piso.accessor.ArrayAccessor.difference#

ArrayAccessor.difference(*interval_arrays, squeeze=False, return_type='infer')#

Performs a set difference operation.

The array elements of interval_arrays, and the interval array object the accessor belongs to (an instance of pandas.IntervalIndex, pandas.arrays.IntervalArray) are considered to be the sets over which the operation is performed. Each of these arrays is assumed to contain disjoint intervals (and satisfy the definition of a set). Any array containing overlaps between intervals will be mapped to one with disjoint intervals via a union operation.

The list interval_arrays must contain at least one element. If interval_arrays contains a single element then the result is the set difference between the interval array the accessor belongs to, and this single element. If interval_arrays contains multiple elements then the result is the set difference between the interval array the accessor belongs to and the union of the sets in interval_arrays. This is equivalent to iteratively applying a set difference operation with each array in interval_arrays as the second operand.

Parameters
*interval_arraysargument list of pandas.IntervalIndex or pandas.arrays.IntervalArray

Must contain at least one argument.

squeezeboolean, default False

If True, will try to coerce the return value to a single pandas.Interval. If supplied, must be done so as a keyword argument.

return_type{“infer”, pandas.IntervalIndex, pandas.arrays.IntervalArray}, default “infer”

If “infer” the return type will be the same as interval_array. If supplied, must be done so as a keyword argument.

Returns
pandas.IntervalIndex or pandas.arrays.IntervalArray

Examples

>>> import pandas as pd
>>> import piso
>>> piso.register_accessors()
>>> arr1 = pd.arrays.IntervalArray.from_tuples(
...     [(0, 4), (2, 5), (3, 6), (7, 8), (8, 9), (10, 12)],
... )
>>> arr2 = pd.arrays.IntervalArray.from_tuples(
...     [(4, 7), (8, 11)],
... )
>>> arr3 = pd.arrays.IntervalArray.from_tuples(
...     [(2, 5), (7, 13)],
... )
>>> arr1.piso.difference(arr2)
<IntervalArray>
[(0.0, 4.0], (7.0, 8.0], (11.0, 12.0]]
Length: 3, closed: right, dtype: interval[float64]
>>> arr1.set_closed("left").piso.difference(arr2.set_closed("left"))
<IntervalArray>
[[0.0, 4.0), [7.0, 8.0), [11.0, 12.0)]
Length: 3, closed: left, dtype: interval[float64]
>>> arr1.piso.difference(arr2, return_type=pd.IntervalIndex)
IntervalIndex([(0.0, 4.0], (7.0, 8.0], (11.0, 12.0]],
              closed='right',
              dtype='interval[float64]')
>>> arr1.piso.difference(arr2, arr3)
<IntervalArray>
[(0.0, 2.0]]
Length: 1, closed: right, dtype: interval[float64]
>>> arr1.piso.difference(arr2, arr3, squeeze=True)
Interval(0.0, 2.0, closed='right')