piso.difference#

piso.difference(interval_array, *interval_arrays, squeeze=False, return_type='infer')#

Performs a set difference operation.

The argument interval_array and the array elements of interval_arrays are all considered to be the sets over which the operation is performed. Each of these arrays is assumed to contain disjoint intervals (and satisfy the definition of a set). Any array containing overlaps between intervals will be mapped to one with disjoint intervals via a union operation.

The list interval_arrays must contain at least one element. If interval_arrays contains a single element then the result is the set difference between interval_array and this single element. If interval_arrays contains multiple elements then the result is the set difference between interval_array and the union of the sets in interval_arrays. This is equivalent to iteratively applying a set difference operation with each array in interval_arrays as the second operand.

Parameters
interval_arraypandas.IntervalIndex or pandas.arrays.IntervalArray

The first operand to the difference operation.

*interval_arraysargument list of pandas.IntervalIndex or pandas.arrays.IntervalArray

Must contain at least one argument.

squeezeboolean, default False

If True, will try to coerce the return value to a single pandas.Interval. If supplied, must be done so as a keyword argument.

return_type{“infer”, pandas.IntervalIndex, pandas.arrays.IntervalArray}, default “infer”

If “infer” the return type will be the same as interval_array. If supplied, must be done so as a keyword argument.

Returns
pandas.IntervalIndex or pandas.arrays.IntervalArray

Examples

>>> import pandas as pd
>>> import piso
>>> arr1 = pd.arrays.IntervalArray.from_tuples(
...     [(0, 4), (2, 5), (3, 6), (7, 8), (8, 9), (10, 12)],
... )
>>> arr2 = pd.arrays.IntervalArray.from_tuples(
...     [(4, 7), (8, 11)],
... )
>>> arr3 = pd.arrays.IntervalArray.from_tuples(
...     [(2, 5), (7, 13)],
... )
>>> piso.difference(arr1, arr2)
<IntervalArray>
[(0.0, 4.0], (7.0, 8.0], (11.0, 12.0]]
Length: 3, closed: right, dtype: interval[float64]
>>> piso.difference(arr1.set_closed("left"), arr2.set_closed("left"))
<IntervalArray>
[[0.0, 4.0), [7.0, 8.0), [11.0, 12.0)]
Length: 3, closed: left, dtype: interval[float64]
>>> piso.difference(arr1, arr2, return_type=pd.IntervalIndex)
IntervalIndex([(0.0, 4.0], (7.0, 8.0], (11.0, 12.0]],
              closed='right',
              dtype='interval[float64]')
>>> piso.difference(arr1, arr2, arr3)
<IntervalArray>
[(0.0, 2.0]]
Length: 1, closed: right, dtype: interval[float64]
>>> piso.difference(arr1, arr2, arr3, squeeze=True)
Interval(0.0, 2.0, closed='right')