piso.symmetric_difference#

piso.symmetric_difference(interval_array, *interval_arrays, min_overlaps=2, squeeze=False, return_type='infer')#

Performs a set symmetric difference operation.

What is considered a set is determined by the number of positional arguments used, that is, determined by the size of interval_arrays.

If interval_arrays is empty then the sets are considered to be the intervals contained in interval_array.

If interval_arrays is not empty then the sets are considered to be interval_array and the elements in interval_arrays. Each of these arrays is assumed to contain disjoint intervals (and satisfy the definition of a set). Any array containing overlaps between intervals will be mapped to one with disjoint intervals via a union operation.

The symmetric difference can be defined as the set difference, of the union and the intersection. The parameter min_overlaps in piso.intersection(), which defines the minimum number of intervals in an overlap required to constitute an intersection, follows through to symmetric difference under this definition.

Parameters
interval_arraypandas.IntervalIndex or pandas.arrays.IntervalArray

The first (and possibly only) operand to the symmetric difference operation.

*interval_arraysargument list of pandas.IntervalIndex or pandas.arrays.IntervalArray

May contain zero or more arguments.

min_overlapsint or “all”, default “all”

Specifies the minimum number of intervals which overlap in order to define an intersection. If min_overlaps is an int then it must be no smaller than 2. If min_overlaps is all then an intersection is only defined where every interval overlaps. If supplied, must be done so as a keyword argument.

squeezeboolean, default False

If True, will try to coerce the return value to a single pandas.Interval. If supplied, must be done so as a keyword argument.

return_type{“infer”, pandas.IntervalIndex, pandas.arrays.IntervalArray}, default “infer”

If “infer” the return type will be the same as interval_array. If supplied, must be done so as a keyword argument.

Returns
pandas.IntervalIndex or pandas.arrays.IntervalArray

Examples

>>> import pandas as pd
>>> import piso

Examples with interval_arrays empty:

>>> arr = pd.arrays.IntervalArray.from_tuples(
...     [(0, 4), (2, 5), (3, 6), (7, 8), (8, 9), (10, 12)],
... )
>>> piso.symmetric_difference(arr)
<IntervalArray>
[(0.0, 2.0], (5.0, 6.0], (7.0, 9.0], (10.0, 12.0]]
Length: 4, closed: right, dtype: interval[float64]
>>> piso.symmetric_difference(pd.IntervalIndex(arr))
IntervalIndex([(0.0, 2.0], (5.0, 6.0], (7.0, 9.0], (10.0, 12.0]],
              closed='right',
              dtype='interval[float64]')
>>> piso.symmetric_difference(arr, return_type=pd.IntervalIndex)
IntervalIndex([(0.0, 2.0], (5.0, 6.0], (7.0, 9.0], (10.0, 12.0]],
              closed='right',
              dtype='interval[float64]')
>>> piso.symmetric_difference(arr, min_overlaps=3)
<IntervalArray>
[(0.0, 3.0], (4.0, 6.0], (7.0, 9.0], (10.0, 12.0]]
Length: 4, closed: right, dtype: interval[float64]
>>> piso.symmetric_difference(arr, min_overlaps="all")
<IntervalArray>
[(0.0, 6.0], (7.0, 9.0], (10.0, 12.0]]
Length: 3, closed: right, dtype: interval[float64]

Examples with interval_arrays non-empty:

>>> arr1 = pd.arrays.IntervalArray.from_tuples(
...     [(0, 4), (5, 6), (7, 8), (10, 12)],
... )
>>> arr2 = pd.arrays.IntervalArray.from_tuples(
...     [(3, 5), (8, 9)],
... )
>>> arr3 = pd.arrays.IntervalArray.from_tuples(
...     [(6, 8), (9, 11)],
... )
>>> piso.symmetric_difference(arr1, arr2)
<IntervalArray>
[(0.0, 3.0], (4.0, 6.0], (7.0, 9.0], (10.0, 12.0]]
Length: 4, closed: right, dtype: interval[float64]
>>> piso.symmetric_difference(arr1, arr2, arr3)
<IntervalArray>
[(0.0, 3.0], (4.0, 7.0], (8.0, 10.0], (11.0, 12.0]]
Length: 4, closed: right, dtype: interval[float64]
>>> piso.symmetric_difference(arr1, arr2, arr3, min_overlaps="all")
<IntervalArray>
[(0.0, 12.0]]
Length: 1, closed: right, dtype: interval[float64]