piso.accessor.ArrayAccessor.coverage#

ArrayAccessor.coverage(domain=None, bins=False, how='fraction')#

Calculates the size of a domain (or possibly multiple domains) covered by a collection of intervals.

The intervals are contained in the array object the accessor belongs to. The (possibly overlapping) intervals may not, or partially, or wholly cover the domain.

Calculation over multiple domains is only possible when bins = True.

Parameters
domaintuple, pandas.Interval, pandas.IntervalIndex or pandas.arrays.IntervalArray, optional

Specifies the domain over which to calculate the “coverage”. If domain is None, then the domain is considered to be the extremities of the intervals contained in the interval array the accessor belongs to. If domain is a tuple then it should specify lower and upper bounds, and be equivalent to a pandas.Interval. If domain is a pandas.IntervalIndex or pandas.arrays.IntervalArray then the intervals it contains define a possibly disconnected domain. If bins = True then domain must be pandas.IntervalIndex or pandas.arrays.IntervalArray with disjoint intervals.

binsboolean, default False

If False, then the domain is interpreted as a single domain and returns one value. If True, then domain is interpreted as disjoint bins over which coverage is calculated for each.

how{“fraction”, “sum”}, default “fraction”

If how = “fraction” then the result is a fraction of the size of the domain. If how = “sum” then the result is the length of the domain covered.

New in version 0.8.0.

Returns
float or pandas.Series

Examples

>>> import pandas as pd
>>> import piso
>>> piso.register_accessors()
>>> arr1 = pd.arrays.IntervalArray.from_tuples(
...     [(0, 4), (3, 5), (7, 8)],
... )
>>> arr1.piso.coverage()
0.75
>>> arr1.piso.coverage((0, 10))
0.6
>>> arr1.piso.coverage(pd.Interval(-10, 10))
0.3
>>> arr1.piso.coverage(pd.Interval(-10, 10), how="sum")
6
>>> domain = pd.arrays.IntervalArray.from_tuples(
...     [(4,6), (7, 10)],
... )
>>> arr1.piso.coverage(domain)
0.4
>>> arr1.piso.coverage(domain, bins=True)
(4, 6]     0.500000
(7, 10]    0.333333
dtype: float64
>>> arr1.piso.coverage(domain, bins=True, how="sum")
(4, 6]     1.0
(7, 10]    1.0
dtype: float64