piso.join#

piso.join(*frames_or_series, how='left', suffixes=None, sort=False)#

Joins multiple dataframes or series by their pandas.IntervalIndex.

Each interval in a pandas.IntervalIndex is considered a set, and the interval index containing them a set defined by their union. Join types are as follows:

  • left: the set defined by the interval index of the result is the same as the set defined by the index of the first argument in frames_or_series

  • right: the set defined by the interval index of the result is the same as the set defined by the index of the last argument in frames_or_series

  • inner: the set defined by the interval index of the result is the intersection of sets defined by interval indexes from all join arguments

  • outer: the set defined by the interval index of the result is the union of sets defined by interval indexes from all join arguments

Parameters
*frames_or_seriesargument list of pandas.DataFrame or pandas.Series

May contain two or more arguments, all of which must be indexed by a pandas.IntervalIndex containing disjoint intervals. The index can have any closed value. Every pandas.Series must have a name.

how{“left”, “right”, “inner”, “outer”}, default “left”

What sort of join to perform.

suffixeslist of str or None, default None

Suffixes to use for overlapping columns. If used then should be same length as frames_or_series.

sortbool, default False

Order result DataFrame lexicographically by the join key. If False, the order of the join key depends on the join type.

Returns
pandas.DataFrame

A dataframe containing columns from elements of frames_or_series

Examples

>>> import pandas as pd
>>> import piso
>>> df = pd.DataFrame(
...     {"A":[4,3], "B":["x","y"]},
...     index=pd.IntervalIndex.from_tuples([(1,3), (5,7)]),
... )
>>> s = pd.Series(
...     [True, False],
...     index=pd.IntervalIndex.from_tuples([(2,4), (5,6)]),
...     name="C",
... )
>>> piso.join(df, s)
        A  B      C
(1, 2]  4  x    NaN
(2, 3]  4  x   True
(5, 6]  3  y  False
(6, 7]  3  y    NaN
>>> piso.join(df, s, how="right")
          A    B      C
(2, 3]  4.0    x   True
(3, 4]  NaN  NaN   True
(5, 6]  3.0    y  False
>>> piso.join(df, s, how="inner")
        A  B      C
(2, 3]  4  x   True
(5, 6]  3  y  False
>>> piso.join(df, s, how="outer")
          A    B      C
(1, 2]  4.0    x    NaN
(2, 3]  4.0    x   True
(5, 6]  3.0    y  False
(6, 7]  3.0    y    NaN
(3, 4]  NaN  NaN   True
>>> piso.join(df, s, how="outer", sort=True)
          A    B      C
(1, 2]  4.0    x    NaN
(2, 3]  4.0    x   True
(3, 4]  NaN  NaN   True
(5, 6]  3.0    y  False
(6, 7]  3.0    y    NaN
>>> piso.join(df, df, suffixes=["", "2"])
        A  B  A2 B2
(1, 3]  4  x   4  x
(5, 7]  3  y   3  y
>>> df2 = pd.DataFrame(
...     {"D":[1,2]},
...     index=pd.IntervalIndex.from_tuples([(1,2), (6,7)]),
... )
>>> piso.join(df, s, df2)
        A  B      C    D
(1, 2]  4  x    NaN  1.0
(2, 3]  4  x   True  NaN
(5, 6]  3  y  False  NaN
(6, 7]  3  y    NaN  2.0
>>> piso.join(df, s, df2, how="right")
        D  A  B    C
(1, 2]  1  4  x  NaN
(6, 7]  2  3  y  NaN