BUG: ValueError: Length of values (5) does not match length of index (4)
when subtracting two series with MultIndex and Index and nan values
#60908
Labels
Bug
Indexing
Related to indexing on series/frames, not to indexes themselves
Missing-data
np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
MultiIndex
Needs Triage
Issue that has not been reviewed by a pandas team member
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
It is possible to carry out arithmetic operations on two series with "mixed" indices when at least 1 level is the same. However, in my case
s1 - s2
,s1
contains an allnan
index row which raises aValueError: Length of values (5) does not match length of index (4)
.I found that this could be an error in how the two series are aligned.
I traced the origin of the mismatching codes to
pandas.core.indexes.base.py:Index._join_level
which blatantly ignores missing values to construct a new index.This is all possible because
verify_integrity
is set to False (and not passed down). If I setverify_integrity=True
thejoin_index = MultiIndex(...)
fails much earlier withValueError: Length of levels and codes must match. NOTE: this index is in an inconsistent state.
I tried to fix this by changing the
taker = old_codes[old_codes != -1]
totaker = old_codes
. This alleviates the initialValueError
(just tested for my case). If I also comment out the -1 handling, I get the desired expected behaviour.Expected Behavior
Installed Versions
Also happens with pandas==2.2.3
The text was updated successfully, but these errors were encountered: