Skip to content

FB Fails if Indices are Missing/Out of Order #359

@xehu

Description

@xehu

It appears that the featurebuilder (and in particular, the mimicry feature) assumes that successive rows of the input dataframe increment in their row index. If the dataframe has missing indices (for example, because it was filtered before processing such that a few rows were dropped), the FB fails with the following error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /opt/anaconda3/envs/tpm_virtualenv/lib/python3.11/site-packages/pandas/core/indexes/base.py:3805, in Index.get_loc(self, key)
   3804 try:
-> 3805     return self._engine.get_loc(casted_key)
   3806 except KeyError as err:

File index.pyx:167, in pandas._libs.index.IndexEngine.get_loc()

File index.pyx:196, in pandas._libs.index.IndexEngine.get_loc()

File pandas/_libs/hashtable_class_helper.pxi:2606, in pandas._libs.hashtable.Int64HashTable.get_item()

File pandas/_libs/hashtable_class_helper.pxi:2630, in pandas._libs.hashtable.Int64HashTable.get_item()

KeyError: 813

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[88], line 19
      1 # first, run the feature builder on the labeled utterances
      2 labeled_reddit_feature_builder = FeatureBuilder(
      3 	input_df=df_labeled_utterances_to_annotate,
      4 	custom_features=[
...
   3815     #  InvalidIndexError. Otherwise we fall through and re-raise
   3816     #  the TypeError.
   3817     self._check_indexing_error(key)

KeyError: 813
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions