Skip to content

Obtaining information for a single job fails #18

@smjanke

Description

@smjanke

Hi there,

when trying to run GA4HPC to get information on a single job, GA4HPC crashes as follows:

`sh myCarbonFootprint.sh -S 2024-01-20 --filterJobIDs 4198761
Virtualenv: OK
Python versions: OK
Traceback (most recent call last):
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 2606, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 2630, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "GreenAlgo4HPC/init.py", line 155, in
extracted_data = main_backend(args)
^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/backend/init.py", line 240, in main_backend
summary_stats = summarise_data(df2, args=args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/backend/init.py", line 193, in summarise_data
userID = df.UserX[0]
~~~~~~~~^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/series.py", line 1012, in getitem
return self._get_value(key)
^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/series.py", line 1121, in _get_value
loc = self.index.get_loc(label)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "GreenAlgo4HPC/GA_env/lib/python3.11/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 0`

(As also indicated in the error message) The error originates from line 193 in backend/init.py, specifically the expression userID = df.UserX[0] . From what I can tell, if there's only one job, then df.UserX doesn't work as a proper hashtable anymore.

What seems to work as a fix is to turn df.UserX into a string for the case of just one job (but this of course completely ignores any Pandas-based solution):
if (len(df.UserX) ==1): userID = str(df.UserX).split()[1] else: userID = df.UserX[0]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions