Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubibrowser KeyError #1418

Closed
kkaris opened this issue Sep 8, 2023 · 3 comments · Fixed by #1423
Closed

Ubibrowser KeyError #1418

kkaris opened this issue Sep 8, 2023 · 3 comments · Fixed by #1423

Comments

@kkaris
Copy link
Member

kkaris commented Sep 8, 2023

When running sources.ubibrowser.api.process_from_web, there is a KeyError raised, likely due to updated headers in the latest data from the Ubibrowser.

Update: the base URL needs to be updated as well. The new URL is http://ubibrowser.bio-it.cn/ubibrowser_v3/Public/download/literature/

http://ubibrowser.bio-it.cn/ubibrowser_v3/home/download

To re-create the error:

from indra.sources import ubibrowser
up = ubibrowser.process_from_web()

The error output (the Pandas part of the stack trace is omitted for brevity):

KeyError                                  Traceback (most recent call last)
Input In [2], in <cell line: 1>()
----> 1 up = ubibrowser.process_from_web()

File ~/repos/indra/indra/sources/ubibrowser/api.py:23, in process_from_web()
     21 e3_df = pandas.read_csv(E3_URL, sep='\t')
     22 dub_df = pandas.read_csv(DUB_URL, sep='\t')
---> 23 return process_df(e3_df, dub_df)

File ~/repos/indra/indra/sources/ubibrowser/api.py:65, in process_df(e3_df, dub_df)
     49 """Process data frames containing UbiBrowser data.
     50 
     51 Parameters
   (...)
     62     extracted in its statements attribute.
     63 """
     64 up = UbiBrowserProcessor(e3_df, dub_df)
---> 65 up.extract_statements()
     66 return up

File ~/repos/indra/indra/sources/ubibrowser/processor.py:16, in UbiBrowserProcessor.extract_statements(self)
     13 for df, stmt_type in [(self.e3_df, Ubiquitination),
     14                       (self.dub_df, Deubiquitination)]:
     15     for _, row in df.iterrows():
---> 16         stmt = self._process_row(row, stmt_type)
     17         if stmt:
     18             self.statements.append(stmt)

File ~/repos/indra/indra/sources/ubibrowser/processor.py:26, in UbiBrowserProcessor._process_row(row, stmt_type)
     20 @staticmethod
     21 def _process_row(row, stmt_type):
     22     # Note that even in the DUB table the subject of the statement
     23     # is called "E3"
     24     # There are some examples where a complex is implied (e.g., BMI1-RNF2),
     25     # for simplicity we just ignore these
---> 26     if '-' in row['E3AC']:
     27         return None
     28     subj_agent = get_standard_agent(row['E3GENE'], {'UP': row['E3AC']})

[...]

KeyError: 'E3AC'

Inspecting the row variable with debug in IPython reveals the following data structure:

NUMBER                                1
SwissProt ID (E3)            ADO1_ARATH
SwissProt ID (Substrate)    APRR1_ARATH
SwissProt AC (E3)                Q94BT6
SwissProt AC (Substrate)         Q9LKL2
Gene Symbol (E3)                   ADO1
Gene Symbol (Substrate)           APRR1
SOURCE                          MEDLINE
SOURCEID                       22199232
SENTENCE                          E3Net
E3TYPE                            Other
COUNT                                 1
type                              Other
species                      A.thaliana
Name: 0, dtype: object
@bgyori
Copy link
Member

bgyori commented Sep 8, 2023

It sounds like they renamed their columns and changed E3AC to SwissProt AC (E3) so we would have to update the code accordingly.

@bgyori
Copy link
Member

bgyori commented Sep 9, 2023

I fixed all the issues on the db-sources-updates branch.

@kkaris
Copy link
Member Author

kkaris commented Sep 20, 2023

Resolved on #1423

@kkaris kkaris linked a pull request Sep 20, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants