Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added plugin to insert and update sources from scienti database #7

Merged
merged 4 commits into from
Aug 7, 2023

Conversation

muzgash
Copy link
Contributor

@muzgash muzgash commented Jul 31, 2023

No description provided.

@muzgash muzgash requested review from omazapa and restrepo July 31, 2023 18:13
Copy link
Contributor

@omazapa omazapa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kahi_scienti_sources/README.md Show resolved Hide resolved
@muzgash muzgash requested a review from omazapa August 1, 2023 21:11
Copy link
Contributor

@omazapa omazapa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @muzgash

With the wrong collection name, the log was created as ok,
fixing the name of the collection to product it was not running, then I deleted the log collection and I t start running and I got this

 kahi_run --workflow flow.yml --verbose 5
OrderedDict([('config', OrderedDict([('database_url', 'localhost:27017'), ('database_name', 'kahi'), ('log_database', 'kahi_log'), ('log_collection', 'log')])), ('workflow', OrderedDict([('scienti_sources', [OrderedDict([('database_url', 'localhost:27017'), ('database_name', 'scienti_udea_2022'), ('collection_name', 'product')]), OrderedDict([('database_url', 'localhost:27017'), ('database_name', 'scienti_uec_2022'), ('collection_name', 'product')]), OrderedDict([('database_url', 'localhost:27017'), ('database_name', 'scienti_univalle_2022'), ('collection_name', 'product')])])]))])
Log retrieved from database
[]
Loading plugin: kahi_scienti_sources
Running plugin: kahi_scienti_sources
Plugin scienti_sources failed
Traceback (most recent call last):
  File "/home/ozapatam/.local/bin/kahi_run", line 37, in <module>
    kahi.run()
  File "/home/ozapatam/.local/lib/python3.11/site-packages/kahi/Kahi.py", line 126, in run
    status = run()
             ^^^^^
  File "/home/ozapatam/.local/lib/python3.11/site-packages/kahi_scienti_sources/Kahi_scienti_sources.py", line 165, in run
    self.process_scienti(config, verbose=5)
  File "/home/ozapatam/.local/lib/python3.11/site-packages/kahi_scienti_sources/Kahi_scienti_sources.py", line 131, in process_scienti
    "from_date": int(dt.strptime(paper["DTA_CREACION"], "%a, %d %b %Y %H:%M:%S %Z").timestamp()),
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '2008-11-23 12:19:44' does not match format '%a, %d %b %Y %H:%M:%S %Z'

I think that if the log was created because of the error and I want to run again with the fixed workflow there should be an option to force the execution and reset the log.

scienti_sources:
- database_url: localhost:27017
database_name: scienti_111
collection_name: products
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the collection is product,
leaving it with products, the code is not producing a warning or something,
would be good to get a message if the collection is not the right or empty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the collection does not exist, the user must be sure about that. That is not an error, even MongoDB thinks this way. What use cases should we treat as an error? All of them, none, this one in particular? This particular issue is reason enough to abort the pull request?

scienti_sources:
- database_url: localhost:27017
database_name: scienti_111
collection_name: products
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here and below "product"

@muzgash muzgash merged commit e00ca9c into main Aug 7, 2023
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants