-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create recipes for TED by topics #930
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #930 +/- ##
==========================================
- Coverage 87.98% 87.84% -0.14%
==========================================
Files 94 93 -1
Lines 5327 5307 -20
==========================================
- Hits 4687 4662 -25
- Misses 640 645 +5 ☔ View full report in Codecov by Sentry. |
b2a634e
to
931a7e5
Compare
931a7e5
to
9fa9d71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what this is for, what will be the Content-Team actions following running this script?
- Will those be individually checked? What's the strategy regarding title/description metadata? Do you want to run and fail all those that will be above 30/80c?
- What about the
mul
and static list of languages? Do all topics have all those languages? In that order of importance? If not, how will this be acknowledged and fixed by content team? - With enabled=True we are talking about hundreds of runs that will need to be checked. Alternative is to enabled=False and let Content team run/review at their own pace.
- How many recipes is this?
This is the script which has been used to generate all TED recipes by topics. The goal is only to not loose it so that it might be reused later on if needed. See openzim/zim-requests#789 as well for answers to some of your questions.
They will be individually checked before being move to "prod" (library.kiwix.org)
I don't understand this question, sorry
Good point, it reminded me I forgot to create some issues in TED scraper. This is kinda of hack to retrieve videos in all languages (it is not possible to say "all" in languages, see openzim/ted#171). Clearly not all topics have those languages, and order is not handled either (the scraper should filter and order them, see openzim/ted#172) I did think this is going to be acknowledged and fixed by content team, I consider this should be fixed by dev team ; doing it it manually while it is mostly straightforward to automate with code is a bit sad
This has already ran and allowed to proceed quickly ^^
355 recipes, 355 ZIMs |
Well if it's to be kept for reference, don't request a review!
I was wondering if you wanted to check the length of all those title/desc with the script or just let it run and have the scraper fail if it didn't fit. I understand from the last answer that all recipes created their ZIM so all metadata did fit.
👍 I think this should be fixed before moving to prod if we are sending this to |
9fa9d71
to
557922d
Compare
Rationale
This is a maintenance script. It is not expected to be used on a regular basis but still useful to keep / share.