-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploration: roadmap for explorers and mdims #3992
Comments
I spent a good chunk of time browsing various explorers, and whoa... this isn't going to be easy. It feels like every explorer is unique, and there's no obvious way to have a single approach for everything. The only thing I can confidently say is that CSV-based explorers are bad (though that alone doesn’t justify spending time migrating them). I'm still wrapping my head around everything, so take the following notes with a grain of salt. 1. MDIMs and Explorers Should Come from ETLThe main question is whether we'd allow editing explorers from Admin or not. If yes, we'd need either some kind of "override" in the Admin layer (either in Explorers with many combinations, like minerals, are well suited for ETL, but more bespoke explorers, like migration, are much more complex. Then again, some people prefer YAML, while others prefer Python, and it's unclear whether we should enforce a single approach. 2. Standardize the Tooling Used in Explorers and MDIMs@lucasrodes has already done this with the COVID explorer and COVID MDIM. The explorer YAML representation is really close to MDIMs. I can imagine generating a similar config file that could power both MDIMs and (indicator-based) explorers. If we can make it work for COVID, where we’re already pretty close, then it should be doable for anything. But does this grand unification bring enough value? I guess we need a couple more MDIMs to better decide where to put our energy. AppendixSome explorers I found interesting:
|
Thanks for the summary, @Marigold! You touch on very valid points. Just to disclose my bias up front, my dream is to migrate all explorers and have a standardized way of doing things in the MDIM/explorer space, as we have for data steps. My take is that this might not provide much value in the short term, but it will in the long term. I'm especially concerned with the update flow, where I think we should assume that everything is ETL-powered. So I don't think this is super urgent, yet a goal that would be great to have in, say, 1-2 year time. In general, I think that deprecating CSV-based explorers (and chart-based) will help us maintain our infrastructure in the long run. It's annoying when developing tools to account for all these edge cases that do not come from ETL. 1. MDIMs and Explorers Should Come from ETLI think we should probably create an issue with all explorers and rank them somehow by type or complexity. Also, whenever attempting to "migrate" one, we should advertise it to avoid conflicts with other edits. One risk here is that the data scientist in charge of this explorer might be used to their current pipeline, so we should make sure that the new indicator-based is easy to understand and with appropriate tooling. I think it could make sense to do this after agreeing on some templating (as in MDIMs) in point 2 below. 2. Standardize the Tooling Used in Explorers and MDIMsI am happy to look at the COVID explorer again and see how the MDIM tooling/approach can be applied there. I think that we could possibly need some engineering work here, to add some of the features that we have on MDIMs now (being able to reference them by catalogPath, display settings per view, etc.) Basically, it'd be nice to improve the explorer config API on the engineering side and align it with MDIMs a bit. |
One-liner
Define our ETL workflow for Explorers and MDIMs while unifying tooling as much as possible.
(previous context: #3969)
Context: MDIM vs Explorers
We have different kinds of similar objects in
etl
/owid-content
:While we want to adopt more and more MDIM pages, we will still have explorers around. This is because both objects are, conceptually, different things:
Therefore, we need to improve the data workflow to support both products.
Goals
1. MDIMs and Explorers should come from ETL
Given the context explained above, and after various discussions, we agree that we should move towards having both explorers and MDIMS be ETL-based (
export://explorers/
andexport://multidim/,
respectively).owid-content
should be generated automatically from ETLexport://explorers
steps.2. Standardize the tooling used in explorers and MDIMs
These two objects are very similar, and ideally, they should rely on standard tooling to minimize the maintenance burden. This implies some additional transition work in the coming months.
3. Create a pleasant workflow experience for data scientists
The text was updated successfully, but these errors were encountered: