You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I'm doing a left join, for example, I often want to know if all of the values on my left and right datasets found a match. It can also be helpful to know if there were duplicates in my right dataset that tried to match. These little warning messages often help me catch problems in the data – especially Census data where different files often contain strange geographies that I don't expect but want to know about. A number of years ago, I made a join utility that includes these kinds of reports and messages. I'd like to keep all of my data transformation steps in an arquero flow, however, so it would be neat if the library provided this kind of feedback.
I'm not sure what the best API would be for this and leave that up for discussion. Probably an option such as table.join(other, 'keyShared', { report: true }) would work. In my joiner library, output comes in both machine readable objects and human-friendly sentences. If you wanted to go that route, you could have something like table.join(other, 'keyShared', { report: 'data' }) or table.join(other, 'keyShared', { report: 'human' }). (Localization could be an issue but these sentences are pretty simple and given that arquero uses English verbs, the linguistic complexity isn't more than what already exists.)
I'd be happy to start a PR for this if it sounds like a useful direction for you.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Joins are mysterious processes.
If I'm doing a left join, for example, I often want to know if all of the values on my left and right datasets found a match. It can also be helpful to know if there were duplicates in my right dataset that tried to match. These little warning messages often help me catch problems in the data – especially Census data where different files often contain strange geographies that I don't expect but want to know about. A number of years ago, I made a join utility that includes these kinds of reports and messages. I'd like to keep all of my data transformation steps in an arquero flow, however, so it would be neat if the library provided this kind of feedback.
I'm not sure what the best API would be for this and leave that up for discussion. Probably an option such as
table.join(other, 'keyShared', { report: true })
would work. In my joiner library, output comes in both machine readable objects and human-friendly sentences. If you wanted to go that route, you could have something liketable.join(other, 'keyShared', { report: 'data' })
ortable.join(other, 'keyShared', { report: 'human' })
. (Localization could be an issue but these sentences are pretty simple and given that arquero uses English verbs, the linguistic complexity isn't more than what already exists.)I'd be happy to start a PR for this if it sounds like a useful direction for you.
Beta Was this translation helpful? Give feedback.
All reactions