Conversation
69941d3 to
6d309d5
Compare
DanSimon
left a comment
There was a problem hiding this comment.
So I think there's one main thing not yet supported in this PR: it should be possible to get a config by name, eg klio_config.data.inputs["my-data-input"].
A while ago I actually took a stab at this issue but forgot about it: https://github.com/spotify/klio/compare/dsimon/config-io-names . Overall I think your approach is better, but I was able to solve the above by writing the KlioIOConfigSubContainer that allows you to access either by index (so it's backwards compatible) or by name.
For some reason I didn't include name as an attr, I can't remember why, but looking again I don't think that was the right approach, so you'll have to adapt my code to account for that (or do it your own way if you have a better idea 😄 ).
| @attr.attrs(frozen=True) | ||
| class KlioPubSubConfig(object): | ||
| name = "pubsub" | ||
| type_name = "pubsub" |
There was a problem hiding this comment.
I think we should use TYPE_NAME in caps to indicate this is more like a class-level constant.
| type_id = type_counters[type_name] | ||
| type_counters[type_name] += 1 | ||
| name = "{}{}".format(type_name, type_id) | ||
| name = count |
There was a problem hiding this comment.
Is there a reason why the type is being dropped from the generated name? I'm not strongly for keeping it but I also don't see any clear reason for removing it.
There was a problem hiding this comment.
I think i remember we discussed as a team and thought there was weirdness since there is another area https://github.com/spotify/klio/blob/develop/exec/src/klio_exec/commands/run.py#L466 where we again append an index to the type. i THINK it's used because multiple event reading requires a unique label (I'll cite this when i find the link) and this generated name was for the unique label. I'll ask the team and confirm if we want to keep or not and I can pluck off that commit if we do indeed want to keep it.
Theres also this TODO written by @econchick that touches on stuff disappearing if we convert to enforcing a dictionary for event inputs and i think the two are related. https://github.com/spotify/klio/blob/develop/exec/src/klio_exec/commands/run.py#L460
Yeah I have this on another branch locally and went down pretty deep into that exploration 😓 i THINK the reason you didnt use attr is cuz to maintain backwards compatibility between |
Description
Currently, inputs and outputs can be given names in klio-job.yaml. If no name is provided then one is generated based on the type of I/O, something like "gcs1".
Right now this name is basically dropped when the KlioConfig object is created. It is removed as part of config pre-processing and klio_config.events.inputs stores each I/O object in a list (so I/O cannot be accessed by name), and if as_dict() is used, "name" will not be included.
There is currently a "name" attribute on the base IOConfig class, however this is a class-level attribute that refers to the string name of the IO config type used in klio-job.yaml, e.g. "gcs" or "pubsub", and so would conflict with actually trying to add a "name" attribute to store the name of the individual instance itself.
This PR disambiguates "type" from "name" to "type_name", adds "name" into the config dict and drops the name generation.
Testing
Checklist for PR author(s)
[cli] Fixes bugs in 'klio job fake-cmd'.docs/src.docs/src/reference/<package>/changelog.rst.