-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenEnum
: an enum to represent protobuf's enumeration field values
#1079
base: master
Are you sure you want to change the base?
Conversation
Please add a description that explains your proposed change. That will make reviewing easier. |
Do you think that this logic could be applied to the oneof field ? |
I believe it works differently for oneofs: if an unknown field number is encountered in a message and it's not a known oneof variant or a regular field, the field is ignored. So the oneof field that would get the value if the variant field were described in the proto would get |
Will this break compatibility with proto2/closed enums? Personally, I don't use proto2, so I am not sure whether it is properly supported at all. |
I think it makes sense to provide a migration guide. |
What is the error message for a |
I would like to see some tests for |
prost-derive/src/field/map.rs
Outdated
self.#ident.get(#take_ref key).cloned().and_then(|x| { x.known() }) | ||
} | ||
#[doc=#insert_doc] | ||
pub fn #insert(&mut self, key: #key_ty, value: #ty) -> ::core::option::Option<#ty> { | ||
self.#ident.insert(key, value as i32).and_then(|x| { | ||
let result: ::core::result::Result<#ty, _> = ::core::convert::TryFrom::try_from(x); | ||
result.ok() | ||
}) | ||
self.#ident.insert(key, value.into()).and_then(|x| { x.known() }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it beneficial to provide a get
and insert
function with the new wrapper? I feel like the map can be used directly with the helper functions provided by OpenEnum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As can be seen in the generated code itself, these methods provide considerable convenience for the most common use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps the insert method could be changed to return the possible old value as Option<OpenEnum<#ty>>
so the caller can choose the fallback behavior for unknown values, but that API would be inconsistent with the get method. If it's to be changed so, the only added convenience there would be the .into()
conversion call.
prost-derive/src/field/scalar.rs
Outdated
pub fn #get(&self) -> #ty { | ||
::core::convert::TryFrom::try_from(self.#ident).unwrap_or(#default) | ||
self.#ident.unwrap_or(#default) | ||
} | ||
|
||
#[doc=#set_doc] | ||
pub fn #set(&mut self, value: #ty) { | ||
self.#ident = value as i32; | ||
self.#ident = value.into(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still beneficial to provide a get
, set
and push
functions with the new wrapper? I feel like the field can be used directly with the helper functions provided by OpenEnum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some convenience in the setter method:
msg.set_kind(Kind::Foo)
is nicer and more type-ahead friendly than
msg.kind = Kind::Foo.into()
The same applies to the push method for repeated fields.
The getter hides a bit much opinionated behavior, in my opinion, so it might be better to leave it to OpenEnum
helper methods to explicitly decide on how unknown values should be treated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing to consider here is porting the existing code. If these methods get used instead of direct access to fields, which I assume happens a lot now because the fields are just i32
or Option<i32>
, updating to the version that generates OpenEnum
would not require changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the better API is to have some sort of set
operation on OpenEnum
. Maybe something like:
msg.kind.set(Kind::Foo)
Well, this change will break existing code in many ways, so I think we should directly go for the best API
prost-derive/src/field/scalar.rs
Outdated
@@ -555,13 +550,13 @@ impl Ty { | |||
Ty::Bool => quote!(bool), | |||
Ty::String => quote!(&str), | |||
Ty::Bytes(..) => quote!(&[u8]), | |||
Ty::Enumeration(..) => quote!(i32), | |||
Ty::Enumeration(..) => unreachable!("an enum should never be queried for its ref type"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain me why it should never be queried by ref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is preempted by logic in prost-derive. For one example, the rust_ref_type
method is used on map field's key type, which cannot be constructed as an enum.
prost-derive/src/field/scalar.rs
Outdated
} | ||
} | ||
|
||
pub fn module(&self) -> Ident { | ||
match *self { | ||
Ty::Enumeration(..) => Ident::new("int32", Span::call_site()), | ||
Ty::Enumeration(..) => Ident::new("enumeration", Span::call_site()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to use self.as_str()
for enumerations as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that would give "enum", but we can't use that as a module name. OTOH, maybe the current behavior of Ty::as_str
should be changed: "enum" is not a real type name anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the result of as_str
to "enumeration" will change the output of Debug
and Display
for Ty
. I don't know if this is a problem; it does not break any tests.
for value in values { | ||
encode(tag, value, buf); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could int32::encode_repeated
be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, unless int32::encode_repeated
is generalized to take an impl IntoIterator
that produces i32
. Which might be a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that would need a breaking change for encode_repeated
which I don't think is worth the little code duplication at this point.
if values.is_empty() { | ||
return; | ||
} | ||
|
||
encode_key(tag, WireType::LengthDelimited, buf); | ||
let len: usize = values | ||
.iter() | ||
.map(|value| encoded_len_varint(value.to_raw() as u64)) | ||
.sum(); | ||
encode_varint(len as u64, buf); | ||
|
||
for value in values { | ||
encode_varint(value.to_raw() as u64, buf); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could int32::encode_packed
be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, not worth changing the signature of int32::encode_packed
for this.
prost/src/encoding.rs
Outdated
if wire_type == WireType::LengthDelimited { | ||
// Packed. | ||
merge_loop(values, buf, ctx, |values, buf, ctx| { | ||
let mut value = Default::default(); | ||
merge(WireType::Varint, &mut value, buf, ctx)?; | ||
values.push(value); | ||
Ok(()) | ||
}) | ||
} else { | ||
// Unpacked. | ||
check_wire_type(WireType::Varint, wire_type)?; | ||
let mut value = Default::default(); | ||
merge(wire_type, &mut value, buf, ctx)?; | ||
values.push(value); | ||
Ok(()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could int32::merge_repeated
be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, definitely not. This function merges into a passed in &mut Vec<OpenEnum<T>>
.
key_len(tag) * values.len() | ||
+ values | ||
.iter() | ||
.map(|value| encoded_len_varint(value.to_raw() as u64)) | ||
.sum::<usize>() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could int32::encoded_len_repeated
be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without breaking changes, as explained above.
if values.is_empty() { | ||
0 | ||
} else { | ||
let len = values | ||
.iter() | ||
.map(|value| encoded_len_varint(value.to_raw() as u64)) | ||
.sum::<usize>(); | ||
key_len(tag) + encoded_len_varint(len as u64) + len | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could int32::encoded_len_packed
be reused here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without breaking changes, as explained above.
let mut raw = 0; | ||
<i32 as Message>::merge_field(&mut raw, tag, wire_type, buf, ctx)?; | ||
*self = OpenEnum::from_raw(raw); | ||
Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same function as encoding::enumeration::merge
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, <i32 as Message>::merge_field
calls to skip_field
if the tag argument is not 1.
I believe prost has never been conformant with closed enums: the raw values have been left in the message as is.
I prefer option 2, even though it requires more work. Protobuf edition 2023 has closed enums as a feature, so they need to be supported even if we ignore proto2. |
What would be a good place for it? I can add a section to the README. |
I'm going to add at least one good example/doctest on the type. |
Have thought some more about this PR: I don't want to break users in this way. At least not now. I suggest making this an option in Once we feel good about the new API, we can think about changing the default. Please look at bytes for a good example of changing the generated data type. |
Instead of introducing a new type, a simple |
It's weird to have a Another benefit of introducing a new type is for the add-on macros and code generators that derive something on structs generated by prost-build. These could deal with the |
How are default values handles in this solution? Especially default values set for a specific field in the proto file. |
Good question. Maybe there should be a const generic parameter for this, but it will then crop up everywhere. Updated: explained below |
9ec7c5d
to
f76175a
Compare
prost-derive/src/field/scalar.rs
Outdated
@@ -799,7 +749,7 @@ impl DefaultValue { | |||
|
|||
pub fn typed(&self) -> TokenStream { | |||
if let DefaultValue::Enumeration(_) = *self { | |||
quote!(#self as i32) | |||
quote!(::prost::OpenEnum::from(#self)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handles the custom default values for message fields.
The generated getter methods handled per-field defaults for optional fields. For non-optional fields, the default is initialized before merging field values from the wire. I'm not sure how best to proceed here. Suppose we have this generated message: pub struct Msg {
#[prost(enumeration = "AnEnum", optional, default = "AnEnum::B")]
pub field: Option<OpenEnum<AnEnum>>
} It would be nice to have API implementing the default value behavior for proto2 and edition 2023. The protobuf guide recommends a getter method that returns the default value, but whether out-of-range values are exposed is up to the |
The wrapper to represent field of enumerated types with the possibility of unknown values.
Replace i32 with the type-checked wrapper.
These are meant to be the primary ways to get and set open enumeration fields, instead of the getters and setters generated by Message derive.
f76175a
to
cb5e59c
Compare
cb5e59c
to
905fbc8
Compare
Add typed_enum_fields method to prost-build configuration, which allows type-checked representation of enumerations in fields of message structs and variants of oneof enums. The argument and the invocation order works like with the boxed method. Depending on the syntax (and preparing for the future support of editions), the type-checked representation can be closed (for proto2) or open (for proto3). The former is represented by the generated enum type itself, while the latter is represented by OpenEnum wrapping the enum type. A new enum_type annotation is supported in the prost attribute inside derives, which allows to specify the type-checked representation of enum types in message fields and oneof variants. The accepted values are "open" or "closed".
905fbc8
to
9bd8272
Compare
I have addressed this and most of the other comments in recent commits. There is now a The proto2 behavior (and the closed enums feature in future editions) is also implemented for enums, albeit with some current limitations: state available to the code generator does not track which syntax the enum types were defined in, that is needed for proto2 ← proto3 includes. I'll try to resolve this. Getter methods are retained for the optional fields for convenience and to implement per-field default values. Protobuf guidelines also insist on there being getters returning non-nullable values. |
@caspermeijn please remove the breaking change label as the revised implementation is purely opt-in. |
Another solution for #276, amenable to destructuring.
This changes the representation of enum fields in message structs to this generic wrapper, parameterized over the generated known enum type:
In an improvement over #1061, this allows convenient matching of enum field values as part of the message. There are also convenience methods to fallibly extract the known value in an
Option
orResult
.