Skip to content

Conversation

ydylla
Copy link

@ydylla ydylla commented Oct 3, 2021

This maps deserialized enum integers into real python enum instances.

Closes #157, #169

ydylla added 2 commits October 3, 2021 17:15
Fallback to int like before, not really happy with this but no other idea.
@ydylla
Copy link
Author

ydylla commented Oct 3, 2021

For the default enum handling I had to add a fallback to int like it was before, in case the enum class does not have a value defined for 0.
The FieldDescriptorProto from the example.proto would otherwise generate this stacktrace when running poe generate -v:

Generated output for 'nested'
Traceback (most recent call last):
  File "G:\Code\python-betterproto\venv\Scripts\\protoc-gen-python_betterproto", line 5, in <module>
    main()
  File "G:\Code\python-betterproto\src\betterproto\plugin\main.py", line 25, in main
    request.parse(data)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 949, in parse
    value = self._postprocess_single(
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 889, in _postprocess_single
    value = cls().parse(value)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 949, in parse
    value = self._postprocess_single(
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 889, in _postprocess_single
    value = cls().parse(value)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 949, in parse
    value = self._postprocess_single(
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 889, in _postprocess_single
    value = cls().parse(value)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 953, in parse
    current = getattr(self, field_name)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 653, in __getattribute__
    value = self._get_field_default(name)
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 818, in _get_field_default
    return self._betterproto.default_gen[field_name]()
  File "G:\Code\python-betterproto\src\betterproto\__init__.py", line 841, in <lambda>
    return lambda: t(0)  # Enums always default to zero.
  File "E:\Python3\lib\enum.py", line 339, in __call__
    return cls.__new__(cls, value)
  File "E:\Python3\lib\enum.py", line 662, in __new__
    raise ve_exc
ValueError: 0 is not a valid FieldDescriptorProtoLabel
--python_betterproto_out: protoc-gen-python_betterproto: Plugin failed with status code 1.

@roblabla
Copy link
Contributor

roblabla commented Oct 18, 2021

I think this approach is not ideal. Before this PR, python-betterproto generates types that it doesn't respect. E.G. it may generate code like:

class TestEnum(betterproto.Enum):
    ZERO = 0
    ONE = 1

@dataclass
class Foo(betterproto.Message):
    bar: TestEnum = betterproto.enum_field(1)

But when parsing Foo, bar will not be a TestEnum, but an int. This makes errors when accessing TestEnum fields (like name) cause runtime errors that cannot be caught by tools like mypy.


With this PR, the behavior is changed so bar is a TestEnum most of the time, but:

  • If the enum has no 0 value, bar will still be an int when left unset to its default value
  • If we attempt to deserialize an enum with a value outside its expected range, we will end up with an error

I think both of those are suboptimal. The first case means we still lie to the type system, and the second goes directly against the Proto3 Language Guide:

During deserialization, unrecognized enum values will be preserved in the message, though how this is represented when the message is deserialized is language-dependent. In languages that support open enum types with values outside the range of specified symbols, such as C++ and Go, the unknown enum value is simply stored as its underlying integer representation. In languages with closed enum types such as Java, a case in the enum is used to represent an unrecognized value, and the underlying integer can be accessed with special accessors. In either case, if the message is serialized the unrecognized value will still be serialized with the message.

What I think would work best would be to change betterproto.Enum's constructor to accept arbitrary numbers, and have name return None for values that fall outside the expected range. I'm not sure if this is possible with python's enum.IntEnum however. Maybe a larger change should be considered that moves away from enum.IntEnum towards something that functions closer to an open enum instead of a closed set.

@ydylla
Copy link
Author

ydylla commented Oct 18, 2021

What I think would work best would be to change betterproto.Enum's constructor to accept arbitrary numbers, and have name return None for values that fall outside the expected range.

I 100% agree with you @roblabla. I already experimented with this approach but could not get it to work properly.
But I stopped spending time on it, because for my current project this implementation was good enough.

@Gobot1234
Copy link
Collaborator

I might be able to help in the enums department and whip out an old PR or 2 and make a custom implementation for Enum which is more open (and more performant).

@smalyala
Copy link

Hi, thanks for creating this PR. I'm using a hacky workaround for mapping deserialized ints to enum but this would be much better. Any update on when this PR will be ready to merge?

@Gobot1234
Copy link
Collaborator

I'm going to give @roblabla's suggestion of implementing Enum as an open set a go hopefully over the weekend with these changes included.

@smalyala
Copy link

Hi @Gobot1234. Any progress on this?

@Gobot1234
Copy link
Collaborator

I have the changes implemented I just need to make a pr

@Gobot1234
Copy link
Collaborator

Closing this in favour of #293

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enum doesn't get mapped on receive
4 participants