An experiment in generating an empty list default for multivalued slots in a LinkML model.
It is difficult or impossible in LinkML to specify that you want an empty list
as the default value for a slot that is both required and multivalued. This
repository contains some materials to help demonstrate this, as well as a
workaround.
For more context, see the following issues:
-
If you are using Nix flakes, run
nix developto install dependencies. Otherwise, ensure you have Python 3.10 installed. -
Create a virtual environment with
python -m venv venvand activate it with. ./venv/bin/activate. -
Install the Python dependencies with
pip install -r requirements.txt.
-
Generate a Pydantic model by running
gen-pydantic personinfo_busted.yaml >personinfo_busted.py. Note that setting bothmultivalued=trueandrequired=truecorrectly infers the type of thealiases3field asList[str](rather thanOptional[List[str]]as foraliases2, which is not required), but that it is not possible to set a default value of[]directly (the Pydantic generator does not have a way to encode list values in theifabsentattribute; attempts to do so generate the string"[]"). -
Generate another Pydantic model by running
gen-pydantic personinfo_workaround.yaml >personinfo_workaround.py. Note that this model sets a globally unique and incorrectly typed default onaliases3. -
Run
gen-pydantic personinfo_workaround.yaml | sed s/\"aliases3dummy\"/[]/ >personinfo_workaround.py. Note that now the definition ofaliases3correctly has an empty list default value. -
Fire up Python and run this script:
from personinfo_workaround import Person p = Person(name2='Ada Byron', aliases2=['Lady Byron']) p
Note that
pcontains the expected values for all fields, including a default[]foraliases3.
The need to post-process like this is unfortunate, but seems to be necessary due to unresolved issues of defaults for multivalued slots in LinkML ([1], [2]).
It is possible to generate correct Pydantic models from LinkML schemata featuring required, multivalued attributes with a list-valued default, but it currently requires a transformation of the generated code.