-
Notifications
You must be signed in to change notification settings - Fork 544
CXX-3126 Refactor EVG config to use config_generator #1244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Accidentally dropped extra alignment fixes for valgrind tasks during a rebase. Addressed and verified by this patch. Aside: libmongocrypt is disabled for sanitizer tasks for more accurate current-config reproduction, but we probably want to "revert" this and extend sanitizer tasks to cover CSFLE code paths soon. |
Latest changes verified by this patch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it. Some minor Python cleanups recommended.
I haven't used uv
yet, but intend to get into it when I have time. Poetry was a significant improvement over pipenv
(which was better than requirements.txt
), but Poetry's implicit virtualenv management is so finicky and getting it to play with pyenv
is a chore. If this works well, I would like to port the same changes over to C and libmongoc.
@classmethod | ||
def call(cls, **kwargs): | ||
return cls.default_call(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most versions of call
are like this. Can it be lifted into the Function
base class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC forcing the definition of call
per class was meant to encourage evaluating meaningful parameters for each function, but I do not think that played out as intended. I'm in favor of the reduced boilerplate.
) | ||
|
||
@classmethod | ||
def call(cls, compiler: str, vars: Mapping[str, str] = {}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python pitfall (mis-feature from the 90s?): Don't use mutable values as default arguments. The value is evaluated and attached to the function definition, meaning that mutating it persists between function calls. Instead, use a None
.
def call(cls, compiler: str, vars: Mapping[str, str] = {}): | |
def call(cls, compiler: str, vars: Mapping[str, str] | None = None): |
vars = vars if vars else {} | ||
|
||
vars |= compiler_to_vars(compiler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dict operator |=
mutates the object in-place. If the caller passes in a dict, then this function will mutate their copy. Create a clone of the dict up-front:
vars = vars if vars else {} | |
vars |= compiler_to_vars(compiler) | |
vars = dict(vars or {}) | |
vars |= compiler_to_vars(compiler) |
Aside: I'd be surprised if the type checker doesn't flag this, because Mapping
is an immutable type and does not have a |=
operator defined (as opposed to MutableMapping
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL the dict or {}
pattern (which does not evaluate to bool
).
with open(filename.resolve(), 'w', encoding='utf-8') as file: | ||
file.write(yml) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pathlib.Path
has a shorthand:
with open(filename.resolve(), 'w', encoding='utf-8') as file: | |
file.write(yml) | |
with filename.open('w', encoding='utf-8') as file: | |
file.write(yml) |
Or, even shorter in this case:
with open(filename.resolve(), 'w', encoding='utf-8') as file: | |
file.write(yml) | |
filename.write_text(yml, encoding='utf-8') |
] | ||
|
||
ordered = { | ||
field: mapping.pop(field) for field in before if field in mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend copying mapping
at the top of this func (i.e. mapping = mapping.copy()
) before calling pop
, otherwise this modifies the caller's value.
echo "Configuring with CMake flags:" | ||
for flag in "${cmake_flags[@]}"; do | ||
echo " - ${flag:?}" | ||
done | ||
echo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor shorthand with printf
:
echo "Configuring with CMake flags:" | |
for flag in "${cmake_flags[@]}"; do | |
echo " - ${flag:?}" | |
done | |
echo | |
echo "Configuring with CMake flags:" | |
printf " - %s\n" "${cmake_flags[@]}" | |
echo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my aversion to using printf
was due to discrepancies in printf
behavior across distros, but I do not recall the details (I might be conflating this with some other printf
behavior), and a quick scan of EVG output across all three major OS's (Ubuntu, MacOS, Windows) appears to be just fine, so I will apply the pattern as suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. The config generation, use of display tasks, and fixes to extra alignment tasks are very much appreciated.
from config_generator.etc.utils import bash_exec | ||
|
||
from shrub.v3.evg_build_variant import BuildVariant, DisplayTask | ||
from shrub.v3.evg_command import EvgCommandType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from shrub.v3.evg_command import EvgCommandType |
EvgCommandType
already imported above.
@@ -0,0 +1,30 @@ | |||
#!/usr/bin/env python3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a README file (similar to the C driver README.md) or a comment to config.yml
to document this script can be run through uv
.
from config_generator.components.funcs.compile import Compile | ||
from config_generator.components.funcs.fetch_det import FetchDET | ||
from config_generator.components.funcs.install_c_driver import InstallCDriver | ||
from config_generator.components.funcs.install_c_driver import InstallCDriver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from config_generator.components.funcs.install_c_driver import InstallCDriver |
from config_generator.etc.utils import bash_exec | ||
|
||
from shrub.v3.evg_build_variant import BuildVariant | ||
from shrub.v3.evg_command import EvgCommandType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from shrub.v3.evg_command import EvgCommandType |
from config_generator.components.funcs.setup import Setup | ||
|
||
from config_generator.etc.distros import find_small_distro | ||
from config_generator.etc.function import Function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from config_generator.etc.function import Function |
@@ -0,0 +1,113 @@ | |||
from config_generator.components.funcs.compile import Compile | |||
from config_generator.components.funcs.fetch_c_driver_source import FetchCDriverSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from config_generator.components.funcs.fetch_c_driver_source import FetchCDriverSource |
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | ||
echo "Preparing CSFLE venv environment... skipped." | ||
exit 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | |
echo "Preparing CSFLE venv environment... skipped." | |
exit 0 | |
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | |
echo "Preparing CSFLE venv environment... skipped." | |
exit 0 |
To indent.
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | ||
echo "Starting mock KMS servers... skipped." | ||
exit 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | |
echo "Starting mock KMS servers... skipped." | |
exit 0 | |
# Python: ImportError: DLL load failed while importing _rust: The specified procedure could not be found. | |
echo "Starting mock KMS servers... skipped." | |
exit 0 |
To indent.
Summary
Resolves CXX-3126. Verified by this patch. Reattempt of #1242.
Imports the EVG config generator from the C Driver and converts the current EVG config into generator components. Changes to task matrices are deliberately minimized. Auditing and improving the task matrices is outside the scope of this PR.
Once this PR is merged, the
.mci.yml
file may be deleted in favor of the new.evergreen/config.yml
file (for consistency with the C Driver). They are equivalent given the changes in this PR.Astral uv
The projects managed by our team have seen a steady progression of Python tooling: a pip
requirements.txt
file -> virtual environments (venv
) -> Poetry project configuration (pyproject.toml
) -> Python binary management (pyenv
in./tools/python.sh
). This PR takes this opportunity to take another step by adopting a new Python tool which aims to supercede all such tooling which came prior: uv.Important
This PR does not use uv in Evergreen and does not add any scripts to assist with obtaining and using uv. Users should install uv themselves according to uv's installation instructions as best appropriate for their local development environment. Exploring the use of uv in Evergreen scripts is outside the scope of this PR.
The shockingly impressive convenience of this tool is demonstrated by running the following command, whose only prerequisite is that
uv
is installed and available to use:Note
The
--frozen
argument is needed to avoid updating the lock file when runninguv
commands.Running this single command accomplishes all of the following steps:
pyenv
, system package managers, and manual installation, which is arguably the most frustrating part of maintaining reproducibility of Python scripts..venv
by default) for (re)use by all subsequent Python package operations and script execution, using the Python binary selected in the earlier step.venv
orpoetry
, but goes even further by properly handling Python binary version compatibility through (re)generation of the virtual environment according to the requested Python version (via-p
/--python
).requirements.txt
orpyproject.toml
(Poetry), but goes even further by supporting resolution strategies to easily validate minimum (or maximum) dependency version requirements.The package version requirements in this PR were obtained by initially using inline script metadata and
--resolution lowest-direct
for requirement validation before copying the dependencies into thepyproject.toml
file, demonstrating the power of built-in Python binary management and Python-version-aware virtual environment management:Note
This behavior can be reproduced despite the presence of the project configuration file using a combination of
--no-project
,--isolated
, and--with
:Note
Inline script metadata is not used to specify dependencies due to isolated virtual environments interacting poorly with external tooling which expect a persistent virtual environment to be present. The dependencies are instead (redundantly) listed per associated script in the project configuration file, which may be used to generate a local and persistent virtual environment (
.venv
by default) usinguv sync --frozen
(or as part of auv run --frozen
command).Note
Although testing with
lowest
would be ideal, thelowest-direct
resolution was used instead due to insufficiently-specified requirements in transitive dependencies leading to unresolvable failures. This PR avoids introducing constraints or overrides to keep things simple.Note
The
clang_format.py
script still requires Python 2. Arequires-python
was added to enforce this requirement (noteuv
does not support Python 2):Updating this script to work with Python 3 is deferred, but would make a good case for using
uv tool
/uvx
, which supercedespipx
, e.g.uvx clang-tools -t clang-format -i <version> -d build && ./build/clang-format --version
.Despite uv not yet supporting a stable API, I believe its powerful featureset (and already high popularity) makes it an very valuable tool which we can and should adopt in our projects. This PR hopes to be a leading test case for its eventual adoption by other projects as well.
EVG Config Generator Adjustments
Some adjustments were required to support the C++ Driver EVG config. These adjustments include:
.evergreen
tosys.path
to permit relative imports ofconfig_generator
modules regardless of the current working directory.distros.py
and support for*-latest
distros.distros.py
and extending compiler helpers to help specify corresponding CMake generators and platforms (e.g.vs2019x64
->CMAKE_GENERATOR="Visual Studio 16 2019"
+CMAKE_GENERATOR_PLATFORM="x64"
).teardown_task_can_fail_task
field for task groups.serialize_as_any=True
:EVG Config Migration
Tip
Reviewing this PR by-commit is recommended to more easily compare how individual functions and components are migrated from the old config into their generator component form.
Most functions are translated as-is into their component form with minimal changes. Functions commonly reused by components are given explicit parameters in their
call()
functions to help with consistency and reduce verbosity (e.g.mongodb_version
->TOPOLOGY
,polyfill
->BSONCXX_POLYFILL
, etc.).Scripts under
.evergreen
which are invoked by the EVG config are relocated into the.evergreen/scripts
directory, formatted, given executable permissions, and audited withshellcheck
(excluding the packaging-related scripts). Scripts underetc
are left in their current location.Some lessons learned during the C Driver's migration are applied in this PR. Unlike with the C Driver's generator components, the C++ Driver's components minimize cross-component inclusion and reuse (with the exclusion of function components). Despite leading to some repetition and verbosity across components, this is done to improve separation-of-concerns of components and matrices. In particular, despite its large matrix and complicated generation routine, I believe the single
integration
component is nevertheless more straightforward and understandable than the layeredsasl
->cse
->asan
/tsan
components in the C Driver. Unlike with the C Driver, sanitizer and valgrind tasks are grouped into seperate, completely independent components.Note
Although many tasks are grouped into a display task (e.g. auth-matrix, compile-only-matrix, etc.), this is not done for the integration matrix. This is to facilitate better filtering, selection, and sorting in Spruce, as such operations appear to be limited for members of a display task. Grouping by distro, by build type, and by server version/topology were all considered but reverted in favor of the current no-grouping state. This may be reconsidered if the Spruce UI is improved to better handle filtering/selecting/sorting members of display tasks: see DEVPROD-12402.
Some additional notes:
Extra Alignment
Important
Adding support for configuring the auto-downloaded C Driver with
ENABLE_EXTRA_ALIGNMENT=ON
was considered but rejected to avoid encouraging use of extra alignment; see also CDRIVER-2813 and MONGOCRYPT-725.Validating correct migration revealed that extra alignment tasks were broken by #967. The
BSON_EXTRA_ALIGNMENT
has no affect onfetch_c_driver_source
nor the subsequent call to thecompile
EVG function (despite misleadingly being set for thecompile
function). Extra alignment was therefore not being enabled and tested with auto-downloaded C Driver configurations, as the auto-downloaded C Driver disables extra alignment by default. All current extra alignment tasks use the auto-downloaded C Driver, therefore extra alignment current has zero test coverage.Fixing the enabling of extra alignment (by reverting relevant tasks to using
install_c_driver
withBSON_EXTRA_ALIGNMENT=ON
instead offetch_c_driver_source
) revealed compiler errors due to-Werror
+-Waligned-new
(enabled by-Wall
):This appears to be due to #1049 which removed the
-Wno-aligned-new
flag "due to conflicts with Clang tasks + appearing to cause no trouble for existing tasks (not entirely clear what motivated its addition or if it still applies)". This PR being dated Nov 2023 likely explains why "no trouble" was observed following #967 on Aug 2023. The-Wno-aligned-new
flag is restored with exclusions for Clang using theCC
/CXX
env vars.