-
Notifications
You must be signed in to change notification settings - Fork 727
Mem value sanitizing #633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mem value sanitizing #633
Conversation
5c50093
to
d8026cf
Compare
src/daemon/daemon.cpp
Outdated
auto mem_size = std::get<0>(mem_disk_name_and_errors); // use structured bindings instead in C++17 | ||
auto disk_space = std::get<1>(mem_disk_name_and_errors); // use structured bindings instead in C++17 | ||
auto instance_name = std::get<2>(mem_disk_name_and_errors); // use structured bindings instead in C++17 | ||
auto option_errors = std::get<3>(mem_disk_name_and_errors); // use structured bindings instead in C++17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dislike this because there's no clear connection between 0 and mem_size, 1 and disk_space, etc. Sure it works, but when we add an entry in the tuple, or swap entries with same type, it's easy to refer to the wrong index and create a bug, yet the code will still compile.
A quick struct will give names to these entries, establishing a stronger key/value connection. e.g. VirtualMachineDescription
Note, you could correctly point out that we do something very similar in the return statements of validate_create_arguments, to_machine_desc, etc - where we use brace initializers to populate a struct - as again there's no clear key/value connection. I'm not a giant fan of that style either, but at least it is localised to the creation of the struct, and not the consumer, so the person who changes it will (I think) be more aware of the consequences.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point and, when talking about public functions or functions with multiple uses, I generally agree. But in localized private functions that only exist to offload code from the caller, I am not so sure. There are pros and cons and I am not sure how to proceed in this occasion.
- Currently (with tuple), the code will compile only if the types are compatible with how they are used by the consumer. This would be the same with struct.
- I do agree it is more readable to have the vars named rather than indexed in large tuples. But it is less clear if structs proliferate for every useful return combination. (I wish we had named returns [and params for that matter]).
- Indexes would go away with C++17 (i.e.
auto [mem, disk, name, errors] = validate_create_arguments(...)
). - It is still the ordering that determines what is what, whether the consumer of the returned expression is outside (current case) or within
validate_create_arguments
(if we returned a struct instead). In both cases, code outside of the function is affected by changes in the return. - But, as you say, in one case those changes are properly localized to the producer, while in the other it leaks to its consumer. This is the main argument for the struct approach and what makes it the better choice in public interfaces.
- The cons of the tuple approach are minimized in this particular case, as the function is local to the compilation unit (anonymous namespace) and only separate from the caller in the interest of readability. This is still no reason to preclude a struct, but...
- The goal of all of this code (most of
launch
) is already to initialize a (complete)VirtualMachineDescription
. So I am not sure creating another intermediate description struct is appropriate. - How would I name this struct?
VirtualMahineMemDiskNameAndErrors
?VirtualMachineSomeOptions
?VirtualMachinePartialDescription
? - I suppose the right thing would be for
VirtualMachineDescription
's constructor to receive the request's parameters and derive itself from them. But that sounds like excessive refactoring for the current purposes...
Alternatives:
- input parameters - I tend to dislike those because I like objects to be meaningful from construction (the other [the literal] meaning of RAII).
- partially filling
VirtualMachineDescription
invalidate_create_arguments
- I wouldn't favor this either for the same reason - validate arguments separately (one function for memory, one for disk, etc.) - they would still need to return pairs (because of possible errors). Additionally, to maintain the same behavior as today,
option_errors
would have to be combined after all functions were called. - put all the code directly in launch...
So, @gerboland what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking your points
- as I said, swapping entries of same type will still compile, but create bug. Public/private is irrelevant to that argument, and the greatest strength structs have.
- I'm only taking this case, not every return. Named returns would be ideal, closest is a struct IMO.
- same problem, just replace indexes with key names.
- I'm not saying it's ideal, but least worst option IMO
- not just public IMO
- readability isn't my priority here, avoiding bugs is - especially those possibly introduced later by a different author.
- it's a struct just to pass a list of values with labels. My objection is purely down to the lack of labels with your approach here.
- I dunno, ValidatedVMOptions?
Sorry, my opinion hasn't changed. Note I'm not making a big deal here, it's just code after all :)
I'll let @townsend2010 have his say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do feel like the proposed approach is more prone to bugs accidentally introduced if some other maintainer or contributor down the road added another check here, like for example, validating the number of cores/CPU's as well. What I'm not a fan of, is that the consumer has to know the exact order of the elements that are returned from validate_create_arguments
. With a struct, it only needs to know the struct's member name and the rest still are valid, even if a contributor puts a new member variable anywhere in the struct.
But as @gerboland said, it's just code and it works now, so I wouldn't block on this either:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments. No big deal here either! Just debating an interesting issue with implications in future coding practices :) I actually agree with you to a large extent, but I don't quite follow what you say in 1 and 6.
Suppose I used a struct Foo
, with ctor Foo(mem, disk, name, errors)
. The function would say return Foo{a,b,c,d};
right? Aren't argument swapping bugs as likely? The way I see it, a change in the producer requires a change in the consumer in both cases. The compiler will do the same. What changes is where those responsibilities lie, which is why I mentioned public/private.
A struct-based approach encapsulates the consumer, preventing impact on the function's client(s). But if the function has a single client and is only there to offload code into the anonymous namespace, the benefit is small IMO. When considering also drawbacks (struct proliferation, a la carte types representing no clear concept), I place the ideal compromise a bit differently (barring named returns).
Anyway... I just saw @townsend2010's comment and will modify the code to use the struct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just committed changes with the ordering now encapsulated in a struct. Does that address the concerns?
Replace variable initializations that were previously marked as candidates for structured bindings in C++17.
Use a corrected regular expression to validate memory size values. Accept lowercase suffixes (k, m, g). Improve and extend corresponding tests. Fixes #470.
Test a new memory normalization interface, to simultaneously validate and convert memory values to the smallest unit (bytes).
Implement verification and normalization of memory values in a single step. Use that in the daemon, replacing mere validation. Update hostname validation for consistency. Update tests accordingly. Fixes #616.
Provide a first step to obtain the number of bytes (to compare with other numbers) and a second step to express the same thing as a string that is suitable for backend memory/disk arguments.
Make tests expect error messages in error stream instead.
Makes failing tests pass. Fixes #588.
This attempts to address a review concern.
Use Qt's API rather than the standard library to convert a QString to long long, thus avoiding the intermediate std::string step. This addresses a review concern.
ccf86ef
to
984d4c3
Compare
This currently uses C++17 features (fallthrough attribute and structured bindings), so rebased on that branch. |
std::string disk_space; | ||
std::string instance_name; | ||
mp::LaunchError option_errors; | ||
} ret{mem_size, disk_space, instance_name, option_errors}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't address my concern. There is still no connection between this ordering of params and the order in the consumer. E.g. this will compile, but is wrong:
auto [disk_space, mem_size, instance_name, option_errors] = validate_create_arguments(request);
This is the root of my dislike of this syntactic sugar.
If the params were all different types, the type system would catch errors. So I'm ok with this syntax being used with care. But here, I think using a struct like this would give more safety. Here's my thinking:
https://pastebin.ubuntu.com/p/RsQ3sf6dZy/
My motivation is that this moves the responsibility for parameter order checking to solely the location the param list is created (i.e. in validate_create_arguments). Not where it is consumed.
With this struct approach, if someone swaps mem_size and disk_space, they only need to care about reordering those params in the creation of the struct. No other code needs to change, and no bugs created.
Sure, we could go all the way and insist on this at creation:
CheckedArguments args;
args.mem_size = mem_size;
args.disk_space = ....
return args;
but I think that's unnecessary. I trust that if an author wants to change the CheckedArguments list order, they'll be sure to do the right thing. But I don't trust they'll update all the consumers correctly too.
Is it pretty? No. But IMO it's worth it for safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just my $0.02, but that pastebin is what I envisioned too when I made my comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will do it that way then, but...
My motivation is that this moves the responsibility for parameter order checking to solely the location the param list is created (i.e. in validate_create_arguments). Not where it is consumed.
Well, the ordering of public attributes is part of a struct's interface, so it is not being contained. IOW, anyone changing the order would still have to check all clients (think also construction, assignment...). But I understand your concern is that they may forget so it is perhaps safer not to rely on that here.
Drop structured-bindings and access struct attributes by name. This addresses a review request.
OK take 3 :) @gerboland did as you requested, keeping the struct inside the function as discussed. |
Moved back onto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking my comments on board. I'm happy with the approach.
I am getting a compile error on Bionic (with Qt5.9)
src/utils/utils.cpp:70:26: error: ‘const class QString’ has no member named ‘front’
switch (unit.front().toLower().toLatin1())
Unfortunately the QString::front() method was added in Qt5.10. Alternative is at(0)
@gerboland thanks for noticing that! I will have to move this because changing the base of this PR closed it and github does not allow reopening. New PR comming... |
654: Mem value sanitizing - take 3 r=gerboland a=ricab Moved from #633 (after rebase and force-push). Co-authored-by: Ricardo Abreu <[email protected]>
[utils] Use a consistent namespace alias for utils
Improves the handling of memory and disk size inputs. Means to address three issues, one of which is still pending.