Skip to content

Conversation

ricab
Copy link
Collaborator

@ricab ricab commented Feb 22, 2019

Improves the handling of memory and disk size inputs. Means to address three issues, one of which is still pending.

@ricab ricab force-pushed the mem-value-sanitizing branch from 5c50093 to d8026cf Compare February 26, 2019 20:59
@ricab ricab changed the title WIP: Mem value sanitizing Mem value sanitizing Feb 26, 2019
auto mem_size = std::get<0>(mem_disk_name_and_errors); // use structured bindings instead in C++17
auto disk_space = std::get<1>(mem_disk_name_and_errors); // use structured bindings instead in C++17
auto instance_name = std::get<2>(mem_disk_name_and_errors); // use structured bindings instead in C++17
auto option_errors = std::get<3>(mem_disk_name_and_errors); // use structured bindings instead in C++17
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike this because there's no clear connection between 0 and mem_size, 1 and disk_space, etc. Sure it works, but when we add an entry in the tuple, or swap entries with same type, it's easy to refer to the wrong index and create a bug, yet the code will still compile.

A quick struct will give names to these entries, establishing a stronger key/value connection. e.g. VirtualMachineDescription

Note, you could correctly point out that we do something very similar in the return statements of validate_create_arguments, to_machine_desc, etc - where we use brace initializers to populate a struct - as again there's no clear key/value connection. I'm not a giant fan of that style either, but at least it is localised to the creation of the struct, and not the consumer, so the person who changes it will (I think) be more aware of the consequences.

Copy link
Collaborator Author

@ricab ricab Feb 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point and, when talking about public functions or functions with multiple uses, I generally agree. But in localized private functions that only exist to offload code from the caller, I am not so sure. There are pros and cons and I am not sure how to proceed in this occasion.

  • Currently (with tuple), the code will compile only if the types are compatible with how they are used by the consumer. This would be the same with struct.
  • I do agree it is more readable to have the vars named rather than indexed in large tuples. But it is less clear if structs proliferate for every useful return combination. (I wish we had named returns [and params for that matter]).
  • Indexes would go away with C++17 (i.e. auto [mem, disk, name, errors] = validate_create_arguments(...)).
  • It is still the ordering that determines what is what, whether the consumer of the returned expression is outside (current case) or within validate_create_arguments (if we returned a struct instead). In both cases, code outside of the function is affected by changes in the return.
  • But, as you say, in one case those changes are properly localized to the producer, while in the other it leaks to its consumer. This is the main argument for the struct approach and what makes it the better choice in public interfaces.
  • The cons of the tuple approach are minimized in this particular case, as the function is local to the compilation unit (anonymous namespace) and only separate from the caller in the interest of readability. This is still no reason to preclude a struct, but...
  • The goal of all of this code (most of launch) is already to initialize a (complete) VirtualMachineDescription. So I am not sure creating another intermediate description struct is appropriate.
  • How would I name this struct? VirtualMahineMemDiskNameAndErrors? VirtualMachineSomeOptions? VirtualMachinePartialDescription?
  • I suppose the right thing would be for VirtualMachineDescription's constructor to receive the request's parameters and derive itself from them. But that sounds like excessive refactoring for the current purposes...

Alternatives:

  • input parameters - I tend to dislike those because I like objects to be meaningful from construction (the other [the literal] meaning of RAII).
  • partially filling VirtualMachineDescription in validate_create_arguments - I wouldn't favor this either for the same reason
  • validate arguments separately (one function for memory, one for disk, etc.) - they would still need to return pairs (because of possible errors). Additionally, to maintain the same behavior as today, option_errors would have to be combined after all functions were called.
  • put all the code directly in launch...

So, @gerboland what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking your points

  1. as I said, swapping entries of same type will still compile, but create bug. Public/private is irrelevant to that argument, and the greatest strength structs have.
  2. I'm only taking this case, not every return. Named returns would be ideal, closest is a struct IMO.
  3. same problem, just replace indexes with key names.
  4. I'm not saying it's ideal, but least worst option IMO
  5. not just public IMO
  6. readability isn't my priority here, avoiding bugs is - especially those possibly introduced later by a different author.
  7. it's a struct just to pass a list of values with labels. My objection is purely down to the lack of labels with your approach here.
  8. I dunno, ValidatedVMOptions?

Sorry, my opinion hasn't changed. Note I'm not making a big deal here, it's just code after all :)

I'll let @townsend2010 have his say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do feel like the proposed approach is more prone to bugs accidentally introduced if some other maintainer or contributor down the road added another check here, like for example, validating the number of cores/CPU's as well. What I'm not a fan of, is that the consumer has to know the exact order of the elements that are returned from validate_create_arguments. With a struct, it only needs to know the struct's member name and the rest still are valid, even if a contributor puts a new member variable anywhere in the struct.

But as @gerboland said, it's just code and it works now, so I wouldn't block on this either:)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments. No big deal here either! Just debating an interesting issue with implications in future coding practices :) I actually agree with you to a large extent, but I don't quite follow what you say in 1 and 6.

Suppose I used a struct Foo, with ctor Foo(mem, disk, name, errors). The function would say return Foo{a,b,c,d}; right? Aren't argument swapping bugs as likely? The way I see it, a change in the producer requires a change in the consumer in both cases. The compiler will do the same. What changes is where those responsibilities lie, which is why I mentioned public/private.

A struct-based approach encapsulates the consumer, preventing impact on the function's client(s). But if the function has a single client and is only there to offload code into the anonymous namespace, the benefit is small IMO. When considering also drawbacks (struct proliferation, a la carte types representing no clear concept), I place the ideal compromise a bit differently (barring named returns).

Anyway... I just saw @townsend2010's comment and will modify the code to use the struct.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just committed changes with the ordering now encapsulated in a struct. Does that address the concerns?

Replace variable initializations that were previously marked as candidates for structured bindings in C++17.
@ricab ricab added the no-merge label Feb 27, 2019
@ricab ricab changed the title Mem value sanitizing WIP: Mem value sanitizing Feb 27, 2019
ricab added 11 commits February 27, 2019 18:13
Use a corrected regular expression to validate memory size values. Accept lowercase suffixes (k, m, g). Improve and extend corresponding tests. Fixes #470.
Test a new memory normalization interface, to simultaneously validate and convert memory values to the smallest unit (bytes).
Implement verification and normalization of memory values in a single step. Use that in the daemon, replacing mere validation. Update hostname validation for consistency. Update tests accordingly. Fixes #616.
Provide a first step to obtain the number of bytes (to compare with other numbers) and a second step to express the same thing as a string that is suitable for backend memory/disk arguments.
Make tests expect error messages in error stream instead.
Makes failing tests pass. Fixes #588.
This attempts to address a review concern.
Use Qt's API rather than the standard library to convert a QString to long long, thus avoiding the intermediate std::string step. This addresses a review concern.
@ricab ricab force-pushed the mem-value-sanitizing branch from ccf86ef to 984d4c3 Compare February 28, 2019 11:06
@ricab ricab changed the base branch from master to move-to-cpp17 February 28, 2019 11:08
@ricab ricab changed the title WIP: Mem value sanitizing Mem value sanitizing Feb 28, 2019
@ricab ricab removed the no-merge label Feb 28, 2019
@ricab
Copy link
Collaborator Author

ricab commented Feb 28, 2019

This currently uses C++17 features (fallthrough attribute and structured bindings), so rebased on that branch.

@bors bors bot closed this Feb 28, 2019
@Saviq Saviq changed the base branch from move-to-cpp17 to master February 28, 2019 13:31
@Saviq Saviq reopened this Feb 28, 2019
std::string disk_space;
std::string instance_name;
mp::LaunchError option_errors;
} ret{mem_size, disk_space, instance_name, option_errors};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't address my concern. There is still no connection between this ordering of params and the order in the consumer. E.g. this will compile, but is wrong:

auto [disk_space, mem_size, instance_name, option_errors] = validate_create_arguments(request);

This is the root of my dislike of this syntactic sugar.

If the params were all different types, the type system would catch errors. So I'm ok with this syntax being used with care. But here, I think using a struct like this would give more safety. Here's my thinking:
https://pastebin.ubuntu.com/p/RsQ3sf6dZy/

My motivation is that this moves the responsibility for parameter order checking to solely the location the param list is created (i.e. in validate_create_arguments). Not where it is consumed.

With this struct approach, if someone swaps mem_size and disk_space, they only need to care about reordering those params in the creation of the struct. No other code needs to change, and no bugs created.

Sure, we could go all the way and insist on this at creation:

CheckedArguments args;
args.mem_size = mem_size;
args.disk_space = ....
return args;

but I think that's unnecessary. I trust that if an author wants to change the CheckedArguments list order, they'll be sure to do the right thing. But I don't trust they'll update all the consumers correctly too.

Is it pretty? No. But IMO it's worth it for safety.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just my $0.02, but that pastebin is what I envisioned too when I made my comments.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will do it that way then, but...

My motivation is that this moves the responsibility for parameter order checking to solely the location the param list is created (i.e. in validate_create_arguments). Not where it is consumed.

Well, the ordering of public attributes is part of a struct's interface, so it is not being contained. IOW, anyone changing the order would still have to check all clients (think also construction, assignment...). But I understand your concern is that they may forget so it is perhaps safer not to rely on that here.

Drop structured-bindings and access struct attributes by name. This addresses a review request.
@ricab
Copy link
Collaborator Author

ricab commented Feb 28, 2019

OK take 3 :)

@gerboland did as you requested, keeping the struct inside the function as discussed.

@Saviq Saviq changed the base branch from master to move-to-cpp17 February 28, 2019 20:21
@Saviq
Copy link
Collaborator

Saviq commented Feb 28, 2019

Moved back onto move-to-cpp17 as we had to revert it temporarily.

Copy link
Contributor

@gerboland gerboland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for taking my comments on board. I'm happy with the approach.

I am getting a compile error on Bionic (with Qt5.9)

src/utils/utils.cpp:70:26: error: ‘const class QString’ has no member named ‘front’
             switch (unit.front().toLower().toLatin1())

Unfortunately the QString::front() method was added in Qt5.10. Alternative is at(0)

@ricab ricab closed this Mar 1, 2019
@ricab ricab changed the base branch from move-to-cpp17 to move-to-cpp17-take2 March 1, 2019 10:58
@ricab
Copy link
Collaborator Author

ricab commented Mar 1, 2019

@gerboland thanks for noticing that!

I will have to move this because changing the base of this PR closed it and github does not allow reopening. New PR comming...

bors bot added a commit that referenced this pull request Mar 8, 2019
654: Mem value sanitizing - take 3 r=gerboland a=ricab

Moved from #633 (after rebase and force-push).

Co-authored-by: Ricardo Abreu <[email protected]>
ricab pushed a commit that referenced this pull request May 8, 2025
[utils] Use a consistent namespace alias for utils
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants