-
Notifications
You must be signed in to change notification settings - Fork 277
Improve harness generated code readability #5087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve harness generated code readability #5087
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #5087 +/- ##
==========================================
- Coverage 69.66% 69.56% -0.1%
==========================================
Files 1319 1315 -4
Lines 109239 109122 -117
==========================================
- Hits 76098 75916 -182
- Misses 33141 33206 +65
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: c7c9708).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126018554
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 8cb2f6e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126026333
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks fine to me, but I think the decision of what things to initialise or not should be made outside of recursive_initialisation
@@ -195,6 +197,9 @@ void function_call_harness_generatort::implt::generate( | |||
|
|||
generate_nondet_globals(function_body); | |||
call_function(arguments, function_body); | |||
for(const auto &global_pointer : global_pointers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ braces please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@@ -195,6 +197,9 @@ void function_call_harness_generatort::implt::generate( | |||
|
|||
generate_nondet_globals(function_body); | |||
call_function(arguments, function_body); | |||
for(const auto &global_pointer : global_pointers) | |||
function_body.add(code_function_callt{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I thought you'd have to pass more than one parameter here sometimes?
code_blockt code{}; | ||
|
||
// add initialization for existing globals | ||
for(auto pair : goto_model.symbol_table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const &
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
{ | ||
if( | ||
id2string(pair.first).find(CPROVER_PREFIX) != 0 && | ||
pair.second.is_static_lifetime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the function call harness, you'll also want
&& symbol.is_lvalue
&& symbol.type.id() != ID_code
&& !has_prefix(id2string(symbol.name), CPROVER_PREFIX)
because we don't want to initialise function symbols, non-lvalues or cprover internal variables (we should probably factor out this filtering stuff into a separate utility)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extracted and moved to recursive_initializationt
.
"min_depth", | ||
from_integer( | ||
initialization_config.min_null_tree_depth, | ||
signed_int_type()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's kind of irrelevant but I'd have expected these to be size_t
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I went for int
is that it looks better in the dumped code.
initialize_struct_tag(lhs, depth, known_tags, body); | ||
} | ||
else if(type.id() == ID_pointer) | ||
const irep_idt &fun_name = build_constructor(lhs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely build_constructor
only needs lhs.type()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need the name of the symbol (if lhs
is a symbol) to get the associated_size_variable
etc.
if(lhs.id() == ID_symbol) | ||
const irep_idt &lhs_name = to_symbol_expr(lhs).get_identifier(); | ||
// skip initialisation of max/min depth globals | ||
if(lhs_name == max_depth_var_name || lhs_name == min_depth_var_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be cleaner to remove these from the global set in the first place - IIRC at least the function-harness automatically strips all variables with a __CPROVER_
prefix from the list of variables to nondet (all harnesses should do this, may be worth extracting into separate functionality). All our global (maybe even local, given that cbmc doesn't really care all that much about scope) variables should have the __CPROVER_
prefix attached to them anyway (that way we can avoid accidentally overriding variables that we add in here without having to remember to block each of them individually here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
goto_model.symbol_table.lookup_ref(size_var.value()).symbol_expr(); | ||
body.add(code_function_callt{ | ||
fun_symbol.symbol_expr(), | ||
{depth, address_of_exprt{lhs}, address_of_exprt{size_symbol}}}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's one way to do it, but I don't like having the replicated logic between the constructor and the call; I'd prefer if the dynamic array constructor just always took a size pointer and just set it to null if it's not needed (kind of mirroring similar functions in the real world)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
{ | ||
symbolt &fresh_symbol = get_fresh_aux_symbol( | ||
signed_int_type(), | ||
"__goto_harness", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably makes more sense to use CPROVER_PREFIX
, same in following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
symbol_exprt recursive_initializationt::get_free_function() | ||
{ | ||
auto free_sym = goto_model.symbol_table.lookup("free"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI we had some talk at some point about using CBMCs internal allocate/deallocate instructions for this, but I didn't have much luck for them (no need for changes, just a remark)
b5aec2e
to
099cf12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 099cf12).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/128903045
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me overall, just some minor comments.
@@ -428,6 +428,9 @@ std::string recursive_initialization_configt::to_string() const | |||
|
|||
recursive_initializationt::array_convertert | |||
recursive_initializationt::default_array_member_initialization() | |||
irep_idt recursive_initializationt::get_fresh_global_name( | |||
const std::string &symbol_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_fresh_global_name
and get_fresh_global_symexpr
look very similar. Are we sure the bulk of the work performed in one function, and then have the helpers call into it?
symbolt *mutable_symbol = symbol_table.get_writeable(function_symbol.name); | ||
|
||
// the body is specific for each type of expression | ||
mutable_symbol->value = build_constructor_body( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly weird that build_constructor_body
is being referenced here but added in the next commit and there's a prototype added for it as well in recursive_initialization.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure how to break the PR into individual commits so I sometimes formed a commit around a body of a function but only declared the functions called inside that body.
{ | ||
if(lhs_name.has_value()) | ||
{ | ||
if(should_be_treated_as_cstring(*lhs_name) && type == char_type()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question, since I'm not sure why it's happening:
Why in other places we compare the type.id()
, like for instance in line 137 type.id() == ID_pointer
and here we compare the type itself with the constructor result? Are we sure equality is working as it should? If yes, should all of these be harmonised?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's just my being lazy and not finding out what type-id char
should have.
because we also want to convert the function that will be added during initialisation.
also add mode to memory-snapshot.
and storing the in the symbol table (and one for `free`).
Previously all recursive initialisation took place in a single function. Now each type has it's own constructor function, which are called recursively. The results (variables to be initialised) are passed by pointer.
099cf12
to
9ca0824
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✔️
Passed Diffblue compatibility checks (cbmc commit: 9ca0824).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130278481
Previously all recursive initialisation took place in a single function. Now each type has it's own constructor function, which are called recursively. The results (variables to be initialised) are passed by pointer.