Skip to content

Improve harness generated code readability #5087

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

xbauch
Copy link
Contributor

@xbauch xbauch commented Sep 4, 2019

Previously all recursive initialisation took place in a single function. Now each type has it's own constructor function, which are called recursively. The results (variables to be initialised) are passed by pointer.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@codecov-io
Copy link

codecov-io commented Sep 5, 2019

Codecov Report

Merging #5087 into develop will decrease coverage by 0.09%.
The diff coverage is 92.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop    #5087     +/-   ##
==========================================
- Coverage    69.66%   69.56%   -0.1%     
==========================================
  Files         1319     1315      -4     
  Lines       109239   109122    -117     
==========================================
- Hits         76098    75916    -182     
- Misses       33141    33206     +65
Impacted Files Coverage Δ
src/goto-harness/recursive_initialization.h 100% <ø> (ø) ⬆️
...c/goto-harness/function_call_harness_generator.cpp 89.81% <100%> (+0.29%) ⬆️
src/pointer-analysis/value_set.cpp 76.02% <100%> (-0.78%) ⬇️
...goto-harness/memory_snapshot_harness_generator.cpp 90.32% <100%> (+0.32%) ⬆️
src/goto-harness/recursive_initialization.cpp 83.59% <92.33%> (+8.59%) ⬆️
src/analyses/natural_loops.cpp 0% <0%> (-100%) ⬇️
src/analyses/natural_loops.h 76.31% <0%> (-23.69%) ⬇️
src/util/invariant.cpp 60.52% <0%> (-13.16%) ⬇️
src/goto-symex/build_goto_trace.cpp 84.82% <0%> (-2.76%) ⬇️
src/analyses/local_may_alias.cpp 56.25% <0%> (-2.28%) ⬇️
... and 72 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eaf8cb3...8cb2f6e. Read the comment docs.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: c7c9708).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126018554

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 8cb2f6e).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/126026333

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks fine to me, but I think the decision of what things to initialise or not should be made outside of recursive_initialisation

@@ -195,6 +197,9 @@ void function_call_harness_generatort::implt::generate(

generate_nondet_globals(function_body);
call_function(arguments, function_body);
for(const auto &global_pointer : global_pointers)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ braces please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@@ -195,6 +197,9 @@ void function_call_harness_generatort::implt::generate(

generate_nondet_globals(function_body);
call_function(arguments, function_body);
for(const auto &global_pointer : global_pointers)
function_body.add(code_function_callt{

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I thought you'd have to pass more than one parameter here sometimes?

code_blockt code{};

// add initialization for existing globals
for(auto pair : goto_model.symbol_table)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const &

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

{
if(
id2string(pair.first).find(CPROVER_PREFIX) != 0 &&
pair.second.is_static_lifetime)
Copy link
Contributor

@hannes-steffenhagen-diffblue hannes-steffenhagen-diffblue Sep 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the function call harness, you'll also want

&& symbol.is_lvalue 
&& symbol.type.id() != ID_code 
&& !has_prefix(id2string(symbol.name), CPROVER_PREFIX)

because we don't want to initialise function symbols, non-lvalues or cprover internal variables (we should probably factor out this filtering stuff into a separate utility)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted and moved to recursive_initializationt.

"min_depth",
from_integer(
initialization_config.min_null_tree_depth,
signed_int_type())))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's kind of irrelevant but I'd have expected these to be size_t

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason I went for int is that it looks better in the dumped code.

initialize_struct_tag(lhs, depth, known_tags, body);
}
else if(type.id() == ID_pointer)
const irep_idt &fun_name = build_constructor(lhs);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely build_constructor only needs lhs.type()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need the name of the symbol (if lhs is a symbol) to get the associated_size_variable etc.

if(lhs.id() == ID_symbol)
const irep_idt &lhs_name = to_symbol_expr(lhs).get_identifier();
// skip initialisation of max/min depth globals
if(lhs_name == max_depth_var_name || lhs_name == min_depth_var_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be cleaner to remove these from the global set in the first place - IIRC at least the function-harness automatically strips all variables with a __CPROVER_ prefix from the list of variables to nondet (all harnesses should do this, may be worth extracting into separate functionality). All our global (maybe even local, given that cbmc doesn't really care all that much about scope) variables should have the __CPROVER_ prefix attached to them anyway (that way we can avoid accidentally overriding variables that we add in here without having to remember to block each of them individually here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

goto_model.symbol_table.lookup_ref(size_var.value()).symbol_expr();
body.add(code_function_callt{
fun_symbol.symbol_expr(),
{depth, address_of_exprt{lhs}, address_of_exprt{size_symbol}}});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one way to do it, but I don't like having the replicated logic between the constructor and the call; I'd prefer if the dynamic array constructor just always took a size pointer and just set it to null if it's not needed (kind of mirroring similar functions in the real world)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
symbolt &fresh_symbol = get_fresh_aux_symbol(
signed_int_type(),
"__goto_harness",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes more sense to use CPROVER_PREFIX, same in following

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


symbol_exprt recursive_initializationt::get_free_function()
{
auto free_sym = goto_model.symbol_table.lookup("free");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we had some talk at some point about using CBMCs internal allocate/deallocate instructions for this, but I didn't have much luck for them (no need for changes, just a remark)

@xbauch xbauch force-pushed the feature/better-harness branch 2 times, most recently from b5aec2e to 099cf12 Compare September 24, 2019 10:32
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 099cf12).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/128903045

Copy link
Contributor

@NlightNFotis NlightNFotis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall, just some minor comments.

@@ -428,6 +428,9 @@ std::string recursive_initialization_configt::to_string() const

recursive_initializationt::array_convertert
recursive_initializationt::default_array_member_initialization()
irep_idt recursive_initializationt::get_fresh_global_name(
const std::string &symbol_name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_fresh_global_name and get_fresh_global_symexpr look very similar. Are we sure the bulk of the work performed in one function, and then have the helpers call into it?

symbolt *mutable_symbol = symbol_table.get_writeable(function_symbol.name);

// the body is specific for each type of expression
mutable_symbol->value = build_constructor_body(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slightly weird that build_constructor_body is being referenced here but added in the next commit and there's a prototype added for it as well in recursive_initialization.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how to break the PR into individual commits so I sometimes formed a commit around a body of a function but only declared the functions called inside that body.

{
if(lhs_name.has_value())
{
if(should_be_treated_as_cstring(*lhs_name) && type == char_type())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question, since I'm not sure why it's happening:

Why in other places we compare the type.id(), like for instance in line 137 type.id() == ID_pointer and here we compare the type itself with the constructor result? Are we sure equality is working as it should? If yes, should all of these be harmonised?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just my being lazy and not finding out what type-id char should have.

because we also want to convert the function that will be added during
initialisation.
also add mode to memory-snapshot.
and storing the in the symbol table (and one for `free`).
Previously all recursive initialisation took place in a single function. Now
each type has it's own constructor function, which are called recursively. The
results (variables to be initialised) are passed by pointer.
@xbauch xbauch force-pushed the feature/better-harness branch from 099cf12 to 9ca0824 Compare October 3, 2019 14:27
@xbauch xbauch marked this pull request as ready for review October 3, 2019 15:04
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 9ca0824).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/130278481

@hannes-steffenhagen-diffblue hannes-steffenhagen-diffblue merged commit cb20917 into diffblue:develop Nov 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants