diff --git a/doc/cprover-manual/complexity-graph.md b/doc/cprover-manual/complexity-graph.md new file mode 100644 index 00000000000..41404708305 --- /dev/null +++ b/doc/cprover-manual/complexity-graph.md @@ -0,0 +1,116 @@ +[CPROVER Manual TOC](../) + +## Complexity Graph + +This tool is meant to help decompose CBMC proofs. +It does this by producing a control-flow graph that contains information about where CBMC may run into issues. + +``` +goto-instrument FILE.goto FLAGS +``` + +The recommended file to use is `FUNCTION_harness1.goto` which is produced from the Makefile provided in the starter kit. If this is not available, use `goto-cc` to compile all relevant source files into a `.goto` file. +We recommend this file because it is produced before any program transformations are made, which can complicate the graph. + +## Relevant FLAGS + + --show-complexity-graph FILE.dot + instructs CBMC to write the graph to the .dot file + This can be compiled to a PDF by using the command + ``` + dot -Tpdf FILE.dot -o FILE.pdf + ``` + + + --complexity-graph-roots FUNCTION-NAME + Instructs CBMC to generate the graph spanned by this function. This flag can be passed multiple times and produces one graph spanned by all roots. If no roots are provided, all functions in the provided source files will be used. + Typically this will be `FUNCTION_harness`. + + + --remove-function-body FUNCTION-NAME + Removes the body of a function, which has the effect of removing all outedges of that vertex from the graph. + + + --omit-function FUNCTION-NAME + Omits a function from the graph entirely. Useful for removing high-indegree sink nodes to reduce visual clutter. + + + --constant-propagator + Recommended to run for the reason that it simplifies the underlying program, and hence the graph. + + + --omit-function-pointers + Removes function pointer edges from the graph, which have a habit of producing visual clutter. + + + --add-library + Includes the CBMC library for functions like `malloc`, `free`, etc. + + --complexity-graph-global-scores + Uses a more global notion of scoring, where the "global score" for a node is the number of paths to that node multiplied by the sum of the global scores of its children. + +The following flags can be used with the main CBMC executable +``` +cbmc FILE.goto FLAGS +``` + + --show-complexity-graph-with-symex FILE.dot + Write the complexity graph with symbolic execution information to FILE.dot. + + + --show-complexity-graph-with-solver FILE.dot + Write the complexity graph with symbolic execution and solver formula information to FILE.dot. + This flag requires the addition of `--write-solver-stats-to FILE.out` to obtain solver formula information. + + + --complexity-graph-instructions + When used with '--show-complexity-graph-with-symex' or '--show-complexity-graph-with-solver' displays the instructions of the program that have a high cost in symbolic execution and/or the solver formula. + +## Reading the graph + +### Node shapes + +- Rectangluar nodes: normal functions +- Elliptical nodes: private functions +- Diamond nodes: functions with no body +- Arrow-shaped nodes: function pointers +- Pentagon: standard/CBMC library functions + +### Node colors + +The intensity of red coloring on a node is a display of its complexity relative to other nodes in the graph. + +When using `--show-complexity-graph-with-symex` with `--complexity-graph-instructions`, intensity of pink on instructions shows the complexity of that instruction in symbolic execution, and intensity of yellow shows the complexity of that instruction in the creation of the solver formula. + + +## Sample interaction + +Suppose we have a function `foobar` and we'd like to do a proof for it. +After creating a GOTO file `foobar_harness1.goto` with the starter kit or `goto-cc`, try the following + +``` +goto-instrument foobar_harness1.goto --show-complexity-graph graph.dot --complexity-graph-roots foobar +``` +Followed by +``` +dot -Tpdf graph.dot -o graph.pdf +``` + +This will display the call graph for foobar. +The first step is to take steps to clean the graph. + +Likely, you can ignore function pointers at this point. See [this page](http://cprover.diffblue.com/md__home_travis_build_diffblue_cbmc_doc_architectural_restrict-function-pointer.html) for how to restrict function pointers - the information provided in the graph displays CBMC's sound overapproximation of what functions can be targets of which function pointers. + +Additionally, there may be some high-indegree functions in the graph that don't appear to be useful and just complicate the graph. +Let's say one of those functions is `notuseful`. +We can then run a new command +``` +goto-instrument foobar_harness1.goto --show-complexity-graph graph.dot --complexity-graph-roots foobar --complexity-graph-omit-function-pointers --complexity-graph-omit-function notuseful +``` +Which should provide a cleaner graph to inspect. + + +When doing a proof, you should use information provided in the graph along with some domain knowledge about the codebase to decide where to decompose. See `doc/cprover-manual/contracts-functions.md` for a tutorial on proof decomposition using function contracts, which we refer to as abstracting a function. +It's good to abstract functions that are complex. In the case that there is a simple function `g` that calls a complex function `f`, it may be better to abstract `g` if the contract is easier to write. +A good way to approximate where a contract is easy to write is by how many memory locations a function can write to. + diff --git a/src/cbmc/cbmc_parse_options.cpp b/src/cbmc/cbmc_parse_options.cpp index a05f909e2de..ccbb295ffcb 100644 --- a/src/cbmc/cbmc_parse_options.cpp +++ b/src/cbmc/cbmc_parse_options.cpp @@ -58,6 +58,7 @@ Author: Daniel Kroening, kroening@kroening.com #include #include #include +#include #include #include #include @@ -334,8 +335,10 @@ void cbmc_parse_optionst::get_command_line_options(optionst &options) options.set_option("trace", true); } + options.set_option("symex-record-coverage", cmdline.get_values ("symex-record-coverage")); if(cmdline.isset("symex-coverage-report")) { + options.set_option("symex-record-coverage", true); options.set_option( "symex-coverage-report", cmdline.get_value("symex-coverage-report")); @@ -372,6 +375,51 @@ void cbmc_parse_optionst::get_command_line_options(optionst &options) parse_solver_options(cmdline, options); } +void set_complexity_graph_options (const cmdlinet &cmdline, optionst &options) { + if (cmdline.isset ("show-complexity-graph-with-symex") + || cmdline.isset ("show-complexity-graph-with-solver")) { + options.set_option("symex-record-coverage", true); + } + + options.set_option ("show-complexity-graph", cmdline.get_values ("show-complexity-graph")); + options.set_option ("show-complexity-graph-with-symex", cmdline.get_values ("show-complexity-graph-with-symex")); + options.set_option ("show-complexity-graph-with-solver", cmdline.get_values ("show-complexity-graph-with-solver")); + if (cmdline.isset ("complexity-graph-root")) + { + std::stringstream stream; + for (const std::string &val : cmdline.get_values("complexity-graph-root")) + { + stream << val << ","; + } + options.set_option ("complexity-graph-roots", stream.str()); + } + + if (cmdline.isset ("complexity-graph-omit-function")) + { + std::stringstream stream; + for (const std::string &val : cmdline.get_values("complexity-graph-omit-function")) + { + stream << val << ","; + } + options.set_option ("complexity-graph-omit-function", stream.str()); + } + + if (cmdline.isset ("complexity-graph-global-scores")) + { + options.set_option ("complexity-graph-global-scores", true); + } + + if (cmdline.isset ("complexity-graph-omit-function-pointers")) + { + options.set_option ("complexity-graph-omit-function-pointers", true); + } + + if (cmdline.isset ("complexity-graph-instructions")) + { + options.set_option ("complexity-graph-instructions", true); + } +} + /// invoke main modules int cbmc_parse_optionst::doit() { @@ -499,6 +547,8 @@ int cbmc_parse_optionst::doit() return CPROVER_EXIT_SUCCESS; } + set_complexity_graph_options (cmdline, options); + int get_goto_program_ret = get_goto_program(goto_model, options, cmdline, ui_message_handler); @@ -635,6 +685,7 @@ int cbmc_parse_optionst::doit() verifier = util_make_unique< all_properties_verifier_with_trace_storaget>( options, ui_message_handler, goto_model); + // TODO: this is the branch triggered by default } } else @@ -894,6 +945,8 @@ void cbmc_parse_optionst::help() " --show-parse-tree show parse tree\n" " --show-symbol-table show loaded symbol table\n" HELP_SHOW_GOTO_FUNCTIONS + HELP_SHOW_COMPLEXITY_GRAPH + HELP_SHOW_COMPLEXITY_GRAPH_CBMC HELP_VALIDATE "\n" "Program instrumentation options:\n" diff --git a/src/cbmc/cbmc_parse_options.h b/src/cbmc/cbmc_parse_options.h index b1a7abd9f27..ea00244d7ce 100644 --- a/src/cbmc/cbmc_parse_options.h +++ b/src/cbmc/cbmc_parse_options.h @@ -51,6 +51,9 @@ class optionst; OPT_SOLVER \ OPT_STRING_REFINEMENT_CBMC \ OPT_SHOW_GOTO_FUNCTIONS \ + "(symex-record-coverage)" \ + OPT_COMPLEXITY_GRAPH \ + OPT_COMPLEXITY_GRAPH_CBMC \ OPT_SHOW_PROPERTIES \ "(show-symbol-table)(show-parse-tree)" \ "(drop-unused-functions)" \ diff --git a/src/goto-checker/multi_path_symex_checker.cpp b/src/goto-checker/multi_path_symex_checker.cpp index 39306b2ad7a..5d00897c859 100644 --- a/src/goto-checker/multi_path_symex_checker.cpp +++ b/src/goto-checker/multi_path_symex_checker.cpp @@ -21,6 +21,8 @@ Author: Daniel Kroening, Peter Schrammel #include "counterexample_beautification.h" #include "goto_symex_fault_localizer.h" +#include + multi_path_symex_checkert::multi_path_symex_checkert( const optionst &options, ui_message_handlert &ui_message_handler, @@ -31,6 +33,103 @@ multi_path_symex_checkert::multi_path_symex_checkert( { } +// generates a complexity graph associated with the given query, +// possibly with symex and SAT formula information. +void generate_goto_dot (const abstract_goto_modelt &goto_model, + const symex_bmct &symex, + const optionst &options, + ui_message_handlert &ui_message_handler, + const goto_symex_property_decidert &property_decider, + const std::string &type) +{ + const std::string path = options.get_option(type); + if (!path.empty()) + { + const auto &goto_functions = goto_model.get_goto_functions(); + + messaget msg(ui_message_handler); + const namespacet ns(goto_model.get_symbol_table()); + const auto sorted = goto_functions.sorted(); + + const symex_coveraget &symex_coverage = symex.get_coverage(); + std::map instr_symex_info; + std::map instr_solver_info; + + // populate instr_symex_info + for(const auto &fun : sorted) + { + const bool has_body = fun->second.body_available(); + + if (has_body) + { + const goto_programt &body = fun->second.body; + forall_goto_program_instructions(from, body) + { + const goto_programt::const_targetst to_list = symex_coverage.coverage_to (from); + size_t total_steps = 0; + double total_duration = 0.0; + for (goto_programt::const_targett to : to_list) + { + total_steps += symex_coverage.num_executions(from, to); + total_duration += symex_coverage.duration(from, to); + } + + symex_infot info; + info.steps = total_steps; + info.duration = total_duration; + + instr_symex_info.insert ({from, info}); + } + } + } + + // populate instr_solver_info + with_solver_hardness( + property_decider.get_decision_procedure(), + [&instr_solver_info](solver_hardnesst &solver_hardness) + { + // source: solver_hardness.cpp solver_hardnesst::produce_report + const std::vector> &hardness_stats = solver_hardness.get_hardness_stats(); + for(std::size_t i = 0; i < hardness_stats.size(); i++) + { + const auto &ssa_step_hardness = hardness_stats[i]; + for(const auto &key_value_pair : ssa_step_hardness) + { + auto const &ssa = key_value_pair.first; + auto const &hardness = key_value_pair.second; + const goto_programt::const_targett target = ssa.pc; + + auto ensure_exists = instr_solver_info.find (target); + if (ensure_exists == instr_solver_info.end()) + { + solver_infot solver_info; + instr_solver_info.insert ({target, solver_info}); + } + + auto entry = instr_solver_info.find (target); + solver_infot &solver_info = entry->second; + solver_info.clauses += hardness.clauses; + solver_info.literals += hardness.literals; + solver_info.variables += hardness.variables.size(); + } + } + }); + + if (type == "show-complexity-graph") + { + show_complexity_graph(options, goto_model, path, ui_message_handler); + } else if (type == "show-complexity-graph-with-symex") + { + show_complexity_graph(options, goto_model, path, ui_message_handler, + instr_symex_info); + } else if (type == "show-complexity-graph-with-solver") + { + show_complexity_graph(options, goto_model, path, ui_message_handler, + instr_symex_info, instr_solver_info); + } + } +} + incremental_goto_checkert::resultt multi_path_symex_checkert:: operator()(propertiest &properties) { @@ -46,7 +145,12 @@ operator()(propertiest &properties) // we haven't got an equation yet if(!equation_generated) { + + generate_goto_dot(goto_model, symex, options, ui_message_handler, property_decider, "show-complexity-graph"); + generate_equation(); + + generate_goto_dot(goto_model, symex, options, ui_message_handler, property_decider, "show-complexity-graph-with-symex"); output_coverage_report( options.get_option("symex-coverage-report"), @@ -67,6 +171,8 @@ operator()(propertiest &properties) run_property_decider(result, properties, solver_runtime); + generate_goto_dot(goto_model, symex, options, ui_message_handler, property_decider, "show-complexity-graph-with-solver"); + return result; } diff --git a/src/goto-checker/symex_bmc.cpp b/src/goto-checker/symex_bmc.cpp index 0fdaa6766ea..0c00a571931 100644 --- a/src/goto-checker/symex_bmc.cpp +++ b/src/goto-checker/symex_bmc.cpp @@ -33,7 +33,7 @@ symex_bmct::symex_bmct( options, path_storage, guard_manager), - record_coverage(!options.get_option("symex-coverage-report").empty()), + record_coverage(options.get_bool_option("symex-record-coverage")), havoc_bodyless_functions( options.get_bool_option("havoc-undefined-functions")), unwindset(unwindset), @@ -74,8 +74,15 @@ void symex_bmct::symex_step( log.statistics() << log.eom; } + std::chrono::duration step_runtime(0); + auto const step_start = std::chrono::steady_clock::now(); + goto_symext::symex_step(get_goto_function, state); + auto const step_stop = std::chrono::steady_clock::now(); + step_runtime = step_stop - step_start; + double duration = step_runtime.count(); + if( record_coverage && // avoid an invalid iterator in state.source.pc @@ -91,9 +98,9 @@ void symex_bmct::symex_step( if( cur_pc->is_goto() && cur_pc->get_target() != state.source.pc && cur_pc->condition().is_true()) - symex_coverage.covered(cur_pc, cur_pc->get_target()); + symex_coverage.covered(cur_pc, cur_pc->get_target(), duration); else if(!state.guard.is_false()) - symex_coverage.covered(cur_pc, state.source.pc); + symex_coverage.covered(cur_pc, state.source.pc, duration); } } @@ -105,8 +112,15 @@ void symex_bmct::merge_goto( const goto_programt::const_targett prev_pc = prev_source.pc; const guardt prev_guard = goto_state.guard; + std::chrono::duration step_runtime(0); + auto const step_start = std::chrono::steady_clock::now(); + goto_symext::merge_goto(prev_source, std::move(goto_state), state); + auto const step_stop = std::chrono::steady_clock::now(); + step_runtime = step_stop - step_start; + double duration = step_runtime.count(); + PRECONDITION(prev_pc->is_goto()); if( record_coverage && @@ -114,7 +128,7 @@ void symex_bmct::merge_goto( !prev_guard.is_false() && !state.guard.is_false() && // branches only, no single-successor goto !prev_pc->condition().is_true()) - symex_coverage.covered(prev_pc, state.source.pc); + symex_coverage.covered(prev_pc, state.source.pc, duration); } bool symex_bmct::should_stop_unwind( @@ -224,3 +238,8 @@ void symex_bmct::no_body(const irep_idt &identifier) log.warning() << log.eom; } } + +const symex_coveraget symex_bmct::get_coverage () const +{ + return symex_coverage; +} diff --git a/src/goto-checker/symex_bmc.h b/src/goto-checker/symex_bmc.h index 5e0958c7e0f..bacf15a0f8e 100644 --- a/src/goto-checker/symex_bmc.h +++ b/src/goto-checker/symex_bmc.h @@ -84,6 +84,8 @@ class symex_bmct : public goto_symext unwindsett &unwindset; + const symex_coveraget get_coverage () const; + protected: /// Callbacks that may provide an unwind/do-not-unwind decision for a loop std::vector loop_unwind_handlers; diff --git a/src/goto-checker/symex_coverage.cpp b/src/goto-checker/symex_coverage.cpp index 4dd4acdf68e..5c107c72548 100644 --- a/src/goto-checker/symex_coverage.cpp +++ b/src/goto-checker/symex_coverage.cpp @@ -438,3 +438,51 @@ bool symex_coveraget::generate_report( return output_report(goto_functions, out); } } + + +const goto_programt::const_targetst symex_coveraget::coverage_to (goto_programt::const_targett from) const +{ + std::list to_list; + const auto &inner = coverage.find(from); + if (inner != coverage.end()) + { + const std::map &to_map = inner->second; + for (auto it = to_map.begin(); it != to_map.end(); ++it) + { + to_list.push_back (it->first); + } + } + + return to_list; +} + + +const int symex_coveraget::num_executions (goto_programt::const_targett from, goto_programt::const_targett to) const +{ + const auto &inner = coverage.find(from); + if (inner != coverage.end()) + { + const auto &inner_ = inner->second.find (to); + if (inner_ != inner->second.end()) + { + const coverage_infot &info = inner_->second; + return info.num_executions; + } + } + return 0; +} + +const double symex_coveraget::duration (goto_programt::const_targett from, goto_programt::const_targett to) const +{ + const auto &inner = coverage.find(from); + if (inner != coverage.end()) + { + const auto &inner_ = inner->second.find (to); + if (inner_ != inner->second.end()) + { + const coverage_infot &info = inner_->second; + return info.duration; + } + } + return 0; +} diff --git a/src/goto-checker/symex_coverage.h b/src/goto-checker/symex_coverage.h index 1ad851c172d..e8b78d90b24 100644 --- a/src/goto-checker/symex_coverage.h +++ b/src/goto-checker/symex_coverage.h @@ -33,19 +33,25 @@ class symex_coveraget } void - covered(goto_programt::const_targett from, goto_programt::const_targett to) + covered(goto_programt::const_targett from, goto_programt::const_targett to, double duration) { std::pair entry = - coverage[from].insert({to, coverage_infot(from, to, 1)}); + coverage[from].insert({to, coverage_infot(from, to, 1, duration)}); - if(!entry.second) + if(!entry.second) { ++(entry.first->second.num_executions); + entry.first->second.duration += duration; + } } bool generate_report( const goto_functionst &goto_functions, const std::string &path) const; + const goto_programt::const_targetst coverage_to (goto_programt::const_targett from) const; + const int num_executions (goto_programt::const_targett from, goto_programt::const_targett to) const; + const double duration (goto_programt::const_targett from, goto_programt::const_targett to) const; + protected: const namespacet &ns; @@ -54,14 +60,16 @@ class symex_coveraget coverage_infot( goto_programt::const_targett _from, goto_programt::const_targett _to, - unsigned _num_executions) - : location(_from), num_executions(_num_executions), succ(_to) + unsigned _num_executions, + double _duration) + : location(_from), num_executions(_num_executions), succ(_to), duration(_duration) { } goto_programt::const_targett location; unsigned num_executions; goto_programt::const_targett succ; + double duration; }; typedef std::map diff --git a/src/goto-instrument/goto_instrument_parse_options.cpp b/src/goto-instrument/goto_instrument_parse_options.cpp index b906ac07ba8..5e0a9937bf0 100644 --- a/src/goto-instrument/goto_instrument_parse_options.cpp +++ b/src/goto-instrument/goto_instrument_parse_options.cpp @@ -689,6 +689,54 @@ int goto_instrument_parse_optionst::doit() return CPROVER_EXIT_SUCCESS; } + + { + optionst options; + options.set_option ("show-complexity-graph", cmdline.get_value ("show-complexity-graph")); + if (cmdline.isset ("complexity-graph-root")) + { + std::stringstream stream; + for (const std::string &val : cmdline.get_values("complexity-graph-root")) + { + stream << val << ","; + } + options.set_option ("complexity-graph-roots", stream.str()); + } + + if (cmdline.isset ("complexity-graph-omit-function")) + { + std::stringstream stream; + for (const std::string &val : cmdline.get_values("complexity-graph-omit-function")) + { + stream << val << ","; + } + options.set_option ("complexity-graph-omit-function", stream.str()); + } + + if (cmdline.isset ("complexity-graph-global-scores")) + { + options.set_option ("complexity-graph-global-scores", true); + } + + if (cmdline.isset ("complexity-graph-omit-function-pointers")) + { + options.set_option ("complexity-graph-omit-function-pointers", true); + } + + if (cmdline.isset ("complexity-graph-instructions")) + { + options.set_option ("complexity-graph-instructions", true); + } + + if (options.get_option ("show-complexity-graph") != "") + { + const std::string path = options.get_option("show-complexity-graph"); + show_complexity_graph(options, goto_model, path, ui_message_handler); + return CPROVER_EXIT_SUCCESS; + } + } + + if(cmdline.isset("list-undefined-functions")) { list_undefined_functions(goto_model, std::cout); @@ -1771,6 +1819,7 @@ void goto_instrument_parse_optionst::help() " --show-symbol-table show loaded symbol table\n" " --list-symbols list symbols with type information\n" HELP_SHOW_GOTO_FUNCTIONS + HELP_SHOW_COMPLEXITY_GRAPH HELP_GOTO_PROGRAM_STATS " --show-locations show all source locations\n" " --dot generate CFG graph in DOT format\n" diff --git a/src/goto-instrument/goto_instrument_parse_options.h b/src/goto-instrument/goto_instrument_parse_options.h index 905e7d191f1..c462ef364c5 100644 --- a/src/goto-instrument/goto_instrument_parse_options.h +++ b/src/goto-instrument/goto_instrument_parse_options.h @@ -22,6 +22,7 @@ Author: Daniel Kroening, kroening@kroening.com #include #include #include +#include #include #include @@ -67,6 +68,7 @@ Author: Daniel Kroening, kroening@kroening.com "(nondet-static-exclude):" \ "(function-enter):(function-exit):(branch):" \ OPT_SHOW_GOTO_FUNCTIONS \ + OPT_COMPLEXITY_GRAPH \ OPT_SHOW_PROPERTIES \ "(drop-unused-functions)" \ "(show-value-sets)" \ diff --git a/src/goto-programs/Makefile b/src/goto-programs/Makefile index 3d6a56f5416..6810c07e94b 100644 --- a/src/goto-programs/Makefile +++ b/src/goto-programs/Makefile @@ -35,6 +35,7 @@ SRC = allocate_objects.cpp \ link_goto_model.cpp \ link_to_library.cpp \ loop_ids.cpp \ + metrics.cpp \ mm_io.cpp \ name_mangler.cpp \ osx_fat_reader.cpp \ @@ -60,6 +61,7 @@ SRC = allocate_objects.cpp \ resolve_inherited_component.cpp \ safety_checker.cpp \ set_properties.cpp \ + show_complexity_graph.cpp \ show_goto_functions.cpp \ show_goto_functions_json.cpp \ show_goto_functions_xml.cpp \ diff --git a/src/goto-programs/complexity_graph.h b/src/goto-programs/complexity_graph.h new file mode 100644 index 00000000000..5d3603d3f9c --- /dev/null +++ b/src/goto-programs/complexity_graph.h @@ -0,0 +1,168 @@ +/*******************************************************************\ + +Module: Provides the graph datatype for the complexity call graph + +Author: Benjamin Quiring + +\*******************************************************************/ + +/// \file +/// Provides the graph datatype for the complexity call graph + +class complexity_grapht +{ + public: + + // a class for nodes + class nodet + { + public: + + // the different types for nodes + enum class node_typet {FUNCTION, LOOP, FUNCTION_POINTER}; + + // the name of this node + irep_idt name; + + // the display name of this node + std::string display_name; + + // the type of this node + node_typet node_type; + + // the properties this node has + std::set properties; + + // the collection of instructions this node owns + std::vector> instructions; + + // the indegree of this node + int indegree; + + // the outdegree of this node + int outdegree; + + // The constructor for nodes. + nodet (const irep_idt name_, const std::string &display_name_, node_typet node_type_) + { + name = name_; + display_name = display_name_; + node_type = node_type_; + } + + // Adds a property to a node. + void add_property (const irep_idt &key) + { + properties.insert (key); + } + + // Checks if a node has a property + const bool has_property (const irep_idt &key) const + { + return properties.find (key) != properties.end(); + } + + }; + + // A class for edges in the graph + class edget + { + + public: + + // the 'from' edge + irep_idt node1; + + // the 'to' edge + irep_idt node2; + + // properties this edge has + std::set properties; + + // Constructor for edges + edget (const irep_idt node1_, const irep_idt node2_) + { + node1 = node1_; + node2 = node2_; + } + + // Adds a property to this edge + void add_property (const irep_idt key) + { + properties.insert (key); + } + + // Checks if this edge has a property + const bool has_property (const irep_idt key) + { + return properties.find (key) != properties.end(); + } + + }; + + // The map from names to nodes for this graph + std::map node_map; + + // The map from names to names to edges for this graph + std::map> edge_map; + + // The dual map of edge_map + std::map> dual_edge_map; + + complexity_grapht () + { + + } + + // Adds a node to the graph + nodet &add_node (const nodet node) + { + node_map.insert ({node.name, node}); + edge_map.insert ({node.name, std::map()}); + dual_edge_map.insert ({node.name, std::map()}); + return node_map.find (node.name)->second; + } + + // Finds a node in the graph + const nodet &find_node (const irep_idt &name) const + { + return node_map.find (name)->second; + } + + // Checks if the graph has a node with this name + const bool has_node (const irep_idt &name) const + { + return (node_map.find (name) != node_map.end()); + } + + // Adds an edge to the graph + edget& add_edge (const irep_idt &name1, const irep_idt &name2) + { + nodet node1 = find_node (name1); + nodet node2 = find_node (name2); + + node1.outdegree++; + node2.indegree++; + + const auto &found = edge_map.find (name1); + std::map &edge_map2 = found->second; + const auto &found2 = edge_map2.find (name2); + if (found2 == edge_map2.end()) + { + edge_map2.insert({name2, edget(name1, name2)}); + } + + { + const auto &found = dual_edge_map.find (name2); + std::map &edge_map2 = found->second; + const auto &found2 = edge_map2.find (name1); + if (found2 == edge_map2.end()) + { + edge_map2.insert({name1, edget(name2, name1)}); + } + } + + return edge_map2.find(name1)->second; + } + +}; diff --git a/src/goto-programs/metrics.cpp b/src/goto-programs/metrics.cpp new file mode 100644 index 00000000000..9270a477d46 --- /dev/null +++ b/src/goto-programs/metrics.cpp @@ -0,0 +1,303 @@ +/*******************************************************************\ + +Module: Show the goto functions as a dot program + +Author: Benjamin Quiring + +\*******************************************************************/ + +/// \file +/// Compute metrics for the CFG + +#include "metrics.h" + +#include +#include + +#include +#include + +size_t num_func_pointer_calls (const instruction_collectiont &instructions) +{ + size_t count = 0; + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if(target->is_function_call()) + { + count += (target->call_function().id() == ID_dereference); + } + } + } + + return count; +} + +// number of loops = number of backward jumps +size_t num_loops (const instruction_collectiont &instructions) +{ + size_t num_loops = 0; + + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if(target->is_backwards_goto()) + { + const exprt &cond = target->condition(); + const bool degenerate = (cond.id() == ID_notequal) + && (cond.operands()[0].is_zero()) + && (cond.operands()[1].is_zero()); + // ignore instructions such as 'IF 0 != 0 GOTO 1' + if (!degenerate) + { + num_loops = num_loops + 1; + } + } + } + } + + return num_loops; +} + +// compute an integer size for an expr +size_t expr_size (const exprt e) +{ + if (e.has_operands()) + { + const exprt::operandst &ops = e.operands(); + size_t size = 1; + for (const auto &op : ops) + { + size += expr_size (op); + } + return size; + } else + { + return 0; + } +} + +// compute an integer size for a function body +// the size of a function body is the sum of the sizes of all expression right-hand sides, +// excluding assertions and assumptions. +// the size of an expression is equal to the number of non-trivial subexpressions, i.e. +// the number of nodes that have operands. +size_t function_size (const instruction_collectiont &instructions) +{ + size_t size = 0; + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if(target->is_function_call()) + { + size += 1 + expr_size (target->call_function()); + for (const auto &arg : target->call_arguments()) + { + size += expr_size (arg); + } + } else if (target->is_assign()) + { + size += expr_size (target->assign_lhs()); + size += expr_size (target->assign_rhs()); + } else if (target->is_assume()) + { + size += expr_size (target->condition()); + } else if (target->is_set_return_value()) + { + size += expr_size (target->return_value()); + } + } + } + return size; +} + +size_t num_complex_user_ops_expr (const exprt &e) +{ + size_t sum = 0; + + if (e.id() == ID_dereference || // pointer dereference + e.id() == ID_pointer_offset || // pointer dereference with offset? + e.id() == ID_field_address || // struct field selection + e.id() == ID_index) // indexing an array + { + sum += 1; + } + + if (e.has_operands()) + { + for (const exprt &oper : e.operands()) + { + sum += num_complex_user_ops_expr (oper); + } + } + + return sum; +} + +size_t num_complex_user_ops (const instruction_collectiont &instructions) +{ + size_t count = 0; + + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if (target->is_function_call()) + { + count += num_complex_user_ops_expr (target->call_lhs()); + for (const auto &oper : target->call_arguments()) + { + count += num_complex_user_ops_expr (oper); + } + } else if (target->is_assign()) + { + count += num_complex_user_ops_expr (target->assign_lhs()); + count += num_complex_user_ops_expr (target->assign_rhs()); + } else if (target->is_assert()) + { + count += num_complex_user_ops_expr (target->condition()); + } else if (target->is_assume()) + { + count += num_complex_user_ops_expr (target->condition()); + } else if (target->is_set_return_value()) + { + count += num_complex_user_ops_expr (target->return_value()); + } + } + } + return count; +} + +size_t num_complex_lib_funcs (const instruction_collectiont &instructions, const std::set &lib_funcs) +{ + size_t count = 0; + + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if (target->is_function_call() && target->call_function().id() == ID_symbol) + { + irep_idt f = to_symbol_expr(target->call_function()).get_identifier(); + if (lib_funcs.find (f) != lib_funcs.end()) + { + count += 1; + } + } + } + } + return count; +} + +size_t num_complex_cbmc_ops_expr (const exprt &e) +{ + size_t sum = 0; + + const irep_idt &e_id = e.id(); + if (e_id == ID_byte_extract_big_endian || + e_id == ID_byte_extract_little_endian || + e_id == ID_byte_update_big_endian || + e_id == ID_byte_update_little_endian || + e_id == ID_allocate) + { + sum += 1; + } + + if (e.has_operands()) + { + for (const exprt &oper : e.operands()) + { + sum += num_complex_cbmc_ops_expr (oper); + } + } + + return sum; +} + +size_t num_complex_cbmc_ops (const instruction_collectiont &instructions) +{ + size_t count = 0; + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + if (target->is_function_call()) + { + count += num_complex_cbmc_ops_expr (target->call_lhs()); + for (const auto &oper : target->call_arguments()) + { + count += num_complex_cbmc_ops_expr (oper); + } + } else if (target->is_assign()) + { + count += num_complex_cbmc_ops_expr (target->assign_lhs()); + count += num_complex_cbmc_ops_expr (target->assign_rhs()); + } else if (target->is_assert()) + { + count += num_complex_cbmc_ops_expr (target->condition()); + } else if (target->is_assume()) + { + count += num_complex_cbmc_ops_expr (target->condition()); + } else if (target->is_set_return_value()) + { + count += num_complex_cbmc_ops_expr (target->return_value()); + } + } + } + return count; +} + +const double func_metricst::avg_time_per_symex_step () const +{ + if (symex_info.steps == 0) + { + return 0.0; + } + return (symex_info.duration / (double)symex_info.steps); +} + +const void func_metricst::dump_html (std::ostream &out) const +{ + std::string endline = "
"; + size_t avg_time_per_step = (size_t)avg_time_per_symex_step()/10000; + out << "overall function size: " << function_size; + if (num_complex_user_ops != 0) + { + out << endline << "complex user ops: " << num_complex_user_ops; + } + if (num_complex_lib_funcs != 0) + { + out << endline << "complex library functions: " << num_complex_lib_funcs; + } + if (num_complex_cbmc_ops != 0) + { + out << endline << "complex CBMC ops: " << num_complex_cbmc_ops; + } + if (num_func_pointer_calls != 0) + { + out << endline << "function pointer calls: " << num_func_pointer_calls; + } + if (num_loops != 0) + { + out << endline << "loops: " << num_loops; + } + + if (use_symex_info) + { + out << endline + << "symex steps: " << symex_info.steps << endline + << "symex duration (ms): " << (size_t)(symex_info.duration / 1000000.0) << endline + << "symex avg time per step: " << avg_time_per_step; + } + + if (use_solver_info) + { + out << endline + << "solver clauses: " << solver_info.clauses << endline + << "solver literals: " << solver_info.literals << endline + << "solver variables: " << solver_info.variables; + } +} diff --git a/src/goto-programs/metrics.h b/src/goto-programs/metrics.h new file mode 100644 index 00000000000..6aafbcdf8cc --- /dev/null +++ b/src/goto-programs/metrics.h @@ -0,0 +1,171 @@ +/*******************************************************************\ + +Module: Provides metrics for the complexity call graph + +Author: Benjamin Quiring + +\*******************************************************************/ + +/// \file +/// Provides metrics used for the complexity call graph + +#ifndef CPROVER_COMPLEXITY_GRAPH_METRICS_H +#define CPROVER_COMPLEXITY_GRAPH_METRICS_H + +#include +#include + +class namespacet; +class goto_functionst; + +class solver_infot +{ + public: + // the number of clauses, literals, and variables + size_t clauses = 0; + size_t literals = 0; + size_t variables = 0; + + // Adds together this and another solver_info + // \param other: the other solver_info to add + // \return: this solver_info + solver_infot &operator+= (const solver_infot &other) + { + clauses += other.clauses; + literals += other.literals; + variables += other.variables; + return *this; + } + + // Adds together this and another solver_info + // \param other: the other solver_info to add + // \return: the sum of this solver_info and the other + solver_infot operator+(const solver_infot &other) + { + solver_infot info (*this); + info += other; + return info; + } +}; + +class symex_infot +{ + public: + // number of symex steps + size_t steps = 0; + // duration of symex steps (in nanoseconds) + double duration = 0.0; + + // Adds together this and another symex_info + // \param other: the other solver_info to add + // \return: this solver_info + symex_infot &operator+= (const symex_infot &other) + { + steps += other.steps; + duration += other.duration; + return *this; + } + + // Adds together this and another symex_info + // \param other: the other solver_info to add + // \return: the sum of this solver_info and the other + symex_infot operator+(const symex_infot &other) + { + symex_infot info(*this); + info += other; + return info; + } + +}; + + +class func_metricst +{ + + public: + // how many calls to function pointers are in the function's body + size_t num_func_pointer_calls = 0; + + // sum of the sides of all right-hand sides in the function body + size_t function_size = 0; + + // number of high-complexity primitives in the function's body + // e.g. struct field access, array indexing, pointer dereferencing + size_t num_complex_user_ops = 0; + + // number of standard library functions called + // e.g. memcpy, memmove, memcmp, malloc, free, realloc + size_t num_complex_lib_funcs = 0; + + // number of high-complexity CBMC-internal functions + // e.g. byte_extract_little_endian, + // byte_extract_big_endian, + // byte_update_little_endian, + // byte_update_big_endian, + size_t num_complex_cbmc_ops = 0; + + // number of loops (backwards jumps) in the function's body + size_t num_loops = 0; + + // wheter there is valid symex info + bool use_symex_info = false; + symex_infot symex_info; + + // wheter there is valid solver info + bool use_solver_info = false; + solver_infot solver_info; + + // Dumps an html representation of this metrics + // \param out: the output stream to dump to + const void dump_html (std::ostream &out) const; + + // average symex time per step in nanoseconds/step + const double avg_time_per_symex_step () const; + +}; + +using instruction_collectiont = std::vector>; + +size_t num_loops (const instruction_collectiont &instructions); + +size_t function_size (const instruction_collectiont &instructions); + +size_t num_func_pointer_calls (const instruction_collectiont &instructions); + +size_t num_complex_user_ops (const instruction_collectiont &instructions); + +size_t num_complex_lib_funcs + (const instruction_collectiont &instructions, + const std::set &lib_funcs); + +size_t num_complex_cbmc_ops (const instruction_collectiont &instructions); + +symex_infot aggregate_symex_info + (const instruction_collectiont &instructions, + const std::map &instr_symex_info); + +solver_infot aggregate_solver_info + (const instruction_collectiont &instructions, + const std::map &instr_symex_info); + +template T aggregate_instr_info ( + const instruction_collectiont &instructions, + const std::map &instr_info) +{ + T total; + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + const auto &info = instr_info.find (target); + if (info != instr_info.end()) + { + const T &other = info->second; + total += other; + } + } + } + return total; +} + +#endif // CPROVER_GOTO_PROGRAMS_PROOF_CFG_METRICS_H diff --git a/src/goto-programs/show_complexity_graph.cpp b/src/goto-programs/show_complexity_graph.cpp new file mode 100644 index 00000000000..30356b4ae2f --- /dev/null +++ b/src/goto-programs/show_complexity_graph.cpp @@ -0,0 +1,1094 @@ +/*******************************************************************\ + +Module: Show the goto functions as a dot program + +Author: Benjamin Quiring + +\*******************************************************************/ + +/// \file +/// Show goto functions + +#include "show_complexity_graph.h" + +#include +#include +#include +#include + +#include +#include +#include +#include + +#include "goto_model.h" +#include "pointer_expr.h" +#include "complexity_graph.h" + +#include + +#include "compute_called_functions.h" +#include +#include + +// is this function a private function? +// does name look like "__CPROVER_file_local_{filename}_c_{name}"; +// \param name: the name to inspect +bool is_private (const irep_idt &name) +{ + + const std::string str (name.c_str(), name.size()); + + return (str.find ("__CPROVER_file_local_") != std::string::npos); +} + +// produces a name without CBMC internal naming effects +// \param name: the name to normalize +std::string normalize_name (const irep_idt &name) +{ + // "__CPROVER_file_local_{filename}_c_{name}"; + + const std::string str (name.c_str(), name.size()); + + if (str.find ("__CPROVER_file_local_") != std::string::npos) + { + return str.substr (str.find ("_c_") + 3); + + } + + return str; +} + +// replaces all substrings in the string with the provided string +// \param str: the string to look in +// \param from: the substring to look for +// \param to: the replacement +void replace_all_substrings (std::string &str, const std::string &from, const std::string &to) +{ + size_t index = 0; + while (true) + { + index = str.find(from, index); + if (index == std::string::npos) break; + str.replace(index, from.length(), to); + index += from.length(); + } +} + +void normalize_html (std::string &str) +{ + // other characters which don't seem to cause a problem: {"$", "&dollar"}, {":", "&colon"} + replace_all_substrings (str, "\"", """); + replace_all_substrings (str, ">", ">"); + replace_all_substrings (str, "<", "<"); + replace_all_substrings (str, "\n", "
"); +} + +// initialize the data structures used for function pointer target detection +void function_pointer_setup (const symbol_tablet &symbol_table, + const namespacet &ns, + const goto_functionst &goto_functions, + std::map &type_map, + std::unordered_set &address_taken) +{ + + for(const auto &s : symbol_table.symbols) + { + compute_address_taken_functions(s.second.value, address_taken); + } + + compute_address_taken_functions(goto_functions, address_taken); + + for(const auto &gf_entry : goto_functions.function_map) + { + type_map.emplace( + gf_entry.first, to_code_type(ns.lookup(gf_entry.first).type)); + } + +} + +// initialize the data structures used for function pointer target detection +// adapted from goto-programs/remove_function_pointers.c +void find_functions_for_function_pointer ( + const symbol_tablet &symbol_table, + const namespacet &ns, + remove_const_function_pointerst::functionst &functions, + message_handlert &message_handler, + const goto_programt::const_targett &target, + const std::map &type_map, + const std::unordered_set &address_taken) +{ + + const auto &function = to_dereference_expr(as_const(*target).call_function()); + + // this better have the right type + code_typet call_type=to_code_type(function.type()); + + // refine the type in case the forward declaration was incomplete + if(call_type.has_ellipsis() && + call_type.parameters().empty()) + { + call_type.remove_ellipsis(); + for(const auto &argument : as_const(*target).call_arguments()) + { + call_type.parameters().push_back(code_typet::parametert(argument.type())); + } + } + + const exprt &pointer = function.pointer(); + remove_const_function_pointerst fpr(message_handler, ns, symbol_table); + bool found_functions=fpr(pointer, functions); + + if(!found_functions) + { + bool return_value_used = as_const(*target).call_lhs().is_not_nil(); + + // get all type-compatible functions + // whose address is ever taken + for(const auto &t : type_map) + { + // address taken? + if(address_taken.find(t.first)==address_taken.end()) + continue; + + // type-compatible? + if(!function_is_type_compatible( + return_value_used, call_type, t.second, ns)) + continue; + + if(t.first=="pthread_mutex_cleanup") + continue; + + symbol_exprt expr(t.first, t.second); + functions.insert(expr); + } + } + +} + + +// populates the nodes of the grapph recursively +void produce_node_rec ( + const goto_functionst &goto_functions, + const namespacet &ns, + const irep_idt name, + complexity_grapht &graph, + const std::set &lib_funcs, + const std::set &omitted_functions, + const bool omit_function_pointers, + std::function find_functions_for_function_pointer) +{ + if (!(graph.has_node (name))) + { + std::string display_name = normalize_name(name); + normalize_html(display_name); + complexity_grapht::nodet &node = graph.add_node (complexity_grapht::nodet (name, display_name, complexity_grapht::nodet::node_typet::FUNCTION)); + + if (lib_funcs.find(name) != lib_funcs.end()) + { + node.add_property ("library"); + } + + if (is_private(name)) + { + node.add_property ("private"); + } + + const auto fun = goto_functions.function_map.find(name); + if (fun != goto_functions.function_map.end()) + { + bool has_body = fun->second.body_available(); + + if (has_body) + { + auto target = fun->second.body.instructions.begin(); + auto end = fun->second.body.instructions.end(); + node.instructions.push_back(std::vector()); + + while (target != end) + { + // FUTURE: can segment function bodies into loops + node.instructions[0].push_back (target); + + if(target->is_function_call()) + { + // normal function calls + if (target->call_function().id() != ID_dereference) + { + const irep_idt call = ns.lookup(to_symbol_expr(target->call_function())).name; + + std::stringstream stream; + stream << call; + std::string str_call = stream.str(); + + if (omitted_functions.find(str_call) == omitted_functions.end()) + { + if (!graph.has_node(call)) + { + produce_node_rec (goto_functions, ns, call, graph, lib_funcs, + omitted_functions, + omit_function_pointers, find_functions_for_function_pointer); + } + graph.add_edge (node.name, call); + } + } else { + // function pointer calls + if (!omit_function_pointers) + { + const exprt &pointer = to_dereference_expr(target->call_function()).pointer(); + std::stringstream stream; + stream << "\"" << format(pointer) << "\""; + + + std::stringstream rhs_stream; + rhs_stream << node.name << "." << stream.str(); + std::string rhs = rhs_stream.str(); + std::string rhs_display = stream.str(); + normalize_html(rhs_display); + + graph.add_node (complexity_grapht::nodet (rhs, rhs_display, complexity_grapht::nodet::node_typet::FUNCTION_POINTER)); + graph.add_edge (node.name, rhs); + + remove_const_function_pointerst::functionst functions; + + find_functions_for_function_pointer (target, functions); + for (const symbol_exprt &function : functions) + { + const irep_idt &name = function.get_identifier(); + if (!graph.has_node (name)) + { + std::string display_name = normalize_name (name); + normalize_html(display_name); + graph.add_node (complexity_grapht::nodet (name, display_name, complexity_grapht::nodet::node_typet::FUNCTION)); + } + + graph.add_edge (rhs, name); + } + } + } + } + + target++; + } + + } else + { + node.add_property ("no_body"); + } + } else + { + node.add_property ("no_definition"); + } + } +} + +// populates the graph with nodes and edges based on the given collection of goto_functions +// \param lib_funcs: a set of library functions +// \param omitted_functions: a set of functions to omit from the graph +// \param omit_function_pointers: whether function pointers should be included in the graph +// \param find_functions_for_function_pointer: produces the targets that +// a function pointer can resolve to +void produce_graph ( + const symbol_tablet &symbol_table, + const namespacet &ns, + message_handlert& message_handler, + const std::map &type_map, + const std::unordered_set &address_taken, + const std::vector &roots, + const goto_functionst &goto_functions, + complexity_grapht &graph, + const std::set &lib_funcs, + const bool omit_function_pointers, + const std::set &omitted_functions) +{ + + std::function find_functions_for_fp = + [&symbol_table, &ns, &message_handler, &type_map, &address_taken] (goto_programt::const_targett &target, remove_const_function_pointerst::functionst &functions) { + find_functions_for_function_pointer (symbol_table, + ns, + functions, + message_handler, + target, + type_map, + address_taken); + }; + + for (const auto &root : roots) + { + const irep_idt &iden = root; + produce_node_rec (goto_functions, ns, iden, graph, lib_funcs, + omitted_functions, omit_function_pointers, + find_functions_for_fp); + } + +} + +// provides a color given the numeric input. +// color ranges from white to red +std::string color_of_score (size_t score) +{ + size_t s = 255 - score; + std::stringstream stream; + // Red + stream << std::hex << 255; + // Green + if (s < 16) { stream << 0 << s; } else { stream << s; } + // Blue + if (s < 16) { stream << 0 << s; } else { stream << s; } + std::string color( stream.str() ); + return color; +} + +// provides a color given the numeric inputs. +// color ranges from white to magenta to yellow +std::string color_of_score (size_t score1, size_t score2) +{ + size_t s1 = 255 - score1; + size_t s2 = 255 - score2; + std::stringstream stream; + // Red + stream << std::hex << 255; + // Green + if (s1 < 16) { stream << 0 << s1; } else { stream << s1; } + // Blue + if (s2 < 16) { stream << 0 << s2; } else { stream << s2; } + std::string color( stream.str() ); + return color; +} + +// populate the metrics for each node in the graph +void compute_metrics (const complexity_grapht &graph, + std::map &metrics, + const std::set &lib_funcs, + const bool use_symex_info, + const std::map &instr_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info) +{ + for (const auto &it : graph.node_map) + { + const complexity_grapht::nodet node = it.second; + func_metricst &m = metrics.find(node.name)->second; + + m.num_func_pointer_calls = num_func_pointer_calls (node.instructions); + m.function_size = function_size (node.instructions); + m.num_complex_user_ops = num_complex_user_ops (node.instructions); + m.num_complex_lib_funcs = num_complex_lib_funcs (node.instructions, lib_funcs); + m.num_complex_cbmc_ops = num_complex_cbmc_ops (node.instructions); + m.num_loops = num_loops (node.instructions); + + if (use_symex_info) + { + m.symex_info = aggregate_instr_info (node.instructions, instr_symex_info); + } + if (use_solver_info) + { + m.solver_info = aggregate_instr_info (node.instructions, instr_solver_info); + } + m.use_symex_info = use_symex_info; + m.use_solver_info = use_solver_info; + } +} + +// normalize the score map +void normalize_scores (std::map &scores) +{ + size_t max = 0; + for (auto it = scores.begin(); it != scores.end(); it++) + { + max = std::max (max, it->second); + } + + for (auto it = scores.begin(); it != scores.end(); it++) + { + // score between [0, inf) + size_t score = it->second; + // normalized score between [0, 255] + float norm_score = 255.0 * ((float) score) / ((float) max); + size_t normalized_score = std::min(255, std::max(0, (int) norm_score)); + it->second = normalized_score; + } +} + +// compute a score for each node based on the metrics +// currently uses a weighted sum +// does not use the solver or symex information at the moment +void compute_scores (std::map &metrics, + std::map &scores, + const bool use_symex_info, + const bool use_solver_info) +{ + size_t w_num_func_pointer_calls = 5; + size_t w_function_size = 1; + size_t w_num_complex_user_ops = 5; + size_t w_num_complex_lib_funcs = 50; + size_t w_num_complex_cbmc_ops = 5; + size_t w_num_loops = 20; + + for (auto it = metrics.begin(); it != metrics.end(); it++) + { + const irep_idt &name = it->first; + const func_metricst &m = it->second; + size_t score = w_num_func_pointer_calls * m.num_func_pointer_calls + + w_function_size * m.function_size + + w_num_complex_user_ops * m.num_complex_user_ops + + w_num_complex_lib_funcs * m.num_complex_lib_funcs + + w_num_complex_cbmc_ops * m.num_complex_cbmc_ops + + w_num_loops * m.num_loops; + scores.find(name)->second = score; + } +} + +void compute_top_sort_rec (const complexity_grapht &graph, + std::vector &top_sort, + std::set &seen, + const irep_idt &node) +{ + if (seen.find(node) == seen.end()) + { + seen.insert (node); + const auto &edge_map2 = graph.edge_map.find (node)->second; + for (const auto &it2 : edge_map2) + { + const irep_idt other = it2.first; + compute_top_sort_rec (graph, top_sort, seen, other); + } + top_sort.push_back (node); + } +} + +// computes a topological sort of the graph, in the case it is acyclic. +void compute_top_sort (const complexity_grapht &graph, + std::vector &top_sort) +{ + std::set seen; + for (const auto &it : graph.edge_map) + { + const irep_idt node = it.first; + if (seen.find(node) == seen.end()) + { + compute_top_sort_rec (graph, top_sort, seen, node); + } + } + std::reverse(top_sort.begin(), top_sort.end()); +} + +// counts the number of paths to each node in the graph, assuming the graph is acyclic. +void count_num_paths (const complexity_grapht &graph, + const std::vector &top_sort, + std::map &num_paths) +{ + for (const irep_idt &node : top_sort) + { + size_t path_count = 0; + for (const auto &it : graph.dual_edge_map.find(node)->second) + { + const irep_idt other = it.first; + // deals with cycles in a degenerate way + path_count += num_paths.find(other) == num_paths.end() ? 0 : num_paths.find(other)->second; + } + // source nodes have a path count of 1 + if (path_count == 0) + { + path_count = 1; + } + num_paths.insert ({node, path_count}); + } +} + +// computes a globalized score for a node dependent on how many paths it took to get there +void globalize_scores (const complexity_grapht &graph, + const std::map &scores, + std::map &global_scores) +{ + std::vector top_sort; + compute_top_sort (graph, top_sort); + std::map num_paths; + count_num_paths (graph, top_sort, num_paths); + + for (const irep_idt &node : top_sort) + { + size_t paths = num_paths.find (node)->second; + size_t base_score = 0; + + for (const auto &it : graph.edge_map.find(node)->second) + { + const irep_idt other = it.first; + base_score += global_scores.find(other)->second; + } + + size_t global_score = base_score * paths; + global_scores.find (node)->second = global_score; + } + +} + +// dumps a valid HTML table entry with the given text to the given output stream. +void dump_html_table_entry (std::ostream &out, const std::string &text, const std::string &color) +{ + out << ""; // style=\"font-family:'Courier', monospace\">"; + out << text; + out << "
"; +} + +// dumps an instruction to the given output stream, if it is determined to have high complexity. +// returns whether it was dumped. +bool dump_instruction + (const irep_idt &name, + const goto_programt::const_targett &target, + const goto_programt &program, + std::ostream &out, + const namespacet &ns, + const bool use_symex_info, + const std::map &instr_symex_info, + const symex_infot &max_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info, + const solver_infot &max_solver_info) +{ + + size_t s_symex = 0; + if (use_symex_info) + { + auto symex_info = instr_symex_info.find (target); + if (symex_info != instr_symex_info.end()) + { + double duration = symex_info->second.duration; + double temp = 255.0 * duration / max_symex_info.duration; + s_symex = std::max(0, std::min (255, temp)); + } + } + + + size_t s_solver = 0; + if (use_solver_info) + { + auto solver_info = instr_solver_info.find (target); + if (solver_info != instr_solver_info.end()) + { + size_t clauses = solver_info->second.clauses; + size_t temp = 255 * clauses / max_solver_info.clauses; + s_solver = std::max(0, std::min (255, temp)); + } + } + std::string color = color_of_score (s_symex, s_solver); + + const goto_programt::instructiont &instruction = *target; + std::stringstream instr_str_stream; + irep_idt empty = ""; + program.output_instruction (ns, empty, instr_str_stream, instruction); + std::string instr_str = instr_str_stream.str(); + normalize_html (instr_str); + + bool use_instruction = (s_symex > 8) || (s_solver > 8); + if (use_instruction) + { + dump_html_table_entry (out, instr_str, color); + } + return use_instruction; + +} + +// dumps a vertex whose label contains all the high-complexity instructions associated with the given node. +// dumps an edge between the source node and the node for the body. +void dump_instructions + (const complexity_grapht::nodet &node, + std::map &dot_node_naming, + const goto_functiont::parameter_identifierst ¶ms, + const goto_programt &program, + const std::vector> &instructions, + std::ostream &out, + const namespacet &ns, + const bool use_symex_info, + const std::map &instr_symex_info, + const symex_infot &max_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info, + const solver_infot &max_solver_info) +{ + + const irep_idt &name = node.name; + + std::stringstream instructions_stream; + bool any_used_instructions = false; + + size_t index = 0; + for (const auto &insts : instructions) + { + for (const auto &target : insts) + { + any_used_instructions = any_used_instructions || + dump_instruction (name, target, program, instructions_stream, ns, + use_symex_info, instr_symex_info, max_symex_info, + use_solver_info, instr_solver_info, max_solver_info); + } + + // dump a '...' line + if (index != instructions.size() - 1) + { + dump_html_table_entry (out, "...", "FFFFFF"); + } + + ++index; + } + + if (any_used_instructions) + { + std::stringstream body_name_stream; + body_name_stream << name << "_body"; + const irep_idt body_name = body_name_stream.str(); + + dot_node_naming.insert ({body_name, dot_node_naming.size()}); + + out << dot_node_naming.find(body_name)->second + << " ["; + + out << "shape=rectangle,"; //plaintext,"; + out << "fontsize=" << 4 << ","; + out << "fontname=\"Courier\","; + out << "label=<"; + out << " "; + out << instructions_stream.str(); + out << "
"; + out << ">]" << ";\n\n"; + + // add an edge between the function node and the body node, so that + // the body gets placed beneath the function node. + out << dot_node_naming.find(name)->second + << " -> " + << dot_node_naming.find(body_name)->second + << " [arrowhead=none]" + << "\n"; + } +} + + +// dumps all of the nodes in the graph to the output stream. +void dump_nodes +(const complexity_grapht &graph, + std::map &dot_node_naming, + std::ostream &out, + const bool include_instructions, + const namespacet &ns, + const goto_functionst &goto_functions, + const std::map &metrics, + const std::map &scores, + const bool use_symex_info, + const std::map &instr_symex_info, + const symex_infot &max_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info, + const solver_infot &max_solver_info) +{ + + for (const auto &it : graph.node_map) + { + const complexity_grapht::nodet &node = it.second; + const irep_idt &name = node.name; + + std::string color = "000000"; + std::string fillcolor = "ffffff"; + std::string size = "8"; + std::string style = "filled"; + std::string shape; + std::stringstream label; + switch (node.node_type) + { + case complexity_grapht::nodet::node_typet::FUNCTION: + shape = node.has_property("library") ? "pentagon" : + node.has_property("no_body") ? "Mdiamond" : + node.has_property("private") ? "ellipse" : + "rect"; + if (node.has_property("library")) + { + color = "FF0000"; + } + fillcolor = color_of_score (scores.find (name)->second); + label << node.display_name; + label << "
"; + if (!node.has_property ("no_body")) + { + metrics.find(name)->second.dump_html (label); + } + break; + case complexity_grapht::nodet::node_typet::FUNCTION_POINTER: + shape = "rarrow"; + fillcolor = "ffffff"; + label << node.display_name; + break; + case complexity_grapht::nodet::node_typet::LOOP: + shape = "doublecircle"; + label << node.display_name; + break; + default: + break; + } + + out << "subgraph {rank=same;color=blue;\n"; + + out << "// " << node.display_name << "\n"; + out << dot_node_naming.find(name)->second + << " [" + << "label=" << "<" << label.str() << ">" << "," + << "shape=" << shape << "," + << "style=" << style << "," + << "color=" << "\"#" << color << "\"," + << "fillcolor=" << "\"#" << fillcolor << "\"," + << "fontsize=" << size + << "];\n\n"; + + + if (include_instructions) + { + const auto fun = goto_functions.function_map.find(node.name); + if (fun != goto_functions.function_map.end() && fun->second.body_available()) + { + const auto &func = fun->second; + const goto_functiont::parameter_identifierst ¶ms = func.parameter_identifiers; + const goto_programt &body = func.body; + dump_instructions(node, dot_node_naming, params, body, node.instructions, out, ns, + use_symex_info, instr_symex_info, max_symex_info, + use_solver_info, instr_solver_info, max_solver_info); + } + } + out << "}\n\n"; + } +} + +// dumps all of the edges in the graph to the output stream. +void dump_edges ( + const complexity_grapht &graph, + std::map &dot_node_naming, + std::ostream &out, + const optionst &options, + const namespacet &ns) +{ + + for (const auto &it : graph.edge_map) + { + complexity_grapht::nodet node1 = graph.find_node (it.first); + const std::map &edge_map2 = it.second; + for (const auto &it2 : edge_map2) + { + complexity_grapht::nodet node2 = graph.find_node (it2.first); + out << "// " << node1.display_name << " -> " << node2.display_name << "\n"; + out << dot_node_naming.find(node1.name)->second << " -> " << dot_node_naming.find(node2.name)->second << ";\n\n"; + } + } + + +} + +// parses a list of strings separated by the delimiter +// \param lst: the string to search through +// \param use: the container for the result +// \param delim: the delimiter to search for in lst +void parse_string_list (std::string lst, std::vector &use, const std::string &delim) +{ + while (lst.length() != 0) + { + const auto idx = lst.find(delim); + const std::string s = lst.substr (0, idx); + const irep_idt irep_s = s; + use.push_back(irep_s); + + lst = lst.substr (idx + delim.length()); + } +} + +// constructs the complexity graph, placing the result in graph +void construct_complexity_graph ( + complexity_grapht &graph, + const namespacet &ns, + const goto_functionst &goto_functions, + const symbol_tablet &symbol_table, + const optionst &options, + const std::set &lib_funcs, + message_handlert &message_handler) +{ + + std::vector roots; + parse_string_list (options.get_option ("complexity-graph-roots"), roots, ","); + + std::vector omitted_functions_lst; + parse_string_list (options.get_option ("complexity-graph-omit-function"), omitted_functions_lst, ","); + std::set omitted_functions; + for (const irep_idt &f : omitted_functions_lst) + { + omitted_functions.insert (f); + } + + + bool omit_function_pointers = options.get_bool_option ("complexity-graph-omit-function-pointers"); + + const auto sorted = goto_functions.sorted(); + + if (roots.size() == 0) + { + for (const auto &fun : sorted) + { + const auto &name = ns.lookup(fun->first).name; + roots.push_back(name); + } + // could also use goto_functions.entry_point() instead + } + + std::map type_map; + std::unordered_set address_taken; + + function_pointer_setup (symbol_table, ns, goto_functions, type_map, address_taken); + + produce_graph (symbol_table, ns, message_handler, type_map, address_taken, + roots, goto_functions, graph, lib_funcs, + omit_function_pointers, omitted_functions); + +} + +void compute_all_scores ( + const complexity_grapht &graph, + const goto_functionst &goto_functions, + const optionst &options, + std::map &metrics, + std::map &scores, + std::map &globalized_scores, + const std::set &lib_funcs, + const bool use_symex_info, + const std::map &instr_symex_info, + symex_infot &max_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info, + solver_infot &max_solver_info) +{ + + for (const auto &it : graph.node_map) + { + const irep_idt &name = it.first; + metrics.insert({name, func_metricst()}); + scores.insert({name, 1}); + globalized_scores.insert({name, 1}); + } + + // initialize scores + compute_metrics (graph, metrics, lib_funcs, + use_symex_info, instr_symex_info, + use_solver_info, instr_solver_info); + compute_scores (metrics, scores, use_symex_info, use_solver_info); + if (options.get_bool_option("complexity-graph-global-scores")) + { + globalize_scores (graph, scores, globalized_scores); + normalize_scores (globalized_scores); + } + normalize_scores (scores); + + const auto sorted = goto_functions.sorted(); + + if (use_symex_info) + { + for(const auto &fun : sorted) + { + forall_goto_program_instructions(target, fun->second.body) + { + const auto &symex_info = instr_symex_info.find (target); + if (symex_info != instr_symex_info.end()) + { + max_symex_info.duration = std::max(max_symex_info.duration, symex_info->second.duration); + max_symex_info.steps = std::max(max_symex_info.steps, symex_info->second.steps); + } + } + } + } + + if (use_solver_info) + { + for(const auto &fun : sorted) + { + forall_goto_program_instructions(target, fun->second.body) + { + const auto &solver_info = instr_solver_info.find (target); + if (solver_info != instr_solver_info.end()) + { + max_solver_info.clauses = std::max(max_solver_info.clauses, solver_info->second.clauses); + max_solver_info.literals = std::max(max_solver_info.literals, solver_info->second.literals); + max_solver_info.variables = std::max(max_solver_info.variables, solver_info->second.variables); + } + } + } + } + +} + +// dumps the complexity graph to the output stream +void dump_complexity_graph ( + const complexity_grapht &graph, + const namespacet &ns, + const goto_functionst &goto_functions, + std::ostream &out, + const optionst &options, + const std::map &metrics, + const std::map &scores, + const std::map &globalized_scores, + const bool use_symex_info, + const std::map &instr_symex_info, + const symex_infot &max_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info, + const solver_infot &max_solver_info) +{ + + std::map dot_node_naming; + { + size_t index = 0; + for (const auto &it : graph.node_map) + { + dot_node_naming.insert ({it.first, index}); + ++index; + } + } + + out << "digraph G {\n\n"; + + out << "rankdir=\"LR\";\n\n"; + + bool include_instructions = options.get_bool_option ("complexity-graph-instructions"); + const std::map &scores_to_use = options.get_bool_option ("complexity-graph-global-scores") ? globalized_scores : scores; + dump_nodes (graph, dot_node_naming, out, include_instructions, ns, goto_functions, + metrics, scores_to_use, + use_symex_info, instr_symex_info, max_symex_info, + use_solver_info, instr_solver_info, max_solver_info); + + dump_edges (graph, dot_node_naming, out, options, ns); + + out << "} // end digraph G\n"; + +} + +void complexity_graph_main( + const optionst &options, + const std::string &path, + const abstract_goto_modelt &goto_model, + message_handlert &message_handler, + const bool use_symex_info, + const std::map &instr_symex_info, + const bool use_solver_info, + const std::map &instr_solver_info) +{ + std::set lib_funcs = { + "strcpy", "strncpy", "__builtin___strcpy_chk", + "strcat", "strncat", "__builtin___strcat_chk", "__builtin___strncat_chk", + "strcmp", "strncmp", "__builtin___strncpy_chk", + "strcasecmp", "strncasecmp", + "strlen", + "strdup", + "strchr", + "strrchr", + "strerror", + + "memset", + "memcpy", "__builtin___memcpy_chk", + "memset", "__builtin_memset", "__builtin___memset_chk", + "memmove", "__builtin___memmove_chk", + "memcmp", + + "malloc", + "realloc", + "free" + }; + + const namespacet ns(goto_model.get_symbol_table()); + const goto_functionst &goto_functions = goto_model.get_goto_functions(); + + complexity_grapht graph; + construct_complexity_graph (graph, ns, goto_functions, goto_model.get_symbol_table(), + options, lib_funcs, message_handler); + + + std::map scores; + std::map globalized_scores; + std::map metrics; + + symex_infot max_symex_info; + solver_infot max_solver_info; + + compute_all_scores (graph, goto_functions, options, + metrics, scores, globalized_scores, + lib_funcs, + use_symex_info, instr_symex_info, max_symex_info, + use_solver_info, instr_solver_info, max_solver_info); + + std::ofstream outf (path.c_str()); + std::ostream &out = outf; + + dump_complexity_graph (graph, ns, goto_functions, + out, options, + metrics, scores, globalized_scores, + use_symex_info, instr_symex_info, max_symex_info, + use_solver_info, instr_solver_info, max_solver_info); + +} + + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &goto_model, + const std::string &path, + message_handlert &message_handler) +{ + const std::map instr_symex_info; + const bool use_symex_info = false; + const std::map instr_solver_info; + const bool use_solver_info = false; + + complexity_graph_main( + options, + path, + goto_model, + message_handler, + use_symex_info, + instr_symex_info, + use_solver_info, + instr_solver_info); +} + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &goto_model, + const std::string &path, + message_handlert &message_handler, + const std::map &instr_symex_info) +{ + const bool use_symex_info = true; + const std::map instr_solver_info; + const bool use_solver_info = false; + complexity_graph_main( + options, + path, + goto_model, + message_handler, + use_symex_info, + instr_symex_info, + use_solver_info, + instr_solver_info); +} + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &goto_model, + const std::string &path, + message_handlert &message_handler, + const std::map &instr_symex_info, + const std::map &instr_solver_info) +{ + const bool use_symex_info = true; + const bool use_solver_info = true; + complexity_graph_main( + options, + path, + goto_model, + message_handler, + use_symex_info, + instr_symex_info, + use_solver_info, + instr_solver_info); +} diff --git a/src/goto-programs/show_complexity_graph.h b/src/goto-programs/show_complexity_graph.h new file mode 100644 index 00000000000..875a7013eb2 --- /dev/null +++ b/src/goto-programs/show_complexity_graph.h @@ -0,0 +1,82 @@ +/*******************************************************************\ + +Module: Show the complexity call graph as a dot program + +Author: Benjamin Quiring + +\*******************************************************************/ + +/// \file +/// Show the complexity call graph, where nodes are colored based on a +/// "proof complexity" weight mechanism and have annotations describing +/// the features of different functions. + +#ifndef CPROVER_COMPLEXITY_GRAPH_SHOW_COMPLEXITY_GRAPH +#define CPROVER_COMPLEXITY_GRAPH_SHOW_COMPLEXITY_GRAPH + +#include +#include +#include +#include +#include +#include "metrics.h" +#include +#include + +class abstract_goto_modelt; +class ui_message_handlert; + +// clang-format off + +// flags usable by both goto-instrument and CBMC +#define HELP_SHOW_COMPLEXITY_GRAPH \ + " --show-complexity-graph show call graph with nodes colored with proof complexity\n" \ + " --complexity-graph-root provides a root for the complexity graph. TODO\n" \ + " --complexity-graph-omit-function omits a function from the complexity graph\n" \ + " --complexity-graph-omit-function-pointers omits function pointers from the complexity graph\n" \ + " --complexity-graph-global-scores uses a globalized notion of complexity for the complexity graph\n" + +// flags only usable with CBMC +#define HELP_SHOW_COMPLEXITY_GRAPH_CBMC \ + "--show-complexity-graph-with-symex show the complexity graph and include symbolic execution information \n" \ + "--show-complexity-graph-with-solver show the complexity graph and include symbolic execution and solver formula information \n" \ + "--complexity-graph-instructions display which GOTO instructions were complex for symbolic execution and generating the solver formula\n" + +// flags usable by both goto-instrument and CBMC +#define OPT_COMPLEXITY_GRAPH \ + "(show-complexity-graph)" \ + "(complexity-graph-root):" \ + "(complexity-graph-omit-function):" \ + "(complexity-graph-omit-function-pointers)" \ + "(complexity-graph-global-scores)" + +// flags only usable with CBMC +#define OPT_COMPLEXITY_GRAPH_CBMC \ + "(show-complexity-graph-with-symex):" \ + "(show-complexity-graph-with-solver):" \ + "(complexity-graph-instructions)" + +// clang-format on + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &, + const std::string &path, + message_handlert &message_handler); + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &, + const std::string &path, + message_handlert &message_handler, + const std::map &instr_symex_info); + +void show_complexity_graph( + const optionst &options, + const abstract_goto_modelt &, + const std::string &path, + message_handlert &message_handler, + const std::map &instr_symex_info, + const std::map &instr_solver_info); + +#endif // CPROVER_COMPLEXITY_GRAPH_SHOW_COMPLEXITY_GRAPH_H diff --git a/src/goto-symex/solver_hardness.cpp b/src/goto-symex/solver_hardness.cpp index ef8fcb4a951..c995ecaa34d 100644 --- a/src/goto-symex/solver_hardness.cpp +++ b/src/goto-symex/solver_hardness.cpp @@ -383,3 +383,8 @@ std::string solver_hardnesst::expr2string(const exprt expr) ss << format(expr); return ss.str(); } + + +const std::vector> solver_hardnesst::get_hardness_stats() const { + return hardness_stats; +} diff --git a/src/goto-symex/solver_hardness.h b/src/goto-symex/solver_hardness.h index 3b80855dcd0..9308a6e6684 100644 --- a/src/goto-symex/solver_hardness.h +++ b/src/goto-symex/solver_hardness.h @@ -129,6 +129,9 @@ struct solver_hardnesst : public clause_hardness_collectort solver_hardnesst &operator=(const solver_hardnesst &) = delete; solver_hardnesst &operator=(solver_hardnesst &&) = default; + const std::vector> get_hardness_stats() const; + + private: // A minor modification of \ref goto_programt::output_instruction static std::string goto_instruction2string(goto_programt::const_targett pc);