[WIP] Refactor .r2 scripts to Python using r2pipe#3685
Closed
[WIP] Refactor .r2 scripts to Python using r2pipe#3685
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Thanks for assigning this issue to me. I'm starting to work on it and will keep this PR's description up to date as I form a plan and make progress.
Original prompt
This section details on the original issue you should resolve
<issue_title>Rewrite all .r2 scripts in the MASTG-DEMO-xxxx folders to use python and r2pipe</issue_title>
<issue_description>ROLE.
You are an expert in radare2 automation with r2pipe for Python, you know radare2 analysis and query commands, you default to JSON producing commands and r2pipe
cmdjparsing, you design maintainable utilities and minimal per demo scripts.https://book.rada.re/scripting/r2pipe.html
GOAL.
Refactor the MASTG-DEMOs that currently runs radare2 .r2 scripts via run.sh into Python scripts using r2pipe so each demo can run on any compatible binary without hardcoded addresses, by locating targets dynamically through analysis, symbols, imports, strings, xrefs, and call sites. ([rada.re][2])
CONTEXT YOU MUST FOLLOW.
The current scripts are repetitive and hardcode addresses, each demo should become a tiny config plus a few calls, all generic logic must move into a shared utility module "r2ooky" (under utils/r2/).
Avoid hardcoding patterns such as function names, class names, field names, instead support configurable matchers, exact match, contains, regex, and allow multiple candidates with ranking and disambiguation output.
Every result must be reproducible and stable across binaries, print what was matched, how it was resolved, and the address or symbol chosen.
example: r2ooky MASTestApp config.json
DELIVERABLES.
RADARE2 AND R2PIPE RULES.
Always open radare2 with analysis enabled or run analysis early, then rely on analyzed function lists and xrefs, do not assume fixed base addresses. ([book.rada.re][3])
Prefer JSON output forms and parse with r2pipe cmdj, do not scrape human text output. ([r2wiki.readthedocs.io][1])
Where radare2 offers both text and JSON variants, choose the JSON variant, for example function list JSON, xrefs JSON, imports JSON, strings JSON. ([r2wiki.readthedocs.io][1])
If a command does not have JSON, isolate parsing to one place and wrap it behind a stable Python interface.
The output must be aligned as much as possible with the "hook" entries in https://github.com/OWASP/mastg/blob/master/demos/android/MASVS-CRYPTO/MASTG-DEMO-0058/output.json
ADDRESS ELIMINATION STRATEGY.
Replace axt at a hardcoded address with dynamic resolution of the target address by symbol, import, or string, then compute xrefs to that resolved address, then for each xref compute context, surrounding disassembly, and containing function, then print. ([rada.re][2])
Replace disassembly at a hardcoded address with, disassembly at each xref site, or disassembly at the resolved call site address, and when needed seek to the containing function start and print a window.
Replace magic constants reads with, discover the instruction that loads the constant near the call site, then decode the immediate or the referenced memory based on architecture, and print the interpreted value.
CONFIG MODEL.
Define a declarative config schema per demo, describing targets and actions. Similar in structure and style to our frida/frooky hooks JSON: https://github.com/OWASP/mastg/blob/master/demos/android/MASVS-CRYPTO/MASTG-DEMO-0058/hooks.json
The per demo Python should only load config, call utilities, and print or write files.
MAPPING THE EXISTING DEMOS.
Provide a section that rewrites each of the following patterns into config plus utility calls, and show the expected printed output shape.
Common...
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.