-
Notifications
You must be signed in to change notification settings - Fork 4
Perform SSA variable renaming. #14
Conversation
@@ -12,7 +12,8 @@ test = false | |||
[dependencies] | |||
rustc = { path = "../librustc" } | |||
rustc_yk_link = { path = "../librustc_yk_link" } | |||
ykpack = { git = "https://github.com/softdevteam/ykpack" } | |||
#ykpack = { git = "https://github.com/softdevteam/ykpack" } | |||
ykpack = { path = "../../../ykpack" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be revised once the ykpack change is in.
I just realised that ykpack is now in the mono repo. We could do with that transition being complete before I raise the companion PR to this PR. |
The transition is complete isn't it? |
@@ -118,6 +118,15 @@ impl<Node: Idx> Dominators<Node> { | |||
self.immediate_dominators[node].unwrap() | |||
} | |||
|
|||
/// Find the children of a node in the dominator tree. | |||
pub fn immediately_dominates(&self, node: Node) -> Vec<Node> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this will be possible (it probably depends on the borrow checker), but it would almost certainly be faster to make this an iterator rather than a producer of a Vec
. You could probably get away with -> impl Iterator<Node>
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a reason why I didn't do this, but I don't recall what it was. Let me experiment and come back to you.
I don't think so. We have an outstanding PR here: |
src/librustc_yk_sections/emit_tir.rs
Outdated
use rustc_data_structures::fx::FxHashSet; | ||
|
||
const SECTION_NAME: &'static str = ".yk_tir"; | ||
const TMP_EXT: &'static str = ".yk_tir.tmp"; | ||
|
||
/// The pre-SSA return value variable. In the pre-SSA TIR, we need the return value to reside at an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first half of this comment makes sense, but the second half is confusing. Possibly it's just because some sentences are in the wrong order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you expand on that -- it makes sense to me, but then I'm not a good critic for my own work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you just need to try rephrasing it. I can't make sense of the comment as a whole at the moment.
src/librustc_yk_sections/emit_tir.rs
Outdated
@@ -57,25 +72,52 @@ struct ConvCx<'a, 'tcx, 'gcx> { | |||
/// Maps each block to the variable it defines. This is what Appel calls `A_{orig}`. | |||
block_defines: RefCell<IndexVec<BasicBlock, FxHashSet<TirLocal>>>, | |||
/// Monotonically increasing number used to give TIR variables a unique ID. | |||
/// Note that a 0 is reserved for `PRE_SSA_RET_VAR`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a 0" -> "0"
src/librustc_yk_sections/emit_tir.rs
Outdated
/// From the Appel book: | ||
/// "We consider the start node to contain an implicit definition of every variable, | ||
/// either because the variable may be a formal parameter or to represent the notion of | ||
/// a ← uninitialized without special cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't parse the part of the comment on this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something has gone wrong somewhere. That's supposed to be \bottom. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can delimit the abstract algebra in some quotes for readability. I think the problem is notion of a <- uninitialized
is hard to parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, maybe this just needs a bit of rephrasing / reformatting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the quotes in the last commit. Does that work for you?
let n_usize = n as usize; | ||
// We have to remember the variables whose stacks we must pop from when we come back from | ||
// recursion. These must be the variables *before* they were renamed. | ||
let mut pop_later = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a variable appear more than once in here? Does it happen often that variables end up in here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every variable assignment in the block we are processing will push to this vector. The same variable may appear more than once if a block assigns to the same variable many times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this should probably be a set to avoid (I assume pointlessly) poping the same thing over and over again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a variable is assigned to X times, then we need to pop X times.
The the algorithm recursively walks the dominator tree depth-first. When it sees an assignment to a variable, it makes a new SSA variable, rewrites the old variable. and pushes the new variable onto the stack of the corresponding old variable. When we are bubbling up from recursion, we have to revert to the previous SSA versions of the variables. So if a block assigned to a variable z
twice, we'd need to revert to z
as of two assignments ago, thus popping twice.
Hard to explain...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
src/librustc_yk_sections/emit_tir.rs
Outdated
// "Suppose n is the jth predecessor of y". | ||
let j = mir.predecessors_for(y).iter().position(|b| b == &n_idx).unwrap(); | ||
for st in &mut blks[y.as_usize()].stmts { | ||
match st.phi_arg_mut(j) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if let Some(ref mut a) = st.phi_arg_mut(j) { ... }
This should address the comments. Untested. Building now... |
Our yk_tir test passes and manual inspection of the graphs show up nothing dodgy. |
src/librustc_yk_sections/emit_tir.rs
Outdated
/// The pre-SSA return value variable. In MIR, a return terminator implicitly returns variable | ||
/// zero. We can't do this in TIR because TIR is in SSA form and the variable we return depends | ||
/// upon which SSA variable reaches the terminator. So, during initial TIR lowering, we convert the | ||
/// implicit MIR terminator to an explicit TIR terminator returning variable index zero initially. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the last "initially".
None => (), // It wasn't a Phi. Do nothing. | ||
if let Some(ref mut a) = st.phi_arg_mut(j) { | ||
// We only get here if `st` was a Phi. | ||
let i = self.stack[**a as usize].last().cloned().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let i =
is longer than **a =
. Can we use the latter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
One commit to address your comment, another to give the instruction that does the SSA definitions a better name. If you are happy, we can squash this ready for merge, but we cannot merge just yet. There's a yk change that I'm about to raise a PR for and that depends on the new tarball PR with cargo. Sadly my stage 2 failed because I forgot to specify the config file on the command line and now I have to rebuild... |
Please squash. |
We perform renaming in-place using a slightly modified version of the algorithm from the book: Modern Compiler Implementation in Java, 2nd edition, Andrew Appel. We experimented with the algorithm from the SSA book by Jeremy Singer but couldn't get it to work. I suspect there is an error in the published algorithm, but I may be wrong. I've email Jeremy. For now this only renames variables for implemented TIR bytecodes, and not yet in terminators (so return terminators currently always return variable 0, which will need to be changed). It may be possible to have the variables numbers start from zero in the resulting bytecode, but I'm keen to get this in first.
squashed. Now just waiting on the ykrustc archives. Once that's done, I'll flip the ykpack dependency over to yk. |
This is now irrelevant. |
vxWorks: set DEFAULT_MIN_STACK_SIZE to 256K and use min_stack to pass initial stack size to rtpSpawn
This is a combination of 18 commits. Commit softdevteam#2: Additional examples and some small improvements. Commit softdevteam#3: fixed mir-opt non-mir extensions and spanview title elements Corrected a fairly recent assumption in runtest.rs that all MIR dump files end in .mir. (It was appending .mir to the graphviz .dot and spanview .html file names when generating blessed output files. That also left outdated files in the baseline alongside the files with the incorrect names, which I've now removed.) Updated spanview HTML title elements to match their content, replacing a hardcoded and incorrect name that was left in accidentally when originally submitted. Commit softdevteam#4: added more test examples also improved Makefiles with support for non-zero exit status and to force validation of tests unless a specific test overrides it with a specific comment. Commit softdevteam#5: Fixed rare issues after testing on real-world crate Commit softdevteam#6: Addressed PR feedback, and removed temporary -Zexperimental-coverage -Zinstrument-coverage once again supports the latest capabilities of LLVM instrprof coverage instrumentation. Also fixed a bug in spanview. Commit softdevteam#7: Fix closure handling, add tests for closures and inner items And cleaned up other tests for consistency, and to make it more clear where spans start/end by breaking up lines. Commit softdevteam#8: renamed "typical" test results "expected" Now that the `llvm-cov show` tests are improved to normally expect matching actuals, and to allow individual tests to override that expectation. Commit softdevteam#9: test coverage of inline generic struct function Commit softdevteam#10: Addressed review feedback * Removed unnecessary Unreachable filter. * Replaced a match wildcard with remining variants. * Added more comments to help clarify the role of successors() in the CFG traversal Commit softdevteam#11: refactoring based on feedback * refactored `fn coverage_spans()`. * changed the way I expand an empty coverage span to improve performance * fixed a typo that I had accidently left in, in visit.rs Commit softdevteam#12: Optimized use of SourceMap and SourceFile Commit softdevteam#13: Fixed a regression, and synched with upstream Some generated test file names changed due to some new change upstream. Commit softdevteam#14: Stripping out crate disambiguators from demangled names These can vary depending on the test platform. Commit softdevteam#15: Ignore llvm-cov show diff on test with generics, expand IO error message Tests with generics produce llvm-cov show results with demangled names that can include an unstable "crate disambiguator" (hex value). The value changes when run in the Rust CI Windows environment. I added a sed filter to strip them out (in a prior commit), but sed also appears to fail in the same environment. Until I can figure out a workaround, I'm just going to ignore this specific test result. I added a FIXME to follow up later, but it's not that critical. I also saw an error with Windows GNU, but the IO error did not specify a path for the directory or file that triggered the error. I updated the error messages to provide more info for next, time but also noticed some other tests with similar steps did not fail. Looks spurious. Commit softdevteam#16: Modify rust-demangler to strip disambiguators by default Commit softdevteam#17: Remove std::process::exit from coverage tests Due to Issue #77553, programs that call std::process::exit() do not generate coverage results on Windows MSVC. Commit softdevteam#18: fix: test file paths exceeding Windows max path len
Please check carefully. I've been working on this code for so long I feel I can barely see the wood for the trees!
There's a ykpack change associated with this PR. Coming soon.
We perform renaming in-place using a slightly modified version of the
algorithm from the book: Modern Compiler Implementation in Java, 2nd
edition, Andrew Appel.
We experimented with the algorithm from the SSA book by Jeremy Singer
but couldn't get it to work. I suspect there is an error in the
published algorithm, but I may be wrong.
For now this only renames variables for implemented TIR bytecodes, and
not yet in terminators (so return terminators currently always return
variable 0, which will need to be changed).
It may be possible to have the variables numbers start from zero in the
resulting bytecode, but I'm keen to get this in first.
Here are some example CFGs: