Skip to content
This repository was archived by the owner on Jan 7, 2022. It is now read-only.

Conversation

vext01
Copy link
Member

@vext01 vext01 commented Apr 23, 2019

Please check carefully. I've been working on this code for so long I feel I can barely see the wood for the trees!

There's a ykpack change associated with this PR. Coming soon.

We perform renaming in-place using a slightly modified version of the
algorithm from the book: Modern Compiler Implementation in Java, 2nd
edition, Andrew Appel.

We experimented with the algorithm from the SSA book by Jeremy Singer
but couldn't get it to work. I suspect there is an error in the
published algorithm, but I may be wrong.

For now this only renames variables for implemented TIR bytecodes, and
not yet in terminators (so return terminators currently always return
variable 0, which will need to be changed).

It may be possible to have the variables numbers start from zero in the
resulting bytecode, but I'm keen to get this in first.

Here are some example CFGs:

image


image

@@ -12,7 +12,8 @@ test = false
[dependencies]
rustc = { path = "../librustc" }
rustc_yk_link = { path = "../librustc_yk_link" }
ykpack = { git = "https://github.com/softdevteam/ykpack" }
#ykpack = { git = "https://github.com/softdevteam/ykpack" }
ykpack = { path = "../../../ykpack" }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be revised once the ykpack change is in.

@vext01
Copy link
Member Author

vext01 commented Apr 23, 2019

I just realised that ykpack is now in the mono repo. We could do with that transition being complete before I raise the companion PR to this PR.

@ltratt
Copy link
Member

ltratt commented Apr 23, 2019

I just realised that ykpack is now in the mono repo. We could do with that transition being complete before I raise the companion PR to this PR.

The transition is complete isn't it?

@@ -118,6 +118,15 @@ impl<Node: Idx> Dominators<Node> {
self.immediate_dominators[node].unwrap()
}

/// Find the children of a node in the dominator tree.
pub fn immediately_dominates(&self, node: Node) -> Vec<Node> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this will be possible (it probably depends on the borrow checker), but it would almost certainly be faster to make this an iterator rather than a producer of a Vec. You could probably get away with -> impl Iterator<Node> or similar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a reason why I didn't do this, but I don't recall what it was. Let me experiment and come back to you.

@vext01
Copy link
Member Author

vext01 commented Apr 23, 2019

The transition is complete isn't it?

I don't think so. We have an outstanding PR here:
ykjit/yk#1

use rustc_data_structures::fx::FxHashSet;

const SECTION_NAME: &'static str = ".yk_tir";
const TMP_EXT: &'static str = ".yk_tir.tmp";

/// The pre-SSA return value variable. In the pre-SSA TIR, we need the return value to reside at an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first half of this comment makes sense, but the second half is confusing. Possibly it's just because some sentences are in the wrong order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on that -- it makes sense to me, but then I'm not a good critic for my own work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you just need to try rephrasing it. I can't make sense of the comment as a whole at the moment.

@@ -57,25 +72,52 @@ struct ConvCx<'a, 'tcx, 'gcx> {
/// Maps each block to the variable it defines. This is what Appel calls `A_{orig}`.
block_defines: RefCell<IndexVec<BasicBlock, FxHashSet<TirLocal>>>,
/// Monotonically increasing number used to give TIR variables a unique ID.
/// Note that a 0 is reserved for `PRE_SSA_RET_VAR`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a 0" -> "0"

/// From the Appel book:
/// "We consider the start node to contain an implicit definition of every variable,
/// either because the variable may be a formal parameter or to represent the notion of
/// a ← uninitialized without special cases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't parse the part of the comment on this line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something has gone wrong somewhere. That's supposed to be \bottom. Will fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I was wrong. This is verbatim from the book.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can delimit the abstract algebra in some quotes for readability. I think the problem is notion of a <- uninitialized is hard to parse.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, maybe this just needs a bit of rephrasing / reformatting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the quotes in the last commit. Does that work for you?

let n_usize = n as usize;
// We have to remember the variables whose stacks we must pop from when we come back from
// recursion. These must be the variables *before* they were renamed.
let mut pop_later = Vec::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a variable appear more than once in here? Does it happen often that variables end up in here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every variable assignment in the block we are processing will push to this vector. The same variable may appear more than once if a block assigns to the same variable many times.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this should probably be a set to avoid (I assume pointlessly) poping the same thing over and over again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a variable is assigned to X times, then we need to pop X times.

The the algorithm recursively walks the dominator tree depth-first. When it sees an assignment to a variable, it makes a new SSA variable, rewrites the old variable. and pushes the new variable onto the stack of the corresponding old variable. When we are bubbling up from recursion, we have to revert to the previous SSA versions of the variables. So if a block assigned to a variable z twice, we'd need to revert to z as of two assignments ago, thus popping twice.

Hard to explain...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

// "Suppose n is the jth predecessor of y".
let j = mir.predecessors_for(y).iter().position(|b| b == &n_idx).unwrap();
for st in &mut blks[y.as_usize()].stmts {
match st.phi_arg_mut(j) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if let Some(ref mut a) = st.phi_arg_mut(j) { ... }

@vext01
Copy link
Member Author

vext01 commented Apr 24, 2019

This should address the comments.

Untested. Building now...

@vext01
Copy link
Member Author

vext01 commented Apr 24, 2019

Untested. Building now...

Our yk_tir test passes and manual inspection of the graphs show up nothing dodgy.

/// The pre-SSA return value variable. In MIR, a return terminator implicitly returns variable
/// zero. We can't do this in TIR because TIR is in SSA form and the variable we return depends
/// upon which SSA variable reaches the terminator. So, during initial TIR lowering, we convert the
/// implicit MIR terminator to an explicit TIR terminator returning variable index zero initially.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove the last "initially".

None => (), // It wasn't a Phi. Do nothing.
if let Some(ref mut a) = st.phi_arg_mut(j) {
// We only get here if `st` was a Phi.
let i = self.stack[**a as usize].last().cloned().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let i = is longer than **a =. Can we use the latter?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, but I'd rather keep it as-is for parity with the algorithm in the book:

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

@vext01
Copy link
Member Author

vext01 commented Apr 25, 2019

One commit to address your comment, another to give the instruction that does the SSA definitions a better name.

If you are happy, we can squash this ready for merge, but we cannot merge just yet. There's a yk change that I'm about to raise a PR for and that depends on the new tarball PR with cargo.

Sadly my stage 2 failed because I forgot to specify the config file on the command line and now I have to rebuild...

@ltratt
Copy link
Member

ltratt commented Apr 25, 2019

Please squash.

We perform renaming in-place using a slightly modified version of the
algorithm from the book: Modern Compiler Implementation in Java, 2nd
edition, Andrew Appel.

We experimented with the algorithm from the SSA book by Jeremy Singer
but couldn't get it to work. I suspect there is an error in the
published algorithm, but I may be wrong. I've email Jeremy.

For now this only renames variables for implemented TIR bytecodes, and
not yet in terminators (so return terminators currently always return
variable 0, which will need to be changed).

It may be possible to have the variables numbers start from zero in the
resulting bytecode, but I'm keen to get this in first.
@vext01
Copy link
Member Author

vext01 commented Apr 25, 2019

squashed. Now just waiting on the ykrustc archives. Once that's done, I'll flip the ykpack dependency over to yk.

@vext01
Copy link
Member Author

vext01 commented May 1, 2019

This is now irrelevant.

@vext01 vext01 closed this May 1, 2019
@vext01 vext01 deleted the yk-ssa-rename branch May 1, 2019 09:12
bors bot pushed a commit that referenced this pull request Sep 22, 2019
vxWorks: set DEFAULT_MIN_STACK_SIZE to 256K and use min_stack to pass initial stack size to rtpSpawn
vext01 pushed a commit to vext01/ykrustc that referenced this pull request Oct 12, 2020
This is a combination of 18 commits.

Commit softdevteam#2:

Additional examples and some small improvements.

Commit softdevteam#3:

fixed mir-opt non-mir extensions and spanview title elements

Corrected a fairly recent assumption in runtest.rs that all MIR dump
files end in .mir. (It was appending .mir to the graphviz .dot and
spanview .html file names when generating blessed output files. That
also left outdated files in the baseline alongside the files with the
incorrect names, which I've now removed.)

Updated spanview HTML title elements to match their content, replacing a
hardcoded and incorrect name that was left in accidentally when
originally submitted.

Commit softdevteam#4:

added more test examples

also improved Makefiles with support for non-zero exit status and to
force validation of tests unless a specific test overrides it with a
specific comment.

Commit softdevteam#5:

Fixed rare issues after testing on real-world crate

Commit softdevteam#6:

Addressed PR feedback, and removed temporary -Zexperimental-coverage

-Zinstrument-coverage once again supports the latest capabilities of
LLVM instrprof coverage instrumentation.

Also fixed a bug in spanview.

Commit softdevteam#7:

Fix closure handling, add tests for closures and inner items

And cleaned up other tests for consistency, and to make it more clear
where spans start/end by breaking up lines.

Commit softdevteam#8:

renamed "typical" test results "expected"

Now that the `llvm-cov show` tests are improved to normally expect
matching actuals, and to allow individual tests to override that
expectation.

Commit softdevteam#9:

test coverage of inline generic struct function

Commit softdevteam#10:

Addressed review feedback

* Removed unnecessary Unreachable filter.
* Replaced a match wildcard with remining variants.
* Added more comments to help clarify the role of successors() in the
CFG traversal

Commit softdevteam#11:

refactoring based on feedback

* refactored `fn coverage_spans()`.
* changed the way I expand an empty coverage span to improve performance
* fixed a typo that I had accidently left in, in visit.rs

Commit softdevteam#12:

Optimized use of SourceMap and SourceFile

Commit softdevteam#13:

Fixed a regression, and synched with upstream

Some generated test file names changed due to some new change upstream.

Commit softdevteam#14:

Stripping out crate disambiguators from demangled names

These can vary depending on the test platform.

Commit softdevteam#15:

Ignore llvm-cov show diff on test with generics, expand IO error message

Tests with generics produce llvm-cov show results with demangled names
that can include an unstable "crate disambiguator" (hex value). The
value changes when run in the Rust CI Windows environment. I added a sed
filter to strip them out (in a prior commit), but sed also appears to
fail in the same environment. Until I can figure out a workaround, I'm
just going to ignore this specific test result. I added a FIXME to
follow up later, but it's not that critical.

I also saw an error with Windows GNU, but the IO error did not
specify a path for the directory or file that triggered the error. I
updated the error messages to provide more info for next, time but also
noticed some other tests with similar steps did not fail. Looks
spurious.

Commit softdevteam#16:

Modify rust-demangler to strip disambiguators by default

Commit softdevteam#17:

Remove std::process::exit from coverage tests

Due to Issue #77553, programs that call std::process::exit() do not
generate coverage results on Windows MSVC.

Commit softdevteam#18:

fix: test file paths exceeding Windows max path len
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants