-
Notifications
You must be signed in to change notification settings - Fork 270
transpile
: Support unaligned reads and writes
#1257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
encode_entry(UO, TagUnaryOperator, childIds, [this, UO](CborEncoder *array) { | ||
cbor_encode_string(array, UO->getOpcodeStr(UO->getOpcode()).str()); | ||
cbor_encode_boolean(array, UO->isPrefix()); | ||
|
||
if (UO->getOpcode() == UO_Deref) { | ||
QualType eltTy = UO->getSubExpr()->getType()->getPointeeType(); | ||
CharUnits align = Context->getTypeAlignInChars(eltTy); | ||
|
||
if (auto *TD = eltTy->getAs<TypedefType>()) { | ||
QualType naturalTy = TD->getDecl()->getUnderlyingType(); | ||
CharUnits naturalAlign = Context->getPreferredTypeAlignInChars(naturalTy); | ||
|
||
if (align < naturalAlign) { | ||
cbor_encode_int(array, align.getQuantity()); | ||
} else { | ||
cbor_encode_int(array, -1); | ||
} | ||
|
||
} | ||
} | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have LLM'd my way to a solution here, this is probably terrible c++ code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with libclang either. Looks like it might work? I can look into it more later.
} else if unaligned { | ||
// We should use read_unaligned here: | ||
// mk().method_call_expr(val, "read_unaligned", vec![]); | ||
// but that interferes with `write_unaligned` | ||
|
||
let mut val = | ||
mk().unary_expr(UnOp::Deref(Default::default()), val); | ||
|
||
// If the type on the other side of the pointer we are dereferencing is volatile and | ||
// this whole expression is not an LValue, we should make this a volatile read | ||
if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile | ||
{ | ||
val = self.volatile_read(val, cqual_type)? | ||
} | ||
Ok(val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by the value of lrvalue
. I'd expect ptr
to be an lvalue
in e.g. *ptr = value
, and an rvalue
in the case return *ptr
, but both are apparently lvalue
s. Hence, I don't see how to differentiate between reads and writes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding to my confusion, this will not actually emit a read_volatile
. Maybe I misunderstand what the attribute does here but I think that's a bug?
uint32_t volatile_read(volatile void* ptr) {
return *((volatile unaligned_uint32*)ptr);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm yeah this just seems wrong https://c.godbolt.org/z/sPcMM4b1T. The lrvalue
is always LValue
(I think a pointer dereference is just always an lvalue currently?), so that volatile read is never emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's a bug, right? I can open an issue for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the volatile code here just for trying to figure out how to emit read_aligned
correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nevermind, it's already existing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in return *ptr
, both ptr
and *ptr
are lvalues in C. See https://godbolt.org/z/bxrs7dczs. return *ptr
is the rvalue. So the rvalue part of it comes not from the dereference, but from what's done with the dereference, if that makes sense. Like in *p + 1
, the whole *p + 1
expression is an rvalue, but the *p
itself is still an lvalue because *p
is writable as well, i.e., *p = 2
is well-formed while *p + 1 = 2
is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that is what makes this code confusing: I don't think it can ever be an rvalue
in a way that is relevant, so that branch is unreachable.
Also in general that means we don't have great way of distinguishing whether a *
means a read, write, or both. That is fine when a C *
is translated with a rust *
, but when you want to use the methods, it's a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also there are already a bunch of open issues wrt volatile reads and writes. They just don't actually currently work as far as I can tell.
pub unsafe extern "C" fn unaligned_read(mut ptr: *const std::ffi::c_void) -> uint32_t { | ||
return *(ptr as *const unaligned_uint32); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this is still wrong. I can make this one work, but then the unaligned write below is no longer correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I took a very long time to review this.
if (align < naturalAlign) { | ||
cbor_encode_int(array, align.getQuantity()); | ||
} else { | ||
cbor_encode_int(array, -1); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overalignment can be useful to handle as well, so maybe just cbor_encode_uint
both the actual and natural alignment?
encode_entry(UO, TagUnaryOperator, childIds, [this, UO](CborEncoder *array) { | ||
cbor_encode_string(array, UO->getOpcodeStr(UO->getOpcode()).str()); | ||
cbor_encode_boolean(array, UO->isPrefix()); | ||
|
||
if (UO->getOpcode() == UO_Deref) { | ||
QualType eltTy = UO->getSubExpr()->getType()->getPointeeType(); | ||
CharUnits align = Context->getTypeAlignInChars(eltTy); | ||
|
||
if (auto *TD = eltTy->getAs<TypedefType>()) { | ||
QualType naturalTy = TD->getDecl()->getUnderlyingType(); | ||
CharUnits naturalAlign = Context->getPreferredTypeAlignInChars(naturalTy); | ||
|
||
if (align < naturalAlign) { | ||
cbor_encode_int(array, align.getQuantity()); | ||
} else { | ||
cbor_encode_int(array, -1); | ||
} | ||
|
||
} | ||
} | ||
}); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with libclang either. Looks like it might work? I can look into it more later.
AddressOf, // &x | ||
Deref { unaligned: bool }, // *x | ||
Plus, // +x | ||
PostIncrement, // x++ | ||
PreIncrement, // ++x | ||
Negate, // -x | ||
PostDecrement, // x-- | ||
PreDecrement, // --x | ||
Complement, // ~x | ||
Not, // !x | ||
Real, // [GNU C] __real x | ||
Imag, // [GNU C] __imag x | ||
Extension, // [GNU C] __extension__ x | ||
Coawait, // [C++ Coroutines] co_await x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting here is annoying 😩. Could you add another commit that just makes all of the same-line //
comments into proper doc comments above the variant with a newline after each one? Or I could if you prefer.
@@ -822,7 +837,7 @@ impl<'c> Translation<'c> { | |||
|
|||
match arg_kind { | |||
// C99 6.5.3.2 para 4 | |||
CExprKind::Unary(_, c_ast::UnOp::Deref, target, _) => { | |||
CExprKind::Unary(_, c_ast::UnOp::Deref { unaligned: _ }, target, _) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CExprKind::Unary(_, c_ast::UnOp::Deref { unaligned: _ }, target, _) => { | |
CExprKind::Unary(_, c_ast::UnOp::Deref { .. }, target, _) => { |
If we don't care about the alignment, could it just be this? Like you did for the other match
.
} else if unaligned { | ||
// We should use read_unaligned here: | ||
// mk().method_call_expr(val, "read_unaligned", vec![]); | ||
// but that interferes with `write_unaligned` | ||
|
||
let mut val = | ||
mk().unary_expr(UnOp::Deref(Default::default()), val); | ||
|
||
// If the type on the other side of the pointer we are dereferencing is volatile and | ||
// this whole expression is not an LValue, we should make this a volatile read | ||
if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile | ||
{ | ||
val = self.volatile_read(val, cqual_type)? | ||
} | ||
Ok(val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, that's a bug, right? I can open an issue for it.
} else if unaligned { | ||
// We should use read_unaligned here: | ||
// mk().method_call_expr(val, "read_unaligned", vec![]); | ||
// but that interferes with `write_unaligned` | ||
|
||
let mut val = | ||
mk().unary_expr(UnOp::Deref(Default::default()), val); | ||
|
||
// If the type on the other side of the pointer we are dereferencing is volatile and | ||
// this whole expression is not an LValue, we should make this a volatile read | ||
if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile | ||
{ | ||
val = self.volatile_read(val, cqual_type)? | ||
} | ||
Ok(val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the volatile code here just for trying to figure out how to emit read_aligned
correctly?
} else if unaligned { | ||
// We should use read_unaligned here: | ||
// mk().method_call_expr(val, "read_unaligned", vec![]); | ||
// but that interferes with `write_unaligned` | ||
|
||
let mut val = | ||
mk().unary_expr(UnOp::Deref(Default::default()), val); | ||
|
||
// If the type on the other side of the pointer we are dereferencing is volatile and | ||
// this whole expression is not an LValue, we should make this a volatile read | ||
if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile | ||
{ | ||
val = self.volatile_read(val, cqual_type)? | ||
} | ||
Ok(val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nevermind, it's already existing code.
} else if unaligned { | ||
// We should use read_unaligned here: | ||
// mk().method_call_expr(val, "read_unaligned", vec![]); | ||
// but that interferes with `write_unaligned` | ||
|
||
let mut val = | ||
mk().unary_expr(UnOp::Deref(Default::default()), val); | ||
|
||
// If the type on the other side of the pointer we are dereferencing is volatile and | ||
// this whole expression is not an LValue, we should make this a volatile read | ||
if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile | ||
{ | ||
val = self.volatile_read(val, cqual_type)? | ||
} | ||
Ok(val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in return *ptr
, both ptr
and *ptr
are lvalues in C. See https://godbolt.org/z/bxrs7dczs. return *ptr
is the rvalue. So the rvalue part of it comes not from the dereference, but from what's done with the dereference, if that makes sense. Like in *p + 1
, the whole *p + 1
expression is an rvalue, but the *p
itself is still an lvalue because *p
is writable as well, i.e., *p = 2
is well-formed while *p + 1 = 2
is not.
This is very work-in-progress, but I think I need some pointers here. I'll add some inline comments.