`transpile`: Support unaligned reads and writes #1257

folkertdev · 2025-06-30T15:42:41Z

This is very work-in-progress, but I think I need some pointers here. I'll add some inline comments.

folkertdev · 2025-06-30T15:43:06Z

c2rust-ast-exporter/src/AstExporter.cpp

+        encode_entry(UO, TagUnaryOperator, childIds, [this, UO](CborEncoder *array) {
            cbor_encode_string(array, UO->getOpcodeStr(UO->getOpcode()).str());
            cbor_encode_boolean(array, UO->isPrefix());
+
+            if (UO->getOpcode() == UO_Deref) {
+                QualType eltTy = UO->getSubExpr()->getType()->getPointeeType();
+                CharUnits align = Context->getTypeAlignInChars(eltTy);
+
+                if (auto *TD = eltTy->getAs<TypedefType>()) {
+                    QualType naturalTy = TD->getDecl()->getUnderlyingType();
+                    CharUnits naturalAlign = Context->getPreferredTypeAlignInChars(naturalTy);
+
+                    if (align < naturalAlign) {
+                        cbor_encode_int(array, align.getQuantity());
+                    } else {
+                        cbor_encode_int(array, -1);
+                    }
+
+                }
+            }
        });
+


I have LLM'd my way to a solution here, this is probably terrible c++ code.

I'm not super familiar with libclang either. Looks like it might work? I can look into it more later.

folkertdev · 2025-06-30T15:46:19Z

c2rust-transpile/src/translator/operators.rs

+                                } else if unaligned {
+                                    // We should use read_unaligned here:
+                                    // mk().method_call_expr(val, "read_unaligned", vec![]);
+                                    // but that interferes with `write_unaligned`
+
+                                    let mut val = 
+                                        mk().unary_expr(UnOp::Deref(Default::default()), val);
+
+                                    // If the type on the other side of the pointer we are dereferencing is volatile and
+                                    // this whole expression is not an LValue, we should make this a volatile read
+                                    if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile
+                                    {
+                                        val = self.volatile_read(val, cqual_type)?
+                                    }
+                                    Ok(val)


I'm confused by the value of lrvalue. I'd expect ptr to be an lvalue in e.g. *ptr = value, and an rvalue in the case return *ptr, but both are apparently lvalues. Hence, I don't see how to differentiate between reads and writes here.

adding to my confusion, this will not actually emit a read_volatile. Maybe I misunderstand what the attribute does here but I think that's a bug?

uint32_t volatile_read(volatile void* ptr) { return *((volatile unaligned_uint32*)ptr); }

hmm yeah this just seems wrong https://c.godbolt.org/z/sPcMM4b1T. The lrvalue is always LValue (I think a pointer dereference is just always an lvalue currently?), so that volatile read is never emitted.

Hmm, that's a bug, right? I can open an issue for it.

Is the volatile code here just for trying to figure out how to emit read_aligned correctly?

Oh, nevermind, it's already existing code.

I think in return *ptr, both ptr and *ptr are lvalues in C. See https://godbolt.org/z/bxrs7dczs. return *ptr is the rvalue. So the rvalue part of it comes not from the dereference, but from what's done with the dereference, if that makes sense. Like in *p + 1, the whole *p + 1 expression is an rvalue, but the *p itself is still an lvalue because *p is writable as well, i.e., *p = 2 is well-formed while *p + 1 = 2 is not.

Right, that is what makes this code confusing: I don't think it can ever be an rvalue in a way that is relevant, so that branch is unreachable.

Also in general that means we don't have great way of distinguishing whether a * means a read, write, or both. That is fine when a C * is translated with a rust *, but when you want to use the methods, it's a problem.

also there are already a bunch of open issues wrt volatile reads and writes. They just don't actually currently work as far as I can tell.

folkertdev · 2025-06-30T15:46:54Z

c2rust-transpile/tests/snapshots/snapshots__transpile@aligned_read_write.c.snap

+pub unsafe extern "C" fn unaligned_read(mut ptr: *const std::ffi::c_void) -> uint32_t {
+    return *(ptr as *const unaligned_uint32);
+}


so this is still wrong. I can make this one work, but then the unaligned write below is no longer correct.

kkysen

Sorry I took a very long time to review this.

kkysen · 2025-07-28T09:05:38Z

c2rust-ast-exporter/src/AstExporter.cpp

+                    if (align < naturalAlign) {
+                        cbor_encode_int(array, align.getQuantity());
+                    } else {
+                        cbor_encode_int(array, -1);
+                    }


Overalignment can be useful to handle as well, so maybe just cbor_encode_uint both the actual and natural alignment?

kkysen · 2025-07-28T09:06:27Z

c2rust-ast-exporter/src/AstExporter.cpp

+        encode_entry(UO, TagUnaryOperator, childIds, [this, UO](CborEncoder *array) {
            cbor_encode_string(array, UO->getOpcodeStr(UO->getOpcode()).str());
            cbor_encode_boolean(array, UO->isPrefix());
+
+            if (UO->getOpcode() == UO_Deref) {
+                QualType eltTy = UO->getSubExpr()->getType()->getPointeeType();
+                CharUnits align = Context->getTypeAlignInChars(eltTy);
+
+                if (auto *TD = eltTy->getAs<TypedefType>()) {
+                    QualType naturalTy = TD->getDecl()->getUnderlyingType();
+                    CharUnits naturalAlign = Context->getPreferredTypeAlignInChars(naturalTy);
+
+                    if (align < naturalAlign) {
+                        cbor_encode_int(array, align.getQuantity());
+                    } else {
+                        cbor_encode_int(array, -1);
+                    }
+
+                }
+            }
        });
+


I'm not super familiar with libclang either. Looks like it might work? I can look into it more later.

kkysen · 2025-07-28T09:08:17Z

c2rust-transpile/src/c_ast/mod.rs

+    AddressOf,                 // &x
+    Deref { unaligned: bool }, // *x
+    Plus,                      // +x
+    PostIncrement,             // x++
+    PreIncrement,              // ++x
+    Negate,                    // -x
+    PostDecrement,             // x--
+    PreDecrement,              // --x
+    Complement,                // ~x
+    Not,                       // !x
+    Real,                      // [GNU C] __real x
+    Imag,                      // [GNU C] __imag x
+    Extension,                 // [GNU C] __extension__ x
+    Coawait,                   // [C++ Coroutines] co_await x


The formatting here is annoying 😩. Could you add another commit that just makes all of the same-line // comments into proper doc comments above the variant with a newline after each one? Or I could if you prefer.

kkysen · 2025-07-28T09:12:45Z

c2rust-transpile/src/translator/operators.rs

@@ -822,7 +837,7 @@ impl<'c> Translation<'c> {

                match arg_kind {
                    // C99 6.5.3.2 para 4
-                    CExprKind::Unary(_, c_ast::UnOp::Deref, target, _) => {
+                    CExprKind::Unary(_, c_ast::UnOp::Deref { unaligned: _ }, target, _) => {


Suggested change

CExprKind::Unary(_, c_ast::UnOp::Deref { unaligned: _ }, target, _) => {

CExprKind::Unary(_, c_ast::UnOp::Deref { .. }, target, _) => {

If we don't care about the alignment, could it just be this? Like you did for the other match.

kkysen · 2025-07-28T09:14:26Z

c2rust-transpile/src/translator/operators.rs

+                                } else if unaligned {
+                                    // We should use read_unaligned here:
+                                    // mk().method_call_expr(val, "read_unaligned", vec![]);
+                                    // but that interferes with `write_unaligned`
+
+                                    let mut val = 
+                                        mk().unary_expr(UnOp::Deref(Default::default()), val);
+
+                                    // If the type on the other side of the pointer we are dereferencing is volatile and
+                                    // this whole expression is not an LValue, we should make this a volatile read
+                                    if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile
+                                    {
+                                        val = self.volatile_read(val, cqual_type)?
+                                    }
+                                    Ok(val)


Hmm, that's a bug, right? I can open an issue for it.

kkysen · 2025-07-28T09:15:01Z

c2rust-transpile/src/translator/operators.rs

+                                } else if unaligned {
+                                    // We should use read_unaligned here:
+                                    // mk().method_call_expr(val, "read_unaligned", vec![]);
+                                    // but that interferes with `write_unaligned`
+
+                                    let mut val = 
+                                        mk().unary_expr(UnOp::Deref(Default::default()), val);
+
+                                    // If the type on the other side of the pointer we are dereferencing is volatile and
+                                    // this whole expression is not an LValue, we should make this a volatile read
+                                    if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile
+                                    {
+                                        val = self.volatile_read(val, cqual_type)?
+                                    }
+                                    Ok(val)


Is the volatile code here just for trying to figure out how to emit read_aligned correctly?

kkysen · 2025-07-28T09:15:47Z

c2rust-transpile/src/translator/operators.rs

+                                } else if unaligned {
+                                    // We should use read_unaligned here:
+                                    // mk().method_call_expr(val, "read_unaligned", vec![]);
+                                    // but that interferes with `write_unaligned`
+
+                                    let mut val = 
+                                        mk().unary_expr(UnOp::Deref(Default::default()), val);
+
+                                    // If the type on the other side of the pointer we are dereferencing is volatile and
+                                    // this whole expression is not an LValue, we should make this a volatile read
+                                    if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile
+                                    {
+                                        val = self.volatile_read(val, cqual_type)?
+                                    }
+                                    Ok(val)


Oh, nevermind, it's already existing code.

kkysen · 2025-07-28T09:39:16Z

c2rust-transpile/src/translator/operators.rs

+                                } else if unaligned {
+                                    // We should use read_unaligned here:
+                                    // mk().method_call_expr(val, "read_unaligned", vec![]);
+                                    // but that interferes with `write_unaligned`
+
+                                    let mut val = 
+                                        mk().unary_expr(UnOp::Deref(Default::default()), val);
+
+                                    // If the type on the other side of the pointer we are dereferencing is volatile and
+                                    // this whole expression is not an LValue, we should make this a volatile read
+                                    if lrvalue.is_rvalue() && cqual_type.qualifiers.is_volatile
+                                    {
+                                        val = self.volatile_read(val, cqual_type)?
+                                    }
+                                    Ok(val)


I think in return *ptr, both ptr and *ptr are lvalues in C. See https://godbolt.org/z/bxrs7dczs. return *ptr is the rvalue. So the rvalue part of it comes not from the dereference, but from what's done with the dereference, if that makes sense. Like in *p + 1, the whole *p + 1 expression is an rvalue, but the *p itself is still an lvalue because *p is writable as well, i.e., *p = 2 is well-formed while *p + 1 = 2 is not.

Support unaligned reads and writes

6f0e229

folkertdev commented Jun 30, 2025

View reviewed changes

kkysen self-requested a review June 30, 2025 16:59

kkysen reviewed Jul 28, 2025

View reviewed changes

	CExprKind::Unary(_, c_ast::UnOp::Deref { unaligned: _ }, target, _) => {
	CExprKind::Unary(_, c_ast::UnOp::Deref { .. }, target, _) => {

transpile: Support unaligned reads and writes #1257

Are you sure you want to change the base?

transpile: Support unaligned reads and writes #1257

Uh oh!

Conversation

folkertdev commented Jun 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

`transpile`: Support unaligned reads and writes #1257

`transpile`: Support unaligned reads and writes #1257