Fix alignment for values larger than 32 bit on xtensa targets #195

haileys · 2023-09-22T10:34:18Z

32 bit is the maximum alignment required for any object on xtensa, indeed this is the maximum alignment that esp-idf's heap allocator guarantees.

MabezDev · 2023-09-22T11:05:24Z

Thanks for the PR!

I don't think this is correct, the Xtensa ISA (here, pg 687) dictates that 64bit ints and 64bit floats have 8-byte alignment. For u128, because this isn't a common type in C compilers we defaulted to clang and gcc's current behaviour.

The heap allocator guarantees 32bit word-aligned allocations mostly due to the fact that some address spaces do not offer byte-level access, and it's often required for certain hardware operations, like using the DMA engine.

Could you further explain your reasoning behind the PR?

workingjubilee · 2023-09-22T17:05:17Z

Er. Perhaps I am just being a simpleton here, but how can anyone allocate 64-bit objects if the heap allocator cannot guarantee 64-bit-aligned allocations? I realize Box<i64> is a rare sight, but it doesn't seem like it should be invalid.

MabezDev · 2023-09-22T19:03:26Z

Er. Perhaps I am just being a simpleton here, but how can anyone allocate 64-bit objects if the heap allocator cannot guarantee 64-bit-aligned allocations? I realize Box is a rare sight, but it doesn't seem like it should be invalid.

The heap allocator in question is the esp-idf one (I believe @haileys is using the std sport, please correct me if I'm wrong!), which definitely supports allocating larger than 32bit structures, but it defaults to an alignment of 32bit for various reasons, some I mentioned above.

I think there are two distinctly different topics here, although closely linked. This PR changes the calling convention for Xtensa by changing the required alignment of certain data types, which will in turn, change how these data types are stored in registers at compile time, and how they are stored on the stack. A by-product of this change is that these new primitive types will have different alignments, which will also affect how they should be allocated on the heap which I believe is what @haileys is trying to do? I'd like to know why they wish to change the alignment, and what their end goal is :)

haileys · 2023-09-22T22:39:03Z

Thanks for the fast reply @MabezDev!

I don't think this is correct, the Xtensa ISA (here, pg 687) dictates that 64bit ints and 64bit floats have 8-byte alignment. For u128, because this isn't a common type in C compilers we defaulted to clang and gcc's current behaviour.

I wasn't aware of this, thanks for pointing it out!

I think then there might be a mismatch between what the ISA specifies and what the ESP platform specifies. From my reading of the Espessif docs, it seems like (except for special cases where the hardware calls for much larger alignments), 32 bit is the maximum alignment ever mentioned.

According to page 26 ESP32 technical reference manual:

The CPU can access data bus addresses via aligned or non-aligned
byte, half-word and word read-and-write operations. The CPU can read and write data through the instruction
bus, but only in a word aligned manner; non-word-aligned access will cause a CPU exception.

However, if the ISA spec says one thing, I do agree that we can't just change things here and we need to look for a different solution.

Perhaps a little more about my use case would be helpful in determining what to do here:

I'm using the bytemuck crate to cast a 32-bit aligned byte buffer to a struct. The struct is repr(C) and contains u64 fields. Bytemuck panics when calling from_bytes unless the buffer is 64-bit aligned.

My understanding is that on an esp32, there is no safety issue or performance penalty associated with reading a 64 bit value from a 32 bit aligned address, no matter which type of RAM that data is located in. Additionally, I don't control the allocation of this buffer. The buffer my code receives is guaranteed to be 32 bit aligned, but ensuring 64 bit alignment would require me to reallocate + copy the buffer before reading, which is undesirable.

Bytemuck should panic if this data is not 32-bit aligned, since it can't know if the pointer it's casting is a pointer to IRAM which only allows 32 bit access, but panicking because stricter alignment requirements are not satisfied is not necessary I believe. The panicking code in bytemuck uses align_of, which is what led me here.

I see that LLVM allows both ABI alignment as well as a "preferred" alignment to be set. I'm not familiar with the semantics of that, but could it possibly be helpful in this situation?

If not then I'm left scratching my head, because as you point out the ABI/calling convention is quite clear about alignment requirements, but those requirements are stricter than the alignment requirements for data access. I'd love to find some kind of solution, but I'm also operating at the very edge of my knowledge here so any tips are appreciated also :)

MabezDev · 2023-10-11T12:32:33Z

My understanding is that on an esp32, there is no safety issue or performance penalty associated with reading a 64 bit value from a 32 bit aligned address, no matter which type of RAM that data is located in. Additionally, I don't control the allocation of this buffer. The buffer my code receives is guaranteed to be 32 bit aligned, but ensuring 64 bit alignment would require me to reallocate + copy the buffer before reading, which is undesirable.

On some architectures, this would be possible, but Xtensa (generally, there is an unaligned access extension but no esp32's have it) is a strict alignment architecture. The reason your assumption above wouldn't work would be cases where your struct could fit into the available registers when calling a function. In this case, the u64 would be in registers, not on the stack, and the Xtensa arch defines how u64s are to be stored in registers (it needs to start in an even-numbered register) which is different to just storing a u32< in a single register. This is why we can't change the ABI to solve this issue.

All of the above is a bit orthogonal to the actual issue though. Why can't you ask whatever is allocating to provide memory with a given alignment? Or if you really can't control it, ask for more than needed and align yourself, so that we prevent the re-alloc, at the cost of a bit more memory usage.

ivmarkov · 2024-01-21T11:50:34Z

@MabezDev @haileys As confirmed by @igrr and as evidenced by esp-rs/esp-idf-sys#278 , the Espressif xtensa GCC toolchain is operating with 4 byte alignment for both 64 bit types, as well as 128 bit types. For me this means:

The xtensa ISA variation used in the Espressif chips is probably deviating from the original xtensa ISA spec in that it "tolerates" these types being only 4 byte aligned
We should do the same, or else any interop with C code compiled with GCC is at risk

ivmarkov · 2024-01-21T11:53:35Z

The only difference between this and my other PR is that this one treats the f64 type to the 4 bytes' alignment (as opposed to its standard 8 bytes alignment) as well.

... which I think we should actually do as well. So I'll close my PR in favor of this one.

igrr · 2024-01-22T07:29:48Z

The xtensa ISA variation used in the Espressif chips is probably deviating from the original xtensa ISA spec in that it "tolerates" these types being only 4 byte aligned

I don't think this is the case. Since this is a 32-bit architecture, there are no CPU instructions which deal with larger-than-32-bit values. The CPU simply doesn't care about alignment of 64-bit ints or floats because it never has to deal with them directly — only with one of their 32-bit parts at a time.

I am sorry for setting you in the wrong direction with my comment yesterday @ivmarkov. I have looked at GCC config again and now I see that the 8-byte data types are indeed naturally aligned. This is confirmed by checking the alignment of that structure in GCC (compiler explorer).

So this bug is on esp32-camera side, and I suspect that the similar issue may occur many other places. I don't think we have any kind of a warning or a runtime check in place to check for mismatch between the required structure alignment and the actual pointer value. (Edit: except for UBSAN, which should catch this, but not many people enable it since it increases code size a lot.) As long as the pointer is 32-bit aligned, accessing a larger-than-4-bytes value by this pointer will work, since in practice every operations on such value will be represented by multiple 32-bit operations.

cc: esp-rs/esp-idf-sys#278 cc: esp-rs/rust#195

* Align the frame buffers to the structure alignment cc: esp-rs/esp-idf-sys#278 cc: esp-rs/rust#195 * Include stdalign.h

me-no-dev · 2024-01-22T09:37:16Z

@ivmarkov a fix has been added to esp32-camera here. Can you please confirm that it fixes the issue?

ivmarkov · 2024-01-22T13:32:15Z

The xtensa ISA variation used in the Espressif chips is probably deviating from the original xtensa ISA spec in that it "tolerates" these types being only 4 byte aligned

I don't think this is the case. Since this is a 32-bit architecture, there are no CPU instructions which deal with larger-than-32-bit values. The CPU simply doesn't care about alignment of 64-bit ints or floats because it never has to deal with them directly — only with one of their 32-bit parts at a time.

I am sorry for setting you in the wrong direction with my comment yesterday @ivmarkov. I have looked at GCC config again and now I see that the 8-byte data types are indeed naturally aligned. This is confirmed by checking the alignment of that structure in GCC (compiler explorer).

@igrr

Well, for STD and rustc specifically either way is fine, because it uses ESP IDF's malloc for stuff that needs <= 4 bytes alignment, or else it goes to ESP IDF's memalign. And this is all nicely hidden and abstracted and user should not think or care about that.

But I'm a bit worried that the ESP IDF code might be sprinkled with mallocs left and right and if some of the allocated structures happen to contain i64 or i128 that now means a problem with rustc on non-release mode as it asserts on incorrect alignment, at least since some time.

Isn't it therefore easier and just more correct to fix both the xtensa targets in rustc, as well as the GCC ones to NOT align on more than 4 bytes? After all - and as you say - there are NO i64 and i128 types from the point of view of the CPU anyway, so why - in the first place - we are aligning these to bigger alignments then?

So this bug is on esp32-camera side, and I suspect that the similar issue may occur many other places. I don't think we have any kind of a warning or a runtime check in place to check for mismatch between the required structure alignment and the actual pointer value. (Edit: except for UBSAN, which should catch this, but not many people enable it since it increases code size a lot.) As long as the pointer is 32-bit aligned, accessing a larger-than-4-bytes value by this pointer will work, since in practice every operations on such value will be represented by multiple 32-bit operations.

Ditto.

ivmarkov · 2024-01-22T13:41:16Z

@ivmarkov a fix has been added to esp32-camera here. Can you please confirm that it fixes the issue?

That's probably for @ViktorWb to confirm.

igrr · 2024-01-22T16:35:13Z

Isn't it therefore easier and just more correct to fix both the xtensa targets in rustc, as well as the GCC ones to NOT align on more than 4 bytes?

I can answer about GCC, at least: we have been trying to reduce the number of ESP-specific patches we keep on top of the upstream Xtensa target, so we'll probably not follow this route.

But I'm a bit worried that the ESP IDF code might be sprinkled with mallocs left and right and if some of the allocated structures happen to contain i64

That is a valid concern! As far as I can tell, running tests with UBSAN enabled should help us find these cases. As we already have the minimal UBSAN runtime in IDF, we'll try to enable these checks in CI and will then gradually fix all these violations.

MabezDev · 2024-01-23T09:46:19Z

I don't think we want to deviate from the Xtensa ISA here. We spent a lot of time getting it right in both Xtensa Clang and Rust, and we have a small test suite to ensure it is correct: https://github.com/MabezDev/esp-rustc-abi-bug. We haven't had any ABI-related issues since, and seeing the report come in over the weekend worried me a little 😅. I see that the real issue was inside esp-idf in this case. I understand your concern @ivmarkov about finding other possible breakages like this in the future, but I think @igrr's suggestion of running UBSAN in esp-idf's CI is probably the more appropriate response.

With this in mind, I think I will close this PR. @haileys @ivmarkov please file any heap-allocated alignment issues you find along your travels against esp-idf please.

32 bit is the maximum alignment required on xtensa

cd6cb2f

ivmarkov mentioned this pull request Jan 21, 2024

Fix esp-idf-sys#278 #211

Closed

ivmarkov mentioned this pull request Jan 21, 2024

Misaligned struct pointer from esp camera esp-rs/esp-idf-sys#278

Closed

igrr mentioned this pull request Jan 22, 2024

Ensure cam_frame_t structure is correctly aligned espressif/esp32-camera#622

Closed

me-no-dev added a commit to espressif/esp32-camera that referenced this pull request Jan 22, 2024

Align the frame buffers to the structure alignment

f00ca7d

cc: esp-rs/esp-idf-sys#278 cc: esp-rs/rust#195

me-no-dev mentioned this pull request Jan 22, 2024

Align the frame buffers to the structure alignment espressif/esp32-camera#623

Merged

me-no-dev added a commit to espressif/esp32-camera that referenced this pull request Jan 22, 2024

Align the frame buffers to the structure alignment (#623)

8df17f7

* Align the frame buffers to the structure alignment cc: esp-rs/esp-idf-sys#278 cc: esp-rs/rust#195 * Include stdalign.h

MabezDev closed this Jan 23, 2024

ivmarkov mentioned this pull request May 25, 2025

Share parsed mbedtls_x509_crt and mbedtls_pk_context between sessions esp-rs/esp-mbedtls#75

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix alignment for values larger than 32 bit on xtensa targets #195

Fix alignment for values larger than 32 bit on xtensa targets #195

Uh oh!

haileys commented Sep 22, 2023

Uh oh!

MabezDev commented Sep 22, 2023

Uh oh!

workingjubilee commented Sep 22, 2023 •

edited

Loading

Uh oh!

MabezDev commented Sep 22, 2023 •

edited

Loading

Uh oh!

haileys commented Sep 22, 2023 •

edited

Loading

Uh oh!

MabezDev commented Oct 11, 2023

Uh oh!

ivmarkov commented Jan 21, 2024

Uh oh!

ivmarkov commented Jan 21, 2024

Uh oh!

igrr commented Jan 22, 2024 •

edited

Loading

Uh oh!

me-no-dev commented Jan 22, 2024

Uh oh!

ivmarkov commented Jan 22, 2024

Uh oh!

ivmarkov commented Jan 22, 2024

Uh oh!

igrr commented Jan 22, 2024

Uh oh!

MabezDev commented Jan 23, 2024

Uh oh!

Uh oh!

Fix alignment for values larger than 32 bit on xtensa targets #195

Fix alignment for values larger than 32 bit on xtensa targets #195

Uh oh!

Conversation

haileys commented Sep 22, 2023

Uh oh!

MabezDev commented Sep 22, 2023

Uh oh!

workingjubilee commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MabezDev commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

haileys commented Sep 22, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MabezDev commented Oct 11, 2023

Uh oh!

ivmarkov commented Jan 21, 2024

Uh oh!

ivmarkov commented Jan 21, 2024

Uh oh!

igrr commented Jan 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

me-no-dev commented Jan 22, 2024

Uh oh!

ivmarkov commented Jan 22, 2024

Uh oh!

ivmarkov commented Jan 22, 2024

Uh oh!

igrr commented Jan 22, 2024

Uh oh!

MabezDev commented Jan 23, 2024

Uh oh!

Uh oh!

workingjubilee commented Sep 22, 2023 •

edited

Loading

MabezDev commented Sep 22, 2023 •

edited

Loading

haileys commented Sep 22, 2023 •

edited

Loading

igrr commented Jan 22, 2024 •

edited

Loading