-
Notifications
You must be signed in to change notification settings - Fork 38
Fix alignment for values larger than 32 bit on xtensa targets #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the PR! I don't think this is correct, the Xtensa ISA (here, pg 687) dictates that 64bit ints and 64bit floats have 8-byte alignment. For u128, because this isn't a common type in C compilers we defaulted to clang and gcc's current behaviour. The heap allocator guarantees 32bit word-aligned allocations mostly due to the fact that some address spaces do not offer byte-level access, and it's often required for certain hardware operations, like using the DMA engine. Could you further explain your reasoning behind the PR? |
Er. Perhaps I am just being a simpleton here, but how can anyone allocate 64-bit objects if the heap allocator cannot guarantee 64-bit-aligned allocations? I realize |
The heap allocator in question is the esp-idf one (I believe @haileys is using the std sport, please correct me if I'm wrong!), which definitely supports allocating larger than 32bit structures, but it defaults to an alignment of 32bit for various reasons, some I mentioned above. I think there are two distinctly different topics here, although closely linked. This PR changes the calling convention for Xtensa by changing the required alignment of certain data types, which will in turn, change how these data types are stored in registers at compile time, and how they are stored on the stack. A by-product of this change is that these new primitive types will have different alignments, which will also affect how they should be allocated on the heap which I believe is what @haileys is trying to do? I'd like to know why they wish to change the alignment, and what their end goal is :) |
Thanks for the fast reply @MabezDev!
I wasn't aware of this, thanks for pointing it out! I think then there might be a mismatch between what the ISA specifies and what the ESP platform specifies. From my reading of the Espessif docs, it seems like (except for special cases where the hardware calls for much larger alignments), 32 bit is the maximum alignment ever mentioned. According to page 26 ESP32 technical reference manual:
However, if the ISA spec says one thing, I do agree that we can't just change things here and we need to look for a different solution. Perhaps a little more about my use case would be helpful in determining what to do here: I'm using the bytemuck crate to cast a 32-bit aligned byte buffer to a struct. The struct is My understanding is that on an esp32, there is no safety issue or performance penalty associated with reading a 64 bit value from a 32 bit aligned address, no matter which type of RAM that data is located in. Additionally, I don't control the allocation of this buffer. The buffer my code receives is guaranteed to be 32 bit aligned, but ensuring 64 bit alignment would require me to reallocate + copy the buffer before reading, which is undesirable. Bytemuck should panic if this data is not 32-bit aligned, since it can't know if the pointer it's casting is a pointer to IRAM which only allows 32 bit access, but panicking because stricter alignment requirements are not satisfied is not necessary I believe. The panicking code in bytemuck uses I see that LLVM allows both ABI alignment as well as a "preferred" alignment to be set. I'm not familiar with the semantics of that, but could it possibly be helpful in this situation? If not then I'm left scratching my head, because as you point out the ABI/calling convention is quite clear about alignment requirements, but those requirements are stricter than the alignment requirements for data access. I'd love to find some kind of solution, but I'm also operating at the very edge of my knowledge here so any tips are appreciated also :) |
On some architectures, this would be possible, but Xtensa (generally, there is an unaligned access extension but no esp32's have it) is a strict alignment architecture. The reason your assumption above wouldn't work would be cases where your struct could fit into the available registers when calling a function. In this case, the u64 would be in registers, not on the stack, and the Xtensa arch defines how u64s are to be stored in registers (it needs to start in an even-numbered register) which is different to just storing a u32< in a single register. This is why we can't change the ABI to solve this issue. All of the above is a bit orthogonal to the actual issue though. Why can't you ask whatever is allocating to provide memory with a given alignment? Or if you really can't control it, ask for more than needed and align yourself, so that we prevent the re-alloc, at the cost of a bit more memory usage. |
@MabezDev @haileys As confirmed by @igrr and as evidenced by esp-rs/esp-idf-sys#278 , the Espressif
|
The only difference between this and my other PR is that this one treats the ... which I think we should actually do as well. So I'll close my PR in favor of this one. |
I don't think this is the case. Since this is a 32-bit architecture, there are no CPU instructions which deal with larger-than-32-bit values. The CPU simply doesn't care about alignment of 64-bit ints or floats because it never has to deal with them directly — only with one of their 32-bit parts at a time. I am sorry for setting you in the wrong direction with my comment yesterday @ivmarkov. I have looked at GCC config again and now I see that the 8-byte data types are indeed naturally aligned. This is confirmed by checking the alignment of that structure in GCC (compiler explorer). So this bug is on esp32-camera side, and I suspect that the similar issue may occur many other places. I don't think we have any kind of a warning or a runtime check in place to check for mismatch between the required structure alignment and the actual pointer value. (Edit: except for UBSAN, which should catch this, but not many people enable it since it increases code size a lot.) As long as the pointer is 32-bit aligned, accessing a larger-than-4-bytes value by this pointer will work, since in practice every operations on such value will be represented by multiple 32-bit operations. |
* Align the frame buffers to the structure alignment cc: esp-rs/esp-idf-sys#278 cc: esp-rs/rust#195 * Include stdalign.h
Well, for STD and rustc specifically either way is fine, because it uses ESP IDF's But I'm a bit worried that the ESP IDF code might be sprinkled with Isn't it therefore easier and just more correct to fix both the xtensa targets in rustc, as well as the GCC ones to NOT align on more than 4 bytes? After all - and as you say - there are NO i64 and i128 types from the point of view of the CPU anyway, so why - in the first place - we are aligning these to bigger alignments then?
Ditto. |
I can answer about GCC, at least: we have been trying to reduce the number of ESP-specific patches we keep on top of the upstream Xtensa target, so we'll probably not follow this route.
That is a valid concern! As far as I can tell, running tests with UBSAN enabled should help us find these cases. As we already have the minimal UBSAN runtime in IDF, we'll try to enable these checks in CI and will then gradually fix all these violations. |
I don't think we want to deviate from the Xtensa ISA here. We spent a lot of time getting it right in both Xtensa Clang and Rust, and we have a small test suite to ensure it is correct: https://github.com/MabezDev/esp-rustc-abi-bug. We haven't had any ABI-related issues since, and seeing the report come in over the weekend worried me a little 😅. I see that the real issue was inside esp-idf in this case. I understand your concern @ivmarkov about finding other possible breakages like this in the future, but I think @igrr's suggestion of running UBSAN in esp-idf's CI is probably the more appropriate response. With this in mind, I think I will close this PR. @haileys @ivmarkov please file any heap-allocated alignment issues you find along your travels against esp-idf please. |
32 bit is the maximum alignment required for any object on xtensa, indeed this is the maximum alignment that esp-idf's heap allocator guarantees.