Description
The sync/atomic packages have this in the docs in the "Bugs" section: "On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned." This makes it difficult to use atomic operations in types that may not necessarily be at the beginning of an allocated struct or slice. For example, sync.WaitGroup
does this:
// 64-bit value: high 32 bits are counter, low 32 bits are waiter count.
// 64-bit atomic operations require 64-bit alignment, but 32-bit
// compilers do not ensure it. So we allocate 12 bytes and then use
// the aligned 8 bytes in them as state.
state1 [12]byte
and this:
func (wg *WaitGroup) state() *uint64 {
if uintptr(unsafe.Pointer(&wg.state1))%8 == 0 {
return (*uint64)(unsafe.Pointer(&wg.state1))
} else {
return (*uint64)(unsafe.Pointer(&wg.state1[4]))
}
}
Further, on x86 there are vector instructions that require alignment to 16 bytes, and there are even some instructions (e.g., vmovaps
with VEC.256), that require 32 byte alignment. While those instructions are not currently generated by the gc compiler, one can easily imagine using them in assembler code, which will require the values to be appropriately aligned.
To permit programmers to force the desired alignment, I propose that we add new types to the runtime package: runtime.Aligned2
, runtime.Aligned4
, runtime.Aligned8
, runtime.Aligned16
, runtime.Aligned32
, runtime.Aligned64
, runtime.Aligned128
. (We could also use bit values, giving us runtime.Aligned16
through runtime.Aligned1024
, if that seems clearer.)
These types will be identical to the type struct{}
except that they will have a the alignment implied by the name. This will make it possible to write a struct as
type vector struct {
vals [16]byte
_ runtime.Aligned16
}
and ensure that instances of this struct will always be aligned to a 16 byte boundary.
It will be possible to change sync.Waitgroup
to be
type WaitGroup struct {
noCopy noCopy
_ runtime.Aligned8
state uint64
sema uint32
}
simplifying the code.
Although this functionality will not be used widely, it does provide a facility that we need today without requiring awkward workarounds. The drawback is the addition of a new concept to the runtime, though I think it is fairly clear to those people who need to use it.
Another complexity is that we will have to decide whether the size of a value is always a multiple of the alignment of the value. Currently that is the case. It would not be the case for the runtime.AlignedN
values. Should it be the case for any struct that contains a field with one of those types? If the size of a value is not always a multiple of the alignment, we will have to modify the memory allocator to support that concept. I don't think that will be particularly difficult, but I haven't really looked.