Skip to content

[#359] extension: fix typing #361

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

gi
Copy link

@gi gi commented Jul 24, 2025

This updates the typing to be more consistent and avoid implicit conversion warnings/errors when compiling on darwin/macOS.

@gi
Copy link
Author

gi commented Jul 24, 2025

I'm still getting a build error when running bundle exec rake compile, but this fixes the build errors in mini_racer_extension.c and serde.c:

cd tmp/arm64-darwin24/mini_racer_extension/3.4.5
/Users/zach.gianos/.local/share/asdf/installs/ruby/3.4.5/bin/ruby -I. ../../../../ext/mini_racer_extension/extconf.rb
checking for -lpthread... yes
checking for -lobjc... yes
checking for whether -Wl,--exclude-libs=ALL  is accepted as LDFLAGS... no
creating Makefile
cd -
cd tmp/arm64-darwin24/mini_racer_extension/3.4.5
/usr/bin/make
compiling ../../../../ext/mini_racer_extension/mini_racer_extension.c
compiling ../../../../ext/mini_racer_extension/mini_racer_v8.cc
make: *** [mini_racer_v8.o] Error 1
rake aborted!

This is fixed.

This updates the typing to be more consistent and avoid implicit conversion
warnings/errors when compiling on darwin/macOS.
@gi gi force-pushed the issue-359-darwin-compile branch from b36908c to a3d5e90 Compare July 25, 2025 23:25
@gi
Copy link
Author

gi commented Jul 26, 2025

The linux-ubuntu-24.04-arm - ruby-3.x - musl jobs are failing on main: see #344.

Otherwise, the changes here pass all tests.

@tisba
Copy link
Collaborator

tisba commented Jul 26, 2025

musl builds are known to be broken currently, don't worry too much about them.

@gi
Copy link
Author

gi commented Jul 26, 2025

musl builds are known to be broken currently, don't worry too much about them.

@tisba Are you good merging this then?

The changes fix the compilation issues I see on my machine. It aligns the types of variables declared in function bodies and in called function signatures or with the Ruby C API.

@@ -470,7 +470,7 @@ static void des_map_end(void *arg)
pop(arg);
}

static void des_object_ref(void *arg, uint32_t id)
static void des_object_ref(void *arg, long id)
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ruby's C API rb_ary_entry uses the type long for the offset: (ref)

Since this value is not used anywhere else in this function, it makes sense to forward this type to any callers within this library.

Even if the passed in values will never be negative within this library, the types should still align.

If long is 32 bits, then a uint32_t argument passed through could overflow.

@@ -651,7 +651,7 @@ static int serialize(Ser *s, VALUE v)
return serialize1(s, rb_hash_new(), v);
}

static struct timespec deadline_ms(int ms)
static struct timespec deadline_ms(int64_t ms)
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter ms should be of type int64_t to match the type of the arguments from the two invocations which pass in the Context struct's idle_gc and timeout fields, both of which are declare int64_t. This avoids issues when int is 32 bits.

However, this value is used in operations with struct timespec, which declares:

time_t tv_sec;
long tv_nsec;

So, perhaps it might even be declared of type time_t?

@@ -54,13 +55,18 @@ typedef struct Ser {
static const uint8_t the_nan[8] = {0,0,0,0,0,0,0xF8,0x7F}; // canonical nan

// note: returns |v| if v in [0,1,2]
static inline uint32_t next_power_of_two(uint32_t v) {
static inline size_t next_power_of_two(size_t v) {
Copy link
Author

@gi gi Jul 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sole invocation of this function passes in an argument of type size_t, so this function should declare the parameter of the same type to simplify things.

The alternative would be to refactor the invocation, but that has a ripple effect because the value is the parameter to int buf_grow(Buf *b, size_t n), which I am sure we don't want to change the type of n.

@@ -240,24 +246,26 @@ static void ser_num(Ser *s, double v)
}

// ser_bigint: |n| is in bytes, not quadwords
static void ser_bigint(Ser *s, const uint64_t *p, size_t n, int sign)
static void ser_bigint(Ser *s, const void *p, size_t stride, size_t n, int sign)
Copy link
Author

@gi gi Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two invocations of this function:

  • ser_bigint(s, limbs, sizeof(limbs[0]), countof(limbs), sign);
  • ser_bigint(s, &t, sizeof(t), sizeof(t), sign);

limbs is declared unsigned long limbs[65];
t is declared uint64_t t;

So, the types of the pointer arguments to ser_bigint are

  • unsigned long *
  • uint64_t *

If long is 32 bits, then these don't match up and there could be serialization issues.

I saw a few options:

  • try to align the types in each of the callers
  • duplicate the function for each type
  • refactor the function to be agnostic to the pointer's word size

The latter was the easiest and most flexible.

@tisba tisba requested review from bnoordhuis and SamSaffron July 27, 2025 16:44
@tisba
Copy link
Collaborator

tisba commented Jul 27, 2025

@tisba Are you good merging this then?

I'm not really capable of judging if your changes are okay not, sorry 🫣 Builds are green (except the arm+musl ones which are okay to fail currently). I cannot help to confirm if your initial problem got fixed (build issue on macOS, see #359), because I cannot reproduce it.

Maybe @bnoordhuis or @SamSaffron could have a look?

@bnoordhuis
Copy link
Collaborator

I gather these changes are to appease a compiler (?) that you're invoking with -Werror (?) but by and large they're conceptually wrong. For example:

  • an object reference is not long because it can't be < 0 or > UINT32_MAX (and if long is 32 bits it's doubly wrong because values can be > INT32_MAX)

  • changing next_power_of_two's argument type from uint32_t to size_t and all the changes that flow from that makes no sense because arrays in the wire format cannot be > UINT32_MAX; it merely complicates the function and adds dead code

  • the ser_bigint changes look unnecessary; I'm not sure I understand what it's trying to fix

Can I instead suggest simply adding -Wno-shorten-64-to-32 (or -Wno-error=shorten-64-to-32) when supported by the compiler? Or just don't build with -Werror?

@@ -42,7 +43,7 @@ static void des_error_end(void *arg);
// have to worry about allocation failures for small payloads
typedef struct Buf {
uint8_t *buf;
uint32_t len, cap;
size_t len, cap;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These variables are assigned values of type size_t, so it makes sense to declare them as such. The type of uint32_t could be too narrow (and is on my machine).

@@ -324,26 +332,26 @@ static void ser_object_begin(Ser *s)
}

// |count| is the property count
static void ser_object_end(Ser *s, uint32_t count)
static void ser_object_end(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one invocation of this function in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

{
w_byte(s, '{');
w_varint(s, count);
}

static void ser_object_ref(Ser *s, uint32_t id)
static void ser_object_ref(Ser *s, size_t id)
Copy link
Author

@gi gi Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be able to be reverted. However, the invocations of this method are the return value of Ruby'c rb_fix2long function, which obviously returns a long.

Should it be declared as a long instead?
Would you like to cast the invocation arguments to a uint32_t?

Note: the des_object_ref was updated to use long to match Ruby's C API: https://github.com/rubyjs/mini_racer/pull/361/files#r2233122575.

{
w_byte(s, '^');
w_varint(s, id);
}

static void ser_array_begin(Ser *s, uint32_t count)
static void ser_array_begin(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same argument as ser_object_end: https://github.com/rubyjs/mini_racer/pull/361/files#r2234665993.

The one invocation of this function [(that is not a constant 2)] in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

{
w_byte(s, 'A'); // 'A'=dense, 'a'=sparse
w_varint(s, count); // element count
}

// |count| is the element count
static void ser_array_end(Ser *s, uint32_t count)
static void ser_array_end(Ser *s, size_t count)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same argument as ser_object_end: https://github.com/rubyjs/mini_racer/pull/361/files#r2234665993.

The one invocation of this function [(that is not a constant 2)] in serialize1 passes in a size_t value. It's passed to w_varint which is declared as void w_varint(Ser *s, uint64_t v) with a uint64_t.

Thus, it makes sense to align this function's parameter type with the invocation argument's type of size_t which can easily be widened to a 64-bit uint64_t if size_t happens to be 32 bits.

@@ -569,7 +577,7 @@ static int des1(char (*err)[64], const uint8_t **p, const uint8_t *pe,
return bail(err, "negative zero bigint");
if (pe-*p < (int64_t)u)
goto too_short;
des_bigint(arg, *p, u, 1-2*t);
des_bigint(arg, *p, u, (int) (1-2*t));
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter only designates the sign here: 1 or -1, but the type of t is uint64_t. Thus, instead of making the parameter of des_bigint unnecessarily wide, it's easy enough to just cast it down explicitly to avoid any potential compiler warnings that can turn into errors.

@gi
Copy link
Author

gi commented Jul 28, 2025

@bnoordhuis I commented on the changes in the files diff. It might be easier to address each there; however, I will briefly respond here.

  • an object reference is not long because it can't be < 0 or > UINT32_MAX (and if long is 32 bits it's doubly wrong because values can be > INT32_MAX)

Ruby's C API rb_ary_entry uses the type long for the offset, so it makes sense to forward this type requirement to the callers in this library.

See https://github.com/rubyjs/mini_racer/pull/361/files#r2233122575.

  • changing next_power_of_two's argument type from uint32_t to size_t and all the changes that flow from that makes no sense because arrays in the wire format cannot be > UINT32_MAX; it merely complicates the function and adds dead code

The sole invocation of this function passes in an argument of type size_t, so this function should declare the parameter of the same type to simplify things.

See https://github.com/rubyjs/mini_racer/pull/361/files#r2233124981.

  • the ser_bigint changes look unnecessary; I'm not sure I understand what it's trying to fix

See https://github.com/rubyjs/mini_racer/pull/361/files#r2233614647.

@bnoordhuis
Copy link
Collaborator

bnoordhuis commented Jul 28, 2025

The ser/de function types mirror the sizes of fields in the wire format. Replacing e.g. uint32_t with size_t opens up the possibility of passing values > UINT32_MAX and then... well, I don't know exactly what would happen without looking at the code but it most assuredly would go wrong.

If you want to squelch those 64-to-32 warnings, you can narrow-cast at the call sites. r_varint and w_varint currently operate on uint64s but in practice1 most wire format values (with exceptions for e.g. bigints and typed arrays) are 32 bits unsigned values, so you could probably avoid many warnings by introducing r_varint32 and w_varint32.

Having said that, unless you have a good working knowledge of the wire format, there's a pretty good chance you'll break more than you fix, so tread carefully.

edit: although IMO it's all around easier to just disable the warning. Not all compiler warnings are helpful or meaningful.

1 the spec, what there is of it, doesn't say so but in practice V8's serializer only writes 32 bits values in most places

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants