Skip to content

Conversation

@lgritz
Copy link
Collaborator

@lgritz lgritz commented Apr 28, 2025

  • ImageOutput methods that write scanlines, tiles, and images, which are the main customization points for format output implementations, are given additional methods that take image_span in place of raw pointers and strides.

  • Generally, for each, there is a templated version that takes image_span<T> that communicates both the memory area and the data type conversion requested, a "base case" that takes an explicit TypeDesc for the conversion type and a image_view<std::byte> giving the raw memory layout, and a span<T> convenience version for when there are contiguous strides. Note that when reading mixed channel data types in "native" mode (no type conversion, just leave the data in its original types), you have to use the std::byte image_span version, since the idea is not to do any format conversion, and there may not be a single type involved.

  • For now, the default implementations of these new ImageOutput methods are just wrappers that call the old pointer-based ones. One by one, over time, we can swap them, changing the format implementations to have a full implementation of the new bounded versions, and make their raw pointer versions call the wrappers. The raw pointer ones will be understood to be "unsafe", merely assuming that the pointers always refer to appropriately-sized memory areas. Meanwhile, the ones using spans and image_spans will, due to assertions in their implementations, make it easier to verify (at least in debug mode), that we never touch memory outside these bounds.

* ImageOutput methods that write scanlines, tiles, and images, which
  are the main customization points for format output implementations,
  are given additional methods that take `image_view` in place of raw
  pointers and strides.

* Generally, for each, there is a templated version that takes
  `image_view<T>` that communicates both the memory area and the data
  type conversion requested, a "base case" that takes an explicit
  TypeDesc for the conversion type and a `image_view<std::byte>`
  giving the raw memory layout, and a `cspan<T>` convenience version
  for when there are contiguous strides. Note that when reading mixed
  channel data types in "native" mode (no type conversion, just leave
  the data in its original types), you have to use the std::byte
  image_span version, since the idea is not to do any format
  conversion, and there may not be a single type involved.

* For now, the default implementations of these new ImageOutput
  methods are just wrappers that call the old pointer-based ones. One
  by one, over time, we can swap them, changing the format
  implementations to have a full implementation of the new bounded
  versions, and make their raw pointer versions call the wrappers. The
  raw pointer ones will be understood to be "unsafe", still assuming
  that the pointers always refer to appropriately-sized memory areas.
  Meanwhile, the ones using spans and image_spans will, due to
  assertions in their implementations, make it easier to verify (at
  least in debug mode), that we never touch memory outside these
  bounds.

Signed-off-by: Larry Gritz <[email protected]>
@lgritz lgritz added core APIs Affecting public APIs of core functionality classes, such as ImageInput, ImageOutput, ImageBuf. roadmap This is a priority item on the roadmap for the next major release. strategic Strategic initative with far-reaching effects (in time and code space) labels Apr 28, 2025
Copy link
Contributor

@EmilDohne EmilDohne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a step in the right direction, thanks a lot for writing this! I quite like the convenience methods taking only a span<T> rather than an image_span<T> in the cases were you only care about contiguous regularly strided access.

/// @param data A full description of the pixel data location,
/// dimensions, and strides.
/// @returns `true` upon success, or `false` upon failure.
template<typename T> bool write_scanline(int y, int z, image_span<T> data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[question] Is the template statement being inlined part of the clang-format doing its thing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

being inlined? Yeah, I'm mostly just letting clang-format do its thing and not fighting it too hard unless it makes something confusingly unreadable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that wasnt very clear, I meant

template <typename T> void my_function

Rather than

template <typename T>
void my_function

I've just never seen that convention but if clang tidy decides its good who am I to judge :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I usually separate it to 2 lines, as you did, and so does clang-tidy usually! I don't know exactly how it decides in this case, but it's what clang-tidy did. Maybe because it's IN a class definition and is a short enough param list that it all easily fits on one line?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose so yeah!

std::vector<unsigned char>& scratch, unsigned int dither,
int xorigin, int yorigin, int zorigin)
{
// Eventually, we will make a fully save, span-native implementation of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] save -> safe

}

/// Write the full scanline that includes pixels (*,y,z), taking
/// contiguous values from a `span`. For 2D non-volume images, `z` should
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] Double space span. For

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a double space guy.

sz, data.size_bytes());
return false;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[consider] I am uncertain about having a size check like this only present in debug builds (or if oiio_print_debug is defined). Is the rationale for this performance based (in which case I doubt this will add much/any overhead but would have to test)?

Either way I think this may be a bit of a slippery slope since without this check this function is free to access memory out of bounds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware this is the base case and not directly intended to be used by developers but as long as its in the public API someone will use this :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're probably right here. Let me take another swing at it.

/// contiguous strides in all dimensions. This is a convenience wrapper
/// around the `write_scanlines()` that takes an `image_span<const T>`.
template<typename T>
bool write_scanlines(int ybegin, int yend, int z, TypeDesc format,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[fix] format argument is unused.

[question] Do you happen to know what flags OpenImageIO is compiled with? I'm assuming -Wall but perhaps adding -Wunused-variable (which is by default in -Wextra) would help catch such errors in the future?
If then you still want the unused variable you can use https://en.cppreference.com/w/cpp/language/attributes/maybe_unused

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, that's just an oversight, I think, good catch

auto ispan = image_span<const T>(data.data(), m_spec.nchannels,
m_spec.width, m_spec.height,
m_spec.depth);
OIIO_DASSERT(data.size_bytes() == ispan.size_bytes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[consider] I'd maybe make this not a debug-assert but rather a contract of the function (via a check and then a return false). Since this is a fairly easy error to make as a consumer of the API

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, yeah

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, no, in this case, ispan is something we compute from the span that the user passes. It's really just a check on our own code, so OIIO_DASSERT is what I wanted here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thats an oversight from my part, all good then!

{
auto ispan = image_span<T>(data.data(), m_spec.nchannels, xend - xbegin,
yend - ybegin, zend - zbegin);
OIIO_DASSERT(data.size_bytes() == ispan.size_bytes()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[consider] making this a contract of the function rather than a debug assert (same as in the write_image function).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here -- this really should be a OIIO_DASSERT, since it's just checking our own internal logic. It's not a contract because the user can't screw it up, they're passing a span and we're working with only what they told us was ok.

@lgritz
Copy link
Collaborator Author

lgritz commented May 5, 2025

Something I was thinking about this morning... Currently, the base case with untyped memory is presented last, for each triplet of related methods. But do you think it would be more clear if I change the order to:

// Base case -- untyped raw memory + explicit TypeDesc
// Full explanation here of all parameters
bool write(image_span<byte>, TypeDesc);

// Brief explanation: data is all one type, implied by the span
template<typename T>
bool write(image_span<T>);

// Brief explanation: data is all one type and also contiguous, so
// a plain span<T> will do.
template<typename T>
bool write(span<T>);

Do you think this leads to a more clear flow of the explanation?

Signed-off-by: Larry Gritz <[email protected]>
@EmilDohne
Copy link
Contributor

This is something im also often torn about but I'd almost veer towards the one where the base case comes first even though its the least likely you'd use but just to build it up logically. So in short, I'd go with the variant you just proposed!

@lgritz
Copy link
Collaborator Author

lgritz commented May 5, 2025

This is something im also often torn about but I'd almost veer towards the one where the base case comes first even though its the least likely you'd use but just to build it up logically. So in short, I'd go with the variant you just proposed!

Yes, your description is exactly what I was thinking: I started by presenting the functions in the order of most likely to be used. But now I'm thinking, "how can I tell the most logical story that explains all three." Start with the fully general one with all the parameters that explains what's really going on under the hood, then show the shortcuts for common cases.

@lgritz
Copy link
Collaborator Author

lgritz commented May 5, 2025

I posted an update with a rearrangement of the order that the methods are presented. I start with write_image, since it's the most common case. Then within each triplet, start with a full explanation of the generic version using explicit TypeDesc + image_span<bytes>, then briefly explain the two simplified cases of image_span<T> and span<T>.

For each triplet, explain in order of:
  * Generic: TypeDesc + image_span<byte> (full explanation)
  * implied type: image_span<T> (brief explanation)
  * implied type + contiguous: span<T> (brief explanation)

Also, move write_image FIRST, followed by write_scanline(s) and
write_tile(s), since the whole image will be the most common use case.

Signed-off-by: Larry Gritz <[email protected]>
@lgritz lgritz force-pushed the lg-imageio-span-write branch from 44fc36e to 1b94667 Compare May 5, 2025 19:15
Copy link
Contributor

@EmilDohne EmilDohne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! The order definitely reads better like this

@lgritz
Copy link
Collaborator Author

lgritz commented May 7, 2025

I will leave this open for any further comments or guidance until the end of the week, then (assuming no high priority change requests) I will merge and move on to ImageInput.

@jessey-git
Copy link
Contributor

As suspected sonar will complain about image_span for each function Pass large object "data" by reference to const. since it's 64 bytes big. I think other static analysis tools will also complain.

And none of the big 3 compilers generate particularly small code around each call site. Each generates several instructions to do the copy of the struct before making the call.

I don't immediately have a good benchmark locally to see how things would play out if we just pass by reference instead.

@EmilDohne
Copy link
Contributor

I'd be curious about benchmarks too but I very much doubt that copying around 64 bytes would make any significant difference in these functions as compression/write speeds will be the main bottlenecks.

In fact, I wouldnt be surprised to see the copies be faster as the compiler may optimize other parts of the function better

@lgritz
Copy link
Collaborator Author

lgritz commented May 8, 2025

I'd be curious about benchmarks too but I very much doubt that copying around 64 bytes would make any significant difference in these functions as compression/write speeds will be the main bottlenecks.

Yes, but we don't use these ONLY for functions that are doing a bunch of I/O and compression. It's used for all sorts of utility functions that just copy data around in memory, etc.

It's also a pretty big PITA if the static analyzers are constantly yelling, so that's a consideration, too.

I will try to put together some benchmarks of pass by value vs pass by reference.

(I'm curious about what the analyzers are saying about C++23 mdspan which is in many ways similar purpose and similar stack size.)

@EmilDohne
Copy link
Contributor

Actually got around to doing it with a pretty bastardized version of the code:

https://quick-bench.com/q/9LSLiQkta9VPl8-blkmA7sRSrDI
image

I initially had it iterating over 3 channel and 64x64 pixels but I've reduced it to iterating over just 3 elements and it seems the difference is non-existant in both cases.
Is it possible to silence sonar for this?

@lgritz
Copy link
Collaborator Author

lgritz commented May 8, 2025

I did not know about https://quick-bench.com before! (Though it doesn't seem to be able to tell us about MSVS on Windows, or in any way compare different HW architectures.)

It also looks like passing by const ref has no speed penalty, so maybe that's the way to solve the problem?

@EmilDohne
Copy link
Contributor

quick-bench is really nice to quickly test out some minimum reproducible examples but as you said unfortunately it doesn't have msvc which sadly is often also the odd one out when it comes to performance.

Since there is not difference between const ref or copy I don't mind either, if const ref makes sonar happy then that's a good solution :)

@EmilDohne
Copy link
Contributor

(I'm curious about what the analyzers are saying about C++23 mdspan which is in many ways similar purpose and similar stack size.)

I do wonder if the sonar warnings might not be outdated since I (and the compilers too apparently) would certainly not consider 64 bytes too big to copy around, unless you need nanosecond latency/performance.
Judging by this article though they seem to follow the C++ standard guidelines which are fairly loosely defined

@lgritz
Copy link
Collaborator Author

lgritz commented May 8, 2025

Thanks for that link. The blog post makes it sound like Sonar only warns on the declarations, which definitely makes it easier for us to annotate those spots with //NOSONAR or whatever to silence them. I'm always just a bit nervous about silencing warnings and not knowing what future things we may wish to be warned about on those lines.

@EmilDohne
Copy link
Contributor

Im guessing theres no way to silence specific warnings? Im not too familiar with sonar I must admit

@lgritz
Copy link
Collaborator Author

lgritz commented May 8, 2025

You can silence one kind of warning everywhere, or in a list of cpp/h files. Or you can silence all warnings on one line of code with a comment formatted a specific way. There's not a way to say something like "never warn about passing this particular struct by value". And as far as I know, you can't say "ignore only this one kind of warning on this line."

@lgritz
Copy link
Collaborator Author

lgritz commented May 8, 2025

I made some benchmarks.

The task is to sum all values in an image -- "big" is 2048x1536, so heavy on accessing data that the image span describes, and "small" is 16x16, so not much work inside the loops compared to the image_span construction and passing itself.

For each, I pass the image_span by value, and pass by reference, and for each of those choices, using a single already-constructed image_span versus constructing it from the sizes and strides on each call ("imm[ediate]"). And also an old school version where there's no image_span but I pass the raw pointer, sizes, and strides as separate parameters.

Caveats: VERY preliminary results, only running on one configuration (my Intel-based MacbookPro)

  pass by value     (big): 12915.6 us (+/- 835.8us),    0.1 k/s
  pass by value imm (big): 12381.4 us (+/- 697.8us),    0.1 k/s
  pass by ref       (big): 12262.5 us (+/- 238.1us),    0.1 k/s
  pass by ref imm   (big): 12274.8 us (+/- 1223.8us),    0.1 k/s
  pass by ptr       (big): 12519.4 us (+/- 453.1us),    0.1 k/s

  pass by value     (small):  987.5 ns (+/- 47.0ns),    1.0 M/s
  pass by value imm (small):  906.2 ns (+/- 12.1ns),    1.1 M/s
  pass by ref       (small):  915.7 ns (+/- 25.1ns),    1.1 M/s
  pass by ref imm   (small):  898.2 ns (+/- 17.0ns),    1.1 M/s
  pass by ptr       (small):  903.3 ns (+/- 35.9ns),    1.1 M/s

It looks like none of these choices make a consistent big difference -- they are all clustered within the error bars of the trial-to-trial and run-to-run variation.

I will try this on other platforms and alter the CI to output it for every configuration we have, maybe it'll turn something interesting up.

@lgritz
Copy link
Collaborator Author

lgritz commented May 9, 2025

I pushed an update that adds benchmarking of the pass by value vs reference issue.

I had to do some experimentation to get it able to run on the CI machines, I will put that in a different PR to avoid the clutter here (but I will report the results here). Shockingly, running benchmarks on the CI runners is much better and more consistent than I would have guessed.

I'm basically testing several things here at once:

  1. Pass image_span compared to the passing of raw pointers + sizes + strides that we are trying to replace.
  2. Pass image_span by value versus by const reference.
  3. Passing an immediate-constructed image_span on every call versus using the same one over and over.
  4. Passing an image_span describing a big image (lots of per-pixel access via the image_span) versus passing a "small" image_span (more dominated by the assembling and passing of the span itself compared to the pixels).
  5. Whether things things are different per platform.

Linux, Intel CPU, gcc11

  pass by value     (big): 11764.1 us (+/- 8.5us),    0.1 k/s
  pass by value imm (big): 11781.1 us (+/- 11.1us),    0.1 k/s
  pass by ref       (big): 11768.3 us (+/- 9.7us),    0.1 k/s
  pass by ref imm   (big): 11797.6 us (+/- 10.1us),    0.1 k/s
  pass by ptr       (big): 11789.9 us (+/- 10.1us),    0.1 k/s
  pass by value     (small):  899.9 ns (+/- 0.5ns),    1.1 M/s
  pass by value imm (small):  906.4 ns (+/- 0.6ns),    1.1 M/s
  pass by ref       (small):  900.5 ns (+/- 0.9ns),    1.1 M/s
  pass by ref imm   (small):  908.2 ns (+/- 0.3ns),    1.1 M/s
  pass by ptr       (small):  908.4 ns (+/- 0.7ns),    1.1 M/s

The above is from the GitHub runners, and for comparison, here is on my workstation at work, definitely not sharing with any other important process:

  pass by value     (big): 8344.0 us (+/- 3.3us),    0.1 k/s
  pass by value imm (big): 8354.5 us (+/- 9.3us),    0.1 k/s
  pass by ref       (big): 8354.4 us (+/- 10.7us),    0.1 k/s
  pass by ref imm   (big): 8347.8 us (+/- 13.1us),    0.1 k/s
  pass by ptr       (big): 8371.5 us (+/- 41.9us),    0.1 k/s
  pass by value     (small):  639.1 ns (+/- 0.2ns),    1.6 M/s
  pass by value imm (small):  650.1 ns (+/- 0.2ns),    1.5 M/s
  pass by ref       (small):  639.2 ns (+/- 0.1ns),    1.6 M/s
  pass by ref imm   (small):  648.9 ns (+/- 0.2ns),    1.5 M/s
  pass by ptr       (small):  639.8 ns (+/- 0.3ns),    1.6 M/s

the machine is obviously faster, but the results are proportional and we can see that even on the GHA runner, the trial-to-trial timing variation is really not bad at all. I think we can assume that benchmark results of this nature on GHA are reasonably valid.

Mac Intel, clang (GHA)

  pass by value     (big): 12840.2 us (+/- 51.2us),    0.1 k/s
  pass by value imm (big): 12231.8 us (+/- 100.7us),    0.1 k/s
  pass by ref       (big): 12634.1 us (+/- 123.8us),    0.1 k/s
  pass by ref imm   (big): 12339.1 us (+/- 67.3us),    0.1 k/s
  pass by ptr       (big): 12244.8 us (+/- 201.1us),    0.1 k/s
  pass by value     (small):  946.1 ns (+/- 3.1ns),    1.1 M/s
  pass by value imm (small):  914.0 ns (+/- 2.9ns),    1.1 M/s
  pass by ref       (small):  921.1 ns (+/- 20.0ns),    1.1 M/s
  pass by ref imm   (small):  915.2 ns (+/- 5.5ns),    1.1 M/s
  pass by ptr       (small):  901.9 ns (+/- 6.9ns),    1.1 M/s

Mac ARM, clang (GHA)

  pass by value     (big): 11848.1 us (+/- 19.9us),    0.1 k/s
  pass by value imm (big): 11830.0 us (+/- 7.0us),    0.1 k/s
  pass by ref       (big): 11835.2 us (+/- 14.0us),    0.1 k/s
  pass by ref imm   (big): 11862.2 us (+/- 47.6us),    0.1 k/s
  pass by ptr       (big): 11836.9 us (+/- 9.8us),    0.1 k/s
  pass by value     (small):  897.1 ns (+/- 2.2ns),    1.1 M/s
  pass by value imm (small):  830.1 ns (+/- 1.4ns),    1.2 M/s
  pass by ref       (small):  896.5 ns (+/- 1.0ns),    1.1 M/s
  pass by ref imm   (small):  838.3 ns (+/- 12.5ns),    1.2 M/s
  pass by ptr       (small):  839.9 ns (+/- 14.2ns),    1.2 M/s

Windows Intel, MSVS (GHA)

  pass by value     (big): 11782.0 us (+/- 11.0us),    0.1 k/s
  pass by value imm (big): 11818.4 us (+/- 104.2us),    0.1 k/s
  pass by ref       (big): 11877.4 us (+/- 140.0us),    0.1 k/s
  pass by ref imm   (big): 11789.3 us (+/- 14.6us),    0.1 k/s
  pass by ptr       (big): 11776.8 us (+/- 7.0us),    0.1 k/s
  pass by value     (small):  960.4 ns (+/- 2.8ns),    1.0 M/s
  pass by value imm (small):  965.2 ns (+/- 1.1ns),    1.0 M/s
  pass by ref       (small):  964.3 ns (+/- 7.6ns),    1.0 M/s
  pass by ref imm   (small):  959.2 ns (+/- 1.9ns),    1.0 M/s
  pass by ptr       (small):  937.1 ns (+/- 1.4ns),    1.1 M/s

So as far as I can discern, the answers are:

  1. There is no apparent consistent penalty for using image_span versus passing raw pointers + sizes + strides. At least, not big enough to be discernable against the background noise of run-to-run variation.
  2. The timings for passing image_span by value vs by const reference seem to be effectively identical.
  3. Passing an immediate-constructed image_span on every call versus using the same one over and over can sometimes make a difference, but not consistently across (or even within) platforms. Maybe these differences are just statistical noise, or maybe are real but vary too much by specific use case and surrounding code to generalize.
  4. Passing an image_span describing a big image obviously took more time than the small image, but things seemed proportional overall, so I don't think there is any particular concern for either the passing of the image_span itself, or in the accessing of the pixel data via the image_spans, compared to using raw pointers.
  5. I was rather shocked to see that the timings were as similar as they were across the different types of GHA runners. The old Intel Macs were a bit slower than the rest, especially for the "big image" tests (probably older machines with slower memory systems?), but all the rest were close. (Aside: the GHA Linux runners compute at about the same speed as the GHA Windows runners, and are probably the same underlying hardware, so the fact that it takes dramatically longer to build the project on Windows than any of the other platforms seems like it's just MSVS having abysmal compilation performance.)

In conclusion I'm confident that switching from pointers to image_span isn't going to hurt performance, and I don't think it matters at all (for performance) whether we pass image_span by value or by const ref, so if the latter silences the static analysis, I don't have any objections on performance grounds. What do you all think?

@lgritz
Copy link
Collaborator Author

lgritz commented May 10, 2025

@EmilDohne @jessey-git What's your call on what I should do here?

My current thinking is that maybe it's (slightly) a better choice to change to const ref, for the following reasons:

  1. Slightly more idiomatic C++ to pass const Foo& for a struct that has a bunch of fields and is larger than, say, 16 bytes or so.
  2. Silences Sonar. I think we can put //NOSONAR markers on every pass-by-value declaration, but that's a lot of clutter that will make the declarations harder to read, and it will be hard to remember to do in every future case (and would not help other static analysis systems or future compiler warnings).
  3. We've shown that there is no speed penalty.
  4. Speculating, but if there were a CHAIN of calls foo(image_span) calls bar(image_span) calls baz(image_span), or recursion involving an image_span, there may be measurable savings for those all being refs instead of having to make a copy of the struct for every call?

Let me know if you think there's a compelling reason to stick to pass-by-value here. Otherwise, I'll soon change the declarations to pass by const ref and put this PR to bed so we can move to the next thing. (And remember, this is only in main, won't be truly locked down until a fall release, so the penalty for changing our mind in the next few months is very low.)

@jessey-git
Copy link
Contributor

That seems reasonable and yeah git isn't readonly so we can change again if needed. At some tedious busy-work expense that is.

On the monday call I was mainly torn between "'spans' are typically passed by value" and "but it's really kind of chonky". So for now I'd opt for pass-by-ref and prevent the clutter from the //NOSONAR.

@lgritz
Copy link
Collaborator Author

lgritz commented May 10, 2025

I agree, I think we were confused by the fact that a regular span<> is always passed by value, and we were probably inclined to think that since this has a similar name and purpose, maybe it should be passed by value as well. But when I think about it now, there's is no other 10-field, 64-byte structure that I wouldn't obviously pass by const ref.

@lgritz
Copy link
Collaborator Author

lgritz commented May 10, 2025

Updated with a change of convention to passing image_span parameters to functions as const reference.

@EmilDohne
Copy link
Contributor

I'm also perfectly happy with pass by const ref, fine to merge by me!

@lgritz lgritz force-pushed the lg-imageio-span-write branch from eadfc2e to 6dfbf5b Compare May 10, 2025 16:28
@lgritz
Copy link
Collaborator Author

lgritz commented May 10, 2025

Sorry, one more amendment:

A while back, we discussed this and decided to drop the "z" parameter from the new scanline-based read/write methods, since there were no known formats that were both volumetric and scanline oriented. (The few volume formats we support only have tiles.)

My old brain had forgotten that and left the z's in while working on this PR. So I'm dropping those parameters now, before they take root and I forget about them.

@lgritz lgritz requested a review from Copilot May 10, 2025 19:08
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds new image_span‑based overloads for the main ImageOutput methods and updates related tests and utility functions. Key changes include:

  • Adding templated image_span versions of write_image, write_scanline(s), write_tile(s), and write_rectangle.
  • Updating typedesc and span APIs for improved const‐correctness and modern C++ style.
  • Adjusting tests and benchmarks to work with the new span-based interfaces.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated no comments.

Show a summary per file
File Description
testsuite/docs-examples-cpp/src/docs-examples-imageoutput.cpp Updated examples to use std::vector and image_span based API calls
src/libutil/typedesc_test.cpp Extended tests to cover const qualified types
src/libOpenImageIO/imageoutput.cpp Introduced new image_span overloads for image output methods
src/libOpenImageIO/imageio.cpp Updated contiguize and copy_image functions to accept const references
src/include/OpenImageIO/typedesc.h Added const-specializations for BaseTypeFromC and updated BaseTypeFromC_v
src/include/OpenImageIO/span.h and image_span.h Adjusted API signatures and dynamic_extent constant
CMakeLists.txt Bumped version to 3.1.2.0
Comments suppressed due to low confidence (2)

src/include/OpenImageIO/span.h:43

  • Changing dynamic_extent from -1 to the maximum representable value is a significant API change; please confirm that all consumers of this constant have been adjusted accordingly.
inline constexpr span_size_t dynamic_extent = std::numeric_limits<span_size_t>::max();

src/libOpenImageIO/imageoutput.cpp:292

  • The write_rectangle overload currently always returns false; please add a comment or TODO to clarify that this stub is intentional until a full implementation is provided.
bool ImageOutput::write_rectangle(int /*xbegin*/, int /*xend*/, int /*ybegin*/, int /*yend*/, int /*zbegin*/, int /*zend*/, TypeDesc /*format*/, const image_span<const std::byte>& /*data*/)

@lgritz
Copy link
Collaborator Author

lgritz commented May 11, 2025

OK, merging and moving on to the next thing!

@lgritz lgritz merged commit b356152 into AcademySoftwareFoundation:main May 11, 2025
29 of 30 checks passed
@lgritz lgritz deleted the lg-imageio-span-write branch May 11, 2025 15:42
scott-wilson pushed a commit to scott-wilson/OpenImageIO that referenced this pull request May 17, 2025
…Foundation#4727)

* ImageOutput methods that write scanlines, tiles, and images, which are
the main customization points for format output implementations, are
given additional methods that take `image_span` in place of raw pointers
and strides.

* Generally, for each, there is a templated version that takes
`image_span<T>` that communicates both the memory area and the data type
conversion requested, a "base case" that takes an explicit TypeDesc for
the conversion type and a `image_view<std::byte>` giving the raw memory
layout, and a `span<T>` convenience version for when there are
contiguous strides. Note that when reading mixed channel data types in
"native" mode (no type conversion, just leave the data in its original
types), you have to use the std::byte image_span version, since the idea
is not to do any format conversion, and there may not be a single type
involved.

* For now, the default implementations of these new ImageOutput methods
are just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
Signed-off-by: Scott Wilson <[email protected]>
scott-wilson pushed a commit to scott-wilson/OpenImageIO that referenced this pull request May 18, 2025
…Foundation#4727)

* ImageOutput methods that write scanlines, tiles, and images, which are
the main customization points for format output implementations, are
given additional methods that take `image_span` in place of raw pointers
and strides.

* Generally, for each, there is a templated version that takes
`image_span<T>` that communicates both the memory area and the data type
conversion requested, a "base case" that takes an explicit TypeDesc for
the conversion type and a `image_view<std::byte>` giving the raw memory
layout, and a `span<T>` convenience version for when there are
contiguous strides. Note that when reading mixed channel data types in
"native" mode (no type conversion, just leave the data in its original
types), you have to use the std::byte image_span version, since the idea
is not to do any format conversion, and there may not be a single type
involved.

* For now, the default implementations of these new ImageOutput methods
are just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
Signed-off-by: Scott Wilson <[email protected]>
lgritz added a commit that referenced this pull request May 20, 2025
ImageInput methods that read scanlines, tiles, and image are given new
flavors that take image_span, much like the changes of PR #4727 did for
ImageOutput.

Generally, for each, there are three versions: (1) a low-level virtual
method base case that takes an `image_span<byte>` that describes the
underlying memory, and an explicit TypeDesc that specifies the data type
conversion (including allowing TypeUnknown to indicate keeping channels
in their native per-channel format from the file); (2) a templated
version taking an `image_span<T>` that infers the data type conversion
from `T` (must be a single type for all channels); (3) a templated
version taking a `span<T>` that further implies a contiguous buffer in
all dimensions.

For now, the default implementations of these new ImageInput methods are
just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
zachlewis pushed a commit to zachlewis/OpenImageIO that referenced this pull request Aug 1, 2025
…Foundation#4727)

* ImageOutput methods that write scanlines, tiles, and images, which are
the main customization points for format output implementations, are
given additional methods that take `image_span` in place of raw pointers
and strides.

* Generally, for each, there is a templated version that takes
`image_span<T>` that communicates both the memory area and the data type
conversion requested, a "base case" that takes an explicit TypeDesc for
the conversion type and a `image_view<std::byte>` giving the raw memory
layout, and a `span<T>` convenience version for when there are
contiguous strides. Note that when reading mixed channel data types in
"native" mode (no type conversion, just leave the data in its original
types), you have to use the std::byte image_span version, since the idea
is not to do any format conversion, and there may not be a single type
involved.

* For now, the default implementations of these new ImageOutput methods
are just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
zachlewis pushed a commit to zachlewis/OpenImageIO that referenced this pull request Aug 1, 2025
…oundation#4748)

ImageInput methods that read scanlines, tiles, and image are given new
flavors that take image_span, much like the changes of PR AcademySoftwareFoundation#4727 did for
ImageOutput.

Generally, for each, there are three versions: (1) a low-level virtual
method base case that takes an `image_span<byte>` that describes the
underlying memory, and an explicit TypeDesc that specifies the data type
conversion (including allowing TypeUnknown to indicate keeping channels
in their native per-channel format from the file); (2) a templated
version taking an `image_span<T>` that infers the data type conversion
from `T` (must be a single type for all channels); (3) a templated
version taking a `span<T>` that further implies a contiguous buffer in
all dimensions.

For now, the default implementations of these new ImageInput methods are
just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
zachlewis pushed a commit to zachlewis/OpenImageIO that referenced this pull request Sep 1, 2025
…Foundation#4727)

* ImageOutput methods that write scanlines, tiles, and images, which are
the main customization points for format output implementations, are
given additional methods that take `image_span` in place of raw pointers
and strides.

* Generally, for each, there is a templated version that takes
`image_span<T>` that communicates both the memory area and the data type
conversion requested, a "base case" that takes an explicit TypeDesc for
the conversion type and a `image_view<std::byte>` giving the raw memory
layout, and a `span<T>` convenience version for when there are
contiguous strides. Note that when reading mixed channel data types in
"native" mode (no type conversion, just leave the data in its original
types), you have to use the std::byte image_span version, since the idea
is not to do any format conversion, and there may not be a single type
involved.

* For now, the default implementations of these new ImageOutput methods
are just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
zachlewis pushed a commit to zachlewis/OpenImageIO that referenced this pull request Sep 1, 2025
…oundation#4748)

ImageInput methods that read scanlines, tiles, and image are given new
flavors that take image_span, much like the changes of PR AcademySoftwareFoundation#4727 did for
ImageOutput.

Generally, for each, there are three versions: (1) a low-level virtual
method base case that takes an `image_span<byte>` that describes the
underlying memory, and an explicit TypeDesc that specifies the data type
conversion (including allowing TypeUnknown to indicate keeping channels
in their native per-channel format from the file); (2) a templated
version taking an `image_span<T>` that infers the data type conversion
from `T` (must be a single type for all channels); (3) a templated
version taking a `span<T>` that further implies a contiguous buffer in
all dimensions.

For now, the default implementations of these new ImageInput methods are
just wrappers that call the old pointer-based ones. One by one, over
time, we can swap them, changing the format implementations to have a
full implementation of the new bounded versions, and make their raw
pointer versions call the wrappers. The raw pointer ones will be
understood to be "unsafe", merely assuming that the pointers always
refer to appropriately-sized memory areas. Meanwhile, the ones using
spans and image_spans will, due to assertions in their implementations,
make it easier to verify (at least in debug mode), that we never touch
memory outside these bounds.

---------

Signed-off-by: Larry Gritz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core APIs Affecting public APIs of core functionality classes, such as ImageInput, ImageOutput, ImageBuf. roadmap This is a priority item on the roadmap for the next major release. strategic Strategic initative with far-reaching effects (in time and code space)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants