-
Notifications
You must be signed in to change notification settings - Fork 0
DOC: NEP: array in -> array out #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,231 @@ | ||||||
============================== | ||||||
NEP 57 — Array In -> Array Out | ||||||
============================== | ||||||
|
||||||
:Author: Matt Haberland <[email protected]>, Add Your Name Here <[email protected]> | ||||||
:Status: Draft | ||||||
:Type: Standards Track | ||||||
:Created: 2025-05-14 | ||||||
:Resolution: | ||||||
|
||||||
Abstract | ||||||
-------- | ||||||
|
||||||
:ref:`NEP56` proposed adding nearly full support for the array API standard, | ||||||
but many operations involving higher rank arrays still return scalars instead | ||||||
of zero-rank arrays. This NEP would redefine the result of these operations | ||||||
Comment on lines
+15
to
+16
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This suggests that numpy scalars are not compatible with what the array-api defines as an "array object". But, is that really the case? Because on most situations, you can get away with treating a scalar as if it's a full-fledged array, even if you index it as one: >>> duck = np.array(1)
>>> ducknt = np.int_(-1)
>>> duck[()]
np.int64(1)
>>> ducknt[()]
np.int64(-1) I noticed that it also has an There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think most of this discussion has already played out in scipy/scipy#22947 (comment). So yes, we can play devil's advocate and say that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So array-api objects are mutable, and numpy scalars are no. Got it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hehe, even trickier. Array API is either mutable or unmutable. But the abstract class of numpy array|scalar would have undefined mutability. |
||||||
to be zero-rank arrays. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [nitpick] The term "rank" (in the "no. of dimensions" sense) isn't used very often in the numpy docs. Maybe because in linear algebra (and in the UK) "rank" has a different meaning? Either way, I suppose it would be more numpy-esque to e.g. refer to "zero-rank" as " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NEP 27 proposed the addition of "Zero-rank arrays" to NumPy. That seemed like a fairly appropriate source to borrow the term from. So it's fine if the majority of authors would prefer a different term, but I don't immediately see a reason to change it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would've said the same thing if I would've been around when reviewing NEP 27 🤷🏻. |
||||||
|
||||||
Motivation and scope | ||||||
-------------------- | ||||||
|
||||||
The 2024.12 version of the array API standard [1]_ states: | ||||||
|
||||||
Apart from array object attributes, such as ``ndim``, ``device``, and | ||||||
``dtype``, all operations in this standard return arrays (or tuples of | ||||||
arrays)... | ||||||
|
||||||
Beginning with :ref:`NEP56` and NumPy 2.0.0, NumPy added nearly full support | ||||||
for the standard, but explicitly deferred compliance with this aspect. | ||||||
|
||||||
We note that one NumPy-specific behavior that remains is returning array | ||||||
scalars rather than 0-D arrays in most cases where the standard, and other | ||||||
array libraries, return 0-D arrays (e.g., indexing and reductions)... | ||||||
There have been multiple discussions over the past year about the | ||||||
feasibility of removing array scalars from NumPy, or at least no longer | ||||||
returning them by default. However, this would be a large effort with some | ||||||
uncertainty about technical risks and impact of the change, and no one has | ||||||
taken it on. | ||||||
|
||||||
This NEP represents an effort to "take it on". It is a worthwile undertaking: | ||||||
scalars "basically duck type 0-D arrays", but they do not *fully* duck type | ||||||
zero-rank arrays, with the most fundamental difference being that scalars are | ||||||
immutable and zero-rank arrays are not. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe not surprising, I don't like the Array API focus, I would much prefer to only mention it as an additional argument. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What are "all of the arguments"? It sounds like we have two completely different perspectives about why 0d arrays should be returned instead of scalars. I think we could include both in the NEP so that we have a convincing arguments both with and without Array API, but I am less familiar with why scalars are "bad... at numpy". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do what you prefer. I would expect all of your problems while doing Array API work to not actually be related to array api adoption itself? So, I would prefer to start with the issues below, rather than starting with "violating Array API". The 0-D array->scalar conversion creates difficulties when writing functions that work with N-D inputs that may have 0-D intermediates. I.e. we have to add that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll need to research this "object array problem" more. I haven't run into it, personally. I assume this is "It changes the dtype of 0d object arrays containing array-likes" from numpy#13105.
Probably the fundamental reasons are the same. I've just seen more frequent reminders of the inconsistency.
These are just a few examples off the top of my head. These particular things may not be relevant to everyone in the future, so I wouldn't mention them in the NEP. The point is that that the standard prompts code rewrites and reminds us to think about details that might have been ignored before. This inevitably exacerbates the old scalar/0d array problems. One might say that by the time this NEP would take effect, all array API rewrites will be done, so it won't matter any more. I think the same argument can be made about any bug or undesirable API, though - once you've worked around the problem, it stops bothering you... for a while, at least, until the next time you encounted the same problem. I say we take the cue and fix it once and for all. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Object 0-D is the same as your mutability issue in the end. It leads to some code in NumPy wrapping arr0d = np.array(None)
arr0d[()] = np.arange(10) But of course typically, we just want to avoid the For me the issues around this are the main problem I want to solve.
Yeah, although it seems unrelated? Scalars and 0-D always behaved the same in promotion, only Python scalars changed behavior.
This is something where I am honestly not sure what you actually want ☹. If you don't like the But if NumPy preserves scalars in -> scalars out, that seems not really easier for on you, unless you are considering something where So in that case the question is what additional places you want an array return that I bracketed out in the minimal version. My view has always been that the feasible change to me is the one where we don't implicitly convert 0-D arrays to scalars, but
These type of things never go away, and they do have real, if rare, user impact. But of course the question is how much library author pain is OK if the alternative is for some user code to silently start doing the wrong thing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes. We should not have to fight NumPy's behavior to make this happen. We should be able to run the code and not have to either There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, but these are two distinct annoyances:
So this NEP solves the first issue fully. The second one is still the downstream libraries problem at least in the sense that you must change your code to make it happen. Now, transitioning SciPy depending on the NumPy version having implemented this NEP makes sense to me as it makes the transition clearer and cleaner. (And of course the reason why NumPy doesn't change, is that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we're on the same page. I understand that with this change, I will generally no longer need to use I understand that removing So if all that makes sense, I think we're on the same page w.r.t. 2, although I'm not certain that means we will diverge from NumPy by only returning arrays. There are a few possibilities. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It would also help a lot for static typing. Because, currently, the simplest numpy-esque function is currently pretty difficult to properly type: @overload
def noop[...](a: ndarray[Shape0D, dtype[ScalarT]]) -> ScalarT: ...
@overload
def noop[ShapeT: AtLeast1D, ...](a: ndarray[ShapeT, DTypeT]) -> ndarray[ShapeT, DTypeT]: ...
@overload
def noop(a: ArrayLike) -> TheShapeIsNotKnownSoEitherArrayOrScalar: ... If it always returned an array, this could be simplified a lot. It would also help the users, because unknown "array-likes" will retun "array of unknown dtype" instead of "array of unknown dtype or some unknown scalar". The latter often requires users to manually tldr; This NEP would help a lot for static typing. |
||||||
|
||||||
It may be argued that if instances of array base class ``np.ndarray`` and | ||||||
scalar base class ``np.generic`` were fully interoperable, together, they | ||||||
would implement a protocol compatible with the array API standard. Even if this | ||||||
were the case, this design is complex and leads to confusion and errors due to | ||||||
self-inconsistency (zero-rank array-like scalars are immutable, but arrays of | ||||||
other rank are mutable) and inconsistency with all other array API compatible | ||||||
libraries. In particular, it leads to difficulties in working with vectorized | ||||||
reducing functions, which begin with arrays of rank :math:`N` and return | ||||||
objects of rank :math:`M < N`: when :math:`M = 0`, the rules change. This | ||||||
prompts an unfortunate pattern of calling ``asarray`` on the results of | ||||||
intermediate array operations to ensure that operations like boolean mask | ||||||
assignment still work. The inconsistency also presents downstream library | ||||||
authors with an unfortunate choice: should they maintain consistency with | ||||||
NumPy and prefer to return scalars when possible (e.g. ``scipy.stats``, which | ||||||
explicitly uses empty-tuple indexing on all results to *ensure* consistency, | ||||||
and ``scipy.special``, which relies on NumPy ufunc machinery), or should they | ||||||
follow the lead of the array API standard and prefer zero-rank arrays (e.g. | ||||||
``scipy.interpolate``). | ||||||
|
||||||
Usage and impact | ||||||
---------------- | ||||||
|
||||||
Currently, most operations in NumPy involving zero-rank arrays return scalars, | ||||||
reducing operations that would naturally result in a zero-rank array actually | ||||||
produce a scalar, and indexing operations that would naturally result in a | ||||||
zero-rank array actually produce a scalar. The proposal is for these operations | ||||||
to return zero-rank arrays instead of scalars. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is confusing, the below examples are almost right, but speaking too generally about reductions isn't?
You could try more for either points (I don't really believe in it from a BC stand-point, because those are the two cases where it makes a lot of sense to have code that continues relying on scalars). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Must have missed that. I saw:
So I suppose there is a second exception, and the easiest way to summarize it is that indexing behavior is unchanged. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤷 fair, I wanted to refer to ufuncs/functions there. It does continue with:
In the context of doing the bigger change with If you feel strong about changing indexing, we could still try it out. As you can tell, I am focused on the minimal useful change. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. And I was hoping to go a little further and follow the array API more closely. We're coming from different perspectives. I think it's fine now that we understand that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As for whether we should try it out - yeah, I think there's value in seeing how much additional trouble it causes. I think the best decision would come from understanding the tradeoff between switching behaviors and downstream pain. Right now, we only know that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, TBH, I just don't expect |
||||||
|
||||||
.. code:: python | ||||||
|
||||||
import numpy as np | ||||||
x = np.asarray(1.) | ||||||
np.isscalar(x + x) # True (main), False (NEP) | ||||||
np.isscalar(np.exp(x)) # True (main), False (NEP) | ||||||
y = np.ones(10) | ||||||
np.isscalar(np.sum(y), axis=-1)) # True (main), False (NEP) | ||||||
np.isscalar(y[0]) # True (main), False (NEP) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The last one doesn't change. In fact, indexing doesn't change at all. |
||||||
|
||||||
For exceptions to these rules, ask Sebastian. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One exception is x = np.asarray([1, 2, 3])
x[0] produces a scalar, too?
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It makes it much easier because it avoids a swaths of BC issues since You can call it a half-measure if you like, but I don't see it as such at all. The point is To home in on what IMO, is the core of all of this: Make scalar returns predictable and avoidable when working with N-D arrays. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't see how to sell that, though. Individual NumPy operations are pretty predictable right now - operations other than array creation functions will always return a scalar instead of a 0d array. There may be exceptions to that rule, but doesn't that sum it up, or is there a ton of inconsistencies that this PR resolves? (Can you provide examples of surprises we get right now that are not summed up by the rule above?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are right, the problem isn't really them being unpredictable, the problem is that there is no good way to avoid the 0-D array -> scalar conversion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
OK, so do I understand that somewhere in the code, the results of ufuncs and reducing operations is a 0d array, and then it gets converted (unavoidably) to a scalar. You are reversing that choice except for reducing operations when When we fully index an array (e.g. one integer for a 1d array, 2 integers for a 2d array), is the result naturally a 0-d array or a scalar? Is it like ufuncs in that the result is naturally 0d and it gets converted to a scalar, or is it the other way arround - the result is naturally a scalar and if we wanted it to return a 0d array, it would not be so different from just calling |
||||||
|
||||||
Empty-tuple indexing may still be used to cast any resulting zero-rank array | ||||||
to the corresponding scalar. | ||||||
|
||||||
.. code:: python | ||||||
|
||||||
import numpy as np | ||||||
x = np.asarray(1.) | ||||||
np.isscalar(x + x) # True (main), False (NEP) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please replace all There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, I can't replace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because it has a very narrow meaning of "scalar" that is just not quite right in this context (honestly it is never quite right, but does a decent enough job for numeric types). So yes, of course I mean |
||||||
|
||||||
The main impact to users is more predictable results due to improved | ||||||
consistency within NumPy, between NumPy and the array API standard, and | ||||||
between NumPy and other array libraries. Working with the results of reducing | ||||||
functions, in particular, will be easier because return values of any rank | ||||||
will support boolean indexing assignment. | ||||||
|
||||||
There is a secondary impact on performance. On typical hardware, execution | ||||||
time of conversion from zero-rank arrays to scalars and elementary arithmetic | ||||||
operations involving only scalars is on the order of tens of nanoseconds, | ||||||
whereas operations involving only zero-rank arrays is on the order of hundreds | ||||||
of nanoseconds. Consequently, some elementary arithmetic calculations will be | ||||||
slower. On the other hand, conversion from scalars to zero-rank arrays takes a | ||||||
few hundred nanoseconds, and many operations, such such as ufuncs and | ||||||
operations involving both scalars and rank-zero arrays require conversion from | ||||||
scalars to zero-rank arrays. These operations will be faster. We will not | ||||||
speculate as to whether this will have a net positive or net negative impact on | ||||||
user applications, but the net impact is expected to be small since impact on | ||||||
downstream library test suites has been minimal in testing. | ||||||
|
||||||
Backward compatibility | ||||||
---------------------- | ||||||
|
||||||
The motivation of this proposal is to eliminate the surprises associated with | ||||||
a scalar result when a zero-rank array would be expected. However, existing | ||||||
code may rely on the current behavior, and this presents backward compatibility | ||||||
concerns. | ||||||
|
||||||
The main concern for user code is that the mutable zero-rank arrays that | ||||||
replace immutable scalars are no longer hashable. For instance, they cannot | ||||||
by used directly as keys of dictionaries, the argument of an ``lru_cache`` | ||||||
-decorated function, etc. In all circumstances, tbe patch is simple: convert | ||||||
the zero-rank array to a scalar with empty-tuple indexing. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is OK for the hashability but needs to be expanded on the mutability. While less common, it does need to be pointed out explicitly because it can lead to incorrect results. (Unlike hashability, which should mostly lead to a relatively clear error.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought about this, but I don't really understand how existing code could run into this problem. If you try to mutate a scalar now, it raises an error. Presumably this is not what was intended, so this will not be part of a typical code path. Do you mean that people might be relying on mutating a scalar to raise an error, but with the change it would works as expected, and that would be undesirable? I was assuming this would hold true:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It doesn't raise for in-place operators. I actually have problems of thinking of an example (which is good!). But a silly one:
The point is, having a 0-D array where code is clearly written for scalars (using in-place operators) is bad news. I could imagine it also for a simplistic Is that common? Hopefully not, and to me if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah right. I realized that at the conference and recently forgot. |
||||||
|
||||||
Running the test suites of dependent libraries against a branch of NumPy that | ||||||
implements these changes has revealed a few other issues. <Library maintainers, | ||||||
please summaries these issues here.> | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One worry I still have -- not sure how big it is -- is the fact that it isn't necessarily trivial for downstream libraries to follow NumPy behavior. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't see how it is impossible. Even a decorator could be written to detect input type (scalar or array) and ensure that the output types are consistent. It might require an explicit conversion, but aside from the tens to hundreds of nanoseconds, a user would not notice the difference.
They might. This is something SciPy will certainly be discussing for SciPy 2.0. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but we will need public API from NumPy to do this correctly, and that machinery will again not fit well into your Array API code. |
||||||
Detailed description | ||||||
-------------------- | ||||||
|
||||||
The new functionality will be used in much the same was as old functionality, | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/same was/same way/ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
except that sometimes authors will need to convert zero-rank arrays to scalars | ||||||
rather than converting scalars to zero-rank arrays. For instance, consider the | ||||||
following: | ||||||
|
||||||
.. code:: python | ||||||
|
||||||
import numpy as np | ||||||
rng = np.random.default_rng(85878653462722874072976519960992129768) | ||||||
x = rng.standard_normal(size=10) | ||||||
y = np.sum(x, axis=-1) | ||||||
z = {y: 'a duck'} # use scalar result as dictionary key | ||||||
y = np.asarray(y) # convert to array to allow mutation | ||||||
y[y < 0] = np.nan | ||||||
|
||||||
The use of ``x`` as a dictionary key would need to become ``x[()]``, but ``z`` | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops; I was messing with the code and forgot to update the text. Thanks! |
||||||
no longer needs to be explicitly converted to an array. | ||||||
|
||||||
.. code:: python | ||||||
|
||||||
z = {y[()]: 'a duck'} # extract scalar to use as dictionary key | ||||||
y[y < 0] = np.nan # no conversion to array required | ||||||
|
||||||
Realistic examples can be found throughout the codebases of dependent | ||||||
libraries. | ||||||
|
||||||
Related work | ||||||
------------ | ||||||
|
||||||
All known libraries that attempt to implement the array API standard | ||||||
(e.g. ``cupy``, ``torch``, ``jax.numpy``, ``dask.array``) return | ||||||
zero-rank arrays as specified by the standard. | ||||||
|
||||||
Implementation | ||||||
-------------- | ||||||
|
||||||
To implement the NEP, the branch prepared by Sebastian needs to be merged, | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should I link https://github.com/seberg/numpy/tree/try-0d-preservation-rebase? (Or will there be a different branch, do you think?) |
||||||
and dependent libraries will need to prepare releases that adapt to (and take | ||||||
advantage of) the changes. Branches of libraries including SciPy and Matplotlib | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @j-bowhay @pfackeldey @ksunden can you provide links to your branches? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the link to Awkward Array with the NumPy branch provided by @seberg: https://github.com/scikit-hep/awkward/tree/pfackeldey/test_NEP57. There are only 2 failures and they are related to pandas. For Awkward Array the migration seems to be simple. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just catching up on notification now, happy to provide a branch for SciPy and read over this but will probably need a week or two to do so There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's the status here? I am still happy to work on a branch but don't want to invest the time if things are blocked elsewhere There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have worked on it here: numpy#29067 but there are still some issues around polynomials mostly I think, where the NumPy python code needs to properly deal with the changes (which is a bit tedious). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's ok to wait, @j-bowhay. We'll ping you when things are closer. |
||||||
have already been prepared without much difficulty. Initially, users will have | ||||||
the option of opting into this behavior using an environment variable, so these | ||||||
releases need to be compatible with both old and new behaviors. To make the new | ||||||
behavior the default and only behavior, NumPy will need to advise users of the | ||||||
pending change and give the appropriate notice. The initial draft of this | ||||||
document does not specify the appropriate timeline and procedures; this line | ||||||
will be updated to reflect the consensus of the maintainer team. | ||||||
Comment on lines
+177
to
+179
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @seberg should I propose something? What would you suggest? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Propose something. I would say:
Or in other words, I would just not say much about the precise path of achieving step 2. That makes sense IMO, because without 1 it is even harder to judge how big of a change it is (not that 1 will be a silver bullet). How exactly you want to write it, I don't care. You could even add a |
||||||
|
||||||
Alternatives | ||||||
------------ | ||||||
|
||||||
There are two main alternatives to this proposal. | ||||||
|
||||||
The alternative suggested by :ref:`NEP56` is to maintain the current behavior | ||||||
and make NumPy scalars more fully duck-type zero-rank arrays, such as adding | ||||||
missing behaviors. | ||||||
|
||||||
While this work would still be valuable, we propose the behavior change to more | ||||||
fully comply with the standard and to resolve the problems mentioned in the | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this somewhat leaves out a huge part of it which is the non-array API part? The whole "just add methods" works very well for array-api numeric types, but badly for general scalars like object/string dtypes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Could you be more specific about what the "non-array API part" is? From context, I think you are referring to an important argument in favor of the change, but I don't know what argument that is. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you reply to this part @seberg? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think our discussion above covered this, but I'll try to see if there is more (it may make sense to look at NumPy bugs around this, but SciPy fixes seem actually more interesting for the NEP). |
||||||
Motivation. | ||||||
|
||||||
More extreme alternatives are also available, such as eliminating NumPy | ||||||
scalars entirely. We do not take this approach for two reasons: | ||||||
|
||||||
1. Scalars still have some advantages as hashable, dtyped objects that support | ||||||
very fast elementary arithmetic. | ||||||
2. The backward compatibility concerns with eliminating scalars entirely are | ||||||
much more severe. | ||||||
|
||||||
A variant of this proposal is to eliminate the exceptional behavior associated | ||||||
with reducing operations and ``axis=None``. I cannot present any justification | ||||||
for why we shouldn't do this; I would certainly prefer it because it would be | ||||||
even more consistent and fully compliant with the standard. Ask Sebastian. | ||||||
Comment on lines
+202
to
+205
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please replace the sentences after the first for me : ) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minimal change, minimal change, minimal change. And maybe also why not? If you change I keep repeating myself, but the point is making the return predictable for N-D arrays, changing Now, "I don't like scalars" is a valid opinion, but I think we established well enough that we won't achieve that easily, so I don't see why we should try to make inroads towards it that we don't need to fix the actual core issue (don't get random scalars when working with N-D arrays). EDIT: To be clear, I am not in love with this rule, also because it is slightly different from the rule for indexing (where adding And yes, this is an opinion that you can reasonably disagree on and the desire for minimal change is at the core of it. |
||||||
|
||||||
Discussion | ||||||
---------- | ||||||
|
||||||
This section may just be a bullet list including links to any discussions | ||||||
regarding the NEP: | ||||||
|
||||||
- https://github.com/numpy/numpy/issues/24897 | ||||||
- https://github.com/scientific-python/summit-2025/issues/38 | ||||||
- https://github.com/scipy/scipy/pull/22947#discussion_r2080108060 | ||||||
|
||||||
|
||||||
References and footnotes | ||||||
------------------------ | ||||||
|
||||||
.. [1] `Python array API standard 2014.12 — Array Object`_ | ||||||
.. [2] Each NEP must either be explicitly labeled as placed in the public domain (see | ||||||
this NEP as an example) or licensed under the `Open Publication License`_. | ||||||
|
||||||
.. _Open Publication License: https://www.opencontent.org/openpub/ | ||||||
.. _Python array API standard 2014.12 — Array Object: https://data-apis.org/array-api/latest/API_specification/array_object.html | ||||||
|
||||||
Copyright | ||||||
--------- | ||||||
|
||||||
This document has been placed in the public domain. [2]_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK if you want your name added.