Skip to content

MAINT: Unify casting error creation (outside the iterator)#16133

Merged
mattip merged 2 commits into
numpy:masterfrom
seberg:maint-cast-error
May 13, 2020
Merged

MAINT: Unify casting error creation (outside the iterator)#16133
mattip merged 2 commits into
numpy:masterfrom
seberg:maint-cast-error

Conversation

@seberg

@seberg seberg commented May 1, 2020

Copy link
Copy Markdown
Member

This unifies the casting error machinery a bit (deleting some annoying duplication). In rare cases the error message is fixed (scalar -> array, in copyto). In other cases it will now print array data instead of just array (arr.astype). Overall, it should be a simple code duplication-reduction though.

@eric-wieser eric-wieser May 2, 2020

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 3 has a much shorter way to write all this, with %R:

Suggested change
errmsg = PyUString_FromString("Cannot cast scalar from ");
PyErr_Format(PyExc_TypeError, "Cannot cast scalar from %R to %R according to the rule %s", ...);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, didn't really try to fix up that code, had just moved it.

@seberg seberg force-pushed the maint-cast-error branch 2 times, most recently from cbc582a to 63f54de Compare May 2, 2020 13:17
@seberg seberg force-pushed the maint-cast-error branch from 63f54de to ffe76ac Compare May 2, 2020 13:27
* PyArray_CanCastArrayTo result).
*/
NPY_NO_EXPORT void
npy_set_invalid_cast_error(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this function is even worth it at this point, given it would be a one-liner at each call site.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, have to admit, I am not sure. On the other hand, that way at least the error message is not spelled out in 5 places.

@eric-wieser eric-wieser May 2, 2020

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed how many times this was used, that seems fair. Maybe replace scalar with char const *what, which makes it possible for the caller to pass more info to put in the message, which will also make the implementation shorter.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, 2 of those would be the concatenate fixup...

@seberg seberg May 2, 2020

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made that simplifictaion for the iterator code (which is arguably very similar), and adds another 1.5 occurances, which then are not covered here (so we have 2 "places" in the code).

The only real reason for this being useful is probably the scalar distinction, which is correct, but a bit wonky admittedly.

EDIT: Trivial duplication is at least super easy to grep for after all...

@seberg seberg force-pushed the maint-cast-error branch from a0895b3 to 97c5604 Compare May 2, 2020 18:46
Comment thread numpy/core/src/multiarray/ctors.c Outdated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this now describe a 0d array as a scalar?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but maybe no... that is what PyArray_CanCastArrayTo does. Its a bit annoying, it can be incorrect from the user-input perspective, but more correct due to value based casting of 0-D arrays. I.e. the 0-D array uses "scalar casting" rules right now, but the original input typically could have been either a scalar or an array.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the code path that hits this error?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much none... The only paths that should hit this, are arguably nonsense:

np.take([2], [1], out=np.array([1], dtype=np.uint64))

but even that example cannot get a "scalar" case.

from numpy.core import _rational_tests as rt                                                                   
np.array(np.array(rt.rational(2)), dtype="M8")  

should hit it IMO, but we currently consider all unsafe casts as possible, and in this case error when we notice that there is no way defined to do the actual casts. That is: In almost all cases we use force-casting, so it just doesn't matter.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably np.array(rt.rational(2), dtype="M8") is a case when the scalar message is justified?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding one of these as a test?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add the test, but it is basically adding it for the future (or excersizing a different error). Because "unsafe" casts are currently always considered possible, it errors later with:

ValueError: No cast function available

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test for a value error, I am not sure its useful here to be honest. But likely enough we never had a test for that in any case...

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage is always useful!

Comment on lines 1021 to 1032

@eric-wieser eric-wieser May 3, 2020

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, my comment above from mobile (where multi-line comments are not possible) was questioning whether this would be better as:

Suggested change
NPY_CASTING casting, npy_bool scalar)
{
char *msg;
if (!scalar) {
msg = "Cannot cast array data from %R to %R according to the rule %s";
}
else {
msg = "Cannot cast scalar from %R to %R according to the rule %s";
}
PyErr_Format(PyExc_TypeError,
msg, src_dtype, dst_dtype, npy_casting_to_string(casting));
NPY_CASTING casting, char const *what)
{
PyErr_Format(PyExc_TypeError,
"Cannot cast %s from %R to %R according to the rule %s,
what, src_dtype, dst_dtype, npy_casting_to_string(casting));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this is going to be limited to different msg's for scalar vs non scalar, i prefer the current way, where the msg is decided by npy_set_invalid_cast_error. If there are more else if switches added here though, maybe what as the argument would be better.

@eric-wieser eric-wieser left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically look fine, just a few nits. Nice to remove the duplication either way.

@mattip

mattip commented May 6, 2020

Copy link
Copy Markdown
Member

@seberg I think this is ready except for a few small nits. Should we put it in as-is?

@charris

charris commented May 12, 2020

Copy link
Copy Markdown
Member

@seberg ping.

@seberg

seberg commented May 12, 2020

Copy link
Copy Markdown
Member Author

Well, I don't really think there is anything to do, unless you dislike calling a 0D array "scalar", because in theory that is the casting rule we use (and typically we do not know if it used to be a scalar). I honestly do not care either way, and yes, I admit in most of the cases it is either clear cut, or just does not matter, so I can change it.

@seberg seberg force-pushed the maint-cast-error branch from 97c5604 to 5aa0301 Compare May 12, 2020 20:44
@seberg seberg force-pushed the maint-cast-error branch from 5aa0301 to 9cc3af7 Compare May 12, 2020 20:46
@mattip

mattip commented May 13, 2020

Copy link
Copy Markdown
Member

LGTM - less code and more tests.

@mattip mattip merged commit 0a222b8 into numpy:master May 13, 2020
@mattip

mattip commented May 13, 2020

Copy link
Copy Markdown
Member

Thanks @seberg

@seberg seberg deleted the maint-cast-error branch May 13, 2020 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants