-
-
Notifications
You must be signed in to change notification settings - Fork 32.2k
Description
First, the behaviour:
URL | host | path |
---|---|---|
http://*/path | /*/path | |
http://./path | . | /path |
http://=/path | /=/path | |
http://-/path | - | /path |
http://0/path | 0 | /path |
http://,/path | /,/path | |
http://@/path | /path | |
http://;/path | ;/path | |
http://[::1]/path | [::1] | /path |
I would expect that in all of the above, that the path would be /path
. From my point of view, random non-URL syntax characters are being pushed into the path, and its pretty surprising.
There are some statements in the test code that make this appear
to be deliberate, but they don't justify the behaviour.
While it is true that *
is not a valid domain, according to the host parsing rules quoted, neither is -
, or 0
or .
. I would expect .
to be treated as .
, returned in the host string.
I would not expect the url parser to validate that domain names are well formed, though I would expect characters that are defined as part of the URL syntax to of course not be valid.
I can implement my own url parser that allows *
, and I will for backwards compat, but I think this is a bit odd. URL is generally very lax in its parsing, it gives you the syntactic bits, and you get to validate whether they are correct for your use-case, this is the first time its failed my expectations.
- Version: 0.10+
- Platform: all
- Subsystem: url