Skip to content

Incorrect quoting path part of URI. #495

@tarhan

Description

@tarhan

I think there is error with constructing target url in request object.
There are lines 177-178 within aiohttp/client_reqrep.py (method update_path in ClientRequest):

self.path = urllib.parse.urlunsplit(
            ('', '', urllib.parse.quote(path, safe='/%:'), query, fragment))

It says to quote path part with safe characters not needed quoting forward slash, percent sign and colon.
But according to RFC3986 section 2.3 there are unreserved characters that does not need escaping:

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"

Python's urllib.parse.quote library does not recognize tilde sign. So it must be within safe keyword argument.
Also according of section 2.2 of same RFC there reserved characters which used as delimers or subdelimers of path sub parts. As request method receives either full URL and URL without query substring you can not if any of delimers is delimer or part of path subcomponent.
So it must be library uses's responsibility to quote or exclude reserved characters from path sub component.
Reserved characters (RFC3986 section 2.3):

reserved    = gen-delims / sub-delims

gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims  = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="

Again urllib.parse.quote does not quote only forward slash. Since it is so many reserved characters quoted so I think it is mistake from your side to escape UTF-8 characters using urllib.parse.quote method.

Where such behavior is problem?. All Akamai CDNs heavy uses comma as delimer within path part of URLs. So there no way to forbid your library to stop quoting commas within path part of URL.
Example Akamai's URL:

http://o2-f.akamaihd.net/z/onthewings/20150903/20150903-onthewings-hd-,150000,300000,500000,800000,1000000,1300000,1500000,2500000,.mp4.csmil

Your library quotes commas and Akamai CDN response with 404 for such quoted URL.

My suggestions are either include all reserved & unreserved characters as safe keyword argument or provide additional keyword argument to block quoting of path part of URI in request method.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions