Skip to content

BIP77: Propose to encode + as %2B in pj param of Bitcoin URIs #1885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

kumulynja
Copy link

This PR proposes to percent-encode the + character used in the Payjoin mailbox URI (pj parameter) as %2B when included in a query parameter of a Bitcoin URI.

The reason is the same as the # fragment char that is encoded as %23 to avoid it being parsed as a fragment of the Bitcoin URI instead of the payjoin URI. The + character also has reserved meaning in query parameter contexts (Bitcoin URI), different from in path contexts (payjoin URI), and improper encoding can break parsing in standards-compliant URI parsers.

This change ensures:

  • Compatibility across wallets, libraries, and copy-paste behavior.
  • That the Payjoin endpoint and its session parameters are interpreted as intended.
  • Continued support for QR alphanumeric mode if %2B is used (%2b shouldn't be used as clarified in the change to the bip)

@kumulynja
Copy link
Author

@DanGould @nothingmuch Happy to hear your thoughts and discuss.

Also sharing this stackoverflow from @ethicnology which shows the problem with the current unencoded +:
https://stackoverflow.com/questions/79684803/rfc-3986-bip21-how-to-handle-in-query-parameter-containing-a-nested-uri

@murchandamus murchandamus added Proposed BIP modification Pending acceptance This BIP modification requires sign-off by the champion of the BIP being modified labels Jun 30, 2025
Copy link
Contributor

@murchandamus murchandamus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking over the commits separately, and it appears that all of these commits amend the same two sections. Please squash your commits into a single commit to reduce confusion.

@nothingmuch
Copy link
Contributor

nothingmuch commented Jun 30, 2025

NACK, this change is unnecessary.

+ was selected because it's a sub-delimiter. was avoided to prevent any conflicts or ambiguities arising from this ad-hoc convention of interpreting + as .

It's safe (by design in the context of BIP77) to convert back to + before handing off to any mailbox URL parsing code, BIP 21 parsing code (though I don't see why this mangling would be an issue in this setting), or to simply split the fragment on instead of + if implementing the fragment parameter parsing independently while relying on a URI library that does this.

The reason this is safe is that the spec avoids + and anywhere else in the entire URL, including the path components.

The problem originates not with the spec but the specific golang & dart URI libraries, which afaict automatically and destructively modify the pj param to substitute + with . As far as I know this substitution rule is not specified anywhere, certainly not RFC 3986, and is ad hoc behavior from 90s era browsers. Although is arguably a bug in those libraries, or at least a deviation from the spec, because this is a well established convention accommodations for it are desirable, but I believe the spec already accommodates for this sufficiently.

Note that all of the fields are included in queryParameters['pj'] (from the stack overflow question), i.e. the entire field is parsed correctly it's just then additional transformed in this (in general destructive) way but is safe to transform back with knowledge that mailbox endpoint URLs will only contain this:

    print(uri.queryParameters['pj']);
    // HTTPS://PAYJO.IN/Z55YEYZ3N0RFJ#RK1QVWFRUJ48GT052V0VRJF9RR7R8LXYSLWJEGK0JZZ5YYP8WJ87SKH6 OH1QYP87E2AVMDKXDTU6R25WCPQ5ZUF02XHNPA65JMD8ZA2W4YRQN6UUWG EX1Z3EKY6Q

Also note that it's already valid to encode this as %2B instead of a literal +, but both should be supported by any RFC 3986 compliant implementations. RFC 3986 does not introduce any semantic distinction between these, both would be valid in a BIP 21 URL containing a pj parameter with a URL which contains strings. Double encoding should also be valid, i.e. if for some reason the pj parameter URL is encoded with %2B, then the % now would necessarily need to be escaped as well, so the BIP 21 URI would instead contain a %372B, but as far as obtaining a fragment string with the parameters fully intact and delimited by a + after unescaping, these should all result in the same parsed parameter values.

So based on all of these arguments, I think it would be better to not change the spec in this way. Requiring that all implementations always escape in this way would impose exactly the same kind of burden of having to work around generic URI libraries that don't implement escaping of + (i suspect most would not), necessitating the transformation of the encoded strings. This seems worse than only requiring implementations that rely on libraries that unconditionally do this mangling to work around that behavior, or switch to a library that more closely follows the spec.

An alternative might be to select a different character that is still in the QR alphanumeric set, perhaps : or -, but I don't think there's sufficient motivation to change that after software has already been released (and the motivation seemed insufficient when we considered this a few months ago).

In the particular example of bb mobile, is there anything that prevents using the payjoin crate's extension of the bitcoin_uri crate (the payjoin extras stuff) in parallel or instead of the generic URI library? If so we should discuss elsewhere (pdk discord or issues), but that seems like the easiest way to avoid any stringly typed workarounds.

@kumulynja
Copy link
Author

kumulynja commented Jun 30, 2025

Thanks for the thorough explanation. And after going through the RFC3986 specs again, I think I do understand now that + is only specified to have special meaning in the scheme component. But, nevertheless, as you say, a lot of general/standard library Uri parsers in different programming languages wrongly interpret and replace it as a space, so from a practical point of view it would have been nice if those parsers (that are probably used by a lot of wallets) could still be used without any needed extra manipulations. I guess now a lot of users/wallets that do, will bump in an error because of this replacement with spaces, which the pdk sender can not interpret. But I understand the nack.

@nothingmuch
Copy link
Contributor

I guess now a lot of users/wallets that do, will bump in an error because of this replacement with spaces, which the pdk sender can not interpret. But I understand the nack.

Yeah it's not ideal, and I mainly chose + over the equally plausible alternatives because it looked a bit familiar and seemed the least likely to get % encoded unnecessarily but that's not a very compelling (for context, most of these changes over the previous approach were done to optimize QR encoding for better scanning UX)... We could still bikeshed this to e.g. : or -, and just keep backwards compatible stuff in the PDK implementation to handle +.

We had a similar question with regards to the ordering of these parameters not realizing they should probably have a defined order, and ended up just specifying the order that we implemented without much thought (which turned out to be reverse lexicographical, which is also a bit of a gotcha).

Taken together maybe these two gotchas motivate a change even though the spec has been merged? The Core implementation is not yet at the stage of dealing with these things so the consequences should be minimal.

@DanGould, what do you think?

@DanGould
Copy link
Contributor

DanGould commented Jul 1, 2025

If those production implementors are comfortable with another breaking change, especially since we're able to offer a transition period with PDK, I'm ok with making one. It sounds like both Bull and Cake are ok with this path forward.

I would want to make sure the parsers don't complain about : or - before making a commitment. Might you also consider just using the HRP deliminators in the fragment?.

For the time being I would recommend they take you up on your suggestion so that at least the current implementations are not broken.

@kumulynja
Copy link
Author

kumulynja commented Jul 1, 2025

Even though it's a breaking change, my first thought (depends on the implementation though) is that users aren't really affected as they don't process the pj parameter themselves, they just pass it on to PDK as extracted from the Bitcoin URI. So from a user's point-of-view, it doesn't really matter or shouldn't require any changes as long as parsing the Bitcoin URI doesn't break the pj parameter. Of course inconsistencies in versions will make it incompatible, but currently no two production implementations are on the same version anyways.

Both : and - should work, but maybe - is the safest, as this isn't a reserved char in any component, while : is still a reserved char in scheme, port and authority components?

@kumulynja
Copy link
Author

kumulynja commented Jul 1, 2025

Or, another way to go about it, would be keeping the +, but have PDK process both a pj uri string with spaces as with +, so accept the following both correct instead of throwing an error on the one with spaces:

HTTPS://PAYJO.IN/TXJCGKTKXLUUZ%23RK1Q0DJS3VVDXWQQTLQ8022QGXSX7ML9PHZ6EDSF6AKEWQG758JPS2EV+OH1QYPM59NK2LXXS4890SUAXXYT25Z2VAPHP0X7YEYCJXGWAG6UG9ZU6NQ+EX1WKV8CEC

HTTPS://PAYJO.IN/TXJCGKTKXLUUZ%23RK1Q0DJS3VVDXWQQTLQ8022QGXSX7ML9PHZ6EDSF6AKEWQG758JPS2EV OH1QYPM59NK2LXXS4890SUAXXYT25Z2VAPHP0X7YEYCJXGWAG6UG9ZU6NQ EX1WKV8CEC

Instead of every user having to take care of replacing with + after parsing, PDK could maybe do this itself?

(But yeah, this feels hacky and maybe again error prone, so better to have only one specified way)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pending acceptance This BIP modification requires sign-off by the champion of the BIP being modified Proposed BIP modification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants