Edit report at https://pear.php.net/bugs/bug.php?id=20425&edit=1

  ID: 20425
  Comment by: jan.prachar@gmail.com
  Reported By: jan dot prachar@gmail.com
  Summary: Incomplete percent-encoding of userinfo, path and
  Status: Open
  Type: Bug
  Package: Net_URL2
  Package Version: 2.0.9
  PHP Version: Irrelevant
  Roadmap Versions:
  New Comment:

Do you need any help?

Previous Comments:

[2014-10-10 00:23:36] tkli

at least the documentation problem will be resolved in the next 2.0.10
release (just around the


[2014-10-09 14:24:28] tkli

colons in path perhaps shouldn't be translated for interoperability



[2014-10-09 14:14:32] tkli

That's good info.

I think we should do a matrix specifying which part (userinfo, host,
path, query, fragment) should
deal with which characters.

E.g. the Firefox issue you refer to is about the query if I grasped it

We then can put it to a test and have it properly specified. This should
make clear what the intend
is and how it was solved.


[2014-10-09 13:38:51] pracj3am

I also experimented with different browsers. For eaxmple following URL
'http://example.com/ "<>[]\{}|`^? "<>[]\{}|`^'

Chromium turn into
GET /%20%22%3C%3E[]/%7B%7D%7C%60%5E?%20%22%3C%3E[]\{}|`^

GET /%20%22%3C%3E%5B%5D%5C%7B%7D|%60%5E?%20%22%3C%3E[]\{}|%60^

So in the path component Chromium encodes everything except square
brackets and backslash (turned into slash). While Firefox encodes
everything but |. In the query component they are quite permitive.

Notice that not encoding square brackets was reported as bug in Firefox
and fixed recently see

Anyway I think you cannot make any harmm if you ancode all invalid


[2014-10-09 11:46:08] tkli

IIRC that special handling has been done to align wrong input handling
with that how browsers do it
with their URI treatment. Strictly, Net_URL2 expects those parts to be
correctly encoded already.
However this should make it more robust so that Net_URL2 can accept URIs
that are acceptable by
browsers as well without running into double-encode problems:

The example URI you give:

     http://user[1]@example.com/p\s/|" ?{}#^

for example is turned when entered into Chromium into the following
effective request URI (fragment
is kept in client):


This is similar to how Net_URL2 already does it:


The differences I see is with the square brackets, the slash-correction
and pipe symbol.

Angle-brackets do not need to be converted and question mark would
result in data-loss (separator) if
it would have.

There is a documentation problem however because the comment does not
cover the userinfo part in
the docblock of Net_URL2::_encodeData :

      * Encode characters that might have been forgotten to encode when
      * in an URL. Applied onto Path and Query.

As with any fuzzy logic, this method is a best guess. When I introduced
it, I did check that with
browser behavior. Now re-checking it and seeing the differences to
Chromium, I can't say why or why
not I didn't cover square brackets for example.

It's perhaps best to research browser behaviors again and list those
incl. the results and the test-URIs.

I might still have some notes about that on the one or other computer. I
might be able to gather that
later on.


The remainder of the comments for this report are too long. To view
the rest of the comments, please view the bug report online at

Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppear-bugs @
postedNov 24, '14 at 9:14a
activeNov 24, '14 at 9:14a

1 user in discussion

Jan Prachar: 1 post



site design / logo © 2022 Grokbase