Edit report at https://pear.php.net/bugs/bug.php?id=20425&edit=1

  ID: 20425
  Updated by: tklingenberg@lastflood.net
  Reported By: jan dot prachar@gmail.com
  Summary: Incomplete percent-encoding of userinfo, path and
  Status: Open
  Type: Bug
  Package: Net_URL2
  Package Version: 2.0.9
  PHP Version: Irrelevant
  Roadmap Versions:
  New Comment:

IIRC that special handling has been done to align wrong input handling
with that how browsers do it
with their URI treatment. Strictly, Net_URL2 expects those parts to be
correctly encoded already.
However this should make it more robust so that Net_URL2 can accept URIs
that are acceptable by
browsers as well without running into double-encode problems:

The example URI you give:

     http://user[1]@example.com/p\s/|" ?{}#^

for example is turned when entered into Chromium into the following
effective request URI (fragment
is kept in client):


This is similar to how Net_URL2 already does it:


The differences I see is with the square brackets, the slash-correction
and pipe symbol.

Angle-brackets do not need to be converted and question mark would
result in data-loss (separator) if
it would have.

There is a documentation problem however because the comment does not
cover the userinfo part in
the docblock of Net_URL2::_encodeData :

      * Encode characters that might have been forgotten to encode when
      * in an URL. Applied onto Path and Query.

As with any fuzzy logic, this method is a best guess. When I introduced
it, I did check that with
browser behavior. Now re-checking it and seeing the differences to
Chromium, I can't say why or why
not I didn't cover square brackets for example.

It's perhaps best to research browser behaviors again and list those
incl. the results and the test-URIs.

I might still have some notes about that on the one or other computer. I
might be able to gather that
later on.

Previous Comments:

[2014-10-09 02:23:40] pracj3am

When parsing URI, characters that are invalid are percent-encoded in the
userinfo, path
and query part (method _encodeData). But there are more characters that
should be
percent-encoded according to rfc3986 like [ ] | ` { }. Concretely this
is the whole set:

Also the same charcters should be pecent-encoded in a fragment part.

Test script:
echo (new Net_URL2('http://user[1]@example.com/p\s/|"

Expected result:

Actual result:


Search Discussions

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppear-bugs @
postedOct 9, '14 at 9:43a
activeOct 9, '14 at 9:43a

1 user in discussion

Tklingenberg: 1 post



site design / logo © 2022 Grokbase