FAQ
Since ever people are confused by _GET and _POST superglobals,
because, despite their name, they do not (really) depend on the
request method. Therefor I propose to phase out $_GET and name it
$_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
100% confident with the latter yet, though).

Further, I propose to remove the POST method restriction for handling
request bodies and solely rely on the content type to trigger the
parser(s). (*)

There are already parsers for application/x-www-form-urlencoded and
multipart/form-data in the core. One could think of providing an API
to add content type handlers from extensions, ext/json may be an
example, like it is hacked into pecl_http-v2.

Thoughts, objections, insults?

(*) We'd probably have to revisit all *post* INI variables, though.

--
Regards,
Mike

Search Discussions

  • Alexey Zakhlestin at Oct 2, 2013 at 7:17 am

    On 02.10.2013, at 10:59, Michael Wallner wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).

    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)

    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.

    Thoughts, objections, insults?

    (*) We'd probably have to revisit all *post* INI variables, though.
    So, that is not one, but three proposals:

    1. _GET -> _QUERY, _POST -> _FORM

       I don't think this is really necessary. Names are there historically and changing them will break a lot of stuff.
       +0 on aliasing, and soft-deprecation (via documentation) though

    2. ignore request-method, trigger body-processing solely on Content-type

       +1. makes sense

    3. expose body-parsers via php-level API

       +1. Hell, yes! Something like +1000, actually ;)
  • Michael Wallner at Oct 2, 2013 at 7:24 am

    On 2 October 2013 09:17, Alexey Zakhlestin wrote:

    3. expose body-parsers via php-level API

    +1. Hell, yes! Something like +1000, actually ;)
    Uhmmm... I actually meant an interal API not userland :)


    --
    Regards,
    Mike
  • Alexey Zakhlestin at Oct 2, 2013 at 8:01 am

    On 02.10.2013, at 11:24, Michael Wallner wrote:
    On 2 October 2013 09:17, Alexey Zakhlestin wrote:

    3. expose body-parsers via php-level API

    +1. Hell, yes! Something like +1000, actually ;)
    Uhmmm... I actually meant an interal API not userland :)
    well, why not both? :)

    string/stream in, array out
  • Jannik Zschiesche at Oct 2, 2013 at 7:25 am
    Hi,

    wouldn’t $_BODY be better - since it is the request body?
    $_FORM is imho not very clear, since you can send data to $_POST without using a form.


    --
    Cheers
    Jannik


    Am Mittwoch, 2. Oktober 2013 um 09:17 schrieb Alexey Zakhlestin:
    On 02.10.2013, at 10:59, Michael Wallner (mailto:mike@php.net)> wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).

    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)

    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.

    Thoughts, objections, insults?

    (*) We'd probably have to revisit all *post* INI variables, though.
    So, that is not one, but three proposals:

    1. _GET -> _QUERY, _POST -> _FORM

    I don't think this is really necessary. Names are there historically and changing them will break a lot of stuff.
    +0 on aliasing, and soft-deprecation (via documentation) though

    2. ignore request-method, trigger body-processing solely on Content-type

    +1. makes sense

    3. expose body-parsers via php-level API

    +1. Hell, yes! Something like +1000, actually ;)



    --
    Alexey Zakhlestin
    CTO at Grids.by/you
    https://github.com/indeyets
    PGP key: http://indeyets.ru/alexey.zakhlestin.pgp.asc
  • Michael Wallner at Oct 2, 2013 at 7:31 am

    On 2 October 2013 09:25, Jannik Zschiesche wrote:
    Hi,

    wouldn’t $_BODY be better - since it is the request body?
    $_FORM is imho not very clear, since you can send data to $_POST without
    using a form.
    I had it, but I'm not sure $_BODY fits either, because it should be an
    array. Currently only form data fits the purpose of de-serialisation
    of a request body.


    --
    Regards,
    Mike
  • Adam Harvey at Oct 2, 2013 at 6:43 pm

    On 02.10.2013, at 10:59, Michael Wallner wrote:
    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    I'm not really sure people are confused, actually. There are plenty of
    common gotchas that I see daily in ##php, but that isn't one of them.

    I like how Alexey broke this down, so I'm going to steal it. :)
    On 2 October 2013 00:17, Alexey Zakhlestin wrote:
    1. _GET -> _QUERY, _POST -> _FORM
    As I hinted at above, I'm -1 on this at first blush. I don't think
    it's really that confusing in practice and $_GET and $_POST have a lot
    of history (and muscle memory) behind them at this point.
    2. ignore request-method, trigger body-processing solely on Content-type
    Definite +1 here.
    3. expose body-parsers via php-level API
    +1, particularly if it's also available in userland as Alexey got
    excited about. :)

    Adam
  • Andrea Faulds at Oct 2, 2013 at 9:58 am

    Le 2 octobre 2013 à 07:59, Michael Wallner a écrit :

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method.  Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    Backwards compatibility matters, so we should keep $_GET and $_POST but add
    these as better aliases for them.

    While we're at it, can we remove the quirk that existed due to register_globals
    where periods and such are replaced with underscores? Such that for
    /test.php?php.pecl=3&php_pecl=4 we'd still only have $_GET['php_pecl'] === 4 but
    there would also be $_QUERY['php.pecl'] === 3 and $_QUERY['php_pecl'] === 4, if
    you get where I'm coming from.
    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)
    +1 to this, current behaviour is nonsensical.
    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core.  One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.
    +1 Handling JSON would be pretty neat. Maybe we could even support it by default
    (after all, there is the json ext), so long as getting the unparsed request body
    isn't impeded by that.
    --
    Andrea Faulds
    http://ajf.me/
  • Andrea Faulds at Oct 2, 2013 at 10:00 am

    Le 2 octobre 2013 à 10:58, Michael Wallner a écrit :
    On 2 October 2013 11:56, Andrea Faulds wrote:
    Backwards compatibility matters, so we should keep $_GET and $_POST but add
    these as better aliases for them.
    That's why I said "phase out"... or is it a german anglicism?
    It's not. But my understanding of "phase out" is to deprecate and eventually
    remove. Considering how much code relies on $_GET and $_POST though, the most
    we'll ever do is probably deprecate.
    --
    Andrea Faulds
    http://ajf.me/
  • Michael Wallner at Oct 2, 2013 at 12:25 pm

    On 2 October 2013 12:00, Andrea Faulds wrote:
    Le 2 octobre 2013 à 10:58, Michael Wallner <mike@php.net> a écrit :
    On 2 October 2013 11:56, Andrea Faulds wrote:
    Backwards compatibility matters, so we should keep $_GET and $_POST but add
    these as better aliases for them.
    That's why I said "phase out"... or is it a german anglicism?
    It's not. But my understanding of "phase out" is to deprecate and eventually
    remove. Considering how much code relies on $_GET and $_POST though, the most
    we'll ever do is probably deprecate.
    Never heard of register_long_arrays?
    http://www.php.net/manual/en/ini.core.php#ini.register-long-arrays

    That statement would have been applicable there, too.
    Give it a long enough time frame and it becomes possible.

    --
    Regards,
    Mike
  • Leigh at Oct 2, 2013 at 11:12 am

    On 2 October 2013 07:59, Michael Wallner wrote:

    I propose to phase out $_GET and name it $_QUERY and
    I propose to phase out $_POST and name it $_FORM
    I have to say I'm against this aspect of the proposal. While the names
    may not be 100% accurate, _most_ people are used to their behaviour.
    You certainly won't be able to remove $_GET / $_POST (implied by
    "phase out") in any 5.x release, it's just too big of a BC break.
    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)
    This makes more sense, HTTP/1.1 spec states all methods (except TRACE)
    are allowed a body. This is where it could get pretty confusing
    though, since a GET is allowed a body that could populate $_POST. I
    still don't think it justifies the name change though.
    On 2 October 2013 08:31, Michael Wallner wrote:

    I had it, but I'm not sure $_BODY fits either, because it should be an
    array. Currently only form data fits the purpose of de-serialisation
    of a request body.
    Not so sure about that. I don't think there is a rule that says a body
    _has_ to be in query string name=value format, or that multipart
    elements _have_ to have a name=something attribute. I could quite
    easily imagine PUT requests containing a textual body without an
    associated field name (the URI would contain the field name).

    (correct me if I'm wrong)
  • Alexey Zakhlestin at Oct 2, 2013 at 11:37 am

    On 02.10.2013, at 15:12, Leigh wrote:
    On 2 October 2013 08:31, Michael Wallner wrote:

    I had it, but I'm not sure $_BODY fits either, because it should be an
    array. Currently only form data fits the purpose of de-serialisation
    of a request body.
    Not so sure about that. I don't think there is a rule that says a body
    _has_ to be in query string name=value format, or that multipart
    elements _have_ to have a name=something attribute. I could quite
    easily imagine PUT requests containing a textual body without an
    associated field name (the URI would contain the field name).

    (correct me if I'm wrong)

    In these cases, Content-type of body would different.
    And proposal mentions that interpretation should happen strictly basing on the content type
  • Michael Wallner at Oct 2, 2013 at 12:27 pm

    On 2 October 2013 13:12, Leigh wrote:
    On 2 October 2013 07:59, Michael Wallner wrote:

    I propose to phase out $_GET and name it $_QUERY and
    I propose to phase out $_POST and name it $_FORM
    I have to say I'm against this aspect of the proposal. While the names
    may not be 100% accurate, _most_ people are used to their behaviour.
    You certainly won't be able to remove $_GET / $_POST (implied by
    "phase out") in any 5.x release, it's just too big of a BC break.
    Right, that's why I said "phase out". Check out register_long_arrays;
    those globals have been deprecated in PHP-5.0 and removed in PHP-5.4.


    --
    Regards,
    Mike
  • Andrea Faulds at Oct 2, 2013 at 1:50 pm

    Le 2 octobre 2013 à 13:27, Michael Wallner a écrit :

    On 2 October 2013 13:12, Leigh wrote:
    On 2 October 2013 07:59, Michael Wallner wrote:

    You certainly won't be able to remove $_GET / $_POST (implied by
    "phase out") in any 5.x release, it's just too big of a BC break.
    Right, that's why I said "phase out". Check out register_long_arrays;
    those globals have been deprecated in PHP-5.0 and removed in PHP-5.4.
    Huh, $_GET and the other superglobals were added in 4.1 to replace the
    $HTTP_GET_VARS etc., and then the old way was deprecated in PHP-5.0 and removed
    in PHP-5.4.
    Well, perhaps $_QUERY and $_FORM can be added in 5.6, deprecated in 6.0 and
    removed in 6.4, then? Who knows!
    --
    Andrea Faulds
    http://ajf.me/
  • Johannes Schlüter at Oct 2, 2013 at 2:03 pm

    On Wed, 2013-10-02 at 14:50 +0100, Andrea Faulds wrote:
    Le 2 octobre 2013 à 13:27, Michael Wallner <mike@php.net> a écrit :

    On 2 October 2013 13:12, Leigh wrote:
    On 2 October 2013 07:59, Michael Wallner wrote:

    You certainly won't be able to remove $_GET / $_POST (implied by
    "phase out") in any 5.x release, it's just too big of a BC break.
    Right, that's why I said "phase out". Check out register_long_arrays;
    those globals have been deprecated in PHP-5.0 and removed in PHP-5.4.
    Huh, $_GET and the other superglobals were added in 4.1 to replace the
    $HTTP_GET_VARS etc., and then the old way was deprecated in PHP-5.0 and removed
    in PHP-5.4.
    Well, perhaps $_QUERY and $_FORM can be added in 5.6, deprecated in 6.0 and
    removed in 6.4, then? Who knows!
    Also comparing to get rid of $_* with $HTTP_*_VARS is misleading as the
    HTTP_*_VARS were of limited use. register_globals was primarily used.

    johannes
  • Johannes Schlüter at Oct 2, 2013 at 2:48 pm

    On Wed, 2013-10-02 at 08:59 +0200, Michael Wallner wrote:
    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    The later is certainly misleading. The current naming corresponds to
    HTML forms.

        <form method="GET"> -> $_GET
        <form method="POST"> -> $_POST

    I agree that the naming from a HTTP/REST etc. perspective is misleading,
    but unless we have a clearly better naming I would resist from changing
    these.

    Changing these leads to an incompatibility which can not be emulated
    (ignoring runkit there is no way for a user to create a custom super
    global)

    In case that is ignored please mind other related areas, i.e.
    filter_input() to make sure the resulting new language is consistent.

    johannes
  • Michael Wallner at Oct 2, 2013 at 5:59 pm

    On 2 October 2013 16:10, Johannes Schlüter wrote:
    On Wed, 2013-10-02 at 08:59 +0200, Michael Wallner wrote:
    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    The later is certainly misleading. The current naming corresponds to
    HTML forms.

    <form method="GET"> -> $_GET
    <form method="POST"> -> $_POST
    Heh, pretty good observation! Didn't think about that. Still not
    buying. $_FORM is derived from "application/x-www-form-urlencoded"
    resp. "multipart/form-data" and $_QUERY is obvious.

    I agree that the naming from a HTTP/REST etc. perspective is misleading,
    but unless we have a clearly better naming I would resist from changing
    these.
    Not only for REST, but in general IMHO, e.g: <form method="POST"
    action="?see=gotcha">

    Changing these leads to an incompatibility which can not be emulated
    (ignoring runkit there is no way for a user to create a custom super
    global)
    Valid point. Though, with a long enough time frame it could be done.

    In case that is ignored please mind other related areas, i.e.
    filter_input() to make sure the resulting new language is consistent.
    Yeah, well, there's a lot attached to that cumbersome naming, e.g. all
    *_post_* INI settings etc.
    I should have just resisted proposing that change, but I figured
    testing for backing was cheap.

    --
    Regards,
    Mike
  • Johannes Schlüter at Oct 2, 2013 at 9:40 pm

    On Wed, 2013-10-02 at 19:59 +0200, Michael Wallner wrote:
    On 2 October 2013 16:10, Johannes Schlüter wrote:
    On Wed, 2013-10-02 at 08:59 +0200, Michael Wallner wrote:
    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    The later is certainly misleading. The current naming corresponds to
    HTML forms.

    <form method="GET"> -> $_GET
    <form method="POST"> -> $_POST
    Heh, pretty good observation! Didn't think about that. Still not
    buying. $_FORM is derived from "application/x-www-form-urlencoded"
    resp. "multipart/form-data" and $_QUERY is obvious.
    Out of curiosity I did some browsing:

    - ASP uses Request.QueryString(string name) and Request.Form
    - Java Servlets and JSP use ServletRequest.getParameter() for both
    - Perl CGI.pm has a param hash for both.

    I didn't look really deep though for further specifics :)

    So, ASP seems to agree with you, while I see a e difference: In ASP
    those are methods which, when used, are always qualified by the Request
    Object's name.

    A standalone $_QUERY might be confused with the famous $query from
    $query= mysql_query().

    Changing these leads to an incompatibility which can not be emulated
    (ignoring runkit there is no way for a user to create a custom super
    global)
    Valid point. Though, with a long enough time frame it could be done.
    Mind that we are talking about changing each and every PHP application.
    Each and every PHP tutorial. Each and every book. Each and every
    PHP-related tool. Each and every developers mind. Each and every ...

    Phasing out register_globals took us some time.

        PHP 4.1, 2001-12-10, $_* introduced and advertised
        PHP 4.2, 2002-04-22, register_globals off by default, causing
                             lots of scream
        PHP 5.4, 2012-03-01, register_globals dropped

    That's a deprecation process of ten years for a feature which was
    relatively easy to emulate (import_request_variables(), extract() etc.)
    In case that is ignored please mind other related areas, i.e.
    filter_input() to make sure the resulting new language is consistent.
    Yeah, well, there's a lot attached to that cumbersome naming, e.g. all
    *_post_* INI settings etc.
    I should have just resisted proposing that change, but I figured
    testing for backing was cheap.
    I think extending the parsing is a good idea and should be done.

    And there certainly are quite a few things we would do differently if
    PHP was done from scratch, but the days when Rasmus could log in to any
    PHP host and apply fixes for new versions are long gone ;-)

    This also is a good reminder to take care when adding new features and
    to really think those through instead of filling PHP with more and more
    stuff (sorry for abusing this thread for the purpose of that rant)

    johannes
  • Christian Stadler at Oct 4, 2013 at 6:40 pm

    Am 02.10.2013 23:40, schrieb Johannes Schlüter:
    On Wed, 2013-10-02 at 19:59 +0200, Michael Wallner wrote:
    On 2 October 2013 16:10, Johannes Schlüter wrote:
    On Wed, 2013-10-02 at 08:59 +0200, Michael Wallner wrote:
    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    The later is certainly misleading. The current naming corresponds to
    HTML forms.

    <form method="GET"> -> $_GET
    <form method="POST"> -> $_POST
    Heh, pretty good observation! Didn't think about that. Still not
    buying. $_FORM is derived from "application/x-www-form-urlencoded"
    resp. "multipart/form-data" and $_QUERY is obvious.
    Out of curiosity I did some browsing:

    - ASP uses Request.QueryString(string name) and Request.Form
    - Java Servlets and JSP use ServletRequest.getParameter() for both
    - Perl CGI.pm has a param hash for both.

    I didn't look really deep though for further specifics :)

    So, ASP seems to agree with you, while I see a e difference: In ASP
    those are methods which, when used, are always qualified by the Request
    Object's name.
    Actually I like the idea of having an API to handle everything important
    for the HTTP-request and respectively for the response.

    e. g.:
    HTTPRequest::getFormData(...), which could possibly be aliased by
    HTTPRequest::getPOSTData(...) or HTTPRequest::getPOST(...);
    as well as
    HTTPRequest::getQueryData(...) aliased by
    HTTPRequest::getGETData(...) or HTTPRequest::getGET(...);
    and HTTPRequest::filterXXX(...) as a replacement for filter_var()
    and so on

    HTTPResponse could include a replacement for htmlspecialchars and
    htmlentities (not quite sure, if this fits better to HTTPRequest or
    probably into both)
    HTTPResponse::buildQuery() would be a replacement für http_build_query
    and so on.

    To make a long story short: IMHO Every function and superglobal related
    to the HTTP-request or -response should be moved to their respective
    classes so eveything is under one hood rather than making userland
    guessing which function/superglobal/whatever is for what purpose.
    As it is now, it is kinda chaotic and confusing to me and probably to
    userland, too.

    Just may 2 cents about that topic.

    Regards,
       Christian Stadler
  • Johannes Schlüter at Oct 7, 2013 at 11:21 am

    On Fri, 2013-10-04 at 20:38 +0200, Christian Stadler wrote:
    Actually I like the idea of having an API to handle everything important
    for the HTTP-request and respectively for the response.

    e. g.:
    HTTPRequest::getFormData(...), which could possibly be aliased by
    HTTPRequest::getPOSTData(...) or HTTPRequest::getPOST(...);
    We have filter_input(INPUT_POST, ...);
    as well as
    HTTPRequest::getQueryData(...) aliased by
    HTTPRequest::getGETData(...) or HTTPRequest::getGET(...);
    We have filter_input(INPUT_GET, ...);
    and HTTPRequest::filterXXX(...) as a replacement for filter_var()
    and so on
    Why replace something? Are there flaws which can't be fixed? Adding too
    many ways to do the same thing is confusing for everybody. If you want
    it "object oriented" or such frameworks do great things. The language
    should offer a good foundation. (It is my strong believe that we should
    move as many "high level" things as possible in libraries)
    HTTPResponse could include a replacement for htmlspecialchars and
    htmlentities (not quite sure, if this fits better to HTTPRequest or
    probably into both)
    HTTPResponse::buildQuery() would be a replacement für http_build_query
    and so on.

    To make a long story short: IMHO Every function and superglobal related
    to the HTTP-request or -response should be moved to their respective
    classes so eveything is under one hood rather than making userland
    guessing which function/superglobal/whatever is for what purpose.
    As it is now, it is kinda chaotic and confusing to me and probably to
    userland, too.
    Some of these things might have been named better, back in the past, but
    I see no benefit in making those static methods in classes, except maybe
    that we then need two hash lookups (class and method tables) instead of
    one (function table).

    And btw. that design from above already is flawed too:
    HTTPResponse::buildQuery() - this has nothing to do with an HTTP
    responses. This is i.e. also needed for some stream operations. And also
    the escaping is not needed for HTTP, but HTML, even when creating static
    HTML pages. ;-)

    johannes
  • Rowan Collins at Oct 7, 2013 at 3:03 pm
    Hi All,

    Johannes Schlüter wrote (on 07/10/2013):
    Why replace something? Are there flaws which can't be fixed? Adding too
    many ways to do the same thing is confusing for everybody. If you want
    it "object oriented" or such frameworks do great things. The language
    should offer a good foundation. (It is my strong believe that we should
    move as many "high level" things as possible in libraries)
    I think this is a strong point: PHP as it is now is a long way from
    being usable as a modern framework in its own right, and building a
    framework-like API for this kind of functionality may open a bigger can
    of worms than was intended. Zend, Symfony, et al have a massive
    head-start on any kind of "PHP native framework".

    Rather than trying to build the framework logic into the core, how about
    exposing the functionality currently in core to user-space so that
    frameworks can wrap it more efficiently? Looking around, we currently
    have the slightly awkward parse_str for query strings and
    application/x-www-form-urlencoded bodies; I imagine there's something in
    pecl_http for multi-part/form-data, but I'm not familiar with that
    extension, and it's not part of core.

    If the internal implementations were decoupled from the population of
    $_POST as a consistent set of functions, frameworks could simply wrap
    them alongside custom parsers such as JSON (with appropriate options) or
    XML (restricted to some framework-defined schema). This also allows for
    simulated requests, unit testing, etc, and passing in
    file_get_contents('php://input') should allow parsing any request body.
    If this was done as some consistent "serialization/deserialization"
    module, it could expose the session serialization formats as well,
    replacing the horrible environment-clobbering of session_decode().

    Regards,
    --
    Rowan Collins
    [IMSoP]
  • Larry Garfield at Oct 9, 2013 at 9:20 pm

    On 10/7/13 10:03 AM, Rowan Collins wrote:
    Hi All,

    Johannes Schlüter wrote (on 07/10/2013):
    Why replace something? Are there flaws which can't be fixed? Adding too
    many ways to do the same thing is confusing for everybody. If you want
    it "object oriented" or such frameworks do great things. The language
    should offer a good foundation. (It is my strong believe that we should
    move as many "high level" things as possible in libraries)
    I think this is a strong point: PHP as it is now is a long way from
    being usable as a modern framework in its own right, and building a
    framework-like API for this kind of functionality may open a bigger can
    of worms than was intended. Zend, Symfony, et al have a massive
    head-start on any kind of "PHP native framework".

    Rather than trying to build the framework logic into the core, how about
    exposing the functionality currently in core to user-space so that
    frameworks can wrap it more efficiently? Looking around, we currently
    have the slightly awkward parse_str for query strings and
    application/x-www-form-urlencoded bodies; I imagine there's something in
    pecl_http for multi-part/form-data, but I'm not familiar with that
    extension, and it's not part of core.

    If the internal implementations were decoupled from the population of
    $_POST as a consistent set of functions, frameworks could simply wrap
    them alongside custom parsers such as JSON (with appropriate options) or
    XML (restricted to some framework-defined schema). This also allows for
    simulated requests, unit testing, etc, and passing in
    file_get_contents('php://input') should allow parsing any request body.
    If this was done as some consistent "serialization/deserialization"
    module, it could expose the session serialization formats as well,
    replacing the horrible environment-clobbering of session_decode().

    Regards,
    The Framework Interoperability Group has been discussing a unified
    request/response spec on and off for a while. There's definite interest
    in having one, especially if it can support HTTP clients (Guzzle and
    Buzz, etc.) with the same interface as the main PHP process.

    I'd actually recommend those who are interested work on it over there,
    let it happen in PHP user space, and then later backport it to C code if
    it seems to work out well.

    --Larry Garfield
  • Nikita Popov at Oct 2, 2013 at 3:15 pm

    On Wed, Oct 2, 2013 at 8:59 AM, Michael Wallner wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    I don't think this kind of change is worth it if you just rename two very
    heavily used variables. If something in this direction is changed the
    change should be more thorough (including getting away from superglobals
    and representing the request state by an immutable object).

    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.
    I would *strongly* recommend against adding additional body parsers that
    are automatically invoked based on the content type. Adding additional
    parsers creates a high security risk. E.g. exposing ext/json as it is now
    would open you to a denial of service attack (if I'm not mistaken). There
    has been a long history of security vulnerabilities (both DOS and RCE)
    related to unnecessary or incorrect exposure of request body parsers. A
    prominent recent example are the RCE vulnerabilities in Rails caused by the
    exposure of YAML and JSON parsers.

    Nikita
  • Michael Wallner at Oct 2, 2013 at 6:02 pm

    On 2 October 2013 17:15, Nikita Popov wrote:
    On Wed, Oct 2, 2013 at 8:59 AM, Michael Wallner wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).

    I don't think this kind of change is worth it if you just rename two very
    heavily used variables. If something in this direction is changed the change
    should be more thorough (including getting away from superglobals and
    representing the request state by an immutable object).
    Well, what I want and what is in core may diverge.
    May I suggest you take a look at pecl_http-v2, I'd greatly appreciate
    any feedback.
    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.

    I would *strongly* recommend against adding additional body parsers that are
    automatically invoked based on the content type. Adding additional parsers
    creates a high security risk. E.g. exposing ext/json as it is now would open
    you to a denial of service attack (if I'm not mistaken). There has been a
    long history of security vulnerabilities (both DOS and RCE) related to
    unnecessary or incorrect exposure of request body parsers. A prominent
    recent example are the RCE vulnerabilities in Rails caused by the exposure
    of YAML and JSON parsers.
    Pointers, references, evidences?

    --
    Regards,
    Mike
  • Nikita Popov at Oct 2, 2013 at 6:40 pm

    On Wed, Oct 2, 2013 at 8:02 PM, Michael Wallner wrote:

    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.

    I would *strongly* recommend against adding additional body parsers that are
    automatically invoked based on the content type. Adding additional parsers
    creates a high security risk. E.g. exposing ext/json as it is now would open
    you to a denial of service attack (if I'm not mistaken). There has been a
    long history of security vulnerabilities (both DOS and RCE) related to
    unnecessary or incorrect exposure of request body parsers. A prominent
    recent example are the RCE vulnerabilities in Rails caused by the exposure
    of YAML and JSON parsers.
    Pointers, references, evidences?
    The Rails RCE (remote code execution) vulnerability I'm referring to is
    https://groups.google.com/forum/?fromgroups=#!topic/rubyonrails-security/61bkgvnSGTQ,
    which is caused by exposing YAML and XML parsers. There have been several
    subsequent vulnerabilities in this area, e.g.
    https://groups.google.com/forum/?fromgroups=#!topic/rubyonrails-security/1h2DR63ViGo,
    which involves exposing a JSON parser that happened to operate on YAML
    internally. You'll find that similar vulns turned up in various
    webframeworks over time (one other case I remember off the top of my head
    is a parameter parsing vulnerability in Apache Struts2 related to OGNL).

    The DOS vulnerability that would turn up by directly exposing json_decode
    as a body parser is the standard HashDOS vulnerability (which exploits
    collisions in the array key hashing functions).

    Even without exposing additional parser PHP already had its share of
    vulnerabilities in this area (e.g. HashDOS and the subsequent RCE that its
    fix caused).

    I'm not saying that adding additional parsers is bad *per se*, I'm just
    saying that you need to be very careful what you add here. The more
    automatic body parsers you have the larger the attack surface becomes.

    Nikita
  • Pierre Joye at Oct 3, 2013 at 9:18 am
    hi!
    On Wed, Oct 2, 2013 at 8:15 AM, Nikita Popov wrote:
    On Wed, Oct 2, 2013 at 8:59 AM, Michael Wallner wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).
    I don't think this kind of change is worth it if you just rename two very
    heavily used variables. If something in this direction is changed the
    change should be more thorough (including getting away from superglobals
    and representing the request state by an immutable object).

    I totally agree with you here. Unlike the other related changes, the
    impact on existing code is much larger here, for little gain. I am in
    favor of not touching them, at all.
    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.
    I would *strongly* recommend against adding additional body parsers that
    are automatically invoked based on the content type.
    Again, I totally agree.
    Adding additional
    parsers creates a high security risk. E.g. exposing ext/json as it is now
    would open you to a denial of service attack (if I'm not mistaken). There
    has been a long history of security vulnerabilities (both DOS and RCE)
    related to unnecessary or incorrect exposure of request body parsers. A
    prominent recent example are the RCE vulnerabilities in Rails caused by the
    exposure of YAML and JSON parsers.
    However I could imagine some new ways to deal with inputs,
    rest/http/services friendlier. But it is a tricky area, both from an
    API design and security points of view.

    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Daniel Lowrey at Oct 2, 2013 at 3:29 pm
    The superglobals are a hopelessly poor abstraction. Can we stop trying to
    put the proverbial gold ring in the pig's snout on this?

    While a change to `$_QUERY` and `$_BODY` would undoubtedly be an
    improvement I don't think the massive BC breaks that would result are
    justified by simply improving variable names. The problem isn't
    nomenclature; it's globally mutable data structures whose key-value nature
    aren't a good fit for the unrestricted possibilities of HTTP. You can't
    efficiently model an HTTP request with associative arrays. Period.

    The other elephant in the room is that your average wordpress developer
    will continue to use `$_GET` and `$_POST` regardless. So any improvements
    that are made should be targeted at professional developers. The obvious
    thing to do here is to introduce an entirely separate abstraction that can
    be used by professional developers without eliminating the exceedingly
    simple superglobal abstraction (hello, BC easily retained).

    Something like the following would be an infinitely superior solution:

    interface HttpRequest {
         function getMethod();
         function getContentType();
         function getContentLength();
         function getBody();
         function getBodyStream();
         function hasHeader($fieldName);
         function getHeader($fieldName);
         function hasFormField($fieldName);
         function getFormField($fieldName);
         function hasCookie($fieldName);
         function getCookie($fieldName);
    }

    By adding a global function such as `get_request()` that returns the
    immutable request object you get all the functionality you could want:

    1. Entity body parsing (for cookies or form data) could be done JIT
    2. Userland code can easily typehint against the request
    3. No more mutability, eminently testable
    4. Eliminates the need for `$_GET`, `$_POST`, `$_COOKIE`, etc ...

    This is simply an example of a much better way to model HTTP requests --
    it's not a suggestion for a final implementation. But IMO if we're going to
    fix this then we should really fix it and not continuously tickle the same
    broken abstraction.

    Some other thoughts ...
    While we're at it, can we remove the quirk that existed due to
    register_globals
    where periods and such are replaced with underscores?
    I think "fixing" this is a bad idea. You simply can't allow invalid
    variable name characters as part of these keys. That's asking for trouble.
    This problem is completely nullified with an immutable request object,
    though.
    I would *strongly* recommend against adding additional body parsers that
    are automatically invoked based on the content type. Adding additional
    parsers creates a high security risk.
    Agreed. It's best to retain the thoroughly tested existing parser
    functionality for things like multipart, url-encoded forms and cookies.
    These can easily be reused as part of a JIT object-based solution.
    Everything else should be the user's responsibility (until proven so
    ubiquitous as to be useful at the language level). Manual userland parsing
    should be trivial as long as the raw entity body is available.
  • Florian Anderiasch at Oct 2, 2013 at 3:47 pm

    On 02.10.2013 17:29, Daniel Lowrey wrote:
    The superglobals are a hopelessly poor abstraction. Can we stop trying to
    put the proverbial gold ring in the pig's snout on this?

    [...]

    Something like the following would be an infinitely superior solution:

    interface HttpRequest {
    function getMethod();
    [...]
    }

    By adding a global function such as `get_request()` that returns the
    immutable request object you get all the functionality you could want:

    1. Entity body parsing (for cookies or form data) could be done JIT
    2. Userland code can easily typehint against the request
    3. No more mutability, eminently testable
    4. Eliminates the need for `$_GET`, `$_POST`, `$_COOKIE`, etc ...

    This is simply an example of a much better way to model HTTP requests --
    it's not a suggestion for a final implementation. But IMO if we're going to
    fix this then we should really fix it and not continuously tickle the same
    broken abstraction.
    While I totally agree with your point, I don't see why this needs to be
    in core. This is perfectly done in userspace, even the immutable part.

    The only benefit of putting that in core would be the standard-defining
    part, but not much else. Plus it's not minimizing any abuse, as long as
    the superglobals are still available (and I don't think renaming or
    removing them would be a good idea.

    Greetings,
    Florian
  • Johannes Schlüter at Oct 2, 2013 at 4:10 pm

    On Wed, 2013-10-02 at 17:47 +0200, Florian Anderiasch wrote:
    On 02.10.2013 17:29, Daniel Lowrey wrote:
    The superglobals are a hopelessly poor abstraction. Can we stop trying to
    put the proverbial gold ring in the pig's snout on this?

    [...]

    Something like the following would be an infinitely superior solution:

    interface HttpRequest {
    function getMethod();
    [...]
    }

    By adding a global function such as `get_request()` that returns the
    immutable request object you get all the functionality you could want:

    1. Entity body parsing (for cookies or form data) could be done JIT
    2. Userland code can easily typehint against the request
    3. No more mutability, eminently testable
    4. Eliminates the need for `$_GET`, `$_POST`, `$_COOKIE`, etc ...

    This is simply an example of a much better way to model HTTP requests --
    it's not a suggestion for a final implementation. But IMO if we're going to
    fix this then we should really fix it and not continuously tickle the same
    broken abstraction.
    While I totally agree with your point, I don't see why this needs to be
    in core. This is perfectly done in userspace, even the immutable part.

    The only benefit of putting that in core would be the standard-defining
    part, but not much else. Plus it's not minimizing any abuse, as long as
    the superglobals are still available (and I don't think renaming or
    removing them would be a good idea.
    Also mind that filter_input() as an immutable API exists.

    johannes
  • Rowan Collins at Oct 7, 2013 at 3:50 pm

    Daniel Lowrey wrote (on 02/10/2013):
    Something like the following would be an infinitely superior solution:

    interface HttpRequest {
    While having a quick look for userland parsing functions earlier, I came
    upon the PECL http extension, which includes this all-singing object:

    http://www.php.net/manual/en/class.httprequest.php

    As for this:
    You can't efficiently model an HTTP request with associative arrays. Period.
    The fact is that for 99% of use cases, yes you can, and developers
    happily do so. PHP even allows the convenient field_name[]= and
    field_name[key]= notations for building multi-dimensional arrays.

    This is all a convenience wrapper, and a consistent low-level API would
    be good, but alternative high-level APIs can be built from a few
    fundamental building blocks (e.g. getting the basic raw request parts as
    strings, parsing strings in various form encodings) without building a
    whole HTTP framework into the core.

    Regards,
    --
    Rowan Collins
    [IMSoP]
  • Dave at Oct 2, 2013 at 5:28 pm
    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)

    +1
    This would solve the with parsing multi-form data with PUT requests (and
    possibly any future method types), thus enabling full REST support :)

    On 2 October 2013 00:59, Michael Wallner wrote:

    Since ever people are confused by _GET and _POST superglobals,
    because, despite their name, they do not (really) depend on the
    request method. Therefor I propose to phase out $_GET and name it
    $_QUERY and I propose to phase out $_POST and name it $_FORM (I'm not
    100% confident with the latter yet, though).

    Further, I propose to remove the POST method restriction for handling
    request bodies and solely rely on the content type to trigger the
    parser(s). (*)

    There are already parsers for application/x-www-form-urlencoded and
    multipart/form-data in the core. One could think of providing an API
    to add content type handlers from extensions, ext/json may be an
    example, like it is hacked into pecl_http-v2.

    Thoughts, objections, insults?

    (*) We'd probably have to revisit all *post* INI variables, though.

    --
    Regards,
    Mike

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Daniel Lowrey at Oct 4, 2013 at 10:45 am
    Uhmmm... I actually meant an interal API not userland :)
    Hehe, I'd be really excited to see this in userland too. Happy to help make
    this happen unless people have good reasons not to expose the parsers ...
  • Daniel Lowrey at Oct 7, 2013 at 5:19 pm
    You can't efficiently model an HTTP request with associative arrays.
    Period.
    The fact is that for 99% of use cases, yes you can, and developers
    happily do so.
    Leaky abstraction is leaky. If this is truly an efficient model of the HTTP
    request then why do we fragment it out into $_SERVER and $_COOKIES and
    $_FILES and $_POST and $_GET and php://input? I don't know what your
    definition of "efficiently model" is, but it must be different from mine.
    Array Oriented Programming !== design.
    without building a whole HTTP framework into the core.
    I'm devoutly anti-framework and I don't think anyone is actually advocating
    this. If you read closely I'm arguing *against* adding more superglobals.
    I'm also against changing the names of the existing arrays as any
    improvement is far outweighed by the massive BC implications of such a
    change. I personally don't care whether or not a better request model is
    implemented in core -- I've already unlearned the bad-habits ingrained at
    the language-level.
  • Rowan Collins at Oct 7, 2013 at 8:28 pm

    On 07/10/2013 18:19, Daniel Lowrey wrote:
    You can't efficiently model an HTTP request with associative
    arrays. Period.
    The fact is that for 99% of use cases, yes you can, and developers
    happily do so.
    Leaky abstraction is leaky. If this is truly an efficient model of the
    HTTP request then why do we fragment it out into $_SERVER and
    $_COOKIES and $_FILES and $_POST and $_GET and php://input? I don't
    know what your definition of "efficiently model" is, but it must be
    different from mine. Array Oriented Programming !== design.
    Ah, OK, I was only really talking about $_POST and $_GET, since they
    were the topic of this thread. I'm not quite sure why breaking different
    aspects of the HTTP request into different interfaces is fundamentally a
    problem, but I agree that you can't model /all/ aspects of an HTTP
    request as associative arrays.

    To go through the relevant superglobals in turn:

    $_GET: The param=value&... format for query strings is so universal,
    whether or not generated by a form, that it can largely be taken for
    granted. Within that format, the only thing that can't be handled as a
    hash is a repeated key; PHP takes the approach that foo[]=bar always
    creates an array, and foo=bar never does, with later values "winning".
    This covers 99% of what people need to do with query strings.

    The mutability and globalness aren't great, but any method
    ->getQueryStringParam('foo') would be indistinguishable from
    ->getQueryStringHash()['foo']

    $_POST: Deals with the two generally accepted form encodings for POST
    requests, in a way that matches $_GET, but while allowing programmers to
    distinguish the two rather than clobbering thm into one array.

    $_COOKIE: Again, the structure of the Cookie: header in PHP is
    fundamentally a set of name=value pairs, making a hash a perfectly
    reasonable structure. The asymmetry with set_cookie() is unfortunate,
    although some asymmetry is inevitable given the underlying headers. At
    least it's better than the abomination that is document.cookie :P

    $_REQUEST: This is an unnecessary bit of redundancy, although it reminds
    me that if you do want to merge query string and posted data, having
    them as hashes is very handy.

    $_FILES: A bit awkward. I can guess the argument for splitting it from
    $_POST, but it's crying out for a more OO representation of the
    individual entries. They're still name-value pairs though.

    $_SERVER: This is the only one that really doesn't work at all. I was
    going to mention it earlier, but didn't want to drift off-topic in my
    earlier messages. It's an awful jumble of HTTP headers, PHP-specific
    data, and arbitrary environment variables which happen to have come
    through from the SAPI. It contains the requested URL in various parts
    with inconsistent names, and I refuse to go near it without a sane
    wrapper class.

    HTTP headers are fundamentally key-value pairs, although they can
    repeat, so more like key -> array of values. Environment variables are
    key-value too. The rest of it, along with
    file_get_contents('php://input'), is odds and ends that need a separate
    abstraction.

    Well, that's the way I see it anyway. The superglobals themselves aren't
    great, but outside of APIs (which will generally involve a bit of
    framework-y-ness anyway) the associative array interface is what most
    people will end up wanting anyway.

    Regards,
    --
    Rowan Collins
    [IMSoP]

Related Discussions

People

Translate

site design / logo © 2022 Grokbase