FAQ
Hi!

As many probably know, unserialize() has a security issue following from
the fact that you can create objects with data from unserialize(), and
these object may have behavior that is invoked automatically - namely
__destruct - that can result in unintended results. See e.g.
http://heine.familiedeelstra.com/security/unserialize among others for
more detailed description.

So I propose a modification to unserialize():
https://wiki.php.net/rfc/secure_unserialize

that would make one of the common cases - serializing data to be stored
on user side or user-accessible side - more secure by avoiding
instantiating all object (or all objects not belonging to a whitelist)
and keeping them as incomplete objects instead.

Comments and suggestions welcome,
--
Stanislav Malyshev, Software Architect
SugarCRM: http://www.sugarcrm.com/
(408)454-6900 ext. 227

Search Discussions

  • Anthony Ferrara at Mar 31, 2013 at 3:05 am
    Stas,

    Serious question: Why bother trying to clean this up? Why not just
    recommend using JSON or some other generic serialization without tieing
    into specific objects, and pushing the creation logic into userland (where
    it belongs IMHO, at least from a security perspective). At least for any
    times where serialized data may come into contact with a user...

    I'm not saying that this isn't a useful addition, I'm just wondering if it
    goes down the right path of what we should be recommending to users as the
    best practice. I wonder if it'd be better to simply recommend a generic
    serialization (JSON, XML, basically something with no class-type
    information other than array, object) instead for any use-case where end
    users might even remotely be able to tamper with the data.

    That's not to say serialization would be useless, not at all. But more that
    serialization (as it stands) would be recommended for server-side only
    (which it should be anyway)...

    I just fear this may let some people think that serialized data is OK to
    pass to the client. Which is only true with this patch if it's implemented
    well and correctly...

    Just my $0.02... Thoughts?

    Anthony

    On Sat, Mar 30, 2013 at 10:54 PM, Stas Malyshev wrote:

    Hi!

    As many probably know, unserialize() has a security issue following from
    the fact that you can create objects with data from unserialize(), and
    these object may have behavior that is invoked automatically - namely
    __destruct - that can result in unintended results. See e.g.
    http://heine.familiedeelstra.com/security/unserialize among others for
    more detailed description.

    So I propose a modification to unserialize():
    https://wiki.php.net/rfc/secure_unserialize

    that would make one of the common cases - serializing data to be stored
    on user side or user-accessible side - more secure by avoiding
    instantiating all object (or all objects not belonging to a whitelist)
    and keeping them as incomplete objects instead.

    Comments and suggestions welcome,
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Sherif Ramadan at Mar 31, 2013 at 3:12 am

    On Sat, Mar 30, 2013 at 11:05 PM, Anthony Ferrara wrote:

    Stas,

    Serious question: Why bother trying to clean this up? Why not just
    recommend using JSON or some other generic serialization without tieing
    into specific objects, and pushing the creation logic into userland (where
    it belongs IMHO, at least from a security perspective). At least for any
    times where serialized data may come into contact with a user...

    I'm not saying that this isn't a useful addition, I'm just wondering if it
    goes down the right path of what we should be recommending to users as the
    best practice. I wonder if it'd be better to simply recommend a generic
    serialization (JSON, XML, basically something with no class-type
    information other than array, object) instead for any use-case where end
    users might even remotely be able to tamper with the data.

    That's not to say serialization would be useless, not at all. But more that
    serialization (as it stands) would be recommended for server-side only
    (which it should be anyway)...

    I just fear this may let some people think that serialized data is OK to
    pass to the client. Which is only true with this patch if it's implemented
    well and correctly...

    Just my $0.02... Thoughts?

    Anthony
    I agree with Anthony on the note that serialize really only should be used
    where the serialized data is being stored server-side.

    I already deal with users that believe the solution to storing compound
    data in cookies, for example, is OK to do with serialize. Unfortunately,
    for those that take on this practice and write poorly though-out code they
    are susceptible to these kinds of security vulnerabilities. Same goes for
    those who chose to continue using things like register_globals (which
    luckicly we have removed), but my point is we can't fix poor mentality. We
    can only educate others on the risks.

    I think Stas proposes a solution to the problem and I think Anthony
    proposes a viable alternative. I would say that Anthony has found the
    shortest distance between the two points (the problem and the solution),
    however.


    On Sat, Mar 30, 2013 at 10:54 PM, Stas Malyshev <smalyshev@sugarcrm.com
    wrote:
    Hi!

    As many probably know, unserialize() has a security issue following from
    the fact that you can create objects with data from unserialize(), and
    these object may have behavior that is invoked automatically - namely
    __destruct - that can result in unintended results. See e.g.
    http://heine.familiedeelstra.com/security/unserialize among others for
    more detailed description.

    So I propose a modification to unserialize():
    https://wiki.php.net/rfc/secure_unserialize

    that would make one of the common cases - serializing data to be stored
    on user side or user-accessible side - more secure by avoiding
    instantiating all object (or all objects not belonging to a whitelist)
    and keeping them as incomplete objects instead.

    Comments and suggestions welcome,
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Stas Malyshev at Mar 31, 2013 at 3:27 am
    Hi!

    I think Stas proposes a solution to the problem and I think Anthony
    proposes a viable alternative. I would say that Anthony has found the
    shortest distance between the two points (the problem and the solution),
    however.
    The fact is that people do use serialize() for data that may be
    accessible by the user (or made accessible by unrelated security issues,
    which may be upgraded in severity by this - e.g. from SQL injection to
    persistent code backdoor on the server). There are many ways to do
    things differently, and they are known. However, as I said, the fact is
    people do use serialize() and may not even realize the data aren't as
    secure as they are. That's why many security tools flag any object with
    dtor in application using unserialize as insecure. This is not a good
    situation, and presently there are no way to avoid it except dropping
    serialize() completely - which may not be an option is some cases and in
    any case would require serious changes to the production code.

    This enhancement is to fix this problem. It is not to change best
    practices or give advice on how to write the most secure system - it is
    to make existing code more secure easily.
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Sanford Whiteman at Mar 31, 2013 at 5:05 am

    This is not a good situation, and presently there are no way to
    avoid it except dropping serialize() completely - which may not be
    an option is some cases and in any case would require serious
    changes to the production code.
    And what about automatic un/serialize() of objects in $_SESSION?
    People don't even see those function calls in their code, so dropping
    the function/ality would be a wildly drastic move.

    IMO, there's a minefield of "most surprise" to worry about unless you
    tread gently, as in your suggestion of an extra param. And probably
    want two optional PHP.INI settings: one for when unserialize() is
    called automatically (so you can't pass it anything), and one for when
    unserialize() is called in user code without a second argument but you
    want a default whitelist to be applied (say, to instantly "harden" a
    codebase and sort out consequences later).

    -- S.
  • Stas Malyshev at Mar 31, 2013 at 5:42 am
    Hi!
    And what about automatic un/serialize() of objects in $_SESSION?
    People don't even see those function calls in their code, so dropping
    the function/ality would be a wildly drastic move.
    Nothing about it, the change is for unserialize() function.
    tread gently, as in your suggestion of an extra param. And probably
    want two optional PHP.INI settings: one for when unserialize() is
    As we learned many times in the past, behavior-changing ini settings are
    not a good idea. We have to get away from mentality of "if we need to
    modify some behavior, we just put a variable in global state to control
    it". Global state is the last resort, not the first one. Variables that
    have local influence should have local scope.
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Sanford Whiteman at Mar 31, 2013 at 7:59 am

    And what about automatic un/serialize() of objects in $_SESSION?
    People don't even see those function calls in their code, so dropping
    the function/ality would be a wildly drastic move.
    Nothing about it, the change is for unserialize() function.
    OK. I thought of this as one core security issue with multiple
    possible ways of getting a payload to the internal (C) unserialize
    code. (If not, guess I could draw up an RFC for the other vector.)

    It is harder to inject arbitrary objects into session storage than to
    exploit blind-request-variable-unserialization type stuff (though the
    latter can be a stepping stone to the former). But the potential
    payoff in $_SESSION is so huge, I think having "secure unserialize"
    for sessions is fully justified. Otherwise you're saying "I can't
    guarantee objects with killer wakeups/dtors were not injected via one
    of the apps on my server, and I have no way to stop them from
    instantiating magically provided they get through the right way."
    We have to get away from mentality of "if we need to modify some
    behavior, we just put a variable in global state to control it".
    Global state is the last resort, not the first one.
    I guess it could be another argument to session_start() instead.

    -- S.


    P.S. Sure you'll shoot down this idea as well, but I think it would be
    good if Filters had a corresponding validator, such as:

    FILTER_VALIDATE_UNSERIALIZED: detect strings in PHP bytestream format.
    Flags FILTER_ALLOW_SERIALIZED_SCALAR,
    FILTER_ALLOW_SERIALIZED_NONOBJECT to fine-tune.

    Otherwise, if you are still expecting bytestream format from the
    client and want to detect tampering on input, you have to write a
    best-guess regex to try to differentiate between 'Some:free { text;
    }"' and 'O:8:"class":0:{}' and 'S:12...' etc.

    .
  • Anthony Ferrara at Mar 31, 2013 at 12:04 pm
    Stas,


    The fact is that people do use serialize() for data that may be
    accessible by the user (or made accessible by unrelated security issues,
    which may be upgraded in severity by this - e.g. from SQL injection to
    persistent code backdoor on the server). There are many ways to do
    things differently, and they are known. However, as I said, the fact is
    people do use serialize() and may not even realize the data aren't as
    secure as they are. That's why many security tools flag any object with
    dtor in application using unserialize as insecure. This is not a good
    situation, and presently there are no way to avoid it except dropping
    serialize() completely - which may not be an option is some cases and in
    any case would require serious changes to the production code.

    This enhancement is to fix this problem. It is not to change best
    practices or give advice on how to write the most secure system - it is
    to make existing code more secure easily.

    I definitely see your point, and don't disagree with it at all. Again, my
    concern is that people will then be tempted to use serialization to the
    client as it's "safe" (with these modifications). Which I think we should
    discourage for new code....

    So what if we did this: We implement your RFC, but also put a warning in
    the docs that serialize() shouldn't be used in places where a user or third
    party can modify the output (to use json_encode() for those areas). That
    way we're not encouraging serialize to be used in places it shouldn't, but
    also give those with legacy codebases or really awkward use-cases the
    ability to be "more secure"...

    Thoughts?
  • Nikita Popov at Mar 31, 2013 at 6:36 pm

    On Sun, Mar 31, 2013 at 5:27 AM, Stas Malyshev wrote:

    I think Stas proposes a solution to the problem and I think Anthony
    proposes a viable alternative. I would say that Anthony has found the
    shortest distance between the two points (the problem and the solution),
    however.
    The fact is that people do use serialize() for data that may be
    accessible by the user

    Yeah, well, the people who do that are also the ones that are unlikely to
    make use of the new parameters to secure themselves. In order to make them
    use of the new feature they have to be explicitly educated about it and in
    that case we can just as well educate them to use json_encode. In that
    regard, I don't think this proposal is particularly useful.

    JSON and serialize() are (inherently) different serialization formats with
    different use-cases. The former is rather restricted and as such safe to be
    provided by the user. The latter on the other hand aims at exactly
    replicating the structure. Using the latter format for the former task is
    in any case a bad idea. It's not like unserializing objects with dtors is
    the only issue that can turn up. serialize() also supports references and
    object-references, so one could probably use it quite easily to trigger
    some kind of infinite loop/recursion in the application.

    So, I personally don't see much value in this addition. Rather it could
    provide people an excuse to use the function on user-provided data, which
    is, as already mentioned, a bad idea. Even with this additional protection.
    Also I'd like to point out that unserialize() in the past had a relatively
    large number of different security vulnerabilities, so one should really,
    really not trust it with data of unknown origin. Internal classes can quite
    commonly be made to segfault with specially crafted serialization input.
    E.g. the user might think that, hey, DateTime is a safe class, let's allow
    unserializing that. Sounds legit doesn't it? Then let's remember those
    various serialization bugs that were recently fixed in DateTime (or the
    related classes, didn't really look at it). And then we would have a
    potentially exploitable segmentation fault :)

    Nikita
  • Stas Malyshev at Apr 1, 2013 at 7:18 pm
    Hi!
    Yeah, well, the people who do that are also the ones that are unlikely
    to make use of the new parameters to secure themselves. In order to make
    Why? Making use of one parameter is orders of magnitude easier than
    refactoring your whole application to not use serialize() (especially if
    you need object support).
    exactly replicating the structure. Using the latter format for the
    former task is in any case a bad idea. It's not like unserializing
    You completely ignore situations where two could mix - i.e. where you
    could store data that can potentially be not reproducible by JSON and
    still may - due to some or other oversight - be controlled by the user.
    That's why security has layers - so that if you make mistake on one
    layer, other layer would limit the exposure. That's why you don't run
    you webserver as root - even though you could just say "we shouldn't
    have remote execution holes, so it doesn't matter under which user the
    webserver runs". You seem to be under impression that if particular
    technique solves particular problem, there shouldn't be any security
    measures protecting you if you do not use that technique. I think it is
    plain wrong approach to security.
    doesn't it? Then let's remember those various serialization bugs that
    were recently fixed in DateTime (or the related classes, didn't really
    look at it). And then we would have a potentially exploitable
    segmentation fault :)
    I don't see how this has anything to do with anything. There are tons of
    different bugs, and I never claimed this patch protects against all of
    them, so if you think somehow finding a scenario that this change does
    not protect from invalidates it, then by the same logic we should never
    have any security features at all that do not make the application
    perfectly secure in the hands of most ignorant user.
    The point is not to make perfect protection, the point is to make common
    scenario safer and more resistant to common attacks that happen to
    existing applications all the time now.
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Steve Clay at Apr 1, 2013 at 7:46 pm

    On 4/1/13 3:18 PM, Stas Malyshev wrote:
    Why? Making use of one parameter is orders of magnitude easier than
    refactoring your whole application to not use serialize() (especially if
    you need object support).
    Under the RFC, unserialize could potentially create __PHP_Incomplete_Class objects
    (including nested ones), so new handling code would be needed to care for these cases and
    I wouldn't describe that as a simple fix depending on how the app wants to deal with these
    cases.

    IMO these projects would be better off creating drop-in replacements for (un)serialize
    that wrap in HMAC to secure the data channel. Trivially done in userland.

    The next best thing would be an unserialize that would simply fail if a non-whitelisted
    class was found.

    Steve Clay
  • Steve Clay at Mar 31, 2013 at 6:46 pm
    I think this RFC would worsen the problem of misusing the serialize round-trip.

    Even if we make the docs clearer, we'd still be sending the message that there's a new
    "safer" unserialize and some would certainly use that new feature to be more lax about
    guarding serialized structures.

    It also sets up a situation where altering either the whitelist or one of the classes in
    the whitelist could open a vulnerability that's not obvious.

    I'm also not convinced that this feature would spur developers to fix insecure code.

    But setting my arguments against, if the goal is to make unserialize() secure, then it
    should behave like a tripwire: fail loudly if a non-whitelisted class object is found. I
    think returning partially-usable values would gives devs more rope to hang themselves with.

    Re the 2nd arg, I'd make only two cases:
    * null is given: default behavior
    * non-null given: cast to array and that's the class whitelist.


    If the overall goal is to make the serialize/unserialize round-trip tamper-proof, we could
    build HMAC right into the API: add secret key args to both functions. No doubt the Suoshin
    patch already uses HMAC during encryption of the session data.

    Steve Clay
  • ALeX at Mar 31, 2013 at 9:18 pm
    JSON and serialize() are (inherently) different serialization formats with different use-cases [...]
    Yes, and json requires that all strings (including the keys) has to be
    valid utf-8, and I'm sure that's not always the case (serialize can
    use binary data in both places).
  • Ángel González at Mar 31, 2013 at 9:28 pm

    On 31/03/13 23:18, ALeX wrote:
    JSON and serialize() are (inherently) different serialization formats with different use-cases [...]
    Yes, and json requires that all strings (including the keys) has to be
    valid utf-8, and I'm sure that's not always the case (serialize can
    use binary data in both places).
    Yes, it is a problem.
    var_dump(json_encode("\xe1 - \xc3\xa1"));
    PHP Warning: json_encode(): Invalid UTF-8 sequence in argument in php
    shell code on line 1
    string(4) "null"
    In a perfect world, all your input is utf-8, but sometimes what you get
    is in a different encoding...
    (and you still want to store it as-it-came in the first layer)
  • Michael Gauthier at Apr 2, 2013 at 6:04 pm

    On 2013-03-31 00:27, Stas Malyshev wrote:
    Hi!

    I think Stas proposes a solution to the problem and I think Anthony
    proposes a viable alternative. I would say that Anthony has found the
    shortest distance between the two points (the problem and the solution),
    however.
    The fact is that people do use serialize() for data that may be
    accessible by the user (or made accessible by unrelated security issues,
    which may be upgraded in severity by this - e.g. from SQL injection to
    persistent code backdoor on the server). There are many ways to do
    things differently, and they are known. However, as I said, the fact is
    people do use serialize() and may not even realize the data aren't as
    secure as they are. That's why many security tools flag any object with
    dtor in application using unserialize as insecure. This is not a good
    situation, and presently there are no way to avoid it except dropping
    serialize() completely - which may not be an option is some cases and in
    any case would require serious changes to the production code.

    This enhancement is to fix this problem. It is not to change best
    practices or give advice on how to write the most secure system - it is
    to make existing code more secure easily.
    An alternative to whitelisting or forbidding unserialization of classes
    is to sign your serialized data. We do this in a couple of places where
    users have access to serialized data (cookies, forms). There's a private
    key on the server and serialized data is encoded with a signature at the
    end. Users cannot directly modify the serialized data because they can't
    resign it without the key. The server rejects invalid signatures before
    unserializing.

    Maybe PHP core could add signed_serialize($data, $key) and
    signed_unserialize($data, $key).

    Cheers,
    Mike

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupphp-internals @
categoriesphp
postedMar 31, '13 at 2:54a
activeApr 2, '13 at 6:04p
posts15
users9
websitephp.net

People

Translate

site design / logo © 2021 Grokbase