FAQ
Hi,

I would sincerely invite everyone to the resurrected str_size_and_int64
RFC discussion.

I propose the discussion to last for one week as allowed by the voting RFC
because this topic has already been discussed to death previously (any
objections?). As no userland change is introduced, the discussion would
end and the voting would start on May 13th with 50%+1 votes requirement.

https://wiki.php.net/rfc/size_t_and_int64_next

Best regards

Anatol

Search Discussions

  • Andrea Faulds at May 6, 2014 at 8:07 am

    On 6 May 2014, at 09:01, Anatol Belski wrote:

    https://wiki.php.net/rfc/size_t_and_int64_next
    Am I reading this incorrectly, or is this suggesting switching to 64-bit types for int(), but only for 64-bit builds? I would prefer to switch to 64-bit on 32-bit platforms as well, for the sake of consistency. PHP code should behave identically on 32-bit and 64-bit systems.

    --
    Andrea Faulds
    http://ajf.me/
  • Derick Rethans at May 6, 2014 at 9:16 am

    On Tue, 6 May 2014, Andrea Faulds wrote:

    On 6 May 2014, at 09:01, Anatol Belski wrote:

    https://wiki.php.net/rfc/size_t_and_int64_next
    Am I reading this incorrectly, or is this suggesting switching to
    64-bit types for int(), but only for 64-bit builds? I would prefer to
    switch to 64-bit on 32-bit platforms as well, for the sake of
    consistency. PHP code should behave identically on 32-bit and 64-bit
    systems.
    I would agree to that. And to counteract "this is going to be slower",
    well, perhaps, but then again, AMD64 processors have been out for more
    than 10 years. And there are even ARM64 processors now, and you'd hardly
    use ARM processors for running PHP at full speed.. If you're still on
    32bit, maybe it's time to upgrade.

    cheers,
    Derick

    --
    http://derickrethans.nl | http://xdebug.org
    Like Xdebug? Consider a donation: http://xdebug.org/donate.php
    twitter: @derickr and @xdebug
    Posted with an email client that doesn't mangle email: alpine
  • Anatol Belski at May 6, 2014 at 11:22 am
    Hi Andrea,

    On Tue, May 6, 2014 10:07, Andrea Faulds wrote:
    >
    On 6 May 2014, at 09:01, Anatol Belski wrote:

    Am I reading this incorrectly, or is this suggesting switching to 64-bit
    types for int(), but only for 64-bit builds? I would prefer to switch to
    64-bit on 32-bit platforms as well, for the sake of consistency. PHP code
    should behave identically on 32-bit and 64-bit systems.
    this was not implemented. The complexity of the patch would increase as
    well as the danger of the miscarriage danger. You can see this as a middle
    stage, without this change one will have to start over for int64 on 32
    bit. Whereby I have to mention that there's no plan to implement this yet,
    please note also the performance concerns. It's to see there are more
    people interested on that, if such a project exist later, i would
    participate on that.

    Regards

    Anatol
  • Pierre Joye at May 6, 2014 at 8:09 am

    On Tue, May 6, 2014 at 10:01 AM, Anatol Belski wrote:
    Hi,

    I would sincerely invite everyone to the resurrected str_size_and_int64
    RFC discussion.

    I propose the discussion to last for one week as allowed by the voting RFC
    because this topic has already been discussed to death previously (any
    objections?). As no userland change is introduced, the discussion would
    end and the voting would start on May 13th with 50%+1 votes requirement.

    https://wiki.php.net/rfc/size_t_and_int64_next
    Before someone jumps to fast conclusions, we will obviously work with
    the phpng developers to figure out the best way to get these two
    proposals in core in a smooth and conflict-less way. However we really
    want to get a decision as it costs us quite some time to maintain this
    branch outside the main development streams while still working on all
    other parts of php.

    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Derick Rethans at May 6, 2014 at 9:13 am

    On Tue, 6 May 2014, Anatol Belski wrote:

    I would sincerely invite everyone to the resurrected
    str_size_and_int64 RFC discussion.

    I propose the discussion to last for one week as allowed by the voting
    RFC because this topic has already been discussed to death previously
    (any objections?). As no userland change is introduced, the discussion
    would end and the voting would start on May 13th with 50%+1 votes
    requirement.

    https://wiki.php.net/rfc/size_t_and_int64_next
    I see this still says "The usage of long datatype continues on 32 bit
    platforms." — do I understand correctly that on 32bit platforms, there
    won't still be a 64bit integer in PHP and that this "only" makes 64bit
    integers work on the Windows LLP64 model? Or are they still 32bit
    integers there?

    Under "Accepting values with zend_parse_parameters()"

    “l” “i” to accept integer argument, the internal var has to be declared as php_int_t (inside PHP) or zend_int_t (inside Zend)
    “L” “I” to accept integer argument with range check, the internal var has to be declared as php_int_t (inside PHP) or zend_int_t (inside Zend)

    Using l and I is going to be major confusing and incredibly hard to
    spot.

    "'l', 'L', 's', 'p' parameter formats aren't available anymore"

    That means that every zpp call now needs to have an #ifdef around it to
    support post-this-patch and pre-this-patch PHP versions, as well as
    around their variable declarations. I'm afraid that will result in a
    huge mess, so we should look at this again.

    Under "Example on accepting parameters with zpp"

    You have:
      zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "iISP"

    But why is both i and I allowed (or did I misread the Old/New before?)
    If so, then my comment on i, l, I is already valid :-)

    Under "Example on printf specs usage"

    You have:
      php_error_docref(NULL TSRMLS_CC, E_WARNING, "Value '" ZEND_INT_FMT "' is out of range", i0);

    Using that macro there feels really unnatural, and many C developers
    will not bother to figure out that they use that, of course ,because
    they are familiar with "%d" and "%l".

    And similarly under "Example on printf specs usage (no BC)" for:

      "Value '%pd' is out of range", i0

    The new macros that you use in the example for "char *dup_substr(zval
    *s, zval *i)" also infer that all those accesses need to have a big
    #ifdef around stuff to support multiple versions. Again, making it a
    major pain for extension developers to support pre- and post-patch PHP
    versions.

    I'm afraid with all those concerns, I will have to vote "no".

    cheers,
    Derick

    --
    http://derickrethans.nl | http://xdebug.org
    Like Xdebug? Consider a donation: http://xdebug.org/donate.php
    twitter: @derickr and @xdebug
    Posted with an email client that doesn't mangle email: alpine
  • Anatol Belski at May 6, 2014 at 11:03 am
    Hi Derick,
    On Tue, May 6, 2014 11:13, Derick Rethans wrote:
    On Tue, 6 May 2014, Anatol Belski wrote:

    I would sincerely invite everyone to the resurrected
    str_size_and_int64 RFC discussion.

    I propose the discussion to last for one week as allowed by the voting
    RFC because this topic has already been discussed to death previously
    (any objections?). As no userland change is introduced, the discussion
    would end and the voting would start on May 13th with 50%+1 votes
    requirement.

    https://wiki.php.net/rfc/size_t_and_int64_next
    I see this still says "The usage of long datatype continues on 32 bit
    platforms." — do I understand correctly that on 32bit platforms, there
    won't still be a 64bit integer in PHP and that this "only" makes 64bit
    integers work on the Windows LLP64 model? Or are they still 32bit integers
    there?
    Yes, this adds int64 to the windows builds and size_t everywhere and hence
    makes it work consistent all 64 bit platforms. Regarding int64 on 32 bit
    platform - no, this is not implemented. There are various reasons for
    that, the main reason here - the complexity. To the time of starting such
    a huge thing the danger it to be not finished is big. You can see this as
    a stage which makes int64 on 32 bit platforms at least hypotetically
    possible. That might be not visible well to the external APIs, but
    internally using int64 on 32 bit platforms would at least double the
    complexity of the patch itself (which is already not a light one). Also,
    without it we can't move further with overall improvement. And we can't
    achieve everything in one single patch, obviously.
    Under "Accepting values with zend_parse_parameters()"


    “l” “i” to accept integer argument, the internal var has to be declared
    as php_int_t (inside PHP) or zend_int_t (inside Zend) “L” “I” to accept
    integer argument with range check, the internal var has to be declared as
    php_int_t (inside PHP) or zend_int_t (inside Zend)

    Using l and I is going to be major confusing and incredibly hard to
    spot.

    "'l', 'L', 's', 'p' parameter formats aren't available anymore"


    That means that every zpp call now needs to have an #ifdef around it to
    support post-this-patch and pre-this-patch PHP versions, as well as around
    their variable declarations. I'm afraid that will result in a huge mess,
    so we should look at this again.

    Under "Example on accepting parameters with zpp"


    You have:
    zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "iISP"

    But why is both i and I allowed (or did I misread the Old/New before?)
    If so, then my comment on i, l, I is already valid :-)
    The syntax shown there is pure new syntax. Please take a look at the
    migration stuff here
    http://git.php.net/?p=php-src.git;a=tree;f=compat;hb=refs/heads/str_size_and_int64
    (also mentioned in the RFC). It's relatively fresh yet, but already offers
    some automatic and will definitely get better on and on. The compat.h
    header defines ZPP_FMT_COMPAT macros, also there's the clang zpp checker
    which will reveal any incompatibility. I think you misread the "big I"
    with "small l" which can happen also another way round :)

    Under "Example on printf specs usage"


    You have:
    php_error_docref(NULL TSRMLS_CC, E_WARNING, "Value '" ZEND_INT_FMT "' is
    out of range", i0);

    Using that macro there feels really unnatural, and many C developers
    will not bother to figure out that they use that, of course ,because they
    are familiar with "%d" and "%l".

    And similarly under "Example on printf specs usage (no BC)" for:


    "Value '%pd' is out of range", i0


    The new macros that you use in the example for "char *dup_substr(zval
    *s, zval *i)" also infer that all those accesses need to have a big
    #ifdef around stuff to support multiple versions. Again, making it a
    major pain for extension developers to support pre- and post-patch PHP
    versions.
    For this two I would mention the migration path as well. ZEND_INT_FMT is
    exactly for the purpose to output zend_int_t be it 64 or 32 bit and care
    about BC. It will define to the right format for the older PHP through
    compat.h. The %d is kept for the case of the pure 32 bit int usage.
    Previously it is %ld but it couldn't work anymore in the new code as it
    would mean different things with consistent int64 support. The %pd variant
    however can be used in the developments without need of BC.

    With the new macros names - they're all aliased in the compat.h for the
    older versions. So the general idea of the migration path is the usage of
    the new clean syntax for the development (like calling a cat a cat) and
    retaining the compat layer to the older versions. Once migrated, that can
    be carried on.
    I'm afraid with all those concerns, I will have to vote "no".
    Of course that would make me sad. Not only for the work done on that, but
    also for the future perspectives and for the quantity of the users who
    wish such thing.

    Cheers

    Anatol
  • Lester Caine at May 6, 2014 at 1:11 pm

    On 06/05/14 11:58, Anatol Belski wrote:
    I'm afraid with all those concerns, I will have to vote "no".
    Of course that would make me sad. Not only for the work done on that, but
    also for the future perspectives and for the quantity of the users who
    wish such thing.
    Having just been going through an exercise to convert Windows XP sites
    to something M$ will allow to run without warning messages, the switch
    to a more modern windows would seem sensible. Except we have been
    restricted to using 32bit builds of windows for compatibility with the
    hardware. While parts of PHP already require 64 bit integers which are
    handled in an even less efficient way, switching to a clean 64 bit
    integer across all platforms really is essential. Having to manage the
    difference in the user code base is simply not an option :(

    The switch to a cleaner implementation base should allow proper planning
    for that requirement?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Anatol Belski at May 7, 2014 at 10:07 pm
    Hi Lester,
    On Tue, May 6, 2014 15:14, Lester Caine wrote:
    On 06/05/14 11:58, Anatol Belski wrote:

    I'm afraid with all those concerns, I will have to vote "no".
    Of course that would make me sad. Not only for the work done on that,
    but also for the future perspectives and for the quantity of the users
    who wish such thing.
    Having just been going through an exercise to convert Windows XP sites
    to something M$ will allow to run without warning messages, the switch to a
    more modern windows would seem sensible. Except we have been restricted to
    using 32bit builds of windows for compatibility with the hardware. While
    parts of PHP already require 64 bit integers which are handled in an even
    less efficient way, switching to a clean 64 bit integer across all
    platforms really is essential. Having to manage the difference in the user
    code base is simply not an option :(

    The switch to a cleaner implementation base should allow proper planning
    for that requirement?
    absolutely, furthermore - Windows Server 2012 (and probably any upcoming)
    is not merchandased with 32-bit edition anymore. For 32 bit PHP there it
    means possible performance penalty. So usage of the true 64 bit PHP fully
    makes sense there.

    Regards

    Anatol
  • Christopher Jones at May 7, 2014 at 5:32 pm

    On 5/6/14, 1:01 AM, Anatol Belski wrote:
    Hi,

    I would sincerely invite everyone to the resurrected str_size_and_int64
    RFC discussion.

    I propose the discussion to last for one week as allowed by the voting RFC
    because this topic has already been discussed to death previously (any
    objections?). As no userland change is introduced, the discussion would
    end and the voting would start on May 13th with 50%+1 votes requirement.

    https://wiki.php.net/rfc/size_t_and_int64_next

    Best regards

    Anatol
    It seems the "shiny" phpng is attracting all the attention and you are
    unlikely to get much discussion: we had so much last time anyway.

    Can you add the proposed voting options to the RFC? Also if you can
    resolve the open issue on dead SAPIs first, that would be great.
    Then lets vote.

    Chris
  • Anatol Belski at May 7, 2014 at 11:32 pm
    Hi Chris,

    On Wed, May 7, 2014 19:32, Christopher Jones wrote:
    >
    On 5/6/14, 1:01 AM, Anatol Belski wrote:

    Hi,


    I would sincerely invite everyone to the resurrected str_size_and_int64
    RFC discussion.


    I propose the discussion to last for one week as allowed by the voting
    RFC
    because this topic has already been discussed to death previously (any
    objections?). As no userland change is introduced, the discussion would
    end and the voting would start on May 13th with 50%+1 votes
    requirement.

    https://wiki.php.net/rfc/size_t_and_int64_next


    Best regards


    Anatol
    It seems the "shiny" phpng is attracting all the attention and you are
    unlikely to get much discussion: we had so much last time anyway.

    Can you add the proposed voting options to the RFC? Also if you can
    resolve the open issue on dead SAPIs first, that would be great. Then lets
    vote.
    Actually this RFC is being even partly discussed in the parallel phpng
    thread and on IRC :) These two RFCs definitely intersect and are probably
    the most significant changes proposed for PHP next. As the voting rule
    says the discussion "should be at least a week", IMHO lets keep it then
    (not long till 13th anyway).

    The vote option would be a simple yes/no choice to accept the RFC for the
    next major PHP version, just added it.

    For the SAPI RFC - it's probably not to be solved in such short term of a
    couple of days (and that's why it was separated from the main one). As it
    involves trying all the corresponding servers, mailing the authors and
    waiting for their reaction. Now it also looks like this topic concerns
    phpng as well. Thus it makes sense firstly to finish the essential change.
    The task itself of the research and adjoining throw out is a routine.

    Best

    Anatol
  • Anatol Belski at May 13, 2014 at 8:51 pm
    Hi,

    as announced previously, the vote starts on May 13th and ends on May 20th.

    https://wiki.php.net/rfc/size_t_and_int64_next#vote

    The RFC is considered approved with 50%+1 acceptance. Happy voting :)

    Best regards

    Anatol
  • Dmitry Stogov at May 13, 2014 at 10:52 pm
    Anatol,

    We discussed your patch in private and I showed you the big penalty it
    makes...
    I really, don't see, what do you like to achieve initiating voting right
    after that. :(

    I've just take a quick look over your initial patch for phpng at
    https://gist.github.com/weltling/a941d8cf6c731640b51f

    Actually, I would support only one idea from your patch - make IS_LONG to
    be 64-bit on _WIN64.

    Your zend_size_t related changes, in my opinion, makes little sense and
    actually makes more harm. I recompiled phpng with your patch on Linux
    x86_64 and got the following numbers:

    zend_string size increased from 24 to 32 bytes
    HashTable size increased from 56 to 72 bytes
    zend_op_array size increased from 248 to 264 bytes
    zend_class_entry size increased from 512 to 568 bytes
    size of each opcode sizeof(zend_op) from 48 to 56 bytes

    Anyone may recompile phpng (with and without patch) and then get these
    number running
    $ gdb sapi/cli/php
    p sizeof(zend_string)
    ...

    Can you imagine memory consumption difference on a large application?
    More memory usage => more CPU cache misses => worse performance.

    and what are the advantages? strings and class names > 2GB?
    For me it's too big payment for useless feature.

    -1

    Thanks. Dmitry.



    On Wed, May 14, 2014 at 12:51 AM, Anatol Belski wrote:

    Hi,

    as announced previously, the vote starts on May 13th and ends on May 20th.

    https://wiki.php.net/rfc/size_t_and_int64_next#vote

    The RFC is considered approved with 50%+1 acceptance. Happy voting :)

    Best regards

    Anatol

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Pierre Joye at May 14, 2014 at 4:45 am
    hi Dmitry.
    On Wed, May 14, 2014 at 12:52 AM, Dmitry Stogov wrote:
    Anatol,

    We discussed your patch in private and I showed you the big penalty it
    makes...
    We discussed how we could best cooperate to get phpng and this patch
    together to ease everyone's work.
    I really, don't see, what do you like to achieve initiating voting right
    after that. :(
    Moving forward as you rejected the whole idea for no good reason or
    based on numbers that cannot be taken as valid as this stage.
    I've just take a quick look over your initial patch for phpng at
    https://gist.github.com/weltling/a941d8cf6c731640b51f

    Actually, I would support only one idea from your patch - make IS_LONG to
    be 64-bit on _WIN64.
    Again. This patch is about clean, safe, standard implementation for
    64bit plaforms and modern compilers, by far not only for windows
    (which has one more gain with this patch, portability). It brings a
    lot of guards, as well as safe and clean implementations. This is
    something we should have done a long time ago. PHP is only of the only
    OSS language I know still relying on old types and using integer for
    buffer length. This is a bad practice and a well known troubles
    source. To reduce it to only windows is hardly a good move.
    Your zend_size_t related changes, in my opinion, makes little sense and
    actually makes more harm. I recompiled phpng with your patch on Linux
    x86_64 and got the following numbers:

    zend_string size increased from 24 to 32 bytes
    HashTable size increased from 56 to 72 bytes
    zend_op_array size increased from 248 to 264 bytes
    zend_class_entry size increased from 512 to 568 bytes
    size of each opcode sizeof(zend_op) from 48 to 56 bytes
    These numbers cannot be taken as valid or seriously at this stage.
    Restructuring these structs will certainly reduce the delta. I did not
    have the time to analyze phpng possible other improvements but I will
    do it soon, and will also check with other people from the compiler
    team if we can improve it a bit more as well.

    But rejecting a safe and clean 64bit support for a prototype, even
    promising, is wrong, in so many ways. See below.
    Anyone may recompile phpng (with and without patch) and then get these
    number running
    $ gdb sapi/cli/php
    p sizeof(zend_string)
    ...

    Can you imagine memory consumption difference on a large application?
    More memory usage => more CPU cache misses => worse performance.
    I can imagine a lot of things but at this point phpng is a very good
    prototype with promising results. There is a good base idea and a lot
    of changes in many places improving performance. However it is very
    far away from being ready, APIs are very inconsistent and painful to
    use (mix of zend_string and char* usage, remaing or removal which may
    not make sense or make the code way too complicated).
    and what are the advantages? strings and class names > 2GB?
    For me it's too big payment for useless feature.
    I do not think it is that useful for both of us to redo the
    discussions about this patch and its goal. It is a necessary step for
    64bit support. It is also very hard to use a prototype, developed for
    months privately and still being in very early pre-alpha/testing phase
    as argument to reject these changes. However, as I told you, it would
    be way better to do both together, for everyone. But you reject the
    idea. We have to move forward to php-next and I do not think we can
    afford to wait months until phpng is remotely usable or stable (at
    least APIs wised).

    In any case, if the 64bit patch is accepted we will support you and
    other with phpng as we always do, even before its RFC or proposal,
    this is what I call cooperation and teamwork.

    Cheers.
    Pierre
  • Nikita Popov at May 14, 2014 at 5:30 am

    On Wed, May 14, 2014 at 6:44 AM, Pierre Joye wrote:

    hi Dmitry.
    On Wed, May 14, 2014 at 12:52 AM, Dmitry Stogov wrote:
    Anatol,

    We discussed your patch in private and I showed you the big penalty it
    makes...
    We discussed how we could best cooperate to get phpng and this patch
    together to ease everyone's work.
    I really, don't see, what do you like to achieve initiating voting right
    after that. :(
    Moving forward as you rejected the whole idea for no good reason or
    based on numbers that cannot be taken as valid as this stage.
    Your zend_size_t related changes, in my opinion, makes little sense and
    actually makes more harm. I recompiled phpng with your patch on Linux
    x86_64 and got the following numbers:

    zend_string size increased from 24 to 32 bytes
    HashTable size increased from 56 to 72 bytes
    zend_op_array size increased from 248 to 264 bytes
    zend_class_entry size increased from 512 to 568 bytes
    size of each opcode sizeof(zend_op) from 48 to 56 bytes
    These numbers cannot be taken as valid or seriously at this stage.
    Restructuring these structs will certainly reduce the delta. I did not
    have the time to analyze phpng possible other improvements but I will
    do it soon, and will also check with other people from the compiler
    team if we can improve it a bit more as well.
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"? The very same issue also exists in our current implementation. In
    phpng the relative hit is just larger, because the structures are more
    optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having support
    for 64 bit integers on Windows and other LLP64 architectures - that's
    great. Making string lengths unsigned - that's great as well. But
    supporting strings larger than 4G or arrays with more than 4 billion
    elements - that does not seem very useful and unlike the other two changes,
    hurts memory usage. I wonder how many people would prefer having lower
    memory usage over having the ability to create arrays with 4 billion
    elements.

    Independently of that: In a lot of the previous discussion people have
    many, many, many times asked that this patch be implemented without all
    those macros renames and zpp changes. I still have a hard time seeing the
    benefit of doing that. The zpp changes also conflict with phpng, because S
    has a different meaning (and imho for no good reason - it could just as
    well stay at s).

    Nikita
  • Terry Ellison at May 14, 2014 at 5:44 am

    On 14/05/14 06:30, Nikita Popov wrote:
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"? The very same issue also exists in our current implementation. In
    phpng the relative hit is just larger, because the structures are more
    optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having support
    for 64 bit integers on Windows and other LLP64 architectures - that's
    great. Making string lengths unsigned - that's great as well. But
    supporting strings larger than 4G or arrays with more than 4 billion
    elements - that does not seem very useful and unlike the other two changes,
    hurts memory usage. I wonder how many people would prefer having lower
    memory usage over having the ability to create arrays with 4 billion
    elements.

    Independently of that: In a lot of the previous discussion people have
    many, many, many times asked that this patch be implemented without all
    those macros renames and zpp changes. I still have a hard time seeing the
    benefit of doing that. The zpp changes also conflict with phpng, because S
    has a different meaning (and imho for no good reason - it could just as
    well stay at s).

    Nikita
    I don't have a vote on the RFC but I still have to say +1 on this one.
    It makes sense to me to fix stuff that people might need in the next
    5-10 years, but beyond that ???

    IMO, Nikita has summarized the sensible threshold well.

    Regards Terry
  • Pierre Joye at May 14, 2014 at 5:46 am

    On Wed, May 14, 2014 at 7:30 AM, Nikita Popov wrote:

    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"?
    At this stage, this is a key part of this sentence.
    The very same issue also exists in our current implementation. In
    phpng the relative hit is just larger, because the structures are more
    optimized.

    I think you shouldn't dismiss Dmitry's point just like that.
    I do not and did not. I would love to see the same from the phpng
    side, but it failed, for no valid reason.
    Having support
    for 64 bit integers on Windows and other LLP64 architectures - that's great.
    Making string lengths unsigned - that's great as well. But supporting
    strings larger than 4G or arrays with more than 4 billion elements
    Very large string or arrays are only side effects, it does not change
    the goals and benefits of these changes.
    - that
    does not seem very useful and unlike the other two changes, hurts memory
    usage. I wonder how many people would prefer having lower memory usage over
    having the ability to create arrays with 4 billion elements.
    This is a biased argument, you know it, I know it. The key point is
    not about the new maximum size of an array or string but the long due
    clean and safe 64bit implementation, following well known good
    practice (can be seen in almost all other OSS projects out there) and
    standards.
    Independently of that: In a lot of the previous discussion people have many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the benefit
    of doing that. The zpp changes also conflict with phpng, because S has a
    different meaning (and imho for no good reason - it could just as well stay
    at s).
    This can be adapted, this is a details. It is also why I have tried
    to get phpng and this patch along together and get both teams work
    together. Cooperation in this case will be benefit for php as a whole
    as more optimization can be achieve while keeping the safe&clean
    implementation.

    As of now, phpng has been worked on for the last months, totally
    privately. And even if it looks promising it is still not remotely
    ready to be actually proposed. However it does not prevent you to use
    it to stop other improvements, which have been worked on for months,
    publically, with continuous tests, status updates, etc. I am not sure
    what is happening here is good for PHP.


    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Lester Caine at May 14, 2014 at 8:13 am

    On 14/05/14 06:46, Pierre Joye wrote:
    Independently of that: In a lot of the previous discussion people have many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the benefit
    of doing that. The zpp changes also conflict with phpng, because S has a
    different meaning (and imho for no good reason - it could just as well stay
    at s).
    This can be adapted, this is a details. It is also why I have tried
    to get phpng and this patch along together and get both teams work
    together. Cooperation in this case will be benefit for php as a whole
    as more optimization can be achieve while keeping the safe&clean
    implementation.

    As of now, phpng has been worked on for the last months, totally
    privately. And even if it looks promising it is still not remotely
    ready to be actually proposed. However it does not prevent you to use
    it to stop other improvements, which have been worked on for months,
    publically, with continuous tests, status updates, etc. I am not sure
    what is happening here is good for PHP.
    My personal impression is that phpng is yet another independent port of
    php just like HHVM and the like. These all target a particular area of
    PHP use and may not be suitable for 'home users'. As an alternative base
    for PHPNext it may have a better pedigree and to that end a decision
    needs to be made for the path forward. What seems totally out of place
    here is a vote on something which has no real target yet? Has phpng
    already been accepted as PHPNext? That PHPNext will be 64bit is a given?
    So what is the need for a vote on a 'detail that can be changed'? It's
    the detail elements that need to be agreed on ... not the principle of
    64bit!

    Hopefully there is no plan to backport this to the PHP5 builds?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Ferenc Kovacs at May 14, 2014 at 8:22 am

    On Wed, May 14, 2014 at 10:16 AM, Lester Caine wrote:
    On 14/05/14 06:46, Pierre Joye wrote:

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the benefit
    of doing that. The zpp changes also conflict with phpng, because S has a
    different meaning (and imho for no good reason - it could just as well stay
    at s).
    This can be adapted, this is a details. It is also why I have tried
    to get phpng and this patch along together and get both teams work
    together. Cooperation in this case will be benefit for php as a whole
    as more optimization can be achieve while keeping the safe&clean
    implementation.

    As of now, phpng has been worked on for the last months, totally
    privately. And even if it looks promising it is still not remotely
    ready to be actually proposed. However it does not prevent you to use
    it to stop other improvements, which have been worked on for months,
    publically, with continuous tests, status updates, etc. I am not sure
    what is happening here is good for PHP.
    My personal impression is that phpng is yet another independent port of
    php just like HHVM and the like. These all target a particular area of PHP
    use and may not be suitable for 'home users'. As an alternative base for
    PHPNext it may have a better pedigree and to that end a decision needs to
    be made for the path forward. What seems totally out of place here is a
    vote on something which has no real target yet? Has phpng already been
    accepted as PHPNext? That PHPNext will be 64bit is a given? So what is the
    need for a vote on a 'detail that can be changed'? It's the detail elements
    that need to be agreed on ... not the principle of 64bit!

    Hopefully there is no plan to backport this to the PHP5 builds?
    both the phpng and the size_t rfcs are targetting the next major version,
    none of them are accepted yet, both of those would/will be suitable for
    'home users'.
    buth these are all public information, stated in the RFCs and discussed on
    internals@ which you seems to be subscribed on based on your replies to the
    list, so I'm not sure where the confusion comes from.

    --
    Ferenc Kovács
    @Tyr43l - http://tyrael.hu
  • Lester Caine at May 14, 2014 at 8:36 am

    On 14/05/14 09:22, Ferenc Kovacs wrote:
    Hopefully there is no plan to backport this to the PHP5 builds?

    both the phpng and the size_t rfcs are targetting the next major
    version, none of them are accepted yet, both of those would/will be
    suitable for 'home users'.
    buth these are all public information, stated in the RFCs and discussed
    on internals@ which you seems to be subscribed on based on your replies
    to the list, so I'm not sure where the confusion comes from.
    Since original discussions were being targeted against PHP5 picking out
    where discussions have moved on is not easily obvious. I seem to recall
    this was originally targeted against PHP5?

    Have we reached a point where PHP5 has been 'frozen' and so PHPNext is
    the more detailed target today?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Ferenc Kovacs at May 14, 2014 at 8:46 am

    On Wed, May 14, 2014 at 10:39 AM, Lester Caine wrote:
    On 14/05/14 09:22, Ferenc Kovacs wrote:

    Hopefully there is no plan to backport this to the PHP5 builds?

    both the phpng and the size_t rfcs are targetting the next major
    version, none of them are accepted yet, both of those would/will be
    suitable for 'home users'.
    buth these are all public information, stated in the RFCs and discussed
    on internals@ which you seems to be subscribed on based on your replies
    to the list, so I'm not sure where the confusion comes from.
    Since original discussions were being targeted against PHP5 picking out
    where discussions have moved on is not easily obvious. I seem to recall
    this was originally targeted against PHP5?

    Have we reached a point where PHP5 has been 'frozen' and so PHPNext is the
    more detailed target today?
    I guess you are referring to the previous rfc of the size_t, which was
    targetting 5.6 and failed?
    https://wiki.php.net/rfc/size_t_and_int64
    many of the no voters mentioned that they would support the rfc but not in
    a minor version hence the authors accepted and now tried to get in into the
    next major, and here we are atm.

    --
    Ferenc Kovács
    @Tyr43l - http://tyrael.hu
  • Pierre Joye at May 14, 2014 at 8:49 am

    On Wed, May 14, 2014 at 10:46 AM, Ferenc Kovacs wrote:
    On Wed, May 14, 2014 at 10:39 AM, Lester Caine wrote:
    On 14/05/14 09:22, Ferenc Kovacs wrote:

    Hopefully there is no plan to backport this to the PHP5 builds?

    both the phpng and the size_t rfcs are targetting the next major
    version, none of them are accepted yet, both of those would/will be
    suitable for 'home users'.
    buth these are all public information, stated in the RFCs and discussed
    on internals@ which you seems to be subscribed on based on your replies
    to the list, so I'm not sure where the confusion comes from.
    Since original discussions were being targeted against PHP5 picking out
    where discussions have moved on is not easily obvious. I seem to recall
    this was originally targeted against PHP5?

    Have we reached a point where PHP5 has been 'frozen' and so PHPNext is the
    more detailed target today?
    I guess you are referring to the previous rfc of the size_t, which was
    targetting 5.6 and failed?
    https://wiki.php.net/rfc/size_t_and_int64
    many of the no voters mentioned that they would support the rfc but not in
    a minor version hence the authors accepted and now tried to get in into the
    next major, and here we are atm.
    Excactly. And we would not have done it if we knew that Zend was
    working on a total rewamp. We would have rather focused on getting
    phpng done faster, from a usability, design point of view, while
    adding the necessary steps for a safe, clean and actual 64bit support.
    But we did not know, nobody knew it until last week. But now we are
    asked to trash months of work, openness and tests. Sorry, no.
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Dmitry Stogov at May 14, 2014 at 8:53 am
    phpng is based on the same sources and 99% compatible.
    We are just changing the basement.
    it must be the basement for PHPNext, but we didn't start any discussions
    about that.
    We actually have a lot of work to do and spend most the time doing our best.
    We have no plans to backport it into PHP-5.6.

    PHP supports 64bit for ages, and this proposal has nothing common with
    64bit support in general.
    It allows 2GB strings, but do you imagine a web application that need them?
    However, each big PHP site will have to "pay" for it.

    Thanks. Dmitry.

    On Wed, May 14, 2014 at 12:16 PM, Lester Caine wrote:
    On 14/05/14 06:46, Pierre Joye wrote:

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the benefit
    of doing that. The zpp changes also conflict with phpng, because S has a
    different meaning (and imho for no good reason - it could just as well stay
    at s).
    This can be adapted, this is a details. It is also why I have tried
    to get phpng and this patch along together and get both teams work
    together. Cooperation in this case will be benefit for php as a whole
    as more optimization can be achieve while keeping the safe&clean
    implementation.

    As of now, phpng has been worked on for the last months, totally
    privately. And even if it looks promising it is still not remotely
    ready to be actually proposed. However it does not prevent you to use
    it to stop other improvements, which have been worked on for months,
    publically, with continuous tests, status updates, etc. I am not sure
    what is happening here is good for PHP.
    My personal impression is that phpng is yet another independent port of
    php just like HHVM and the like. These all target a particular area of PHP
    use and may not be suitable for 'home users'. As an alternative base for
    PHPNext it may have a better pedigree and to that end a decision needs to
    be made for the path forward. What seems totally out of place here is a
    vote on something which has no real target yet? Has phpng already been
    accepted as PHPNext? That PHPNext will be 64bit is a given? So what is the
    need for a vote on a 'detail that can be changed'? It's the detail elements
    that need to be agreed on ... not the principle of 64bit!

    Hopefully there is no plan to backport this to the PHP5 builds?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk


    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Andrea Faulds at May 14, 2014 at 8:58 am

    On 14 May 2014, at 09:53, Dmitry Stogov wrote:

    PHP supports 64bit for ages, and this proposal has nothing common with
    64bit support in general.
    It allows 2GB strings, but do you imagine a web application that need them?
    However, each big PHP site will have to "pay" for it.

    FWIW, it does concern general 64-bit support. It makes PHP act consistently across all 64-bit platforms.

    Also, switching to size_t for string length does not just mean ‘2GB strings’. It means using the right tool for the job. Even the C standard library uses size_t for string length!

    --
    Andrea Faulds
    http://ajf.me/
  • Andrey Hristov at May 14, 2014 at 5:11 pm

    On 14.05.2014 11:57, Andrea Faulds wrote:
    On 14 May 2014, at 09:53, Dmitry Stogov wrote:

    PHP supports 64bit for ages, and this proposal has nothing common with
    64bit support in general.
    It allows 2GB strings, but do you imagine a web application that need them?
    However, each big PHP site will have to "pay" for it.

    FWIW, it does concern general 64-bit support. It makes PHP act consistently across all 64-bit platforms.

    Also, switching to size_t for string length does not just mean ‘2GB strings’. It means using the right tool for the job. Even the C standard library uses size_t for string length!
    This is purely academical. And the standard library has to support
    everything, it's the standard library. PHP is on its own, and if an
    addition is of little use to the most of the developers/scripts, why the
    heck it should be in/the default.
    A good solution is to typedef a php_size_t, leave it to uint32_t and for
    those, who need more than 4GB in strings and elements they can just
    build with size_t as definition. Offer the choice, don't force.


    Andrey
    --
    Andrea Faulds
    http://ajf.me/



  • Andrea Faulds at May 14, 2014 at 5:13 pm

    On 14 May 2014, at 18:10, Andrey Hristov wrote:

    This is purely academical. And the standard library has to support everything, it's the standard library. PHP is on its own, and if an addition is of little use to the most of the developers/scripts, why the heck it should be in/the default.
    A good solution is to typedef a php_size_t, leave it to uint32_t and for those, who need more than 4GB in strings and elements they can just build with size_t as definition. Offer the choice, don't force.
    It is not just “purely academic”. Here, let me quote Pierre (in 'Re: [PHP-DEV] [VOTE] [RFC] 64 bit platform improvements for string length and integer’, just now) quoting Anthony:
    This thread has been pointed out to me by a few people. As the
    originator of this patch and concept I feel that I should clarify a
    few points.

    # Rationale

    The reason that I originally started this patch was to clean up and
    standardize the underlying types. This is to introduce predictability,
    portability and type sanity into the engine and entire cphp
    implementation.

    ## Rationale for Int64:

    Without this patch, the size of integers (longs) varies based on which
    compiler you use. This means that even for identical target
    architectures behavior can change with respect to userland code.
    Refactoring this allows for consistent sizes that can be relied upon
    by the programmer. This is an effort to make it a bit easier to rely
    on integer width as a developer.

    And ideally this is a free cost to most implementations, since ints
    are already 64 bits wide, so there is no memory overhead. And
    performance stays the same as well.

    ## Rationale for size_t (string lengths):

    This has significant advantages. There are some costs to doing it, but
    they are not as significant as they may appear on the surface. Let's
    dive into it:

    ### It's The Correct Data Type

    The C89 spec indicates in 3.3.3.4 (
    http://port70.net/~nsz/c/c89/rationale/c3.html#size-95t-3-3-3-4 ) that
    the size_t type was created specifically for usage in this context. It
    is always, 100% guaranteed to be able to hold the bounds of every
    possible array element. Strings in C are simply char arrays.
    Therefore, the correct data type to use for string sizes (which really
    are just an offset qualifier) is size_t.

    Additionally, calloc, malloc, etc all expect parameters of type size_t
    for exactly this reason.

    Another good reference on it: http://www.viva64.com/en/a/0050/

    ### It's The Secure Data Type

    size_t (and ptrdiff_t) are the only C89 types that are 100% guaranteed
    to be able to hold the size of any possible object that the compiler
    will support. Other types will vary depending on the data model that
    the compiler supports, as the spec only defines minimum widths.

    This is so important that CERT issued a coding standard for it:
    INT01-C ( https://www.securecoding.cert.org/confluence/display/seccode/INT01-C.+Use+rsize_t+or+size_t+for+all+integer+values+representing+the+size+of+an+object
    ).

    One of the reasons is that it's difficult to do overflow checks in a
    portable way. See VU#162289: https://www.kb.cert.org/vuls/id/162289 .
    In there, they recommend using the C99 uintptr_t type, but suggest
    using size_t for platforms that don't have uintptr_t support (and
    since we target C89 for the engine, that's out).

    Apple's Secure Coding Guide's section on Avoiding Integer Overflows
    and Underflows says the same thing:
    https://developer.apple.com/library/mac/documentation/security/conceptual/securecodingguide/Articles/BufferOverflows.html

    ### About Long Strings

    The fact that changing to size_t allows strings (and arrays) to be >
    4gb is a side-effect. A welcome one, but a side effect none the less.
    The primary reason to use it is that it's the correct data type, and
    gives you the most safety and security.

    # Response To Concerns Mentioned

    I'll respond here to some of the concerns mentioned in this thread:

    ## size_t uses more memory and will result in more CPU cache misses,
    which will result in worse performance

    Well, size_t will use more memory. No doubt about that.

    But the performance side is more nuanced. And as several benchmarks in
    this thread indicate, there isn't a practical difference. Heck, the
    benchmarks on Windows show an improvement in some cases.

    And there is a reason for that. Since a pointer is a 64 bit data type,
    and a int is a 32 bit data type, any time you add the two will result
    in extra CPU cycles needed for the cast. This can be clearly seen by
    analyzing a simple malloc call with an int vs a size_t param. Here's
    the diff:

    < movl $5, -12(%rbp)
    < movl -12(%rbp), %eax
    < cltq
    ---
    movq $5, -16(%rbp)
    movq -16(%rbp), %rax
    Now, a cache miss is much more expensive than a cast, but we don't
    have proof that cache misses will actually occur.

    In fact, in the benchmarks, the worst difference is 2%. Which is
    hardly significant (as indicated by several people here). But also
    notice that in both benchmarks (those done by Microsoft, and those
    done by Dmitry), some specific tests actually executed **faster** with
    the size_t transforms (namely Hello World, Wordpress, etc). So to say
    even 2% is not really the full story.

    We'll come back to the memory thing in a bit.

    ## Macro Renames and ZPP changes

    This was my idea, and I don't think it's been properly justified.

    ### ZPP Changes

    The ZPP changes are critical. The reason is that varargs is casting an
    arbitrary block of memory to a type, and then writing to it. So
    existing code that does zpp("s", str, &int_len) would wind up with a
    buffer overflow. Because zpp would be trying to write a 64 bit value
    to a 32 bit container. The other 32 bits would fall off the end, into
    who knows what. At BEST this can result in a segfault. At worst,
    memory corruption and MASSIVE security vulnerabilities.

    Also note that the compiler *can't* and actively doesn't catch these
    types of errors. That means that it's largely luck and testing that
    will lead to it.

    So, I chose to break BC and rename the ZPP symbols. Because that WILL
    error, and provide the developer with a meaningful indication that an
    improper data type was provided. As I considered a fatal error that an
    invalid type was supplied was a better way of identifying to the
    developer that "HEY, THIS NEEDS TO BE CHANGED ASAP" than just letting
    them hit random segfaults at runtime.

    If there is a way to get around this by giving the compiler more
    information, then do it. But to just leave the types there, and leave
    it to chance if a buffer overflow occurs, is dangerous. Which is why I
    made the call that the ZPP types **needed** to be changed.

    ### Macro Renames

    The reason for the rename is largely the same as with the ZPP changes.
    The severity of not changing is less (since the compiler will warn and
    do an implicit cast for you). But it's still there. Which is why I
    chose to change it. This is less critical, but was done to better
    indicate to the developer what needs to change to properly support the
    new system.

    ## Memory Overhead

    This is definitely a concern. There is a potential to double the
    amount of memory that PHP takes. Which on the surface looks enormous.
    And if we stop at the surface, we definitely shouldn't do it!

    But as we look deeper, we see that in actuality, the difference is not
    double. In fact, most data structures, as identified by Dmitry
    himself, only increase by between 6% (zend_op_array) 50%
    (zend_string's size). So that "double" figure quickly drops.

    But that's at the structure level. Let's look at what actually happens
    in practice. Dmitry himself also provides these answers. The average
    memory increase is 8% for Wordpress, and 6% for ZF1.

    Let's put that 8% in context. Wordpress used 12MB, and now it uses
    13MB. 1MB more. That's not overly significant. ZF used 29MB. Now it
    uses 31MB. Still not overly significant.

    Don't get me wrong, it's still more. And more is bad. But it's not
    nearly as bad as it's being played out to be.

    To put this into context, 5.4 saved up to 50% memory from 5.3
    (depending on benchmark). 8 << 50.

    Now, I'm not saying that memory should be thrown around willy-nilly.
    But given the rationale that I gave above, I think the benefits of
    sanity, portability and security clearly are significant enough for
    the relatively small cost in memory.
    --
    Andrea Faulds
    http://ajf.me/
  • Ferenc Kovacs at May 14, 2014 at 5:28 pm
    On Wed, May 14, 2014 at 7:13 PM, Andrea Faulds wrote:
    On 14 May 2014, at 18:10, Andrey Hristov wrote:

    This is purely academical. And the standard library has to support
    everything, it's the standard library. PHP is on its own, and if an
    addition is of little use to the most of the developers/scripts, why the
    heck it should be in/the default.
    A good solution is to typedef a php_size_t, leave it to uint32_t and for
    those, who need more than 4GB in strings and elements they can just build
    with size_t as definition. Offer the choice, don't force.

    It is not just “purely academic”. Here, let me quote Pierre (in 'Re:
    [PHP-DEV] [VOTE] [RFC] 64 bit platform improvements for string length and
    integer’, just now) quoting Anthony:
    I'm fairly sure than anybody reading this thread will also read the other,
    so there is no reason to copypaste the mail.
    If you think it is important, you can quote the relevant parts, or link to
    the mail if you want to refer the whole thing.
    Other than that, I'm not sure that you understood what Andrey wrote, his
    idea would allow to use the size_t types, but it would be a build-time
    opt-in feature.
    Personally I don't like the idea, as we would have to support both types,
    and as this also effects the userland (max length of strings, integer size
    on windows, etc.) it would be a heavy debugging/maintenance burden.

    --
    Ferenc Kovács
    @Tyr43l - http://tyrael.hu
  • Andrey Hristov at May 14, 2014 at 5:28 pm

    On 14.05.2014 20:13, Andrea Faulds wrote:
    On 14 May 2014, at 18:10, Andrey Hristov wrote:

    This is purely academical. And the standard library has to support
    everything, it's the standard library. PHP is on its own, and if an
    addition is of little use to the most of the developers/scripts, why
    the heck it should be in/the default.
    A good solution is to typedef a php_size_t, leave it to uint32_t and
    for those, who need more than 4GB in strings and elements they can
    just build with size_t as definition. Offer the choice, don't force.
    It is not just “purely academic”. Here, let me quote Pierre (in 'Re:
    [PHP-DEV] [VOTE] [RFC] 64 bit platform improvements for string length
    and integer’, just now) quoting Anthony:
    PHP is not general purpose library for writing applications, it's an
    environment on its own with its own specifics. For general purpose
    library size_t is the only way to go. Using size_t for a C based chat
    application to exchange 140 byte in length messages is not needed. The
    MySQL C/S protocol uses length encoding to lower the memory usage.
    The API can take size_t but increasing the size of a very core and very
    often allocated structure is rarely a good thing. Those who want to be
    fully compliant can build and stay with php_size_t being alias to size_t
    but those who doesn't need it (think big installations) have the choice
    not to overbloat their machines.

    Choice...

    Andrey
    This thread has been pointed out to me by a few people. As the
    originator of this patch and concept I feel that I should clarify a
    few points.

    # Rationale

    The reason that I originally started this patch was to clean up and
    standardize the underlying types. This is to introduce predictability,
    portability and type sanity into the engine and entire cphp
    implementation.

    ## Rationale for Int64:

    Without this patch, the size of integers (longs) varies based on which
    compiler you use. This means that even for identical target
    architectures behavior can change with respect to userland code.
    Refactoring this allows for consistent sizes that can be relied upon
    by the programmer. This is an effort to make it a bit easier to rely
    on integer width as a developer.

    And ideally this is a free cost to most implementations, since ints
    are already 64 bits wide, so there is no memory overhead. And
    performance stays the same as well.

    ## Rationale for size_t (string lengths):

    This has significant advantages. There are some costs to doing it, but
    they are not as significant as they may appear on the surface. Let's
    dive into it:

    ### It's The Correct Data Type

    The C89 spec indicates in 3.3.3.4 (
    http://port70.net/~nsz/c/c89/rationale/c3.html#size-95t-3-3-3-4 ) that
    the size_t type was created specifically for usage in this context. It
    is always, 100% guaranteed to be able to hold the bounds of every
    possible array element. Strings in C are simply char arrays.
    Therefore, the correct data type to use for string sizes (which really
    are just an offset qualifier) is size_t.

    Additionally, calloc, malloc, etc all expect parameters of type size_t
    for exactly this reason.

    Another good reference on it: http://www.viva64.com/en/a/0050/

    ### It's The Secure Data Type

    size_t (and ptrdiff_t) are the only C89 types that are 100% guaranteed
    to be able to hold the size of any possible object that the compiler
    will support. Other types will vary depending on the data model that
    the compiler supports, as the spec only defines minimum widths.

    This is so important that CERT issued a coding standard for it:
    INT01-C (
    https://www.securecoding.cert.org/confluence/display/seccode/INT01-C.+Use+rsize_t+or+size_t+for+all+integer+values+representing+the+size+of+an+object
    ).

    One of the reasons is that it's difficult to do overflow checks in a
    portable way. See VU#162289: https://www.kb.cert.org/vuls/id/162289 .
    In there, they recommend using the C99 uintptr_t type, but suggest
    using size_t for platforms that don't have uintptr_t support (and
    since we target C89 for the engine, that's out).

    Apple's Secure Coding Guide's section on Avoiding Integer Overflows
    and Underflows says the same thing:
    https://developer.apple.com/library/mac/documentation/security/conceptual/securecodingguide/Articles/BufferOverflows.html

    ### About Long Strings

    The fact that changing to size_t allows strings (and arrays) to be >
    4gb is a side-effect. A welcome one, but a side effect none the less.
    The primary reason to use it is that it's the correct data type, and
    gives you the most safety and security.

    # Response To Concerns Mentioned

    I'll respond here to some of the concerns mentioned in this thread:

    ## size_t uses more memory and will result in more CPU cache misses,
    which will result in worse performance

    Well, size_t will use more memory. No doubt about that.

    But the performance side is more nuanced. And as several benchmarks in
    this thread indicate, there isn't a practical difference. Heck, the
    benchmarks on Windows show an improvement in some cases.

    And there is a reason for that. Since a pointer is a 64 bit data type,
    and a int is a 32 bit data type, any time you add the two will result
    in extra CPU cycles needed for the cast. This can be clearly seen by
    analyzing a simple malloc call with an int vs a size_t param. Here's
    the diff:

    < movl $5, -12(%rbp)
    < movl -12(%rbp), %eax
    < cltq
    ---
    movq $5, -16(%rbp)
    movq -16(%rbp), %rax
    Now, a cache miss is much more expensive than a cast, but we don't
    have proof that cache misses will actually occur.

    In fact, in the benchmarks, the worst difference is 2%. Which is
    hardly significant (as indicated by several people here). But also
    notice that in both benchmarks (those done by Microsoft, and those
    done by Dmitry), some specific tests actually executed **faster** with
    the size_t transforms (namely Hello World, Wordpress, etc). So to say
    even 2% is not really the full story.

    We'll come back to the memory thing in a bit.

    ## Macro Renames and ZPP changes

    This was my idea, and I don't think it's been properly justified.

    ### ZPP Changes

    The ZPP changes are critical. The reason is that varargs is casting an
    arbitrary block of memory to a type, and then writing to it. So
    existing code that does zpp("s", str, &int_len) would wind up with a
    buffer overflow. Because zpp would be trying to write a 64 bit value
    to a 32 bit container. The other 32 bits would fall off the end, into
    who knows what. At BEST this can result in a segfault. At worst,
    memory corruption and MASSIVE security vulnerabilities.

    Also note that the compiler *can't* and actively doesn't catch these
    types of errors. That means that it's largely luck and testing that
    will lead to it.

    So, I chose to break BC and rename the ZPP symbols. Because that WILL
    error, and provide the developer with a meaningful indication that an
    improper data type was provided. As I considered a fatal error that an
    invalid type was supplied was a better way of identifying to the
    developer that "HEY, THIS NEEDS TO BE CHANGED ASAP" than just letting
    them hit random segfaults at runtime.

    If there is a way to get around this by giving the compiler more
    information, then do it. But to just leave the types there, and leave
    it to chance if a buffer overflow occurs, is dangerous. Which is why I
    made the call that the ZPP types **needed** to be changed.

    ### Macro Renames

    The reason for the rename is largely the same as with the ZPP changes.
    The severity of not changing is less (since the compiler will warn and
    do an implicit cast for you). But it's still there. Which is why I
    chose to change it. This is less critical, but was done to better
    indicate to the developer what needs to change to properly support the
    new system.

    ## Memory Overhead

    This is definitely a concern. There is a potential to double the
    amount of memory that PHP takes. Which on the surface looks enormous.
    And if we stop at the surface, we definitely shouldn't do it!

    But as we look deeper, we see that in actuality, the difference is not
    double. In fact, most data structures, as identified by Dmitry
    himself, only increase by between 6% (zend_op_array) 50%
    (zend_string's size). So that "double" figure quickly drops.

    But that's at the structure level. Let's look at what actually happens
    in practice. Dmitry himself also provides these answers. The average
    memory increase is 8% for Wordpress, and 6% for ZF1.

    Let's put that 8% in context. Wordpress used 12MB, and now it uses
    13MB. 1MB more. That's not overly significant. ZF used 29MB. Now it
    uses 31MB. Still not overly significant.

    Don't get me wrong, it's still more. And more is bad. But it's not
    nearly as bad as it's being played out to be.

    To put this into context, 5.4 saved up to 50% memory from 5.3
    (depending on benchmark). 8 << 50.

    Now, I'm not saying that memory should be thrown around willy-nilly.
    But given the rationale that I gave above, I think the benefits of
    sanity, portability and security clearly are significant enough for
    the relatively small cost in memory.
    --
    Andrea Faulds
    http://ajf.me/


  • Pierre Joye at May 14, 2014 at 5:41 pm

    On May 14, 2014 7:29 PM, "Andrey Hristov" wrote:
    On 14.05.2014 20:13, Andrea Faulds wrote:


    On 14 May 2014, at 18:10, Andrey Hristov <php@hristov.com
    wrote:
    PHP is not general purpose library for writing applications, it's an
    environment on its own with its own specifics.

    Wrong. It is a web programming language providing consistent behavior and
    safe implementation in all supported environment.
  • Pierre Joye at May 14, 2014 at 9:01 am

    On Wed, May 14, 2014 at 10:53 AM, Dmitry Stogov wrote:
    phpng is based on the same sources and 99% compatible.
    You surely meant 99% incompatible right?

    It changes literally every single line of code related to hashtable,
    zval or related areas. It adds dozen of possible issues not easily
    catch-able because of types change for numerous widely used APIs,
    besides being inconsistent (yet, that's on the todos).


    We are just changing the basement.
    it must be the basement for PHPNext, but we didn't start any discussions
    about that.
    No, it is one task for php-next, one upcoming RFC (well, more in a few
    months than next week).
    We actually have a lot of work to do and spend most the time doing our best.
    We have no plans to backport it into PHP-5.6.
    So we do, and we do it now for phpng as well. As coop is the only way to go.

    PHP supports 64bit for ages, and this proposal has nothing common with
    64bit support in general.
    This statement is wrong, you know it. I may re post common good
    practices and recommendations for actual 64bit support but that will
    be doubled.
    It allows 2GB strings, but do you imagine a web application that need them?
    As I already said numerous times, 2GB+ is a side effect, not a goal.
    However, each big PHP site will have to "pay" for it.
    and how many times do we have to pay for Zend total lack of clue about
    open processes and communications?


    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Stas Malyshev at May 14, 2014 at 6:20 pm
    Hi!
    and how many times do we have to pay for Zend total lack of clue about
    open processes and communications?
    Really, this just derails the discussion and does not help anything. Can
    we please stay in the bounds of professional conduct here?

    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Andrey Hristov at May 14, 2014 at 6:35 pm

    On 14.05.2014 12:01, Pierre Joye wrote:
    On Wed, May 14, 2014 at 10:53 AM, Dmitry Stogov wrote:
    phpng is based on the same sources and 99% compatible.
    You surely meant 99% incompatible right?

    It changes literally every single line of code related to hashtable,
    zval or related areas. It adds dozen of possible issues not easily
    catch-able because of types change for numerous widely used APIs,
    besides being inconsistent (yet, that's on the todos).


    We are just changing the basement.
    it must be the basement for PHPNext, but we didn't start any discussions
    about that.
    No, it is one task for php-next, one upcoming RFC (well, more in a few
    months than next week).
    We actually have a lot of work to do and spend most the time doing our best.
    We have no plans to backport it into PHP-5.6.
    So we do, and we do it now for phpng as well. As coop is the only way to go.

    PHP supports 64bit for ages, and this proposal has nothing common with
    64bit support in general.
    This statement is wrong, you know it. I may re post common good
    practices and recommendations for actual 64bit support but that will
    be doubled.
    It allows 2GB strings, but do you imagine a web application that need them?
    As I already said numerous times, 2GB+ is a side effect, not a goal.
    However, each big PHP site will have to "pay" for it.
    and how many times do we have to pay for Zend total lack of clue about
    open processes and communications?
    Pierre, you are getting the things wrong. Changing the types is
    straightforward thing to do so an open branch. What the Zend guys did
    was try and see approach with their ideas. If the results were bad you
    would have probably never heard of phpng. This kind of things sometimes
    are not open from the very beginning to limit the noise.


    I hope I'm wrong but you are probably hurt because the size_t patch did
    not get accepted and somehow now phpng gets lot of attention.
    Cheers,
    Andrey
  • Stas Malyshev at May 14, 2014 at 6:16 pm
    Hi!
    This is a biased argument, you know it, I know it. The key point is
    not about the new maximum size of an array or string but the long due
    clean and safe 64bit implementation, following well known good
    practice (can be seen in almost all other OSS projects out there) and
    standards.
    I personally am still confused about why clean and safe 64-bit
    implementation requires 64-bit string lengths.
    As of now, phpng has been worked on for the last months, totally
    privately. And even if it looks promising it is still not remotely
    ready to be actually proposed. However it does not prevent you to use
    it to stop other improvements, which have been worked on for months,
    publically, with continuous tests, status updates, etc. I am not sure
    what is happening here is good for PHP.
    Nobody is proposing to stop other improvements. What people (including
    myself) are concerned about is that change to 64-bit strings, due to the
    increase in memory usage, may negate the very perceivable benefits of
    the phpng and impact performance, all while not being beneficial for 99%
    of the users. This is something that can not be fixed by saying "phpng
    is not production quality yet". It is not, but we have an issue here
    that does not depend on quality of phpng and we need to find a
    resolution for it. Maybe if we have types cleaned up and
    compartmentalized, we could have it working with both options and then
    make a decision based on later performance tests?

    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Zeev Suraski at May 14, 2014 at 7:52 am
    Well put Nikita!

    Guys - we're in a bit of a ridiculous situation where the key low-level
    engine maintainers are saying this patch is unacceptable, yet it may pass
    due to the low level of overall interest and the lack of special rules to
    govern low-level changes like that (where with all due respect, I think the
    main maintainers of the engine should get much stronger power).

    The patch the way it is now should be discarded; All the macro changes, as
    well as any data structure change except for IS_LONG should be removed.

    Given that the current RFC doesn't thoroughly represent the performance hit
    (in terms of memory footprint, as well as resulting performance hit -
    especially when using phpng), I recommend the following:

    * Add the relevant performance feedback from Dmitry to the RFC, as well as
    his concerns as the chief performance guy php.net has
    * Provide an option for people to vote 'yes' for the IS_LONG size part only

    If the authors of the RFC object, my request from everyone who has voting
    rights here is to vote 'no' and we can create a separate RFC for the IS_LONG
    part only. Forcing this change against the explicit concerns of Dmitry and
    Nikita, who worked their rear ends off to squeeze every bit of performance
    in phpng (along with Xinchen, and now others joining in) - is, well,
    ridiculous IMHO.

    Zeev
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"?
    The very same issue also exists in our current implementation. In phpng
    the
    relative hit is just larger, because the structures are more optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having
    support for 64
    bit integers on Windows and other LLP64 architectures - that's great.
    Making
    string lengths unsigned - that's great as well. But supporting strings
    larger than
    4G or arrays with more than 4 billion elements - that does not seem very
    useful
    and unlike the other two changes, hurts memory usage. I wonder how many
    people would prefer having lower memory usage over having the ability to
    create arrays with 4 billion elements.

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the
    benefit of
    doing that. The zpp changes also conflict with phpng, because S has a
    different
    meaning (and imho for no good reason - it could just as well stay at s).

    Nikita
  • Christian Stoller at May 14, 2014 at 8:08 am

    Well put Nikita!

    Guys - we're in a bit of a ridiculous situation where the key low-level
    engine maintainers are saying this patch is unacceptable, yet it may pass
    due to the low level of overall interest and the lack of special rules to
    govern low-level changes like that (where with all due respect, I think the
    main maintainers of the engine should get much stronger power).

    The patch the way it is now should be discarded; All the macro changes, as
    well as any data structure change except for IS_LONG should be removed.

    Given that the current RFC doesn't thoroughly represent the performance hit
    (in terms of memory footprint, as well as resulting performance hit -
    especially when using phpng), I recommend the following:

    * Add the relevant performance feedback from Dmitry to the RFC, as well as
    his concerns as the chief performance guy php.net has
    * Provide an option for people to vote 'yes' for the IS_LONG size part only

    If the authors of the RFC object, my request from everyone who has voting
    rights here is to vote 'no' and we can create a separate RFC for the IS_LONG
    part only. Forcing this change against the explicit concerns of Dmitry and
    Nikita, who worked their rear ends off to squeeze every bit of performance
    in phpng (along with Xinchen, and now others joining in) - is, well,
    ridiculous IMHO.

    Zeev
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"?
    The very same issue also exists in our current implementation. In phpng
    the
    relative hit is just larger, because the structures are more optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having
    support for 64
    bit integers on Windows and other LLP64 architectures - that's great.
    Making
    string lengths unsigned - that's great as well. But supporting strings
    larger than
    4G or arrays with more than 4 billion elements - that does not seem very
    useful
    and unlike the other two changes, hurts memory usage. I wonder how many
    people would prefer having lower memory usage over having the ability to
    create arrays with 4 billion elements.

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the
    benefit of
    doing that. The zpp changes also conflict with phpng, because S has a
    different
    meaning (and imho for no good reason - it could just as well stay at s).

    Nikita
    I am not a developer, but I think the approach of the phpng developers is not fair. The 64 bit topic and its RFC has been worked on and discussed for weeks or months and now theirs suddenly phpng and all the former work should be thrown away?

    I have not followed the whole discussion about 64 bit/size_t, etc, so I just want to ask if you, Nikita, Dmitry or Zeev have mentioned your view and the issues before?

    Christian
  • Ferenc Kovacs at May 14, 2014 at 8:19 am

    On Wed, May 14, 2014 at 10:08 AM, Christian Stoller wrote:

    Well put Nikita!

    Guys - we're in a bit of a ridiculous situation where the key low-level
    engine maintainers are saying this patch is unacceptable, yet it may pass
    due to the low level of overall interest and the lack of special rules to
    govern low-level changes like that (where with all due respect, I think the
    main maintainers of the engine should get much stronger power).

    The patch the way it is now should be discarded; All the macro changes, as
    well as any data structure change except for IS_LONG should be removed.

    Given that the current RFC doesn't thoroughly represent the performance hit
    (in terms of memory footprint, as well as resulting performance hit -
    especially when using phpng), I recommend the following:

    * Add the relevant performance feedback from Dmitry to the RFC, as well as
    his concerns as the chief performance guy php.net has
    * Provide an option for people to vote 'yes' for the IS_LONG size part only
    If the authors of the RFC object, my request from everyone who has voting
    rights here is to vote 'no' and we can create a separate RFC for the IS_LONG
    part only. Forcing this change against the explicit concerns of Dmitry and
    Nikita, who worked their rear ends off to squeeze every bit of
    performance
    in phpng (along with Xinchen, and now others joining in) - is, well,
    ridiculous IMHO.

    Zeev
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"?
    The very same issue also exists in our current implementation. In phpng
    the
    relative hit is just larger, because the structures are more optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having
    support for 64
    bit integers on Windows and other LLP64 architectures - that's great.
    Making
    string lengths unsigned - that's great as well. But supporting strings
    larger than
    4G or arrays with more than 4 billion elements - that does not seem very
    useful
    and unlike the other two changes, hurts memory usage. I wonder how many
    people would prefer having lower memory usage over having the ability to
    create arrays with 4 billion elements.

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the
    benefit of
    doing that. The zpp changes also conflict with phpng, because S has a
    different
    meaning (and imho for no good reason - it could just as well stay at s).

    Nikita
    I am not a developer, but I think the approach of the phpng developers is
    not fair. The 64 bit topic and its RFC has been worked on and discussed for
    weeks or months and now theirs suddenly phpng and all the former work
    should be thrown away?

    I have not followed the whole discussion about 64 bit/size_t, etc, so I
    just want to ask if you, Nikita, Dmitry or Zeev have mentioned your view
    and the issues before?

    Christian
    yeah, I think it would have been better to announce/share the phpng work a
    bit sooner, but I think there is more than that in the current arguments:
    the two RFCs have a conflict of interest, the phpng is focused on the
    performance improvements through smaller memory usage, while the size_t
    patch increases that, and the changes in phpng made the impact bigger.
    So if we include both of those in their current form, the best scenario
    which we can end up is clearing up the type system and introducing phpng
    without any visible performance impact.
    Of course this is only my understanding of the situation.

    --
    Ferenc Kovács
    @Tyr43l - http://tyrael.hu
  • Dmitry Stogov at May 14, 2014 at 8:21 am
    yes. We discussed that patch with Pierre for hours, and I always told that
    I afraid about memory consumption overhead.
    my tests showed it clearly

    phpng was started as closed project, because we didn't know if we'll able
    to succeed at all (it's not a first attempt) and we liked to move fast.
    Once, we got useful results we opened it.
    see https://wiki.php.net/phpng#performance_evaluation

    Thanks. Dmitry.

    On Wed, May 14, 2014 at 12:08 PM, Christian Stoller wrote:

    Well put Nikita!

    Guys - we're in a bit of a ridiculous situation where the key low-level
    engine maintainers are saying this patch is unacceptable, yet it may pass
    due to the low level of overall interest and the lack of special rules to
    govern low-level changes like that (where with all due respect, I think the
    main maintainers of the engine should get much stronger power).

    The patch the way it is now should be discarded; All the macro changes, as
    well as any data structure change except for IS_LONG should be removed.

    Given that the current RFC doesn't thoroughly represent the performance hit
    (in terms of memory footprint, as well as resulting performance hit -
    especially when using phpng), I recommend the following:

    * Add the relevant performance feedback from Dmitry to the RFC, as well as
    his concerns as the chief performance guy php.net has
    * Provide an option for people to vote 'yes' for the IS_LONG size part only
    If the authors of the RFC object, my request from everyone who has voting
    rights here is to vote 'no' and we can create a separate RFC for the IS_LONG
    part only. Forcing this change against the explicit concerns of Dmitry and
    Nikita, who worked their rear ends off to squeeze every bit of
    performance
    in phpng (along with Xinchen, and now others joining in) - is, well,
    ridiculous IMHO.

    Zeev
    Sorry, what did I miss here? Why cannot the phpng numbers be taken as
    "valid"?
    The very same issue also exists in our current implementation. In phpng
    the
    relative hit is just larger, because the structures are more optimized.

    I think you shouldn't dismiss Dmitry's point just like that. Having
    support for 64
    bit integers on Windows and other LLP64 architectures - that's great.
    Making
    string lengths unsigned - that's great as well. But supporting strings
    larger than
    4G or arrays with more than 4 billion elements - that does not seem very
    useful
    and unlike the other two changes, hurts memory usage. I wonder how many
    people would prefer having lower memory usage over having the ability to
    create arrays with 4 billion elements.

    Independently of that: In a lot of the previous discussion people have
    many,
    many, many times asked that this patch be implemented without all those
    macros renames and zpp changes. I still have a hard time seeing the
    benefit of
    doing that. The zpp changes also conflict with phpng, because S has a
    different
    meaning (and imho for no good reason - it could just as well stay at s).

    Nikita
    I am not a developer, but I think the approach of the phpng developers is
    not fair. The 64 bit topic and its RFC has been worked on and discussed for
    weeks or months and now theirs suddenly phpng and all the former work
    should be thrown away?

    I have not followed the whole discussion about 64 bit/size_t, etc, so I
    just want to ask if you, Nikita, Dmitry or Zeev have mentioned your view
    and the issues before?

    Christian
  • Pierre Joye at May 14, 2014 at 8:46 am

    On Wed, May 14, 2014 at 10:21 AM, Dmitry Stogov wrote:
    yes. We discussed that patch with Pierre for hours,
    one hour at best, and memory consuption impact was minimal when we
    bench it using 5.6 as base. It can still be minimal using phpng as we
    did not yet did the necessary work there, obviously (remember? you
    guys announced it last week).
    and I always told that
    I afraid about memory consumption overhead.
    my tests showed it clearly

    phpng was started as closed project, because we didn't know if we'll able
    to succeed at all (it's not a first attempt) and we liked to move fast.
    That's your mistake. We are an opensource projects and ideas must be
    (I repeat, must be) discussed early instead of very late. It works for
    every other healthy OSS projects and I totally fail to see why Zend
    and related cannot do it in a community friendly way.


    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Lester Caine at May 14, 2014 at 9:10 am

    On 14/05/14 09:21, Dmitry Stogov wrote:
    yes. We discussed that patch with Pierre for hours, and I always told that
    I afraid about memory consumption overhead.
    my tests showed it clearly

    phpng was started as closed project, because we didn't know if we'll able
    to succeed at all (it's not a first attempt) and we liked to move fast.
    Once, we got useful results we opened it.
    seehttps://wiki.php.net/phpng#performance_evaluation
    Dmitry
    As all of my systems are based on Firebird, I can't currently test even
    if I did have the time. One area that I have problems with is the fact
    that 64 bit numbers are a key element of SEQUENCE/GENERATOR values which
    PHP does not support well and which I'd like to see fixed, but I can
    understand your concern over 64bit strings increasing memory
    consumption. This is why I think that lumping several 64bit related
    items together does not make sense? Looking at this from the hardware
    side, some 64bit processes will be faster on a 64bit machine, but also
    supporting 32bit builds simply adds to the overheads. I'd prefer to see
    better coverage of benchmarking since I don't think limited testing
    gives the whole picture? But I don't see 64/32 changes having a
    substantial effect. Do we need 64bit long strings - not generally - but
    the facility should be covered properly while still retaining 32bit
    operations?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Dmitry Stogov at May 14, 2014 at 9:24 am
    Ah, you are on windows and lack 64-bit IS_LONG.
    This is the part of the patch that should be accepted.
    I mentioned it on original email.

    The "bad" thing that this patch did, it changed all C data structures to
    use 64-bit string lengths and it means that each such data sructure would
    take more memory. Even zend_op becames bigger and as it's used for VM
    byte-code representation you may just multiply the difference to number of
    opcodes in application (that might be millions).

    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.

    Thanks. Dmitry.

    On Wed, May 14, 2014 at 1:13 PM, Lester Caine wrote:
    On 14/05/14 09:21, Dmitry Stogov wrote:

    yes. We discussed that patch with Pierre for hours, and I always told that
    I afraid about memory consumption overhead.
    my tests showed it clearly

    phpng was started as closed project, because we didn't know if we'll able
    to succeed at all (it's not a first attempt) and we liked to move fast.
    Once, we got useful results we opened it.
    seehttps://wiki.php.net/phpng#performance_evaluation
    Dmitry
    As all of my systems are based on Firebird, I can't currently test even if
    I did have the time. One area that I have problems with is the fact that 64
    bit numbers are a key element of SEQUENCE/GENERATOR values which PHP does
    not support well and which I'd like to see fixed, but I can understand your
    concern over 64bit strings increasing memory consumption. This is why I
    think that lumping several 64bit related items together does not make
    sense? Looking at this from the hardware side, some 64bit processes will be
    faster on a 64bit machine, but also supporting 32bit builds simply adds to
    the overheads. I'd prefer to see better coverage of benchmarking since I
    don't think limited testing gives the whole picture? But I don't see 64/32
    changes having a substantial effect. Do we need 64bit long strings - not
    generally - but the facility should be covered properly while still
    retaining 32bit operations?


    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Lester Caine at May 14, 2014 at 7:36 pm

    On 14/05/14 10:24, Dmitry Stogov wrote:
    Ah, you are on windows and lack 64-bit IS_LONG.
    This is the part of the patch that should be accepted.
    I mentioned it on original email.
    Not used windows for a number of years. 64 bit builds on Linux ...
    The "bad" thing that this patch did, it changed all C data structures to
    use 64-bit string lengths and it means that each such data sructure
    would take more memory. Even zend_op becames bigger and as it's used for
    VM byte-code representation you may just multiply the difference to
    number of opcodes in application (that might be millions).
    Actually I do agree that this may not be ideal ...
    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.
    Then there is no way that I can get involved at the present time :)

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Dmitry Stogov at May 14, 2014 at 7:48 pm

    On Wed, May 14, 2014 at 11:39 PM, Lester Caine wrote:
    On 14/05/14 10:24, Dmitry Stogov wrote:

    Ah, you are on windows and lack 64-bit IS_LONG.
    This is the part of the patch that should be accepted.
    I mentioned it on original email.
    Not used windows for a number of years. 64 bit builds on Linux ...

    then you already have 64-bit long, or do I miss something?


    The "bad" thing that this patch did, it changed all C data structures to
    use 64-bit string lengths and it means that each such data sructure
    would take more memory. Even zend_op becames bigger and as it's used for
    VM byte-code representation you may just multiply the difference to
    number of opcodes in application (that might be millions).
    Actually I do agree that this may not be ideal ...


    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.
    Then there is no way that I can get involved at the present time :)

    may be help in porting ext/interbase and ext/pdo_firebird :)

    I really don't know a lot about Firebird and afraid even proper
    configuration might take us significant time.

    Thanks. Dmitry.


    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Lester Caine at May 14, 2014 at 8:38 pm

    On 14/05/14 20:48, Dmitry Stogov wrote:
    Not used windows for a number of years. 64 bit builds on Linux ...
    then you already have 64-bit long, or do I miss something?
    BIGINT is not handled as a simple integer? But I do need a consistent
    result if users are on windows which is an area where inconsistency has
    been causing problems in the past.
    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.

    Then there is no way that I can get involved at the present time :)
    may be help in porting ext/interbase and ext/pdo_firebird :)

    I really don't know a lot about Firebird and afraid even proper
    configuration might take us significant time.
    Looks like you don't support Postgres either yet?
    MySQL is the worst possible choice for any database application, so many
    of us will not be able to follow you until these are supported, and
    consistent handling of long strings/blobs and large integers is part of
    the problem here.

    While I'm trying to keep in touch with these new developments, until
    I've managed to get all of my client base off PHP5.2 and up to 5.4 I
    don't have time to spend on extra work. :(

    ( God I HATE the navigation crap on the top of the manual pages - this
    should be a PROPER hierarchy of the page you are looking at not random
    stuff you visited )

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Anatol Belski at May 17, 2014 at 11:00 am
    Hi,

    from the previous episodes

    Dmitry
    http://grokbase.com/p/php/php-internals/141qd2k0ct/php-dev-rfc-64-bit-platform-improvements-for-string-length-and-integer
    "After some thoughts I think that usage of "size_t" is a good thing for
    the future support of X32 ABI."

    Stas
    http://grokbase.com/p/php/php-internals/1421be3ywy/php-dev-vote-64-bit-platform-improvements-for-string-length-and-integer
    " I can't speak for others but I can speak for myself in saying
    that these concerns are not just kicking the can down the road in hope
    that eventually this patch dies off. "

    Nikita:
    http://www.serverphorums.com/read.php?7,862033,862730#msg-862730
    "I fully support merging this into master, as the first change for PHP
    6."

    That's pretty sad guys. Applause for playing me for a fool (well, probably
    not only me). On the other hand, I'd be honored to be as stupid as that
    old farts from ANSI who wrote the specs.

    Why can't you just work together instead of looking for a hair in the soup?

    Best regards

    Anatol

    On Wed, May 14, 2014 21:48, Dmitry Stogov wrote:
    On Wed, May 14, 2014 at 11:39 PM, Lester Caine wrote:

    On 14/05/14 10:24, Dmitry Stogov wrote:

    Ah, you are on windows and lack 64-bit IS_LONG.
    This is the part of the patch that should be accepted.
    I mentioned it on original email.
    Not used windows for a number of years. 64 bit builds on Linux ...

    then you already have 64-bit long, or do I miss something?


    The "bad" thing that this patch did, it changed all C data structures
    to
    use 64-bit string lengths and it means that each such data sructure
    would take more memory. Even zend_op becames bigger and as it's used
    for VM byte-code representation you may just multiply the difference
    to number of opcodes in application (that might be millions).
    Actually I do agree that this may not be ideal ...



    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.
    Then there is no way that I can get involved at the present time :)

    may be help in porting ext/interbase and ext/pdo_firebird :)

    I really don't know a lot about Firebird and afraid even proper
    configuration might take us significant time.

    Thanks. Dmitry.



    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk


    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php

  • Zeev Suraski at May 17, 2014 at 11:16 am
    For the benefit of everyone, this is all from January. Dmitry's,
    Stas's and Nikita's positions in the actual patch in question can be
    found on this thread.

    Sent from my mobile
    On 17 במאי 2014, at 14:00, Anatol Belski wrote:

    Hi,

    from the previous episodes

    Dmitry
    http://grokbase.com/p/php/php-internals/141qd2k0ct/php-dev-rfc-64-bit-platform-improvements-for-string-length-and-integer
    "After some thoughts I think that usage of "size_t" is a good thing for
    the future support of X32 ABI."

    Stas
    http://grokbase.com/p/php/php-internals/1421be3ywy/php-dev-vote-64-bit-platform-improvements-for-string-length-and-integer
    " I can't speak for others but I can speak for myself in saying
    that these concerns are not just kicking the can down the road in hope
    that eventually this patch dies off. "

    Nikita:
    http://www.serverphorums.com/read.php?7,862033,862730#msg-862730
    "I fully support merging this into master, as the first change for PHP
    6."

    That's pretty sad guys. Applause for playing me for a fool (well, probably
    not only me). On the other hand, I'd be honored to be as stupid as that
    old farts from ANSI who wrote the specs.

    Why can't you just work together instead of looking for a hair in the soup?

    Best regards

    Anatol

    On Wed, May 14, 2014 21:48, Dmitry Stogov wrote:
    On Wed, May 14, 2014 at 11:39 PM, Lester Caine <lester@lsces.co.uk>
    wrote:

    On 14/05/14 10:24, Dmitry Stogov wrote:

    Ah, you are on windows and lack 64-bit IS_LONG.
    This is the part of the patch that should be accepted.
    I mentioned it on original email.
    Not used windows for a number of years. 64 bit builds on Linux ...

    then you already have 64-bit long, or do I miss something?


    The "bad" thing that this patch did, it changed all C data structures
    to
    use 64-bit string lengths and it means that each such data sructure
    would take more memory. Even zend_op becames bigger and as it's used
    for VM byte-code representation you may just multiply the difference
    to number of opcodes in application (that might be millions).
    Actually I do agree that this may not be ideal ...



    Unfortunately, phpng don't support firebird yet and it's not in our
    priority list.
    Then there is no way that I can get involved at the present time :)

    may be help in porting ext/interbase and ext/pdo_firebird :)

    I really don't know a lot about Firebird and afraid even proper
    configuration might take us significant time.

    Thanks. Dmitry.



    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk


    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
  • Pierre Joye at May 17, 2014 at 12:02 pm

    On Sat, May 17, 2014 at 1:15 PM, Zeev Suraski wrote:
    For the benefit of everyone, this is all from January. Dmitry's,
    Stas's and Nikita's positions in the actual patch in question can be
    found on this thread.
    Stas position is not settled.

    And in the meantime, from somewhere after January until last week, you
    guys worked on a patch privately and did not consider a single second
    that it could be a good thing for PHP to inform other about your work,
    in progress or not. The memory increase is minimal against 5.6 (almost
    at the measure error margin) and we do not have any number against
    phpng, and won't be for months, until the actual re-factoring and
    improvements are done. So excuse me Zeev, but this is not my vision
    for the PHP core, and will never be. We, even as a team working under
    a company umbrella, always follow community rules and try to cooperate
    as much as we can, even working on things which are by no mean a
    priority for us. I would love to see and hear the same from you.

    Now. I am not saying that the changes in the core types will not
    affect their respective size, it will, obviously (maybe less than
    expected if we take a closer look at them). However the work on phpng
    is in early stage, conceptual ideas have been proven good but
    implementation is not yet finished and we have rooms for other
    improvements. That makes the final impact of the 64bit patch, from 5.x
    to 6.x, much less important as what it looks now.

    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.or
  • Johannes Schlüter at May 19, 2014 at 12:21 pm

    On Sat, 2014-05-17 at 14:15 +0300, Zeev Suraski wrote:
    For the benefit of everyone, this is all from January. Dmitry's,
    Stas's and Nikita's positions in the actual patch in question can be
    found on this thread.
    It should be known that I usually won't support Pierre, but the way this
    discussion goes is really bad. There was a long debated and quite openly
    developed patch. When it was initially proposed form my perspective the
    result was "this might be good, timing is bad, we can't put it in 5.6"
    Then some who participated in that debate had an idea and implemented,
    ignored the existence of the public known work while working on
    something contrary and none found the time to (publicly) discuss this.
    This is really bad.

    On the other side the attempt to push the size_t/int64 thing now looks
    like an attempt to push it through now as a reaction, more for
    "political" than technical reasons. This is bad, too. (While I
    understand how ridiculous it is to first argue "now it is too late for
    5.6" and then "oh, now it is too early for PHP.Next" as I'm essentially
    doing)

    What we as a community (both (engine) maintainers as well as user
    communities) have to decide is how much performance loss we are willing
    to pay for "clean" and "quality" code and how to find the balance. Maybe
    an approach might be not using a single zend_string type everywhere but
    have different types lie zend_small_string for usage in opcode members
    (class names, function names, .. would almost certainly fit into a
    (unsigned) char and are for the most part engine internal so places for
    extra range checks should be minimal.

    So please, let's try to avoid confrontation and let's try to define our
    common goals and then finding a compromise for the technical questions.

    johannes
  • Dmitry Stogov at May 19, 2014 at 1:31 pm
    We had a long discussion with Pierre on IRC and probably come to some
    agreement.
    However, we always talk as a blind with a deaf, so I can't be completely
    sure :)

    We agreed to make 64-bit IS_LONG and 64-bit string length on all 64-bit
    systems, but don't change other core data structures (as Nikita suggested).

    The main idea of the changes on top of phpng may be seen at
    https://gist.github.com/dstogov/07fcbb60b1b585bcd290

    I checked the patch on 64-bit Linux. The memory consumption grows, but no
    so dramatic. On wordpress home page without opcache the peak_memory_usage()
    showed 11043920 bytes insead of 10959456, that mean ~0.8% that we probably
    can afford. Note that the original patch on the same test showed ~8% memory
    consumption increase on master.

    To integrate the patch into phpng we have to re-check and re-change every
    place where zend_long, zend_ulong and string length are involved.

    Pierre, if you agree with this proposal just change RFC accordingly, and
    most people won't object.

    I'm not sure if we need zend_ulong -> zend_int_t, IS_LONG -> IS_INT, size_t
    -> zend_size_t renaming in thousand places. In my opinion this work is
    useless but I won't object against patch just because of names. I would
    suggest just 3 option for voting - "no", "yes with old names", "yes with
    new names".

    Thanks. Dmitry.


  • Andrea Faulds at May 19, 2014 at 1:57 pm

    On 19 May 2014, at 14:31, Dmitry Stogov wrote:

    We agreed to make 64-bit IS_LONG and 64-bit string length on all 64-bit
    systems, but don't change other core data structures (as Nikita suggested).
    Does this mean we’re not switching to unsigned ints for other lengths? 64-bit or no, we ought to do that.

    --
    Andrea Faulds
    http://ajf.me/
  • Dmitry Stogov at May 19, 2014 at 2:03 pm
    size_t is unsigned.

    what do you mean by "other lengths"? (in phpng most data structures use
    zend_string instead of char*/length, so we need to change it in one place).

    May be some other places need to be changed as well, but I don't know about
    them. If you talk about stream related structures, it must not make any
    problems, because in run-time we will have just few such structures.

    Thanks. Dmitry.

    On Mon, May 19, 2014 at 5:57 PM, Andrea Faulds wrote:

    On 19 May 2014, at 14:31, Dmitry Stogov wrote:

    We agreed to make 64-bit IS_LONG and 64-bit string length on all 64-bit
    systems, but don't change other core data structures (as Nikita
    suggested).

    Does this mean we’re not switching to unsigned ints for other lengths?
    64-bit or no, we ought to do that.

    --
    Andrea Faulds
    http://ajf.me/



  • Andrea Faulds at May 19, 2014 at 3:51 pm

    On 19 May 2014, at 15:03, Dmitry Stogov wrote:

    size_t is unsigned.

    what do you mean by "other lengths"? (in phpng most data structures use zend_string instead of char*/length, so we need to change it in one place).
    Nevermind, looks like I was mistaken and we’re already using unsigned lengths everywhere.
    --
    Andrea Faulds
    http://ajf.me/

Related Discussions

People

Translate

site design / logo © 2022 Grokbase