FAQ

[PHP-INTERNALS] Introducing "Array Of" RFC

Philip Sturgeon
Jan 15, 2014 at 6:17 pm
Hey,

This is my first RFC so give me a little leeway if I get things wrong.

https://wiki.php.net/rfc/arrayof

The implementation has been written by Joe Watkins, and I will be
handling the RFC process for him.

It is aimed at PHP 5.6, and so far the release managers seem to be ok
with the idea of this potentially being merged after the alpha
release, so this should not be considered an issue.

Everything is open for discussion, especially the current error
messages. They are not perfect, so let us know if you have better
ideas.
reply

Search Discussions

118 responses

  • Andrea Faulds at Jan 15, 2014 at 6:21 pm

    On 15/01/14 18:17, Philip Sturgeon wrote:
    This is my first RFC so give me a little leeway if I get things wrong.

    https://wiki.php.net/rfc/arrayof

    The implementation has been written by Joe Watkins, and I will be
    handling the RFC process for him.

    It is aimed at PHP 5.6, and so far the release managers seem to be ok
    with the idea of this potentially being merged after the alpha
    release, so this should not be considered an issue.

    Everything is open for discussion, especially the current error
    messages. They are not perfect, so let us know if you have better
    ideas.
    I think this is a great proposal.

    I have one question, however. Though I doubt it would be a frequent
    usage, would { function (StdClass[][] $x) {} } work, for representing an
    array of arrays?
    --
    Andrea Faulds
    http://ajf.me/
  • Robert Stoll at Jan 15, 2014 at 7:24 pm
    Hey,
    -----Original Message-----
    From: Philip Sturgeon
    Sent: Wednesday, January 15, 2014 7:18 PM
    To: int...@...net
    Subject: [PHP-DEV] Introducing "Array Of" RFC

    Hey,

    This is my first RFC so give me a little leeway if I get things wrong.

    https://wiki.php.net/rfc/arrayof

    The implementation has been written by Joe Watkins, and I will be
    handling the RFC process for him.

    It is aimed at PHP 5.6, and so far the release managers seem to be ok
    with the idea of this potentially being merged after the alpha
    release, so this should not be considered an issue.

    Everything is open for discussion, especially the current error
    messages. They are not perfect, so let us know if you have better
    ideas.

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
    First of all, I really like the RFC, good idea, well done.

    But about some details. I am not sure if I got the implementation right (haven't done any code contributions so far, so
    I am not really familiar with the source code) but it seems that a check for NULL is implemented for the actual
    parameter which is perfectly fine (done for other type hints as well) but it also checks if all members of an array are
    not NULL and the check returns a failure if NULL is detected. I think that's wrong, NULL is a perfect valid entry in an
    array.

    Furthermore, what about type hints like the following ones:
    function foo(int[] $qualities, callable[] $callables){}

    Seems like they are not supported. But might be that I am totally wrong and did not understand the code
  • Andrea Faulds at Jan 15, 2014 at 7:27 pm

    On 15/01/14 19:24, Robert Stoll wrote:
    But about some details. I am not sure if I got the implementation right (haven't done any code contributions so far, so
    I am not really familiar with the source code) but it seems that a check for NULL is implemented for the actual
    parameter which is perfectly fine (done for other type hints as well) but it also checks if all members of an array are
    not NULL and the check returns a failure if NULL is detected. I think that's wrong, NULL is a perfect valid entry in an
    array.
    I agree with this. NULL should be considered a valid substitute for a
    value of any type.

    --
    Andrea Faulds
    http://ajf.me/
  • Robert Stoll at Jan 15, 2014 at 7:38 pm

    Furthermore, what about type hints like the following ones:
    function foo(int[] $qualities, callable[] $callables){}

    Seems like they are not supported. But might be that I am totally wrong and did not understand the code
    Please forget about my question about type hint for int[], I forgot for a second that PHP does not support scalar type
    hints.
    However, the question about callable[] remains
  • Philip Sturgeon at Jan 15, 2014 at 8:09 pm

    On Wed, Jan 15, 2014 at 2:24 PM, Robert Stoll wrote:
    Hey,
    -----Original Message-----
    From: Philip Sturgeon
    Sent: Wednesday, January 15, 2014 7:18 PM
    To: int...@...net
    Subject: [PHP-DEV] Introducing "Array Of" RFC

    Hey,

    This is my first RFC so give me a little leeway if I get things wrong.

    https://wiki.php.net/rfc/arrayof

    The implementation has been written by Joe Watkins, and I will be
    handling the RFC process for him.

    It is aimed at PHP 5.6, and so far the release managers seem to be ok
    with the idea of this potentially being merged after the alpha
    release, so this should not be considered an issue.

    Everything is open for discussion, especially the current error
    messages. They are not perfect, so let us know if you have better
    ideas.

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
    First of all, I really like the RFC, good idea, well done.

    But about some details. I am not sure if I got the implementation right (haven't done any code contributions so far, so
    I am not really familiar with the source code) but it seems that a check for NULL is implemented for the actual
    parameter which is perfectly fine (done for other type hints as well) but it also checks if all members of an array are
    not NULL and the check returns a failure if NULL is detected. I think that's wrong, NULL is a perfect valid entry in an
    array.

    Furthermore, what about type hints like the following ones:
    function foo(int[] $qualities, callable[] $callables){}

    Seems like they are not supported. But might be that I am totally wrong and did not understand the code
    Ok, ignoring the int stuff as PHP doesn't generally do that. We don't
    want to broach that topic here.

    As for allowing null, this feature is currently intended as a
    syntactic representation of:

    foreach ($foos as $foo) {
         if (! $foo instanceof Face) {
             throw new Exception ('AAAGGGGGHHH!');
         }
    }

    You are suggesting:

    foreach ($foos as $foo) {
         if (! is_null($foo) and ! $foo instanceof Face) {
             throw new Exception ('AAAGGGGGHHH!');
         }
    }

    How do people feel about that?
  • Kristopher at Jan 15, 2014 at 8:19 pm

    On Wed, Jan 15, 2014 at 3:09 PM, Philip Sturgeon wrote:
    Ok, ignoring the int stuff as PHP doesn't generally do that. We don't
    want to broach that topic here.

    As for allowing null, this feature is currently intended as a
    syntactic representation of:

    foreach ($foos as $foo) {
    if (! $foo instanceof Face) {
    throw new Exception ('AAAGGGGGHHH!');
    }
    }

    You are suggesting:

    foreach ($foos as $foo) {
    if (! is_null($foo) and ! $foo instanceof Face) {
    throw new Exception ('AAAGGGGGHHH!');
    }
    }

    How do people feel about that?
    PHP disallows passing NULL in a function like:

    function foo(MyClass $obj) {}

    Unless you explicitly allow it to be null:

    function foo(MyClass $obj = null) { }

    The first instance will throw an error:

    PHP Catchable fatal error: Argument 1 passed to foo() must be an instance
    of MyClass, null given


    I think this implementation (MyClass[]) should be consistent this when
    passing null.
  • Marco Pivetta at Jan 15, 2014 at 8:25 pm

    On 15 January 2014 21:19, Kristopher wrote:

    I think this implementation (MyClass[]) should be consistent this when
    passing null.
    Even without the argument of consistency, the new syntax would be pretty
    much useless if `NULL` was allowed, since every method would have start
    with a call to `array_filter`.

    Marco Pivetta

    http://twitter.com/Ocramius

    http://ocramius.github.com/
  • Robert Stoll at Jan 15, 2014 at 8:54 pm

    -----Original Message-----
    From: Marco Pivetta
    Sent: Wednesday, January 15, 2014 9:25 PM
    To: Kristopher
    Cc: Philip Sturgeon; Robert Stoll; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC
    On 15 January 2014 21:19, Kristopher wrote:

    I think this implementation (MyClass[]) should be consistent this when
    passing null.
    Even without the argument of consistency, the new syntax would be pretty
    much useless if `NULL` was allowed, since every method would have start
    with a call to `array_filter`.

    Marco Pivetta
    You could also think in a different way. NULL is basically a valid entry in an array. If you apply the restriction as implemented in the pull request then one cannot use the new syntax for arrays which allow NULL, that seems odd to me as well. Especially since class/interface-types are reference types and I regard NULL as something equal to a null-pointer.
    Furthermore, if one does not want to support NULL in an array currently, then one probably already uses an own collection implementation which already triggers and error, throws an exception respectively, when one tries to add NULL to the collection.

    It seems the pull request does not yet support multidimensional arrays (e.g. Foo[][]) but if we would stick to the restriction now then the consequence should be that this restriction also applies to multi-dimensional arrays. In my opinion, we would spoil valid use cases this way for RFC and for future ones (in the case multi-dimensional arrays will be introduced, other concepts could be affected as well).

    Thus I would suggest to remove this check and provide a further SPL data structure implementation instead which does not allow null as entry if this feature is really needed. Seems more appropriate to me than making a restriction in the language itself.
  • Philip Sturgeon at Jan 15, 2014 at 9:24 pm

    On Wed, Jan 15, 2014 at 3:54 PM, Robert Stoll wrote:
    -----Original Message-----
    From: Marco Pivetta
    Sent: Wednesday, January 15, 2014 9:25 PM
    To: Kristopher
    Cc: Philip Sturgeon; Robert Stoll; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC
    On 15 January 2014 21:19, Kristopher wrote:

    I think this implementation (MyClass[]) should be consistent this when
    passing null.
    Even without the argument of consistency, the new syntax would be pretty
    much useless if `NULL` was allowed, since every method would have start
    with a call to `array_filter`.

    Marco Pivetta
    You could also think in a different way. NULL is basically a valid entry in an array. If you apply the restriction as implemented in the pull request then one cannot use the new syntax for arrays which allow NULL, that seems odd to me as well. Especially since class/interface-types are reference types and I regard NULL as something equal to a null-pointer.
    Furthermore, if one does not want to support NULL in an array currently, then one probably already uses an own collection implementation which already triggers and error, throws an exception respectively, when one tries to add NULL to the collection.

    It seems the pull request does not yet support multidimensional arrays (e.g. Foo[][]) but if we would stick to the restriction now then the consequence should be that this restriction also applies to multi-dimensional arrays. In my opinion, we would spoil valid use cases this way for RFC and for future ones (in the case multi-dimensional arrays will be introduced, other concepts could be affected as well).

    Thus I would suggest to remove this check and provide a further SPL data structure implementation instead which does not allow null as entry if this feature is really needed. Seems more appropriate to me than making a restriction in the language itself.

    I would assume that the majority of people will be using this syntax
    to ensure the types of objects.

    In English:

    "I would like everything in this array to be Foo."

    That is much more programatically useful in my opinion than:

    "I would like everything in this array to be Foo or maybe null."

    If we go with the latter then I still have no idea what is in this
    variable, and have to do a whole bunch of checking or
    array_filter()ing, otherwise I am doing this:

    $possibleNullValue->getInfo(); // BOOM!

    Even if there are use-cases for allowing them I would suggest that
    there are probably more use-cases for denying them.
  • Robert Stoll at Jan 15, 2014 at 10:00 pm

    -----Original Message-----
    From: Philip Sturgeon
    Sent: Wednesday, January 15, 2014 10:25 PM
    To: Robert Stoll
    Cc: Marco Pivetta; Kristopher; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC
    On Wed, Jan 15, 2014 at 3:54 PM, Robert Stoll wrote:

    -----Original Message-----
    From: Marco Pivetta
    Sent: Wednesday, January 15, 2014 9:25 PM
    To: Kristopher
    Cc: Philip Sturgeon; Robert Stoll; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC
    On 15 January 2014 21:19, Kristopher wrote:

    I think this implementation (MyClass[]) should be consistent this when
    passing null.
    Even without the argument of consistency, the new syntax would be pretty
    much useless if `NULL` was allowed, since every method would have start
    with a call to `array_filter`.

    Marco Pivetta
    You could also think in a different way. NULL is basically a valid entry in an array. If you apply the restriction
    as
    implemented in the pull request then one cannot use the new syntax for arrays which allow NULL, that seems odd to me as
    well. Especially since class/interface-types are reference types and I regard NULL as something equal to a
    null-pointer.
    Furthermore, if one does not want to support NULL in an array currently, then one probably already uses an own
    collection implementation which already triggers and error, throws an exception respectively, when one tries to add NULL
    to the collection.
    It seems the pull request does not yet support multidimensional arrays (e.g. Foo[][]) but if we would stick to the
    restriction now then the consequence should be that this restriction also applies to multi-dimensional arrays. In my opinion,
    we would spoil valid use cases this way for RFC and for future ones (in the case multi-dimensional arrays will be
    introduced,
    other concepts could be affected as well).
    Thus I would suggest to remove this check and provide a further SPL data structure implementation instead which does
    not allow null as entry if this feature is really needed. Seems more appropriate to me than making a restriction in the
    language itself.
    I would assume that the majority of people will be using this syntax
    to ensure the types of objects.

    In English:

    "I would like everything in this array to be Foo."

    That is much more programatically useful in my opinion than:

    "I would like everything in this array to be Foo or maybe null."

    If we go with the latter then I still have no idea what is in this
    variable, and have to do a whole bunch of checking or
    array_filter()ing, otherwise I am doing this:

    $possibleNullValue->getInfo(); // BOOM!

    Even if there are use-cases for allowing them I would suggest that
    there are probably more use-cases for denying them.

    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
    Fair enough, the example with the English sentence makes pretty much sense from an average user's point of view and thus
    most PHP users would probably agree with you (don't take it wrong, I don't imply you are average or something similar).
    However, this should not be a decision based on whether many people are preferring it this way or the other way round in
    my opinion, since the restriction is going to make it impossible to use it for the use cases where NULL is a valid
    entry.
    So, we should ask ourselves, do we really want to exclude a whole bunch of valid use cases or should we decline the
    restriction and put more effort in providing alternatives for non-null -containing collections.

    See, if we do not introduce the restriction and provide a non-null-containing collection then all use cases can be
    supported. That is the way to go IMO.
    Or do you have another idea how one can specify that the array can contain null?
    And please consider, if you should suggest "function foo(Foo[] $foo=null)" implies that $foo can also contain NULL then
    we have another vague concept in PHP which has a double meaning -> $foo can be NULL and entries in $foo can be NULL. But
    maybe one does not want to allow NULL for $foo but only for members of $foo.
  • Crypto Compress at Jan 15, 2014 at 11:18 pm
    Hello,

    since there is an option for nullable type-hinted variables, it may be
    usefull to *explicitly* allow null values in a type-hinted array too.

    e.g.: function foo(Bar[null] = null)

    to consider:
    - null is not an object
    - null is not "nothing"
    - null is not "not defined"
    - null is not a/any type (even documentation states so)

    Java: only reference types are nullable
    c#: only if explicitly defined
    sql: only if explicitly defined

    cryptocompress
  • Andrea Faulds at Jan 15, 2014 at 11:54 pm

    On 15/01/14 23:18, Crypto Compress wrote:
    e.g.: function foo(Bar[null] = null)
    I'd prefer the ? syntax for nullables that C# has. I'm imagining
    something like this:

    function foo(Bar? $a) {} // $a is Bar/null
    function foo(Bar?[] $a) {} // $a is array of Bars/nulls
    function foo(Bar[]? $a) {} // $a is array of Bars, or null
    function foo(Bar?[]? $a) {} // $a is array of Bars/nulls, or null

    --
    Andrea Faulds
    http://ajf.me/
  • Robert Stoll at Jan 16, 2014 at 12:09 am

    -----Original Message-----
    From: Andrea Faulds
    Sent: Thursday, January 16, 2014 12:54 AM
    To: Crypto Compress; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC


    On 15/01/14 23:18, Crypto Compress wrote:
    e.g.: function foo(Bar[null] = null)
    I'd prefer the ? syntax for nullables that C# has. I'm imagining
    something like this:

    function foo(Bar? $a) {} // $a is Bar/null
    function foo(Bar?[] $a) {} // $a is array of Bars/nulls
    function foo(Bar[]? $a) {} // $a is array of Bars, or null
    function foo(Bar?[]? $a) {} // $a is array of Bars/nulls, or null

    --
    Andrea Faulds
    http://ajf.me/

    --
    I like Andrea's proposal, this way we can support all scenarios.
    It is possible to define that NULL is a valid entry in an array and we would stick to the same decision as for actual
    parameters, namely that one has to define explicitly that null is a valid entry. This would make me happy and I suppose
    also Philip.

    Just as side notice, this syntax would also give the chance for PHP to get rid of the double meaning of "=null" in a
    later version. I am talking about the following case:
    function foo(Foo $a=null, $b=1){}
    Where one might not want to specify that $a is optional but cannot do it differently at the moment - with the additional
    ? it could change :)
  • Martin Keckeis at Jan 16, 2014 at 9:18 am
    2014/1/16 Robert Stoll <ph...@...ch>
    -----Original Message-----
    From: Andrea Faulds
    Sent: Thursday, January 16, 2014 12:54 AM
    To: Crypto Compress; int...@...net
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC


    On 15/01/14 23:18, Crypto Compress wrote:
    e.g.: function foo(Bar[null] = null)
    I'd prefer the ? syntax for nullables that C# has. I'm imagining
    something like this:

    function foo(Bar? $a) {} // $a is Bar/null
    function foo(Bar?[] $a) {} // $a is array of Bars/nulls
    function foo(Bar[]? $a) {} // $a is array of Bars, or null
    function foo(Bar?[]? $a) {} // $a is array of Bars/nulls, or null

    --
    Andrea Faulds
    http://ajf.me/

    --
    I like Andrea's proposal, this way we can support all scenarios.
    It is possible to define that NULL is a valid entry in an array and we
    would stick to the same decision as for actual
    parameters, namely that one has to define explicitly that null is a valid
    entry. This would make me happy and I suppose
    also Philip.

    Just as side notice, this syntax would also give the chance for PHP to get
    rid of the double meaning of "=null" in a
    later version. I am talking about the following case:
    function foo(Foo $a=null, $b=1){}
    Where one might not want to specify that $a is optional but cannot do it
    differently at the moment - with the additional
    ? it could change :)


    --
    PHP Internals - PHP Runtime Development Mailing List
    To unsubscribe, visit: http://www.php.net/unsub.php
    + 1 for the RFC

    But if i'm not completely wrong, there is performance decrease instead of
    increase in your tests?
    https://gist.github.com/krakjoe/8444591

    [joe@fiji php-src]$ sapi/cli/php isTest.php 10000
    objects: 10000
    arrayof: 0.00645113
    instanceof: 0.00560713
  • Php at Jan 16, 2014 at 10:01 am

    + 1 for the RFC

    But if i'm not completely wrong, there is performance decrease instead of
    increase in your tests?
    https://gist.github.com/krakjoe/8444591

    [joe@fiji php-src]$ sapi/cli/php isTest.php 10000
    objects: 10000
    arrayof: 0.00645113
    instanceof: 0.00560713
    I gave the RFC a few more thoughts and came to the conclusion that it
    does not fit to PHP (due to the performance penalties involved).
    The syntax merely hide an ugly design. If you would write the same in
    normal PHP then probably everyone would agree that it is bad design.
    Following an example (doesn't really make sense but outlines the bad design)

    function foo(Foo[] $foos){
         if(count($foos) > 0){
            $a = array_pop($foos);
            //do something
           foo($foos);
         }
    }

    Is equivalent to

    function foo(array $arr){
         foreach($arr as $v){
            if(!($v instanceof Foo)){
               //trigger_error
            }
         }
         if(count($foos) > 0){
            $a = array_pop($foos);
            //do something
           foo($foos);
         }
    }

    The check adds runtime complexity to the function for each call which
    is obviously not necessary (once would be good enough). And I doubt
    that users would actually make sure that the check is only performed
    once and even if you would argue that one should write it as follows:

    function bar(Foo[] $foos){
         foo($foos);
    }
    function foo(array $foos){
         if(count($foos) > 0){
            $a = array_pop($foos);
            //do something
           foo($foos);
         }
    }

    Then it is still bad design compared to a collection which actually
    does the check during the addition of elements.
    Although, I like the effort to make PHP somewhat more type safe I am
    -1 for this RFC, the implementation respectively. If we want typed
    arrays then the array needs to carry the type to be efficient.
  • Matthieu Napoli at Jan 16, 2014 at 10:49 am

    Le 16/01/2014 11:01, ph...@...ch a écrit :
    + 1 for the RFC

    But if i'm not completely wrong, there is performance decrease instead of
    increase in your tests?
    https://gist.github.com/krakjoe/8444591

    [joe@fiji php-src]$ sapi/cli/php isTest.php 10000
    objects: 10000
    arrayof: 0.00645113
    instanceof: 0.00560713
    I gave the RFC a few more thoughts and came to the conclusion that it
    does not fit to PHP (due to the performance penalties involved).
    The performance penalty is the same as if you were type-checking yourself.

    And on the gist linked, we can see that the results vary a lot (not
    enough iterations probably), it's easy to take only the worst one. Here,
    I'll take the one where the numbers are inversed to "prove" my point:

    [joe@fiji php-src]$ sapi/cli/php isTest.php 1000
    objects: 1000
    arrayof: 0.00080991
    instanceof: 0.00086308
    The syntax merely hide an ugly design. If you would write the same in
    normal PHP then probably everyone would agree that it is bad design.
    How is type checking a bad design?

    Between:

    function (array $foos) {
          foreach ($foos as $foo) {
              if (! $foo instanceof Foo) {
                  throw new InvalidArgumentException("...");
              }
              $foo->bar();
          }
    }

    and:

    function (array $foos) {
          foreach ($foos as $foo) {
              $foo->bar(); // BOOM, fatal error if not correct object given
          }
    }

    which one is worse?

    It's the same problem than this:

    function ($foo) {
          $foo->bar(); // fatal error if not correct object given
    }

    and this:

    function (Foo $foo) {
          $foo->bar();
    }
    Following an example (doesn't really make sense but outlines the bad
    design)

    function foo(Foo[] $foos){
    if(count($foos) > 0){
    $a = array_pop($foos);
    //do something
    foo($foos);
    }
    }

    Is equivalent to

    function foo(array $arr){
    foreach($arr as $v){
    if(!($v instanceof Foo)){
    //trigger_error
    }
    }
    if(count($foos) > 0){
    $a = array_pop($foos);
    //do something
    foo($foos);
    }
    }

    The check adds runtime complexity to the function for each call which is
    obviously not necessary (once would be good enough). And I doubt that
    users would actually make sure that the check is only performed once and
    even if you would argue that one should write it as follows:

    function bar(Foo[] $foos){
    foo($foos);
    }
    function foo(array $foos){
    if(count($foos) > 0){
    $a = array_pop($foos);
    //do something
    foo($foos);
    }
    }

    Then it is still bad design compared to a collection which actually does
    the check during the addition of elements.
    Although, I like the effort to make PHP somewhat more type safe I am -1
    for this RFC, the implementation respectively. If we want typed arrays
    then the array needs to carry the type to be efficient.
    Matthieu
  • Yasuo Ohgaki at Jan 16, 2014 at 10:55 am
    Hi all,
    On Thu, Jan 16, 2014 at 7:49 PM, Matthieu Napoli wrote:

    The syntax merely hide an ugly design. If you would write the same in
    normal PHP then probably everyone would agree that it is bad design.
    How is type checking a bad design?

    Type checking is not bad.

    Alternatively, DbC (design by contract) may be useful.
    Cleaner and could be faster.

    Regards,

    --
    Yasuo Ohgaki
    yoh...@...net
  • Chris Wright at Jan 16, 2014 at 11:12 am

    On 16 January 2014 10:49, Matthieu Napoli wrote:
    The performance penalty is the same as if you were type-checking yourself.
    (Sorry if I'm stepping on your toes here Joe, since it was you who
    pointed this out to me):

    This is not completely true. With arrayof, there are two iterations of
    the passed value - one in C and one in PHP, vs only one iteration in
    the foreach/instanceof approach.

    However:

    - as you point out, the performance penalty for the check is
    negligible if it even exists
    - if you don't use the syntax there is no penalty whatsoever

    so for me this is a complete non-issue. The concise syntax and strong
    semantic possibilities this brings far outweigh a few microseconds of
    slowdown.

    Thanks, Chris
  • Martin Keckeis at Jan 16, 2014 at 11:31 am
    2014/1/16 Matthieu Napoli <mat...@...fr>
    Le 16/01/2014 11:01, ph...@...ch a écrit :

    + 1 for the RFC
    But if i'm not completely wrong, there is performance decrease instead of
    increase in your tests?
    https://gist.github.com/krakjoe/8444591

    [joe@fiji php-src]$ sapi/cli/php isTest.php 10000
    objects: 10000
    arrayof: 0.00645113
    instanceof: 0.00560713
    I gave the RFC a few more thoughts and came to the conclusion that it
    does not fit to PHP (due to the performance penalties involved).
    The performance penalty is the same as if you were type-checking yourself.

    And on the gist linked, we can see that the results vary a lot (not enough
    iterations probably), it's easy to take only the worst one. Here, I'll take
    the one where the numbers are inversed to "prove" my point:


    [joe@fiji php-src]$ sapi/cli/php isTest.php 1000
    objects: 1000
    arrayof: 0.00080991
    instanceof: 0.00086308
    Yes i saw the good result too. Just wondered why native PHP was slower
    sometimes...
  • Robert Stoll at Jan 16, 2014 at 4:44 pm

    How is type checking a bad design?

    Between:

    function (array $foos) {
    foreach ($foos as $foo) {
    if (! $foo instanceof Foo) {
    throw new InvalidArgumentException("...");
    }
    $foo->bar();
    }
    }

    and:

    function (array $foos) {
    foreach ($foos as $foo) {
    $foo->bar(); // BOOM, fatal error if not correct object given
    }
    }

    which one is worse?
    You got me wrong, I don't think that type checking as such is bad but type checking in the way the RFC promotes it is bad design IMO. If I expect an array with members of a certain type (regardless if including null or not) then I would not write any of the above code because this is exactly the bad design I was writing about.
    Wouldn't you agree with me that it would be better to type check the entries when they are added rather than during each function call?
    Imagine function foo(Foo[] $foos){} which calls bar and baz which in turn call several other functions which all have the same type hint Foo[].
    The result would be that you have increased your runtime complexity by the factor of function calls. That's really ugly and I don't think that the language itself should promote such bad design. If you go further and support multi-dimensional arrays the same way then you would add a complexity of n*m for each function call with such a type hint.

    However, I like the basic idea of the RFC, namely typed arrays. I think a better implementation would be to introduce that an array holds (visible only internally - could be exposed to userland later) the defined type and performs the necessary checks during addition and therefore the runtime complexity for the type hint checks would no longer be linear but constant. Yet, it is probably better to introduce another internal data structure for this purpose in order that the type hint "array" does not suffer from the additional checks.
    An alternative approach, without changing the language, would be to provide an additional SPL data structure instead which performs the type check during the addition of an entry.

    Nevertheless, the implementation of the RFC as such is not useless. We could introduce a function is_arrayOf(array $arr, $type, $nullable=false) which could perform the check as implemented in the pull request (including the nullable option). And I suppose you can come up with valid uses cases where such a check makes actually sense and worth the new function.
  • Julien Pauli at Jan 18, 2014 at 3:52 am

    On Thu, Jan 16, 2014 at 5:44 PM, Robert Stoll wrote:
    How is type checking a bad design?

    Between:

    function (array $foos) {
    foreach ($foos as $foo) {
    if (! $foo instanceof Foo) {
    throw new InvalidArgumentException("...");
    }
    $foo->bar();
    }
    }

    and:

    function (array $foos) {
    foreach ($foos as $foo) {
    $foo->bar(); // BOOM, fatal error if not correct object given
    }
    }

    which one is worse?
    You got me wrong, I don't think that type checking as such is bad but type checking in the way the RFC promotes it is bad design IMO. If I expect an array with members of a certain type (regardless if including null or not) then I would not write any of the above code because this is exactly the bad design I was writing about.
    Wouldn't you agree with me that it would be better to type check the entries when they are added rather than during each function call?
    Imagine function foo(Foo[] $foos){} which calls bar and baz which in turn call several other functions which all have the same type hint Foo[].
    The result would be that you have increased your runtime complexity by the factor of function calls. That's really ugly and I don't think that the language itself should promote such bad design. If you go further and support multi-dimensional arrays the same way then you would add a complexity of n*m for each function call with such a type hint.
    I'm absolutely +1 with such a statement.
    However, I like the basic idea of the RFC, namely typed arrays. I think a better implementation would be to introduce that an array holds (visible only internally - could be exposed to userland later) the defined type and performs the necessary checks during addition and therefore the runtime complexity for the type hint checks would no longer be linear but constant. Yet, it is probably better to introduce another internal data structure for this purpose in order that the type hint "array" does not suffer from the additional checks.
    An alternative approach, without changing the language, would be to provide an additional SPL data structure instead which performs the type check during the addition of an entry.
    Yes, yes and yes again. +1

    This RFC shows that we lack "array of" as a whole structure in PHP,
    like Java has Vectors. We already talked about this in the past.
    I'm in favor of designing a new "typed array" structure , instead of
    adding a new syntax that checks things at runtime of every function
    call, mainly by iterating over the array at runtime.
    I dont think we can accept such a penalty, particulary in the case of
    nested function calls with the same signature, all requiring the
    engine to iterate for type checking.


    Julien.P
  • Chris Wright at Jan 18, 2014 at 12:36 pm

    On 18 Jan 2014 03:53, "Julien Pauli" wrote:

    I'm in favor of designing a new "typed array" structure , instead of
    adding a new syntax that checks things at runtime of every function
    call, mainly by iterating over the array at runtime.
    This, while useful, doesn't solve the same problem as this RFC is solving.
    You wouldn't be able to type hint for the type contained within the
    structure without some form of new syntax.

    What if we did both? So that the new syntax actually expects an
    SplTypedArray (or whatever you want to call it) and if an instance of this
    structure is passed the class entry attached to the structure can be used
    as an O(1) shortcut, but if an array is passed it is run through
    SplTypedArray::fromArray(), allowing users to pass an array and take the
    performance hit if they want to.

    Thanks, Chris
  • Chris Wright at Jan 18, 2014 at 1:24 pm

    On 18 Jan 2014 12:40, "Mark Tomlin" wrote:
    When you say, SplTypedArray, do you mean single datatype array? Or would
    it go so far as adding SplStructs, because to me, that would be VERY handy,
    if it added some speed to parsing packets binary packets.

    I just inferred a single datatype array (i.e. "members must all be
    instances of the FQ class name passed as a constructor arg") from previous
    comments, although I suppose some more complex extended variants are
    possible. I would certainly like the simplest possible variant to be
    present, but I'm not against more complex structures extending from that
    base.
  • Sherif Ramadan at Jan 19, 2014 at 6:16 am
    This RFC seems to purport in the way of improvement. I feel it's more of a
    variation than an improvement on the status quo. We've had a number of
    these RFCs over the years. I'd be much more interested to see RFCs that
    introduce an actual lacking feature into the language like the exponent
    operator RFC rather than all of these OOP syntax sugar RFCs that are
    cropping up by the minute. I'm not trying to bash anyone's efforts, but I
    really feel that these kinds of RFCs are mostly fluff that we could easily
    do without.

    To focus specifically on this RFC I find its use cases to be extremely
    limiting and mildly beneficial. PHP arrays are very generalized so that a
    developer could easily have an array of integers, floats, strings, and any
    other primitive in the same storage unit. To limit the type hinting just to
    user-defined objects and then limit it even further to a single dimension
    means chopping the number of use cases down to an 8th then chopping that
    number down by another fraction. The result is such a narrow use case that
    it becomes almost useless in practice.

    The idea of better type hinting in PHP is not inherently a bad one. It's
    just that many of the proposed implementations seem to veer off into
    fragile use cases that really shouldn't require a language change. What's
    worse is that the next RFC to come after it usually makes it worse as it
    tries to pull the language into an equally different set of fragile use
    cases. Ultimately the user just gets confused and doesn't tend to find
    these limiting syntax sugars all that useful in practice. Even though they
    appear interesting in theory, they're not really so great that we couldn't
    do with a pure PHP implementation that doesn't require a language change.

    Having looked over the RFC and the actual patch I find that refactoring
    code which relies on the function or method itself to handle type checks
    (in situations where it will cause an actual broken design by contract
    effect), makes the patch a worser scenario for both the developer and the
    language. It doesn't actually make the code any easier to read and doesn't
    make it any easier to debug. Code that already relies on PHP implemented
    type checks is better off not using the new syntax. Code that uses the new
    syntax tends to get trapped into too many edge cases. So I really can't see
    the upside here.
  • Philip Sturgeon at Jan 19, 2014 at 5:13 pm

    On Sun, Jan 19, 2014 at 1:16 AM, Sherif Ramadan wrote:
    This RFC seems to purport in the way of improvement. I feel it's more of a
    variation than an improvement on the status quo. We've had a number of
    these RFCs over the years. I'd be much more interested to see RFCs that
    introduce an actual lacking feature into the language like the exponent
    operator RFC rather than all of these OOP syntax sugar RFCs that are
    cropping up by the minute. I'm not trying to bash anyone's efforts, but I
    really feel that these kinds of RFCs are mostly fluff that we could easily
    do without.

    To focus specifically on this RFC I find its use cases to be extremely
    limiting and mildly beneficial. PHP arrays are very generalized so that a
    developer could easily have an array of integers, floats, strings, and any
    other primitive in the same storage unit. To limit the type hinting just to
    user-defined objects and then limit it even further to a single dimension
    means chopping the number of use cases down to an 8th then chopping that
    number down by another fraction. The result is such a narrow use case that
    it becomes almost useless in practice.

    The idea of better type hinting in PHP is not inherently a bad one. It's
    just that many of the proposed implementations seem to veer off into
    fragile use cases that really shouldn't require a language change. What's
    worse is that the next RFC to come after it usually makes it worse as it
    tries to pull the language into an equally different set of fragile use
    cases. Ultimately the user just gets confused and doesn't tend to find
    these limiting syntax sugars all that useful in practice. Even though they
    appear interesting in theory, they're not really so great that we couldn't
    do with a pure PHP implementation that doesn't require a language change.

    Having looked over the RFC and the actual patch I find that refactoring
    code which relies on the function or method itself to handle type checks
    (in situations where it will cause an actual broken design by contract
    effect), makes the patch a worser scenario for both the developer and the
    language. It doesn't actually make the code any easier to read and doesn't
    make it any easier to debug. Code that already relies on PHP implemented
    type checks is better off not using the new syntax. Code that uses the new
    syntax tends to get trapped into too many edge cases. So I really can't see
    the upside here.
    I'm a little stunned by this response. By no means is this RFC fluff,
    and simply because you do not personally see a use case does not mean
    that they do not exist.

    Currently I am working on a HTTP Message PSR, where we would like an
    array of instances of the HeaderValueInterface to be sent through to a
    method. Currently we have to type-hint against "array" and hope the
    user knows what they are supposed to send through with documentation.
    Interfaces are meant to be a contractual obligation, not guesswork.

    That might seem like an edge case, but this happens all the time.
    PyroCMS wants an array of Widgets, or ModuleDetails, or anything.
    Boilerplate code for these occasions is always required and its a pain
    in the ass having to write it every time.

    To me it is confusing that there is a separation in the "usefulness"
    between type hinting a single item, or type hinting an array of items.
    For the same reason I want to be certain that an argument is a Foo, I
    want to know if I am being given multiple Foo's.

    "PHP arrays are very generalized so that a developer could easily have
    an array of integers, floats, strings, and any other primitive in the
    same storage unit. To limit the type hinting just to user-defined
    objects and then limit it even further to a single dimension means
    chopping the number of use cases down to an 8th then chopping that
    number down by another fraction. The result is such a narrow use case
    that it becomes almost useless in practice."

    It's wonderful that PHP allows arrays to be so vague. It's truely
    useful sometimes. Sometimes its not and I want to know exactly what is
    in there, as is the right of the implementor of a function or method.
    When I want to know that everything is a Foo it would be great to
    ensure that.

    Again I say I am confused how this is a marginal requirement, this is
    incredibly standard stuff.

    "Ultimately the user just gets confused and doesn't tend to find these
    limiting syntax sugars all that useful in practice."

    Who is the user? The user implementing the type hint? If they want it,
    they will use it. If they dont want it, they wont use it.

    Somebody calling said function or method is going to get an error if
    they dont pass in the correct data, just like any other type hint. So
    again, there is not a lot of room for confusion there.

    "It doesn't actually make the code any easier to read and doesn't make
    it any easier to debug"

    I strongly disagree with this statement.

    Currently if a function is expecting Foo's you need to either read a
    DocBlock, external documentation or read the code. Seeing Foo[] let's
    you know that you need an array of Foo's, right there in the
    declaration. That is what I consider to be readable, but I understand
    you might say the square brackets could cause confusion to the user.

    I know you can't google search for "[]", but the docs would be right
    there on the "type-hinting documentation" so that solves that nicely.

    "Code that uses the new syntax tends to get trapped into too many edge cases"

    Not sure what you mean here. If I want a bag of spanners I know that
    I'm getting a bag of spanners. That does not trap me, it is a safety
    catch, to make sure when I put my hand in the bag it is always going
    to be a spanner, and I don't accidentally pull out a bear.

    I think mostly I would just like you to quantify some of your
    statements, as you seem to be saying:

    1. It is only useful to people called Jeremy on a Tuesday afternoon
    2. It is single-handedly responsible for the death of Steve Jobs.
    3. Explosions

    I think its a nice logical improvement which syntactically offers a
    replacement for looney boilerplate code, exactly like Variadics which
    a LOT of people are excited about.
  • Sherif Ramadan at Jan 19, 2014 at 6:12 pm
    On Sun, Jan 19, 2014 at 12:13 PM, Philip Sturgeon wrote:

    1. It is only useful to people called Jeremy on a Tuesday afternoon
    Yes, the implementation of this type hint is very limiting in nature. It's
    not that I don't like the idea of an array type hint. I just don't like
    this implementation. I'd much rather see something like the HHVM
    implementation Sara spoke of that actually doesn't constrain the
    implementation to such limiting use cases as those you have exhibited.

    2. It is single-handedly responsible for the death of Steve Jobs.
    No idea what you're talking about. I never said anything of the sort.

    3. Explosions
    What?
  • Rasmus Lerdorf at Jan 20, 2014 at 4:20 am

    On 1/19/14, 9:13 AM, Philip Sturgeon wrote:
    I think its a nice logical improvement which syntactically offers a
    replacement for looney boilerplate code, exactly like Variadics which
    a LOT of people are excited about.
    Actually variadics solves a problem that can't be solved other ways,
    namely by-ref for an unknown number of arguments. There is no userspace
    boilerplate code, looney or otherwise, that can do that.

    I won't argue whether the base need is great enough here. I am sure it
    is to some people. I do have an issue with the idea that the right
    solution is a full hash table scan on every function call. I have a
    feeling that if this makes it in this is one of those features that will
    be high my list for optimization/refactoring when I run across it in
    code. As in, replace the per-call hash scan with a single userspace scan
    that is only done once earlier in the stack. From a performance
    perspective the only-done-once userspace version is extremely likely to
    outperform this built-in per-call check assuming more than a trivial
    amount of calls to the function in question.

    -Rasmus
  • Larry Garfield at Jan 20, 2014 at 6:21 am

    On 01/19/2014 10:19 PM, Rasmus Lerdorf wrote:
    On 1/19/14, 9:13 AM, Philip Sturgeon wrote:
    I think its a nice logical improvement which syntactically offers a
    replacement for looney boilerplate code, exactly like Variadics which
    a LOT of people are excited about.
    Actually variadics solves a problem that can't be solved other ways,
    namely by-ref for an unknown number of arguments. There is no userspace
    boilerplate code, looney or otherwise, that can do that.

    I won't argue whether the base need is great enough here. I am sure it
    is to some people. I do have an issue with the idea that the right
    solution is a full hash table scan on every function call. I have a
    feeling that if this makes it in this is one of those features that will
    be high my list for optimization/refactoring when I run across it in
    code. As in, replace the per-call hash scan with a single userspace scan
    that is only done once earlier in the stack. From a performance
    perspective the only-done-once userspace version is extremely likely to
    outperform this built-in per-call check assuming more than a trivial
    amount of calls to the function in question.

    -Rasmus
    Which I think goes to the earlier point about it being better to do at
    add-time than read-time, since add-time is O(n) and read-time is O(n*m).

    Stepping back a bit, the underlying feature request, essentially, is "I
    want to ensure I have a collection of X without having to write a custom
    collection class every frickin' time". (Writing a FooCollection class
    whose add() method only accepts Foo objects is very straightforward, but
    generally needless boilerplate.) That can be checked at add time or
    read-time. I think that's an entirely valid use case in modern code,
    and can help lead to more self-documenting code, too.

    The patch in the RFC handles that with arrays as the collection, and
    run-time checking via type hints. My initial thought was "yes please!",
    but from the discussion here I can see where that's sub-optimal for
    performance reasons as Rasmus notes.

    Generics or something in SPL would allow for simpler add-time checking,
    which is what we probably want. However, if we did that then all of the
    various array functions don't work. That's hardly a new problem for
    PHP, of course.

    So how could we add "collection of just X type of things" support to PHP
    without running afoul of the "ArrayObject vs. Array" problem?

    I'm not sure, but I think this is another case where "arrays as the
    uber-data structure" gets in the way of more robust programming
    methodologies.

    --Larry Garfield
  • Patrick Schaaf at Jan 20, 2014 at 7:31 am
    Regarding the N*M checking complexity, couldn't a little bit of memory
    alleviate the problem?

    For each array, have a "member type" pointer, initially NULL.

    When the member type needs to be checked for the first time, and that
    "member type" pointer is not already set, do an O(N) scan of the arrray,
    testing compatibility of all members against the hinted type. Error on
    mismatch, but when all members match, remember the type behind the "member
    type" pointer.

    When the member type needs to be checked and the "member type" pointer _is_
    set, just compare it to the hinted type O(1).

    There must be some overhead when elements are added to an array. Either
    "when member type is set, forget it", or even "when member type is already
    set, check compatiility of the newly added element with the member type,
    and then remove member type on mismatch or keep it on match".

    Finally when arrays are copied, and "member type" is already set, copy it
    along with the array, Optionally some of the array functions like
    array_merge might be made aware of the scheme, too.

    BTW, I'm undecided regarding the need for the overall feature, from my
    rather limited personal coding practise it seems to me that whenever I am
    interested in member type checking I'm already looping over the array and
    can easily add up-front instanceof checks to the loops with suitable
    explicit error-out. Which I generally prefer over automatic errors,
    especially fatal ones, because _I_ control the error case behaviour.

    But I think _if_ such a scheme were added, the implementation described
    above would solve the O(N*M) problem with acceptable overall cost.

    best regards
       Patrick
  • Rasmus Lerdorf at Jan 20, 2014 at 8:05 am

    On 1/19/14, 11:31 PM, Patrick Schaaf wrote:
    Regarding the N*M checking complexity, couldn't a little bit of memory
    alleviate the problem?

    For each array, have a "member type" pointer, initially NULL.

    When the member type needs to be checked for the first time, and that
    "member type" pointer is not already set, do an O(N) scan of the arrray,
    testing compatibility of all members against the hinted type. Error on
    mismatch, but when all members match, remember the type behind the "member
    type" pointer.

    When the member type needs to be checked and the "member type" pointer _is_
    set, just compare it to the hinted type O(1).

    There must be some overhead when elements are added to an array. Either
    "when member type is set, forget it", or even "when member type is already
    set, check compatiility of the newly added element with the member type,
    and then remove member type on mismatch or keep it on match".
    This means every HT becomes larger now and every time you write to a HT
    you at the very least need to do one extra comparison. We use HTs
    everywhere, so this would have a performance impact on every script even
    if you have absolutely no intention of ever using this feature.

    -Rasmus
  • Pierre Joye at Jan 20, 2014 at 8:46 am
    hi,
    On Mon, Jan 20, 2014 at 9:05 AM, Rasmus Lerdorf wrote:
    On 1/19/14, 11:31 PM, Patrick Schaaf wrote:
    Regarding the N*M checking complexity, couldn't a little bit of memory
    alleviate the problem?

    For each array, have a "member type" pointer, initially NULL.

    When the member type needs to be checked for the first time, and that
    "member type" pointer is not already set, do an O(N) scan of the arrray,
    testing compatibility of all members against the hinted type. Error on
    mismatch, but when all members match, remember the type behind the "member
    type" pointer.

    When the member type needs to be checked and the "member type" pointer _is_
    set, just compare it to the hinted type O(1).

    There must be some overhead when elements are added to an array. Either
    "when member type is set, forget it", or even "when member type is already
    set, check compatiility of the newly added element with the member type,
    and then remove member type on mismatch or keep it on match".
    This means every HT becomes larger now and every time you write to a HT
    you at the very least need to do one extra comparison. We use HTs
    everywhere, so this would have a performance impact on every script even
    if you have absolutely no intention of ever using this feature.
    As I like this RFC for other reasons (type hinting extension), I tend
    to think that the use cases presented here can be done in a more
    efficient way. Someone else mentioned "collections" and it is exactly
    what it is all about. A collection could be an instance of a given
    class, which stores only one class type. It prevents on usage/function
    call checks and checks are done only when adding a member to the
    collection.

    On another note, and only as a note here as I really do not want to
    hijack this thread with another topic, the more I see this kind of
    RFCs the more I feel like we are heading to general or global type
    hinting supports. I think it could be the right time to actually
    decide what we want. It is getting critical as we keep adding stuff
    about type hinting, one after another, which may lead to
    inconsistencies and will prevent us to have a good type hinting
    mechanism in PHP.

    Cheers,
    --
    Pierre

    @pierrejoye | http://www.libgd.org
  • Lester Caine at Jan 20, 2014 at 10:55 am

    Pierre Joye wrote:
    As I like this RFC for other reasons (type hinting extension), I tend
    to think that the use cases presented here can be done in a more
    efficient way. Someone else mentioned "collections" and it is exactly
    what it is all about. A collection could be an instance of a given
    class, which stores only one class type. It prevents on usage/function
    call checks and checks are done only when adding a member to the
    collection.
    To my mind this is a lot more PHP friendly? ... If I am building an array of
    data then I have the option to create each element as a 'new' construct, or
    simply cache the data in a manor that a single instance of the object can handle
    multiple sets of data. A collection of data could potentially be managed in a
    single instance?

    --
    Lester Caine - G8HFL
    -----------------------------
    Contact - http://lsces.co.uk/wiki/?page=contact
    L.S.Caine Electronic Services - http://lsces.co.uk
    EnquirySolve - http://enquirysolve.com/
    Model Engineers Digital Workshop - http://medw.co.uk
    Rainbow Digital Media - http://rainbowdigitalmedia.co.uk
  • Lazare Inepologlou at Jan 20, 2014 at 9:54 am
    Hello,

    on 2014/1/16 Sara Golemon wrote:

    * Soft types - Any type preceeded by an at sign is hinted as that type,
    but not checked:
    function foo(@Bar $IHopeYoureABarObject) {...}

    Nobody has commented so far about this feature of HHVM, which, in my
    opinion is a win-win:

    + Clarifies the contract between the function designer and the function
    user.
    + Makes the life easy for IDEs and static analysis tools.
    + No run-time penalty.

    Thoughts?


    --

    Lazare INEPOLOGLOU
    Ingénieur Logiciel
  • Robert Stoll at Jan 20, 2014 at 9:58 am
    Hey Lazare
    -----Original Message-----
    From: Lazare Inepologlou
    Sent: Monday, January 20, 2014 10:54 AM
    Cc: internals
    Subject: Re: [PHP-DEV] Introducing "Array Of" RFC

    Hello,

    on 2014/1/16 Sara Golemon wrote:

    * Soft types - Any type preceeded by an at sign is hinted as that type,
    but not checked:
    function foo(@Bar $IHopeYoureABarObject) {...}

    Nobody has commented so far about this feature of HHVM, which, in my
    opinion is a win-win:

    + Clarifies the contract between the function designer and the function
    user.
    + Makes the life easy for IDEs and static analysis tools.
    + No run-time penalty.

    Thoughts?
    I think it suits PHP very well but that would definitely be a different RFC.
    And btw. "soft" type hints would not work for typed arrays (since this can only be checked during runtime
  • Nikita Nefedov at Jan 20, 2014 at 12:08 pm

    On Mon, 20 Jan 2014 11:31:17 +0400, Patrick Schaaf wrote:

    Regarding the N*M checking complexity, couldn't a little bit of memory
    alleviate the problem?

    For each array, have a "member type" pointer, initially NULL.

    When the member type needs to be checked for the first time, and that
    "member type" pointer is not already set, do an O(N) scan of the arrray,
    testing compatibility of all members against the hinted type. Error on
    mismatch, but when all members match, remember the type behind the
    "member
    type" pointer.

    When the member type needs to be checked and the "member type" pointer
    _is_
    set, just compare it to the hinted type O(1).

    There must be some overhead when elements are added to an array. Either
    "when member type is set, forget it", or even "when member type is
    already
    set, check compatiility of the newly added element with the member type,
    and then remove member type on mismatch or keep it on match".

    Finally when arrays are copied, and "member type" is already set, copy it
    along with the array, Optionally some of the array functions like
    array_merge might be made aware of the scheme, too.

    BTW, I'm undecided regarding the need for the overall feature, from my
    rather limited personal coding practise it seems to me that whenever I am
    interested in member type checking I'm already looping over the array and
    can easily add up-front instanceof checks to the loops with suitable
    explicit error-out. Which I generally prefer over automatic errors,
    especially fatal ones, because _I_ control the error case behaviour.

    But I think _if_ such a scheme were added, the implementation described
    above would solve the O(N*M) problem with acceptable overall cost.

    best regards
    Patrick
    Hi Patrick,

    I had this cache-idea too, but it will fail to get you any speedup if you
    check for array of interfaces (if you want to store objects of different
    classes but all this classes implement one interface). And this is the
    same reason why this proposal is very different from having typed array
    structure (cause, as long as I know, type of the typed array is always the
    end-class, objects of which will be stored in this array). So if we are
    proposing to add some new Spl structure - keep that in mind.

    And again, reference to an element in the array can be used to change
    element of the array without affecting (and thus calling any of the
    functions related to) HT.
  • Joe Watkins at Jan 20, 2014 at 9:33 am

    On 01/20/2014 04:19 AM, Rasmus Lerdorf wrote:
    On 1/19/14, 9:13 AM, Philip Sturgeon wrote:
    I think its a nice logical improvement which syntactically offers a
    replacement for looney boilerplate code, exactly like Variadics which
    a LOT of people are excited about.
    Actually variadics solves a problem that can't be solved other ways,
    namely by-ref for an unknown number of arguments. There is no userspace
    boilerplate code, looney or otherwise, that can do that.

    I won't argue whether the base need is great enough here. I am sure it
    is to some people. I do have an issue with the idea that the right
    solution is a full hash table scan on every function call. I have a
    feeling that if this makes it in this is one of those features that will
    be high my list for optimization/refactoring when I run across it in
    code. As in, replace the per-call hash scan with a single userspace scan
    that is only done once earlier in the stack. From a performance
    perspective the only-done-once userspace version is extremely likely to
    outperform this built-in per-call check assuming more than a trivial
    amount of calls to the function in question.

    -Rasmus
    On 01/20/2014 04:19 AM, Rasmus Lerdorf wrote:> I do have an issue with
    the idea that the right
    solution is a full hash table scan on every function call.
    Rasmus,

      The better solution here appears (to everyone else) to be generics,
    which you actually said you didn't want to see anywhere near PHP, please
    could you communicate your reasons for that ?

      I have concerns ...

      I'm concerned that type hinting isn't robust or complete enough to
    compliment using generics.
      I'm wary of the inherent complexity that generics carries both in it's
    implementation and usage.
      I'm concerned that we already have many different kinds of collection,
    and they have introduced inconsistencies (Traversable != array), I don't
    really see a way to add another, and such a complex one, without
    introducing more, or alternatively impacting everything.
      It's a bit crazy to implement a whole paradigm because we wanted a
    specific kind of type hint.

      Is it any of them ?? Is it something I, or any of the others thinking
    about generics, don't see ??

      Do you, or anyone else, anyone at all, see a way to perform the simple
    type hint we set out to perform without incurring the overhead of a full
    table scan ?

      I don't want to set out to do anything that is bad, however, I think
    the performance thing is not really a massive concern, after all it's
    not very difficult for something innocent to incur a measurable overhead:

      https://eval.in/91920

      And we're not all going around re-factoring our code to avoid that,
    because it's "normal"; people know that passing a huge array around
    increases memory usage because it's documented, so I don't see the
    problem with documenting how this would work such that those that need
    to avoid the overhead, can.

      Also, variadics incur the same overhead:

      http://lxr.php.net/xref/PHP_5_6/Zend/zend_vm_def.h#3448

      The idea to change the hashtable wouldn't work anyway (for the OP of
    the idea): you can have a reference to a variable that is a member of an
    array and change it's type without executing any opcodes related to
    hashtable manipulation.

    Cheers
    Joe
  • Robert Stoll at Jan 20, 2014 at 3:34 pm
    Hey Joe,
    I don't want to set out to do anything that is bad, however, I think
    the performance thing is not really a massive concern, after all it's
    not very difficult for something innocent to incur a measurable overhead:

    https://eval.in/91920
    It can be fairly fast for one call only, but imagine you have those type hints all over the place.
    I really don't think it is a good idea to introduce something which we both now is far from ideal. Yet, I like the idea
    behind this RFC and think we should think it over and come up with a better implementation.
    Your last comment was actually thought-provoking
    The idea to change the hashtable wouldn't work anyway (for the OP of
    the idea): you can have a reference to a variable that is a member of an
    array and change it's type without executing any opcodes related to
    hashtable manipulation.
    So I sat down and coded an example in PHP. Maybe it can help this RFC to go further. You can find it here:
    https://github.com/robstoll/phputils/blob/master/TypedArray.php

    Seems like the implementation is robust enough for the dirty reference hack (as mentioned by you) by chance. Yet, one
    could still inject invalid types via Reflection. It's not possible to prohibit that as far as I know.

    Brief overview of the implementation
    - works for all types (bool, int, float, string, array, resource, callable, and class/interface)
    - initialisation in two different ways
      $foos = new TypedArray(Types::T_CLASS_OR_INTERFACE, "Foo");
       $bars = TypedArray::createFromArray(Types::T_INT, [1,2,3, 'a'=5, 'a'=>9]);
    - type hinting can be used as follows:
      /**
       * @param TypedArray<int> $arr
       */
      function foo(TypedArray $arr){
          $arr->checkIsSameType(Types::T_CLASS_OR_INTERFACE, "Foo");
      }

    I have not yet implemented Iterator as interface but that could be added easily

    This implementation has of course drawbacks compared to the RFC:
    - it is not obvious what type TypedArray contains just from reading the signature (needs the additional documentation)
    - needs one extra line of code to check if actually the given TypedArray is of the correct type

    And advantages:
    + is already possible with the current PHP version
    + type hint check O(1)
    + add element check O(n)
    + one can still pass a normal array with the initialise syntax (does more or less the same as your implementation):
      foo(TypedArray::createFromArray(Types::T_INT, $arr);


    If we would substitute the one additional line with some syntactic sugar then we would have already a nice
    implementation IMO.

    Cheers,
    Robert
  • Philip Sturgeon at Jan 20, 2014 at 6:43 pm
    Hey,

    So I have not been as active in this thread as I would have liked, but
    life has a tendency to happen, and it's been happening a lot.

    I would just like to cover multiple topics that have popped up, in a
    bid to get things back on track.


    Scalar Type Hints
    -----------------

    A lot of people obviously feel strongly that scalar type hints are
    lacking here, and that while we are discussing type hints now woul-d
    be a good time to talk about that. It isn't.

    There is an RFC for scalars and I would love to discuss it another time.

    https://wiki.php.net/rfc/scalar_type_hinting_with_cast

    This "array of" RFC will be made better by scalar types happening, but
    they do not have to happen first so let's not worry about them now.


    Generics
    -------------

    As Joe has pointed out, this RFC has nothing to do with generics. They
    are a whole different kettle of fish.

    1. Rasums hates them

    http://comments.gmane.org/gmane.comp.php.devel/76495


    2. I hear a lot of people saying "PHP is turning into Java" every time
    anything vaguely OOP happens (which is frustrating, and... factually
    inaccurate) but adding generics would be INCREDIBLY Java-esque.

    http://en.wikipedia.org/wiki/Generics_in_Java

    Those who are interested in it should discuss it, but I do not know
    why we are talking about them here. Maybe a new RFC and a new thread
    for Generics can occur, but they are not alternatives to the same
    solution.

    Use-cases
    --------------

    If you don't see the use, maybe ask about the use, don't assume there
    is no use. I will happily repeat this on every single RFC thread in
    the future, because it happens every single time.


    "Type-checking is not PHP"
    --------------------------------------

    PHP is a loosely typed language. It allows arrays to contain a mixture
    of content. It is very vague. GREAT.

    Type-hints are lovely for when you do not want to be vague.

    A plain-old function declaration:

    1. "Give me a bag of stuff, and I'll try using everything like it's a
    spanner, but it could be a banana and... well I guess my wing-nuts
    will get covered in banana gunk, but whatever, YAY WEAK TYPING."

    or

    2. "Give me a bag of stuff and I'll fish around in there trying to
    find spanners, and ignore everything else"

    This RFC optionally allows developers to say:

    3. "Give me a bag of spanners, and break if they put anything else in there."

    Optional, opt-in, simple, logical, useful.


    Performance
    -------------------

    This adds overhead when used. Like anything, doing more stuff takes
    longer. Check.

    The overhead is small, and the use-cases where I see this stuff
    happening makes it unlikely for usage to be nested in such a way that
    this would run multiple times.

    That said it is known that the typing system needs an overhaul. Joe
    has said that while this does add a trivial overhead, it can easily be
    taken care of in future versions with a general drive-by on types.

    It's great to see people discussing caches and other performance
    improvements, it certainly does not seem impossible to speed this up a
    little, but its hardly slow enough to impact anything unless you have
    hundreds of these things handling thousands of records. Regardless, if
    you are actually doing that then you need to stop and think about your
    actions.


    Enforcing OOP
    ---------------------

    Not the case. As has been said, this is equally useful for scalar type
    hinting - when that gets done. You can type-hint anything, not just
    interfaces. Even currently while only array, callable and OOP stuff is
    hintable, this optional feature does not enforce OOP any more than
    current type hinting.


    Doesn't Handle Traversable
    ----------------------------------------

    Right now the typing system cannot handle hint for both "foo(array
    $bar)" and foo(Traversable $bar)", so it is purely a consistency thing
    that makes this RFC not try to attack that either. The overhaul of the
    typing system mentioned in the Performance section could approach
    this, but this has nothing to do with the RFC in question and should
    be considered off topic. I did explain that on the RFC itself.


    "Just use collections"
    ------------------------------

    Something that I have seen before in these discussions is this
    conversational approach:

    "Why add a feature to do this in a nice, neat, logical fashion, when
    you could just throw loads of code at it?!"

    This time it is happening with the suggesting of using Collections,
    instead of defining the content at a declaration level.

    If I want a bag of spanners, then I can absolutely define
    "SpannerCollection" and require that on the type-hint level. Sure.

    Personally I hate the idea of forcing the implementor of a method (who
    may not be the same as the developer of said method) to change their
    code on the outside, purely so I can have the convenience of knowing
    the contents of the argument on the inside.

    I have the overhead of autoloading, finding, populating and generally
    f**king around with that collection, when all I want or need is the
    ability to be certain that the argument only contains spanners, and
    nothing else.

    Suggesting that throwing more code at this is a viable solution is
    madness, and this specific example is just suggesting a weak form of
    generics should be used. Again, this is not generics.


    Syntax
    ---------

    There have been a lot of people suggesting various types of syntax. As
    Joe said, using generics syntax for not generics would be a travesty,
    and an overcomplication of what should be a simple feature.

    For those who hate OOP, array of would be a lovely way to ask for an
    array of callables. Trying to make generics happen is a great way to
    force not only new syntax, but a brand new OOP paradigm that will be
    new for EVERYONE, so ignoring that syntax and letting our functional
    and OOP folks have a nice thing shouldn't be considered a negative.

    One we decide that this is not generics, we can discuss syntax much easier.

    So.

    function (Foo[] $foo) - No nulls.

    This is something I hope we can all agree on. It is by no means
    confusing, but I would be happy to run a poll and get NetTuts to tweet
    it, to see if the average user is confused by this syntax. I've
    tweeted about it and had 1 out of 50ish replies saying they weren't
    sure. The confusion of 2% is something that can easily be fixed with
    documentation and time.

    Again, I'll be blogging about 5.6 features on NetTuts so those same
    beginner level users will know all about it, and our documentation
    will explain it for everyone else, meaning a 2% sub-section of users
    will be EASY to fix.


    Allowing Nulls
    --------------------

    Some folks are concerned that forcing this feature to not consider
    null as a valid array entry is somehow a loss to the feature. We can
    potentially fix this with MOAR SYNTAX. I offer 3 options here:

    1. function (Foo[]? $foo)

    Maybe. Not sure we even care about this, and gets a little confusing
    with &Foo[]? but it could be considered.

    2. function (Foo|null[] $foo)

    Fits in line with DocBlock syntax, could allow multiple types, but
    folks would assume that function (Foo|null $foo) is also ok.

    My suggestion:

    3. Do nothing

    We implement this feature, as is, then we have a a follow-up for
    allowing type-hinting on multiples:

    The new RFC at the same time would implement

    function (Foo|Bar $foo)
    function (Foo|null[] $foo)
    function (Foo|Bar[] $foo) or... function (Foo[]|Bar[] $foo) I guess?

    Being able to say "I would like a spanner or a monkey-wrench" is a
    perfectly valid use-case, just as "I would like a spanner or an 'I owe
    you'" is another valid use-case. But to shove nulls in with something
    that specifically expects a spanner is clearly a really weird thing to
    do.

    Either way, there are a LOT of topics being discussed in this thread
    that are not super relevant to what is actually being suggested.

    I'd really like it if we could discuss just this feature, and keep the
    generics, OOP arguments, anti-type-checking, and "bemoaning of
    edge-case performance issues on type system that needs some love
    anyway" conversations for other threads.

    Thanks! :)
  • Sara Golemon at Jan 20, 2014 at 7:17 pm
    Short version: "This RFC is not about Scalar Type Hints or Generics"
    But it is.

    "ArrayOf", whatever form it takes, is incomplete without scalar
    specializations. Is implementing half-a-feature (arrayof without
    scalars) good enough? Maybe, but that doesn't mean we shouldn't be
    having that discussion.
    There have been a lot of people suggesting various types of syntax. As
    Joe said, using generics syntax for not generics would be a travesty,
    and an overcomplication of what should be a simple feature.
    ArrayOf is not a separate topic from Generics, it is by definition a
    narrowly-scoped form of generics. "ArrayOf Foo" is a "Foo"
    specialization of the array generic (even if it's not labeled as such
    due to not having other types of generics). "This is an array, but
    it's an array just for Foos". Can arrayof be implemented in a way
    which hides this heritage? Sure, but should it?

    I'd really like it if we could discuss just this feature, and keep the
    generics, OOP arguments, anti-type-checking, and "bemoaning of
    edge-case performance issues on type system that needs some love
    anyway" conversations for other threads.
    Just reinforce my points above, I disagree with this paragraph.
    Looking at specific cases while ignoring the larger picture of the
    language's design is what gets us into ugly corners (inconsistently
    named functions, parameter ordering, multiple autoloader mechanisms,
    etc...)

    Ignoring elements which fit closely together is a short-sighted mistake.

    -Sara
  • Andrea Faulds at Jan 20, 2014 at 7:24 pm

    On 20/01/14 19:17, Sara Golemon wrote:
    ArrayOf is not a separate topic from Generics, it is by definition a
    narrowly-scoped form of generics. "ArrayOf Foo" is a "Foo"
    specialization of the array generic (even if it's not labeled as such
    due to not having other types of generics). "This is an array, but
    it's an array just for Foos". Can arrayof be implemented in a way
    which hides this heritage? Sure, but should it?
    This is a quite good point. If we're introducing limited-scope Generics
    here, what's wrong with using Generics syntax, especially since it will
    be familiar to C#, C++ and Java users?
    --
    Andrea Faulds
    http://ajf.me/
  • Crypto Compress at Jan 20, 2014 at 7:42 pm
    Hi,

    somewhat off topic, sorry. Yet, generics are *not* the same as purposed
    syntax:

    protected void Foo(int[] foo) {}
    protected void Foo(Dictionary<int, string> foo) {
    protected void Foo(Dictionary<int, string>[] foo) {

    cryptocompress
  • Andrea Faulds at Jan 20, 2014 at 7:46 pm

    On 20/01/14 19:37, Philip Sturgeon wrote:
    On Mon, Jan 20, 2014 at 2:24 PM, Andrea Faulds wrote:
    This is a quite good point. If we're introducing limited-scope Generics
    here, what's wrong with using Generics syntax, especially since it will be
    familiar to C#, C++ and Java users?
    If we use the generics syntax for not-generics, then we can't EVER
    implement generics properly, so that would be an incredibly bad idea.
    But these *are* generics. If we add "proper generics" in future,
    Array<int> would keep meaning the same thing, though I suppose we'd need
    to allow implict casting of Arrays of ints. That's all that'd change.
    This doesn't block adding proper generics, it adds a weak and limited
    form of them for arrays only, which could be improved in performance
    with "proper" generics later if we so wished.

    --
    Andrea Faulds
    http://ajf.me/
  • Andrea Faulds at Jan 20, 2014 at 8:18 pm

    On 20/01/14 19:54, Philip Sturgeon wrote:
    No, they are not generics. They do not require setup or definition of
    ANY kind, they are jsut arrays of content.
    Right. Implict Array of int to Array<int> cast. These are generics.
    Uses purely as an example (not as statement of love for Java), Java
    does have both. They are not alternatives to the same goal.
    Java arrays and PHP arrays are hardly comparable.

    --
    Andrea Faulds
    http://ajf.me/
  • Sara Golemon at Jan 20, 2014 at 8:17 pm

    This is a quite good point. If we're introducing limited-scope Generics
    here, what's wrong with using Generics syntax, especially since it will be
    familiar to C#, C++ and Java users?
    If we use the generics syntax for not-generics, then we can't EVER
    implement generics properly, so that would be an incredibly bad idea.
    That's an excellent point, and all the better reason to discuss their
    overlap NOW, rather than further down the road when we can't change
    arrayof syntax.

    To borrow from another message in this thread:

    Dictionary<int, string>[] foo <-- Is that *really* what you want an
    array of int->string dictionaries to look like? Really?

    -Sara
  • Crypto Compress at Jan 20, 2014 at 8:53 pm

    Am 20.01.2014 20:55, schrieb Sara Golemon:
    Dictionary<int, string>[] foo <-- Is that *really* what you want an
    array of int->string dictionaries to look like? Really?
    Yes, i like this clean separation between base/scalar types and
    generics. We can have this RFC *and* generics. Can't see any problems
    here. This is not critique of your work or generics!

    Discussed performance penalty got in long time before this RFC:
    http://3v4l.org/os5Gg

    The only thing to solve is the nullable-issue...

    cryptocompress
  • Sara Golemon at Jan 20, 2014 at 9:03 pm

    Dictionary<int, string>[] foo <-- Is that *really* what you want an
    array of int->string dictionaries to look like? Really?
    Yes, i like this clean separation between base/scalar types and generics. We
    can have this RFC *and* generics. Can't see any problems here. This is not
    critique of your work or generics!
    Eh... Makes one of us then. I think that syntax looks horrendous.
    Agree to disagree.
    Discussed performance penalty got in long time before this RFC:
    http://3v4l.org/os5Gg
    That's not a discussion, that's a contrived benchmark, and a deeply
    flawed one since it amplifies general function call overhead and
    doesn't look at type-checking arbitrarily large arrays, let alone
    nested arrays. That benchmark was designed to lose any perf hit in
    the noise.
    The only thing to solve is the nullable-issue...
    Apart from performance and syntax.

    -Sara
  • Stas Malyshev at Jan 20, 2014 at 9:01 pm
    Hi!
    Dictionary<int, string>[] foo <-- Is that *really* what you want an
    array of int->string dictionaries to look like? Really?
    In every (popular) dynamic language on a planet (Perl, Python, Ruby,
    Javascript, Lua, Groovy) it looks roughly like that (within syntax
    variations):

    foo = [Dictionary()]

    I wonder why it is enough for them? Are they all missing something very
    important?
    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Sara Golemon at Jan 20, 2014 at 9:06 pm

    Dictionary<int, string>[] foo <-- Is that *really* what you want an
    array of int->string dictionaries to look like? Really?
    In every (popular) dynamic language on a planet (Perl, Python, Ruby,
    Javascript, Lua, Groovy) it looks roughly like that (within syntax
    variations):

    foo = [Dictionary()]

    I wonder why it is enough for them? Are they all missing something very
    important?
    To quote someone who's skill and abilities I respect: "Just because
    they do it some way in X, doesn't mean it belongs in PHP."

    Hint, I'm talking about you.

    -Sara
  • Stas Malyshev at Jan 20, 2014 at 9:27 pm
    Hi!
    I wonder why it is enough for them? Are they all missing something very
    important?
    To quote someone who's skill and abilities I respect: "Just because
    they do it some way in X, doesn't mean it belongs in PHP."
    That is true. But this is not "just because" - I'm not arguing about any
    specific syntax or even semantics. I'm talking about all of them not
    even going in that general direction and still being fine. That makes me
    question if the need for generics is that great and if they actually are
    a good fit for a dynamic language, at least in a way that they are
    presented - i.e. as type checking mechanism, not as code generation
    mechanism. I could repeat the regular arguments why I think they are not
    if you'd like - I did it on this list like 100 times and everybody is
    tired and hating me for that already, but I could do it again on popular
    request. So it's not the only argument. It is an additional argument
    providing some perspective.

    --
    Stanislav Malyshev, Software Architect
    SugarCRM: http://www.sugarcrm.com/
    (408)454-6900 ext. 227
  • Sara Golemon at Jan 20, 2014 at 9:37 pm

    To quote someone who's skill and abilities I respect: "Just because
    they do it some way in X, doesn't mean it belongs in PHP."
    That is true. But this is not "just because" - I'm not arguing about any
    specific syntax or even semantics. I'm talking about all of them not
    even going in that general direction and still being fine. That makes me
    question if the need for generics is that great and if they actually are
    a good fit for a dynamic language, at least in a way that they are
    presented - i.e. as type checking mechanism, not as code generation
    mechanism. I could repeat the regular arguments why I think they are not
    if you'd like - I did it on this list like 100 times and everybody is
    tired and hating me for that already, but I could do it again on popular
    request. So it's not the only argument. It is an additional argument
    providing some perspective.
    This is an issue that keeps coming up. Calling for typed arrays to
    look like generics is not the same as calling for generics.

    It's an acknowledgement that there is a fundamental shared behavior
    between the two. A generic implementing ArrayAccess and Traversable
    is not different **FROM A USERSPACE POINT OF VIEW** from a typed array
    (apart from object psuedoreference semantics and the ability to check
    on input). Given that they look the same **FROM A USERSPACE POINT OF
    VIEW**, they should share common syntax.

    This does not mean that we have to add (or even agree to add) generics
    in order to adopt typed arrays. This does not mean the implementation
    of typed arrays need share anything in common with generics should
    they be someday be voted in.

    This is just thinking of the consistency of the language.
    This is about how the user interacts with the syntax.

    -Sara

Related Discussions