FAQ

[scala-internals] Seq aliasing scala.collection.Seq considered dangerous

Heiko Seeberger
Oct 22, 2012 at 4:44 pm
Hi,

Many Scala users, even very advanced ones, don't know that Seq (as defined in the package object scala) is *not* an alias for scala.collection.immutable.Seq like many other collection aliases (Set, Map, etc.). Actually Seq is an alias for scala.collection.Seq which could be immutable or mutable.

Therefore, if one does not explicitly import scala.collection.immutable.Seq, mutability can leak into APIs which are considered immutable. This is in contradiction to Scala making it easy to write immutable code. And it simply is dangerous. Any chance this can be fixed?

Thanks
Heiko

--

Heiko Seeberger
Twitter: @hseeberger
Blog: heikoseeberger.name
Company: Typesafe - The software stack for applications that scale
Author of "Durchstarten mit Scala, a tutorial-style Scala book"
reply

Search Discussions

65 responses

  • Alex Cruise at Oct 22, 2012 at 5:43 pm

    On Mon, Oct 22, 2012 at 9:44 AM, Heiko Seeberger wrote:

    Many Scala users, even very advanced ones, don't know that Seq (as defined
    in the package object scala) is *not* an alias for
    scala.collection.immutable.Seq like many other collection aliases (Set,
    Map, etc.). Actually Seq is an alias for scala.collection.Seq which could
    be immutable or mutable.

    Therefore, if one does not explicitly
    import scala.collection.immutable.Seq, mutability can leak into APIs which
    are considered immutable. This is in contradiction to Scala making it easy
    to write immutable code. And it simply is dangerous. Any chance this can be
    fixed?
    Wow, that's... Surprising.

    -0xe1a
  • Daniel Sobral at Oct 22, 2012 at 6:04 pm

    On Mon, Oct 22, 2012 at 2:44 PM, Heiko Seeberger wrote:
    Hi,

    Many Scala users, even very advanced ones, don't know that Seq (as defined
    in the package object scala) is *not* an alias for
    scala.collection.immutable.Seq like many other collection aliases (Set, Map,
    etc.). Actually Seq is an alias for scala.collection.Seq which could be
    immutable or mutable.

    Therefore, if one does not explicitly import scala.collection.immutable.Seq,
    mutability can leak into APIs which are considered immutable. This is in
    contradiction to Scala making it easy to write immutable code. And it simply
    is dangerous. Any chance this can be fixed?
    It would prevent Array from being passed as Seq, which would be very
    annoying when doing things around varargs.

    --
    Daniel C. Sobral

    I travel to the future all the time.
  • Paul Phillips at Oct 22, 2012 at 6:29 pm

    On Mon, Oct 22, 2012 at 10:56 AM, Daniel Sobral wrote:

    It would prevent Array from being passed as Seq, which would be very
    annoying when doing things around varargs.
    There is a cleanup available here which addresses more than one problem.
    This has always horrified me:

    scala> val xs = Array(1, 2, 3)
    xs: Array[Int] = Array(1, 2, 3)

    scala> case class Foo(args: Int*) { }
    defined class Foo

    scala> Foo(xs: _*)
    res0: Foo = Foo(WrappedArray(1, 2, 3))

    scala> xs(0) = 123

    scala> res0
    res2: Foo = Foo(WrappedArray(123, 2, 3))

    Treating varargs this way, with no way to exclude mutable inputs, taints
    varargs for all uses. This is especially annoying because case classes
    privilege varargs such that you can pattern match with unapplySeq semantics.

    scala> res0 match { case Foo(123, xs @ _*) => xs }
    res3: Seq[Int] = WrappedArray(2, 3)

    So to me, varargs should require immutable.Seq, just as it would if the
    arguments were being passed individually. That would mean a defensive copy
    of the array has to be made. I would consider this a feature. If that
    were to be a performance problem in some contexts, then "don't do that."

    Undoubtedly controversial, but an implicit from Array -> immutable.Seq
    (which copies) would be easy enough to offer.
  • Jonas Bonér at Oct 22, 2012 at 6:38 pm

    On Mon, Oct 22, 2012 at 8:29 PM, Paul Phillips wrote:
    On Mon, Oct 22, 2012 at 10:56 AM, Daniel Sobral wrote:

    It would prevent Array from being passed as Seq, which would be very
    annoying when doing things around varargs.

    There is a cleanup available here which addresses more than one problem.
    This has always horrified me:

    scala> val xs = Array(1, 2, 3)
    xs: Array[Int] = Array(1, 2, 3)

    scala> case class Foo(args: Int*) { }
    defined class Foo

    scala> Foo(xs: _*)
    res0: Foo = Foo(WrappedArray(1, 2, 3))

    scala> xs(0) = 123

    scala> res0
    res2: Foo = Foo(WrappedArray(123, 2, 3))

    Treating varargs this way, with no way to exclude mutable inputs, taints
    varargs for all uses. This is especially annoying because case classes
    privilege varargs such that you can pattern match with unapplySeq semantics.

    scala> res0 match { case Foo(123, xs @ _*) => xs }
    res3: Seq[Int] = WrappedArray(2, 3)

    So to me, varargs should require immutable.Seq, just as it would if the
    arguments were being passed individually. That would mean a defensive copy
    of the array has to be made. I would consider this a feature. If that were
    to be a performance problem in some contexts, then "don't do that."

    Undoubtedly controversial, but an implicit from Array -> immutable.Seq
    (which copies) would be easy enough to offer.
    I like that.

    >



    --
    Jonas Bonér
    Phone: +46 733 777 123
    Home: http://jonasboner.com
    Twitter: @jboner
  • Martin odersky at Oct 22, 2012 at 7:53 pm

    On Mon, Oct 22, 2012 at 8:30 PM, Jonas Bonér wrote:
    On Mon, Oct 22, 2012 at 8:29 PM, Paul Phillips wrote:
    On Mon, Oct 22, 2012 at 10:56 AM, Daniel Sobral wrote:

    It would prevent Array from being passed as Seq, which would be very
    annoying when doing things around varargs.

    There is a cleanup available here which addresses more than one problem.
    This has always horrified me:

    scala> val xs = Array(1, 2, 3)
    xs: Array[Int] = Array(1, 2, 3)

    scala> case class Foo(args: Int*) { }
    defined class Foo

    scala> Foo(xs: _*)
    res0: Foo = Foo(WrappedArray(1, 2, 3))

    scala> xs(0) = 123

    scala> res0
    res2: Foo = Foo(WrappedArray(123, 2, 3))

    Treating varargs this way, with no way to exclude mutable inputs, taints
    varargs for all uses. This is especially annoying because case classes
    privilege varargs such that you can pattern match with unapplySeq
    semantics.
    scala> res0 match { case Foo(123, xs @ _*) => xs }
    res3: Seq[Int] = WrappedArray(2, 3)

    So to me, varargs should require immutable.Seq, just as it would if the
    arguments were being passed individually. That would mean a defensive copy
    of the array has to be made. I would consider this a feature. If that were
    to be a performance problem in some contexts, then "don't do that."

    Undoubtedly controversial, but an implicit from Array -> immutable.Seq
    (which copies) would be easy enough to offer.
    I like that.
    I am not sure. Java uses mutable arrays. We will no doubt lose considerable
    performance in the interest of purity. Can we afford it? Right now, "slow
    collections" is one of the main arguments Scala detractors put forward.

    And, it seems nobody in Java land cares one jota about loss of purity
    anyway. But they will hang us on the performance argument.

    I am not yet decided on this, but just wanted to raise some doubts.

    Cheers

    - Martin



    --
    Jonas Bonér
    Phone: +46 733 777 123
    Home: http://jonasboner.com
    Twitter: @jboner


    --
    Martin Odersky
    Prof., EPFL <http://www.epfl.ch> and Chairman, Typesafe<http://www.typesafe.com>
    PSED, 1015 Lausanne, Switzerland
    Tel. EPFL: +41 21 693 6863
    Tel. Typesafe: +41 21 691 4967
  • Daniel Spiewak at Oct 22, 2012 at 8:57 pm
    As someone who cares very deeply about collection performance, I still fall
    on the "make type Seq = immutable.Seq" side of the fence. The reason for
    this is I have no expectation of performance when I'm using varargs
    functions. Seriously! I will never, ever use a varargs function in a hot
    path, regardless of whether or not I am passing an Array as a Seq (which,
    incidentally, is something I would never do anyway). Is this a regression
    from Java's varargs semantics? Maybe, but it's something I can live with.

    As has been pointed out, Java's varargs are broken. We shouldn't preserve
    an extremely dangerous type semantic merely to attempt the preservation of
    Java's brokenness.

    If Scala varargs performance were really a legitimate issue, the very first
    thing that should be done is create specialized "SeqN" collections for
    common arities. In my testing, this results in an order of magnitude
    improvement, both in space and time. Since this hasn't been done (and is
    not even on the radar), my assumption is that varargs are not considered
    (in general) to be a performance-sensitive feature, and therefore we
    shouldn't use varargs performance as a reason to restrict improvement in
    other areas.

    Daniel
    On Mon, Oct 22, 2012 at 1:46 PM, martin odersky wrote:


    On Mon, Oct 22, 2012 at 8:30 PM, Jonas Bonér wrote:

    On Mon, Oct 22, 2012 at 8:29 PM, Paul Phillips <pau...@...org>
    wrote:
    On Mon, Oct 22, 2012 at 10:56 AM, Daniel Sobral <dcs...@...com>
    wrote:
    It would prevent Array from being passed as Seq, which would be very
    annoying when doing things around varargs.

    There is a cleanup available here which addresses more than one problem.
    This has always horrified me:

    scala> val xs = Array(1, 2, 3)
    xs: Array[Int] = Array(1, 2, 3)

    scala> case class Foo(args: Int*) { }
    defined class Foo

    scala> Foo(xs: _*)
    res0: Foo = Foo(WrappedArray(1, 2, 3))

    scala> xs(0) = 123

    scala> res0
    res2: Foo = Foo(WrappedArray(123, 2, 3))

    Treating varargs this way, with no way to exclude mutable inputs, taints
    varargs for all uses. This is especially annoying because case classes
    privilege varargs such that you can pattern match with unapplySeq
    semantics.
    scala> res0 match { case Foo(123, xs @ _*) => xs }
    res3: Seq[Int] = WrappedArray(2, 3)

    So to me, varargs should require immutable.Seq, just as it would if the
    arguments were being passed individually. That would mean a defensive copy
    of the array has to be made. I would consider this a feature. If that were
    to be a performance problem in some contexts, then "don't do that."

    Undoubtedly controversial, but an implicit from Array -> immutable.Seq
    (which copies) would be easy enough to offer.
    I like that.
    I am not sure. Java uses mutable arrays. We will no doubt lose
    considerable performance in the interest of purity. Can we afford it? Right
    now, "slow collections" is one of the main arguments Scala detractors put
    forward.

    And, it seems nobody in Java land cares one jota about loss of purity
    anyway. But they will hang us on the performance argument.

    I am not yet decided on this, but just wanted to raise some doubts.

    Cheers

    - Martin



    --
    Jonas Bonér
    Phone: +46 733 777 123
    Home: http://jonasboner.com
    Twitter: @jboner


    --
    Martin Odersky
    Prof., EPFL <http://www.epfl.ch> and Chairman, Typesafe<http://www.typesafe.com>
    PSED, 1015 Lausanne, Switzerland
    Tel. EPFL: +41 21 693 6863
    Tel. Typesafe: +41 21 691 4967
  • Rex Kerr at Oct 22, 2012 at 10:03 pm

    On Mon, Oct 22, 2012 at 3:55 PM, Daniel Spiewak wrote:

    As someone who cares very deeply about collection performance, I still
    fall on the "make type Seq = immutable.Seq" side of the fence. The reason
    for this is I have no expectation of performance when I'm using varargs
    functions. Seriously!

    Orders of magnitude still matter. Making a defensive immutable copy of
    arrays (and other mutable collections) on Seq is a great way to force
    people to optimize code paths that otherwise wouldn't have required any
    attention.

    I will never, ever use a varargs function in a hot path, regardless of
    whether or not I am passing an Array as a Seq (which, incidentally, is
    something I would never do anyway). Is this a regression from Java's
    varargs semantics? Maybe, but it's something I can live with.
    But you will have to redefine what your "hot path" is in some cases. If
    you're using arrays, you already know not to mutate them out from under
    yourself. Why double-to-(O(1)->O(n)) your runtime when you are already
    aware that you're using arrays and that you have to keep their mutability
    in mind?

    As has been pointed out, Java's varargs are broken. We shouldn't preserve
    an extremely dangerous type semantic merely to attempt the preservation of
    Java's brokenness.
    They are efficient and work as designed. You can use Java varargs for
    moderately (but not extremely) performance-critical stuff. I don't know
    how that is broken except inasmuch as everything mutable is broken (i.e.
    breakable).

    If Scala varargs performance were really a legitimate issue, the very
    first thing that should be done is create specialized "SeqN" collections
    for common arities.

    Agreed. That would be 3-4x faster than the Java varargs method for small
    arities when dynamic dispatch could be skipped (and assuming
    specialization, or that you wouldn't use it for primitives).

    In my testing, this results in an order of magnitude improvement, both in
    space and time.

    Not over Java varargs. The space usage is comparable to an array. Time is
    ~3-4x. Not that you're likely to run out of memory because of varargs
    usage in any case.

    Since this hasn't been done (and is not even on the radar), my assumption
    is that varargs are not considered (in general) to be a
    performance-sensitive feature, and therefore we shouldn't use varargs
    performance as a reason to restrict improvement in other areas.
    Okay, but that doesn't mean it's wise to degrade performance by another
    factor of two-to-N to save you from a problem that you already should be
    well-aware of. Let's not make varargs usage the hotspot sooner than we
    need to.

    --Rex

    P.S. We'd also need to add a ClonedArray class in the immutable part of the
    hierarchy, or the performance could get much, much worse as we run into
    unavoidable boxing of primitives. The only reason it's not much, much
    worse now is that WrappedArray.ofX is effectively specialized.
  • Paul Phillips at Oct 23, 2012 at 11:34 pm
    On Mon, Oct 22, 2012 at 3:03 PM, Rex Kerr wrote:
    If Scala varargs performance were really a legitimate issue, the very
    first thing that should be done is create specialized "SeqN" collections
    for common arities.

    Agreed. That would be 3-4x faster than the Java varargs method for small
    arities when dynamic dispatch could be skipped (and assuming
    specialization, or that you wouldn't use it for primitives).
    Can you guys give me some code examples of exactly what you're talking
    about with this, ideally with highlighted sections labeled "what it does
    now" and "what it should do" ?
  • Rex Kerr at Oct 24, 2012 at 4:08 pm
    On Tue, Oct 23, 2012 at 7:34 PM, Paul Phillips wrote:
    On Mon, Oct 22, 2012 at 3:03 PM, Rex Kerr wrote:


    If Scala varargs performance were really a legitimate issue, the very
    first thing that should be done is create specialized "SeqN" collections
    for common arities.

    Agreed. That would be 3-4x faster than the Java varargs method for small
    arities when dynamic dispatch could be skipped (and assuming
    specialization, or that you wouldn't use it for primitives).
    Can you guys give me some code examples of exactly what you're talking
    about with this, ideally with highlighted sections labeled "what it does
    now" and "what it should do" ?
    It's just like Sets, with Set1 through Set4--that's done to improve
    performance of sets; it's a huge win there, but there are still pretty big
    wins for other collections in certain use cases. That's really the key.
    The upside is speed for tiny collections, but the downside is that you
    might have to do dynamic dispatch on more code paths than before. I don't
    have time to test that exhaustively, but here's a highlight of the
    differences:


    def ptime[A](a: => A) = {
    val t0 = System.nanoTime
    val ans = a
    println("Elapsed: %.3f".format((System.nanoTime-t0)*1e-9))
    ans
    }
    def repeat[A](n: Int)(a: => A): A = if (n<=1) a else { a; repeat(n-1)(a) }

    trait ClassImpl { def apply(i: Int): Int }
    class FourInts(a: Int, b: Int, c: Int, d: Int) extends ClassImpl {
    def apply(i: Int) = if (i<2) if (i<1) a else b else if (i<3) c else d
    }
    class SixInts(a: Int, b: Int, c: Int, d: Int, e: Int, f: Int) extends
    ClassImpl {
    def apply(i: Int) = if (i<3) { if (i<1) a else if (i<2) b else c } else
    if (i<5) { if (i<4) d else e } else f
    }

    def classimpl = {
    var i,s = 0
    while (i<1000000) {
    val cia = if ((i&1)==1) new SixInts(i-2,i-1,i,i+1,i+2,i+3) else new
    FourInts(i,i+1,i+2,i+3)
    s += cia(3)
    i += 1
    }
    s
    }

    def vararg(xs: Int*) = xs
    def varargsimpl = {
    var i,s = 0
    while (i<1000000) {
    val va = if ((i&1)==1) vararg(i-2,i-1,i,i+1,i+2,i+3) else
    vararg(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }

    def arrayimpl = {
    var i,s = 0
    while (i<1000000) {
    val ar = (
    if ((i&1)==1) {
    val a = new Array[Int](6); a(0) = i-2; a(1) = i-1; a(2) = i; a(3)
    = i+1; a(4) = i+2; a(5) = i+3; a
    } else {
    val a = Array(i,i+1,i+2,i+3); a(0) = i; a(1) = i+1; a(2) = i+2;
    a(3) = i+3; a
    }
    )
    s += ar(3)
    i += 1
    }
    s
    }

    def vectorimpl = {
    var i,s = 0
    while (i<1000000) {
    val vb = Vector.newBuilder[Int]
    val v = (
    if ((i&1)==1) { vb+=i-2; vb+=i-1; vb+=i; vb+=i+1; vb+=i+2; vb+=i+3 }
    else { vb+=i; vb+=i+1; vb+=i+2; vb+=i+3 }
    ).result
    s += v(3)
    i += 1
    }
    s
    }

    If you run ptime(repeat(100){ classimpl }) it's about 2x faster than
    ptime(repeat(100){ varargsimpl }) on 2.10. It's 4x faster than arrayimpl,
    which really just points that we need to improve our array handling (it's
    8x faster than the naive Array-by-using-companion-apply, not shown here).
    It's 12x faster than vectorimpl. (6x faster than listimpl, not shown here,
    but which you can imagine; 45x faster than collection.immutable.Seq(...),
    55x faster than collection.immutable.IndexedSeq(...).)

    Anyway, the current varargs implementation is pretty good--so good that it
    shows that the array implementation is embarrassingly bad right now. It
    could be better yet with custom classes.

    --Rex

    P.S. Just to show why Daniel's "I don't use varargs" can be a good point,
    try this one out:

    def func4(a: Int, b: Int, c: Int, d: Int) = c
    def func6(a: Int, b: Int, c: Int, d: Int, e: Int, f: Int) = c
    def funcimpl = {
    var i,s = 0
    while (i<1000000) {
    s += (if ((i&1)==1) func6(i-2,i-1,i,i+1,i+2,i+3) else
    func4(i,i+1,i+2,i+3))
    i += 1
    }
    s
    }

    This is ~7x faster than classimpl, and 14x faster than what you can get
    with varargs now. So, indeed, in a tight inner loop, *don't use varargs*!
    But making it several times slower again is still probably foolish, and if
    you've already got a Seq, it shouldn't really hurt to pass it along to a
    varargs function instead of one that explicitly takes a collection
    (multiple dispatch being a possible exception).
  • Paul Phillips at Oct 24, 2012 at 4:30 pm
    Thanks a lot for that. The main reason I asked is that the place I could
    see where this would matter is a method like this, which indeed was your
    example:

    def vararg(xs: Int*) = xs

    And I never encounter methods with signatures like that. So I was wondering
    why this would be a subject of great interest. The signature I encounter a
    lot is this:

    def vararg[T](xs: T*) = xs

    This could well be because I don't get out much.
  • Rex Kerr at Oct 24, 2012 at 4:47 pm

    On Wed, Oct 24, 2012 at 12:30 PM, Paul Phillips wrote:

    Thanks a lot for that. The main reason I asked is that the place I could
    see where this would matter is a method like this, which indeed was your
    example:


    def vararg(xs: Int*) = xs

    And I never encounter methods with signatures like that. So I was
    wondering why this would be a subject of great interest. The signature I
    encounter a lot is this:

    def vararg[T](xs: T*) = xs

    This could well be because I don't get out much.
    Sure, that matters less, relatively, because the overhead of the other
    stuff you're doing is greater. Then varargs is only about 1.5x slower than
    classimpl (and classimpl is only about 1.4x slower than funcimpl):

    class X(val x: Int) {
    var next: X = null
    }
    var x0 = new X(0); var x1 = new X(1); var x2 = new X(2); var x3 = new
    X(3); var x4 = new X(4); var x5 = new X(5)
    x0.next = x1; x1.next = x2; x2.next=x3; x3.next=x4; x4.next=x5;
    x5.next=x0;
    var x = x0
    def nx = {x=x.next; x}

    trait ClassImpl[A] { def apply(i: Int): A }
    class Four[A](a: A, b: A, c: A, d: A) extends ClassImpl[A] {
    def apply(i: Int) = if (i<2) if (i<1) a else b else if (i<3) c else d
    }
    class Six[A](a: A, b: A, c: A, d: A, e: A, f: A) extends ClassImpl[A] {
    def apply(i: Int) = if (i<3) { if (i<1) a else if (i<2) b else c } else
    if (i<5) { if (i<4) d else e } else f
    }

    def classimpl = {
    var i,s = 0
    while (i<1000000) {
    val cia = if ((i&1)==1) new Six(nx,nx,nx,nx,nx,nx) else new
    Four(nx,nx,nx,nx)
    s += cia(3).x
    i += 1
    }
    s
    }

    def vararg[A](as: A*) = as
    def varargsimpl = {
    var i,s = 0
    while (i<1000000) {
    val va = if ((i&1)==1) vararg(nx,nx,nx,nx,nx,nx) else
    vararg(nx,nx,nx,nx)
    s += va(3).x
    i += 1
    }
    s
    }

    --Rex
  • Paul Phillips at Oct 24, 2012 at 5:16 pm
    Okay, using this as my test:

    object Test {
    def ptime[A](a: => A) = {
    val t0 = System.nanoTime
    val ans = a
    println("Elapsed: %.3f".format((System.nanoTime-t0)*1e-9))
    ans
    }
    def repeat[A](n: Int)(a: => A): A = if (n<=1) a else { a; repeat(n-1)(a) }

    def vararg1(xs: Int*) = xs
    def vararg2[T](xs: T*) = xs

    def varargsimpl1 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg1(i-2,i-1,i,i+1,i+2,i+3) else
    vararg1(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }
    def varargsimpl2 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2(i-2,i-1,i,i+1,i+2,i+3) else
    vararg2(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }

    def varargsimpl3 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2("1", "2", "3", "4", "5", "6") else
    vararg2("1", "2", "3", "4")
    s += va(3).toInt
    i += 1
    }
    s
    }

    def main(args: Array[String]): Unit = {
    ptime(repeat(200)(varargsimpl1))
    ptime(repeat(200)(varargsimpl2))
    ptime(repeat(200)(varargsimpl3))
    }
    }

    I moved the line this far with type-and-arity-specialized sequences for
    varargs:

    % qscalac -d /tmp test/files/run/tspecialize-varargs.scala && qscala -cp
    /tmp Test
    Elapsed: 1.916
    Elapsed: 1.832
    Elapsed: 3.419

    % scalac3 -d /tmp test/files/run/tspecialize-varargs.scala && scala3 -cp
    /tmp Test
    Elapsed: 2.646
    Elapsed: 2.507
    Elapsed: 4.266

    If you think it should have moved further, or would like to improve the
    test, I'm all ears.
  • Rex Kerr at Oct 24, 2012 at 5:45 pm
    Looks good to me--about what I'd have expected, and pushes the "you must
    optimize" boundary a nice bit further away. Maybe Daniel can comment on
    the "order of magnitude" thing vs. the observed ~4/3.

    --Rex

    P.S. Crikey--how'd you get that done so fast?

    On Wed, Oct 24, 2012 at 1:16 PM, Paul Phillips wrote:

    Okay, using this as my test:

    object Test {
    def ptime[A](a: => A) = {
    val t0 = System.nanoTime
    val ans = a
    println("Elapsed: %.3f".format((System.nanoTime-t0)*1e-9))
    ans
    }
    def repeat[A](n: Int)(a: => A): A = if (n<=1) a else { a; repeat(n-1)(a)
    }

    def vararg1(xs: Int*) = xs
    def vararg2[T](xs: T*) = xs

    def varargsimpl1 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg1(i-2,i-1,i,i+1,i+2,i+3) else
    vararg1(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }
    def varargsimpl2 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2(i-2,i-1,i,i+1,i+2,i+3) else
    vararg2(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }

    def varargsimpl3 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2("1", "2", "3", "4", "5", "6") else
    vararg2("1", "2", "3", "4")
    s += va(3).toInt
    i += 1
    }
    s
    }

    def main(args: Array[String]): Unit = {
    ptime(repeat(200)(varargsimpl1))
    ptime(repeat(200)(varargsimpl2))
    ptime(repeat(200)(varargsimpl3))
    }
    }

    I moved the line this far with type-and-arity-specialized sequences for
    varargs:

    % qscalac -d /tmp test/files/run/tspecialize-varargs.scala && qscala -cp
    /tmp Test
    Elapsed: 1.916
    Elapsed: 1.832
    Elapsed: 3.419

    % scalac3 -d /tmp test/files/run/tspecialize-varargs.scala && scala3 -cp
    /tmp Test
    Elapsed: 2.646
    Elapsed: 2.507
    Elapsed: 4.266

    If you think it should have moved further, or would like to improve the
    test, I'm all ears.
  • Daniel Spiewak at Oct 24, 2012 at 5:49 pm
    The "order of magnitude" was going off of some benchmarking I did a while
    back with Anti-XML and small collections. We discovered that by
    specializing Vector for cases 0-4, we were able to obtain a massive overall
    improvement, and an order of magnitude at some call sites. I hadn't done
    benchmarking with varargs specifically, and a number of things have changed
    since I did the original testing with Anti-XML (in the 2.8.x timeline).
    ~4/3 sounds within the margin of expectation, given how I was carrying
    results over from a slightly-related benchmark.

    So…can we fix the Seq alias now? :-)

    Daniel
    On Wed, Oct 24, 2012 at 11:45 AM, Rex Kerr wrote:

    Looks good to me--about what I'd have expected, and pushes the "you must
    optimize" boundary a nice bit further away. Maybe Daniel can comment on
    the "order of magnitude" thing vs. the observed ~4/3.

    --Rex

    P.S. Crikey--how'd you get that done so fast?


    On Wed, Oct 24, 2012 at 1:16 PM, Paul Phillips wrote:

    Okay, using this as my test:

    object Test {
    def ptime[A](a: => A) = {
    val t0 = System.nanoTime
    val ans = a
    println("Elapsed: %.3f".format((System.nanoTime-t0)*1e-9))
    ans
    }
    def repeat[A](n: Int)(a: => A): A = if (n<=1) a else { a;
    repeat(n-1)(a) }

    def vararg1(xs: Int*) = xs
    def vararg2[T](xs: T*) = xs

    def varargsimpl1 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg1(i-2,i-1,i,i+1,i+2,i+3) else
    vararg1(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }
    def varargsimpl2 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2(i-2,i-1,i,i+1,i+2,i+3) else
    vararg2(i,i+1,i+2,i+3)
    s += va(3)
    i += 1
    }
    s
    }

    def varargsimpl3 = {
    var i,s = 0
    while (i < 1000000) {
    val va = if ((i&1)==1) vararg2("1", "2", "3", "4", "5", "6") else
    vararg2("1", "2", "3", "4")
    s += va(3).toInt
    i += 1
    }
    s
    }

    def main(args: Array[String]): Unit = {
    ptime(repeat(200)(varargsimpl1))
    ptime(repeat(200)(varargsimpl2))
    ptime(repeat(200)(varargsimpl3))
    }
    }

    I moved the line this far with type-and-arity-specialized sequences for
    varargs:

    % qscalac -d /tmp test/files/run/tspecialize-varargs.scala && qscala -cp
    /tmp Test
    Elapsed: 1.916
    Elapsed: 1.832
    Elapsed: 3.419

    % scalac3 -d /tmp test/files/run/tspecialize-varargs.scala && scala3 -cp
    /tmp Test
    Elapsed: 2.646
    Elapsed: 2.507
    Elapsed: 4.266

    If you think it should have moved further, or would like to improve the
    test, I'm all ears.
  • Alex Cruise at Oct 24, 2012 at 6:16 pm

    On Wed, Oct 24, 2012 at 10:48 AM, Daniel Spiewak wrote:

    So…can we fix the Seq alias now? :-)

    Does this have the potential to break existing code?

    -0xe1a
  • Daniel Spiewak at Oct 24, 2012 at 6:17 pm
    Yes, but arguably any such existing code was dubious to begin with.

    Daniel
    On Wed, Oct 24, 2012 at 12:16 PM, Alex Cruise wrote:
    On Wed, Oct 24, 2012 at 10:48 AM, Daniel Spiewak wrote:

    So…can we fix the Seq alias now? :-)

    Does this have the potential to break existing code?

    -0xe1a
  • Rex Kerr at Oct 24, 2012 at 9:15 pm

    On Wed, Oct 24, 2012 at 1:48 PM, Daniel Spiewak wrote:

    So…can we fix the Seq alias now? :-)
    Er, no? You were saying that we should fix Seq because who cares if it's
    slower because varargs is already way too slow. Paul just made varargs
    faster. That is the opposite of making it a better idea to fix the Seq
    alias, unless we have benchmarks showing that the immutable Seq solution
    isn't too slow. I didn't write any such benchmarks--I only wrote those for
    tiny collections with varargs since you claimed they were slow.

    If Paul's speedup change is also entirely immutable (I didn't quite catch
    whether that was the case), then we just need one more benchmark to check
    the large-existing-collection case and see how bad it is. Then we can make
    an informed decision. If it's not entirely immutable, we'd also need to
    know how fast or slow the immutable solution was.

    --Rex
  • Daniel Spiewak at Oct 24, 2012 at 9:27 pm
    My point is that the question of performance is an artificial conflation of
    concerns. I spend a good deal of time working on super-tight, performance
    sensitive code in Scala. If performance is even a question in my mind, I'm
    simply not going to use varargs in *any* form. Whether we fix the Seq
    alias or not is irrelevant to such code.

    In my experience, and in my day job, the Seq alias change is a question of
    pure correctness and usability of the language. I raised the point about
    small varargs sizes in an attempt to illustrate how tangential the entire
    performance question really is in this discussion. If varargs performance
    were such a serious concern before, someone would have sat down and made an
    attempt to optimize some of these common cases. If it wasn't a serious
    concern before, why should it be a concern now? Much less a concern
    affecting a language decision which has essentially no impact on serious
    performance-sensitive code?

    There are a *lot* of things in Scala and in the standard library that are
    just horrendously slow. There are obvious things like structural types,
    but also somewhat less obvious things like BitSet. Despite the performance
    problems, these features still exist and are the way they are for a good
    reason. People using the language just have to be aware of the
    implications of these features and design accordingly. My argument is that
    varargs has traditionally fallen (and should continue to fall) in the same
    category: a convenience feature that is used judiciously with eyes open.

    Fixing the Seq alias neither rectifies nor exacerbates this situation.

    Daniel
    On Wed, Oct 24, 2012 at 3:15 PM, Rex Kerr wrote:
    On Wed, Oct 24, 2012 at 1:48 PM, Daniel Spiewak wrote:

    So…can we fix the Seq alias now? :-)
    Er, no? You were saying that we should fix Seq because who cares if it's
    slower because varargs is already way too slow. Paul just made varargs
    faster. That is the opposite of making it a better idea to fix the Seq
    alias, unless we have benchmarks showing that the immutable Seq solution
    isn't too slow. I didn't write any such benchmarks--I only wrote those for
    tiny collections with varargs since you claimed they were slow.

    If Paul's speedup change is also entirely immutable (I didn't quite catch
    whether that was the case), then we just need one more benchmark to check
    the large-existing-collection case and see how bad it is. Then we can make
    an informed decision. If it's not entirely immutable, we'd also need to
    know how fast or slow the immutable solution was.

    --Rex
  • Rex Kerr at Oct 24, 2012 at 10:01 pm

    On Wed, Oct 24, 2012 at 5:27 PM, Daniel Spiewak wrote:

    My point is that the question of performance is an artificial conflation
    of concerns. I spend a good deal of time working on super-tight,
    performance sensitive code in Scala.

    Yes, I do too.

    If performance is even a question in my mind, I'm simply not going to
    use varargs in *any* form. Whether we fix the Seq alias or not is
    irrelevant to such code.
    It is irrelevant to the most-heavily-executed bits of code. But there's a
    lot _more_ code, the not-the-inner-loop-but-still-executed-kind-of-a-lot
    code, that one has to worry about. These things add up, as you have
    probably noticed from your profiling. Do I use a for loop or do I use
    while? Well, that depends. I'd rather use the for loop, but in the
    innermost loop of time-sensitive code I don't even think about it: it's
    while or recursion, only. In the next loop out, if the inner loop is
    small, I _still_ can't use for. The slower for is, the more inner
    iterations I need before I can decide to not worry about it and use the
    compact expressive form instead of the highest-performance form.

    Surely you do this also.

    If you speed up the library, then you get to use compact expressive code in
    more places. If you slow it down, you get to use it in fewer places (or
    your program just gets slower).

    In my experience, and in my day job, the Seq alias change is a question of
    pure correctness and usability of the language. I raised the point about
    small varargs sizes in an attempt to illustrate how tangential the entire
    performance question really is in this discussion. If varargs performance
    were such a serious concern before, someone would have sat down and made an
    attempt to optimize some of these common cases. If it wasn't a serious
    concern before, why should it be a concern now? Much less a concern
    affecting a language decision which has essentially no impact on serious
    performance-sensitive code?
    Because as you saw, the change _wasn't actually that big_. Varargs already
    were pretty good. If they suddenly got 10x or 100x slower, it might
    require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have to
    take a collection.immutable.Seq. I really don't want to have to make a
    defensive copy of every array I pass into something annotated with Seq, I
    don't think. Maybe a defensive wrapper is okay--but then collection.Seq
    already *is* a defensive wrapper (well, defensive upcast) because it
    doesn't have any mutating methods.

    There are a *lot* of things in Scala and in the standard library that are
    just horrendously slow. There are obvious things like structural types,
    but also somewhat less obvious things like BitSet.

    Yes. Many maps too.

    What I conclude is that things need work, like immutable RB trees did.
    They're much improved in 2.10.

    I don't conclude that we should make more things slow, just because some
    other things are slow.

    Despite the performance problems, these features still exist and are the
    way they are for a good reason. People using the language just have to be
    aware of the implications of these features and design accordingly.

    Agreed, and implementing "just-Seq is immutable.Seq" changes all these
    implications that were hopefully thought about before when it mattered.
    It's not something to do lightly if you care about performance. If there
    are really no substantive downsides, then I'm all for it. I prefer
    immutability by default, mutability by request. But I'm not at all
    convinced yet that there are no substantive downsides.

    Let me give you an example of something that *might* make a difference for
    immutable Seq in general (not varargs), picked at random out of a grep
    "Seq". I have a line of code that looks like

    def decode(encoded: Seq[String], format: String)

    in a little custom base64 library. I'd get the Seq[String] as s.split('
    '), which means I've got an array. Adding the cost of a single-object
    wrapper isn't too bad. What about copying the entire array? Is that
    enough to matter for my use case (which is decoding a bunch of small binary
    data packed into a string)? Don't know. Probably not, but I might have to
    check again if parsing performance goes down too much, especially in the
    case where I'm only trying to extract a subset of the items (which I can
    control by altering the format string).

    My argument is that varargs has traditionally fallen (and should continue
    to fall) in the same category: a convenience feature that is used
    judiciously with eyes open.
    I think we should fix these things so we can use them as injudiciously as
    possible.

    Fixing the Seq alias neither rectifies nor exacerbates this situation.
    I think it does/can exacerbate, for the reasons described above. Again,
    it's not a no, but it's a proceed-with-care. I don't think we've done
    quite enough due diligence to even change the vararg to immutable.Seq, let
    alone the default Seq to immutable.Seq.

    --Rex
  • Roland Kuhn at Oct 25, 2012 at 5:03 am

    25 okt 2012 kl. 00:01 skrev Rex Kerr:

    In my experience, and in my day job, the Seq alias change is a question of pure correctness and usability of the language. I raised the point about small varargs sizes in an attempt to illustrate how tangential the entire performance question really is in this discussion. If varargs performance were such a serious concern before, someone would have sat down and made an attempt to optimize some of these common cases. If it wasn't a serious concern before, why should it be a concern now? Much less a concern affecting a language decision which has essentially no impact on serious performance-sensitive code?

    Because as you saw, the change _wasn't actually that big_. Varargs already were pretty good. If they suddenly got 10x or 100x slower, it might require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have to take a collection.immutable.Seq. I really don't want to have to make a defensive copy of every array I pass into something annotated with Seq, I don't think. Maybe a defensive wrapper is okay--but then collection.Seq already *is* a defensive wrapper (well, defensive upcast) because it doesn't have any mutating methods.
    You are arguing besides the main point: all data structures which are not specially imported from collection.mutable are names for the immutable version, except for Seq.

    def makeDecider(trapExit: Seq[Class[_ <: Throwable]]): Decider =
    { case x ⇒ if (trapExit exists (_ isInstance x)) Restart else Escalate }

    This seemingly correct code (incidentally written by YT) is in fact a bad bug, since you could pass in an Array[] and happily observe all the breakage which ensues after changing the array’s content afterwards (thinking that the immutable Decider must surely be okay).

    collection.Seq not having mutators is not at all a valid defense!

    Now ask yourself how many Scala developers would actually have noticed this bug during a review (quite a few very experienced ones actually missed it). Set, Seq, Map, …

    The very same argument holds for all the other odd ones (TraverableOnce, Traversable, Iterable, IndexedSeq). Correctness trumps performance, anytime.

    Regards,

    Roland Kuhn
    Typesafe – The software stack for applications that scale.
    twitter: @rolandkuhn
  • Erik Osheim at Oct 25, 2012 at 5:12 am

    On Thu, Oct 25, 2012 at 07:03:09AM +0200, Roland Kuhn wrote:
    You are arguing besides the main point: all data structures which are
    not specially imported from collection.mutable are names for the
    immutable version, except for Seq.
    So if Seq were changed to forbid Array, then what type would one use to
    generalize across List, Vector, and Array? Or is it the idea that this
    generalization exists that you object to?

    -- Erik
  • Naftoli Gugenheim at Oct 25, 2012 at 5:21 am

    On Thu, Oct 25, 2012 at 1:12 AM, Erik Osheim wrote:
    On Thu, Oct 25, 2012 at 07:03:09AM +0200, Roland Kuhn wrote:
    You are arguing besides the main point: all data structures which are
    not specially imported from collection.mutable are names for the
    immutable version, except for Seq.
    So if Seq were changed to forbid Array, then what type would one use to
    generalize across List, Vector, and Array?

    collection.Seq

    Or is it the idea that this
    generalization exists that you object to?

    -- Erik
  • Roland Kuhn at Oct 25, 2012 at 5:23 am

    On 25 okt 2012, at 07:21, Naftoli Gugenheim wrote:


    On Thu, Oct 25, 2012 at 1:12 AM, Erik Osheim wrote:
    On Thu, Oct 25, 2012 at 07:03:09AM +0200, Roland Kuhn wrote:
    You are arguing besides the main point: all data structures which are
    not specially imported from collection.mutable are names for the
    immutable version, except for Seq.
    So if Seq were changed to forbid Array, then what type would one use to
    generalize across List, Vector, and Array?
    collection.Seq
    Precisely! Allowing mutable collections should be a conscious choice.

    Regards,

    Roland
    Or is it the idea that this
    generalization exists that you object to?

    -- Erik
  • Heiko Seeberger at Oct 25, 2012 at 5:34 am

    On Oct 25, 2012, at 7:03 AM, Roland Kuhn wrote:

    Correctness trumps performance, anytime.
    Correctness is the exact and only reason why I started this thread.
    Amen

    Heiko
  • Kevin Wright at Oct 25, 2012 at 5:57 am
    +1
    On 25 Oct 2012 06:03, "Roland Kuhn" wrote:


    25 okt 2012 kl. 00:01 skrev Rex Kerr:
    In my experience, and in my day job, the Seq alias change is a question
    of pure correctness and usability of the language. I raised the point
    about small varargs sizes in an attempt to illustrate how tangential the
    entire performance question really is in this discussion. If varargs
    performance were such a serious concern before, someone would have sat down
    and made an attempt to optimize some of these common cases. If it wasn't a
    serious concern before, why should it be a concern now? Much less a
    concern affecting a language decision which has essentially no impact on
    serious performance-sensitive code?

    Because as you saw, the change _wasn't actually that big_. Varargs
    already were pretty good. If they suddenly got 10x or 100x slower, it
    might require optimization of code that didn't need it before.
    Likewise with anything else taking a collection.Seq that would now have
    to take a collection.immutable.Seq. I really don't want to have to make a
    defensive copy of every array I pass into something annotated with Seq, I
    don't think. Maybe a defensive wrapper is okay--but then collection.Seq
    already *is* a defensive wrapper (well, defensive upcast) because it
    doesn't have any mutating methods.
    You are arguing besides the main point: all data structures which are not
    specially imported from collection.mutable are names for the immutable
    version, except for Seq.
    def makeDecider(trapExit: Seq[Class[_ <: Throwable]]): Decider =
    { case x ⇒ if (trapExit exists (_ isInstance x)) Restart else
    Escalate }
    This seemingly correct code (incidentally written by YT) is in fact a bad
    bug, since you could pass in an Array[] and happily observe all the
    breakage which ensues after changing the array’s content afterwards
    (thinking that the immutable Decider must surely be okay).
    collection.Seq not having mutators is not at all a valid defense!

    Now ask yourself how many Scala developers would actually have noticed
    this bug during a review (quite a few very experienced ones actually missed
    it). Set, Seq, Map, …
    The very same argument holds for all the other odd ones (TraverableOnce,
    Traversable, Iterable, IndexedSeq). Correctness trumps performance, anytime.
    Regards,

    Roland Kuhn
    Typesafe – The software stack for applications that scale.
    twitter: @rolandkuhn
  • Seth Tisue at Oct 25, 2012 at 11:37 am
    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/
  • √iktor Ҡlang at Oct 25, 2012 at 11:47 am

    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:

    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.
    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the
    collection.Seq IS-A immutable.Seq already

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Josh Suereth at Oct 25, 2012 at 12:43 pm
    .to never short circuits.
    On Oct 25, 2012 7:47 AM, "√iktor Ҡlang" wrote:


    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:

    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.
    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the
    collection.Seq IS-A immutable.Seq already

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Roland Kuhn at Oct 25, 2012 at 12:52 pm
    Yes, but why is this desirable?

    25 okt 2012 kl. 14:43 skrev Josh Suereth:
    .to never short circuits.

    On Oct 25, 2012 7:47 AM, "√iktor Ҡlang" wrote:


    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:
    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.

    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the collection.Seq IS-A immutable.Seq already


    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/



    --
    Viktor Klang

    Akka Tech Lead
    Typesafe - The software stack for applications that scale

    Twitter: @viktorklang
    Roland Kuhn
    Typesafe – The software stack for applications that scale.
    twitter: @rolandkuhn
  • Josh Suereth at Oct 25, 2012 at 12:56 pm
    Never said it was desirable. It's more a limitation currently. If you
    have a mechanism/fix for that, please submit! I have ideas for how to make
    it so, but none of my initial attempts panned out. Remember the method
    was initially called 'buildTo' to denote this. The other to methods can
    short circuit because they use the class hierarchy. The to method will
    have to use some fancy type tricks on top of the existing fancy type tricks.
    On Thu, Oct 25, 2012 at 8:52 AM, Roland Kuhn wrote:

    Yes, but why is this desirable?

    25 okt 2012 kl. 14:43 skrev Josh Suereth:

    .to never short circuits.
    On Oct 25, 2012 7:47 AM, "√iktor Ҡlang" wrote:


    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:

    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.
    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the
    collection.Seq IS-A immutable.Seq already

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for
    applications that scale

    Twitter: @viktorklang
    Roland Kuhn
    Typesafe <http://typesafe.com/> – The software stack for applications
    that scale.
    twitter: @rolandkuhn <http://twitter.com/#!/rolandkuhn>
  • √iktor Ҡlang at Oct 25, 2012 at 1:01 pm

    On Thu, Oct 25, 2012 at 2:56 PM, Josh Suereth wrote:

    Never said it was desirable. It's more a limitation currently. If you
    have a mechanism/fix for that, please submit! I have ideas for how to make
    it so, but none of my initial attempts panned out. Remember the method
    was initially called 'buildTo' to denote this. The other to methods can
    short circuit because they use the class hierarchy. The to method will
    have to use some fancy type tricks on top of the existing fancy type tricks.

    Haven't thought much about it, but instinctively I fee like it should be a
    method on CanBuildFrom, or we reify To and check if this isInstanceOf To

    On Thu, Oct 25, 2012 at 8:52 AM, Roland Kuhn wrote:

    Yes, but why is this desirable?

    25 okt 2012 kl. 14:43 skrev Josh Suereth:

    .to never short circuits.
    On Oct 25, 2012 7:47 AM, "√iktor Ҡlang" wrote:


    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:

    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.
    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the
    collection.Seq IS-A immutable.Seq already

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for
    applications that scale

    Twitter: @viktorklang
    Roland Kuhn
    Typesafe <http://typesafe.com/> – The software stack for applications
    that scale.
    twitter: @rolandkuhn <http://twitter.com/#!/rolandkuhn>

    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Josh Suereth at Oct 25, 2012 at 1:37 pm
    On Thu, Oct 25, 2012 at 9:01 AM, √iktor Ҡlang wrote:
    On Thu, Oct 25, 2012 at 2:56 PM, Josh Suereth wrote:

    Never said it was desirable. It's more a limitation currently. If you
    have a mechanism/fix for that, please submit! I have ideas for how to make
    it so, but none of my initial attempts panned out. Remember the method
    was initially called 'buildTo' to denote this. The other to methods can
    short circuit because they use the class hierarchy. The to method will
    have to use some fancy type tricks on top of the existing fancy type tricks.

    Haven't thought much about it, but instinctively I fee like it should be a
    method on CanBuildFrom, or we reify To and check if this isInstanceOf To
    We could reify the ClassTag, but types are not in the standard library. My
    instinct said it belongs in CanBuildFrom as well, but that's part of a much
    more aggressive collection refactoring.

    - Josh
  • Simon Ochsenreither at Oct 25, 2012 at 1:50 pm
    We could reify the ClassTag, but types are not in the standard library.
    I have thought a lot about this for a long time ... any reason why we
    shouldn't move over to Type/Class-tagged collections by default? 4/8 bytes
    overhead per collection instance vs. primitive arrays and more intuitive
    pattern matching semantics?
  • Josh Suereth at Oct 25, 2012 at 1:53 pm
    For Types, it's because Types are a lot of overhead (the whole
    scala-reflect).

    For Classes, it's not a bad idea except for collections of things that
    aren't quite a class.

    val x: Seq[{ def close(): Unit }] = Seq(System.in, System.out)


    - Josh
    On Thu, Oct 25, 2012 at 9:50 AM, Simon Ochsenreither wrote:


    We could reify the ClassTag, but types are not in the standard library.
    I have thought a lot about this for a long time ... any reason why we
    shouldn't move over to Type/Class-tagged collections by default? 4/8 bytes
    overhead per collection instance vs. primitive arrays and more intuitive
    pattern matching semantics?
  • Simon Ochsenreither at Oct 25, 2012 at 2:03 pm
    Could the compiler choose a parent type that's a valid Java class?
  • √iktor Ҡlang at Oct 25, 2012 at 2:06 pm

    On Thu, Oct 25, 2012 at 4:03 PM, Simon Ochsenreither wrote:

    Could the compiler choose a parent type that's a valid Java class?
    How would it handle: def x[T](f: T) = Seq(f)


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Matthew Pocock at Oct 25, 2012 at 2:39 pm
    On 25 October 2012 14:37, Josh Suereth wrote:
    On Thu, Oct 25, 2012 at 9:01 AM, √iktor Ҡlang wrote:



    Haven't thought much about it, but instinctively I fee like it should be
    a method on CanBuildFrom, or we reify To and check if this isInstanceOf To
    We could reify the ClassTag, but types are not in the standard library.
    My instinct said it belongs in CanBuildFrom as well, but that's part of a
    much more aggressive collection refactoring.

    - Josh
    Yeah, I think this is part of a more aggressive collection refactoring. My
    first thought about how to support coll.to[C1, C2] was that there should be
    a typeclass that captures this operation, CollectionTo, which
    coll.todelegates on to. One instance would require an <:< instance and
    do the
    cast, another instances would delegate to a CanBuildFrom instance and
    handle a generic collection to a buildable type, and then perhaps we'd
    provide hand-rolled ones to handle things like [SortedSet, SortedSet] that
    needs to pipe the ordering correctly, and so on. This would let us get the
    right behaviour for the different contexts without runtime instanceof
    checks or relying on implementors overriding the correct magical hooks or
    trusting to the vagaries of trait linearisation.

    This ship has sailed for 2.10, but it would be really nice to revisit
    collections once value classes and macros have settled a bit, with a view
    to getting a clean separation between the different collections interfaces
    and the data structures that underpin these.

    Matthew

    --
    Dr Matthew Pocock
    Integrative Bioinformatics Group, School of Computing Science, Newcastle
    University
    mailto: tur...@...com
    mailto: mat...@...uk
    gchat: tur...@...com
    msn: mat...@...uk
    irc.freenode.net: drdozer
    skype: matthew.pocock
    tel: (0191) 2566550
    mob: +447535664143
  • Stefan Zeiger at Oct 26, 2012 at 1:43 pm

    On 2012-10-25 15:37, Josh Suereth wrote:
    Haven't thought much about it, but instinctively I fee like it
    should be a method on CanBuildFrom, or we reify To and check if
    this isInstanceOf To

    We could reify the ClassTag, but types are not in the standard
    library. My instinct said it belongs in CanBuildFrom as well, but
    that's part of a much more aggressive collection refactoring.
    In that case, I'll just point to my instinct from 4 months ago when this
    was discussed before:
    https://groups.google.com/d/msg/scala-internals/q7tVEOItmP8/ADqNrTipxTkJ
  • √iktor Ҡlang at Oct 25, 2012 at 12:55 pm

    On Thu, Oct 25, 2012 at 2:43 PM, Josh Suereth wrote:

    .to never short circuits.
    The usefulness of it is extremely diminished if I need to write this every
    time:

    def smth(x: Seq[Int]) {
    val realX = x match {
    case x: immutable.Seq[_] => x.asInstanceOf[immutable.Seq[Whathaveyou]]
    case other => other.to[immutable.Seq]
    }

    }


    On Oct 25, 2012 7:47 AM, "√iktor Ҡlang" wrote:


    On Thu, Oct 25, 2012 at 1:37 PM, Seth Tisue wrote:

    I'm with Heiko on this. It's hard to say which is worse, the
    unsafeness or the inconsistency. Combine them, and it's like being
    stabbed by your two best friends from two different directions.

    This has been bugging me for years. It nags at me practically every
    time I type S-e-q, and that's a lot.

    Seriously, I'd love it if this were changed.
    Also related:

    collection.Seq.to[immutable.Seq] doesn't short-circuit if the
    collection.Seq IS-A immutable.Seq already

    --
    Seth Tisue | Northwestern University | http://tisue.net
    developer, NetLogo: http://ccl.northwestern.edu/netlogo/


    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for
    applications that scale

    Twitter: @viktorklang

    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Rex Kerr at Oct 25, 2012 at 2:11 pm

    On Thu, Oct 25, 2012 at 1:03 AM, Roland Kuhn wrote:
    25 okt 2012 kl. 00:01 skrev Rex Kerr:

    In my experience, and in my day job, the Seq alias change is a question of
    pure correctness and usability of the language. I raised the point about
    small varargs sizes in an attempt to illustrate how tangential the entire
    performance question really is in this discussion. If varargs performance
    were such a serious concern before, someone would have sat down and made an
    attempt to optimize some of these common cases. If it wasn't a serious
    concern before, why should it be a concern now? Much less a concern
    affecting a language decision which has essentially no impact on serious
    performance-sensitive code?
    Because as you saw, the change _wasn't actually that big_. Varargs
    already were pretty good. If they suddenly got 10x or 100x slower, it
    might require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have to
    take a collection.immutable.Seq. I really don't want to have to make a
    defensive copy of every array I pass into something annotated with Seq, I
    don't think. Maybe a defensive wrapper is okay--but then collection.Seq
    already *is* a defensive wrapper (well, defensive upcast) because it
    doesn't have any mutating methods.


    You are arguing besides the main point: all data structures which are not
    specially imported from collection.mutable are names for the immutable
    version, except for Seq.
    I'm arguing a different aspect of the point, I think. I agree that ideally
    the defaults would be consistent and would be immutable. The question is
    how big of an impact it is to change at this point. In particular, is it
    okay to just switch from collection.Seq to collection.immutable.Seq at some
    point, or does one need to figure out how to have a deprecation period? If
    it's an inconsequential change, you just fix it. My point is that it is
    _not_ necessarily an inconsequential change for people who have
    high-performance code, and that the fix is less trivial than many things
    that are deprecated (like removing a method) because it will presumably
    massively affect the library interface also, and thus force people to
    re-optimize if they want to maintain performance.

    def makeDecider(trapExit: Seq[Class[_ <: Throwable]]): Decider =
    { case x => if (trapExit exists (_ isInstance x)) Restart elseEscalate }

    This seemingly correct code (incidentally written by YT) is in fact a bad
    bug, since you could pass in an Array[] and happily observe all the
    breakage which ensues after changing the array's content afterwards
    (thinking that the immutable Decider must surely be okay).

    collection.Seq not having mutators is not at all a valid defense!
    Agreed. I wasn't really thinking it was enough defense, just that it's as
    much defense as you can get without having to think about performance a lot.

    If you ask whether I would have noticed the code that you wrote above as a
    bug, the answer is "probably not". But if you ask instead whether I would
    have caught a bug where you took an array, passed it in somewhere, and then
    modified it: probably yes! If you spend much time working with mutable
    data structures, it's a good idea to get used to being careful about
    mutating the data out from under you. It is kind of a gotcha if you don't
    use mutable data structures very often, I'll agree, and that's why I agree
    it would have been better to have Seq be immutable.Seq by default.

    The very same argument holds for all the other odd ones (TraverableOnce,
    Traversable, Iterable, IndexedSeq). Correctness trumps performance, anytime.
    As long as there is a high-performance route, I agree. Otherwise, no
    that's not a given; people are willing to spend some extra thought to gain
    extra performance if they can, sometimes. That the
    computation-speed-minded people are to be ignored in favor of the
    coding-speed-minded people is not a given. One of Scala's claims to fame
    is that it can be used to write high-performance code. I agree that a bias
    towards correctness is good, but one has to ask what the tradeoff really is.

    In this particular case, I think high-performance options can still exist
    since collection.Seq still exists. But do you think we can change the
    interface of the ~215 instances of Seq in defs in the Scala library without
    thinking about the impact on performance, or deciding which of those really
    need to be collection.Seq and which need to be collection.immutable.Seq?

    Also, I'm not sure I agree about TraversableOnce etc.--there is something
    to be said for easy abstraction as well as correctness.

    --Rex
  • √iktor Ҡlang at Oct 25, 2012 at 3:07 pm
    On Thu, Oct 25, 2012 at 4:11 PM, Rex Kerr wrote:
    On Thu, Oct 25, 2012 at 1:03 AM, Roland Kuhn wrote:


    25 okt 2012 kl. 00:01 skrev Rex Kerr:

    In my experience, and in my day job, the Seq alias change is a question
    of pure correctness and usability of the language. I raised the point
    about small varargs sizes in an attempt to illustrate how tangential the
    entire performance question really is in this discussion. If varargs
    performance were such a serious concern before, someone would have sat down
    and made an attempt to optimize some of these common cases. If it wasn't a
    serious concern before, why should it be a concern now? Much less a
    concern affecting a language decision which has essentially no impact on
    serious performance-sensitive code?
    Because as you saw, the change _wasn't actually that big_. Varargs
    already were pretty good. If they suddenly got 10x or 100x slower, it
    might require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have
    to take a collection.immutable.Seq. I really don't want to have to make a
    defensive copy of every array I pass into something annotated with Seq, I
    don't think. Maybe a defensive wrapper is okay--but then collection.Seq
    already *is* a defensive wrapper (well, defensive upcast) because it
    doesn't have any mutating methods.


    You are arguing besides the main point: all data structures which are not
    specially imported from collection.mutable are names for the immutable
    version, except for Seq.
    I'm arguing a different aspect of the point, I think. I agree that
    ideally the defaults would be consistent and would be immutable.
    Yeah

    In predef:

    type Map[A, +B] = immutable.Map[A, B]
    type Set[A] = immutable.Set[A]

    exercise for the reader:

    type Seq[A] = ???

    The question is how big of an impact it is to change at this point. In
    particular, is it okay to just switch from collection.Seq to
    collection.immutable.Seq at some point, or does one need to figure out how
    to have a deprecation period? If it's an inconsequential change, you just
    fix it. My point is that it is _not_ necessarily an inconsequential change
    for people who have high-performance code, and that the fix is less trivial
    than many things that are deprecated (like removing a method) because it
    will presumably massively affect the library interface also, and thus force
    people to re-optimize if they want to maintain performance.

    def makeDecider(trapExit: Seq[Class[_ <: Throwable]]): Decider =
    { case x => if (trapExit exists (_ isInstance x)) Restart elseEscalate }

    This seemingly correct code (incidentally written by YT) is in fact a bad
    bug, since you could pass in an Array[] and happily observe all the
    breakage which ensues after changing the array's content afterwards
    (thinking that the immutable Decider must surely be okay).

    collection.Seq not having mutators is not at all a valid defense!
    Agreed. I wasn't really thinking it was enough defense, just that it's as
    much defense as you can get without having to think about performance a lot.

    If you ask whether I would have noticed the code that you wrote above as a
    bug, the answer is "probably not". But if you ask instead whether I would
    have caught a bug where you took an array, passed it in somewhere, and then
    modified it: probably yes! If you spend much time working with mutable
    data structures, it's a good idea to get used to being careful about
    mutating the data out from under you. It is kind of a gotcha if you don't
    use mutable data structures very often, I'll agree, and that's why I agree
    it would have been better to have Seq be immutable.Seq by default.

    The very same argument holds for all the other odd ones (TraverableOnce,
    Traversable, Iterable, IndexedSeq). Correctness trumps performance, anytime.
    As long as there is a high-performance route, I agree. Otherwise, no
    that's not a given; people are willing to spend some extra thought to gain
    extra performance if they can, sometimes. That the
    computation-speed-minded people are to be ignored in favor of the
    coding-speed-minded people is not a given. One of Scala's claims to fame
    is that it can be used to write high-performance code. I agree that a bias
    towards correctness is good, but one has to ask what the tradeoff really is.

    In this particular case, I think high-performance options can still exist
    since collection.Seq still exists. But do you think we can change the
    interface of the ~215 instances of Seq in defs in the Scala library without
    thinking about the impact on performance, or deciding which of those really
    need to be collection.Seq and which need to be collection.immutable.Seq?

    Also, I'm not sure I agree about TraversableOnce etc.--there is something
    to be said for easy abstraction as well as correctness.

    --Rex

    --
    Viktor Klang

    Akka Tech Lead
    Typesafe <http://www.typesafe.com/> - The software stack for applications
    that scale

    Twitter: @viktorklang
  • Heiko Seeberger at Oct 25, 2012 at 3:34 pm
    OK, it is an official issue now: https://issues.scala-lang.org/browse/SI-6570

    IMHO we should generate some deprecation or warning for Scala 2.10 (if this is possible and in line with the release policy) and apply the change in 2.11.

    Heiko
    On Oct 25, 2012, at 5:07 PM, √iktor Ҡlang wrote:



    On Thu, Oct 25, 2012 at 4:11 PM, Rex Kerr wrote:


    On Thu, Oct 25, 2012 at 1:03 AM, Roland Kuhn wrote:

    25 okt 2012 kl. 00:01 skrev Rex Kerr:
    In my experience, and in my day job, the Seq alias change is a question of pure correctness and usability of the language. I raised the point about small varargs sizes in an attempt to illustrate how tangential the entire performance question really is in this discussion. If varargs performance were such a serious concern before, someone would have sat down and made an attempt to optimize some of these common cases. If it wasn't a serious concern before, why should it be a concern now? Much less a concern affecting a language decision which has essentially no impact on serious performance-sensitive code?

    Because as you saw, the change _wasn't actually that big_. Varargs already were pretty good. If they suddenly got 10x or 100x slower, it might require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have to take a collection.immutable.Seq. I really don't want to have to make a defensive copy of every array I pass into something annotated with Seq, I don't think. Maybe a defensive wrapper is okay--but then collection.Seq already *is* a defensive wrapper (well, defensive upcast) because it doesn't have any mutating methods.
    You are arguing besides the main point: all data structures which are not specially imported from collection.mutable are names for the immutable version, except for Seq.

    I'm arguing a different aspect of the point, I think. I agree that ideally the defaults would be consistent and would be immutable.

    Yeah

    In predef:

    type Map[A, +B] = immutable.Map[A, B]
    type Set[A] = immutable.Set[A]

    exercise for the reader:

    type Seq[A] = ???

    The question is how big of an impact it is to change at this point. In particular, is it okay to just switch from collection.Seq to collection.immutable.Seq at some point, or does one need to figure out how to have a deprecation period? If it's an inconsequential change, you just fix it. My point is that it is _not_ necessarily an inconsequential change for people who have high-performance code, and that the fix is less trivial than many things that are deprecated (like removing a method) because it will presumably massively affect the library interface also, and thus force people to re-optimize if they want to maintain performance.


    def makeDecider(trapExit: Seq[Class[_ <: Throwable]]): Decider =
    { case x ⇒ if (trapExit exists (_ isInstance x)) Restart else Escalate }

    This seemingly correct code (incidentally written by YT) is in fact a bad bug, since you could pass in an Array[] and happily observe all the breakage which ensues after changing the array’s content afterwards (thinking that the immutable Decider must surely be okay).

    collection.Seq not having mutators is not at all a valid defense!

    Agreed. I wasn't really thinking it was enough defense, just that it's as much defense as you can get without having to think about performance a lot.

    If you ask whether I would have noticed the code that you wrote above as a bug, the answer is "probably not". But if you ask instead whether I would have caught a bug where you took an array, passed it in somewhere, and then modified it: probably yes! If you spend much time working with mutable data structures, it's a good idea to get used to being careful about mutating the data out from under you. It is kind of a gotcha if you don't use mutable data structures very often, I'll agree, and that's why I agree it would have been better to have Seq be immutable.Seq by default.

    The very same argument holds for all the other odd ones (TraverableOnce, Traversable, Iterable, IndexedSeq). Correctness trumps performance, anytime.

    As long as there is a high-performance route, I agree. Otherwise, no that's not a given; people are willing to spend some extra thought to gain extra performance if they can, sometimes. That the computation-speed-minded people are to be ignored in favor of the coding-speed-minded people is not a given. One of Scala's claims to fame is that it can be used to write high-performance code. I agree that a bias towards correctness is good, but one has to ask what the tradeoff really is.

    In this particular case, I think high-performance options can still exist since collection.Seq still exists. But do you think we can change the interface of the ~215 instances of Seq in defs in the Scala library without thinking about the impact on performance, or deciding which of those really need to be collection.Seq and which need to be collection.immutable.Seq?

    Also, I'm not sure I agree about TraversableOnce etc.--there is something to be said for easy abstraction as well as correctness.

    --Rex




    --
    Viktor Klang

    Akka Tech Lead
    Typesafe - The software stack for applications that scale

    Twitter: @viktorklang
  • Paolo G. Giarrusso at Oct 25, 2012 at 4:57 pm
    In this code:

    def foo(varargs: Any*) = println(varargs.getClass())
    foo(1, 2, 3) //prints "class scala.collection.mutable.WrappedArray$ofRef".

    the compiler could wrap Array(1, 2, 3) in a different, immutable wrapper
    without any extra copy.
    And I'd really like the compiler to do that.
    Discussion below.

    Il giorno mercoledì 24 ottobre 2012 23:01:06 UTC+1, Rex Kerr ha scritto:
    If you speed up the library, then you get to use compact expressive code
    in more places. If you slow it down, you get to use it in fewer places (or
    your program just gets slower).
    I see that argument a lot and usually like it. But I like when it does not
    mean worsening the semantics: the right tradeoff is to give the programmer
    the right tools (think garbage collection) while using very fancy
    techniques (say, the result of 40 years of research) for making it fast. If
    you apply this indefinitely, you get something like Self (a much better and
    extremely cool Smalltalk).

    I guess here we might not have those techniques (but see below), and maybe
    what you suggest is the right decision, but this argument is still much
    weaker.

    In my experience, and in my day job, the Seq alias change is a question of
    pure correctness and usability of the language. I raised the point about
    small varargs sizes in an attempt to illustrate how tangential the entire
    performance question really is in this discussion. If varargs performance
    were such a serious concern before, someone would have sat down and made an
    attempt to optimize some of these common cases. If it wasn't a serious
    concern before, why should it be a concern now? Much less a concern
    affecting a language decision which has essentially no impact on serious
    performance-sensitive code?
    Because as you saw, the change _wasn't actually that big_. Varargs
    already were pretty good. If they suddenly got 10x or 100x slower, it
    might require optimization of code that didn't need it before.

    Likewise with anything else taking a collection.Seq that would now have to
    take a collection.immutable.Seq. I really don't want to have to make a
    defensive copy of every array I pass into something annotated with Seq, I
    don't think. Maybe a defensive wrapper is okay--but then collection.Seq
    already *is* a defensive wrapper (well, defensive upcast) because it
    doesn't have any mutating methods.
    Do you pass many arrays as varargs explicitly? In my code, there's many
    more arrays being passed _implicitly_, which is extremely annoying, because
    the compiler uses Array by default:

    def foo(varargs: Any*) = println(varargs.getClass())
    foo(1, 2, 3) //prints "class scala.collection.mutable.WrappedArray$ofRef".

    Now, the hot path we're talking about does not include mutating varargs.
    I'm sure there can be cases when you do that, and maybe important cases,
    but that's definitely not the most common choice. So, why does the compiler
    automatically create a _mutable_ wrapper to the array? It could create an
    immutable wrapper without introducing any copy in the hot path. However,
    creating such a wrapper is an "unsafe" operation: you can call it only if
    you know that the raw array is not used afterward and cannot be extracted
    from the wrapper. But in the above code, Array(1, 2, 3) is not going to be
    aliased, so you don't need the copy.

    Today, otherwise immutable case classes become mutable because of varargs,
    and that's really bad and serves little performance purpose. And I think
    this should change even if Seq keeps pointing to collection.Seq.

    That's the only thing I take issue with. If you have a mutable sequence, I
    accept that you really want to pass it to a variadic function.

    Paolo
  • Pavel Pavlov at Oct 27, 2012 at 4:22 pm

    On Thursday, October 25, 2012 11:57:12 PM UTC+7, Paolo G. Giarrusso wrote:

    Do you pass many arrays as varargs explicitly? In my code, there's many
    more arrays being passed _implicitly_, which is extremely annoying, because
    the compiler uses Array by default:

    def foo(varargs: Any*) = println(varargs.getClass())
    foo(1, 2, 3) //prints "class scala.collection.mutable.WrappedArray$ofRef".

    Now, the hot path we're talking about does not include mutating varargs.
    I'm sure there can be cases when you do that, and maybe important cases,
    but that's definitely not the most common choice. So, why does the compiler
    automatically create a _mutable_ wrapper to the array? It could create an
    immutable wrapper without introducing any copy in the hot path. However,
    creating such a wrapper is an "unsafe" operation: you can call it only if
    you know that the raw array is not used afterward and cannot be extracted
    from the wrapper. But in the above code, Array(1, 2, 3) is not going to be
    aliased, so you don't need the copy.
    Why not pass this way to the end and not to create a fully functional
    immutable array class in the standard library? Something like this:

    ========================
    class ImmArray[+T] private(data: Array[T]) extends immutable.IndexedSeq[T] {
    def apply(i: Int) = data(i)
    def toArray = data.clone()
    ...
    }

    object ImmArray {
    private[collection] def makeNoCopy[T](arr: Array[T]) = new ImmArray(arr)
    def make[T](xs: collection.Seq[T]): ImmArray[T] = xs match {
    case immArray: ImmArray[T] => immArray
    case imm: immutable.Seq[T] => makeNoCopy(imm.toArray)
    case _ => makeNoCopy(xs.toArray.clone())
    }

    def apply[T](xs: T*) = make(xs)

    def fill[T](n: Int)(elem: => T) = makeNoCopy(Array.fill(n)(elem))
    def tabulate[T](n: Int)(f: Int => T) = makeNoCopy(Array.tabulate(n)(f))
    def iterate[T](start: T, len: Int)(f: T => T) =
    makeNoCopy(Array.iterate(start, len)(f))

    def newBuilder[T: ClassTag]: Builder[T, ImmArray[T]] =
    ArrayBuilder.make[T]() mapResult makeNoCopy
    ...
    }
    ========================

    With it the only thing compiler should do when passing varargs to method is
    to wrap collected varargs array with `ImmArray.makeNoCopy`.

    I believe this class will be very usable besides varargs processing:
    - it can have public creation methods symmetric to those in Array object:
    apply, fill, tabulate, iterate, etc.
    - it can be "specialized" to avoid boxing of primitives (like
    WrappedArray.ofXXX)
    - ImmArray can be made covariant on its element type
    - ImmArrayBuilder that works like ListBuffer (i.e. copy exported
    collection lazily) can be implemented; or better, method `toImmArray` with
    the same semantics can be added to ArrayBuffer
    - some collection implementations can use `ImmArray.makeNoCopy` internally
    to export internal arrays they do not mutate
    - (arguably) `makeNoCopy` can be made accessible to those users who need
    most performance while using "trust me, nobody would dare alter this
    array"-style interfaces (e.g. for Java interop or legacy code).

    So, with this class we'll get fastest/smallest random-access immutable
    covariant collection backed by an array.

    What do you think?


    Regards,
    Pavel
  • Paul Phillips at Oct 27, 2012 at 3:14 pm

    On Sat, Oct 27, 2012 at 4:00 AM, Pavel Pavlov wrote:

    What do you think?
    I've thought about implementing an immutable array a number of times; it
    does seem to fill a need. I only wish we had some kind of uniqueness
    capability which would allow for promoting an existing array to an
    immutable array with verification that there are no more references to it
    running around elsewhere. But it sounds plenty useful just the way you
    describe it.
  • Pavel Pavlov at Oct 27, 2012 at 3:25 pm
    2012/10/27 Paul Phillips <pau...@...org>
    I only wish we had some kind of uniqueness capability which would allow
    for promoting an existing array to an immutable array with verification
    that there are no more references to it running around elsewhere.
    Then you need something like C++ rvalue references, which will (greatly?
    somewhat?) complicate type system. Moreover, AFAIU to ensure that
    uniqueness property holds for a reference the compiler must employ some
    form of escape analysis (C++ doesn't check this and allow user manually
    declare object reference/pointer as "safe" with `std::move`).

    BTW, what people think of the great immutability hole also known as
    `ListBuffer.readOnly` method? (This method destroy guarantees that any
    external List you use is immutable).
  • Paul Phillips at Oct 27, 2012 at 6:00 pm

    On Sat, Oct 27, 2012 at 8:25 AM, Pavel Pavlov wrote:

    BTW, what people think of the great immutability hole also known as
    `ListBuffer.readOnly` method? (This method destroy guarantees that any
    external List you use is immutable).
    Yikes. I hadn't really noticed it. Seems like a serious bug to me.

    scala> val buf = scala.collection.mutable.ListBuffer[Int](1, 2)
    buf: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2)

    scala> def f(): List[Int] = buf.readOnly
    f: ()List[Int]

    scala> val innocent = f()
    innocent: List[Int] = List(1, 2)

    scala> buf ++= (1 to 100)
    res4: buf.type = ListBuffer(1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
    13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
    32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
    51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
    70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
    89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

    scala> innocent
    res5: List[Int] = List(1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
    15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
    34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
    53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71,
    72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90,
    91, 92, 93, 94, 95, 96, 97, 98, 99, 100)
  • Daniel Spiewak at Oct 28, 2012 at 1:33 am
    This doesn't surprise me, give how ListBuffer is implemented. Really the only way to avoid this situation while preserving a O(1) immutable copy in the common case would be to change the state of the ListBuffer such that subsequent writes lazily rebuild the internal structure.

    Daniel
    On Oct 27, 2012, at 10:13 AM, "Paul Phillips" wrote:


    On Sat, Oct 27, 2012 at 8:25 AM, Pavel Pavlov wrote:
    BTW, what people think of the great immutability hole also known as `ListBuffer.readOnly` method? (This method destroy guarantees that any external List you use is immutable).
    Yikes. I hadn't really noticed it. Seems like a serious bug to me.

    scala> val buf = scala.collection.mutable.ListBuffer[Int](1, 2)
    buf: scala.collection.mutable.ListBuffer[Int] = ListBuffer(1, 2)

    scala> def f(): List[Int] = buf.readOnly
    f: ()List[Int]

    scala> val innocent = f()
    innocent: List[Int] = List(1, 2)

    scala> buf ++= (1 to 100)
    res4: buf.type = ListBuffer(1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)

    scala> innocent
    res5: List[Int] = List(1, 2, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100)
  • Paul Phillips at Oct 28, 2012 at 2:02 am

    On Sat, Oct 27, 2012 at 6:33 PM, Daniel Spiewak wrote:

    This doesn't surprise me, give how ListBuffer is implemented. Really the
    only way to avoid this situation while preserving a O(1) immutable copy in
    the common case would be to change the state of the ListBuffer such that
    subsequent writes lazily rebuild the internal structure.
    It already does that. The issue is that readOnly offers a means to get a
    List, but does not mark it as exported so it can be mutated after that
    point. If you call toList, you get the same list as you do with readOnly,
    but the buffer is marked dirty.

    It's completely unnecessary to expose readOnly as far as I can see.
  • Martin odersky at Oct 28, 2012 at 10:44 am
    On Sun, Oct 28, 2012 at 3:02 AM, Paul Phillips wrote:
    On Sat, Oct 27, 2012 at 6:33 PM, Daniel Spiewak wrote:

    This doesn't surprise me, give how ListBuffer is implemented. Really the
    only way to avoid this situation while preserving a O(1) immutable copy in
    the common case would be to change the state of the ListBuffer such that
    subsequent writes lazily rebuild the internal structure.
    It already does that. The issue is that readOnly offers a means to get a
    List, but does not mark it as exported so it can be mutated after that
    point. If you call toList, you get the same list as you do with readOnly,
    but the buffer is marked dirty.

    It's completely unnecessary to expose readOnly as far as I can see.
    Is it used somewhere in the standard library? I agree that if possible we
    should deprecate it.

    Cheers

    - Martin

Related Discussions