FAQ
Reviewers: golang-dev1,

Message:
Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com),

I'd like you to review this change to
https://code.google.com/p/go/


Description:
    sort: provide different stable sort algorithms

NOT FOR SUBMISSION

This CL is for testing and discussing different variants
of a stable sort algorithm.

This CL provides three different implementations for
stable inplace sorting. Actually the sorting is done
by a standard merge sort, only the inplace merge
algorithms differ. The following three variants are
compared:
   - gcc-4.6.3's merge_without_buffer which is used by
     libstdc++'s inplace_stable_sort function.
   - SymMerge from Pok-Son Kim and Arne Kutzner, "Stable
     minimum storage merging ba symetric comparisons"
     (Algorithms - ESA 2004, Volume 3221 of Lecture Notes
     in Computer Sience, Springer 2004, p. 714-723)
   - SplitMerge, also from Kim and Kutzern, "A simple
     algorithm for stable minimum storage merging."

All three merging algorithm are stable and the recursion
depth is bound by log(n), so that merge-sorting with them
requieres only logarithmic additional stack space.
The main helper function to all merge algorithms is a
block rotation, the implementation is a direct translation
of libstdc++'s __rotate function.

The cutoffs when switching from merge- to insertion sort
where determined on a dualcore Intel i7 2.67GHz CPU.

Benchmarking shows that sorting with SymMerge and
SplitMerge (SymMergeSort and SplitMergeSort) are comparable
in speed while using MergeWithoutBuffer (InplaceStableSort)
is considerable slower.

Measuring the number of operations (swaps and comparisons)
performed seems to indicate that all three algorithms
perform O(n log n) swaps and O(n log n) compares.
InplaceStableSort does more swaps and comparisons than
Sym/SplitMergeSort - and is thus slower.

SymMergeSort uses only 1.5n compares on an already sorted
sequence whereas SplitMergeSort needs 2n comparisons and
InplaceStableSortNeeds 5n. SplitMerge sort and SymMergeSort
also differ a bit on saw-tooth like data: SymMergeSort uses
less operations

For the standard benchmarks (data of the form i % 0x2cc)
SymMergeSort compares well with the standard Sort implementation
wheras for random data the standard quicksort is faster by
a factor of 3 to 9.

This data (at least on 64 bit Intel i7) suggests to use
SymMerge. Thus something like

    // StableSort sorts data in a stable way.
    // It makes one call to data.Len to determine n, and O(n*log(n))
    // calls to data.Less and data.Swap. The sort is guaranteed to
    // be stable. StableSort has the same asymptotic runtime bounds
    // as Sort but larger factors hidden in the big O and will take
    // considerable more time on random data than Sort.
    func StableSort(data Interface) {
        // merge-sort using symMerge to merge blocks
    }



Benchmarks ("StandardSort" is the unstable Sort):

BenchmarkSearchWrappers 10000000 281 ns/op
BenchmarkSortString1K 5000 342614 ns/op
BenchmarkSortInt1K 10000 163881 ns/op
BenchmarkSortInt64K 100 14672172 ns/op

BenchmarkSymMergeSortInt1K 50000 50043 ns/op
BenchmarkSplitMergeSortInt1K 50000 50677 ns/op
BenchmarkInplaceStableSortInt1K 10000 153956 ns/op
BenchmarkStandardSortInt1K 10000 156901 ns/op

BenchmarkSymMergeSortInt64K 500 4085378 ns/op
BenchmarkSplitMergeSortInt64K 500 4112881 ns/op
BenchmarkInplaceStableSortInt64K 100 12846001 ns/op
BenchmarkStandardSortInt64K 100 14329672 ns/op

BenchmarkSymMergeSortInt4M 5 318262644 ns/op
BenchmarkSplitMergeSortInt4M 5 318633667 ns/op
BenchmarkInplaceStableSortInt4M 1 1006823241 ns/op
BenchmarkStandardSortInt4M 1 1216140021 ns/op

BenchmarkSymMergeSortString1K 10000 178198 ns/op
BenchmarkSplitMergeSortString1K 10000 196506 ns/op
BenchmarkInplaceStableSortString1K 5000 349352 ns/op
BenchmarkStandardSortString1K 5000 332424 ns/op

BenchmarkSymMergeSortedInt64K 1000 1646250 ns/op
BenchmarkSplitMergeSortedInt64K 1000 1733652 ns/op
BenchmarkInplaceStableSortedInt64K 500 4065249 ns/op
BenchmarkStandardSortedInt64K 200 7937829 ns/op

BenchmarkSymMergeReversedInt64K 500 7451392 ns/op
BenchmarkSplitMergeReversedInt64K 500 7429676 ns/op
BenchmarkInplaceStableReversedInt64K 100 19820718 ns/op
BenchmarkStandardSortReversedInt64K 200 8690838 ns/op

BenchmarkSymMergeSortUniqe 1 1594279671 ns/op
BenchmarkSplitMergeSortUniqe 1 1596697063 ns/op
BenchmarkInplaceStableSortUniqe 1 1662573134 ns/op
BenchmarkStandardSortUniqe 5 347079470 ns/op

BenchmarkSymMergeSortSubset 1 1237720688 ns/op
BenchmarkSplitMergeSortSubset 1 1207237700 ns/op
BenchmarkInplaceStableSortSubset 1 1349776742 ns/op
BenchmarkStandardSortSubset 10 172742441 ns/op

Please review this at https://codereview.appspot.com/9612044/

Affected files:
    M src/pkg/sort/sort.go
    M src/pkg/sort/sort_test.go


--

---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Search Discussions

  • Dr Volker Dobler at May 21, 2013 at 9:15 am
    Please do not focus on the details like
    comments or separators, variable names (taken from
    the source and publications) or that like but on
    the algorithms themselves. Even the function signatures
    are not final (e.g. measuring recursion depth won't
    make it into the final CL). Lots of functions are
    exported to test them, but won't stay exported.

    This CL makes it easy to run the algorithms on
    architectures I do not have access to and play
    with them. Nothing else. It'll be abandoned and
    replaced with a suitable implementation of the
    chosen algorithm.


    https://codereview.appspot.com/9612044/

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Brad Fitzpatrick at May 21, 2013 at 3:31 pm
    Excellent, thanks for working on this!

    I'm worried about GPL infection. Could you compare stl_algo.h from an SGI
    or HP STL, before it went into GNU's libstdc++ to make sure it's the same,
    back before it got a GPL sticker on it too?


    On Tue, May 21, 2013 at 2:09 AM, wrote:

    Reviewers: golang-dev1,

    Message:
    Hello golang-dev@googlegroups.com (cc: golang-dev@googlegroups.com),

    I'd like you to review this change to
    https://code.google.com/p/go/


    Description:
    sort: provide different stable sort algorithms

    NOT FOR SUBMISSION

    This CL is for testing and discussing different variants
    of a stable sort algorithm.

    This CL provides three different implementations for
    stable inplace sorting. Actually the sorting is done
    by a standard merge sort, only the inplace merge
    algorithms differ. The following three variants are
    compared:
    - gcc-4.6.3's merge_without_buffer which is used by
    libstdc++'s inplace_stable_sort function.
    - SymMerge from Pok-Son Kim and Arne Kutzner, "Stable
    minimum storage merging ba symetric comparisons"
    (Algorithms - ESA 2004, Volume 3221 of Lecture Notes
    in Computer Sience, Springer 2004, p. 714-723)
    - SplitMerge, also from Kim and Kutzern, "A simple
    algorithm for stable minimum storage merging."

    All three merging algorithm are stable and the recursion
    depth is bound by log(n), so that merge-sorting with them
    requieres only logarithmic additional stack space.
    The main helper function to all merge algorithms is a
    block rotation, the implementation is a direct translation
    of libstdc++'s __rotate function.

    The cutoffs when switching from merge- to insertion sort
    where determined on a dualcore Intel i7 2.67GHz CPU.

    Benchmarking shows that sorting with SymMerge and
    SplitMerge (SymMergeSort and SplitMergeSort) are comparable
    in speed while using MergeWithoutBuffer (InplaceStableSort)
    is considerable slower.

    Measuring the number of operations (swaps and comparisons)
    performed seems to indicate that all three algorithms
    perform O(n log n) swaps and O(n log n) compares.
    InplaceStableSort does more swaps and comparisons than
    Sym/SplitMergeSort - and is thus slower.

    SymMergeSort uses only 1.5n compares on an already sorted
    sequence whereas SplitMergeSort needs 2n comparisons and
    InplaceStableSortNeeds 5n. SplitMerge sort and SymMergeSort
    also differ a bit on saw-tooth like data: SymMergeSort uses
    less operations

    For the standard benchmarks (data of the form i % 0x2cc)
    SymMergeSort compares well with the standard Sort implementation
    wheras for random data the standard quicksort is faster by
    a factor of 3 to 9.

    This data (at least on 64 bit Intel i7) suggests to use
    SymMerge. Thus something like

    // StableSort sorts data in a stable way.
    // It makes one call to data.Len to determine n, and O(n*log(n))
    // calls to data.Less and data.Swap. The sort is guaranteed to
    // be stable. StableSort has the same asymptotic runtime bounds
    // as Sort but larger factors hidden in the big O and will take
    // considerable more time on random data than Sort.
    func StableSort(data Interface) {
    // merge-sort using symMerge to merge blocks
    }



    Benchmarks ("StandardSort" is the unstable Sort):

    BenchmarkSearchWrappers 10000000 281 ns/op
    BenchmarkSortString1K 5000 342614 ns/op
    BenchmarkSortInt1K 10000 163881 ns/op
    BenchmarkSortInt64K 100 14672172 ns/op

    BenchmarkSymMergeSortInt1K 50000 50043 ns/op
    BenchmarkSplitMergeSortInt1K 50000 50677 ns/op
    BenchmarkInplaceStableSortInt1**K 10000 153956 ns/op
    BenchmarkStandardSortInt1K 10000 156901 ns/op

    BenchmarkSymMergeSortInt64K 500 4085378 ns/op
    BenchmarkSplitMergeSortInt64K 500 4112881 ns/op
    BenchmarkInplaceStableSortInt6**4K 100 12846001 ns/op
    BenchmarkStandardSortInt64K 100 14329672 ns/op

    BenchmarkSymMergeSortInt4M 5 318262644 ns/op
    BenchmarkSplitMergeSortInt4M 5 318633667 ns/op
    BenchmarkInplaceStableSortInt4**M 1 1006823241 ns/op
    BenchmarkStandardSortInt4M 1 1216140021 ns/op

    BenchmarkSymMergeSortString1K 10000 178198 ns/op
    BenchmarkSplitMergeSortString1**K 10000 196506 ns/op
    BenchmarkInplaceStableSortStri**ng1K 5000 349352 ns/op
    BenchmarkStandardSortString1K 5000 332424 ns/op

    BenchmarkSymMergeSortedInt64K 1000 1646250 ns/op
    BenchmarkSplitMergeSortedInt64**K 1000 1733652 ns/op
    BenchmarkInplaceStableSortedIn**t64K 500 4065249 ns/op
    BenchmarkStandardSortedInt64K 200 7937829 ns/op

    BenchmarkSymMergeReversedInt64**K 500 7451392 ns/op
    BenchmarkSplitMergeReversedInt**64K 500 7429676 ns/op
    BenchmarkInplaceStableReversed**Int64K 100 19820718 ns/op
    BenchmarkStandardSortReversedI**nt64K 200 8690838 ns/op

    BenchmarkSymMergeSortUniqe 1 1594279671 ns/op
    BenchmarkSplitMergeSortUniqe 1 1596697063 ns/op
    BenchmarkInplaceStableSortUniq**e 1 1662573134 ns/op
    BenchmarkStandardSortUniqe 5 347079470 ns/op

    BenchmarkSymMergeSortSubset 1 1237720688 ns/op
    BenchmarkSplitMergeSortSubset 1 1207237700 ns/op
    BenchmarkInplaceStableSortSubs**et 1 1349776742 ns/op
    BenchmarkStandardSortSubset 10 172742441 ns/op

    Please review this at https://codereview.appspot.**com/9612044/<https://codereview.appspot.com/9612044/>

    Affected files:
    M src/pkg/sort/sort.go
    M src/pkg/sort/sort_test.go


    --

    ---You received this message because you are subscribed to the Google
    Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an
    email to golang-dev+unsubscribe@**googlegroups.com<golang-dev%2Bunsubscribe@googlegroups.com>
    .
    For more options, visit https://groups.google.com/**groups/opt_out<https://groups.google.com/groups/opt_out>
    .

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Ian Lance Taylor at May 21, 2013 at 4:54 pm

    On Tue, May 21, 2013 at 8:31 AM, Brad Fitzpatrick wrote:
    I'm worried about GPL infection. Could you compare stl_algo.h from an SGI
    or HP STL, before it went into GNU's libstdc++ to make sure it's the same,
    back before it got a GPL sticker on it too?
    The GPL is a copyright license and as such only applies to copyright.
    It does not apply to reimplementation of the algorithm in a different
    language. In any case I'm sure the algorithm is well described
    elsewhere.

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Jan Mercl at May 22, 2013 at 9:11 am

    On Tue, May 21, 2013 at 6:54 PM, Ian Lance Taylor wrote:
    On Tue, May 21, 2013 at 8:31 AM, Brad Fitzpatrick wrote:

    I'm worried about GPL infection. Could you compare stl_algo.h from an SGI
    or HP STL, before it went into GNU's libstdc++ to make sure it's the same,
    back before it got a GPL sticker on it too?
    The GPL is a copyright license and as such only applies to copyright.
    It does not apply to reimplementation of the algorithm in a different
    language.
    That's true for a clean-room (re)implementation. A transliteration of
    existing code into another language is a different case which retains
    the original license, I think.

    -j

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.
  • Ian Lance Taylor at May 22, 2013 at 1:31 pm

    On Wed, May 22, 2013 at 2:07 AM, Jan Mercl wrote:
    On Tue, May 21, 2013 at 6:54 PM, Ian Lance Taylor wrote:
    On Tue, May 21, 2013 at 8:31 AM, Brad Fitzpatrick wrote:

    I'm worried about GPL infection. Could you compare stl_algo.h from an SGI
    or HP STL, before it went into GNU's libstdc++ to make sure it's the same,
    back before it got a GPL sticker on it too?
    The GPL is a copyright license and as such only applies to copyright.
    It does not apply to reimplementation of the algorithm in a different
    language.
    That's true for a clean-room (re)implementation. A transliteration of
    existing code into another language is a different case which retains
    the original license, I think.
    It depends. There is no one answer in this area. I wouldn't worry
    about this case.

    Ian

    --

    ---
    You received this message because you are subscribed to the Google Groups "golang-dev" group.
    To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.
    For more options, visit https://groups.google.com/groups/opt_out.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-dev @
categoriesgo
postedMay 21, '13 at 9:09a
activeMay 22, '13 at 1:31p
posts6
users4
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase