FAQ
Hi all,

I got two questions when I try to play with slice..

1. Suppose we have the following case:
we have a slice that contains 1million number, from 1 to 1000000. Then I
need to iterate the slice and delete the number that meet some requirement.
Is slice is the perfect data structure to do that, or any other data
structure is better than slice? I tried list, but it's too slow.

2. Is the slice's implementation guarantee concurrent write, or the user is
required to implement the concurrent write action?

Thanks.
Robert

--

Search Discussions

  • Erwin at Oct 15, 2012 at 6:52 pm
    regarding 1)
    if the ordering of elements in the slice doesn't matter, a fast deletion
    can be done by placing the last element of the slice in the position of the
    element to be deleted, followed by shortening the slice length by 1 (slice
    = slice[:len(slice) - 1])

    regarding 2)
    slices need protection when two ore more concurrent writes might happen to
    the same slice element

    --
  • Vanja Pejovic at Oct 15, 2012 at 6:52 pm
    1. If you are going to be deleting many numbers, you probably want to make
    a new slice and put the numbers you want to keep in it. If you want to do
    this concurrently consider having an input channel for all the 1 million
    numbers and an output channel for the numbers you want to keep. Look at
    implementations of the prime sieve in go for inspiration:
    https://www.google.com/search?q=prime+sieve+go&aq=f&oq=prime+sieve+go&sugexp=chrome,mod=0&sourceid=chrome&ie=UTF-8

    2. Not sure exactly what your question is, but I'm pretty sure that writing
    a value to an index in a slice is guaranteed to leave the slice and the
    value at the index in a consistent state if the value is smaller than a
    word. Deleting from a slice is not at all atomic though.

    On Mon, Oct 15, 2012 at 2:34 PM, Robert Sandra
    wrote:
    Hi all,

    I got two questions when I try to play with slice..

    1. Suppose we have the following case:
    we have a slice that contains 1million number, from 1 to 1000000. Then I
    need to iterate the slice and delete the number that meet some requirement.
    Is slice is the perfect data structure to do that, or any other data
    structure is better than slice? I tried list, but it's too slow.

    2. Is the slice's implementation guarantee concurrent write, or the user
    is required to implement the concurrent write action?

    Thanks.
    Robert

    --

    --
  • Kevin Gillette at Oct 15, 2012 at 10:15 pm
    For such a computationally simple task, doing this concurrently doesn't
    really improve the structure of the code, and the cost using channels or
    mutexes would outweigh the benefit of trying to parallelize the operation
    using an append methodology (which does require locking -- it'd take _much_
    longer to lock/unlock or do channel ops then it would to do the actual
    'useful' work).

    If you have to scan the entire slice anyway, and order of elements doesn't
    matter, I'd suggest notnot's approach, as it involves the fewest number of
    writes. It can be done concurrently (and in parallel) as suggested by
    others here, by giving each worker a non-overlapping slice, which can
    either use notnot's approach, or if order is needed, you can just do a
    manual copy of each element as it's encountered to fill the beginning of
    the gap as it grows (it will effectively never shrink) -- each subslice can
    be processed by each worker in one pass, and a post-process step using
    copy() can be done to fill the inter-subslice gaps, with a new,
    precisely-sized backing array if desired.
    On Monday, October 15, 2012 12:53:16 PM UTC-6, Vanja Pejovic wrote:

    1. If you are going to be deleting many numbers, you probably want to make
    a new slice and put the numbers you want to keep in it. If you want to do
    this concurrently consider having an input channel for all the 1 million
    numbers and an output channel for the numbers you want to keep. Look at
    implementations of the prime sieve in go for inspiration:
    https://www.google.com/search?q=prime+sieve+go&aq=f&oq=prime+sieve+go&sugexp=chrome,mod=0&sourceid=chrome&ie=UTF-8
    --
  • Dumitru Ungureanu at Oct 15, 2012 at 7:18 pm
    And I imagine things will repeat after a while, since we're working with
    the decimal system. I suspect 99 would pretty much be the same with 9999
    and so on.

    The way I see it: a simple number generator/increment, processing, writing
    the "right" numbers in a file.
    Use go routines and range number generators to concurrently solve more than
    one increment.

    But I'd look for patterns, not for brute force attack.

    --
  • Dumitru Ungureanu at Oct 15, 2012 at 7:20 pm
    1. Working with so big hunks of data is never a good idea: divide et
    impera. Ranges: sl1[1:10.000], sl2[10.001:20.000] or something like that.

    2. Everything is memory, you can concurrently write in memory. But you need
    to implement the write.

    --
  • Bryanturley at Oct 15, 2012 at 9:49 pm

    1. Working with so big hunks of data is never a good idea: divide et
    impera. Ranges: sl1[1:10.000], sl2[10.001:20.000] or something like that.
    You can't slice fractionally, sl2[10.001:20.000] doesn't make sense to me,
    perhaps I missed something here...

    http://golang.org/ref/spec#Slice_types
    "The elements can be addressed by integer indices 0 through len(s)-1"

    AND THEN I realized I live in a country where 1,000 is a thousand and you
    probably live in one where 1.000 is a thousand ;)
    Still I don't think you can slice with commas or periods 1000 not 1,000 or
    1.000

    --
  • Dumitru Ungureanu at Oct 15, 2012 at 9:04 pm
    On Monday, October 15, 2012 11:54:30 PM UTC+3, bryanturley wrote:

    1. Working with so big hunks of data is never a good idea: divide et
    impera. Ranges: sl1[1:10.000], sl2[10.001:20.000] or something like that.
    You can't slice fractionally, sl2[10.001:20.000] doesn't make sense to me,
    perhaps I missed something here...
    http://play.golang.org/p/9d287iKxi5


    AND THEN I realized I live in a country where 1,000 is a thousand and you
    probably live in one where 1.000 is a thousand ;)
    Yes, you're right. Romania is my homeland.

    --
  • Bryanturley at Oct 15, 2012 at 9:11 pm
    Heh, no I thought you where trying to slice like x[0 : float32(3) /
    float32(4)], because of the "." and "," mix up.
    Wrote that way to verbosely because of the aforementioned mix up.



    --
  • Bryanturley at Oct 15, 2012 at 8:38 pm
    Depends on what your doing. If you have a slice that contains 1 to 1mil
    and you want to delete entries from it...

    Why not just have a slice that has the numbers you don't want from 1 to
    1mil in it? I bet in the end it will be shorter.

    Also perhaps write your own bitmap (not the misnamed graphics one...)
    which wikipedia seems to think is called a
    http://en.wikipedia.org/wiki/Bit_array
    If your range is always 1 to 1mil you could pack 64 of those numbers into a
    single uint64 and test for them.
    Would also take 1/64th of the memory. Though i would consider making your
    range 0-1mil for simplicity, even if you never use 0.


    If you used atomic operations on the bitmap it could be thread safe,
    assuming it never grows or moves somewhere else.

    --
  • Dumitru Ungureanu at Oct 15, 2012 at 9:12 pm
    ... and I used 10.000 number notation for readability.

    --
  • Dumitru Ungureanu at Oct 15, 2012 at 9:17 pm
    My best bet, to eliminate or to select, are still patterns.

    But, if it's a must, then something like ten go routines 1-100.000 (or
    100,000) through 900.001-1.000.000, to iterate the numbers for processing,
    all buffering to the same channel responsible for writing the right numbers
    in a file.

    --

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupgolang-nuts @
categoriesgo
postedOct 15, '12 at 6:34p
activeOct 15, '12 at 10:15p
posts12
users6
websitegolang.org

People

Translate

site design / logo © 2022 Grokbase