FAQ
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.

I'm writing a simulation and would like to visit all the nodes in a
different order at each iteration of the simulation to remove the risk
of a fixed order introducing spurious evidence of correlation.

## Search Discussions

•  at May 24, 2013 at 8:37 am ⇧

On Fri, May 24, 2013 at 6:14 PM, Peter Brooks wrote:
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.

I'm writing a simulation and would like to visit all the nodes in a
different order at each iteration of the simulation to remove the risk
of a fixed order introducing spurious evidence of correlation.

Permuting a sequence iteratively to cover every possibility? Good fun.

Here's one way of looking at it. Imagine the indices of all elements
are in some special "base" like so:

[a, b, c, d] --> a*4+b*3+c*2+d*1

Then iterate up to the highest possible value (ie 4*3*2*1), picking
indices for each accordingly. I don't know how efficient this will be,
but here's a simple piece of code to do it:

def permute(lst,pos):
lst=lst[:] # Take a copy
ret=[None]*len(lst)
for i in range(len(lst)):
pos,idx=divmod(pos,len(lst))
ret[i]=lst[idx]
del lst[idx]
return ret

for i in range(4*3*2*1):
permute([10,20,30,40],i)

[10, 20, 30, 40]
[20, 10, 30, 40]
[30, 10, 20, 40]
[40, 10, 20, 30]
[10, 30, 20, 40]
[20, 30, 10, 40]
[30, 20, 10, 40]
[40, 20, 10, 30]
[10, 40, 20, 30]
[20, 40, 10, 30]
[30, 40, 10, 20]
[40, 30, 10, 20]
[10, 20, 40, 30]
[20, 10, 40, 30]
[30, 10, 40, 20]
[40, 10, 30, 20]
[10, 30, 40, 20]
[20, 30, 40, 10]
[30, 20, 40, 10]
[40, 20, 30, 10]
[10, 40, 30, 20]
[20, 40, 30, 10]
[30, 40, 20, 10]
[40, 30, 20, 10]

It works, it produces a unique list for any given index provided, but
it's not the cleanest or most efficient. But I know someone'll improve
on it... or tell me I'm an idiot for not taking a more obvious
approach :)

ChrisA
•  at May 24, 2013 at 8:47 am ⇧

On 24 May 2013 09:41, "Chris Angelico" wrote:
On Fri, May 24, 2013 at 6:14 PM, Peter Brooks
wrote:
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.
...
It works, it produces a unique list for any given index provided, but
it's not the cleanest or most efficient. But I know someone'll improve
on it... or tell me I'm an idiot for not taking a more obvious
approach :)

ChrisA

I think that is pretty much itertools.permutations from the standard
library. The OP should check it out.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130524/40d451ba/attachment.html>
•  at May 24, 2013 at 9:11 am ⇧

On Fri, May 24, 2013 at 6:47 PM, F?bio Santos wrote:
On 24 May 2013 09:41, "Chris Angelico" wrote:

On Fri, May 24, 2013 at 6:14 PM, Peter Brooks
wrote:
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.
...
It works, it produces a unique list for any given index provided, but
it's not the cleanest or most efficient. But I know someone'll improve
on it... or tell me I'm an idiot for not taking a more obvious
approach :)

ChrisA
I think that is pretty much itertools.permutations from the standard
library. The OP should check it out.

That works if all the permutations are wanted at once. Is there a way,
short of iterating over it N times, to request permutation #N? Or
maybe I'm misreading the OP and that's not a requirement.

ChrisA
•  at May 24, 2013 at 2:33 pm ⇧

On 24/05/13 10:11, Chris Angelico wrote:
On Fri, May 24, 2013 at 6:47 PM, F?bio Santos wrote:
On 24 May 2013 09:41, "Chris Angelico" wrote:

On Fri, May 24, 2013 at 6:14 PM, Peter Brooks
wrote:
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.
...
It works, it produces a unique list for any given index provided, but
it's not the cleanest or most efficient. But I know someone'll improve
on it... or tell me I'm an idiot for not taking a more obvious
approach :)

ChrisA
I think that is pretty much itertools.permutations from the standard
library. The OP should check it out.
That works if all the permutations are wanted at once. Is there a way,
short of iterating over it N times, to request permutation #N? Or
maybe I'm misreading the OP and that's not a requirement.

ChrisA

A long time ago I wrote some code to do that.

import gmpy

def LexPermFromIndex(items, index):
n = len(items)
inds = range(n)
perm = []
for i in range(1, n+1):
r, index = divmod(index, gmpy.fac(n-i))
r = int(r)
perm.append(inds[r])
inds = inds[:r] + inds[r+1:]

return [items[i] for i in perm]

LexPermFromIndex([1,2,3,4], 0)
[1, 2, 3, 4]
LexPermFromIndex([1,2,3,4], 1)
[1, 2, 4, 3]
LexPermFromIndex([1,2,3,4], 10)
[2, 4, 1, 3]
>>>

I can't remember exactly why I wrote it. But I also have something for
generating a permutation's index and similar functions for combinations.

Duncan
•  at May 24, 2013 at 10:10 am ⇧

On 5/24/2013 4:14 AM, Peter Brooks wrote:
What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.

I'm writing a simulation and would like to visit all the nodes in a
different order at each iteration of the simulation to remove the risk
of a fixed order introducing spurious evidence of correlation.

random module has a shuffle function I believe
•  at May 24, 2013 at 10:52 am ⇧

On Fri, 24 May 2013 01:14:45 -0700, Peter Brooks wrote:

What is the easiest way to reorder a sequence pseudo-randomly?

import random
random.shuffle(sequence)

The sequence is modified in place, so it must be mutable. Lists are okay,
tuples are not.

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.

You can't *guarantee* that it will be different each time. With a four-
item list, there are only 4! = 24 combinations, so on average you'll get
the original order one time in 24. For a ten-item list, that is once
every 3628800 times, and for a twenty-item list, once in
2432902008176640000 times. But of course these are *random*, and there's
always a chance of this:

http://dilbert.com/strips/comic/2001-10-25

--
Steven
•  at May 24, 2013 at 11:26 am ⇧

On 5/24/2013 6:52 AM, Steven D'Aprano wrote:
On Fri, 24 May 2013 01:14:45 -0700, Peter Brooks wrote:

What is the easiest way to reorder a sequence pseudo-randomly?
import random
random.shuffle(sequence)

The sequence is modified in place, so it must be mutable. Lists are okay,
tuples are not.

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.
You can't *guarantee* that it will be different each time. With a four-
item list, there are only 4! = 24 combinations, so on average you'll get
the original order one time in 24. For a ten-item list, that is once
every 3628800 times, and for a twenty-item list, once in
2432902008176640000 times. But of course these are *random*, and there's
always a chance of this:

http://dilbert.com/strips/comic/2001-10-25

Also, heed the note in the docs: "Note that for even rather small
len(x), the total number of permutations of /x/ is larger than the
period of most random number generators; this implies that most
permutations of a long sequence can never be generated." The default
random number generator has a period of 2**19937-1, which means that
lists longer than 2080 have more permutations than the period of the
generator (because factorial(2081) > 2**19937). Most shuffles involve
lists far shorter than 2080, but it's still good to keep it in mind.

--Ned.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20130524/630b2e0b/attachment.html>
•  at May 24, 2013 at 1:23 pm ⇧
Thank you all for those most helpful suggestions! random.shuffle does
precisely the job that I need quickly. Thank you for introducing me to
itertools, though, I should have remembered APL did this in a symbol
or two and I'm sure that itertools will come in handy in future.

Thanks for the warnings about random numbers too - I hope my lists
will be short enough for the permutations of the function to be
irrelevant. I don't need every single sequence to be unique, only that
the same sequence only occurs occasionally. My lists might get up to
the ~2k length one day, and I'll keep in mind the need, at that stage,
to use a different pseudo-random generator. Actually, thinking about
it, there is probably a source of non-algorithmically-derived 'random'
numbers somewhere on the net that would do the job nicely.
•  at May 24, 2013 at 1:57 pm ⇧

On Fri, 24 May 2013 06:23:14 -0700, Peter Brooks wrote:

Thanks for the warnings about random numbers too - I hope my lists will
be short enough for the permutations of the function to be irrelevant. I
don't need every single sequence to be unique, only that the same
sequence only occurs occasionally. My lists might get up to the ~2k
length one day, and I'll keep in mind the need, at that stage, to use a
different pseudo-random generator. Actually, thinking about it, there is
probably a source of non-algorithmically-derived 'random' numbers
somewhere on the net that would do the job nicely.

That's massive overkill.

Take a closer look at what Ned wrote:

"The default random number generator has a period of 2**19937-1"

and consider the numbers.

For a list with 3000 items, there are 3000! possible permutations. That
is approximately 10**21024. That is, a number with 21024 digits, or
somewhere around a trillion trillion trillion ... trillion trillion,
where there are 1752 "trillions".

If you could generate a million permutations a second, it would take you
on average 10**210988 centuries before getting the original permutation
again. That's the expected result you would get with a source of true
randomness.

Instead, with Python's default pseudo-random number generator, you cannot
get the full 3000! permutations, but only 2**19937-1 of them. Using the
same calculation as above, that means that you will have to generate a
million permutations per second for "only" 10**13783 centuries before
getting the original permutation again.

There are uses where being able to generate any possible permutation is
important, and the default PRNG is not sufficient. But merely shuffling
your data around to avoid spurious correlations is not one of them. Save
yourself a lot of development effort, and debugging, and just use
random.shuffle.

--
Steven
•  at Jun 12, 2013 at 1:14 pm ⇧

On 2013-05-24 14:43, Chris Angelico wrote:
On Fri, May 24, 2013 at 11:23 PM, Peter Brooks
wrote:
it, there is probably a source of non-algorithmically-derived 'random'
numbers somewhere on the net that would do the job nicely.
True entropy is usually provided by a source such as /dev/random (on
Unix systems). It's sometimes referred to as "cryptographic"
randomness, due to its necessity in secure encryption work. There are
various ways to get this in a cross-platform way.

os.random() and os.urandom(), particularly.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
an underlying truth."
-- Umberto Eco
•  at May 24, 2013 at 5:33 pm ⇧

On Friday, May 24, 2013 3:52:18 AM UTC-7, Steven D'Aprano wrote:
On Fri, 24 May 2013 01:14:45 -0700, Peter Brooks wrote:

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.
You can't *guarantee* that it will be different each time.

Well, within limits, you can guarantee a LONG TIME between repeats. And that may be good enough for Peter's purpose.

When the number of elements in your sequence is short enough (the right value of "short enough" is at least partially a matter of opinion, and the amount of RAM available, but I would guess n < 10 myself), use itertools.permutations to generate all the possibilities at once -- call this list p. Store p in memory. Then shuffle p, and use its elements one at a time. If you get to the end of p and need to keep working, it's up to you whether to shuffle p a second time. If you don't reshuffle p, it guarantees the maximum interval between reusing the same permutation.

Use random.shuffle on your sequence directly, when your sequence has a larger number of elements. This doesn't guarantee that two successive permutations will not be identical, but the probability is of course very low.
•  at May 26, 2013 at 1:25 am ⇧

On Friday, May 24, 2013 10:33:47 AM UTC-7, Yours Truly wrote:
If you don't reshuffle p, it guarantees the maximum interval between reusing
the same permutation.

Of course, that comes at a certain price. Given two permutations p[x] and p[x+1], they will ALWAYS be adjacent, in every repetition of the list, unless you reshuffle. So the "spurious correlation" problem that Peter is worrying about would still exist, albeit at a meta-level.

You could play clever games, such as splitting p in half once you get to the end of it, shuffling each half independently, and then concatenating the halves. This algorithm scrambles the p[x]'s and p[x+1]'s pretty well, at the cost of cutting the average interval between repeats of a given p[x] from len(p) to something closer to len(p)/2.

Because someone's got to say it... "The generation of random numbers is too important to be left to chance." ? Robert R. Coveyou
•  at May 26, 2013 at 1:49 am ⇧

Because someone's got to say it... "The generation of random numbers is too
important to be left to chance." ? Robert R. Coveyou

Absolutely. I know just enough about random number generation to
understand that I don't really know anything about it :-)

That being said, people who really care about random numbers, tend to
rely on some sort of physical process instead of computer algorithms. A
classic example would be /dev/random. A somewhat more fun example is
geiger counter are a good source of randomness (time intervals between
decay events).
•  at May 24, 2013 at 3:00 pm ⇧
----------------------------------------
Date: Fri, 24 May 2013 01:14:45 -0700
Subject: Simple algorithm question - how to reorder a sequence economically
From: peter.h.m.brooks at gmail.com
To: python-list at python.org

What is the easiest way to reorder a sequence pseudo-randomly?

That is, for a sequence 1,2,3,4 to produce an arbitrary ordering (eg
2,1,4,3) that is different each time.

I'm writing a simulation and would like to visit all the nodes in a
different order at each iteration of the simulation to remove the risk
of a fixed order introducing spurious evidence of correlation.
--
http://mail.python.org/mailman/listinfo/python-list

I don't know what "spurious evidence of correlation" is. Can you give a mathematical definition?

Here's a snippet for creating a random shuffle by Fisher?Yates algorithm:

def FY_shuffle(l):
??? from random import randint
??? for i in range(len(l)-1,0,-1):
??????? j = randint(0,i)
??????? l[j],l[i] = l[i],l[j]

It looks just like random.shuffle() mentioned before, but you can change it as you see fit.

If you can afford to test all permutations you can iterate over it by doing:

from itertools import permutations
l=range(4)
l
[0, 1, 2, 3]
for i in permutations(l): print i
...
(0, 1, 2, 3)
(0, 1, 3, 2)
(0, 2, 1, 3)
[...]
(3, 1, 2, 0)
(3, 2, 0, 1)
(3, 2, 1, 0)
>>>

Note that 'i' is a tuple.

If you need a list of all permutations to make a selection:

l=range(4)
l
[0, 1, 2, 3]
[list(i) for i in permutations(l)]
[[0, 1, 2, 3], [0, 1, 3, 2], [0, 2, 1, 3], [0, 2, 3, 1], [0, 3, 1, 2], [0, 3, 2, 1], [1, 0, 2, 3], [1, 0, 3, 2], [1, 2, 0, 3], [1, 2, 3, 0], [1, 3, 0, 2], [1, 3, 2, 0], [2, 0, 1, 3], [2, 0, 3, 1], [2, 1, 0, 3], [2, 1, 3, 0], [2, 3, 0, 1], [2, 3, 1, 0], [3, 0, 1, 2], [3, 0, 2, 1], [3, 1, 0, 2], [3, 1, 2, 0], [3, 2, 0, 1], [3, 2, 1, 0]]

This will produce big lists:
-for 10 elements (l=range(10)) the size of the list is about 30MB (on Windows).
-for 11 elements (l=range(11)) the size of the list is about 335MB (on Windows). It took more than 7GB for CPython 2.7.5 to create that list.

Didn't try after that.
•  at May 24, 2013 at 7:01 pm ⇧

On May 24, 5:00?pm, Carlos Nepomuceno wrote:

I don't know what "spurious evidence of correlation" is. Can you give a mathematical definition?
If I run the simulation with the same sequence, then, because event E1
always comes before event E2, somebody might believe that there is a
causative connection between them in the world that's being simulated,
when, in fact, they only correlate in this way because the sequence is
not being shuffled. That's what it means.

Actually it'll be a bit more subtle than that, because each iteration
of the simulation updates all nodes in one time interval, the events
will not usually show the order of iteration - but, where there are
any secondary effects, that are related to the order in which the
nodes are updated, these will always happen the same way, which is my
concern.
•  at May 24, 2013 at 9:33 pm ⇧
----------------------------------------
Date: Fri, 24 May 2013 12:01:35 -0700
Subject: Re: Simple algorithm question - how to reorder a sequence economically
From: peter.h.m.brooks at gmail.com
To: python-list at python.org
On May 24, 5:00 pm, Carlos Nepomuceno wrote:

I don't know what "spurious evidence of correlation" is. Can you give a mathematical definition?
If I run the simulation with the same sequence, then, because event E1
always comes before event E2, somebody might believe that there is a
causative connection between them in the world that's being simulated,
when, in fact, they only correlate in this way because the sequence is
not being shuffled. That's what it means.

Correlation does not imply causation. If "somebody" is an expert system and you want to avoid it's recognition and/or triggering of some kind, and you can't or don't want to change it's behavior, you may take the random way because it's cheaper.

Actually it'll be a bit more subtle than that, because each iteration
of the simulation updates all nodes in one time interval, the events
will not usually show the order of iteration - but, where there are
any secondary effects, that are related to the order in which the
nodes are updated, these will always happen the same way, which is my
concern.

You should have a more precise understanding of the dependence of the variables you taking in consideration before randomizing the series of events your are using for tests.

I suggest you start using PPMCC. If it's close to zero or negative you wouldn't have to mind about it! ;)
•  at May 25, 2013 at 12:28 am ⇧

On May 24, 11:33?pm, Carlos Nepomuceno wrote:
----------------------------------------

Date: Fri, 24 May 2013 12:01:35 -0700
Subject: Re: Simple algorithm question - how to reorder a sequence economically
From: peter.h.m.bro... at gmail.com
To: python-l... at python.org
On May 24, 5:00 pm, Carlos Nepomuceno <carlosnepomuc...@outlook.com>
wrote:
I don't know what "spurious evidence of correlation" is. Can you give a mathematical definition?
If I run the simulation with the same sequence, then, because event E1
always comes before event E2, somebody might believe that there is a
causative connection between them in the world that's being simulated,
when, in fact, they only correlate in this way because the sequence is
not being shuffled. That's what it means.
Correlation does not imply causation. If "somebody" is an expert system and you want to avoid it's recognition and/or triggering of some kind, and you can't or don't want to change it's behavior, you may take the random way because it's cheaper.
Actually it'll be a bit more subtle than that, because each iteration
of the simulation updates all nodes in one time interval, the events
will not usually show the order of iteration - but, where there are
any secondary effects, that are related to the order in which the
nodes are updated, these will always happen the same way, which is my
concern.
You should have a more precise understanding of the dependence of the variables you taking in consideration before randomizing the series of events your are using for tests.
If the scenario could be modelled mathematically, then there'd be no
point in writing the simulation.

## Related Discussions

Discussion Overview
 group python-list categories python posted May 24, '13 at 8:14a active Jun 12, '13 at 1:14p posts 18 users 11 website python.org

### 11 users in discussion

Content

People

Support

Translate

site design / logo © 2019 Grokbase