FAQ
In the following code I create the graph with vertices
sgb-words.txt (the file of 5 letter words from the
stanford graphbase), and an edge if two words differ
by one letter. The two methods I wrote seem to me to
likely perform the same computations, the list comprehension
is faster though (281 seconds VS 305 seconds on my dell mini).

Is the right interpretation of this timing difference
that the comprehension is performed in the lower level
C code?

As this time I have no other conjecture about the cause.

---------------------------------------------------------
import time
import copy

data = map (lambda x: x.strip(), open('sgb-words.txt').readlines())

def d (w1, w2):
count = 0
for idx in range(0,5):
if w1[idx] != w2[idx]:
count += 1
return count

print "creating graph"
t0 = time.clock ()
graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a < b]
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."

t0 = time.clock ()
graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i < j:
graph2.append ([i,j])
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."

## Search Discussions

• at Sep 2, 2011 at 1:12 am ⇧ On 02/09/2011 01:35, Bart Kastermans wrote:
In the following code I create the graph with vertices
sgb-words.txt (the file of 5 letter words from the
stanford graphbase), and an edge if two words differ
by one letter. The two methods I wrote seem to me to
likely perform the same computations, the list comprehension
is faster though (281 seconds VS 305 seconds on my dell mini).

Is the right interpretation of this timing difference
that the comprehension is performed in the lower level
C code?

As this time I have no other conjecture about the cause.

---------------------------------------------------------
import time
import copy

data = map (lambda x: x.strip(), open('sgb-words.txt').readlines())

def d (w1, w2):
count = 0
for idx in range(0,5):
if w1[idx] != w2[idx]:
count += 1
return count

print "creating graph"
t0 = time.clock ()
graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a< b]
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."

t0 = time.clock ()
graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i< j:
graph2.append ([i,j])
t1 = time.clock ()
print "took " + str (t1 - t0) + " seconds."
Are they actually equivalent? Does graph == graph2?

The first version (list comprehension) creates a list of pairs of
values:

[a, b]

whereas the second version (for loops) creates a list of pairs of
indexes:

[i, j]

The second version has subscripting ("data[i]" and "data[j]"), which
will slow it down.
• at Sep 2, 2011 at 1:54 pm ⇧ MRAB <python at mrabarnett.plus.com> writes:
On 02/09/2011 01:35, Bart Kastermans wrote:

graph = [[a,b] for a in data for b in data if d(a,b) ==1 and a< b]
graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i< j:
graph2.append ([i,j])
Are they actually equivalent? Does graph == graph2?

The first version (list comprehension) creates a list of pairs of
values:

[a, b]

whereas the second version (for loops) creates a list of pairs of
indexes:

[i, j]

The second version has subscripting ("data[i]" and "data[j]"), which
will slow it down.
You are absolutely right. I had changed the code from the
equivalent:

graph2 = []
for i in range (0, len(data)):
for j in range(0,len(data)):
if d(data[i],data[j]) == 1 and i < j:
graph2.append ([data[i],data[j]])

But then also tried the equivalent

for a in data:
for b in data:
if d(a,b) == 1 and a < b:
graph2.append([a,b])

Which does away with the indexing, and is just about exactly as
fast as the list comprehension.

That'll teach me; I almost didn't ask the question thinking it might
be silly. And it was, but I thought it for the wrong reason. I tell my
students there are no stupid questions, I should listen to myself
more when I do. Thanks!
• at Sep 2, 2011 at 2:50 pm ⇧ On Sep 2, 9:54?am, Bart Kastermans wrote:
if d(a,b) == 1 and a < b:
It will probably be faster if you reverse the evaluation order of that
expression.

if a<b and d(a,b)==1:

That way the d() function is called less than half the time. Of course
this assumes that a<b is a faster evaluation than d(a,b), but I think
--
// T.Hsu
• at Sep 3, 2011 at 2:15 am ⇧ ting at thsu.org writes:
On Sep 2, 9:54?am, Bart Kastermans wrote:
if d(a,b) == 1 and a < b:
It will probably be faster if you reverse the evaluation order of that
expression.

if a<b and d(a,b)==1:

That way the d() function is called less than half the time. Of course
this assumes that a<b is a faster evaluation than d(a,b), but I think
--
// T.Hsu
Indeed makes quite a difference, goes from 275 seconds down to
153 seconds.

## Related Discussions

Discussion Overview
 group python-list categories python posted Sep 2, '11 at 12:35a active Sep 3, '11 at 2:15a posts 5 users 3 website python.org

### 3 users in discussion

Content

People

Support

Translate

site design / logo © 2022 Grokbase