FAQ
Here is a function which takes any list and creates a freq table,
which can be printed unsorted, sorted by cases or items. It's supposed
to mirror the proc freq in SAS.

Dirk

def freq(seq,order='unsorted',prin=True):
#order can be unsorted, cases, items

freq={}
for s in seq:
if s in freq:
freq[s]+=1
else:
freq[s]=1
if prin==True:
print 'Items=',len(seq),'Cases=',len(freq)
print '------------------------'
if order=='unsorted':
for k in freq.keys():
print k,freq[k],float(freq[k])/len(seq)
elif order=='cases':
#http://blog.client9.com/2007/11/sorting-python-dict-by-
value.html
freq2=sorted(freq.iteritems(), key=lambda (k,v):
(v,k),reverse=True)
for f in freq2:
print f[0],f[1],float(f[1])/len(seq)
elif order=='items':
for k in sorted(freq.iterkeys()):
print k,freq[k],float(freq[k])/len(seq)
print '------------------------'
return freq

#test

import random

rand=[]
for i in range(10000):
rand.append(str(int(100*random.random())))

fr=freq(rand)
fr2=freq(rand,order='items')
fr2=freq(rand,order='cases')

Search Discussions

  • Shashwat Anand at Aug 22, 2010 at 8:16 am

    On Sun, Aug 22, 2010 at 1:31 PM, Dirk Nachbar wrote:

    Here is a function which takes any list and creates a freq table,
    which can be printed unsorted, sorted by cases or items. It's supposed
    to mirror the proc freq in SAS.

    Dirk

    def freq(seq,order='unsorted',prin=True):
    #order can be unsorted, cases, items

    freq={}
    for s in seq:
    if s in freq:
    freq[s]+=1
    else:
    freq[s]=1
    The above code can be replaced with this:
    freq = {}
    for s in seqn:
    freq[s] = freq.get(s,0) + 1

    if prin==True:
    print 'Items=',len(seq),'Cases=',len(freq)
    print '------------------------'
    if order=='unsorted':
    for k in freq.keys():
    print k,freq[k],float(freq[k])/len(seq)
    elif order=='cases':
    #http://blog.client9.com/2007/11/sorting-python-dict-by-
    value.html
    freq2=sorted(freq.iteritems(), key=lambda (k,v):
    (v,k),reverse=True)
    for f in freq2:
    print f[0],f[1],float(f[1])/len(seq)
    elif order=='items':
    for k in sorted(freq.iterkeys()):
    print k,freq[k],float(freq[k])/len(seq)
    print '------------------------'
    return freq

    #test

    import random

    rand=[]
    for i in range(10000):
    rand.append(str(int(100*random.random())))

    fr=freq(rand)
    fr2=freq(rand,order='items')
    fr2=freq(rand,order='cases')
    --
    I feel the code you wrote is bloated a bit. You shall definately give
    another try to improvise it.



    --
    ~l0nwlf
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/python-list/attachments/20100822/99bbebed/attachment.html>
  • Chris Rebert at Aug 22, 2010 at 8:34 am

    On Sun, Aug 22, 2010 at 1:16 AM, Shashwat Anand wrote:
    On Sun, Aug 22, 2010 at 1:31 PM, Dirk Nachbar wrote:
    Here is a function which takes any list and creates a freq table,
    which can be printed unsorted, sorted by cases or items. It's supposed
    to mirror the proc freq in SAS.

    Dirk
    <snip>
    ? ?freq={}
    ? ?for s in seq:
    ? ? ? ?if s in freq:
    ? ? ? ? ? ?freq[s]+=1
    ? ? ? ?else:
    ? ? ? ? ? ?freq[s]=1
    The above code can be replaced with this:
    ?freq = {}
    ?for s in seq:
    ?? ? ? ? ?freq[s] = freq.get(s,0) + 1
    Which can be further replaced by:

    from collections import Counter
    freq = Counter(seq)

    Using collections.defaultdict is another possibility if one doesn't
    have Python 2.7.

    Cheers,
    Chris
    --
    It really bothers me that Counter isn't a proper Bag.
    http://blog.rebertia.com
  • Peter Otten at Aug 22, 2010 at 12:29 pm

    Dirk Nachbar wrote:

    Here is a function which takes any list and creates a freq table,
    which can be printed unsorted, sorted by cases or items. It's supposed
    to mirror the proc freq in SAS.

    Dirk

    def freq(seq,order='unsorted',prin=True):
    #order can be unsorted, cases, items

    freq={}
    for s in seq:
    if s in freq:
    freq[s]+=1
    else:
    freq[s]=1
    if prin==True:
    print 'Items=',len(seq),'Cases=',len(freq)
    print '------------------------'
    if order=='unsorted':
    for k in freq.keys():
    print k,freq[k],float(freq[k])/len(seq)
    elif order=='cases':
    #http://blog.client9.com/2007/11/sorting-python-dict-by-
    value.html
    freq2=sorted(freq.iteritems(), key=lambda (k,v):
    (v,k),reverse=True)
    Sorting in two steps gives a slightly better result when there are items
    with equal keys. Compare
    freq = {"a": 2, "b": 1, "c": 1, "d": 2}
    sorted(freq.iteritems(), key=lambda (k, v): (v, k), reverse=True)
    [('d', 2), ('a', 2), ('c', 1), ('b', 1)]

    with
    freq2 = sorted(freq.iteritems(), key=lambda (k, v): k)
    freq2.sort(key=lambda (k, v): v, reverse=True)
    freq2
    [('a', 2), ('d', 2), ('b', 1), ('c', 1)]

    Here the keys within groups of equal frequency are in normal instead of
    reversed order.

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppython-list @
categoriespython
postedAug 22, '10 at 8:01a
activeAug 22, '10 at 12:29p
posts4
users4
websitepython.org

People

Translate

site design / logo © 2023 Grokbase