FAQ

On Feb 5, 7:12?am, Peter Otten wrote:
Slafs wrote:
Hi there!
I'm having trouble to wrap my brain around this kind of problem:
What I have :
? 1) list of dicts
? 2) list of keys that i would like to be my grouping arguments of
elements from 1)
? 3) list of keys that i would like do "aggregation" on the elements
of 1) with some function e.g. sum
For instance i got:
1) [ { 'g1' : 1, 'g2' : 8, 's_v1' : 5.0, 's_v2' : 3.5 },
? ? ? { 'g1' : 1, 'g2' : 9, 's_v1' : 2.0, 's_v2' : 3.0 },
? ? ? {'g1' : 2, 'g2' : 8, 's_v1' : 6.0, 's_v2' : 8.0}, ... ]
2) ['g1', 'g2']
3) ['s_v1', 's_v2']
To be precise 1) is a result of a values_list method from a QuerySet
in Django; 2) is the arguments for that method; 3) those are the
annotation keys. so 1) is a result of:
? ?qs.values_list('g1', 'g2').annotate(s_v1=Sum('v1'), s_v2=Sum('v2'))
What i want to have is:
a "big" nested dictionary with 'g1' values as 1st level keys and a
dictionary of aggregates and "subgroups" in it.
In my example it would be something like this:
{
? 1 : {
? ? ? ? ? 's_v1' : 7.0,
? ? ? ? ? 's_v2' : 6.5,
? ? ? ? ? 'g2' :{
? ? ? ? ? ? ? ? ? ?8 : {
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 5.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 3.5 },
? ? ? ? ? ? ? ? ? ?9 : ?{
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 2.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 3.0 }
? ? ? ? ? ? ? ? }
? ? ? ?},
? 2 : {
? ? ? ? ? ?'s_v1' : 6.0,
? ? ? ? ? ?'s_v2' : 8.0,
? ? ? ? ? ?'g2' : {
? ? ? ? ? ? ? ? ? ? 8 : {
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 6.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 8.0}
? ? ? ? ? ?}
? ? ? ?},
...
}
# notice the summed values of s_v1 and s_v2 when g1 == 1
I was looking for a solution that would let me do that kind of
grouping with variable lists of 2) and 3) i.e. having also 'g3' as
grouping element so the 'g2' dicts could also have their own
"subgroup" and be even more nested then.
I was trying something with itertools.groupby and updating nested
dicts, but as i was writing the code it started to feel too verbose to
me :/
Do You have any hints maybe? because i'm kind of stucked :/
Regards
S?awek
Not super-efficient, but simple:

$ cat python sumover.py
cat: python: No such file or directory
data = [ { 'g1' : 1, 'g2' : 8, 's_v1' : 5.0, 's_v2' : 3.5 },
? ? ? ? ?{ 'g1' : 1, 'g2' : 9, 's_v1' : 2.0, 's_v2' : 3.0 },
? ? ? ? ?{'g1' : 2, 'g2' : 8, 's_v1' : 6.0, 's_v2' : 8.0}]
sum_over = ["s_v1", "s_v2"]
group_by = ["g1", "g2"]

wanted = {
? 1 : {
? ? ? ? ? 's_v1' : 7.0,
? ? ? ? ? 's_v2' : 6.5,
? ? ? ? ? 'g2' :{
? ? ? ? ? ? ? ? ? ?8 : {
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 5.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 3.5 },
? ? ? ? ? ? ? ? ? ?9 : ?{
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 2.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 3.0 }
? ? ? ? ? ? ? ? }
? ? ? ?},
? 2 : {
? ? ? ? ? ?'s_v1' : 6.0,
? ? ? ? ? ?'s_v2' : 8.0,
? ? ? ? ? ?'g2' : {
? ? ? ? ? ? ? ? ? ? 8 : {
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v1' : 6.0,
? ? ? ? ? ? ? ? ? ? ? ? ? 's_v2' : 8.0}
? ? ? ? ? ?}
? ? ? ?},

}

def calc(data, group_by, sum_over):
? ? tree = {}
? ? group_by = group_by + [None]
? ? for item in data:
? ? ? ? d = tree
? ? ? ? for g in group_by:
? ? ? ? ? ? for so in sum_over:
? ? ? ? ? ? ? ? d[so] = d.get(so, 0.0) + item[so]
? ? ? ? ? ? if g:
? ? ? ? ? ? ? ? d = d.setdefault(g, {}).setdefault(item[g], {})
? ? return tree

got = calc(data, group_by, sum_over)[group_by[0]]
assert got == wanted
$ python sumover.py
$

Untested.
Very clever. I didn't understand how it worked until I rewrote it like
this:

def calc(data, group_by, sum_over):
tree = {}
group_by = [None] + group_by
for item in data:
d = tree
for g in group_by:
if g:
d = d.setdefault(g, {}).setdefault(item[g], {})
for so in sum_over:
d[so] = d.get(so, 0.0) + item[so]
return tree


Processing "None" in the last round of the loop was throwing me off.

Search Discussions

Discussion Posts

Previous

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 6 of 6 | next ›
Discussion Overview
grouppython-list @
categoriespython
postedFeb 4, '11 at 11:14p
activeFeb 7, '11 at 4:52p
posts6
users5
websitepython.org

People

Translate

site design / logo © 2022 Grokbase