FAQ

On Fri, 04 Feb 2011 15:14:24 -0800, Slafs wrote:

Hi there!

I'm having trouble to wrap my brain around this kind of problem:
Perhaps you should consider backing up and staring from somewhere else
with different input data, or changing the requirements. Just a thought.

What I have :
1) list of dicts
2) list of keys that i would like to be my grouping arguments of
elements from 1)
3) list of keys that i would like do "aggregation" on the elements
of 1) with some function e.g. sum

dicts = [ {'g1': 1, 'g2': 8, 's_v1': 5.0, 's_v2': 3.5},
{'g1': 1, 'g2': 9, 's_v1': 2.0, 's_v2': 3.0},
{'g1': 2, 'g2': 8, 's_v1': 6.0, 's_v2': 8.0} ]

It sometimes helps me to think about data structures by drawing them out.
In this case, you have what is effectively a two-dimensional table:

g1 g2 s_v1 s_v2
=== === ===== ====
1 8 5.0 3.5
1 9 2.0 3.0
2 8 6.0 8.0

Nice and simple. But the result you want is a bit more complex -- it's a
dict of dicts of dicts:

{1: {'s_v1': 7.0, 's_v2': 6.5,
'g2': {8: {'s_v1': 5.0, 's_v2': 3.5},
9: {'s_v1': 2.0, 's_v2': 3.0}
}},
2: {'s_v1': 6.0, 's_v2': 8.0,
'g2': {8: {'s_v1' : 6.0, 's_v2': 8.0}
}}}

(I quote from the Zen of Python: "Flat is better than nested." Hmmm.)

which is equivalent to a *four* dimensional table, which is a bit hard to
write out :)

Here's a two-dimensional projection of a single slice with key = 1:

s_v1 s_v2 g2
===== ===== =====
7.0 6.5 | s_v1 s_v2
---------------
8 | 5.0 3.5
9 | 2.0 3.0

work out how to go from there?

[...]
I was looking for a solution that would let me do that kind of grouping
with variable lists of 2) and 3) i.e. having also 'g3' as grouping
element so the 'g2' dicts could also have their own "subgroup" and be
even more nested then. I was trying something with itertools.groupby and
updating nested dicts, but as i was writing the code it started to feel
too verbose to me :/
I don't think groupby is the tool you want. It groups *consecutive* items
in sequences:

from itertools import groupby
for key, it in groupby([1,1,1,2,3,4,3,3,3,5,1]):
... print(key, list(it))
...
1 [1, 1, 1]
2 [2]
3 [3]
4 [4]
3 [3, 3, 3]
5 [5]
1 [1]

Except for the name, I don't see any connection between this and what you
want to do.

The approach I would take is a top-down approach:

dicts = [ ... ] # list of dicts, as above.
result = {}
for d in dicts:
# process each dict in isolation
temp = process(d)
merge(result, temp)

merge() hopefully should be straight forward, and process only needs to
look at one dict at a time.

--
Steven

## Related Discussions

 view thread | post posts ‹ prev | 2 of 6 | next ›
Discussion Overview
 group python-list categories python posted Feb 4, '11 at 11:14p active Feb 7, '11 at 4:52p posts 6 users 5 website python.org

### 5 users in discussion

Content

People

Support

Translate

site design / logo © 2022 Grokbase