FAQ

On Fri, 04 Feb 2011 15:14:24 -0800, Slafs wrote:

Hi there!

I'm having trouble to wrap my brain around this kind of problem:
Perhaps you should consider backing up and staring from somewhere else
with different input data, or changing the requirements. Just a thought.

What I have :
1) list of dicts
2) list of keys that i would like to be my grouping arguments of
elements from 1)
3) list of keys that i would like do "aggregation" on the elements
of 1) with some function e.g. sum

You start with data:

dicts = [ {'g1': 1, 'g2': 8, 's_v1': 5.0, 's_v2': 3.5},
{'g1': 1, 'g2': 9, 's_v1': 2.0, 's_v2': 3.0},
{'g1': 2, 'g2': 8, 's_v1': 6.0, 's_v2': 8.0} ]


It sometimes helps me to think about data structures by drawing them out.
In this case, you have what is effectively a two-dimensional table:

g1 g2 s_v1 s_v2
=== === ===== ====
1 8 5.0 3.5
1 9 2.0 3.0
2 8 6.0 8.0


Nice and simple. But the result you want is a bit more complex -- it's a
dict of dicts of dicts:

{1: {'s_v1': 7.0, 's_v2': 6.5,
'g2': {8: {'s_v1': 5.0, 's_v2': 3.5},
9: {'s_v1': 2.0, 's_v2': 3.0}
}},
2: {'s_v1': 6.0, 's_v2': 8.0,
'g2': {8: {'s_v1' : 6.0, 's_v2': 8.0}
}}}


(I quote from the Zen of Python: "Flat is better than nested." Hmmm.)

which is equivalent to a *four* dimensional table, which is a bit hard to
write out :)

Here's a two-dimensional projection of a single slice with key = 1:

s_v1 s_v2 g2
===== ===== =====
7.0 6.5 | s_v1 s_v2
---------------
8 | 5.0 3.5
9 | 2.0 3.0


Does this help you to either (1) redesign your data structures, or (2)
work out how to go from there?

[...]
I was looking for a solution that would let me do that kind of grouping
with variable lists of 2) and 3) i.e. having also 'g3' as grouping
element so the 'g2' dicts could also have their own "subgroup" and be
even more nested then. I was trying something with itertools.groupby and
updating nested dicts, but as i was writing the code it started to feel
too verbose to me :/
I don't think groupby is the tool you want. It groups *consecutive* items
in sequences:

from itertools import groupby
for key, it in groupby([1,1,1,2,3,4,3,3,3,5,1]):
... print(key, list(it))
...
1 [1, 1, 1]
2 [2]
3 [3]
4 [4]
3 [3, 3, 3]
5 [5]
1 [1]


Except for the name, I don't see any connection between this and what you
want to do.

The approach I would take is a top-down approach:

dicts = [ ... ] # list of dicts, as above.
result = {}
for d in dicts:
# process each dict in isolation
temp = process(d)
merge(result, temp)


merge() hopefully should be straight forward, and process only needs to
look at one dict at a time.



--
Steven

Search Discussions

Discussion Posts

Previous

Follow ups

Related Discussions

Discussion Navigation
viewthread | post
posts ‹ prev | 2 of 6 | next ›
Discussion Overview
grouppython-list @
categoriespython
postedFeb 4, '11 at 11:14p
activeFeb 7, '11 at 4:52p
posts6
users5
websitepython.org

People

Translate

site design / logo © 2022 Grokbase