Group a list into chunks in Python

More than once I've had to take a list of items and group them. For instance, maybe I have a flat list of key-value pairs in alternation, like this:

[key1, value1, key2, value2, key3, value3]

It could have been that they were easier to enter that way (fewer parentheses or brackets), or maybe that's just how they come from an outside source.

Nevertheless, to actually use them, it would be better to have them in tuples:

[(key1, value1), (key2, value2), (key3, value3)]

That way, I could make them into a dictionary, urlencode them, etc.

I looked through Python's entire itertools module for a pre-existing solution, but didn't find one. itertools does have the reverse operation, chain, but not this one.

So, to save you the trouble, here's the code:

group_iter.py (license: public domain)
def group_iter(iterator, n=2, strict=False):
    """ Transforms a sequence of values into a sequence of n-tuples.
    e.g. [1, 2, 3, 4, ...] => [(1, 2), (3, 4), ...] (when n == 2)
    If strict, then it will raise ValueError if there is a group of fewer
    than n items at the end of the sequence. """
    accumulator = []
    for item in iterator:
        accumulator.append(item)
        if len(accumulator) == n: # tested as fast as separate counter
            yield tuple(accumulator)
            accumulator = [] # tested faster than accumulator[:] = []
            # and tested as fast as re-using one list object
    if strict and len(accumulator) != 0:
        raise ValueError("Leftover values")

Prev: YouTube video embedding with XSLT

Share this content

Comments

Avatar picture for Jonathan I was thinking of just using (the less generic solution) of:

zip(arr[::2], arr[1::2])

But alas, there is a solution in itertools:

The left-to-right evaluation order of the iterables is guaranteed. This makes possible an idiom for clustering a data series into n-length groups using izip(*[iter(s)]*n).

So group_iter could be simplified to:

zip(*[iter(iterator)]*n)
Avatar picture for Jason Stitt Jonathan: Two interesting solutions! I'm going to have to remember the second one, which I missed because it's stuck into a note under izip. I think it would be better off as a named function, though, as you have to stare at it for a minute if you're not primed by knowing what it's for.
Avatar picture for Mich Van Hellooo. Question..

I tried your code. but when i try to print it, the result is always <generator object at *blahblah*>.

Why is that so? How can I remedy that? Thank youuu

Post Comment

All comments are personally reviewed and must be:

  • On-topic
  • Courteous
  • Not self-linking or spam
(Optional. This is your one self-link.)