Advent of Code 2020: Introduction and Day 4 reflection
This year’s Advent of Code is just around the corner! I had a lot of fun with these puzzles last year, so in honor of the occasion I thought I would share some of the things I learned over the course of the challenge.
For the unfamiliar, Advent of Code is an Advent calendar where a new programming puzzle unlocks each day of December (from the 1st through the 25th). Each puzzle has a part 1 and a part 2, where part 2 is usually a more complicated variation on the first part. Puzzles involve writing some code to compute an answer (usually based on some input file data).
I don’t intend to talk about every single puzzle, or even to give complete solutions to a few, but rather to share a few things I found interesting.
I used Python for all of last year’s puzzles. As the high-level structure of each puzzle was fairly similar (read input, compute something for part 1, compute something else for part 2), I got in the habit of starting each day from a skeleton file that looked something like this:
import sys
def parse(lines):
pass
def part1(p):
pass
def part2(p):
pass
def main():
parsed_input = parse(sys.stdin.readlines())
print(part1(parsed_input))
print(part2(parsed_input))
Usually, I would want to take the raw input and translate it into some other
format that would be more convenient, so this is what the parse()
method was
for: maybe I needed to convert from strings to numbers, or maybe build a map
of values indexed by (X, Y) coordinates.
When I got to Day 4, I stumbled a bit
with the parse()
method. Most of the input files so far involved a list of
items, with one item per line. This puzzle’s input, however, groups of
key:value
pairs that could span multiple lines, with a blank line in between
separate items.
This was the example input:
ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm
iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929
hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm
hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in
(These were supposed to represent passport data fields: ecl
was “eye color”,
pid
was “passport ID number”, and so on.)
To make the actual puzzle solving code easier, I wanted to represent each
passport as a dictionary, mapping keys (e.g. 'ecl'
, 'pid'
) to their values
('gry'
, '860033327'
).
My parse()
method started with a list of input lines, but I needed to make
sure to group these lines together up to the next blank line. I found myself
wanting to use something like Python’s string
split()
method.
This method takes a string and “splits” it at any whitespace, returning
a new list of strings.
For example:
>>> 'Well, hello there'.split()
['Well,', 'hello', 'there']
I wanted to take my list of input lines and “split” it at every blank line;
something like lines.split('\n')
. But lines
was a list, not a string, and
I didn’t see any method like this in the Python standard library.
After some Googling, I found this Stack Overflow
question
wondering the same thing: is there a split()
for lists?
One of the answers pointed out that we can use itertools.groupby()
for this,
something like this:
def listsplit(l, delim):
return [list(group) for is_delim, group in
itertools.groupby(l, key=lambda e: e == delim) if not is_delim]
It took me a minute to understand how this worked, so I thought I’d try to explain it here.
What does itertools.groupby()
do?
The Python documentation has this to say:
itertools.groupby(iterable, key=None)
Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is
None
, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.The operation of
groupby()
is similar to theuniq
filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.The returned group is itself an iterator that shares the underlying iterable with
groupby()
. Because the source is shared, when thegroupby()
object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list.
I found this a little hard to follow, especially because for our use case we don’t need or want to sort the input.
If we ignore the key
parameter for now, the groupby()
method essentially
takes a sequence of elements and looks for consecutive elements that are the
same. It returns an iterator of these individual groups.
A diagram might help. Say we start with a list with some repeated elements. These could be strings, numbers, or what have you, so I’ll use labeled boxes to represent the elements:
Calling groupby()
with this list would group the elements like this:
(Here I’ve used angle brackets to represent an iterator that will return the enclosed items.)
Note that even though B appears twice, because they aren’t consecutive in the input, they are part of separate groups in the output.
However, this isn’t quite an accurate picture. The groupby()
method doesn’t
just return an iterator of each group of elements. It actually returns an
iterator of tuples, where each tuple contains the grouped element followed by
the group iterator itself. So I guess the output diagram should really look
something like this:
That’s a lot of brackets….
But let’s come back to that key
function. This function lets us group
elements in different ways. Let’s say we had a function that returned 1 for
a C element and 0 for any other element. If we called groupby()
with this
key function, it would apply this function to each element and then group
according to the results. We could represent that something like this:
Now the first B is grouped with all of the A’s, because the key function gave the same result for all of these elements. (And note that the first element of each tuple is now the 0 or 1 from the key function.)
Back to listsplit()
For the list-splitting case, the key insight (ha!) is that we can use a key
function that returns True
if an element matches our delimiter, and False
otherwise. This way groupby()
will return all consecutive non-delimiter
elements in one chunk.
Unfortunately, all of our delimiters get returned as an output group too. But, we can filter out these group by checking the key function output values.
Finally, we also need to capture all of the elements from the group iterators, which we can do by making a new list out of each iterator.
Putting it all together, this brings us to the listsplit()
method from before:
And then using this method, the passport parsing routine is a snap:
def parse(lines):
return [parse_passport(' '.join(p)) for p in listsplit(lines, '\n')]
def parse_passport(p):
return dict(kv.split(':') for kv in p.split())
Now we have a list of dictionaries that are easy to query for individual data fields.
Addendum
The real moral of this story might be that it’s important not to get too focused on one approach.
As I was writing up this reflection, I realized that I don’t really even need a
listsplit()
method at all.
If I hadn’t had readlines()
in my solution skeleton already, I might have seen
that we can go straight to the input chunks I wanted by reading the entire
file at once and using split('\n\n')
to divide the input at blank lines.
I could replace the whole parse()
method with something like this:
def parse(input):
return [dict(kv.split(':') for kv in p.split()) for p in input.split('\n\n')]