Kenneth Jenkins

Advent of Code 2020: Day 5 reflection

Python’s str.translate() and str.maketrans()

The day 5 puzzle from last year’s Advent of Code talks about airplane seat numbering using “binary space partitioning”.

From the puzzle description:

Instead of zones or groups, this airline uses binary space partitioning to seat people. A seat might be specified like FBFBBFFRLR, where F means “front”, B means “back”, L means “left”, and R means “right”.

The first 7 characters will either be F or B; these specify exactly one of the 128 rows on the plane (numbered 0 through 127). Each letter tells you which half of a region the given seat is in. Start with the whole list of rows; the first letter indicates whether the seat is in the front (0 through 63) or the back (64 through 127). The next letter indicates which half of that region the seat is in, and so on until you’re left with exactly one row.

[…]

The last three characters will be either L or R; these specify exactly one of the 8 columns of seats on the plane (numbered 0 through 7). The same process as above proceeds again, this time with only three steps. L means to keep the lower half, while R means to keep the upper half.

[…]

Every seat also has a unique seat ID: multiply the row by 8, then add the column.

(from https://adventofcode.com/2020/day/5)

After absorbing all of that explanation, it turns out that parsing a “seat specification” is way less complicated than it might appear. Converting to row and column is just like parsing a binary number: the first character is just like the most significant bit of a 7-bit binary number (the 64s place, if you will). As the rows are numbered with 0 in the front and 127 in the back, an F is like a 0 bit and a B like a 1.

Similarly, the last three characters are the binary representation of the column number, with L like a 0 and R like a 1.

Let’s take the example FBFBBFFRLR. The row and column break down like so:

seat description example

And then the final seat ID is 44 × 8 + 5 = 357.

But multiplying by 8 is the same as shifting the binary digits left by three places. (The 1s place turns into the 8s place, the 2s place into the 16s place, and so on.)

So there’s no need to distinguish between row and column at all! We can just replace all the Fs and Ls with 0s, and all the Bs and Rs with 1s, and then parse the whole thing as an 11 bit binary number.

(And in Python, parsing a string s representation of a binary number is as simple as int(s, 2).)

So my quick Python code to parse a list of seat descriptions looked something like this:

def parse(lines):
    return [int(l.replace('F', '0').replace('B', '1')
                 .replace('L', '0').replace('R', '1'), 2) for l in lines]

This worked just fine, but all those replace() calls feel a little silly. I thought that there must be a less-verbose way to do this.

And sure enough, there is!

translate() and maketrans()

The Python str.translate() method does just this:

str.translate(table)

Return a copy of the string in which each character has been mapped through the given translation table.

[…]

You can use str.maketrans() to create a translation map from character-to-character mappings in different formats.

(from https://docs.python.org/3/library/stdtypes.html#str.translate)

And from the description of str.maketrans():

static str.maketrans(x[, y[, z]])

This static method returns a translation table usable for str.translate().

[…]

If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y.

(from https://docs.python.org/3/library/stdtypes.html#str.maketrans)

We can use these methods to simplify our parse() method like so:

t = str.maketrans('FBLR', '0101')

def parse(lines):
    return [int(l.translate(t), 2) for l in lines]

Then part 1 of the puzzle (finding the highest seat ID in the input file) is trivial:

def part1(ids):
    return max(ids)

But the puzzle isn’t really the point of this post.

It’s more that I hadn’t ever used Python’s translate() method before, so I wanted to share in case it’s new to others as well.

Addendum

Fun tidbit: the Python translate() method is basically what the Unix tr command does. We can solve part 1 of the puzzle (find the highest seat ID out of all the input seat descriptions) in a one-liner like this:

$ <input05.txt tr FBLR 0101 | sort | tail -n1 | echo "ibase=2; $(cat)" | bc

Because all the seat descriptions are the same length, we can sort them as text strings and we’ll get the same order as if we sorted as numbers.1 Then we can use tail -n1 to take just the last line (the maximum value), and run it through bc (the basic calculator) to convert from binary to decimal. (We just need to prepend an ibase=2 to get bc to parse its input as binary.)


  1. If we tried to use plain sort after converting from binary to decimal, we could run into problems like 99 sorting after 100. But, we can use the -n option to sort the input numerically, like so:

    $ <input05.txt tr FBLR 0101 | echo "ibase=2; $(cat)" | bc | sort -n | tail -n1
    
     ↩︎