Moving right on to Day 3! At first glance, I think this one is going to be fun. We’re dealing with collections of simple vectors identified by a direction and a magnitude. For example “R10” specifies 10 units to the right. A collection of these vectors will specify the path that a wire takes. The problem involves two wires that start at the same location and our job is to determine at which points the wires collide and which of those collisions is closest to the start location or “central port”. The instructions specify that this should be measured using Manhattan distance.
The wires twist and turn, but the two wires occasionally cross paths. To fix the circuit, you need to find the intersection point closest to the central port. Because the wires are on a grid, use the Manhattan distance for this measurement.
Our input file has two lines, one for each wire, containing comma-separated values:
We’ll use a few familiar functions to parse this input into lists again:
This will read each line of the file into a list of the form
("R1005,U563,...", "L1003,D878,..."). From there our
mapcar lambda will split the comma-separated values transforming that into nested lists with individual values as strings
(("R1005", "U563", ...), ("L1003", "D878", ...)).
This is a good first step, but next we’ll want to do something with those vector strings in order to make them a little easier to work with. Before we handle the full lists, let’s write a helper that can convert a single value:
Since strings are sequences, we can use
subseq to perform a substring operation and set
direction to the letter in the vector (R, L, U, or D) and
magnitude to the number of units. We then generate a list containing a symbol identifying direction and the magnitude as an actual integer. This will turn something like
With our helper written it is very easy to convert an entire path into this format using
Things get a little more complicated from here. We want to be able to collect the coordinates a wire visits as we apply each vector so let’s see if we can write a function to do that for a single vector from a specific point. Essentially we want to go from something like
(0 0) (right 5) to
((1 0) (2 0) (3 0) (4 0) (5 0)). We should be able to implement this as a recursive function that walks each point.
We’ll define some simple functions for walking one space in each direction and then a higher-level function that accepts those walk functions and applies them a specific number of times:
The single-step walk functions simply return a new coordinate while incrementing or decrementing one of the values. The generic
walk function accepts one of these functions as a parameter and applies it to get the next coordinate in the path. From there, it conses that coordinate with a recursive call to
walk in order to build up the path.
Now we can define a function that takes a starting coordinate and a vector and returns all the coordinates that were visited:
Next we can write another function that consumes this one in order to collect all of the coordinates in the path:
Then I could use
intersection to find the points that exist in both. However, once I got to this point I realized two separate issues.
First, the above function is pretty slow. For the first path in the input file, which ends up visiting ~150k coordinates, it takes about six seconds to execute. With the
intersection on top of the two
get-coordinates-visited calls it took almost five minutes to complete.
Second, after waiting for it to complete I was surprised to see only a single intersection,
(0 0). Huh. It turns out that
intersection function does not work on nested lists by default. I’m guessing it uses some sort of object equality which causes
(0 0) to still match because it is seeded as the initial value for the
It appears we can address the equality check pretty easily by setting the
:test keyword on the
intersection call. Here is what the function looks like when that is all put together:
Running this (and waiting 5 minutes) does get us the correct answer. Most of the time is spent finding the intersection since the collections of points are so large. Also, the
get-coordinates-visited function relies on
last which has O(n) time complexity. This gets called after every single walk so it really adds up.
One option we have for speeding this up would be to use hash tables instead of lists. We would process the first path and build up a hash table of all of its coordinates, and then instead of doing a set intersection we would get the second paths coordinates and do lookups in the table, which would be O(1) on average. I’m not entirely sure what is going on under the hood during the
intersection call but based on my understanding of how lists are constructed, I’m guessing it is O(n) at best.
As far as eliminating the
last call, my best bet would be to pass the hash table along to walk so it no longer needs to return the walked coordinates (it can just add them to the table immediately) and can instead return the last coordinate walked, eliminating the need to use
last to find it.
Anyways, I’m not going to pursue those improvements right now. I’ll wait until I see the next exercise to decide whether it is worth the re-write.
My full solution can be found on Github.