Python!

Screenshot of code that is discussed in this post. The link to the full text is included below.

Over the last couple days, I completed Google’s Python Class. I have tried so many learning methods, from books to MOOCs and everything in between, and this really hit a sweet spot for me. I did not watch the videos; I found the written info + exercises to be enough. The difficulty setting on this class was just right; it wasn’t starting from 0, as so many Python tutorials do (Which is fine! Those kinds of learning materials are so important!). Also, I found the exercises really engaging, particularly the more comprehensive ones towards the end.

And with those more advanced exercises in mind, I thought it might be interesting to take a look at my solution for the final exercise and compare it to their provided solution. Both solutions are at this gist, with mine on top and Google’s on the bottom, should you be interested in checking them out in full. Here are the program’s requirements.

The first difference that jumped out at me was a stylistic one: the solution code is so much more terse than mine. Their variable names are things like f and match, whereas I prefer to be a little more descriptive with my variables, opting for file and url_match for my corresponding variables. I’m also a fan of creating intermediate semantic variables rather than chaining methods together. The step where we had to create an index.html file provides a good study in contrast. First, Google’s code:

index = file(os.path.join(dest_dir, 'index.html'), 'w')

I feel this one line is doing a lot of work, possibly too much. The interpreter (Compiler? What’s the deal in Python?) can certainly handle it, but I find it difficult to parse as a human. Here’s my code:

index_path = os.path.join(dest_dir, 'index.html')
index_file = open(index_path, 'w')

I prefer this style because my general rule of thumb (stolen from Kyle Simpson) is to err on the side of optimizing for human readability. The variable names give me an idea of what is stored in them, whereas index doesn’t tell me much.

Interestingly, there’s another example where the script (see what I did there) is flipped. When sorting the image URLs, the solution file defines a function called url_sort_key(url) and then passes a reference to it as an argument to sorted():

def url_sort_key(url):
  """Used to order the urls in increasing order by 2nd word if present."""
  match = re.search(r'-(\w+)-(\w+)\.\w+', url)
  if match:
    return match.group(2)
  else:
    return url
return sorted(url_dict.keys(), key=url_sort_key)

As a JavaScript dev, I am used to seeing anonymous functions in these contexts, so I looked into how Python would achieve such a thing. Turns out, Python has something called lambdas that are basically the same thing, although I can’t find a way to give a lambda a name the way you can with an “anonymous” function expression in JavaScript. Here’s my analogous code:

sorted_jpg_files = sorted(jpg_files, key=lambda f: re.search(sort_pattern, f).group())

So, it’s interesting to me that the solution author opted for a more verbose code expression here, though I’m guessing they probably didn’t want to delve into lambdas in a fairly introductory course.

Let’s look at some other differences. Here’s a great example where my verbosity did not enhance the readability of the code. The goal was to extract a hostname from the filename of a provided log file, and the filenames all followed the same convention: someword_hostname. For the life of me, I could not remember the word “host,” which is how I ended up with the variable name base_url. I opted for a regular expression here, which works, but is ultimately too much of a muchness, I think:

url_match = re.search(r'\w+_(\S+)', filename)
base_url = ''
if url_match:
  base_url = url_match.group(1)

It’s clear that this code is doing too much, particularly when you compare it to the elegance of the provided solution:

underbar = filename.index('_')
host = filename[underbar + 1:]

Looking at my code and then looking at their code gives me the same feeling of tension and relief I experience when using my asthma inhaler. Like, oh, that’s what oxygen feels like.

Another interesting difference is that the solution code opted to iterate over the log file line by line and searched each line for the magic sauce, whereas I used .findall() on the whole file string. I suppose I felt comfortable using findall() because I knew that the size of the log files was fairly small. But in a real-world situation, it would probably make more sense to go line by line.

One of the subgoals of this exercise was to remove any possible duplicate URLs from the final list. The provided solution used a dict, and I wish I had gone that route. Instead, I used this rather hacky strategy:

jpg_files = list(dict.fromkeys(jpg_files))

It works, but it’s doing a lot with one line, and it isn’t semantic. Thank u, next.

Some more differences I learned from:

  • The Google code provided better feedback to the user.
  • They also changed the names of the image files locally, which made them more semantic to the user

Getting to read/write/create files in this way was so fun, and it’s something I haven’t had an opportunity to do much of in my career up until now. I learned so much from this course, and I look forward to doing much more with Python.

Color Generator: Middleware

I had been struggling with the architecture of my project. I knew I wanted a separate web API that was secured with an API key. Where I was sort of lost was in figuring out where to go from there.

I knew that I didn’t want my Vue app to talk directly to my API because I didn’t want to expose my API key. So my next thought was to route my API requests through Express.

I set about creating two Express routes, one for each API endpoint. This was easy enough. Both functions look almost the same, so here is the one for /generate-color:

app.get("/generate-color", function(req, res, next) {
    var options = {
        host: API_HOST,
        path: "/api/v1/color"
    };
    
    var request = https.get(options, function(response) {
        var bodyChunks = [];
        response.on("data", function(chunk) {
            bodyChunks.push(chunk);
            console.log(JSON.parse(chunk));
        }).on("end", function() {
            var body = Buffer.concat(bodyChunks);
            var parsed = JSON.parse(body);
            res.status(200).send({
                status: "success",
                message: "here is your color!",
                color: parsed.color
            });
            console.log(parsed);
        }).on("error", function(err) {
            console.log(err);
        });
    });

    request.on("error", function(e) {
        console.log(e.message);
    });
});

The code is fairly self contained, but there are a few things to note. First, API_HOST appears to come out of nowhere, but is actually stored in an environment variable. The dotenv package makes it nice and easy to use environment variables in a node project; you define your variables in a .env file, and these two magical lines make them available in your app code:

const dotenv = require("dotenv");
dotenv.config();

Currently, my API doesn’t implement authentication, but if it did, I would also have an API_KEY variable in my .env file.

Second, if you’re not familiar with how requests and responses work in node, this code may look a little odd. What is being pushed to the bodyChunks array? why is there a bodyChunks array in the first place?

As you may know, Node.js is event-based. That means that there is always an event loop running to check and see what events have taken place, if any. In this example, we can see three events being used: data, end, and error. We’ll zoom in on the data event. Whenever information is being passed from one place to another in Node, you are dealing with a Stream. A stream sends small chunks of information at a time, and when it does so, it emits a data event. In my code, we are listening for that data event, and when it happens, we are grabbing that chunk of data and we are pushing it to our bodyChunks array. We call it that because the array comprises the chunks of our response body from the server.

Once our stream sends its last chunk of information, it emits the end event. Then, it’s time to put all of our chunks together into a Buffer and turn it into something we can use: an object! That’s what we send back in our response.

Next, we’ll talk about how I set up my Vue client.