Recently, I had a great interaction with one of my coworkers that I think is worth sharing, with the hope you may learn a bit about refactoring and python.
My colleague came to me to help him think through a problem that surfaced with a change to a project. The code in question sends a file to a remote storage service. It looked like this:
This method will read the file at file_path
and send it to the file service, where it will be named file_name
, and it would be placed in the folder remote_path
Looking at it now, it’s not the best signature for a function. It would be much better as (local_path, remote_path)
. It sure would be nice to fix that. But that’ll be a different post. .
The change that my colleague was working through is that before now, this method would expect remote_path
to be one of a few predefined folders in the remote service. Now, the requirements changed so that we might need to create a arbitrarily-named sub-folder of a known folder.
He came to me, explained this situation, and showed me code that changed the signature of the function to something like this:
His reasoning is because we would always know that the path in remote_path
would always exist, and that any subfolder may or may not need to be created. The problem he wanted my help thinking through was: how can we check to make sure that remote_path
/ sub_folder
exists without having to check in with the remote server every time we send a file, since this function could very likely be called to upload several files in short succession.
Ask for forgiveness
As we discussed our options, I realized that we were thinking according to a programming pattern that is called “Look before you leap” (LBYL).
This kind of flow would ask the server, “Does this folder exist?”, and wait for it’s response. If the response is, “No, it doesn’t,” it’ll tell the server, “Okay. Create this folder for me,” and wait for the server to respond. Then it would send the file.
Coding in this style tends to produce more if
statements, and is more common in other programming languages, like C.
It looks something like this:
This could be non-trivial overhead for something that has such a delay as communicating with a remote server.Especially this server. This has a funky inconsistent connection that is a pain to deal with at times. But I digress. It would be better to find a way that would optimize around interacting with the server once for each file.
It turns out that this is not only possible, but even preferred. Python’s documentation prefers the “Easier to ask for forgiveness than permission” (EAFP) style of programming.
In this style, you assume that things are in place to do what you need to do and catch the exception if things aren’t.
Coding in this style tends to produce more try
/ except
blocks, tends to run faster, and tends to not have as many problems with race conditions.
It looks something like this:
Coding in this style takes a little more mental effort at first, but it is worth practicing.
Changing signatures
As my colleague and I continued to discuss how this function could improve, we also realized that we didn’t need to add the sub_folder
argument to the function.
If you removed the mental requirement that remote_path
had to be one of a few specific folders, all that’s left are two path strings that have a direct relationship to each other. The sub_string
would always be a direct child under remote_path
, meaning that joining those two strings would always represent the end path, as well as the simplest way to represent that path.
On top of that, it enables the end user of the code to pass in the full remote path, instead of having to think about what part of the remote path is one of the pre-defined folders, and where and how to split the remote file path.
To let this function work with the EAFP style, we determined to try to write the file. If it failed, we would try to make the folder it is supposed to be in. If that failed, we would do the same for it’s parent, and it’s parent, and it’s parent, until either something works or the programs fails because we don’t have permission to write on this server, and that’s a perfectly reasonable error to let the user know.
It’s your turn
The shift from thinking “look before you leap” to “easier to ask forgiveness than permission” takes some effort. But I think you should give it a try!
The three pointers I have at this time are:
- Keep an eye out for errors that are thrown as you are running your code as you develop it. These may be great places to put in a
try
/except
block. - Keep the number of lines between the
try
andexcept
lines as small as possible. You want it to be as clear as possible to see what threw the error and why. - When you are adding an
if
statement, take a moment to consider if you could write the code in a way that will smoothly execute for the most common situation and throw a specific error if it need some adjusting.