This is an opinion article on improving the readability of conditional statements through the use of meaningful variable names.
Self documenting code
In order to discuss what I mean by "Self documenting conditional statements", we need to know what self-documenting code means. Self documenting code is simply a formal name given to the practice of giving your variable/function/class names meaningful names. Advocates of self-documenting code generally take this one step further and assert that code comments should be kept to an absolute minimum. Code comments should only be used to describe the intention of blocks of code that can't be explained through identifier names alone. Comments should describe the intent of code when necessary (the "why?") instead of summarizing "what" the code is doing.
The primary argument for self-documenting code is the generalization that comments are likely to become out of sync with the code they are commenting on. In other words, as code changes, someone will eventually forget to update the comment to reflect the changes in the code.
Example
Here's a simple example to demonstrate the principle of self-documenting code.
Non self-documenting:
def current_city(): ''' Returns the current city of the machine running the script as a string. The city is determined from the IP address. ''' # Use the freegeoip geolocation API to get current information based on # the IP address of the machine running the program r = requests.get("https://freegeoip.net/json/") # Convert the geolocation json response data return json.loads(r.content)['city']
self-documenting:
def get_city_by_current_ip(): geolocation_api_url = "https://freegeoip.net/json/" r = requests.get(geolocation_api_url) geo_data = json.loads(r.content) return geo_data['city']
Comments that summarized what the code did were removed. New identifiers were introduced as needed in order to convey the information the comments once presented. Naturally, self-documenting code has the tendency to introduce more identifiers, along with longer identifier names.
Self documenting conditional statements
Now that we know what self-documenting code is all about, let's talk about conditional statements.
A conditional statement is Boolean expression used within a language construct to make a decision. For example,
if x<5: # do something else: # don't do anything
While loops as well:
while response != 'q': # do something
Expressions such as response != 'q'
and x<5
are known as relational
expressions. Relational expressions evaluate to either Boolean True or Boolean
False. When you introduce logical AND
or logical OR
into the expression, the
relational expression becomes a compound relational expression. For example,
if x > 5 and x%2 == 0: # do something
It's fairly common for compound relational expressions to be used as conditions. However, over the years I've observed a tendency for these relational expressions within conditions to not convey the intent of the expression very well.
A simple example
Consider a common scenario in my life where I decide if I want to go outside. If it's cold outside and there's a high chance of rain, you'll have to drag me, kicking and screaming, to get me out of the house. I'm fine with just a high chance of rain, or I'm fine with it just being cold, but I can't handle them both! Here's what that might look like expressed as an if-statement
if temp < 75 and precip_chance > .5: # I'm staying inside else: # I'll go out!
While this example is fairly easy to follow, I believe there is an unnecessary
mixing of responsibilities occurring within the conditional statement. The
relational expression temp < 75
represents the idea of "cold", and
precip_chance > .5
represents the idea of "high chance of rain". When software
developers are familiarizing themselves with new source code, we mentally
emulate how the machine will journey through the program based on different
conditions. It makes the developer's job much easier to consider these different
conditions when they can focus on the ideas represented by the expressions
rather than the logic of the expressions. In other words, it would be much
easier to reason through a program thinking "If it's cold but not likely to
rain, then..." as opposed to "if temp < 75
but precip_chance <= .5
,
then..."
Consequently, I propose writing conditional statements resembling this example in the form
is_cold = temp < 75 likely_to_rain = precip_chance > .5 if is_cold and likely_to_rain: # I'm staying inside else: # I'll go out!
Now the intended path of the if statement is extremely obvious. The logic for
determining is_cold
or likely_to_rain
is now separate from the organization
of the if statement, allowing us to focus on the core ideas as opposed to the
raw numbers.
A real world example
Django, the most widely used Python web development framework, has a template
tag called include
which lets you include one template into another. It can be
called with a string literal
include "foo/bar.html"
or with a string variable
include template_name_variable
Within Django's source for the include
tag, it has to evaluate if what was
received as the argument was a string literal or a variable. Here's how it
accomplishes this
if path[0] in ('"', "'") and path[-1] == path[0]: # do stuff knowing the argument is a string literal else: # do stuff knowing the argument is a variable
To summarize, if the first character of the argument passed to include
is
equal to a single quote or a double quote, and if the first and the last
character of the argument match (to ensure there isn't a single quote/double
quote mismatch), assume the argument provided is a string literal.
In my opinion, the logic for how we determine if the argument is a string literal should not be within the condition itself. The developers reading the source should easily be able to tell the path the program will take if the argument is a string literal. How that is determined is irrelevant when we simply intend to trace the path of the program.
One solution might be to leave a comment
# The argument is a string literal if the first character is a single/double # quote, and the first and last chaarcters of the argument match if path[0] in ('"', "'") and path[-1] == path[0]: # do stuff knowing the argument is a string literal else: # do stuff knowing the argument is a variable
However, to avoid the issue of excess comments simply explaining "what" the code does while still conveying the intent of the condition, we can easily adjust the code to be self-documenting:
arg_is_string_literal = path[0] in ('"', "'") and path[-1] == path[0] if arg_is_string_literal: # do stuff knowing the argument is a string literal else: # do stuff knowing the argument is a variable
The basic idea of "If this, do this" is much easier to follow when excess logic is extracted from the conditional statement.
Don't get crazy!
Like any programming strategy or pattern, you can take this idea too far! I'm not advocating that all conditional statements should be converted into variable expressions. Conditions involving multiple ranges of numbers, for example, are easy enough to follow. I'm simply advocating that when it's possible to easily extract logic from a conditional statement, you can often times improve the readability and organization of your code by storing relational expressions in well named variables.