-
- Python Scripting for Maya Artists
- Writing Clean Code
Writing clean, readable code is something we should all strive for. If you’re
just learning to program, you may not be paying attention to how clean your
code is, but after a while when you start working with other people with a
shared code base you may begin to recognize the importance of writing clean
code. Code is read a lot more than it is written and sloppy code just
depresses and frustrates people. Being sloppy up front when you are under
pressure actually slows you down in the long term. You will end up creating
more bugs which leads to more maintenance.
In this section, I’ll give brief pointers on how to make your code cleaner and
easier to read. For more in-depth discussions on writing clean code, I
recommend the excellent book, Clean Code: A Handbook of Agile Software
Craftsmanship, by Robert C Martin, and the Pluralsight course,
Clean Code: Writing Code for
Humans by Cory
House.
The DRY Principle
DRY stands for Don’t Repeat Yourself. You should state a piece of logic once
and only once. If you find yourself copying and pasting chunks of code
multiple times, that should be a signal that you are repeating yourself.
Repeating code just leads to more code to maintain and debug. For example, the
following code has repeated code:
sphere = create_poly_sphere(name='left_eye')
assign_shader(sphere, 'blinn1')
parent_constrain(head, sphere)
sphere = create_poly_sphere(name='right_eye')
assign_shader(sphere, 'blinn2')
parent_constrain(head, sphere)
This code should be written like the following:
def create_eye(name, shader):
sphere = create_poly_sphere(name=name)
assign_shader(sphere, shader)
parent_constrain(head, sphere)
create_eye('left_eye', 'blinn1')
create_eye('right_eye', 'blinn2')
The second code example is easier to maintain. If an update needs to be
implemented, we only have to update code in one place rather than multiple
places.
Use Clean Names
The names of your classes, variables, and functions contribute greatly to how
readable and clean your code is. Take this code snippet from an actual VFX
studio pipeline tool:
curr= os.environ.get('CURRENT_CONTEXT')
if curr:
cl= curr.split('/')
self.__curr= [None] * 6
self.setType( cl[0] )
self.setSeq( cl[1] )
if len( cl ) > 3:
self.setSubseq( cl[2] )
self.setShot( '/'.join( cl[2:] ) )
else:
self.setShot( cl[-1] )
if wa: self.__wa= wa
else: self.__wa= os.environ.get('CURRENT_WORKAREA')
Can you tell what this code is doing? If you’re familiar with writing pipeline
environment tools, you might recognize what it is trying to do. What is
self.__curr? Why is it a list of 6 values? What do the elements of cl
repreprent? Why does the length of cl being greater than 3 differentiate one
block of the if statement from the other?
A cleaner implementation would look something like this:
context_type, sequence, subsequence, shot = self.get_current_context()
self.set_type(context_type)
self.set_sequence(sequence)
if subsequence:
self.set_subsequence(subsequence)
if shot:
self.set_shot(shot)
self.__work_area = work_area if work_area else self.get_current_work_area()
The above code cleans up all the list indices and string manipulations to make
the code easier to read and understand. The individual list elements have been
assigned meaningful names. Also the environment variable accesses have been
extracted away into new methods. This makes the code easier to maintain. What
happens if we want to change the name of the environment variables or maybe we
want to read the values from a configuration file on disk? Extracting those
values to functions would let us update the code in one place rather than
multiple direct accesses to the environment variable.
Naming Classes
Class names should be a noun because they represent objects. The name should
be as specific as possible. If the name cannot be specific, it may be a sign
that the class needs to be split into smaller classes. Classes should have a
single responsibility.
Bad class names include:
- ShapeIE
- Utility
- Common
- MyFunctions
- DansUtils
- ShapeClass
Good class names include:
- ShapeExporter
- RigPublisher
- Project
- User
Naming Methods
Method names should be verbs because they perform actions. There should be no
need to read the contents of a method if the name accurately describes what
the method does. If the function is doing one thing (as it should) it should
be easy to name. If not, split the code into smaller functions. Sometimes
explaining the code out loud and help you name the function. If you say “And”,
“If”, or “Or” it is a warning sign that the method is doing too much.
Bad method names include:
- proc_new
- pending
- process1
- process2
Good method names include:
- create_process
- is_pending
- send_notification
- import_mesh
- calculate_rivet_matrix
Methods should only perform the actions described by the name. Any other
actions are called side effects and they can confuse people using your code.
For example, a method called validate_form should not save the form. A method
called publish_model should not smooth the normals. A method called
prune_weights should not remove unused influences.
Avoid Abbreviations
Abbreviated text may be easier to type, but code is read more than it is
written. When people talk about code or read it silently, it is harder to say
the abbreviations. There are also no standards when referring to
abbreviations.
Bad names:
- sjData
- jid
- sjid
- nm
- sjState
Good names:
- subjob_data
- job_id
- subjob_id
- name
- subjob_state
Naming Booleans
Boolean values should be able to fit in an actual sentence of saying something
is True or False.
Bad boolean names:
- open
- status
- login
Good Boolean names:
- is_open
- logged_in
- is_valid
- enabled
- done
Symmetrical Names
When names have a corresponding opposite, be consistent and always use the
same opposite.
Bad naming
- on/disabled
- quick/slow
- lock/open
- low/max
Good naming
- on/off
- fast/slow
- lock/unlock
- min/max
Working with Booleans
When comparing Booleans, compare them implicitly:
# Don't do this
if (is_valid == True):
# do something
# Instead do this
if is_valid:
# do something
When assigning booleans, assign them implicitly:
# Don't do this
if len(items) == 0:
remove_entry = True
else:
remove_entry = False
# Instead do this
remove_entry = len(items) == 0
Avoid using booleans that represent negative values. This leads to double
negatives, which end up confusing people:
# Don't do this
if not not_valid:
pass
# Instead do this
if valid:
pass
Use Ternaries
Ternaries are ways of assigning a value to a variable depending on if some
condition is True or False. For example:
# Don't do this
if height > height_threshold:
category = 'giant'
else:
category = 'hobbit'
# Instead do this
category = 'giant' if height > height_threshold else 'hobbit'
Don’t Use String as Types
You may have encountered code similar to the following:
if component_type == 'arm':
# do something
elif component_type == 'leg':
# do something else
This is considered bad form for various reasons. If we end up wanting to
change the value of one of these types, we have to change it in all the places
that it is used. It can also lead to typos and inconsistencies. A better
approach would be:
class ComponentType(object):
arm = 'arm'
leg = 'leg'
if component_type == ComponentType.arm:
# do something
elif component_type == ComponentType.leg:
# do something else
The new code provides one place to change and update values (the DRY
principle). It also provides auto-completion support and is more searchable if
you are using an IDE like PyCharm or Eclipse.
Don’t Use Magic Numbers
Magic numbers are numeric values that seemed to have been pulled out of
nowhere. For example, the following code was pulled from an actual VFX
pipeline tool:
if run_mode < 3:
run_mode = 5
elif run_mode == 3:
run_mode = 4
What do these numbers mean? You would need to search all over code that could
span multiple files to figure out what these numbers represent. A better
approach would be:
class JobStatus(object):
waiting = 1
starting = 2
running = 3
aborting = 4
done = 5
def __init__(self, value=JobStatus.waiting):
self.status = value
def not_yet_running(self):
return self.status < JobStatus.running
def abort(self):
if self.not_yet_running():
self.status = JobStatus.done
elif self.status == JobStatus.running:
self.status = JobStatus.aborting
# job_status is the new run_mode
job_status.abort()
Encapsulate Complex Conditionals
Sometimes you may have conditionals with many comparisons chained together. At
some point, this is going to get hard to read. For example:
# Instead of this
if (obj.component.partial_path.startswith('model') and
namespace == ‘GEOM’ and
has_rigging_publish(obj.child) and
edits_path):
# Encapsulate the complex conditional in a function or variable
def is_model_only_publish(obj):
return (obj.component.partial_path.startswith('model') and
namespace == 'GEOM' and
has_rigging_publish(obj.child) and
edits_path)
if is_model_only_publish(obj):
Writing Clean Functions
Functions should be created in order to help convey intent. They should do one
thing and one thing only as this aids the reader, promotes reuse, eases
testing, and avoids side effects. Strive for a function to only have 0-3
parameters with a max of 7-9 parameters. Anything longer makes it harder for
readers to keep track of all the parameters while running through the code in
their head. Functions should be relatively short, maybe no more than 100 or so
lines. If a function is longer, it may be time to refactor (update) the code
into smaller functions.
Extracting a Method
If you find your code 3 or 4 indentation levels deep, it may be time to
extract some of that code into a separate function. For example:
# Instead of this
if something:
if something_else:
while some_condition:
# do something complicated
# Do this instead
if something and something_else:
do_complicated_things()
def do_complicated_things():
while some_condition:
# do something complicated
Return Early
People can usually only keep track of a handful of trains of thought at a
time. Therefore, we should try to organize our code in as many discrete
independent chunks as possible. For example:
# Instead of this
def validate_mesh(mesh):
result = False
if has_uniform_scale(mesh):
if has_soft_normal(mesh):
if name_is_alphanumeric(mesh):
result = name_is_unique(mesh)
return result
# Do this
def validate_mesh(mesh):
if not has_uniform_scale(mesh):
return False
if not has_normal(mesh):
return False
if not name_is_alphanumeric(mesh):
return False
return name_is_unique(mesh)
This is not a strict rule. Like everything listed so far, use it when it
enhances readability.
Signs Your Function is Too Long
Functions should hardly ever be over 100 lines. Longer functions are harder to
test, debug, and maintain since users need to keep track if updates at the
start of the function affect areas and the end of the function. Here are some
simple rules to determine if a function is too long:
- You separate sections of code in a function with whitespace and/or comments
- Scrolling is required to view all the code.
- The function is hard to name.
- There are conditionals several levels deep.
- There are more than 7 parameters or variables in scope at a time.
Writing Clean Classes
Classes are like headings in a book, there should be multiple layers of
abstraction going from high level ideas to more detailed lower level ideas:
Chapter
- Heading 1
- Paragraph 1
- Paragraph 2
- Heading 2
- Paragraph 1
- Heading 1
Module
- Class 1
- Method 1
- Method 2
- Class 2
- Method 1
- Class 1
High Cohesion
Cohesion is the fact of forming a united whole. When a class is said to have
high cohesion, all of its functionality is closely related. We should strive
to create classes with high cohesion. High cohesion not only enhances
readability; it also increases the likelihood of reusing the class. Signs that
a class does not have high cohesion are:
- The class has methods that don’t interact with the rest of the class.
- The class has fields only used by one method.
- The class changes often.
For example:
# Low cohesion class
class Vehicle(object):
def edit_options():
pass
def update_pricing():
pass
def schedule_maintenance():
pass
def send_maintenance_reminder():
pass
def select_financing():
pass
def calculate_monthly_payment():
pass
The Vehicle class contains many unrelated methods. This makes it harder to use
and maintain because parts of unrelated code are intertwined together. A
better approach would be to split this class up into smaller classes:
# High cohesion classes
class Vehicle(object):
def __init__(self)
def edit_options():
pass
def update_pricing():
pass
class VehicleMaintainer(object):
def schedule_maintenance():
pass
def send_maintenance_reminder():
pass
class VehicleFinancer(object)
def select_financing():
pass
def calculate_monthly_payment():
pass
Method Proximity
Code should read top to bottom and related methods should be kept together:
def add_take():
if not validate_take(): # First method referenced should be directly below
raise ValueError('Take is not valid')
save_take() # Second method referenced should be below first
def validate_take():
return take.endswith(‘.mov’)
def save_take():
# save in database
Collapsed code should read like an outline. Strive for multiple layers of
abstraction:
- Class
- Method 1
- Method 1a
- Method 1ai
- Method 1aii
- Method 1b
- Method 1c
- Method 1a
- Method 2
- Method 2a
- Method 1
Comments
Comments should only be used to explain ideas and assumptions not already
apparent by reading the code.
Redundant Comments
The comments in this code do not add anything the user could not have figured
out by reading the code.
# Clear the node combo box then add items
self.node_combobox.clear()
if nodes:
# Sort the nodes
nodes.sort()
# Check to see if there is a shape controller associated with the node
self.find_shape_controllers(nodes)
# Now add the list of nodes to the combo box
self.node_combobox.addItems(nodes)
# If a node is specified set the combo box
if node:
# Find the node's index
index = self.node_combobox.findText(
node,
QtCore.Qt.MatchExactly | QtCore.Qt.MatchCaseSensitive)
self.node_combobox.setCurrentIndex(index)
Divider Comments
If you see divider comments, it’s a sign you need to extract the code into its
own function:
# Now create the new group object and insert it into the table
# ----------------------------------------------------------------------------------
# Create the group object
group = slidergroup.SliderGroup(name)
self._slider_groups[name] = group
# Tell the group what its start row is
group.setRow(row)
# Apply color
if color:
group.setColor(color)
# Generate sliders from the attributes attached to the group
# ----------------------------------------------------------------------------------
row_index = row + 1
sliders_to_add = []
for attr in attributes:
if cmds.objExists(attr):
slider = self.add_slider(attr, rowIndex, group)
row_index += 1
Zombie Comments
Zombie comments are large sections of commented out code. People often do this
because they think they might need the code in the future. This is unnecessary
because version control systems like git, svn, and perforce perform this exact
functionality. People looking at code with large commented out portions will
be confused. Why is the code commented out? Is it important?
Clean Code Conclusion
In this section, I covered a quick overview on writing clean code. There are
many other great resources that go in a lot more detail of writing clean code
including explanations why it is considered clean code. As I mentioned at the
start of the section, for more in-depth discussions on writing clean code,
refer to Clean Code: A Handbook of Agile Software
Craftsmanship, by Robert C Martin, and the Pluralsight course,
Clean Code: Writing Code for
Humans by Cory
House.