In a world that runs on code, writing high quality code is continually getting more important. Simply put, clean code is essential for building software that is reliable, efficient, and easy to work with.
In this blog, we will delve into some principles and practices of clean code and how you can apply them in your own projects. Whether you are a developer working hard to make a tool production ready or a data scientist working on a prototype (which should eventually end up in production!), understanding and practicing clean code is crucial for building software that stands the test of time.
If you want to improve your coding skills and make your software better, this post is for you!
You’re not done when it works, you are done when it is right
On the road to writing better software, a sensible starting point is studying and applying the fundamentals found in Clean Code: A Handbook of Agile Software Craftsmanship by Robert R. Martin. This handbook contains a set of principles and rules based on years of experience in writing software.
In the book, ‘Uncle Bob’ explains why writing clean code is important. He emphasizes that code is read more often than it is written: "You may think that your job is to get code to work. That is not true; it’s only half of your job. The other half of your job is to write code that other people can maintain, can use, and can make work."
Uncle Bob advocates code that looks like it was written by someone who cares about everyone else who is going to read the code. We should stop writing code that grows more tangled and corrupted with each passing day. And we should stop creating a bug list that is a thousand pages long.
Software craftsmanship is about being a real professional. Musicians study and practice the seemingly trivial details of their disciplines, and programmers should do the same.
By ensuring your code stays clean, you will not only redeem the reader of your code (and yourself) of unnecessary frustration, but you will primarily contribute to the main benefit of writing clean code: a reduction of technical debt.
Technical debt, or in short tech debt, is the implied cost of additional rework caused by the implementation of a short term solution (quick-fixes) when writing software. Even though it can only be coarsely quantified, tech debt is estimated to be immense.
Based on research done by Stripe , tech debt leads to an estimated global loss of ±$85 billion per year. In day-to-day life, Stripe estimates that developers spend on average 23 - 33% of their time dealing with technical debt . An example of code that creates tech debt, which Uncle Bob calls ‘rude’, is copying and pasting the same piece of code multiple times instead of wrapping it in a function.
The problem is that no solution is as permanent as a temporary solution; neglected or even completely skipped tasks will not be revised once code is in production. Even though it may seem impossible to ensure high code quality in the face of tight deadlines, it is still the responsibility of every developer to prioritize quality. Also, Gartner found that by actively managing and reducing technical debt, it is possible to achieve at least a 50% faster service delivery time to the business.
In a nutshell, tidying up code after making it work, adding tests to prove that not a single part is failing, and reviewing the code before merging are the most important solutions when it comes to reducing and combatting technical debt.
We selected some principles and rules recommended by The Clean Code Handbook that will make your life - and that of the people who read your code- easier. Thijs, a Data Scientist at Xomnia, and a strong believer in writing readable code, has studied The Clean Code Handbook in great detail.
After completing a code review on a simple script that was created in a jupyter notebook, he answered the most important questions on how to start writing clean code.
Don’t worry about determining which parts of your codebase need improvement. Thijs advocates the “clean as you code” method: Whenever new code is written or existing code is touched up, make sure it meets the new standards. This prevents you from investing time in irrelevant parts of your code and will decrease your technical debt over time.
This reviewed the following script, which uses a dataset containing the top 100 songs on Spotify to generate a few plots. This explains how to improve the script to increase its readability and efficiency:
Comments are a tool that’s often misused to cover up bad coding practices. Comments in your code can be a great tool to communicate to the people who are going to read your code. However, they come with some pitfalls.
Avoid the following types of comments:
Obsolete comments: Even though code can easily be adjusted by using modern day IDEs, comments are disjoined from these code edits. Comments need to be adjusted by the developer separate from the code, and this can be easy to forget, guaranteeing that it will be forgotten at times. Keeping comments that contain info that is no longer relevant only leads to confusion, so it’s better to simply remove them.
Code review: These comments were used to structure the program before it was written out. They’re no longer relevant, so it is better remove them.
Redundant comments: Another common pitfall when writing comments is repeating what is already stated in the code. When the code has been changed and the comments haven't, this can become quite confusing. Therefore, a clean code rule of thumb is: What can be stated in names of functions or variables should not be stated in comments.
Code review: The comments say exactly what the code does. Wrapping the code in functions or variables with descriptive names should prevent the need for comments like this.
Commented-out code: Even though we understand you don’t want to remove your ‘old’ code immediately because you might need it in the future, we strongly recommend not to comment out the code in order to preserve it. Nowadays, tools like git allow keeping a very descriptive history of the code. It is better to revert to earlier commits in this history when necessary and keep the current version clean, than keep code around in comments that will rot eventually even though some day that code might come in handy again. For example, when you rename variable names and you forget to apply that change to the code you commented out, that piece of code will eventually turn obsolete.
Code review: Remove the code that you are not actually using.
Another issue is picking the right name for objects in your code. Names are the main way to communicate functionality to the person after you. It makes it very important to put time into precisely naming your variables and functions.
Many small, aptly named functions reduce the need for comments. In a few cases, precise and concise naming conventions can help define functions to describe what is going on in a piece of code. An added advantage is that this reduces the amount of duplication and repetition in code. But again, they come with pitfalls.
Avoid the following types of code naming conventions:
Code review: So, in our code, we can change the variable “fig”, which has a very specific functionality like what we see below:
… to a more informative name that better reflects what the code is there for:
Even though writing a method is preferred over copying and pasting code over and over again, a few rules should be considered when writing these methods. Because they also come with pitfalls, avoid the following types of methods:
Code review: Instead of copying and pasting code every time you want to create and save a plot …
… use a function with an informative name and call it when you want to apply it:
Even though our code review covers a jupyter notebook that is not a part of a bigger software system, it is best practice to write tests for every piece of code that you create. An interesting way of developing code is by creating the necessary tests first and then filling in the code. This is called Test Driven Development.
According to the laws of test driven development, you are not allowed to:
Go back and forth between your production code and your unit tests. If you work this way, you will never have to do any debugging. Learn a new language, not debugging - Debugging is not a desirable skill. Also, if you want to know how a system works, read the tests. They give you all the detailed information you need. Although this development method may be extreme, it is recommended to write a lot of tests. This shortens the feedback loop on the quality of your work extensively, and thus increases your efficiency and productivity.
To put this all together, technical debt is an expensive problem that occurs when choosing a quick and limited solution over a better approach that takes longer. Clean code is a set of rules and principles you can follow to reduce tech debt, and we discussed a few of those principles.
Fixing tech debt you find while working on a codebase, a principle known as "clean as you code” can improve not only the quality of your own work, but also your productivity. “Clean as you code” rules can be applied to both the code you create and the code you touch during normal work, in order to improve the existing code base slowly but steadily.
This blog also went over rules about commenting: Remove comments that are obsolete or redundant. We also went over rules for naming: Use short, descriptive names and define “magic numbers” in constants. Last but not least, we went over rules about how to use functions: Use the single responsibility principle, avoid side effects, and use many small descriptive functions.
Now, all you have to do is get started with writing clean code!