Signs of Triviality

Opinions, mostly my own, on the importance of being and other things.
[homepage] [index] [jschauma@netmeister.org] [@jschauma] [RSS]

Recommendations To Write (Slightly More) Readable And (Thus) Robust Code

December 21st, 2020

Python Logo + text 'Readability Counts' Code is communication. You are using one language to tell a computer what to do, translating your thoughts, perhaps even after translating from your non-English primary language into the heavily English influenced syntax of a programming language.

You are also communicating with your colleagues, current and future, who need to be able to make sense of your code. You are communicating with your own future self who will inevitably look at this mess and think "Just what did this idiot think when they wrote this?".

You can make your and their life easier by writing code that is easy to read. Over the years, I've read my share of "other people's code" -- my own, weeks, months, or years after I first wrote it; my students' homework assignments; my colleagues' code; code on GitHub and StackOverflow, the code of various Open Source packages that I had to debug, patch, merge, or integrate into another code base, or the proprietary code from third party providers.

All of that code -- regardless of the language it was written in -- could have benefited from an application of the Zen of Python, and so here are a few recommendations based on what I observed to be the most common mistakes:


  1. Don't write the damn code in the first place.
  2. Use descriptive variable and function names. You are not charged per character, and vowels don't cost extra.
  3. Don't squish your code as tight as possible. Use spaces to increase readability, newlines to group small logical blocks.
  4. Don't mix spaces and tabs for indentation. Pick either one* and then be consistent in their use.
    *Just kidding. Use tabs.
  5. Exit early to avoid waterfall code. Especially for longer code blocks, this minimizes cognitive overhead.
    Don't: Do:
    if (condition) {
    	if condition2() {
    		if (condition3 || condition4) {
    			doStuff();
    		} else {
    			error("oh no!");
    			return;
    		}
    	} else {
    		warn("beware!");
    		return;
    	}
    } else {
    	printf("Some information here.\n");
    	return;
    }
    
    if (!condition1) {
    	printf("Some information here.\n");
    	return;
    }
    
    if (!condition2) {
    	warn("beware!");
    	return;
    }
    
    if (condition3 || condition4) {
    	error("oh no!");
    	return;
    }
    
  6. Use defaults to avoid unnecessary else-clauses.
    Don't: Do:
    if (condition) {
    	var = value1;
    } else {
    	var = value2;
    }
    
    var = value2;
    if (condition) {
    	var = value1;
    }
    
  7. Use a sensible terminal window size (< 100x50 max) as a guide to refactor code:
    • if your code doesn't fit into < pages, you likely can't keep it in your head -- refactor
    • if your code is indented such that it wraps, you're in too deep -- refactor
  8. Using < 4 space indentation is cheating and makes your code too dense to read easily.
    (Again: use tabs. And yes, 8 space tabstops. :-)
  9. Declare all variables at function start; that way, you can use introduction of a new variable as a hint to create a subroutine. Similarly, declare variables only in the scope they are needed.
  10. Allocate and release resources within the same indentation level/scope to avoid e.g., fd- or memory leaks, use-after-free, etc.
  11. Don't repeat the same or almost the same code; abstract the difference and create a function.
  12. Comments should explain why you're doing something, not what you're doing -- your code should be readable enough all by itself. See #1 above. (previously)
  13. Be able to explain your code.
    There's a lot of code snippets on StackOverflow that are incomplete, make assumptions, or are simply incorrect -- never copy anything without understanding exactly what it does.
  14. Write boring code.
    print foo.join(map(lambda x: str(f(x)), g(y)))
    looks leet, but requires a lot of cognitive overhead compared to a simple 'for' loop.
    See also: Murder your darlings.
  15. Don't use arbitrary limits and avoid magic numbers. Define meaningful constants with commentary explaining the chosen value.
  16. Check the return value of every function that may fail. (Yes, that includes malloc(3).)
  17. Use braces even if optional on single statements. (Remember goto fail?)
    Don't: Do:
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
    
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
        goto fail;
    }
        goto fail;  /* beeeep, this is now a lot more obvious */
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) {
        goto fail;
    }
  18. Write your functions / modules such that there are no side effects. Have each do only one specific thing. That makes them independently testable and forces you to clarify your data models.
  19. Handle I/O in chunks: don't try to read all data into a single buffer. Somebody* will give you input larger than you can handle.
    *Ok, I.
  20. Assume hostile input and usage. Always validate all input from outside your program. (Goes hand in hand with #15.)
    'perl -T' is a great mechanism that I miss in many other languages.
  21. When dealing with external input, an allowlist beats a denylist any day. You'll never think of all the weird things your users* will throw at you.
    *Ok, that's me again.
  22. Don't shell out; use exec(3) or similar. Things like popen(3) or system(3) can easily lead to arbitrary code execution via shell command injection. Parametrized execution of commands avoids that.
    Still combine with #19 for good measure.
  23. Don't assume you can write to the current working directory. Don't assume your program is run from any specific directory.
  24. Avoid interactive usage; prefer command-line flags, environment variables, or config files instead. (Although avoid the complexity of parsing a config file if you can.)
  25. Avoid temporary files wherever possible. If you must use a temporary file, set a umask, use mktemp(3), and unlink via an exit handler.
    See this blog post for a longer discussion.
  26. Separate stdout and stderr. Don't print any output that's not required or requested. Favor syslog(3) over handling your own logging I/O logic.
  27. Ensure your error messages are meaningful and include a useful diagnostic message, preferably via strerror(errno), perror(3), err(3), ...
  28. Enable warnings in your compiler/interpreter, turn warnings into errors. Go does this by default, for C use "-Wall -Werror -Wextra" at a minimum), perl 'use warnings; use strict;', shell 'set -eu' etc.
  29. Avoid XXX and TODO -- you'll never go back and fix these.
  30. Be willing to rip out and throw away your code if you find an existing library or a better way. (See also: 14 above, re murdering your darlings, and, of course, #0.)
    "One of my most productive days was throwing away 1000 lines of code."
    -- Ken Thompson

Code is communication. Readability counts. Perhaps more than most other properties of your code.

December 21st, 2020


See also:


[CPU Pinning and CPU Sets] [Index]