December 21st, 2020
Code is communication. You are using
one language to tell a computer what to do,
translating your thoughts, perhaps even after
translating from your non-English primary language
into the heavily English influenced syntax of a
programming language.
You are also communicating with your colleagues,
current and future, who need to be able to make sense
of your code. You are communicating with your own
future self who will inevitably look at this mess and
think "Just what did this idiot think when they wrote
this?".
You can make your and their life easier by writing
code that is easy to read. Over the years, I've read
my share of "other people's code" -- my own, weeks,
months, or years after I first wrote it; my
students' homework assignments; my colleagues'
code; code on GitHub and StackOverflow, the code of
various Open Source packages that I had to debug,
patch, merge, or integrate into another code base, or
the proprietary code from third party providers.
All of that code -- regardless of the language it
was written in -- could have benefited from an
application of the Zen
of Python, and so here are a few recommendations
based on what I observed to be the most common
mistakes:
- Don't write the damn code in the first
place.
- Use descriptive variable and function names. You
are not charged per character, and vowels
don't cost extra.
- Don't squish your code as tight as possible. Use
spaces to increase readability, newlines to group
small logical blocks.
- Don't mix spaces and tabs for indentation. Pick
either one* and then be consistent in their
use.
*Just kidding. Use tabs.
- Exit early to avoid waterfall code. Especially for
longer code blocks, this minimizes cognitive overhead.
Don't: |
Do: |
if (condition) {
if condition2() {
if (condition3 || condition4) {
doStuff();
} else {
error("oh no!");
return;
}
} else {
warn("beware!");
return;
}
} else {
printf("Some information here.\n");
return;
}
|
if (!condition1) {
printf("Some information here.\n");
return;
}
if (!condition2) {
warn("beware!");
return;
}
if (condition3 || condition4) {
error("oh no!");
return;
}
|
- Use defaults to avoid unnecessary else-clauses.
Don't: |
Do: |
if (condition) {
var = value1;
} else {
var = value2;
}
|
var = value2;
if (condition) {
var = value1;
}
|
- Use a sensible terminal window size (< 100x50 max) as a guide to refactor code:
- if your code doesn't fit into < pages, you
likely can't keep it in your head -- refactor
- if your code is indented such that it wraps, you're in too deep -- refactor
- Using < 4 space indentation is cheating and makes your code too dense to read easily.
(Again: use tabs. And yes, 8 space tabstops. :-)
- Declare all variables at function start; that way,
you can use introduction of a new variable as a hint
to create a subroutine. Similarly, declare variables
only in the scope they are needed.
- Allocate and release resources within the same
indentation level/scope to avoid e.g., fd- or memory
leaks, use-after-free, etc.
- Don't repeat the same or almost the same code;
abstract the difference and create a function.
- Comments should explain why you're doing something, not what you're doing -- your code should be readable enough all by itself. See #1 above. (previously)
- Be able to explain your code.
There's a lot of code snippets on StackOverflow that
are incomplete, make assumptions, or are simply
incorrect -- never copy anything without understanding
exactly what it does.
- Write boring code.
print foo.join(map(lambda x: str(f(x)), g(y)))
looks leet, but requires a lot of cognitive overhead
compared to a simple 'for' loop.
See also: Murder your darlings.
- Don't use arbitrary limits and avoid magic
numbers. Define meaningful constants with commentary
explaining the chosen value.
- Check the return value of every function that may
fail. (Yes, that includes malloc(3).)
- Use braces even if optional on single statements.
(Remember goto fail?)
Don't: |
Do: |
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
goto fail;
goto fail;
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
goto fail;
|
if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0) {
goto fail;
}
goto fail; /* beeeep, this is now a lot more obvious */
if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0) {
goto fail;
}
|
- Write your functions / modules such that there are
no side effects. Have each do only one specific thing.
That makes them independently testable and forces you
to clarify your data models.
- Handle I/O in chunks: don't try to read all
data into a single buffer. Somebody* will give you
input larger than you can handle.
*Ok, I.
- Assume hostile input and usage. Always validate
all input from outside your program. (Goes hand in
hand with #15.)
'perl -T' is a great mechanism that I miss in many
other languages.
- When dealing with external input, an allowlist beats a denylist any day. You'll never think of all the weird things your users* will throw at you.
*Ok, that's me again.
- Don't shell out; use exec(3) or similar.
Things like popen(3) or system(3)
can easily lead to arbitrary code execution via shell
command injection. Parametrized execution of commands
avoids that.
Still combine with #19 for good measure.
- Don't assume you can write to the current working
directory. Don't assume your program is run from
any specific directory.
- Avoid interactive usage; prefer command-line
flags, environment variables, or config files instead.
(Although avoid the complexity of parsing a config
file if you can.)
- Avoid temporary files wherever possible. If you
must use a temporary file, set a umask, use mktemp(3),
and unlink via an exit handler.
See this blog post for a
longer discussion.
- Separate stdout and stderr. Don't print any output
that's not required or requested. Favor
syslog(3) over handling your own logging I/O
logic.
- Ensure your error messages are meaningful and
include a useful diagnostic message, preferably via
strerror(errno), perror(3),
err(3), ...
- Enable warnings in your compiler/interpreter, turn
warnings into errors. Go does this by default, for C
use "-Wall -Werror -Wextra" at a minimum),
perl 'use warnings; use strict;', shell
'set -eu' etc.
- Avoid XXX and TODO -- you'll
never go back and fix these.
- Be willing to rip out and throw away your code if
you find an existing library or a better way. (See
also: 14 above, re murdering your darlings, and, of
course, #0.)
"One of my most productive days was throwing away 1000
lines of code." -- Ken Thompson
Code is communication. Readability counts. Perhaps more than most other
properties of your code.
December 21st, 2020
See also:
|