Signs of Triviality

Opinions, mostly my own, on the importance of being and other things.
[homepage] [index] [jschauma@netmeister.org] [@jschauma] [RSS]

less bug, more /dev/null

May 26, 2013

Cicada As a System Administrator (with various job titles tip-toeing around this baseline) I've been using Unix for quite some time now, and have encountered a number of wonderful bugs. There's this sudo(8) bug I wrote about a while ago, but one of my favorites remains the following:

At one place of employment we had a tool that would add users to a given host. It was called adduser.pl (which was itself entertainingly misleading, see below), and it accepted as input a list of usernames as well as a list of hostnames. It would then gather the login information for all the users, ssh(1) to each host and update /etc/passwd using the host's native tools (such as useradd(8), etc.). Many a system administrator has written a similar tool, and for the most part, this program worked reasonably well. (For the purpose of this discussion, let us not debate how user management could be handled better.)

But all software fails eventually, and for one reason or another I had to debug the tool. Groveling through a few hundred lines of perl that had clearly been written by different people in different stages of their career, I came across a code block that, more or less, read something like this:


# Some hosts have broken /dev/null permissions.
# While we're here, let's fix that.
chmod(0666, "/dev/null");

This was several years before the accepted standard reaction to a situation like this had become "wat.", so I went "buh?". It wasn't quite so surprising that this tool that was written to add users did a few other things entirely unrelated to this, but how does a host get into a state with /dev/null not being readable or writable by everybody? And in order for this fix to have ended up in this tool, wouldn't this error condition have to have been encountered quite often?

Fast typing. Asking around, nobody really know how this happened, only that it was not entirely uncommon. Speculation amongst the sysadmins was that less clueful users encountered a world-writable file and said to themselves "Oh noes! A security hazard! I fix!" before pounding on the keyboard and by coincidence yielding the key sequence sudo chmod go-rw /dev/null.

If /dev/null is not readable, interesting things happen. And sure enough, every now and then a ticket would be filed with the system administrators' team that showed that certain programs had all sort of weird errors (the more helpful one being, ironically, adduser.pl: Can't open /dev/null: Permission denied"). But somehow the sudo(8) logs never showed just who would be so silly to change the permissions on this device.

One day, however, one of us was able to witness a host breaking in this fashion. That is, one second the permissions were fine, and the next they were broken. But the only activity by a user with sufficient privileges to make such a change was a sudo(8) invocation of less(1). Hmmm....

Boar Now less(1) makes use of a history file (LESSHISTFILE, defaulting to ~/.lesshst), to remember search commands between invocations, much like your shell may use a history file (such as ~/.bash_history). less(1) is aware of the fact that the history file might contain sensitive information, and it actively changes the permissions on that file to only allow read/write access to the owner:


$ ls -l .lesshst
-rw------- 1 jschauma users  45 May 25 00:06 .lesshst
$ chmod a+r .lesshst
-rw-r--r-- 1 jschauma users  45 May 25 00:06 .lesshst
$ less .bashrc
[...]
$ ls -l .lesshst
-rw------- 1 jschauma users  45 May 25 00:06 .lesshst
$ 

But for the root user, we explicitly symlinked all common such history files to /dev/null to avoid accidental leaking of secrets to disk. That is, the relevant files looked something like this:


$ ls -lU .bash_history .lesshst /dev/null
lrwxrwxrwx 1 root  root     9 Mar 25 10:02 .bash_history -> /dev/null
lrwxrwxrwx 1 root  root     9 Mar 25 23:49 .lesshst -> /dev/null
crw-rw-rw- 1 root  root  1, 3 Mar 26 06:35 /dev/null
$ 

I'm sure you can see where this is going... That's right: if less(1) was invoked by the super user (for example via sudo(8)), it would check the permissions on the file /root/.lesshst, follow the redirection to /dev/null find that they're not 0600 and call chmod(2), yielding


$ ls -lU .lesshst /dev/null
lrwxrwxrwx 1 root  root     9 Mar 25 23:49 .lesshst -> /dev/null
crw------- 1 root  root  1, 3 Mar 26 06:35 /dev/null
$ 

Digging around, we find that the change log for less includes the following entry and relevant diff between version 424 and 429:

* Don't change permissions on history file if it is not a regular file.

--- less-424/cmdbuf.c	2008-06-09 12:45:30.000000000 -0400
+++ less-429/cmdbuf.c	2009-03-30 15:45:51.000000000 -0400
@@ -1469,8 +1469,19 @@
 	if (f == NULL)
 		return;
 #if HAVE_FCHMOD
+{
 	/* Make history file readable only by owner. */
+	int do_chmod = 1;
+#if HAVE_STAT
+	struct stat statbuf;
+	int r = fstat(fileno(f), &statbuf);
+	if (r <  0 || !S_ISREG(statbuf.st_mode))
+		/* Don't chmod if not a regular file. */
+		do_chmod = 0;
+#endif
+	if (do_chmod)
 	fchmod(fileno(f), 0600);
+}
 #endif
 
 	fprintf(f, "%s\n", HISTFILE_FIRST_LINE);

Mystery solved! Despite any Sysadmin's inclination to blame clueless users, nobody did in fact go out and run sudo chmod a-rw /dev/null or anything like that.

I like this bug because it shows how something as simple as changing a file to a symbolic link can cause quite interesting and unpredictable side effects. Tracking down the actual cause of the problem isn't easy by any means, and did involve quite a good amount of luck in this case, as well as the willingness to dig deep into code that appears to be unrelated to the issue, while at the same time being aware of the many different factors that come together to trigger the bug in such a way as to cause this particular failure scenario.


Implied Facepalm

Here's another interesting thing about the adduser.pl script:

$ file `which adduser.pl`
/usr/local/bin/adduser.pl: POSIX shell script text executable
$ 

Feel free to go 'wat.' (I went 'buh?'). The code changed regularly, and in order for users to get the most recent bugs (fixed), it needed to be checked out of CVS. Users tended to forget this, and so a wrapper was added that automatically updated the script and some of its data files. (Self-updating tools are another scary anti-pattern, as far as I'm concerned, but that kind of thing happens to system tools that are maintained by different people through out the years.)

Because the tool was widely used, it was deemed undesirable to retrain all users to now type "adduser.sh" instead of "adduser.pl", and so the shell script wrapper was called "adduser.pl" for "backwards compatibility". I now use this example frequently to advocate naming your tools without any particular ending suggesting what language a tool might be implemented in.


May 26, 2013


Updated July 26, 2013 based on feedback by @markkawika, who originally added the chmod calls and who helped clarify the history of the shell wrapper.


Related:


[Syncing the NIST National Vulnerability Database to Sqlite3] [Index] [Syncing NIST's National Vulnerability Database with Jira]