Signs of Triviality

Opinions, mostly my own, on the importance of being and other things.
[homepage] [index] [jschauma@netmeister.org] [@jschauma] [RSS]

sudo: unable to execute <command>: success

ARG-MAXYou've probably facepalmed upon encountering this error yourself:

$ sudo rm *
sudo: unable to execute /bin/rm: Success
$ echo $?
127
$ 

A clearly rather odd definition of "Success". Searching around your local intertubes, you probably have eventually arrived at the conclusion that the problem is due to there being a large number of files to be removed, meaning the shell will glob * into more than ARGS_MAX arguments before invoking rm(1), causing that to fail. So far, so good. But why doesn't sudo(8) say so instead of lying and calling it a "Success"?

Even more confusing, suppose you then try to verify your theory by using a less destructive (and hence repeatable) command and compare the invocation with and without sudo(8):

$ mkdir /tmp/d
$ cd /tmp/d
$ for i in $(seq 32768); do touch $i; done
$ ls *
1      1268   15360  18041  20722  23403  26085  28767  31447  4637  7318
10     12680  15361  18042  20723  23404  26086  28768  31448  4638  7319
100    12681  15362  18043  20724  23405  26087  28769  31449  4639  732
1000   12682  15363  18044  20725  23406  26088  2877   3145   464   7320
[...]
12677  15358  18039  2072   23400  26082  28764  31444  4634   7315  9998
12678  15359  1804   20720  23401  26083  28765  31445  4635   7316  9999
12679  1536   18040  20721  23402  26084  28766  31446  4636   7317
$ echo $?
0
$ sudo ls *
Password:
sudo: unable to execute /bin/ls: Success
$ echo $?
127
$ 

(Being a smart little unix cookie, you are of course aware that 'ls *' is technically a useless use of ls(1); running just ls would have had the same effect, but you wanted to explicitly test the limits of ARGS_MAX, so you needed the shell to expand the * and feed the args to ls(1). Likewise, you are of course aware that using 'echo *' would not have been a good candidate to test your theory, since echo is a shell built-in (well, yes, there is a /bin/echo for POSIX compliance, well done) and thus wouldn't have caused your current shell to fork a new process and attempt to exec the tool with the given args.)

So, useless use of ls(1) aside, clearly the number of arguments cannot be the limiting factor -- without sudo(8), you are able to run the same command. Also, if you were in fact running into the maximum number of arguments for a new process problem, shouldn't sudo(8) reply with:

sudo: unable to execute /bin/ls: Argument list too long

...ie, the suitable string representation of E2BIG (yes, yes, "she said")?

This is exactly the kind of odd behaviour and inconclusive, unsatisfactory or misleading error reporting that really gets to me, and so I actually did spend a little bit of time tracking down and understanding this error.

First of all, what is the reason sudo(8) fails to invoke the command? Let's look at the code:

$ cd /tmp
$ wget http://www.sudo.ws/sudo/dist/sudo-1.8.3p1.tar.gz
$ tar zxf sudo-1.8.3p1.tar.gz
$ cd sudo-1.7.8p1
$ more src/exec.c
[...]
/*
 * Like execve(2) but falls back to running through /bin/sh
 * ala execvp(3) if we get ENOEXEC.
 */
int
my_execve(const char *path, char *const argv[], char *const envp[])
{
    execve(path, argv, envp);
    if (errno == ENOEXEC) {
        int argc;
        char **nargv;

        for (argc = 0; argv[argc] != NULL; argc++)
            continue;
        nargv = emalloc2(argc + 2, sizeof(char *));
        nargv[0] = "sh";
        nargv[1] = (char *)path;
        memcpy(nargv + 2, argv + 1, argc * sizeof(char *));
        execve(_PATH_BSHELL, nargv, envp);
        efree(nargv);
    }
    return -1;
}
[...]
$ 

This looks like a good place to start. Let's build this little fucker with debugging symbols and give it a go.

$ CFLAGS="-g" ./configure
[...]
$ make
[...]
$ sudo chown root src/sudo
$ sudo chmod u+s src/sudo
$ # let's first see if this falis in the same way
$ ./src/sudo ls /tmp/d/*
Password:
$ echo $?
1
$ 

Wonderful, this version of sudo(8) fails, but in a different way. I'm not sure if I prefer "unable to execute: Success" over "", but it's fairly nice failing we see. So let's first move ahead in our efforts to identify why this fails.

$ sudo gdb ./src/sudo
(gdb) set follow-fork-mode child
(gdb) break my_execve
Breakpoint 1 at 0x4032f4: file ./exec.c, line 89.
(gdb) run ls /tmp/d/*
Starting program: /var/tmp/sudo-1.8.3p1/src/sudo ls /tmp/d/*
[Switching to process 22282]

Breakpoint 1, my_execve (path=0x195fbee8 "/bin/ls", argv=0x2b94a9c4e010,
envp=0x195f46c0)
    at ./exec.c:89
89          execve(path, argv, envp);
(gdb) (gdb) p errno
$1 = 0
(gdb) s
90          if (errno == ENOEXEC) {
(gdb) p errno
$2 = 7
(gdb) 

Well, so at least we were not on crack. An errno of "7" indicates E2BIG (yes, yes, like your mom). So execve(2) is in fact failing as we thought. Let's take a look at the arguments given:

(gdb) p *argv@10
$4 = {0x7fffdacd67e1 "ls", 0x7fffdacd67e4 "/tmp/d/1", 0x7fffdacd67ed
"/tmp/d/10",  0x7fffdacd67f7 "/tmp/d/100", 0x7fffdacd6802 "/tmp/d/1000",
0x7fffdacd680e "/tmp/d/10000",  0x7fffdacd681b "/tmp/d/10001",
0x7fffdacd6828 "/tmp/d/10002", 0x7fffdacd6835 "/tmp/d/10003",
0x7fffdacd6842 "/tmp/d/10004"}
(gdb) 

Yupp, there they are, all of the arguments that the shell so nicely supplied when we asked it to glob *. Sweet! But... if we traced our shell executing our non-sudo invocation, we'd see the same thing, and yet it wouldn't fail. So why does this fail? Let's go and find out what our ARG_MAX is, anyway:

$ cat >> /tmp/a.c <<EOF
#include <stdio.h>
#include <unistd.h>

int
main(int argc, char **argv) {
        printf("_SC_ARG_MAX says : %ld.\n", sysconf(_SC_ARG_MAX));
        return 0;
}
EOF
$ cc a.c
$ ./a.out
_SC_ARG_MAX says : 131072.
$ getconf ARG_MAX
131072
$ 

Yay, two out of two dentists agree, the maximum number of arguments (not to be confused with the argument of the maximum or ARG-MAX®, by the way) is 131072. No, wait a second, the what? The "maximum number of arguments"? That is not actually what POSIX specifies ARG_MAX to be, now is it? POSIX/SUSv4 actually says:

{ARG_MAX}
    Maximum length of argument to the exec functions including environment
    data.

Aha! ARG_MAX includes the environment data. That is, it does not specify how many arguments you can feed to a shell command (as it is commonly interpreted to mean), but rather the size of the arguments and the environment data combined. But if that was the reason that ls * works, but sudo ls * doesn't work, that'd mean that the sudo(8) environment is just about large enough to push us across the ARG_MAX limit. Let's take a look, going back to our gdb traced invocation:

(gdb) bt
#0  my_execve (path=0x195fbee8 "/bin/ls", argv=0x2b94a9c4e010,
        envp=0x195f46c0) at ./exec.c:90
#1  0x00000000004034fe in fork_cmnd (details=0x7fffdac95270,
        sv=0x7fffdac950c0) at ./exec.c:143
#2  0x00000000004038a2 in sudo_execve (details=0x7fffdac95270,
        cstat=0x7fffdac95140) at ./exec.c:284
#3  0x000000000040a977 in run_command (details=0x7fffdac95270)
        at ./sudo.c:1065
#4  0x0000000000408912 in main (argc=32770, argv=0x7fffdac95448,
        envp=0x7fffdacd5460) at ./sudo.c:295
(gdb) frame 0
#0  my_execve (path=0x195fbee8 "/bin/ls", argv=0x2b94a9c4e010,
        envp=0x195f46c0) at ./exec.c:90
90          if (errno == ENOEXEC) {
(gdb) li
85       */
86      int
87      my_execve(const char *path, char *const argv[], char *const
envp[])
88      {
89          execve(path, argv, envp);
90          if (errno == ENOEXEC) {
91              int argc;
92              char **nargv;
93
94              for (argc = 0; argv[argc] != NULL; argc++)
(gdb)  p *envp@20
$6 = {0x7fffdad3bc82 "HOSTNAME=t.raptor.pool.spa.yahoo.com",
  0x7fffdad3bec6 "PATH=/home/y/bin64:/home/y/bin:/usr/kerberos/bin:
/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin",
  0x7fffdad3bf68 "LANG=en_US.UTF-8", 0x195f45f0 "SHELL=/bin/bash",
  0x195f44c0 "LOGNAME=root", 0x195f44e0 "USER=root", 0x195f4500 
  "USERNAME=root", 0x195f2540 "MAIL=/var/mail/root", 0x195f4520
  "HOME=/root", 0x2b94a9cf5010 "SUDO_COMMAND=/bin/ls /tmp/d/1
/tmp/d/10 /tmp/d/100  /tmp/d/1000 /tmp/d/10000 /tmp/d/10001
/tmp/d/10002 /tmp/d/10003 /tmp/d /10004 /tmp/d/10005 /tmp/d/10006
/tmp/d/10007 /tmp/d/10008 /tmp/d/10009 /tmp/d/"...,
  0x195f4540 "SUDO_USER=root", 0x195f4560 "SUDO_UID=0",
  0x195f4580 "SUDO_GID=0", 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) 

Here we see that not only does our argv include the shell-expanded list of files, but our envp also includes all that gunk in SUDO_COMMAND! So that must be the reason. And it is. But, as I found out, not directly in the way you'd think. Because if this was the reason, then we should be able to only use approximately ARG_MAX/2 arguments when using sudo(8), and if you play around with trying to nail down that number, you'll find that is not actually the case.

Now the ARG_MAX discussion has been had numerous times, and a useful summary can be found on this page. In particular, the section on the effectively usable space suggests that the space we should have ought to be around:

$  expr `getconf ARG_MAX` - `env|wc -c` - `env|wc -l` \* 4 - 2048
127991
$ 

Now let's pretend we're in the sudo environment, where we find SUDO_COMMAND="/bin/ls /tmp/d/1 /tmp/d/10 /tmp/d/100 /tmp/d/1000 [...]":

$ export SUDO_COMMAND="/bin/ls $(ls /tmp/d/*)"
$ echo $SUDO_COMMAND
[...]
$ echo $SUDO_COMMAND | wc -c
-sh: /usr/bin/wc: Argument list too long
-sh: echo: write error: Broken pipe
$ 

Now what? We set SUDO_COMMAND and can echo it (shell built-in, remember?), but the shell immediately fails to exec(3) wc(1), feeding it the output of echo. This initially seems to confirm our thesis of the environment being too big, but only until we remember that our thesis was that what was too big was the number of args plus the environment. So it seems the environment is too big all by itself.

So far, so good. How big of an environment can we actually build? Let's slowly reduce the length of the SUDO_COMMAND variable until we can actually exec(3) something again:

$ unset SUDO_COMMAND # built-in, fixes everything
$ export SUDO_COMMAND=$(dd if=/dev/urandom | tr -dC '[0-9a-z]' | \
        head -c $(( $(getconf ARG_MAX) - $(env | wc -c ) - 13 )) )
$ echo $SUDO_COMMAND | wc -c
130111
$ env | wc -c
131073
$ 

We generate a few random (trivially printable) characters and extract enough to fill our total env up to ARG_MAX by subtracting 13 characters (strlen("SUDO_COMMAND=")) as well as the number of characters in the current env (env | wc -c) from ARG_MAX. Then we confirm that our total env is exactly the size of ARG_MAX (plus one \n).

So the theory is that if our total env was just a single byte bigger, we couldn't exec(3) anything. But that theory fails to hold up to the test.

$ unset SUDO_COMMAND
$ export SUDO_COMMAND=$(dd if=/dev/urandom | tr -dC '[0-9a-z]' | \
        head -c $(( $(getconf ARG_MAX) - $(env | wc -c ) - 12 )) )
$ env | wc -c
131074
$ unset SUDO_COMMAND
export SUDO_COMMAND=$(dd if=/dev/urandom | tr -dC '[0-9a-z]' | \
        head -c $(( $(getconf ARG_MAX) - $(env | wc -c ) )) )
$ env | wc -c
131086
$ unset SUDO_COMMAND
$ export SUDO_COMMAND=$(dd if=/dev/urandom | tr -dC '[0-9a-z]' | \
        head -c $(( $(getconf ARG_MAX) - 14 )) )
$ env | wc -c
132021
$ export SUDO_COMMAND=$(dd if=/dev/urandom | tr -dC '[0-9a-z]' | \
        head -c $(( $(getconf ARG_MAX) - 13 )) )
$ env | wc -c
-sh: /usr/bin/wc: Argument list too long
-sh: /bin/env: Argument list too long
$ 

So we can set the value of one single environment variable to the value of ARG_MAX - strlen(varname). That should fill our env completely, but then we shouldn't be able to add any other variables. But we still can -- and thus actually increase the size of our env and still be able to exec(3) other commands:

$ env | wc -c
132021
$ export FOO=1
$ env | wc -c
132027
$ 

In fact, we can increase the size of our env significantly. We can add a large number of variables, as well as other variables that are themselves very large:

$ for n in $(seq 2048); do export FOO$n=$n; done
$ env | wc -c
156437
$ env | grep -c FOO
2049
$ export VAL=$SUDO_COMMAND
$ export VAL2=$SUDO_COMMAND
$ env | wc -c
418564
$ for n in $(seq 2048); do export FOO$n=$SUDO_COMMAND; done
$ env | wc -c
-sh: fork: Cannot allocate memory
$ # various divide and conquer commands goes here
$ env | wc -c
2621411
$ export B=
$ env | wc -c
-sh: /usr/bin/wc: Argument list too long
$ 

So. Now we've finally gotten somewhere. Our limitation of ARG_MAX is, apparently, not on the sum of argc + envp, but on a single environment variable. (This appears to contradict POSIX!) The total size of our env then is (eventually) limited by available memory, which makes sense since our env needs to fit on top of the stack, but we obviously can't just use all available space. It seems that on this system, we can have a total env of max size 2621411.

(Let me preempt comments regarding how Linux changed this limit to no longer be bound by _SC_ARG_MAX in versions >= 2.6.23 by noting that the system in question happens to be a 2.6.18 kernel.)

(Likewise, this apparent contradiction of POSIX appears to be a Linux only thing -- NetBSD and FreeBSD both don't allow you to add new variables to the env if you've already reached ARG_MAX. Ie, the max is actually for the full env, not for an individual variable, as apparently the case on Linux.)

Aaaaaanyway... so the reason that sudo(8) failed (remember, that's how we started down this road in the first place?) is that it adds the globbed full command into the environment, and this single variable happens to be larger than ARG_MAX. Yay, figured out that part. But... why does sudo(8) give us an error of "Success"?

To understand that, we go back to reading the code we extracted earlier. Unfortunately, that does not appear to contain the error message we are looking for at all, and in fact, when we built and ran that version, it failed without any useful error message whatsoever. So let's go and hunt down the specific version of sudo(8) that we have installed on our system (1.6.9p17, as it happens). Unfortunately, that version is not available on the sudo(8) archive. Grrr.

Let's get the source RPM from which this version supposedly was built instead, extract the sources and look at them:

$ wget -q
http://vault.centos.org/5.4/os/SRPMS/sudo-1.6.9p17-5.el5.src.rpm
$ rpm2cpio sudo-1.6.9p17-5.el5.src.rpm | cpio -idm
$ tar zxf sudo-1.6.9p17.tar.gz
$ cd sudo-1.6.9p17
$ ./configure >/dev/null 2>&1
$ make >/dev/null 2>&1
$ sudo chown root sudo
$ sudo chmod u+s sudo
$ ./sudo ls /tmp/d/*
sudo: unable to execute /bin/ls: Argument list too long
$ 

Gaaaah! This version prints the actually helpful message, yet our system version does not. Well, the RPM actually contained a few more files. RedHat/CentOS actually patches sudo(8), and if we look at sudo-1.6.9p13-audit.patch, we find:

@@ -458,10 +477,16 @@ main(argc, argv, envp)
            NewArgv[1] = safe_cmnd;
            execve(_PATH_BSHELL, NewArgv, environ);
        }
+#ifdef WITH_AUDIT
+       audit_logger(AUDIT_USER_CMD,  safe_cmnd, user_args, 0);
+#endif 
        warn("unable to execute %s", safe_cmnd);
        exit(127);
     } else if (ISSET(validated, FLAG_NO_USER) || (validated &
FLAG_NO_HOST)) {
        log_auth(validated, 1);
+#ifdef WITH_AUDIT
+       audit_logger(AUDIT_USER_CMD,  safe_cmnd, user_args, 0);
+#endif 
        exit(1);
     } else if (ISSET(validated, VALIDATE_NOT_OK)) {
        if (def_path_info) { 

Hey now, look at this. Right after we execve(2) (ie, in the case that that call fails) and right before we call warn(3) (which prints a useful error message based on the value of errno), we call audit_logger. Now if audit_logger makes any calls that cause errno to be changed, then... exactly. errno gets reset, those calls succeed and when we get to the warn(3), we call it with an errno of zero, which yields the "Success" message.

And there you have it. A much too long deep dive into why sudo(8) apparently occasionally prints "Success" when in fact it failed and a summary of why the command failed in the first place even though a non-sudo(8) invocation...

December 4, 2011


Related:


[Of Headless User Accounts and Restricted Shells] [index] [Leaving Yahoo!]