Signs of Triviality

Opinions, mostly my own, on the importance of being and other things.
[homepage]  [blog]  [jschauma@netmeister.org]  [@jschauma]  [RSS]

strlcat(3) > strncat(3)

December 30th, 2021

In my Advanced Programming in the UNIX Environment class, I frequently observe common anti-patterns that appear all too easy for students to adopt. One of those is the unsafe use of strncat(3), which I'll demonstrate here together with an explanation of why you should use strlcat(3) instead.

When handling strings in C, buffer overflows are an all too common occurrence. Beginner programmers often use the rather convenient, but unbound strcpy(3) / strcat(3) functions. Let's illustrate by a simple example, appending a filename to a directory name to build a full pathname:

$ cat unsafe.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define SIZE 32

int
main(int argc, char **argv) {
	char buf[SIZE] = { 0 };
	int i = 42;

	if (argc != 3) {
		fprintf(stderr, "Usage: %s dirname filename\n", argv[0]);
		exit(EXIT_FAILURE);
		/* NOTREACHED */
	}

	(void)strcpy(buf, argv[1]);
	(void)strcat(buf, argv[2]);

	(void)printf("buf (length %ld) is: %s\n", strlen(buf), buf);
	(void)printf("i is: %d\n", i);
	return 0;
}
$ cc -g -Wall -Werror -Wextra unsafe.c
$ ./a.out /tmp/ file
buf (length 9) is: /tmp/file
i is: 42
$ 

Wonderful. But as you can trivially spot, there's an obvious buffer overflow vulnerability whereby my writing data into buf can overwrite i (and ultimately lead to a segfault):

$ ./a.out `yes a | head -39 | tr -d '\n'` /file
buf (length 44) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/file
i is: 0             # --- terminating NULL now value of 'i' ----^
$ ./a.out `yes a | head -40 | tr -d '\n'` /file
buf (length 45) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/file
i is: 101           # --- ASCII 'e' = decimal 101 --------------^
$ ./a.out `yes a | head -51 | tr -d '\n'` /file
buf (length 56) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/file
i is: 1633771873    # now we trampled all over the stack and overwrote our return address!
[2]   Segmentation fault (core dumped) ./a.out $(...) file
$ 

Right. So you may have instructed your students to be careful about this sort of thing, and they may have adopted the practice of using strncpy(3) / strncat(3) instead. But here lies a common misunderstanding or misuse of the function: new programmers tend to give as the length argument the length of the string that is being appended:

$ diff unsafe.c unsafe2.c
18,19c18,19
<       (void)strcpy(buf, argv[1]);
<       (void)strcat(buf, argv[2]);
---
>       (void)strncpy(buf, argv[1], strlen(argv[1]));
>       (void)strncat(buf, argv[2], strlen(argv[2]));
$ 

Ok, rookie mistake, I get it. We didn't change anything there: we're still copying / appending the input unchecked into a fixed-size buffer, overflowing it just like before. But this is a very common usage pattern, so don't dismiss it too quickly.

Now the other approach students take to fix the problem is by passing the total size of the destination buffer as the limiting argument, and initially that even looks a lot better:

$ diff unsafe2.c unsafe3.c
18,19c18,19
<       (void)strncpy(buf, argv[1], strlen(argv[1]));
<       (void)strncat(buf, argv[2], strlen(argv[2]));
---
>       (void)strncpy(buf, argv[1], SIZE);
>       (void)strncat(buf, argv[2], SIZE);
$ cc -g -Wall -Werror -Wextra unsafe3.c 
$ ./a.out `yes a | head -51 | tr -d '\n'` file
buf (length 37) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/file
i is: 42
$ 

Yay, no segfault! But... we still overflowed the buffer. Note that buf has a length of 37. We just got lucky here, because the second string we appended was short. What if it's longer?

$ long=`yes a | head -51 | tr -d '\n'` 
$ ./a.out $long /$long
buf (length 64) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is: 1633771873
[3]   Segmentation fault (core dumped) ./a.out ${long} ${long}
$ 

Yep, just like before. The problem is that we told strncat(3) to not copy more than SIZE bytes, without consideration of whether buf can hold that many bytes at the time of the concatenation. What we should have done instead:

$ diff unsafe3.c ok.c
18,19c18,20
<       (void)strncpy(buf, argv[1], SIZE);
<       (void)strncat(buf, argv[2], SIZE);
---
>       (void)strncpy(buf, argv[1], sizeof(buf) - 1);
>       buf[sizeof(buf) - 1] = '\0';
>       (void)strncat(buf, argv[2], sizeof(buf) - 1 - strlen(buf));
$ cc -g -Wall -Werror -Wextra ok.c 
$ ./a.out $long /$long
buf (length 31) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is: 42
$ 

..., which, incidentally, is exactly what the strncat(3) manual page suggests, but of course this is neither intuitive, nor convenient, which is why so many students get this wrong. (It also relies on sizeof, which students also tend to get wrong, because sizeof != strlen.)

So what's the better practice to ingrain in your mind? strlcat(3), which handles all that for you:

$ diff ok.c better.c
18,20c18,19
<       (void)strncpy(buf, argv[1], sizeof(buf) - 1);
<       buf[sizeof(buf) - 1] = '\0';
<       (void)strncat(buf, argv[2], sizeof(buf) - 1 -
strlen(buf));
---
>       (void)strlcpy(buf, argv[1], SIZE);
>       (void)strlcat(buf, argv[2], SIZE);
$ ./a.out $long /$long
buf (length 31) is: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
i is: 42
$ 

There you go. strlcat(3) / strlcpy(3) have been around since they were introduced in OpenBSD 2.4 back in 1998, and it speaks to the persistent problems endemic in the C programming language that well over 20 years later, new programmers still struggle to adopt them. Perhaps this article helps some people remember to use them...

December 30th, 2021


Links:


Previous: [Open Source Security Process Wishlist]  -- Next: [WHOIS: Fragile, unparseable, obsolete... and universally relied upon]
[homepage]  [blog]  [jschauma@netmeister.org]  [@jschauma]  [RSS]