ACME Updates

20Oct2015 self-overlapping strcpy()

I recently got a bug report about mini_sendmail involving some string corruption. The reporter had done an excellent job of tracking down the likely cause: using strcpy() on two strings that overlap. Turns out this is undefined. See the C89/ANSI C standard, section

If copying takes place between objects that overlap, the behavior is undefined.
and also section A.6.2, in a huge long list of undefined behaviors:
- An attempt is made to copy an object to an overlapping object by use of a library function other than memmove
My code uses overlapping strcpy() all over the place, typically for eliding a substring. E.g. to remove the first character:
strcpy( str, str + 1 );
Apparently I'm supposed to use this instead:
memmove( str, str + 1, strlen( str + 1 ) + 1 );
The memmove() version is ugly and takes twice as many CPU cycles, but it's "standard".

It's likely that the string corruption is not actually due to a strcpy() implementation copying the bytes in some weird order. It's implausible that a strcpy() implementation has ever been written or will ever be written that doesn't just copy bytes from the beginning of the source string until it finds a NUL. I suspect the problem here is more like a compiler optimiztion taking advantage of the guarantee that strcpy()'s args do not alias each other.

Self-overlapping strcpy()'s behavior is not actually undefined, it's precisely defined. It always works when copying backwards, and it always fails when copying forwards. Every version of strcpy() behaves this way, and always will. Calling it undefined behavior doesn't improve anything, and does break lots of existing code going back decades. A typical standards committee botch.

Anyway, I grepped through a couple hundred kilolines of code for self-overlapping strcpy() calls, and found & fixed a couple dozen. I've stored out new versions of mini_sendmail, thttpd, mini_httpd, coords, sfcmilter, and xml2c.

Back to ACME Updates.