Text manipulation is a large part user facing programs so that text can be stored, retrieved, and displayed. To accomplish those actions, strings may need to be copied from one location to another. We can already find the length of a string with strlen and could pass along that value to the memcpy or memmove functions. However, the programmer must always remember to copy an extra character: the null terminator. To prevent this error prone requirement on the programmer, C has the strlen function

char *
strcpy(char *s1, const char *s2);

An (Almost) Easy Win

As mentioned, one could accomplish a string copy with a memory copying function. The Standard does not require strcpy to support copying between overlapping strings, but we can easily support this by using memmove

memmove(dst, src, strlen(src) + 1);

Assuming that strlen works, it follows that a null character will be at the end of the string. Thus, we can copy the source length, plus one, in order to get the entire string along with the null terminator.

In the case of string which runs off the end of memory and the program doesn't crash during strlen (which is a valid possibility with undefined behavior), we would attempt to copy a single character from the source. As long as the source pointer isn't the last addressable byte, this would look valid to memmove which would subsequently copy a single character into the destination... without the null terminator. This is not the correct behavior so we must handle it appropriately.

Local Variables

First, we'll store the length of the string so we can use it to perform error checking before attempting to copy.

size_t len = strlen(s2);

Parameter Validation

We need to perform the same types of checks that memmove does so we'll steal them only exchanging n for our local variable len

if ( !s1 ||
     !s2 ||
    /* Check for wrapping while copying */
    ((((unsigned long) -1) - ((unsigned long) s1)) < len) ||
    ((((unsigned long) -1) - ((unsigned long) s2)) < len))
{
    return s1;
}

In the case that the length is 0 then the wrapping checks don't perform any meaningful verification. However, when the length is positive we guarantee that we won't copy onto the end of memory leaving a string which is not null terminated.

Implementation

If the parameters are valid then we are assured that a valid copy may take place. However, if we're copying from a string which isn't null terminated (which would report a length of 0) then we would still copy a single character and no null terminator if we were to use the above memmove call. To prevent this, we can call memmove with the string length and manually null terminate it. This guarantees that the destination will always become null terminated, even if the source string wasn't. In other words, an invalid source string will result in an empty string being copied.

memmove(s1, s2, len);
s1[len] = '\0';

return s1;

Testing

Aside from passing NULL pointers to strcpy, we also need to verify copies from an empty string along with a normal copy from a string into an appropriately sized array. We still can't accurately test non-null terminated strings at the end of memory because that would invoke undefined behavior, however we can attempt to copy to the end of memory.

int
strcpyTest(void)
{
    int ret = -1;
    char *str1 = NULL;
    char *str2 = "hello, world";
    char *str3 = "";
    char str4[] = "hello, world";
    char str5[15] = { 0 };
    char str6[] = "a";

    do
    {
        /* Wrap checking relies on an unsigned long holding a pointer value
         * without truncation.
         */
        if (sizeof(unsigned long) < sizeof(void *))
        {
            break;
        }

        /* Copy into NULL, may crash */
        strcpy(str1, str2);

        /* Copy from NULL, may crash */
        strcpy(str2, str1);

        if (0 != memcmp(str2, str4, strlen(str4) + 1))
        {
            break;
        }

        /* Copy to destination which would cause wrap, may crash */
        strcpy((char *) ((unsigned long) -5), str2);

        /* Copy an empty string over an array */
        strcpy(str4, str3);

        if (0 != memcmp(str4, str3, strlen(str3) + 1))
        {
            break;
        }

        /* Copy a single character string */
        strcpy(str5, str6);

        if (0 != memcmp(str5, str6, strlen(str6) + 1))
        {
            break;
        }

        /* Copy a string into an array */
        strcpy(str5, str2);

        if (0 != memcmp(str5, str2, strlen(str2) + 1))
        {
            break;
        }

        ret = 0;
    } while (0);

    return ret;
}

Conclusion

Our implementation of strcpy appropriately handles invalid inputs as well as copying from non-null terminated strings at the end of memory.

char *
strcpy(char *s1, const char *s2)
{
    size_t len = strlen(s2);

    if ( !s1 ||
         !s2 ||
        /* Check for wrapping while copying */
        ((((unsigned long) -1) - ((unsigned long) s1)) < len) ||
        ((((unsigned long) -1) - ((unsigned long) s2)) < len))
    {
        return s1;
    }

    memmove(s1, s2, len);
    s1[len] = '\0';

    return s1;
}
comments powered by Disqus