Comparing memory regions is an incredibly useful tool when writing C code. This can be used to verify that two regions contain the same data or to compare against a desired value with user-provided data. The memcmp function provides this functionality and has the following prototype

int
memcmp(const void *s1, const void *s2, size_t n);

The return value is an integer which indicates whether the first region is greater than, equal to, or less than the second region. Since we only need to read the data it makes sense that the pointers are const which is an extra safety measure to ensure we don't modify what is being compared.

Local Variables

As with memcpy and memmove, we need two local pointers that enable us to perform comparisons. We will also use an integer to hold the value we will be returning.

const unsigned char *pStr1  = s1;
const unsigned char *pStr2  = s2;
int ret                     = 0;

Parameter Validation

The only validation we need to perform is on the pointer values. The n parameter is of type size_t which means it will always be greater than or equal to zero. This brings up an interesting question: what should the return value be when no memory is compared? The Standard provides no direct answer to this, but we can infer one from section 7.11.4, "Comparison functions", which states

"The sign of a nonzero value returned by the comparison functions memcmp, strcmp, and strncmp is determined by the sign of the difference between the first two values of the first pair of characters (both interpreted as unsigned char), that differ in the objects being compared."

So, if we were to return a nonzero value we would need to be able to perform a comparison which we can't do if n is zero. As such, if n is zero then the return value will also be zero. Therefore, we don't need to perform any validation on n since all values of n are valid.

Lastly, we need to verify that the comparison will not cause the pointers to wrap around memory. This will be done in the same manner as memmove where we add n to each pointer and see if the result is less than the pointer itself.

Input errors will be indicated with a value of 1 since no comparison will be made and we cannot otherwise determine the correct sign of the difference according to the rules above.

if ( !s1 ||
     !s2 ||
    /* Check for wrapping while comparing */
    ((((unsigned long) -1) - ((unsigned long) s1)) < n) ||
    ((((unsigned long) -1) - ((unsigned long) s2)) < n))
{
    return 1;
}

Implementation

A similar approach to memmove will be taken where we loop over each character, decrementing n each iteration, but we will compare values rather than copy them. We can store the compared value so that we can check it and break out of the loop upon inequality like so

while (n-- > 0)
{
    ret = *(pStr1++) - *(pStr2++);

    if (0 != ret)
    {
        break;
    }
}

However, this adds the overhead of storing that value for every iteration of the loop. We only care about that value when it is not equal to zero so we should only store it when that's the case. This requires us to backup the pointers one step since they were incremented within the if statement.

while (n-- > 0)
{
    if (*(pStr1++) - *(pStr2++))
    {
        ret = *(--pStr1) - *(--pStr2);
        break;
    }
}

This implementation is \(O(n)\) which is as good as you can do while staying within the constraints of The Standard. The multi-character copy approach could also be used for comparisons but that's still \(O(n)\) and it violates The Standard.

Testing

Aside from testing the obvious cases like equal regions and differing regions (both to get a greater than and less than result), we also need to test that our input validation works as expected. This includes passing in NULL pointers, providing 0 for n, and intentionally attempting to compare regions which will cause an overlap to occur. We'll order the tests such that we only verify correct region comparisons after we verify that input validation works appropriately.

int
memcmpTest(void)
{
    int ret         = 1;
    char pStr1[]    = "This is a Test string";
    char pStr2[]    = "This is a Test string";
    char pStr3[]    = "This is a Lesser string";
    char pStr4[]    = "This is a greater string";
    char *pStr5     = NULL;

    do
    {
        /* NULL s2 pointer */
        if (1 != memcmp(pStr1, pStr5, sizeof(pStr1)))
        {
            break;
        }

        /* NULL s1 pointer */
        if (1 != memcmp(pStr5, pStr1, sizeof(pStr1)))
        {
            break;
        }

        /* Cause a wrap from s1 */
        if (1 != memcmp((void *) ((unsigned long) -5), pStr1, sizeof(pStr1)))
        {
            break;
        }

        /* Cause a wrap from s2 */
        if (1 != memcmp(pStr1, (void *) ((unsigned long) -5), sizeof(pStr1)))
        {
            break;
        }

        /* Compare no characters */
        if (0 != memcmp(pStr1, pStr2, 0))
        {
            break;
        }

        /* Compare no characters with an invalid s2 pointer */
        if (0 == memcmp(pStr1, pStr5, 0))
        {
            break;
        }

        /* Compare no characters with an invalid s1 pointer */
        if (0 == memcmp(pStr5, pStr1, 0))
        {
            break;
        }

        /* Test equality */
        if (0 != memcmp(pStr1, pStr2, sizeof(pStr1)))
        {
            break;
        }

        /* First string greater than second string */
        if (0 >= memcmp(pStr1, pStr3, sizeof(pStr1)))
        {
            break;
        }

        /* First string less than second string */
        if (0 <= memcmp(pStr1, pStr4, sizeof(pStr1)))
        {
            break;
        }

        ret = 0;
    } while (0);

    return ret;
}

Conclusion

This function follows the same pattern that we've seen in memcpy and memmove with the only differences being that a comparison is performed instead of a copy and that we don't need to worry about comparing in a certain direction since data is never written. Remember that comparing zero bytes of memory will produce a return value indicating that the regions are equal. Now that we have memcmp, we can use it to test future functions like strcpy or strcat.

int
memcmp(const void *s1, const void *s2, size_t n)
{
    const unsigned char *pStr1  = s1;
    const unsigned char *pStr2  = s2;
    int ret                     = 0;

    if ( !s1 ||
         !s2 ||
        /* Check for wrapping while comparing */
        ((((unsigned long) -1) - ((unsigned long) s1)) < n) ||
        ((((unsigned long) -1) - ((unsigned long) s2)) < n))
    {
        return 1;
    }

    while (n-- > 0)
    {
        if (*(pStr1++) - *(pStr2++))
        {
            ret = *(--pStr1) - *(--pStr2);
            break;
        }
    }

    return ret;
}
comments powered by Disqus