memcmp
Comparing memory regions is an incredibly useful tool when writing C code. This
can be used to verify that two regions contain the same data or to compare
against a desired value with user-provided data. The memcmp
function provides
this functionality and has the following prototype
int
memcmp(const void *s1, const void *s2, size_t n);
The return value is an integer which indicates whether the first region is
greater than, equal to, or less than the second region. Since we only need to
read the data it makes sense that the pointers are const
which is an extra
safety measure to ensure we don't modify what is being compared.
Local Variables
As with memcpy
and memmove
, we need two local pointers that enable us to
perform comparisons. We will also use an integer to hold the value we will be
returning.
const unsigned char *pStr1 = s1;
const unsigned char *pStr2 = s2;
int ret = 0;
Parameter Validation
The only validation we need to perform is on the pointer values. The n
parameter is of type size_t
which means it will always be greater than or
equal to zero. This brings up an interesting question: what should the return
value be when no memory is compared? The Standard provides no direct answer to
this, but we can infer one from section 7.11.4, "Comparison functions", which
states
"The sign of a nonzero value returned by the comparison functions memcmp
,
strcmp
, and strncmp
is determined by the sign of the difference between the
first two values of the first pair of characters (both interpreted as unsigned
char
), that differ in the objects being compared."
So, if we were to return a nonzero value we would need to be able to perform a
comparison which we can't do if n
is zero. As such, if n
is zero then the
return value will also be zero. Therefore, we don't need to perform any
validation on n
since all values of n
are valid.
Lastly, we need to verify that the comparison will not cause the pointers to
wrap around memory. This will be done in the same manner as memmove
where we
add n
to each pointer and see if the result is less than the pointer itself.
Input errors will be indicated with a value of 1 since no comparison will be made and we cannot otherwise determine the correct sign of the difference according to the rules above.
if ( !s1 ||
!s2 ||
/* Check for wrapping while comparing */
((((unsigned long) -1) - ((unsigned long) s1)) < n) ||
((((unsigned long) -1) - ((unsigned long) s2)) < n))
{
return 1;
}
Implementation
A similar approach to memmove
will be taken where we loop over each character,
decrementing n
each iteration, but we will compare values rather than copy
them. We can store the compared value so that we can check it and break out of
the loop upon inequality like so
while (n-- > 0)
{
ret = *(pStr1++) - *(pStr2++);
if (0 != ret)
{
break;
}
}
However, this adds the overhead of storing that value for every iteration of the
loop. We only care about that value when it is not equal to zero so we should
only store it when that's the case. This requires us to backup the pointers one
step since they were incremented within the if
statement.
while (n-- > 0)
{
if (*(pStr1++) - *(pStr2++))
{
ret = *(--pStr1) - *(--pStr2);
break;
}
}
This implementation is \(O(n)\) which is as good as you can do while staying within the constraints of The Standard. The multi-character copy approach could also be used for comparisons but that's still \(O(n)\) and it violates The Standard.
Testing
Aside from testing the obvious cases like equal regions and differing regions
(both to get a greater than and less than result), we also need to test that our
input validation works as expected. This includes passing in NULL
pointers,
providing 0 for n
, and intentionally attempting to compare regions which will
cause an overlap to occur. We'll order the tests such that we only verify
correct region comparisons after we verify that input validation works
appropriately.
int
memcmpTest(void)
{
int ret = 1;
char pStr1[] = "This is a Test string";
char pStr2[] = "This is a Test string";
char pStr3[] = "This is a Lesser string";
char pStr4[] = "This is a greater string";
char *pStr5 = NULL;
do
{
/* NULL s2 pointer */
if (1 != memcmp(pStr1, pStr5, sizeof(pStr1)))
{
break;
}
/* NULL s1 pointer */
if (1 != memcmp(pStr5, pStr1, sizeof(pStr1)))
{
break;
}
/* Cause a wrap from s1 */
if (1 != memcmp((void *) ((unsigned long) -5), pStr1, sizeof(pStr1)))
{
break;
}
/* Cause a wrap from s2 */
if (1 != memcmp(pStr1, (void *) ((unsigned long) -5), sizeof(pStr1)))
{
break;
}
/* Compare no characters */
if (0 != memcmp(pStr1, pStr2, 0))
{
break;
}
/* Compare no characters with an invalid s2 pointer */
if (0 == memcmp(pStr1, pStr5, 0))
{
break;
}
/* Compare no characters with an invalid s1 pointer */
if (0 == memcmp(pStr5, pStr1, 0))
{
break;
}
/* Test equality */
if (0 != memcmp(pStr1, pStr2, sizeof(pStr1)))
{
break;
}
/* First string greater than second string */
if (0 >= memcmp(pStr1, pStr3, sizeof(pStr1)))
{
break;
}
/* First string less than second string */
if (0 <= memcmp(pStr1, pStr4, sizeof(pStr1)))
{
break;
}
ret = 0;
} while (0);
return ret;
}
Conclusion
This function follows the same pattern that we've seen in memcpy
and memmove
with the only differences being that a comparison is performed instead of a copy
and that we don't need to worry about comparing in a certain direction since
data is never written. Remember that comparing zero bytes of memory will produce
a return value indicating that the regions are equal. Now that we have memcmp
,
we can use it to test future functions like strcpy
or strcat
.
int
memcmp(const void *s1, const void *s2, size_t n)
{
const unsigned char *pStr1 = s1;
const unsigned char *pStr2 = s2;
int ret = 0;
if ( !s1 ||
!s2 ||
/* Check for wrapping while comparing */
((((unsigned long) -1) - ((unsigned long) s1)) < n) ||
((((unsigned long) -1) - ((unsigned long) s2)) < n))
{
return 1;
}
while (n-- > 0)
{
if (*(pStr1++) - *(pStr2++))
{
ret = *(--pStr1) - *(--pStr2);
break;
}
}
return ret;
}