The strstr function lets us search for the first occurrence of one string in another string.

char *
strstr(const char *s1, const char *s2);

If the string is found then a pointer to its beginning is returned, if the string being found is empty then a pointer to the beginning of the search string is returned, otherwise NULL is returned.

Local Variables

During the main loop of this function we'll reference the length of the string being found many times so it will be best to store that in a variable rather than calling strlen(s2) over and over again.

size_t s2_len = 0;

Error Checking

As normal, first we check to see if either string is invalid.

if (!s1 || !s2)
{
    return NULL;
}

Next, we'll calculate and store the length of the string being found and if it's empty (has zero length) then we'll return a pointer to the first string.

s2_len = strlen(s2);
if (0 == s2_len)
{
    return (char *) s1;
}

Implementation

As a precursor to the main search logic, if the string to be found is longer than the string being searched then a match will never occur and we can bail early.

/* A longer string will never be found */
if (s2_len > strlen(s1))
{
    return NULL;
}

The general idea will be to focus on the first character in the string being found and skip along each occurrence in the string being searched. Since this is a prerequisite of matching the entire string, it will prevent us from performing unnecessary comparisons. As we visit each instance of that character we'll then compare the full string being found - if a match is identified then we can stop searching and return the matching location.

First, we'll want to loop until s1 points to the terminating null character.

while (*s1)
{
    /* Perform search */
}

Next, we want to skip to the next location of the first character of the string that we want to find. strchr will do exactly what we need and we'll update our search string since unmatched characters no longer matter as we progress. If the first character isn't found then NULL will be returned and this will be the same value we return to the caller since the string wasn't found either.

while (*s1)
{
    /* Jump to the next match on the first character */
    s1 = strchr(s1, *s2);
    if (!s1)
    {
        break;
    }

    /* Perform comparison */
}

return (char *) s1;

If we do get a match on the first character then we will perform a string comparison to identify a match. However, since the string being searched may be longer than the string being found, we need to ensure that we only compare the characters in the string being found which is done through the use of strncmp.

Finally, if no match was found then we increment our search string pointer and try again.

while (*s1)
{
    /* Jump to the next match on the first character */
    s1 = strchr(s1, *s2);
    if (!s1)
    {
        break;
    }

    if (0 == strncmp(s1, s2, s2_len))
    {
        break;
    }

    s1++;
}

return (char *) s1;

Testing

For testing, we'll ensure that we pass NULL pointers to validate error checking, search for a string which is longer than the string being searched, attempt some near matches, and finally search for matches that exist at the beginning, middle, and end of the string being searched.

int
strstrTest(void)
{
    int ret     = 1;
    char str[]  = "Hello, world!\n";
    char str2[] = "H";

    do
    {
        /* Invalid search string */
        if (NULL != strstr(NULL, str))
        {
           break;
        }

        /* Invalid substring */
        if (NULL != strstr(str, NULL))
        {
            break;
        }

        /* Empty substring */
        if (str != strstr(str, ""))
        {
            break;
        }

        /* No match */
        if (NULL != strstr(str, "goodbye"))
        {
            break;
        }

        /* No match on single character */
        if (NULL != strstr(str, "z"))
        {
            break;
        }

        /* Search for string that is longer but matches initially */
        if (NULL != strstr(str2, "Hello"))
        {
            break;
        }

        /* Mismatch on last character */
        if (NULL != strstr(str, "Hellp"))
        {
            break;
        }

        /* Find itself */
        if (str != strstr(str, str))
        {
            break;
        }

        /* Find match at beginning */
        if (str != strstr(str, "Hello"))
        {
            break;
        }

        /* Find match in middle */
        if (strchr(str, 'w') != strstr(str, "wo"))
        {
            break;
        }

        /* Find match at end */
        if (strchr(str, 'w') != strstr(str, "world!\n"))
        {
            break;
        }

        /* Find match of single character */
        if (strchr(str, ' ') != strstr(str, " "))
        {
            break;
        }

        /* Match a single character only */
        if (str2 != strstr(str2, "H"))
        {
            break;
        }

        ret = 0;
    } while (0);

    return ret;
}

Conclusion

This function is a little larger than the other ones we've covered so far, but still boils down to straightforward logic of jumping to the next match of the first character and then performing a full comparison of the string being found.

char *
strstr(const char *s1, const char *s2)
{
    size_t s2_len = 0;

    if (!s1 || !s2)
    {
        return NULL;
    }

    s2_len = strlen(s2);
    if (0 == s2_len)
    {
        return (char *) s1;
    }

    /* A longer string will never be found */
    if (s2_len > strlen(s1))
    {
        return NULL;
    }

    while (*s1)
    {
        /* Jump to the next match on the first character */
        s1 = strchr(s1, *s2);
        if (!s1)
        {
            break;
        }

        if (0 == strncmp(s1, s2, s2_len))
        {
            break;
        }

        s1++;
    }

    return (char *) s1;
}
comments powered by Disqus