Standard Library Functions - C Interview Questions III


Q. What standard functions are available to manipulate strings?
Short answer: the functions in <string.h>.
C doesn't have a built-in string type. Instead, C programs use char arrays, terminated by the NUL ('\0') character.
C programs (and C programmers) are responsible for ensuring that the arrays are big enough to hold all that will be put in them. There are three approaches:
1. Set aside a lot of room, assume that it will be big enough, and don't worry what happens if it's not big enough (efficient, but this method can cause big problems if there's not enough room).
2. Always allocate and reallocate the necessary amount of room (not too inefficient if done with realloc; this method can take lots of code and lots of runtime).
3. Set aside what should be enough room, and stop before going beyond it (efficient and safe, but you might lose data).
There are two sets of functions for C string programming. One set (strcpy, strcat, and so on) works with the first and second approaches. This set copies or uses as much as it's asked to—and there had better be room for it all, or the program might be buggy. Those are the functions most C programmers use. The other set (strncpy, strncat, and so on) takes the third approach. This set needs to know how much room there is, and it never goes beyond that, ignoring everything that doesn't fit.
The "n" (third) argument means different things to these two functions:
To strncpy, it means there is room for only "n" characters, including any NUL character at the end. strncpy copies exactly "n" characters. If the second argument doesn't have that many, strncpy copies extra NUL characters. If the second argument has more characters than that, strncpy stops before it copies any NUL character. That means, when using strncpy, you should always put a NUL character at the end of the string yourself; don't count on strncpy to do it for you.
To strncat, it means to copy up to "n" characters, plus a NUL character if necessary. Because what you really know is how many characters the destination can store, you usually need to use strlen to calculate how many characters you can copy.
The difference between strncpy and strncat is "historical." (That's a technical term meaning "It made sense to somebody, once, and it might be the right way to do things, but it's not obvious why right now.")
An example of the "string-n" functions.
#include <stdio.h>
#include <string.h>
/*
Normally, a constant like MAXBUF would be very large, to
help ensure that the buffer doesn't overflow.  Here, it's very
small, to show how the "string-n" functions prevent it from
ever overflowing.
*/
#define MAXBUF 16
int
main(int argc, char** argv)
{
        char buf[MAXBUF];
        int i;
        buf[MAXBUF - 1] = '\0';
        strncpy(buf, argv[0], MAXBUF-1);
        for (i = 1; i < argc; ++i) {
                strncat(buf, " ",
                  MAXBUF - 1 - strlen(buf));
                strncat(buf, argv[i],
                  MAXBUF - 1 - strlen(buf));
        }
        puts(buf);
        return 0;
}
strcpy and strncpy copy a string from one array to another. The value on the right is copied to the value on the left; think of the order as being the same as that for assignment.
strcat and strncat "concatenate" one string onto the end of another. For example, if a1 is an array that holds "dog" and a2 is an array that holds "wood", after calling strcat(a1, a2), a1 would hold "dogwood". strcmp andstrncmp compare two strings. The return value is negative if the left argument is less than the right, zero if they're the same, and positive if the left argument is greater than the right. There are two common idioms for equality and inequality:
if (strcmp(s1, s2)) {
    /* s1 != s2 */
}
and
if (! strcmp(s1, s2)) {
    /* s1 == s2 */
}
This code is not incredibly readable, perhaps, but it's perfectly valid C code and quite common; learn to recognize it. If you need to take into account the current locale when comparing strings, use strcoll.
A number of functions search in a string. (In all cases, it's the "left" or first argument being searched in.) strchr andstrrchr look for (respectively) the first and last occurrence of a character in a string. (memchr and memrchr are the closest functions to the "n" equivalents strchr and strrchr.) strspnstrcspn (the "c" stands for "complement"), and strpbrk look for substrings consisting of certain characters or separated by certain characters:
n = strspn("Iowa", "AEIOUaeiou");
/* n = 2; "Iowa" starts with 2 vowels */
n = strcspn("Hello world", " \t");
/* n = 5; white space after 5 characters */
p = strbrk("Hello world", " \t");
/* p points to blank */
strstr looks for one string in another:
p = strstr("Hello world", "or");
/* p points to the second "o" */
strtok breaks a string into tokens, which are separated by characters given in the second argument. strtok is "destructive"; it sticks NUL characters in the original string. (If the original string should be changed, it should be copied, and the copy should be passed to strtok.) Also, strtok is not "reentrant"; it can't be called from a signal-handling function, because it "remembers" some of its arguments between calls. strtok is an odd function, but very useful for pulling apart data separated by commas or white space.
The below program shows a simple program that uses strtok to break up the words in a sentence.
/* An example of using strtok. */
#include <stdio.h>
#include <string.h>
static char buf[] = "Now is the time for all good men ...";
int main()
{
        char* p;
        p = strtok(buf, " ");
        while (p) {
                printf("%s\n", p);
                p = strtok(NULL, " ");
        }
        return 0;
}

Comments