CISC 3142
Programming Paradigms in C++
C-Style Strings

Motivation

Prior to the C++ string class (i.e., in C), strings were represented in a relatively low-level, but quite natural manner: as arrays of ASCII characters: char [], or equivalently char *.

Overview

A C string is:

String Literals

char [] vs char *

Recall that although an array is passed to a function as a pointer, there is a difference between declaring an array and a pointer:

char []

char *

Working With C Strings

There are several issue related to working with C string — especially to Java and C++ programmers who are accustomed to using a self-managed string class. In summary, there is a lot of 'stuff' to deal with; which is one of the major reasons a string class is so desirable.

To help address and minimize the consequences of not taking these issues into account, C (and thus C++) provides a library of C string functions to aid in the processing of C strings. These are accessed via the cstring header file.

Examining the Functions in the cstring C String Library

Understanding the String Idioms

There are several fairly unique C-originated idioms that arise when working with null-terminated C strings. These are based on the following C'isms:

Checking for a null byte

Again, processing C-strings is like performing trailer-value-based input: one uses a while (a conditional loop) using the null byte ('\0') as the terminating condition (trailer value). The most straightforward way to check for the null byte (ASCII 0, '\0') is to write:
*s == '\0'	
Similarly, the condition for a character NOT the null byte is:
*s != '\0'	

Iterating Through a C-String

Iterating through s is accomplished with the following pattern:
while (*s) {		// while not yet at the null byte
	…		// process the current character (*s);
	s++;			// go to next character
}

Recursing Through a C-String

Similarly, recursing through s uses the following pattern:
void f(char *s) {
	if (!*s) return;		// null byte is escape clause
	…				// process current character (*s)
	f(s+1)				// recurse to next character
}	
`

Copying a sequence of successive characters

Here is the straightforward way of moving through s2 and copying each character to the corresponding position pointed at by s1. After each character is copied the pointers are incremented to the next position, the sequence terminating when the null byte is encountered in s2
while (*s2) {
	*s1 = * s2;
	s1++;
	s2++;
}
*s1 = '\0';		// it ain't a C-string without the null byte
The above can be rewritten in the highly terse, yet elegant code:
while (*s1++ = *s2++) 		// assigns the charcter pointed to by s2 to the location pointed to by s1, and bumnps up both pointer; stops on '\0'
	;
One common issue when copying is maintaining a pointer to the beginning of the string; note how s1 and s2 both move down the string; if the beginning of the string is important in the current context, their loctions must be saved. This will be seen in the examples below.

Implementing the String Functions Using the Various Idioms

strlen

int strlen_1(char *s) {
	int count = 0;
	while (*s != '\0') {
		count++;
		s++;
	}
	return count;
}
Notes
int strlen_2(char s[]) {
	int i = 0;
	while (s[i] != '\0')
		i++;
	return i;
}
Notes
int strlen_3(char *s) {
	int i; 
	for (i = 0; s[i] != '\0'; i++)
		;
	return i;
}
Notes
int strlen_4(char *s) {
	char *p = s;
	while (*p++)
		;
	return p - s - 1;
}
Notes
int strlen_5(char *s) {
	char *p;
	for (p = s; *p; p++)
		;
	return p - s;
}
Notes
int strlen_rec1(char *s) {
        if (!*s) return 0;
        return strlen_rec1(s+1) + 1;
}
Notes
int strlen_rec2(char *s) {
        return *s ? strlen_rec2(s+1) + 1 : 0;
}
Notes

strcpy

char *strcpy_1(char *to, const char *from) {
	char *originalTo = to;
	while (*from) {
		*to = *from;
		to++;
		from++;
	}
	*to = '\0';
	return originalTo;
}
Notes
char *strcpy_2(char *to, const char *from) {
	char *originalTo = to;
	while (*to++ = *from++)
		;
	return originalTo;
}
Notes
char *strcpy_3(char *to, const char *from) {
        int i = 0;
        while (from[i]) {
                to[i] = from[i];
                i++;
        }
        to[i] = '\0';
        return to;
}
Notes
char *strcpy_rec(char *to, const char *from) {
        *to = *from;
        if (*from) strcpy_rec(to+1, from+1);
        return to;
}
Notes

Command Line Arguments