What does ctype h contain

C programming: ctype.h

Overview [edit]

The header ctype.h contains various functions with which individual characters can be checked or converted. The functions return a value other than 0, if c the condition is met, otherwise they return 0:

  • tests for alphanumeric characters (a-z, A-Z, 0-9)
  • tests for letter (a-z, A-Z)
  • tests for control characters (\ f, \ n, \ t ...)
  • tests for decimal digit (0-9)
  • tests for printable characters with no spaces
  • tests for lowercase letters (a-z)
  • tests for printable characters with spaces
  • tests for printable punctuation marks
  • tests for whitespace characters ('', '\ f', '\ n', '\ r', '\ t', '\ v')
  • tests for capital letters (A-Z)
  • tests for hexadecimal digits (0-9, a-f, A-F)

In addition, two functions are defined for converting to upper and lower case letters:

  • converts uppercase to lowercase letters
  • converts lower case to upper case

Frequently made mistakes [edit]

As you may see, the functions expect you to be a parameter intalthough it is actually a char should be. After all, the functions work with characters.

The reason for this lies in the C standard itself. According to the C standard, must c either »as unsigned char representable or the value of the macro EOF be". Otherwise the behavior is undefined. EOF is defined as a negative int value in the standard, but unsigned char can never have negative values. In order to meet the standard, a sufficient parameter type must be declared that can map both the unsigned char value range and negative int values. The basic data type int.

That alone is not bad. But: in C there is three various types of char-Data types: char, signed char and unsigned char. In a two's complement environment where a char Is 8 bits in size (yes, there are also larger ones), the implementation-dependent range of values ​​is signed char mostly from -128 to +127, that of unsigned char from 0 to mostly 255. If you now assume that the character set is ISO-8859-1 (latin1) or Unicode / UTF-8, you must not pass any strings that may contain umlauts to these functions. A typical example where this happens anyway is:

int all_spaces (const char * s) {while (* s! = '\ 0') {if (isspace (* s)) / * ERROR * / return 0; s ++; } return 1; }

The call of all_spaces ("Hall oils") then leads to undefined behavior. To avoid that, you have to use the argument of the function isspace in a unsigned char convert. It works like this, for example:

if (isspace ((unsigned char) * s))