What is C++ Programming Language? C++ Character Set, C++ Tokens

What is C++ Programming Language?

The C + + language can be used to practice various programming concepts such as sequence, selection, and iteration. Just like any other language, the learning of the C + + language begins with the familiarisation of its basic symbols called characters.

The learning hierarchy proceeds through words, phrases (expressions), statements, etc. Let us begin with the learning of characters.

C++ was created by Bjarne Stroustrup, beginning in 1979. The development and refinement of C++ was a major effort, spanning the 1980s and most of the 1990s. Finally, in 1998 an ANSI/ISO standard for C++ was adopted. In general terms, C++ is the object-oriented version of C.

It soon expanded into being a programming language in its own right. Today, C++ is near twice the size of the C language. Needless to say, C++ is one of the most powerful computer languages ever devised.

C++ Character Set

As we know, the study of any language, such as English, Malayalam, or Hindi begins with the alphabet. Similarly, the C + + language also has its own alphabet. With regard to a programming language, the alphabet is known as the character set.

It is a set of valid symbols, called characters that a language can recognize. A character represents any letter, digit, or other symbols. The set of valid characters in a language which is the fundamental unit of that language is collectively known as the character set.

The character set of C ++ is categorized as follows:

S.NO. Categorized Character
1. Letters A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
2. Digits 0 1 2 3 4 5 6 7 8 9
3. Special characters – * / A () [] {} = < > . ‘ ” $ , ; : % ! & ? _ (underscore) # @
4. White spaces Space bar (Blank space), Horizontal Tab (➔ ),
5. Other characters Carriage Return ( .J ) , Newline, Form feed C + + can process any of the 256 ASCII characters as data or as literals.

C++ Tokens

After learning the alphabet the second stage is learning words constituted by the alphabet (or characters). The term ‘token’ in the C++ language is similar to the term ‘word’ in natural languages. Tokens are the fundamental building blocks of the program.

They are also known as lexical units. C ++ has five types of tokens as listed below:

  1. Keywords
  2. Identifiers
  3. Literals
  4. Punctuators
  5. Operators
C++ Tokens
C++ Tokens


The words (tokens) that convey a specific meaning to the language compiler are called keywords. These are also known as reserved words as they are reserved by the language for special purposes and cannot be redefined for any other purposes.

The set of 48 keywords in C ++ are listed in Table. Their meaning will be explained in due course:

Keywords Keywords Keywords Keywords Keywords Keywords
asm continue float new signed try
auto default for operator sizeof typedef
break delete friend private static union
case do goto protected struct unsigned
catch double if public switch virtual
char else inline register template void
class enum int return this volatile
const extern long short throw while
C++ Keywords


We usually assign names to places, people, objects, etc. in our day-to-day life, to identify them from one another. In C + + we use identifiers for this purpose. Identifiers are the user-defined words that are used to name different program elements such as memory locations, statements, functions, objects, classes, etc.

The identifiers of memory locations are called variables. The identifiers assigned to statements are called labels. The identifiers used to refer to a set of statements are called function names.

While constructing identifiers certain rules are to be strictly followed for their validity in the program. The rules are as follows:

  1. Identifier is an arbitrary long sequence of letters, digits and underscores ( _ ).
  2. The first character must be a letter or underscore ( _ ).
  3. White space and special characters are not allowed.
  4. Keywords cannot be used as identifiers.
  5. Upper and lower case letters are treated differently, i.e. C++ is case sensitive.

Examples for some valid identifiers are Count, Sum of 2 numbers, Average Height, 1st Rank, Main, FOR The following are some invalid identifiers due to the specified reasons:

Sum of Digits Blank space is used.
1st year Digit is used as the first character.
First.Jan Special character ( . ) is used.
for It is a keyword.
C++ Identifiers


Consider the case of the Single Window System for the admission of Plus One students. You may have given your date of birth in the application form. As an applicant, your date of birth remains the same throughout your life. Once they are assigned their initial values, they never change their value.

In mathematics, we know that the value of re is a constant and the value of gravitational constant ‘g’ never changes, i.e. it remains 9.8m/ s2 Like that, in C ++, we use the type of tokens called literals to represent data items that never change their value during the program run.

They are often referred to as constants. Literals can be divided into four types as follows:

  1. Integer literals
  2. Floating point literals
  3. Character literals
  4. String literals

Integer Literals

Consider the numbers 177 6, 707, -273. They are integer constants that identify integer decimal values. The tokens constituted only by digits are called integer literals and they are whole numbers without fractional parts.

Floating Point Literals

You may have come across numbers like 3.14159, 3.0X108, 1.6X10 19 and3.0 during your course of study. These are four valid numbers.

The first number is re (Pi), the second one is the speed of light in meter/ sec, the third is the electric charge of an electron (an extremely small number) – all of them are approximated, and the last one is the number three expressed as a floating-point numeric literal.

Floating-point literals, also known as real constants are numbers having fractional parts. These can be written in one of the two forms called fractional form or exponential form.

Character Literals

When we want to store the letter code for gender usually we use ‘f’ or ‘F’ for f-female and ‘m’ or ‘M’ for Male. Similarly, we may use the letter ‘y’ or ‘Y’ to indicate Yes and the letter ‘n’ or ‘N’ to indicate No. These are single characters.

When we refer to a single character enclosed in single quotes that never changes its value during the program run, we call it a character literal or character constant.

String Literals

Nandana is a student and she lives in Bapuji N agar. Here, “Nandana” is the name of a girl and “Bapuji N agar” is the name of a place. These kinds of data may need to be processed with the help of programs. Such data are considered as string constants and they are enclosed within double quotes.

A sequence of one or more characters enclosed within a pair of double quotes is called a string constant. For instance, “Hello friends”, “123”, “C++”, “Baby\’ s Day Out”, etc. are valid string constants.


In languages like English, Malayalam, etc. punctuation marks are used for the grammatical perfection of sentences. Consider the statement: Who developed C + +? Here ‘?’ is the punctuation mark that tells that the statement is a question.

Similarly, at the end of each sentence, we put a full stop (.). In the same way, C++ also has some special symbols that have syntactic or semantic meaning to the compiler.

These are called punctuators. Examples are: # ; ‘ ” ( ) [ ] { } . The purpose of each punctuator will be discussed later.


When we have to add 5 and 3, we express it as 5 + 3. Here + is an operator that represents the addition operation. Similarly, C + + has a rich collection of operators. An operator is a symbol that tells the compiler about a specific operation.

They are the tokens that trigger some kind of operation. The operator is applied on a set of data called operands. C++ provides different types of operators like arithmetic, relational, logical, assignment, conditional, etc.