(This post is part of the learning c series.)
Learning C (Part 2): Types and Expressions
In this article, I will note what I feel are the non-obvious information related to C types, operators, and expressions.
Data Types
char
: a single byte. Holds a single character.int
: an integerfloat
: single-precision floating pointdouble
: double-precision floating point
Qualifiers
The qualifiers short
or long
can be applied to an integer.
short
andint
must be at least 16 bits in size, butint
may be larger thanshort
depending upon hardware and implementation.
Also,short <= int
.long
is at least 32 bits. Also,int <= long
.
Usage can be seen below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #include <stdio.h> int main(){ short int x = 10; // int can actually be omitted here short y = 10; // zu is size_t type specifier printf("%zu\n", sizeof x); printf("%zu\n", sizeof y); long int lx = 10; long ly = 10; printf("%zu\n", sizeof lx); printf("%zu\n", sizeof ly); return 0; } |
2 2 8 8
Note that the output is in bytes, so see that sizeof x
is 16 bits and
sizeof lx
is 64 bits.
The qualifiers signed
or unsigned
can be used with a char or any integer
type.
unsigned
numbers are always ≥ 0.unsigned
numbers obey laws of arithmetic $\mod{2^n}$, wheren
is the number of bits in the type.- For example, for an 8-bit
unsigned char
type, the possible values are 0-255. -
For an 8-bit
signed char
type, the possible values are -128-127. -
long double
specifies extended floating point precision.float <= double <= long double
- The sizes of the above could be distinct or similar. It's implementation defined.
Here's an example experimenting with these qualifiers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #include <stdio.h> int main(){ // -------------------------------------------------- // Messing with size qualifiers // -------------------------------------------------- short sx = 10; int ix = 10; long lx = 10; char *fmt1 = "short: %zu, int: %zu, long: %zu\n"; printf(fmt1, sizeof sx, sizeof ix, sizeof lx); float fy = 10.0; double dy = 10.0; long double ldy = 10.0; char *fmt2 = "float: %zu, double: %zu, long double: %zu\n"; printf(fmt2, sizeof fy, sizeof dy, sizeof ldy); // -------------------------------------------------- // Messing with signed/unsigned qualifiers // -------------------------------------------------- unsigned char c = -1; unsigned long lc = -1; printf("c: %u, lc: %lu\n", c, lc); return 0; } |
short: 2, int: 4, long: 8 float: 4, double: 8, long double: 16 c: 255, lc: 18446744073709551615
Constants and literals
int
-1234
long
-123456789L
unsigned
-123U
unsigned long
-123UL
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | #include <stdio.h> int main(){ int theint = 123; printf("%d\n", theint); long thelong = 123456789L; printf("%ld\n", thelong); unsigned theuint = 123U; printf("%u\n", theuint); unsigned long theulong = 123456789UL; printf("%lu\n", theulong); double thedouble = 123e-2; printf("%f\n", thedouble); int thehex = 0xA88BCF; printf("%x\n", thehex); char thechar = 'A'; // single, not double quotes printf("c as char: %c\n", thechar); printf("c as int: %d\n", thechar); return 0; } |
123 123456789 123 123456789 1.230000 a88bcf c as char: A c as int: 65
String literals have a null character ('\0'
) at the end.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | #include <stdio.h> void print_str(char *s); int main(){ char *mystr = "Hello"; print_str(mystr); return 0; } void print_str(char *s){ int i=0; char c; while(c = *s++) printf("%c\n", c); if(c == '\0') // '\0' is not a printable character, so we need to print something // ourselves if we want to see it. printf("\\0\n"); } |
H e l l o \0
Note: 'x'
!= "x"
Enums for symbol comparison
Enums are a great way to use symbols to represent constant, related data without the use of a define. Enums allow for semantic, readable comparisons with symbols when their value doesn't necessarily matter.
These concepts are described in the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | #include <stdio.h> enum category { DOG, HUMAN, CAT }; struct mammal { char *name; enum category c; }; void find_the_mammal(struct mammal *mammals, enum category mammal_type, size_t n); int main(){ struct mammal mammals[3]; mammals[0] = (struct mammal){"Joseph", HUMAN}; mammals[1] = (struct mammal){"Odin", CAT}; mammals[2] = (struct mammal){"LUCY", DOG}; find_the_mammal(mammals, HUMAN, 3); find_the_mammal(mammals, CAT, 3); find_the_mammal(mammals, DOG, 3); return 0; } void find_the_mammal(struct mammal *mammals, enum category mammal, size_t n){ int i; for(i=0; i<n; i++){ if(mammals[i].c == mammal) printf("%s\n", mammals[i].name); } } |
Joseph Odin LUCY
Implicit and explicit type conversions
When performing an operation where the items being operated on are of different types, they are converted to a common type before the operation occurs.
For example, take the following simple program:
1 2 3 4 5 6 7 | #include <stdio.h> int main(){ int x = 10; double y = 20.0; printf("%f\n", x+y); return 0; } |
30.000000
The integer value of x
was implicitly converted to a double when added to the
double y
. This conversion was "temporary" in that the variable x
was not
effected at all. x
still remains an integer after the printf()
statement has
finished.
In general, implicit conversions such as the one above will make the smaller
data type match the larger one. This is because there is no data loss going
from an int
to a float
, but there is definitely data loss going from float
to int
via truncation!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | #include <stdio.h> #include <math.h> int main(){ // You need a grade average of 70 after rounding to graduate int graduation_threshold = 70; float your_grade = 69.9999; float your_grade_rounded = round(your_grade); printf("Your rounded grade as a double: %.2f\n", your_grade_rounded); int your_grade_as_int = your_grade; int your_grade_as_int_rounded = round(your_grade_as_int); if(your_grade_as_int_rounded >= graduation_threshold){ printf("You graduated. Congrats!\n"); } else{ printf("You failed! Oh no!\n"); } printf("Your final grade: %d\n", your_grade_as_int_rounded); return 0; } |
Your rounded grade as a double: 70.00 You failed! Oh no! Your final grade: 69
Chars are just numbers!
1 2 3 4 5 6 7 | #include <stdio.h> int main(){ char c = 'A'; c++; printf("%c\n", c); return 0; } |
B
Implicit conversions
Suppose you have an operator that takes two numeric operands. The C standard specifies the following rules for conversions, in order (this is an exact quote of the standard):
- First, if either operand is long double, the other is converted to long double.
- Otherwise, if either operand is double, the other is converted to double.
- Otherwise, if either operand is float, the other is converted to float.
- Otherwise, the integral promotions are performed on both operands; then, if either operand is unsigned long int, the other is converted to unsigned long int.
- Otherwise, if one operand is long int and the other is unsigned int, the effect depends on whether a long int can represent all values of an unsigned int; if so, the unsigned int operand is converted to long int; if not, both are converted to unsigned long int.
- Otherwise, if one operand is long int, the other is converted to long int.
- Otherwise, if either operand is unsigned int, the other is converted to unsigned int.
- Otherwise, both operands have type int.
Comparison between signed and unsigned values are machine-dependent.
Note: "integral promotion" is a process that occurs when a char
, short
, or
enum
object are promoted to either int
or unsigned int
when used in an
expression. If an int
can represent the value of being converted, the value is
converted to an int
for the expression. Otherwise, it is converted to
unsigned int
.
Rules 1-3, and 6-8 were demonstrated by example 5 above.
The rules unsigned concerning integers are much more interesting, and they have
very important implications! Consider a machine with a 32 bit int
and 64 bit
long
along with the following example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #include <stdio.h> int main(){ // On my computer, this prints 32 and 64. It may be different on your // computer! printf("sizeof unsigned int: %zu\n", sizeof(unsigned int)*8); printf("sizeof long: %zu\n", sizeof(long)*8); // Note that long/short, when by themselves, act as shorthand for // "long int" and "short int" respectively. printf("sizeof long int: %zu\n", sizeof(long int)*8); if(-1L < 1U){ printf("-1 < 1. Duh!\n"); } else{ printf("-1 > 1?! WTF?!\n"); } if(-1L < 1UL){ printf("-1 < 1. Duh!\n"); } else{ printf("-1 > 1?! WTF?!\n"); } return 0; } |
sizeof unsigned int: 32 sizeof long: 64 sizeof long int: 64 -1 < 1. Duh! -1 > 1?! WTF?!
Let's explain this behavior using the rules listed above.
-1L < 1U
In the first comparison, we are comparing a long with an unsigned int. Rule 5 tells us what we should expect when an unsigned int and a long are operands together:
Otherwise, if one operand is long int and the other is unsigned int, the effect depends on whether a long int can represent all values of an unsigned int; if so, the unsigned int operand is converted to long int; if not, both are converted to unsigned long int.
So the question becomes: Can a 64 bit long
represent all the values of a 32
bit unsigned int
? A long
has to be able to represent both positive and
negative numbers, whereas an unsigned int
only has to represent positive
numbers. So if the largest possible positive long
value is greater than or
equal to the largest possible unsigned int
value, the standard says the
unsigned int
will be converted to long
.
Let's ask <limits.h>
whether a 64 bit long
can represent all the values of a
32 bit unsigned int
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #include <stdio.h> #include <limits.h> // This is by definition #define UINT_MIN 0 int main(){ printf("UINT_MIN: %u\n", UINT_MIN); printf("UINT_MAX: %u\n", UINT_MAX); printf("LONG_MAX: %ld\n", LONG_MAX); printf("LONG_MIN: %ld\n", LONG_MIN); if(LONG_MIN <= UINT_MIN && LONG_MAX >= UINT_MAX){ printf("`long` can represent all values of `unsigned int`\n"); } return 0; } |
UINT_MIN: 0 UINT_MAX: 4294967295 LONG_MAX: 9223372036854775807 LONG_MIN: -9223372036854775808 `long` can represent all values of `unsigned int`
So the answer is yes, a long
can represent all values of an unsigned int
,
thus the less-than operator -1L < 1UL
will cause unsigned int
to be
converted to a long
during the comparison. Let's see what the result of
conversion would be.
1 2 3 4 5 6 | #include <stdio.h> int main(){ long l = (long) 1U; printf("%ld\n", l); return 0; } |
1
This behavior isn't anything unexpected! But now, it's time for the fun part.
-1L > 1UL
Rule 4 in the conversion steps says
Otherwise, the integral promotions are performed on both operands; then, if either operand is unsigned long int, the other is converted to unsigned long int.
So during the evaluation of the expression -1L > 1UL
, integral promotion is
performed. Remember, integral promotion is just taking the smaller int types,
such as char
and short
, and converting them to int
or unsigned int
. So,
since we are dealing with larger integer types, integral promotion does not
occur in this example.
The second part of Rule 4, though, states that if either operand is an
unsigned long
, the other operand is converted to an unsigned long. Thus, in
the expression -1L > 1UL
, the operand -1L
will be converted to an unsigned
long. Let's see what converting a negative long
to an unsigned long
looks
like.
1 2 3 4 5 6 7 8 9 10 | #include <stdio.h> #include <limits.h> int main(){ unsigned long ul = (unsigned long) -1L; printf("%lu\n", ul); // Compare printf("%lu\n", ULONG_MAX); return 0; } |
18446744073709551615 18446744073709551615
The C standard actually mandates that -nUL == ULONG_MAX-n+1
for some integer n
holds true for any standards-compliant compiler - this result is not machine
dependent. Thus, converting a negative, signed long
to an unsigned long
yields a VERY big number. Identical behavior occurs for UINT
.
So, for the comparison of -1L > 1UL
, Rule 4 states that -1L
is converted to
an unsigned long
, and we have seen that such a conversion results in a very
large number. So now, it makes sense that -1L > 1UL
, since the largest
unsigned long
is certainly larger than the number 1!
Explicit conversions
As you may have noticed in previous examples, we can explicitly convert data via
a "cast". You simply prefix an expression with (thetype)
, and that expression
will be converted to that type.
Consider this example, which will provide a compile-time warning due to formatter mismatch:
1 2 3 4 5 6 | #include <stdio.h> int main(){ int i = 15; printf("The value: %.2f\n", i); return 0; } |
ex13.c: In function ‘main’: ex13.c:4:5: warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int’ [-Wformat] The value: 0.00
Instead of creating another variable to facilitate an implicit int
to float
conversion, we can explicitly convert the data ourselves.
1 2 3 4 5 6 | #include <stdio.h> int main(){ int i = 15; printf("The value: %.2f\n", (float)i); return 0; } |
The value: 15.00