10.14 Positional Number Systems

Positional number systems: decimal and binary

Decimal = base 10, Binary = base 2

In both cases digits are joined together to form numbers.

A number is written as a vector of digits.

An N-digit decimal number represents one of \(10^N\) possibilities: 0, 1, 2, 3, ..., \(10^N-1\). This is called the range of the number.

For example, a three-digit decimal number represents one of 1000 possibilities in the range of 0 to 999.

An N-bit binary number represents one of \(2^N\) possibilities: 0, 1, 2, 3, ..., \(2^N-1\).

Example and Notation

Binary Numbers and their Decimal Equivalents

1-Bit Binary Numbers	2-Bit Binary Numbers	3-Bit Binary Numbers	4-Bit Binary Numbers	Decimal Equivalents
0	00	000	0000	0
1	01	001	0001	1
	10	010	0010	2
	11	011	0011	3
		100	0100	4
		101	0101	5
		110	0110	6
		111	0111	7
			1000	8
			1001	9
			1010	10
			1011	11
			1100	12
			1101	13
			1110	14
			1111	15

Hexadecimal numbers (base 16)

Digits are 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

Equivalent to decimal values from 0 to 15.

Four bits are equivalent to a single hexadecimal digit.

So the digits are equivalent to 0000 up to 1111
Hexadecimal is a convenient way of writing binary values.

Hexadecimal editors/dumpers

$ hexedit a.out

Tools to read any file as a sequence of bytes.

Left column shows the offset (position from the beginning of the file).

Middle columns are the actual bytes of the file as hexadecimal digits.

Right columns are the same bytes shown in ASCII.

Other commands:

xxd
hexdump

Terminal Output (hexedit a.out):

00001F7C  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001F8C  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001F9C  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FAC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FBC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FCC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FDC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FEC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
00001FFC  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
0000200C  00 00 00 00  01 00 02 00  48 65 6C 6C  6F 20 47 54  ........Hello GT
0000201C  10 F0 FF FF  68 00 00 00  01 1B 03 3B  34 00 00 00  ....h.......;4..
0000202C  40 F0 FF FF  A8 00 00 00  30 F0 FF FF  90 00 00 00  @.......0.......
0000203C  39 F1 FF FF  C0 00 00 00  05 00 00 00  01 7A 52 00  9.......zR..
0000204C  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  ................
0000205C  90 01 00 00  14 00 00 00  01 78 10 01  1B 0C 07 08  ......x.........

Bits, bytes, nibbles, words ...

A bit has a value 1 or 0

A byte is a group of 8 bits

A nibble is a group of 4 bits

A word is the amount of data processed in one instruction step by a microprocessor. On a 64-bit architecture, a word is a group of 64 bits.

The bit in the rightmost column is the least significant bit (lsb)

The bit in the leftmost column is the most significant bit (msb).

Kilo and Mega

Since \(2^{10} = 1024 \sim 1000\), a kilobyte (1 KB) is \(2^{10}\) bytes.

A megabyte (1 MB) is \(2^{20} \sim 10^6\) bytes.

A gigabyte (1 GB) is \(2^{30} \sim 10^9\) bytes.

Useful to estimate powers of two. Example: find the approximate value of \(2^{24}\):

\(2^{24} = 2^{20} \times 2^4\)
\(2^{20}\) is \(\sim 1\) million
\(2^4\) is \(16\)
Hence \(2^{24}\) is \(\sim 16\) millions (exactly \(16\,777\,216\))

Integers Representations

Let us describe two different ways a vector of bits can be interpreted as an integer value:

one is to interpret it as a nonnegative numbers (unsigned encoding)
another is to interpret it as a negative, zero, or positive number (signed encoding)

Unsigned Encoding

4-bit unsigned number examples:

The least value is given by bit vector \([00...0]\), having integer value \(0\).
The greatest value is given by bit vector \([11...1]\), having integer value \(2^w-1\).

Two's Complement Encoding

The most common computer representation of signed numbers.
Idea: interpret the most significant bit to have negative weight.
Function \(\text{B2T}_{w}\) (Binary to(2) Two's complement length \(w\)):
for vector \(x=[x_{w-1},...,x_0]\): \(\text{B2T}_{w}(x)= -x_{w-1}2^{w-1}+ \sum_{i=0} ^{w-2}x_{i}2^{i}\)

Two's Complement Numbers Examples

Examples for \(w=4\):

Bit 3 serves as sign bit. When set to 1, it contributes \(-2^3 = -8\) to the value (shown in gray).
The least value is given by bit vector \([10...0]\), having integer value \(-2^{w-1}\).
The greatest value is given by bit vector \([01...1]\), having integer value \(2^{w-1}-1\).

Remarks

Every number between 0 and \(2^w-1\) has a unique unsigned encoding as w-bit value.

Every number between \(-2^{w-1}\) and \(2^{w-1}-1\) has a unique two's complement encoding as w-bit value.

The two's complement range is asymmetric since it range is \([-2^{w-1},2^{w-1}-1]\)
There is no positive counterpart to the lowest possible value since \(2^{w-1}\) is not in the range
This is because half the bit patterns represent negative numbers, while the other half represent nonnegative numbers (including 0).

Important Numbers

Value	8	16	32	64
\(UMax_{w}=11...1_{2}\)	0xFF	0xFFFF	0xFFFFFFFF	0xFFFFFFFFFFFFFFFF
Unsigned maximum	255	65,535	4,294,967,295	18,446,744,073,709,551,615
\(TMin_{w}=100...0_2\)	0x80	0x8000	0x80000000	0x8000000000000000
Two complement's minimum	\(-128\)	\(-32,768\)	\(-2,147,483,648\)	\(-9,223,372,036,854,775,808\)
\(TMax_{w}=0111..1_2\)	0x7F	0x7FFF	0x7FFFFFFF	0x7FFFFFFFFFFFFFFF
Two complement's maximum	127	32,767	2,147,483,647	9,223,372,036,854,775,807
\(-1\)	0xFF	0xFFFF	0xFFFFFFFF	0xFFFFFFFFFFFFFFFF
\(0\)	0x00	0x0000	0x00000000	0x0000000000000000

Unsigned Addition

With a 4-bits word size, the sum could requite 5 bits.

The arguments (shown on the horizontal axes) range from 0 to 15, but the sum ranges from 0 to 30.

Usually, programming languages support fixed-size arithmetic, hence operations such as addition, multiplication differ from their counterpart operations over integers.

Unsigned Addition

Define the operation \(+_w^u\) for arguments x and y, where \(0 \leq x, y < 2^w\), as the result of truncating the integer sum x + y to be w bits long and then viewing the result as an unsigned number.

This can be characterized as a form of modular arithmetic, computing the sum modulo \(2^w\).

For example, \(15_0+3_0=1111_2+11_2=10010_2\) and we throw the first \(1\) and remain \(0010_2\)

Unsigned Addition Overflow

We can characterize operation \(+_w^u\) as follows:

\[ x+_w^u y = \begin{cases} x+y, & x+y < 2^w \quad \text{Normal} \\ x+y-2^w, & 2^w \leq x+y < 2^{w+1} \quad \text{Overflow} \end{cases} \]

An arithmetic operation is said to overflow when the full integer result cannot fit within the word size limits of the data type.

Overflow can be detected at programming language level: let \(s = x+_w^u y\), the computation of \(s\) overflowed if \(s < x\) (or equivalently, \(s < y\)).

unsigned char x, y, z;

[scanf of x, y] x := 150
                y := 160

z = x + y;
if (z < x)
  print "overflow"
else
  print "no overflow"

TRUNCATE

Two's Complement Addition

At bit level, two's complement addition is the same operation as unsigned addition.

With two's-complement addition, we must decide what to do when the result is either too large (positive) or too small (negative) to represent.

Given integer values \(x\) and \(y\) in range \(-2^{w-1}\leq x, y \leq 2^{w-1}- 1\), their sum is in range \(-2^{w} \leq x + y \leq 2^{w} - 2\), potentially requiring \(w+1\) bits to represent exactly.

As before, we truncate the representation to \(w\) bits. What would be the definition of the resulting operation?

Define \(x +_w^t y\) as the result of truncating the integer sum \(x + y\) to \(w\) bits long and then viewing the result as a two's-complement number:

\[ x +_w^t y = \begin{cases} x + y - 2^w, & 2^{w-1} \le x + y & \text{Positive overflow} \\ x + y, & -2^{w-1} \le x + y < 2^{w-1} & \text{Normal} \\ x + y + 2^w, & x + y < -2^{w-1} & \text{Negative overflow} \end{cases} \]

Two's Complement Negation

Every number \(x\) has an additive inverse under \(+_{w}^{t}\) which we denote as \(-_{w}^{t} x\) as follows:

\[ -_{w}^{t} x=\left\{\begin{array}{ll} T M i n_{w}, & x=T M i n_{w} \\ -x, & x>T M i n_{w} \end{array}\right. \]

Observe that \(T M i n_{w}+T M i n_{w}=-2^{w-1}+-2^{w-1}=-2^{w}\).

This would cause negative overflow, and hence \(T M i n_{w}+_{w}^{t} T M i n_{w}=-2^{w}+2^{w}=0\).

To perform negation at the bit level: invert all bits and then add 1 to the result.

Reason: \(x + (-x) = 00...0_{2}\) and \(x + \bar{x} = 11...1_2\), thus "\(-x\)" = \(\bar{x} + 1\)

And there is a overflow since the range of two's complement negation is \([-2^{w-1},2^{w-1}-1]\), then \(1000_2 = -8_{10}\) \(\to\) \(0111\) \(\to\) \(1000_2 = -8_{10}\)

Example

\(0101_2 = 5_{10}\) \(\to\) \(1010\) \(\to\) \(1011_2 = -5_{10}\)

\(1001_2 = -7_{10}\) \(\to\) \(0110\) \(\to\) \(0111_2 = 7_{10}\)

Guaranteed ranges for C integral data types

The C standards require that the data types have at least these ranges of values:

C data type	Minimum	Maximum
[signed] char	-127	127
unsigned char	0	255
short	-32,767	32,767
unsigned short	0	65,535
int	-32,767	32,767
unsigned	0	65,535
long	-2,147,483,647	2,147,483,647
unsigned long	0	4,294,967,295
int32_t	-2,147,483,648	2,147,483,647
uint32_t	0	4,294,967,295
int64_t	-9,223,372,036,854,775,808	9,223,372,036,854,775,807
uint64_t	0	18,446,744,073,709,551,615

In modern compilers (gcc...) an int is 4 bytes big, so the range is -2,147,483,648 to 2,147,483,647, and unsigned int 0 to 4,294,967,295.
But if you want to write portable code you should not rely on the size of the int type!

Final Comments

The "integer" arithmetic performed by computers is actually a form of modular arithmetic.

The finite word size used to represent numbers limits the range of possible values, and the resulting operations can overflow.

The two's-complement representation provides a clever way to represent both negative and positive values, while using the same bit-level implementations as are used to perform unsigned arithmetic

‍