Skip to content

12.2 Arrays, random number generation

Pseudorandom Number Generators (PRNG)

  • A PRNG is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers.
  • The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed.
  • The seed may be taken by an actual random source (hardware/outside world).
  • PRNGs are important in practice for their speed in number generation and their reproducibility.
  • PRNGs have applications in simulation, electronic games, and cryptography.

PRNGs in C's Standard Library

  • stdlib.h​ provides srand()​ and rand()​ functions.
  • time.h​ provides time()​ (useful for setting the seed).
  • A sample use:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main() {
    srand(time(NULL));
    printf("%d\n", rand());
    printf("%d\n", rand());
    printf("%d\n", rand());
    return 0;
}

Get a Random Number into a Certain Range

There are ways to do this in a statistically sound way, but in this course, we just want to go the simple way:

int x;

...

x = 50 + rand() % 51;  // x in range [50, 100]

Good Practices of srand()​ and rand()

  • srand()​ must be called only once in the program, otherwise the sequence resets.
  • srand()​ must be called before all uses of rand() ​.
  • srand(expr)​ must be provided a value expr​ that changes for each execution of the program:

  • Unless we want to enable reproducing some sequences.

  • This is useful for debugging.

Randomness Example: Guess the Number Game

Think of a game in which the computer randomly chooses a number between 1 and 100, and the player tries to guess it.

The program structure should be:

  • Computer chooses a random value between 1 and 100, stores it in variable r​.
  • Player inputs some value u​ to guess the number.
  • Computer prints a message to tell whether u​ is smaller or greater than r​.
  • Player has only N​ tries to find the number; otherwise, they lose.

Randomness Example: Password Generator

Consider a program that generates a random password according to the following rules:

  • There are 4 categories of characters: a-z​, A-Z​, 0-9​, and "special" characters from value 35 (#)​ to 47 (/)​.
  • To generate a character:

  • Randomly choose a category.

  • Then, randomly choose a character inside that category.
  • When choosing the last character of the password:

  • If no "special" character has been previously chosen, then randomly choose one special character.

Some output examples: image

Randomness Example: Random Walk Simulation

  • Random walk on the integer number line:

  • Starts at 0.

  • At each step, moves +1 or -1 with equal probability.
  • Consider a program that:

  • Simulates N random walks of M steps each.

  • Prints how many times the value reached is greater in absolute value than THRESHOLD.
  • You can think of variants of random walk in higher dimensions.
  • Random walk has many scientific and engineering applications.

image


Arrays

  • An array can be thought of as the definition of a family of variables:

  • All the variables of the family share the same name prefix.

  • Each variable of the family has its own suffix, called the index or subscript.
  • The brackets []​ are used to contain the array subscripts/indexes.
  • To use temp[0]​, temp[1]​, and temp[2]​ in a program, we would declare:

int temp[3];
* The integer 3 in the declaration represents the number of elements in the array. * The indexing of array elements starts at 0.


Array Declaration

  • An array is a sequential collection of homogeneous data.

  • All data items have the same type.

  • Its declaration in C requires:

  • The type of the array items.

  • An identifier (the array name).
  • A bracketed constant integral expression (the size of the array).
  • The lower bound of the array subscripts is 0, and the upper bound is (size - 1)​.

Thus, the following relationships hold:

int a[size]; /* space for a[0], ..., a[size-1] allocated */

lower bound = 0
upper bound = size - 1
size = upper bound + 1

Use of Constants

It is good practice to avoid so-called magic numbers. This includes, in particular, the array sizes, which many times can be defined using a symbolic constant:

#include <stdio.h>
#define SIZE 100

int main() {
    int a[SIZE];
    // some code to get values for a
    for (int i = 0; i < SIZE; ++i) {
        printf("%d\n", a[i]);
    }
    return 0;
}

Initialization

  • Consider the example:
    int f[5] = {0,1,2,3,4};

This initializes f[0]​ to 0, f[1]​ to 1, and so on.

  • When the list of values is shorter than the size of the array, the remaining elements are initialized to zero. For example:
int a[100] = {0};

This initializes all the elements of a​ to zero (first element only is explicitly set to zero).


Initialization (continued)

  • If an array is declared without a size and initialized to a series of values, it is implicitly given the size of the number of initializers. Thus:
int a[] = {2,3,5,-7};

is equivalent to:

int a[4] = {2,3,5,-7};

Subscripting

  • If a​ is an array, we can write a[expr]​, where expr​ is an integral expression, to access an element of the array. We call expr​ a subscript, or index, of a​.
  • Assume that the declarations: int i; int a[N];

have been made, where N​ is a symbolic constant.

  • A single array element a[i]​ is accessed when i​ has a value greater than or equal to 0​ and less than or equal to N - 1​.
  • If i​ is outside this range, a run-time error may occur when a[i]​ is accessed.

Out of Bounds Error

  • Suppose we access a[i]​ with value i​ being out of bounds.
  • Even on the same computer and operating system, the effect of the error may vary:

  • One frequent result is that the value of an unrelated variable will be returned.

  • Another result is that the operating system terminates the execution of the program for reading an unauthorized memory address (Segmentation Fault).
  • It is the responsibility of the programmer to ensure that all subscripts to arrays stay within bounds.

Arrays vs Non-Array Variables

  • A non-array variable is treated as a reference to a memory address that contains a value that can be read and written into. Its syntax is:

  • declaration: int a

  • read: a
  • write: a = b
  • get address: &a​ (required for scanf​)
  • An array variable is also a reference to a memory address that contains consecutive values that can be read and written to. However, the syntax is different:

  • declaration: int a[10]

  • read: a[5]​ (only for a single element)
  • write: a[5] = b​ (same as a non-array variable)
  • get address: a​ (equivalent to &a[0]​)

Applications of Arrays: Collections of Values

An array can be seen as a collection of related values. Typical operations that one can implement over collections are:

  • Total sum
  • Average
  • Find maximum/minimum
  • Apply an operation to each element
  • ...

Applications of Arrays: Sequential Collections

  • Arrays can inherently represent sequential collections of data, where the order of the elements matters.
  • Arrays are a fundamental data structure in computing, with many applications.

Examples of operations on sequential collections:

  • Median
  • Local minima/maxima
  • Sorting

How do we conveniently load values into arrays?

A way to load user-specific data into arrays is by getting the values from standard input:

while(scanf("%d", &a[i]) == 1) {
    // Code to process each input value
}
  • This includes reading from files via stdin redirection