12.23 Some fundamental algorithms on sequences
Algorithms on sequences (arrays)
- Sequence shuffling
- Sequence sorting
- Sequence searching
- Sequence merging
Algorithms and Pseudo-Code
- It is a finite sequence of rigorous well-defined instructions, whose aim is to solve a specific problem.
-
Algorithms describe computations.
-
They can be implemented as programs in a programming language.
-
They can also be abstractly described using pseudo-code.
- Pseudo-code is an informal notation to abstractly describe programs.
- Strong syntactic rules of programming languages are omitted.
- Types, operators, and control structures are used in a flexible way.
- Subtasks can be referred to as functions/procedures (straightforward tasks, or complex tasks to be further refined/implemented later on).
Pseudo-Code Example
isPrime(int x) -> boolean {
assume x is positive
if x is 1, then return false
else {
for each i in [2, 3, ..., (x-1)] do {
if i divides x then return false
}
}
return true
}
Shuffling a Sequence: The Fisher-Yates Algorithm
Problem: Randomly shuffle the elements of a sequence.
- Input: A sequence
s = [e1, ..., en]
of elements. - Output: A random permutation of
s
.
Fisher-Yates Algorithm:
- Produces an unbiased random permutation (every possible output permutation is equally likely).
- Shuffles sequence elements "in place" (without the need for additional space).
Shuffling a Sequence: The Fisher-Yates Algorithm
Pseudo-Code:
shuffle(seq s) {
for i from length(s)-1 to 1 do {
choose random j in [0..i]
swap s[i] and s[j]
}
}
C Implementation:
void shuffle(int array[], int size) {
srand(time(NULL));
for (int i = 0; i < size; i++) {
int j = (rand() % ((size - 1) - i + 1)) + i;
int aux = array[i];
array[i] = array[j];
array[j] = aux;
}
}
Myself version:
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int read(int seq[]) {
int i = 0;
while(scanf("%d", &seq[i])) {
i++;
}
return i;
}
void shuff(int seq[], int size) {
srand(time(NULL));
for(int i = 0; i < size; i++) {
int j = rand() % (size - i) + i;
int temp = seq[i];
seq[i] = seq[j];
seq[j] = temp;
}
}
void print(int seq[], int size) {
for(int i = 0; i < size; i++) {
printf("%d", seq[i]);
}
}
int main() {
int seq[128];
int size = read(seq);
shuff(seq, size);
print(seq, size);
return 0;
}
Sorting a Sequence: An Important CS Problem
- Problem: Sorting a sequence of elements.
- Input: A sequence
s = [e1, ..., en]
of sortable elements, i.e., elements for which there exists a defined order. - Output: A permutation of
s
that is increasingly sorted. -
There exist many alternative algorithms to solve this problem, with different characteristics:
-
Efficiency
- Memory requirements
- Performance on particular data structures
- Performance on partially sorted sequences, etc.
Sorting a Sequence with Insertion Sort
-
Key Idea:
-
Maintain a sorted prefix of the sequence.
- At each step, insert the first element of the unsorted part into its sorted position of the sorted prefix, thus extending the sorted prefix by one.
- When the unsorted part becomes empty, the sequence is sorted.
Sorting a Sequence with Insertion Sort
Pseudo-Code:
insertionSort(seq s) {
for each i in [1..length(s)-1] do {
j = i
while (j > 0 and s[j-1] > s[j]) {
swap s[j] and s[j-1]
j = j - 1
}
}
}
C Implementation:
void insertionSort(int array[], int size) {
for (int i = 1; i < size; i++) {
int j = i;
while (j > 0 && array[j - 1] > array[j]) {
int aux = array[j];
array[j] = array[j - 1];
array[j - 1] = aux;
j--;
}
}
}
Sorting a Sequence with Selection Sort
Key Idea:
- Maintain a sorted prefix of the sequence, with all elements smaller than the unsorted part.
- At each step, select the minimum of the unsorted part, and exchange it with the first element in the unsorted part, thus extending the sorted prefix by one.
- When the unsorted part becomes empty, the sequence is sorted.
Pseudo-Code:
selectionSort(seq s) {
for each i in [0..length(s)-2] do {
min_index = i
for j from i+1 to length(s)-1 do {
if s[j] < s[min_index] then {
min_index = j
}
}
swap s[i] and s[min_index]
}
}
Sorting a Sequence with Selection Sort
C Implementation:
void selectionSort(int array[], int size) {
for (int i = 0; i < size - 1; i++) {
int min_index = i;
for (int j = i + 1; j < size; j++) {
if (array[j] < array[min_index]) {
min_index = j;
}
}
int aux = array[i];
array[i] = array[min_index];
array[min_index] = aux;
}
}
Searching in a Sequence
- Problem: Given a sequence
s
and an elemente
, check ife
belongs tos
. - Input: A sequence
s = [e1, ..., en]
of elements and an elementx
. -
Output:
-
If
x
belongs tos
, the indexi
such thatei
isx
. -
-1
ifx
does not belong tos
.
Searching in a Sequence: Linear Search
Pseudo-Code:
linearSearch(seq s, elem x) -> int {
x_index = -1
i = 0
while x_index == -1 and i < length(s) {
if s[i] == x then x_index = i
i = i + 1
}
return x_index
}
Searching in a Sorted Sequence
- Problem: Given a sorted sequence
s
and an elemente
, check ife
belongs tos
. - Input: A sorted sequence
s = [e1, ..., en]
of elements and an elementx
. -
Output:
-
If
x
belongs tos
, the indexi
such thatei
isx
. -
-1
ifx
does not belong tos
. -
Observation:
-
Linear search still works!
- But can we exploit the fact that
s
is sorted to speed up the search?
Searching in a Sorted Sequence: Binary Search
Key Idea:
-
Look up the mid-point of the sequence
s
. -
If the element in the mid-point of
s
isx
, return the mid-point. - If the element is smaller than the mid-point of
s
, continue searching in the first half. - If the element is greater than the mid-point of
s
, continue searching in the second half. - Repeat the process until the element is found, or the subsequence to explore is empty.
Pseudo-Code:
binarySearch(seq s, elem x) -> int {
low = 0
high = length(s) - 1
while low <= high {
mid = (high - low) / 2
if s[mid] == x then return mid
else {
if s[mid] < x then low = mid + 1
else high = mid - 1
}
}
return -1
}
Merging Sorted Sequences
- Problem: Given two sorted sequences
s1
ands2
, create a sorted sequences3
that mergess1
ands2
. - Input: Sorted sequences
s1
ands2
. - Output: A sorted sequence
s3
that is the permutation of the concatenations1 + s2
.
Merging Sorted Sequences
-
Naïve algorithm:
-
Copy both
s1
ands2
intos3
in a contiguous way (i.e.,s3
is the concatenation ofs1
ands2
). - Sort
s3
. -
Key question:
-
Can we take advantage of the fact that
s1
ands2
are already sorted to make the process more efficient?
Merging Sorted Sequences: Linear Merge
Key Idea:
- Maintain in
s3
a sorted merge of two prefixes ofs1
ands2
. -
s1
ands2
have corresponding "non-treated" parts. -
At each step:
-
Compare the first elements of the non-treated parts of
s1
ands2
. - Extend
s3
with the smaller of the two, and reduce the untreated part of the sequence contributing the element tos3
.
Pseudo-Code:
merge(seq s1, seq s2) -> seq {
output = []
i = 0
j = 0
while i < length(s1) and j < length(s2) {
if s1[i] <= s2[j] then {
output = output + s1[i]
i = i + 1
} else {
output = output + s2[j]
j = j + 1
}
}
while i < length(s1) {
output = output + s1[i]
i = i + 1
}
while j < length(s2) {
output = output + s2[j]
j = j + 1
}
return output
}