Lecture 10.13 Introduction + Sorting
01 - Introduction + Sorting.pdf
Concepts
Functions Contracts(What do you want your function to do)
- Pre- and post-conditions for functions
-
Loop invariants
-
Algorithms
-
Abstract Data types (ADT)
-
Interfaces
-
Data Structures (implementation of ADTs)
-
Representation for ADTs
- Algorithms
Contracts
Contracts: An agreement between a programmer of a function and a user
- A function name and parameters:
E.g., sqr_root(float n): float - Precondition: what the programmer assumes about the input data
- Conditions over the function arguments
-
Facilitates the task of the programmer
Only cares for the cases when precondition is true - E.g.: \(n \ge 0\)
As a programmer I only care for \(n>0\) (rights)
As a user I need to provide \(n>0\) (obligation) - Postcondition: properties of the output the programmer must ensure
-
Specify what (not how) function will compute if the the precondition holds
-
E.g., result * result = n
As a programmer I have to provide the sqr_root (obligation)
As a user I get the sqr_root (rights)
Taking advantages of contracts
Problem: given a sequence s and an element x, check if x belongs to s.
Input: a sequence s = [e1, ..., en] of elements and an element x
Output: if x belongs to s, index i such that \(e_i\) is x; -1 if x does not belong to s
linearSearch(seq s, elem x) -> int {
x_index = -1
i = 0
while index == -1 and i < length(s) {
if s[i] == x then x_index = i
i = i + 1
}
return x_index
}
Then if we know s is sorted (precondition) , we'll have a faster way compared to linear search, that is Binary Search
binarySearch(seq s, elem x) -> int {
low = 0
high = length(s) - 1
while low <= high {
mid = (high-low)/2
if s[mid] == x then return mid
else {
if s[mid] < x then low = mid + 1
else high = mid - 1
}
}
return -1
}
Taking advantages of contracts
- linear Search
If s is an unsorted sequence ✅
sorted sequence ✅
- Binary Search (It assumes that s sorted)
If s is an unsorted sequence ❌ Violates precondition
sorted sequence ✅
Algorithm
A finite sequence of rigorous well-defined instructions, whose aim is to solve a specific problem specified in a contract
- Algorithms describe computations
- The contract can be formally or informally given
Important properties of an algorithm
- Correctness:
Does the algorithm effectively solve the intended problem? - Complexity:
- How good is a solution?
- Time complexity: cost of execution (e.g., required steps)
- Space complexity: memory usage
- Complexity depends on the used data structures
Data structure: a way to store and organize data to facilitate access and modifications.
Abstract data type (ADT)
A conceptual model for data types where the implementation details are hidden from the user. It defines the set of elements of the type and the set of operations (i.e., functions) that can be performed on those elements, without any implementation detail.
Implementation of an ADT
A data structure provides a specific implementation of an ADT:
- The representation of the elements of the data type
- The implementation of the operations that operate on the representation.
They can be subject to the analysis of their running time and space properties.
A data structure is a high-level (i.e., design level) implementation of an abstract data type.
Relation ADT and Data Structure
An ADT typically admit multiple alternative implementations (Imp1, Imp2, ..., ImpN) :
Each implementation
- relies on a data structure
- can be subject to the analysis of their properties: time and space complexity
We can select the most convenient implementation by analyzing their properties
graph BT
ADT("An ADT")
Imp1("Imp1")
Imp2("Imp2")
Dots("...")
ImpN("ImpN")
Imp1 --> ADT
Imp2 --> ADT
ImpN --> ADT
Popular ADTs
Lists
Sets
Dictionaries (Maps)
Queues
Multisets (Bags)
Trees
Graphs
They are collections of elements organized in some specific way
ADT List
Elements: a linearly organized collection of elements
(Usual) Operations:
- access(L, i) : access an element at the valid position i from the list L
- insert(L, e, i) : adds an element e at the specified position i in the list L
- remove(L, i) : removes the element at the valid position i from the list L
- belongs(L, e) : checks if the element e is present in the list L. If it is, the function returns the position of e in the L
Alternative implementations: arrays, singly and doubly linked list, ...
There can be other operations!!! isEmpty?, size?, first, last, ...
The set of operations is part of the definition of the ADT
ADT Set
Elements: unordered collection of elements (multiplicity is irrelevant)
Operations:
- init/empty: creates an empty set
- singleton(a) : creates a set containing just the element a
- belongs(A, a) : returns true if and only if \(a \in A\)
- union(A, B) : adds the elements of B to A (A becomes \(A \cup B\))
- intersection (A, B) : removes from A all those elements that are not in B (A becomes \(A \cap B\))
Alternative implementations: arrays, linked list, bitset, binary search trees, hashes, ...
ADT Dict(ionary) or Map
Elements: collection of key/value pairs
Operations:
- init/empty: creates an empty dictionary
- defined?(D, k) : returns true if k is a key defined in the dictionary D
- insert(D, k, v) : add the key-value pair (k,v) to the dictionary D. If the key k is defined in D, its previous value is replaced
- get(D, k) : returns the value associated to the given key k (which is assumed to be defined)
- delete(D, k) : removes the key/value pair corresponding to the given key k
Alternative implementations: arrays, linked list, binary search trees, hashes, ...
Algorithm
A good algorithm for sorting a small number of elements.
It works the way you might sort a hand of playing cards.
First example: insertion sort
We start with an empty left hand and the cards face down on the table.
We then remove one card at a time from the table and insert it into the correct position in the left hand.
To find the correct position for a card, we compare it with each of the cards already in the hand, from right to left.
At all times, the cards held in the left hand are sorted, and these cards were originally the top cards of the pile on the table.
Executing insertion sort algorithm
- Note that the prefix (yellow cells) is always sorted
- The remaining cells corresponds to the original elements of the array
- That property is essential to focus only on inserting the new element in the right place.
Insertion sort ()
INSERTION-SORT(A,n)
for i = 2 to n
key = A[i]
// Insert A[i] into the sorted subarray A[1: i - 1].
j = i - 1
while j > 0 and A[j] > key
A[j + 1] = A[j]
j = j - 1
A[j + 1] = key
Remarks about previous pseudocode
Following Cormen et al. array indexes start at 1
- Watch out! In the C/C++/Java array indexes start at 0!
The array A is sorted in place:
- the elements of the array are rearranged within the array
- there is a constant number of elements outside the array at any time (one, the value in variable key).
Contract of insertion sort
Precondition
- Pc = {\(|A| = n\)}
Notation: \(|A|\) stands for the length of A - It can be used for sorting any array
Postcondition
- Qc = perm(A,A0) && isSorted(A)
- A contains the same elements of the original array A0 in different order (perm(A,A0)); and
- A is sorted in ascending order (isSorted(A))
Correctness of insertion sort
- In order to prove that the algorithm is correct we need to show that every execution of the algorithm that starts from a state satisfying Pc terminates in a state that satisfies Qc
- How do we reason about loops?
Loop invariant
Loop Invariant A (logical) condition that must holds at each iteration
Loop invariants help us understand why a loop is correct
- It encodes the assumptions we made when writing the loop
- It explains the progress expected to be made by each iteration of the loop
Loop invariant
- Initialization: It is true prior to the first iteration of the loop
- Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration.
- Termination: When the loop terminates, the invariant helps show that the algorithm is correct.
Inv
while (B) {
Inv && B
loop body
Inv
}
Inv && not B
Loop invariant theorem
Theorem: If the following conditions hold
-
Initialization: Pc \(\Rightarrow\) Inv
-
Maintenance: Assuming Inv && B holds, the execution of the loop body makes Inv true again
-
Termination: Inv && not B \(\Rightarrow\) Qc
Then, if the loop terminates, then loop transforms a state satisfying Pc into another one that satisfies Qc
Notation
Pc: Condition on the program state that holds before the loop
Qc: Condition expected after the loop
Inv: Loop invariant
Pc --precondition
Inv
while (B) {
Inv && B
loop body
Inv
}
Inv && not B
Qc -- postcondition
Using loop invariants is like mathematical induction
To prove a property by induction
-
Prove a base case and an inductive step
-
Showing that the invariant holds before the first iteration is like the base case
- Showing that the invariant holds from iteration to iteration is like the inductive step
-
The termination part differs from mathematical induction, in which the inductive step is used infinitely
-
We stop when the loop terminates
- The three parts can be showed in any order
Simple example
Then we use theorem to prove it
-
Initialization \({\color{red}\text{Pc'}}\implies {\color{green}\text{Inv}}\)
From Pc', i = 0. Then, \(0 \le i\).
From Pc', n \(\ge\) 0. Then, i = \(0 \le n\). Hence, \(0 \le i \le n\)
From Pc', res = 1 = 0! = i! -
Maintenance: Assume \({\color{green}\text{Inv } \&\&\text{ not B}}\) Show \(\color{green}\text{Inv}\)
Name \(i_0\) and \(res_0\) the old values of i and res
Inv && B = { \(0 \leq i_0 < n\) && \(res_0 = i_0!\) }
\(i = i_0 + 1\) and \(res = res_0 * i\)
Since, \(0 \leq i_0 < n\), then \(0 \leq i_0 + 1 \leq n\).
Hence, \(0 \leq i \leq n\)
We know that \(res = res_0 * i\). Then,
\(res = res_0 * (i_0 + 1) = i_0! * (i_0 + 1) = (i_0 + 1)! = i!\)
-
Termination \({\color{green}\text{Inv } \&\&\text{ not B}}\Rightarrow{\color{red}\text{Qc'}}\)
not \((i < n)\) implies \((i \geq n)\).
From Inv, \(i \leq n\).
Hence, \(i = n\).
From Inv, res = \(i!\). Hence res = \(i! = n!\)