Skip to content

Lecture 10.13 Introduction + Sorting

01 - Introduction + Sorting.pdf

Concepts

Functions Contracts(What do you want your function to do)

  • Pre- and post-conditions for functions
  • Loop invariants

  • Algorithms

  • Abstract Data types (ADT)

  • Interfaces

  • Data Structures (implementation of ADTs)

  • Representation for ADTs

  • Algorithms

Contracts

Contracts: An agreement between a programmer of a function and a user

  • A function name and parameters:

E.g., sqr_root(float n): float - Precondition: what the programmer assumes about the input data

  • Conditions over the function arguments
  • Facilitates the task of the programmer

    Only cares for the cases when precondition is true - E.g.: \(n \ge 0\)

    As a programmer I only care for \(n>0\) (rights)

    As a user I need to provide \(n>0\) (obligation) - Postcondition: properties of the output the programmer must ensure

  • Specify what (not how) function will compute if the the precondition holds

  • E.g., result * result = n

    As a programmer I have to provide the sqr_root (obligation)

    As a user I get the sqr_root (rights)

Taking advantages of contracts

Problem: given a sequence s and an element x, check if x belongs to s.

Input: a sequence s = [e1, ..., en] of elements and an element x

Output: if x belongs to s, index i such that \(e_i\) is x; -1 if x does not belong to s

linearSearch(seq s, elem x) -> int {
  x_index = -1
  i = 0
  while index == -1 and i < length(s) {
    if s[i] == x then x_index = i
    i = i + 1
  }
  return x_index
}

Then if we know s is sorted (precondition) , we'll have a faster way compared to linear search, that is Binary Search

binarySearch(seq s, elem x) -> int {
  low = 0
  high = length(s) - 1
  while low <= high {
    mid = (high-low)/2
    if s[mid] == x then return mid
    else {
      if s[mid] < x then low = mid + 1
      else high = mid - 1
    }
  }
  return -1
}

Taking advantages of contracts

  • linear Search

If s is an unsorted sequence ✅
sorted sequence ✅ - Binary Search (It assumes that s sorted)

If s is an unsorted sequence ❌ Violates precondition
sorted sequence ✅

Algorithm

A finite sequence of rigorous well-defined instructions, whose aim is to solve a specific problem specified in a contract

  • Algorithms describe computations
  • The contract can be formally or informally given

Important properties of an algorithm

  • Correctness:

Does the algorithm effectively solve the intended problem? - Complexity:

  • How good is a solution?
  • Time complexity: cost of execution (e.g., required steps)
  • Space complexity: memory usage
  • Complexity depends on the used data structures

Data structure: a way to store and organize data to facilitate access and modifications.

Abstract data type (ADT)

A conceptual model for data types where the implementation details are hidden from the user. It defines the set of elements of the type and the set of operations (i.e., functions) that can be performed on those elements, without any implementation detail.

Implementation of an ADT

A data structure provides a specific implementation of an ADT:

  • The representation of the elements of the data type
  • The implementation of the operations that operate on the representation.

They can be subject to the analysis of their running time and space properties.

A data structure is a high-level (i.e., design level) implementation of an abstract data type.

Relation ADT and Data Structure

An ADT typically admit multiple alternative implementations (Imp1, Imp2, ..., ImpN) :

Each implementation

  • relies on a data structure
  • can be subject to the analysis of their properties: time and space complexity

We can select the most convenient implementation by analyzing their properties

graph BT
    ADT("An ADT")

    Imp1("Imp1")
    Imp2("Imp2")
    Dots("...")
    ImpN("ImpN")

    Imp1 --> ADT
    Imp2 --> ADT
    ImpN --> ADT

Lists

Sets

Dictionaries (Maps)

Queues

Multisets (Bags)

Trees

Graphs

They are collections of elements organized in some specific way

ADT List

Elements: a linearly organized collection of elements

(Usual) Operations:

  • access(L, i) : access an element at the valid position i from the list L
  • insert(L, e, i) : adds an element e at the specified position i in the list L
  • remove(L, i) : removes the element at the valid position i from the list L
  • belongs(L, e) : checks if the element e is present in the list L. If it is, the function returns the position of e in the L

Alternative implementations: arrays, singly and doubly linked list, ...

There can be other operations!!! isEmpty?, size?, first, last, ...

The set of operations is part of the definition of the ADT

ADT Set

Elements: unordered collection of elements (multiplicity is irrelevant)

Operations:

  • init/empty: creates an empty set
  • singleton(a) : creates a set containing just the element a
  • belongs(A, a) : returns true if and only if \(a \in A\)
  • union(A, B) : adds the elements of B to A (A becomes \(A \cup B\))
  • intersection (A, B) : removes from A all those elements that are not in B (A becomes \(A \cap B\))

Alternative implementations: arrays, linked list, bitset, binary search trees, hashes, ...

ADT Dict(ionary) or Map

Elements: collection of key/value pairs

Operations:

  • init/empty: creates an empty dictionary
  • defined?(D, k) : returns true if k is a key defined in the dictionary D
  • insert(D, k, v) : add the key-value pair (k,v) to the dictionary D. If the key k is defined in D, its previous value is replaced
  • get(D, k) : returns the value associated to the given key k (which is assumed to be defined)
  • delete(D, k) : removes the key/value pair corresponding to the given key k

Alternative implementations: arrays, linked list, binary search trees, hashes, ...

Algorithm

A good algorithm for sorting a small number of elements.

It works the way you might sort a hand of playing cards.

First example: insertion sort

We start with an empty left hand and the cards face down on the table.

We then remove one card at a time from the table and insert it into the correct position in the left hand.

To find the correct position for a card, we compare it with each of the cards already in the hand, from right to left.

At all times, the cards held in the left hand are sorted, and these cards were originally the top cards of the pile on the table.

Executing insertion sort algorithm

image

  • Note that the prefix (yellow cells) is always sorted
  • The remaining cells corresponds to the original elements of the array
  • That property is essential to focus only on inserting the new element in the right place.

Insertion sort ()

INSERTION-SORT(A,n)
    for i = 2 to n
        key = A[i]
        // Insert A[i] into the sorted subarray A[1: i - 1].
        j = i - 1
        while j > 0 and A[j] > key
            A[j + 1] = A[j]
            j = j - 1
        A[j + 1] = key

Remarks about previous pseudocode

Following Cormen et al. array indexes start at 1

  • Watch out! In the C/C++/Java array indexes start at 0!

The array A is sorted in place:

  • the elements of the array are rearranged within the array
  • there is a constant number of elements outside the array at any time (one, the value in variable key).

Contract of insertion sort

Precondition

  • Pc = {\(|A| = n\)}

Notation: \(|A|\) stands for the length of A - It can be used for sorting any array

Postcondition

  • Qc = perm(A,A0) && isSorted(A)
  • A contains the same elements of the original array A0 in different order (perm(A,A0)); and
  • A is sorted in ascending order (isSorted(A))

Correctness of insertion sort

  • In order to prove that the algorithm is correct we need to show that every execution of the algorithm that starts from a state satisfying Pc terminates in a state that satisfies Qc
  • How do we reason about loops?

Loop invariant

Loop Invariant A (logical) condition that must holds at each iteration

Loop invariants help us understand why a loop is correct

  • It encodes the assumptions we made when writing the loop
  • It explains the progress expected to be made by each iteration of the loop

Loop invariant

  • Initialization: It is true prior to the first iteration of the loop
  • Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration.
  • Termination: When the loop terminates, the invariant helps show that the algorithm is correct.
Inv
while (B) {
    Inv && B
    loop body
    Inv
}
Inv && not B

Loop invariant theorem

Theorem: If the following conditions hold

  1. Initialization: Pc \(\Rightarrow\) Inv

  2. Maintenance: Assuming Inv && B holds, the execution of the loop body makes Inv true again

  3. Termination: Inv && not B \(\Rightarrow\) Qc

Then, if the loop terminates, then loop transforms a state satisfying Pc into another one that satisfies Qc

Notation
Pc: Condition on the program state that holds before the loop
Qc: Condition expected after the loop
Inv: Loop invariant

Pc --precondition
Inv
while (B) {
  Inv && B
  loop body
  Inv
}
Inv && not B
Qc -- postcondition

Using loop invariants is like mathematical induction

To prove a property by induction

  • Prove a base case and an inductive step

  • Showing that the invariant holds before the first iteration is like the base case

  • Showing that the invariant holds from iteration to iteration is like the inductive step
  • The termination part differs from mathematical induction, in which the inductive step is used infinitely

  • We stop when the loop terminates

  • The three parts can be showed in any order

Simple example

image

Then we use theorem to prove it

  1. Initialization \({\color{red}\text{Pc'}}\implies {\color{green}\text{Inv}}\)
    From Pc', i = 0. Then, \(0 \le i\).
    From Pc', n \(\ge\) 0. Then, i = \(0 \le n\). Hence, \(0 \le i \le n\)
    From Pc', res = 1 = 0! = i!

  2. Maintenance: Assume \({\color{green}\text{Inv } \&\&\text{ not B}}\) Show \(\color{green}\text{Inv}\)

    Name \(i_0\) and \(res_0\) the old values of i and res

    Inv && B = { \(0 \leq i_0 < n\) && \(res_0 = i_0!\) }

    \(i = i_0 + 1\) and \(res = res_0 * i\)

    Since, \(0 \leq i_0 < n\), then \(0 \leq i_0 + 1 \leq n\).

    Hence, \(0 \leq i \leq n\)

    We know that \(res = res_0 * i\). Then,

    \(res = res_0 * (i_0 + 1) = i_0! * (i_0 + 1) = (i_0 + 1)! = i!\)

  3. Termination \({\color{green}\text{Inv } \&\&\text{ not B}}\Rightarrow{\color{red}\text{Qc'}}\)

    not \((i < n)\) implies \((i \geq n)\).

    From Inv, \(i \leq n\).

    Hence, \(i = n\).

    From Inv, res = \(i!\). Hence res = \(i! = n!\)

image