#!/usr/bin/env python # coding: utf-8 # In[4]: # setup from IPython.core.display import display,HTML display(HTML('')) display(HTML(open('rise.css').read())) # imports import numpy as np import matplotlib.pyplot as plt import seaborn as sns get_ipython().run_line_magic('matplotlib', 'inline') sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)}) # # CMPS 2200 # # Introduction to Algorithms # # ## Greedy Algorithms, Unit-Task Scheduling, Knapsack # # Today's agenda: # # - Framework for Greedy algorithms # - Unit-task Scheduling # - The Knapsack problem # # We previously looked at the Divide and Conquer framework. In this strategy, we used recurrences to give bounds on work/span and induction to prove correctness. # # We will now look at *Greedy* algorithms. The greedy framework is very simple: # # - Let $\mathcal{X}$ be possible choices for the solution. Initialize solution $S=\emptyset$. # - Select $x\in\mathcal{X}$ according to a greedy criterion $C(x)$ and set $S := S \cup \{x\}, \mathcal{X} := \mathcal{X} - \{x\}$. # - Repeat until solution is complete. # # Selection Sort was an example of a greedy strategy. What was the greedy criterion? Why was our algorithm correct? # # # For selection sort, we can see that swapping minimum element into the current position is a correct choice for the "optimal" solution. Moreover, selecting a minimum and sorting the rest of the list recursively is also correct. # In[5]: def selection_sort(L): if (len(L) == 1): return(L) else: m = L.index(min(L)) print('selecting minimum %s' % L[m]) L[0], L[m] = L[m], L[0] print('recursively sorting L=%s\n' % L[1:]) return [L[0]] + selection_sort(L[1:]) selection_sort([2, 1, 999, 4, 3]) # When would the greedy strategy yield the correct solution? # # 1. **Optimal substructure**: An optimal solution can be constructed from optimal solutions of smaller subproblems. # 2. **Greedy choice**: A greedy choice must be in some optimal solution (of a given size). # # Put together, these two properties yield that the iterative strategy above constructs an optimal solution. # # Due to their simplicity greedy algorithms are easy to implement. However proving the optimal substructure and greedy choice properties requires a problem-specific approach and can be tricky. # ### Unit Task Scheduling # # We already saw how a greedy scheduler could work to schedule concurrent tasks. However our greedy algorithm was not optimal. Let's consider the **unit task scheduling problem** where we are given a set of $n$ tasks $A = \{a_0, \ldots, a_{n-1}\}$. Each task $i$ has start and finish times $(s_i, f_i)$. The goal is to select a subset $S$ of tasks with no overlaps that is as large as possible. # #

# # Example: Let the tasks be $a_0=(0, 2), a_1=(2, 4), a_2=(4, 6), a_3=(2, 8)$. What is the largest set of non-overlapping tasks? # #

# # Here, we should choose $S = \{a_0, a_1, a_2\}$ since choosing $a_3$ "blocks" a large part of the solution. # # Is there a greedy algorithm for the unit task scheduling problem? # # Does this problem have optimal substructure? What should the greedy choice be? # # **Optimal Substructure:** Since the tasks are nonoverlapping, if we identify one task in the optimal solution we can eliminate the tasks overlapping it and recursively solve for the remaining tasks. # # Is there a greedy criterion with the greedy choice property? # What if we chose the shortest task first? # # #

# # What if we chose the task that started earliest? # # #

# # What if we successively chose the task with fewest overlaps? # # #

# # What if we chose the task that finished earliest? # # $a_0=(0, 2), a_1=(2, 4), a_2=(4, 6), a_3=(2, 8)$ # # This works on all of our counterexamples. # # What would the greedy choice property say here? # # **Greedy Choice Property**: For any set of tasks, the task with earliest finish time is in some optimal solution. # # **Proof**: Given a set of tasks $A$, let $G$ be the greedy solution and let $O$ be an optimal solution. Now, suppose that $G\neq O$. Sort the tasks by finish time and let tasks $a\in G$ and $a'\in O$ be the first pair of tasks with $a\neq a'$. So we have that $G = # \langle G_1, a, G_2\rangle$ and $O = \langle O_1, a', O_2\rangle$. Consider the solution $X = \langle G_1, a, O_2\rangle$. Since $G$ and $O$ are both valid solutions and $a$ has earlier finish time than $a'$, all the tasks in $O_2$ are compatible with $\langle G_1, a\rangle$. So we now have that $|X| = |O|$, which proves that $a$ is in some optimal solution. # #

# # So we've proven that the earliest finish time first strategy produces the optimal solution. What about work/span? # # # # # Given a set of tasks, we can sort by earliest finish time in $O(n\log n)$ work and $O(\log^2 n)$ span. Then, we sequentially step through the tasks, adding a task to the solution if it doesn't overlap with a task that is already chosen. This takes $O(n)$ work/span. # ### The Knapsack Problem # # Let's look another optimization problem. Suppose there are $n$ objects, each with a *value* $v_i$ and *weight* $w_i$. You have a "knapsack" of capacity $W$ and want to fill it with a set of objects $X \subseteq [n]$ so that $w(X) \leq W$ and $v(X)$ is maximized. # # Example: Suppose you have 3 objects with values/weights $(10, 5), (6, 3), (6, 2)$ and $W=5$. # # |value|weight| # |------|-----| # |10 |5 | # |6 |3 | # |6 | 2 | # # What if we greedily took the most valuable object first? # # # What if we greedily took the object with highest per unit value (i.e., maximum value/weight ratio)? # # In the example above, we'd achieve a value of $12$, which is optimal. Does this always work? # Unfortunately no. Consider this counterexample with 2 objects that have weights/values $(10, 5), (9.999, 3)$. # # |value|weight| # |------|-----| # |10 |5 | # |9.999 |3 | # # # Is there another greedy criterion? # # # We don't know of one (we'll see more of this problem in the next module). The problem is that a greedy algorithm, regardless of the criterion selects objects one at a time, without factoring in the capacity remaining into future decisions. # # What if we wanted to solve the **fractional** version of the problem, where we could take arbitrary amounts of each item? # # Does our weight/value greedy criterion satisfy the greedy choice property? # # **Greedy Choice Property for Fractional Knapsaack**: The item $j$ with largest $v_j/w_j$ is in an optimal solution to the fractional knapsack problem. # # **Proof**: Suppose we were given $n$ objects and their weights/values. Let $O$ be the optimal solution, and let $G$ be the greedy solution. Let item $j$ have the maximum value/weight ratio. Suppose $j\not\in O$, but that some item $i$ had maximum ratio $v_i/w_i$ in $O$ and that $\alpha ~~(0\leq \alpha \leq 1)$ fraction was present. Now, $v_i/w_i \leq v_j/w_j$ so we can substitute the $\alpha$ fraction of item $i$ with item $j$ to obtain a solution $O'$ with $v(O') \geq v(O)$ that is also optimal. Thus item $j$ is in some optimal solution. # # How would we implement our solution? We sort by value/weight ratios and take successive items until we end with a possibly fractional item. The work is $O(n\lg n + n) = O(n\lg n)$ and the span is $O(\lg^2 n + n) = O(n)$. # In[ ]: