Algorithms Midterm
Study Guide

Every sub-question from every problem — answer hidden until you reveal it.

Problem 1
Time Complexity & Master Theorem
Time Analysis
Problem 2
Matrix Chain Multiplication
Dynamic Programming
Problem 3
0/1 Knapsack
Dynamic Programming
Problem 4
Huffman Encoding
Greedy
Problem 5
Fractional Knapsack
Greedy
Exam rules: 3 of 5 problems will appear. Show all work — no work shown = max 70/100. Paper only, no laptops.

Problem 1 — Time Complexity of Matrix Multiplication

TIME ANALYSIS Three algorithms · Big-Theta · Master Theorem

Core Concept — What is Time Complexity?

Time complexity describes how the number of operations an algorithm performs scales with input size n. We use Big-Theta (Θ) to give a tight bound — meaning the runtime grows at exactly that rate, not faster, not slower, up to constant factors.

The key question for any algorithm is: what is the most deeply nested operation, and how many times does it execute? Everything else is secondary — we drop lower-order terms and constants because for large n, they don't matter.

Setup

You will analyze multiplying two n × n matrices using three different algorithms. For each, the goal is to count operations precisely, then express it in Big-Theta notation.

Part A — Iterative Matrix Multiplication

How iterative matrix multiplication works

To multiply two n×n matrices A and B into result matrix C, each entry C[i][j] is computed as the dot product of row i of A with column j of B. That dot product requires n multiplications (one per element pair). We do this for every (i,j) entry in C, of which there are n×n = n² entries.

The algorithm uses three nested loops: the outer two iterate over every (i,j) position in C, and the inner loop computes the dot product by accumulating n multiply-add operations. The innermost line is what we count.

The Algorithm
for i = 1 to n ← outer loop: select each row i of result matrix C (runs n times) for j = 1 to n ← middle loop: select each column j of result C (runs n times per i → n² pairs) C[i][j] = 0 ← zero the accumulator before computing the dot product for this entry for k = 1 to n ← inner loop: step through the shared dimension (runs n times per (i,j) → n³ total) C[i][j] = C[i][j] + A[i][k] * B[k][j] ← multiply A's row-i element by B's col-j element and add to accumulator ← count this

Time: Θ(n³) — three nested loops of size n give n³ multiply-add operations. Space: Θ(n²) — result matrix C stores n² values; input matrices are read-only.

A·Q1
How many times does the innermost multiplication operation execute?

Work through the loops from outside in. Each loop multiplies the count:

  • Outer loop (i) runs n times
  • Middle loop (j) runs n times per i iteration → n² total (i,j) pairs
  • Inner loop (k) runs n times per (i,j) pair
  • Total = n × n × n = n³ multiplications

Intuition: there are n² entries in C, each requiring a dot product of length n. That's n² × n = n³.

A·Q2
Express the running time as a function of n.

Each loop iteration performs constant work (one multiply-add). Since the innermost line runs times total, and the initialization C[i][j]=0 adds only n² work (dominated), the running time is T(n) = n³.

A·Q3
Give the final time complexity using Big-Theta notation.
T(n) = Θ(n³)
Why Big-Theta and not Big-O?

Big-O is an upper bound. Big-Theta is a tight bound — it says the runtime is exactly cubic growth. For iterative matrix multiplication, the loops always run to completion, so the best and worst cases are the same. Theta is the right notation.


Part B — Recursive Divide and Conquer

The divide and conquer idea

Instead of computing each entry one-by-one, divide and conquer breaks the n×n matrices into four (n/2 × n/2) blocks. The product can then be expressed in terms of products of these sub-blocks. This recursion bottoms out at 1×1 matrices (a single scalar multiply).

The key insight: to compute the four blocks of the result matrix C using this block formulation, you need 8 recursive multiplications of (n/2 × n/2) matrices, plus matrix addition work to combine results. Addition of two n×n matrices costs Θ(n²) because you touch every entry exactly once.

The question is: does splitting into subproblems actually reduce the total work? The Master Theorem will tell us.

Setup

Matrices are split into four (n/2 × n/2) blocks and multiplied recursively. This requires 8 recursive multiplications plus addition work costing Θ(n²).

B·Q1
How many recursive matrix multiplications occur?

Each n×n matrix is divided into four (n/2 × n/2) blocks. To compute the product using block multiplication, 8 recursive multiplications are performed — this comes directly from the block matrix multiplication formula (each of the 4 result blocks requires 2 sub-multiplications).

B·Q2
What is the size of each recursive subproblem?

Each recursive call works on matrices of size n/2 × n/2. We halved both dimensions, so the problem size is n/2.

B·Q3
Write the recurrence relation T(n).
T(n) = 8T(n/2) + Θ(n²)
Reading the recurrence

T(n) is the total work for size n. The "8T(n/2)" term accounts for making 8 recursive calls on half-size problems. The "Θ(n²)" term is the work done at this level: adding/subtracting the sub-matrices to combine results. Every recurrence for divide and conquer follows this pattern: subproblem work + combine work.

B·Q4
Use the Master Theorem to solve the recurrence. Show a, b, f(n), and which case applies.

Master Theorem — T(n) = aT(n/b) + f(n)

The Master Theorem solves recurrences of this form by comparing f(n) (the work done outside recursion) against nlog_b(a) (which represents the total work done at the leaves of the recursion tree).

  • a = number of subproblems spawned per call
  • b = factor by which input size shrinks
  • f(n) = non-recursive work per level
  • c = log_b(a) = critical exponent, the "balance point"
  • a = 8 — 8 recursive subproblems
  • b = 2 — problem size halves each time
  • f(n) = Θ(n²) — the matrix addition work

Critical exponent: logb(a) = log2(8) = 3


Compare f(n) = n² with n3: f(n) grows slower than n3Case 1 applies

Master Theorem Case 1

If f(n) = O(nc−ε) for some ε > 0 where c = logb(a), then T(n) = Θ(nc). This case means the recursion tree's leaves dominate — most of the work happens at the bottom of the tree, not at the top.

B·Q5
State the final time complexity.
T(n) = Θ(n³)
Key Insight

Exactly the same as the brute-force iterative algorithm. Dividing into subproblems didn't help here because we still spawn 8 subproblems — the recursion tree has the same total leaf work as just doing it iteratively. The overhead of divide and conquer didn't buy us anything when a=8.


Part C — Strassen's Algorithm

The algebraic trick behind Strassen

Strassen's key insight is that you can compute the 4 block sub-matrices of C using only 7 multiplications instead of 8, by cleverly reusing intermediate products. This requires computing 7 "helper" matrices (M₁–M₇) using sums/differences of blocks first, then combining them. Each of those 7 helper computations is a single (n/2 × n/2) matrix multiply.

The extra additions and subtractions needed to set up M₁–M₇ and combine them still cost Θ(n²). So the recurrence looks almost identical — except a drops from 8 to 7. That one change has a surprisingly large effect on the exponent.

Setup

Strassen's algorithm reduces recursive multiplications from 8 to 7 via algebraic manipulation. Matrix additions/subtractions still cost Θ(n²).

C·Q1
Write the recurrence relation for Strassen's algorithm.
T(n) = 7T(n/2) + Θ(n²)
Only change from Part B

a = 7 instead of 8. Everything else is identical. Saving one recursive call changes the leaf-level work count from 8^depth to 7^depth — which drastically reduces the total work for large n.

C·Q2
Identify the parameters a, b, and f(n) for the Master Theorem.
  • a = 7 — 7 recursive subproblems (Strassen's trick)
  • b = 2 — problem size still halves each time
  • f(n) = Θ(n²) — matrix additions/subtractions at each level
C·Q3
Apply the Master Theorem. Compute the critical exponent and determine which case applies.

Critical exponent: log2(7) ≈ 2.807


Compare f(n) = n² with n2.807:

f(n) = n² grows slower than n2.807Case 1 applies

Because f(n) is dominated by the leaf work, T(n) = Θ(nlog₂7).

C·Q4
Determine the final time complexity.
T(n) = Θ(n^log&sub2;7) ≈ Θ(n^2.807)
Key Insight

One fewer multiplication changed the exponent from 3.0 to 2.807. This might seem small, but for large n the difference is enormous. For n = 1000, n³ = 10⁹ while n^2.807 ≈ 3.2 × 10⁸ — about 3× faster. Strassen is asymptotically faster than standard matrix multiplication, which is why it matters theoretically even though constants make it impractical for small n.

Master Theorem Cheat Sheet

T(n) = aT(n/b) + f(n), let c = logb(a)

  • Case 1: f(n) = O(nc−ε) → leaves dominate → T(n) = Θ(nc)
  • Case 2: f(n) = Θ(nc) → all levels equal → T(n) = Θ(nc log n)
  • Case 3: f(n) = Ω(nc+ε) → top level dominates → T(n) = Θ(f(n))

Problem 2 — Matrix Chain Multiplication

DYNAMIC PROGRAMMING Bottom-up · m and s tables · Parenthesization

Core Concept — Why Parenthesization Matters

Matrix multiplication is associative: (A×B)×C = A×(B×C). But the cost — number of scalar multiplications — depends heavily on the order you parenthesize. Different orderings can differ by orders of magnitude in cost.

This is a dynamic programming problem because: (1) it has optimal substructure — the optimal solution to the full chain includes optimal solutions to sub-chains, and (2) sub-chains overlap — the same sub-chain appears in many different full-chain calculations, so we cache results in the m table rather than recomputing.

We fill the table in order of increasing chain length, so that when we compute m[i][j], all shorter sub-chains are already solved.

Problem Statement

Given: A₁(7×2), A₂(2×9), A₃(9×6), A₄(6×5)
Find the minimum scalar multiplications to compute A₁×A₂×A₃×A₄ and the optimal parenthesization.
Fill tables: m[i][j] (min cost) and s[i][j] (split point k).

STEP 1
Define the dimensions array p[].
Matrix Aᵢ has dimensions p[i−1] × p[i]. What is p[] for these four matrices?
p = [7, 2, 9, 6, 5]
Why?

p[0]=7, p[1]=2 → A₁ is 7×2  |  p[1]=2, p[2]=9 → A₂ is 2×9  |  p[2]=9, p[3]=6 → A₃ is 9×6  |  p[3]=6, p[4]=5 → A₄ is 6×5. Notice how adjacent dimensions share a value — this is what makes the matrices compatible for multiplication.

STEP 2
Initialize the m[i][j] and s[i][j] tables.
What values go on the diagonal? Why?

Set m[i][i] = 0 for all i. A single matrix requires no multiplication — there is nothing to multiply. The cost of "multiplying" a chain of length 1 is zero by definition. All other cells start empty (to be filled).

m[i][j]j=1j=2j=3j=4
i=10???
i=20??
i=30?
i=40
STEP 3
State the recurrence formula for m[i][j]. What does k represent?
m[i][j] = min over all k where i ≤ k < j of:
m[i][k] + m[k+1][j] + p[i−1] × p[k] × p[j]

What k means:

  • k is the split point — where you divide the chain Aᵢ...Aⱼ into two halves
  • Left side: Aᵢ...Aₖ → already optimally computed, cost = m[i][k]
  • Right side: Aₖ₊₁...Aⱼ → already optimally computed, cost = m[k+1][j]
  • Merge cost: multiplying the two resulting matrices costs p[i−1] × p[k] × p[j] (dimensions of the left result × right result)
STEP 4a
Compute chain length = 2: fill m[1][2], m[2][3], m[3][4] and the corresponding s values.

m[1][2]: Only split is k=1
= m[1][1] + m[2][2] + p[0]×p[1]×p[2] = 0 + 0 + 7×2×9 = 126 → s[1][2] = 1


m[2][3]: Only split is k=2
= 0 + 0 + p[1]×p[2]×p[3] = 2×9×6 = 108 → s[2][3] = 2


m[3][4]: Only split is k=3
= 0 + 0 + p[2]×p[3]×p[4] = 9×6×5 = 270 → s[3][4] = 3

m[i][j]j=1j=2j=3j=4
i=10126??
i=20108?
i=30270
i=40
STEP 4b
Compute chain length = 3: fill m[1][3] and m[2][4]. Show all k values tried.

m[1][3] — try k=1 and k=2:

  • k=1: m[1][1] + m[2][3] + p[0]×p[1]×p[3] = 0 + 108 + 7×2×6 = 108 + 84 = 192
  • k=2: m[1][2] + m[3][3] + p[0]×p[2]×p[3] = 126 + 0 + 7×9×6 = 126 + 378 = 504

Min = 192 at k=1 → m[1][3] = 192, s[1][3] = 1


m[2][4] — try k=2 and k=3:

  • k=2: m[2][2] + m[3][4] + p[1]×p[2]×p[4] = 0 + 270 + 2×9×5 = 270 + 90 = 360
  • k=3: m[2][3] + m[4][4] + p[1]×p[3]×p[4] = 108 + 0 + 2×6×5 = 108 + 60 = 168

Min = 168 at k=3 → m[2][4] = 168, s[2][4] = 3

m[i][j]j=1j=2j=3j=4
i=10126192?
i=20108168
i=30270
i=40
STEP 4c
Compute chain length = 4: fill m[1][4]. Show all k values tried.

m[1][4] — try k=1, k=2, k=3:

  • k=1: m[1][1] + m[2][4] + p[0]×p[1]×p[4] = 0 + 168 + 7×2×5 = 168 + 70 = 238
  • k=2: m[1][2] + m[3][4] + p[0]×p[2]×p[4] = 126 + 270 + 7×9×5 = 396 + 315 = 711
  • k=3: m[1][3] + m[4][4] + p[0]×p[3]×p[4] = 192 + 0 + 7×6×5 = 192 + 210 = 402

Min = 238 at k=1 → m[1][4] = 238, s[1][4] = 1

Minimum scalar multiplications = 238
FINAL
Write the completed m[i][j] and s[i][j] tables.

m[i][j] — minimum cost:

m[i][j]j=1j=2j=3j=4
i=10126192238
i=20108168
i=30270
i=40

s[i][j] — optimal split point:

s[i][j]j=1j=2j=3j=4
i=1111
i=223
i=33
i=4
STEP 5
Use the s table to construct the optimal parenthesization step-by-step.

Read the s table recursively: s[i][j] tells you where to split the chain from i to j.

  1. Full chain A₁–A₄: s[1][4] = 1 → split after A₁ → (A₁) × (A₂A₃A₄)
  2. Right subchain A₂–A₄: s[2][4] = 3 → split after A₃ → (A₂A₃) × A₄
  3. Left subchain A₂–A₃: s[2][3] = 2 → split after A₂ → (A₂) × (A₃)
Optimal Parenthesization

( A₁ × ((A₂ × A₃) × A₄) ) — total cost: 238 scalar multiplications

Pseudocode
for len = 2 to n: ← iterate over chain lengths 2..n (shortest sub-chains first) for i = 1 to n-len+1: ← for each valid starting index i of a chain of this length j = i + len - 1 ← compute the ending index j of the sub-chain [i..j] m[i][j] = ∞ ← init cost to infinity; we will minimize over all split points k for k = i to j-1: ← try every split point k: left side is [i..k], right side is [k+1..j] q = m[i][k] + m[k+1][j] + p[i-1]·p[k]·p[j] ← cost = left sub-chain + right sub-chain + cost to merge the two resulting matrices if q < m[i][j]: ← if this split is cheaper than the current best... m[i][j] = q ← ...update the minimum cost for chain [i..j] s[i][j] = k ← ...record the optimal split point for traceback

Time: Θ(n³) — three nested loops (chain length, start index, split point) each up to n → O(n³) total. Space: Θ(n²) — two n×n tables, m[i][j] for costs and s[i][j] for split points.

Problem 3 — 0/1 Knapsack

DYNAMIC PROGRAMMING Include/exclude decision · Build table row by row · Traceback

Core Concept — Why DP is Necessary

In 0/1 Knapsack, each item is either taken whole or not at all. A greedy approach (take best ratio first) does not work here — it can get stuck with a suboptimal combination when partial items aren't allowed.

The DP approach builds a table P[i][w] that answers: "what is the maximum profit using only items 1 through i, with capacity exactly w?" By building this up from i=0 and w=0, each cell can be computed using previously solved sub-problems in O(1) time, giving an overall O(nW) algorithm.

At each cell, you make one binary decision: include item i (if it fits) or exclude it. You take the max of the two choices.

Problem Statement

Knapsack capacity: W = 8 lbs, n = 4 items. Maximize profit. Each item taken whole or not at all.

ItemProfitWeight
112
223
354
465
STEP 1
Define the profit and weight arrays. How large is the DP table?

Profit = [1, 2, 5, 6]    Weight = [2, 3, 4, 5]


Table P[i][w]: i = items considered (0 to n=4), w = capacity (0 to W=8)

Dimensions: (n+1) × (W+1) = 5 × 9

Initialize all to 0. Row 0 (no items) is trivially all zeros — with no items available, profit is always 0.

STEP 2
State the recurrence formula. When do you include vs. exclude an item?
Recurrence

If Weight[i] > w (item too heavy):   P[i][w] = P[i−1][w]

Otherwise:   P[i][w] = max( P[i−1][w] , Profit[i] + P[i−1][w − Weight[i]] )

Exclude: take best profit from row above (same capacity) = P[i−1][w]. We pretend item i doesn't exist.

Include: take this item's profit, then look up the best profit achievable with the remaining capacity = Profit[i] + P[i−1][w − Weight[i]]. We "use up" Weight[i] capacity and get Profit[i] in return.

Take the max of the two options — DP always records the best achievable.

STEP 3a
Fill Row 1 — Item 1 (Profit=1, Weight=2). Show your calculations.

w=0,1: Weight[1]=2 > w → can't fit → copy 0 from row above

w=2: max(P[0][2], 1 + P[0][0]) = max(0, 1+0) = 1

w=3: max(P[0][3], 1 + P[0][1]) = max(0, 1+0) = 1

w=4–8: same logic, always max(0, 1) = 1

i\w012345678
0000000000
1001111111
STEP 3b
Fill Row 2 — Item 2 (Profit=2, Weight=3). Show your calculations.

w=0,1,2: Weight[2]=3 > w → copy row above: 0, 0, 1

w=3: max(P[1][3], 2+P[1][0]) = max(1, 2+0) = 2

w=4: max(P[1][4], 2+P[1][1]) = max(1, 2+0) = 2

w=5: max(P[1][5], 2+P[1][2]) = max(1, 2+1) = 3

w=6: max(P[1][6], 2+P[1][3]) = max(1, 2+1) = 3

w=7: max(P[1][7], 2+P[1][4]) = max(1, 2+1) = 3

w=8: max(P[1][8], 2+P[1][5]) = max(1, 2+1) = 3

i\w012345678
1001111111
2001223333
STEP 3c
Fill Row 3 — Item 3 (Profit=5, Weight=4). Show your calculations.

w=0–3: Weight[3]=4 > w → copy row above: 0,0,1,2

w=4: max(P[2][4], 5+P[2][0]) = max(2, 5+0) = 5

w=5: max(P[2][5], 5+P[2][1]) = max(3, 5+0) = 5

w=6: max(P[2][6], 5+P[2][2]) = max(3, 5+1) = 6

w=7: max(P[2][7], 5+P[2][3]) = max(3, 5+2) = 7

w=8: max(P[2][8], 5+P[2][4]) = max(3, 5+2) = 7

i\w012345678
2001223333
3001255677
STEP 3d
Fill Row 4 — Item 4 (Profit=6, Weight=5). Show your calculations. What is the maximum profit?

w=0–4: Weight[4]=5 > w → copy row above: 0,0,1,2,5

w=5: max(P[3][5], 6+P[3][0]) = max(5, 6+0) = 6

w=6: max(P[3][6], 6+P[3][1]) = max(6, 6+0) = 6

w=7: max(P[3][7], 6+P[3][2]) = max(7, 6+1) = 7

w=8: max(P[3][8], 6+P[3][3]) = max(7, 6+2) = 8 ← final answer

i\w012345678
0000000000
1001111111
2001223333
3001255677
4001256678
Maximum Profit = P[4][8] = 8
STEP 4
Perform the traceback. Starting from P[4][8], determine which items were included.

Traceback rule: at P[i][w], if P[i][w] ≠ P[i-1][w], then item i was included. Subtract its weight and move to row i-1. If equal, item i was excluded — move up a row, same column.

  1. P[4][8]=8 vs P[3][8]=7 → differentItem 4 INCLUDED. w = 8−5 = 3. Go to P[3][3].
  2. P[3][3]=2 vs P[2][3]=2 → sameItem 3 NOT included. Stay at w=3. Go to P[2][3].
  3. P[2][3]=2 vs P[1][3]=1 → differentItem 2 INCLUDED. w = 3−3 = 0. Go to P[1][0].
  4. P[1][0]=0 → base case reached. Done.
Final Answer

Item 2 (Profit=2, Weight=3) + Item 4 (Profit=6, Weight=5)
Total Weight = 8 lbs ✓   Total Profit = 8

Pseudocode
for i = 1 to n: ← consider each item i in turn (row by row) for w = 0 to W: ← fill every capacity column 0..W for this row if Weight[i] > w: ← item i is too heavy to fit in a knapsack of capacity w P[i][w] = P[i-1][w] ← exclude item i: best profit is same as without it else: ← item i could fit; decide whether to include it P[i][w] = max(P[i-1][w], ← option 1 — exclude: carry forward best profit without item i Profit[i] + P[i-1][w - Weight[i]]) ← option 2 — include: gain Profit[i], use remaining capacity w-Weight[i]

Time: Θ(nW) — two nested loops: n items × W+1 capacities, each cell filled in O(1). Space: Θ(nW) — the full (n+1)×(W+1) table P; reducible to Θ(W) if only the final value is needed (not the traceback).

Problem 4 — Huffman Encoding

GREEDY Frequency count · Min-heap tree · Variable-length codes · Compression

Core Concept — Variable-Length Encoding

In a fixed-length encoding, every character gets the same number of bits (e.g., ASCII uses 8). Huffman encoding assigns shorter codes to more frequent characters and longer codes to rare ones, minimizing the total bits needed to encode a message.

It's greedy: at each step, merge the two lowest-frequency nodes in the priority queue into a new internal node. This builds a binary tree where the path from root to a leaf is the code for that character. The greedy choice (always merge the two smallest) is provably optimal.

The result is a prefix-free code — no code is a prefix of another — which means encoded bitstrings can be decoded unambiguously without delimiters.

Problem Statement

Apply Huffman compression to the string "balloon" (case-insensitive).
Steps: (a) count frequencies, (b) build tree, (c) assign codes, (d) encode string, (e) compare with fixed-length encoding.

Part a
a
Count the frequency of each character in "balloon" (case-insensitive).
CharacterFrequency
b1
a1
l2
o2
n1
Totals

7 total characters, 5 unique characters. b-a-l-l-o-o-n: b=1, a=1, l=2, o=2, n=1.

Part b
b
Build the Huffman tree. Show each merge step using the min-heap (priority queue).

Start: insert all as leaf nodes

Queue: b(1), a(1), n(1), l(2), o(2)


Merge 1: Take two smallest → b(1) + n(1) → internal node bn(2)

Queue: a(1), bn(2), l(2), o(2)


Merge 2: Take two smallest → a(1) + bn(2) → internal node abn(3)

Queue: l(2), o(2), abn(3)


Merge 3: Take two smallest → l(2) + o(2) → internal node lo(4)

Queue: abn(3), lo(4)


Merge 4: abn(3) + lo(4) → root(7)

Final Tree Structure
[7] / \ [4] [3] / \ / \ l:2 o:2 a:1 [2] / \ b:1 n:1
Why this shape?

l and o have the highest frequency (2), so they end up closest to the root and get the shortest codes. b and n are deepest, getting the longest codes. This assignment minimizes total bits: frequent chars pay less per occurrence.

Part c
c
Assign binary codes to each character. (left = 0, right = 1)

Traverse from root: left = 0, right = 1

  • root → left → left → l = 00
  • root → left → right → o = 01
  • root → right → left → a = 10
  • root → right → right → left → b = 110
  • root → right → right → right → n = 111
CharFreqCodeBits
l2002
o2012
a1102
b11103
n11113
Part d
d
Encode the string "balloon" using your Huffman codes. What is the encoded bitstring and total bit count?

b → 110

a → 10

l → 00

l → 00

o → 01

o → 01

n → 111

110 10 00 00 01 01 111 = 1101000000101111
Total Bits = 16
Part e
e
Compare with fixed-length encoding. How many bits does fixed-length need? How much does Huffman save?

Fixed-length formula: ⌈log&sub2;(N)⌉ bits per character, where N = number of unique characters


5 unique characters → ⌈log&sub2;(5)⌉ = ⌈2.32⌉ = 3 bits per character


Total: 7 characters × 3 bits = 21 bits

MethodTotal Bits
Fixed-length21 bits
Huffman16 bits
Savings5 bits (≈24% smaller)
Conclusion

Huffman reduces size from 21 to 16 bits because l and o (frequency 2) get 2-bit codes instead of 3-bit codes — saving 1 bit per occurrence, 4 total. That's exactly the 5-bit improvement (l saves 2×1, o saves 2×1, a also gets a 2-bit code saving 1 bit).

Pseudocode
Huffman(C): n = |C| ← number of unique characters (leaf nodes to create) Q = priority_queue(C) ← insert all characters as leaf nodes; min-heap ordered by frequency for i = 1 to n-1: ← perform n-1 merges to reduce n nodes down to 1 root z = new_node() ← allocate a new internal (non-leaf) node z.left = x = Extract-Min(Q) ← pull the lowest-frequency node; it becomes the left child z.right = y = Extract-Min(Q) ← pull the next lowest-frequency node; it becomes the right child z.freq = x.freq + y.freq ← internal node's frequency = sum of its two children's frequencies Insert(Q, z) ← push the merged node back into the heap for future merges return Extract-Min(Q) ← only one node remains — it is the root of the Huffman tree

Time: Θ(n log n) — n−1 iterations, each with two Extract-Min and one Insert on a heap of size ≤n → O(log n) each. Space: Θ(n) — the tree has 2n−1 nodes total (n leaves + n−1 internal nodes).

Problem 5 — Fractional Knapsack

GREEDY Profit/weight ratio · Sort descending · Take fractions when needed

Core Concept — Why Greedy Works Here

Unlike 0/1 Knapsack, you can take any fraction of an item. This changes everything: it means you never "waste" capacity. Once you fill the knapsack optimally up to the last item, you fill the remainder with a fraction of the next best item.

The greedy strategy — sort by profit/weight ratio and take the best available — is provably optimal. The argument: if you didn't take the item with the best ratio first, you could swap it in for whatever you did take and improve profit. Therefore, always taking the highest-ratio item first can never be wrong.

Time complexity is O(n log n) due to sorting; the greedy selection itself is O(n).

Problem Statement

Capacity: 15 lbs, n=7 items. You may take fractions of items. Maximize total profit using a greedy strategy.

Item1234567
Profit12820107254
Weight3487152
STEP 1
Calculate the profit-to-weight ratio for each item.
ItemProfitWeightRatio (P/W)
11234.00
2842.00
32082.50
41071.43
5717.00
62555.00
7422.00
STEP 2
Sort items by profit-to-weight ratio in descending order.
RankItemProfitWeightRatio
1st5717.00
2nd62555.00
3rd11234.00
4th32082.50
5th2842.00
6th7422.00
7th41071.43
STEP 3a
Take Item 5 (Profit=7, Weight=1, Ratio=7.00). Does it fit? What are the running totals?

Remaining capacity = 15. Item 5 weighs 1 lb. 1 ≤ 15 → fits entirely. Take 100%.

Total Weight = 1 lb   Total Profit = 7

STEP 3b
Take Item 6 (Profit=25, Weight=5, Ratio=5.00). Does it fit? What are the running totals?

Remaining capacity = 14. Item 6 weighs 5 lbs. 5 ≤ 14 → fits entirely. Take 100%.

Total Weight = 6 lbs   Total Profit = 7 + 25 = 32

STEP 3c
Take Item 1 (Profit=12, Weight=3, Ratio=4.00). Does it fit? What are the running totals?

Remaining capacity = 9. Item 1 weighs 3 lbs. 3 ≤ 9 → fits entirely. Take 100%.

Total Weight = 9 lbs   Total Profit = 32 + 12 = 44

STEP 3d
Take Item 3 (Profit=20, Weight=8, Ratio=2.50). Does it fit entirely? If not, what fraction do you take and how much profit does that add?

Remaining capacity = 6 lbs. Item 3 weighs 8 lbs. Does NOT fit entirely.


Fraction = remaining / weight = 6/8 = 0.75

Profit from fraction = 20 × 0.75 = 15


Total Weight = 9 + 6 = 15 lbs (knapsack full)

Total Profit = 44 + 15 = 59

STEP 4
State the final answer. List all items taken, their amounts, and total profit.
ItemFraction TakenWeightProfit
5100%1 lb7
6100%5 lbs25
1100%3 lbs12
375%6 lbs15
Maximum Profit = 59
Final Answer

Total Weight = 15 lbs ✓   Total Profit = 59 units

Key Difference from 0/1 Knapsack

Greedy works here because fractions are allowed — you never waste capacity. In 0/1 Knapsack, greedy fails because taking the best-ratio item whole can leave capacity that no remaining item fits into optimally. Fractional knapsack is O(n log n); 0/1 knapsack requires O(nW) DP.

Pseudocode
Sort items by p[i]/w[i] descending ← greedy order: best value-per-pound items come first (O(n log n)) remaining = W ← track how much capacity is left in the knapsack total_profit = 0 ← accumulate the total profit as we fill the knapsack for each item i (sorted order): ← greedily consider each item from highest to lowest ratio if w[i] <= remaining: ← item fits entirely in the remaining capacity total_profit += p[i] ← take the whole item; gain its full profit remaining -= w[i] ← reduce remaining capacity by the item's full weight else: ← item does not fit whole; take the largest fraction that fits fraction = remaining / w[i] ← fraction = what portion of the item fills the leftover space total_profit += p[i] * fraction ← gain proportional profit for that fraction of the item break ← knapsack is now exactly full; stop

Time: Θ(n log n) — dominated by the sort; the greedy selection loop is O(n). Space: Θ(n) — stores the ratio and sorted order for all n items.