Prokudin-Gorskii Colorization

CS180, Joshua Liao

Background

Sergei Mikhailovich Prokudin-Gorskii was a Russian photographer before color printing was available.
He took exposures of scenes with a red, green, and blue filter. Using image processing techniques, we can align the glass plate images to produce a color image from the three channels!
Below, you can see three examples of these RGB glass plate negatives.

Basic Alignment

Here is a comparison of brute force searching for alignment on small images; all images were searched over [-15, 15] pixels.
The left is a naive combination of channels with no alignment.
The middle uses Euclidean distance to score green and red channels compared to the blue channel.
The right uses cropping to improve scoring, making the alignment more robust to noise at the edges.

R(0, 0), G(0, 0).
No alignment.

G(1, -1), R(7, -1)
Using L2 Euclidean distance with no cropping.

G(5, 2), R(12, 3)
Scoring using a 5% crop.

G(0, 0), R(0, 0).
No alignment.

G(-6, 0), R(9, 1)
Using L2 Euclidean distance with no cropping.

G(-3, 2), R(3, 2)
Scoring using a 5% crop.

G(0, 0), R(0, 0).
No alignment.

G(3, 2), R(6, 3)
Using L2 Euclidean distance with no cropping.

G(3, 2), R(6, 3)
Scoring using a 5% crop.
Cropping here didn't improve quality-- presumably less noise.

Cathedral: Different p-norm Scoring

Euclidean distance is the L2-norm, ie. \( (\sum |x^2|)^{\frac{1}{2}} \). What happens if you use different p-norms?
Well, turns out not that much. These images were generated with 5% cropping.

G(4, 2), R(10, 8).
p=0.5

G(5, 2), R(12, 3).
p=2, or the basic Euclidean norm.

G(5, 2), R(12, 3).
p=5. Same results as p=2.

Size and Image Pyramids

Note that the images above are relatively small: the cathedral dimensions are 390 x 341 = 132,990 pixels. Consider the large image of Emir: its dimensions are 3,702 x 3,209 = 11,879,718. Not only is this approximately 90 times as many pixels (meaning that each brute-forced alignment does 90 times the work!), a brute force search needs to search over more alignments. Before, the small images searched over [-15, 15], a 4% shift in all cardinal directions, which works out to 900 different alignments.
A 4% shift on Emir would be [-150, 150], working out to 90,000 different alignments— 100 times as many!
This is mostly because of the "square-cube" law idea (except for line-square, 1d to 2d). When increasing the sides of the image by 10 times, the amount of work is increased by 100.
In this case, we're affected even more! 10x to the image side length works out to 10,000x the work (100x the alignments, each of which searches 100x pixels.)