{ "cells": [ { "cell_type": "raw", "id": "1bc5cc4b", "metadata": {}, "source": [ "---\n", "title: \"Day 22: Clearing the Runway\"\n", "disable_content_template: 1\n", "tags:\n", " - 'image processing'\n", " - 'computer vision'\n", "author: Zaki Mughal\n", "data:\n", " bio: zmughal\n", " description: Hough transform for line extraction\n", "images:\n", " banner:\n", " src: 'Runway_34,_Nagoya_Airfield_(3937428018).jpg'\n", " alt: 'Runway 34, Nagoya Airfield'\n", " data:\n", " attribution: |-\n", " Runway 34, Nagoya Airfield, by Kentaro Iemoto from Tokyo, Japan, CC BY-SA 2.0, via Wikimedia Commons\n", "---" ] }, { "cell_type": "markdown", "id": "9d225ec4", "metadata": {}, "source": [ "The North Pole Workshop's logistics department has been busy this year with\n", "upgrading the sleigh's dash-cam with a new more powerful computer. They've\n", "asked the research and development department if there is anything that can be\n", "done with the extra cycles to make the journey safer. After bouncing around a\n", "couple ideas, they came up with a plan for an autoland system…but since Santa\n", "and the reindeer do not operate in typical environments (such as roofs), this\n", "would not be a typical autoland system (which use microwave/radio guidance) so\n", "they couldn't select anything off-the-shelf (and they do know their shelves).\n", "\n", "---\n", "\n", "To help iterate over the requirements for this new system, the elves knew that\n", "PDL could help them process images and try different approaches quickly. This\n", "iterative process will start with simpler scenarios in order to ensure the\n", "algorithms are on the right track. We'll start with an airport runway as a test\n", "image.\n", "\n", "# Setup\n", "\n", "First we need to setup the environment to do everything we need: read images,\n", "processing them with some filters, and then plot the results. Since image\n", "processing requires frequent visualisation, this work is being done in Jupyter\n", "Notebook via [`Devel::IPerl`](https://p3rl.org/Devel::IPerl), but the setup\n", "below allows for display of plots both in a notebook, but also as a regular\n", "Perl script. You can [download the notebook here](20241222-hough-lines.ipynb)." ] }, { "cell_type": "code", "execution_count": null, "id": "5665996e-85ed-498e-a21e-ca0056566ead", "metadata": {}, "outputs": [], "source": [ "use v5.36;\n", "use utf8;\n", "use feature qw(signatures postderef);\n", "\n", "use constant IN_IPERL => !! $ENV{PERL_IPERL_RUNNING};\n", "no if IN_IPERL, warnings => 'redefine'; # fewer messages when re-running cells\n", "no if ! IN_IPERL, warnings => 'void'; # fewer messages for variable at end of cell outside of IPerl\n", "\n", "use PDL 2.095;\n", "use PDL::Constants qw(PI);\n", "use PDL::IO::Pic;\n", "use PDL::Image2D;\n", "use PDL::ImageRGB;\n", "\n", "use SVG;\n", "use MooX::Struct ();\n", "use Path::Tiny qw(path);\n", "use List::Util ();\n", "use Encode qw(encode_utf8);\n", "\n", "use Data::Printer;\n", "use Data::Printer::Filter::PDL ();\n", "\n", "use PDL::Graphics::Gnuplot qw(gpwin);\n", "\n", "# Make PDL::Graphics::Gnuplot compatible with IPerl\n", "use if IN_IPERL, 'Devel::IPerl::Plugin::PDLGraphicsGnuplot';\n", "if( IN_IPERL ) {\n", "\tIPerl->load_plugin('PDLGraphicsGnuplot');\n", "}\n", "\n", "sub PDL::Graphics::Gnuplot::fancy_display {\n", "\tif(IN_IPERL){\n", "\t\tIPerl->display($_[0]);\n", "\t} else {\n", "\t\t#sleep 1; $_[0]->close;\n", "\t\t$_[0]->pause_until_close;\n", "\t}\n", "}\n", "\n", "my $gp = gpwin();" ] }, { "cell_type": "markdown", "id": "aad7adf7", "metadata": {}, "source": [ "# Image processing for feature extraction\n", "\n", "The elves know that runways are characterized by strong straight lines: the\n", "edges of the runway, centerline markings, and threshold stripes. If they can\n", "reliably detect these lines in images from the sleigh's camera, they'll have a\n", "key piece of their autoland system.\n", "\n", "But how do you find straight lines in an image? Let's break this down into steps:\n", "\n", "1. First, we need to find places in the image where there are sharp changes in\n", " brightness — these are edges that might be part of the lines we're looking\n", " for:\n", "\n", " ```mermaid\n", " graph LR\n", " A[Color Image] -->|Convert to greyscale| B[Greyscale]\n", " B -->|Smooth noise| C[Blurred]\n", " C -->|Find edges| D[Edge points]\n", " ```\n", "\n", "2. But just having edge points isn't enough - we need to figure out which ones\n", " form straight lines. This is tricky because:\n", " - edges might be broken (gaps in runway lines);\n", " - there's usually noise (not all edges are runway lines); and\n", " - lines might be partial (only part of the runway visible).\n", "\n", "3. This is where the Hough transform comes in. Instead of trying to connect\n", " edge points directly, it:\n", " 1. Takes each edge point.\n", " 2. Finds all possible lines through that point.\n", " 3. Lets the points \"vote\" on which lines they might be part of.\n", " 4. Finds the lines that got the most votes.\n", "\n", "The rest of this notebook shows how we implement this pipeline using PDL, step\n", "by step.\n", "\n", "# Prepare image\n", "\n", "After selecting a test image\n", "(Runway 34, Nagoya Airfield,\n", "by Kentaro Iemoto from Tokyo, Japan,\n", "CC BY-SA 2.0, via Wikimedia Commons),\n", "we need to read it in using `rpic()`\n", "and take a look at the dimensions." ] }, { "cell_type": "code", "execution_count": null, "id": "e9522290", "metadata": { "tags": [ "hide-output-execute_result" ] }, "outputs": [], "source": [ "# https://commons.wikimedia.org/wiki/File:Runway_34,_Nagoya_Airfield_(3937428018).jpg\n", "my $runway = rpic('Runway_34,_Nagoya_Airfield_(3937428018).jpg');\n", "\n", "say np $runway;" ] }, { "cell_type": "markdown", "id": "ac139252", "metadata": {}, "source": [ "and then view the image in Gnuplot:" ] }, { "cell_type": "code", "execution_count": null, "id": "3abe81c7", "metadata": {}, "outputs": [], "source": [ "$gp->image( $runway );\n", "$gp->fancy_display;" ] }, { "cell_type": "markdown", "id": "a0d0dc13", "metadata": {}, "source": [ "Note the dimensions are `[3 W H]` where 3 is the number of channels, `W` is the\n", "width (x-extent), and `H` is the height (y-extent).\n", "\n", "We can't process the color channels in the image directly, so we need to\n", "convert it to a single greyscale (luminance) channel. We can use `rgbtogr`\n", "which uses a standard formula." ] }, { "cell_type": "code", "execution_count": null, "id": "024b7d7c", "metadata": {}, "outputs": [], "source": [ "my $grey = rgbtogr($runway)->double / 255;\n", "$gp->image($grey,\n", "\t{ title => 'Greyscale image',\n", "\t clut => 'grey' });\n", "$gp->fancy_display;" ] }, { "cell_type": "markdown", "id": "978b7b02", "metadata": {}, "source": [ "# Gaussian blur\n", "\n", "We need to find edges in the image, but all real-world images have some amount\n", "of noise (e.g., from random variation in brightness from pixel to pixel).\n", "This noise can interfere with further processing such as edge\n", "detection such that the noise may be picked up as spurious edges.\n", "\n", "To smooth out this noise, we apply a 2D Gaussian filter to the image.\n", "The filter:\n", "\n", "1. Looks at each pixel and its neighbors.\n", "2. Creates a weighted average where:\n", " - The center pixel counts most;\n", " - Nearby pixels count less;\n", " - Far pixels barely count at all;\n", " - The weights follow a bell curve (Gaussian) shape.\n", "\n", "The parameter σ (sigma) controls how much smoothing we do:\n", "\n", "- Small σ: less smoothing, keeps more detail.\n", "- Large σ: more smoothing, might blur real edges.\n", "\n", "The way we can perform this weighted average is by performing convolution\n", "with the `conv2d()` function.\n", "\n", "