Skip to content

Commit 62bdc67

Browse files
committed
image tensors - wip
1 parent 4a8a43d commit 62bdc67

1 file changed

Lines changed: 53 additions & 56 deletions

File tree

src/dtype_next/image_analysis.clj

Lines changed: 53 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
^{:kindly/hide-code true
2-
:clay {:title "Functional Image Analysis with dtype-next"
2+
:clay {:title "Image Processing with dtype-next"
33
:quarto {:author :daslu
44
:draft true
55
:type :post
66
:date "2025-12-07"
77
:category :data
88
:tags [:dtype-next :tensors :image-processing :computer-vision :tutorial]}}}
99
(ns dtype-next.image-analysis
10-
"Learn dtype-next by building practical image analysis tools.
10+
"Learn dtype-next by building practical image processing tools.
1111
1212
We'll explore quality metrics, enhancement pipelines, accessibility features,
1313
and edge detection—all with functional idioms and zero-copy operations."
@@ -34,13 +34,12 @@
3434

3535
;; ## What We'll Build
3636

37-
;; 1. **Image Statistics** — channel means, ranges, distributions, histograms
38-
;; 2. **Spatial Analysis** — gradients, edge detection, sharpness metrics
39-
;; 3. **Enhancement Pipeline** — white balance, contrast adjustment
40-
;; 4. **Accessibility** — color blindness simulation
41-
;; 6. **Convolution & Filtering** — blur, sharpen, Sobel edge detection
42-
;; 7. **Reshape & Downsampling** — pyramids, multi-scale processing
43-
;; 8. **Batch Processing** — stacking and workflows
37+
;; - **Image Statistics** — channel means, ranges, distributions, histograms
38+
;; - **Spatial Analysis** — gradients, edge detection, sharpness metrics
39+
;; - **Enhancement Pipeline** — white balance, contrast adjustment
40+
;; - **Accessibility** — color blindness simulation
41+
;; - **Convolution & Filtering** — blur, sharpen, Sobel edge detection
42+
;; - **Reshape & Downsampling** — pyramids, multi-scale processing
4443

4544
;; Each section demonstrates core dtype-next concepts with immediate practical value.
4645

@@ -88,7 +87,7 @@ original-tensor
8887

8988
;; ---
9089

91-
;; # Part 1: Image Statistics
90+
;; # Image Statistics
9291

9392
;; Let's analyze image properties using **reduction operations**.
9493

@@ -97,8 +96,8 @@ original-tensor
9796
;; Use `tensor/select` to slice out individual channels (zero-copy views):
9897

9998
(defn extract-channels
100-
"Extract R, G, B, A channels from RGBA tensor.
101-
Returns map with :r, :g, :b, :a tensors (each [H W])."
99+
"Extract R, G, B channels from RGB tensor.
100+
Returns map with :red, :green, :blue tensors (each [H W])."
102101
[img-tensor]
103102
{:red (tensor/select img-tensor :all :all 0)
104103
:green (tensor/select img-tensor :all :all 1)
@@ -196,7 +195,11 @@ original-tensor
196195
(tensor/reshape [height width 3])
197196
bufimg/tensor->image)
198197

199-
;; ## Simple Histograms
198+
;; ## Histograms
199+
200+
;; A [histogram](https://en.wikipedia.org/wiki/Image_histogram) shows the distribution
201+
;; of pixel values. It's essential for understanding image brightness, contrast, and
202+
;; exposure. Peaks indicate common values; spread indicates dynamic range.
200203

201204
;; To draw the histograms, we can use a pivot transformation:
202205

@@ -225,9 +228,11 @@ original-tensor
225228

226229
;; ---
227230

228-
;; # Part 2: Spatial Analysis — Edges and Gradients
231+
;; # Spatial Analysis — Edges and Gradients
229232

230-
;; Now we'll analyze spatial structure using **gradient operations**.
233+
;; Analyze spatial structure using [gradient](https://en.wikipedia.org/wiki/Image_gradient)
234+
;; operations. Gradients are fundamental to [edge detection](https://en.wikipedia.org/wiki/Edge_detection),
235+
;; which identifies boundaries between regions in an image.
231236

232237
;; ## Computing Gradients
233238

@@ -304,14 +309,16 @@ edges
304309

305310
;; ---
306311

307-
;; # Part 3: Enhancement Pipeline
312+
;; # Enhancement Pipeline
308313

309314
;; Build composable image enhancement functions. Each transformation is
310315
;; verifiable through numeric properties we can check in the REPL.
311316

312317
;; ## Auto White Balance
313318

314-
;; Adjust channels so their means are equal (removes color casts).
319+
;; [White balance](https://en.wikipedia.org/wiki/Color_balance) adjusts colors to
320+
;; appear neutral under different lighting conditions. We scale RGB channels to have
321+
;; equal means, removing color casts.
315322

316323
(defn auto-white-balance
317324
"Scale RGB channels to have equal means.
@@ -357,7 +364,9 @@ edges
357364

358365
;; ## Contrast Enhancement
359366

360-
;; Amplify deviation from the mean to increase contrast.
367+
;; [Contrast](https://en.wikipedia.org/wiki/Contrast_(vision)) enhancement amplifies
368+
;; the difference between light and dark regions. We amplify each pixel's deviation
369+
;; from the mean, making bright pixels brighter and dark pixels darker.
361370

362371
(defn enhance-contrast
363372
"Increase image contrast by amplifying deviation from mean.
@@ -411,7 +420,7 @@ edges
411420

412421
;; ---
413422

414-
;; # Part 4: Accessibility — Color Blindness Simulation
423+
;; # Accessibility — Color Blindness Simulation
415424

416425
;; Use matrix transformations to simulate how images appear to people with
417426
;; different types of color vision deficiency. This demonstrates dtype-next's
@@ -421,7 +430,8 @@ edges
421430

422431
;; ## Color Blindness Matrices
423432

424-
;; These matrices are from established research on color vision deficiency:
433+
;; These matrices simulate [color blindness](https://en.wikipedia.org/wiki/Color_blindness)
434+
;; (color vision deficiency). Different types affect perception of red, green, or blue:
425435

426436
(def color-blindness-matrices
427437
{:protanopia [[0.567 0.433 0.000] ; Red-blind
@@ -496,16 +506,18 @@ edges
496506

497507
;; ---
498508

499-
;; # Part 6: Advanced — Convolution & Filtering
509+
;; # Advanced — Convolution & Filtering
500510

501511
;; Convolution is the fundamental operation behind image filters, from blur to edge
502512
;; detection. We'll build a reusable convolution engine and apply various kernels,
503513
;; demonstrating `tensor/compute-tensor` for windowed operations and nested iterations.
504514

505515
;; ## Understanding Convolution
506516

507-
;; A **kernel** (or filter) is a small matrix that slides over the image. At each
508-
;; position, we multiply kernel values by corresponding pixel values and sum the result.
517+
;; [Convolution](https://en.wikipedia.org/wiki/Kernel_(image_processing)) is a
518+
;; fundamental operation in image processing. A **kernel** (or filter) is a small
519+
;; matrix that slides over the image. At each position, we multiply kernel values
520+
;; by corresponding pixel values and sum the result.
509521

510522
;; Example: 3×3 box blur kernel (all pixels weighted equally):
511523
;; ```
@@ -570,7 +582,9 @@ kernel-3x3
570582

571583
;; ## Gaussian Blur
572584

573-
;; Gaussian kernels weight center pixels more heavily than edge pixels:
585+
;; [Gaussian blur](https://en.wikipedia.org/wiki/Gaussian_blur) uses a kernel based
586+
;; on the Gaussian (normal) distribution. It weights center pixels more heavily than
587+
;; edge pixels, producing a smooth, natural-looking blur without artifacts.
574588

575589
(defn gaussian-kernel
576590
"Create NxN Gaussian kernel with given sigma."
@@ -599,7 +613,9 @@ gaussian-5x5
599613

600614
;; ## Sharpen Filter
601615

602-
;; Sharpen enhances edges by amplifying high-frequency details.
616+
;; [Unsharp masking](https://en.wikipedia.org/wiki/Unsharp_masking) sharpens images
617+
;; by enhancing edges. We subtract a blurred version from the original to extract
618+
;; high-frequency details, then add them back amplified.
603619
;; Method: original + strength × (original - blur)
604620

605621
(defn sharpen
@@ -626,7 +642,7 @@ gaussian-5x5
626642
(-> {:original grayscale
627643
:box (convolve-2d grayscale kernel-3x3)
628644
:gaussian (convolve-2d grayscale gaussian-5x5)
629-
:sharpented (sharpen grayscale 1.5)}
645+
:sharpened (sharpen grayscale 1.5)}
630646
(update-vals
631647
(fn [t]
632648
(dfn/mean (edge-magnitude
@@ -636,6 +652,10 @@ gaussian-5x5
636652

637653
;; ## Sobel Edge Detection
638654

655+
;; The [Sobel operator](https://en.wikipedia.org/wiki/Sobel_operator) is a classic
656+
;; edge detection method that uses specialized kernels to compute gradients in X and Y
657+
;; directions. It's more robust to noise than simple finite differences.
658+
639659
;; Sobel kernels detect edges in X and Y directions:
640660

641661
(def sobel-x-kernel
@@ -685,7 +705,7 @@ gaussian-5x5
685705

686706
;; ---
687707

688-
;; # Part 7: Reshape & Downsampling
708+
;; # Reshape & Downsampling
689709

690710
;; Explore multi-scale image processing through downsampling and pyramids.
691711
;; We'll demonstrate `tensor/reshape` for zero-copy view transformations and
@@ -709,7 +729,9 @@ gaussian-5x5
709729

710730
;; ## Downsampling by 2×
711731

712-
;; We can downsample by selecting every other pixel:
732+
;; [Downsampling](https://en.wikipedia.org/wiki/Downsampling_(signal_processing))
733+
;; (decimation) reduces image resolution by discarding pixels. We select every other
734+
;; pixel in each dimension, creating a half-size image.
713735

714736
(defn downsample-2x [img-2d]
715737
(let [[h w] (dtype/shape img-2d)]
@@ -734,8 +756,9 @@ gaussian-5x5
734756

735757
;; ## Image Pyramid
736758

737-
;; An image pyramid contains multiple scales of the same image,
738-
;; useful for multi-scale analysis:
759+
;; An [image pyramid](https://en.wikipedia.org/wiki/Pyramid_(image_processing)) contains
760+
;; the same image at multiple scales. This is essential for multi-scale analysis, feature
761+
;; detection at different sizes, and efficient image processing algorithms.
739762

740763
(defn build-pyramid [img-2d levels]
741764
(loop [pyramid [img-2d]
@@ -799,32 +822,6 @@ gaussian-5x5
799822

800823
;; ---
801824

802-
;; # Part 8: Batch Processing & Workflows
803-
804-
;; Process multiple images efficiently by stacking them into higher-dimensional
805-
;; tensors. This demonstrates dtype-next's support for 4D tensors and batch operations.
806-
807-
;; ## Image Stacking
808-
809-
;; Stack multiple images along a new dimension for parallel processing:
810-
811-
(defn stack-images
812-
"Stack images into a single 4D tensor [N H W C].
813-
All images must have same dimensions."
814-
[images]
815-
(let [[h w c] (dtype/shape (first images))
816-
n (count images)]
817-
;; Verify all images have same shape
818-
(assert (every? #(= [h w c] (vec (dtype/shape %))) images)
819-
"All images must have same dimensions")
820-
(tensor/compute-tensor
821-
[n h w c]
822-
(fn [i y x ch]
823-
(tensor/mget (nth images i) y x ch))
824-
:uint8)))
825-
826-
;; ---
827-
828825
;; # Conclusion: The dtype-next Pattern
829826

830827
;; We've built a complete image analysis toolkit demonstrating core dtype-next concepts:

0 commit comments

Comments
 (0)