Image-to-Polygon Conversion

Automated Shape Extraction for WebGL Obstacles

This post explores building a pipeline that extracts shapes from images and outputs them as triangle definitions for the fluid simulation, using only documented algorithms and real contour data.

The goal: take any image (like an "M" logo), trace its boundary, simplify the vertices, and output JavaScript code that can be pasted directly into the config.js file used to control the fluid simulation. No manual vertex placement, no approximations—just pure algorithm-driven extraction.

The Problem

The fluid simulation's obstacle system uses a binary mask texture. Obstacles are rasterized from geometric primitives (circles, rectangles, triangles, polygons) into a grid of pixels, which the GPU samples to determine where fluid can flow.

Initially, the "M" logo was constructed from 15+ rectangles stacked in a staircase pattern to approximate diagonal edges. This looked blocky and required tedious manual coordinate calculation. The challenge: can we extract clean diagonal shapes directly from a reference image?

Algorithm Selection

The pipeline needs three core algorithms working in sequence:

1. Contour Extraction: Moore-Neighbor Tracing

After thresholding the image to binary (black shape on white background), we need to trace the boundary. Moore-Neighbor tracing is a simple boundary-following algorithm that walks around the shape using 8-connectivity.

Contour Tracing Algorithm

function traceContour(bitmap, width, height) {
    // Direction vectors for 8-connectivity (clockwise from right)
    const dx = [1, 1, 0, -1, -1, -1, 0, 1];
    const dy = [0, 1, 1, 1, 0, -1, -1, -1];
    
    // Find starting point (first black pixel)
    let startX = -1, startY = -1;
    for (let y = 0; y < height; y++) {
        for (let x = 0; x < width; x++) {
            if (bitmap[y * width + x] === 0) {  // Black
                startX = x;
                startY = y;
                break;
            }
        }
        if (startX !== -1) break;
    }
    
    const contour = [];
    let currentX = startX, currentY = startY;
    let direction = 0;  // Start looking right
    
    do {
        contour.push({x: currentX, y: currentY});
        
        // Search from (direction + 6) % 8 (back-left)
        let searchDir = (direction + 6) % 8;
        
        for (let i = 0; i < 8; i++) {
            const checkDir = (searchDir + i) % 8;
            const nx = currentX + dx[checkDir];
            const ny = currentY + dy[checkDir];
            
            if (/* in bounds and black */) {
                currentX = nx;
                currentY = ny;
                direction = checkDir;
                break;
            }
        }
    } while (currentX !== startX || currentY !== startY);
    
    return contour;
}

This produces a sequence of pixel coordinates following the shape's edge. For the "M" logo at 1024x1024 resolution, this yields hundreds of boundary points—far too many for efficient rendering.

2. Simplification: Ramer-Douglas-Peucker

The Ramer-Douglas-Peucker (RDP) algorithm reduces the number of vertices while preserving the shape's overall appearance. It works recursively by finding the point farthest from the line connecting the endpoints. If that distance exceeds a threshold (epsilon), the point is kept and the algorithm recurses on both segments.

RDP Simplification

function rdpSimplify(points, epsilon) {
    if (points.length <= 2) return points;
    
    const start = points[0];
    const end = points[points.length - 1];
    
    let maxDist = 0;
    let maxIdx = 0;
    
    // Find point farthest from line
    for (let i = 1; i < points.length - 1; i++) {
        const dist = perpendicularDistance(points[i], start, end);
        if (dist > maxDist) {
            maxDist = dist;
            maxIdx = i;
        }
    }
    
    if (maxDist > epsilon) {
        // Recurse on both segments
        const left = rdpSimplify(points.slice(0, maxIdx + 1), epsilon);
        const right = rdpSimplify(points.slice(maxIdx), epsilon);
        return [...left.slice(0, -1), ...right];
    }
    
    return [start, end];
}

The epsilon parameter controls the trade-off: higher values yield fewer vertices but lose detail. For the M logo, epsilon=5 pixels reduced hundreds of points to just 14 vertices while preserving all the important corners.

3. Triangulation: Earcut

The simplified contour is likely non-convex (e.g., the inner "V" of the "M"). Since the obstacle manager uses a simple triangle-fan algorithm for convex polygons, we need proper triangulation.

Earcut is a fast, robust triangulation library that handles non-convex polygons, holes, and even self-intersections. It outputs a flat array of triangle indices:

Earcut Triangulation

import earcut from 'earcut';

const flatCoords = normalized.flatMap(p => [p.x, p.y]);
const triangleIndices = earcut(flatCoords);

// Convert indices to triangle definitions
const triangles = [];
for (let i = 0; i < triangleIndices.length; i += 3) {
    triangles.push({
        type: 'triangle',
        v0: normalized[triangleIndices[i]],
        v1: normalized[triangleIndices[i + 1]],
        v2: normalized[triangleIndices[i + 2]]
    });
}

For the M logo, 14 vertices were triangulated into 10 triangles—ready to paste into the config file.

The Pipeline

The complete image-to-polygon.js script combines these algorithms into a single CLI tool:

Usage

node tools/image-to-polygon.js logo.png --epsilon 5 --scale 0.35

The script follows this flow:

Load image using Jimp
Convert to grayscale and apply threshold (default: 128)
Trace contour with Moore-Neighbor algorithm
Simplify with RDP (configurable epsilon)
Normalize coordinates to [0,1] range
Triangulate with Earcut
Output JavaScript array ready for config.js

Coordinate Normalization

A subtle but important step is converting pixel coordinates to the [0, 1] normalized space used by the simulation. Images have their origin at the top-left, while WebGL uses bottom-left:

Flipping Y-Axis

const normalized = simplified.map(p => ({
    x: centerX + (p.x / width - 0.5) * scale * aspectRatio,
    y: centerY + (0.5 - p.y / height) * scale  // Flip Y
}));

The scale parameter controls the final size (default 1.0 = full canvas width), and centerX/centerY allow positioning the shape anywhere on screen.

Results

Running the pipeline on the reference M logo (epsilon=5, scale=0.35) produces:

Metric	Value
Input Resolution	1024×1024 pixels
Raw Contour Points	~800 pixels
After RDP (ε=5)	14 vertices
Triangulation Output	10 triangles

The generated triangles perfectly match the reference image when rendered in the simulation. Fluid flows smoothly around the diagonal edges with no penetration or gaps.

Key Parameters

The pipeline exposes several tunable parameters for different use cases:

Parameter	Default	Effect
`--epsilon`	2	RDP tolerance in pixels (higher = fewer vertices)
`--threshold`	128	Binary threshold (0-255) for shape detection
`--scale`	1.0	Output size as fraction of canvas width
`--invert`	false	Treat white as shape instead of black

Technical Details

Language: Node.js with ES modules
Dependencies: Jimp (image loading), Earcut (triangulation)
RDP Implementation: Custom recursive implementation
Contour Tracing: Moore-Neighbor with 8-connectivity
Output Format: JavaScript array of triangle objects

Edge Cases

The pipeline handles several tricky scenarios:

Anti-aliased edges: Threshold converts gray pixels to binary
Multiple contours: Currently extracts the first found (could be extended)
Degenerate triangles: Earcut is robust to edge cases
Very small shapes: Warning if grid area < 3 cells

Future Improvements

Possible extensions to the pipeline:

Multi-contour support: Extract holes (e.g., inner regions of "O" or "P")
Convex decomposition: Output polygons instead of triangles when possible
Adaptive epsilon: Automatically tune based on shape complexity
Preview mode: Render extracted shape before exporting

Try It Yourself

View on GitHub