Your CPU is a genius. It can branch, speculate, reorder instructions, manage complex state machines, and juggle a hundred different tasks in the time it takes you to blink. Your GPU is not a genius. Your GPU is a factory floor staffed by thousands of workers who are each capable of exactly one thing: following a short list of instructions very, very fast.

That’s what a shader is. A short list of instructions that runs on every single pixel (or vertex) of your screen, simultaneously, across thousands of cores. My site’s cosmic background? That’s a fragment shader running on every pixel of your viewport, sixty times per second. Each pixel gets the same program but different coordinates, and the result is a procedurally generated nebula that never repeats.

This post is about how that works. Not a beginner tutorial. More of a mental model for how to think about GPU programming, using my actual shaders as examples. I promise to make it fun.


The Parallel Mindset

Here’s the thing that trips up every programmer the first time they write a shader: you can’t talk to your neighbors.

On a CPU, if you want pixel (400, 300) to know what color pixel (401, 300) is, you just… read it. Array lookup. Done. On a GPU, that’s not how it works. Your shader runs on pixel (400, 300) and it has no idea what’s happening at (401, 300), because that pixel is being processed by a different core at the exact same time. There’s no shared state. There’s no “wait for the other pixel to finish.” Each invocation of your shader is an island. A very lonely, very fast island.

This constraint is also the GPU’s superpower. Because nothing depends on anything else, the hardware can run all of them in parallel. A modern GPU has thousands of shader cores. A 1920x1080 viewport has about 2 million pixels. Your fragment shader runs 2 million times per frame, 60 frames per second, and the GPU just… handles it. That’s 120 million executions per second of your little program. 🤯

Once you internalize this, a lot of shader conventions start making sense.


The Data Pipeline: Uniforms, Attributes, and Varyings

There are exactly three ways to get data into a shader, and understanding them is the whole game.

Uniforms are values that the CPU sends to the GPU once per frame. They’re the same for every pixel and every vertex. Think of them as global constants that change over time. In my cosmic background shader, the uniforms look like this:

uniform float uTime;
uniform vec2 uMouse;
uniform vec2 uResolution;

uTime is a clock that ticks up every frame. uMouse is the normalized mouse position. uResolution is the viewport size in pixels. Every single pixel gets the same values for these. The CPU updates them each frame with gl.uniform2f() or similar calls, and the GPU reads them. Simple enough, right?

Attributes are per-vertex data. They’re different for each vertex. In a mesh, these are typically positions, normals, texture coordinates. In my constellation text renderer, each point particle has its own attributes:

attribute vec2 position;
attribute float pointIndex;
attribute float birthTime;

Every point in the constellation gets a unique position, a unique pointIndex (used to stagger animations), and a unique birthTime (so letters fade in sequentially). This data lives in GPU buffers that the CPU uploads once, and the vertex shader reads per-vertex.

Varyings are the bridge between vertex shaders and fragment shaders, and they’re the cleverest part of the pipeline. A varying is a value that the vertex shader outputs and the fragment shader receives, but here’s the trick: the GPU interpolates it across the surface of the triangle. If one vertex sets vAlpha = 1.0 and the adjacent vertex sets vAlpha = 0.0, then a fragment halfway between them will receive vAlpha = 0.5. The hardware does this interpolation for free. I love free things!

In my cosmic background, the vertex shader computes UV coordinates and passes them to the fragment shader as a varying:

varying vec2 vUv;
void main() {
  vUv = position * 0.5 + 0.5;
  gl_Position = vec4(position, 0.0, 1.0);
}

The vertex shader runs on a full-screen quad (two triangles covering the viewport), and vUv smoothly interpolates from (0,0) at the bottom-left to (1,1) at the top-right. The fragment shader receives a unique UV coordinate for each pixel without anyone having to calculate it explicitly. The rasterizer just does it. It’s honestly kind of beautiful.


Noise: Why Randomness Isn’t Random

The cosmic background gets its organic, nebula-like appearance from noise functions. Specifically, simplex noise. If you’ve ever used Math.random() and wondered why it looks terrible for procedural generation, the answer is that white noise has no spatial coherence. Adjacent pixels get completely unrelated values, and the result looks like TV static. Not the vibe!

Simplex noise solves this by being smooth. Nearby coordinates produce nearby values. The function is deterministic (same input, same output, always) but it looks random. Here’s a simplified version of what the math is doing:

  1. Take your input coordinate (say, a 3D point in space).
  2. Skew it onto a simplex grid (triangles in 2D, tetrahedra in 3D). This is more efficient than the regular grid that classic Perlin noise uses.
  3. For each corner of the simplex that contains your point, compute a gradient vector (derived from a hash function, which is where those mod289 and permute functions come in).
  4. Dot each gradient with the distance from that corner to your point.
  5. Blend the results using a smooth falloff kernel.

The result is a value between -1 and 1 that changes smoothly across space. My implementation uses the standard Ashima Arts simplex noise, which is the one you’ll find in roughly 90% of all WebGL projects. If it ain’t broke.

float snoise(vec3 v) {
  const vec2 C = vec2(1.0 / 6.0, 1.0 / 3.0);
  vec3 i = floor(v + dot(v, C.yyy));
  vec3 x0 = v - i + dot(i, C.xxx);
  // ... 40 more lines of coordinate hashing
  // and gradient interpolation
  vec4 m = max(0.6 - vec4(dot(x0,x0), dot(x1,x1),
                           dot(x2,x2), dot(x3,x3)), 0.0);
  m = m * m;
  return 42.0 * dot(m*m, vec4(dot(p0,x0), dot(p1,x1),
                               dot(p2,x2), dot(p3,x3)));
}

That final 42.0 scaling constant isn’t a Hitchhiker’s Guide reference, unfortunately. I checked. It normalizes the output range. The m * m (quartic falloff) is what gives simplex noise its characteristic smooth, blobby appearance. Each simplex corner’s contribution fades to zero before it could reach the next corner, which prevents discontinuities.


FBM: Noise on Top of Noise

One layer of simplex noise looks like blurry blobs. Interesting, but not very natural. Nature has detail at every scale: large cloud formations contain smaller wisps, which contain even finer tendrils. Fractal Brownian Motion (fbm) fakes this by layering noise at increasing frequencies and decreasing amplitudes:

float fbm(vec3 p) {
  float v = 0.0, a = 0.5;
  for (int i = 0; i < 5; i++) {
    v += a * snoise(p);
    p *= 2.0;
    a *= 0.5;
  }
  return v;
}

Five iterations. Each one doubles the frequency (p *= 2.0) and halves the amplitude (a *= 0.5). The first layer creates large-scale structure. The second adds medium detail. By the fifth layer, you’re adding tiny high-frequency variation that gives the result a natural, cloudy quality. Five lines of code to simulate nature. Not bad.

This is the same math behind terrain generation, cloud rendering, and procedural textures in basically every game made after 2003. It’s also cheap enough to run per-pixel at 60fps on integrated graphics, which matters when your site needs to work on a 2019 MacBook Air. Not everyone has a 4090, surprisingly.


Painting the Nebula

Here’s the actual color logic from my cosmic background’s main() function. Three layers of fbm noise, each contributing a different color:

vec3 col = vec3(0.006, 0.004, 0.015); // near-black base

float n1 = fbm(vec3((uv + mouseOffset * 0.5) * 1.5, t));
col += vec3(0.4, 0.08, 0.6) * pow(n1 * 0.5 + 0.5, 2.2) * 0.45;

float n2 = fbm(vec3((uv + mouseOffset) * 2.0 + 5.0, t * 0.7));
col += vec3(0.02, 0.28, 0.45) * pow(n2 * 0.5 + 0.5, 2.5) * 0.35;

float n3 = fbm(vec3((uv + mouseOffset * 1.5) * 3.0 + 10.0, t * 0.5));
col += vec3(0.65, 0.12, 0.45) * smoothstep(0.1, 0.7, n3) * 0.2;

Each layer samples fbm at a different scale and offset. The + 5.0 and + 10.0 prevent the layers from correlating with each other. The mouseOffset (derived from uMouse) shifts the UV coordinates slightly based on cursor position, creating that subtle parallax effect when you move your mouse around. Different layers respond to the mouse at different intensities (0.5, 1.0, 1.5), so they shift at different rates. It’s a cheap trick that creates surprising depth.

The pow(n, 2.2) and pow(n, 2.5) calls push the noise distribution darker, concentrating the color into bright wisps against a mostly-dark background. Without the power curve, the nebula would look like an even wash of color. With it, you get the “bright clouds against dark space” look. A single exponent change and the whole mood shifts!

The time uniform t is uTime * 0.025, so the noise field drifts very slowly. Fast enough to feel alive, slow enough that you don’t notice it’s moving unless you stare. Go ahead, stare. I’ll wait.


Constellation Text: Instanced Points with Glow

The constellation text effect takes a different approach. Instead of running a shader per pixel on a full-screen quad, it renders thousands of individual point particles, each one a dot in a pointillist rendering of text. ✨

At build time (well, at page load), I rasterize text to a hidden canvas, scan the pixel data, and extract the positions of all visible pixels. Each one becomes a particle. Then I upload all those positions as vertex attributes and draw them in a single draw call using gl.drawArrays(gl.POINTS, ...). One draw call for thousands of glowing dots. The GPU doesn’t even break a sweat.

The vertex shader handles the animation: staggered fade-in based on each point’s index, size pulsing over time:

float fadeIn = smoothstep(0.0, 0.3, uTime - birthTime - staggerDelay);
gl_PointSize = uPointSize * (0.8 + 0.4 * sin(uTime * 3.0 + pointIndex * 0.5));

The pointIndex * 0.5 phase offset means adjacent points pulse at slightly different times, creating a shimmer that ripples across the text. The birthTime and staggerDelay mean that when a new word appears, its dots don’t all pop in at once; they cascade from one end to the other. It looks SO cool!!

The fragment shader creates the glow effect. gl_PointCoord gives you coordinates within the point sprite (0 to 1 on each axis), so you can treat each point as a tiny canvas:

vec2 center = gl_PointCoord - 0.5;
float dist = length(center);
float alpha = 1.0 - smoothstep(0.3, 0.5, dist);

This is a distance field. dist is how far the current fragment is from the center of the point. smoothstep(0.3, 0.5, dist) creates a soft falloff: fully opaque inside a radius of 0.3, fully transparent outside 0.5, with a smooth gradient between. The result is a soft, glowing dot instead of a hard-edged square. Hard-edged squares are so 1992.

The color cycling uses a simplified HSV-to-RGB conversion locked to the cyan-blue-purple range, with each point’s pointIndex determining its base hue. The whole constellation shimmers through cool tones without ever touching red or yellow, which would clash with the background palette.


The Flying Icons: Bezier Curves on the GPU

The flying particle icons do something I’m particularly fond of: cubic Bezier interpolation in the vertex shader. Each particle follows a curved path defined by four control points, and the GPU evaluates the curve:

float bezier(float t, vec4 p) {
  float t2 = t * t;
  float t3 = t2 * t;
  float mt = 1.0 - t;
  float mt2 = mt * mt;
  float mt3 = mt2 * mt;
  return mt3*p.x + 3.0*mt2*t*p.y + 3.0*mt*t2*p.z + t3*p.w;
}

That’s the standard cubic Bernstein polynomial. t is the progress along the path (0 to 1), and p contains the four control points packed into a vec4. The CPU generates random Bezier curves at init time and uploads them as per-instance attributes. The vertex shader advances t each frame based on uTime, and the particle glides along the curve with no CPU involvement after setup. Set it and forget it. The GPU does all the work. 😄

The glow in the fragment shader uses exp(-dist * 2.5), which is exponential falloff; it drops off faster than a linear fade but never quite reaches zero, so there’s always a faint halo around each particle. Combined with the hard smoothstep core, you get a bright center with a soft atmospheric glow.

float core = 1.0 - smoothstep(0.0, 0.3, dist);
float glow = exp(-dist * 2.5) * 0.8;
float alpha = core + glow;

Two components, added together: a solid core and a diffuse glow. Simple. Looks great.


No Three.js

All of this is raw WebGL2. No Three.js, no Babylon, no abstractions. I wrote the buffer management, the shader compilation, the uniform uploads, the render loop. Is that a good idea for a production application? Probably not. Is it a good idea for learning how GPUs actually work? Absolutely.

When you use Three.js, you write new THREE.ShaderMaterial({ uniforms, vertexShader, fragmentShader }) and things happen. When you write raw WebGL, you write gl.createShader(), gl.shaderSource(), gl.compileShader(), gl.attachShader(), gl.linkProgram(), gl.getUniformLocation(), gl.uniform2f(), and you understand every single step because you had to type every single step. The API is verbose to the point of comedy, but there’s no magic. Just you and the GPU, having a very verbose conversation.


Why Shaders Are the Most Fun You Can Have Programming

I’ve written a lot of code. Backend services, compilers, build systems, infrastructure automation. Shaders are different. The feedback loop is instant and visual. You change a number, save the file, and the entire screen transforms. You’re not debugging log output or stepping through breakpoints. You’re watching math become light.

There’s also something deeply satisfying about the constraints. You can’t use complex data structures. You can’t call external APIs. You can’t even read what the pixel next to you is doing. All you have is math, and with just math: some noise functions, a few smoothstep calls, a handful of mix and pow operations; you can render nebulae and constellations and glowing particles drifting along invisible curves. ✨

GPUs are the most powerful hardware most programmers never learn to use. If you’ve ever been curious, just open a shader editor (I like Shadertoy), type gl_FragColor = vec4(gl_FragCoord.xy / iResolution.xy, 0.0, 1.0);, and watch your screen turn into a red-green gradient. That’s your first shader. Everything else is just adding more math. And honestly? That’s the best part.