The Weather App That Became an AI Art Pipeline

2026-05-12

This started with a spark: wouldn't it be cool if my kids could glance at the weather and know what to wear?

Not read a forecast. Not parse wind speed. Just look.

I wanted a small character on my TRMNL that could show up in different weather, wearing different clothes, with enough consistency that my kids would recognize the pattern. Raincoat means rain. Hoodie means cool. Winter gear means do not leave the house dressed like they're going to the pool.

The fun part was obvious immediately: use GPT Image 2 to generate the character art.

The less obvious part was that "generate a cute dragon(?) character" is not a product spec. It is barely even a prompt. It is a wish.

The first versions were not great: some good vibes, some weird artifacts, some grayscale mush, some characters that looked related only in the sense that they had wings and could all be called dragons.

That is where the project became interesting.

TRMNL is a small e-ink display. It does not care how charming an image looks in a browser tab if the final thing collapses into gray oatmeal after preprocessing. The display pushed the prompting away from "make a nice picture" and toward something much more specific:

strong black linework
sparse gray fills
big readable shapes
no tiny texture
no text, signs, labels, glyphs, or fake UI
high contrast after dithering
a clean region where weather text can sit
clothing and weather cues that survive a 2-bit treatment

That last point mattered a lot. I was not prompting for a normal illustration. I was prompting for an image that could be crushed down to a tiny grayscale display and still communicate.

The prompt stopped being decorative. It became part art direction, part print-production note, part accessibility constraint.

Instead of:

draw a cute dragon in rainy weather

the useful prompt shape was closer to:

Make a grayscale storybook-comic weather scene for non-backlit e-ink.
Use large readable shapes, black linework, sparse gray fills, and strong
contrast after 2-bit dithering. Preserve a calm text-safe area on the
left. Put the full character and visible weather action on the right.
Rain must be visible in the environment, not only implied by clothing.
No text, signs, labels, panels, borders, UI, logos, or dense texture.

That sounds fussy because it needed to be fussy.

Image models are happy to satisfy the emotional intent while missing the production intent. "Rainy" might mean a character holding an umbrella in a dry scene. "Kid-readable" might produce childish props. "Grayscale" might produce beautiful tonal fog that becomes useless once pushed through a 2-bit display pipeline.

I had to learn to prompt for the thing after the next transformation, not just the thing the model returns.

That was the core loop:

Generate.
Inspect.
Notice what failed after sizing, cropping, and grayscale treatment.
Tighten the prompt.
Regenerate.

The cost of that loop was shockingly low. All in, including experimentation and regeneration, I spent under $20 on GPT Image 2.

That changes the feel of building. I could be playful without being precious. A failed generation was not a tragedy. It was feedback. Try a different contrast instruction. Ban a new artifact. Make the rain more environmental. Pull back on texture. Preserve the silhouette. Regenerate.

The other half of the project was agentic coding.

TRMNL has a UI-based deployment model. It is intuitive, which is good. It is also clunky when you are iterating quickly, which is less good. I did not want to click around a dashboard every time I changed layout, prompts, asset paths, or weather rendering.

So I moved the iteration loop into code. The agents helped build the local generator, plugin HTML, tests, asset paths, and docs around it. TRMNL stayed the target display, but the fast feedback loop lived in the repo.

That made a huge difference. The image model handled visual candidates. The coding agents handled the boring glue. I got to keep steering.

At first, one character was enough. A dragon can carry a lot of charm.

Then I added more characters.

That is when consistency became the real problem.

One generated dragon in one weather scene can look great. Eighteen characters across day, night, rain, fog, snow, hot, cold, and everything in between is a different beast. Prompt packets help, but they do not magically make an image model preserve identity across hundreds of outputs.

The failure mode was not dramatic. It was worse: it was subtle.

A character would keep the same broad idea but lose the face. Or the silhouette. Or the clothing logic. Or the value map. One robot would start wearing human rain boots. One mascot would drift toward a different species. A cold-weather variant would hide the very features that made the character recognizable.

That pushed the project from "better prompts" into a more disciplined reference workflow.

Each character needed a canonical reference sheet. The weather images needed to use that reference as the identity source. Regenerating the reference should invalidate old approvals. Approval needed to bind to the actual image bytes and the prompt hash, not just a filename sitting around looking official.

That sounds like overkill for a cute weather app until you have looked at a grid of generated characters and realized the model is quietly improvising.

The funny thing is that the whole project still feels small.

It is a weather app for a tiny e-ink screen. It tells my kids what kind of clothes the day wants. It has dragons and robots and little weather scenes. There is no grand platform hiding underneath it.

But the path from idea to working thing felt newly available.

I did not need to hire an illustrator for a full production set before proving the concept. I did not need to manually wire every asset by hand through a deployment UI. I did not need to accept whatever the first model output gave me. I could generate, critique, tighten, regenerate, and let agents turn that loop into something repeatable.

That is the part I keep coming back to.

There is still so much room to build.

Not because AI magically finishes products. It does not. The model gives you an output, and then you still have to have taste. You still have to notice what is wrong. You still have to decide what matters. You still have to turn a spark into a system.

But the distance between "wouldn't it be cool if..." and "I can try that this weekend" is collapsing.

That is a big deal.

Cool things are still just a spark of imagination away. Now the spark has better tools.