Bringing a 12 Year Old Idea to Life with AI

Chris Weiher
Jun 2
4 min read

When I started CLEAVER back in 2012, I had a very specific image in my head: a butcher's cleaver slamming down through a strip of film.

The Initial Roadblocks

Back then, I realized that executing this idea could mean investing in resources that just didn't make sense for a new company. The barriers included:

A somewhat expensive 3D animation.
Hiring a small team.
Extensive render times.
A budget that was simply out of reach for a startup.

So, the idea stayed in my head. We eventually got really busy with work and never ended up making the animation.

Testing the Concept with Modern AI

Last year, I decided to finally give this idea a try using the two biggest text to video AI tools that were available.

The results were eye opening they taught me exactly where AI was headed, and the output was really impressive.

The Brief

The concept was never complicated: a photorealistic butcher's cleaver cuts through a spool of movie film with real force. Weight. Impact. Cause and effect.

The challenge wasn't generating pretty visuals. It was whether AI could understand intent — physics, momentum, and the cinematic logic of a single violent action.

Sora: Stunning to Look At, Confused by Physics

I started with a straightforward prompt inside Sora:

"Photorealistic cleaver cutting film on a cutting board."

The lighting was immediately impressive. Image quality was strong. Renders came back fast.

But the action fell apart.

Sora initially interpreted "film" too loosely, and the cleaver barely interacted with the object at all. I refined the prompt:

"Photorealistic cleaver cuts through a long spool of movie film with great force."

Better object recognition — but the physics still didn't hold. The knife moved limply. There was no sense of impact or weight transfer between objects.

I tried a different approach: generating a still image in Adobe Firefly first, then using it as the animation base. This got visually closer to my original concept. But then the film started drifting across the table like a ribbon in a breeze, with the cleaver awkwardly chasing it.

When I told it to stop moving the film, Sora replaced the cleaver with something featuring saw blades.

Throughout all of this, the visual quality stayed excellent. The logic never did.

Check my original LinkedIn post here.

Kling: Better Physics, Better Comprehension

Moving to Kling felt like a different conversation.

Using the same source image and core concept, Kling produced an animation that actually respected the physics of the scene. The film behaved like film. The knife-to-object interaction made visual sense. The motion had directionality.

It wasn't perfect — the cleaver's force still felt understated, and I'd likely reverse the clip in Premiere to sharpen the final result. But Kling understood the assignment in a way Sora didn't.

The difference wasn't image quality. It was comprehension.

Why Physics Is AI Video's Hardest Problem

Generative AI video has advanced fast, but physics simulation remains the field's most stubborn challenge. According to OpenAI's own technical research on Sora, current video generation models still struggle to accurately simulate complex physical interactions — a limitation the company openly acknowledges. Researchers at Stanford's Human-Centered AI Institute have similarly flagged object permanence, realistic force transfer, and spatial consistency as core unsolved problems in generative video.

This is why AI videos can look spectacular in a still frame and fall apart the moment things start colliding. Humans have deeply internalized how gravity, momentum, and impact feel. When those relationships are wrong — even slightly — we notice immediately. That instinct is nearly impossible to fool, a challenge NVIDIA's generative AI rendering research has described as one of the fundamental gaps between visual plausibility and physical believability.

Why This Moment Matters for Creative Businesses

The short version: what used to cost thousands of dollars and weeks of production time can now be prototyped in an afternoon.

According to HubSpot's State of AI Marketing Report, more than 80% of marketers now use AI-assisted tools somewhere in their workflow. Video continues to outperform almost every other content format in engagement and retention, as Wyzowl's annual video marketing research consistently shows. And per McKinsey's Generative AI industry analysis, global investment in generative AI has surged dramatically — signaling these tools are moving from novelty to infrastructure.

This is the same pattern we've seen before. When Pixar released Toy Story in 1995, the studio's own history documents how high-quality 3D animation required massive infrastructure and highly specialized talent — resources only a handful of organizations could access. Today independent creators approximate that same quality with accessible software. YouTube collapsed the cost of video distribution, as the platform's creator economy research has tracked over the years. And when Photoshop became widely accessible, Adobe's creative trends reporting shows it fundamentally shifted who could compete visually — giving small studios the reach of large ones overnight.

Text-to-video AI is the next compression of that gap — potentially removing entire stages of production, not just reducing their cost.

Bringing a 12 Year Old Idea to Life with AI

Recent Posts

Comments