LLMs and Generative AI in the enterprise.
Inspire, develop, and guide a winning organization.
Understand the unique values and behaviors of a successful organization.
Create visible workflows to achieve well-architected software.
Understand and use meaningful data to measure success.
Integrate and automate quality, security, and compliance into daily work.
An on-demand learning experience from the people who brought you The Phoenix Project, Team Topologies, Accelerate, and more.
Learn how to enhance collaboration and performance in large-scale organizations through Flow Engineering
Learn how making work visible, value stream management, and flow metrics can affect change in your organization.
Clarify team interactions for fast flow using simple sense-making approaches and tools.
Multiple award-winning CTO, researcher, and bestselling author Gene Kim hosts enterprise technology and business leaders.
In the first part of this two-part episode of The Idealcast, Gene Kim speaks with Dr. Ron Westrum, Emeritus Professor of Sociology at Eastern Michigan University.
In the first episode of Season 2 of The Idealcast, Gene Kim speaks with Admiral John Richardson, who served as Chief of Naval Operations for four years.
Exploring the impact of GenAI in our organizations & creating business impact through technology leadership.
DevOps best practices, case studies, organizational change, ways of working, and the latest thinking affecting business and technology leadership.
Just as physical jerk throws our bodies off balance, technological jerk throws our mental models and established workflows into disarray when software changes too abruptly or without proper preparation.
Sure, vibe coding makes you code faster—that’s the obvious selling point. But if you think speed is the whole story, you’re missing out on the juicy stuff.
The values and philosophies that frame the processes, procedures, and practices of DevOps.
This post presents the four key metrics to measure software delivery performance.
September 29, 2025
The following is an excerpt from the forthcoming book Vibe Coding: Building Production-Grade Software With GenAI, Chat, Agents, and Beyond by Gene Kim and Steve Yegge.
The vibe coding loop looks similar to the traditional developer loop. But when you’re coding with AI, every step becomes critical. As you’ll see soon, you can’t fall asleep at the wheel. If you do, you’ll soon wind up with frustrating and expensive rework, a theme we continue to explore throughout Part 2 of the book. Let’s talk about the vibe coding loop. Here’s what it can look like:
By the way, once you’re somewhat experienced with this vibe coding loop, there is one more critical step to add:
Automating this toil will not only make you faster but will speed up your ability to experiment and innovate. We’ll talk more about the unexpectedly high benefits of this later in the book. (Hint: It’s the O in FAAFO). And if, at any time, you’re typing a lot or manually searching through data structures, stop and ask yourself: “Could I ask AI to help with this?” The answer is usually yes, and you’ll be faster and have more fun.
For the past fifteen years, Gene has been taking screenshots whenever he finds something interesting in podcasts or YouTube videos, hoping to revisit those moments eventually, maybe to write about someday or to further research an interesting fact. In practice, he rarely used them. It was too tedious to search through the screenshots, locate the original content, and find the exact quote he needed. The juice didn’t seem worth the squeeze. Optimistically, he held out hope that it might be someday and kept making screenshots. For fifteen years! We mentioned this story briefly in the Preface, but now we’ll show the details of how Gene was able to vibe code his way to success.
In our first vibe coding pairing session together, we set out to build something that could create video excerpts (clips) of YouTube videos directly from Gene’s screenshots. He would be able to dig up a picture and, with the click of a button, post that excerpt from the video. His new tool would also use the video transcript to add overlaid captions (subtitles) onto the clips.
We used ffmpeg, a super-powerful command-line tool that can process, convert, and manipulate video and audio files in almost any format. It’s notorious for having extremely complex command-line options and syntax, which makes the operations difficult to write and almost impossible to read afterward. With this complexity in mind, we were going to find out if AI could come to the rescue.
In the following sections, we’ll walk you through how Gene went through the vibe coding loop multiple times, using a chat assistant to build what he wanted. We recorded the forty-seven minutes it took for him to build it.
First, Gene explained to Steve what he was trying to build. He needed a tool to automate the process of creating a “highlights reel” from his extensive collection of video highlights, which were video screenshots he had taken on his phone. Before starting our session, he had converted those screenshots into the following data: the YouTube channel and video, as well as the start and end times of the video clip he wanted generated. He also had movie files and transcripts of those YouTube videos.
He aimed to create captioned video .mp4 files, with the transcript converted into subtitles that showed up in the video frame, so he could share on social media. Gene felt his thousands of screenshots were a treasure trove of the wisdom of others, of interesting research material, and of miscellaneous topics that people would be interested in. This tool would finally let him start sharing that accumulated wisdom.
Given the objective, Gene now needed to decompose his problem into tasks that he could implement with AI. He came up with the following tasks, which could be implemented and validated using AI:
For this project, Gene chose to use Claude via the Sourcegraph AI assistant inside his IntelliJ IDE, though any assistant (and any model) would have worked. This session occurred before autonomous agents, so he was vibe coding using regular chat. A skill that remains useful today with agents, because some problems will always best be solved with chat.
Gene’s vibe coding loop looked like this: He would type his prompt in the assistant window. AI would generate some code in the chat. Gene would copy and paste that answer into his editor, or in some cases, smart-apply it directly into the code with a button click. Ask, answer, integrate, over and over. And it worked! Boy, did it ever. As we shall see.
Gene’s first task was to extract a segment of the source video file. Here was his starting prompt:
Given an excerpt beginning and end (in seconds), give me the ffmpeg command to extract that portion of the video. Go ahead and shell out and put that into a file /tmp/output.mp4.
A short prompt, but it got the job done. No need to look up any ffmpeg documentation, no need to learn the command-line arguments, no need to learn time unit conventions. AI handled all the details. Within minutes, Gene and Steve had working code that could extract video clips. He opened the video file, and it looked great. Given the simple nature of this task, Gene decided tests were not needed. Gene was convinced that we could rely on ffmpeg working correctly, so we moved onto the next task. (You decide whether that was a good decision.)
Next, Gene moved on to processing the transcript data. Given the start and end time of the highlight, he needed to extract the relevant transcript portions. Here was the prompt he used:
Here’s the video transcript (it’s a JSON array of objects). Write a function that, given a list of start and end ranges, extracts all the relevant entries in the transcript.
AI generated the function, which Gene copied into his Clojure code base. Although it ran correctly, this was a nontrivial function, so we needed test cases. This function computed intersections of time ranges in the transcript and seemed to have lots of places where the code might go wrong.
Gene gave our AI assistant another prompt: “Write some tests.” It generated several interesting test cases, exercising the different ways that time ranges might overlap. And indeed, one test case failed.
This was a genuine teachable moment for both of us. Our AI assistant was sure that the failed case was due to an off-by-one error in the code. But we discovered the code itself was correct; it was the generated test cases that were wrong. So much for tests that “look good.”
This reminded us that AI is not always reliable. We had to stay vigilant and verify its answers—especially because AI almost always sounds confident and correct and explains why it’s correct in lengthy detail. In this case, it was right when it generated the initial code but completely wrong in guessing why the tests were failing.
We soon had a tested function, which, given a list of transcript start/end ranges, would correctly extract the text for that part of the transcript. So far, so good.
Finally, we needed to add captions. This meant taking the transcript file and inserting it as captions that could be seen in the video frames. This was a large enough task that we decomposed it into the following subtasks:
First, we asked ChatGPT what caption formats ffmpeg supports. (Answer: SRT and ASS formats, which neither Gene nor Steve knew about before. And now we do!)
Gene then asked ChatGPT, “Give examples of SRT and ASS transcript files.” Gene chose the SRT transcript format because it had fewer fields and looked simpler to implement. Again, there is no need to become an SRT file format specialist. We then asked ChatGPT to generate the SRT file from the transcript segments.
Gene wrote this prompt:
Write a function to transform my list of transcript entries (a JSON array) into an SRT file.
Our AI assistant generated the code to do it, and it chose a great function name (which is sometimes more difficult than writing the function). Finally, we needed the subtitle text to be placed into the video frames. We learned that ffmpeg calls these “captions.”
Modify the ffmpeg command to generate captions, using the specified SRT caption file.
If you watch the session recording, you can hear Gene gasp the moment he opens the video and sees the video excerpt with overlaid captions. We had not been vibe coding for long, barely over half an hour. And we hadn’t written many prompts. On the recording, Gene declared, “This is freaking incredible,” plus lots of expletives we had to censor out.
In a total of forty-seven minutes of pair programming using vibe coding techniques with chat, Gene had built a working video clip generator that achieved his goal:
Not bad for an hour’s work. It turned into an hour because, upon closer inspection, Gene and Steve noticed that two lines of captions were being displayed, and there was something wrong with the caption timing. They spent a few minutes trying to fix it, and then Gene promised to work on it that evening.
The next day, after Gene got his code working, he texted Steve: “Holy cow, I got this running! I had so much fun generating and posting excerpts, extracting every quote I found inspiring.” Steve had not expected that Gene—who is not a professional programmer—would have accomplished this in under an hour. Gene had finally created a way to plunder his fifteen-year-old treasure trove.
What’s better is that it turns out the video Gene was using for testing the code was a talk by Dr. Erik Meijer (whom you may recall from Part 1). When Gene posted a twelve-part series of his favorite quotes from that talk on social media, Dr. Meijer responded: “This looks amazing. Thanks for doing this. It helps grasp the talk even faster than just watching at 2x speed.”
Gene’s tweet got nearly a quarter million views. Clearly others were finding his treasure trove and excerpt format valuable. This is the kind of impact vibe coding can unlock.
Okay, if you’re super experienced, Gene’s programming feat might sound mundane. It’s mostly new code in a small code base, and the final product was smaller than what some professional developers might commit multiple times a day. Some of you could have written this whole program in a quarter of the time it took us pairing with vibe coding.
That’s fair. But it’s also not the point. The takeaway here is not “Oh ho, ha ha. AIs will never replace real programmers.” The point is that we were able to build it at all. The program never would have been written the old way, but Gene did it in under an hour (fast) with AI.
For Gene, this was a life-changing experience. Gene achieved FAAFO. He had considered this sufficiently so far from reach that he had never bothered trying (ambitious). After creating this program, he used it several times a week because it unlocked the value of thousands of interesting moments he captured while listening to podcasts. Best of all, it was fun, and it set in motion writing tons of other utilities, some of which he uses multiple times daily.
Here are some other takeaways from this early vibe coding session:
We did this little test in September 2024 (almost prehistoric AI times). Given all the advances in coding agents, we know we could complete this project today in a fraction of the time. A coding agent could doubtless have solved this problem in a couple of minutes. As AI improves, it will be able to handle larger and larger tasks. It’s possible that Gene’s video excerpting program could have been implemented in one shot—if not today, sometime in the future. But like when giving tasks to humans, the larger the task you have AI take on, the more that can go wrong.
The relevant skill is no longer code generation (i.e., typing out code by hand), but being able to articulate your goals clearly and create good specifications that AI can implement. Because of this, the principles here continue to apply to larger projects as AI’s capabilities scale up.
In the Preface, Gene mentioned that he had his first inkling of how powerful chat programming could be as early as February 2024. While we’re talking about chat programming, here is a slightly expanded explanation of what happened.
For the non-iOS YouTube screenshots, he could ask the new ChatGPT-4 vision model to extract the current playback time displayed in the video player controls (e.g., “1:45”). But screenshots from the iOS YouTube app were different. They only showed a red progress bar with no visible time stamp. Without that timing information, he couldn’t automatically determine where in the video to create his excerpts.
On a whim, Gene typed into ChatGPT: “Here’s a YouTube screenshot. There’s a red progress bar under the video player window. Write a Clojure function that analyzes the image. March up the left side of the image to find the red progress bar.” The AI-generated code used Java 2D graphics libraries—ImageIO, BufferedImage, Color classes—which Gene had never used before. Gene hadn’t used bitmap functions since writing Microsoft C++ code in 1995. When the function correctly identified the progress bar on row 798 of the image on the first try, Gene sat slack-jawed.
Next, he extended the solution. “On that row, march right until you see a non-red pixel,” he prompted, and AI delivered code that calculated the exact playback percentage from the progress bar’s position. What would have taken him days of studying graphics APIs—if he’d attempted it at all—was working in under an hour. This code transformed thousands of iOS screenshots from unusable artifacts into valuable time stamps.
That’s what changed Gene’s life in 2024 and set the stage for his exciting adventure with Steve a year and a half later. Truly, FAAFO.
Gene’s video excerpting tool shows the vibe coding loop in action. By breaking down a complex task, collaborating with AI through conversation, and iteratively building a solution, Gene accomplished in under an hour what might never have happened otherwise.
But, as valuable as this chat-based approach proved to be, it only scratches the surface of what’s possible with vibe coding. Later in the book, we’ll examine the prompts that Gene used and show what made them effective.
Before we do that, we’ll look at what we can do with autonomous, agentic coding assistants, or “coding agents,” and how they alter the vibe coding loop.
Stay tuned for more exclusive excerpts from the upcoming book Vibe Coding: Building Production-Grade Software With GenAI, Chat, Agents, and Beyond by Gene Kim and Steve Yegge on this blog or by signing up for the IT Revolution newsletter.
Gene Kim has been studying high-performing technology organizations since 1999. He was the founder and CTO of Tripwire, Inc., an enterprise security software company, where he served for 13 years. His books have sold over 1 million copies—he is the WSJ bestselling author of Wiring the Winning Organization, The Unicorn Project, and co-author of The Phoenix Project, The DevOps Handbook, and the Shingo Publication Award-winning Accelerate. Since 2014, he has been the organizer of DevOps Enterprise Summit (now Enterprise Technology Leadership Summit), studying the technology transformations of large, complex organizations.
Steve Yegge is an American computer programmer and blogger known for writing about programming languages, productivity, and software culture for two decades. He has spent over thirty years in the industry, split evenly between dev and leadership roles, including nineteen years combined at Google and Amazon. Steve has written over a million lines of production code in a dozen languages, has helped build and launch many large production systems at big tech companies, has led multiple teams of up to 150 people, and has spent much of his career relentlessly focused on making himself and other developers faster and better. He is currently an Engineer at Sourcegraph working on AI coding assistants.
No comments found
Your email address will not be published.
First Name Last Name
Δ
The following is an excerpt from the forthcoming book Vibe Coding: Building Production-Grade Software With…
The 2025 DORA State of AI-assisted Software Development report delivers a sobering reality check…
Today marks an exciting milestone in organizational design: the release of Team Topologies, 2nd…