Skip to content

October 27, 2022

Visualizing The Book Writing Process: And Help I’m Looking For (Vega and Vega-Lite, SVG, and ideas?)

By Gene Kim

Back in 2019, after The Unicorn Project was published, I published a github repo of code (written in Clojure, of course) that created a graph based on analyzing the git repository where I committed the daily updates to the book manuscript. The graph below shows the word count over time, and by parsing the diffs, it also shows the region of the manuscript that was modified.

What I loved about it was that it shows the sequential process of working through a manuscript, from top to bottom, several times, restructuring and refining the words. It helped me understand how one spends the three years between the first words written and handing off the finished manuscript.

Generating this graph was a first stab at doing something I’ve wanted to do for nearly a decade, which is visualize the way a book manuscript changes over time.  

But, what I really wanted to do was something like Gource — the video below shows a representation of the evolution of the Python code base from (Aug 1990 to Jun 20212).

Earlier this year, three years after I wrote the code above, I dug it back out. My goal was to see if I could use it to help track progress and visualize work on the current book I’m working on with my mentor, Dr. Steven Spear, which we have until April to get done. 

Here’s a modification of that graph for that manuscript, showing where all the manuscript adds/modifies/deletes are.

But this was a far cry from what the Gource visualization shows.  

I was truly inspired by one of the Glamorous Toolkit / Smalltalk pairing sessions with Tudor Girba and Eric Normand. I decided to write a recursive descent Markdown parser to generate a Vega tree diagram and try animating it over time / commits.

This is an early prototype, but I was pleased with the output. Shown below are 103 frames of animation showing the structure of the manuscript, in both a radial tree view and tree view, showing the evolution of The Unicorn Project manuscript.

I’m hoping to combine it with the word count graph shown above.

Help I’m Looking For From Anyone!

(Jack Rusher: I’ve been thinking of you for the last year as I’ve been working on this, thinking you’d have some awesome insights and ideas — I’d love any impressions or advice!)

Questions from the top of my head, for anyone with opinions and ideas!

  • I’d love to show the Vega radial tree and tree diagrams side by side, and have them be about the same size — but using hconcat or similar operations requires them to share the same data set…  So that won’t work?  
  • I convert the SVG diagrams into PNG files, but it causes them to get super fuzzy — are there better ways to do this?  (Jack, I’m using your awesome darkstar library — and I have a proposed doc change to disclaim that the call into GraalJS has to be single-threaded. I spent a couple of hours yesterday trying to make a multi-threaded version — any chance anyone wants to pair on that together to get that done?) (PS: GraalJS is amazing. And I’d love to try running this inside of GraalVM to see how much JIT helps performance.)
  • Maybe the better approach is to take the parsed Markdown and split it into directories and files so that Gource can generate the visualization?
  • But the more useful approach would be a JS app, which could take all 103 frames, and use a slider bar to show each frame.

Methodology

  • get all commits from repo
  • identify markdown file to extract
  • parse markdown into tree
  • overlay git add/modify/delete onto tree
  • convert into vega tree or radial tree
  • convert vega diagram into SVG diagram (using darkstar)
  • convert SVG into PNG (using macOS RSVG program)
  • convert PNG files into GIF (using ImageMagick)

I couldn’t have done any of this without Cursive / IntelliJ IDE, Clojure REPL, and the fantastic clerk notebooks (that enable “moldable development”, as coined by Tudor Girba) and portal (which can render vega diagrams).

I plan on making a video of how I these tools make for such a fantastic development environment and workflow.

- About The Authors
Avatar photo

Gene Kim

Gene Kim has been studying high-performing technology organizations since 1999. He was the founder and CTO of Tripwire, Inc., an enterprise security software company, where he served for 13 years. His books have sold over 1 million copies—he is the WSJ bestselling author of Wiring the Winning Organization, The Unicorn Project, and co-author of The Phoenix Project, The DevOps Handbook, and the Shingo Publication Award-winning Accelerate. Since 2014, he has been the organizer of DevOps Enterprise Summit (now Enterprise Technology Leadership Summit), studying the technology transformations of large, complex organizations.

Follow Gene on Social Media

No comments found

Leave a Comment

Your email address will not be published.



Jump to Section

    More Like This

    Unbundling the Enterprise: How API Strategies Can Transform the Public Sector
    By Leah Brown

    The Public Sector's Unique Challenge Government agencies face a fundamental problem: Imagine if you…

    Implementing at Scale: Strategies for Enterprise-Wide GenAI Adoption
    By Leah Brown

    Moving from isolated pilots to enterprise-wide GenAI implementation requires thoughtful strategies that balance innovation…

    The Physics of Flow: How the Constructal Law Can Revolutionize Product Development
    By Leah Brown

    In product development, the quest for better flow has been a constant for nearly…

    Governance and Organization: Creating the Framework for Responsible GenAI
    By Leah Brown

    As enterprises move beyond initial experiments with generative AI, establishing robust governance becomes critical…