Skip to content

October 27, 2022

Visualizing The Book Writing Process: And Help I’m Looking For (Vega and Vega-Lite, SVG, and ideas?)

By Gene Kim

Back in 2019, after The Unicorn Project was published, I published a github repo of code (written in Clojure, of course) that created a graph based on analyzing the git repository where I committed the daily updates to the book manuscript. The graph below shows the word count over time, and by parsing the diffs, it also shows the region of the manuscript that was modified.

What I loved about it was that it shows the sequential process of working through a manuscript, from top to bottom, several times, restructuring and refining the words. It helped me understand how one spends the three years between the first words written and handing off the finished manuscript.

Generating this graph was a first stab at doing something I’ve wanted to do for nearly a decade, which is visualize the way a book manuscript changes over time.  

But, what I really wanted to do was something like Gource — the video below shows a representation of the evolution of the Python code base from (Aug 1990 to Jun 20212).

Earlier this year, three years after I wrote the code above, I dug it back out. My goal was to see if I could use it to help track progress and visualize work on the current book I’m working on with my mentor, Dr. Steven Spear, which we have until April to get done. 

Here’s a modification of that graph for that manuscript, showing where all the manuscript adds/modifies/deletes are.

But this was a far cry from what the Gource visualization shows.  

I was truly inspired by one of the Glamorous Toolkit / Smalltalk pairing sessions with Tudor Girba and Eric Normand. I decided to write a recursive descent Markdown parser to generate a Vega tree diagram and try animating it over time / commits.

This is an early prototype, but I was pleased with the output. Shown below are 103 frames of animation showing the structure of the manuscript, in both a radial tree view and tree view, showing the evolution of The Unicorn Project manuscript.

I’m hoping to combine it with the word count graph shown above.

Help I’m Looking For From Anyone!

(Jack Rusher: I’ve been thinking of you for the last year as I’ve been working on this, thinking you’d have some awesome insights and ideas — I’d love any impressions or advice!)

Questions from the top of my head, for anyone with opinions and ideas!

  • I’d love to show the Vega radial tree and tree diagrams side by side, and have them be about the same size — but using hconcat or similar operations requires them to share the same data set…  So that won’t work?  
  • I convert the SVG diagrams into PNG files, but it causes them to get super fuzzy — are there better ways to do this?  (Jack, I’m using your awesome darkstar library — and I have a proposed doc change to disclaim that the call into GraalJS has to be single-threaded. I spent a couple of hours yesterday trying to make a multi-threaded version — any chance anyone wants to pair on that together to get that done?) (PS: GraalJS is amazing. And I’d love to try running this inside of GraalVM to see how much JIT helps performance.)
  • Maybe the better approach is to take the parsed Markdown and split it into directories and files so that Gource can generate the visualization?
  • But the more useful approach would be a JS app, which could take all 103 frames, and use a slider bar to show each frame.

Methodology

  • get all commits from repo
  • identify markdown file to extract
  • parse markdown into tree
  • overlay git add/modify/delete onto tree
  • convert into vega tree or radial tree
  • convert vega diagram into SVG diagram (using darkstar)
  • convert SVG into PNG (using macOS RSVG program)
  • convert PNG files into GIF (using ImageMagick)

I couldn’t have done any of this without Cursive / IntelliJ IDE, Clojure REPL, and the fantastic clerk notebooks (that enable “moldable development”, as coined by Tudor Girba) and portal (which can render vega diagrams).

I plan on making a video of how I these tools make for such a fantastic development environment and workflow.

- About The Authors
Avatar photo

Gene Kim

Gene Kim is a Wall Street Journal bestselling author, researcher, and multiple award-winning CTO. He has been studying high-performing technology organizations since 1999 and was the founder and CTO of Tripwire for 13 years. He is the author of six books, The Unicorn Project (2019), and co-author of the Shingo Publication Award winning Accelerate (2018), The DevOps Handbook (2016), and The Phoenix Project (2013). Since 2014, he has been the founder and organizer of DevOps Enterprise Summit, studying the technology transformations of large, complex organizations.

Follow Gene on Social Media
Jump to Section

    More Like This

    Audit to the Rescue? – Investments Unlimited Series: Chapter 12
    By IT Revolution , Helen Beal , Bill Bensing , Jason Cox , Michael Edenzon , Dr. Tapabrata "Topo" Pal , Caleb Queern , John Rzeszotarski , Andres Vega , John Willis

    Welcome to the twelfth installment of IT Revolution’s series based on the book Investments…

    Mastering the Art of (Re)Recruiting Talent in Times of Transformation
    By IT Revolution

    In today's fast-paced and ever-evolving business landscape, organizations are constantly undergoing transformations to stay…

    What to Expect at Enterprise Technology Leadership Summit Europe Virtual
    By Gene Kim

    Holy cow, Enterprise Technology Leadership Summit Europe Virtual is happening next week, and I’m…

    Boardroom Showdown – Investments Unlimited Series: Chapter 11
    By IT Revolution , Helen Beal , Jason Cox , Michael Edenzon , Dr. Tapabrata "Topo" Pal , Caleb Queern , John Rzeszotarski , Andres Vega , John Willis

    Welcome to the eleventh installment of IT Revolution’s series based on the book Investments…