SLML Part 1 - Why I Decided to Train SketchRNN on My Drawings

singleline
machinelearning
slml
Using visual embeddings to filter thousands of images.
Author

Andrew Look

Published

October 27, 2023

SLML Part 1 - Why I Decided to Train SketchRNN on My Drawings

This post is part 1 of “SLML” - Single-Line Machine Learning.

If you want some context on how I got into single-line drawings, check out the part 0.

Next in the series is part 2 where I assemble the training data, and filter it using visual embeddings.

Discovering SketchRNN

David Ha and Doug Eck from the Magenta team at Google had crowdsourced over 100,000 drawings using a game call “Quick, Draw”. The game would give users a prompt such as “Yoga” or “Pig” and users would have 20 seconds to draw it in the browser. This produced a dataset of line drawings for each category. Then, they trained a model called SketchRNN on these drawings. They trained a separate model for each category that users were prompted to draw, so there’s a separate model trained for “yoga”, “pig”, etc.

I was fascinated by Magenta’s SketchRNN demo where you start a drawing, and the SketchRNN model tries continuously to complete it in new ways. It makes me think about the conditional probability that evolves as a drawing progresses. Given that one line exists on the page, how does that influence what line the artist is most likely to draw next?

My favorite example is to select the yoga model, and then draw a rhombus shape simulating a yoga mat. I love watching stick figures emerge in various poses, all centered on and interacting with the yoga mat.

magenta-demo-01-draw-and-gen

Some of the same authors from the Magenta paper collaborated with distill.pub, using the same SketchRNN model trained on a handwriting dataset, to publish an explorable explanation called Experiments in Handwriting with a Neural Network.

Why train a model on my single-line drawings?

I wanted to apply SketchRNN to my own single-line drawings.

I’d considered training an ML model on my single-line drawings, but the pixel-based GAN’s and VAE’s available in 2017/2018 when I started thinking about this didn’t seem like they’d yield good results on images that are mostly composed of white pixels with narrow black lines interspersed. What SketchRNN produces sequences, the generated results could be animated. I see single-line drawing as a sort of performance. A video of me drawing a single-line is inherently a proof that I didn’t lift my pen from the page.

It struck me that it could give a new window into my own drawing style. If my own drawing style evolves slowly over time, would I be able to notice the difference between a model trained on drawings I made recently from one trained on drawings I made several years ago?

How cool would it be to have an interface, similar to Andy Muatuschak’s scrying pen, where I could start drawing and see live completions, showing me how the probability space of subsequent strokes in my drawing is changing?

scrying pen demo

What new ideas might I get from turning the “variability” up, like the slider in distill.pub’s handwriting demo, and generating new drawings based on my old ones? distill-variation Or running stroke prediction along the length of a drawing:

distill-vary-strokes

distill-vary-strokes-legend

Getting Started

When I reached out to the authors of SketchRNN, they estimated that I’d need several thousand examples in order to train a model. I only had a few hundred at the time. But I kept making new drawings in my sketchbooks, and numbering the pages and the sketchbooks so that I could scan them and store them in a consistent way. In the back of my mind, I held on to the idea that one day I’d have enough drawings to train a model on them.

Several years went by. More sketchbooks accumulated.

Eventually, I ran a file count and saw a few thousand distinct drawings. I was finally ready to get started. I was going to need:

  1. at least a thousand drawings
  2. the ability to convert my scanned drawings into stroke-3 format
  3. to train a model on my drawings, and experiment until I found a configuration producing results that I liked.

Next in my SLML series is part 2, where I convert my JPEG scans into vector representations in preparation for training SketchRNN.