A Generative Audio Summary of Summer Posts
An AI audio analysis of the last four months of writing - what came through, what was right, and what was wrong.
So…. I really didn’t realize that things these last four months have been so heavily influenced by themes of grief and loss. I honestly would not say that is what the Illustrated Life substack is about. This AI experiment was an insightful (and addictive) process.
(Above: Version four of an AI-generated summary of posts from the last four weeks, explained below. A few other iterations are embedded below, too.)
(If you are anti-AI, please skip this post. This post is not a personal essay and not an example of my Substack as a whole. I trust the authenticity of my writing voice, and I definitely don’t want AI to do the things I do, but I am keeping up with AI to see ways I can use it effectively, and I do think, long term, that AI can be helpful in terms of aggregation and summarization of data. That’s where today comes in.)
A New Feature
I answer a lot of Substack questions at Reddit. I enjoy helping with random technical questions. It’s a good counterbalance to the “real” work I do at a STEM education nonprofit.
While browsing the subreddit today, I saw something highlighted that was too irresistible not to try. The premise is that you can feed an AI tool a set of documents, and it will generate a two-host podcast based on those files. In other words, the two hosts will talk "about" my work.
I was incredibly curious to see what they might say.
Setting Up My Experiment
This experiment uses NotebookLM, a Google tool.1
I offered a set of 13 posts from summer (post June 5, 2024). I didn't include the "Zines" post or the most recent "Loneliness" post. I thought the handful I picked was a good limited set because I know a lot of these posts circle around personal loss, but they are also about journaling, about life, about meaning, and about creative habit.
The posts I included:
Victorian Puzzle Note Folding (because it has a philosophical angle on folding)
Making and Moving Piles - A Tower of Hanoi View of Decluttering
Not all stories have a plot—Not Reading, But Didion and Jansson
A Proactive Birthday List (version 1 only)
Trial 1
When I turned on the generated audio, I found that the "hosts" had fixated on a post about my year-long birthday list. They hadn't really understood my year list (which started with a project I called my 50 Before 50), but they picked up on the story about a "week of" list I did with my mom a few years ago when she turned 70. They particularly enjoyed that "Eat Cheetos" was on the list.
I was listening to them chat about doing small and basic things, and then I heard them say that having done this list was even more important to me as my mom's health started to decline. What?
I kept listening and discovered that my mom died. (Sorry, Mom.)
They went on to talk about my grief.
They also talked about my illustrated journal, about pen pals, and about the search for meaning.
They did like the fact that I was drawing teddy bears (although there was a level of "surprise" to the mention that made me see that discussion, which I think of as incredibly sentimental and serious, as somehow silly or, worse, cute).
I was working (my actual job that pays me more than the few pennies an hour I make writing), and I tuned out a bit after my mom died, but I heard a few other bits and pieces that were clearly misinterpreted. (I always worry that this happens with readers, too. I'm always trying to be "more" clear even when being deliberately roundabout in some of what I discuss. But I do know that we always risk being completely misunderstood.)
Trial 2
I re-ran this whole generative AI experiment, removing the birthday post to see what might happen.
The first audio was 7 minutes. It was definitely two hosts chatting about my work.
The second audio was 22 minutes. It started out oddly though. The audio just starts in the middle of nowhere, and there is some odd sense that they are talking "to" me.... not about me. It felt like I had two people sitting in front of me analyzing what I had written and simultaneously explaining it to me and asking me questions. (It felt a bit too much like I imagine a therapy session might, except I didn’t have to say anything.)
Trial 3
Confused about that shift, that difference between the first and the second in terms of the audience, I ran the experiment again.
I used the same source files. The result was a 4-minute audio with a completely different angle on my work.
Trial 3 focused on the lighthouse, and I really liked that, although the "wish" had been my mom's, but the symbolism was still there. This version also includes the Victorian pocket purse, holding space, and the “mystery tool.”
They got some things right in this version. Other things are definitely skewed, and the ways in which they seem to suggest grief has rendered me basically incapable of doing regular things is definitely exaggerated. (I think they have completely made up an odd statement about doing dishes, too.)
Overall though, it was interesting to hear the analysis, to know what comes out as a "synthesis" of even a few posts.
Here's the audio for test three.
Trial 4
I ran the experiment one more time today, with the same files. That version is about 7 minutes. The audio for test four appears at the top of this post.
Trial 4 might be the best of them, although my partner was male in their rendition. They focus on the teddy bears in this version, but they don’t really understand it as a process of documentation and a precursor to getting rid of things.
Trial 5
It’s hard to stop. They are so different that the “next” one might be just right. When I first started listening, I thought that Trial 5, at 8 minutes, might possibly be one of the best, but it is also somehow uncomfortable…partly because I think it’s just not “quite” right.
I didn’t even plan to run the fifth trial, but the others had been so wildly different that I had to. “She’s not offering these cheesy solutions or anything…” —indeed!
This version does feel a bit too superficial. It feels like they project quite a bit onto the grief story that isn’t really there, and this version doesn’t do a good job referencing actual details from the posts (which the other versions all did).
I was surprised to find out I had to learn how to appreciate quiet (which I think I’ve been talking about for years as important), but then I heard that I moved to a new city. What? (For the record, I haven’t moved.)
Spin the Wheel Again!
The process can be addictive because, like a kaleidoscope, every time you click to generate, you will get something different. There are no controls or guides to help clue the "hosts" in on what to focus on. You can’t specify keywords or offer even a grounding statement on which they might build and shape the analysis.
I feel sure those kinds of features will be coming.
I did give it a text-based prompt to see if it would generate the audio, but it gave me a text-based version of a two-host discussion. Generating the audio, right now, appears to be a black-box, push-button process.
Despite the lack of controls, overall, it’s a mind-blowing process.
As someone who has podcasted for many years, I know how much time can be involved in the recording and editing. That AI can generate reasonable, human sounding people — with really good cadence, tone, pacing, and back-and-forth interaction — in a matter of minutes…. is amazing. (It’s a bit depressing, too.)
I think there is something to be learned in hearing what kind of summary is drawn from a body of work. Not only can this process help highlight themes over the whole, but it can also highlight themes and cycles over specific periods of time.
I would love to see how summaries differ over my year-and-a-half here, for example, compared to the last year, compared to the last few months. Or.... what I would really love.... a real deep-dive summary of the 17 years of my podcast.
I have used ChatGPT quite a bit, as well as Microsoft Co-pilot (to get access to the newer algorithm for free). I haven’t experimented extensively with Google’s tool.
I find the whole idea totally creepy. But then I think about how we put out essays or podcasts or TikToks, whatever, and people read them. Strangers. Sometimes a handful and sometimes thousands. And they can infer things, get things wrong, make totally wrong assumptions, make connections, misinterpret, or hit something on the nose that we didn't see. So what's the difference? Does it matter if it's a human brain or AI? I don't have an answer. I'm not sure what I think or how I feel about it, honestly. Very cool post. Thank you for sharing this.