Podcast equipment

A new project coming out of UC Santa Cruz is using artificial intelligence to generate a podcast. This AI generates the narrative and the voices that read the narrative. The fictional storyline is based on an American county called Sheldon County, and it generates unique characters in that county, weaving through their individual experiences. While the entire podcast won’t be ready until 2019, you can listen to bits of the initial episodes (see above link).

For now, this project is just one storyline, but the PhD student behind it plans to have the AI generate completely personalized podcast series—entirely different for each individual who wants to listen.

Photo by Maria Fernanda Gonzalez on Unsplash

You may have heard of Lyrebird.ai, a Montrealian company that’s creating customized voices for personal assistants. It can mimic your voice—all you need to do is record one minute’s worth of your dulcet tones onto their website and it can create an eerily similar copy (depending on the sample’s audio quality).
Once you’ve created your “digital voice,” you can use it to read your audio book or text messages or a number of other things. Does anyone else find this a bit narcissistic? Narcissistic and yet very intriguing. It’s currently available for the English language, preferably American accent, but once this tech is in place for English, it’s only a matter of time before it expands to others.

One side offering that’s worth mentioning (in case this tech creeps you out) is that Lyrebird.ai says you can send them audio clips when you question their veracity, and they’ll check the authenticity for you. Little glitches in the audio are clues that the audio is a fraud.

Finally, because it’s a super interesting factoid, this company is named after the Australian bird that mimics sounds it hears—including car alarms, chainsaws, and camera shutter clicks (we’ll let the venerable Sir David Attenborough explain it to you in the video)—so an apropos name choice.

Photo Found Here: https://www.pexels.com/photo/hand-metal-music-musician-33779/

Chalk another one up to deep learning from Google. With the tag team of two deep learning networks—one to translate text into a spectrograph (which represents audio frequencies across time) and another to generate speech from that spectrograph—Google can generate voice that is purportedly indistinguishable from a human. If you don’t think you can be fooled, judge for yourself.