Machine Learning

From Sound to Images, Part 1: A deep dive on spectrogram creation.

By Benjamin Hoffman and Grant Van Horn
In our first post, we described the idea of using a computer vision model to identify bird vocalizations. But how does a computer vision model “listen” to a sound? For Sound ID, we use the short-time Fourier transform (STFT) to convert the raw waveform (which tracks air pressure as a function of time) into an…

Behind the Scenes of Sound ID in Merlin

By Benjamin Hoffman and Grant Van Horn
What is Sound ID? Today we announced one of our biggest breakthroughs—Sound ID, a new feature in the Merlin Bird ID app—and a major leap forward in sound identification and machine learning to date. Sound ID lets people use their phone to listen to the birds around them, and see live predictions of who’s singing. Currently,…