Bass + AI: Improvisation (Python, Unity3D, and Kyma)

An improvised duet(?) with a AI agent trained on the “Embodied Musicking Dataset. “

In this performance, Python listens to live audio input from the bass, and, based on models trained with the dataset, sends out data to Unity3D and Kyma. Unity3D creates the visuals (the firework), and Kyma processes the audio from the bass.

First, though, the dataset used for training was collected from several pianists in the US and UK. As pianists played, we recorded multiple aspects of their performance: audio, video of their hands, EEG, skeletal data, and galvanic skin response. After playing, pianists listened to their own performance and were asked to record their state of “flow” over the course of the performance. All of these different dimensions of data, then, were associated over time, and so neural networks can be trained on these different dimensions to make associations.

This demonstration uses the trained models from Craig Vear’s Jess+ project to generate X&Y data (from the skeletal data), and “flow”, from the amplitude of the input. These XY coordinates, “flow”, and amplitude are sent out from Python as OSC Data, which is received by both Unity3D (for visuals) and Kyma (for audio).

In Unity, the XY data moves the “firework” around the screen. Flow data affects its color, and amplitude affects its size. Audio in Kyma is a bit more sophisticated, but X position is left/right pan, and the flow data affects the delay, reverb, and live granulation.

As you can see, amplitude to XY mapping is limited, with the firework moving along a kind of diagonal. Possible next steps would be to extract more features of the audio (e.g. pitch, spectral complexity, or delta values), and train with those.

Applying this data trained on pianists to a bass performance (in a different genre) does not have the same goals music-generation AI such as MusicGen or MusicLM. Instead of automatically generating music, the AI becomes a partner in performance. Sometimes unpredictable, but not random, since its behavior is based on rules.

Pd Machine Learning Fail

A failed attempt at machine learning for real-time sound design in Pure Data Vanilla.

I’ve previously shown artificial neurons and neural networks in Pd, but here I attempted to take things to the next step and make a cybernetic system that demonstrates machine learning. It went good, not great.

This system has a “target” waveform (what we’re trying to produce). The neuron takes in several different waveforms, combines them (with a nonlinearity), and then compares the result to the target waveform, and attempts to adjust accordingly.

While it fails to reproduce the waveform in most cases, the resulting audio of a poorly-designed AI failing might still hold expressive possibilities.

0:00 Intro / Concept
1:35 Single-Neuron Patch Explanation
3:23 The “Learning” Part
5:46 A (Moderate) Success!
7:00 Trying with Multiple Inputs
10:07 Neural Network Failure
12:20 Closing Thoughts, Next Steps

More music and sound with neurons and neural networks here: