Building a Music Production System Between My Phone and Computer

Every music producer has a graveyard of voice memos on their phone: hummed melodies, tapped rhythms and ambient noise recordings that never make it to the DAW (music production software).

For years, I struggled with this exact problem. I rotate hobbies depending on the time of year, and music production always seems to come back in winter. As soon as the days get shorter, I start wanting to make tracks again. Recently, after a two-year break to focus on my health, I was inspired by a new group of friends who were incredibly serious about their craft. It was enough to wake up my creative drive.

But as soon as I started, I hit the same wall I always do: Where do my ideas actually go?

Here is how I built a cross-platform music production system using Soundtrap, yt-dlp, Demucs, Google Colab, and Google Drive to ensure my ideas survive the jump from mobile capture to desktop production.

The Core Problem: Momentum vs. Barriers

The actual problem was never a lack of inspiration. The problem was continuity.

I do not get ideas on command. Most of them show up when I am away from my computer. Usually, I record a voice note on my phone, maybe hum a melody, or note down an arrangement idea, and tell myself I will deal with it later.

"Later" was where the process kept breaking. My phone and my computer were living in separate worlds. Ideas started on the phone and got stranded there. Moving material between the two devices felt annoying enough to slow me down, and small amounts of distraction are all it takes to kill momentum when you are trying to be creative.

I needed a single environment where ideas could survive the transfer. I call this the Mobputer approach: making my devices feel less separate and more like synchronized parts of the same working environment.

Starting With a Conceptual Theme

Before I built the system, I needed a target. This time, I wanted to work around specific concepts instead of just opening a DAW and waiting for something to happen.

I chose "The Vatican" as a theme. Not the literal city in a documentary sense, but the atmosphere I associate with it: space, ritual, stone, distance, silence, crowds, echo and ceremony.

My process is usually a mix of:

  • Melodies from a MIDI keyboard
  • External MP3 sounds
  • Free samples and textures (like crowd noise or room ambience)

You take this source material, change it, cut it, stretch it, combine it with something else and push it far enough that it becomes its own thing. Sometimes you make it better, sometimes you ruin it.

Either way, once you turn it into something personal, it becomes yours, just like I did with this one.

Vatican

Vatican

Ivan Vulović

But, to make that process work smoothly, I needed a very specific tech stack.

My Cross-Platform Music Tech Stack

I tried a range of different programs before settling on a system that connects my Android phone with my desktop. Here is exactly what I use.

1. The Cloud DAW: Soundtrap

At first, I didn't trust Soundtrap. Since Spotify had owned it, I assumed it was an acquisition destined to be quietly killed off. But after seeing it re-acquired by its original owners, I felt much more comfortable investing my time into it.

More importantly, it actually solved my problem. The free version does exactly what I need:

  • Cross-device accessibility: I can move from Android to the web browser smoothly.
  • Quick sketching: It has a simple production environment perfect for rough drafts.
  • Version history: I can move backward and forward without fear of ruining a project.

GarageBand was never the issue in terms of quality, but it lacked cross-platform support. If a tool traps me on one machine, it breaks how I work. Soundtrap keeps everything in the cloud.

2. Sourcing Material Cleanly: yt-dlp

Once I had my DAW environment, I needed a clean way to collect and process source audio for my textures and samples. I wanted to avoid every shady, ad-filled converter site on the internet.

I opted for yt-dlp, a command-line tool that is direct, fast and predictable. It bypasses the bad quality and suspicious behavior of browser-based downloaders, letting me pull reference audio directly to my machine.

3. AI Stem Separation: Demucs

To effectively sample external sounds, I needed to split audio into usable parts. I had used Spleeter in the past, but Demucs ended up fitting my workflow much better.

Demucs provides an incredibly practical way to separate vocals, drums, bass, and instrumental layers. This level of control is vital. Once a sound is separated, it becomes much easier to isolate a texture or rebuild the emotional shape of a sample, rather than just dragging a dense, full stereo file into a project and hoping it fits the mix.

4. The Glue: Google Colab & Google Drive

The funny part is that Google Colab ended up being one of the most useful parts of this entire setup.

I had used Colab for small Python scripts before, but never for music. Then I realized: I could run the Demucs stem separation script directly in the browser, process the heavy audio files using Google's cloud GPUs, and send the separated stem results straight into my Google Drive.

This changed everything. It removed a massive, annoying transfer step. Because Soundtrap and my Android phone can both easily access Google Drive, Drive became the central hub where all my files lived and moved.

The System as a Whole

Once I connected Soundtrap, yt-dlp, Demucs, Colab, and Drive, the creative process stopped feeling fragmented.

Now, the workflow looks like this:

  1. Capture: Catch a melody or field recording on my phone.
  2. Source: Download interesting audio references via yt-dlp.
  3. Process: Run the audio through Demucs in Google Colab to extract specific stems.
  4. Store: Auto-save those stems into Google Drive.
  5. Produce: Open Soundtrap on my phone or PC, pull files from Drive and start arranging.

I got exactly what I wanted from the beginning: continuity.

This setup is not about turning a smartphone into a perfect, professional studio. It is about closing the gap between mobile and desktop so ideas do not get lost in the process. A system only matters if it actually helps you do the work. By removing the barriers between my devices, I can focus on shaping the track.