Skip to content

beastyrabbit/infinitune

Repository files navigation


INFINITUNE

Infinite Generative Music

Describe a vibe. Get an endless stream of original AI-generated songs — lyrics, cover art, and audio, all created on the fly.


Features · Screenshots · How It Works · Tech Stack · Quick Start · Architecture


Features

  • Endless Generation — describe a mood, genre, or artist and songs keep appearing in real-time
  • Prompt Steering — change direction mid-stream without losing history
  • One-Off Requests — drop in a specific song idea and it gets generated next
  • Album Mode — generate an entire album from a single track
  • Oneshot Mode — generate a single standalone song with full control
  • Song Library — browse all generated songs with genre, mood, energy, and era filters
  • Playlist Management — star favorites, search, filter by mode (endless/oneshot)
  • Multi-Device Rooms — synchronized playback across devices (Sonos-style)
  • Terminal Daemon Control — run local playback as a background daemon and control it with infi commands
  • Gapless Playback — next song preloads in background, zero gaps between tracks
  • Rating & Feedback — thumbs up/down to influence future generation
  • Cover Art — AI-generated vinyl-style album covers for every song
  • Configurable AI — switch between local (Ollama), cloud (OpenRouter), and OpenAI Codex (ChatGPT subscription) for LLM generation

Screenshots

Player View

The main player with now-playing display, generation controls, prompt steering, and the song queue.

Player with queue and generation controls

Song Library

Browse all generated songs with cover art. Filter by genre, mood, energy level, and era.

Song library with cover art and filters

Playlist Management

Star your favorites, search by name or prompt, filter by endless or oneshot mode.

Playlist management with starring and filters

Landing Page

Describe your music, pick a provider and model, and start listening.

Landing page — describe your music

Oneshot Mode

Generate a single standalone song with full prompt control and advanced settings.

Oneshot single-song generator

Worker Queue

Live dashboard showing LLM, image, and audio pipeline status with active/waiting/error counts.

Worker queue dashboard
More screenshots

Settings

Configure service endpoints (Ollama, ACE-Step, ComfyUI), API keys, and model preferences.

Settings page

Rooms

Create rooms for synchronized multi-device playback. Name your devices, join as player or controller.

Rooms for multi-device sync

How It Works

1. Describe your music — "2010 techno beats with English lyrics, S3RL energy, heavy 808 bass"

2. Hit Start — the unified backend kicks off the pipeline: LLM writes metadata + lyrics, ComfyUI renders cover art, ACE-Step synthesizes audio

3. Listen endlessly — songs appear in real-time. Rate them up/down to steer the direction. Request one-offs or generate entire albums from a single track.

Song Generation Pipeline

Each song flows through: pendinggenerating_metadatametadata_readysubmitting_to_acegenerating_audiosavingreadyplayed

The unified server runs a per-song worker pipeline with concurrency queues managing throughput across three lanes: LLM (metadata/lyrics), Image (cover art), and Audio (ACE-Step synthesis).

Multi-Device Playback

Infinitune includes integrated room management for synchronized playback — think Sonos or Spotify Connect, but for AI-generated music.

  • Roles — devices join as player (outputs audio) or controller (remote control only)
  • Sync — all players stay locked to the same song and position
  • Per-device control — adjust volume or pause individual players independently
  • Clock sync — NTP-style ping/pong calibration, synchronized within ~50ms across LAN
  • Gapless — next song preloads in background while current one plays

Terminal Daemon (infi)

Use the terminal daemon when you want room playback without keeping the browser open.

# Start daemon manually
pnpm infi daemon start

# Pick playlist + play (auto-creates room when needed)
pnpm infi play

# Playback controls
pnpm infi stop
pnpm infi skip
pnpm infi volume up
pnpm infi volume down --step 0.1
pnpm infi mute

# Interactive selectors
pnpm infi room pick
pnpm infi song pick

# Status
pnpm infi status

# Persist CLI defaults (server/device/step)
pnpm infi config --server http://localhost:5175
pnpm infi config --device-name "DESK SPEAKER"
pnpm infi config --daemon-host 127.0.0.1 --daemon-port 17653
pnpm infi config

# Daemon HTTP endpoints (for Waybar/custom scripts)
curl -s http://127.0.0.1:17653/status | jq
curl -s http://127.0.0.1:17653/queue | jq
curl -s http://127.0.0.1:17653/waybar | jq

Install a local command wrapper:

pnpm infi install-cli

Install the CLI man page:

pnpm infi install-man

Then use:

infi play
infi stop
infi man
man infi

Install daemon as a systemd user service:

pnpm infi service install
pnpm infi service restart
pnpm infi service uninstall

Tech Stack

Technology
Frontend React 19 · TanStack Router · React Query · Tailwind CSS 4
Backend Hono (unified server — API + worker + rooms on one port)
Database SQLite (better-sqlite3, WAL mode) · Drizzle ORM
Rooms Integrated WebSocket room service · multi-device sync · REST API
Worker Pipeline Event-driven background pipeline · per-song workers · concurrency queues
Audio ACE-Step 1.5 (text-to-music synthesis)
Cover Art ComfyUI (image generation)
LLM Vercel AI SDK (Ollama/OpenRouter) + Codex App Server (openai-codex, ChatGPT subscription auth)
Build Vite 7 · TypeScript 5.7 · Biome (lint/format) · pnpm monorepo

Quick Start

# Install dependencies
pnpm install

# Start everything (web + unified server) with Portless stable local domains
pnpm dev:all

# Fixed-port fallback (Vite :5173, server :5175)
pnpm dev:all:fallback

Default local dev uses Portless: web at http://web.localhost:1355, backend API at http://api.localhost:1355, with VITE_API_URL and APP_ORIGIN set automatically by scripts. Fallback mode: use pnpm dev:all:fallback to run web on :5173 and the unified backend on :5175 (VITE_API_URL=http://localhost:5175). Backend-only local dev should use pnpm dev:server; pnpm server is a pnpm built-in command name, not a reliable script entry point.

T3Code Worktrees

Set T3Code's "Run automatically on worktree creation" command to:

bash scripts/t3code-worktree-setup.sh

Prerequisites

Infinitune requires external AI services running on your network:

Service Role Default Port
ACE-Step 1.5 Text-to-music synthesis :8001
Ollama Local LLM (metadata, lyrics) :11434
ComfyUI Cover art generation :8188
OpenRouter (optional) Cloud LLM access
Codex CLI (optional) OpenAI Codex provider bridge (codex app-server)

Environment Variables

Configure in apps/server/.env.local:

# AI service endpoints (replace with your server addresses)
OLLAMA_URL=http://<your-server>:11434
ACE_STEP_URL=http://<your-server>:8001
COMFYUI_URL=http://<your-server>:8188

# Optional — cloud LLM via OpenRouter
OPENROUTER_API_KEY=sk-or-v1-...

# Optional — override Codex turn timeout (default: 360000 / 6 minutes)
CODEX_TURN_TIMEOUT_MS=360000

# Where to store generated audio files
MUSIC_STORAGE_PATH=/path/to/your/music/storage

OpenAI Codex (ChatGPT Subscription) Setup

Use this when you want LLM generation to run through your ChatGPT subscription instead of API-key billing.

  1. Install the Codex CLI and verify it is on your PATH (codex --version).
  2. Open SettingsNetworkOPENAI CODEX (CHATGPT SUBSCRIPTION).
  3. Click START DEVICE AUTH, open the verification URL, and enter the one-time code.
  4. Wait for status Authenticated with ChatGPT.
  5. Pick OPENAI CODEX as provider in playlist creation or oneshot mode, then select a Codex model.

Notes:

  • This project uses codex app-server for the openai-codex provider (not the Vercel AI SDK transport).
  • openai-codex covers text generation (metadata, lyrics, persona). Cover art and audio still use ComfyUI + ACE-Step.

Playlist Lifecycle

  • Endless playlists move from activeclosing after ~90s without heartbeat.
  • Opening an endless playlist page sends heartbeat and now reactivates both closing and closed playlists, then refills the song buffer.
  • Oneshot playlists remain closed after completion and are not auto-reactivated by heartbeat.

Architecture

Browser (React 19 + TanStack Router + React Query)
  ↕ HTTP fetch + WebSocket event invalidation (/ws)
  ↕ WebSocket room protocol (/ws/room)
Unified Server (Hono on :5175)
  ├── SQLite (better-sqlite3, WAL mode)
  ├── In-memory typed event bus
  ├── Service layer (song, playlist, settings)
  ├── Event-driven worker (metadata → cover → audio pipeline)
  ├── Room manager (multi-device playback)
  ├── WebSocket bridge → Browser (event invalidation)
  └── External services:
      ├── LLM (Ollama/OpenRouter via Vercel AI SDK + OpenAI Codex via Codex App Server)
      ├── ComfyUI → cover art
      └── ACE-Step 1.5 → audio synthesis

One server process handles everything: API routes, worker pipeline, room management, event broadcasting. No message queues. No inter-process HTTP. Single port.

Event-driven: Service mutations emit events → worker handlers react instantly → no polling. Song completion triggers buffer deficit check → creates new pending songs → triggers metadata generation → self-sustaining loop.

Project Structure

infinitune/
  packages/
    shared/            # @infinitune/shared — types, protocol, pick-next-song
    room-client/       # @infinitune/room-client — room hooks
  apps/
    web/               # React frontend (Vite + TanStack)
    server/            # Unified backend (Hono — API + worker + rooms)
Built with mass GPU cycles and human curiosity.

About

Infinite generative music — describe a vibe, get an endless stream of original AI-generated songs with lyrics, cover art, and audio.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors