A local-first creative suite for image, audio, and video generation — everything running on your own GPU, no API keys for the core workflows. This guide covers install, configuration, and the small number of concepts that separate "it's generating garbage" from "that looks like a finished asset."
Wyltek Studio is a local generative-AI creative suite built around ComfyUI and a small set of purpose-built studio pages. The single Python server coordinates:
It isn't a cloud SaaS, and it isn't a thin wrapper around DALL-E or Midjourney. Every generative step runs on your hardware. The trade-off is you need a capable GPU (8 GB+ VRAM minimum for SDXL) and a few gigabytes of disk for checkpoints.
git clone https://github.com/toastmanAu/wyltek-studio ~/open-palette
cd ~/open-palette
./install.sh
The install script walks you through:
Then launch:
cd ~/open-palette
./run.sh # or: python server.py
Studio opens at http://localhost:7860. Config lives at config.yaml in the project root.
Main page. Prompt → image, with LoRA, IP-Adapter, reference images, upscaler.
Compare mode renders the same prompt through two models side-by-side.
"OP my prompt" button next to every prompt field. Uses a local Ollama LLM to enrich your prompt and suggest a negative prompt.
Per-prompt model override via dropdown.
rembg background removal (5 models), SAM click-to-segment, manual mask tools (brush / rect / ellipse / lasso).
Extract PNG frames from any video file. Frame-by-frame scrubber. Send directly to Image Tools.
Piper (fast CPU), Kokoro (expressive), XTTS v2 (voice cloning), Bark (emotions).
MusicGen (text-to-music), beat builder, sample packs.
AnimateDiff Lightning (fast text-to-video), Stable Video Diffusion, timeline compositor.
SDXL + pixel-art LoRA batch sprite generation. Auto-tile for game assets.
Shared asset + timeline system. Any output from any tool lands here with metadata for later assembly.
Wyltek Studio's model catalog (Settings → Models) lists what's available and what's installed. Pick based on intent:
| Intent | Model recommendation | Why |
|---|---|---|
| Photoreal portraits / architecture | RealVisXL V5.0 |
Heavily finetuned photo SDXL. Excellent skin tones, interiors, landscapes. |
| Versatile "all-rounder" | juggernautXL v9 |
Photoreal-leaning but comfortable with illustration. Good with LoRAs. |
| Illustration / fantasy | DreamShaper XL Turbo |
Only needs 6–10 steps (Turbo distilled). Great for concept art iteration. |
| Neutral "canvas" for LoRA testing | SDXL base 1.0 (Q4) |
Doesn't fight LoRAs the way heavily-finetuned bases do. Best for evaluating new LoRAs. |
| State-of-the-art quality | Flux dev or FLUX.2-klein |
Best prompt following and coherence. Slow + heavy (16 GB VRAM class). |
| Fast iteration | Flux Schnell or SDXL + Lightning LoRA |
4-step distilled. Draft-quality previews in seconds. |
The single biggest lever on image quality isn't the model — it's the prompt. "A cat" gives SDXL nothing; "A detailed, whimsical cat character, vibrant colors, professional illustration, trending on ArtStation" gives it a direction. The Prompt Optimizer (OP my prompt) does that transformation via a local Ollama LLM.
The dropdown next to the OP button lets you swap Ollama models per call. Your selection persists in localStorage across sessions; the server default (configured in Settings → Prompt Optimizer) is the fallback.
Style LoRAs transform the aesthetic of your output — voxel, pixel art, watercolour, crayon, etc. Wyltek Studio ships with six style LoRAs preconfigured. Three things matter when using them:
Every image, audio clip, and video frame you generate lands in Wyltek Studio's Projects system. Each project has a shared assets folder and a timeline where you arrange generations in sequence for export. The flow:
This is what separates Wyltek Studio from a loose collection of AI tools — you can iterate on a music track, generate matching frames, run TTS for narration, and assemble them all in one place without leaving the app.
Wyltek Studio ships several skills for Claude Code that treat the studio as a callable service. Relevant ones:
/cut-subject — background-removal skill that wraps Wyltek Studio's rembg + SAM endpoints. Run /cut-subject ~/photos/x.jpg and Claude picks the right model, calls Wyltek Studio, and returns the transparent PNG./ingest-doc — extracts text / frames / audio from any document and pushes it through Wyltek Studio's processing pipeline./quote-meme — quote-card meme generator using Wyltek Studio's image tools.See the Skills guide for the full list.