← Back to AI Hub

🎨 LoRAs, Prompt Optimizer & Style Control

A practical guide to getting sharp, intentional output from Wyltek Studio's image generator — prompt enhancement via local LLMs, style LoRAs, strength settings, and the trigger words that make them click.

Contents

1. Prompt Optimizer ("OP my prompt")
2. Which OP model should I use?
3. Style LoRAs — the fundamentals
4. Trigger words (and why your LoRA is doing nothing)
5. LoRA strength — finding the sweet spot
6. Three production-ready presets
7. Reproduce these findings on your own hardware

1. Prompt Optimizer — "OP my prompt"

The most common reason an image comes out disappointing isn't the model — it's the prompt. Four words of crude human input ("a pokemon style cat") gives Stable Diffusion nothing to anchor on. What you actually want is something like:

A highly detailed, adorable Pokémon-style cat character design,
vibrant colors, professional character sheet illustration,
dynamic pose, volumetric lighting, clean vector art style,
trending on ArtStation.

The Prompt Optimizer (reachable via the "OP my prompt" button next to every prompt field) does that transformation for you — a local Ollama LLM enhances your shorthand into a prompt that actually tells SDXL what you meant.

The enhancer also suggests a negative prompt — descriptors to avoid — which usually improves output more than people expect. That's auto-populated into the Negative prompt field when OP fires.

Why local, not GPT-4? Latency, privacy, cost. A good 8B Ollama model on CPU gives you rich prompt expansion in ~10–30 s with no API keys, no rate limits, and your creative work never leaves the machine. See the Local LLM Inference guide for hardware recommendations.

2. Which OP model should I use?

Wyltek Studio ships with a per-prompt model dropdown next to the OP button, plus a global default you set in Settings. You can switch any time — different models produce genuinely different prompt styles, and different subjects benefit from different styles.

We benchmarked every model we had on disk (3B–35B parameters) on CPU-only inference, warm cache, using the server's default system prompt. Here's what held up:

Model	Size	Typical latency (CPU)	Verdict
`gemma3n:e4b`	~7B Q4	~10 s	Speed champ. Terse 20-word enhancements, well-chosen style tokens.
`gemma4:latest`	~8B Q4	~10 s	Recommended default. Rich descriptive vocabulary ("volumetric lighting", "trending on ArtStation"), stable across creative subjects.
`qwen2.5:14b`	14B Q4	~40 s	Consistent, wordy (40+ word outputs). Good fallback if gemma4 misses the subject.
`qwen3.5:35b-a3b`	35B MoE	30–220 s	Highest-quality prompts when it behaves, but latency explodes on creative subjects (reasoning-mode engages unpredictably). Keep for tricky prompts, not daily driver.
`qwen3:8b`, `qwen3:14b`	8B / 14B	30–120 s	Reasonable, but variance is high. qwen3:14b consistently runs 2× slower than qwen2.5:14b at the same size.
`qwen3.5:9b`, `carnice-9b`	9B	200 s+ or timeout	Skip. Both exhibit runaway thinking-mode or don't produce valid JSON output.

num_gpu matters even on CPU-only Ollama

Our server passes num_gpu: -1 to Ollama (auto-offload all GPU layers that fit). Historically this was set to 0 to avoid VRAM contention with the image generator. On hardware without a GPU-capable Ollama daemon, num_gpu: -1 is still ~3× faster than 0 because it unblocks Ollama's normal multi-threaded CPU path. The 0 value forced a strict fallback path that left performance on the floor.

Our heuristic for picking an OP model

Daily driver: gemma4:latest. Fast enough to use on every gen.
Testing / critical output: switch to qwen3.5:35b-a3b for one call, accept the 30–200 s wait.
Brief / terse prompts: gemma3n:e4b — smallest model, tightest output.

3. Style LoRAs — the fundamentals

A LoRA (Low-Rank Adaptation) is a small fine-tune — typically 40–500 MB — that nudges a base diffusion model toward a specific style without retraining the whole thing. Wyltek Studio ships with six style LoRAs plus two speed LoRAs:

LoRA	Type	What it does
`pixel-art-xl`	Style	Pixel / 8-bit / retro-game art
`voxel-xl`	Style	Minecraft-style blocky 3D geometry
`crayon-style-xl`	Style	Crayon / wax drawing aesthetic
`watercolor-xl`	Style	Soft watercolour painting
`sticker-style-xl`	Style	Die-cut vinyl sticker look
`anime-detailer-xl`	Detailer	Sharpens anime features. Needs an anime subject + "anime" in the prompt to engage.
`sdxl_lightning_4step / 8step`	Speed	Distillation LoRAs. Let you run SDXL at 4–8 steps instead of 25–30. Pair with any style LoRA if you have the code to stack them.

Picking a LoRA tells the model what style you want. But there are two hidden controls that make-or-break the result: trigger words and strength.

4. Trigger words — and why your LoRA is doing nothing

Every style LoRA was trained with specific activator tokens in its training captions. If those tokens don't appear in your prompt, the LoRA barely fires. You'll get a tint toward the style rather than the style itself.

LoRA	Trigger word(s)
`pixel-art-xl`	pixel art, pixel-art, 8-bit
`voxel-xl`	voxel art, voxel
`crayon-style-xl`	crayon drawing, crayon, wax crayon
`watercolor-xl`	watercolor painting, watercolor, watercolour
`sticker-style-xl`	die-cut sticker, sticker
`anime-detailer-xl`	anime, anime style

Wyltek Studio auto-injects triggers for you. When you pick a style LoRA and your prompt doesn't already contain a trigger or an explicit style cue (like "photo" or "oil painting"), Wyltek Studio silently appends the LoRA's canonical trigger before submitting. The injection is recorded in the image's JSON sidecar as original_prompt + trigger_injected so you always know what was added. This is the "hybrid" strategy — auto-help for bare prompts, hands-off for deliberate ones.

Empirical proof from our own testing

In a 48-image strength × trigger sweep, the same LoRA at the same strength consistently produced dramatically more recognisable style when its trigger word was present. Voxel-xl is the clearest example:

Without trigger: a subtle geometric flattening, almost imperceptible.
With "voxel art" appended: fully blocky 3D rendering — looks like actual Minecraft-style game art.

No amount of strength alone produces the trigger effect. Strength and triggers are multiplicative — you need both.

5. LoRA strength — finding the sweet spot

Each LoRA exposes two strength knobs:

Model strength — how much the LoRA's weights modify the base model's unet (the thing that renders pixels)
CLIP strength — how much the LoRA modifies the text encoder (the thing that interprets your prompt)

Conventional wisdom: model strength should be slightly higher than CLIP strength. Style LoRAs want model ≈ 0.8–1.2 with clip ≈ 0.5–0.9. Below 0.7 on model, the style is a faint tint. Above 1.2, most LoRAs start to oversaturate or fry details.

Our strength sweep — per-LoRA recommendations

These are our own results running each LoRA on a pokemon-style-cat subject, juggernautXL_v9 base, gemma4-enhanced prompt, with the trigger word appended. "Sweet spot" means where the LoRA first produces its full aesthetic without visible burn:

LoRA	Sweet spot (model/clip)	Notes
`voxel-xl`	1.2 / 0.9 +trigger	Needs max strength + trigger to produce actual blocky 3D art. Perfect for game-ready character assets.
`pixel-art-xl`	1.0 / 0.7 +trigger	Needs 1.0+ to override base model's photoreal tendency. At 0.8, you get hints of pixel style; at 1.0+, full 8-bit.
`crayon-style-xl`	1.0 / 0.7 +trigger	Robust — works well at all strengths. At 1.0+ with trigger it looks like elite hand-drawn crayon illustration. Usable at 0.6+ for subtle styling.
`watercolor-xl`	0.8–1.0 / 0.5–0.7 +trigger	Charming at lower strengths with trigger. Burns to muddy at 1.2.
`sticker-style-xl`	1.0 / 0.7 +trigger	Needs trigger + standard strength to produce the die-cut vinyl look.
`anime-detailer-xl`	Only useful on anime subjects	A detailer, not a style transformer. On a pokemon cat it's largely a no-op regardless of strength. Use with anime-native prompts.

A caveat about subject matter. These strength recommendations are from a pokemon-cat subject on juggernautXL base. LoRA behaviour varies by subject (abstract backgrounds don't exercise style LoRAs well) and by base model (photoreal-heavy bases like RealVisXL fight style LoRAs; neutral bases like SDXL 1.0 give them more room). If your results don't match these numbers, sweep the strength yourself — see section 7.

6. Three production-ready presets

Three combinations that came out of our testing as ready-to-use. Each is a specific {base, LoRA, strength_model, strength_clip, trigger} tuple that reliably produces its named aesthetic:

🎮 Voxel Game Character

Base model: juggernautXL_v9.safetensors
LoRA: voxel-xl
Strength: model 1.2 / clip 0.9
Trigger: append , voxel art to prompt
Steps / CFG: 30 / 7.0
Output looks like: Pre-rendered Minecraft / voxel-game character art. Scale-down friendly — outputs slot directly into 2D game assets without further processing.

✏️ Elite Hand-Drawn (Crayon)

Base model: juggernautXL_v9.safetensors
LoRA: crayon-style-xl
Strength: model 1.0 / clip 0.7
Trigger: append , crayon drawing to prompt
Steps / CFG: 30 / 7.0
Output looks like: High-quality crayon / wax illustration with visible tooth and deliberate hand-drawn linework. Works across subjects — people, animals, objects.

🎨 Watercolour Illustration

Base model: juggernautXL_v9.safetensors
LoRA: watercolor-xl
Strength: model 0.8–1.0 / clip 0.5–0.7
Trigger: append , watercolor painting to prompt
Steps / CFG: 30 / 7.0
Output looks like: Soft watercolour aesthetic with paper-texture charm. Don't push above 1.0 — watercolour LoRAs burn to muddy at high strength.

7. Reproduce these findings on your own hardware

Every claim in this guide was produced by two Python scripts that ship with Wyltek Studio. Running them on your own setup lets you verify the conclusions against your actual LoRAs, GPU, and Ollama models:

OP-model benchmark

Measures cold-load and warm inference latency for each Ollama model you have installed. Useful for picking a daily-driver OP model.

cd ~/open-palette
python scripts/op_latency_bench.py gemma4:latest qwen3.5:35b-a3b \
    --trials 2 --num-gpu -1

Full model × LoRA matrix

Enhances the same prompt with every OP model you have, renders it against each style LoRA. Outputs a gallery you can scroll side-by-side. ~30 minutes for a full matrix on consumer GPU.

python scripts/lora_matrix_test.py "a pokemon style cat"
python scripts/matrix_gallery.py    # renders HTML gallery of latest run

Strength + trigger sweep

Holds the prompt and OP model fixed, varies LoRA strength (4 tiers) and trigger presence (A/B). Answers "what strength should I run this LoRA at?" empirically. ~10 minutes for 48 images.

python scripts/lora_strength_sweep.py "a pokemon style cat" --with-trigger-ab
python scripts/matrix_gallery.py

Each script writes a browseable HTML gallery with hover tooltips showing the exact prompt, LoRA, and strength for every image. Compare rows/columns to find your own sweet spots.

See also

Using Wyltek Studio — install, configuration, full feature map
Local LLM Inference Setup — picking Ollama models for your hardware
Wyltek Studio source on GitHub