โ† Back to AI Hub

๐ŸŽจ LoRAs, Prompt Optimizer & Style Control

A practical guide to getting sharp, intentional output from Wyltek Studio's image generator โ€” prompt enhancement via local LLMs, style LoRAs, strength settings, and the trigger words that make them click.

Contents

1. Prompt Optimizer โ€” "OP my prompt"

The most common reason an image comes out disappointing isn't the model โ€” it's the prompt. Four words of crude human input ("a pokemon style cat") gives Stable Diffusion nothing to anchor on. What you actually want is something like:

A highly detailed, adorable Pokรฉmon-style cat character design,
vibrant colors, professional character sheet illustration,
dynamic pose, volumetric lighting, clean vector art style,
trending on ArtStation.

The Prompt Optimizer (reachable via the "OP my prompt" button next to every prompt field) does that transformation for you โ€” a local Ollama LLM enhances your shorthand into a prompt that actually tells SDXL what you meant.

The enhancer also suggests a negative prompt โ€” descriptors to avoid โ€” which usually improves output more than people expect. That's auto-populated into the Negative prompt field when OP fires.

Why local, not GPT-4? Latency, privacy, cost. A good 8B Ollama model on CPU gives you rich prompt expansion in ~10โ€“30 s with no API keys, no rate limits, and your creative work never leaves the machine. See the Local LLM Inference guide for hardware recommendations.

2. Which OP model should I use?

Wyltek Studio ships with a per-prompt model dropdown next to the OP button, plus a global default you set in Settings. You can switch any time โ€” different models produce genuinely different prompt styles, and different subjects benefit from different styles.

We benchmarked every model we had on disk (3Bโ€“35B parameters) on CPU-only inference, warm cache, using the server's default system prompt. Here's what held up:

Model Size Typical latency (CPU) Verdict
gemma3n:e4b ~7B Q4 ~10 s Speed champ. Terse 20-word enhancements, well-chosen style tokens.
gemma4:latest ~8B Q4 ~10 s Recommended default. Rich descriptive vocabulary ("volumetric lighting", "trending on ArtStation"), stable across creative subjects.
qwen2.5:14b 14B Q4 ~40 s Consistent, wordy (40+ word outputs). Good fallback if gemma4 misses the subject.
qwen3.5:35b-a3b 35B MoE 30โ€“220 s Highest-quality prompts when it behaves, but latency explodes on creative subjects (reasoning-mode engages unpredictably). Keep for tricky prompts, not daily driver.
qwen3:8b, qwen3:14b 8B / 14B 30โ€“120 s Reasonable, but variance is high. qwen3:14b consistently runs 2ร— slower than qwen2.5:14b at the same size.
qwen3.5:9b, carnice-9b 9B 200 s+ or timeout Skip. Both exhibit runaway thinking-mode or don't produce valid JSON output.

num_gpu matters even on CPU-only Ollama

Our server passes num_gpu: -1 to Ollama (auto-offload all GPU layers that fit). Historically this was set to 0 to avoid VRAM contention with the image generator. On hardware without a GPU-capable Ollama daemon, num_gpu: -1 is still ~3ร— faster than 0 because it unblocks Ollama's normal multi-threaded CPU path. The 0 value forced a strict fallback path that left performance on the floor.

Our heuristic for picking an OP model

3. Style LoRAs โ€” the fundamentals

A LoRA (Low-Rank Adaptation) is a small fine-tune โ€” typically 40โ€“500 MB โ€” that nudges a base diffusion model toward a specific style without retraining the whole thing. Wyltek Studio ships with six style LoRAs plus two speed LoRAs:

LoRA Type What it does
pixel-art-xl Style Pixel / 8-bit / retro-game art
voxel-xl Style Minecraft-style blocky 3D geometry
crayon-style-xl Style Crayon / wax drawing aesthetic
watercolor-xl Style Soft watercolour painting
sticker-style-xl Style Die-cut vinyl sticker look
anime-detailer-xl Detailer Sharpens anime features. Needs an anime subject + "anime" in the prompt to engage.
sdxl_lightning_4step / 8step Speed Distillation LoRAs. Let you run SDXL at 4โ€“8 steps instead of 25โ€“30. Pair with any style LoRA if you have the code to stack them.

Picking a LoRA tells the model what style you want. But there are two hidden controls that make-or-break the result: trigger words and strength.

4. Trigger words โ€” and why your LoRA is doing nothing

Every style LoRA was trained with specific activator tokens in its training captions. If those tokens don't appear in your prompt, the LoRA barely fires. You'll get a tint toward the style rather than the style itself.

LoRATrigger word(s)
pixel-art-xlpixel art, pixel-art, 8-bit
voxel-xlvoxel art, voxel
crayon-style-xlcrayon drawing, crayon, wax crayon
watercolor-xlwatercolor painting, watercolor, watercolour
sticker-style-xldie-cut sticker, sticker
anime-detailer-xlanime, anime style
Wyltek Studio auto-injects triggers for you. When you pick a style LoRA and your prompt doesn't already contain a trigger or an explicit style cue (like "photo" or "oil painting"), Wyltek Studio silently appends the LoRA's canonical trigger before submitting. The injection is recorded in the image's JSON sidecar as original_prompt + trigger_injected so you always know what was added. This is the "hybrid" strategy โ€” auto-help for bare prompts, hands-off for deliberate ones.

Empirical proof from our own testing

In a 48-image strength ร— trigger sweep, the same LoRA at the same strength consistently produced dramatically more recognisable style when its trigger word was present. Voxel-xl is the clearest example:

No amount of strength alone produces the trigger effect. Strength and triggers are multiplicative โ€” you need both.

5. LoRA strength โ€” finding the sweet spot

Each LoRA exposes two strength knobs:

Conventional wisdom: model strength should be slightly higher than CLIP strength. Style LoRAs want model โ‰ˆ 0.8โ€“1.2 with clip โ‰ˆ 0.5โ€“0.9. Below 0.7 on model, the style is a faint tint. Above 1.2, most LoRAs start to oversaturate or fry details.

Our strength sweep โ€” per-LoRA recommendations

These are our own results running each LoRA on a pokemon-style-cat subject, juggernautXL_v9 base, gemma4-enhanced prompt, with the trigger word appended. "Sweet spot" means where the LoRA first produces its full aesthetic without visible burn:

LoRA Sweet spot (model/clip) Notes
voxel-xl 1.2 / 0.9 +trigger Needs max strength + trigger to produce actual blocky 3D art. Perfect for game-ready character assets.
pixel-art-xl 1.0 / 0.7 +trigger Needs 1.0+ to override base model's photoreal tendency. At 0.8, you get hints of pixel style; at 1.0+, full 8-bit.
crayon-style-xl 1.0 / 0.7 +trigger Robust โ€” works well at all strengths. At 1.0+ with trigger it looks like elite hand-drawn crayon illustration. Usable at 0.6+ for subtle styling.
watercolor-xl 0.8โ€“1.0 / 0.5โ€“0.7 +trigger Charming at lower strengths with trigger. Burns to muddy at 1.2.
sticker-style-xl 1.0 / 0.7 +trigger Needs trigger + standard strength to produce the die-cut vinyl look.
anime-detailer-xl Only useful on anime subjects A detailer, not a style transformer. On a pokemon cat it's largely a no-op regardless of strength. Use with anime-native prompts.
A caveat about subject matter. These strength recommendations are from a pokemon-cat subject on juggernautXL base. LoRA behaviour varies by subject (abstract backgrounds don't exercise style LoRAs well) and by base model (photoreal-heavy bases like RealVisXL fight style LoRAs; neutral bases like SDXL 1.0 give them more room). If your results don't match these numbers, sweep the strength yourself โ€” see section 7.

6. Three production-ready presets

Three combinations that came out of our testing as ready-to-use. Each is a specific {base, LoRA, strength_model, strength_clip, trigger} tuple that reliably produces its named aesthetic:

๐ŸŽฎ Voxel Game Character

Base model
juggernautXL_v9.safetensors
LoRA
voxel-xl
Strength
model 1.2 / clip 0.9
Trigger
append , voxel art to prompt
Steps / CFG
30 / 7.0
Output looks like
Pre-rendered Minecraft / voxel-game character art. Scale-down friendly โ€” outputs slot directly into 2D game assets without further processing.

โœ๏ธ Elite Hand-Drawn (Crayon)

Base model
juggernautXL_v9.safetensors
LoRA
crayon-style-xl
Strength
model 1.0 / clip 0.7
Trigger
append , crayon drawing to prompt
Steps / CFG
30 / 7.0
Output looks like
High-quality crayon / wax illustration with visible tooth and deliberate hand-drawn linework. Works across subjects โ€” people, animals, objects.

๐ŸŽจ Watercolour Illustration

Base model
juggernautXL_v9.safetensors
LoRA
watercolor-xl
Strength
model 0.8โ€“1.0 / clip 0.5โ€“0.7
Trigger
append , watercolor painting to prompt
Steps / CFG
30 / 7.0
Output looks like
Soft watercolour aesthetic with paper-texture charm. Don't push above 1.0 โ€” watercolour LoRAs burn to muddy at high strength.

7. Reproduce these findings on your own hardware

Every claim in this guide was produced by two Python scripts that ship with Wyltek Studio. Running them on your own setup lets you verify the conclusions against your actual LoRAs, GPU, and Ollama models:

OP-model benchmark

Measures cold-load and warm inference latency for each Ollama model you have installed. Useful for picking a daily-driver OP model.

cd ~/open-palette
python scripts/op_latency_bench.py gemma4:latest qwen3.5:35b-a3b \
    --trials 2 --num-gpu -1

Full model ร— LoRA matrix

Enhances the same prompt with every OP model you have, renders it against each style LoRA. Outputs a gallery you can scroll side-by-side. ~30 minutes for a full matrix on consumer GPU.

python scripts/lora_matrix_test.py "a pokemon style cat"
python scripts/matrix_gallery.py    # renders HTML gallery of latest run

Strength + trigger sweep

Holds the prompt and OP model fixed, varies LoRA strength (4 tiers) and trigger presence (A/B). Answers "what strength should I run this LoRA at?" empirically. ~10 minutes for 48 images.

python scripts/lora_strength_sweep.py "a pokemon style cat" --with-trigger-ab
python scripts/matrix_gallery.py

Each script writes a browseable HTML gallery with hover tooltips showing the exact prompt, LoRA, and strength for every image. Compare rows/columns to find your own sweet spots.

See also