Image generation

Overview

Image generation uses qvac-ext-stable-diffusion.cpp as the inference engine. Load a supported model using modelType: "diffusion". Then, provide a text prompt describing the image to generate.

diffusion() returns one or more PNG images as Uint8Array buffers. Use progressStream to track generation progress step-by-step.

Functions

Use the following sequence of function calls:

For how to use each function, see SDK — API reference.

Models

Supported model families and their file layouts:

SD1.x, SD2.x: single all-in-one *.gguf file. No companion files needed.
SDXL, SD3/3.5: may require separate CLIP/T5 text encoder files (clipLModelSrc, clipGModelSrc, t5XxlModelSrc) in modelConfig depending on the model variant.
FLUX.2-klein: split layout — diffusion model *.gguf + LLM text encoder *.gguf (via llmModelSrc) + VAE *.safetensors (via vaeModelSrc).

For models available as constants, see SDK — Models.

Examples

Stable Diffusion

The following script shows a minimal text-to-image generation example using a single all-in-one SD 2.1 model:

diffusion-simple.js

import { loadModel, unloadModel, diffusion, SD_V2_1_1B_Q8_0 } from "@qvac/sdk";
import fs from "fs";
// Minimal diffusion example — single GGUF model, no companion files needed.
// Works with SD 1.x / 2.x all-in-one models.
const modelSrc = process.argv[2] || SD_V2_1_1B_Q8_0;
const prompt = process.argv[3] || "a photo of a cat sitting on a windowsill";
const modelId = await loadModel({
    modelSrc,
    modelType: "diffusion",
    modelConfig: { prediction: "v" },
});
const { outputs } = diffusion({ modelId, prompt });
const buffers = await outputs;
fs.writeFileSync("output.png", buffers[0]);
console.log("Saved: output.png");
await unloadModel({ modelId, clearStorage: false });
process.exit(0);

FLUX.2-klein

The following script shows text-to-image generation using FLUX.2-klein with its split-layout model (separate diffusion model, LLM text encoder, and VAE):

diffusion-flux2-klein.js

import { loadModel, unloadModel, diffusion, FLUX_2_KLEIN_4B_Q4_0, FLUX_2_KLEIN_4B_VAE, QWEN3_4B_Q4_K_M } from "@qvac/sdk";
import fs from "fs";
import path from "path";
// FLUX.2 [klein] uses a split-layout: separate diffusion model + LLM text encoder + VAE
const diffusionModelSrc = process.argv[2] || FLUX_2_KLEIN_4B_Q4_0;
const llmModelSrc = process.argv[3] || QWEN3_4B_Q4_K_M;
const vaeModelSrc = process.argv[4] || FLUX_2_KLEIN_4B_VAE;
const prompt = process.argv[5] || "a futuristic city at sunset, photorealistic";
const outputDir = process.argv[6] || ".";
console.log("Loading FLUX.2 [klein] split-layout model...");
const modelId = await loadModel({
    modelSrc: diffusionModelSrc,
    modelType: "diffusion",
    modelConfig: {
        device: "gpu",
        threads: 4,
        llmModelSrc,
        vaeModelSrc,
    },
    onProgress: (p) => console.log(`Loading: ${p.percentage.toFixed(1)}%`),
});
console.log(`Model loaded: ${modelId}`);
console.log(`\nGenerating: "${prompt}"`);
const { progressStream, outputs, stats } = diffusion({
    modelId,
    prompt,
    width: 512,
    height: 512,
    steps: 20,
    guidance: 3.5,
    seed: -1,
});
for await (const { step, totalSteps } of progressStream) {
    process.stdout.write(`\rStep ${step}/${totalSteps}`);
}
console.log();
const buffers = await outputs;
for (let i = 0; i < buffers.length; i++) {
    const outputPath = path.join(outputDir, `flux2_${i}.png`);
    fs.writeFileSync(outputPath, buffers[i]);
    console.log(`Saved: ${outputPath}`);
}
console.log("\nStats:", await stats);
await unloadModel({ modelId, clearStorage: false });
console.log("Done.");
process.exit(0);

Tip: all examples throughout this documentation are self-contained and runnable. For instructions on how to run them, see SDK quickstart.