Stability

The Stability provider includes processors for models from Stability AI hosted on DreamStudio.

Text2Image

The Text2Image processor generates images from text prompts.

engine_id: The Stability AI Text2Image model to use.
height: The height of the generated image in pixels.
width: The width of the generated image in pixels.
cfg_scale: A parameter that controls how closely the engine attempts to match a generation to the provided prompt.
sampler: The sampling engine to use.
steps: The number of diffusion steps performed on the requested generation.
seed: A seed for random latent noise generation.
num_samples: The number of images to generate.
guidance_preset: A guidance preset to use for image generation.

init_image: The initial image to generate from. This can be any image, such as a photo, drawing, or painting.
prompt: The prompt to describe the desired changes to the image. The prompt can be as simple as a few words or as complex as a paragraph of text.

engine: The Stability AI Image2Image model to use. There are a number of different models available, each with its own strengths and weaknesses.
height: The height of the generated image in pixels.
width: The width of the generated image in pixels.
cfg_scale: Dictates how closely the engine attempts to match a generation to the provided prompt. v2-x models respond well to lower CFG (4-8), where as v1-x models respond well to a higher range (IE: 7-14).
sampler: Sampling engine to use. If no sampler is declared, an appropriate default sampler for the declared inference engine will be applied automatically.
steps: Affects the number of diffusion steps performed on the requested generation.
seed: Seed for random latent noise generation. Deterministic if not being used in concert with CLIP Guidance. If not specified, or set to 0, then a random value will be used.
num_samples: Number of images to generate. Allows for batch image generations.
guidance_preset: CLIP guidance preset, use with ancestral sampler for best results.
guidance_strength: How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt).