Stability
The Stability provider includes processors for models from Stability AI hosted on DreamStudio.
Text2Image
The Text2Image processor generates images from text prompts.
Input
prompt: A list of prompts to describe the image to generate.
Configuration
engine_id: The Stability AI Text2Image model to use.height: The height of the generated image in pixels.width: The width of the generated image in pixels.cfg_scale: A parameter that controls how closely the engine attempts to match a generation to the provided prompt.sampler: The sampling engine to use.steps: The number of diffusion steps performed on the requested generation.seed: A seed for random latent noise generation.num_samples: The number of images to generate.guidance_preset: A guidance preset to use for image generation.
Output
answer: A list of generated images as base64 encoded strings.
Image2Image
Input
init_image: The initial image to generate from. This can be any image, such as a photo, drawing, or painting.prompt: The prompt to describe the desired changes to the image. The prompt can be as simple as a few words or as complex as a paragraph of text.
Configuration
engine: The Stability AI Image2Image model to use. There are a number of different models available, each with its own strengths and weaknesses.height: The height of the generated image in pixels.width: The width of the generated image in pixels.cfg_scale: Dictates how closely the engine attempts to match a generation to the provided prompt. v2-x models respond well to lower CFG (4-8), where as v1-x models respond well to a higher range (IE: 7-14).sampler: Sampling engine to use. If no sampler is declared, an appropriate default sampler for the declared inference engine will be applied automatically.steps: Affects the number of diffusion steps performed on the requested generation.seed: Seed for random latent noise generation. Deterministic if not being used in concert with CLIP Guidance. If not specified, or set to 0, then a random value will be used.num_samples: Number of images to generate. Allows for batch image generations.guidance_preset: CLIP guidance preset, use with ancestral sampler for best results.guidance_strength: How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt).
Output
images: An array of generated images as base64 encoded strings.