NVIDIA RTX and TensorRT speed up Black Forest Labs’ latest image generation and editing model; plus, Gemma 3n is now accelerated by RTX and NVIDIA Jetson, and the G-Assist Plug-In Hackathon continues.
one of the world’s leading AI research labs, just changed the game for image generation.
The lab’s FLUX.1 image models have earned global attention for delivering high-quality visuals with exceptional prompt adherence. Now, with its new FLUX.1 Kontext model, the lab is fundamentally changing how users can guide and refine the image generation process.To get their desired results, AI artists today often use a combination of models and ControlNets — AI models that help guide the outputs of an image generator. This commonly involves combining multiple ControlNets or using advanced techniques like the one used in the NVIDIA AI Blueprint for 3D-guided image generation, where a draft 3D scene is used to determine the composition of an image.
The new FLUX.1 Kontext model simplifies this by providing a single model that can perform both image generation and editing, using natural language.NVIDIA has collaborated with Black Forest Labs to optimize FLUX.1 Kontext [dev] for NVIDIA RTX GPUs using the NVIDIA TensorRT software development kit and quantization to deliver faster inference with lower VRAM requirements.
For creators and developers alike, TensorRT optimizations mean faster edits, smoother iteration and more control — right from their RTX-powered machines.
The FLUX.1 Kontext [dev] Flex: In-Context Image Generation
Black Forest Labs in May introduced the FLUX.1 Kontext family of image models which accept both text and image prompts.These models allow users to start from a reference image and guide edits with simple language, without the need for fine-tuning or complex workflows with multiple ControlNets.
FLUX.1 Kontext is an open-weight generative model built for image editing using a guided, step-by-step generation process that makes it easier to control how an image evolves, whether refining small details or transforming an entire scene. Because the model accepts both text and image inputs, users can easily reference a visual concept and guide how it evolves in a natural and intuitive way. This enables coherent, high-quality image edits that stay true to the original concept.