An implementation of: Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
You need GPT4-Azure or Gemini Pro to use it. Local LLMs support is still being worked on.
You must log in or register to comment.
Neat! I’ve known that Regional Prompter is powerful, but it’s too much of a pain for me to bother using. Hopefully this makes it easier.