Introduction
Using GenAI allows individuals to create fresh images based on specific prompts effortlessly. Advanced machine learning algorithms are utilized to generate unique visuals that match the given prompt seamlessly.
By simply entering a prompt, users can quickly generate new images without requiring any design skills. The GenAI image generation models are trained on extensive datasets, utilizing this knowledge to effortlessly produce new visuals.
By employing generative adversarial networks (GANs), they are capable of generating images that closely resemble the provided prompts. This versatile capability finds applications across various fields, including website graphics, marketing materials, and advertising campaigns.
This method offers a fast and efficient way of generating images, often delivering impressive results in terms of both quality and relevance.
Implementation
We will use the ByteDance/SDXL-Lightning which uses a diffusion distillation method that achieves new state-of-the-art in one-step/few-step 1024px text-to-image generation based on Stable Diffusion XL (SDXL). It combines progressive and adversarial distillation to achieve a balance between quality and mode coverage.
Getting Started
First, setup a new environment in Python/ VSCode. Open a PowerShell terminal within VSCode and use the command ->
python -m venv . venv
to create a virtual environment. Activate this virtual environment via the Terminal to your workspace using ->
.venv\Scripts\Activate.ps1
Now, install the required libraries using pip->
pip install accelerate==0.28.0
pip install certifi==2024.2.2
pip install charset-normalizer==3.3.2
pip install colorama==0.4.6
pip install diffusers==0.26.3
pip install filelock==3.13.1
pip install fsspec==2024.2.0
pip install huggingface-hub==0.21.4
pip install idna==3.6
pip install importlib_metadata==7.0.2
pip install Jinja2==3.1.3
pip install MarkupSafe==2.1.5
pip install mpmath==1.3.0
pip install networkx==3.2.1
pip install numpy==1.26.4
pip install packaging==24.0
pip install pillow==10.2.0
pip install pip==24.0
pip install psutil==5.9.8
pip install PyYAML==6.0.1
pip install regex==2023.12.25
pip install requests==2.31.0
pip install safetensors==0.4.2
pip install setuptools==58.1.0
pip install sympy==1.12
pip install tokenizers==0.15.2
pip install tqdm==4.66.2
pip install transformers==4.38.2
pip install typing_extensions==4.10.0
pip install urllib3==2.2.1
pip install zipp==3.17.0
pip install torch torchvision torchaudio -f https://download.pytorch.org/whl/cu121/torch_stable.html
Install the relevant CUDA drivers from Nvidia.
Code
Create a new python file for the ImageGeneration.py script->
import torch
from diffusers import (
StableDiffusionXLPipeline,
UNet2DConditionModel,
EulerDiscreteScheduler,
)
from huggingface_hub import hf_hub_download
from safetensors.torch import load_file
import datetime
import os
base = "stabilityai/stable-diffusion-xl-base-1.0"
repo = "ByteDance/SDXL-Lightning"
ckpt = "sdxl_lightning_4step_unet.safetensors" # Use the correct ckpt for your step setting!
# Load model.
unet = UNet2DConditionModel.from_config(base, subfolder="unet").to(
"cuda", torch.float16
)
unet.load_state_dict(load_file(hf_hub_download(repo, ckpt), device="cuda"))
pipe = StableDiffusionXLPipeline.from_pretrained(
base, unet=unet, torch_dtype=torch.float16, variant="fp16"
).to("cuda")
# Ensure sampler uses "trailing" timesteps.
pipe.scheduler = EulerDiscreteScheduler.from_config(
pipe.scheduler.config, timestep_spacing="trailing"
)
while True:
prompt = input("Enter your description for generating the image:\n")
# Ensure using the same inference steps as the loaded model and CFG set to 0.
file_name = (
"./output/"
+ prompt.replace(" ", "").strip()
+ datetime.datetime.now().strftime("%d_%m_%y_%H_%M_%S")
+ ".png"
)
pipe(prompt, num_inference_steps=4, guidance_scale=0).images[0].save(file_name)
print("Generated Image saved at : ", os.path.abspath(file_name), "\n")
Sample dataset could be downloaded from here.
If everything is configured correctly, the output should be something like->
(.venv) PS C:\Code\Python\Environment\ImageGeneration> python .\ImageGeneration.py
Loading pipeline components...: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00, 2.81it/s]
Enter your description for generating the image:
A cat looking into the mirror imagining itself as a tiger.
0%| | 0/4 [00:00<?, ?it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [04:10<00:00, 62.69s/it]
Generated Image saved at : C:\Code\Python\Environment\ImageGeneration\output\Acatlookingintothemirrorimaginingitselfasatiger.15_03_24_03_05_09.png
Enter your description for generating the image:
Sir Isaac Newton sitting under the Apple Tree.
0%| | 0/4 [00:00<?, ?it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [04:30<00:00, 67.52s/it]
Generated Image saved at : C:\Code\Python\Environment\ImageGeneration\output\SirIsaacNewtonsittingundertheAppleTree.15_03_24_03_30_07.png
Enter your description for generating the image:
Man on walking on the Moon watches the Earth.
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [04:08<00:00, 62.14s/it]
Generated Image saved at : C:\Code\Python\Environment\ImageGeneration\output\ManonwalkingontheMoonwatchestheEarth.15_03_24_04_00_03.png
Generated Images Sample ->
GitHub: https://github.com/threadwaiting/ImageGenerationUsingGenAI