Install Stable Diffusion and Generate Breathtaking Art with AI

Install Stable Diffusion and Generate Breathtaking Art with AI
Cyberpunk Daenerys Targaryen - Ryan Gordon via Stable Diffusion 

Stable Diffusion is the latest in a series of mind-blowing art-generating AI software. Many people are familiar with OpenAI's DALL·E, but it's not the only player in the market.

Unlike DALL-E, which requires payment to generate images, Stable Diffusion is an open-source project available on GitHub, which means it can be run locally on your own computer. That's right, you can generate mind-blowing works of art which could rival greats such as Picasso or DaVinci, all without having to leave the comfort of your chair. What a world to live in, right?

Getting Stable Diffusion running can be a bit daunting if you aren't already technically-inclined. Most guides focus on how to get it setup on Linux, and don't fully explain the steps. This guide will be focusing on Windows, but anyone who knows Linux will be able to easily supplement in the equivalent Linux commands and follow along no problem.

For those who are technically inclined, the primary reason I am focusing on Windows here is a) sharing GPU resources to a VM is hard and often infeasible without full passthrough, and b) there is no real reason we need run this on Linux in the first place.


First Things First...

The very first thing to do is to figure out which version of Stable Diffusion you want to run. This will depend entirely on how much Video RAM (VRAM) your graphics card has.

To check, press CTRL+ALT+DELETE and select Task Manager. Then, Click on the Performance tab and scroll down to GPU, and click on it.

Task Manager GPU View

At the bottom of GPU pane, there is a field labeled Dedicated GPU Memory. The total amount of memory shown here (the number on the right) is how much VRAM you have.

To Optimize or Not to Optimize?

Now that you know how much VRAM you have available to you, it's time to decide which version of Stable Diffusion you should download. There are two primary versions (repositories) to consider: the main one, or the optimized fork that is separately maintained.

Luckily, this is a simple calculus. If you have >=10GB VRAM, then you'll use the main repository here:

GitHub - CompVis/stable-diffusion
Contribute to CompVis/stable-diffusion development by creating an account on GitHub.

If you have <10GB VRAM, then you'll need to use the optimized fork of Stable Diffusion here:

GitHub - basujindal/stable-diffusion
Contribute to basujindal/stable-diffusion development by creating an account on GitHub.

There's not much difference between the two- the optimized fork merely contains some scripts that have been reworked to be suitable for people with less than 10GB VRAM. The tradeoff is that this version takes longer to generate images.

Preparation - Installing Miniconda

Stable Diffusion is written in the programming language Python, and has been prepared for deployment to end users (people like you and me) using a Python platform called Anaconda. In order to get Stable Diffusion setup, we'll install a small, lightweight version of it called Miniconda:

Miniconda — conda documentation

Head over to the page linked above and download the latest Windows installer. Go ahead and click through the installer with its default values. Once it's done, you should be able to search for Anaconda in your search bar, like so:

Downloading Stable Diffusion

Once you've chosen which repository you want to go with, there are two primary ways to actually get Stable Diffusion on to your PC. I'll cover both.

Option 1 : Downloading the ZIP

This is the 'simpler' of the two options, and the one less tech-savvy individuals should probably opt for. Head over the GitHub repository, and press on the green Code button. Then, press on the Download Zip option.

This will download the entire repository as a ZIP file.

Then, open up the ZIP you just downloaded, and extract it to where ever it is you want to keep the Stable Diffusion files. This could be your desktop, your documents folder- it doesn't matter.

Option 2 : Cloning the GitHub Repository

This is arguably better option if you already have an idea of what your doing. Cloning the GitHub repository will let you quickly and easily update your Stable Diffusion files as new changes get pushed to the codebase.

To clone the GitHub repository, simply download Git from the following page:

Git - Downloading Package

Click through the installation prompts until the installer is finished. Then, in your file explorer, navigate to where you want to store Stable Diffusion, SHIFT + RIGHT-CLICK the window, and select either Open in Terminal or Open PowerShell window here to open up a terminal window.

If you stumbled upon this guide looking for a beginner oriented article, and aren't computer savvy, you might not be very familiar with the terminal. Don't worry, you aren't going to accidentally hack into the NSA or download the world's entire malware population.

Go ahead and run this in the terminal window: git clone <link_to_repository> – obviously replacing <link_to_repository> with the URL of the repository you've chosen. If you used the main, CompVis repository, you'd run the following:

git clone https://github.com/CompVis/stable-diffusion

This will create a new folder in the directory your in called stable-diffusion.

Downloading the Latest Checkpoint

To use Stable Diffusion, we need to download the latest weights / checkpoint. The details of what this file is and what is does are fairly technical, but essentially this is the file that contains the weighting data for the AI model. It's what configures it.

Generally speaking, the newer the version of the checkpoint, the more data it'll have been trained on, and the better it will perform. You can download v1.4 (the latest weight as of publication of this article) directly from this link:

https://www.googleapis.com/storage/v1/b/aai-blog-files/o/sd-v1-4.ckpt?alt=media

Alternatively, you can download it from CompVis at HuggingFace, though keep in mind you'll need to register an account.

Place this file into the stable-diffusion folder that you created earlier. Then, right-click and copy it. Go into the models/ldm folder and make a new directory called stable-diffusion-v1 and go into it. Finally, paste the copy of the weights file into this folder, and rename it to model.ckpt

Setting up the Conda Environment

This is the final step before we get to the fun part. Open up the Minconda Prompt you installed earlier, and you'll be presented with a command line interface.

Change your directory to the folder you installed Stable Diffusion to using the cd command (cd <path to switch to). Then, setup the Conda environment by running the following:

conda env create -f environment.yaml

This will create a specialized python environment that contains all the requirements needed for Stable Diffusion to run properly. All the information telling Conda what it needs to do is stored in that environment.yaml file, which you downloaded earlier along with the rest of the Stable Diffusion repository.

Once it's finished, you need to activate / switch over to the environment (which has been named ldm) by running this command:

conda activate ldm

Note that every time you want to use Stable Diffusion, you'll need to open the Miniconda terminal, cd to your stable-diffusion folder, and run conda activate ldm. If you don't remember to activate the ldm environment, you'll run into errors.

Using Stable Diffusion to Generate Images

Finally! The fun part. We can actually start generating some images now! There are a number of tools included with Stable Diffusion by default, by the biggest ones are txt2img and img2img.

The first lets you enter a prompt, it will generate you images based on that prompt. The second lets you input an already existing image, alongside a prompt, and it will alter that image based on your prompt.

Text-to-Image

The command to run txt2img (again, via Miniconda, with the ldm enviornment activated, inside the Stable Diffusion folder) is the following:

python scripts/txt2img.py --prompt "Cyberpunk Daenerys Targaryen"

There are a number of options you can specify, as well:

--outdir	Directory to write results to (defaults to output folder)
--n_iter	How many iterations of the script should be run
--n_samples	How many samples to generate per iteration
--H		Height of the image (in pixels)
--W		Width of the image (in pixels)
--ddim_steps	Number of sampling steps (50 is ideal, more = better + slower)

It should go without saying the amount of time it will take to run will increase with the more samples you create at once, the larger the image dimensions, and the more ddim_steps taken per image.

By default, files are stored in stable-diffusion/outputs/txt2img-samples/<your_prompt>/

Note: if you chose the optimized version of Stable Diffusion, your command will be python optimizedSD/optimized_txt2img.py --prompt "Cyberpunk Daenerys Targaryen"

Here are some images I've created through this process:

Cyberpunk Daenerys 
Cyberpunk Circuitscape

Image-to-Image

This is one of the coolest functions of Stable Diffusion. You can provide a base image, and then a prompt, and Stable Diffusion will change the image based on the prompt. For example, you can turn old sprites in photorealistic portraits, or silly drawings into something you would actually want to hang on your wall.

The command is similar to before:

python .\scripts\img2img.py --init-img "<PATH TO YOUR BASE IMAGE>" --prompt "Your Prompt" --strength 0.7 --n_samples 2 --n_iter 20 --W 800 --H 512

There are two new options we passed the script: --init-img and --strength. The first rather self explanatory, but the second determines how much Stable Diffusion is allowed to deviate from the original image. A strength of 1 allows it to roam free, with no care for preserving the original, where a strength of 0.3 would only allow modest alterations.

Here's a drawing I did in Microsoft Paint, which took about 2 minutes:

Ned Stark as an Astronaut in Space, Holding a Sword

Here's what I was able to get out img2img with just one pass:

The difference is astounding, but these aren't even particularly impressive. By playing with the prompt, or running the output back into img2img, you can adjust the result, add details, change the style; the sky is the limit.

So, dear reader, now that you know how, what will you create?