Stable Video Diffusion : Install2024/02/27


Install [Stable Diffusion] that is the Text-to-Image model of deep learning.


Install Python, refer to here.


Install CUDA, refer to here.

[3] Run PowerShell with Admin Privilege and work.
Install [FFmpeg] first.
Windows PowerShell
PS C:\Users\Administrator> Invoke-WebRequest -Uri "" -OutFile "" 

PS C:\Users\Administrator> Expand-Archive -Path ./ 

PS C:\Users\Administrator> Move-Item ./ffmpeg-master-latest-win64-gpl/ffmpeg-master-latest-win64-gpl "C:/Program Files/FFmpeg" 

# set Path
PS C:\Users\Administrator> $currentPath = [Environment]::GetEnvironmentVariable("Path", "Machine") 
PS C:\Users\Administrator> $currentPath += ";C:\Program Files\FFmpeg\bin" 
PS C:\Users\Administrator> [Environment]::SetEnvironmentVariable("Path", $currentPath, "Machine") 
PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") 

PS C:\Users\Administrator> ffmpeg -version 
ffmpeg version N-113824-ga3ca4beeaa-20240226 Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 13.2.0 (crosstool-NG
[4] Install [Git].
PS C:\Users\Administrator> Invoke-WebRequest -Uri "" -OutFile "Git-2.43.0-64-bit.exe" 

# install Git with silent mode
PS C:\Users\Administrator> ./Git-2.43.0-64-bit.exe /silent 

# installation processes are running
PS C:\Users\Administrator> Get-Process -Name "Git*", "setup*" 

Handles  NPM(K)    PM(K)      WS(K)     CPU(s)     Id  SI ProcessName
-------  ------    -----      -----     ------     --  -- -----------
    134      11     4820       7880       0.16   2284   0 Git-2.43.0-64-bit
    243      19    74912      89144      12.83   4256   0 Git-2.43.0-64-bit.tmp

# after finishing installation, processes above finish
PS C:\Users\Administrator> Get-Process -Name "Git*", "setup*" 

# reload environment variables
PS C:\Users\Administrator> $env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User") 

PS C:\Users\Administrator> git -v 
git version
[5] Install [Stable Video Diffusion].
# add firewall rule for Stable Video Diffusion
PS C:\Users\Administrator> New-NetFirewallRule `
-Name "Stable Video Diffusion Server Port" `
-DisplayName "Stable Video Diffusion Server Port" `
-Description 'Allow Stable Video Diffusion Server Port' `
-Profile Any `
-Direction Inbound `
-Action Allow `
-Protocol TCP `
-Program Any `
-LocalAddress Any `
-LocalPort 8501 

PS C:\Users\Administrator> git clone 
PS C:\Users\Administrator> cd generative-models 
PS C:\Users\Administrator\generative-models> pip3 install 

Successfully installed cmake-3.28.3 triton-2.0.0

PS C:\Users\Administrator\generative-models> pip3 install -r ./requirements/pt2.txt 

Successfully installed appdirs-1.4.4 black-23.7.0 blinker-1.7.0 braceexpand-0.1.7 cachetools-5.3.3 chardet-5.1.0 docker-pycreds-0.4.0 einops-0.7.0 fairscale-0.4.13 fire-0.5.0 invisible-watermark-0.2.0 jedi-0.19.1 kornia-0.6.9 markdown-it-py-3.0.0 mdurl-0.1.2 mypy-extensions-1.0.0 natsort-8.4.0 ninja- numpy-1.26.4 omegaconf-2.3.0 opencv-python- parso-0.8.3 pathspec-0.12.1 pudb-2024.1 pyarrow-15.0.0 pydeck-0.8.1b0 pygments-2.17.2 pyre-extensions-0.0.29 pytorch-lightning-2.0.1 rich-13.7.0 sentry-sdk-1.40.5 setproctitle-1.3.3 streamlit-1.31.1 streamlit-keyup-0.2.0 tenacity-8.2.3 tensorboardx-2.6 termcolor-2.4.0 tokenizers-0.12.1 toml-0.10.2 torchaudio-2.0.2 torchdata-0.6.1 tornado-6.4 transformers-4.19.1 typing-inspect-0.9.0 tzlocal-5.2 urllib3-1.26.18 urwid-2.6.5 urwid-readline-0.14 validators-0.22.0 wandb-0.16.3 watchdog-4.0.0 webdataset-0.2.86 wheel-0.42.0

PS C:\Users\Administrator\generative-models> pip3 install ./ 

Successfully built sgm
Installing collected packages: sgm
Successfully installed sgm-0.1.0

# lowvram_mode ⇒ change the value to true if your Graphic Card has few amount of memory
# * value of [False] did not work on an RTX 3060 with 12G RAM
PS C:\Users\Administrator\generative-models> Get-Content ./scripts/demo/ | Select-String "^lowvram_mode" 
lowvram_mode = False
PS C:\Users\Administrator\generative-models> (Get-Content ./scripts/demo/ | foreach { $_ -replace "lowvram_mode = False","lowvram_mode = True" } | Set-Content ./scripts/demo/ 
PS C:\Users\Administrator\generative-models> Get-Content ./scripts/demo/ | Select-String "^lowvram_mode" 
lowvram_mode = True

# download a model, models are here
PS C:\Users\Administrator\generative-models> mkdir ./checkpoints 
PS C:\Users\Administrator\generative-models> Invoke-WebRequest -Uri "" -OutFile "./checkpoints/svd.safetensors" 

# run as a server
PS C:\Users\Administrator\generative-models> Copy-Item ./scripts/demo/ ./ 
PS C:\Users\Administrator\generative-models> streamlit run --server.address= 

      Welcome to Streamlit!

      If youfd like to receive helpful onboarding emails, news, offers, promotions,
      and the occasional swag, please enter your email address below. Otherwise,
      leave this field blank.


  You can find our privacy policy at

  - This open source library collects usage statistics.
  - We cannot see and do not store information contained inside Streamlit apps,
    such as text, charts, images, etc.
  - Telemetry data is stored in servers in the United States.
  - If you'd like to opt out, add the following to %userprofile%/.streamlit/config.toml,
    creating that file if necessary:

    gatherUsageStats = false

  You can now view your Streamlit app in your browser.

[6] Access to the port 8501 that was shown on your command line, then you can use [Stable Video Diffusion].
Check a box [Load Model].
[7] The initial loading will take quite a while.
After finishing loading, following screen is displayed. Ignore the error below.
Next, click [Browse files] to select an image you like to convert to video.
[8] After loading your image, it will be put on the screen.
By the way, the image of cat below was generated using Stable Diffusion.

Scroll down the page and adjust each value.
At least the following values need to be changed.

the value [H] and [W] ⇒ change to the size of the image
the value [T] framse ⇒ default is 14 on the [svd], but reduce it if out of memory happens when app runs

After adjusting the values, click the [Sample] button to generate the video.
[10] After successfully generated a video, it will be displayed on the screen.
With the RTX 3060 12G RAM, the best I could do was generate a 2-second video with [svd].
