從 ComfyUI Examples 下載 Image to Video Workflow，讓我們來以圖產影吧！

8 min readAug 16, 2024

目錄

⦿ ComfyUI
⦿ Video Example
⦿ svd.safetensors
⦿ ComfyUI Manager
⦿ 出大事了！
⦿ 成果

ComfyUI

在前面的文章說過，ComfyUI 是一個方便使用的 Web 介面，將底層模型導入後，可以進行 text to image 的操作，導入的模型多為 Stable Diffusion 或其子代；這跟 Open WebUI 差不多，如果我們要使用 Web 介面跟對話機器人對話，例如使用的模型是 Llama，這個介面能讓我們比較直覺地操作，只是我們用 Ollama 來管理這些模型。

綜合上述，我們在與本地架好的模型交互時，不需要用 CLI，如 Mac 中是 Terminal 來與模型交互，而 Open WebUI 也能夠將 text to image 的方式導入，即是先用 text to text 產出 text to image 的 prompt。

在 Web 中，當產圖的 prompt 已經生成，按下方生成圖片按鈕即可。

而生成這個 promt 之前，其實是用下面這段話去 text to text：

Craft a detailed and nuanced image generation prompt based on this brief: 
"['賽博龐克, 仿生機械, 未來世界']".
 Focus on expanding the visual elements, incorporating specific textures, colors, and atmosphere. The resulting prompt should be precise and rich in imagery, capturing the essence of the brief without any introductory or explanatory text.

即是賽博龐克、仿生機械、未來世界是我最初的提示。

這表示說多模態已經是將來 AGI 的發展方向了，當我們看過 Open AI 的 Sora 展示影片，知道 image to video 也是一個將來的趨勢，那麼在 ComfyUI 中，有沒有辦法做到呢？

有的，下面一步步帶大家使用 ComfyUI 來以圖產影吧！

繼續閱讀｜回目錄

Video Example

由於早先的流程是 text to image，如下：

這並不適用 image to video，所以我們要換掉 workflow，首先，進到下方的網頁。

https://comfyanonymous.github.io/ComfyUI_examples/video/

在這網頁中，第一個範例是 14 frame model，是比較質樸的 model，剛開始先從這個 model 測試。

下載下方的 Workflow in Json format 到電腦裡。

這個 Json 檔就是 ComfyUI 的 workflow 的設定，我們把它拖到 http://127.0.0.1:8188/ 裡面。

已經根本來的架構不一樣了，中間的 Group 已經變成 Image to Video 的流程，雖然我們把圖片放進去了，但還不能轉成影片，這是因為還沒有下載相應的模型。

繼續閱讀｜回目錄

svd.safetensors

由於前面的 workflow 是 14 frame model，所以我們要使用的模型是 svd.safetensors，下載它，同樣地放到 ComfyUI => models => checkpoints 這個資料夾裡，當然這時候要重新載入 ComfyUI，這個模型才能在本地被選擇。

不過這種下載方式其實不是很方便，我們需要一個 Manager 來幫助每次的下載與重整、重啟這些動作。

此時 ComfyUI Manager 就派上用場了！

ComfyUI Manager

首先，我們要先切到 custom_nodes 這個資料夾，在我電腦的安裝路徑如下：

chengchunli@MacBook-Pro-M2 custom_nodes % pwd
/Users/chengchunli/Desktop/ComfyUI/custom_nodes

在 custom_nodes 裡，git clone https://github.com/ltdrdata/ComfyUI-Manager.git ，完成後重啟 ComfyUI，這時在 Web 介面我們就可以看到如下：

點下方 Manager。

再點 Model Manager，我們就可以從資源裡面找到想要的模型，比方說鍵入 Video，就可以找到剛才我們從載點下載的 svd.safetensors，如此就不用自己找連結了。

svd.safetensors 的大小是 9.56 GB，要注意自己的硬碟空間，下載完後再按 Restart，重進頁面後，原先的 workflow 中就可以選擇這個模型了！

繼續閱讀｜回目錄

出大事了！

因為我用蘋果的筆電，所以模型在跑的時候跳出下面錯誤。

8%|█████▍                                                           | 1/12 [05:06<56:15, 306.88s/it]

RuntimeError: MPS backend out of memory (MPS allocated: 7.21 GB, other allocations: 10.17 GB, max allowed: 18.13 GB). Tried to allocate 843.75 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

Prompt executed in 327.84 seconds

當跑到 8% 的時候 RuntimeError，這時候提示 MPS（Metal Performance Shaders）出問題，這應該是 Pytorch 在使用時超過了可用記憶體上限，這時候我們要這樣解決：

同樣地，在虛擬環境中我們試著輸入下面兩行：

export PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0

export PYTORCH_ENABLE_MPS_FALLBACK=1

第一行是取消調用 Apple M 系列晶片的記憶體上限，第二行是如果有 MPS 記憶體的問題，把運算交給 CPU，只不過會算得更慢，更慢呢。

成果

最後，在虛擬環境設置完成後，花了如下時間⋯⋯阿彌陀佛。

成功將最上方的貓貓圖轉成貓貓動圖了！

是不是很好玩呢？！

這次就分享到這，感謝您的閱讀。

繼續閱讀｜回目錄

Reference：

stabilityai/stable-video-diffusion-img2vid · Hugging Face

We're on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

GitHub - thecooltechguy/ComfyUI-Stable-Video-Diffusion: ComfyUI nodes for Stable Video Diffusion

ComfyUI nodes for Stable Video Diffusion. Contribute to thecooltechguy/ComfyUI-Stable-Video-Diffusion development by…

github.com