Added support for Moore Threads GPUs #8011

hben35096 · 2025-05-08T18:00:23Z

System Configuration:

Architecture: x86_64
Operating System: Linux
GPU: Moore Threads MTT S4000
VRAM: 48GB
python Version: 3.10.8
torch Version: 2.2.0a0+git8ac9b20
torchvision Version: 0.17.2+c1d70fe
torchaudio Version: 2.2.2+cefdb36
torch_musa Version: 1.3.0

Project Objective:

I have added code to ComfyUI to enable support for the Moore Threads GPU, leveraging the MUSA backend through . I would like to contribute this enhancement to the main branch of ComfyUI under the repository.

Launch Information:

/root/autodl-tmp/ComfyUI
[START] Security scan
[DONE] Security scan
## ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2025-05-09 01:47:55.170
** Platform: Linux
** Python version: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
** Python executable: /root/miniconda3/bin/python
** ComfyUI Path: /root/autodl-tmp/ComfyUI
** ComfyUI Base Folder Path: /root/autodl-tmp/ComfyUI
** User directory: /root/autodl-tmp/ComfyUI/user
** ComfyUI-Manager config path: /root/autodl-tmp/ComfyUI/user/default/ComfyUI-Manager/config.ini
** Log path: /root/autodl-tmp/ComfyUI/user/comfyui.log

Prestartup times for custom nodes:
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/rgthree-comfy
   2.6 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI-Manager

Warning, you are using an old pytorch version and some ckpt/pt files might be loaded unsafely. Upgrading to 2.4 or above is recommended.
/root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
MUSA device detected: MTT S4000
Total VRAM 49062 MB, total RAM 1031698 MB
pytorch version: 2.2.0
Set vram state to: NORMAL_VRAM
Device: musa
/root/miniconda3/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
Using split optimization for attention
Python version: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
ComfyUI version: 0.3.32
ComfyUI frontend version: 1.18.9
[Prompt Server] web root: /root/miniconda3/lib/python3.10/site-packages/comfyui_frontend_package/static
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using ckpts path: /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using symlinks: False
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/node_wrappers/dwpose.py:26: UserWarning: DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly
  warnings.warn("DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly")
### Loading: ComfyUI-Manager (V3.31.13)
[ComfyUI-Manager] network_mode: public
### ComfyUI Version: v0.3.32-8-gc7c025b8 | Released on '2025-05-08'

[rgthree-comfy] Loaded 42 extraordinary nodes. 🎉


Import times for custom nodes:
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/websocket_image_save.py
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/aigodlike-comfyui-translation
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/rgthree-comfy
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus
   0.1 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI-Manager
   0.5 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux

Starting server

To see the GUI go to: http://127.0.0.1:6006/
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load SD1ClipModel
loaded completely 47836.78046875 235.84423828125 True
INFO: Clip Vision model loaded from /root/autodl-tmp/ComfyUI/models/clip_vision/CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors
INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15_light_v11.bin
INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting.
Requested to load CLIPVisionModelProjection
loaded completely 47552.3732421875 1208.09814453125 True
Requested to load BaseModel
loaded completely 46277.04658203125 1639.406135559082 True
 30%|█████████████▏                              | 6/20 [00:01<00:02,  4.73it/s]FETCH ComfyRegistry Data: 5/84
100%|███████████████████████████████████████████| 20/20 [00:04<00:00,  4.84it/s]
Requested to load AutoencoderKL
loaded completely 44292.29931640625 159.55708122253418 True
Prompt executed in 7.16 seconds
FETCH ComfyRegistry Data: 10/84
FETCH ComfyRegistry Data: 15/84
FETCH ComfyRegistry Data: 20/84
got prompt
INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15.safetensors
INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting.
Requested to load BaseModel
 15%|██████▌                                     | 3/20 [00:00<00:02,  5.82it/s]FETCH ComfyRegistry Data: 25/84
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.80it/s]
Prompt executed in 3.87 seconds
FETCH ComfyRegistry Data: 30/84
FETCH ComfyRegistry Data: 35/84
FETCH ComfyRegistry Data: 40/84
FETCH ComfyRegistry Data: 45/84
FETCH ComfyRegistry Data: 50/84
FETCH ComfyRegistry Data: 55/84
FETCH ComfyRegistry Data: 60/84
FETCH ComfyRegistry Data: 65/84
FETCH ComfyRegistry Data: 70/84
FETCH ComfyRegistry Data: 75/84
FETCH ComfyRegistry Data: 80/84
FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
got prompt
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.81it/s]
Prompt executed in 3.62 seconds
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
model_path is /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/lllyasviel/Annotators/150_16_swin_l_oneformer_coco_100ep.pth
/root/miniconda3/lib/python3.10/site-packages/torch/functional.py:507: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /home/pytorch/aten/src/ATen/native/TensorShape.cpp:3549.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[DetectionCheckpointer] Loading from /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/lllyasviel/Annotators/150_16_swin_l_oneformer_coco_100ep.pth ...
model_path is /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LayerNorm/DensePose-TorchScript-with-hint-image/densepose_r50_fpn_dl.torchscript
Requested to load SD1ClipModel
loaded completely 47579.58125 235.84423828125 True
Requested to load BaseModel
Requested to load ControlNet
loaded completely 47234.53916015625 1639.406135559082 True
loaded completely 45578.70859375 689.0852355957031 True
100%|███████████████████████████████████████████| 20/20 [00:05<00:00,  3.81it/s]
Requested to load AutoencoderKL
loaded completely 44167.8798828125 159.55708122253418 True
Prompt executed in 13.93 seconds

hben35096 · 2025-05-08T18:10:42Z

I have modified two files:

ComfyUI/comfy/model_management.py
ComfyUI/comfy/clip_vision.py

This is my first time creating a pull request, and I am not very familiar with the process. If there are any mistakes, please forgive me.

In the ComfyUI/comfy/clip_vision.py file, I changed:
image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)

to:

image = torch.nn.functional.interpolate(image, size=scale_size, mode="bilinear")

This modification allowed the IPAdapter workflow to run successfully, but I am not sure if it may cause any other negative impacts.

hben35096 · 2025-05-09T13:24:51Z

Please wait. There is an error in the flux model, and it is being resolved.

…sorOut" of the flux KSampler.

hben35096 · 2025-05-09T15:31:34Z

It has been solved

comfyanonymous · 2025-05-09T17:47:36Z

comfy/clip_vision.py

@@ -29,7 +29,7 @@ def clip_preprocess(image, size=224, mean=[0.48145466, 0.4578275, 0.40821073], s
        else:
            scale_size = (size, size)

-        image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)
+        image = torch.nn.functional.interpolate(image, size=scale_size, mode="bilinear")


This cannot be changed because bicubic is the correct way to downscale images for the clip vision models.

If I use it on a computer with a Moore Threads GPU:
image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True
I receive the following error:

VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16 CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16 Requested to load SD1ClipModel loaded completely 47836.78046875 235.84423828125 True FETCH ComfyRegistry Data: 15/84 INFO: Clip Vision model loaded from /root/autodl-tmp/ComfyUI/models/clip_vision/CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15.safetensors INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting. Requested to load CLIPVisionModelProjection loaded completely 47552.3732421875 1208.09814453125 True !!! Exception during processing !!! Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'musa' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, AutocastPrivateUse1, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher]. CPU: registered at /home/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31411 [kernel] Meta: registered at /home/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26984 [kernel] BackendSelect: fallthrough registered at /home/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback] FuncTorchDynamicLayerBackMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback] Functionalize: registered at /home/pytorch/build/aten/src/ATen/RegisterFunctionalization_0.cpp:21981 [kernel] Named: registered at /home/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at /home/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback] Negative: registered at /home/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback] ZeroTensor: registered at /home/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: registered at /home/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp:4832 [kernel] AutogradOther: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCUDA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHIP: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXLA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMPS: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradIPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradVE: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradLazy: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMTIA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse1: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse2: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse3: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMeta: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradNestedTensor: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] Tracer: registered at /home/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:17001 [kernel] AutocastCPU: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback] AutocastCUDA: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback] AutocastPrivateUse1: fallthrough registered at /home/torch_musa/torch_musa/csrc/amp/autocast_mode.cpp:412 [backend fallback] FuncTorchBatched: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback] BatchedNestedTensor: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback] FuncTorchVmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback] Batched: registered at /home/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback] VmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] FuncTorchGradWrapper: registered at /home/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback] PythonTLSSnapshot: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback] FuncTorchDynamicLayerFrontMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback] PreDispatch: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback] PythonDispatcher: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback] Traceback (most recent call last): File "/root/autodl-tmp/ComfyUI/execution.py", line 347, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "/root/autodl-tmp/ComfyUI/execution.py", line 222, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "/root/autodl-tmp/ComfyUI/execution.py", line 194, in _map_node_over_list process_inputs(input_dict, i) File "/root/autodl-tmp/ComfyUI/execution.py", line 183, in process_inputs results.append(getattr(obj, func)(**inputs)) File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 752, in apply_ipadapter return ipadapter_execute(model.clone(), ipadapter['ipadapter']['model'], ipadapter['clipvision']['model'], **ipa_args) File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 376, in ipadapter_execute img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size, tiles=enhance_tiles, ratio=enhance_ratio, clipvision_size=clipvision_size) File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/utils.py", line 242, in encode_image_masked embeds = encode_image_masked_(clip_vision, image, mask, batch_size, clipvision_size=clipvision_size) File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/utils.py", line 293, in encode_image_masked_ pixel_values = clip_preprocess(img, size=clipvision_size).float() File "/root/autodl-tmp/ComfyUI/comfy/clip_vision.py", line 32, in clip_preprocess image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True) File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py", line 4059, in interpolate return torch._C._nn._upsample_bicubic2d_aa(input, output_size, align_corners, scale_factors) NotImplementedError: Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'musa' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, AutocastPrivateUse1, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher]. CPU: registered at /home/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31411 [kernel] Meta: registered at /home/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26984 [kernel] BackendSelect: fallthrough registered at /home/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback] Python: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback] FuncTorchDynamicLayerBackMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback] Functionalize: registered at /home/pytorch/build/aten/src/ATen/RegisterFunctionalization_0.cpp:21981 [kernel] Named: registered at /home/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback] Conjugate: registered at /home/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback] Negative: registered at /home/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback] ZeroTensor: registered at /home/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback] ADInplaceOrView: registered at /home/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp:4832 [kernel] AutogradOther: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradCUDA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHIP: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXLA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMPS: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradIPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradXPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradHPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradVE: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradLazy: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMTIA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse1: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse2: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradPrivateUse3: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradMeta: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] AutogradNestedTensor: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel] Tracer: registered at /home/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:17001 [kernel] AutocastCPU: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback] AutocastCUDA: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback] AutocastPrivateUse1: fallthrough registered at /home/torch_musa/torch_musa/csrc/amp/autocast_mode.cpp:412 [backend fallback] FuncTorchBatched: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback] BatchedNestedTensor: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback] FuncTorchVmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback] Batched: registered at /home/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback] VmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback] FuncTorchGradWrapper: registered at /home/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback] PythonTLSSnapshot: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback] FuncTorchDynamicLayerFrontMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback] PreDispatch: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback] PythonDispatcher: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback]

So I used "bilinear" to make it run, and judging from the results, it seems acceptable, haha.
Of course, it would be even better if we could determine which method to use based on the device being used, or if there were other solutions.

I’m not very familiar with programming; I can only use the most amateur methods. I hope you can understand.

You're right, I changed it to:

if image.device.type == 'musa': image = image.cpu() image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True) image = image.to('musa') else: image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)

Update clip_vision.py

e56e423

hben35096 requested a review from comfyanonymous as a code owner May 8, 2025 18:00

Added support for Moore Threads GPUs

d22feac

Solve the error "RuntimeError: BinaryCall MUDNN failed in: Run PowTen…

b06b4f6

…sorOut" of the flux KSampler.

comfyanonymous reviewed May 9, 2025

View reviewed changes

Use the CPU for interpolation.

31914c9

hben35096 marked this pull request as draft May 10, 2025 06:08

hben35096 added 2 commits May 10, 2025 14:15

Merge branch 'comfyanonymous:master' into master

7f18e5a

Merge branch 'comfyanonymous:master' into master

ac18aff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for Moore Threads GPUs #8011

Added support for Moore Threads GPUs #8011

hben35096 commented May 8, 2025

hben35096 commented May 8, 2025

hben35096 commented May 9, 2025

hben35096 commented May 9, 2025

comfyanonymous May 9, 2025

hben35096 May 9, 2025 •

edited

Loading

hben35096 May 10, 2025

Added support for Moore Threads GPUs #8011

Are you sure you want to change the base?

Added support for Moore Threads GPUs #8011

Conversation

hben35096 commented May 8, 2025

System Configuration:

Project Objective:

Launch Information:

hben35096 commented May 8, 2025

hben35096 commented May 9, 2025

hben35096 commented May 9, 2025

comfyanonymous May 9, 2025

Choose a reason for hiding this comment

hben35096 May 9, 2025 • edited Loading

Choose a reason for hiding this comment

hben35096 May 10, 2025

Choose a reason for hiding this comment

hben35096 May 9, 2025 •

edited

Loading