Skip to content

Added support for Moore Threads GPUs #8011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

hben35096
Copy link

System Configuration:

Architecture: x86_64
Operating System: Linux
GPU: Moore Threads MTT S4000
VRAM: 48GB
python Version: 3.10.8
torch Version: 2.2.0a0+git8ac9b20
torchvision Version: 0.17.2+c1d70fe
torchaudio Version: 2.2.2+cefdb36
torch_musa Version: 1.3.0

Project Objective:

I have added code to ComfyUI to enable support for the Moore Threads GPU, leveraging the MUSA backend through . I would like to contribute this enhancement to the main branch of ComfyUI under the repository.

Launch Information:

/root/autodl-tmp/ComfyUI
[START] Security scan
[DONE] Security scan
## ComfyUI-Manager: installing dependencies done.
** ComfyUI startup time: 2025-05-09 01:47:55.170
** Platform: Linux
** Python version: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
** Python executable: /root/miniconda3/bin/python
** ComfyUI Path: /root/autodl-tmp/ComfyUI
** ComfyUI Base Folder Path: /root/autodl-tmp/ComfyUI
** User directory: /root/autodl-tmp/ComfyUI/user
** ComfyUI-Manager config path: /root/autodl-tmp/ComfyUI/user/default/ComfyUI-Manager/config.ini
** Log path: /root/autodl-tmp/ComfyUI/user/comfyui.log

Prestartup times for custom nodes:
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/rgthree-comfy
   2.6 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI-Manager

Warning, you are using an old pytorch version and some ckpt/pt files might be loaded unsafely. Upgrading to 2.4 or above is recommended.
/root/miniconda3/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
MUSA device detected: MTT S4000
Total VRAM 49062 MB, total RAM 1031698 MB
pytorch version: 2.2.0
Set vram state to: NORMAL_VRAM
Device: musa
/root/miniconda3/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: ''If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
Using split optimization for attention
Python version: 3.10.8 (main, Nov 24 2022, 14:13:03) [GCC 11.2.0]
ComfyUI version: 0.3.32
ComfyUI frontend version: 1.18.9
[Prompt Server] web root: /root/miniconda3/lib/python3.10/site-packages/comfyui_frontend_package/static
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using ckpts path: /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using symlinks: False
[/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
/root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/node_wrappers/dwpose.py:26: UserWarning: DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly
  warnings.warn("DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly")
### Loading: ComfyUI-Manager (V3.31.13)
[ComfyUI-Manager] network_mode: public
### ComfyUI Version: v0.3.32-8-gc7c025b8 | Released on '2025-05-08'

[rgthree-comfy] Loaded 42 extraordinary nodes. 🎉


Import times for custom nodes:
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/websocket_image_save.py
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/aigodlike-comfyui-translation
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/rgthree-comfy
   0.0 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus
   0.1 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI-Manager
   0.5 seconds: /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux

Starting server

To see the GUI go to: http://127.0.0.1:6006/
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load SD1ClipModel
loaded completely 47836.78046875 235.84423828125 True
INFO: Clip Vision model loaded from /root/autodl-tmp/ComfyUI/models/clip_vision/CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors
INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15_light_v11.bin
INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting.
Requested to load CLIPVisionModelProjection
loaded completely 47552.3732421875 1208.09814453125 True
Requested to load BaseModel
loaded completely 46277.04658203125 1639.406135559082 True
 30%|█████████████▏                              | 6/20 [00:01<00:02,  4.73it/s]FETCH ComfyRegistry Data: 5/84
100%|███████████████████████████████████████████| 20/20 [00:04<00:00,  4.84it/s]
Requested to load AutoencoderKL
loaded completely 44292.29931640625 159.55708122253418 True
Prompt executed in 7.16 seconds
FETCH ComfyRegistry Data: 10/84
FETCH ComfyRegistry Data: 15/84
FETCH ComfyRegistry Data: 20/84
got prompt
INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15.safetensors
INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting.
Requested to load BaseModel
 15%|██████▌                                     | 3/20 [00:00<00:02,  5.82it/s]FETCH ComfyRegistry Data: 25/84
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.80it/s]
Prompt executed in 3.87 seconds
FETCH ComfyRegistry Data: 30/84
FETCH ComfyRegistry Data: 35/84
FETCH ComfyRegistry Data: 40/84
FETCH ComfyRegistry Data: 45/84
FETCH ComfyRegistry Data: 50/84
FETCH ComfyRegistry Data: 55/84
FETCH ComfyRegistry Data: 60/84
FETCH ComfyRegistry Data: 65/84
FETCH ComfyRegistry Data: 70/84
FETCH ComfyRegistry Data: 75/84
FETCH ComfyRegistry Data: 80/84
FETCH ComfyRegistry Data [DONE]
[ComfyUI-Manager] default cache updated: https://api.comfy.org/nodes
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
got prompt
100%|███████████████████████████████████████████| 20/20 [00:03<00:00,  5.81it/s]
Prompt executed in 3.62 seconds
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
model_path is /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/lllyasviel/Annotators/150_16_swin_l_oneformer_coco_100ep.pth
/root/miniconda3/lib/python3.10/site-packages/torch/functional.py:507: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /home/pytorch/aten/src/ATen/native/TensorShape.cpp:3549.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[DetectionCheckpointer] Loading from /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/lllyasviel/Annotators/150_16_swin_l_oneformer_coco_100ep.pth ...
model_path is /root/autodl-tmp/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LayerNorm/DensePose-TorchScript-with-hint-image/densepose_r50_fpn_dl.torchscript
Requested to load SD1ClipModel
loaded completely 47579.58125 235.84423828125 True
Requested to load BaseModel
Requested to load ControlNet
loaded completely 47234.53916015625 1639.406135559082 True
loaded completely 45578.70859375 689.0852355957031 True
100%|███████████████████████████████████████████| 20/20 [00:05<00:00,  3.81it/s]
Requested to load AutoencoderKL
loaded completely 44167.8798828125 159.55708122253418 True
Prompt executed in 13.93 seconds

PixPin_2025-05-09_01-59-15
PixPin_2025-05-09_01-58-40

@hben35096 hben35096 requested a review from comfyanonymous as a code owner May 8, 2025 18:00
@hben35096
Copy link
Author

I have modified two files:

  • ComfyUI/comfy/model_management.py
  • ComfyUI/comfy/clip_vision.py

This is my first time creating a pull request, and I am not very familiar with the process. If there are any mistakes, please forgive me.

In the ComfyUI/comfy/clip_vision.py file, I changed:
image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)

to:

image = torch.nn.functional.interpolate(image, size=scale_size, mode="bilinear")

This modification allowed the IPAdapter workflow to run successfully, but I am not sure if it may cause any other negative impacts.

@hben35096
Copy link
Author

Please wait. There is an error in the flux model, and it is being resolved.

@hben35096
Copy link
Author

It has been solved
PixPin_2025-05-09_23-29-53

@@ -29,7 +29,7 @@ def clip_preprocess(image, size=224, mean=[0.48145466, 0.4578275, 0.40821073], s
else:
scale_size = (size, size)

image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)
image = torch.nn.functional.interpolate(image, size=scale_size, mode="bilinear")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be changed because bicubic is the correct way to downscale images for the clip vision models.

Copy link
Author

@hben35096 hben35096 May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I use it on a computer with a Moore Threads GPU:
image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True
I receive the following error:

VAE load device: musa:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: musa:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load SD1ClipModel
loaded completely 47836.78046875 235.84423828125 True
FETCH ComfyRegistry Data: 15/84
INFO: Clip Vision model loaded from /root/autodl-tmp/ComfyUI/models/clip_vision/CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors
INFO: IPAdapter model loaded from /root/autodl-tmp/ComfyUI/models/ipadapter/ip-adapter_sd15.safetensors
INFO: the IPAdapter reference image is not a square, CLIPImageProcessor will resize and crop it at the center. If the main focus of the picture is not in the middle the result might not be what you are expecting.
Requested to load CLIPVisionModelProjection
loaded completely 47552.3732421875 1208.09814453125 True
!!! Exception during processing !!! Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'musa' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, AutocastPrivateUse1, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /home/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31411 [kernel]
Meta: registered at /home/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26984 [kernel]
BackendSelect: fallthrough registered at /home/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback]
Functionalize: registered at /home/pytorch/build/aten/src/ATen/RegisterFunctionalization_0.cpp:21981 [kernel]
Named: registered at /home/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at /home/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at /home/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback]
ZeroTensor: registered at /home/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: registered at /home/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp:4832 [kernel]
AutogradOther: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCUDA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHIP: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXLA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMPS: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradIPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradVE: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradLazy: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMTIA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse1: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse2: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse3: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMeta: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradNestedTensor: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
Tracer: registered at /home/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:17001 [kernel]
AutocastCPU: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback]
AutocastCUDA: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback]
AutocastPrivateUse1: fallthrough registered at /home/torch_musa/torch_musa/csrc/amp/autocast_mode.cpp:412 [backend fallback]
FuncTorchBatched: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback]
BatchedNestedTensor: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /home/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /home/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback]
PythonTLSSnapshot: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback]
PreDispatch: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback]
PythonDispatcher: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback]

Traceback (most recent call last):
  File "/root/autodl-tmp/ComfyUI/execution.py", line 347, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/root/autodl-tmp/ComfyUI/execution.py", line 222, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/root/autodl-tmp/ComfyUI/execution.py", line 194, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/root/autodl-tmp/ComfyUI/execution.py", line 183, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 752, in apply_ipadapter
    return ipadapter_execute(model.clone(), ipadapter['ipadapter']['model'], ipadapter['clipvision']['model'], **ipa_args)
  File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 376, in ipadapter_execute
    img_cond_embeds = encode_image_masked(clipvision, image, batch_size=encode_batch_size, tiles=enhance_tiles, ratio=enhance_ratio, clipvision_size=clipvision_size)
  File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/utils.py", line 242, in encode_image_masked
    embeds = encode_image_masked_(clip_vision, image, mask, batch_size, clipvision_size=clipvision_size)
  File "/root/autodl-tmp/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/utils.py", line 293, in encode_image_masked_
    pixel_values = clip_preprocess(img, size=clipvision_size).float()
  File "/root/autodl-tmp/ComfyUI/comfy/clip_vision.py", line 32, in clip_preprocess
    image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)
  File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/functional.py", line 4059, in interpolate
    return torch._C._nn._upsample_bicubic2d_aa(input, output_size, align_corners, scale_factors)
NotImplementedError: Could not run 'aten::_upsample_bicubic2d_aa.out' with arguments from the 'musa' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::_upsample_bicubic2d_aa.out' is only available for these backends: [CPU, Meta, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradHIP, AutogradXLA, AutogradMPS, AutogradIPU, AutogradXPU, AutogradHPU, AutogradVE, AutogradLazy, AutogradMTIA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, AutogradMeta, AutogradNestedTensor, Tracer, AutocastCPU, AutocastCUDA, AutocastPrivateUse1, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

CPU: registered at /home/pytorch/build/aten/src/ATen/RegisterCPU.cpp:31411 [kernel]
Meta: registered at /home/pytorch/build/aten/src/ATen/RegisterMeta.cpp:26984 [kernel]
BackendSelect: fallthrough registered at /home/pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Python: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:154 [backend fallback]
FuncTorchDynamicLayerBackMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:498 [backend fallback]
Functionalize: registered at /home/pytorch/build/aten/src/ATen/RegisterFunctionalization_0.cpp:21981 [kernel]
Named: registered at /home/pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
Conjugate: registered at /home/pytorch/aten/src/ATen/ConjugateFallback.cpp:17 [backend fallback]
Negative: registered at /home/pytorch/aten/src/ATen/native/NegateFallback.cpp:19 [backend fallback]
ZeroTensor: registered at /home/pytorch/aten/src/ATen/ZeroTensorFallback.cpp:86 [backend fallback]
ADInplaceOrView: registered at /home/pytorch/torch/csrc/autograd/generated/ADInplaceOrViewType_0.cpp:4832 [kernel]
AutogradOther: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradCUDA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHIP: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXLA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMPS: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradIPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradXPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradHPU: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradVE: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradLazy: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMTIA: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse1: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse2: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradPrivateUse3: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradMeta: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
AutogradNestedTensor: registered at /home/pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:19039 [autograd kernel]
Tracer: registered at /home/pytorch/torch/csrc/autograd/generated/TraceType_0.cpp:17001 [kernel]
AutocastCPU: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:378 [backend fallback]
AutocastCUDA: fallthrough registered at /home/pytorch/aten/src/ATen/autocast_mode.cpp:244 [backend fallback]
AutocastPrivateUse1: fallthrough registered at /home/torch_musa/torch_musa/csrc/amp/autocast_mode.cpp:412 [backend fallback]
FuncTorchBatched: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:720 [backend fallback]
BatchedNestedTensor: registered at /home/pytorch/aten/src/ATen/functorch/LegacyBatchingRegistrations.cpp:746 [backend fallback]
FuncTorchVmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/functorch/VmapModeRegistrations.cpp:28 [backend fallback]
Batched: registered at /home/pytorch/aten/src/ATen/LegacyBatchingRegistrations.cpp:1075 [backend fallback]
VmapMode: fallthrough registered at /home/pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]
FuncTorchGradWrapper: registered at /home/pytorch/aten/src/ATen/functorch/TensorWrapper.cpp:203 [backend fallback]
PythonTLSSnapshot: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:162 [backend fallback]
FuncTorchDynamicLayerFrontMode: registered at /home/pytorch/aten/src/ATen/functorch/DynamicLayer.cpp:494 [backend fallback]
PreDispatch: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:166 [backend fallback]
PythonDispatcher: registered at /home/pytorch/aten/src/ATen/core/PythonFallbackKernel.cpp:158 [backend fallback]

So I used "bilinear" to make it run, and judging from the results, it seems acceptable, haha.
Of course, it would be even better if we could determine which method to use based on the device being used, or if there were other solutions.

PixPin_2025-05-10_03-09-1822

PixPin_2025-05-10_03-09-182

I’m not very familiar with programming; I can only use the most amateur methods. I hope you can understand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I changed it to:

        if image.device.type == 'musa':
            image = image.cpu()
            image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)
            image = image.to('musa')
        else:
            image = torch.nn.functional.interpolate(image, size=scale_size, mode="bicubic", antialias=True)

@hben35096 hben35096 marked this pull request as draft May 10, 2025 06:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants