==> Building on volcanion ==> Checking for remote environment... ==> Syncing package to remote host... sending incremental file list created directory packages/composable-kernel ./ .SRCINFO 658 100% 0.00kB/s 0:00:00 658 100% 0.00kB/s 0:00:00 (xfr#1, to-chk=3/5) .nvchecker.toml 115 100% 112.30kB/s 0:00:00 115 100% 112.30kB/s 0:00:00 (xfr#2, to-chk=2/5) PKGBUILD 1,347 100% 1.28MB/s 0:00:00 1,347 100% 1.28MB/s 0:00:00 (xfr#3, to-chk=1/5) composable-kernel-6.0.2-1.log 500 100% 488.28kB/s 0:00:00 500 100% 488.28kB/s 0:00:00 (xfr#4, to-chk=0/5) sent 1,824 bytes received 144 bytes 3,936.00 bytes/sec total size is 2,316 speedup is 1.18 ==> Patching arch to riscv64... ==> Running extra-riscv64-build -- -d /home/felix/packages/riscv64-pkg-cache:/var/cache/pacman/pkg -l felix2 on remote host... ]2;🔵 Container arch-nspawn-185003 on volcanion.felixc.at[?25l:: Synchronizing package databases... core downloading... extra downloading... :: Starting full system upgrade... resolving dependencies... looking for conflicting packages... Package (2) Old Version New Version Net Change Download Size core/texinfo 7.1-2 7.1.1-1 0.07 MiB 1.68 MiB core/tzdata 2024a-2 2024b-1 -0.06 MiB 0.34 MiB Total Download Size: 2.02 MiB Total Installed Size: 11.83 MiB Net Upgrade Size: 0.00 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... texinfo-7.1.1-1-riscv64 downloading... tzdata-2024b-1-riscv64 downloading... checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... upgrading tzdata... upgrading texinfo... :: Running post-transaction hooks... (1/1) Updating the info directory file... [?25h==> Building in chroot for [extra] (riscv64)... ==> Synchronizing chroot copy [/var/lib/archbuild/extra-riscv64/root] -> [felix2]...done ==> Making package: composable-kernel 6.0.2-1 (Mon Sep 9 03:19:51 2024) ==> Retrieving sources...  -> Downloading composable-kernel-6.0.2.tar.gz... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0 100 188k 0 188k 0 0 77293 0 --:--:-- 0:00:02 --:--:-- 200k 100 2303k 0 2303k 0 0 678k 0 --:--:-- 0:00:03 --:--:-- 1252k ==> Validating source files with sha256sums... composable-kernel-6.0.2.tar.gz ... Passed ]2;🔵 Container arch-nspawn-186140 on volcanion.felixc.at==> Making package: composable-kernel 6.0.2-1 (Mon Sep 9 03:20:37 2024) ==> Checking runtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... warning: dependency cycle detected: warning: libglvnd will be installed before its mesa dependency Package (32) New Version Net Change Download Size extra/comgr 6.0.2-1 153.33 MiB 45.29 MiB extra/default-cursors 2-2 0.00 MiB extra/hsa-rocr 6.0.2-2.1 1.36 MiB 0.40 MiB extra/hsakmt-roct 6.0.0-2 0.26 MiB 0.09 MiB core/hwdata 0.387-1 9.23 MiB core/kmod 33-3 0.27 MiB extra/libdrm 2.4.123-1 1.18 MiB core/libedit 20240517_3.1-1 0.24 MiB extra/libglvnd 1.7.0-1 3.72 MiB extra/libomxil-bellagio 0.9.3-5 0.55 MiB extra/libpciaccess 0.18.1-2 0.05 MiB extra/libx11 1.8.10-1 9.73 MiB extra/libxau 1.0.11-3 0.02 MiB extra/libxcb 1.17.0-1 3.69 MiB extra/libxdmcp 1.1.5-1 0.13 MiB extra/libxext 1.3.6-1 0.29 MiB extra/libxfixes 6.0.1-2 0.03 MiB extra/libxshmfence 1.3.2-2 0.01 MiB extra/libxxf86vm 1.1.5-2 0.03 MiB extra/llvm-libs 18.1.8-4.1 121.19 MiB extra/lm_sensors 1:3.6.0.r41.g31d1f125-3 0.42 MiB extra/mesa 1:24.2.1-1 88.25 MiB extra/numactl 2.0.18-1 0.19 MiB core/pciutils 3.13.0-2 0.34 MiB extra/rocm-device-libs 6.0.2-1 3.13 MiB 0.52 MiB extra/rocm-llvm 6.0.2-1 6901.19 MiB 1187.28 MiB extra/rocminfo 6.0.2-1 0.41 MiB 0.12 MiB extra/wayland 1.23.0-1 0.79 MiB extra/xcb-proto 1.17.0-2 1.02 MiB extra/xorgproto 2024.1-2 1.46 MiB extra/hip-runtime-amd 6.0.2-4 7.85 MiB 1.55 MiB extra/rocm-core 6.0.2-2 0.00 MiB 0.00 MiB Total Download Size: 1235.25 MiB Total Installed Size: 7310.35 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... rocm-llvm-6.0.2-1-riscv64 downloading... comgr-6.0.2-1-riscv64 downloading... hip-runtime-amd-6.0.2-4-riscv64 downloading... rocm-device-libs-6.0.2-1-riscv64 downloading... hsa-rocr-6.0.2-2.1-riscv64 downloading... rocminfo-6.0.2-1-riscv64 downloading... hsakmt-roct-6.0.0-2-riscv64 downloading... rocm-core-6.0.2-2-riscv64 downloading... checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing rocm-core... installing numactl... installing libpciaccess... installing libdrm... Optional dependencies for libdrm cairo: needed for modetest tool installing xcb-proto... installing xorgproto... installing libxdmcp... installing libxau... installing libxcb... installing libx11... installing libxext... installing libglvnd... installing libxfixes... installing libxshmfence... installing libxxf86vm... installing libedit... installing llvm-libs... installing lm_sensors... Optional dependencies for lm_sensors rrdtool: for logging with sensord perl: for sensor detection and configuration convert [installed] installing default-cursors... Optional dependencies for default-cursors adwaita-cursors: default cursor theme installing wayland... installing libomxil-bellagio... installing mesa... Optional dependencies for mesa opengl-man-pages: for the OpenGL API man pages installing rocm-device-libs... installing comgr... installing hwdata... installing kmod... installing pciutils... Optional dependencies for pciutils which: for update-pciids [installed] grep: for update-pciids [installed] curl: for update-pciids [installed] installing hsakmt-roct... installing hsa-rocr... installing rocminfo... installing rocm-llvm... installing hip-runtime-amd... Optional dependencies for hip-runtime-amd inetutils: Print hostname in hipconfig [?25h==> Checking buildtime dependencies... ==> Installing missing dependencies... [?25lresolving dependencies... looking for conflicting packages... Package (13) New Version Net Change Download Size extra/cppdap 1.58.0-2 1.48 MiB extra/hicolor-icon-theme 0.18-1 0.05 MiB extra/jsoncpp 1.9.5-3 3.13 MiB extra/libuv 1.48.0-2 0.56 MiB extra/perl-error 0.17029-7 0.04 MiB extra/perl-mailtools 2.21-9 0.10 MiB extra/perl-timedate 2.33-7 0.08 MiB extra/rhash 1.4.4-1 0.31 MiB extra/cmake 3.30.3-1 68.24 MiB extra/git 2.46.0-2 26.98 MiB extra/ninja 1.12.1-1 0.29 MiB extra/openmp 18.1.8-1 26.08 MiB 0.87 MiB extra/rocm-cmake 6.0.2-1 0.11 MiB 0.03 MiB Total Download Size: 0.90 MiB Total Installed Size: 127.45 MiB :: Proceed with installation? [Y/n] :: Retrieving packages... openmp-18.1.8-1-riscv64 downloading... rocm-cmake-6.0.2-1-any downloading... checking keyring... checking package integrity... loading package files... checking for file conflicts... :: Processing package changes... installing perl-error... installing perl-timedate... installing perl-mailtools... installing git... Optional dependencies for git tk: gitk and git gui openssh: ssh transport and crypto perl-libwww: git svn perl-term-readkey: git svn and interactive.singlekey setting perl-io-socket-ssl: git send-email TLS support perl-authen-sasl: git send-email TLS support perl-mediawiki-api: git mediawiki support perl-datetime-format-iso8601: git mediawiki support perl-lwp-protocol-https: git mediawiki https support perl-cgi: gitweb (web interface) support python: git svn & git p4 [installed] subversion: git svn org.freedesktop.secrets: keyring credential helper libsecret: libsecret credential helper [installed] installing cppdap... installing hicolor-icon-theme... installing jsoncpp... Optional dependencies for jsoncpp jsoncpp-doc: documentation installing libuv... installing rhash... installing cmake... Optional dependencies for cmake make: for unix Makefile generator [installed] ninja: for ninja generator [pending] qt6-base: cmake-gui installing ninja... installing rocm-cmake... installing openmp... Optional dependencies for openmp cuda: offloading to NVIDIA GPUs hsa-rocr: offloading to AMD GPUs [installed] :: Running post-transaction hooks... (1/1) Warn about old perl modules [?25h==> Retrieving sources...  -> Found composable-kernel-6.0.2.tar.gz ==> WARNING: Skipping all source file integrity checks. ==> Extracting sources...  -> Extracting composable-kernel-6.0.2.tar.gz with bsdtar ==> Starting prepare()... ==> Starting build()... -- The C compiler identification is GNU 14.2.1 -- The CXX compiler identification is Clang 17.0.0 -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Check for working C compiler: /usr/bin/cc - skipped -- Detecting C compile features -- Detecting C compile features - done -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Check for working CXX compiler: /opt/rocm/bin/hipcc - skipped -- Detecting CXX compile features -- Detecting CXX compile features - done -- Found Git: /usr/bin/git (found version "2.46.0") fatal: not a git repository (or any of the parent directories): .git GPU_TARGETS= checking which targets are supported -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 -- Performing Test COMPILER_HAS_TARGET_ID_gfx908 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a -- Performing Test COMPILER_HAS_TARGET_ID_gfx90a - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx940 -- Performing Test COMPILER_HAS_TARGET_ID_gfx940 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx941 -- Performing Test COMPILER_HAS_TARGET_ID_gfx941 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 -- Performing Test COMPILER_HAS_TARGET_ID_gfx942 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1030 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1100 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1101 - Success -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 -- Performing Test COMPILER_HAS_TARGET_ID_gfx1102 - Success Supported GPU_TARGETS= gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101;gfx1102 Building CK for the following targets: gfx908;gfx90a;gfx940;gfx941;gfx942;gfx1030;gfx1100;gfx1101;gfx1102 -- Performing Test CMAKE_HAVE_LIBC_PTHREAD -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success -- Found Threads: TRUE -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS -- Performing Test HIP_CLANG_SUPPORTS_PARALLEL_JOBS - Success hip_version_flat=600000000 Adding the fno-offload-uniform-block compiler flag CMAKE_CXX_COMPILER_ID: Clang OpenMP_CXX_LIB_NAMES: libomp;libgomp;libiomp5 OpenMP_gomp_LIBRARY: OpenMP_pthread_LIBRARY: OpenMP_CXX_FLAGS: -fopenmp=libomp -Wno-unused-command-line-argument -- Build with HIP 6.0.0 -- Clang tidy found: 17.0.0git -- Clang tidy checks: *,-abseil-*,-android-cloexec-fopen,-cert-msc30-c,-bugprone-exception-escape,-bugprone-macro-parentheses,-cert-env33-c,-cert-msc32-c,-cert-msc50-cpp,-cert-msc51-cpp,-cert-dcl37-c,-cert-dcl51-cpp,-clang-analyzer-alpha.core.CastToStruct,-clang-analyzer-optin.performance.Padding,-clang-diagnostic-deprecated-declarations,-clang-diagnostic-extern-c-compat,-clang-diagnostic-unused-command-line-argument,-cppcoreguidelines-avoid-c-arrays,-cppcoreguidelines-avoid-magic-numbers,-cppcoreguidelines-explicit-virtual-functions,-cppcoreguidelines-init-variables,-cppcoreguidelines-macro-usage,-cppcoreguidelines-non-private-member-variables-in-classes,-cppcoreguidelines-pro-bounds-array-to-pointer-decay,-cppcoreguidelines-pro-bounds-constant-array-index,-cppcoreguidelines-pro-bounds-pointer-arithmetic,-cppcoreguidelines-pro-type-member-init,-cppcoreguidelines-pro-type-reinterpret-cast,-cppcoreguidelines-pro-type-union-access,-cppcoreguidelines-pro-type-vararg,-cppcoreguidelines-special-member-functions,-fuchsia-*,-google-explicit-constructor,-google-readability-braces-around-statements,-google-readability-todo,-google-runtime-int,-google-runtime-references,-hicpp-vararg,-hicpp-braces-around-statements,-hicpp-explicit-conversions,-hicpp-named-parameter,-hicpp-no-array-decay,-hicpp-avoid-c-arrays,-hicpp-signed-bitwise,-hicpp-special-member-functions,-hicpp-uppercase-literal-suffix,-hicpp-use-auto,-hicpp-use-equals-default,-hicpp-use-override,-llvm-header-guard,-llvm-include-order,-llvmlibc-restrict-system-libc-headers,-llvmlibc-callee-namespace,-llvmlibc-implementation-in-namespace,-llvm-else-after-return,-llvm-qualified-auto,-misc-misplaced-const,-misc-non-private-member-variables-in-classes,-misc-no-recursion,-modernize-avoid-bind,-modernize-avoid-c-arrays,-modernize-pass-by-value,-modernize-use-auto,-modernize-use-default-member-init,-modernize-use-equals-default,-modernize-use-trailing-return-type,-modernize-use-transparent-functors,-performance-unnecessary-value-param,-readability-braces-around-statements,-readability-else-after-return,-readability-function-cognitive-complexity,-readability-isolate-declaration,-readability-magic-numbers,-readability-named-parameter,-readability-uppercase-literal-suffix,-readability-convert-member-functions-to-static,-readability-qualified-auto,-readability-redundant-string-init,-bugprone-narrowing-conversions,-cppcoreguidelines-narrowing-conversions,-altera-struct-pack-align,-cppcoreguidelines-prefer-member-initializer CMAKE_CXX_FLAGS: instance should be built for all types! adding instance device_avg_pool3d_bwd_instance instance should be built for all types! adding instance device_batched_gemm_instance instance should be built for all types! adding instance device_batched_gemm_add_relu_gemm_add_instance instance should be built for all types! adding instance device_batched_gemm_bias_permute_instance instance should be built for all types! adding instance device_batched_gemm_gemm_instance instance should be built for all types! Found only dl instances, but DL_KERNELS is not set. Skipping. instance should be built for all types! adding instance device_batched_gemm_reduce_instance instance should be built for all types! adding instance device_batched_gemm_softmax_gemm_instance instance should be built for all types! adding instance device_batched_gemm_softmax_gemm_permute_instance instance should be built for all types! adding instance device_batchnorm_instance instance should be built for all types! adding instance device_column_to_image_instance instance should be built for all types! adding instance device_contraction_bilinear_instance instance should be built for all types! adding instance device_contraction_scale_instance instance should be built for all types! adding instance device_conv1d_bwd_data_instance instance should be built for all types! adding instance device_conv2d_bwd_data_instance removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f32_instance.cpp removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_f16_instance.cpp removing dl instance device_conv2d_bwd_data_dl_nhwc_kyxc_nhwk_int8_instance.cpp instance should be built for all types! adding instance device_conv2d_fwd_instance instance should be built for all types! adding instance device_conv2d_fwd_bias_relu_instance instance should be built for all types! adding instance device_conv2d_fwd_bias_relu_add_instance instance should be built for all types! adding instance device_conv3d_bwd_data_instance instance should be built for all types! adding instance device_elementwise_instance instance should be built for all types! adding instance device_elementwise_normalization_instance instance should be built for all types! adding instance device_gemm_instance removing dl instance device_gemm_dl_f32_f32_f32_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f32_f32_f32_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_f16_f16_f16_km_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_mk_nk_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_kn_mn_irregular_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_instance.cpp removing dl instance device_gemm_dl_i8_i8_i8_km_nk_mn_irregular_instance.cpp instance should be built for all types! adding instance device_gemm_add_add_fastgelu_instance instance should be built for all types! adding instance device_gemm_add_fastgelu_instance instance should be built for all types! adding instance device_gemm_add_multiply_instance instance should be built for all types! adding instance device_gemm_add_relu_add_layernorm_instance instance should be built for all types! adding instance device_gemm_bias_add_reduce_instance instance should be built for all types! adding instance device_gemm_bilinear_instance instance should be built for all types! adding instance device_gemm_fastgelu_instance instance should be built for all types! adding instance device_gemm_multiply_add_instance instance should be built for all types! adding instance device_gemm_reduce_instance instance should be built for all types! adding instance device_gemm_splitk_instance instance should be built for all types! adding instance device_gemm_streamk_instance instance should be built for all types! adding instance device_grouped_conv1d_bwd_weight_instance instance should be built for all types! adding instance device_grouped_conv1d_fwd_instance instance should be built for all types! adding instance device_grouped_conv2d_bwd_data_instance instance should be built for all types! adding instance device_grouped_conv2d_bwd_weight_instance instance should be built for all types! adding instance device_grouped_conv2d_fwd_instance removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_gnhwc_gkyxc_gnhwk_f32_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f16_instance.cpp removing dl instance dl/device_grouped_conv2d_fwd_dl_nhwgc_gkyxc_nhwgk_f32_instance.cpp instance should be built for all types! adding instance device_grouped_conv3d_bwd_data_instance instance should be built for all types! adding instance device_grouped_conv3d_bwd_weight_instance instance should be built for all types! adding instance device_grouped_conv3d_fwd_instance instance should be built for all types! adding instance device_grouped_gemm_instance instance should be built for all types! adding instance device_grouped_gemm_bias_instance instance should be built for all types! adding instance device_grouped_gemm_fastgelu_instance instance should be built for all types! adding instance device_grouped_gemm_fixed_nk_instance instance should be built for all types! adding instance device_image_to_column_instance instance should be built for all types! adding instance device_max_pool_bwd_instance instance should be built for all types! adding instance device_normalization_instance instance should be built for all types! adding instance device_pool3d_fwd_instance instance should be built for all types! adding instance device_quantization_instance removing dl instance conv2d_fwd/device_conv2d_dl_perlayer_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_perchannel_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_bias_perlayer_quantization_int8_instance.cpp removing dl instance conv2d_fwd/device_conv2d_dl_bias_perchannel_quantization_int8_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_kn_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_km_nk_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_kn_mn_instance.cpp removing dl instance gemm/device_gemm_quantization_dl_c_shuffle_i8_i8_i8_mk_nk_mn_instance.cpp instance should be built for all types! adding instance device_reduce_instance instance should be built for all types! adding instance device_softmax_instance -- Configuring done (262.8s) -- Generating done (3.0s) -- Build files have been written to: /build/composable-kernel/src/build [1/451] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f16_instance.cpp.o [2/451] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_bf16_instance.cpp.o [3/451] Building CXX object library/src/tensor_operation_instance/gpu/avg_pool3d_bwd/CMakeFiles/device_avg_pool3d_bwd_instance.dir/device_avg_pool3d_bwd_ndhwc_f32_instance.cpp.o [4/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gkm_gkn_gmn_instance.cpp.o [5/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gmk_gnk_gmn_instance.cpp.o [6/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gmk_gnk_gmn_instance.cpp.o [7/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gmk_gkn_gmn_instance.cpp.o [8/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gkm_gnk_gmn_instance.cpp.o [9/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f16_f16_f16_gmk_gkn_gmn_instance.cpp.o [10/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gkm_gkn_gmn_instance.cpp.o [11/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gmk_gkn_gmn_instance.cpp.o [12/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_bf16_bf16_bf16_gkm_gnk_gmn_instance.cpp.o [13/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gkm_gkn_gmn_instance.cpp.o [14/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gmk_gnk_gmn_instance.cpp.o [15/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_f32_f32_f32_gkm_gnk_gmn_instance.cpp.o [16/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gmk_gnk_gmn_instance.cpp.o [17/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gmk_gkn_gmn_instance.cpp.o [18/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gkm_gnk_gmn_instance.cpp.o [19/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm/CMakeFiles/device_batched_gemm_instance.dir/device_batched_gemm_xdl_int8_int8_int8_gkm_gkn_gmn_instance.cpp.o [20/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add/CMakeFiles/device_batched_gemm_add_relu_gemm_add_instance.dir/device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [21/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_add_relu_gemm_add/CMakeFiles/device_batched_gemm_add_relu_gemm_add_instance.dir/device_batched_gemm_add_relu_gemm_add_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp.o [22/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_gemm/CMakeFiles/device_batched_gemm_gemm_instance.dir/device_batched_gemm_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [23/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gmk_gnk_gmn_instance.cpp.o [24/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_gemm/CMakeFiles/device_batched_gemm_gemm_instance.dir/device_batched_gemm_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gon_gmo_instance.cpp.o [25/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gmk_gkn_gmn_instance.cpp.o [26/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_bias_permute/CMakeFiles/device_batched_gemm_bias_permute_instance.dir/device_batched_gemm_bias_permute_m2_n3_k1_xdl_c_shuffle_f16_f16_f16_f16_instance.cpp.o [27/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gkm_gkn_gmn_instance.cpp.o [28/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_reduce/CMakeFiles/device_batched_gemm_reduce_instance.dir/device_batched_gemm_reduce_xdl_cshuffle_f16_f16_f16_f32_f32_gkm_gnk_gmn_instance.cpp.o [29/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f16_instance.cpp.o [30/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [31/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp.o [32/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f32_instance.cpp.o [33/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm/CMakeFiles/device_batched_gemm_softmax_gemm_instance.dir/device_batched_gemm_softmax_gemm_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [34/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o FAILED: library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o /opt/rocm/bin/hipcc -DCK_ENABLE_BF16 -DCK_ENABLE_BF8 -DCK_ENABLE_FP16 -DCK_ENABLE_FP32 -DCK_ENABLE_FP64 -DCK_ENABLE_FP8 -DCK_ENABLE_INT8 -DINSTANCES_ONLY -DUSE_PROF_API=1 -D__HIP_PLATFORM_AMD__=1 -D__HIP_PLATFORM_HCC__=1 -I/build/composable-kernel/src/composable_kernel-rocm-6.0.2/library/include -I/build/composable-kernel/src/composable_kernel-rocm-6.0.2/include -I/build/composable-kernel/src/build/include -O3 -DNDEBUG -std=c++17 -fPIC -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-bit-int-extension -fno-offload-uniform-block -x hip --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx940 --offload-arch=gfx941 --offload-arch=gfx942 --offload-arch=gfx1030 --offload-arch=gfx1100 --offload-arch=gfx1101 --offload-arch=gfx1102 -MD -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -MF library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o.d -o library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -c /build/composable-kernel/src/composable_kernel-rocm-6.0.2/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f32_instance.cpp free(): invalid pointer PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /opt/rocm/lib/llvm/bin/clang-17 -cc1 -triple amdgcn-amd-amdhsa -aux-triple riscv64-unknown-linux-gnu -emit-obj -disable-free -clear-ast-before-backend -main-file-name device_batchnorm_backward_f32_instance.cpp -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fno-rounding-math -mconstructor-aliases -aux-target-cpu generic-rv64 -aux-target-feature +m -aux-target-feature +a -aux-target-feature +f -aux-target-feature +d -aux-target-feature +c -aux-target-feature +zicsr -aux-target-feature -e -aux-target-feature -h -aux-target-feature -zihintpause -aux-target-feature -zfhmin -aux-target-feature -zfh -aux-target-feature -zfinx -aux-target-feature -zdinx -aux-target-feature -zhinxmin -aux-target-feature -zhinx -aux-target-feature -zba -aux-target-feature -zbb -aux-target-feature -zbc -aux-target-feature -zbs -aux-target-feature -zbkb -aux-target-feature -zbkc -aux-target-feature -zbkx -aux-target-feature -zknd -aux-target-feature -zkne -aux-target-feature -zknh -aux-target-feature -zksed -aux-target-feature -zksh -aux-target-feature -zkr -aux-target-feature -zkn -aux-target-feature -zks -aux-target-feature -zkt -aux-target-feature -zk -aux-target-feature -zmmul -aux-target-feature -v -aux-target-feature -zvl32b -aux-target-feature -zvl64b -aux-target-feature -zvl128b -aux-target-feature -zvl256b -aux-target-feature -zvl512b -aux-target-feature -zvl1024b -aux-target-feature -zvl2048b -aux-target-feature -zvl4096b -aux-target-feature -zvl8192b -aux-target-feature -zvl16384b -aux-target-feature -zvl32768b -aux-target-feature -zvl65536b -aux-target-feature -zve32x -aux-target-feature -zve32f -aux-target-feature -zve64x -aux-target-feature -zve64f -aux-target-feature -zve64d -aux-target-feature -zicbom -aux-target-feature -zicboz -aux-target-feature -zicbop -aux-target-feature -zifencei -aux-target-feature -zawrs -aux-target-feature -svnapot -aux-target-feature -svpbmt -aux-target-feature -svinval -aux-target-feature -xtheadba -aux-target-feature -xtheadbb -aux-target-feature -xtheadbs -aux-target-feature -xtheadcmo -aux-target-feature -xtheadcondmov -aux-target-feature -xtheadfmemidx -aux-target-feature -xtheadmac -aux-target-feature -xtheadmemidx -aux-target-feature -xtheadmempair -aux-target-feature -xtheadsync -aux-target-feature -xtheadvdot -aux-target-feature -xventanacondops -aux-target-feature -experimental-zihintntl -aux-target-feature -experimental-zca -aux-target-feature -experimental-zcb -aux-target-feature -experimental-zcd -aux-target-feature -experimental-zcf -aux-target-feature -experimental-zfa -aux-target-feature -experimental-zicond -aux-target-feature -experimental-zvfh -aux-target-feature -experimental-ztso -aux-target-feature -experimental-zvkb -aux-target-feature -experimental-zvkg -aux-target-feature -experimental-zvkn -aux-target-feature -experimental-zvknha -aux-target-feature -experimental-zvknhb -aux-target-feature -experimental-zvkned -aux-target-feature -experimental-zvks -aux-target-feature -experimental-zvksed -aux-target-feature -experimental-zvksh -aux-target-feature +relax -aux-target-feature -save-restore -fcuda-is-device -mllvm -amdgpu-internalize-symbols -fcuda-allow-variadic-functions -fvisibility=hidden -fapply-global-visibility-to-externs -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/hip.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/ocml.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/ockl.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_daz_opt_off.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_unsafe_math_off.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_finite_only_off.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_correctly_rounded_sqrt_on.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_wavefrontsize64_off.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_isa_version_1030.bc -mlink-builtin-bitcode /opt/rocm/amdgcn/bitcode/oclc_abi_version_500.bc -target-cpu gfx1030 -debugger-tuning=gdb -resource-dir /opt/rocm/lib/llvm/lib/clang/17.0.0 -dependency-file library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o.d -MT library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f32_instance.cpp.o -sys-header-deps -internal-isystem /opt/rocm/lib/llvm/lib/clang/17.0.0/include/cuda_wrappers -idirafter /opt/rocm/include -include __clang_hip_runtime_wrapper.h -c-isystem /opt/rocm/llvm/include/gpu-none-llvm -isystem /opt/rocm/include -D CK_ENABLE_BF16 -D CK_ENABLE_BF8 -D CK_ENABLE_FP16 -D CK_ENABLE_FP32 -D CK_ENABLE_FP64 -D CK_ENABLE_FP8 -D CK_ENABLE_INT8 -D INSTANCES_ONLY -D USE_PROF_API=1 -D __HIP_PLATFORM_AMD__=1 -D __HIP_PLATFORM_HCC__=1 -I /build/composable-kernel/src/composable_kernel-rocm-6.0.2/library/include -I /build/composable-kernel/src/composable_kernel-rocm-6.0.2/include -I /build/composable-kernel/src/build/include -D NDEBUG -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1 -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1/riscv64-unknown-linux-gnu -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1/backward -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1 -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1/riscv64-unknown-linux-gnu -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../include/c++/14.2.1/backward -internal-isystem /opt/rocm/lib/llvm/lib/clang/17.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../riscv64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /opt/rocm/lib/llvm/lib/clang/17.0.0/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/riscv64-unknown-linux-gnu/14.2.1/../../../../riscv64-unknown-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -source-date-epoch 1725823174 -O3 -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Wno-missing-field-initializers -Wno-deprecated-declarations -Wall -Wextra -Wcomment -Wendif-labels -Wformat -Winit-self -Wreturn-type -Wsequence-point -Wswitch -Wtrigraphs -Wundef -Wuninitialized -Wunreachable-code -Wunused -Wno-reserved-identifier -Werror -Wno-option-ignored -Wsign-compare -Wno-extra-semi-stmt -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-conversion -Wno-double-promotion -Wno-exit-time-destructors -Wno-extra-semi -Wno-float-conversion -Wno-gnu-anonymous-struct -Wno-gnu-zero-variadic-macro-arguments -Wno-missing-prototypes -Wno-nested-anon-types -Wno-padded -Wno-return-std-move-in-c++11 -Wno-shorten-64-to-32 -Wno-sign-conversion -Wno-unknown-warning-option -Wno-unused-command-line-argument -Wno-weak-vtables -Wno-covered-switch-default -Wno-unsafe-buffer-usage -Wno-bit-int-extension -std=c++17 -fdeprecated-macro -fno-autolink -fdebug-compilation-dir=/build/composable-kernel/src/build -ferror-limit 19 -fhip-new-launch-api -fno-signed-char -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -vectorize-loops -vectorize-slp -mllvm -amdgpu-early-inline-all=true -mllvm -amdgpu-function-calls=false -cuid=541c1f117c52005c -fcuda-allow-variadic-functions -fno-offload-uniform-block -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/device_batchnorm_backward_f32_instance-gfx1030-18415e.o -x hip /build/composable-kernel/src/composable_kernel-rocm-6.0.2/library/src/tensor_operation_instance/gpu/batchnorm/device_batchnorm_backward_f32_instance.cpp 1. parser at end of file 2. Optimizer #0 0x0000003670ea2274 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/opt/rocm/lib/llvm/bin/clang-17+0x235b274) #1 0x0000003670e9fcc0 (/opt/rocm/lib/llvm/bin/clang-17+0x2358cc0) #2 0x000000353d55e5b0 (linux-vdso.so.1+0x5b0) #3 0x000000353cef52d4 (/usr/lib/libc.so.6+0x782d4) #4 0x000000353ceb5f6e raise (/usr/lib/libc.so.6+0x38f6e) #5 0x000000353cea4448 abort (/usr/lib/libc.so.6+0x27448) #6 0x000000353ceeb804 __libc_fatal (/usr/lib/libc.so.6+0x6e804) #7 0x000000353cefddf0 (/usr/lib/libc.so.6+0x80df0) #8 0x000000353ceffc66 (/usr/lib/libc.so.6+0x82c66) #9 0x000000353cf01c74 free (/usr/lib/libc.so.6+0x84c74) #10 0x0000003670c9a2dc llvm::GVNPass::ValueTable::clear() (/opt/rocm/lib/llvm/bin/clang-17+0x21532dc) #11 0x0000003670c9a528 llvm::GVNPass::cleanupGlobalSets() (/opt/rocm/lib/llvm/bin/clang-17+0x2153528) #12 0x0000003670cab010 llvm::GVNPass::runImpl(llvm::Function&, llvm::AssumptionCache&, llvm::DominatorTree&, llvm::TargetLibraryInfo const&, llvm::AAResults&, llvm::MemoryDependenceResults*, llvm::LoopInfo*, llvm::OptimizationRemarkEmitter*, llvm::MemorySSA*) (/opt/rocm/lib/llvm/bin/clang-17+0x2164010) #13 0x0000003670cabc64 llvm::GVNPass::run(llvm::Function&, llvm::AnalysisManager&) (/opt/rocm/lib/llvm/bin/clang-17+0x2164c64) #14 0x00000036710a5564 (/opt/rocm/lib/llvm/bin/clang-17+0x255e564) #15 0x000000366f92428c (/opt/rocm/lib/llvm/bin/clang-17+0xddd28c) #16 0x000000366ffda232 llvm::CGSCCToFunctionPassAdaptor::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/opt/rocm/lib/llvm/bin/clang-17+0x1493232) #17 0x000000366f908f0a (/opt/rocm/lib/llvm/bin/clang-17+0xdc1f0a) #18 0x000000366ffd4fac llvm::PassManager, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&>::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/opt/rocm/lib/llvm/bin/clang-17+0x148dfac) #19 0x0000003671f782f6 (/opt/rocm/lib/llvm/bin/clang-17+0x34312f6) #20 0x000000366ffdb478 llvm::DevirtSCCRepeatedPass::run(llvm::LazyCallGraph::SCC&, llvm::AnalysisManager&, llvm::LazyCallGraph&, llvm::CGSCCUpdateResult&) (/opt/rocm/lib/llvm/bin/clang-17+0x1494478) #21 0x0000003671f78344 (/opt/rocm/lib/llvm/bin/clang-17+0x3431344) #22 0x000000366ffd6a5a llvm::ModuleToPostOrderCGSCCPassAdaptor::run(llvm::Module&, llvm::AnalysisManager&) (/opt/rocm/lib/llvm/bin/clang-17+0x148fa5a) #23 0x000000367212735c llvm::ModuleInlinerWrapperPass::run(llvm::Module&, llvm::AnalysisManager&) (/opt/rocm/lib/llvm/bin/clang-17+0x35e035c) #24 0x0000003671f77a02 (/opt/rocm/lib/llvm/bin/clang-17+0x3430a02) #25 0x00000036708ff13a llvm::PassManager>::run(llvm::Module&, llvm::AnalysisManager&) (/opt/rocm/lib/llvm/bin/clang-17+0x1db813a) #26 0x00000036710b54f6 (/opt/rocm/lib/llvm/bin/clang-17+0x256e4f6) #27 0x00000036710b7f2a clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr, std::unique_ptr>) (/opt/rocm/lib/llvm/bin/clang-17+0x2570f2a) #28 0x0000003671f1f270 (/opt/rocm/lib/llvm/bin/clang-17+0x33d8270) #29 0x0000003672d0dccc clang::ParseAST(clang::Sema&, bool, bool) (/opt/rocm/lib/llvm/bin/clang-17+0x41c6ccc) #30 0x000000367181d298 clang::FrontendAction::Execute() (/opt/rocm/lib/llvm/bin/clang-17+0x2cd6298) #31 0x00000036717b84ce clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/opt/rocm/lib/llvm/bin/clang-17+0x2c714ce) #32 0x00000036718d5444 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/opt/rocm/lib/llvm/bin/clang-17+0x2d8e444) #33 0x000000366f8ddd12 cc1_main(llvm::ArrayRef, char const*, void*) (/opt/rocm/lib/llvm/bin/clang-17+0xd96d12) #34 0x000000366f8daa96 (/opt/rocm/lib/llvm/bin/clang-17+0xd93a96) #35 0x000000366f8dc870 clang_main(int, char**, llvm::ToolContext const&) (/opt/rocm/lib/llvm/bin/clang-17+0xd95870) #36 0x000000366f81ce6c main (/opt/rocm/lib/llvm/bin/clang-17+0xcd5e6c) #37 0x000000353cea498e (/usr/lib/libc.so.6+0x2798e) #38 0x000000353cea4a3a __libc_start_main (/usr/lib/libc.so.6+0x27a3a) #39 0x000000366f8d5f18 _start (/opt/rocm/lib/llvm/bin/clang-17+0xd8ef18) clang: error: unable to execute command: Aborted (core dumped) clang: error: clang frontend command failed due to signal (use -v to see invocation) clang version 17.0.0 Target: riscv64-unknown-linux-gnu Thread model: posix InstalledDir: /opt/rocm/llvm/bin clang: note: diagnostic msg: Error generating preprocessed source(s). [35/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_bf16_instance.cpp.o [36/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_forward_f64_instance.cpp.o [37/451] Building CXX object library/src/tensor_operation_instance/gpu/batchnorm/CMakeFiles/device_batchnorm_instance.dir/device_batchnorm_backward_f16_instance.cpp.o [38/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_f16_f16_f16_f16_gmk_gnk_gno_gmo_instance.cpp.o [39/451] Building CXX object library/src/tensor_operation_instance/gpu/batched_gemm_softmax_gemm_permute/CMakeFiles/device_batched_gemm_softmax_gemm_permute_instance.dir/device_batched_gemm_bias_softmax_gemm_permute_xdl_cshuffle_bf16_bf16_bf16_bf16_gmk_gnk_gno_gmo_instance.cpp.o ninja: build stopped: subcommand failed. ==> ERROR: A failure occurred in build().  Aborting... ==> ERROR: Build failed, check /var/lib/archbuild/extra-riscv64/felix2/build [?25h[?25hreceiving incremental file list composable-kernel-6.0.2-1-riscv64-build.log composable-kernel-6.0.2-1-riscv64-prepare.log sent 62 bytes received 6,714 bytes 13,552.00 bytes/sec total size is 36,534 speedup is 5.39