Skip to content

CI/BUG: add native jobs for s390x, fix bug in pack_inner#30819

Open
AlekseiNikiforovIBM wants to merge 10 commits intonumpy:mainfrom
AlekseiNikiforovIBM:s390x_ci
Open

CI/BUG: add native jobs for s390x, fix bug in pack_inner#30819
AlekseiNikiforovIBM wants to merge 10 commits intonumpy:mainfrom
AlekseiNikiforovIBM:s390x_ci

Conversation

@AlekseiNikiforovIBM
Copy link
Contributor

@AlekseiNikiforovIBM AlekseiNikiforovIBM commented Feb 11, 2026

Rename workflow linux-ppc64le.yml into linux-actionspz.yml.
Copy and update ppc64le workflow for s390x.
Disable qemu-based workflows for s390x.

Fix big endian issue in pack_inner.

Closes: #30817

@mkumatag @rgommers

@AlekseiNikiforovIBM AlekseiNikiforovIBM marked this pull request as draft February 11, 2026 15:05
@mkumatag
Copy link

@sandeepgupta12 ptal

@AlekseiNikiforovIBM
Copy link
Contributor Author

AlekseiNikiforovIBM commented Feb 11, 2026

So, currently clang tests are failing on ppc64le and s390x, and clang and gcc tests with forced vectorization fail on s390x.

Can we proceed with these workflows now or should tests be fixed first? I'm already looking into s390x failures.

@AlekseiNikiforovIBM AlekseiNikiforovIBM marked this pull request as ready for review February 11, 2026 16:39
@AlekseiNikiforovIBM
Copy link
Contributor Author

This change fixes s390x tests with forced zvector:

663107d

I could also split it into a separate PR if you'd like.

Now there are only some failures when clang is used.

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlekseiNikiforovIBM. This is a bit tricky to review (let's blame the GitHub UI), so it'd be great if you could revert the name change and leave it at linux-ppc64le.yml for now. That will make the diff much smaller, and it'll be visible what you tweaked in the ppc64le job.

fail-fast: false
matrix:
config:
- name: "GCC (ppc64le)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole job matrix is ppc64le, so this name change seem a bit unnecessary

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it takes title of workflow file, and then job name. Without this change there will be two jobs named "Native ppc64le and s390x Linux Test / GCC", one for ppc64le and one for s390x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, export doesn't survive across job steps

- name: "GCC with zvector (s390x)"
args: "-Dallow-noblas=false -Dcpu-baseline=vx"
- name: "clang with zvector (s390x)"
args: "-Dallow-noblas=false -Dcpu-baseline=vx"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had two jobs under QEMU; four seems a little excessive - can you do GCC and Clang with zvector here (or vice versa?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They produce different results, so I think it's better to have all 4 of them. Considering issues with clang, maybe clang jobs should be dropped until all issues when compiling with clang are resolved?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With clang 18, ppc64le tests are passing. See here- https://github.com/numpy/numpy/actions/runs/21948675141/job/63393053952

Copy link
Contributor Author

@AlekseiNikiforovIBM AlekseiNikiforovIBM Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see clang-18 being used there. I see gcc. From linked build log:

C compiler for the host machine: cc (gcc 13.3.0 "cc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0")
C linker for the host machine: cc ld.bfd 2.42
C++ compiler for the host machine: c++ (gcc 13.3.0 "c++ (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0")
C++ linker for the host machine: c++ ld.bfd 2.42

Clang-20 logs for comparison:

C compiler for the host machine: clang-20 (clang 20.1.2 "Ubuntu clang version 20.1.2 (0ubuntu1~24.04.2)")
C linker for the host machine: clang-20 ld.bfd 2.42
C++ compiler for the host machine: clang++-20 (clang 20.1.2 "Ubuntu clang version 20.1.2 (0ubuntu1~24.04.2)")
C++ linker for the host machine: clang++-20 ld.bfd 2.42

@rgommers rgommers added 03 - Maintenance component: CI component: SIMD Issues in SIMD (fast instruction sets) code or machinery labels Feb 11, 2026
@rgommers rgommers requested a review from seiko2plus February 11, 2026 20:25
@rgommers
Copy link
Member

I could also split it into a separate PR if you'd like.

A single PR seems fine to me, thanks.

@rgommers rgommers changed the title Implement workflows for s390x native CI CI/BUG: Implement workflows for s390x native CI Feb 12, 2026
@rgommers rgommers changed the title CI/BUG: Implement workflows for s390x native CI CI/BUG: add native jobs for s390x, fix bug in pack_inner Feb 12, 2026
Add parameters for runner and compiler
Only 16-byte vectorization path is tested for big endian.
It contains fixes for numpy testsuite, compared to default clang-18
Too many tests still fail and need investigation.
@AlekseiNikiforovIBM
Copy link
Contributor Author

I've disabled clang builds and tests for s390x while there are still too many issues to be fixed. I think enabling them later would be a good choice.

Newer clang from Fedora 43 builds numpy and passes tests better than clang from Ubuntu 24.04
AT_HWCAP is now a hex number and needs to be decoded.
https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=e5893e6349541d871e8a25120bca014551d13ff5

Use /prc/cpuinfo to obtain same information.
Too many failing tests which need to be looked at.
@AlekseiNikiforovIBM
Copy link
Contributor Author

I've updated PR.

I've used docker and fedora:43 image to build and test numpy with clang. Fedora has newer clang, and maybe clang is patched differently there, so there are no miscompilations detected by testsuite.

Only 1 test failed and was needed to be fixed: test_cpu_features.py::Test_ZARCH_Features::test_features.

AT_HWCAP is no longer decoded in glibc output:
https://sourceware.org/git/?p=glibc.git;a=commit;h=e5893e6349541d871e8a25120bca014551d13ff5
Now it's provided as a hex number:

$ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP
AT_HWCAP:             0x67ffff

Luckily, on s390x a copy of output is available in /proc/cpuinfo and I've switched to using it:

$ grep feature /proc/cpuinfo 
features        : esan3 zarch stfle msa ldisp eimm dfp edat etf3eh highgprs te vx vxd vxe gs vxe2 vxp sort dflt pcimio sie 

Unfortunately, I don't think similar output is available for ppc64le. Hex output of AT_HWCAP would be needed to be decoded for ppc64le when it'd be detected, or some other approach would have to be used.

I've also bumped zvector build to enable everything up to VXE2.

There still were 65 failing tests for ppc64le when building with clang:
https://github.com/numpy/numpy/actions/runs/21990958320/job/63537634918?pr=30819

2026-02-13T14:52:11.1751562Z = 65 failed, 47875 passed, 344 skipped, 2843 deselected, 34 xfailed, 3 xpassed in 256.65s (0:04:16) =

I disabled ppc64le clang build for now. It'd be better to have those tests fixed or disabled first, or some other build environment used.

@rgommers please take one more look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

03 - Maintenance component: CI component: SIMD Issues in SIMD (fast instruction sets) code or machinery

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CI: use IBM self-hosted runner for native s390x tests

4 participants