News: 0001478784

  ARM Give a man a fire and he's warm for a day, but set fire to him and he's warm for the rest of his life (Terry Pratchett, Jingo)

GCC Git Adjusts Unaligned Load/Store Costs For AMD Zen 4 & Zen 5

([GNU] 6 Hours Ago znver4 + znver5 Tuning)


Stemming from a recent investigation into a GCC compiler [1]regression on Zen 4, it was discovered that the unaligned load/store costs for the Zen 4 and Zen 5 targets were inaccurate and have now been tweaked within GCC Git.

GCC compiler expert Richard Biener at SUSE reported the original Zen 4 regression and went on to analyze and fix-up the issue. In updating the unaligned load/store costs for the "Znver4" compiler target he [2]explained :

"Fixup unaligned load/store cost for znver4

Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply left untouched from znver3 where they equate the aligned costs when tweaking aligned costs for znver4. The following makes the unaligned costs equal to the aligned costs.

This avoids the miscompile seen in PR115843 but it's of course not a real fix for the issue uncovered there. But it makes it qualify as a regression fix."

It's not the first time we've seen AMD Zen targets a bit hairy as a result of starting off by copying over from prior Zen revisions but the cost tables not always being updated accurately.

And then similarly the Zen 5 (znver5) target was originally copied over too and thus it also needed [3]updating :

"Currently unaligned YMM and ZMM load and store costs are cheaper than aligned which causes the vectorizer to purposely mis-align accesses by adding an alignment prologue. It looks like the unaligned costs were simply copied from the bogus znver4 costs. The following makes the unaligned costs equal to the aligned costs like in the fixed znver4 version."

At least as being treated as a "regression fix" these AMD Zen tuning patches should be picked up in time for the upcoming GCC 14.2 point release.



[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843

[2] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1e3aa9c9278db69d4bdb661a750a7268789188d6

[3] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=896393791ee34ffc176c87d232dfee735db3aaab



chuckula

ptr1337

milkylainen

chuckula

Even if you persuade me, you won't persuade me.
-- Aristophanes