Skip to main content

AMD has a fix coming for Ryzen bug that hard-locks PCs

By Joel Hruska
Not long after AMD launched Ryzen, a group of forum users at HWBot.org began noticing something strange. A benchmark designed to measure a CPU’s performance in floating point operations (FLOPS) that made use of the FMA3 instruction set was hard-locking on Ryzen systems. The application typically hung when it reached the “Single-Precision – 128-bit FMA3 – Fused Multiply Add:” section of the test. Given that the test is open-source and hosted on GitHub, there’s no reason to suspect bad faith or improper coding, and AMD has actually confirmed that there’s a problem.
The good news is, this isn’t the kind of problem that’s going to cripple the CPU’s performance or be a permanent, ongoing problem the way the original Phenom’s L2 TLB bug was. The bug, according to the HWBot thread, is tied to AGESA — the AMD Generic Encapsulated Software Architecture, which handles the bootstrap protocol and initializes system devices. This piece of software initializes CPU cores, memory, and likely AMD’s Infinity Fabric as well, though that’s speculation on our part.
BIOS updates are said to be in the works for all major board vendors; ExtremeTech recommends keeping an eye out for new updates as they become available. The likelihood of encountering this bug in the wild is rare; I’ve personally tested applications like Prime95, which uses FMA3 instructions and executed for an hour on Ryzen with no problem. And problems with FMA3 aren’t exactly novel, either. Intel had its own issues with FMA3 code with Skylake, as we reported last year.
For those of you wanting a more technical explanation: FMA stands for Fused-Multiply-Add. FMA3 instructions are supported by both AMD and Intel and have three operands. The classic example operation is d = round(a × b + c). In FMA3, “d” must be the same register as a, b, or c. FMA4, which only AMD supports (in Bulldozer and later processors) allow a,b,c, and d to all be stored in different registers.
FMA3 is considered to be simpler to implement and to reduce code length, while FMA4 offers more flexibility. Not many applications critically depend on FMA3 because the majority of CPUs in the market today (as opposed to new chips being sold) don’t support it. Intel’s support for FMA3 only dates to Haswell, and AMD’s FMA4 never gained much traction.
Given that this bug appeared in a low-level benchmark specifically designed for FLOPS testing, and we aren’t aware of any problems in any shipping applications, we’d keep an eye out for a motherboard update. But we aren’t pulling our general recommendation of the chip. All three Ryzen CPUs released to date and all known motherboards do suffer from this bug, however.

Comments