There is a widespread idea that modern high-performance x86 processors work by decoding the "complex" x86 instructions into "simple" RISC-like instructions that the rest of the pipeline then operates on. But how close is this idea to how the processors actually work internally?
To answer this question, let's analyze how different x86 processors, ranging from the first "modern" Intel microarchitecture, P6, to their current designs, handle the following simple loop (the code is 32-bit just to allow us to discuss very old x86 processors):
Read the full article…
add [edx], eax
add edx, 4
sub eax, 1