Video Screencast Help
Symantec to Separate Into Two Focused, Industry-Leading Technology Companies. Learn more.
Security Response

x86 Fetch-Decode Anomalies

Created: 19 Feb 2007 08:00:00 GMT • Updated: 23 Jan 2014 18:52:30 GMT
Peter Ferrie's picture
0 0 Votes
Login to vote

A colleague of mine came to see me one morning recently with anunusual result. For reasons that he didn't explain to me (he called it"a secret project"), he had intentionally placed a particular encodingof an invalid instruction near the end of a valid page, next to anunallocated page, then executed that instruction. However, instead ofseeing the expected invalid opcode exception, he was seeing a pagefault. Initially, I thought that it was related to the unexpected LOCKexception bug in Windows that I documented here, but it turned out to be something else entirely.

It turns out that the CPU performs a complete fetch, includingparsing the ModR/M byte, prior to performing any kind of decoding.Thus, because of the instruction encoding that he had used, the CPU wasattempting to retrieve all of the necessary bytes first,before it knew that the instruction was invalid. The emphasis on theword "all" is intentional – the CPU will fetch all of the necessarybytes, even when the instruction will exceed the maximum instructionlength of 15 bytes as a result. The result is that 14 prefix bytesprior to a multi-byte opcode at the end of a page will trigger a pagefault instead of a general protection exception. The CPU also knowswhen surrounding instructions in the same range contain immediate bytes(e.g., C6/C7), and will fetch those bytes, too.

For his results, my colleague used the FF B8 opcode, but I ran sometests and found quite a large list of opcodes that perform in the sameway. The list follows, and assumes an Intel CPU with SSE3 technology(i.e., Pentium 4 CPUs since early 2004) using 32-bit encoding toproduce the SIB effects.

8c 34
8c 35
8c 3c
8c 3d
8c 70-7f
8c b0-bf
8e 0c
8e 0d
8e 34
8e 35
8e 3c
8e 3d
8e 48-4f
8e 70-7f
8e 88-8f
8e b0-bf
8f 0c
8f 0d
8f 14
8f 15
8f 1c
8f 1d
8f 24
8f 25
8f 2c
8f 2d
8f 34
8f 35
8f 3c
8f 3d
8f 48-7f
8f 88-bf
c6 08-3f
c6 48-7f
c6 88-bf
c6 c8-ff
c7 08-3f
c7 48-7f
c7 88-bf
c7 c8-ff
d9 0c
d9 0d
d9 48-4f
d9 88-8f
db 24
db 25
db 34
db 35
db 60-67
db 70-77
db a0-a7
db b0-b7
dd 2c
dd 2d
dd 68-6f
dd a8-af
fe 14
fe 15
fe 1c
fe 1d
fe 24
fe 25
fe 2c
fe 2d
fe 34
fe 35
fe 3c
fe 3d
fe 50-7f
fe 90-bf
ff 3c
ff 3d
ff 78-7f
ff b8-bf
0f 00 2c
0f 00 2d
0f 00 70-7f
0f 00 b0-bf
0f 01 2c
0f 01 2d
0f 01 68-6f
0f 01 a8-af
0f 50 04
0f 50 05
0f 50 0c
0f 50 0d
0f 50 14
0f 50 15
0f 50 1c
0f 50 1d
0f 50 24
0f 50 25
0f 50 2c
0f 50 2d
0f 50 34
0f 50 35
0f 50 3c
0f 50 3d
0f 50 40-bf
0f 5d 04
0f 5d 05
0f 5d 0c
0f 5d 0d
0f 5d 14
0f 5d 15
0f 5d 1c
0f 5d 1d
0f 5d 24
0f 5d 25
0f 5d 2c
0f 5d 2d
0f 5d 34
0f 5d 35
0f 5d 3c
0f 5d 3d
0f 5d 40-bf
0f 71 00-cf
0f 71 d8-df
0f 71 e8-ef
0f 71 f8-ff
0f 72 00-cf
0f 72 d8-df
0f 72 e8-ef
0f 72 f8-ff
0f 73 00-cf
0f 73 d8-df
0f 73 e0-ef
0f 73 f8-ff
0f ae 24
0f ae 25
0f ae 2c
0f ae 2d
0f ae 34
0f ae 35
0f ae 60-77
0f ae a0-b7
0f ba 00-1f
0f ba 40-5f
0f ba 80-9f
0f ba c0-df
0f c4 00-bf
0f c5 00-bf
0f c7 04
0f c7 05
0f c7 14
0f c7 15
0f c7 1c
0f c7 1d
0f c7 24
0f c7 25
0f c7 2c
0f c7 2d
0f c7 34
0f c7 35
0f c7 3c
0f c7 3d
0f c7 40-47
0f c7 50-87
0f c7 90-bf
0f d6 04
0f d6 05
0f d6 0c
0f d6 0d
0f d6 14
0f d6 15
0f d6 1c
0f d6 1d
0f d6 24
0f d6 25
0f d6 2c
0f d6 2d
0f d6 34
0f d6 35
0f d6 3c
0f d6 3d
0f d6 40-bf
0f d7 04
0f d7 05
0f d7 0c
0f d7 0d
0f d7 14
0f d7 15
0f d7 1c
0f d7 1d
0f d7 24
0f d7 25
0f d7 2c
0f d7 2d
0f d7 34
0f d7 35
0f d7 3c
0f d7 3d
0f d7 40-bf

The anomalies can be demonstrated in 16-bit mode, too. The onlychange is that the x4/x5/xc/xd entries are replaced with x6/xe. 64-bitmode should behave in the same way, since it's only a matter of aprefix to switch from 32 to 64 bits. For earlier Intel CPUs, there areadditional opcodes that behave in the same way. CPUs with VT-xtechnology actually reduce the gap in the 0f c7 range. CPUs with SSSE3technology (i.e., Core 2 and later CPUs) potentially introduce gaps inthe 0f 38 range. Presumably AMD CPUs will behave in the same way.

An interesting bug was also revealed in Windows NT – an invalidopcode that causes a page fault actually triggers a blue screen crashinstead. However, since Microsoft no longer supports Windows NT, thiswill probably never be fixed.

Some people might point to the Intel documentation, which says thatthe page fault has a higher priority than the invalid opcode exception,so of course it would happen that way. Yes, that's what thedocumentation says, but no, that's not what it says."Priority" is for servicing the exception, not for raising theexception. Anyway, I'm not saying that it's a bug, I'm saying that it'san anomaly. Intel, on the other hand, apparently considers it a bug, atleast for the Core Duo processor, where the specification update notes(vaguely) this behaviour, though even the Pentium 3 demonstrates thisbehaviour.

There's another anomaly, when we play with these opcodes:

0f 20
0f 21
0f 22
0f 23

They are documented as accepting only register encodings ("The 2bits in the mod field are always 11B"). Therefore, anything else shouldcause an exception, but that's not what happens. In fact, they supportthe full range of encodings, but in a special way. The quote shouldactually say, "The 2 bits in the mod field are always interpreted as11B". That is, no matter what value is in the mod field, theinstruction always decodes to a register access, not a memory access.So, for those opcodes, there is no ModR/M parsing, no fetch ofadditional bytes, and no page fault.

Here's another anomaly: 0f 18 (prefetch) is undocumentedly fullyallocated. Only the first four entries are documented, but the otherfour also execute without exception. I don't know how to test what theyare doing, though.

Finally, 0f 1f (multi-byte NOP) is also undocumentedly fully allocated. Interestingly, despite its name, it does access memory if the Mod/RM byte tells it to, so this "No OPeration" can cause page faults. Not quite a NOP after all.