Editing Nopl

Warning: You are not logged in. Your IP address will be publicly visible if you make any edits. If you log in or create an account, your edits will be attributed to your username, along with other benefits.

The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then publish the changes below to finish undoing the edit.

Latest revision Your text
Line 1: Line 1:
[[Category:Research]] {{DISPLAYTITLE:nopl}}
[[Category:Research]] {{DISPLAYTITLE:nopl}}
During research for my [[AMD Geode]] projects, I found an amazing saga based around a CPU instruction. Nobody else has written this up from what I can see, so here's my take.
During research for my [[Geode Repair]] projects, I found an amazing saga based around a CPU instruction. Nobody else has written this up from what I can see, so here's my take.


== Background ==
== Background ==
Line 32: Line 32:
In 1998 Christian Ludloff documented in his [https://web.archive.org/web/19981205142152/http://sandpile.org:80/80x86/opcodes2.shtml updated map of 2 byte x86 opcodes] that the 0F 18 through 0F 1F range of opcodes were hinting NOPs. The first being the 0F 18 opcode which maps to PREFETCHh instructions. I believe this information was documented first in the [https://www.cs.cmu.edu/afs/cs/academic/class/15213-s01/docs/intel-opt.pdf Intel Architecture Optimization Reference Manual].
In 1998 Christian Ludloff documented in his [https://web.archive.org/web/19981205142152/http://sandpile.org:80/80x86/opcodes2.shtml updated map of 2 byte x86 opcodes] that the 0F 18 through 0F 1F range of opcodes were hinting NOPs. The first being the 0F 18 opcode which maps to PREFETCHh instructions. I believe this information was documented first in the [https://www.cs.cmu.edu/afs/cs/academic/class/15213-s01/docs/intel-opt.pdf Intel Architecture Optimization Reference Manual].


Later in 2003 Christian Ludloff clarified in an email thread [https://web.archive.org/web/20041106070621/http://www.sandpile.org/post/msgs/20004129.htm Undocumented opcodes (HINT_NOP)] that these hinting NOPs were declared by Intel in their 1995 patent [https://patents.google.com/patent/US5701442A/en US5701442]. The idea behind this patent from my reading is that you can encode a program written in another ISA as a series of opcodes that are run as NOPs on older machines and the new ISA on a newer machine.
Later in 2003 Christian Ludloff clarified in an email thread [http://www.sandpile.org/post/msgs/20004129.htm Undocumented opcodes (HINT_NOP)] that these hinting NOPs were declared by Intel in their 1995 patent [https://patents.google.com/patent/US5701442A/en US5701442]. The idea behind this patent from my reading is that you can encode a program written in another ISA as a series of opcodes that are run as NOPs on older machines and the new ISA on a newer machine.


I'm not sure why, but third party x86 CPUs aside from AMD didn't implement these NOPs. Perhaps Intel kept this patent close to their heart? Or maybe it's just not worth spending silicon and research on NOPs that nobody used?
I'm not sure why, but third party x86 CPUs aside from AMD didn't implement these NOPs. Perhaps Intel kept this patent close to their heart? Or maybe it's just not worth spending silicon and research on NOPs that nobody used?
Line 163: Line 163:
After the first bug report, [https://bugs.llvm.org/show_bug.cgi?id=11212 X86AsmBackend::WriteNopData uses long nops unconditionally] was filed upstream to LLVM.
After the first bug report, [https://bugs.llvm.org/show_bug.cgi?id=11212 X86AsmBackend::WriteNopData uses long nops unconditionally] was filed upstream to LLVM.


Later in 2012 [https://github.com/llvm/llvm-project/commit/5dd4ccb4020173a569bc54ba559232b5be2cef01 LLVM r164132] was committed, adding a 'geode' CPU target to LLVM that didn't use multi-byte NOPs. This meant building for i686 without using multi-byte NOPs required building for Geode CPUs. Not very useful for generic i686 releases or for i586 and older machines that weren't supposed to support multi-byte NOPs.
Later in 2012 [https://github.com/llvm/llvm-project/commit/5dd4ccb4020173a569bc54ba559232b5be2cef01 r164132] was committed to LLVM, adding a 'geode' CPU target to LLVM that didn't use multi-byte NOPs. This meant building for i686 without using multi-byte NOPs required building for Geode CPUs. Not very useful for generic i686 releases or for i586 and older machines that weren't supposed to support multi-byte NOPs.


In 2014 [https://github.com/llvm/llvm-project/commit/1b8bfdaae3264efdba964321956965a6ab47540a LLVM r195679] was committed to flat out avoid using multi-byte NOPs on i686, i586 and specific non-Intel and non-AMD CPU models that didn't support multi-byte NOPs.
In 2014 [https://github.com/llvm/llvm-project/commit/1b8bfdaae3264efdba964321956965a6ab47540a r195679] was comitted to LLVM to flat out avoid using multi-byte NOPs on i686, i586 and specific non-Intel and non-AMD CPU models that didn't support multi-byte NOPs.
 
== Emulators ==
It's not just hardware that implements the i686 instruction set, software emulators can too. So which emulators support multi-byte NOPs?
 
In 2006 [https://sourceforge.net/p/bochs/code/7216 Bochs r7216] was committed, adding support for the multi-byte NOP opcode as long as Bochs was compiled to emulate an i686 or newer. Later in 2007 [https://sourceforge.net/p/bochs/code/7973 Bochs r7973] was committed, marking 0F 19 through 0F 1E as multi-byte NOPs based on AMD documentation. They didn't link to the documentation but it makes sense to me.
 
In 2006 [https://git.qemu.org/?p=qemu.git;a=commitdiff;h=e17a36ce41bc76abeceb QEMU r2145] was committed and made all hinting NOPs execute as multi-byte NOPs. This made it in to QEMU 0.9.0 which makes the Debian bug report reporting QEMU 0.9.1 as crashing due to NOPs surprising. Furthermore these NOPs are available on every emulated x86 CPU, 32-bit or 64-bit, regardless of whether it should have it or not.
 
In 2007 [https://github.com/mirror/vbox/commit/cb39b37cad08c79c5096fcd5dd69ad6997ee418b VirtualBox r2422] imported QEMU's i386 interpreter and gained multi-byte NOP support.
 
In 2020 [https://github.com/sarah-walker-pcem/pcem/commit/b973755ca376dbb47c3a8c85a53f4058f0ccc54d Add hintable NOPs for Pentium Pro and II.] was committed to PCem.
 
In 2022 [https://github.com/joncampbell123/dosbox-x/pull/3390 src/cpu: Implement hinting NOPs] was merged to DOSBox-X, the only DOSBox variant that supports Pentium Pro and newer CPUs.
 
== Intel CET ==
In 2016 Intel announced [https://web.archive.org/web/20160614162220/http://blogs.intel.com/evangelists/2016/06/09/intel-release-new-technology-specifications-protect-rop-attacks/ Control-flow Enforcement Technology] and released the [https://web.archive.org/web/20170320213641/https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf Intel CET specification]. These CPU extensions run not just in 64-bit mode but in 32-bit mode. While management for the shadow stack uses new instructions, the ENDBRANCH instruction intended to be compiled in to user space code re-uses the hinting NOP 0F 1E.
 
Unlike the multi-byte NOP there's no indication in the specifications that these instructions are limited to Pentium Pro or newer CPUs.
 
In 2017 [https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=603555e563725616246912711419637add54c961 Add support for Intel CET instructions] was committed to the GNU Assembler.
 
Later in 2017 [https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=2a25448c490b16eea276521d818640bcaca75e35 Update x86 backend to enable Intel CET.] was committed to GNU GCC.
 
Even later in 2017 [https://github.com/llvm/llvm-project/commit/fec21ec0c6257eb24290c483b03b4fd9e6a9d0d1 LLVM r318995] added support for CET. As far as I can still this doesn't limit the use of these CET instructions.
 
In 2021 [https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98667 gcc generates endbr32 invalid opcode on -march=i486] was reported to GCC. The next day [https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=77d372abec0fbf2cfe922e3140ee3410248f979e x86: Error on -fcf-protection with incompatible target] was committed to GNU GCC. This patch limits CET to architectures with CMOV. That's a safe bet, but seems like it would break on the Geode LX800 and other i686-compatibles that lack multi-byte NOPs.
 
In 2022 [https://github.com/rust-lang/rust/issues/93059 i586-unknown-linux-gnu target generates binaries containing Intel CET opcodes which are illegal on i586 processors] was reported to the Rust bug tracker. A day or so later Gentoo committed [https://github.com/gentoo/gentoo/commit/bff66eedb4ae530ef21187d617daeba5472320a1 dev-lang/rust: pass -fcf-protection=none on i586] despite Rust not being available on i586 yet. It's unclear how much things will break if someone gets an actual i686 build of Rust going.
 
Rust uses LLVM so this might indicate that LLVM doesn't check if an architecture supports CET before adding its instructions.
 
As of early 2022 Intel CET support is not in the kernel yet.
 
== Conclusions ==
I have a few takeaways from this slow motion train wreck:
 
* Intel's documentation only applies to Intel CPUs
* Developers don't really question retroactive additions to instruction sets
* To some i686 is the Pentium Pro
* To others i686 is a baseline for various 32-bit x86 processors
 
Something else to just tack on here is that I spent a non-trivial amount of time trying to dig up old copies of Intel web pages and documentation. By the way Intel: When you make a new revision of a document you don't have to destroy the old ones.
Please note that all contributions to JookWiki are considered to be released under the Creative Commons Zero (Public Domain) (see JookWiki:Copyrights for details). If you do not want your writing to be edited mercilessly and redistributed at will, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource. Do not submit copyrighted work without permission!

To edit this page, please answer the question that appears below (more info):

Cancel Editing help (opens in new window)