ModR/M - w0rth.dev
n3tw0rth a.k.a 0xbyt3z
~/notes-from-the-terminal

ModR/M

Same opcode, different ModR/M bytes, 89 45 c8 (write to memory) vs 89 c7 (register to register). Same operands, different opcode: 89 45 c8 vs 8b 45 c8 (read vs write). Shows how one byte changes what an instruction means.

Finding good examples for technical topics is tough, but I recently came across one that’s worth writing about.

While reversing a binary, I found a spot where an 8-byte time_t value is narrowed to a 4-byte int. It’s a neat example for explaining how CISC instructions use the ModR/M byte to encode operands, and how the same byte can be interpreted differently depending on the instruction context.

Here’s the disassembled code. It generates a random seed from the current time, then initializes the RNG. Calls time() and srand(). Between them, three instructions explain ModR/M particularly well.

e807feffff call sym.imp.time ; time_t time(time_t *timer)
8945c8 mov dword [seed], eax
8b45c8 mov eax, dword [seed]
89c7 mov edi, eax ; int seed
e8dafdffff call sym.imp.srand ; void srand(int seed)

Two scenarios here.

89 45 c8 vs 89 c7 : same opcode, different mod field. Different operand types (memory/register). Both are 2-operand instructions. Only the addressing mode changes.

  1. 89 45 c8mov [ebp-0x38], eax (register-to-memory)
  2. 89 c7mov edi, eax (register-to-register)

89 45 c8 vs 8b 45 c8 : same operands (ebp-0x38, eax), different opcode. Different direction.

  • 89 45 c8mov [ebp-0x38], eax (write to memory)
  • 8b 45 c8mov eax, [ebp-0x38] (read from memory)

What ModR/M actually is

The ModR/M byte usually follows the opcode and encodes operand information. Depending on the instruction, it can specify registers, memory addressing modes, or even extend the opcode itself. The concept is straightforward, but it can be a little confusing at first.

Format:

Bit: 7 6 5 4 3 2 1 0
Field: Mod Reg R/M
  • Mod (2 bits): addressing mode. Memory, memory+displacement, or register-direct.
  • Reg (3 bits): a register or, for some opcodes, an extension digit instead of a register.
  • R/M (3 bits): the other operand. Mod decides whether this is a register or a memory reference.

Mod values, 32-bit mode:

ModMeaning
00Memory, no displacement (with exceptions for [esp] and RIP-relative)
01Memory + 8-bit displacement
10Memory + 32-bit displacement
11Register-direct, no memory access

That’s the whole trick. Mod=11 turns R/M into a plain register. Anything else turns it into a memory address.

Same opcode, different ModR/M

89 is MOV r/m32, r32: move a register into r/m. Direction is fixed: reg → r/m. What changes is where r/m points.

89 45 c8:

45 = 01 000 101
mod = 01 → memory, 8-bit displacement follows
reg = 000 → eax (source)
r/m = 101 → ebp base
disp8 = c8 → -0x38

Result: mov [ebp-0x38], eax

89 c7:

c7 = 11 000 111
mod = 11 → register-direct
reg = 000 → eax (source)
r/m = 111 → edi (destination)

Result: mov edi, eax

Same opcode byte. Same instruction meaning. Mod flips the destination from a stack slot to a register.

Different opcode, same ModR/M

89 and 8b are a matched pair same encoding, opposite direction.

  • 89 = MOV r/m, r : reg is the source, r/m is the destination.
  • 8b = MOV r, r/m : reg is the destination, r/m is the source.

Both 89 45 c8 and 8b 45 c8 decode the same ModR/M byte: mod=01, reg=000 (eax), r/m=101+disp8 (ebp-0x38). Same operands. The opcode alone decides which one is read and which one is written.

89 45 c8mov [ebp-0x38], eax

8b 45 c8mov eax, [ebp-0x38]

Follow this link (selections are highlighted) for better understanding.

Take the register from the respective column as the operand 1 and the row value as the operand 2. Then check the direction of the opcode to decide whether to swap the operand positions or keep it as it is.

I got these results,

>>> bin(0x89)
'0b10001001'
>>> bin(0x8b)
'0b10001011'

yes, indeed the operand should swap places when used with the instructions 8b.

Takeaway

One byte, three fields, and the meaning of “operand” shifts under your feet depending on mod. Add the opcode’s d-bit on top and the same ModR/M payload can read or write, touch memory or skip it entirely. That’s the whole CISC operand-encoding trick in one byte.

ref: http://ref.x86asm.net/coder64.html#modrm_byte_32_64 https://en.wikipedia.org/wiki/ModR/M