Skip to content

Strange page fault issue #1

@khale

Description

@khale

Overview

page fault handler in nautilus somehow clobbers rax, resulting in a bad address reference on return from handler

Suspected Cause

Stack is going away or getting manipulated during page fault handler, before rax is restored by the low-level exception handler code

  42911e:       e8 6d d6 01 00          callq  446790 <__mmap>
  429123:       48 83 f8 ff             cmp    $0xffffffffffffffff,%rax
  429127:       0f 84 1e fc ff ff       je     428d4b <_int_malloc+0xa4b>
  42912d:       4c 89 ea                mov    %r13,%rdx
  429130:       48 83 ca 02             or     $0x2,%rdx
  429134:       48 89 50 08             mov    %rdx,0x8(%rax) <--- we're getting a page fault here, rax is getting clobbered. Likely this is coming from rax corruption in the nautilus page fault handler


[ 8922.615162] palacios (pcore 0 vm hvm vcore 1): DEBUG: VM_CONSOLE>Forwarding page fault to ROS: rip=0x004290f4 addr=0x7f8c0c245008 err=0x2 (w=1)
[ 8922.616325] palacios (pcore 0 vm hvm vcore 1): DEBUG: hvm: ROS event request
[ 8922.616906] palacios (pcore 0 vm hvm vcore 1): DEBUG: hvm: copying ros event size 80
[ 8922.617994] palacios (pcore 0 vm hvm vcore 1): DEBUG: hvm: copied new ROS event (type=page fault)
[ 8922.618027] palacios (pcore 1 vm hvm vcore 0): DEBUG: hvm: completion of ROS event (rc=0x0)
[ 8922.620584] palacios (pcore 0 vm hvm vcore 1): DEBUG: VM_CONSOLE>Nautilus returning from HVM page fault. CS is 0x30, SS is 0x28 (returning 0x0)
[ 8922.621754] palacios (pcore 0): zone=ffff88020cacee80, order=12
[ 8922.622329] palacios (pcore 0): Order iter=12
[ 8922.622873] palacios (pcore 0): pool=ffff88020d1287e0, block=ffff8800de987000, order=12, j=12
[ 8922.624129] palacios (pcore 0 vm hvm vcore 1): DEBUG: VM_CONSOLE>current CS=0x30 SS=0x28
[ 8922.625604] palacios (pcore 0 vm hvm vcore 1): DEBUG: VM_CONSOLE>Forwarding page fault to ROS: rip=0x004290f4 addr=0x00000038 err=0x2 (w=1)

Notice we come back from the first page fault with RAX screwed up. 0x38 is suspicious because it is 8(0x30). 0x30 is a typical value for %cs. Coincidence?

NOTE: This bug is relatively uncommon

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions