Picking through disassembled MBR code can be a little disorienting at first. This guide outlines the process of creating a prototypical MBR routine that prints “Hello World!”
This tutorial assumes you have a basic understanding of assembly, setting up virtual machines and doing things from the Linux command line.
Create your own Hello World MBR program:
- Install NASM (and a virtualization platform, if you don’t have one already).
- Rewrite or copy/paste the MBR source code.
- Figure out what the MBR code does.
- Compile the code and write it to the first sector of a boot disk (Virtualization recommended!).
- Boot the machine with your MBR code.
Warning: overwriting a computer’s MBR will make it unable to boot normally. Before altering a computer’s MBR, be sure you know what you’re doing. Only make changes to a computer or virtual machine that you don’t care about. Follow this tutorial at your own risk.
When you power on your computer, the BIOS performs a “Power-on Self Test” and briefly initializes the system. When the initialization routine is complete, the BIOS copies the first sector of your boot drive into memory and hands over control of the computer to the code it just copied. The first sector of a bootable disk, and the code it contains, are known as the Master Boot Record, or MBR.
With the recent renaissance of bootkits and MBR malware, the MBR has come back into public focus. Malicious code that is loaded by the MBR code can take complete control of a system and set itself up before antivirus software or the operating system have the chance to lock things down. Since MBR malware can make changes to a system before anything else, MBR rootkits can make malware nearly undetectable. This was the case with the Mebroot bootkit, which was used to hide the Torpig botnet’s client programs.
512 bytes isn’t much space for a program. Due to space limitations, the MBR is usually just a short routine that establishes the location of the boot partition, then loads the boot partition’s Volume Boot Record (VBR), which then loads the operating system’s bootloader.
1. Install NASM
In order to compile the MBR code, we’ll need NASM. If you want to test your compiled MBR code without sacrificing a real computer’s hard drive, you’ll also need a virtualization platform and a bootable guest OS. I used virtualbox and the Slitaz live cd. You can use whatever livecd or virtualization platform you prefer, as long as it has dd and a way to get the compiled MBR file from the host OS. On Ubuntu:
$ sudo apt-get install virtualbox nasm
You can download the latest Slitaz ISO from here. I picked Slitaz because it’s very small (<35mb!) and it boots quickly.
2. Copy or write out the source code for the MBR
The NASM source for the “Hello World” MBR program is available for download here.
org 7c00h ; Tell NASM that the code's base will be at 7c00h.
; Otherwise it will assume offset 0 when calculating
jmp short start ; jump over data
db "Hello, world!", 0 ; null-terminated message
xor ax, ax ; clear ax
mov ds, ax ; ds needs to be 0 for lodsb
cld ; clear direction flag for lodsb
mov si, message ; move the message's address into si for lodsb
jmp string ; jump to the string routine
; Displays a character
; int 0x10 ah=e
; al = character, bh = page number
mov bx, 0x1
mov ah, 0xe
; print a string
lodsb ; lodsb loads ds:esi into al
cmp al, 0x0
jnz char ; display character if not null
; jmp short main ; uncomment to repeat infinitely
; infinite loop that does nothing
jmp short end
times 0200h - 2 - ($ - $$) db 0 ; NASM directive: zerofill up to 510 bytes
dw 0AA55h ; Magic boot sector signature
The “org 7c00″ directive tells NASM that our program will be loaded at 0000:7c00, which is where the BIOS will copy our program and transfer control after the POST is complete.
The first instruction in our program jumps over the data (here, just message:) to the start: routine, which zeroes out some of the registers we’ll be using later.
The main: routine loads the address of the message string into the SI or “Source Index” register before jumping to the string: routine.
The lodsb (load string) instruction in string: then loads a single character from the address in SI and increments SI so that it points at the next character in the string. string: then checks to make sure that the loaded character isn’t null before jumping to the char: routine, which will print the character.
If the character loaded by lodsb is null, we know we’ve reached the end of the string so we don’t jump to char:. Instead, we let the program go on to end:, which is an infinite loop that does nothing.
If you want the program to run in a loop, printing message over and over, uncomment the ; jmp short main line.
A note on BIOS Interrupt Calls
During the boot process, the BIOS sets up some useful interrupt calls, like INT 10h (display routines) and INT 13h (disk access routines). A complete list of BIOS interrupt calls is available on wikipedia. BIOS interrupt calls are defined by an interrupt number (i.e. 10h) and an argument in register AH. For instance, INT 13h AH=0 will reset a disk drive, INT 13h, AH=1 will check the status of a drive and INT 13h, AH=02 will read sectors from a drive. It’s possible that different BIOSes may implement interrupts differently.
In the char: routine, our program uses INT 10h, AH=E, which invokes the function for TTY output. INT 10 AH=E will display the character in register AL on the screen. The character in AL will be the last character taken from the message string by lodsb in the string: routine. After the INT 10h system call is completed, the string: routine is executed again and the process is repeated until the end of the message string is reached.
The last part of the source file is a little tricky. Every MBR needs to have the magic signature “55AAh” stored at location 1FEh (the 510th byte of the MBR). If the magic signature isn’t present, the BIOS will assume that the MBR has been corrupted. times 0200h - 2 - ($ - $$) db 0 is an NASM directive to fill the space from the end of our program to (512-2) with zeroes.
For clarity, in NASM the single dollar sign ($) refers to the address of the current line being assembled, and the double dollar sign ($$) refers to the address of the current section. ($-$$) is equal to the length of our compiled program, and 0200h – 2 – ($-$$) equals the number of zeroes we’ll need between the end of our program and the address of the MBR signature.
4. Compiling the MBR and running it (or writing it to a disk’s MBR)
I saved my program into a file named hello.nasm. To compile it, I typed
$ nasm hello.nasm
which created a binary file named “hello.”
Testing your MBR in QEMU
When I first wrote this post, I used VirtualBox to emulate a computer. A little while later, I was doing some ARM development and testing in QEMU and I realized that QEMU provides a much simpler way to test MBR code. QEMU is convenient, but the VirtualBox method mimics the process of installing the MBR on an actual PC.
Installing qemu-system provides a bunch of different systems that each serve as complete virtual computers. You can install qemu-system on Ubuntu with sudo apt-get install qemu-system . Then simply run qemu-system-i386 hello to boot the virtual i386 system from your compiled MBR.
Testing your MBR in VirtualBox (simulating an actual PC)
To install and run the compiled MBR in VirtualBox, I created a new Linux 2.6 machine with a blank hard drive in and I put the Slitaz live CD iso into its CD drive. I booted the machine and opened a terminal in Slitaz. I entered the following commands to copy the compiled mbr to the virtual machine:
Password: (the root password is "root")
root@slitaz:home/tux# scp user@host:path/to/hello hello
root@slitaz:home/tux# dd if=hello of=/dev/sda
1+0 records in
1+0 records out
Replace “user” with your username and “host” with the ip of your host OS. If you don’t have sshd running on your host machine, you’ll have to figure out a way to get the compiled MBR code onto your virtual machine. Slitaz has an ssh server you can start, but you need to have VirtualBox networking configured to use a bridged adapter instead of the default NAT if you want to ssh into the virtual machine. You can also use wget or something along those lines.
If everything has worked so far, # dd if=hello of=/dev/sda writes the compiled MBR code to the first sector of your machine’s disk.
Once the MBR had been written successfully, I issued the following commands.
# eject will eject the Slitaz Live CD and # reboot restarts the machine, which should then boot from our compiled MBR.
The first time I tried to run my compiled MBR, my machine hung because I forgot to include the end: routine and the machine tried to execute the nulls following my program. Remember that processors are dumb, and a processor will always try to load and execute the next instruction, whether you want it to or not.
The result should look something like this:
If everything worked, congratulations! If something didn’t work, it’s probably because there’s a difference between your machine and mine. I worked this out on a laptop running Ubuntu 12.04 64-bit. Commands and output may vary slightly on different architectures or virtual machines.
- Master boot record – Wikipedia
- INT 10h – Wikipedia
- INT 13h – Wikipedia
- X86 Assembly/Bootloaders – Wikibooks (an alternative guide with a different approach)
- Win95B/98/SE/ME MBR Revealed
- eEye RootBoot: Stealth MBR rootkit