Hello World MBR Tutorial

Picking through disassembled MBR code can be a little disorienting at first. This guide outlines the process of creating a prototypical MBR routine that prints “Hello World!”

This tutorial assumes you have a basic understanding of assembly, setting up virtual machines and doing things from the Linux command line.

Create your own Hello World MBR program:

  1. Install NASM (and a virtualization platform, if you don’t have one already).
  2. Rewrite or copy/paste the MBR source code.
  3. Figure out what the MBR code does.
  4. Compile the code and write it to the first sector of a boot disk (Virtualization recommended!).
  5. Boot the machine with your MBR code.

Warning: overwriting a computer’s MBR will make it unable to boot normally. Before altering a computer’s MBR, be sure you know what you’re doing. Only make changes to a computer or virtual machine that you don’t care about. Follow this tutorial at your own risk.

Background

When you power on your computer, the BIOS performs a “Power-on Self Test” and briefly initializes the system. When the initialization routine is complete, the BIOS copies the first sector of your boot drive into memory and hands over control of the computer to the code it just copied. The first sector of a bootable disk, and the code it contains, are known as the Master Boot Record, or MBR.

With the recent renaissance of bootkits and MBR malware, the MBR has come back into public focus. Malicious code that is loaded by the MBR code can take complete control of a system and set itself up before antivirus software or the operating system have the chance to lock things down. Since MBR malware can make changes to a system before anything else, MBR rootkits can make malware nearly undetectable. This was the case with the Mebroot bootkit, which was used to hide the Torpig botnet’s client programs.

512 bytes isn’t much space for a program. Due to space limitations, the MBR is usually just a short routine that establishes the location of the boot partition, then loads the boot partition’s Volume Boot Record (VBR), which then loads the operating system’s bootloader.

1. Install NASM

In order to compile the MBR code, we’ll need NASM. If you want to test your compiled MBR code without sacrificing a real computer’s hard drive, you’ll also need a virtualization platform and a bootable guest OS. I used virtualbox and the Slitaz live cd. You can use whatever livecd or virtualization platform you prefer, as long as it has dd and a way to get the compiled MBR file from the host OS. On Ubuntu:

You can download the latest Slitaz ISO from here. I picked Slitaz because it’s very small (<35mb!) and it boots quickly.

2. Copy or write out the source code for the MBR

The NASM source for the “Hello World” MBR program is available for download here.

 3. Explanation

The “org 7c00″ directive tells NASM that our program will be loaded at 0000:7c00, which is where the BIOS will copy our program and transfer control after the POST is complete.

The first instruction in our program jumps over the data (here, just message:) to the start: routine, which zeroes out some of the registers we’ll be using later.

The main: routine loads the address of the message string into the SI or “Source Index” register before jumping to the string: routine.

The lodsb (load string) instruction in string: then loads a single character from the address in SI and increments SI so that it points at the next character in the string. string: then checks to make sure that the loaded character isn’t null before jumping to the char: routine, which will print the character.

If the character loaded by lodsb is null, we know we’ve reached the end of the string so we don’t jump to char:. Instead, we let the program go on to end:, which is an infinite loop that does nothing.

If you want the program to run in a loop, printing message over and over, uncomment the ; jmp short main line.

A note on BIOS Interrupt Calls

During the boot process, the BIOS sets up some useful interrupt calls, like INT 10h (display routines) and INT 13h (disk access routines). A complete list of BIOS interrupt calls is available on wikipedia. BIOS interrupt calls are defined by an interrupt number (i.e. 10h) and an argument in register AH. For instance, INT 13h AH=0 will reset a disk drive, INT 13h, AH=1 will check the status of a drive and INT 13h, AH=02 will read sectors from a drive. It’s possible that different BIOSes may implement interrupts differently.

In the char: routine, our program uses INT 10h, AH=E, which invokes the function for TTY output. INT 10 AH=E will display the character in register AL on the screen. The character in AL will be the last character taken from the message string by lodsb in the string: routine. After the INT 10h system call is completed, the string: routine is executed again and the process is repeated until the end of the message string is reached.

The last part of the source file is a little tricky. Every MBR needs to have the magic signature “55AAh” stored at location 1FEh (the 510th byte of the MBR). If the magic signature isn’t present, the BIOS will assume that the MBR has been corrupted.  times 0200h - 2 - ($ - $$) db 0 is an NASM directive to fill the space from the end of our program to (512-2) with zeroes.

For clarity, in NASM the single dollar sign ($) refers to the address of the current line being assembled, and the double dollar sign ($$) refers to the address of the current section. ($-$$) is equal to the length of our compiled program, and 0200h – 2 – ($-$$) equals the number of zeroes we’ll need between the end of our program and the address of the MBR signature.

 4. Compiling the MBR and running it (or writing it to a disk’s MBR)

I saved my program into a file named hello.nasm. To compile it, I typed

which created a binary file named “hello.”

Testing your MBR in QEMU

When I first wrote this post, I used VirtualBox to emulate a computer. A little while later, I was doing some ARM development and testing in QEMU and I realized that QEMU provides a much simpler way to test MBR code. QEMU is convenient, but the VirtualBox method mimics the process of installing the MBR on an actual PC.

Installing qemu-system provides a bunch of different systems that each serve as complete virtual computers. You can install qemu-system on Ubuntu with  sudo apt-get install qemu-system . Then simply run  qemu-system-i386 hello to boot the virtual i386 system from your compiled MBR.

Testing your MBR in VirtualBox (simulating an actual PC)

To install and run the compiled MBR in VirtualBox, I created a new Linux 2.6 machine with a blank hard drive in and I put the Slitaz live CD iso into its CD drive. I booted the machine and opened a terminal in Slitaz. I entered the following commands to copy the compiled mbr to the virtual machine:

Replace “user” with your username and “host” with the ip of your host OS. If you don’t have sshd running on your host machine, you’ll have to figure out a way to get the compiled MBR code onto your virtual machine. Slitaz has an ssh server you can start, but you need to have VirtualBox networking configured to use a bridged adapter instead of the default NAT if you want to ssh into the virtual machine. You can also use wget or something along those lines.

If everything has worked so far,  # dd if=hello of=/dev/sda writes the compiled MBR code to the first sector of your machine’s disk.

Once the MBR had been written successfully, I issued the following commands.

# eject will eject the Slitaz Live CD and  # reboot restarts the machine, which should then boot from our compiled MBR.

The first time I tried to run my compiled MBR, my machine hung because I forgot to include the end: routine and the machine tried to execute the nulls following my program. Remember that processors are dumb, and a processor will always try to load and execute the next instruction, whether you want it to or not.

The result should look something like this:

Hello World MBRIf everything worked, congratulations! If something didn’t work, it’s probably because there’s a difference between your machine and mine. I worked this out on a laptop running Ubuntu 12.04 64-bit. Commands and output may vary slightly on different architectures or virtual machines.

Further Reading

 

Incoming search terms:

8 thoughts on “Hello World MBR Tutorial

  1. I”m going to run a test, convert an mbr to .com and run it. I suppose an mbr can be straight assembly without a boot signature if you want it to run continuously.

    • If you don’t jump over the message, the computer will try to interpret the text as executable instructions.

  2. Thanks a lot for your quick response! However it raise other questions.

    message: db “Hello, world!”, 0

    1)Isn’t this line “similar” to declaring and initializing a variable?
    2)If that is the case won’t executing it just define it on memory?
    3)If it is not like a variable how the compiler knows it is “declared” if the line was jumped?

    Sorry for going deeper into this but I have asked several times this questions on forums and usually nobody answers!

    PD: I agree with your previous rensponse and I know it works. I am just trying to understand WHY it work.

  3. OK I have a theory about how it Works please tell if I am wright or wrong!

     The whole programs is compiled into machine code including the db label directives. During the compilation any use of the db labels are replaced with its actual memory location or value as needed.
    Before execution the whole machine code including the memory information defined by db labels is stored in RAM.
    During the program execution there is no need for executing the db labels since the information they represent is already stored on RAM.

    If I am right them it means that higher level languages like C and C# actually encapsulate a lot more than I originally imagined.

  4. Wicked awesome tutorial, had no idea it was so simple. I’ve got an old laptop I’m eager to try this on, thanks a lot!

    artificer: I think you’ve got the general idea. Despite how we normally think of it, an assembly source file is a set of instructions to the assembler, not to processor. Your source file essentially describes to the assembler how to produce the binary output file. The code instructions (a mnemonic and its arguments, like mov ds, ax) tell the assembler to generate the corresponding machine code at the current location in the output file. Other instructions to the assembler, like the db directive, simply tell it to write specific byte values to the output file. This is how you initialize variables or any other memory addresses in assembly. As the author mentioned, if have data like this at the beginning of the program and do not jump over it, then the processor will try to execute it as machine code.

    As for declaring variables, the lines with colons (like message:) are called labels: it’s just a way to refer to a specific (but generally unknown) location in the generated output file (i.e., a memory address relative to the start of the output). In other words, it tells the assembler “anytime I use ‘message’, I’m actually referring to this address.”. It makes your code much more readable than using numerical address values, and it saves you from having to figure out what the address actually is, because the assembler can do that for you. So if you put a label like this before a block of data (for instance, one that is allocated with db), then it works a lot like a variable (more specifically, a pointer).

    And you’re right, high level compiled languages like C encapsulate a significant amount of details, beyond just converting expressions into assembly instructions. Anytime you have an initialized or static variable, it will use something along the lines of a db directive to allocate and optionally initialize memory for it. It may also produce some startup code which is actually executed before any of your code.

  5. Here is a simpler way to do this in virtualbox:

    first compile the mbr:
    ./nasm/nasm -f bin -o mbr.bin mbr.asm
     

    Then convert the output of that to vdi
    the uuid parameter can be created any way, but it is important for future steps:
    VBoxManage convertfromraw –format vdi –uuid \{9dc197b0-8d9e-4ad6-879b-c87d14940715\} mbr.bin mbr.vdi
     

    Now create a new vm in virtualbox, using the output vdi file as harddisk. When starting up the VM, it boots into the MBR we just compiled.

    To replace the mbr, just redo the steps from above, using the same uuid. after that, start the vm again, and the new mbr will be loaded.
    if you  do not specify a uuid it creates a new random one, and virtualbox won’t start it.

    You can also start the vm with VBoxManage.

  6. Following are the steps for doing it on Linux using qemu(without the virtual box):
    Create a harddisk for qemu using dd command:
    dd if=/dev/zero of=./out bs=512 count=1000
    Copy the mbr to our hardisk
    dd if=/dev/zero of=./out bs=512 count=1000
    run qemu to boot from the hardisk:
    qemu ./out
    This should boot the hardisk with “Hello World”.