This second part also involves no coding, but it does require you to have your xv6 kernel up and running. That is, you must complete Part 0 in order to complete Part 1. In this part, you will use GDB to walk through the execution of the fork system call and answer a few questions as you do. When you are done, you will submit a text file with your answers.

Part 1: How processes are born

15 points

To be completed individually

In this part, you will use GDB to walk through the xv6 code as it creates a new process. We will provide the GDB commands for you. All you have to do is follow the instructions and answer the questions that we ask along the way.

To get started, open two terminals and make sure you are in the xv6 directory in both. In one terminal, run xv6 with remote debugging enabled by typing:

$ make qemu-nox-gdb

In the other terminal, run GDB as a remote debugger by typing:

$ gdb kernel

This terminal (your GDB terminal) should now be connected to your xv6 terminal. Meanwhile, in the other terminal, xv6 is currently stalling the boot-up process while it waits for you to give commands via GDB. Tell GDB to allow execution to continue by typing:

(gdb) c

Wait for xv6 to boot up until it reaches the shell prompt. Then in your GDB terminal, interrupt xv6 by hitting:

control+c

You should see GDB give the following output:

Program received signal SIGINT, Interrupt.
The target architecture is assumed to be i386

GDB will also print out an address and a line number of the exact spot in code that is currently being executed. This will vary because the kernel's scheduler is looping through the process table looking for processes to execute. You don't have to worry about that for now. The only process we currently care about is the shell process that is sitting there, waiting for input.

We want to set a breakpoint in the shell program, but GDB currently has the symbol table for the kernel loaded, so we have to tell it to load the symbol file for the shell program by typing:

(gdb) symbol-file sh.o

When it asks if you want to load a new symbol table, type 'y'. We can now set a breakpoint in the shell program by typing:

(gdb) b sh.c:160

This command sets a breakpoint (b) in the file sh.c at line 160. Read the code at line 160 in sh.c. Take a quick look at the while loop that this line sits within. Now return control to xv6 by typing 'c' in your GDB terminal.

Switch over to your xv6 terminal. We want to run a program that will trigger the breakpoint we just set. Let's run the echo program. Type:

$ echo hello world

You should see that your breakpoint has been hit. In GDB you will see this output:

Breakpoint 1, main () at sh.c:160

GDB will also print the code on line 160. In GDB, type 'n' which should bring you to line 167. Make sure you understand why you didn't enter the if block on lines 161-165. On line 167, we are about to execute a function called fork1(). We want to watch that function execute so set a breakpoint by typing:

(gdb) b fork1

Then type 'c' to continue. You are now inside the fork1() function. Pretty much the first thing this function does is call a function called fork(). Don't be fooled by how similar the name is. fork() is a system call and will invoke the execution of kernel code.

Try to set a breakpoint in the fork syscall by typing:

(gdb) b fork

GDB will give the following warning:

Function "fork" not defined.
Make breakpoint pending on future shared library load? (y or [n])

The reason for this warning is that we currently have the symbol table for sh loaded, but the code for the fork syscall is in kernel code. Type 'y' then load the symbol table for the kernel by typing:

(gdb) symbol-file kernel

Type 'y' again to confirm that you want to load the new symbol table. You should see the following warnings in GDB:

Error in re-setting breakpoint 1: No source file named sh.c.
Error in re-setting breakpoint 2: Function "fork1" not defined.

This is telling us that our breakpoints that we previously set are no longer valid in the new symbol table. We have to delete them to avoid any trouble:

(gdb) d 1

(gdb) d 2

These commands tell GDB to delete (d) breakpoints #1 and #2.

Now we should also confirm that our fork breakpoint was automatically set after the kernel symbol table loaded. To get info about your current breakpoints, type:

(gdb) info b

You should should see a list of your breakpoints (just one breakpoint in the list for now). Make sure that it looks like this:

Num Type       Disp Enb Address    What
3   breakpoint keep y   0x80103ab5 in fork at proc.c:185

Type 'c' in GDB and you should hit your fork breakpoint. Go back to your text editor or github, open the file proc.c, and look at line 185. Indeed, you are inside the fork function. To see how we got here, let's inspect the call stack. In GDB, type:

(gdb) bt

On the top of the stack in spot #0 is the fork() function we are currently sitting in. You may have expected that this function was called directly by the fork1() function we were looking at above, but instead, in spot #1, we see that this fork() function was actually called by syscall(), which was called by trap(), which was called by some assembly code in trapasm.S. These functions are evidence of the work done by the kernel "behind the curtain" whenever a user program makes a syscall.

Now let's look more closely at the fork() function we are still sitting in. Read the comments above the function to get an idea of what it does. Then read through the function code itself and try to understand how it works.

Question 1: What is the type of variable np?

Question 2: What does np represent? In other words, what do the letters 'n' and 'p' stand for?

Question 3: What does the variable curproc represent? Hint: You can find its type by looking at its declaration in proc.h.

If you can't answer these questions definitively right now, that's OK, we're going to step through the code. In GDB, type 'n' to execute the line 185. Then type:

(gdb) print np

GDB outputs the type of the variable and the value. The type is your answer to question 1 above. The value is actually an address. Since np is a pointer, it stores the address of an object that it is pointing to. This leads to our next question...

Question 4: What is the address of the object that np is pointing to?

Now have a look at the object itself by typing:

(gdb) print *np

We'll actually want to keep an eye on this object because its attributes will gradually be initialized throughout the fork syscall. We can tell GDB to display this object automatically by typing:

(gdb) display *np

Then type 'n' to execute the next line of code and notice that GDB automatically displays the np object again. Look at the line of code that just executed (line 193), then look at which attributes of the np object have changed as a result.

Continue stepping through the fork function watching as the values within the np object continue to change. Make sure you stop before the function returns on line 221.

Question 5: What is the name of the child process that was created? Note: Although you may not believe your own answer at first, you can trust that GDB is not lying to you. The child process will later be renamed and repurposed.

Now take a look at the value of pid, which the fork syscall is about to return, by typing:

(gdb) print pid

Question 6: What is the value of pid?

Question 7: In which process, the parent or the child, will this value be returned?

Resources

Some useful GDB and QEMU commands: https://pdos.csail.mit.edu/6.828/2018/labguide.html
xv6 textbook: https://pdos.csail.mit.edu/6.828/2018/xv6/book-rev11.pdf
xv6 code on github: https://github.com/starzia/xv6
Hexadecimal converter: https://www.mathsisfun.com/binary-decimal-hexadecimal-converter.html