开始日期：22.5.3

操作系统：Ubuntu20.0.4

my github repository: duilec/MITS6.081-fall2021/tree/thread

Lab Multithreading

写在前面

多线程这一部分以课程为主会好很多，教材的解释太繁琐了。

Uthread: switching between threads

实现用户态的线程切换，同内核态的线程切换比较，步骤减少了许多，不用多考虑陷阱帧（trapframe）的切换问题。只需要考虑栈指针、返回地址以及被调用者寄存器（callee register）。记得这是在一个进程内创建多个线程，所以contxt是自定义的，并没有实际涉及到硬件。

这里实际上涉及到了用户级线程（ULT）的概念，这些线程由用户进程直接管理和调度，与内核无关。

整个过程：第一次调用thread_schedule();之后，会离开thread[0]（也即是进程main），之后就一直在三个线程thread[1] ~ thread[3]之间一直互相切换，直到exit(0)。

Makefile插入_uthread

UPROGS=\
	$U/_cat\
	$U/_echo\
	...
	$U/_uthread\ # insert _uthread

线程切换时需要保存/恢复被调用者寄存器，参照swthc.S即可

注意context的定义必须在thread之前，否则会报错incomplete type is not allowed

/* we must defind it before defining thread */
/* otherwise, we will meet "incomplete type is not allowed" */
struct context {
  uint64 ra;
  uint64 sp;

  // callee-saved
  uint64 s0;
  uint64 s1;
  uint64 s2;
  uint64 s3;
  uint64 s4;
  uint64 s5;
  uint64 s6;
  uint64 s7;
  uint64 s8;
  uint64 s9;
  uint64 s10;
  uint64 s11;
};

struct thread {
  char       stack[STACK_SIZE]; /* the thread's stack */
  int        state;             /* FREE, RUNNING, RUNNABLE */
  struct context context;       /* Saved registers for user context switches. */
};

	/* uthread_switch.S */
	.text

	/*
         * save the old thread's registers,
         * restore the new thread's registers.
         */

	.globl thread_switch
thread_switch:
	/* YOUR CODE HERE */
	sd ra, 0(a0)
	sd sp, 8(a0)
	sd s0, 16(a0)
	sd s1, 24(a0)
	sd s2, 32(a0)
	sd s3, 40(a0)
	sd s4, 48(a0)
	sd s5, 56(a0)
	sd s6, 64(a0)
	sd s7, 72(a0)
	sd s8, 80(a0)
	sd s9, 88(a0)
	sd s10, 96(a0)
	sd s11, 104(a0)

	ld ra, 0(a1)
	ld sp, 8(a1)
	ld s0, 16(a1)
	ld s1, 24(a1)
	ld s2, 32(a1)
	ld s3, 40(a1)
	ld s4, 48(a1)
	ld s5, 56(a1)
	ld s6, 64(a1)
	ld s7, 72(a1)
	ld s8, 80(a1)
	ld s9, 88(a1)
	ld s10, 96(a1)
	ld s11, 104(a1)
	
	ret    /* return to ra */

参考kernel/proc.c/allocproc()，使线程恢复被调用者寄存器之后，能够直接跳到函数头部，同时，栈恢复为对应线程的栈。

uthread.c中可以看到：*it needs a stack so that the first thread_switch() can save thread 0’s state.*，说明stack是用来存放state的，但笔者没有搞清楚是怎么存放的。
笔者搞情况怎么存放的了，

Swtch(kernel/swtch.S:3)saves only callee-saved registers; the C compiler generates code in the caller to save caller-saved registers on the stack.

也就是说线程切换时，C编译器会把caller-saved registers存放在当前线程的stack中，而state是存放在一个caller-saved register当中的。

void 
thread_create(void (*func)())
{
  struct thread *t;

  for (t = all_thread; t < all_thread + MAX_THREAD; t++) {
    if (t->state == FREE) break;
  }
  t->state = RUNNABLE;
  // YOUR CODE HERE
  // from kernel/proc.c/allocproc(), we know we need get thread's address of ret and thread's stack pointer
  t->context.ra = (uint64)func;
  t->context.sp = (uint64)(t->stack + STACK_SIZE);
  // I know we need stack to store state, but how to use it to store 'state'?
}

从当前线程的callee register切换到下一个线程的callee register

void 
thread_schedule(void)
{
...
  if (current_thread != next_thread) {         /* switch threads?  */
    next_thread->state = RUNNING;
    t = current_thread;
    current_thread = next_thread;
    /* YOUR CODE HERE
     * Invoke thread_switch to switch from t to next_thread:
     * thread_switch(??, ??);
     */
    // calling the first 'thread_schedule()' from main
    // when first 'thread_switch()' reture, we will go to the address of thread_a(equal calling thread_a)
    // then go to thread_b->_c->_a->_b->...->exit(0)
    thread_switch((uint64)&t->context, (uint64)&next_thread->context);
  }else
    next_thread = 0;
}

接下来的实验都是在ubuntu上做的了，实验文件也都在文件夹notxv6中，需要用到linux的不少接口，不过，都大同小异。

Using threads

在ph.c前部声明一个全局锁

...
int keys[NKEYS];
int nthread = 1;

pthread_mutex_t lock;            // declare a lock
...

在main()中把锁给初始化

int
main(int argc, char *argv[])
{
  pthread_t *tha;
  void *value;
  double t1, t0;
  pthread_mutex_init(&lock, NULL); // initialize the lock
  ...

修改put()

第一对锁（先上锁再解锁）是防止双线程产生竞争（同时修改key和value）
第二对锁（先解锁再上锁）是用来加速put()的，因为使用第二对锁的这段代码不需要修改key或者value，只是访问了key，而不是修改

static 
void put(int key, int value)
{
  pthread_mutex_lock(&lock);       // first_mutex: acquire lock
  int i = key % NBUCKET;

  // is the key already present?
  // we don't need modify key or value in this 'for() loop' 
  pthread_mutex_unlock(&lock);       // second_mutex: release lock
  struct entry *e = 0;
  for (e = table[i]; e != 0; e = e->next) {
    if (e->key == key)
      break;
  }
  pthread_mutex_lock(&lock);       // second_mutex: acquire lock
  if(e){
    // update the existing key.
    e->value = value;
  } else {
    // the new is new.
    insert(key, value, &table[i], table[i]);
  }
  pthread_mutex_unlock(&lock);       // first_mutex: release lock
}

Barrier

整体思路是：每一轮都有同样数目的线程来到，我们会阻塞所有线程，直到这一轮的最后一个线程进来，这时，我们将这一轮除最后一个线程外的所有线程唤醒，然后进入下一轮。

进入下一轮之前记得将bstate.nthread = 0，使得下一轮能够重新计数
执行pthread_cond_wait()之后也需要解锁，否则会产生死锁

go to sleep on cond, releasing lock mutex, acquiring upon wake up

进入睡眠之后，在pthread_cond_wait()中线程会释放锁，当线程被唤醒后会立刻获得锁，如果进入下一轮前没释放掉当前这一轮的锁，在下一轮就会一直请求锁，产生死锁

static void 
barrier()
{
  // YOUR CODE HERE
  //
  // Block until all threads have called barrier() and
  // then increment bstate.round.
  //
  pthread_mutex_lock(&bstate.barrier_mutex);  // acquire lock

  /* next thread */
  bstate.nthread++;
  
  /* next round */
  /* the last thread coming, we go to next round and wake up all threads */
  /* we must set 'bstate.nthread = 0' to call barrier() for all threads again */
  if (bstate.nthread == nthread){
    bstate.nthread = 0;
    bstate.round++;
    pthread_cond_broadcast(&bstate.barrier_cond); // wake up every thread sleeping on cond
  }

  /* block all thread by pthread_cond_broadcast()*/
  else{
     // when we not use 'else', 'pthread_cond_wait()' will be executed after 'pthread_cond_broadcast()'
    pthread_cond_wait(&bstate.barrier_cond, &bstate.barrier_mutex);  // go to sleep on cond, releasing lock mutex, acquiring upon wake up
  }  

  pthread_mutex_unlock(&bstate.barrier_mutex);  // release lock
}

./grade-lab-thread

duile@LAPTOP-II21PK0U:xv6-labs-2021$ ./grade-lab-thread
make: 'kernel/kernel' is up to date.
== Test uthread == uthread: OK (2.1s)
== Test answers-thread.txt == answers-thread.txt: OK
== Test ph_safe == gcc -o ph -g -O2 -DSOL_THREAD -DLAB_THREAD notxv6/ph.c -pthread
ph_safe: OK (10.4s)
== Test ph_fast == make: 'ph' is up to date.
ph_fast: OK (25.5s)
== Test barrier == gcc -o barrier -g -O2 -DSOL_THREAD -DLAB_THREAD notxv6/barrier.c -pthread
barrier: OK (3.2s)
== Test time ==
time: OK
Score: 60/60

总结

完成日期：2022.5.4
第一个实验想到了要跳转到func()，但忘记了是给context.ra使用函数地址func，在切换完context之后，以ret的方式跳转到func()
第三个实验没考虑到else的使用，如果没有else，if之后仍会执行pthread_cond_wait()，导致最后一个线程睡眠。
这是做实验以来最快完成的一次，代码量很少
最近听音乐都是用随机模式，下意识以为的一首歌竟然不是，很有意思

Memory Dot

MIT6.S081 Lab Multithreading

Lab Multithreading

写在前面

Uthread: switching between threads

Using threads

Barrier

总结