开始日期：22.6.16

操作系统：Ubuntu20.0.4

Link：Lab Lock

my github repository: duilec/MITS6.081-fall2021/tree/lock

Lab Lock

写在前面

此部分不涉及实现细节，可以放心阅读

总结一下，实验一kalloctest，使用了自旋锁、数组链表；
实验二，bcachetest中，使用了自旋锁、哈希桶（哈希链表）、LRU（time stamp实现）

踩坑

对于cache.lock不允许使用睡眠锁

cache.lock使用睡眠锁可以很取巧地通过测试，但只是把对自旋锁的争用转移到了睡眠锁，争用仍旧存在，不符合实验目的（实验目的是尽可能地降低争用）。
不要使用固定数组的固定buf链表，而是使用固定哈希桶但动态划分buf

笔者一开始采用的就是前者，无法通过测试，笔者猜测是因为其中的几个链表会经常被访问，但是由于buf是固定的（笔者划分时最多是4个buf），无法动态增加，导致争用程度没能降低多少
参考20fall版本，修改param.h中的FSSZIE 1000为FSSZIE 10000

不改的话会导致panic: balloc: out of blocks
usertests中测试manywrite出现bug：panic: freeing free block

很可能是bget函数锁的使用不正确，一定需要保证bget函数的原子性，否则同一磁盘块可能在缓存重复出现，造成二次释放。

hints实际也提出这个buf

It is OK to serialize eviction in bget (i.e., the part of bget that selects a buffer to re-use when a lookup misses in the cache).
在bcachetest中，make qume 前最好先make clean

不然的话，在bcachetest中，容易遗留前一个版本的bug

review

bcache本身是kernel virtual space中的一个数据结构，本身是一串虚拟地址，kernel程序创建的时候（kernel/main.c）就会将其创建，存在kerne virtual space里，直接映射到物理内存空间。我们没有必要要知道它的具体物理空间，我们只是用来指向disk block，作用就像是缓存了一样。(2022.8.12)

参考材料

实验内容

警告：此部分涉及实现细节

Memory allocator

实现内存分配器，减少竞争，主要实现思路是原先只有一个freelist，一个lock给八个cpu竞争使用，而我们需要重新设计为八个freelist、八个lock给八个cpu使用。
中途会有一个问题，就是当某个cpu对应的freelist没有空闲页面了，该怎么办？实验介绍给出的方案是从其它cpu的freelist中偷（steal）过来，笔者的具体实现是依靠一个for循环遍历数组链表（kmems[NCPU]）找到一个空闲页面来用，如果所有页面都满了，就只能返回0。

笔者期间还遇到一个c语言的问题：结构体的是如何设置的？
经过查阅资料和自己测试，明白了在{前的是类型名（type name），是拿来定义的，在}后的是变量名（variable name），是拿来使用的

access code

// Physical memory allocator, for user processes,
// kernel stacks, page-table pages,
// and pipe buffers. Allocates whole 4096-byte pages.

#include "types.h"
#include "param.h"
#include "memlayout.h"
#include "spinlock.h"
#include "riscv.h"
#include "defs.h"

void freerange(void *pa_start, void *pa_end);

extern char end[]; // first address after kernel.
                   // defined by kernel.ld.

struct run {
  struct run *next;
};

// we must have the type_name(Keme) so we can establish the array of struct(kmems[NCPU])
struct Keme{
  struct spinlock lock;
  struct run *freelist;
};

struct Keme kmems[NCPU];

void
kinit()
{
  // set eight "kmem" for eight cpus
  for(int i = 0; i < NCPU; i++){
    initlock(&kmems[i].lock, "kmem");
      // we build a single freelist to kmems[0] firstly 
      if (i == 0) 
        freerange(end, (void*)PHYSTOP);
  }
}

void
freerange(void *pa_start, void *pa_end)
{
  char *p;
  p = (char*)PGROUNDUP((uint64)pa_start);
  for(; p + PGSIZE <= (char*)pa_end; p += PGSIZE)
    kfree(p);
}

// Free the page of physical memory pointed at by v,
// which normally should have been returned by a
// call to kalloc().  (The exception is when
// initializing the allocator; see kinit above.)
void
kfree(void *pa)
{
  struct run *r;

  if(((uint64)pa % PGSIZE) != 0 || (char*)pa < end || (uint64)pa >= PHYSTOP)
    panic("kfree");

  // Fill with junk to catch dangling refs.
  memset(pa, 1, PGSIZE);

  r = (struct run*)pa;

  // get current cpu_id
  push_off();
  int cpu_id = cpuid(); 
  pop_off();

  // free a page from freelist of cpu_id 
  acquire(&kmems[cpu_id].lock);
  r->next = kmems[cpu_id].freelist;
  kmems[cpu_id].freelist = r;
  release(&kmems[cpu_id].lock);
}

// Allocate one 4096-byte page of physical memory.
// Returns a pointer that the kernel can use.
// Returns 0 if the memory cannot be allocated.
void *
kalloc(void)
{
  struct run *r;

  // get current cpu_id
  push_off();
  int cpu_id = cpuid();
  pop_off();

  acquire(&kmems[cpu_id].lock); // use 'acquire' also disable intrruption

  r = kmems[cpu_id].freelist;
  
  // if freelist of current cpu_id has freepage we use it
  if(r){
    kmems[cpu_id].freelist = r->next;
    release(&kmems[cpu_id].lock);
    memset((char*)r, 5, PGSIZE); // fill with junk
    return (void*)r;
  }

  // else we steal freepage from other freelist of cpus
  else{
    release(&kmems[cpu_id].lock);
    for(int i = 0; i < NCPU; i++){
      // avoid race condition(same cpu_id lock)
      if (i != cpu_id){ 
        acquire(&kmems[i].lock);

        // the last_list also not memory so ret 0 
        if(i == NCPU - 1 && kmems[i].freelist == 0){  
          release(&kmems[i].lock);
          return (void*)0;
        }

        // It is OK to allocte a freepage by stealing from other cpus
        if(kmems[i].freelist){
          struct run *to_alloc = kmems[i].freelist; 
          kmems[i].freelist = to_alloc->next;
          release(&kmems[i].lock);
          memset((char*)to_alloc, 5, PGSIZE); // fill with junk
          return (void*)to_alloc;
        }

        // if no matching condition, we must release current lock
        release(&kmems[i].lock); 
      }
    }
  }

  // Returns 0 if the memory cannot be allocated. 
  return (void*)0;
}

Buffer cache

实现buffer cache的并发（concurrency），保证一个不变性（invariant ）：即每次只有一个buffer被cpu或者process使用，这也是所谓的原子性（atmoic）。同时不要让其它process或cpu都等待一个cache lock（这暗示减少使用cache lock，或者直接不使用cache lock）

实现思路主要是将30个buf（NBUF == 30）划分为13个哈希桶（bucket），每个bucket都有一个对应的自旋锁即bcache.bucket。在重复使用或者淘汰（eviction）一个buf时，对这个buf对应的bucket上自旋锁，重复使用或者eviction结束之后，才对这个buf对应的bucket解锁，达成这个原则，才能达成不变性的要求。

在初始化时，先将每个bucket的head.next置空，再将30个buf全分配给bucket[0]。再根据hash函数算出的buk_id动态地分配，这里还是采用了steal的方式，但找出的空闲块要符合LRU。
在bget()中，buf的「身份认证」也就是dev和blockno作为参数，经hash函数算出buf对应的buk_id后，会有两种情况。
- bucket[buk_id]中已经缓存了对应的额buf，那直接拿来重复使用就行
- 没有，那只能eviction，先遍历所有buket，找到符合LRU的buf，从当前buket中steal出来，然后缓存到目标bucket[buk_id]中，修改相关信息就可以拿来用了。
  - 中间会有一个重复缓存的问题导致panic: freeing free block，假设两个进程同时找到了同一个buf，也要缓存到同一个bucket[buk_id]中，这就是重复缓存，从而导致二次释放。
    
    解决方案是在缓存到目标bucket[buk_id]之后，马上做一次检查，看看buf是不是已经被其它进程缓存进去了，如果是就用这个buf就行了，如果不是就要修改相关信息才拿出来用。
- 为了保证原子性，在「找对应buf」、「找符合LRU的buf并steal出来」、「缓存到目标bucket」的三个过程中必须对各自所查找或使用的bucket保持锁定。
实现了LRU采用的是时间戳（timestamp）的方式
- 初始化全部timestamp置为0
- 释放一个buf时，如果引用为0，就更新一下timestamp
- 每次找符合LRU的空闲块时，找引用为0，timestamp最大的即可

brelse()、bpin()、 bunpin()都不需要使用cache.lock，因为cache.lock可有可无，笔者为了让测试符合要求，才在eviction的过程加入了cache.lock，事实上有了13个bucket.lock即可保证原子性。

访问一个buf的结构时是不需要对sleeplock: buffer进行上锁、解锁操作，但要获得bucket.lock。只有使用这个buf的数据内容才需要对sleeplock: buffer进行上锁、解锁操作。

chapter 8.3
The buffer cache uses a per-buffer sleep-lock to ensure that only one thread at a time uses each buffer (and thus each disk block); bread() returns a locked buffer, and brelse() releases the lock.

所以也是为了确保每次只有一个线程拥有一个buffer，才让bget()返回一个带睡眠锁的buf（bread()会调用bget()）。

疑问

为什么将bucket的数量设置为质数后，就可以减少访问buckets时冲突，这里涉及到hashtable的知识点了，但笔者没有继续深入。
这里会涉及到一个问题，为什么不每个buffer都设一个锁，因为实际上每个buffer已经有对应的睡眠锁，这些睡眠锁是修改添加buf里的数据内容时才使用的。那为什么不能每个buf都再设定一个自旋锁呢？笔者猜测是因为30个buf太多了，会严重降低性能表现，不过还是能通过测试，而13个bucket相比而言，性能和支出得以平衡
在acquire执行之前获得buket.head是可行的
在acquire执行之前获得buket.head.next是不可行的，会导致各种bug。
目前还不知道为什么，可能是破坏了原子性？

access code

首先是一些定义：

/* param.h */
...
#define FSSIZE       10000  // size of file system in blocks
  
/* buf.h */
struct buf {
	...
  uchar data[BSIZE];
  uint timestamp;  // the timestamp of current block 
};

然后是实现代码：

// Buffer cache.
//
// The buffer cache is a linked list of buf structures holding
// cached copies of disk block contents.  Caching disk blocks
// in memory reduces the number of disk reads and also provides
// a synchronization point for disk blocks used by multiple processes.
//
// Interface:
// * To get a buffer for a particular disk block, call bread.
// * After changing buffer data, call bwrite to write it to disk.
// * When done with the buffer, call brelse.
// * Do not use the buffer after calling brelse.
// * Only one process at a time can use a buffer,
//     so do not keep them longer than necessary.


#include "types.h"
#include "param.h"
#include "spinlock.h"
#include "sleeplock.h"
#include "riscv.h"
#include "defs.h"
#include "fs.h"
#include "buf.h"

#define NBUK  13
#define hash(dev, blockno)  ((dev * blockno) % NBUK) // we use "mod" to establish func of hash

struct bucket{
  struct spinlock lock;
  struct buf head; // the head of current bucket
};

struct {
  struct spinlock lock;
  struct buf buf[NBUF]; // the cache has 30 bufs
  struct bucket buckets[NBUK]; // the cache has 13 buckets
} bcache;

void
binit(void)
{
  struct buf *b;
  struct buf *prev_b;

  initlock(&bcache.lock, "bcache");
  
  for(int i = 0; i < NBUK; i++){
    initlock(&bcache.buckets[i].lock, "bcache.bucket");
    bcache.buckets[i].head.next = (void*)0; // setting 0 for each tail of bucket
    // we pass all bufs to buckets[0] firstly
    if (i == 0){
      prev_b = &bcache.buckets[i].head;
      for(b = bcache.buf; b < bcache.buf + NBUF; b++){
        if(b == bcache.buf + NBUF - 1) // buf[29]
          b->next = (void*)0; // set tail of 0 for the buckets[0] of bufs
        prev_b->next = b;
        b->timestamp = ticks; // when initialize kernel, ticks == 0
        initsleeplock(&b->lock, "buffer");
        prev_b = b; // next linking
      }    
    }
  }
}

// Look through buffer cache for block on device dev.
// If not found, allocate a buffer.
// In either case, return locked buffer.
static struct buf*
bget(uint dev, uint blockno)
{
  struct buf *b;
  int buk_id = hash(dev, blockno); // get buk_id by mapping of hash

  // Is the block already cached?
  // atmic: one bucket.lock only be acquired by one process
  acquire(&bcache.buckets[buk_id].lock);  
  b = bcache.buckets[buk_id].head.next; // the first buf in buckets[buk_id]
  while(b){ 
    if(b->dev == dev && b->blockno == blockno){
      // acquire(&bcache.buckets[buk_id].lock);
      b->refcnt++;
      release(&bcache.buckets[buk_id].lock);
      acquiresleep(&b->lock);
      return b;   
    }
    b = b->next;
  }
  release(&bcache.buckets[buk_id].lock);
  
  // Not cached.
  // Recycle the least recently used (LRU) unused buffer.
  int max_timestamp = 0; 
  int lru_buk_id = -1; // select the LRU bucket id
  int is_better = 0; // we have better lru_buk_id?
  struct buf *lru_b = (void*)0;
  struct buf *prev_lru_b = (void*)0;
  
  // atmic: we alway lock buckets[lru_buk_id] to avoid be used by other process
  // find lru_buk_id when refcnt == 0 and getting max_timestamp in each bucket[i]
  struct buf *prev_b = (void*)0;
  for(int i = 0; i < NBUK; i++){
    prev_b = &bcache.buckets[i].head;
    acquire(&bcache.buckets[i].lock);
    while(prev_b->next){
      // we must use ">=" max_timestamp == 0 at first time in eviction
      if(prev_b->next->refcnt == 0 && prev_b->next->timestamp >= max_timestamp){ 
        max_timestamp = prev_b->next->timestamp;
        is_better = 1;
        prev_lru_b = prev_b; // get prev_lru_b
        // not use break, we must find the max_timestamp until we running in the tail of buckets[12]
      }
      prev_b = prev_b->next;
    }
    if(is_better){
      if(lru_buk_id != -1)
        release(&bcache.buckets[lru_buk_id].lock); // release old buckets[lru_buk_id]
      lru_buk_id = i; // get new lru_buk_id and alway lock it 
    }
    else
      release(&bcache.buckets[i].lock); // not better lru_buk_id, so we release current bucket[i]
    is_better = 0; // reset is_better to go next bucket[i]
  }

  // get lru_b
  lru_b = prev_lru_b->next; 

  // steal lru_b from buckets[lru_buk_id] by prev_lru_b
  // and release buckets[lru_buk_id].lock
  if(lru_b){
    prev_lru_b->next = prev_lru_b->next->next;
    release(&bcache.buckets[lru_buk_id].lock);
  }

  // cache lru_b to buckets[buk_id]
  // atmic: one bucket.lock only be acquired by one process
  acquire(&bcache.lock);
  acquire(&bcache.buckets[buk_id].lock);
  if(lru_b){
    lru_b->next = bcache.buckets[buk_id].head.next;
    bcache.buckets[buk_id].head.next = lru_b;
  }

  // if two processes will use the same block(same blockno) in buckets[lru_buk_id] 
  // one process can check it that if already here we get it
  // otherwise, we will use same block in two processes and cache double
  // then "panic freeing free block"
  b = bcache.buckets[buk_id].head.next; // the first buf in buckets[buk_id]
  while(b){ 
    if(b->dev == dev && b->blockno == blockno){
      b->refcnt++;
      release(&bcache.buckets[buk_id].lock);
      release(&bcache.lock);
      acquiresleep(&b->lock);
      return b;   
    }
    b = b->next;
  }

  // not find lru_b when travel each bucket
  if (lru_b == 0)
    panic("bget: no buffers");
  
  lru_b->dev = dev;
  lru_b->blockno = blockno;
  lru_b->valid = 0;
  lru_b->refcnt = 1;
  release(&bcache.buckets[buk_id].lock);
  release(&bcache.lock);
  acquiresleep(&lru_b->lock);
  return lru_b;

}

// Return a locked buf with the contents of the indicated block.
struct buf*
bread(uint dev, uint blockno)
{
  struct buf *b;

  b = bget(dev, blockno);
  if(!b->valid) {
    virtio_disk_rw(b, 0);
    b->valid = 1;
  }
  return b;
}

// Write b's contents to disk.  Must be locked.
void
bwrite(struct buf *b)
{
  if(!holdingsleep(&b->lock))
    panic("bwrite");
  virtio_disk_rw(b, 1);
}

// Release a locked buffer.
// Move to the head of the most-recently-used list.
void
brelse(struct buf *b)
{
  if(!holdingsleep(&b->lock))
    panic("brelse");

  releasesleep(&b->lock);

  int buk_id = hash(b->dev, b->blockno);
  acquire(&bcache.buckets[buk_id].lock);
  b->refcnt--;
  // update timestamp when it is a free buf (b->refcnt == 0)
  if(b->refcnt == 0)
    b->timestamp = ticks; 
  release(&bcache.buckets[buk_id].lock);
}

void
bpin(struct buf *b) {
  int buk_id = hash(b->dev, b->blockno);
  acquire(&bcache.buckets[buk_id].lock);
  b->refcnt++;
  release(&bcache.buckets[buk_id].lock);
}

void
bunpin(struct buf *b) {
  int buk_id = hash(b->dev, b->blockno);
  acquire(&bcache.buckets[buk_id].lock);
  b->refcnt--;
  release(&bcache.buckets[buk_id].lock);
}

result

make[1]: Leaving directory '/home/duile/xv6-labs-2021'
== Test running kalloctest == 
$ make qemu-gdb
(97.6s) 
== Test   kalloctest: test1 == 
  kalloctest: test1: OK 
== Test   kalloctest: test2 == 
  kalloctest: test2: OK 
== Test kalloctest: sbrkmuch == 
$ make qemu-gdb
kalloctest: sbrkmuch: OK (12.7s) 
== Test running bcachetest == 
$ make qemu-gdb
(12.7s) 
== Test   bcachetest: test0 == 
  bcachetest: test0: OK 
== Test   bcachetest: test1 == 
  bcachetest: test1: OK 
== Test usertests == 
$ make qemu-gdb
usertests: OK (187.3s) 
== Test time == 
time: OK 
Score: 70/70
duile@LAPTOP-II21PK0U:xv6-labs-2021$

总结

完成日期：22.6.23
一开始关于hashtable的概念不是很清楚，导致结构体的定义出错，后面就不可能对了。看了《算法精解》才搞懂
写概括性注释真的很重要，16号写完kalloctest，等到18号写blog的时候就记得不是很清楚了
本次实验二开始就思路错了，使用了sleeplock还以为自己对了（苦笑），后来花了很长时间（30小时）去debug，才知道怎么用哈希桶，怎么保证「找符合LRU的buf并steal出来」这个过程的原子性，过程中写出了通过测试导致manywrite死循环的bug，一直不知道怎么回事，后来重写一部分才成功。
会使用panic和printfdebug了，很快乐。
最近在听Sunshine Girl moumoon

Memory Dot

MIT6.S081 Lab Lock

Lab Lock

写在前面

踩坑

review

参考材料

实验内容

Memory allocator

access code

Buffer cache

access code

result

总结