0%

MIT6.S081 Lab Lock

开始日期:22.6.16

操作系统:Ubuntu20.0.4

Link:Lab Lock

my github repository: duilec/MITS6.081-fall2021/tree/lock

Lab Lock

写在前面

此部分不涉及实现细节,可以放心阅读

总结一下,实验一kalloctest,使用了自旋锁、数组链表;
实验二,bcachetest中,使用了自旋锁、哈希桶(哈希链表)、LRU(time stamp实现)

踩坑

  • 对于cache.lock不允许使用睡眠锁

    cache.lock使用睡眠锁可以很取巧地通过测试,但只是把对自旋锁的争用转移到了睡眠锁,争用仍旧存在,不符合实验目的(实验目的是尽可能地降低争用)。

  • 不要使用固定数组的固定buf链表,而是使用固定哈希桶但动态划分buf

    笔者一开始采用的就是前者,无法通过测试,笔者猜测是因为其中的几个链表会经常被访问,但是由于buf是固定的(笔者划分时最多是4个buf),无法动态增加,导致争用程度没能降低多少

  • 参考20fall版本,修改param.h中的FSSZIE 1000FSSZIE 10000

    不改的话会导致panic: balloc: out of blocks

  • usertests中测试manywrite出现bug:panic: freeing free block

    很可能是bget函数锁的使用不正确,一定需要保证bget函数的原子性,否则同一磁盘块可能在缓存重复出现,造成二次释放。

    hints实际也提出这个buf

    It is OK to serialize eviction in bget (i.e., the part of bget that selects a buffer to re-use when a lookup misses in the cache).

  • 在bcachetest中,make qume 前最好先make clean

    不然的话,在bcachetest中,容易遗留前一个版本的bug

review

  • bcache本身是kernel virtual space中的一个数据结构,本身是一串虚拟地址,kernel程序创建的时候(kernel/main.c)就会将其创建,存在kerne virtual space里,直接映射到物理内存空间。我们没有必要要知道它的具体物理空间,我们只是用来指向disk block,作用就像是缓存了一样。(2022.8.12)

参考材料

实验内容

警告:此部分涉及实现细节

Memory allocator

实现内存分配器,减少竞争,主要实现思路是原先只有一个freelist,一个lock给八个cpu竞争使用,而我们需要重新设计为八个freelist、八个lock给八个cpu使用。
中途会有一个问题,就是当某个cpu对应的freelist没有空闲页面了,该怎么办?实验介绍给出的方案是从其它cpu的freelist(steal)过来,笔者的具体实现是依靠一个for循环遍历数组链表(kmems[NCPU])找到一个空闲页面来用,如果所有页面都满了,就只能返回0

笔者期间还遇到一个c语言的问题:结构体的是如何设置的?
经过查阅资料和自己测试,明白了{前的是类型名(type name),是拿来定义的,}后的是变量名(variable name),是拿来使用

access code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
// Physical memory allocator, for user processes,
// kernel stacks, page-table pages,
// and pipe buffers. Allocates whole 4096-byte pages.

#include "types.h"
#include "param.h"
#include "memlayout.h"
#include "spinlock.h"
#include "riscv.h"
#include "defs.h"

void freerange(void *pa_start, void *pa_end);

extern char end[]; // first address after kernel.
// defined by kernel.ld.

struct run {
struct run *next;
};

// we must have the type_name(Keme) so we can establish the array of struct(kmems[NCPU])
struct Keme{
struct spinlock lock;
struct run *freelist;
};

struct Keme kmems[NCPU];

void
kinit()
{
// set eight "kmem" for eight cpus
for(int i = 0; i < NCPU; i++){
initlock(&kmems[i].lock, "kmem");
// we build a single freelist to kmems[0] firstly
if (i == 0)
freerange(end, (void*)PHYSTOP);
}
}

void
freerange(void *pa_start, void *pa_end)
{
char *p;
p = (char*)PGROUNDUP((uint64)pa_start);
for(; p + PGSIZE <= (char*)pa_end; p += PGSIZE)
kfree(p);
}

// Free the page of physical memory pointed at by v,
// which normally should have been returned by a
// call to kalloc(). (The exception is when
// initializing the allocator; see kinit above.)
void
kfree(void *pa)
{
struct run *r;

if(((uint64)pa % PGSIZE) != 0 || (char*)pa < end || (uint64)pa >= PHYSTOP)
panic("kfree");

// Fill with junk to catch dangling refs.
memset(pa, 1, PGSIZE);

r = (struct run*)pa;

// get current cpu_id
push_off();
int cpu_id = cpuid();
pop_off();

// free a page from freelist of cpu_id
acquire(&kmems[cpu_id].lock);
r->next = kmems[cpu_id].freelist;
kmems[cpu_id].freelist = r;
release(&kmems[cpu_id].lock);
}

// Allocate one 4096-byte page of physical memory.
// Returns a pointer that the kernel can use.
// Returns 0 if the memory cannot be allocated.
void *
kalloc(void)
{
struct run *r;

// get current cpu_id
push_off();
int cpu_id = cpuid();
pop_off();

acquire(&kmems[cpu_id].lock); // use 'acquire' also disable intrruption

r = kmems[cpu_id].freelist;

// if freelist of current cpu_id has freepage we use it
if(r){
kmems[cpu_id].freelist = r->next;
release(&kmems[cpu_id].lock);
memset((char*)r, 5, PGSIZE); // fill with junk
return (void*)r;
}

// else we steal freepage from other freelist of cpus
else{
release(&kmems[cpu_id].lock);
for(int i = 0; i < NCPU; i++){
// avoid race condition(same cpu_id lock)
if (i != cpu_id){
acquire(&kmems[i].lock);

// the last_list also not memory so ret 0
if(i == NCPU - 1 && kmems[i].freelist == 0){
release(&kmems[i].lock);
return (void*)0;
}

// It is OK to allocte a freepage by stealing from other cpus
if(kmems[i].freelist){
struct run *to_alloc = kmems[i].freelist;
kmems[i].freelist = to_alloc->next;
release(&kmems[i].lock);
memset((char*)to_alloc, 5, PGSIZE); // fill with junk
return (void*)to_alloc;
}

// if no matching condition, we must release current lock
release(&kmems[i].lock);
}
}
}

// Returns 0 if the memory cannot be allocated.
return (void*)0;
}

Buffer cache

实现buffer cache的并发(concurrency),保证一个不变性(invariant ):即每次只有一个buffer被cpu或者process使用,这也是所谓的原子性(atmoic)。同时不要让其它process或cpu都等待一个cache lock(这暗示减少使用cache lock,或者直接不使用cache lock)

实现思路主要是将30个buf(NBUF == 30)划分为13个哈希桶(bucket),每个bucket都有一个对应的自旋锁即bcache.bucket在重复使用或者淘汰(eviction)一个buf时,对这个buf对应的bucket上自旋锁,重复使用或者eviction结束之后,才对这个buf对应的bucket解锁,达成这个原则,才能达成不变性的要求。

  • 在初始化时,先将每个buckethead.next置空,再将30个buf全分配给bucket[0]。再根据hash函数算出的buk_id动态地分配,这里还是采用了steal的方式,但找出的空闲块要符合LRU

  • bget()中,buf的「身份认证」也就是devblockno作为参数,经hash函数算出buf对应的buk_id后,会有两种情况。

    • bucket[buk_id]中已经缓存了对应的额buf,那直接拿来重复使用就行

    • 没有,那只能eviction,先遍历所有buket,找到符合LRU的buf,从当前buket中steal出来,然后缓存到目标bucket[buk_id]中,修改相关信息就可以拿来用了。

      • 中间会有一个重复缓存的问题导致panic: freeing free block,假设两个进程同时找到了同一个buf,也要缓存到同一个bucket[buk_id]中,这就是重复缓存,从而导致二次释放。

        解决方案是在缓存到目标bucket[buk_id]之后,马上做一次检查,看看buf是不是已经被其它进程缓存进去了,如果是就用这个buf就行了,如果不是就要修改相关信息才拿出来用。

    • 为了保证原子性,在「找对应buf」、「找符合LRU的buf并steal出来」、「缓存到目标bucket」的三个过程中必须对各自所查找或使用的bucket保持锁定。

  • 实现了LRU采用的是时间戳(timestamp)的方式

    • 初始化全部timestamp置为0
    • 释放一个buf时,如果引用为0,就更新一下timestamp
    • 每次找符合LRU的空闲块时,找引用为0timestamp最大的即可

brelse()bpin()bunpin()都不需要使用cache.lock,因为cache.lock可有可无,笔者为了让测试符合要求,才在eviction的过程加入了cache.lock,事实上有了13个bucket.lock即可保证原子性。

访问一个buf的结构时是不需要对sleeplock: buffer进行上锁、解锁操作,但要获得bucket.lock。只有使用这个buf的数据内容才需要对sleeplock: buffer进行上锁、解锁操作。

chapter 8.3
The buffer cache uses a per-buffer sleep-lock to ensure that only one thread at a time uses each buffer (and thus each disk block); bread() returns a locked buffer, and brelse() releases the lock.

所以也是为了确保每次只有一个线程拥有一个buffer,才让bget()返回一个带睡眠锁的buf(bread()会调用bget())。

疑问

  • 为什么将bucket的数量设置为质数后,就可以减少访问buckets时冲突,这里涉及到hashtable的知识点了,但笔者没有继续深入。
  • 这里会涉及到一个问题,为什么不每个buffer都设一个锁,因为实际上每个buffer已经有对应的睡眠锁,这些睡眠锁是修改添加buf里的数据内容时才使用的。那为什么不能每个buf都再设定一个自旋锁呢?笔者猜测是因为30个buf太多了,会严重降低性能表现,不过还是能通过测试,而13个bucket相比而言,性能和支出得以平衡
  • 在acquire执行之前获得buket.head是可行的
    在acquire执行之前获得buket.head.next是不可行的,会导致各种bug。
    目前还不知道为什么,可能是破坏了原子性?

access code

首先是一些定义:

1
2
3
4
5
6
7
8
9
10
/* param.h */
...
#define FSSIZE 10000 // size of file system in blocks

/* buf.h */
struct buf {
...
uchar data[BSIZE];
uint timestamp; // the timestamp of current block
};

然后是实现代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
// Buffer cache.
//
// The buffer cache is a linked list of buf structures holding
// cached copies of disk block contents. Caching disk blocks
// in memory reduces the number of disk reads and also provides
// a synchronization point for disk blocks used by multiple processes.
//
// Interface:
// * To get a buffer for a particular disk block, call bread.
// * After changing buffer data, call bwrite to write it to disk.
// * When done with the buffer, call brelse.
// * Do not use the buffer after calling brelse.
// * Only one process at a time can use a buffer,
// so do not keep them longer than necessary.


#include "types.h"
#include "param.h"
#include "spinlock.h"
#include "sleeplock.h"
#include "riscv.h"
#include "defs.h"
#include "fs.h"
#include "buf.h"

#define NBUK 13
#define hash(dev, blockno) ((dev * blockno) % NBUK) // we use "mod" to establish func of hash

struct bucket{
struct spinlock lock;
struct buf head; // the head of current bucket
};

struct {
struct spinlock lock;
struct buf buf[NBUF]; // the cache has 30 bufs
struct bucket buckets[NBUK]; // the cache has 13 buckets
} bcache;

void
binit(void)
{
struct buf *b;
struct buf *prev_b;

initlock(&bcache.lock, "bcache");

for(int i = 0; i < NBUK; i++){
initlock(&bcache.buckets[i].lock, "bcache.bucket");
bcache.buckets[i].head.next = (void*)0; // setting 0 for each tail of bucket
// we pass all bufs to buckets[0] firstly
if (i == 0){
prev_b = &bcache.buckets[i].head;
for(b = bcache.buf; b < bcache.buf + NBUF; b++){
if(b == bcache.buf + NBUF - 1) // buf[29]
b->next = (void*)0; // set tail of 0 for the buckets[0] of bufs
prev_b->next = b;
b->timestamp = ticks; // when initialize kernel, ticks == 0
initsleeplock(&b->lock, "buffer");
prev_b = b; // next linking
}
}
}
}

// Look through buffer cache for block on device dev.
// If not found, allocate a buffer.
// In either case, return locked buffer.
static struct buf*
bget(uint dev, uint blockno)
{
struct buf *b;
int buk_id = hash(dev, blockno); // get buk_id by mapping of hash

// Is the block already cached?
// atmic: one bucket.lock only be acquired by one process
acquire(&bcache.buckets[buk_id].lock);
b = bcache.buckets[buk_id].head.next; // the first buf in buckets[buk_id]
while(b){
if(b->dev == dev && b->blockno == blockno){
// acquire(&bcache.buckets[buk_id].lock);
b->refcnt++;
release(&bcache.buckets[buk_id].lock);
acquiresleep(&b->lock);
return b;
}
b = b->next;
}
release(&bcache.buckets[buk_id].lock);

// Not cached.
// Recycle the least recently used (LRU) unused buffer.
int max_timestamp = 0;
int lru_buk_id = -1; // select the LRU bucket id
int is_better = 0; // we have better lru_buk_id?
struct buf *lru_b = (void*)0;
struct buf *prev_lru_b = (void*)0;

// atmic: we alway lock buckets[lru_buk_id] to avoid be used by other process
// find lru_buk_id when refcnt == 0 and getting max_timestamp in each bucket[i]
struct buf *prev_b = (void*)0;
for(int i = 0; i < NBUK; i++){
prev_b = &bcache.buckets[i].head;
acquire(&bcache.buckets[i].lock);
while(prev_b->next){
// we must use ">=" max_timestamp == 0 at first time in eviction
if(prev_b->next->refcnt == 0 && prev_b->next->timestamp >= max_timestamp){
max_timestamp = prev_b->next->timestamp;
is_better = 1;
prev_lru_b = prev_b; // get prev_lru_b
// not use break, we must find the max_timestamp until we running in the tail of buckets[12]
}
prev_b = prev_b->next;
}
if(is_better){
if(lru_buk_id != -1)
release(&bcache.buckets[lru_buk_id].lock); // release old buckets[lru_buk_id]
lru_buk_id = i; // get new lru_buk_id and alway lock it
}
else
release(&bcache.buckets[i].lock); // not better lru_buk_id, so we release current bucket[i]
is_better = 0; // reset is_better to go next bucket[i]
}

// get lru_b
lru_b = prev_lru_b->next;

// steal lru_b from buckets[lru_buk_id] by prev_lru_b
// and release buckets[lru_buk_id].lock
if(lru_b){
prev_lru_b->next = prev_lru_b->next->next;
release(&bcache.buckets[lru_buk_id].lock);
}

// cache lru_b to buckets[buk_id]
// atmic: one bucket.lock only be acquired by one process
acquire(&bcache.lock);
acquire(&bcache.buckets[buk_id].lock);
if(lru_b){
lru_b->next = bcache.buckets[buk_id].head.next;
bcache.buckets[buk_id].head.next = lru_b;
}

// if two processes will use the same block(same blockno) in buckets[lru_buk_id]
// one process can check it that if already here we get it
// otherwise, we will use same block in two processes and cache double
// then "panic freeing free block"
b = bcache.buckets[buk_id].head.next; // the first buf in buckets[buk_id]
while(b){
if(b->dev == dev && b->blockno == blockno){
b->refcnt++;
release(&bcache.buckets[buk_id].lock);
release(&bcache.lock);
acquiresleep(&b->lock);
return b;
}
b = b->next;
}

// not find lru_b when travel each bucket
if (lru_b == 0)
panic("bget: no buffers");

lru_b->dev = dev;
lru_b->blockno = blockno;
lru_b->valid = 0;
lru_b->refcnt = 1;
release(&bcache.buckets[buk_id].lock);
release(&bcache.lock);
acquiresleep(&lru_b->lock);
return lru_b;

}

// Return a locked buf with the contents of the indicated block.
struct buf*
bread(uint dev, uint blockno)
{
struct buf *b;

b = bget(dev, blockno);
if(!b->valid) {
virtio_disk_rw(b, 0);
b->valid = 1;
}
return b;
}

// Write b's contents to disk. Must be locked.
void
bwrite(struct buf *b)
{
if(!holdingsleep(&b->lock))
panic("bwrite");
virtio_disk_rw(b, 1);
}

// Release a locked buffer.
// Move to the head of the most-recently-used list.
void
brelse(struct buf *b)
{
if(!holdingsleep(&b->lock))
panic("brelse");

releasesleep(&b->lock);

int buk_id = hash(b->dev, b->blockno);
acquire(&bcache.buckets[buk_id].lock);
b->refcnt--;
// update timestamp when it is a free buf (b->refcnt == 0)
if(b->refcnt == 0)
b->timestamp = ticks;
release(&bcache.buckets[buk_id].lock);
}

void
bpin(struct buf *b) {
int buk_id = hash(b->dev, b->blockno);
acquire(&bcache.buckets[buk_id].lock);
b->refcnt++;
release(&bcache.buckets[buk_id].lock);
}

void
bunpin(struct buf *b) {
int buk_id = hash(b->dev, b->blockno);
acquire(&bcache.buckets[buk_id].lock);
b->refcnt--;
release(&bcache.buckets[buk_id].lock);
}

result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
make[1]: Leaving directory '/home/duile/xv6-labs-2021'
== Test running kalloctest ==
$ make qemu-gdb
(97.6s)
== Test kalloctest: test1 ==
kalloctest: test1: OK
== Test kalloctest: test2 ==
kalloctest: test2: OK
== Test kalloctest: sbrkmuch ==
$ make qemu-gdb
kalloctest: sbrkmuch: OK (12.7s)
== Test running bcachetest ==
$ make qemu-gdb
(12.7s)
== Test bcachetest: test0 ==
bcachetest: test0: OK
== Test bcachetest: test1 ==
bcachetest: test1: OK
== Test usertests ==
$ make qemu-gdb
usertests: OK (187.3s)
== Test time ==
time: OK
Score: 70/70
duile@LAPTOP-II21PK0U:xv6-labs-2021$

总结

  • 完成日期:22.6.23
  • 一开始关于hashtable的概念不是很清楚,导致结构体的定义出错,后面就不可能对了。看了《算法精解》才搞懂
  • 概括性注释真的很重要,16号写完kalloctest,等到18号写blog的时候就记得不是很清楚了
  • 本次实验二开始就思路错了,使用了sleeplock还以为自己对了(苦笑),后来花了很长时间(30小时)去debug,才知道怎么用哈希桶,怎么保证「找符合LRU的buf并steal出来」这个过程的原子性,过程中写出了通过测试导致manywrite死循环的bug,一直不知道怎么回事,后来重写一部分才成功。
  • 会使用panicprintfdebug了,很快乐。
  • 最近在听Sunshine Girl moumoon