0%

MIT6.S081 Lab FS

开始日期:22.07.05

操作系统:Ubuntu20.0.4

Link:Lab file system

my github repository: duilec/MITS6.081-fall2021/tree/fs

Lab Lock

写在前面

此部分不涉及实现细节,可以放心阅读

bigfile主要采用了二级链表实现

symlink需要清楚iput()open()create()unlink()namei()等函数是如何交互的,它们又是怎么让inode在cache中释放存储的。

踩坑

  • symlinktest中使用存储路径名字的方式失败
    笔者一开始采用的是symlink存储路径名字(字符指针),但在测试执行if(write(fd1, buf, sizeof(buf)) != 4)之后,路径名字会直接丢失,无法引用,导致失败。具体原因没有分析出来。
  • iput
    iput的作用是将ip->ref减一,如果ip->ref == 1就直接将其从cache中释放掉
  • fs.c的注释
    做symlinktest前,请务必通读fs.c的大段注释(大约在105 line
  • writebig: panic freeing free block
    如果完成了symlinktest,在执行usertests之前,得先执行一次bigfile。
    否则会导致出错,因为wirtebif要写和MAXFILE一样大的文件,而只有执行bigfile之后我们才有修改后的MAXFILE
  • make clean
    如果发现测试怎么样都通过不了,觉得思路也没错,那可能是之前版本有问题遗留可以尝试make clean

参考材料

实验内容

警告:此部分涉及实现细节

bigfile

实验要求:实现磁盘的二级映射,将磁盘空间由268 blocks扩大为65803 blocks,约为原来的256倍

实现思路:将原来的12个直接映射+1个一级间接映射改为11个直接映射+1个一级间接映射+一个二级映射

具体而言就是链表多链接一层即可,再改一下参数,写之前最好完完全全看懂原版的bmap()函数。
注意在进入二级映射之前,记得减掉一个NINDIRECT,保证获取的中间地址号不会多1

access code

首先是修改一些定义

1
2
3
4
5
6
7
8
9
10
11
/* fs.h */
#define NDIRECT 11
#define NINDIRECT (BSIZE / sizeof(uint))
#define MAXFILE (NDIRECT + NINDIRECT + (NINDIRECT*NINDIRECT))

/* file.h */
// in-memory copy of an inode
struct inode {
...
uint addrs[NDIRECT + 2]; // 11 direct blocks, 1 singly indirect block and 1 double indirect block
};

然后是具体实现

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
static uint
bmap(struct inode *ip, uint bn)
{
uint addr, *a;
struct buf *bp;

// 11 direct block[0 ~ 10] => [0 ~ 10]
if(bn < NDIRECT){
// Load direct block, allocating if necessary.
if((addr = ip->addrs[bn]) == 0)
ip->addrs[bn] = addr = balloc(ip->dev);
return addr;
}
bn -= NDIRECT;

// NDIRECT == 11
// 1 singly direct block[11] => [0 ~ 255]
if(bn < NINDIRECT){
// Load indirect block, allocating if necessary.
if((addr = ip->addrs[NDIRECT]) == 0)
ip->addrs[NDIRECT] = addr = balloc(ip->dev);

bp = bread(ip->dev, addr);
a = (uint*)bp->data;

if((addr = a[bn]) == 0){
a[bn] = addr = balloc(ip->dev);
log_write(bp);
}

brelse(bp);
return addr;
}
// go to double direct block[12]
bn -= NINDIRECT;

// 1 double direct block[12] => 256 * [0 ~ 255]
if(bn < NINDIRECT * NINDIRECT){
int mid_addr_num = bn / NINDIRECT; // get mid_addr_num and c-style: divide(/) to down
bn = bn % NINDIRECT; // get real bn

// Load indirect block, allocating if necessary.
if((addr = ip->addrs[NDIRECT + 1]) == 0)
ip->addrs[NDIRECT + 1] = addr = balloc(ip->dev);

// get the addr of mid_addr_num
bp = bread(ip->dev, addr);
a = (uint*)bp->data;
if((addr = a[mid_addr_num]) == 0){
a[mid_addr_num] = addr = balloc(ip->dev);
log_write(bp);
}
brelse(bp);

// get the addr of real bn
bp = bread(ip->dev, addr);
a = (uint*)bp->data;
if((addr = a[bn]) == 0){
a[bn] = addr = balloc(ip->dev);
log_write(bp);
}
brelse(bp);

return addr;
}
panic("bmap: out of range");
}

// Truncate inode (discard contents).
// Caller must hold ip->lock.
void
itrunc(struct inode *ip)
{
int i, j, k;
struct buf *bp;
uint *a, *botton_a;

// 11 direct block[0 ~ 10] => [0 ~ 10]
for(i = 0; i < NDIRECT; i++){
if(ip->addrs[i]){
bfree(ip->dev, ip->addrs[i]);
ip->addrs[i] = 0;
}
}

// NDIRECT == 11
// 1 singly direct block[11] => [0 ~ 255]
if(ip->addrs[NDIRECT]){
bp = bread(ip->dev, ip->addrs[NDIRECT]);
a = (uint*)bp->data;
for(j = 0; j < NINDIRECT; j++){
if(a[j])
bfree(ip->dev, a[j]);
}
brelse(bp);
bfree(ip->dev, ip->addrs[NDIRECT]);
ip->addrs[NDIRECT] = 0;
}

// 1 double direct block[12] => 256 * [0 ~ 255]
if(ip->addrs[NDIRECT + 1]){
// get a of mid_addr
bp = bread(ip->dev, ip->addrs[NDIRECT + 1]);
a = (uint*)bp->data;
brelse(bp);

for(j = 0; j < NINDIRECT; j++){
// get botton_a of botton_addr
// exist block?
if(a[j]){
bp = bread(ip->dev, a[j]);
botton_a = (uint*)bp->data;
brelse(bp);
// free content of botton_a
for(k = 0; k < NINDIRECT; k++)
if(botton_a[k])
bfree(ip->dev, botton_a[k]);
}
}
bfree(ip->dev, ip->addrs[NDIRECT + 1]);
ip->addrs[NDIRECT + 1] = 0;
}

ip->size = 0;
iupdate(ip);
}

实验要求:实现软连接(symboil/soft link)
软连接不同于硬连接(hard link),可以在不同的磁盘块下连接起来。比如,我想D盘执行一个程序,但程序文件的位于C盘。那么通过软连接,即可在D盘执行C盘的程序,而硬连接则不行。

实现思路:实现syccall: sys_symlink将目标文件inode指针存储到路径文件的inode结构中,之后即可通过路径文件来访问目标文件

具体实现过程中,需要注意路径文件本身可能存在也可能不存在,所以需要检查,根据情况进行创建。这需要处理两个情况,即如果路径文件本身不存在,创建出来后是带锁的,那么自然不需要上锁;但如果路径文件本身就存在,那么它是没有锁的,那么我们就需要使用if(!holdingsleep(&ip->lock))进行检查。

同时,软连接到一个不存在的目标文件是被允许的,也就是说目标文件和路径文件一样,可能存在也可能不存在,所以需要检查,根据情况进行创建。这就会导致一个情况,就是如果带着O_CREATE参数去open()目标文件时,目标文件可能已经在symlink()被创建,如果再创建就是重复创建了,解决方法是进行检查,如果是T_SYMLINK,则不用新创建,改下参数即可;如果不是,则需要将其释放掉,然后再去创建。

open()函数中循环查找时记得解锁,再上锁,保证原子性,查找结束后,需要检查一下目标文件是否真的存在,存在的情况是nlink大于0

无论是目标文件还是路径文件,只要是T_SYMLINK类型,当被unlink()时,就需要执行iput()将其引用减一,让其有机会被释放掉,否则T_SYMLINK类型的inode就会遗留在cache,无法释放,数量一多,就会导致cache中没有空闲的inode可以存储了。(这点是依据报错和测试程序推测出来的)

注意,我们不需要处理链接对象是direcitory的情况

access code

首先是syccall: sys_symlink的注册流程,我这里就不详细写了,按照提示走即可。

然后是实现sys_symlink

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
uint64
sys_symlink(void)
{
char target[MAXPATH], path[MAXPATH];
struct inode *ip, *ip_target;

// get the parameter of target and path
if(argstr(0, target, MAXPATH) < 0 || argstr(1, path, MAXPATH) < 0)
return -1;

begin_op();

// Symlinking to nonexistent file should succeed
if((ip_target = namei(target)) == 0){
if((ip_target = create(target, T_SYMLINK, 0, 0)) == 0){ // diff char* path
end_op();
return -1;
}
}

// create ip if not exist
if((ip = namei(path)) == 0){ // diff char* path
if((ip = create(path, T_SYMLINK, 0, 0)) == 0){
end_op();
return -1;
}
}

// not have to handle symbolic links to directories for this lab.
if(ip_target){
if(ip_target->type == T_DIR){
end_op();
return -1;
}
}

// inode(ip) with lock return from create(path)
// inode(ip) without lock return from namei(path)
// so we need to check and maybe ilock it
if(!holdingsleep(&ip->lock)){
ilock(ip);
}
ip->sym_link = ip_target;
iupdate(ip);
iunlock(ip);
end_op();
return 0;
}

修改open()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
// Modify the open system call to handle the case where the path refers to a symbolic link. 
// If the file does not exist, open must fail.
// When a process specifies O_NOFOLLOW in the flags to open,
// open should open the symlink (and not follow the symbolic link).
uint64
sys_open(void)
{
char path[MAXPATH];
int fd, omode;
struct file *f;
struct inode *ip;
int n;

if((n = argstr(0, path, MAXPATH)) < 0 || argint(1, &omode) < 0)
return -1;

begin_op();

if(omode & O_CREATE){
ip = namei(path);
if(ip){
// if it is T_SYMLINK, we just change it type
if(ip->type == T_SYMLINK){
ip->type = T_FILE;
ip->major = ip->minor = 0;
}
else{
// if not T_SYMLINK we need free it, then create a inode
// becasue it is a new path so we can sure 'ip->ref == 1'
// 'ip->ref == 1'(namei()=>namex()=>idup()=>ip->ref++=>1)
iput(ip);
ip = create(path, T_FILE, 0, 0);
if(ip == 0){
end_op();
return -1;
}
}
}
else{
ip = create(path, T_FILE, 0, 0);
if(ip == 0){
end_op();
return -1;
}
}
}
else{
if((ip = namei(path)) == 0){
end_op();
return -1;
}
ilock(ip);
if(ip->type == T_DIR && omode != O_RDONLY){
iunlockput(ip);
end_op();
return -1;
}
}

if(ip->type == T_SYMLINK){
// if((omode & O_NOFOLLOW)){
// // do nothing
// }

// FOLLOW => follow symlink
if(!(omode & O_NOFOLLOW)){
// we set threshold = 10, so we only check 10 times
int threshold = 10;
for(int i = 0; i < threshold; i++){
iunlockput(ip);
ip = ip->sym_link;
ilock(ip);
if(ip->type != T_SYMLINK){
break;
}
}

// check: file really exists?
if(ip->nlink == 0){
iunlockput(ip);
end_op();
return -1;
}
}
}

if(ip->type == T_DEVICE && (ip->major < 0 || ip->major >= NDEV)){
iunlockput(ip);
end_op();
return -1;
}

if((f = filealloc()) == 0 || (fd = fdalloc(f)) < 0){
if(f)
fileclose(f);
iunlockput(ip);
end_op();
return -1;
}

if(ip->type == T_DEVICE){
f->type = FD_DEVICE;
f->major = ip->major;
} else {
f->type = FD_INODE;
f->off = 0;
}
f->ip = ip;
f->readable = !(omode & O_WRONLY);
f->writable = (omode & O_WRONLY) || (omode & O_RDWR);

if((omode & O_TRUNC) && ip->type == T_FILE){
itrunc(ip);
}

iunlock(ip);
end_op();

return fd;
}

最后是修改unlink()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
uint64
sys_unlink(void)
{
struct inode *ip, *dp;
struct dirent de;
char name[DIRSIZ], path[MAXPATH];
uint off;

if(argstr(0, path, MAXPATH) < 0)
return -1;

begin_op();
if((dp = nameiparent(path, name)) == 0){
end_op();
return -1;
}

ilock(dp);

// Cannot unlink "." or "..".
if(namecmp(name, ".") == 0 || namecmp(name, "..") == 0)
goto bad;

if((ip = dirlookup(dp, name, &off)) == 0)
goto bad;
ilock(ip);

if(ip->nlink < 1)
panic("unlink: nlink < 1");
if(ip->type == T_DIR && !isdirempty(ip)){
iunlockput(ip);
goto bad;
}

memset(&de, 0, sizeof(de));
if(writei(dp, 0, (uint64)&de, off, sizeof(de)) != sizeof(de))
panic("unlink: writei");
if(ip->type == T_DIR){
dp->nlink--;
iupdate(dp);
}
iunlockput(dp);

ip->nlink--;
iupdate(ip);
iunlockput(ip);

// we need to ip->ref-- or free symlink when we call unlink
if(ip->type == T_SYMLINK){
iput(ip);
}

end_op();

return 0;

bad:
iunlockput(dp);
end_op();
return -1;
}

result

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
make[1]: Leaving directory '/home/duile/xv6-labs-2021'
== Test running bigfile ==
$ make qemu-gdb
running bigfile: OK (170.2s)
(Old xv6.out.bigfile failure log removed)
== Test running symlinktest ==
$ make qemu-gdb
(0.9s)
== Test symlinktest: symlinks ==
symlinktest: symlinks: OK
== Test symlinktest: concurrent symlinks ==
symlinktest: concurrent symlinks: OK
== Test usertests ==
$ make qemu-gdb
usertests: OK (294.8s)
== Test time ==
time: OK
Score: 100/100

总结

  • 完成日期:22.07.07
  • 耗时15h,也是完全没看任何参考答案完成,bigfile用了3小时,2小时在看材料,看代码,1小时在写代码,实际上可能都没一个小时;symlink用了12小时,3小时看材料,查资料,看代码,9小时在调代码
  • 一开始写symlink的时候和bigfile区别和大,bigfile是我完全知道自己要干什么,非常清晰,而symlink是毫无头绪,查了不少资料也没有完全懂,后面是模仿着link()才写出个大概,然后一直调一直调一直调。不断猜测题目想要我写的是什么。
    从这我得出经验,真正明白编程所运用的知识才能写得流畅
  • 写symlink到四分之一的时候想出两个方案一个是存储路径名字指针,一个是存储路径文件inode指针,一开始用前者,失败了,但没找出原因,然后用后者,但遇到测试open("/testsymlink/4", O_CREATE | O_RDWR)时发现重复创建了文件,出现矛盾,想不出好办法解决。于是,又折回去实现名字指针,这回发现原因了,名字指针丢失,但找不出原理,对于我是个黑箱问题,无法解决。最后,折回去实现inode指针,想办法修改明确有问题的open(),解决了重复创建的问题。
    这个过程告诉我,当编程遇到明显问题的时候,有些看似无法解决,应该先休息,而不是立马放弃,调整好状态再思考。同时,面对黑箱问题和白箱问题,即使前者实现起来更漂亮简洁,还不如后者更实在靠谱
  • open()解决重复创建的那段我觉得我写的不够简洁,或许还能优化下,但加个函数就太做作了。
  • printf真好用,但用多了,就想着或许打断点更方便,只是我不熟悉打断点
  • 最近在听 I Don’t Want To Say Goodbye Teddy Thompson