-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnotes
196 lines (123 loc) · 7.57 KB
/
notes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
get is not working.
Because the in-memory hashmap has to be re-loaded from disk when the program is started again.
Have a load command? Otherwise, must not overwrite existing segment files.
the bytes appended at the end are system-dependent.
strace:
strace: Process 94085 attached
read(0, "set 1 a\n", 1024) = 8
openat(AT_FDCWD, "segment0", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END) = 74
lseek(3, 0, SEEK_CUR) = 74
fstat(3, {st_mode=S_IFREG|0644, st_size=74, ...}) = 0
write(3, "\4\0\0\0\31\0\0\0\1\0\0\0a\0\317\25zU\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 37) = 37
close(3) = 0
write(1, "offset: 74\n", 11) = 11
write(1, "Executed\n", 9) = 9
write(1, "bitcask-lite> ", 14) = 14
read(0, ".exit\n", 1024) = 6
write(1, "You forgot to cleanup\n", 22) = 22
exit_group(0) = ?
+++ exited with 0 +++
Make a makefile
to compile and also
to clear the segments, as part of make clean
I didn't initialize the keyvalue struct to 0s,
so garbage bytes were being copied by fwrite,
since it copies over the entire buffer (command.keyvalue.value).
Current question:
Why is only 1 byte written with 04 at the beginning of the file?
It's good, but I don't get why it's happening (and not 8 bytes).
new error message:
bitcask-lite> get 2
bc(28809,0x1053165c0) malloc: *** error for object 0x61: pointer being realloc'd was not allocated
bc(28809,0x1053165c0) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6
^Above was likely b/c I was writing to a char* that was not initialized.
(Exactly why is that error produced, why is it wrong?)
Current problem:
If a value is inserted, it isn't displayed upon being retrieved in the same session.
The offset is wrong
It's when you're uncertain about things
at two different levels of abstraction
that things get difficult (and you make mistakes).
Problem while reading:
strace: (not as useful b/c reads are buffered)
read(0, "get 1\n", 1024) = 6
write(1, "remembered offset is: 0\n", 24) = 24
openat(AT_FDCWD, "segment0", O_RDONLY) = 7
fstat(7, {st_mode=S_IFREG|0644, st_size=148, ...}) = 0
lseek(7, 0, SEEK_SET) = 0
read(7, "\4\0\0\0\31\0\0\0\1\0\0\0a\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 148
write(1, "size_v: 25\n", 11) = 11
read(7, 0x7ffdc4b6fa68, 94523640246272) = -1 EFAULT (Bad address)
write(1, "error while reading segment file"..., 34) = 34
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} ---
+++ killed by SIGSEGV +++
ltrace: (useful)
strcmp("get 1", ".exit") = 57
strncmp("get 1", "set ", 4) = -12
strncmp("get 1", "get ", 4) = 0
strtok("get 1", " ") = "get"
strtok(nil, " ") = "1"
atoi(0x5634572f48a4, 4, 12, 1) = 1
printf("remembered offset is: %ld\n", 0) = 24
fopen("segment0", "r") = 0x5634572f4d30
fseek(0x5634572f4d30, 0, 0, 0) = 0
fread(0x7ffc0e847990, 4, 1, 0x5634572f4d30) = 1
fread(0x7ffc0e847988, 4, 1, 0x5634572f4d30) = 1
printf("size_k: %d\n", 4) = 10
fread(0x7ffc0e847984, 4, 1, 0x5634572f4d30) = 1
printf("size_v: %d\n", 25) = 11
fread(0x7ffc0e847960, 94781338288153, 1, 0x5634572f4d30) = 0
puts("error while reading segment file"...) = 34
--- SIGSEGV (Segmentation fault) ---
+++ killed by SIGSEGV +++
The last read, the second arg is wrong. Why? I am printing it out right before.
Can I learn to examine the assembly code?
Can you trace the assembly, just like you can trace system calls?
gdb does this
Okay, lesson learned, actually pay atention to the warnings that tell you to use the appropriate printf format things for various things like size_t.
size_t is being read incorrectly.
is it b/c it's not initialized?
and only the low-order bits are being written?
Yes, that's exactly what happened.
If declaring variables of a potentially bigger type,
and you're using them or writing to them with functions that treat them as ints or types with a smaller width,
make sure you initialize them,
or garbage bytes could end up making them wrong/really big.
Eventually use TSAN and ASAN on the binary.
So overwriting the keyvalue (using same key but different value) already works?
Yes, because the file is read forward, so the most recent offset for the key is always stored into the hashmap last.
Keeping a singleton hashmap around in the program? That's the best way right? Should I make it more explicit?
Get feedback on this somehow? The current prototype version.
Who would be really good. I don't trust random redditors/internet people.
Is return -1 in main() equiv. to exit(EXIT_FAILURE) ?
same for return 0 and exit(EXIT_SUCCESS)
size_t is unsigned and therefore cannot (return -1 in order to) indicate an error.
ssize_t is a signed version of this.
Next, learn how to read the disassembled output.
elf files? x86 assembly? linkers?
Need to fclose the stream at some point.
How big does a size_t have to get before it fucks shit up by being treated as ssize_t?
-> write some simple programs to verify this
if size_t is 8 bytes (verify), then as soon as it goes over 2^(31) it should get negative.
do i expect any files to be of size 2^31? other way around, that is an upper bound for the segment file size.
this is ~2GB (or exactly 2 GiB)
what should the size limit on a segment file be?
Let's make it 100 bytes for now, so we can easily test things.
Going with writing current max ID to a file. Shouldn't be race conditions since
only one thing at a time will be creating a new segmetn file. It is another point of failure,
and more complexity, but seems nicer than having a linear scan take place, or using timestamps.
Can you write to a file atomically with fopen?
We don't want to create it atomically, we want to /write/ to it atomically
1. Do you need a terminal null when supplying a string pathname to fopen?
-> almost certainly
2. Does sprintf add a terminal byte?
-> looks like it does https://github.com/torvalds/linux/blob/v5.4/arch/alpha/boot/stdio.c#L289
actually, that's not added to the buf... but the str pointer is pointing to buf, and it is modified, so ok
3. What happens if the buffer you gave to sprintf is too big?
-> should be okay, as the null byte is appended.
What is this errno thing?
When do we write to the segment_id file?
Need to do a bunch of that bookkeeping work--update segment id file when closing a segment,
Also store the segment file ID inside the keydir (hashmap).