-
Notifications
You must be signed in to change notification settings - Fork 0
/
review.txt
194 lines (167 loc) · 9.51 KB
/
review.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
Review
Know the Concepts
• What does it mean if a line in the program starts with the ’#’ character?
The octothorpe, or hash symbol, is used to mark the start of a code comment.
• What is the difference between an assembly language file and an object code
file?
Assembly language files are a type of source code, it is human-readable and mnemonic with operation names like "add" or "mov" (for move), but assembly files alone can not be executed.
In order to convert the source code into an executable file it must be translated into machine code by the assembler.
Machine code is made up of 1's and 0's (binary), and is not easily readable by humans because it is used to speak directly to the CPU.
Object code is a type of machine code that is created after assembling a programs source code, it usually contains metadata that will be used by the linker.
This metadata tells the linker where programs start and how they relate to each other.
Object code gets turned into an executable after the linking process.
Assembly Source Code -> Assembler -> Object Code -> Linker -> Executable.
• What does the linker do?
The linker converts object files into a single executable.
Typically, an object file will contain references to other object files or libraries,
the linker resolves all these references into one executable program.
• How do you check the result status code of the last program you ran?
echo $?
• What is the difference between movl $1, %eax and movl 1, %eax?
movl $1, %eax - This intruction uses immediate addressing ($) to move the value 1 into the accumulator (%eax register).
movl 1, %eax - This instruction is using direct addressing to move the value stored at the 1 address in memory into the accumulator.
• Which register holds the system call number?
%eax is the register responsible for storing the system call number.
• What are indexes used for?
Indexes are used to keep track of where we are or where we want to be in an array.
• Why do indexes usually start at 0?
Zero-based indexing makes pointer arithmetic simpler to use.
When grabbing bytes, we multiply the index by the amount of bytes we need.
For instance, if we wanted to grab the second byte from memory address 2700 (where our pointer is) we
would simply do 2700 + 2 * 1 (2702).
If we wanted the first element, we would do 2700 + 0 * 1 (2700).
The formula used:
address + index * bytes = effective address
If we started counting at 1 instead we would need to account for this change by subtracting or adding 1 everytime we wanted to point to anything.
Zero-based indexing allows for cleaner and simpler arithmetic.
• If I issued the command movl data_items(,%edi,4), %eax and
data_items was address 3634 and %edi held the value 13, what address would
you be using to move into %eax?
We can simplify the instruction by plugging in the values above, so it'll be easier to understand.
movl data_items(,%edi,4), %eax --> movl 3634(,13,4), %eax.
Now we can use the following formula to get the value that will end up in the accumulator (%eax register).
address + index * bytes = effective address
3634 + (13 * 4) = 3686.
The contents of the %eax register will be the value currently stored at location 3686 in memory.
• List the general-purpose registers.
EAX - (Extended Accumulator)
EBX - (Extended Base)
ECX - (Extended Counter)
EDX - (Extended Data)
ESI - (Source Index)
EDI - (Destination Index)
EBP - (Extended Base Pointer)
ESP - (Extended Stack Pointer)
• What is the difference between movl and movb?
movl - (Move Long) moves 32 bits (4 bytes).
movb - (Move Byte) moves 8 bits (1 byte).
The difference between movl and movb is how much data they move.
• What is flow control?
Flow control refers to the direction of a program.
Control of a programs flow kicks in during conditionals, loops, etc.
In a simple program where every instruction is executed in order, no matter what,
there is no control over its flow.
Pseudo-Code for program with no flow control:
a = 1
b = 2
c = 3
print a
print b
print c
Pseudo-Code for program with flow control:
a = 1
b = 2
c = 3
if a equals b
print a
else
print c
• What does a conditional jump do?
A conditional jump moves to another part in a program (jumps) based on
whether the value of the instruction right before was true or false.
It is used as a control flow mechanism.
For example, if you wanted to return to the beginning of a loop if a number is not found, or
exit the loop if the number is found in the current iteration of the loop.
Pseudo-Code for conditional jump:
NUMBERS = 1, 2, 3, 4, 5
NUMBER = 0
START_OF_LOOP
if NUMBERS[NUMBER] = 4
exit
else
NUMBER = NUMBER + 1
Jump to START_OF_LOOP
In assembly, common operation codes for conditional jumps include:
JE: Jump if Equal
JL : Jump if Less
JGE : Jump if Greater or Equal
JLE : Jump if Less or Equal
• What things do you have to plan for when writing a program?
The purpose of the program.
A general idea of how we may solve whatever problem the program is designed to solve.
How much memory will be needed.
• Go through every instruction and list what addressing mode is being used for
each operand.
movl $0, %edi --> IMMEDIATE ADDRESSING MODE, REGISTER MODE
movl data_items(,%edi,4), %eax --> INDEX ADDRESSING MODE, REGISTER MODE
movl %eax, %ebx --> REGISTER MODE, REGISTER MODE
start_loop:
cmpl $0, %eax --> IMMEDIATE ADDRESSING MODE, REGISTER MODE
je loop_exit --> RELATIVE ADDRESSING
incl %edi --> REGISTER MODE
movl data_items(,%edi,4), %eax --> INDEX ADDRESSING MODE, REGISTER MODE
cmpl %ebx, %eax --> REGISTER MODE, REGISTER MODE
jle start_loop --> RELATIVE ADDRESSING
movl %eax, %ebx --> REGISTER MODE, REGISTER MODE
jmp start_loop --> RELATIVE ADDRESSING
loop_exit:
movl $1, %eax --> IMMEDIATE ADDRESSING MODE, REGISTER MODE
int $0x80
Use the Concepts
• Modify the first program to return the value 3.
Refer to exit-3.s for solution.
• Modify the maximum program to find the minimum instead.
Refer to minimum.s for solution.
• Modify the maximum program to use the number 255 to end the list rather than
the number 0.
Refer to maximum-3.s for solution.
• Modify the maximum program to use an ending address rather than the number
0 to know when to stop.
Refer to maximum-4.s for solution.
• Modify the maximum program to use a length count rather than the number 0 to
know when to stop.
Refer to maximum-5.s for solution.
• What would the instruction movl _start, %eax do? Be specific, based on
your knowledge of both addressing modes and the meaning of _start. How
would this differ from the instruction movl $_start, %eax?
movl _start, %eax: This instruction moves the value of the _start label into the %eax register.
The _start label usually points to the memory address of a programs starting point.
movl $_start, %eax: This instruction moves the IMMEDIATE value of the _start label into the %eax register.
The immediate value of the _start label will be the memory address of a programs starting point.
Going Further
• Modify the first program to leave off the int instruction line. Assemble, link,
and execute the new program. What error message do you get. Why do you
think this might be?
Error Message: Segmentation fault (core dumped)
Without a system call to exit, the program never terminated properly and resources were not returned.
This resulted in a segmentation fault.
• So far, we have discussed three approaches to finding the end of the list - using
a special number, using the ending address, and using the length count. Which
approach do you think is best? Why? Which approach would you use if you
knew that the list was sorted? Why?
There are pros and cons to the three approaches mentioned above.
Using a special number to designate the end of an array is a quick and dirty method.
It works and it's simple, but problems with this method arise when the same character is needed as part of the actual data.
In this scenario, the program will interpret the data number as the end of the array.
For example, if the special number is 0 but the list of data items is [4, 8, 3, 233, 13, 172, 0, 52, 37, 2, 0],
the program will stop evaluating the numbers 52, 37, and 2.
Using a memory address, as a stopping point, will ensure that we always stop at a precise location rather than stopping at any
location that contains the value we are looking for.
The downside to this method is that we need to calculate the address at the end of an array,
with simple arrays this may be trivial, but as programs become more complex this process becomes tedious.
Using the length count approach is simple, it's also elegant.
Once you've calculated the length of an array, all you need is a counter that you can check against the length.
This method also comes in handy when trying to implement a binary search algorithm.
With a sorted list we can check which half of an array contains the target element and slice off the other half.
This process repeats until the target element is found.
Therefore, the best method for searching sorted arrays is the length count approach.