-
Notifications
You must be signed in to change notification settings - Fork 0
/
ALP-notes1.txt
2158 lines (1656 loc) · 73.7 KB
/
ALP-notes1.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Assembly Language Instructions:
======================
1.An assembly language has 2 parts:
Part1: Defines the operation to be performed and is represented by the memonic of the instruction:
example: movl %eax, %ebx
In the above example movl is the operation to be performed
Part2: Defines the operands for the instruction
2. Operands of Instructions are 2 Types:
Type1: Source Operands: Provides Input values to the operation to be performed.
Type: Destination Operands: Results of the operation are stored in destination Operands
3. In an Instruction destination & Source operands:
A. May be same : addl %ebx, %ebx (addl the current values of %ebx and store the result in %ebx)
B. May be different : addl %ebx, %ecx (add the values in ebx and ecx and store the result in ecx)
4. As instructions are assembled by the assembler, A track of Memory addresses of instructions are maintained.
5. Some instructions such as "jump" or "call" have the memory address of anohter instructions as argument. It's difficult for programmer to keep track of
these addresses , So instructions are labelled, and this label is referred in Other instructions:
1. .section .data
2. .section .text
3. .globl _start
4. _start:
5. movl $1, %eax
6. movl $100, %ebx
7,. int $0x80
the above code , the instructions from lines 5,6,7 are labelled with "_start":
6. _start defines the address of the instruction in line 5:
Example:
08048054 <_start>:
8048054: b8 01 00 00 00 mov $0x1,%eax
8048059: bb 64 00 00 00 mov $0x64,%ebx
804805e: cd 80 int $0x80
7. An assembly instructions also several Non-Processor instructions used in the program, which are called pseudo-ops or assembler directives.
example: ".section data" , ".section .text" etc.
8. _start is a symbol to define address of the instruction in the program that is executed first. Thus the execution of the program always starts from location "_start"
and carries on till the program makes a call to exit system call in GNU/Linux
Example:
08048054 <_start>:
8048054: b8 01 00 00 00 mov $0x1,%eax
8048059: bb 64 00 00 00 mov $0x64,%ebx
804805e: cd 80 int $0x80
As you see from the above example: _start symbol is pointing to "08058054", "08058054" is the starting addres of the first instruction to be executed that is Line:5
Operand Sizes:
==========
1. In IA-32 Architecture: instructions can have operands of variable sizes.
Example:
i)8-bit operands known as: byte
ii) 16-bit known as: word
iii)32-bit known as: long
2. Instructions in IA32 architecture of 0,1 or 2 Operands
3. Operands may be:
i) Registers Example: movl %eax, %ebx
ii) Memory Example: movl myval, %ebx
iii) Constants Example: addl 5,%ebx
4. Register operands can be:
8-bit registers: al,ah,bl,bh,cl,ch,dl,dh
16-bit registers: ax, bx,cx,dx,di,si,sp, bp
5. Instructions can have:
i) No operand
ii) Multiple Operands
6. All instructions can have only 1 Memory operand.
examples:
A. if instruction has only 1 operand:
incl myval (value in myval will be incremented by 1)
B. Multiple operands:
incl %ebx, myval
7. For instructions which have 2 operands , they can have 1 memory , & other may be constant or register
examples:
addl %ebx, myval
addl 5,myval
Memory Model:
===========
1. Memory in a computer system can be thought as an Array of bytes:
|--------------------------------------------------------------------------------------------------------------------------------------------------|
| A | B | C | 1 | 2 | 3 | 4 | @ | S | 8 | 11 | 1 | 13 | 15 | H | E | L | L | O | B | Y | E |
|---0-----1-----2------3-----4-----5-----6------7-----8----9-----10----11----12---13---14----15----16----17---18----19---20---21--|
Index of this array: (example, 0,1,2,3) are known as address:
Value of the Array element : A, B,C are known as content of the memory location
So Address 0,the value A is stored
Address 1, Value B i stored
Address 17, Value L is stored.
2. Instructions may have operands in the memory, so memory addresses of the operands are specified in the instruction, Ways to specify address in , instruction will
be dealt later
3. IA32 supports 2 kinds of Addressing mechanism
1. 16-bit wide:
Leave this, Not Important
2. 32-bit wide addressing
4. OS ues only 1 kind of addressing mechanism., GNU/Linux uses 32-bit addressing Mechanism. In this addressing mechanism, the range of memory addressability
is 2^32 bytes
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| A | B | C | 1 | 2 | 3 | 4 | @ | S | 8 | 11 | 1 | 13 | 15 | H | E | L | L | O | B | Y | E | | | | | |
|---0-----1-----2------3-----4-----5-----6------7-----8----9-----10----11----12---13---14----15----16----17---18----19---20---21------------------------------2^32-
5. Operands can be of 1 or more size bytes:
If operand is 1 byte: 1 memory location is used
If operand is more than 1 byte, Multiple memory locations will be used.
Example: 16-bit operand will use 2 memory locations:
Example: In the above diagram, to store HELLO, 5 Memory locations are used. 5 bytes are used.
6. The memory operands are specified using 2 Attributes:
A. Start Address (Specifically Mechanism to compute the start Address)
B. size
7. 16-bit Operands has 2 bytes: The most significant byte and Least significant byte.
8. IA32 use Little-endian, LSB is stored in Lower Address.
9. Example:
16-bit number: 0x1245 , So 0x45 will be stored in lower order : 0x23C8 , and 0x12 will be stored in 0x23C9
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| A | B | C | 1 | 2 | 3 | 4 | @ | S | 8 | 11 | 1 | 13 | 15 | H | E | L | L | O | B | Y | E | | | | | |
|---0-----1-----2------3-----4-----5-----6------7-----8----9-----10----11----12---13---14----15----16----17---18----19---20---21------------------------------2^32-
In the above representation
If the number 1513 has to be stored:
Lowest address: 12: will have 13
Next higher address: 13: will have 15
===================================================================================================
Operand Addressing:
1. Instructions in IA32 processor operate on operands & generate at most 1 result. Results are then stored on memory or as specified in location
Instruction----->Processor->(Get operands, as per instruction & do operation)
| Output
____|__________
| |
Memory Registers
2. Instructions can have :
A. No destination operand
B. Implied destination operand
C. 1 destination Operand
3. Instructions can have:
A. 0 Source operand
B. 1 Source operand
C. 2 source operands
4. Mechanism of addressing operands indicates the way operands are found during the execution of instructions . Also known as addressing mode
of processors
5. Addressing modes are encoded in instruction. During execution of instruction the actual values are taken using the addressing modes specified:
6. Categories of operands:
A. Constant
Example: add $5, %eax
All constants have $ sign prefixed
B. Registers
examples: incl %eax,
add %ebx, %ecx
movl %eax, %ebx
All registers in the instruction are prefixed with %
C. Memory
Examples:
movl data_items(,%ecx,4), %eax
movl data_items, %eax
Different Addressing modes
------------------------------------
1. Immediate Addressing : All constant operations of an instruction are specified using the immediate addressing mode. Constant operands are
prefixed with $sign
Examples:
addl $5, %eax
In the above example 5 is added to the value stored in %eax and result is saved in eax register
cmpl $5, %eax
In the above example 5 is compared with value stored in eax register and eflags register is used to store the result
2. Register Addressing mode:
Register operands (%eax, %ebx) are specified in instruction by name of registers.
examples:
addl %eax, %ebx
cmpl %eax, %ebx
movl %eax, %ebx
32-bit registers: eax, ebx,ecx,edx,
16-bit, ax,cx,dx,si,di,sp & bp
8-bit: al,ah, cl, ch, dl,dh
3. Memory Addressing:
i)When an operand of the instruction is stored in memory, It is read from (or written to) to the memory during the execution
ii) Instruction should also specify the method to compute memory address also known as effective address
Methods to compute the address:
3.1: Direct Addressing:
------------------------------
1. Simplest method of Providing the address is to specify that address in the instruction:
Example: addl 0x8048054, %eax
In the above example 0x8048054 is theaddress in memory having a value 5(for example) which will be added to the contents of eax and the result tant
value is stored in eax
2. Since it's difficult to memorize the addresses , we use symbolic names for addresses , Linker then assigns the final address before execution
Example: addl data_items, %eax
data_items is a symbolic name for an address that contains 5(for example) that will be added to eax
3. Example: addl 20, %eax
In the above example there is no $ prefixed for 20, so it's treated as address, Since the second operand is %eax, which is 32-bit register, the first operand
is also treated as 4 byte operands, So values stored from starting from 20,21,22,23 are fetched.
4. While using memory addressing, the effective address provides just the starting address of the operand in the memory.
5. The instruction should also provide the size of the operand , In most cases size of the operand is implicit
Examples:
addl 20, %eax (In this case the size of the memory operand is implicit, which is actually based on 2nd operand)
movb 20, %al (in this case the size of the memory opeand is 1 byte,)
incl 20 (In this case the size of the memory operand is 4 bytes starting from 20)
incb 20 (size of memory operand is 1 byte which address 20)
Components of Effective Address
------------------------------------------
1. Effective address can be provided by specifying 4 different components:
A. 2 Registers
B. 2 constants
2. Two Registers are used as "base" & "index" component , while the constants are used as "scale" & "displacement" .
base(Register) , Index (Register), Scale (Constant), Displacement (Constant)
3. Scale can have values: 1,2,4,8
4. Displacement can have values of : 8-bit or 32-bit
5. From the 4 values we defined above (base, index,scale, displacement) up to 3 can be ommitted
6. Direct Addressing is an example where effective address is computed by omitting base, index & scale and only 32-bit displacement being used.
7. Any of the 8 General Purpose registers can be used for "base" component (eax,ebx,ecx,edx,esi,ebp,esp,ebp)
8. For Index only 7 General Purpose registers can be used : eax,ebx,ecx,edx,esi,edi,ebp.
9. Displacement is a signed Number: so values range from -128 to +127
10. In GNU/Linux Assembly language Memory operand can be computed using below syntax:
Effective address: displacement(base,index,scale)
3.2 Indirect Addressing: (Base/Registers) Or Register Indirect method
------------------------------------------------------------------------------------------
1. If the memory operand is specified using just the base component (i.e only registers)
example:
mov (%eax), %ebx
In the above example Memory address is stored in eax , size is 32-bit
add eax, (%ecx)
Note: addl (%ecx), (%ebx) is not allowed as assembly instruction can have only 1 memory operand
Assume the memory image:
| 0x50BF | 0x50C0 | 0x50C1 | 0x50C2 | 0x50C3 |
-------------------------------------------------------------------
| 0x10 | 0xE3 |0x20 | 0x00 | 0x10 |
-------------------------------------------------------------------
Consider:
eax: 0x20
ecx: 0x50C0
Instruction addl %eax, (%ecx)
The above causes 32-bit number from 0x50C0 to be read , and 0x20 be added to that value , and then stored back the result from 0x50C0
Since it's 32-bit so 0x50C0 to 0x50C3 will be read and also the same is used to store the result.
3.3 Base + Displacement:
-------------------------------------
1. If the memory operand is computed using base & Displacement , then
effective address: displacement + base
2. Example: addl 2(%ebp)
In the above example: displacement=2, base is ebp
3. Consider the below memory Image:
| 25 | 26 | 27 | 28 | 29 |
-------------------------------------------------------------------
| 0x10 | 0xE3 |0x20 | 0x00 | 0x10 |
-------------------------------------------------------------------
If the ebp has 25 , then 25 is added to 2 , so the effective address is 27, since it's 32-bit register 4 bytes from 27 is read.
Consider example:
addb 10, 2(%ebp)
In the above example: the effective address of 2nd operand is (if ebp is 25), 27 (25 + 2) , since the instruction has "movb" , then only 1 byte is read starting from
27 and added to 10 and result is stored in ebp
Example Code 1:
.section .data
var1:
.int 25,27,28,29,30
.section .text
.globl _start
_start:
movl $var1, %ecx
movl 8(%ecx), %ebx
movl $1, %eax
int $0x80
Example Code: 2
----------------------
.section .data
var1:
.long 25,26,27,28
.section .text
.globl _start
_start:
movl $var1, %edi
movl (%edi), %eax
movl 4(%edi), %ebx
movl 8(%edi), %ecx
movl 12(%edi), %edx
movl $1, %eax
int $0x80
====================================================================================================
Index*Scale+ displacement:
1. If the memory operand is specified using index,scale & displacement components, the effective address is computed by multiplying the index by scale
& then adding the displacement.
2. Scale can take values: 1,2,4,8
3. Index can have all General purpose registers except "esp"
4. If an array is used where the size of the element is 1,2,4 or 8 bytes , then element of array can be accessed using this mode.
Consider Example:
# Registers:
# %edi points to starting address of an array
# %esi will be used as index register
.section .data
var1:
.int 10,11,12,13
.section .text
.globl _start
_start:
movl $var1, %edi #we are passing address of var1[0] to %edi
#Assume the address of var1[0]=134516928/0x80490c0
movl $0, %esi #Since Index should be a register, We access first element var[0], so esi gets 0
addl $1, var1(,%esi,4) #index(%ecx)=0,scale=4 , because 4 is total storage location taken by each element of array
#displacement=var1 i,e starting address(134516928)
# var1(,%ecx,4) ==>> starting address(,0,4)
#=>>displacement(startingaddres)[0+0*4]=> starting address(134516928)
movl (%edi), %eax # here %edi=0x80490c0/134516928 , %eax will have 11
movl $1, %esi #We need to move to next array element %esi=1
addl $1, var1(,%esi,4) #displacment(starting address)[0+1*4]=>(starting address+4) now adding 1 to the value (134516928 + 4)
# 134516928 + 4=134516932, Value stored at address 134516932 is 11, Add 1 to 11 and store it back at 134516932
# stored in (Starting address+4) 134516932, so 12 is stored
movl 4(%edi), %ebx #copy the net result in %eax , edi=134516928, 4+134516928=134516932, value at 134516932 is 12
movl $2, %esi #we move to var1[2] now, i.e 3rd element.
addl $1, var1(,%esi,4) #displacement(0+2*4) 134516928(0+2*4) 134516928(0+8)=>134516928 + 8 => 134516936
# Value stored at 134516936 is 12, Add 1 to 12 and result is stored back at 134516936 ie. 13.
movl 8(%edi), %ecx # Access 3rd element and store it in ecx. %edi=134516928, 134516928+8=134516936,i.e get value 13 and store it in ecx
movl $3, %esi # we move to var1[3] now
addl $1, var1(,%esi,4) # displacement=var1 (starting address), %esi=3(index), scale=4
# 134516928(0+3*4)
# effective address=displacement+[base+index * scale]
# EA=displacement+[0+3*4]
# EA=displacement+[0+12] => displacement+12 134516928 + 12=134516940
movl 12(%edi), %edx
movl $1, %eax
int $0x80
Another Example:
We are adding all the values in the array var1 and store it in %ebx, var1 is accessed using displacement(index*scale)
.section .data
var1:
.int 10,11,12,13
.section .text
.globl _start
_start:
movl $var1, %edi
movl %edi, %ecx
movl $0, %esi #Since Index should be a register, %esi will be index register
addl $12, %ecx # we need ending address to come out of loop
movl (%edi), %ebx
start_loop:
cmpl %edi, %ecx
je loop_exit
incl %esi
addl var1(,%esi,4), %ebx
addl $4, %edi
jmp start_loop
loop_exit:
movl $1, %eax
int $0x80
===================================================================================================
Base+index*scale:
1. In this addressing mode, 3 components base, index & scale are provided and effective address is computed by the processor while executing the instruction.
2. Two components base & index are provided by registers , Scale is constant
Examples:
.section .data
var1:
.int 10,11,12,13
.section .text
.globl _start
_start:
movl $var1, %edi #we copy the starting address in %edi
movl $0, %esi #Since Index should be a register, The first element var[0], so ecx gets 0
addl $1, (%edi,%esi,4) #index(%esi)=0,scale=4 , because 4 is total storage location taken by each element of array
#displacement=0
# var1(%edi,%esi,4) ==>> starting address(starting address+0*4)
#[0+0*4]=> starting address=>%edi
movl (%edi), %eax
movl $1, %esi #We need to move to next array element
addl $1, (%edi,%esi,4) #[starting+1*4]=>(starting address+4) now adding 1 to the value
# stored in (Starting address+4)
movl 4(%edi), %ebx #copy the net result in %eax
movl $2, %esi #we move to var1[2] now
addl $1, (%edi,%esi,4) #displacement(0+2*4)
movl 8(%edi), %ecx
movl $3, %esi # we move to var1[3] now
addl $1, (%edi,%esi,4)
movl 12(%edi), %edx
movl $1, %eax
int $0x80
====================================================================================================
Base+Index*scale+Displacement:
1. This mode of addressing in IA32 processors is most powerful mode, In this mode all the four components are specified & the effective address is computed
using 2 additions & 1 multiplication
2. This addressing involves 2 registers and 2 constants
3. Registers are used for address information
4. Constants are used for displacement & scaling
5. This addressing mode can be used to address an element from a 2-D Array
Example:
Consider 2 * 4 Matrix as below:
j 0 1 2 3
-----------------------------------------------------
0 | 12 | 13 | 14 | 15 |
i ------------------------------------------------------
1 | 16 | 17 | 18 | 19 |
------------------------------------------------------
Size of the Array => 2 * 4
In memory the above array is stored in Row major format, in which the array elements are stored in memory in the row-first order. Example as below:
matrix value Address;
----------------------------------------------
A0,0 | 12 | 20
A0,1 | 13 | 21
A0,2 | 14 | 22
A0,3 | 15 | 23
A1,0 | 16 | 24
A1,1 | 17 | 25
A1,2 | 18 | 26
A1,3 | 19 | 27
Assuming the size of each element is denoted by "s"
A[0][0] is denoted as "A" and size is 1 byte, then Address of A[0][1]=A +s ===> So if A[0][0]=20, then A[0][1]=>20+1=21
Address of A[0][j] is (A+j)*s, So address of A[0][n-1] or A[0][4-1] ==>A[0][3]= (A+j)*s=> (20+3)*1=> 20+3=23
Each Row of the array occupies n*s bytes Which implies from the above example: each row occupies 4 bytes
Address of Row 1 (j) will be A+n*s
Address of first element of Row i is A* i*n*s
Generalized form of A[i][j] => A+i*n*s+j*s
1 Registers is to keep i*n*s, this provides offset of the starting address of row i
Index register is used to keep j
s or size is represented by scale
Starting Address of array(A) can be given by displacement
====================================================================================================
Functions:
------------
1. To assist programmers in working together in groups, It is necessary to break programs apart into separate pieces, Which communicate with each
other through well-defined interfaces
2. Functions are units of code that do a defined piece of work on specified types of data.
3. Data items(inputs) a function is given to process are called it's parameters
4. A typical program is composed of thousands of functions, each with a small, well-defined task to perform. However ulitmately there are things that
one cannot write functions for which must be provided by the system. Those are called primitive functions. They are basics which everything else is built
of.
Functions are composed of below Elements:
--------------------------------------------------------
A. Function Name:
B. Function Parameters
C. Local variables
D. Static variables
E. Global Variables
F Return Address
G. Return Value
---------------------------
| |
| Functions |
----------------------------
|
|---------------------------------------------------------------------------------------------------------------------------------------------------------------|
| | | | | | |
Function Name Function local static Global Return Address Return Value
Parameters variables variables variables
A. Function name:
-----------------------
i. Functions name is a symbol that represents the address where the function's code starts
ii. In ALP symbol is defined by typing the functions name followed by colon immediately before function's code.
Example:
.section .text
.globl _start
_start: <-------------------_start is a function which symbolises address of the instruction "movl $101, %ebx"
movl $101, %ebx
movl $1, %eax
int $0x80
objdump -d ./exit
./exit: file format elf32-i386
Disassembly of section .text:
08048054 <_start>:
8048054: b8 01 00 00 00 mov $0x1,%eax
8048059: bb 64 00 00 00 mov $0x64,%ebx
804805e: cd 80 int $0x80
As we can see from the above dissembly, _start points to Address 8048054 which is the address of the instruction "mov $0x1,%eax"
B. Function Paramters
----------------------------
i. A function's parameters are the data items that are explicitly given to the function for processing
example:
sine(2) = 0.34
sine is the name of the function
2 is the paramter
ii. Some functions have many parameters
iii. Some functions have no parameters
C. local variables
-----------------------
i. Local variables are data storage that a function uses while processing
ii. This data storage is thrown away when function execution is completed (or returned)
iii. It's like rough paper which we get everytime function is evoked, Once completed , the rough paper is discareded.
iv. Local variables of a function cannot be accesssed by any other function
D. Static variables
-------------------------
i. Static Variables are data storage that function uses while processing
ii. This data storage is not thrown away when function completes it's execution
iii. This data storage is re-used everytime the function is activated
iv. This data stroage cannot be accessed outside of the function.
v. This data storage should be used with caution
E. Global Variables
------------------------
i. Global Variables are data storage that function uses while processing
ii. This data storage is available outside of the function execution, i.e this data storage is available to other functions also
iii. This data storage is thrown away when the whole program completes execution
Examples:
i. A simple text editor may put the entire contents of the file it is working on a global variable so it doesn't have to be passed to every function that
operates on it.
ii. Configuration values are often stored in global variables.
F. Return Address
-----------------------
i. Return Address is an invisible parameter
ii. Return Address is not directly Used by the function during execution
iii. Return address is used to find where the processor should start executing after the function completes it's execution
iv. This is needed because function can be called to execute from different parts of the program , once the function completes execution, it should
go back to where it was called from.
v. In most programming languages, this parameter is passed automatically when the function is called
G. Return Value
---------------------
i. Return Value is main method of tranferring data back to main program
ii. Most programming languages allow a single return value for a program
5. The above pieces of a function are mostly same for different programming languages , How each pieces are specified are different in each one.
6. The variables(local,global & Static), Function Paramters & Return values are tranferred between functions vary from language to language. This variance
is called "calling convention"
7. Calling convention describes how functions expect to get & receive data when they are called
8. Assembly language can use any calling convention
9. C calling Convention is mostly used
10. To understand how functions work, It's important to understand how "stack" work.
11. Each computer program that runs uses a region of memory called the stack to enable functions to work properly.
Stack:
--------
A. IA32 processors implement stack in memory:
B. Stack is a data structure in which data items are added and removed in the last in first out(lifo) order
Example:
Function f1 Function f3 Function f2 functionf4
----------- |-->---------- |->----------- |->-----------
statement1 | statement 1 | statement1 | statement1
statement2 | statement 2 | statement2 | statement2
... | End function | call functionf4------| statement3
---- | ^ | Statement X<----------- end function
.. | | | ......
call functionf2:<---|---|------------| Statement N
statement X <-------|---|--------------end function
call to functionf3--| |
statement N |
end function<-----------|
->In the above example, functionf1 makes 2 calls to function f2 & function f3 respectively
->Function f2 in turn makes one call to function f4
->The execution starts with function f1 , execution continues till call is made to f2,
->At this time, the execution control is transferred to function f2,
->After execution of f2 is completed, it execution control is transferred back to f1
->The above semantics is maintained when function f2 makes a call to function f4.
->While executing fuction f4, the execution path can be traced as:
f1->f2->f4
->When f4 function finishes, its execution control is returned back to f2,
->Similarly when fuction f2 completes, execution control is handed back to f1
->There is a natural order of passing control upon calls to function and the reverse order
of passing control upon return from the function. Stack is an obvious choice here.
C. Each time a function is called , the return address is pushed to stack.
D. The top of the stack therefore provides address within the function to which control must returnwhen execution of
the called function is over
E. IA32 Processors provide a mechanism to implement stack and several instructions that use this stack implicitly
F. The actual storage of the stack is created in the processor memory along with other items such as program instructions,
data etc.
G. The top of the stack is represented by register esp, which stores the memory address of the location where the last
data item was added.
H.Upon each push register esp is first decremented by the number of bytes in the data item. The value of the data item is
then stored in the memory location whose address is available in register esp
Example:
1. If we wish to push contents of register eax on to stack , registers esp will be decremented by 4 and then value of register
eax will be stored in memory whose address is given in esp.
Example:
0x11BC 0x11BD 0x11BE 0X11BF 0X11C0
Assume Current address of %esp is 0x11C0,
Before push
%eax: 0x13579BDF
After push operation
0x11BC 0x11BD 0x11BE 0X11BF 0X11C0
0XDF 0X9B 0X57 0x13
%eax: 0x13579BDF
The contents of the memory operation is shwon above
I. Pop operation is reverse of push operation. In the case of pop operation, first the appropriate number of bytes are read
from memory addressed by register esp and then the register esp is incremented in sisze of the data operand .
J. Machine stack in IA32 processors is very powerful and can be used in several ways:
-> The most important use of stack is to keep the return address for the function calls
-> Another use of stack is to temporarily save the contents register & recover them later
-> Implementations of high-level Programming languages typically use stack to pass parameters to functions
-> Local variables of high-level langauge function are typically alloted on to the stack.
-> Execution control of recursive function is implemented using a stack where the functions are returned in reverse
order of the call. The function that is called at the end returns first and the function invocation that is done in the beginning
is concluded last.
i. Stack is a pile of papers on desk , which can be added to indefinitely
ii. Things that we are working currently is always on the top.
iii. Computers have stack too, computers stack lives at the very top address of Memory
iv. we can add values (also called push) values on to the stack using instruction called "pushl"
v. pushl pushes either a register(eax,ebx,ecx...etc) or value on to the top of the stack.
vi. Nomencalature used in stack like "top" is actually (physically) the bottom of the stack's Memory
________________________________________________________________________________________________
|29 |28 |27 |26 |25 |25 |
| ---------- | ---------- | -------- | ---------- | ---------- | ---------- | <--------Stack
|23 |22 |21 |20 |19 |18 |<--------------------------------Top
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
|17 @ |16 ! |15 F |14 ) |13 G |12 H |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
|11 H |10 E |9 L |8 L |7 O |6 W |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
|5 O |4 R |3 L |2 D |1 ! |0 ! |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
_________________________________________________________________________________________________
vii. From the above Memory representation We can see that
A. Memory addressing starts from 29 and moves down to 0 (more like fffffff to 0000000)
B. Top is bottom of the memory
C. Memory grows Downwards
viii. When we are referring top of the stack we are actually referring to bottom of the memory
ix. We can also remove values from the stack , which is called as "pop" , instruction used is "popl"
x. When we push a value on to the stack, the top of the stack moves to accomdate the addition value.
xi. Stack can be kept on pushing until we hit our code or data.
xii. To know where the "top" of the stack is , we use a stack register %esp , which always points to the current top of the stack
Examples:
-----------------------------------------------------------------------------------------------
|29 H |28 |27 |26 |25 |25 |
| ---------- | ---------- | -------- | ---------- | ---------- | ---------- | <--------Stack
|23 |22 |21 |20 |19 |18 |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
In the above example We added H to the stack, So now top (%esp) points to address 29
------------------------------------------------------------------------------------------------
|29 H |28 E |27 |26 |25 |25 |
| ---------- | ---------- | -------- | ---------- | ---------- | ---------- | <--------Stack
|23 |22 |21 |20 |19 |18 |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
We added(pushed) E to the stack, So now top (%esp) points to E
------------------i-----------------------------------------------------------------------------
|29 H |28 E |27 L |26 L |25 O |25 W |
| ---------- | ---------- | -------- | ---------- | ---------- | ---------- | <--------Stack
|23 O |22 R |21 L |20 D |19 |18 |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
As you can see from the above example we kept pushing characters to stack.
Now top(%esp) of the stack points to Address 20 (or value D)
xiii. pushl instruction pushes 4 bytes of value on to the stack , Since we are moving down the memory , Pushing value is basically subtracting by 4
when we do pushl, %esp gets subtracted by 4 ,
note: pushl (push long i.e 4 bytes)
xiv. To remove something from stack , we simply popl from the stack , which is basically adding 4 bytes to %esp
xv. To access current value (or topmost value in stack) , We can access %esp register
example:
movl (%esp), %eax
The above will move what ever value is in the top of the stack to %eax register
xvi. If we do the operation,
movl %esp, %eax
The above will copy the address to which currently top points to %eax register.
xvii. In the C language calling convention, stack is the key element for implementing a functions local variables, parameters & return address.
xix. Before executing a function, Program pushes all the parameters for the function on to the stack in the reverse order that they are documented.
Example: if we sum(a,b), then
------------------------------------------------------------------------------------------------
|29 b |28 a |27 |26 |25 |25 |
| ---------- | ---------- | -------- | ---------- | ---------- | ---------- | <--------Stack
|23 |22 |21 |20 |19 |18 |
| ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
As we can see we have b first pushed and then a, so now top points to 28 (i.e esp register has address 28)
xx. The program uses call function indicating which function to start.
Stack related data Movements:
------------------------------
1. The stack related data movements instructions in IA32 processors implement push and pop operations on the stack
2. Syntax of push and pop are:
push src
pop dest
3. The src operand in the case of "push" instruction can be
->Immediate constant
->Register
->Memory variable
4. The size of the operand can only be 16-bits or 32-bits, 8-bit operands cannot be pushed
5. If the operand of the "push"instruction is specified using immediate addressing or memory addressing , the size
of the operand is not implicitly known
6. In order to specify the size of the operand, the "push" instruction can we suffixed with "w" or "l" to indicate 16-bit or
32-bit operations respectively (ex: pushl, pushw)
7.If the operand is an immediate constant or is a memory variable w or l in pushw or pushl instruction must be used to
provide the size of the operand
8. In programs often several registers are saved on the stack at the same time & later restored. IA32 provides another
set of push and pop instructions for this purpose
9. "pusha" pushes all eight general purpose registers on the stack.
10. popa instructoin removes 32-bit registers from the stack in reverse order of the push
11. pushf instruction pushes 32-bit eflags register on to stack and popf instruction is not opposite of pushf.
Handling Unsigned Numbers:
-------------------------
1. IA32 architecture supports both unsigned & signed numbers.
2. For the signed numbers, 2's complement notation is used.
3. In 2's complement representation the most significant bit of an operand is 1 for negative
numbers and 0 for positive numbers
4. IA 32 architecutre provies several instructions that change the size of the signed numbers.
Ex: 8-bit data item can be converted in to 16-bit data item using instruction "cbw"
5. while converting data item to a larger size, the sign bit can be duplicated to all the extra bits.
Example: consider a byte value 0x08, the equivalent 16-bit size is 0x0008, which is obtained by duplicating
sign bit(0) to all the extra bits in 16-bit number.
If the number originally was negative , the extra bit will have all bits set to 1.
6. Following instructions in IA32 architecture are provided to convert data sizes for the signed numbers:
cbtw
cwtd
cwtl
cltd
movsbw src, dest
movsbl src, dest
movswl src, dest
7. cbtw: (also known as cbw) converts 8-bit signed numbers stored in al to 16-bit signed number and stores it in ax
8. cwtd(cwd): converts 16-bit signed number stored in ax to 32-bit number , Most significant 16-bits of resultant 32-bit number are returned in
register dx while register ax represents lower 16bits
9. cwtl(cwde) perfrms similar operation as of cwtd execept that it returns 32-bit number in register eax
10. cltd (cdg) converts 32-bit number to 64-bit number. input 32-bit number in eax and output goes in edx and eax
11. movs.. instructions are generic as they conver 8-bit - 16-bit (movsbw), 8-bit to 32-bit (movsbl) and 16-bit to 32bit (movswl)
12. There are few more instructions in IA32 architecture to convert data sizes for unsigned numbers
movzbw src, dest
movzbl src, dest
movzwl src, dest
src can be register or memory
dest can only be a register
====================================================================================================
Chapter 3: Basic Data Manipulation
-----------------------------------------------------
1. IA32 instructions can be divided in several categories:
A. Data manipulation
B. Data Movement etc.
2. The most commonly used instruction in IA32 processors is "mov" instruction.
3. "mov" instruction can be used for following purposes:
-> initialize a register
-> initialize a memory location
-> copy a value from register to memory
-> copy value from memory to register
-> copy from register to register
4. Syntax of "mov" instruction is as follows:
mov src, dest
5. During the execution of the above instruction the value of the src operand is copied to "dest" operand.
6. "src" operand may be immediate constant in which case it causes initialization of "dest" operand
7. IA32 processors also have an restriction on type of operands. like the instruction can have only 1 memory operand. So it cannot copy value
from a memory variable to another memory location.
8. To perform such an operation (memory to memory) more than 1 instruction has to be done.
Example1:
1 movl 0x100, %eax
2 movl 0x200, %ebx
3 movl %ebx, 0x100
4 movl %eax, 0x200
Assume %eax had the value: 0xabcdabcd, %ebx had 0xdeafdeaf prior to execution
Memory Image Before execution:
0x100 0x101 0x102 0x103
0x10002000 <-----------Value
0x200 0x201 0x202 0x203
0x30004000 <----------Value
After execution of line1: %eax - 0x10002000
0x100 0x101 0x102 0x103
0x10002000 <-----------Value
0x200 0x201 0x202 0x203
0x30004000 <----------Value
After Execution of line2: %ebx = 0x30004000
0x100 0x101 0x102 0x103
0x10002000 <-----------Value
0x200 0x201 0x202 0x203