Skip to content
This repository has been archived by the owner on Aug 17, 2022. It is now read-only.

Disassembler issue #171

Open
meghamegha opened this issue May 24, 2019 · 7 comments
Open

Disassembler issue #171

meghamegha opened this issue May 24, 2019 · 7 comments

Comments

@meghamegha
Copy link

meghamegha commented May 24, 2019

Hello,
We found these issues with disassembler.

Details:

riscv64-unknown-elf-objdump --v
GNU objdump (GNU Binutils) 2.31.1

disasm-error.zip

$ riscv64-unknown-elf-gcc -g errors.c -o errors
$ riscv64-unknown-elf-objdump -D -S errors > errors_disasm.txt

Testcase:

#include<stdio.h>
#include <stdlib.h>
volatile int n=80;
void main()
{
typedef struct myerr
{
int error;
const char detail;
}
myerr;
static myerr data[] =
{
{ 1, "Pass" },
{ 2, "Fail" },
{ -1, ((void
)(0)) }
};

const signed char c_char[64] =
{
'A', 'B', 'C', 'D', 'E', 'F', 'G',
'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O',
'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W',
'X', 'Y', 'Z'
};
const signed char *p_ptr = c_char;
int i=0;

for (i=0; i<n ; i++)
printf("Hello World");
p_ptr = c_char;
}

Result:

Issue1:
we are getting 0xffff code in .rodata section for the above code when disassembled, is that correct ?Its trying to decode constants to instruction code.
symtab:
0000000000019db0 l O .rodata 0000000000000022 bmask

0000000000019db0 :
19db0: ffff 0xffff
19db2: fffe sd t6,504(sp)

Issue2:
for (i=0; i<n ; i++)
1029a: fe042623 sw zero,-20(s0)
1029e: a819 j 102b4 <main+0x11e>
printf("Hello World");
..
for (i=0; i<n ; i++)

There are two for loops emitted.

@jim-wilson
Copy link
Collaborator

Yes, it is expected that there may be illegal instructions in the data section, because the data section does not contain instructions, it contains data.

The solution is to stop using objdump -D and use objdump -d instead. Objdump -D is only useful in obscure situations, and is not a normally recommended way to run objdump.

@meghamegha
Copy link
Author

Hi jim,

Issue2 is not resolved ,I am using -d instead of -D But still getting multiple source lines.
for (i=0; i<n ; i++)
1029a: fe042623 sw zero,-20(s0)
1029e: a819 j 102b4 <main+0x11e>
printf("Hello World");
102a0: 67e9 lui a5,0x1a
102a2: a4078513 addi a0,a5,-1472 # 19a40 <__clzdi2+0x32>
102a6: 1ca000ef jal ra,10470
for (i=0; i<n ; i++)

Two for loops are emitted

@jim-wilson
Copy link
Collaborator

If you compile with a high optimization level, then the compiler may duplicate lines of code, or mix up the assembly instructions for multiple lines of code, etc. This makes it difficult to impossible to provide a one to one mapping of disassembly back to original source lines of code. One solution is to just emit original source lines multiple times. Another solution is to just emit source lines once, and then pretend that other assembly code for that source line belongs to some other source line. The result depends one what tools you are using to map assembly back to source, and how they display info.

Anyways, this is expected at high optimization levels.

@meghamegha
Copy link
Author

Ok.
But we see same result with default option and with (-O0 and -g)
riscv64-unknown-elf-gcc -g test.c
and
riscv64-unknown-elf-gcc -g -O0 test.c

Is it an issue with disassembler/objdump?

@jim-wilson
Copy link
Collaborator

It is an issue with how compilers work. Each line of source code may be converted to more than one assembly instruction. And some lines of code may emit assembly instructions at different places in the output. A for loop in partition needs to emit code at the beginning of the loop, and at the end of the loop, with the loop body in the middle, so there are actually (at least) two disjoint places in the output that map back to the original for source line. So it is always wrong to assume that you can do a one to one mapping between source code and assembly code. This mapping is easier when you use -O0, but it is still not a one to one mapping. Some tools will hide this problem when they produce intermixed C/asm. Some tools don't hide it.

If you want a better answer, you need to provide a fully self contained testcase that demonstrates the problem, and which I can use to reproduce exactly what you are seeing. Otherwise, I'm just guessing.

@meghamegha
Copy link
Author

Ok ,

I am executing the C file error.c that i attached in the previous message .

Executing with default optimization option:
riscv64-unknown-elf-gcc -g errors.c -o error-default
riscv64-unknown-elf-objdump -d -S error-default > error_default.txt
error_default.txt

Executing with -O0 option:
riscv64-unknown-elf-gcc -O0 -g errors.c -o error-with-opt
riscv64-unknown-elf-objdump -d -S error-with-opt > error-with-opt.txt
error-with-opt.txt

still getting multiple source line in for loop .

@jim-wilson
Copy link
Collaborator

The output looks correct to me. The first for line is the top of the loop, the second for line is the bottom of the loop. Note that a for loop is actually 3 statements separated by semicolons, and those 3 statements are emitted in different places in the code. If you don't like this try rewritng the for as 3 lines of code
for (i = 0;
i < n;
i++)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants