Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pasim: Bundled pointer dereference followed by multiply doesn't correctly dereference #41

Closed
Emoun opened this issue Nov 4, 2018 · 8 comments

Comments

@Emoun
Copy link
Member

Emoun commented Nov 4, 2018

If we load a label into a register as part of a bundle, then dereference the register, followed by using a multiply on the resulting value, it does not execute correctly on pasim.

Take this program (main.c):

#include <stdio.h>
volatile int _1 = 1;
int init_func(){
  int x;
  asm volatile(
    "{add $r3 = $r0, _1\n"
    "nop}\n"
    "lwc $r5 = [$r3]\n"
    "li $r4 = 2\n"
    "mul $r4, $r5\n"
    "nop\n"
    "mfs %[x] = $s2\n"
    :[x] "=r" (x)
    :
    :"$r3", "$r4", "$r5", "$r4", "$s2", "$s1"
  );
  return x;
}
// Should print "2" for correct execution
int main(){
  printf("%d\n", init_func());
}

Looking in the inline assembly, we see that we start by loading the label _1 into r3. We then dereference r3 into r5, which means r5 = 1. We then load r4 = 2 and multiply r4 and r5, which should result in the value 2 in the special register s2 (the lower half of the mul result).
The code successfully compiles using patmos-clang main.c, but running it in pasim (using pasim a.out) result in the program printing 0, where we would expect 2.

This error is very specific. If we do any one of the following, the code will execute correctly in pasim:

  • Have the add not be part of a bundle. Though, no matter which other instruction is part of the bundle, the add will not success.
  • Use any other operation instead of the mul. E.g. if we exchange the mul for an add the correct result will be printed.
  • Use a specific immediate value in the add, or use a register, instead of a label.

Looking at pasim debug prints, I suspect the problem lies in the stalling of the regular pipeline by the multiply pipeline, but I am not sure. I will investigate.

@schoeberl
Copy link
Member

The mul operation has some latency (it does NOT stall the pipeline). How many is not even documented in the handbook :-(

I assume that this mul latency is the reason why you are not getting the result. Add some more nops and it will be fine. Furthermore, please test this also against the emulator. That represents the real hardware.

@Emoun
Copy link
Member Author

Emoun commented Nov 10, 2018

I have tried to add several hundred nops without a difference. Also, this error doesn't appear if the initial add isn't part of a bundle which means it can't be the multiply's latency that is the problem.
I.e. this code will execute correctly:

#include <stdio.h>
volatile int _1 = 1;
int init_func(){
  int x;
  asm volatile(
    "add $r3 = $r0, _1\n"
    "lwc $r5 = [$r3]\n"
    "li $r4 = 2\n"
    "mul $r4, $r5\n"
    "nop\n"
    "mfs %[x] = $s2\n"
    :[x] "=r" (x)
    :
    :"$r3", "$r4", "$r5", "$r4", "$s2", "$s1"
  );
  return x;
}
// Should print "2" for correct execution
int main(){
  printf("%d\n", init_func());
}

I have tried to test the emulator patemu, and it appears to have the same error.
Next time I'm at the office, I'll try to get an fpga to run this example on. Until then I will look at pasim's debug output to see what is going on.

@Emoun
Copy link
Member Author

Emoun commented Nov 10, 2018

debug_cleaned.txt is a cleaned version of the debug from pasim for the init_func function.
We can see that _1 gets loaded into r3 correctly (at cycle 24677), but the dereference into r5 never happens.

@Emoun
Copy link
Member Author

Emoun commented Nov 10, 2018

The problem is not isolated to multiplies, but seems to be a problem with the load. Using an add instead of a mul also doesn't execute correctly:

#include <stdio.h>
volatile int _1 = 1;
int init_func(){
  int x;
  asm volatile(
    "add $r3 = $r0, _1\n"
    "lwc $r5 = [$r3]\n"
    "li $r4 = 2\n"
    "add %[x] = $r4, $r5\n"
    :[x] "=r" (x)
    :
    :"$r3", "$r4", "$r5"
  );
  return x;
}
// Should print "3" for correct execution
int main(){
  printf("%d\n", init_func());
}

The above code should print 3 but prints 2. I had tried this before but not noticed that the output was wrong.
Adding nops between the add and the load does not alleviate the problem.

@Emoun
Copy link
Member Author

Emoun commented Nov 11, 2018

Cannot recreate the problem with automatic testing, i.e. creating a test in simulator/tests and running it will give the correct result no matter what I do.
The tests use paasm to build the program, which produces a binary format, while I'm using patmos-llc which produces an ELF format. This makes me think its a problem with pasim's ELF loader.

@ghost
Copy link

ghost commented Nov 12, 2018 via email

@Emoun
Copy link
Member Author

Emoun commented Nov 12, 2018

@flopsi you are right.
I had checked the value of r3 against the value of _1 in the objdump, but hadn't thought that it would be truncated silently, which means the value in the objdump is wrong. I compared the code with and without the bundles and the immediate value is truncated in the bundled version.
I will try to see if I can update patmos-llc to throw an error in cases like this.

@Emoun
Copy link
Member Author

Emoun commented Nov 14, 2018

Will close this issue, as it is not an issue with patmos or the simulator. Will be fixed with t-crest/patmos-llvm#15.

@Emoun Emoun closed this as completed Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants