forked from staticafi/llvm2c
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
masterThesis #1
Open
vmihalko
wants to merge
107
commits into
master
Choose a base branch
from
vmihalko-devel
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
masterThesis #1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
void foo(int a) { int a; // remove this line }
1. generate random test.c file [csmith] 2. compile test.c to binary [clang] 3. modify generated test.c file [fix-csmi.sh] - Here we replace csmith.h header with necessery stuff to avoid header expansion while compiling to llvm in next step 4. compile to LLVMIR test.ll [clang] 5. run llvm2c and generate decompiled.c [llvm2c] 6. modify decompiled.c file [decom-fix-csmi.sh] - Here we add csmith.h include and fix types for functions from csmith.h 7. compile decompiled.c to binary [clang] 8. run compiled binaries and compare their outputs If something goes wrong at any point then all generated files are copied to tmp (for debug purpose). If we caught exception after running the compiled test.c then we continue to the next step.
Parse basic types are: - int - char - short - long - float - double - long double and correctly recognize whether a type is signed or unsigned.
from metadataTypeInfo
E.g. `int *` or `unsigned int*`
Which prints strings to the llvm:errs().
From https://lists.llvm.org/pipermail/cfe-dev/2013-January/027302.html: When a function has a struct parameter or return type, Clang may lower a struct parameter into... - a "byval" pointer (for a struct with several different members) - a vector (for a struct with a few float members) - two doubles (for a struct with two double members) - an i64 (for a struct with two i32 members) ... and possibly more variations. But there is no information in the metadata about types created in this way. Therefore, we detect the use of the struct type as an argument or return value of a function and do not reconstruct these types from the metadata.
Enable all (loop) passes from https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/IPO/PassManagerBuilder.cpp#L353-#L375: ```c if (EnableSimpleLoopUnswitch) { // The simple loop unswitch pass relies on separate cleanup passes. Schedule // them first so when we re-process a loop they run before other loop // passes. MPM.add(createLoopInstSimplifyPass()); MPM.add(createLoopSimplifyCFGPass()); } // Rotate Loop - disable header duplication at -Oz MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1)); MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap)); if (EnableSimpleLoopUnswitch) MPM.add(createSimpleLoopUnswitchLegacyPass()); else MPM.add(createLoopUnswitchPass(SizeLevel || OptLevel < 3, DivergentTarget)); // FIXME: We break the loop pass pipeline here in order to do full // simplify-cfg. Eventually loop-simplifycfg should be enhanced to replace the // need for this. MPM.add(createCFGSimplificationPass()); addInstructionCombiningPass(MPM); // We resume loop passes creating a second loop pipeline here. MPM.add(createIndVarSimplifyPass()); // Canonicalize indvars MPM.add(createLoopIdiomPass()); // Recognize idioms like memset. ``` Test: ```bash clang -S -emit-llvm -Xclang -disable-O0-optnone simple-for-loop-second-latch.c -o simple-for-loop-second-latch-noopt.ll optpassPasses simple-for-loop-second-latch-noopt --loop-simplify --simplifycfg --loop-rotate --lcssa --licm --loop-unswitch --simplifycfg --instcombine --indvars old_llvm2c simple-for-loop-second-latch-noopt-opt == new_llvm2c simple-for-loop-second-latch-noopt ```
1. map LOOP with BRANCH instruction (condition) 2. transform BRANCH inst. to if's or doWhile constructs
First: ```c goto head; head: ... do goto head; while( C ); ``` transfrom into ```c do head while( C ); Second: Cache result from loopInfoAnalysis in particular function
and prepare for anonymous structs/unions Thanks @lzaoral for help!
vmihalko
force-pushed
the
vmihalko-devel
branch
from
February 13, 2024 15:36
7d7841c
to
fd4978d
Compare
vmihalko
force-pushed
the
vmihalko-devel
branch
from
March 14, 2024 14:51
a530ecb
to
dc96e47
Compare
e.g., type function occurs as arg type or return type
I wrongly assume that llvm always generates positive loop conditions, e.g. if this is true, then iterate, but after -O3 optimisations, loop condition might be negated: if this is false, then continue to the next iteration. This new function will reverse the loop condition if the situation described above occurs! This might need a proper solution - this is hackery, because we do not replace the whole expression, just a printed character e.g. "<" -> ">=".
x = phi i32 [0, %beforeLoop] [%y, %fromLoop] ; coming from %fromLoop means we are in a next iteration before this commit: x = 0; do { x = y; loopBody(y, ...); } while ( cond ) after this commit: x = 0; do { loopBody(y, ...); x = y; } while ( cond )
vmihalko
force-pushed
the
vmihalko-devel
branch
from
March 14, 2024 14:59
dc96e47
to
0e68bc6
Compare
…eturn the m{ax,in}imum of the two operands. My idea comes from https://reviews.llvm.org/D9293?id=&download=true The expression: call i8 @llvm.umin.i8(i8 %a, i8 %b) is equivalent to %1 = icmp ult i8 %a, %b %2 = select i1 %1, i8 %a, i8 %b This is what llvm2c outputs: a < b ? a : b
…om the preheader block
This commit is probably the correct fix for what fd4978d was trying to address.
https://llvm.org/docs/LangRef.html#sext-to-instruction This is based on a fact that if you sext i1 cmp_result to i32, you can get either -1 or 0.
vmihalko
force-pushed
the
vmihalko-devel
branch
from
March 14, 2024 15:23
8e48a0e
to
b10037a
Compare
Signed-off-by: Andrew V. Teylu <andrew.teylu@vector.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.