We want to skip the analysis of lib calls, too expensive. If we are going to create synthetic ssa for lib functions, we start here.
- write a native.xml file with all irs we want
- preload all irs in native.xml
- when reach here @ssa/create.go CreatePackage(), we check if it is in preload, if yes, use this synthetic
Key files: ssa/create.go ssa/builder.go
Standard libraries: https://pkg.go.dev/std
====================================================================================
Git clone from https://github.com/golang/tools, start from commit 146a0deefdd11b942db7520f68c117335329271a (around v0.5.0-pre1).
The default go pointer analysis algorithm (v0.5.0-pre1) is at go_tools/go/pointer_default
.
For any panic, please submit an issue with copy/paste crash stack. Thanks.
Go to go_tools/main
, and run go build
. Then, run ./main
with the following flags and
the directory of the go project that you want to analyze.
It will go through all of your main files and analyze them one by one.
- path: default value = "", Designated project filepath.
- doLog: default value = false, Do a log to record all procedure, so verbose.
- doCompare: default value = false, Do comparison with default pta about performance and result.
For example,
./main -doLog -doCompare ../grpc-go/benchmark/server
This will run the origin-sensitive pointer analysis on all main files under directory ../grpc-go/benchmark/server
,
as well as generate a full log and a comparison with the default algorithm about performance and result.
Note that -doLog
is very verbose and significantly slowdown the analysis.
Go to https://github.com/april1989/origin-go-tools/main/main.go, check how to use the callgraph and queries.
We treat a go routine instruction as an origin entry point, and all variables/function calls inside this go rountine share the same context as their belonging go routine.
Instead of pre-computing all cgnodes and their constraints before actually propagating changes among points-to constraints,
we start from the reachable cgnodes init
and main
and gradually compute reachable cgnodes and their constraints.
- Create k-callsite-sensitive contexts for static/invoke calls
- Generate constraints/cgnode online for invoke calls and targets when it is necessary
- Currently, skip the creation of reflection and dynamic calls due to the huge number
========================================================================
The most recent doc is https://pkg.go.dev/golang.org/x/tools/go/pointer#pkg-overview, quoted:
"SOUNDNESS
The analysis is fully sound when invoked on pure Go programs that do not use reflection or unsafe.Pointer conversions. In other words, if there is any possible execution of the program in which pointer P may point to object O, the analysis will report that fact."
However, over soundness is unnecessary.
========================================================================
All example below based on race_checker/tests/cg.go
There are two pointers involved in one constraint, both of which are stored under the same ssa.Value in my queries.
Default has tracked less types than mine, of which constraints and invoke calls are missing in the default result. Hence, it has empty pts while mine has non-empty pts. For example,
SSA: &t57[41:int]
My Query: (#obj: 1 )
n597&[0:shared contour; ] : [slicelit[*]]
n2266&[0:shared contour; ] : []
Default Query: (#obj: 0 )
n36563 : []
In default log:
; t181 = &t57[41:int]
localobj[t181] = n37034
type not tracked: *strconv.leftCheat
In my log:
; t181 = &t57[41:int]
localobj[t181] = n2266
addr n597 <- {&n2266}
Due to the default algorithm (pre-compute all constraints for all functions), it generates a lot of unreachable functions/cgnodes (they have no callers), as well as their constraints. This also affect the pts of the reachable part in cg and pts, since they may be polluted. For example,
SSA: (*internal/reflectlite.rtype).Size //-> (*internal/reflectlite.rtype).Size is not reachable function
My Query: (#obj: 0 )
n8971&(Global/Local) : []
Default Query: (#obj: 1 )
n6354 : [(*internal/reflectlite.rtype).Size]
Default does not create queries for those types (not tracked types). For example,
654.
SSA: io.ErrClosedPipe
My Query: (#obj: 1 )
n4448&(Global/Local) : [makeinterface:*errors.errorString]
Default Query: nil)
In default log:
; *ErrClosedPipe = t10
copy n19413 <- n39471
In my log:
; *ErrClosedPipe = t10
create n4448 error for global
globalobj[io.ErrClosedPipe] = n4448
copy n4448 <- n4431
IDK. For example,
SSA: &r.peekRune [#4]
My Query: nil
Default Query: (#obj: 0 )
n44378 : []
and
SSA: ssa:wrapnilchk(v, "internal/reflectl...":string, "IsNil":string)
My Query: nil)
Default Query: (#obj: 0 )
n44378 : []
and
val[t115] = n44377 (*ssa.FieldAddr)
create n44378 *[16]byte for query
copy n44378 <- n44377
All missing objs in my pts are due to objs and constraints introduced by unreachable functions. This is the pollution we mentioned before. For example,
SSA: *t49
My Query: (#obj: 27 )
n11815&[0:shared contour; ] : [makeinterface:int makeinterface:[]int makeinterface:int makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string]
Default Query: (#obj: 79 )
n18076 : [makeinterface:int makeinterface:[]int makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:int makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:fmt.scanError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:*errors.errorString makeinterface:string makeinterface:string makeinterface:*internal/reflectlite.ValueError makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string makeinterface:string]
The default algorithm create cgnodes for functions that are not reachable from the main entry.
For example, when analyzing the main entry google.golang.org/grpc/benchmark/server
,
the default algorithm pre-generate constraints and cgnodes for function:
(*google.golang.org/grpc/credentials.tlsCreds).ServerHandshake
which is not reachable from the main entry (it has no caller in cg).
This can be reflected in the default analysis data:
Call Graph: (function based)
#Nodes: 14740
#Edges: 45550
#Unreach Nodes: 7698
#Reach Nodes: 7042
#Unreach Functions: 7698
#Reach Functions: 7042
Done -- PTA/CG Build; Using 5m13.058385739s .
Default generates 14740 functions and their constraints, however, only 7042 (at most) of them can be reachable from the main.
While my analysis data is:
Call Graph: (cgnode based: function + context)
#Nodes: 6306
#Edges: 23291
#Unreach Nodes: 39
#Reach Nodes: 6267
#Unreach Functions: 39
#Reach Functions: 5870
#Unreach Nodes from Pre-Gen Nodes: 39
#Unreach Functions from Pre-Gen Nodes: 39
#(Pre-Gen are created for reflections)
Done -- PTA/CG Build; Using 10.279403083s.
My analysis traverse 6267 functions that can be reached after extended the traced types.
This not only introduce differences in cg, but also unreachable constraints and objs, which can be propagated to the cgnodes and constraints that can be reached from the main entry. This causes false call edges and callees in default cg.
Most CG DIFFs from comparing mine with default result are due to this reason.
This is because the default algorithm creates nodes and constraints for all methods of all types that are dynamically accessible via reflection or interfaces (no matter it will be reached or not).
Because default algo has less type tracked than mine (no constraints generated for them and hence no propagation), Hence, some invoke calls has no base instance that will exist if we track those types. Consequently, no callee functions/cgs generated as well as constraints.