-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make memorynew
intrinsic
#55913
base: master
Are you sure you want to change the base?
make memorynew
intrinsic
#55913
Conversation
@gbaraldi so with LLVM assertions enabled I'm getting
which is on the line that does |
I'd print everyone involved here with the way I showed you yesterday |
This now works! For simple examples like |
As an example of what is possible. Allocopt was able to go from define i64 @julia_f_769() #0 !dbg !5 {
top:
%pgcstack = call ptr @julia.get_pgcstack()
%current_task1 = getelementptr inbounds i8, ptr %pgcstack, i64 -112, !dbg !14
%memoryref_mem = call dereferenceable(40) ptr addrspace(10) @julia.gc_alloc_obj(ptr nonnull %current_task1, i64 40, ptr addrspace(10) addrspacecast (ptr @"+Core.GenericMemory#771.jit" to ptr addrspace(10))), !dbg !14
%0 = addrspacecast ptr addrspace(10) %memoryref_mem to ptr addrspace(11), !dbg !14
%1 = getelementptr inbounds { i64, ptr }, ptr addrspace(11) %0, i64 0, i32 1, !dbg !14
%2 = call nonnull ptr @julia.pointer_from_objref(ptr addrspace(11) %0) #4, !dbg !14
%3 = getelementptr inbounds i8, ptr %2, i64 16, !dbg !14
store ptr %3, ptr addrspace(11) %1, align 8, !dbg !14
store i64 3, ptr addrspace(11) %0, align 8, !dbg !14
%memoryref_data4 = call ptr addrspace(13) @julia.gc_loaded(ptr addrspace(10) %memoryref_mem, ptr %3), !dbg !15
store i64 2, ptr addrspace(13) %memoryref_data4, align 8, !dbg !15, !tbaa !20, !alias.scope !24, !noalias !27
%memoryref_data11 = getelementptr inbounds i8, ptr addrspace(13) %memoryref_data4, i64 8, !dbg !32
store i64 4, ptr addrspace(13) %memoryref_data11, align 8, !dbg !32, !tbaa !20, !alias.scope !24, !noalias !27
%memoryref_data18 = getelementptr inbounds i8, ptr addrspace(13) %memoryref_data4, i64 16, !dbg !34
store i64 5, ptr addrspace(13) %memoryref_data18, align 8, !dbg !34, !tbaa !20, !alias.scope !24, !noalias !27
ret i64 11, !dbg !36
} to. Removing the allocation. Which likely would allow it to just return the 11 define i64 @julia_f_769() #0 !dbg !5 {
top:
%memoryref_mem = alloca [40 x i8], align 16
%pgcstack = call ptr @julia.get_pgcstack()
%current_task1 = getelementptr inbounds i8, ptr %pgcstack, i64 -112, !dbg !14
call void @llvm.lifetime.start.p0(i64 40, ptr %memoryref_mem)
%0 = freeze [40 x i8] undef, !dbg !14
store [40 x i8] %0, ptr %memoryref_mem, align 1, !dbg !14
%1 = getelementptr inbounds { i64, ptr }, ptr %memoryref_mem, i64 0, i32 1, !dbg !14
%2 = getelementptr inbounds i8, ptr %memoryref_mem, i64 16, !dbg !14
store ptr %2, ptr %1, align 8, !dbg !14
store i64 3, ptr %memoryref_mem, align 8, !dbg !14
%memoryref_data4 = call ptr addrspace(13) @julia.gc_loaded(ptr addrspace(10) null, ptr %2), !dbg !15
store i64 2, ptr addrspace(13) %memoryref_data4, align 8, !dbg !15, !tbaa !20, !alias.scope !24, !noalias !27
%memoryref_data11 = getelementptr inbounds i8, ptr addrspace(13) %memoryref_data4, i64 8, !dbg !32
store i64 4, ptr addrspace(13) %memoryref_data11, align 8, !dbg !32, !tbaa !20, !alias.scope !24, !noalias !27
%memoryref_data18 = getelementptr inbounds i8, ptr addrspace(13) %memoryref_data4, i64 16, !dbg !34
store i64 5, ptr addrspace(13) %memoryref_data18, align 8, !dbg !34, !tbaa !20, !alias.scope !24, !noalias !27
ret i64 11, !dbg !36
} |
|
|
6222082
to
b65a483
Compare
b65a483
to
724b8c5
Compare
Can you please add an llvm pass test for #56030 (comment) (removing all memory for a simple case where the Memory object doesn't escape)? |
Do you want an actual LLVM pass, or can I just write a test for 0 allocations? |
I think an llvm test would be more robust, but probably a simple zero-allocation test would do the job as well. |
LOL. This test is so good it broke a doctest in performance tips. We're testing to show that you get allocations if you have "bad" code that allocates arrays, but now it doesn't allocate :laughing |
2c2b098
to
e6e26ab
Compare
This is now on top of #55995 (to figure out why we weren't optimizing correctly), but other than that, I think this is good to go! |
Maybe a test of no allocations in simple cases as discussed above? 🙂 |
The Value printer LLVM uses just prints the kind of instruction so it just shows call.
f280252
to
db95887
Compare
@@ -905,6 +906,12 @@ macro goto(name::Symbol) | |||
end | |||
|
|||
# linear indexing | |||
function getindex(A::GenericMemory, i::Int) | |||
@_noub_if_noinbounds_meta | |||
@boundscheck ult_int(bitcast(UInt, sub_int(i, 1)), bitcast(UInt, A.length)) || throw_boundserror(A, (i,)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just make this the bootstrap method too? The boundscheck macro is just expanded simply to @_boundscheck &&
, so this seems compatible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds reasonable
Co-authored-by: Jameson Nash <vtjnash@gmail.com>
This speeds up making new
Memory
s and allow the compiler to better understand what's going on, allowing for LLVM level escape analysis in some cases. There is more room to grow this (currently this only optimizes for fairly smallMemory
since bigger ones would require writing some more LLVM code, and we probably want a size limit on puttingMemory
on the stack to avoid stackoverflow. For larger ones, we could potentially inline thefree
so theMemory
doesn't have to be swept by the GC, etc.Benchmarks: