Skip to content

Commit

Permalink
[RFC] IR: Define noalias.addrspace metadata
Browse files Browse the repository at this point in the history
This is intended to solve a problem with lowering atomics in
OpenMP and C++ common to AMDGPU and NVPTX.

In OpenCL and CUDA, it is undefined behavior for an atomic instruction
to modify an object in thread private memory. In OpenMP, it is defined.
Correspondingly, the hardware does not handle this correctly. For AMDGPU,
32-bit atomics work and 64-bit atomics are silently dropped. We therefore
need to codegen this by inserting a runtime address space check, performing
the private case without atomics, and fallback to issuing the real atomic
otherwise. This metadata allows us to avoid this extra check and branch.

Handle this by introducing metadata intended to be applied to atomicrmw,
indicating they cannot access the forbidden address space.
  • Loading branch information
arsenm committed Aug 8, 2024
1 parent 1139dee commit e403d08
Show file tree
Hide file tree
Showing 6 changed files with 237 additions and 6 deletions.
36 changes: 36 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8015,6 +8015,42 @@ it will contain a list of ids, including the ids of the callsites in the
full inline sequence, in order from the leaf-most call's id to the outermost
inlined call.


'``noalias.addrspace``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``noalias.addrspace`` metadata is used to identify memory
operations which cannot access a range of address spaces. It is
attached to memory instructions, including :ref:`atomicrmw
<i_atomicrmw>`, :ref:`cmpxchg <i_cmpxchg>`, and :ref:`call <i_call>`
instructions.

This follows the same form as :ref:`range metadata <_range-metadata>`,
except the field entries must be of type `i32`. The interpretation is
the same numeric address spaces as applied to IR values.

Example:

.. code-block:: llvm
; %ptr cannot point to an object allocated in addrspace(5)
%rmw.valid = atomicrmw and ptr %ptr, i64 %value seq_cst, !noalias.addrspace !0

; Undefined behavior. The underlying object is allocated in one of the listed
; address spaces.
%alloca = alloca i64, addrspace(5)
%alloca.cast = addrspacecast ptr addrspace(5) %alloca to ptr
%rmw.ub = atomicrmw and ptr %alloca.cast, i64 %value seq_cst, !noalias.addrspace !0

!0 = !{i32 5, i32 6}


This is intended for use on targets with a notion of generic address
spaces, which at runtime resolve to different physical memory
spaces. The interpretation of the address space values is target
specific. The behavior is undefined if the runtime memory address does
resolve to an object defined in one of the indicated address spaces.


Module Flags Metadata
=====================

Expand Down
2 changes: 2 additions & 0 deletions llvm/docs/ReleaseNotes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Changes to the LLVM IR
* The ``x86_mmx`` IR type has been removed. It will be translated to
the standard vector type ``<1 x i64>`` in bitcode upgrade.

* Introduced `noalias.addrspace` metadata.

Changes to LLVM infrastructure
------------------------------

Expand Down
1 change: 1 addition & 0 deletions llvm/include/llvm/IR/FixedMetadataKinds.def
Original file line number Diff line number Diff line change
Expand Up @@ -52,3 +52,4 @@ LLVM_FIXED_MD_KIND(MD_pcsections, "pcsections", 37)
LLVM_FIXED_MD_KIND(MD_DIAssignID, "DIAssignID", 38)
LLVM_FIXED_MD_KIND(MD_coro_outside_frame, "coro.outside.frame", 39)
LLVM_FIXED_MD_KIND(MD_mmra, "mmra", 40)
LLVM_FIXED_MD_KIND(MD_noalias_addrspace, "noalias.addrspace", 41)
34 changes: 28 additions & 6 deletions llvm/lib/IR/Verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -515,8 +515,9 @@ class Verifier : public InstVisitor<Verifier>, VerifierSupport {
void visitFunction(const Function &F);
void visitBasicBlock(BasicBlock &BB);
void verifyRangeMetadata(const Value &V, const MDNode *Range, Type *Ty,
bool IsAbsoluteSymbol);
bool IsAbsoluteSymbol, bool IsAddrSpaceRange);
void visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty);
void visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range, Type *Ty);
void visitDereferenceableMetadata(Instruction &I, MDNode *MD);
void visitProfMetadata(Instruction &I, MDNode *MD);
void visitCallStackMetadata(MDNode *MD);
Expand Down Expand Up @@ -760,7 +761,7 @@ void Verifier::visitGlobalValue(const GlobalValue &GV) {
if (const MDNode *AbsoluteSymbol =
GO->getMetadata(LLVMContext::MD_absolute_symbol)) {
verifyRangeMetadata(*GO, AbsoluteSymbol, DL.getIntPtrType(GO->getType()),
true);
true, false);
}
}

Expand Down Expand Up @@ -4128,7 +4129,8 @@ static bool isContiguous(const ConstantRange &A, const ConstantRange &B) {
/// Verify !range and !absolute_symbol metadata. These have the same
/// restrictions, except !absolute_symbol allows the full set.
void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,
Type *Ty, bool IsAbsoluteSymbol) {
Type *Ty, bool IsAbsoluteSymbol,
bool IsAddrSpaceRange) {
unsigned NumOperands = Range->getNumOperands();
Check(NumOperands % 2 == 0, "Unfinished range!", Range);
unsigned NumRanges = NumOperands / 2;
Expand All @@ -4145,8 +4147,14 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,

Check(High->getType() == Low->getType(), "Range pair types must match!",
&I);
Check(High->getType() == Ty->getScalarType(),
"Range types must match instruction type!", &I);

if (IsAddrSpaceRange) {
Check(High->getType()->isIntegerTy(32),
"noalias.addrspace type must be i32!", &I);
} else {
Check(High->getType() == Ty->getScalarType(),
"Range types must match instruction type!", &I);
}

APInt HighV = High->getValue();
APInt LowV = Low->getValue();
Expand Down Expand Up @@ -4185,7 +4193,14 @@ void Verifier::verifyRangeMetadata(const Value &I, const MDNode *Range,
void Verifier::visitRangeMetadata(Instruction &I, MDNode *Range, Type *Ty) {
assert(Range && Range == I.getMetadata(LLVMContext::MD_range) &&
"precondition violation");
verifyRangeMetadata(I, Range, Ty, false);
verifyRangeMetadata(I, Range, Ty, false, false);
}

void Verifier::visitNoaliasAddrspaceMetadata(Instruction &I, MDNode *Range,
Type *Ty) {
assert(Range && Range == I.getMetadata(LLVMContext::MD_noalias_addrspace) &&
"precondition violation");
verifyRangeMetadata(I, Range, Ty, false, true);
}

void Verifier::checkAtomicMemAccessSize(Type *Ty, const Instruction *I) {
Expand Down Expand Up @@ -5177,6 +5192,13 @@ void Verifier::visitInstruction(Instruction &I) {
visitRangeMetadata(I, Range, I.getType());
}

if (MDNode *Range = I.getMetadata(LLVMContext::MD_noalias_addrspace)) {
Check(isa<LoadInst>(I) || isa<StoreInst>(I) || isa<AtomicRMWInst>(I) ||
isa<AtomicCmpXchgInst>(I) || isa<CallInst>(I),
"noalias.addrspace are only for memory operations!", &I);
visitNoaliasAddrspaceMetadata(I, Range, I.getType());
}

if (I.hasMetadata(LLVMContext::MD_invariant_group)) {
Check(isa<LoadInst>(I) || isa<StoreInst>(I),
"invariant.group metadata is only for loads and stores", &I);
Expand Down
110 changes: 110 additions & 0 deletions llvm/test/Assembler/noalias-addrspace-md.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
; RUN: llvm-as < %s | llvm-dis | FileCheck %s

define i64 @atomicrmw_noalias_addrspace__0_1(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_1(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META0:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !0
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__0_2(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__0_2(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META1:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !1
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__1_3(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__1_3(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META2:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !2
ret i64 %ret
}

define i64 @atomicrmw_noalias_addrspace__multiple_ranges(ptr %ptr, i64 %val) {
; CHECK-LABEL: define i64 @atomicrmw_noalias_addrspace__multiple_ranges(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = atomicrmw add ptr [[PTR]], i64 [[VAL]] seq_cst, align 8, !noalias.addrspace [[META3:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !3
ret i64 %ret
}

define i64 @load_noalias_addrspace__5_6(ptr %ptr) {
; CHECK-LABEL: define i64 @load_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = load i64, ptr [[PTR]], align 4, !noalias.addrspace [[META4:![0-9]+]]
; CHECK-NEXT: ret i64 [[RET]]
;
%ret = load i64, ptr %ptr, align 4, !noalias.addrspace !4
ret i64 %ret
}

define void @store_noalias_addrspace__5_6(ptr %ptr, i64 %val) {
; CHECK-LABEL: define void @store_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL:%.*]]) {
; CHECK-NEXT: store i64 [[VAL]], ptr [[PTR]], align 4, !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
store i64 %val, ptr %ptr, align 4, !noalias.addrspace !4
ret void
}

define { i64, i1 } @cmpxchg_noalias_addrspace__5_6(ptr %ptr, i64 %val0, i64 %val1) {
; CHECK-LABEL: define { i64, i1 } @cmpxchg_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]], i64 [[VAL0:%.*]], i64 [[VAL1:%.*]]) {
; CHECK-NEXT: [[RET:%.*]] = cmpxchg ptr [[PTR]], i64 [[VAL0]], i64 [[VAL1]] monotonic monotonic, align 8, !noalias.addrspace [[META4]]
; CHECK-NEXT: ret { i64, i1 } [[RET]]
;
%ret = cmpxchg ptr %ptr, i64 %val0, i64 %val1 monotonic monotonic, align 8, !noalias.addrspace !4
ret { i64, i1 } %ret
}

declare void @foo()

define void @call_noalias_addrspace__5_6(ptr %ptr) {
; CHECK-LABEL: define void @call_noalias_addrspace__5_6(
; CHECK-SAME: ptr [[PTR:%.*]]) {
; CHECK-NEXT: call void @foo(), !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
call void @foo(), !noalias.addrspace !4
ret void
}

define void @call_memcpy_intrinsic_addrspace__5_6(ptr %dst, ptr %src, i64 %size) {
; CHECK-LABEL: define void @call_memcpy_intrinsic_addrspace__5_6(
; CHECK-SAME: ptr [[DST:%.*]], ptr [[SRC:%.*]], i64 [[SIZE:%.*]]) {
; CHECK-NEXT: call void @llvm.memcpy.p0.p0.i64(ptr [[DST]], ptr [[SRC]], i64 [[SIZE]], i1 false), !noalias.addrspace [[META4]]
; CHECK-NEXT: ret void
;
call void @llvm.memcpy.p0.p0.i64(ptr %dst, ptr %src, i64 %size, i1 false), !noalias.addrspace !4
ret void
}

declare void @llvm.memcpy.p0.p0.i64(ptr noalias nocapture writeonly, ptr noalias nocapture readonly, i64, i1 immarg) #0

attributes #0 = { nocallback nofree nounwind willreturn memory(argmem: readwrite) }

!0 = !{i32 0, i32 1}
!1 = !{i32 0, i32 2}
!2 = !{i32 1, i32 3}
!3 = !{i32 4, i32 6, i32 10, i32 55}
!4 = !{i32 5, i32 6}
;.
; CHECK: [[META0]] = !{i32 0, i32 1}
; CHECK: [[META1]] = !{i32 0, i32 2}
; CHECK: [[META2]] = !{i32 1, i32 3}
; CHECK: [[META3]] = !{i32 4, i32 6, i32 10, i32 55}
; CHECK: [[META4]] = !{i32 5, i32 6}
;.
60 changes: 60 additions & 0 deletions llvm/test/Verifier/noalias-addrspace.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
; RUN: not llvm-as < %s -o /dev/null 2>&1 | FileCheck %s

; CHECK: It should have at least one range!
; CHECK-NEXT: !0 = !{}
define i64 @noalias_addrspace__empty(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !0
ret i64 %ret
}

; CHECK: Unfinished range!
; CHECK-NEXT: !1 = !{i32 0}
define i64 @noalias_addrspace__single_field(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !1
ret i64 %ret
}

; CHECK: Range must not be empty!
; CHECK-NEXT: !2 = !{i32 0, i32 0}
define i64 @noalias_addrspace__0_0(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !2
ret i64 %ret
}

; CHECK: noalias.addrspace type must be i32!
; CHECK-NEXT: %ret = atomicrmw add ptr %ptr, i64 %val seq_cst, align 8, !noalias.addrspace !3
define i64 @noalias_addrspace__i64(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !3
ret i64 %ret
}

; CHECK: The lower limit must be an integer!
define i64 @noalias_addrspace__fp(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !4
ret i64 %ret
}

; CHECK: The lower limit must be an integer!
define i64 @noalias_addrspace__ptr(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !5
ret i64 %ret
}

; CHECK: The lower limit must be an integer!
define i64 @noalias_addrspace__nonconstant(ptr %ptr, i64 %val) {
%ret = atomicrmw add ptr %ptr, i64 %val seq_cst, !noalias.addrspace !6
ret i64 %ret
}

@gv0 = global i32 0
@gv1 = global i32 1

!0 = !{}
!1 = !{i32 0}
!2 = !{i32 0, i32 0}
!3 = !{i64 1, i64 5}
!4 = !{float 0.0, float 2.0}
!5 = !{ptr null, ptr addrspace(1) null}
!6 = !{i32 ptrtoint (ptr @gv0 to i32), i32 ptrtoint (ptr @gv1 to i32) }


0 comments on commit e403d08

Please sign in to comment.