Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYCL][ESIMD] Introduce rdtsc API #12315

Merged
merged 5 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion llvm/lib/SYCLLowerIR/ESIMD/LowerESIMD.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -660,7 +660,8 @@ class ESIMDIntrinDescTable {
{"addc", {"addc", {l(0)}}},
{"subb", {"subb", {l(0)}}},
{"bfn", {"bfn", {a(0), a(1), a(2), t(0)}}},
{"srnd", {"srnd", {a(0), a(1)}}}};
{"srnd", {"srnd", {a(0), a(1)}}},
{"timestamp",{"timestamp",{}}}};
}
// clang-format on

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,9 @@ __ESIMD_INTRIN __ESIMD_raw_vec_t(sycl::half, N)
__ESIMD_DNS::vector_type_t<uint16_t, N> src2)
__ESIMD_INTRIN_END;

__ESIMD_INTRIN __ESIMD_raw_vec_t(uint32_t, 4)
__esimd_timestamp() __ESIMD_INTRIN_END;

#undef __ESIMD_raw_vec_t
#undef __ESIMD_cpp_vec_t

Expand Down
7 changes: 7 additions & 0 deletions sycl/include/sycl/ext/intel/experimental/esimd/math.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1723,6 +1723,13 @@ __ESIMD_API std::enable_if_t<__ESIMD_DNS::is_esimd_scalar<T>::value &&
return __ESIMD_NS::bfn<FuncControl>(src0, src1, src2);
}

/// rdtsc - get the value of rdtsc register.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rdtsc stands for ReaD TimeStamp Counter (I think). Can you please make it more clear in the comment here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenX returns 4 integers (which is probably the content of the drtsc register (if it is a register).
The function only returns uint64 (part of what GenX can return), please don't use the word register in this decription.

/// \return the current value of rdtsc
ESIMD_INLINE uint64_t rdtsc() {
v-klochkov marked this conversation as resolved.
Show resolved Hide resolved
__ESIMD_NS::simd<uint32_t, 4> retv = __esimd_timestamp();
return retv.template bit_cast_view<uint64_t>()[0];
}

/// @} sycl_esimd_logical

} // namespace ext::intel::experimental::esimd
Expand Down
78 changes: 78 additions & 0 deletions sycl/test-e2e/ESIMD/rdtsc.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out
//==- rdtsc.cpp - Test to verify rdtsc0 and sr0 functionlity----------------==//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// This is basic test to validate rdtsc function.

#include <cmath>
#include <iostream>
#include <sycl/ext/intel/esimd.hpp>
#include <sycl/ext/intel/esimd/simd.hpp>
#include <sycl/sycl.hpp>
#include <vector>

int ErrCnt = 0;
template <typename DataT>
using shared_allocator = sycl::usm_allocator<DataT, sycl::usm::alloc::shared>;
template <typename DataT>
using shared_vector = std::vector<DataT, shared_allocator<DataT>>;

int test_rdtsc_sr0() {
sycl::queue Queue;
shared_allocator<uint64_t> Allocator(Queue);
constexpr int32_t SIZE = 32;

shared_vector<uint64_t> VectorOutputRDTSC(SIZE, -1, Allocator);

auto GlobalRange = sycl::range<1>(SIZE);
sycl::range<1> LocalRange{1};
sycl::nd_range<1> Range(GlobalRange, LocalRange);

{
Queue.submit([&](sycl::handler &cgh) {
uint64_t *VectorOutputRDTSCPtr = VectorOutputRDTSC.data();

auto Kernel = ([=](sycl::nd_item<1> ndi) [[intel::sycl_explicit_simd]] {
using namespace sycl::ext::intel::esimd;
auto Idx = ndi.get_global_id(0);
simd<uint64_t, SIZE> DummyVector;
uint64_t StartCounter = sycl::ext::intel::experimental::esimd::rdtsc();
DummyVector.copy_from(VectorOutputRDTSCPtr);
uint64_t EndCounter = sycl::ext::intel::experimental::esimd::rdtsc();

simd<uint64_t, 1> VectorResultRDTSC = EndCounter - StartCounter;
v-klochkov marked this conversation as resolved.
Show resolved Hide resolved

VectorResultRDTSC.copy_to(VectorOutputRDTSCPtr + Idx);
});

cgh.parallel_for(Range, Kernel);
});
Queue.wait();
}

int Result = 0;

// Check if returned values are positive
Result |= std::any_of(VectorOutputRDTSC.begin(), VectorOutputRDTSC.end(),
[](uint64_t v) { return v <= 0; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if v is unsigned, then how it can be less than zero?
The VectorOutputRDTSC needs to be signed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it will be signed there will be risk of overflow. I reworked the logic to reduce the chances of overflow

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If suppose 2Ghz, then 2 secs = 32-bit. the timestamp returns 64-bit, which is huge (2 sec * 4Bil, so, there was no risk of overflow). The updated solution is good, except it needs 1 minor fix to avoid DCE-ing the load between rdtsc calls.


return Result;
}

int main() {

int TestResult = 0;

TestResult |= test_rdtsc_sr0();

if (!TestResult) {
std::cout << "Pass" << std::endl;
}
return TestResult;
}