Fix UE 4.27 Compatibility #38

khalidbourr · 2025-12-08T18:39:57Z

The original convertDepth function had two issues:

_mm_div_epi16 doesn't exist - there is no SSE/AVX integer division intrinsic
Logic was incorrect - dividing float16 encoded bits as integers corrupts the values

This PR uses UE4's built-in FFloat16 class for portable float16→float32 conversion, then scales by 0.01 to convert cm→m.

Now no compilation errors,
Produces correct depth values
No longer requires F16C CPU support or manual UE4 recompilation

…at16 conversion - _mm_div_epi16 doesn't exist in any SSE/AVX instruction set - Original code incorrectly performed integer division on float16 encoded bits - Use UE4's FFloat16 for portable float16->float32 conversion - Removes mandatory F16C CPU requirement - Fixes depth unit conversion (cm to m) to work correctly

Sanic · 2025-12-08T22:37:48Z

Hi @khalidbourr Thanks for the PR!
I can't valide this functionality right now on my machine, but i was wondering if this has a negative impact on the compute time? Have you tested how fast this is in comparison? Will this slow down the overall image capturing compared to the old version?

khalidbourr · 2025-12-09T00:36:50Z

Dear @Sanic, I haven’t evaluated it from that perspective yet. However, I did encounter build errors in Unreal Engine 4.27 on Linux, and this is the error message I received.”

FStaticMeshLODResources &LODModel = StaticMesh->RenderData->LODResources[PaintingMeshLODIndex];
^
/home/vampiro/UnrealEngine-4.27/Engine/Source/Runtime/Engine/Classes/Engine/StaticMesh.h:519:2: note: 'RenderData' has been explicitly marked deprecated here
UE_DEPRECATED(4.27, "Please do not access this member directly; use UStaticMesh::GetRenderData() or UStaticMesh::SetRenderData().")
^
/home/vampiro/UnrealEngine-4.27/Engine/Source/Runtime/Core/Public/Misc/CoreMiscDefines.h:234:43: note: expanded from macro 'UE_DEPRECATED'
#define UE_DEPRECATED(Version, Message) [[deprecated(Message " Please update your code to the new API before upgrading to the next release, otherwise your project will no longer compile.")]]
^
In file included from /home/vampiro/Documents/Unreal Projects/AI4FOREST/Plugins/ROSIntegrationVision/Intermediate/Build/Linux/B4D820EA/UE4Editor/Development/ROSIntegrationVision/Module.ROSIntegrationVision.cpp:6:
/home/vampiro/Documents/Unreal Projects/AI4FOREST/Plugins/ROSIntegrationVision/Source/ROSIntegrationVision/Private/VisionComponent.cpp:754:4: error: use of undeclared identifier '_mm_div_epi16'; did you mean '_mm_min_epi16'?
_mm_div_epi16(
^~~~~~~~~~~~~
_mm_min_epi16
/home/vampiro/UnrealEngine-4.27/Engine/Extras/ThirdPartyNotUE/SDKs/HostLinux/Linux_x64/v19_clang-11.0.1-centos7/x86_64-unknown-linux-gnu/lib/clang/11.0.1/include/emmintrin.h:2412:1: note: '_mm_min_epi16' declared here
_mm_min_epi16(__m128i __a, __m128i __b)

Sanic · 2025-12-09T14:25:21Z

Alright.
Can you see in your Log what the typical tick rate / delay is? There should be some debug outputs telling you how long generating and sending one Sensor image tuple took.

khalidbourr · 2025-12-10T13:30:30Z

Once I do that I'll inform you.

khalidbourr · 2025-12-11T01:30:36Z

I tested the VisionComponent tick timing on Linux (Ubuntu, Intel i7 7th Gen, GTX 1050, UE4.27, ROS Melodic). Initially, F16C was not enabled - I confirmed this with objdump -d libUE4Editor-ROSIntegrationVision.so | grep -i vcvtph2ps showing no output. I enabled F16C by adding -mf16c to LinuxToolChain.cs and also modified the convertDepth() function to use hardware intrinsics (_mm_cvtph_ps()) instead of the software FFloat16::GetFloat() loop. After rebuilding, objdump now shows vcvtph2ps instructions confirming F16C is compiled. However, performance remains at ~1000ms per tick (~1 FPS). Interestingly, the first ticks before ROS publishing are fast (~50ms), but once publishing starts, it drops to ~1 FPS - I think the bottleneck may be in ReadPixels, ROS network I/O, or thread synchronization rather than the depth conversion itself. I will check again, at the moment, my modif solve the building issue.

khalidbourr · 2025-12-11T01:32:19Z

This is the current implementation of convertdepth I am using, not pushed yet!

void UVisionComponent::convertDepth(const uint16_t *in, __m128 *out) const
{
const size_t size = (Width * Height) / 4;
const __m128 scale = _mm_set1_ps(0.01f);

for (size_t i = 0; i < size; ++i, in += 4, ++out)
{
    // F16C hardware conversion - 4 half-floats to 4 floats in ONE instruction!
    __m128i half4 = _mm_loadl_epi64((__m128i*)in);
    __m128 depth = _mm_cvtph_ps(half4);  // F16C intrinsic!
    *out = _mm_mul_ps(depth, scale);
}

}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix UE 4.27 Compatibility #38

Fix UE 4.27 Compatibility #38

Uh oh!

khalidbourr commented Dec 8, 2025

Uh oh!

Sanic commented Dec 8, 2025 •

edited

Loading

Uh oh!

khalidbourr commented Dec 9, 2025

Uh oh!

Sanic commented Dec 9, 2025

Uh oh!

khalidbourr commented Dec 10, 2025

Uh oh!

khalidbourr commented Dec 11, 2025

Uh oh!

khalidbourr commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix UE 4.27 Compatibility #38

Are you sure you want to change the base?

Fix UE 4.27 Compatibility #38

Uh oh!

Conversation

khalidbourr commented Dec 8, 2025

Uh oh!

Sanic commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

khalidbourr commented Dec 9, 2025

Uh oh!

Sanic commented Dec 9, 2025

Uh oh!

khalidbourr commented Dec 10, 2025

Uh oh!

khalidbourr commented Dec 11, 2025

Uh oh!

khalidbourr commented Dec 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Sanic commented Dec 8, 2025 •

edited

Loading