Hi,
i realize the profiling with a proprietary microcontroller and simulator(not arduino).
My analysis is that the bslice_4d function spent too time in the total time inference.
Is is possible develop a "bslice_4d less" algoritm or more efficient bslice_4d function?