I tried executing some matrix multiplication algorithms on verilated model of Vicuna. The algorithms are able to successfully execute with compact configuration but not with dual and triple pipeline configuration. The application just hangs and does not reach the end.
I also tried executing a matrix convolution algorithm which executed properly in compact configuration but failed to execute in dual and triple pipeline configuration.
Please suggest some means to identify root cause/solution for this problem.