Skip to content

Latest commit

 

History

History
48 lines (31 loc) · 2 KB

20250115.md

File metadata and controls

48 lines (31 loc) · 2 KB

RISE RP005 QEMU weekly report 2025-01-15

Milestone 2

  • Patch 1 (small vectors unit stride loads/stores).

    • COMPLETE.
  • Patch 2 (large vectors unit stride loads/stores).

Milestone 3

Generate TCG Ops for vector whole word load/store.

  • IN PROGRESS.

    • Latest version of the patch.
    • Performance improvement for all except large data with large vectors.
    • We are identifying the cut-off point, so we use helper functions in this case, and will then submit the final version of the patch.
    • Key limit to performance is maintaining the vstart CSR.
  • Improve first-fault handling for vector load/store helper functions.

    • IN PROGRESS.
    • No new work to report this week.
  • Improve strided load/store helper functions.

    • IN PROGRESS.
    • No new work to report this week.

Statistics

Instruction timing for TCGOp generation for whole word load/store

We used a simple assembler benchmark to obtain timings for each instruction and hence for each instruction the speedup from TCGOp generation. The graphs show the results of 10 separate runs, with standard error bars.

whole word load speedups from TCGOp generation

whole word store speedups from TCGOp generation

SiFive benchmarks: TCGOp generation for memcpy (V3)

This uses the variant implementation of memcpy using whole word load/store (see the report from 18 Dec 2024. There is no update from last week's report

Actions

Other

Next meeting 22 January 2025.