From 370858ef8578fb870b625dd4f4cf57e387c245b9 Mon Sep 17 00:00:00 2001 From: Damien Pretet Date: Mon, 26 Jun 2023 18:26:31 +0200 Subject: [PATCH] Prepare release v1.4.0 --- doc/project_mgt_hw.md | 11 +++++++---- doc/release_v1.4.0.md | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 39 insertions(+), 4 deletions(-) create mode 100644 doc/release_v1.4.0.md diff --git a/doc/project_mgt_hw.md b/doc/project_mgt_hw.md index 8850d97..8980ba2 100644 --- a/doc/project_mgt_hw.md +++ b/doc/project_mgt_hw.md @@ -35,7 +35,6 @@ N.B. : Cache Stage Enhancement: -- [ ] Walk-thru FIFO to reduce latency on jump - [ ] AXI4 + Wrap mode for read - [ ] Support prefetch: if no jump/branch detected in fetched instructions grab the next line, else give a try to fetch the branch address. AXI hint? @@ -105,7 +104,6 @@ Processing: https://www.youtube.com/channel/UCPSsA8oxlSBjidJsSPdpjsQ/videos -- [ ] Scheduler to run multiple operations in parallel - [ ] Memfy: Manage RRESP/BRESP - [ ] Support F extension - [ ] Division @@ -155,8 +153,13 @@ Hardware Test: # DONE -- [X] CSR executes in a single cycle -- [X] Enhance Memfy outstanding request support +- [X] v1.4.0 + - [X] Rework Control for faster jump. + - [X] Rework iCache block fetcher to simplify it + - [X] Block fetcher: pass-thru front-end FIFO to reduce latency on jump + - [X] Scheduler to run multiple operations in parallel. ALU can run along LD/ST if no hazard + - [X] CSR executes in a single cycle + - [X] Enhance Memfy outstanding request support - [X] Add Zihpm - [X] Fix TX read of UART which is blocking - [X] Develop dCache diff --git a/doc/release_v1.4.0.md b/doc/release_v1.4.0.md new file mode 100644 index 0000000..8e2bf99 --- /dev/null +++ b/doc/release_v1.4.0.md @@ -0,0 +1,32 @@ +# v1.4.0 + +This release has been initiated to boost performance by reworking the control and the block fetch +stage. Result: CPI passed from 3.67 to 3.57. Not as good as expected but future cache enhance +will improve more. + +Control: +- front-end read data channel can be removed now +- the sequencer FSM has been simplified and now avoid RELOAD state. Request can be issued + without stall time and reboot faster +- the FSM has been splitted, CSR management is done in a dedicated process +- flush_reqs is asserted along a new request +- flush_reqs deactivated makes the performance very bad + +iCache block fetcher: +- FSM has been simplified and replaced by a simpler logic +- Front-end FIFO has been first removed then put back because it really enhances the performance +- Less OR if front-end FIFO is removed enhance the performance +- latency is lower by 1 cycle. Flow-thru option is better and can balance performance +- a FIFO has been placed on read data channel to increase performance. Drasticaly better when + control data path FIFO is removed +- flush_reqs reboots the circuit but a request can be served along this assertion +- cache miss fetch stage has been moved to a dedicated module. To enhance later to increase + performance + +CSR: +- CSR is always ready and instruction executed in one cycle +- new custom register to measure performance + +Processing +- Allow multiple instruction in parallel if no hazards occur +- Memfy: enhance outstanding request performance