Skip to content

Latest commit

 

History

History
423 lines (386 loc) · 19 KB

project_mgt_hw.md

File metadata and controls

423 lines (386 loc) · 19 KB

DOING

  • Kernel-capable Hart
    • Supporter des set de config du core en test bench.
    • Support U-mode
    • Support PMP/PMA
    • https://github.com/eembc/coremark
    •  Advanced Interrupt controller
      •  AXI ERR handling
      • AXI EXOKAY handling
    • Atomic operations
      • stage to execute the instruction, controlling ldst Stages
      • memfy exposes two interfaces for requests.
      • memfy drives back response along register write channel
      • memfy supports two tags, one for regular, one for exclusive
      • support exclusive access in cache
        • support in-order for same tag (don't always subsitute)
        • exclusive can't be cachable
      • support exclusive access in memory, track the ID with a LUT
    •  Zc extension

BACKLOG

Any new features should be carefully study to ensure a proper exception and interrupt handling

Memory

  • Better manage ACACHE attribute
    • Correct value driven from memfy
    • Use it correctly across the cache
    • Read/write allocate based on memory map
    • Check impossible combination
    • IO map bufferable / non-bufferable
  • Make memory mapping of the core with:
    • Normal vs device
    • Inst vs data zone for cacheability / executability
    • Sharable for L2 cache
    • Support exception code for memory access error
    • Manage write response from cache or interco, don’t wait endpoint
    • Raise exception also from cache
  • Support AXI response
    • drive APROT with priv_mode
    • raise an exception (which one? a custom mcause?)
    • test with mapping outside interconnect memory region
    • manage in a clic controller and so avoid custom spec implementation could be used for other purpose later
  • Support fine-grain permission over memory range

Cache Stages

  • AXI4 + Wrap mode for read
  • Support datapath adaptation from memory controller
    •  Narrow transfer support?
    •  Gather/merge multiple continuous transactions?
  • Bien définir la politique write through no allocate
  • New cache associativity (2 / 4 / 8 ways configurable)
  • OoO read: miss could be stacked and served later waiting for cache fill and continue reading the next address
  •  Fully concurrent read / write access (Issue #1)
    • Split memfy in load unit & store unit

Miscellanous

  •  Add test for vector table
  • Test MSTATUS.TW
  •  mcountinhibit: stop a specific counter
  •  Machine Environment Configuration Registers (menvcfg and menvcfgh)
  • Machine Configuration Pointer Register (mconfigptr)
  • Create a HW test platform
    •  Analogue pocket
    • [C] Cloud
  • Add registers to configure the core in platform (use custom CSR)
    • Caches
    • Interconnect
    • processing: scheduling, hazard detection
  • Support CLIC controller
  • Random peripheral
  • UART: Support 9/10 bits & parity
  •  Deactivate the core with WFI (clock gating)
  •  Security Extension
    • Custom pmpsec CSR
      • priv/non-priv
      • cacheability
      • shareability
      • io/mem
    • HW isolation by CPU / Thread IDs

Next-Gen Architecture

AXI4 Infrastructure

  •  Detect address collision in memfy for better performance
    • support concurrent r/w in dCache
    • merge memfy_opt for memfy core udpate
  • Support different clock for AXI4 memory interface, cache and internal core
  • Support ECC bits in core/crossbar
  • Rework GPIOs sub-system
    • Reduce latency in switching logic
    • Ajouter PERROR sur l’APB, to log on error reporting bus
    • Rework IO APB interconnect
      • Fix IO subsystem misrouted
      • Fix IO subsystem bridge
  • Out of order support in AXI (memfy if not using cache)

Control / CSR

Processing

Verification/Validation

Hardware Tests

Ideas / Applications

DONE

  • v1.6.0: User Mode
    • Design
      •  Support U-mode:
        • Previous privilege mode interrupt is stored in xPP to support nested trap
        • Ecall move to M-mode
        • Mret move to U-mode
      •  Support exceptions
        • M-mode instructions executed in U-mode must raise an illegal instruction exception
        • Access to M-mode only registers must raise an illegal instruction exception
        • ecall code when coming from U-mode in mcause
      •  Support PMP (Physical Memory Protection)
        • Instruction read or data R/W access are checked against PMP to secure the hart
        • Address is checked with CSRs pmpcfg
        • Up to 16 zones can be defined
        • A zone can be readable, writable, executable
        • PMP checks are applied to all accesses whose effective privilege mode is S or U, including instruction fetches and data accesses in S and U mode, and data accesses in M-mode when the MPRV bit in mstatus is set and the MPP field in mstatus contains S or U (page 56 & page 23)
      •  Study PMA (Physical Memory Attribute) (section 3.6)
        • define R/W/X et l'address matching
        • le PMA ne permet pas de définir des zones d'IO et/ou si une region peut etre cohérente
      •  WFI:
        •  if MIE/SIE=1, wait for one of them and trap to m-mode. Resume to mepc=pc+4
        •  if MIE/SIE=0, wait for any intp and move forward
        •  Support MSTATUS.TW (timeout platform-dependent)
      • add FIFO for memory exceptions
      • Drive aprot[0] based on priviledge mode
      •  mcounteren: accessibility to lower privilege modes
        • Bit x = 1, lower privilege mode can read the counter
        • Bit x = 0, lower privilege mode access is forbidden and raise an illegal instruction exception -[X] Testcases
      • Faire varier la periode de l'EIRQ U-mode
      • pass from/to m-mode/u-mode
      • try mret in u-mode, needs to fail
      • try to access m-mode only CSRs Traps
      •  Do something within a loop with interrupt enabled, data needs to be OK
      •  WFI in u-mode, interrupt enabled, trapped in m-mode
      •  WFI in u-mode, interrupt disabled, NOP
      •  Test des exception load/store misaligned MPU:
      • configure registers
      •  all region configuration mode: NA4 / NAPOT / TOR
      •  multiple mixed region type and size
      • [-] Access exceptions
        • execute instruction outside allowed regions (U-mode)
        • write data in U-mode
        • read data in U-mode
        • read data in M-mode with MPRV=1 + MPP=U-mode
        • write data in M-mode with MPRV=1 + MPP=U-mode
        •  execute in M-mode without X + locked region
      •  locked access to change configuration MCOUNTEREN:
      • Bit x = 1, lower privilege mode can read the counter
      • Bit x = 0, lower privilege mode access is forbidden and raise an illegal instruction exception
  • v1.5.1: maintenance
    • Preload jal even if processing is busy
    • Print des tests qui ne marchent pas, un par un, dans le bash
    • Join errors after a test status
    • Review readme files
    • Revoir tous les paramètres de chaque instance et les documenter
  • v1.5.0: Mesure et amélioration des performances
    • Print et save des registres CSRs pour chaque test, garde la trace des performances dans Git
    • IP point de mesure des différents bus en bandwidth
    • CPI measure in benchmark
    • Augmenter le nombre d’OR max de dCache
    •  Prefetch read request
    •  Optimize write pusher to save a cycle
    •  Optimize Memfy dead cycle (RD write comb & pending request =0 if == 1 & valid)
    • Enhance read outstanding requests in MemFy
    •  No more pending flags in caches, BCH / RCH handshake is used to manage reording in Memfy
    • Enhance completion in OoO
    • Save a cycle on RD write in Memfy
    • Pending flag to deassert on completion if or=1
    • OoO write completion, response needs to come from the destination if IO write
    • Support prefetch: if no jump/branch detected in fetched instructions grab the next line, else give a try to fetch the branch address. AXI hint?
    • Reduce cache jump
  • v1.4.0
    • Rework Control for faster jump.
    • Rework iCache block fetcher to simplify it
    • Block fetcher: pass-thru front-end FIFO to reduce latency on jump
    •  Scheduler to run multiple operations in parallel. ALU can run along LD/ST if no hazard
    • CSR executes in a single cycle
    • Enhance Memfy outstanding request support
  • Add Zihpm
  • Fix TX read of UART which is blocking
  • Develop dCache
    • Uncachable access for IOs region
    • Derive from iCache
    • Add pusher stage for write access
    • APROT[2] pour instruction or data hint
  • Develop dCache testbench
  • Fix lint error code management in CI
  • Memfy:
    • Support outstanding read/write request
    • Don’t block write if AW / W are ready
    • Don’t block write until BCH but block any further read if pending write (in-order only)
  • Testcase WFI
  • Testcase outstanding requests
  • Testcase Zicnt
  • Add Zicntr
  • Rework trace among the modules
  • Deactivate trace with define for every module
  • AXI RAM model: add a performance mode
  • Add unsupported cache setup in core checkers
  • Add Github actions
  • Support unaligned address in APB sub-system
  • Add Clint peripheral
  • Output ISA regs on top level for debug purpose
  • Create a tesbench for iCache
  • Support script in App interactive testsuite
  • Add C testsuite
  • Add Apps testsuite, interactive tb with UART link from Verilator
  • Add almost empty/full flags to scfifo
  • Ensure interrupt and trap are correctly supported
  • Update SVUT to pass extra string to vvp for VPI
  • Review flush/reboot in fetcher & memctrl
  • Enhance cache reboot when ARID changes. Today just flush the FIFO, could restart the whole fetcher stages
  • Make AXI4-lite RAM throtteling
  •  Enhance processing unit (CANCELLED: implementation is too big for too few benefits) - control checks registers under use in an instruction and knows if can branch, LUI, AUIPC - processing clear the tickets once instruction is finished - processing knows if a ALU can be used based register targeted
  • Support multiple ALUs in parallel, differents extensions (float, mult/div, ...)
  • Better print control status when branching and trapping (MAUSE info)
  • Add Github Actions and deploy CI flow
  • Support both Icarus and Verilator in simulation flow
  • Add M extension
  • Share common sources between ASM and Compliance testsuite
  • Testbench supports both CORE and platform configuration
  • Develop FRISCV platform including the core, an AXI4 crossbar and peripherals
  • Simplifier les r/w de CSR, save one cycle to execute an op
  • Option to read ISA registers on falling edge, not combinatorial read
  • Design a generic pipeline stage for processing front-end
  • Support trap and interupts
  • Add clint controller
  • First documentation
  • Add external IRQ
  • Add software IRQ
  • Add timer IRQ
  • Parse doc and verify the trap handling (MCAUSE / ... fields)
  • Support traps
  • Convertir la testsuite ASM avec le format riscv-tests
  • Clean up repo after
  • Fix an isseu when rebooting teh cache, it issued a addr=0 request
  • Better handle traps on bad instruction
  • Support AXI4-lite for data interface
  • Pass RISCV compliance
  • Study how to use CSR
  • Partager les testbench et scripts entre les envs (use verilator?)
  • Define for SVLogger
  • Rename sources to remove rv32 mentions
  • Ajouter de check de parameters dans le top level
  • Print state with function and a verbosity level
  • Implement a generic logger
  • Implement instruction cache
    • Support AXI4-lite from control unit
    • Pipeline operations
    • Support outstanding requests
    • Bundle fetcher in a dedicated module
    • Configure the testbench for command line
      • cache line width define, used to setup bin2hex.py
      • cache enabling
    • Debug the core
    • Reboot fetcher if new ID incomes
  • Write-thru FIFO: If pull and empty, write directly the output not the RAM
  • Use AXI4-lite to fetch instruction
  • Always forward and define addressing in byte
  • Move CSR out of control unit
  • Synthesis session: OK for Yosys, needs to use another or lib to map async/sync reset FFD.
  • Add a debug interface (UART, JTAG) + DPI
  • Add GPIOs
  • Implement in-house profiler to check branching, stall time, ...
  • Develop top testbench to use asm programs and rely only on RAM to drive instructions and data into the core
    • Develop a unit test framework for ASM
    • Test memfy
    • Test processing + memory
    • Test processing vs JAL/JALR
    • Test JAL/JALR vs CSRs vs Processing
  •  Define the architecture of the first testbench. Goal: use C/asm to produce a RAM init file to drive the testcases. Break with a EBREAK instruction
  • Understand vexrisc and picorv32 make file
  •  Write a simple program, compile it, understand the asm
  •  transform object into hex file
  •  Understand the toolchain
    • Understand the linker description to be able to initialize the processor instruction memory
  • Read RISCV unpriviligied specification
  • Implement control unit and its testbench
    • be able to handle ALU halts for long instruction execution
    • support branching / system instructions
    • support pc correctly
  • Implement ALU
  •  Populate modules' unit tests (control & alu)