Skip to content

2021.1.0

Compare
Choose a tag to compare
@davide-giri davide-giri released this 23 Jan 00:12

Added

  • Accelerator design flows
    • Keras/Pytorch/ONNX with hls4ml
      • Accelerator templates
      • Accelerator and test applications generation with AccGen
      • Tutorial
    • C/C++ with Xilinx Vivado HLS
      • Accelerator templates
      • Accelerator and test applications skeleton generation with AccGen
      • Tutorial
      • Sample accelerators: adder (element-wise addition)
    • C/C++ with Mentor Catapult HLS
    • SystemC with Cadence Stratus HLS
      • Accelerator templates (includes, skeleton templates)
      • Accelerator and test applications skeleton generation with AccGen
      • Tutorial
      • Sample accelerators: dummy (identity mapping), fft (Fast Fourier Transform 1D), sort, spmv (sparse matrix-vector multiplication), synth (synthetic traffic generator), nightvision (night-vision kernels), vitbfly2 (Viterbi butterfly), vitdodec (Viterbi decoder)
    • Chisel
      • Accelerator templates
      • Sample accelerators: adder (element-wise addition), counter, fft (Fast Fourier Transform 1D)
  • Third-party accelerator integration flow
    • Supported accelerator interfaces: AXI for the memory interface, AXI-Lite and APB for the configuration interface
    • Tutorial
    • Sample accelerators: NVDLA
  • SoC design flow
    • High-level SoC configuration (batch or GUI)
    • Automatic SoC generation
    • Push-button full-system RTL simulation of bare-metal programs
      • Supported simulators: Mentor Modelsim SE, Cadence Incisive, Cadence Xcelium
    • Push-button FPGA bitstream generation
      • Supported FPGA tools: Xilinx Vivado
  • Architecture
    • NoC
      • Packet-switched NoC with lookahead routing, single-cycle hop, and configurable bitwidth
      • ESP SoCs use 6 bidirectional physical NoC planes
        • 3 for cache coherence messages (32-bits or 64-bits based on processor architecture)
        • 2 for DMA messages (32-bits or 64-bits based on processor architecture)
        • 1 32-bit plane for the other messages (interrupts, memory-mapped IO and configuration registers)
    • Processor tile
      • Processor
        • Available options: 32-bit Leon3 (Sparc v8) with ESP FPU, 64-bit Ariane (RISC-V), 32-bit Ibex (RISC-V)
      • L2 private cache (optional)
      • Bus
        • Memory request bus options: AXI, AHB
        • Memory-mapped IO requests bus options: APB
      • Support for SoCs with multiple processor tiles
    • Accelerator tile
      • Accelerator (see accelerator design flow options above)
      • Accelerator socket
        • Accelerator configuration registers (default registers + user-defined registers)
        • Miss-free accelerator TLB for low overhead virtual memory support
        • Accelerator DMA engine
        • Private cache (optional)
          • Same as the L2 private cache in the processor tile
        • Cache coherence
          • Supported options: coherent with private cache, coherent DMA, LLC-coherent DMA, non-coherent DMA
          • Configurable at run-time
        • Point-to-point accelerator communication
          • Configurable at run-time
      • Support for SoCs with multiple accelerator tiles
    • Third-party accelerator tile
      • Accelerator socket
        • Bus-to-NoC bridges
          • Memory requests bus options: AXI
          • Memory-mapped IO requests bus options: AXI-Lite, APB
      • Support for SoCs with multiple third-party accelerator tiles
    • Memory tile
      • Last-level cache slice (optional)
        • NoC-based directory-based MESI protocol
        • Support for coherent DMA and LLC-coherent DMA
        • Available implementations: SystemVerilog, SystemC
      • Memory channel
        • Optionally include AHB bus and memory controller in the memory tile
      • Memory simulation model for full-system RTL simulation
      • Support for all accelerator cache coherence options
      • Support for SoCs with multiple memory tiles
        • Up to 2 memory tiles on proFPGA Virtex7 XC7V2000T FPGA module and up to 4 memory tiles on proFPGA Virtex UltraScale XCVU440 FPGA module
    • Auxiliary tile
      • Peripherals: Ethernet, UART, DVI (only on proFPGA FPGA modules with DVI interface board)
      • ESP Link debug unit
      • SoC initialization unit
      • Interrupt controller: Leon3 multiprocessor interrupt controller or RISC-V platform interrupt controller
      • Timer: GRLIB general-purpose timer or RISC-V core-local interrupt controller
      • Frame buffer
    • Scratchpad (shared-local memory) tile
      • Shared software-managed addressable memory
      • Support for multiple SLM tiles
      • SLM can replace external memory when configuring ESP with no memory tiles and selecting the Ibex core
    • Additional SoC services
      • ESP tile CSRs: memory mapped and accessible from software
        • Configuration registers: PADs configuration, clock generators configuration, tile ID configuration, core ID configuration (processor tile only), Ethernet and UART scalers configuration (auxiliary tile only), soft reset
        • Performance counters: accelerators activity, caches hit and miss rates, memory accesses, NoC routers traffic, dynamic voltage-frequency scaling operation
        • With proFPGA FPGA modules, performance counters can be accessed via Ethernet as well through an MMI64-based monitor interface (see ESP software tools below)
      • NoC adapters: AXI (to-NoC), AHB (to-NoC, from-NoC), APB (to-NoC, from-NoC), DMA (to/from-NoC), interrupt line (to-NoC, from-NoC)
      • Other adapters: APB-to-AXI-Lite, custom memory link for ESP instances w/o integrated DDR controller (link-to-AHB, cache/DMA-to-link)
      • NoC queues in every tile (processor, accelerator, memory, auxiliary, scratchpad)
      • Dynamic Voltage-Frequency Scaling controller in every tile
      • Single-tile test unit in every tile
  • ESP software stack
    • Support for Ariane, Leon3, and Ibex processors
    • Linux SMP support (Ariane and Leon3 only)
    • Bare-metal support
    • Multi-core support (Leon3 only)
      • Leon3 bare-metal multi-core test suite
    • Accelerator-specific software
      • ESP accelerator device driver
      • LibESP: the ESP accelerator invocation API
        • 3 functions: esp_alloc, esp_run, esp_free
        • Manage the execution of multiple accelerators in parallel and/or in a pipeline
      • Bare-metal unit-test sample applications for accelerators
      • Linux unit-test sample applications for accelerators
      • Multi-accelerator Linux applications examples
  • ESP software tools
    • AccGen: accelerator skeleton generator, including testbench, device driver and test applications
    • PLMGen: multi-port and multi-bank memory generator for SystemC accelerators
    • SoCGen: configure and generate an ESP SoC (batch or GUI)
    • SocketGen: generate the RTL for some of the ESP tile sockets
    • ESPLink: debug link via Ethernet from a host machine
    • ESPMon: collection of hardware performance monitors accessed via Ethernet through the proFPGA MMI64 interface (batch or GUI)
  • Supported FPGA development boards
    • Xilinx Virtex UltraScale+ FPGA VCU118
    • Xilinx Virtex UltraScale+ FPGA VCU128
    • Xilinx Virtex-7 FPGA VC707
    • proFPGA Virtex7 XC7V2000T
    • proFPGA Virtex Ultrascale XCVU440
    • Xilinx Zynq UltraScale+ MPSoC ZCU102 (WIP)
    • Xilinx Zynq UltraScale+ MPSoC ZCU106 (WIP)
  • Supported OS
    • CentOS 7 (recommended)
    • Red Hat Enterprise Linux 7.8
    • Ubuntu 18.04 (Cadence Stratus HLS not fully supported)