AI Accelerator Chip

This chip uses a tiled architecture to compute a series of fixed-point multiply-and-accumulate operations in parallel. In other words, the chip takes a vector and performs a series of matrix multiplications to it, which simulates layer computations in Deep Learning.

Architecture

The tiled architecture refers to the chip being composed of a grid of autonomous tiles, each tile connected to adjacent tiles. Each tile is analogous to an artificial neuron in Deep Learning. Information propagates from top to bottom, with the input vector being passed into the top layer of the grid and subsequent results being passed downwards to the next layer. The final result is the output from the bottom layer of the grid. The tiles communicate bidirectionally with its left/right neighbors to assemble the output of the entire layer, which is then propagated downwards as the input of the next neuron.

Core

Within each tile is an outer and inner core. The inner core computes the dot product of the input vector with its weight vector, which computes the output of one neuron. The inner core sends the result to the outer core, which communicates with the tile's neighbors to assemble the entire layer's output. The output of the outer core is the output of an entire layer.

Tile

The tile is a wrapper module for the outer core, which handles serial communication and daisy-chaining with neighboring tiles.

Communication

Communication happens in a daisy-chained manner, with each tile sending its own output as well as any outputs it receives from its neighbors. This allows the output of a single neuron to propogate to all of the other neurons in the layer, so that each tile can output the result of the layer to the next tile. This is necessary as each tile expects an entire vector (the output of the previous layer) as input.

Simulation

This project was built with Intel Quartus and simulated with ModelSim. Each module has appropriate testbenches to ensure the correctness of the components. Theouter_core_tb simulates a 4x4 grid of cores using non-serial communication. In the screenshot of the waveform simulation, the outputs of the last layer (/outer_tb/layer_outg[3]) are consistent with the expected results.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
db		db
incremental_db		incremental_db
output_files		output_files
simulation/modelsim		simulation/modelsim
work		work
AI_chip.qpf		AI_chip.qpf
AI_chip.qsf		AI_chip.qsf
AI_chip.qws		AI_chip.qws
AI_chip_nativelink_simulation.rpt		AI_chip_nativelink_simulation.rpt
LICENSE.txt		LICENSE.txt
README.md		README.md
bitstream.sv		bitstream.sv
bitstream.sv.bak		bitstream.sv.bak
communication.png		communication.png
core.png		core.png
fifo.sv		fifo.sv
fifo.sv.bak		fifo.sv.bak
inner_core.sv		inner_core.sv
inner_core.sv.bak		inner_core.sv.bak
msi_tb.sv.bak		msi_tb.sv.bak
mult_sum_fixp.sv		mult_sum_fixp.sv
mult_sum_fixp.sv.bak		mult_sum_fixp.sv.bak
mult_sum_fixp.v.bak		mult_sum_fixp.v.bak
mult_sum_int.sv		mult_sum_int.sv
mult_sum_int.sv.bak		mult_sum_int.sv.bak
outer_core.sv		outer_core.sv
outer_core.sv.bak		outer_core.sv.bak
packet_io.sv		packet_io.sv
packet_io.sv.bak		packet_io.sv.bak
simulation.png		simulation.png
tile.png		tile.png
tile.sv		tile.sv
tile.sv.bak		tile.sv.bak
vish_stacktrace.vstf		vish_stacktrace.vstf
vsim.wlf		vsim.wlf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Accelerator Chip

Architecture

Core

Tile

Communication

Simulation

About

Releases

Packages

Languages

License

henryz2004/ai-accelerator-chip

Folders and files

Latest commit

History

Repository files navigation

AI Accelerator Chip

Architecture

Core

Tile

Communication

Simulation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages