Skip to content

A Generic hardware solution to a PopulationCount algorithm ( popcnt ), written in vhdl. Testbench and diagrams included

Notifications You must be signed in to change notification settings

BarakBinyamin/PopulationCount

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PopulationCount

A Generic hardware solution to a PopulationCount algorithm, written in VHDL. Testbench and diagrams included

Description

This device counts the number of "1"'s in a binary word, it has many applications including crytography and error detection/correction in communications

The structure of this design was inspired by a Divide and Conquer Algorithm explained on stack overflow

Here is a waveform showing what it does:


Figure 1: Behavioral Simulation Waveform

You can find an explanation of this waveform in the testing section

Developers

Quickstart

EDA

The source is shared on EDAplayground, where it's ready for simulation (sign in and hit the run button). A waveform should pop up after the simulation

Be sure to check the settings on the left panel, they should look like:

  • Testbench + Design: VHDL
  • Top entity: count_onesTB
  • Tools and Simulators
    • Aldec Rivera Pro
      • Compile Options: 2019 -o
    • [ x ] Open EPWave after run

When the EPwave opens:

  • Select Get Signals on the top left, a panel should pop up
  • Select count_onesTB under scope on the left
  • Select Append All on the bottom right

For more information on how to use EDAplayground please refer to the EDAplayground documentation

CLI_HDL

Compile VHDL code using ghdl from the command line and simulate using gtkwave.

  1. Install ghdl
  2. Structure your source code in 'work' into hirearchical directories
  3. Use make to compile your code
  4. Use TESTBENCH=tst/... make test to view a Gtkwave simulation like this one

Gtkwave popcount

Project Directory

Name Purpose
img Screen captures showing design & functionality
src The source code
tst Tests

Components Available For Individual Download:

Theory

The structure of this design was inspired by a Divide and Conquer Algorithm explained on stack overflow, written in C

unsigned int count_bit(unsigned int x)
{
  x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
  x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
  x = (x & 0x0F0F0F0F) + ((x >> 4) & 0x0F0F0F0F);
  x = (x & 0x00FF00FF) + ((x >> 8) & 0x00FF00FF);
  x = (x & 0x0000FFFF) + ((x >> 16)& 0x0000FFFF);
  return x;
}

Posted by user abcdabcd987 on Stack Overflow

To approach creating a scalable hardware equivalent, this algorithm can be broken up into 4 distinct problems:

  1. Generating the Bitmasks
  2. Generating a circuit to AND two sets of bits
  3. Generating a circuit to ADD two sets of bits
  4. Generating a circuit to arithmetically SHIFT(keep the sign bit when shifting) two sets of bits

Only the generation of the bitmasks will be discussed in this README, see this link for a project dedicated to the addition unit.

Once a unit can be made for one set of these operations (one line), then these units can be sequenced together.


Figure 2: Schematic For a Two-bit Population Count Circuit (equivelent to one line of c code)

Figure 3: Schematic For a Four-bit Population Count Circuit (equivelent to three lines of c code)

This design uses registers between the OP units to synchronize the circuits signals and limit timing issues such as propogation delay.

Generating the Bitmasks

Let's inspect if there is a pattern in the bitmasks:

Mask number HEX MASK BIT MASK
1 0x55555555 01010101010101010101010101010101
2 0x33333333 00110011001100110011001100110011
3 0x0F0F0F0F 00001111000011110000111100001111
4 0x00FF00FF 00000000111111110000000011111111
5 0x0000FFFF 00000000000000001111111111111111

From the table, we can see:

  1. The length of the mask is the length of the intended X input
  2. The number of masks corresponds with the what power of 2 creates the length of the number, in this case 2^5 = 32
  3. Each mask oscillates between 2^(MASK_NUMBER-1) 1's and 0's

Lets examine Mask number 3:

0000 1111 0000 1111 0000 1111 0000 1111     

For the first 12 bits (from right to left), we find that indeces 0,1,2,3,8,9,10,11 all contain '1' values.

Searching this sequence on the Online Encyclopedia of Integer Sequences, we find that it is equivelent to numbers that are congruent to {0, 1, 2, 3} mod 8.

With a little more inspection the general rule looks to be that the ones are every index in which index MOD 2^(MASK_NUMBER) <= 2^(MASK_NUMBER-1)-1

Now we can generate our 2d array of masks:

gen_masks: for MASK_NUMBER in 1 to LOG_BIT_WIDTH generate

    gen_bits: for BIT in (2**LOG_BIT_WIDTH)-1 downto 0 generate
        one_bits: if BIT mod 2**MASK_NUMBER <= 2**(MASK_NUMBER-1)-1 generate
                mask(MASK_NUMBER-1)(BIT) <= '1';
        end generate one_bits;

        zero_bits: if BIT mod 2**MASK_NUMBER > 2**(MASK_NUMBER-1)-1 generate
                mask(MASK_NUMBER-1)(BIT) <= '0';
        end generate zero_bits;
       
    end generate gen_bits;  
end generate gen_masks;

Testing

Tests for VHDL designs are usually refered to as a "Testbench". It can be thought of as a wrapper component used to drive the inputs and tap the outputs of the Unit Under Test (UUT)

We can simulate the testbench and inspect the inputs and outputs using tools available through Vivado or EDAPlayground

Designs can get large and visual inpection of a waveform can become time consuming. Thankfully a testbench can automatically check if the outputs match expected values at the right time with assert statements. If the outputs don't match expectations, we can raise alerts which will show up in the logs of the simulation tool


Figure 1: Behavioral Simulation Waveform

Figure 1 shows three scaled simulations of the design.

  • A 2 bit unit
  • A 4 bit unit
  • A 32 bit unit

The test inputs are labeled Xone-Xthree
The test outputs are labeled COUNTone-COUNTthree

Motivation

I was looking for a hardware/software internship in 2018 and my brother @ErezBinyamin offered to share my resume with his work on the condition I could pass two challanges. One was solving the first crackme challange using Ghidra, and the other was to devlop a harware equivalent of a population count algorithm from stack overflow

Both were actually really fun and I'm glad I got the chance to build something and learn along the way

Resources

About

A Generic hardware solution to a PopulationCount algorithm ( popcnt ), written in vhdl. Testbench and diagrams included

Topics

Resources

Stars

Watchers

Forks