Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Himbaechel xilinx : Add support of DSP packing for 7-series #1363

Merged
merged 11 commits into from
Sep 24, 2024

Conversation

marzoul
Copy link
Contributor

@marzoul marzoul commented Sep 10, 2024

Hi,

This pull request imports the DSP packing implementation from project nextpnr-xilinx.
https://github.com/gatecat/nextpnr-xilinx/
Only minimal changes done to the main code to make it fit for Himbaechel code base.

This should address issue #1361
(but not make it work altogether as routing fails later in the flow).

One point to highlight is the addition of rbegin() to class indexed_store, because as far as I understand the use of end() will always generate an out-of-range exception because of calling at() with an index equal to the vector size.
So if existing code makes use of end() should probably be inspected.

Regards,
Adrien

@marzoul
Copy link
Contributor Author

marzoul commented Sep 11, 2024

Changes done and tested, it still works :-)

@marzoul
Copy link
Contributor Author

marzoul commented Sep 12, 2024

I pushed the extra checks.
Now the routing starts, but does not succeed.

Info: Routing..
Info: Setting up routing queue.
Info: Routing 1919 arcs.
Info:            |   (re-)routed arcs  |   delta    | remaining|       time spent     |
Info:    IterCnt |  w/ripup   wo/ripup |  w/r  wo/r |      arcs| batch(sec) total(sec)|
Warning: Failed to find a route for arc 100 of net $PACKER_GND_NET.
Info: Running post-routing legalisation...
ERROR: Expecting string value but got integer 1380270932.
1 warning, 1 error

Worth noting is that it is related to net $PACKER_GND_NET that is explicitly used in DSP packing.
But this is beyond what I can analyze.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 12, 2024

Found the option --debug-router..
The routing issue (which is just a warning) as about that arc :

Routing constant arc 100 on net $PACKER_GND_NET (1905 arcs total):
  value ... GND
  sink ..... X49Y57/DSP48_X0Y1.CARRYCASCIN
  total number of visited nodes: 5
  no route found for this arc
Warning: Failed to find a route for arc 100 of net $PACKER_GND_NET.

Indeed CARRYCASCIN input is not reachable with routing.
But the crash happens after :

Info: Running post-routing legalisation...
ERROR: Expecting string value but got integer 1380270932.
1 warning, 1 error

According to gdb stacktrace, it is indeed linked to DSPs:

(gdb) bt
#0  nextpnr_himbaechel::log_error (format=0x5555557319c8 "Expecting string value but got integer %d.\n") at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/common/kernel/log.cc:184
#1  0x00005555555c8cc6 in nextpnr_himbaechel::str_or_default<nextpnr_himbaechel::IdString> (ct=..., key=..., def="DIRECT") at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/common/kernel/util.h:65
#2  0x00005555556b29ce in nextpnr_himbaechel::(anonymous namespace)::FasmBackend::write_dsp_cell (this=0x7fffffffad00, ci=<optimized out>) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/uarch/xilinx/fasm.cc:1524
#3  nextpnr_himbaechel::(anonymous namespace)::FasmBackend::write_ip (this=0x7fffffffad00) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/uarch/xilinx/fasm.cc:1634
#4  nextpnr_himbaechel::(anonymous namespace)::FasmBackend::write_fasm (this=0x7fffffffad00) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/uarch/xilinx/fasm.cc:1648
#5  nextpnr_himbaechel::XilinxImpl::write_fasm (this=<optimized out>, filename=...) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/uarch/xilinx/fasm.cc:1661
#6  0x00005555556e3af7 in nextpnr_himbaechel::XilinxImpl::postRoute (this=0x5555557b0190) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/uarch/xilinx/xilinx.cc:273
#7  0x000055555564e16b in nextpnr_himbaechel::Arch::route (this=0x5555557b3860) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/arch.cc:243
#8  0x00005555556fef69 in nextpnr_himbaechel::CommandHandler::executeMain(std::unique_ptr<nextpnr_himbaechel::Context, std::default_delete<nextpnr_himbaechel::Context> >) [clone .constprop.0] (this=this@entry=0x7fffffffc590, 
    ctx=std::unique_ptr<nextpnr_himbaechel::Context> = {...}) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/common/kernel/command.cc:687
#9  0x00005555555917e3 in nextpnr_himbaechel::CommandHandler::exec (this=0x7fffffffc590) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/common/kernel/command.cc:759
#10 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/nextpnr-himbaechel-git/nextpnr-himbaechel/himbaechel/main.cc:107
(gdb) 

The code at fasm.cc:1524 :
auto ainput = str_or_default(ci->params, ctx->id("A_INPUT"), "DIRECT");

@marzoul
Copy link
Contributor Author

marzoul commented Sep 13, 2024

Maybe ci->params stores the string values as string IDs I reckon ?
So the variant int_or_default could be more appropriate for these operations, the stored value is just compared anyway right after.

@gatecat
Copy link
Member

gatecat commented Sep 13, 2024

No, strings that come from Yosys should be stored as strings in the JSON and therefore kept as string type property values.

The problem is that GHDL doesn't preserve them as strings, so you get something like this in the Verilog/JSON .A_INPUT(48'b010001000100100101010010010001010100001101010100) - nextpnr can't handle this with str_or_default which expects a string to be preserved (I'm not certain if Vivado would accept this, either, other vendor tools generally expect to see strings as strings).

With a normal Verilog flow, this should work fine.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 13, 2024

I did not notice this ghdl generation issue. Thank you.
With a little post-processing of generated verilog I can have the types expected by nextpnr, that'll do for now.

But nextpnr expects MASK and PATTERN to strings, where the reference instantiation templates from ug953 looks more like a bit vector. So the format generated by ghdl should be a better fit ?

    .MASK(48'b001111111111111111111111111111111111111111111111),
    .PATTERN(48'b000000000000000000000000000000000000000000000000),

Routing still fails, though. But this time no exploitable stack trace. So I guess the warning has to be fixed solidly first.

Warning: Failed to find a route for arc 100 of net $PACKER_GND_NET.
Info: Running post-routing legalisation...
ERROR: Routing design failed.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 13, 2024

I have updated the archive of my reproducer with DSP instantiation compatible with nextpnr.
https://cloud.univ-grenoble-alpes.fr/s/pTHEn8ngLgSTpxg

@marzoul
Copy link
Contributor Author

marzoul commented Sep 13, 2024

I added a commit to disconnect more cascaded input ports so these are not routed.
Routing now succeeds \o/

@gatecat
Copy link
Member

gatecat commented Sep 16, 2024

But nextpnr expects MASK and PATTERN to strings, where the reference instantiation templates from ug953 looks more like a bit vector. So the format generated by ghdl should be a better fit ?

This sounds like a problem in nextpnr, they should be handled as bitvectors (probably, as these parameters aren't usually specified, this was never caught before).

@marzoul
Copy link
Contributor Author

marzoul commented Sep 17, 2024

The issue is now with cascaded DSPs, this is used for multiplications larger than one DSP.
Reproducer design is in public archive :
https://cloud.univ-grenoble-alpes.fr/s/NpQT72p49kR5mCb

// In case no bel was found, emit a warning instead of continuing to a crash
// This helps to investigate and to devise a workaround in user design
if (bel == BelId()) {
log_warning("No bel found to bind cell '%s' of type '%s'\n", cell->name.c_str(ctx), cell->type.c_str(ctx));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should just be a warning

@marzoul
Copy link
Contributor Author

marzoul commented Sep 19, 2024

Disabling DSP clustering makes placement and routing succeed.
And it seems the current clustering of DSPs should be revised (see comments in commits).

It seems the results are valid solutions, though I can't guarantee that given my little knowledge of placer behaviour.
But at least, this it makes nextpnr produce results. I think this is valuable for both end users (perf metrics give directions to continue their design development), and nextpnr developers (large designs enable to work on performance of algorithms).

This no-clustering behavior also corresponds to how BRAMs are handled.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 19, 2024

Finally, I pushed a commit to be more flexible about types of DSP parameters, for interoperability purposes.
Xilinx doc ug953 shows bitvector type for PATTERN and MASK, but yosys seems to generate strings only, and ghdl generates bitvectors.
Xilinx doc ug953 even shows inconsistent type for USE_DPORT.

@gatecat
Copy link
Member

gatecat commented Sep 23, 2024

Unfortunately I haven't had much time to look into this, but I'm somewhat surprised that just disabling clustering is enough to make this work - my understanding was the DSP cascade routing is dedicated, so if the DSPs aren't placed correctly then routing will fail. BRAMs are different because we don't ever actually use any of the functionality that requires dedicated cascade routing for those.

A "proper", albeit somewhat more complex solution, if I understand this correctly, would be to override getClusterPlacement to account for the gaps in the column structure and that DSPs don't always have fixed offsets.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 24, 2024

Indeed I'm not saying that this is fixed, only that there is no crash on my experiment design.
Obtaining a correct placement by chance is nice but not relevant for now. For my humble objective, potentially non-functional results is still better than crash because I can keep going.
A strong warning is certainly desirable.

Are you meaning that placement of cascaded BRAMs is not supposed to be handled properly ?

Some interesting words about cascaded BRAMs : I am testing also testing with 2 cascaded BRAMs, and perhaps also by chance these happen to be placed correctly (runtime not tested). Yosys does generate cascaded BRAMs for a 64k x 1b sync memories so this is easily reproducible (I can provide testcase if this is useful).
However, nextpnr-xilinx crashes in the placer code on cascaded BRAMs. I had to add more constant cascaded port disconnection in nextpnr-xilinx to make it work (pull request to come, for discussion).

@marzoul
Copy link
Contributor Author

marzoul commented Sep 24, 2024

Now I need some visibility : What should this PR feature to hope for integration ?
Adding a warning for cascaded DSPs is doable.
But in making the placer correctly support it could be future works (limited time and knowledge of intricacies of this code).

@marzoul
Copy link
Contributor Author

marzoul commented Sep 24, 2024

I'm wrong : the DSP clustering code really seems correct.
And I can get the placer crash on a cluster created from a carry chain.
So the placer behavior about clusters must be inspected deeper.

@gatecat
Copy link
Member

gatecat commented Sep 24, 2024

Yeah, this is definitely a placer bug, I will try and find some time to look into it if you don't get there first (realistically Thursday/Friday).

I didn't realise Yosys was generating BRAM cascades now, it's been a while since I've looked at that code. This will need to be properly supported too. Beware that blindly disconnecting cascade ports that are not connected to constants in order to not need cluster placements is likely to break (in some cases subtly) design behaviour.

@marzoul
Copy link
Contributor Author

marzoul commented Sep 24, 2024

Finally found the root cause : The root DSP was not marked as being in a cluster !
It all seems functional now.

@gatecat
Copy link
Member

gatecat commented Sep 24, 2024

Nice, thanks! This was probably missed when porting the code to the new cluster API at some point.

@gatecat gatecat merged commit 437fb70 into YosysHQ:master Sep 24, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants