Files
PLFM_RADAR/9_Firmware
Jason cc6691dec9 perf(fpga): move CIC comb stages to fabric — 80→70 DSPs (-10)
Strip the explicit DSP48E1 instance from comb stage 0 and the
(* use_dsp = "yes" *) attribute from comb stages 1-4. The combs are
gated by data_valid_comb_pipe (fires once every 4 clk_400m cycles
post-decimation), so a multicycle path of 4 -setup / 3 -hold scoped
to the comb registers in xc7a50t_ftg256.xdc gives STA 10 ns of slack
for fabric carry-chain to close 28-bit subtracts comfortably.

Pipeline depth and bit-widths unchanged: the new fabric model mirrors
the prior CREG+AREG+BREG+PREG structure exactly, so data_valid_comb_0_out
alignment and downstream stages 1-4 see bit-identical samples. CIC
behavioral simulation model now lives outside the SIMULATION ifdef
branch (used unconditionally) since there is no longer a synthesis-only
DSP48E1 to replace.

50T post-impl results (Vivado 2025.2):
  DSPs:         80 → 70 / 120 (66.7% → 58.3%, freed 10)
  LUTs:         22114 / 32600 (67.8%)
  BRAM:         55.5 / 75 (74.0%, unchanged)
  adc_dco_p WNS: +0.022 ns → +0.906 ns (margin improved)
  All clocks meet timing, 0 failing endpoints.

Local regression: 32/34 PASS — same as baseline; the two failures
(Receiver Integration, Matched Filter Chain) are pre-existing
RX-NEW-3 (FFT throughput) and unaffected by this change. Bit-exact
through DDC chain (NCO→CIC→FIR) and MF cosim verified.

Cumulative DSP savings today: 112 → 70 (freed 42), enough headroom
for Xilinx LogiCORE FFT Pipelined Streaming swap (~33 DSPs for the
3-instance matched-filter chain) with 17 DSPs to spare.
2026-04-23 11:32:03 +05:45
..