From 30b31509c197b79ffd8dbb50905fb05e5c0fc6c9 Mon Sep 17 00:00:00 2001 From: canisio Date: Mon, 30 Mar 2026 09:30:16 -0300 Subject: [PATCH] Updated README --- README.md | 329 ++++++++++-------------------------------------------- 1 file changed, 59 insertions(+), 270 deletions(-) diff --git a/README.md b/README.md index 75577a3..674d5a3 100644 --- a/README.md +++ b/README.md @@ -1,308 +1,97 @@ -# RFSoC Channelizer RX Architecture (ZCU111) +# 📡 RFSoC Channelizer + PS Processing (R-ESM Prototype) -> **Project Context:** This design is part of a prototype **R-ESM (Radar Electronic Support Measures) receiver**, implemented on the ZCU111 RFSoC platform. -> The project was initiated from the RFSoC reference template provided by MATLAB/Simulink SoC Blockset and is being incrementally analyzed and modified. +## Overview + +This project is based on the RFSoC SoC Blockset reference design, adapted as a prototype for a Radar Electronic Support Measures (R-ESM) receiver. + +### Current Status + +- Tx subsystem: simple tone generator (to be replaced by LFM pulse generator) +- Rx subsystem: fully functional channelizer pipeline (PFB-based) +- PL → PS interface: AXI4-Stream + DMA working +- PS processing: frame-based algorithm (RMS + peak detection) --- -## System Overview +## System Architecture -* The **TX subsystem** is currently used for test purposes: - - * Generates a **single-tone signal via NCO** - * Future work: implement an **LFM pulse generator** - -* The **RX subsystem** (focus of this document): - - * Acquires RF data via ADC - * Performs **channelization (PFB)** - * Buffers and serializes data - * Streams data to processor (PS) via AXI DMA +ADC → Channelizer (PFB, 512 bins) +→ FFT_Capture (frame control) +→ FIFO Serializer (4 FIFOs → 1 stream) +→ AXI4-Stream (uint64) +→ DMA (S2MM) +→ PS Memory +→ Processor Algorithm (frame-based) --- -## System Configuration +## Key Parameters -* ADC Sampling Rate: **4096 MSPS** -* Decimation: **×8 → 512 MSPS effective bandwidth** -* FPGA Fabric Clock: **128 MHz** -* Samples per Clock: **4 complex samples** +- ADC Sampling Rate: 4096 MSPS +- Decimation: 8 +- Effective BW: 512 MHz +- Channels (FFT size): 512 +- Samples per clock: 4 +- FPGA clock: 128 MHz +- Frame size (PS): 512 samples --- -## Channelizer (PFB) +## DMA (PL → PS) -* Type: Polyphase Filter Bank (PFB) -* Number of Channels: **512** -* Taps per Channel: **16** -* Output per Clock: +- Data type: uint64 +- Frame size: 512 +- Buffers: 16 +- Memory: PS DDR - * `4 complex samples` (vectorized) - * `valid`, `SOF`, `EOF` - -### Frame Structure - -* Total bins per frame: **512** -* Samples per clock: **4** -* → **128 clock cycles per frame** - -### Time-Multiplexed Output - -Each clock produces consecutive frequency bins: - -```id="y7l3sj" -clk 0 → bins 0–3 -clk 1 → bins 4–7 -... -clk 127 → bins 508–511 -``` +Each TLAST corresponds to one DMA frame. --- -## Data Representation +## Processor (PS) -* Input to channelizer: **16-bit complex** -* Output: **25-bit complex** -* Per sample: **50 bits (Re + Im)** - -Data is later packed into **uint64** for AXI compatibility. +- Event-driven execution (triggered by DMA) +- No task queueing +- Frames may be dropped if processing is slower than input rate --- -## FIFO Architecture (Banked Design) +## Data Path in PS -### Structure - -* **4 independent FIFOs** -* One FIFO per lane (sample index) - -```id="k6d9c7" -Lane 0 → FIFO1 -Lane 1 → FIFO2 -Lane 2 → FIFO3 -Lane 3 → FIFO4 -``` - -### Depth - -* Each FIFO depth: **128** -* Total frame: **512 samples** -* → 512 / 4 = 128 samples per FIFO +- Stream Read → uint64[512] +- Bit extraction → real/imag +- Conversion → complex vector +- Processing → RMS + peak detection --- -## Why 4 FIFOs (Critical Design Choice) +## Performance Notes -### Hardware Constraint - -FPGA BRAM: - -* Max **2 ports** -* Cannot support **4 simultaneous writes** - -### Input Requirement - -* 4 samples per clock → **4 writes per clock** - -### Solution - -→ **Banked memory (4 FIFOs)** - -Each FIFO: - -* 1 write per clock -* Fully compatible with BRAM architecture +- Bottleneck: unpacking + type conversion +- PS cannot keep up with full-rate stream +- Frames are skipped under load --- -## Data Organization Across FIFOs +## FrFT Integration Plan -Each FIFO stores a decimated sequence of bins: - -```id="ehy2k6" -FIFO1: bins 0, 4, 8, ... -FIFO2: bins 1, 5, 9, ... -FIFO3: bins 2, 6, 10, ... -FIFO4: bins 3, 7, 11, ... -``` - -This is a **lane-based de-interleaving** of the channelizer output. +- Replace Processor Algorithm with FrFT +- Keep all other components unchanged +- Input: complex single [512x1] +- Accept dropped frames initially --- -## Serialization (Parallel → Stream Conversion) +## Roadmap -### Input - -* 4 samples per clock (parallel) - -### Output - -* 1 sample per clock (AXI stream) - -### Mechanism - -A **FIFO Sequencer** performs round-robin reads: - -```id="b42g8p" -Cycle 0 → FIFO1 -Cycle 1 → FIFO2 -Cycle 2 → FIFO3 -Cycle 3 → FIFO4 -(repeat) -``` - -### Result - -Reconstructed stream: - -```id="9f8x7x" -0, 1, 2, 3, 4, 5, ..., 511 -``` +1. Functional FrFT (PS) +2. Profiling +3. NEON optimization +4. Throughput tuning +5. PL acceleration --- -## Throughput Behavior +## Key Takeaway -* Write side: **4 samples/clk** -* Read side: **1 sample/clk** - -Over 4 cycles: - -* 4 samples written → 4 samples read - -→ **No data loss (rate preserved over time)** - ---- - -## TLAST Handling (Frame Boundary) - -* TLAST is embedded in FIFO data: - - * LSB of FIFO word carries TLAST flag - -```id="l6g0vn" -[Data (50 bits)] + [TLAST (1 bit)] -``` - -* Extracted after FIFO mux and sent to AXI - -### Behavior - -* TLAST asserted **once per frame** -* Typically associated with final bin (e.g., bin 511) - ---- - -## AXI4-Stream Interface - -Output signals: - -* `tdata` → uint64 packed data -* `tvalid` → data valid -* `tready` → backpressure from DMA -* `tlast` → frame boundary - -### Data Path - -```id="9m7iqk" -PL → AXI4-Stream → AXI DMA (S2MM) → DDR → PS -``` - ---- - -## Backpressure Handling - -AXI backpressure (`tready`) propagates upstream: - -* If `tready = 0`: - - * FIFO reads pause - * Data accumulates in FIFOs - -### Protection Mechanism - -* FIFO_Sequencer only reads when: - - * AXI is ready - * Data available in all FIFOs - ---- - -## Triggered Capture Mechanism - -### Trigger Source - -* Software writes to register -* Generates 1-cycle pulse (`TriggerCapture`) - -### FFT_Capture Behavior - -State machine: - -```id="27qf3x" -IDLE → wait trigger -ARMED → wait SOF -CAPTURE → collect 128 cycles -DONE → assert TLAST -``` - -### Key Property - -Capture is **frame-aligned** (starts at SOF) - ---- - -## Architectural Pattern - -The system implements: - -```id="2q6v6x" -Parallel Stream (4 samples/clk) - ↓ -Banked Memory (4 FIFOs) - ↓ -Round-Robin Serialization - ↓ -AXI Stream (1 sample/clk) -``` - -This is a standard FPGA pattern: - -> **Lane-based parallelism + memory banking + time-multiplexed output** - ---- - -## Notes for Future Work (FrFT Integration) - -### Recommended Insertion Points - -**Option A (Preferred):** - -```id="bb7jbp" -FIFO output → FrFT → MUX → AXI -``` - -**Option B:** - -```id="o5o0qz" -MUX → FrFT → AXI -``` - -### Avoid - -```id="q8k3yo" -Before FIFOs (requires 4-sample parallel processing) -``` - ---- - -## Key Takeaways - -* Multiple FIFOs are required due to **memory port limitations** -* Serialization is done via **deterministic round-robin scheduling** -* AXI backpressure is safely absorbed using FIFO buffering -* Frame integrity is guaranteed via **SOF-aligned capture + TLAST** -* Architecture is scalable and suitable for further DSP insertion (e.g., FrFT) - ---- +First make it work end-to-end, then make it fast. \ No newline at end of file