Updated README

2026-03-30 09:30:16 -03:00
parent 10644b0475
commit 30b31509c1
1 changed files with 59 additions and 270 deletions
--- a/README.md
+++ b/README.md
@@ -1,308 +1,97 @@
-# RFSoC Channelizer RX Architecture (ZCU111)
+# 📡 RFSoC Channelizer + PS Processing (R-ESM Prototype)
-> **Project Context:** This design is part of a prototype **R-ESM (Radar Electronic Support Measures) receiver**, implemented on the ZCU111 RFSoC platform.
+## Overview
-> The project was initiated from the RFSoC reference template provided by MATLAB/Simulink SoC Blockset and is being incrementally analyzed and modified.
+
 This project is based on the RFSoC SoC Blockset reference design, adapted as a prototype for a Radar Electronic Support Measures (R-ESM) receiver.
 ### Current Status
 - Tx subsystem: simple tone generator (to be replaced by LFM pulse generator)
 - Rx subsystem: fully functional channelizer pipeline (PFB-based)
 - PL → PS interface: AXI4-Stream + DMA working
 - PS processing: frame-based algorithm (RMS + peak detection)
 ---
-## System Overview
+## System Architecture
-* The **TX subsystem** is currently used for test purposes:
+ADC → Channelizer (PFB, 512 bins)
-
+→ FFT_Capture (frame control)
-  * Generates a **single-tone signal via NCO**
+→ FIFO Serializer (4 FIFOs → 1 stream)
-  * Future work: implement an **LFM pulse generator**
+→ AXI4-Stream (uint64)
-
+→ DMA (S2MM)
-* The **RX subsystem** (focus of this document):
+→ PS Memory
-
+→ Processor Algorithm (frame-based)
  * Acquires RF data via ADC
  * Performs **channelization (PFB)**
  * Buffers and serializes data
  * Streams data to processor (PS) via AXI DMA
 ---
-## System Configuration
+## Key Parameters
-* ADC Sampling Rate: **4096 MSPS**
+- ADC Sampling Rate: 4096 MSPS
-* Decimation: **×8 → 512 MSPS effective bandwidth**
+- Decimation: 8
-* FPGA Fabric Clock: **128 MHz**
+- Effective BW: 512 MHz
-* Samples per Clock: **4 complex samples**
+- Channels (FFT size): 512
 - Samples per clock: 4
 - FPGA clock: 128 MHz
 - Frame size (PS): 512 samples
 ---
-## Channelizer (PFB)
+## DMA (PL → PS)
-* Type: Polyphase Filter Bank (PFB)
+- Data type: uint64
-* Number of Channels: **512**
+- Frame size: 512
-* Taps per Channel: **16**
+- Buffers: 16
-* Output per Clock:
+- Memory: PS DDR
-  * `4 complex samples` (vectorized)
+Each TLAST corresponds to one DMA frame.
  * `valid`, `SOF`, `EOF`
 ### Frame Structure
 * Total bins per frame: **512**
 * Samples per clock: **4**
 * → **128 clock cycles per frame**
 ### Time-Multiplexed Output
 Each clock produces consecutive frequency bins:
 ```id="y7l3sj"
 clk 0 → bins 0–3
 clk 1 → bins 4–7
 ...
 clk 127 → bins 508–511
 ```
 ---
-## Data Representation
+## Processor (PS)
-* Input to channelizer: **16-bit complex**
+- Event-driven execution (triggered by DMA)
-* Output: **25-bit complex**
+- No task queueing
-* Per sample: **50 bits (Re + Im)**
+- Frames may be dropped if processing is slower than input rate
 Data is later packed into **uint64** for AXI compatibility.
 ---
-## FIFO Architecture (Banked Design)
+## Data Path in PS
-### Structure
+- Stream Read → uint64[512]
-
+- Bit extraction → real/imag
-* **4 independent FIFOs**
+- Conversion → complex vector
-* One FIFO per lane (sample index)
+- Processing → RMS + peak detection
 ```id="k6d9c7"
 Lane 0 → FIFO1
 Lane 1 → FIFO2
 Lane 2 → FIFO3
 Lane 3 → FIFO4
 ```
 ### Depth
 * Each FIFO depth: **128**
 * Total frame: **512 samples**
 * → 512 / 4 = 128 samples per FIFO
 ---
-## Why 4 FIFOs (Critical Design Choice)
+## Performance Notes
-### Hardware Constraint
+- Bottleneck: unpacking + type conversion
-
+- PS cannot keep up with full-rate stream
-FPGA BRAM:
+- Frames are skipped under load
 * Max **2 ports**
 * Cannot support **4 simultaneous writes**
 ### Input Requirement
 * 4 samples per clock → **4 writes per clock**
 ### Solution
 → **Banked memory (4 FIFOs)**
 Each FIFO:
 * 1 write per clock
 * Fully compatible with BRAM architecture
 ---
-## Data Organization Across FIFOs
+## FrFT Integration Plan
-Each FIFO stores a decimated sequence of bins:
+- Replace Processor Algorithm with FrFT
-
+- Keep all other components unchanged
-```id="ehy2k6"
+- Input: complex single [512x1]
-FIFO1: bins 0, 4, 8, ...
+- Accept dropped frames initially
 FIFO2: bins 1, 5, 9, ...
 FIFO3: bins 2, 6, 10, ...
 FIFO4: bins 3, 7, 11, ...
 ```
 This is a **lane-based de-interleaving** of the channelizer output.
 ---
-## Serialization (Parallel → Stream Conversion)
+## Roadmap
-### Input
+1. Functional FrFT (PS)
-
+2. Profiling
-* 4 samples per clock (parallel)
+3. NEON optimization
-
+4. Throughput tuning
-### Output
+5. PL acceleration
 * 1 sample per clock (AXI stream)
 ### Mechanism
 A **FIFO Sequencer** performs round-robin reads:
 ```id="b42g8p"
 Cycle 0 → FIFO1
 Cycle 1 → FIFO2
 Cycle 2 → FIFO3
 Cycle 3 → FIFO4
 (repeat)
 ```
 ### Result
 Reconstructed stream:
 ```id="9f8x7x"
 0, 1, 2, 3, 4, 5, ..., 511
 ```
 ---
-## Throughput Behavior
+## Key Takeaway
-* Write side: **4 samples/clk**
+First make it work end-to-end, then make it fast.
 * Read side: **1 sample/clk**
 Over 4 cycles:
 * 4 samples written → 4 samples read
 → **No data loss (rate preserved over time)**
 ---
 ## TLAST Handling (Frame Boundary)
 * TLAST is embedded in FIFO data:
  * LSB of FIFO word carries TLAST flag
 ```id="l6g0vn"
 [Data (50 bits)] + [TLAST (1 bit)]
 ```
 * Extracted after FIFO mux and sent to AXI
 ### Behavior
 * TLAST asserted **once per frame**
 * Typically associated with final bin (e.g., bin 511)
 ---
 ## AXI4-Stream Interface
 Output signals:
 * `tdata` → uint64 packed data
 * `tvalid` → data valid
 * `tready` → backpressure from DMA
 * `tlast` → frame boundary
 ### Data Path
 ```id="9m7iqk"
 PL → AXI4-Stream → AXI DMA (S2MM) → DDR → PS
 ```
 ---
 ## Backpressure Handling
 AXI backpressure (`tready`) propagates upstream:
 * If `tready = 0`:
  * FIFO reads pause
  * Data accumulates in FIFOs
 ### Protection Mechanism
 * FIFO_Sequencer only reads when:
  * AXI is ready
  * Data available in all FIFOs
 ---
 ## Triggered Capture Mechanism
 ### Trigger Source
 * Software writes to register
 * Generates 1-cycle pulse (`TriggerCapture`)
 ### FFT_Capture Behavior
 State machine:
 ```id="27qf3x"
 IDLE → wait trigger
 ARMED → wait SOF
 CAPTURE → collect 128 cycles
 DONE → assert TLAST
 ```
 ### Key Property
 Capture is **frame-aligned** (starts at SOF)
 ---
 ## Architectural Pattern
 The system implements:
 ```id="2q6v6x"
 Parallel Stream (4 samples/clk)
        ↓
 Banked Memory (4 FIFOs)
        ↓
 Round-Robin Serialization
        ↓
 AXI Stream (1 sample/clk)
 ```
 This is a standard FPGA pattern:
 > **Lane-based parallelism + memory banking + time-multiplexed output**
 ---
 ## Notes for Future Work (FrFT Integration)
 ### Recommended Insertion Points
 **Option A (Preferred):**
 ```id="bb7jbp"
 FIFO output → FrFT → MUX → AXI
 ```
 **Option B:**
 ```id="o5o0qz"
 MUX → FrFT → AXI
 ```
 ### Avoid
 ```id="q8k3yo"
 Before FIFOs (requires 4-sample parallel processing)
 ```
 ---
 ## Key Takeaways
 * Multiple FIFOs are required due to **memory port limitations**
 * Serialization is done via **deterministic round-robin scheduling**
 * AXI backpressure is safely absorbed using FIFO buffering
 * Frame integrity is guaranteed via **SOF-aligned capture + TLAST**
 * Architecture is scalable and suitable for further DSP insertion (e.g., FrFT)
 ---