Updated README

2026-03-30 09:30:16 -03:00
parent 10644b0475
commit 30b31509c1
1 changed files with 59 additions and 270 deletions
--- a/README.md
+++ b/README.md
@@ -1,308 +1,97 @@
-# RFSoC Channelizer RX Architecture (ZCU111)
+# 📡 RFSoC Channelizer + PS Processing (R-ESM Prototype)

-> **Project Context:** This design is part of a prototype **R-ESM (Radar Electronic Support Measures) receiver**, implemented on the ZCU111 RFSoC platform.
-> The project was initiated from the RFSoC reference template provided by MATLAB/Simulink SoC Blockset and is being incrementally analyzed and modified.
+## Overview
+
+This project is based on the RFSoC SoC Blockset reference design, adapted as a prototype for a Radar Electronic Support Measures (R-ESM) receiver.
+
+### Current Status
+
+- Tx subsystem: simple tone generator (to be replaced by LFM pulse generator)
+- Rx subsystem: fully functional channelizer pipeline (PFB-based)
+- PL → PS interface: AXI4-Stream + DMA working
+- PS processing: frame-based algorithm (RMS + peak detection)

 ---

-## System Overview
+## System Architecture

-* The **TX subsystem** is currently used for test purposes:
-
-  * Generates a **single-tone signal via NCO**
-  * Future work: implement an **LFM pulse generator**
-
-* The **RX subsystem** (focus of this document):
-
-  * Acquires RF data via ADC
-  * Performs **channelization (PFB)**
-  * Buffers and serializes data
-  * Streams data to processor (PS) via AXI DMA
+ADC → Channelizer (PFB, 512 bins)
+→ FFT_Capture (frame control)
+→ FIFO Serializer (4 FIFOs → 1 stream)
+→ AXI4-Stream (uint64)
+→ DMA (S2MM)
+→ PS Memory
+→ Processor Algorithm (frame-based)

 ---

-## System Configuration
+## Key Parameters

-* ADC Sampling Rate: **4096 MSPS**
-* Decimation: **×8 → 512 MSPS effective bandwidth**
-* FPGA Fabric Clock: **128 MHz**
-* Samples per Clock: **4 complex samples**
+- ADC Sampling Rate: 4096 MSPS
+- Decimation: 8
+- Effective BW: 512 MHz
+- Channels (FFT size): 512
+- Samples per clock: 4
+- FPGA clock: 128 MHz
+- Frame size (PS): 512 samples

 ---

-## Channelizer (PFB)
+## DMA (PL → PS)

-* Type: Polyphase Filter Bank (PFB)
-* Number of Channels: **512**
-* Taps per Channel: **16**
-* Output per Clock:
+- Data type: uint64
+- Frame size: 512
+- Buffers: 16
+- Memory: PS DDR

-  * `4 complex samples` (vectorized)
-  * `valid`, `SOF`, `EOF`
-
-### Frame Structure
-
-* Total bins per frame: **512**
-* Samples per clock: **4**
-* → **128 clock cycles per frame**
-
-### Time-Multiplexed Output
-
-Each clock produces consecutive frequency bins:
-
-```id="y7l3sj"
-clk 0 → bins 0–3
-clk 1 → bins 4–7
-...
-clk 127 → bins 508–511
-```
+Each TLAST corresponds to one DMA frame.

 ---

-## Data Representation
+## Processor (PS)

-* Input to channelizer: **16-bit complex**
-* Output: **25-bit complex**
-* Per sample: **50 bits (Re + Im)**
-
-Data is later packed into **uint64** for AXI compatibility.
+- Event-driven execution (triggered by DMA)
+- No task queueing
+- Frames may be dropped if processing is slower than input rate

 ---

-## FIFO Architecture (Banked Design)
+## Data Path in PS

-### Structure
-
-* **4 independent FIFOs**
-* One FIFO per lane (sample index)
-
-```id="k6d9c7"
-Lane 0 → FIFO1
-Lane 1 → FIFO2
-Lane 2 → FIFO3
-Lane 3 → FIFO4
-```
-
-### Depth
-
-* Each FIFO depth: **128**
-* Total frame: **512 samples**
-* → 512 / 4 = 128 samples per FIFO
+- Stream Read → uint64[512]
+- Bit extraction → real/imag
+- Conversion → complex vector
+- Processing → RMS + peak detection

 ---

-## Why 4 FIFOs (Critical Design Choice)
+## Performance Notes

-### Hardware Constraint
-
-FPGA BRAM:
-
-* Max **2 ports**
-* Cannot support **4 simultaneous writes**
-
-### Input Requirement
-
-* 4 samples per clock → **4 writes per clock**
-
-### Solution
-
-→ **Banked memory (4 FIFOs)**
-
-Each FIFO:
-
-* 1 write per clock
-* Fully compatible with BRAM architecture
+- Bottleneck: unpacking + type conversion
+- PS cannot keep up with full-rate stream
+- Frames are skipped under load

 ---

-## Data Organization Across FIFOs
+## FrFT Integration Plan

-Each FIFO stores a decimated sequence of bins:
-
-```id="ehy2k6"
-FIFO1: bins 0, 4, 8, ...
-FIFO2: bins 1, 5, 9, ...
-FIFO3: bins 2, 6, 10, ...
-FIFO4: bins 3, 7, 11, ...
-```
-
-This is a **lane-based de-interleaving** of the channelizer output.
+- Replace Processor Algorithm with FrFT
+- Keep all other components unchanged
+- Input: complex single [512x1]
+- Accept dropped frames initially

 ---

-## Serialization (Parallel → Stream Conversion)
+## Roadmap

-### Input
-
-* 4 samples per clock (parallel)
-
-### Output
-
-* 1 sample per clock (AXI stream)
-
-### Mechanism
-
-A **FIFO Sequencer** performs round-robin reads:
-
-```id="b42g8p"
-Cycle 0 → FIFO1
-Cycle 1 → FIFO2
-Cycle 2 → FIFO3
-Cycle 3 → FIFO4
-(repeat)
-```
-
-### Result
-
-Reconstructed stream:
-
-```id="9f8x7x"
-0, 1, 2, 3, 4, 5, ..., 511
-```
+1. Functional FrFT (PS)
+2. Profiling
+3. NEON optimization
+4. Throughput tuning
+5. PL acceleration

 ---

-## Throughput Behavior
+## Key Takeaway

-* Write side: **4 samples/clk**
-* Read side: **1 sample/clk**
-
-Over 4 cycles:
-
-* 4 samples written → 4 samples read
-
-→ **No data loss (rate preserved over time)**
-
---
-
-## TLAST Handling (Frame Boundary)
-
-* TLAST is embedded in FIFO data:
-
-  * LSB of FIFO word carries TLAST flag
-
-```id="l6g0vn"
-[Data (50 bits)] + [TLAST (1 bit)]
-```
-
-* Extracted after FIFO mux and sent to AXI
-
-### Behavior
-
-* TLAST asserted **once per frame**
-* Typically associated with final bin (e.g., bin 511)
-
---
-
-## AXI4-Stream Interface
-
-Output signals:
-
-* `tdata` → uint64 packed data
-* `tvalid` → data valid
-* `tready` → backpressure from DMA
-* `tlast` → frame boundary
-
-### Data Path
-
-```id="9m7iqk"
-PL → AXI4-Stream → AXI DMA (S2MM) → DDR → PS
-```
-
---
-
-## Backpressure Handling
-
-AXI backpressure (`tready`) propagates upstream:
-
-* If `tready = 0`:
-
-  * FIFO reads pause
-  * Data accumulates in FIFOs
-
-### Protection Mechanism
-
-* FIFO_Sequencer only reads when:
-
-  * AXI is ready
-  * Data available in all FIFOs
-
---
-
-## Triggered Capture Mechanism
-
-### Trigger Source
-
-* Software writes to register
-* Generates 1-cycle pulse (`TriggerCapture`)
-
-### FFT_Capture Behavior
-
-State machine:
-
-```id="27qf3x"
-IDLE → wait trigger
-ARMED → wait SOF
-CAPTURE → collect 128 cycles
-DONE → assert TLAST
-```
-
-### Key Property
-
-Capture is **frame-aligned** (starts at SOF)
-
---
-
-## Architectural Pattern
-
-The system implements:
-
-```id="2q6v6x"
-Parallel Stream (4 samples/clk)
-        ↓
-Banked Memory (4 FIFOs)
-        ↓
-Round-Robin Serialization
-        ↓
-AXI Stream (1 sample/clk)
-```
-
-This is a standard FPGA pattern:
-
-> **Lane-based parallelism + memory banking + time-multiplexed output**
-
---
-
-## Notes for Future Work (FrFT Integration)
-
-### Recommended Insertion Points
-
-**Option A (Preferred):**
-
-```id="bb7jbp"
-FIFO output → FrFT → MUX → AXI
-```
-
-**Option B:**
-
-```id="o5o0qz"
-MUX → FrFT → AXI
-```
-
-### Avoid
-
-```id="q8k3yo"
-Before FIFOs (requires 4-sample parallel processing)
-```
-
---
-
-## Key Takeaways
-
-* Multiple FIFOs are required due to **memory port limitations**
-* Serialization is done via **deterministic round-robin scheduling**
-* AXI backpressure is safely absorbed using FIFO buffering
-* Frame integrity is guaranteed via **SOF-aligned capture + TLAST**
-* Architecture is scalable and suitable for further DSP insertion (e.g., FrFT)
-
---
+First make it work end-to-end, then make it fast.