docs: update documentation for capture redesign and validation

This commit is contained in:
canisio
2026-04-29 10:03:34 -03:00
parent b3ba729f8b
commit 19b0513809
3 changed files with 160 additions and 123 deletions

View File

@@ -11,22 +11,35 @@ The system implements a high-throughput signal chain in the FPGA (PL) and perfor
## Current Status ## Current Status
- Tx subsystem: LFM pulse generator (DDS-based, complex output) - Tx subsystem: LFM pulse generator (DDS-based, complex output)
- Rx subsystem: fully functional channelizer pipeline (PFB-based) - Rx subsystem: fully functional channelizer pipeline (PFB-based) or bypass
- PL → PS interface: AXI4-Stream + DMA operational - PL → PS interface: AXI4-Stream + DMA operational
- PS processing: frame-based algorithm (RMS + peak detection) - PS processing: frame-based algorithm on a Data Process Window (DPW)
--- ---
## System Architecture ## System Architecture
ADC → Channelizer (PFB, 512 bins) Tx (PL)
FFT_Capture (frame control) Waveform Generator (LFM / CW / Pulsed)
FIFO Serializer (4 FIFOs → 1 stream) DAC
AXI4-Stream (uint64) RF Loopback / Input
Rx (PL)
→ ADC
→ Channelizer (PFB, 512 bins) / Bypass / Counter
→ Capture (frame control)
→ AXI4-Stream (128-bit, 4 samples/clock)
→ DMA (S2MM) → DMA (S2MM)
→ PS Memory → PS Memory
→ Processor Algorithm → Processor Algorithm
Post Processing (PS)
→ Triggered Capture
→ Sample Unpacking (I/Q)
→ Data Reshaping → [FrameSize x nFrames x nTriggers]
→ Host Communication / Processing / Visualization
→ One DPW is a windows of FrameSize x nFrames samples
--- ---
## Key Parameters ## Key Parameters

View File

@@ -6,11 +6,9 @@
## Overview ## Overview
The Rx subsystem implements a **polyphase filter bank (PFB) channelizer** followed by FFT processing. The Rx subsystem implements a **polyphase filter bank (PFB) channelizer** followed by FFT processing, a **bypass path**, and a **multi-frame capture pipeline**.
It converts wideband ADC input into frequency-domain channels and streams the result to the PS. It converts wideband ADC input into frequency-domain channels (or raw samples via bypass) and streams the result to the PS.
A **bypass path** is also available for raw data inspection and debugging.
--- ---
@@ -24,11 +22,9 @@ PFB Channelizer (Decimation + Filtering)
FFT (512 bins) FFT (512 bins)
FFT Capture Capture (frame control)
FIFO Serializer (4 → 1) AXI4-Stream (128-bit, 4 samples/clock)
AXI4-Stream
DMA DMA
@@ -40,45 +36,26 @@ ADC
Bypass Path Bypass Path
FIFO / Serializer Capture (frame control)
AXI4-Stream AXI4-Stream (128-bit, 4 samples/clock)
DMA DMA
--- ---
## Bypass Functionality ## Capture Pipeline
The bypass allows direct observation of the input signal without channelization. - Multi-frame acquisition (configurable nFrames)
- Frame size: 512 samples
### Purpose - Supports asynchronous capture start (not frame-aligned)
- TLAST asserted at frame boundaries
- Debugging and validation
- Access to raw ADC-domain data
- Comparison with channelized output
- Verification of downstream processing
---
### Behavior ### Behavior
- Input data is routed directly to output - First frame may be partial
- No filtering or FFT applied - Frames may contain ≤ 2 frame indices (expected)
- Maintains same output interface (AXI4-Stream) - DPW spans nFrames frames but covers nFrames + 1 frame regions
---
### Selection Mechanism
A selector signal chooses between:
- Channelizer output (normal operation)
- Bypass output (raw data)
Implementation typically uses:
- Parallel paths
- Output switching logic
--- ---
@@ -86,22 +63,19 @@ Implementation typically uses:
### ADC Input ### ADC Input
- Sampling rate: 4096 MSPS - Sampling rate: 4096 MSPS
- Data type: **fixdt(1,16,15)** (Q1.15 format) - Data type: **fixdt(1,16,15)** (Q1.15)
### PFB Channelizer ### PFB Channelizer
- Decimation: 8 - Decimation: 8
- Effective bandwidth: 512 MHz - Effective bandwidth: 512 MHz
- Input and internal scaling aligned to Q1.15 domain
### FFT ### FFT
- Size: 512 - Size: 512
- Produces frequency bins - Produces frequency bins
### FFT Capture ### Capture
- Controls frame boundaries - Defines frame boundaries (512 samples)
- Generates TLAST
### FIFO Serializer
- Converts parallel streams into single stream
--- ---
@@ -109,62 +83,57 @@ Implementation typically uses:
### System Standardization ### System Standardization
The signal chain was standardized to a **Q1.15 fixed-point format (fixdt(1,16,15))**: - End-to-end Q1.15 (**fixdt(1,16,15)**)
- DAC output uses Q1.15
- ADC input is reinterpreted as Q1.15 (Same Stored Integer)
- Channelizer input operates in this normalized domain
---
### Channelizer Output Scaling ### Channelizer Output Scaling
- Native channelizer output: **sFix25_En23** - Native: **sFix25_En23**
- Rescaled and quantized to: **fixdt(1,16,15)** - Quantized to: **fixdt(1,16,15)** (round + saturate)
This conversion:
- Preserves signal dynamic range
- Maximizes fractional precision
- Uses rounding and saturation
- Aligns with system-wide numeric format
--- ---
### Data Width Reduction ## Data Packing (Updated)
- Previous format: **50 bits per complex sample** (25 bits real + 25 bits imag) - 4 samples per clock
- New format: **32 bits per complex sample** (16 bits real + 16 bits imag) - Each sample: complex (16-bit real + 16-bit imag)
- Packed into **128-bit AXI4-Stream word**
Benefits: Benefits:
- Matches datapath parallelism
- Reduced AXI bandwidth - Efficient DMA transfers
- Reduced FIFO usage - Eliminates need for serializer stage
- More efficient DMA transfers
--- ---
## AXI4-Stream Output ## AXI4-Stream Output
- Data type: uint32 (packed complex: 16-bit real + 16-bit imag) - Width: 128 bits
- Contains 4 complex samples per cycle
- TLAST = frame boundary - TLAST = frame boundary
--- ---
## Data Format ## Debug / Validation Features
- Frame size: 512 samples A counter-based debug mode is implemented:
- Complex samples packed into 32-bit words
- Real part → sample counter (0..511)
- Imag part → frame index
Used to validate:
- Sample continuity
- Frame boundaries
- DMA ordering and integrity
--- ---
## Key Characteristics ## Key Characteristics
- Fully streaming pipeline - Fully streaming pipeline
- High throughput
- Deterministic latency - Deterministic latency
- Consistent fixed-point scaling (Q1.15 end-to-end) - High throughput (4 samples/clock)
- Supports dual-mode operation (channelizer / bypass) - Dual-mode operation (channelizer / bypass)
- Validated up to nFrames = 1024
--- ---

View File

@@ -1,4 +1,4 @@
# 🧠 PS Subsystem (Control + Processing) # 🧠 PS Subsystem (Control + Capture + Processing)
[🏠 Project Home](../README.md) [🏠 Project Home](../README.md)
@@ -8,73 +8,128 @@
The PS subsystem is responsible for: The PS subsystem is responsible for:
- System initialization
- Configuring PL subsystems - Configuring PL subsystems
- Triggering captures
- Receiving data via DMA - Receiving data via DMA
- Performing frame-based processing - Preparing data for processing and visualization
The current implementation acts as a **placeholder for post-processing**, focusing on reliable data acquisition and host interaction.
--- ---
## Responsibilities ## Responsibilities
### Control ### Control & Initialization
- Writes parameters to PL registers: - Configure PL parameters:
- Tx generator configuration - Tx waveform configuration
- Generates TxPulseStart trigger - Capture parameters (nFrames, etc.)
- Initialize DMA and memory buffers
- Manage system startup
---
### Trigger & Capture
- Generates capture trigger (software-controlled)
- Controls DPW acquisition timing
- Each trigger initiates one DPW capture
--- ---
### DMA Handling ### DMA Handling
- AXI4-Stream → DMA (S2MM) - AXI4-Stream → DMA (S2MM)
- Data stored in PS DDR - Receives **128-bit stream** (4 samples per clock)
- Stores data in PS DDR memory
Configuration: Configuration:
- Frame size: 512 - Frame size: 512 samples
- Buffers: 16 - nFrames: configurable (validated up to 1024)
--- ---
### Processing Pipeline ## Data Format
DMA → uint64[512] ### Raw DMA Data
→ unpack real/imag
→ convert to complex - Packed complex samples
→ RMS + peak detection - 16-bit real + 16-bit imag per sample
- 4 samples per 128-bit word
---
### Processing Representation
Data is unpacked and reshaped into:
```
[FrameSize x nFrames x nTriggers]
```
---
## Processing Pipeline (Current)
DMA
→ Unpack samples (I/Q separation)
→ Convert to complex representation
→ Reshape into 3D structure
→ Visualization / basic analysis
---
## Validation Support
Uses counter-based validation:
- Real part → sample counter
- Imag part → frame index
Enables verification of:
- Data continuity
- Frame alignment
- Correct ordering from DMA
--- ---
## Execution Model ## Execution Model
- Event-driven (DMA trigger) - Triggered (event-based)
- No buffering queue - Burst capture (DPW)
- Frames may be dropped - Not continuous real-time streaming
--- ---
## Performance Notes ## Performance Notes
- Bottleneck: unpacking + conversion - Designed for correctness and validation (not optimized)
- Cannot sustain full-rate input - Bottleneck: unpacking + data movement
- Full-rate continuous processing not supported
--- ---
## Interaction with PL ## Role in System
### Tx Control The PS currently serves as:
- Low-rate trigger (~Hz)
- Starts burst generation
### Rx Data - Control interface
- Continuous high-rate stream - Data acquisition manager
- Pre-processing stage
Future implementations will replace the current processing with advanced algorithms (e.g., FrFT).
--- ---
## Future Work ## Future Work
- Replace processing with FrFT - FrFT-based processing
- NEON optimization - Timestamp integration
- Throughput improvements - UDP streaming
- Optimization (NEON / vectorization)
- Metadata extraction (move complexity to PL)
--- ---