Skip to content

Latest commit

 

History

History
236 lines (208 loc) · 17.7 KB

README.md

File metadata and controls

236 lines (208 loc) · 17.7 KB

Sequential DOA Trajectory Estimation using Deep Complex Network and Residual Signals

Abstract:

We propose a data-driven method for direction-of-arrival (DOA) trajectory estimation. We use a deep complex architecture which leverages complex-valued representations to capture both magnitude and phase information in the received sensor array data. The network is designed to output the DOA trajectory parameters and amplitudes of the strongest source. Deviating from conventional methods, which attempt to estimate parameters for all sources simultaneously -- leading to assignment ambiguity and the problem of uncertain output dimensions, we adopt a sequential approach. The estimated source signal contribution is subtracted from the input to obtain a residual signal. This residual signal is then fed back into the network to identify the next strongest source and so on, making the proposed network reusable. We evaluate our network on simulated data of varying complexity. Results demonstrate the feasibility of such a reusable network and potential improvements can be explored in future.


Sequential DOA trajectory estimation: TL-CBF spectrum of original signal (a), and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.

Signal Model:

Consider a uniform linear array (ULA) with $N$ sensors. Assuming linear source motion, the DOA $\theta^l_k$ for $k{\text{-th}}$ source at $l{\text{-th}}$ snapshot is,

$$\theta^l_k = \phi_k + \frac{l-1}{L-1}\alpha_k, \quad l=1,2, \cdots, L.$$

The parameters $(\phi_k,\alpha_k)$ model the DOA trajectory of the $k{\text{-th}}$ source. For $K$ linearly moving sources,

$$\mathbf{Y} = \tilde{\mathbf{A}}\tilde{\mathbf{X}} + \mathbf{W}$$

where

  • $\mathbf{Y} = [\mathbf{y}_1 \cdots \mathbf{y}_L] \in \mathbb{C}^{N \times L}$ is the $L$-snapshot measurement matrix;
  • $\tilde{\mathbf{A}} = [\tilde{\mathbf{A}}_1(\phi_1, \alpha_1) \cdots \tilde{\mathbf{A}}_K(\phi_K, \alpha_K)] \in \mathbb{C}^{N \times KL}$ contains the variable DOA steering vectors for each of the $K$ sources;
  • $$\tilde{\mathbf{A}}_k(\phi_k, \alpha_k) = [ \mathbf{a}(\theta^1_k) \cdots \mathbf{a}(\theta^L_k)]$$ is the steering matrix for the $k{\text{-th}}$ source, where $\mathbf{a}(\theta_k^l) = [ 1 \quad e^{j2\pi \frac{d}{\lambda}\text{sin}(\theta_k^l)} \quad \cdots \quad e^{j2\pi(N-1)\frac{d}{\lambda}\text{sin}(\theta_k^l)} ]^T$ is the steering vector for DOA $\theta_k^l$;
  • $d$ is sensor separation and $\lambda$ is observation wavelength;
  • $\tilde{\mathbf{X}} = [\tilde{\mathbf{X}}_1 \cdots \tilde{\mathbf{X}}_K]^T \in \mathbb{C}^{KL \times L}$ with $\tilde{\mathbf{X}}_k = \text{diag}(\mathbf{x}_k) \in \mathbb{C}^{L \times L}$, where $\mathbf{x}_k = [s_k^1 \cdots s_k^L]$ are $L$ amplitudes of the $k{\text{-th}}$ source;
  • $\mathbf{W}=[\mathbf{w}_1 \cdots \mathbf{w}_L] \in \mathbb{C}^{N \times L}$ represents the additive noise.
Linearly moving source.
DOA as a function of snapshots.

Architecture:

Deep complex network -- feature extractor (shaded pink), amplitude estimator (shaded blue), and trajectory estimator (shaded peach). Every block has complex weights except for the LSTM and dense layers in the trajectory estimator. The architecture has 548,152 parameters.

The overall network architecture is shown above. It consists of three main parts: feature extractor (shaded pink), amplitude estimator (shaded blue), and trajectory estimator (shaded peach). The feature extractor processes the input $\mathbf{Y}$ \eqref{eq:sensor_array} and outputs complex features that serve as input to the two estimators. The amplitude estimator is designed to estimate the source amplitudes $\mathbf{X}$, while the trajectory estimator is designed to estimate the trajectory parameters $(\phi,\alpha)$, both for the same source (strongest). Each convolutional block performs a complex convolutional operation, followed by complex batch normalization (BN) and an activation function.

Complex-valued Convolutional Block: Complex-valued convolution can be described as follows. Let $\mathbf{W}=\mathbf{A}+i\mathbf{B}$ be the complex-valued convolutional kernels characterized by real-valued matrices $\mathbf{A}$ and $\mathbf{B}$. The complex matrix $\mathbf{H}=\mathbf{P}+i\mathbf{Q}$ is the input feature to the convolution block. The complex-valued convolution is mathematically formulated as $\mathbf{W}*\mathbf{H}= (\mathbf{A}*\mathbf{P}-\mathbf{B}*\mathbf{Q}) + i(\mathbf{A}*\mathbf{Q} + \mathbf{B}*\mathbf{P})$ where $*$ denotes the convolution operation. Check the figure below, which provides a visual representation of the complex convolution operation. For implementation, see modules.

Network Flow and Description :

  • Feature Extractor: It comprises of convolutional blocks, followed by alternate double inception blocks and convolutional downsampling block. To capture and process information at multiple scales (receptive fields sizes), inception blocks are used. It is constructed with five distinct convolutional blocks, each characterized by a different kernel size: (1, 1), (1, 3), (3, 1), (3, 3), and (5, 1). The number of output channels for different inception blocks are given in config file. The stride is 1 and padding is adjusted based on the kernel size to ensure that the input size remains identical to the output size. We down-sample the input using a dilated convolutional block to extract essential information. The kernel size, dilation rate, stride, and padding for this operation are (3, 1), (2, 1), (1, 1) and, (1, 0) respectively. ResNet blocks are used to process and downsample the features from the inception blocks prior to the skip connections in the amplitude estimator.
  • Amplitude Estimator: The output of feature extractor is passed through two inception blocks configured with the same hyperparameters as described above. This output is then concatenated (via a first skip connection) with the features which are processed by ResNet blocks. It then undergoes a down-sampling process before concatenating via a second skip connection with an alternative set of features processed through another ResNet block. The combined output then undergoes further processing using complex convolutional blocks with a kernel size of (2, 1), a stride of 1, and zero padding. This output is processed through Squeeze-and-Excitation (SE) block, followed by final complex convolution having kernel size 1. The SE block encompasses two primary operations: squeezing and excitation. Squeezing aggregates feature maps across their spatial dimensions to generate a channel descriptor. Excitation takes this channel descriptor as input and generates a set of per-channel modulation weights. These weights are then applied to the feature maps to produce the SE block's output.
  • Trajectory Estimator: Here, the output of the feature extractor is passed through two inception blocks, without using skip connections. This output is then down-sampled twice. To facilitate sequential processing across L snapshots, LSTM are used. The LSTM block is set with a hidden size of 64. This information is then passed through a fully connected dense layer followed by the Tanh non-linear activation function.

Code:

For Generating Training and Test data.

To generate ground truth DOA parameters

$ cd dataset_folder
$ python utils.py

To generate Recevied signal, Ground truth signal amplitude and Tl-CBF spectrum.

$ cd dataset_folder
$ python main_datagen.py

To generate test data.

$ cd dataset_folder
$ python generate_testdata.py

For training the proposed network.

Change the dataset_path, label_path, and sigdata_path according to your directory.

$ cd Gridless
$ sbatch batch.sh

To get test results,

$ python testdata_loss_acc.py

For training the Gridbased network.

$ cd Gridbased

Change the dataset_path, label_path in main.py file.
For checkpointpath=f'./saved_models/exp18/gaussian_rmse/l2' in main.py, make sure reg_parameter=None, weight_decay=0.
For checkpointpath=f'./saved_models/exp18/gaussian_rmse/l2_l12' in main.py, make sure reg_parameter=5e-4, weight_decay=0.

$ sbatch batch.sh

To get test results,

$ python loss_acc.py

For plots.

$ cd Gridless
$ python plots_results.py

Results:

Accuracy (in %) of various algorithms on different test datasets (η = 2.4).

DatasetName TL-CBF U-Net-l2 U-Net-l21 Proposed
TestData 64.79 85.02 85.11 88.93
TestData-1 16.175 78.48 87.6 90.87
TestData-2 69.97 70.76 68.76 95.22
TestData-3 48.49 76.47 76.2 63.70
TestData-4 57.3 69.63 72.7 65.97

Accuracy (in %) for the two-source scenarios (a, b) and the three-source scenarios (c, d) as SNR and Φ0 vary.


Amplitude relative error (in %) for two-source (a, b) and three-source (c, d) scenarios as SNR and Φ0 vary.

Observations:

  • We experimented with different skip connection configurations, including no skip connection, skip connection only to the trajectory estimator, and separate skip connections to both estimators. However, these configurations did not perform well.
  • In some three-source scenarios, the network initially estimated one of the two closely positioned sources rather than the third (stronger) source. In the subsequent iteration, the network estimated the globally strongest source.

    TL-CBF spectrum of original signal (a) at 6 dB SNR, and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(-60.5, 2.5) (strongest), S2-(5.5, 1.2) and S3-(-7.5, 3.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
    TL-CBF spectrum of original signal (a) at 5 dB SNR, and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(60.5, 2.5) (strongest), S2-(5.5, 1.2) and S3-(0, 3.8) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
  • Errors in source amplitude and trajectory parameter estimation can lead to error in source trajectory parameter estimation when processing the residuals. For example, in some scenarios, partial removal of a source in the first iteration caused the network to estimate the same source again, instead of estimating the other sources.


    TL-CBF spectrum of original signal (a) at 3 dB SNR, and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(-60.5, 2.5) (strongest), S2-(5.5, 1.2) and S3-(-7.5, 3.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
    TL-CBF spectrum of original signal (a) at 10 dB SNR, and of residual signals (b, c) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(20.4, 3.5) (strongest) and S2-(18, -2.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
  • As sources are removed partially or fully sequentially, the average power per sensor per snapshot also decreases. Additionally, as SNR increases, the average power per sensor per snapshot (P) decreases, indicating that at higher SNR levels, the method can more effectively remove the sources sequentially.

    TL-CBF spectrum of original signal (a) at 0 dB SNR, and of residual signals (b, c) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(60.4, 3.5) (strongest) and S2-(50.5, -2.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
    TL-CBF spectrum of original signal (a) at 19 dB SNR, and of residual signals (b, c) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(60.4, 3.5) (strongest) and S2-(50.5, -2.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
    TL-CBF spectrum of original signal (a) at 3 dB SNR, and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(-60.5, 2.5) (strongest), S2-(5.5, 1.2) and S3-(-7.5, 3.5) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.
    TL-CBF spectrum of original signal (a) at 20 dB SNR, and of residual signals (b)-(d) obtained after removal of every additional source. True sources are indicated by red cross. Source trajectories are S1-(60.5, 2.5) (strongest), S2-(5.5, 1.2) and S3-(0, 3.8) (weakest). The estimated source trajectories are indicated by red circle in panels (a)-(c), which are partially or fully removed in subsequent spectra (b)-(d). The average power per sensor per snapshot (P) is indicated.