Design and Implementation of a CFAR Processor for Target Detection

Design and Implementation of a CFAR Processor for Target Detection César Torres-Huitzil, Rene Cumplido-Parra, Santos López-Estrada Computer Science De...
10 downloads 0 Views 104KB Size
Design and Implementation of a CFAR Processor for Target Detection César Torres-Huitzil, Rene Cumplido-Parra, Santos López-Estrada Computer Science Department, INAOE, Apdo. Postal 51 & 216 Tonantzintla, Puebla, México [email protected], [email protected], [email protected]

Abstract. Real-time performance of adaptive digital signal processing algorithms is required in many applications but it often means a high computational load for many conventional processors. In this paper, we present a configurable hardware architecture for adaptive processing of noisy signals for target detection based on Constant False Alarm Rate (CFAR) algorithms. The architecture has been designed to deal with parallel/pipeline processing and to be configured for three versions of CFAR algorithms, the Cell-Average, the Max and the Min CFAR. The architecture has been implemented on a Field Programmable Gate Array (FPGA) with a good performance improvement over software implementations. Results are presented and discussed.

1

Introduction

The extraction of targets from signals is a complex task due to the uncontrolled and noisy environmental conditions. Adaptive digital signal processing techniques are often used to remove noise and to enhance the detectability of targets in many situations. For instance, in radar applications, the backscattering amplitude of the radar signal is used for target detection and it is usually assumed that a high magnitude of the backscattering radar signal comes from targets [1]. Since the background is not uniform and the backscattering amplitude from the background fluctuates due to noise, an adaptive scheme is required to extract targets according to a varying reference threshold and to maintain a constant false alarm rate. The CFAR algorithms have been widely used to extract targets from the background under noisy environments in application areas such as image processing, medical engineering, power quality analysis, and sonar and surveillance systems, among others [2][3]. Although the theoretical aspect of CFAR detection is advanced, there are not practical hardware applications because the high computational requirements involved in applications such as radar signal processing. The rest of the paper is organized as follows. Section 2 provides the theoretical foundation of the CFAR algorithm. Section 3 presents a data parallelism analysis of CFAR algorithms and details of the proposed hardware architecture. In section 4 the FPGA implementation and experimental results are presented. In section 5, a brief discussion on the performance improvements is presented. Finally, section 6 presents the concluding remarks.

2

César Torres-Huitzil, Rene Cumplido-Parra, Santos López-Estrada

2

CFAR Algorithm

The Cell-Averaged CFAR (CA-CFAR) is the most common CFAR detector used for target detection. The CA-CFAR detector is used to regulate the false alarm probability to a desired level in varying background environments through averaging. Figure 1 shows a block diagram of the CA-CFAR algorithm structure. In CA-CFAR detectors, a reference window of N samples which surround the cell or data under test is taken to compute the average value and some guard cells are incorporated in order to avoid targets that are close one to each other affect noise estimation [3][4]. X X1

...

XN/2

Guard Cells

Y

Guard Cells

XN/2+1

...

XN +

e(Y) >

Average Computation T

Fig. 1. Block diagram of a CA-CFAR processor. The main components of the processor are registers, a multiplier, an average computation module and a comparator

The average computation module sums up independently the data samples of both sides of the cell under test and computes their average, SL and SR, left and right respectively. Both average values are combined to estimate the local noise level in the signal. Three modalities, the average, the maximum and the minimum, are used for this purpose and they are defined according to equation 1. The noise estimation is multiplied by a scaling factor T and finally compared to the value of the cell under test Y. If the values of the cell under test exceed the computed value, then target detection is declared. The CFAR detector adapts the threshold automatically to the local information on the background noise. The scaling factor T sets up a desired false alarm probability and it is related to the noise distribution in the environment.

YAC

3

⎧1 ⎪ 2 (S L + S R ) ⎪ = ⎨max(S L , S R ) ⎪ min(S , S ) L R ⎪ ⎩

⎧ 1, if Y ≥ T × YAC e(Y ) = ⎨ ⎩0, if Y < T × YAC

(1)

CFAR Hardware Implementation

Let X be the raw data samples of the signal to be processed and n the number of reference cells in the CFAR detector. For the sake of simplify but without lost of generality, guard cells are not included in the explanation. Also, let consider a sequence of reference data samples around a cell under test as one-dimension windows shown as rectangles in figure 2. Each rectangle includes data from the

Design and Implementation of a CFAR Processor for Target Detection

3

reference cells around Xi, Xi+1, and Xi+2. As shown in figure 2, the windows share data and each time that the window slides leftwards one position on data, all its data samples, except the data located on the edges, belong to the domain of the next window. Therefore, the data dependencies and sharing can be exploited to reuse previous partial results. Once a window has been processed, preceding results can be used to compute the result of the next window without the need of recalculating partial result over the entire domain just by inserting and deleting values from the window boundaries. Xi+n Xi+n-1 Xi+n-2

Xi+2 Xi+1 Xi Xi-1 Xi-2

Xi-n Xi-n-1 Xi-n-2

Fig. 2. A graphical view of the data dependencies for three adjacent reference data sets. A data set is obtained by sliding the previous data set and by inserting and deleting one data at the edges of the previous data set

If guard cells are included, it is still possible to exploit the data sharing through parallelism since data sets on each side of the cell under test exhibit the same data sharing and both data sets can be processed concurrently by two processing elements. The data sharing in CFAR algorithms can be efficiently handled through pipelining and systolic processing due to the regularity of computations [4]. A block diagram of the proposed architecture is shown in figure 3. The main components of the architecture are: a shift register, two processing elements for accumulating partial results, called APEs, and a processing element that performs the thresholding, called CTPE. The length of the shift register is equal to the number of reference cells NRC plus the number of guard cells NGC plus one cell under test. X X1

...

XN/2

Guard Cells

APE

Y

Guard Cells

CTPE

XN/2+1

...

XN

APE

T Y

Fig. 3. Block diagram of the main core of the CFAR architecture. Two APEs compute the average of the reference cells and a CTPE computes the thresholding operation

Figure 4 shows a block diagram of the internal structure for the processing elements APE and CTPE. The APE is composed of an accumulator and a substracter. Two APEs computes the accumulation of the values of the reference cells, SL o SR in equation 1. The APE has three inputs: XR the data that is inserted in a new reference window, XD the data that is deleted from the previous accumulation, and E the signal that inhibits the activity of the accumulator in the latency period. The CTPE is composed of an ALU-like sub-module, a multiplier and a comparator. The ALU-like sub-module provides three modalities for computing the thresholding: the average, the maximum and the minimum of the partial sums SL and SR. Thus, t architecture performs three modalities of CFAR algorithms, the CA-CFAR and the so

4

César Torres-Huitzil, Rene Cumplido-Parra, Santos López-Estrada

called MAX family of CFAR detectors [3]. A multiplier scales up the ALU result with a fixed threshold T and the comparator decides if a target is present or absent. XR

XD

-

SL

CR

SR

C T

E

ACC

S

> Y

Fig. 4. Block diagram of the internal structure of the Processing Elements, a) main components of the APE and b) main components of the CTPE

In the proposed architecture, on each clock cycle, data moves rightwards and after a latency period, the APEs accumulate data of the reference cells. At the start of processing data, the APEs are inactive but after NRC/2+NGC+1 clock cycles the APEs operates continuously. NRC and NGC stand for the number of reference cells and the number of guard cells employed in the current architecture, respectively.

4

FPGA Implementation and Results

The proposed architecture was modeled using the VHDL Hardware Description Language [5] and synthesized with Xilinx ISE targeted for a XC2V250 Virtex-II device. Table 1 summarizes the FPGA hardware resource utilization and timing performance. The default configuration of the CFAR processor uses 12-bit for data, 32 reference cells and 8 guard cells which is a common configuration used for most radar-based applications with a good performance-accuracy trade-off [1]. The internal temporal data in the accumulator uses 18-bit precision established for the worst case..

5

Discussion

The proposed architecture produces an output result on each clock cycle after the latency period and performs seven arithmetic operations concurrently. The latency period is proportional to the number of reference cells NRC, and the number of the guard cells NGC around the cell under test UCT. The latency arises at the start of processing since the pipeline or shift register must be full in order to output a result. The architecture provides a throughput of 840 Millions of Operations per Second (MOPs). For instance, the architecture execution time to perform the CFAR processing in radar-based applications is 140 milliseconds on a data set 4096×4096 samples, using 32 and 8 reference and guard cells, respectively [1]. The architecture performance is over 18 times faster than the required theoretical processing time of 2.5 seconds. The software implementation of the CFAR algorithm was carried out in

Design and Implementation of a CFAR Processor for Target Detection

5

Visual C++ targeted to a personal computer with a Pentium IV processor running a 2.4 GHz and 512 Mbytes of main memory. The processing time for the CFAR algorithm on this platform is 1.2 seconds. In the software implementation a similar scheme for data reuse and optimization to the FPGA implementation was used. Also, the CFAR algorithm, coded in C language, was targeted to a TMS320C6203 DSP device from Texas Instruments. The processing time obtained for the DSP implementation was about 420 milliseconds. The performance improvement of the proposed architecture is about 10 times than the software implementation but a less extent improvement is obtained when compared to the DSP implementation. Synthesis summary for the CFAR processor targeted for a XC2V250-6FG456 Virtex-II device Number of Slices 331 (21%) Number of 4-input LUTs 177 (5%) Number of flip-flops 540 (17%) FPGA Occupation percentage 21% Maximum clock frequency 120 MHz Table 1. Synthesis summary and timing for the FPGA implementation of the CFAR processor.

6

Conclusions

In this work an efficient hardware implementation of a class of CFAR processors for adaptive signal processing and target detection was presented. The high performance of the architecture was feasible since the employment of a parallel processing model and the arithmetic digital logic and parallel structures provided by FPGAs. The proposed architecture efficiently implements a class of related CFAR algorithms, the CA-CFAR and the MAX-CFAR, MIN-CFAR algorithms. The architecture nature exploits the parallel nature in CFAR signal processing and it can be extended to more complex CFAR algorithms such as the statistic ordered algorithms.

References 1. 2.

3.

4. 5.

Merrill Ivan Skolnik, Introduction to Radar Systems, Editorial McGraw-Hill, 2000 Pearse A. Ffrench, James R. Zeidler, and Walter H. Ku, “Enhanced Detectability of Small Objects in Correlated Clutter Using an Improved 2-D Adaptive Lattice Algorithm”, IEEE Transaction on Image Processing, Vol. 6, No. 3, March 1997 Tsakalides Paniagotis, Trinci Flippo, and Nikias Crysostomos L., “Performance Assessment of CFAR Processors in Pearson-Distributed Clutter”, IEEE Transactions on Aerospace and Electronics Systems, vol. 36, No. 4, October 2000, pp. 1377-1386. Lei Zhao, Wexian Liu, Xin Wu and Jeffrey S Fu, “A Novel Approach for CFAR Processor Design”, 2001 IEEE Radar Conference, pp. 284-288. Stefan Sjoholm, and Lennart Lindh, “VHDL for Designers”, Prentice Hall, First Edition, 1997.

Suggest Documents