# Design of Pipeline Based Low Power and Area efficient FFT for MIMO-OFDM System

C.Lakshmi<sup>#1</sup>, Dr.P.Jesu Jayarin<sup>\*2</sup>

 <sup>#1</sup>Research Scholar, Faculty of Electronics, Sathyabama Institute of Science and Technology, Chennai, Tamilndu, India
<sup>\*2</sup> Professor, Department of Information Technology, Jeppiar Engineering College, Tamilnadu, India.

c.lakshmichandrasekar@gmail.com

Abstract: In modern communication system widely uses MIMO-OFDM based wireless channels. The hardware based encryption and decryption techniques is implemented in Wireless channel to increase the security. The processor has complex units to perform the operation that increases the hardware complexity and power utilization. The Fast Fourier Transform used in MIMO-OFDM channel for signal processing involves complex design and requires more hardware utilization. This increases the processor complexity and also increases latency. The proposed pipelined based FFT/IFFT architecture is stimulated in Xilinx Virtex 7 FPGA and consumes 87.8298% less power and utilizes less area compare to existing technique.

**Key words:** *MIMO-OFDM*, *Wireless Channel*, *FFT/IFFT*, *Security*, *Power utilization*, *Hardware utilization* 

## I. INTRODUCTION

In communication systems Data integrity and Authentication is important considerations. Wireless transmission system is more vulnerable to security issues. In wireless transmission system the signal is transmitted from transmitter to receiver by using antennas. In SISO systems communication channel has single input and single output, so only one intend receiver is used whereas in MIMO communication system multiple input and multiple outputs are present so it is difficult to transmit the signal to one intend receiver. In wireless systems the radio signals act as carrier, each transmitter and receiver is tuned to particular frequency by using Frequency division multiple access (FDMA) technique. In OFDM based system uses multiple frequency to transmit and receive signals.

The various security techniques are implemented in the communication channel in order to enhance the security. The software-based approach and hardware-based approach are implemented in wireless channel. The hardware-based approach provides more security than software-based approach. But designing of hardware processor is more complex [1], because it has various computation Unit. The computation process consumes more power whereas in wireless channel only limited power is possible. In MIMO multiple inputs and multiple outputs are existing, where the number of antennas increases the FFT complexity also increases [2]. So to reduce hardware complexity the FFT module need to be optimized.

There are two basic architectures of FFT module are existing, one is memory-based architecture and another one is pipelined architecture. In memory-based architecture the latency is high compare to pipelined architecture. The proposed FFT module uses three stage pipelined architecture that increases the throughput and reduces the area. The pipeline-based architecture is further classified into feedback architecture and feed forward architecture. In feedback architecture single path feedback architecture is widely and Multipath delay commutator is widely used as feed forward architecture. The single delay feedback architecture is used as memory efficient architecture but the latency is high compare multi path delay commutator.

## II. MIMO-OFDM OVERVIEW

The MIMO-OFDM system shown in fig 1 consists of the transmitting nT and the receiving nR antennas. The transmitter signal is forwarded first to the channel coder and converted to serial-to-parallel conversion, then module. First, the signals emitted by the emitted antennas eliminate the cyclic prefix and parallel to both serial converters and go demodulators. Fast Fourier transform (FFT) is used to convert the signal in the spatial domain into a frequency domain. Then the identification of original data is forwarded to the MIMO channel estimation. The data stream is decoded by the space-time decoding and demodulation technique on the receiver side. The fig 2 shows the 2x2 MIMO-OFDM system uses two transmitter antennas and two receiver antennas.

In proposed MIMO-OFDM system the secret data is transmitted so channel encoder is combined with LDPC. The LDPC has Check node unit and Variable node unit. The proposed FFT/IFFT module is implemented in CNU of LDPC block to reduce the area and power utilization of processor. The proposed system uses the more number of Check node Unit and Variable node unit, each check node unit uses the FFT module and IFFT module, this causes the system overhead. So the low power and area efficient FFT module is proposed to reduce the processor overhead. The Check node unit is shown in fig 3.

Multiple-input multiple-output processing (MIMO) involves multiple antennas at the transmitting end and the receiving end. The performance of FFT plays a vital role in processor computation speed and hardware complexity [1]. The MIMO system uses the spatial time signal and process space domain signal along with time domain signal simultaneously. The MIMO system exploits massive antennas in transmitter and receiver side[2], when number of antennas increases the FFT module increases.



Fig 3:Implementation of FFT in Check Node Unit of LDPC

# III. RELATED WORKS

The OFDM based MIMO system uses the FFT and IFFT for computation of frequency domain signals. The FFT block consumes more area and power in the processor. So the optimization technique is used to reduce the area and power consumption.

Laxman P. Thakare and Dr. A.Y.Deshmukh [3] proposed the area efficient FFT for MIMO-OFDM channel. The reconfigurable N point FFT is discussed. In OFDM based MIMO system uses multiple frequencies to transmit the data, that increases the hardware complexity. The area efficient FFT module is proposed based on multipath delay commutator architecture technique, but there is no information on power.

Mahdavi *et al* [4] proposed the low latency FFT module based on pipelined architecture. The proposed architecture uses single path delay commutator architecture. This proposed architecture reduces 42% latency compare to existing module. The 2048 point FFT is implemented in 28nm Cmos technology. The Latency is reduced by reducing number of butterfly operations and using low latency reordering circuit.

Antony Xavier Glittas et al [5] proposed the multipath delay commutator based Radix 2 FFT for MIMO system.In this method the bit reversal operation is performed by Internal architecture itself .The data scheduling registers are used as a bit reversal

registers, so it requires less number of registers. The two data streams are processed simultaneously ,so the latency is reduced compare to prior design.

Mohammed Dali et al [6] proposed FFT module based on Single path delay feedback(SDF) implemented in vertex 5 FPGA board. The hardware multiplier is shared between two data streams the hardware utilization is reduced.

G. R. Locharla et al [7] proposed the architecture based on multipath delay commutator (MDC) data re-ordering circuit for FFT/IFFT module used in MIMO-OFDM channel. This method reduces the reordering circuit power consumption from 17.44mW to 11.64mW.The proposed method implemented in TSMC 65nm CMOS technology.

#### IV. PROPOSED STRUCTURE OF FFT

The efficient multipath delay commutator based architecture is proposed in [8]. The fig 4 shows the radix-2 FFT operation .The proposed FFT module uses multipath delay Commutator architecture is shown in fig 5. In MDC architecture the output of one stage is feed to the input to next stage and parallel processing is possible in this architecture. The existing architecture uses [9] the viterbi algorithm but in proposed method uses Alamouti encoding is added with LDPC as a channel coding algorithm. In the proposed architecture the parameters are reordered to reduce the latency. The four stage pipelined architecture [10] is proposed Store stage, Parse Stage, FFT stage and Post stage and detailed architecture is analysed. The OFDM need 80bits/sec data rate ,to ensure this data rate with in 4 µsec the FFT process need to be completed. The area efficient FFT is designed to reduce the system complexity and the proposed FFT consumes less power compare to existing R2MDC[9] FFT module.



Fig 4:Radix-2 16 point FFT





N point Radix-2 DFT is given by

$$X(k) = \sum_{n=0}^{N-1} x(n) \quad W^{nk}, k = 0, 1, \dots N - 1(1)$$



Fig 6:FFT used in MIMO-OFDM system

The above fig 6 depicts the resource utilization in FFT module. The Fast Fourier Transform core implements the Cooley-Tukey FFT algorithm a computationally efficient method for calculating the Discrete Fourier Transform (DFT) in Xilinx ISE[11]. Cooley-Tukey proposed the algorithm for fast computation of FFT based on Choice of N. The computation is easy when  $N=2^m$ .Based on this algorithm the input is divided into N/2.

In order to reduce the latency the proper reordering of input signals is implemented in FPGA module. The parallelism also plays important role in achieving of speed.

The parallel extensions [12] are proposed to reduce the latency in SDF architecture but due to some limitations MDC is widely used in OFDM based MIMO channels .The twiddle factor generation[12] plays a vital role in FFT computation. In pipelined architecture the look up tables and recursive multiplication are used. The recursive multiplication reduces the number of memory elements. In proposed architecture the recursive multiplier is used in Check node unit of LDPC to reduce the number of memory elements. In MDC architecture the feedback is not used so the feedback delay is eliminated. The fig 7 shows the butterfly architecture used in MDC architecture.In this butterfly the delay is introduced in some samples but processed in same clock cycle.



Fig 7:Butterfly architecture without feedback line



Fig 8:MDC based 8 Point DFT

In the fig 8 the input data is feed to Butterfly I and in second stage the output from Butterfly I is feed to Butterfly II. In third stage the Butterfly II is feed to Butterfly III and output is obtained. The efficient MDC architecture is proposed [13] along with input reordering circuit for OFDM channel, here also the Cooley-Tukey algorithm is used to compute the FFT.It is a Radix 2 MDC architecture for 64 point FFT. There is no optimization for multiplier unit that increases the complexity. We introduced the flexible multiplier designed in [14] to reduce the complexity. In FFT module the store stage has convolutional encoder for correcting the errors along with puncture and Interleaver .Here we uses QAM 256 modulation scheme. The proposed FFT is implemented in FFT stage shown in fig 9.



Fig 9:Proposed Pipelined FFT module



Fig 10:Internal Architecture of Butterfly I



Fig 11:Internal Architecture of Butterfly II

The internal structure of Butterfly I and Butterfly II is shown in the fig 10 and fig 11. The real part and imaginary part is computed in stage I is feed to the stage 2 and twiddle factors are calculated.

# V. PERFORMANCE COMPARISON OF FFT

The proposed RAPS based MIMO-OFDM system consists of several function with modules, among others FFT plays major role here. FFT in proposed RAPS based MIMO-OFDM system is compared with existing R2MDC FFT in MIMO-OFDM system [9]. The proposed FFT is implemented in MIMO-OFDM system that uses LDPC coder. The existing R2MDC FFT in MIMO-OFDM system consumes the slice register as 198 and slice LUTs as 152. However, the FFT in RAPS based MIMO-OFDM with Virtex-7 (XC7VX330T-3FFG1157) FPGA design consumes the slice register as 179, slice LUTs as 123 and LUT-FFs pairs as 137. The summary of performance comparison is tabulated in Table 1; it clearly depicts the hardware utilization of FFT in RAPS based MIMO-OFDM system is very low compared to the existing R2MDC FFT in MIMO-OFDM system.

Similarly, the power consumption of existing R2MDC FFT in MIMO-OFDM system is 1.175W. However, the power consumption of FFT in RAPS based MIMO-OFDM system is 0.143W. Table 1 clearly depicts the power consumption of proposed FFT in RAPS based MIMO-OFDM system is very low in terms of 87.8298% lower than existing R2MDC FFT in MIMO-OFDM system.

#### TABLE I

#### PERFORMANCE COMPARISON OF FFT

| Performance<br>metrics     | FFT in<br>RAPS<br>based<br>MIMO-<br>OFDM | R2MDC<br>FFT in<br>MIMO-<br>OFDM [7] |  |
|----------------------------|------------------------------------------|--------------------------------------|--|
| Slice Registers            | 179                                      | 198                                  |  |
| Slice LUTs                 | 123                                      | 152                                  |  |
| LUT and FF pairs           | 137                                      | -                                    |  |
| Power<br>consumption (w)   | 0.143                                    | 1.175                                |  |
| Maximum<br>frequency (MHz) | 596.730                                  | -                                    |  |

## VI. SIMULATION RESULTS

The proposed architecture is stimulated and test bench wave form is obtained shown in fig 16.



Fig 12:RTL Schematic of FFT

## 1)Stage 1

Input: in1,in2,in3,in4,in5,in6,in7,in8.

Output:stage1\_1,stage1\_2,stage1\_3,stage1\_4,stage 1\_5,stage1\_6,stage1\_7,stage1\_8.

2)Stage 2

Input:stage1\_1,stage1\_2,stage1\_3,stage1\_4,stage1\_ 5,stage1\_6,stage1\_7,stage1\_8

Output:Stage2\_1,stage2\_2,stage2\_3,stage2\_4,stage 2\_5,stage2\_6,stage2\_7,stage2\_8

3)Stage 3

Input:Stage2\_1,stage2\_2,stage2\_3,stage2\_4,stage2\_5,stage2\_6,stage2\_7,stage2\_8

Output:Stage3\_1,stage3\_2,stage3\_3,stage3\_4,stage 3\_5,stage3\_6,stage3\_7,stage3\_8

The RTL schematic of FFT module is shown in the figure 12 and figure 13. The Delav is minimized by grouping the adders and Subtractors. In Stage 1 the four adders(A1,A2,A3,A4) are grouped together and four subtractors(S1,S2,S3,S4) are grouped together. In stage 2 two adders A5 and A7 are groped together ,adders A6 and A8 are grouped together. Similarly the two subtractors S5 and S7 are grouped together, subtractors S6 and S8 are grouped together. In stage 3 the four adders(A9,A10,A11,A12) are grouped together and four subtractors(\$9,\$10,\$11,\$12) are grouped together. Each adder/subtractor has 8 internal blocks(c\_0, c\_1, c\_2, c\_3, c\_4, c\_5, c\_6, c\_7).The adders and subtractors are designed based on pipelined architecture.



Fig 13:Internal architecture of 3 stage- FFT

The Register level schematic of internal architecture of Butterfly computation is shown in the fig 15.



Fig 14: Comparison between proposed FFT architecture and Existing R2MDC FFT architecture

The fig 14 clearly depicts the proposed architecture utilizes less hardware than existing R2MDC architecture.



Fig 15:RTL schematic of Internal Architecture of Butterfly structure

#### International Journal of Engineering Trends and Technology (IJETT) – Volume 68 Issue 9 - Sep 2020

|                  | 0 ps               | 200,000 ps                 | 400,000 ps | 600,000 ps     | 800,000 ps |
|------------------|--------------------|----------------------------|------------|----------------|------------|
| stage3_1[14:0]   | XXXX               |                            | 0000000001 | 11110          |            |
| stage3_2[14:0]   | XXXX (00000000000) |                            | 0000000000 | 00000000010000 |            |
| ■ stage3_3[14:0] | XXXX X00000000000X |                            | 000000010  | 00000001010100 |            |
| stage3_4[14:0]   | XXXX (00000000000) | xxx                        |            | 00000000010110 |            |
| stage3_5[14:0]   | XXXX               | 000000000X 000000110001100 |            | 01100          |            |
| stage3_6[14:0]   | XXXX (00000000000) | xx\000000000000\           |            |                |            |
| stage3_7[14:0]   | XXXXX00000000000X  |                            | 0000000001 | 00000000110010 |            |
| stage3_8[14:0]   | xxxx (00000000000) | 0                          |            |                |            |
| 14 ck            |                    |                            |            |                |            |
| in1[7:0]         | 00000000           |                            | 00100100   |                |            |
| in2[7:0]         | 00000000           |                            | 10000010   |                |            |
| in3[7:0]         | 00000000           |                            | 00001011   |                |            |
| in4[7:0]         | 00000000           |                            | 01100110   |                |            |
| in5[7:0]         | 00000000           |                            | 00010001   |                |            |
| in6[7:0]         | 00000000           |                            | 10010010   |                |            |
| in7[7:0]         | 00000000           |                            | 01101011   |                |            |
| in8[7:0]         | 00000000           |                            | 00011001   |                |            |
|                  |                    | 1                          |            |                |            |

Fig 16:Test Bench Wave form of FFT module

#### VII. CONCLUSION

paper studied This the various architectures of FFT and proposed the area efficient low power FFT The proposed architecture is implemented in MIMO-OFDM channel. The proposed architecture uses the low power and less area compared to existing module. This effective FFT module will reduce the latency and area in MIMO-OFDM communication channel. Finally, the simulation results show the improvement of proposed FFT based MIMO-OFDM in terms of hardware utilization, power consumption and maximum operating frequency compared to existing techniques.

#### REFERENCES

- J. Zhang, Y. Liu, H. Rashvand, P. Deng, G. Xie and J. Mao, "Taylor approximation pricing for K-user multipleinput multiple-output (MIMO) interference channels", IET Communications, vol. 6, no. 17, pp. 2957-2967, 2012.
- [2] X. Kuai, L. Chen, X. Yuan and A. Liu, "Structured Turbo Compressed Sensing for Downlink Massive MIMO-OFDM Channel Estimation," in IEEE Transactions on Wireless Communications, vol. 18, no. 8, pp. 3813-3826, Aug. 2019, doi: 10.1109/TWC.2019.2917905.
- [3] L. P. Thakare and A. Y. Deshmukh, "Area Efficient FFT/IFFT Processor Design for MIMO OFDM System in Wireless Communication," 2015 7th International Conference on Emerging Trends in Engineering & Technology (ICETET), Kobe, 2015, pp. 10-13, doi: 10.1109/ICETET.2015.25.
- [4] M. Mahdavi, O. Edfors, V. Öwall and L. Liu, "A Low Latency FFT/IFFT Architecture for Massive MIMO Systems Utilizing OFDM Guard Bands," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 66, no. 7, pp. 2763-2774, July 2019, doi: 10.1109/TCSI.2019.2896042.
- [5] A. X. Glittas, M. Sellathurai and G. Lakshminarayanan, "A Normal I/O Order Radix-2 FFT Architecture to Process Twin Data Streams for MIMO," in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 24, no. 6, pp. 2402-2406, June 2016, doi: 10.1109/TVLSI.2015.2504391.
- [6] M. Dali, R. M. Gibson, A. Amira, A. Guessoum and N. Ramzan, "An efficient MIMO-OFDM radix-2 Single-Path Delay Feedback FFT implementation on FPGA," 2015 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), Montreal, QC, 2015, pp. 1-7, doi: 10.1109/AHS.2015.7231171.
- [7] G. R. Locharla, K. K. Mahapatra and S. Ari, "Variable length mixed radix MDC FFT/IFFT processor for MIMO-

*OFDM application*," in IET Computers & Digital Techniques, vol. 12, no. 1, pp. 9-19, 1 2018, doi: 10.1049/iet-cdt.2017.0018.

- [8] Ganesamoorthy, Narmadha & Deivasigamani, S. & Balasubadra, K. (2014). "An Efficient Multi-Path Delay Commutator Architecture". International Journal of Computer Applications. 98. 21-23. 10.5120/17275-7704.
- [9] N. Kirubanandasarathy and K. Karthikeyan, "Design of pipeline R2MDC FFT for implementation of MIMO OFDM transceivers using FPGA", Telecommunication Systems, vol. 63, no. 3, pp. 465-471, 2016.
- [10] Park, J.S., Ogunfunmi, T. "Efficient FPGA-Based Implementations of MIMO-OFDM Physical Layer". Circuits Syst Signal Process 31, 1487–1511 (2012). https://doi.org/10.1007/s00034-012-9411-4
- [11] Ahashie, Astom & Arkoh, Ebenezer & Anokye, Frank. (2017). "Minimizing FFT Hardware Resources on FPGA". 10.13140/RG.2.2.14508.10881.
- [12] Dickson, B.W. (2014). "Parallel Extensions to Single-Path Delay-Feedback FFT Architectures".
- [13] Mookherjee, Soumak & Debrunner, Linda & Debrunner, Victor. (2015). A low power radix-2 FFT accelerator for FPGA. 10.1109/ACSSC.2015.7421167.
- [14] C.Lakshmi P.Jesu Jayarin" Reconfigurable Design of Low Power Hybrid Crypto Processor using Signcryption for Wireless Networks" International Journal of Advanced Trends in Computer Science and Engineering, 9(3), May – June 2020, 4030 – 4036