ISSN: 1992-8645

<u>www.jatit.org</u>



# REVIEW ARTICLE: EFFICIENT MULTIPLIER ARCHITECTURE IN VLSI DESIGN

# M. JEEVITHA<sup>1</sup>, R.MUTHAIAH<sup>2</sup>, P.SWAMINATHAN<sup>3</sup>

<sup>1</sup> P.G. Scholar, School of Computing, SASTRA University, Tamilnadu, INDIA

<sup>2</sup>Assoc. Prof., School of Computing, SASTRA University, Tamilnadu, INDIA

<sup>3</sup>Dean, School of Computing, SASTRA University, Tamilnadu, INDIA

E-mail: <sup>1</sup>m.jeevitha.cse@gmail.com, <sup>2</sup>sjamuthaiah@core.sastra.edu, <sup>3</sup>deanpsw@sastra.edu

#### ABSTRACT

Designing high-speed multipliers with low power and regular in layout have substantial research interest. The analysis is done on the basis of certain performance parameters i.e. Area, Speed and Power consumption and dissipation. Multipliers are considered to be an important component in DSP applications like filters. Therefore, the low power multiplier is a necessity for the design and implementation of efficient power-aware devices. In this paper we have analyzed and reviewed a few multiplier architectures based on their working principle, speed and power efficiency.

**Keywords:** Multiplier, Modified Booth-Encoding, Carry Save Array Multiplier, Partial Products, Wallace Tree Multiplier.

## 1. INTRODUCTION

Multiplication is a basic arithmetic operation which is present in many part of the digital computer especially in signal processing systems such as graphics and computation system. It requires more hardware resources and processing time than addition and subtraction requires. There is a continuous development in VLSI technologies and, the needs to develop process independent chip design tools are growing [1], [2], [3]. The use of hardware description language (HDL) for integrated circuit designing is good for the process design rules at the early stages of design [4]. Multipliers are the basic component in single chip digital information processors [5], [6]. With advances in technology, many scientist have tried and trying to design multipliers which offer high speed, low power consumption, layout regularity and hence less area or even combination of them in multiplier. The multiplier compiler design defined in [7] and [8] generate parameterizable layout for MOS technology, thus making them suitable for various high speed, low power, and compact VLSI implementations. However area and speed are two important conflicting constraints. So improving speed results always in larger areas. The number of gates per chip area keeps on increasing, while the gate switching energy does not decrease at the same rate. So the power dissipation rises and removal of heat becomes difficult and expensive. The dynamic power of CMOS circuits is becoming a major concern in the design of devices. There are different multiplier structures which can be classified as Serial Multipliers, Parallel multipliers, Array multipliers, Tree multipliers and so on. Multipliers are categorized in relative to their architecture, applications, and the way of producing partial products and summing up of partial products to produce the final result.

## 2. WALLACE TREE MULTIPLIER

For real-time signal processing, a high speed and throughput Multipliers-Accumulator (MAC) is always a key to achieve high performance in the digital signal processing system. The main consideration of MAC design is to enhance its speeds. That high speed is achieved through this well-known Wallace tree multiplier. Wallace introduced parallel multiplier architecture [9], [10] to achieve high speed. Wallace Tree algorithm can be used to reduce the number of sequential adding stages. The advantage of high speed becomes an enhanced feature for multipliers having operand of greater than 16 bits. The Wallace tree was being constructed using carry save adder to reduce an Nrow bit product matrix to an equivalent two row

30th April 2012. Vol. 38 No.2

© 2005 - 2012 JATIT & LLS. All rights reserved.

| ISSN: 1992-8645 | www.jatit.org | E-ISSN: 1817-3195 |
|-----------------|---------------|-------------------|
|                 |               |                   |

matrix that is then fed into carry propagating adder to sum up those rows of bits and to produce the product. The carry save adders are those conventional full adders [11] in which carries are not connected and three bits of inputs are taken in and two bits are given as output. Instead of using carry save adders in this multiplier, full adders and half adders of 4:2 compressors and 3:2 compressors can be used in their reduction phase.

Generally the Wallace tree construction has many ways to implement. One way among them is considering all bits in a column and producing two bits as output for that column. Another way is to consider first four bits of a column and producing two bits which uses 4:2 compressors. And the other is considering first three bits in a column which uses 3:2 compressors. The Wallace Tree multiplier has an irregular structure [12]. Many different adder tree structures have been used to reduce the computation time of the multipliers. The computation time of the Wallace tree has achieved the lower bound of O (log3/2 N). For n-bit Wallace tree multiplier, the number of steps needed is  $(\log 3/2(n/2) + 1)$ . Wallace tree multipliers have significant complexity and timing advantages over traditional matrix multipliers.

The main disadvantage of Wallace tree multipliers is its irregular structure, making layout difficult and all adder blocks are active regardless of multiplicand size.

Fig.1 Wallace tree multiplier

Delay is log (n). The Wallace tree multiplier has irregular interconnection which in turn occupies more area on the wafer [13], [14], [15] and needs greater cell interconnection wiring. Since an interconnection plays an important role in IC technologies this factor makes Wallace tree inappropriate for certain circuits [16], [17]. The main advantage of this multiplier is its Logarithmic circuit delay. In many FPGAs, Wallace trees do not provide any advantage over ripple adder trees. Due to irregular routing, they may actually be slower and may certainly more difficult to route. Adder structure used in this will increases for increased bit multiplication.

## 3. MODIFIED BOOTH MULTIPLIER

Booth encoding is a method of reducing the number of partial products required to produce the multiplication result. To achieve high-speed multiplication, algorithms using parallel counters like modified Booth algorithm has been proposed and used. This type of fast multiplier operates much faster than an array multiplier for longer operands because it's time to compute is proportional to the logarithm of the word length of operands. By recoding the numbers that are to be multiplied, Modified Booth multiplier allows for smaller, faster multiplication circuits. The number of partial products is reduced to half, by using the technique of Booth recoding [18]. Reduction in the number of partial products depends upon how many bits are recoded and on the grouping of bits.



The grouping considers each three bits of the multiplier bits starts from the LSB bit and the first considers only two bits. From the next it considers three bits in which one bit will be overlapped on the previous group.

<u>30<sup>th</sup> April 2012. Vol. 38 No.2</u>

© 2005 - 2012 JATIT & LLS. All rights reserved.

ISSN: 1992-8645

www.jatit.org



#### Fig.3 Grouping of bits from the multiplier term

Thus grouped multiplier will result in the production of bits between these five bits as follows as -2,-1, 0, +1, and +2.

| Block | Re - coded digit | Operation |
|-------|------------------|-----------|
| 000   | 0                | 0         |
| 001   | +1               | +1        |
| 010   | +1               | +1        |
| 011   | +2               | +2        |
| 100   | -2               | -2        |
| 101   | -1               | -1        |
| 110   | -1               | -1        |
| 111   | 0                | 0         |

Table1. Recoding Table

The advantage of this method is making the number of partial products into half of the multiplier term size by grouping. The main disadvantage of the modified booth multiplier is its complexity of the circuit to produce partial product.

| 2AC9 0010101011001001                     |     |  |  |  |
|-------------------------------------------|-----|--|--|--|
| 006A x) 00000000011010100                 |     |  |  |  |
|                                           |     |  |  |  |
| +0 +0 +0 +0 +2 -1 -1 -2                   |     |  |  |  |
| 11010101001101110                         | PP0 |  |  |  |
| 11101010100110111                         | PP1 |  |  |  |
| 11101010100110111                         | PP2 |  |  |  |
| 00101010110010010                         | PP3 |  |  |  |
| 000000000000000000                        | PP4 |  |  |  |
| 0000000000000000                          | PP5 |  |  |  |
| 00000000000000000                         | PP6 |  |  |  |
| ±)00000000000000000                       | PP7 |  |  |  |
| 00000000000100011011011100111010 (11B73A) |     |  |  |  |
| E. d E                                    |     |  |  |  |



# 4. MIXED STYLE MULTIPLIER

Low power VLSI Design is necessary to satisfy Moore's law and to produce consumer electronic goods with low power consumption and with high battery backup. Saving power is necessary in this modern world. Dynamic Power is the dominant property in the technology above 0.1m, while in smaller technologies Leakage Power is more important. Dynamic power dissipation is occurred as a result of charging the Load capacitances in a circuit. This dynamic power consumption of CMOS IC is calculated by adding the transient power consumption (PT), and capacitive-load power consumption (PL). For the past few years, different optimizations have been applied in the architecture in order to minimize the dynamic power dissipation in arithmetic circuits, and especially in digital multipliers [19], [20].



Fig.5 The FAB Cell

In this Mixed Style Multiplier, there are two parts of multipliers. One is array part and the other one is a tree part multiplier. The array part is chosen as Carry Save array multiplier which considered giving good result when combined with the bypassing technique. The design of low power combinational circuits is introduced using a technique of bypassing logic blocks when their function is not required. This bypassing is carried out using low delay and area overhead component like transmission gates. This technique offers great power savings by avoiding switching activity in the circuit which in turn produces dynamic power savings. This bypassing technique is done in the array part of the multiplier. The array multipliers are chosen for this bypassing technique since it has a regular interconnection which helps to skip unwanted blocks. The array multipliers have linear delay circuit [21].

30th April 2012. Vol. 38 No.2

© 2005 - 2012 JATIT & LLS. All rights reserved.

ISSN: 1992-8645

www.jatit.org





Fig.6 The 4x4 Carry-Save Array multiplier with bypass

The functionality of the of the Carry save array multiplier is X=(xn-1,...,x1,x0) and Y=(ym-1,...,y1,y0) are fed into FAB cell. The transmission gates in FAB cell lock the inputs of the full adder to prevent any operations when y=0 and the multiplexer propagates the sin to the sout. When y=1, the full adder works and produces sout. The second part is a tree part multiplier which has the advantage of logarithmic circuit delay and considered to produce results in faster way [21].



Fig.7 A 32 bit multiplication split in parts

The great timing advantage of the Wallace tree multiplier along with the great power advantage of the bypass scheme in the carry save array multiplier can be combined in mixed multiplier architecture. The 32 bit values can be multiplied by splitting them in two 16 bit. If the first 32bit value is (X, Y)and the second is (W, Z) four 32 bit partial products are generated. A = X x Z, B = Y x Z, C = X x W and D = Y x W. These four partial products are shifted and added together to produce the final 64 bit multiplication result [22]. The main operation behind these operand splitting is to use different multiplier architectures for each different partial multiplication. So, from A = X x Z and C = X x W performance is gained while from B = Y x Z, and D = Y x W power is gained, if half of one or both operands usually contains more no of 0s than 1s, this specific half should be passed through the bypass array multiplier for greater power advantages

This multiplier can give good results of power consumption and dynamic power savings but it is not sure give good results in the other parameters such as time and area. Therefore it's hard to implement Wallace tree in FPGA for its irregular and complex interconnection. By using a faster and efficient multiplier which possesses familiar interconnection model than Wallace tree multiplier can give better results than this architecture.

#### 5. PROPOSED ARCHITECTURE

The Carry save array multiplier is considered to have good dynamic power savings by using the bypassing technique across full adders. It uses low power consumption components like transmission gate, multiplexer in the place of full adder therefore it reduces power consumption and it avoids the unwanted switching activity which in turn reduces the dynamic power dissipation. We can expect large amount power savings in carry save array multiplier. On the other hand, in all cases the proposed bypass architecture offers power savings ranging from 20% up to almost 60%.

A new architecture can be proposed by combining bypass carry save array multiplier with Modified Booth-Wallace tree multiplier it may produce results in a much faster way. Since Modified booth Wallace is considered to have best result in Area-Delay2 product (AD2) and Delay Product (DP) among some of the multipliers. Considering Modified Booth-Wallace tree multiplier in the place of Wallace tree multiplier may give good and fast results and will reduce power dissipation.

© 2005 - 2012 JATIT & LLS. All rights reserved.



## 6. **RECOMMENDATION & LIMITATIONS**

The multiplier which concentrates on all the parameters, time and area with power should be chosen. Changing this combination of multipliers may give better results. Since each separate multiplier has its own advantages based on the designer's considerations, it is better to use in combination than to use as single multiplier for more number of bits. Therefore, a new architecture of multiplier has been proposed by combining two multipliers based on the efficiency.

# 7. CONCLUSION & FUTURE WORK

Considering all facts of the multipliers above, combinations of multiplier can give good result for operands which have greater number of bits. Its dynamic power saving is a main advantage in Low power VLSI design world with great battery backup. This work can be further extended with the analysis of power and area when considered for ASIC implementation.

# **REFERENCES:**

- S. M. Aziz, Iftekhar Ahmed, "Easily Testable Array Multiplier Design Using VHDL "Malaysian Journal of Computer Science, Vol. 11 No. 2, December 1998, pp. 1-7
- [2] D. D. Gazski, N. D. Dutt, C. H. Wu and Y. L. Lin, High-Level Synthesis, Introduction to Chip and System Design, Kluwer Academic Publishers, 1991.

E-ISSN: 1817-3195

- [3] Z. Navabi, VHDL Analysis and Modeling of Digital Systems, New York, McGraw-Hill, Inc., 1993.
- [4] J. R. Armstrong, Chip Level Modeling with VHDL, Englewood Cliffs, New Jersey, Prentice-Hall International, 1989.
- [5] Frank P. J. M. Welton, Antoine Delaruelle, "A 2m m CMOS 10-MHz Micro programmable Signal Processing Core With an On-Chip Multiport Memory Bank", IEEE J. of Solid-State Circuits, Vol. SC-20, No.3, June 1985, pp. 754-760.
- [6] K. Takeda, F. Ishino, and Y. Ito et al., "A Single-Chip 80-bit Floating Point Processor", IEEE J. of Solid-State Circuits, Vol. SC-20, No. 5, October 1985, pp. 986-991.
- [7] N. F. Benschop, "Layout Compilers for Variable Array Multipliers", in Proc. of Custom Integrated Circuits Conf., May 1983, pp. 336-339.
- [8] K. C. Chu and R. Sharma, "A Technology Independent MOS Multiplier Generator", in 21<sup>st</sup> Design Automation Conf., 1984, pp. 90-97.
- [9] S.Shah, A. J. Aj-Khabb, D. AI-Khabb, "Comparison of 32-bit Multipliers for Various Performance Measures" The 12th International Conference on Microelectronics Tehran, Oct. 31- Nov. 2, 2000
- [10] C.S.Wallace, "A suggestion for a fast multiplier", IEEE Trans. Elechon. Con@., vol. EC-13, pp. 14-17, Feb. 1964.
- [11] L.dadda, "Some Schemes for Parallel multipliers", in Proc. Alta Frequenza, Vol.19,pp.349-356, Mars,1965
- [12] Mahmoud A. Al-Qutayri, Hassan R. Barada and Ahmed Al-Kindi, "Comparison of Multipliers Architectures through Emulation and Handle-C FPGA Implementation"Etisalat University College, Sharjah, UAE
- [13] P.Meier, R.A.Rutenbar, and L.R.Carley, "Exploring multiplier architecture for Low power", in Custom Integrated Circuits Conference. IEEE, 1996, pp.513-516.
- [14] G. Economakos and K. Anagnostopoulos, "Bit level architectural exploration technique for the design of low power multipliers," in International Symposium on Circuits and Systems, IEEE, 2006.
- [15] T.Sakuta, W.Lee, and P.Balsara, "Delay balanced multipliers for low power/low voltage dsp core", in Symposium on Low Power Electronics. IEEE, 1995, pp. 36-37.

<u>30<sup>th</sup> April 2012. Vol. 38 No.2</u>

© 2005 - 2012 JATIT & LLS. All rights reserved

|                  |                             | 3/(111            |
|------------------|-----------------------------|-------------------|
| ISSN: 1992-8645  | <u>www.jatit.org</u>        | E-ISSN: 1817-3195 |
| [16] I. Doboov A | Chandrakasan and P. Nikolia |                   |

- [16]J. Rabaey, A. Chandrakasan, and B. Nikolic, "Digital Integrated Circuits: A Design Perspective" - Second Edition, Prentice Hall, 2003.
- [17] A.Bellaouar and M.Elmasry, Low Power Digital vlsi Design:Circuits and Systems. Kluwer Academic Publishers. 1995.
- [18] C. N.Marimuthu1, P. Thangaraj2,"Low Power High Performance Multiplier" ICGST-PDCS, Volume 8, Issue 1, December 2008.
- [19] M. Karlsson, "A generalized carry-save adder array for digital signal processing," in 4th Nordic Signal Processing Symposium. IEEE, 2000, pp. 287–290.
- [20] S. M. Lee, J. H. Chung, H. S. Yoon, and M. M. O. Lee, "High speed and ultra low power 16x16 mac design using tg techniques for webbased multimedia system," in Asia South Pacific Design Automation Conference. ACM/IEEE, 2000, pp. 17–18.
- [21] N. Weste and D. Harris, CMOS VLSI Design: A Circuits and Systems Perspective - Third Edition. Addison-Wesley, 2004.
- [22] Dimitris Bekiaris, George Economakos and Kiamal Pekmestzi, "A Mixed Style Multiplier Architecture for Low Dynamic and Leakage Power Dissipation", National Technical University of Athens.IEEE,2010.