Modified CSKA Application in the Floating Point Adder using Carry Skip Adder Hybrid Structure

Mr. N. Md. Mohasinul Huq¹ | Mr. M. Mahaboob Basha² | Subbamma Kalingiri³

¹Associate Professor, Department of ECE, SVR Engineering College, Nandyal.
²Associate Professor, Department of ECE, SVR Engineering College, Nandyal.
³PG Scholar, Department of ECE, SVR Engineering College, Nandyal.

ABSTRACT
In this paper, we present a carry skip adder (CSKA) structure that has a higher speed yet lower energy consumption compared with the conventional one. The speed enhancement is achieved by applying concatenation and incrementation schemes to improve the efficiency of the conventional CSKA (Conv CSKA) structure. In addition, instead of utilizing multiplexer logic, the proposed structure makes use of AND-OR-Invert (AOI) and OR-AND-Invert (OAI) compound gates for the skip logic. The structure maybe realized with both fixed stage size and variable stage size styles, wherein the latter further improves the speed and energy parameters of the adder. Finally, a hybrid variable latency extension of the proposed structure, which lowers the power consumption without considerably impacting the speed, is presented.

This extension utilizes a modified parallel structure for increasing the slack time, and hence, enabling further voltage reduction. The proposed structures are assessed by comparing their speed, power, and energy parameters with those of other adders using a 45-nm static CMOS technology for a wide range of supply voltages. The results that are obtained using HSPICE simulations reveal, on average, 44% and 38% improvements in the delay and energy, respectively, compared with those of the Conv-CSKA. In addition, the power–delay product was the lowest among the structures considered in this paper, while its energy–delay product was almost the same as that of the Kogge–Stone parallel prefix adder with considerably smaller area and power consumption. Simulations on the proposed hybrid variable latency CSKA reveal reduction in the power consumption compared with the latest works in this field while having a reasonably high speed.

KEYWORDS: Carry skip adder, hybrid structure, modified CSKA, Application in the floating point adder.
minimum number of gate count. The CBL worked as a full adder. By using the modified CSKA an efficient floating point adder optimize. The results also suggested the CSKA structure is a very good adder for the applications where both the speed and area consumption. In this paper, given the attractive features of the CSKA structure. The modified CSKA increases the speed considerably while maintaining the low area and delay consumption features of the CSKA.

Hence, the contributions of this paper can be summarized as follows.
1) Proposing a modified CSKA structure by combining parallel prefix network and optimized RCA. Modified CSKA structure for enhancing the speed of the adder. The modification provides us with the ability to use simple Ladner Fischer adder and optimized RCA network.
2) Proposing a modified variable latency CSKA structure based on the extension of the suggested CSKA, by replacing some of the middle stages in its structure with PPA, which is modified in this paper.
3) Optimized RCA used with minimum gate count.
4) Application of the modified structure in to a floating point adder unit.

II. Related Work

A. Modifying CSKAs for Improving Speed:
Alioto and Palumbo [19] propose a simple strategy for the design of a single-level CSKA. The method is based on the VSS technique where the near-optimal numbers of the FAs are determined based on the skip time (delay of the multiplexer), and the ripple time (the time required by a carry to ripple through a FA). The goal of this method is to decrease the critical path delay by considering a non-integer ratio of the skip time to the ripple time on contrary to most of the previous works, which considered an integer ratio [17], [20]. In all of the works reviewed so far, the focus was on the speed, while the power consumption and area usage of the CSKAs were not considered. Even for the speed, the delay of skip logics, which are based on multiplexers and form a large part of the adder critical path delay [19], has not been reduced.

B. Improving Efficiency of Adders at Low Supply Voltages:
To improve the performance of the adder structures at low supply voltage levels, some methods have been proposed. In adaptive clock stretching operation has been suggested. The method is based on the observation that the critical paths in adder units are rarely activated. Therefore, the slack time between the critical paths and the off-critical paths may be used to reduce the supply voltage. Notice that the voltage reduction must not increase the delays of then on critical timing paths to become larger than the period of the clock allowing us to keep the original clock frequency at a reduced supply voltage level. When the critical timing paths in the adder are activated, the structure uses two clock cycles to complete the operation. This way the power consumption reduces considerably at the cost of rather small throughput degradation.

In the efficiency of this method for reducing the power consumption of the RCA structure has been demonstrated. The CSLA structure in [28] was enhanced to use adaptive clock stretching operation where the enhanced structure was called cascade CSLA (C2SLA). Compared with the
common CSLA structure, C2SLA uses more and different sizes of RCA blocks. Since the slack time between the critical timing paths and the longest off-critical path was small, the supply voltage scaling, and hence, the power reduction were limited. Finally, using the hybrid structure to improve the effectiveness of the adaptive clock stretching operation has been investigated.

In the proposed hybrid structure, the KSA has been used in the middle part of the C2SLA where this combination leads to the positive slack time increase. However, the C2SLA and its hybrid version are not good candidates for low-power ALUs. This state originates from the fact that due to the logic duplication in this type of adders, the power consumption and also the PDP are still high even at low supply voltages. The CSKA may be implemented using FSS and VSS where the highest speed may be obtained for the VSS structure. Here, the stage size is the same as the RCA block size. In Sections III-A and III-B, these two different implementations of the CSKA adder are described in more detail.

### III. Implementation

#### A. Proposed CSKA Structure:

The structure is based on combining the concatenation and the incrementation schemes [13] with the Conv-CSKA structure, and hence, is denoted by CI-CSKA. It provides us with the ability to use simpler carry skip logics. The logic replaces 2:1 multiplexers by AOI/OAI compound gates (Fig. 2). The gates, which consist of fewer transistors, have lower delay, area, and smaller power consumption compared with those of the 2:1 multiplexer [37]. Note that, in this structure, as the carry propagates through the skip logics, it becomes complemented. Therefore, at the output of the skip logic of even stages, the complement of the carry is generated. The structure has a considerable lower propagation delay with a slightly smaller area compared with those of the conventional one. Note that while the power consumptions of the AOI (or OAI) gate are smaller than that of the multiplexer, the power consumption of the proposed CI-CSKA is a little more than that of the conventional one. This is due to the increase in the number of the gates, which imposes a higher wiring capacitance (in the non-critical paths).

![Fig. 2. Proposed CI-CSKA Structure.](image)

The reason for using both AOI and OAI compound gates as the skip logics is the inverting functions of these gates in standard cell libraries. This way the need for an inverter gate, which increases the power consumption and delay, is eliminated. As shown in Fig. 2, if an AOI is used as the skip logic, the next skip logic should use OAI gate. In addition, another point to mention is that the use of the proposed skipping structure in the Conv-CSKA structure increases the delay of the critical path considerably. This originates from the fact that, in the Conv-CSKA, the skip logic (AOI or OAI compound gates) is not able to bypass the zero carry input until the zero carry input propagates from the corresponding RCA block.

### IV. Experimental Results

The design proposed in this paper has been developed using MODEL SIMULATOR. ADDERS are a key building block in arithmetic and logic
units (ALUs). Low power arithmetic circuits have become very important in VLSI industry. Adder circuit is the main building block in DSP processor. Adder is the main component of arithmetic unit. A complex DSP system involves several adders. Many design styles of adders exist.

V. CONCLUSION

In this paper, a static CMOS CSKA structure called CI-CSKA was proposed, which exhibits a higher speed and lower energy consumption compared with those of the conventional one. The speed enhancement was achieved by modifying the structure through the concatenation and incrementation techniques. In addition, AOI and OAI compound gates were exploited for the carry skip logics. The efficiency of the proposed structure for both FSS and VSS was studied by comparing its power and delay with those of the Conv-CSKA, RCA, CIA, SQRT-CSLA, and KSA structures. The
results revealed considerably lower PDP for the VSS implementation of the CICSKA structure over a wide range of voltage from super-threshold to near threshold. The results also suggested the CI-CSKA structure as a very good adder for the applications where both the speed and energy consumption are critical. In addition, a hybrid variable latency extension of the structure was proposed. It exploited a modified parallel adder structure at the middle stage for increasing the slack time, which provided us with the opportunity for lowering the energy consumption by reducing the supply voltage.

The efficacy of this structure was compared versus those of the variable latency RCA, C2SLA, and hybridC2SLA structures. Again, the suggested structure showed the lowest delay and PDP making itself as a better candidate for high-speed low-energy applications.

**References**


