[See page 337 for Figures 4, 7.]

## SESSION IV: HIGH-SPEED CIRCUIT TECHNOLOGY

## WAM 4.2: A 30ns-32b Programmable Arithmetic Operator

Gerald Boudon, Pierre Mollier, Jean P. Nuez, Franck Wallart IBM Component Development Laboratory Corbeil-Essonnes, France

AN EXPERIMENTAL 30ns CMOS programmable arithmetic operator, fabricated on a high speed CMOS  $50 \text{mm}^2$  gate array, will be reported. Performance is twice as good as that of a previous bipolar version. This was possible through the use of a half-micron channel length CMOS, affording a 200ps delay for a 2-way NAND gate. The unit dissipates less than 1W, with a 3.3V power supply.

The operator comprises three basic components (Figure 1): 1-the sequencer to control instruction memory, 2-the address generator to compute the data memory address, and 3the computer section, composed of a 32b ALU and a 16x16b array multiplier based on the modified Booth algorithm. Chip operation is controlled by 170 microinstructions. Microinstructions and data are stored in external ROM and RAM chips.

The gate array selectively uses  $0.5\mu$ m channel length FETs with  $1.5\mu$ m minimum geometry rules for the other features; Figure 2. To improve the breakdown and the punch through voltage, the so-called DI-LDD technology (double implant lightly doped Drain/Source<sup>1</sup>) is necessary; Figure 3. This structure also decreases short channel and hot electron effects<sup>1</sup>. Titanium silicide has been used to reduce the resistance of the poly gate and of the drain/source diffusion. Deep trench isolation avoids latch-up problems.

A 6.9 x 7.1mm gate array chip contains 6,596 cells of 3 P and 3 N transistors plus 103 I/O cells in the periphery; Figure 7. The  $28.8\mu m \ge 115\mu m$  cells are separated horizontally by 29 wiring tracks on first-level metal. Additional global wiring is available on second-level metal over the entire chip.

FET gate widths are  $32.2\mu$ m for the P and  $38.8\mu$ m for the N devices. To improve circuit density, the gate isolation technique has been used instead of regular oxide isolation; Figure 4.

Metal wiring was kept similar to an existing hardware proved to be functional in a conventional CMOS technology, to demonstrate compatibility with current metallurgy processing steps and to shorten the logical and physical design time.

High speed is provided by a 120mA/V/mm transconductance measured on the  $0.5\mu$ m device. Circuit simulations show for a 2-way NAND, a nominal delay of 0.3ns with a typical

<sup>1</sup>Codella, C. and Ogura, S., "Halo Doping Effect in Submicron DI-LDD Device Design", *IEDM*; Dec., 1985. fan-out of 2.5; Figure 5. The internal critical path delay, including the multiplier, shows a speed improvement factor of 3.5, when compared to a conventional  $1.5\mu$ m CMOS. Hardware measurements on a ring oscillator and on the product confirm the simulations; Figure 6. The power dissipation is less than 1W at maximum speed, as opposed to 2.5W for the bipolar version.

External connections are provided by 103 driver-receiver cells which are LVTTL compatible and can be personalized in tri-state, push-pull or open drain configurations. They have been designed to accommodate LVTTL and meet the noise tolerance constraints of 36 simultaneous switchings. Logic design has been implemented with automatic placement and wiring programs on the 6,600 cell gate array, using 92% of the cells.

## Acknowledgments

The authors wish to thank N. Rovedo, S. Ogura, C. Codella, P. Kroesen for technology definition and processing in East-Fishkill, Essonnes Testing department and V. Vallet for technical assistance. Fabrication and engineering by the Essonnes silicon, gate manufacturing line, the mask house and the physical and reliability, analysis department are gratefully acknowledged.

| Organization:                          | 32b ALU and 16b multiplier                 |
|----------------------------------------|--------------------------------------------|
| -                                      |                                            |
| Programming:                           | 170 instructions                           |
| Power Supply:                          | 3.3V - 1W                                  |
| Logic Interface:                       | low voltage CMOS/TTL                       |
| Cycle Time:                            | 30ns                                       |
| Technology:                            | Single Poly Si / Double Al,<br>CMOS DI-LDD |
| Gate Length:                           | 0.5µm                                      |
| Geometry rules other than gate length: | $1.5 \mu m$                                |
| Typical loaded gate delay:             | 30ps                                       |
| Chip size:                             | 6.9 x 7.1mm <sup>2</sup>                   |

TABLE 1-Features and performance.

54

## ISSCC 87 / WEDNESDAY, FEBRUARY 25, 1987 / RHINELANDER SOUTH / WAM 4.2



FIGURE 1-Block diagram of programmable arithmetic operator.



FIGURE 2–SEM microphotograph of  $0.5\mu m$  FET.











ring oscillators.

DIGEST OF TECHNICAL PAPERS

55

A 30ns 32b Programmable Arithmetic Operator (Continued from Page 55)



FIGURE 7-The 6.9 x 7.1mm CMOS gate array chip at first level of metal.

DIGEST OF TECHNICAL PAPERS .

337