WITHIN DIE THERMAL GRADIENT IMPACT ON CLOCK-SKEW: A NEW TYPE OF DELAY-FAULT MECHANISM

WITHIN DIE THERMAL GRADIENT IMPACT ON CLOCK-SKEW: A NEW TYPE OF DELAY-FAULT MECHANISM S.A. Bota, M. Rosales, J.L. Roselló, A. Keshavarzi* and J. Segur...
6 downloads 0 Views 1MB Size
WITHIN DIE THERMAL GRADIENT IMPACT ON CLOCK-SKEW: A NEW TYPE OF DELAY-FAULT MECHANISM S.A. Bota, M. Rosales, J.L. Roselló, A. Keshavarzi* and J. Segura. Univ. de les Illes Balears, Palma de Mallorca, Spain *

Circuit Research Labs., Intel Corporation, Portland, OR, USA

Abstract—As chips become faster, the need to test them at their intended speed of operation has been recognized. High-speed operation, together with the higher switching activity typically induced during test, can result in a die-thermal distribution significantly different from that achieved during normal operation. Differences in thermal map distribution between normal- and test-mode operations give rise to a nonuniform impact on the relative path delay within logic blocks. The impact of test-induced hot spots may artificially slow down non-critical paths or speed-up critical ones with respect to the clock making the whole die to fail (pass) delay testing for a good (bad) part. The non-uniform thermal-induced delay is especially important for clock circuitry, the most critical block, which is impacted even if exact zero-skew clock routing algorithms are adopted. In this work we analyze the impact of thermal map temperature changes on the clock delay identifying a new delay-fault mechanism. We propose a technique to minimize the impact of different test- and normal-mode thermal maps by making the clock tree speed independent of temperature gradients. This technique allows applying confidently delay test patterns to the die regardless of the thermalmap test-induced modification. Index Terms—Clock skew, clock distribution networks, temperature gradient, interconnect delay.

1. Introduction The ever aggressive increase in performance of very large scale integration (VLSI) chips is leading to higher power dissipation and increasing temperature of operation. Management of thermally related issues is rapidly becoming one of the most challenging efforts in high performance IC design. At the circuit level, thermal problems have important implications for performance and reliability [1]. Furthermore, it has been reported that significant temperature gradients on the silicon substrate can occur due to different activity and/or different sleep

Paper 45.2 1276

modes of various functional blocks in high performance chips [2]. In the past, the problem of power dissipation during test was only a minor issue since the test was performed at a speed much lower than the normal operation speed. Conversely, today circuits are tested at higher clock rates, if possible at the circuit normal clock rate [3] to detect timing failures [4,5]. Since test-induced activity can be much higher than the normal operation mode, power dissipation during test can be much larger [6]. Power constraints are determined for the circuit normal operation during the design, and are settled assuming that random logic blocks will have about 20% to 30% activity with respect to the clock signal. This activity factor represents a power demand that is smaller than the power consumed during test mode. Therefore, power constraints can be easily exceeded during test [7], and may induce delay-related failures as well as reduced reliability. Different strategies have been proposed to limit the testinduced power excess by either controlling the peak power or the average power. To alleviate test-induced power excess, some works propose a proper selection of test vectors [8] showing how to reduce the power dissipation and energy consumption while achieving high fault coverage. Many of these techniques rely on powerconstrained test-scheduling algorithms and are focused on reducing or maintaining the circuit power consumption within safe operating margins. Such methods do not pursue a uniform power distribution over the die, and therefore do not guarantee a uniform thermal map. Moreover, power consumption cannot be directly translated to temperature increase since temperature is related to power density, and also because the time constant for temperature variation is much higher than for power. Therefore, the techniques developed to contain power consumption during testing cannot be directly applied to obtain uniform thermal maps. Power consumption has two main components: the dynamic power, PAC, and the leakage power, Pleak. The short-circuit power component can be included in the

ITC INTERNATIONAL TEST CONFERENCE 0-7803-8580-2/04 $20.00 Copyright 2004 IEEE

dynamic one, and is important for gates whose input transition time is larger than the output one [9]. The P AC component is proportional to the switching activity and is responsible for thermal gradients since circuit activity changes from block to block. Within die temperature gradients in high performance ICs can easily reach 40 ºC to 50 ºC during normal operation [10]. From the previous discussion it is clear that these gradients tend to increase during test. In synchronous circuits, the clock network constitutes one of the most critical sub-blocks and has a significant impact on speed, area, and power dissipation. The clock signal defines a time window for data movement within the IC. Given its importance, much attention has been given to clock signal design and distribution. Clock signals are typically heavily loaded, travel over the whole die, and operate at the highest possible speed. Furthermore, clock signals are particularly affected by technology scaling as they become the interconnect lines most susceptible to inductive noise, and are subjected to more complex noise mechanisms, as global metal layers carrying the clock signal are getting closer to the substrate [11]. Due to the well-known impact of temperature on delay this trend also makes the effect of non-uniform substrate temperature on the clock skew becoming more critical. A poor ability in controlling clock skew can severely limit the maximum performance of the entire system and may also lead to catastrophic race conditions resulting in setup or hold time violations [12]. Therefore, if the effect of the higher activity during the test mode and its impact on the clock network is not properly considered, a given percentage of dies may fail during test due to the test-induced thermal map modification. This would cause an increased yield loss since the normaloperation thermal map impact on the path delay is in general different from that induced during test mode. In this work we analyze the impact of within die thermal gradients on the clock skew considering the impact of temperature on both active devices and the interconnect system. We show experimentally that critical path ordering may change with temperature. This effect, coupled to the fact that the test-induced thermal map may be different to the normal-mode operation one, motivates the need for a careful consideration of temperature gradients impact on delay during test. Finally we present a low-power-clock tree design strategy that, by using an adequate biasing voltage, VDDopt, for the clock tree, reduces the temperature-related clock skew problems.

2. Clock Networks and Skew One of the most important constrains when designing the clock signal distribution network is to maintain a zero skew through the whole IC. Different clock signal paths

may exhibit different delays for a variety of reasons. The following causes of clock skew have been identified [11]: • • •

Differences in line lengths from the clock source to the clocked register. Differences in delays of any active buffers (e.g., distributed buffers) within the clock distribution network. Differences in passive interconnect parameters, such as line resistivity, dielectric constant and thickness, via/contact resistance, line and fringing capacitance, and line dimensions.

The clock period (tclock) constraints are related to the clock skew (s), the worst-case data path delay (dmax), and an offset constant (t0) that includes data setup time, latch active time, and other possible offset factors such as safety margins:

t clock = s + d max + t0

(1)

It is clear from Eq. (1) that clock period reduction requires not only worst-case data path delay minimization, but also clock skew reduction. Zero skew clock tree synthesis has been widely researched [13]. Many different approaches, from ad hoc to algorithmic have been developed for designing clock distribution networks in synchronous digital ICs. Most clock distribution schemes are constructed on the basis that the absolute delay from a central clock source to the clocking elements is irrelevant –only the relative phase between two clocking points is important. Therefore, one common approach to distribute a clock is to use balanced paths (called trees). The most common and general approach is the use of buffered trees and symmetric trees, such as H-trees. The H-Tree clock topology consists of trunks (vertical stripes) and branches (horizontal stripes) as depicted in Figure 1. In non-buffered structures, the toplevel segments of the tree are wider than the lower level segments. Furthermore, the top-level global segments are assigned to the upper metal layers and low-level local segments are routed using the lower metal layers. In addition to zero skew, a second requirement for a clock tree is that the slew rate of the clock edge must be sharp. This requires the insertion of buffers within the clock network to isolate the downstream capacitance, thus reducing the transition times (multistage clock tree). Clock networks with several buffer stages are common in high performance ICs [2]. Clock buffers are a primary source of the total clock skew within a well-balanced clock distribution network since the active device characteristics vary much more greatly than the passive device characteristics.

Paper 45.2 1277

3. Impact of temperature on critical path ordering

T µ(T) = µ(T0 )   T0 

(2)

−M

(3)

where T0 is the room temperature (T0=300 K); κ is the threshold voltage temperature coefficient, whose typical value is 2.5 mV/K, and M is the temperature exponent whose typical value is 1.5. The thermal variations of the

Paper 45.2 1278

2,40E-11 2,30E-11 2,20E-11 2,10E-11 2,00E-11 1,90E-11 1,80E-11 1,70E-11

The CMOS device parameters that are most affected by temperature are the threshold voltage (VT ), mobility (µ), and the energy bandgap of silicon (Eg). The dependence of the first two can be expressed as:

VT (T ) = VT (T0 ) − κ (T − T0 )

2,50E-11

Time

The algorithms used to obtain clock networks with zero skew by construction do not take into account the effect of temperature as a cause of clock skew. One could think of describing the non-uniform impact of temperature on delay through either passive or device parameter variations. This approach is not appropriate since the impact of parameter variations on the delay is very similar for normal- and test-mode operation for each particular circuit. The impact of parameter variations on clock skew is rather different than that of thermal gradients as the thermal map is closely related to the circuit activity and thus can be very different from normalmode to test operation.

Delay (s)

Figure 1. Typical symmetric H-tree clock distribution net.

energy bandgap are usually small and not comparable to the threshold voltage and mobility variations. Eqs. (2) and (3) reveal that both the device threshold voltage and carrier mobility decrease as temperature is increased. These dependencies have compensating effects with temperature since VT reduction tends to increase the transistor current, while mobility reduction tends to decrease the on–state current. The net effect of mobility and threshold voltage temperature variation is a reduction of the transistor saturation current when temperature increases, with a direct impact on the gate delay. Figure 2 illustrates this effect by showing the variation of the high-to-low and low-to-high propagation delay for an inverter obtained from a 0.12 µm CMOS technology using HSPICE assuming a temperature independent interconnect line. We also carried simulations for different logic gates and found that the variation of the gate delay with temperature depends on the logic gate type, its number of inputs, and the position of the switching device within the stack of transistors that compose the gate. This non-uniform temperature dependency may give rise to a critical path ordering that depends on the IC operating temperature, since different logic gates will experience a delay variation that will depend on the switching transistor relative position in the stack, and the gate topology.

1,60E-11 1,50E-11 -50

0

50

100

Temperature

Figure 2. Single inverter propagation delay for input highto-low (gray) and low-to-high (black) transition dependence with temperature from simulations for a 0.12 µm technology

To investigate this effect we designed and fabricated a test chip (Figure 3) with several chains composed of different basic gate configurations (inverter, NAND and NOR) with different number of inputs and different positions of the switching devices in the gate transistor stack. The circuit was fabricated on a 0.35 µm technology

critical path ordering varies with the temperature. 30 28 Chain delay (ns)

and had fifteen different chains (labeled from C1 to C15). A reference chain consisting of a metal connection was used to determine the delay induced by intermediate auxiliary circuitry. The delay of this reference chain was subtracted to measured chain delays for each temperature. We measured the delay of all chains at three temperatures 25 oC, 45 oC, and 95 oC. Figure 4 shows the chain delay measured for four chains: C1, C2, C3 and C4. Chains C1 and C2 had 100 gates connected in series following a 3NAND-3NOR sequence, while chains C4 and C5 had 100 3NOR gates connected in series. In chain C1 the switching input was connected to the device closer to ground for the 3NAND nMOS stack, and to the device closer to VDD for the 3NOR pMOS stack. Chain C2 had the switching devices closer to the output for each transistor stack. For chains C3 and C4 the switching devices were those closer to the output, and to the rails respectively.

26 24 C1

22

C2 C3 C4

20 20

40

60

80

100

Temperature (C)

Figure 4. Chain delay-temperature variation for 3NAND3NOR chains (C1 and C2), and 3NOR-3NOR chains (C3 and C4). The chains have different switching device positions in the series-connected gate internal transistor stack structure.

We simulated the behavior of several chains for a 0.12 µm technology obtaining similar results. Table 1 shows an example of two chains composed of 8 2NOR and 12 2NAND gates respectively, for which the critical path ordering changed when the temperature varied from 27 oC to 80 oC. Table 1. Delay simulation comparison for two chains of 8 3NOR and 12 3NAND gates respectively for a 0.12 µm technology. At 27 oC the 3NOR-gate chain exhibits the larger delay, while at 80 oC the 3NAND-gate chain is slower.

Delay (ps) 8 2NOR (HLin => HLout) (LHin => LHout) 12 2NAND (HLin => HLout) (LHin => LHout)

27ºC 346.8 321.3 341.7 346.3

80ºC 369.6 344.7 369.0 374.0

These examples illustrate the need of considering the temperature as a parameter when computing the circuit critical path, and also when developing a delay-test strategy. Figure 3. Test chip (top) with several scan chains (bottom) to investigate the impact of the critical path dependency with temperature.

Figure 4 shows that at 25 oC chain C1 would determine the critical path of the circuit. At 95 oC the situation is different since the delays of C1 and C2 are very similar and now C2 is the slower chain. This result illustrates that in a real circuit it may happen that the

4. Effect of Non-uniform Substrate Temperature on Clock Skew Environmental variations are one of the most significant contributors to skew and jitter. The two major sources of environmental variations are power supply fluctuations and temperature. Power-supply variations are the major source of jitter since the supply voltage has a strong impact on circuit delay. The characteristic time of

Paper 45.2 1279

power supply fluctuations (specially those due to di/dt noise) is of the same order of magnitude than the circuit operating frequency. Temperature is considered as a skew source because the typical time constants for temperature changes are in the range of milliseconds. Figure 5 shows a typical self–heating profile for a single transistor measured at three different temperatures showing that the device temperature reaches a stable value after a time interval of about 100 ms.

R( x ) = r0 (1 + βT ( x ) )

(4)

where r0 is the unit length resistance at 0 ºC, and β is the temperature coefficient of resistance (ºC-1). Using a distributed RC Elmore delay model, the delay D of a signal passing through the line can be written as [16]:

(

L

)

L

D = Rd C p + CL + ∫0 c0 (x)dx + ∫0r (x) ×

( ∫ c (τ )dτ + C )dx L

x 0

(5)

L

Assuming that the unit length capacitance c0 does not change with temperature, a substitution of (4) into (5) leads to: L

L

(6)

D = D0 + ( CL + c0 L)r0 β ∫x T (x)dx − c0 r0 β ∫xxT (x)dx

where D0 is the Elmore delay at 0 ºC  L2  D0 = Rd (CP + CL + c0 L) +  c0 r0 + r LC .  2 0 L

Note that non-uniform interconnect thermal distribution T(x) can impact delay significantly across the clock tree. According to the parameter values for a deep submicron technology, there is roughly a 5% - 6% increase in the Elmore delay for each 20 degree increase on line temperature [17]. Figure 5. Transistor self-heating experiment showing the time-scale range for temperature stabilization. The y-axis shows a voltage that is linearly related to temperature.

The main sources of temperature elevation in the IC are the block switching activities generated at the substrate, and the interconnect Joule heating due to passing current through metal. In a high performance IC the substrate temperature can have more than 40 ºC thermal gradients and reach a peak temperature up to 120 ºC [14], while Joule heating can contribute further to the overall interconnect temperature [15]. The non-uniform thermal profile results in different signal delays at the end of different clock-tree branches, inducing non-zero skew. Therefore, the temperature gradients may create a scenario where the H-tree symmetry cannot guarantee the zero skew. 4.1.

Effects on Interconnect delay

Consider an interconnect line between two consecutive clock buffers of length L and uniform width W. The line is driven by a buffer having on-resistance R d, parasitic output capacitance C p, and is terminated by a buffer with load capacitance CL (see Figure 6). The interconnect resistance has a linear relationship with its temperature that can be described as:

Paper 45.2 1280

Buffer Stage i

Buffer Stage i+1

Driver

C

C

C

Load

Figure 6. Distributed RC interconnect line driven by buffer i and terminated at buffer i+1

4.2.

Effects of the Non-uniform Substrate Temperature on Buffer Delay

The goal of buffer insertion is to find the number, size, and exact location of the inserted buffers along the line length. In general clock buffers are distributed over the entire die and cover large distances. As a result, two or more buffers being at the same distance from the clock generator within the clock tree structure may be at significantly different temperature because their neighborhood blocks have different local activities. This results in a non-uniform impact on the clock-tree skew that depends on the local and time-varying activity of the clock tree surrounding blocks. These variations are not

systematic, and difficult to characterize. In general it is observed that the delay degradation caused by the effects of temperature on the driver onresistance are much more severe than those caused by the interconnect resistance thermal dependency [16]. 4.3.

Impact of Thermal Gradients on Clock Skew

We simulated the impact of thermal gradients on the clock skew by comparing the relative delay in two clocktree branches for a 0.12 µm technology. In one branch (branch A) the temperature was assumed to be uniformly distributed at 50 oC. The relative delay in the other branch (branch B) was analyzed assuming that a number of gates were influenced by a hot spot. The impact of the hot spot size on the relative delay was characterized by changing the number of buffers in the clock-tree branch being set at a different temperature. The hot spot temperature was assumed uniform and was varied from -10 oC to 60 oC with respect to the uniform temperature at which branch A was settled (50 oC). Figure 7 shows the dependency of clock skew with the relative temperature increase between two clock-tree branches for several hot spot sizes (described through the number n of stages being at a different temperature). It is shown that as the temperature difference increases the skew becomes larger, and is roughly proportional to n·(∆T). This result shows the importance of having a temperature aware clock-tree, especially for high performance designs where 40 ºC - 50 ºC thermal gradients are usual. This trend is predicted to increase for future scaled technologies as has been recognized [10]. n=2 n=4 n=8 n=12

1,6E-11

Skew (s)

Skew (s)

1,4E-11

It is well known that the rate of the driver resistance variations due to temperature fluctuations is strongly dependent on the power supply voltage. A well-known simplified expression for the delay of a CMOS gate obtained by neglecting the contribution from the interconnect resistance is:

tD =

CL VDD 2Iav

(7)

where CL is the load capacitance, VDD the buffer supply voltage, and Iav the average current supplied by the gate to the load. Aproximating the average current of a shortchannel MOSFET by [18]. α

Iav ∝ µ(T)(VDD − VT (T ))

∂t D =0 ∂T

1,2E-11 1,0E-11

(8)

(9)

The optimum supply voltage can be obtained using Eqs. (2), (3) and (7) to (9): €

8,0E-12 6,0E-12 4,0E-12

VDDopt = VT 0 (T0 ) +

2,0E-12 0,0E+00 -2,0E-12

5. Dual Supply-Voltage Clock Tree

where the dependence of the threshold voltage and the channel mobility with temperature are given in (2) and (3) For a gate delay to be insensitive to temperature we solve:

2,0E-11 1,8E-11

A comparison between Figures 2 and 7 reveals that the thermal gradient induced skew can be as high as the delay of one inverter in the chain. This result illustrates how the increased activity induced during testing may lead to a significant increase in clock skew if one portion of the clock-tree branch is impacted by a hot spot. The probability of this effect to happen during test mode is much higher than in normal operation since the testinduced activity is significantly higher. Therefore, a methodology to minimize the impact of non-uniform die thermal distributions on the clock delay during test application is required. One possible solution is proposed in the following section.

-10

0

10

20

30

40

50

60

Temperature Increase (ºC)

Figure 7. Skew related to temperature difference between two clock paths. The buffers are biased at nominal supply voltage (1.2V). We have assumed that a hot spot is affecting n buffers of the second path (for n=2,4,8,12). Simulations have been done choosing 50 ºC as a reference temperature.

ακ ·T0 M

(10)

The definition of this temperature insensitive cross point voltage is well known, and its value is required to obtain a delay insensitive circuit structure for a given block. We verified this result through simulations using HSPICE for a 0.12 µm process. Figure 8 shows simulation results of delay variation versus VDD for the temperature range between 0 ºC and 125 ºC. At VDDopt (0.70 V) the gate delay is insensitive to temperature Paper 45.2 1281

variations. The results shown in Figure 8 correspond to an inverter, while simulation results for multiple-input gates revealed that the VDDopt value depends on VT0 as show in Table 2. In this work, we propose a dual VDD scheme where the clock supply is adjusted to its VDDopt value during delay test application. This setting guarantees that the clock-tree delay is insensitive to the test-induced thermal map. Clock schemes with multiple-supply voltages have been proposed previously to provide savings in the total IC power dissipation [19,20], and would not be required only for test purposes.

clock signal is then transmitted on the chip as a lowvoltage signal. The clock signal is converted to the original supply voltage before connecting it to a flip-flop using a low-tohigh converter (LH converter) block to get the same voltage swing used in the logic network. The structure of the LH converter is more involved as reported in [19,20].

HLconverter

LHconverter

T0 T 25 T 125 T 75

7,0E-11

VDD Region

Figure 9. Typical dual-voltage clock scheme

Time

Time

VDDopt Region

5,0E-11

3,0E-11

6. Results And Discussion

1,0E-11 0,4

0,6

0,8

1

1,2

Volts

Figure 8. High to Low propagation time vs VDD

We verified the benefits of using a dual supply-voltage structure that used a clock bias voltage selected to compensate the temperature-related delay fluctuations during delay-test application. Figure 10 shows that when the clock buffers are biased at VDdopt = 0.7 V, the skew is reduced by up to one order of magnitude with respect to the skew observed at the nominal supply shown in Figure 7.

A dual voltage clock scheme is illustrated in Figure 9, where a high-to-low converter (HLconverter) is a buffer that converts the incoming clock signal to the chip from a high-voltage swing to a lower-voltage one.

Logic Gate Inverter Nand2 Nor2

VDDopt 0.70 V 0.66 V 0.73 V

Skew (s) Skew (s)

Table 2. VDDopt computed for different logic gates

2,0E-12 1,5E-12

n=12 n=8

1,0E-12

n=2 n=4

5,0E-13 0,0E+00 -5,0E-13 -1,0E-12 -1,5E-12 -2,0E-12

-10

0

10

20

30

40

50

60

Temperature Increase (ºC)

The structure of the HLconverter is relatively straightforward: to convert the clock swing from a high voltage range to a lower voltage range a conventional buffer driven by a supply voltage of VDDopt is used. The

Paper 45.2 1282

Figure 10. Skew vs Temperature increase for a clock tree biased at VDDopt, (VDDopt=0.70V) results have been compared with a reference temperature of 50 ºC, assuming that the hot spot affects different number of buffer stages (n=2,4,8,12).

A positive side effect of this clock scheme is the reduction in power consumption during test application with the consequent benefit in terms of circuit reliability [19]. However a lower clock-tree voltage may increase the relative impact of noise on the supply network. Additional research is needed to analyze the benefits of a separated supply grid routing in terms of noise. The routing of a separated clock-tree supply grid distribution has the benefit of an intrinsic supply noise isolation since IR drop and di/dt noise between the regular logic circuitry and the clock tree buffers is achieved. A tradeoff analysis is required to determine the final impact of noise on the clock circuitry. It is obvious that a lower supply voltage of the clocktree circuitry results in a larger clock period. Such a relative speed reduction with respect to the logic circuitry does not represent a limitation in terms of delay-test application since neither of the launch from test, or launch from capture techniques are impacted. In any case, the nominal clock period increase obtained when lowering the supply voltage is in the order of a factor two, and impacts all the clock buffers. This is not relevant if the main contribution of the clock period, tclock, is due to long critical paths, i.e. large dmax in Eq. (1).

7. Conclusions Clock skew has as much impact on overall parametric yield as any propagation delay. Large clock skews can cause timing violations, due either to the erosion of setup time or to that of hold time. It has been reported that process parameter variations, the effect of parasitics, and cross-talk affect the delay of each clock tree branch. Therefore a small timing margin must be maintained within the whole circuit to guarantee correct functionality with acceptable yield. In this work he have shown that temperature gradients may also represent an important source of clock skew. Test application induced thermal gradients may have an “artificial” impact on the apparition of delay faults since the switching activity is higher. In this work we have presented a method to minimize the impact of thermal gradients on the clock skew, which is based on a two-supply voltage clock scheme. Results show that the proposed scheme reduces the thermal gradient component of the skew by one order of magnitude. The adoption of a dedicated supply voltage for the clock tree has been reported to be beneficial also for power reduction, and would not be required for testing purposes only. Further analysis is required to investigate the tradeoff between clock temperature insensitive operation and noise impact.

Acknowledgements This work has been partially supported by the Spanish Ministry of Science and Technology, the Regional European Development Funds (FEDER) from the European Union project TIC2002-01238, the CAIB project FPI01-43086696, and an Intel Laboratories-CRL research grant. References [1] K. Banerjee et al. “On termal effects in deep submicron VLSI interconnects” 36th ACM Design Automation Conference, pp. 885-891, 1999. [2] J.M.Rabaey, A. Chandrakasan, B.Nikolic. Digital Integrated Circuits: A Design Pespective, PrenticeHall, 2n edition, Chapt. 10, 2003. [ 3 ] M.L. Bushnell, V.D.Agrawal. “Essentials of Electronic Testing for digital Memory and Mixedsignal VLSI circuits”, Kluwer Academic Publishers, 2000. [4] P. Franco, S. Ma, J. Chang, C. Yi-Chin, S. Wattal, E.J. McCluskey, R.L. Stokes, and W.D. Farwell, “Analysis and detection of timing failures in an experimental Test Chip.” IEEE Int. Test Conference, pp. 691 – 700, Oct. 1996 [ 5 ] McCluskey, E.J.; Chao-Wen Tseng “Stuck-fault tests vs. actual defects” IEEE Int. Test Conference, 3-5 Oct. Pp:336 – 342, 2000. [6] S. Wang, S.K. Gupta, “DS-LFSR: a BIST TPG for low switching activity” IEEE Trans. On ComputerAided Design of Integrated Circuits and Systems, Vol. 21 , pp 842-851, 2002. [7] P. Girard, C. Landrault, S. Pravossoudovitch, A. Virazel, H.J. Wunderlich, “High defect coverage with low-power test sequences in a BIST environment”, IEEE Design & Test of Computers, Vol. 19, pp 44-52, 2002. [ 8 ] X.Zhang, W. Shan, and K. Roy. “Low-Power Weighted Random Pattern Testing” I E E E Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 19, No. 11, November 2000, pp 1389 [9] J. Rosselló and J. Segura, "Charge-based analytical model for the evaluation of power consumption in submicron CMOS buffers", IEEE Transactions on Computer-Aided Design of Intergrated Circuits and Systems, Vol. 21, No. 4, pp. 433-443, April 2002. [10] S. Borkar, et. al. "Parameter variations and imapt on circuits and microarchitecture", Design Automation Conference, DAC'03, pp. 338-342, 2003.

Paper 45.2 1283

[11] P-J. Restle et al. “A Clock Distribution Network for Microprocessors” IEEE Journal Solid-State Circuits, Vol. 36, pp 792-797, 2001. [12] R-S.Tsay, “An Exact Zero-Skew Clock Routing Algorithm” IEEE Trans. On Comp Aided design of Integrated Circuits and Systems, Vol 12, pp 242249, 1993. [ 1 3 ] S. Tam, R.D. Limaye, U.N. Desai. “Clock generation and distribution for the 130-nm Itanium 2 Processor with 6-MB On-Die L3 Cache”. IEEE Journal Solid-State Circuits, Vol. 39, pp 636-642, 2004. [14] D. Chen, E. Li, E. Rosenbaum, S.M. Kang “Interconnect thermal modeling for accurate simulation of circuit timing and reliability” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 19, pp 197-205, 2000 [15] A.H.Ajami, M.Pedram, K.Banerjee. “Effects of Nonuniform Substrate temperature on the clock Signal integrity in High Performance Designs”, IEEE Custom Integrated Circuits Conference, pp 233237, 2001 [16] A.H.Ajami, K.Banerjee, M.Pedram “Analisys of Substrate Thermal Gradient Effects on Optimal Buffer Insertion” IEEE International Conference Computer Aided Design, (ICCAD), pp 44-48, 2001.

Paper 45.2 1284

[ 1 7 ]A.H.Ajami, K.Banerjee, M. Pedram, L.P. vanGinneken “Analysis of Non-uniform temperaturedependent interconnect performance in High Performance ICs”, Design Automation Conference, pp 567-572, 2001. [ 1 8 ] T.Sakurai, and A.R.Newton. “Alpha-power law MOSFET model and its applications to CMOS inverter delay and other formulas” IEEE Journal of Solid-State Circuits, Vol. 25, pp 584 – 594, April 1990 [19] M. Igarashi et al “A low-power design method using multiple supply voltages” Int. Symp. Low Power Design, 1997, pp 36-41 [ 2 0 ] J. Pangjun, S.Sapatnekar. “Low Power Clock Distribution Using Multiple Voltages and Reduced Swings”. IEEE Trans. on very Large Scale Integration Systems, Vol. 10, No 3, pp 309-318, 2002, [21] E. Malavasi, S. Zanella, M. Cao, J. Uschersohn, M. Misheloff and C. Guardiani “Impact Analysis of Process Variability on Clock Skew.” International Symposium on Quality Electronic Design (ISQED.02), 2002

Suggest Documents