THE TECHNOLOGY advance toward nanometer regime

738 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008 Power Grid Analysis and Optimization U...
1 downloads 0 Views 570KB Size
738

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008

Power Grid Analysis and Optimization Using Algebraic Multigrid Cheng Zhuo, Student Member, IEEE, Jiang Hu, Senior Member, IEEE, Min Zhao, and Kangsheng Chen

Abstract—This paper presents a class of power grid analysis and optimization techniques, all of which are based on the algebraic-multigrid (AMG) method. First, a new AMG-based reduction scheme is proposed to improve the efficiency of reducing the problem size for power grid analysis and optimization. Next, with the proposed reduction technique, a fast transient-analysis method is developed and extended to an accurate solver with error control mechanism. After that, the scope of this method is further broadened for handling the analysis of the modified grid. Finally, a fast decap-allocation (DA) scheme based on AMG is suggested. Experimental results show that these techniques not only achieve a significant speedup over reported industrial methods but also enhance the quality of solutions. By using the proposed techniques, transient analysis with 200 time steps on a 1.6-M-node power grid can be completed in less than 5 min; dc analysis on the same circuit can reach an accuracy of 1 × 10−6 in about 141 s. Our DA can process a circuit with up to one million nodes in about 11 min. Index Terms—Capacitance, multigrid, optimization, power grid, simulation.

I. I NTRODUCTION

T

HE TECHNOLOGY advance toward nanometer regime has brought the on-chip power integrity issues into the spotlight [1]. In modern very large scale integration (VLSI) design, a robust on-chip power supply network is an indispensable part of ensuring the system performance [2], [3]. A poorly designed power grid may easily lead to extra logic delays, signal integrity problems, and even functional failures. To overcome the increasing IR drop, electromigration, and simultaneous switching noise, a robust power-grid design is becoming more and more important [4]. For a high-performance chip, power-grid design is an iterative procedure [3]. From the early stage to the postlayout stage, it requires multiple iterations of planning, resource allocation, and refinement. To ensure the robustness of the design, it is necessary to have the following: 1) fast and accurate power grid Manuscript received October 31, 2006; revised March 30, 2007 and July 26, 2007. This work was supported by the Specialized Research Fund for the Doctoral Program of Higher Education, Ministry of Education of China, under Grant 20060335065. This paper was recommended by Associate Editor A. Raghunathan. C. Zhuo was with the Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China. He is now with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109 USA (e-mail: [email protected]). J. Hu is with the Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843 USA (e-mail: jianghu@ ece.tamu.edu). M. Zhao is with Magma Design Automation, Inc., Austin, TX 78759 USA (e-mail: [email protected]). K. Chen is with the Department of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China (e-mail: chenks@ zju.edu.cn). Digital Object Identifier 10.1109/TCAD.2008.917587

analysis and 2) efficient power-grid optimization method [5]. The main challenge of power grid analysis or optimization is its huge size, typically in millions of nodes. Due to tremendous number of variables, using general-purpose simulator, such as SPICE, is no longer feasible in practice. Therefore, new techniques with high computational efficiency, in terms of both execution time and memory, are highly demanded. The challenge of huge problem size in power grid analysis and optimization resulted in many research works from both academia and industry [2], [3], [6]–[24]. Among them, some techniques can achieve a relatively low time/space complexity, such as the hierarchical macromodels [9], the randomwalk-based method [10], and the 2-D/3-D transmission-line methods [12], [13]. Due to the intrinsic similarity between the power grid and the discretized structure of smooth partial differential equations, Kozhaya et al. [14] applies a geometricbased multigrid-like technique on power grid analysis. In order to further handle irregular power-grid structures, algebraicmultigrid (AMG)-based techniques are also developed in [15]–[17]. Each of the aforementioned methods aims at reducing problem complexity to gain speedup with or without little accuracy loss. In power-grid design, decoupling capacitance (decap) is a very effective technique for suppressing transient noise. Onchip decap allocation (DA) is also a very difficult problem not only due to its huge size but also because of the nonlinear nature of its constraints. In [18], a charge-based model is developed to roughly estimate the decap size for each individual module. Other works [19]–[23] use adjoint sensitivity technique to guide the solution search in nonlinear optimization. Even though the number of transient simulations is greatly decreased by merged adjoint method [21], [22] or greedy search [22], [23], the sheer size of the problem still implies a huge computation cost and, therefore, needs to be reduced directly. In [23], the problemsize reduction is achieved by divide-and-conquer, assuming that the boundary voltages of each partition do not change during the decap optimization. The work of [24] uses a geometricmultigrid (GMG) technique [14], [25] to reduce the problem size, the effectiveness of which is mainly restricted to regular power grids [15], [16]. In this paper, we propose an AMG-based reduction for fast power grid analysis and decap optimization. Multigrid can reduce the system size by pruning out a large number of variables. In VLSI design, its efficiency depends on the smoothness of the current distribution, the memory and runtime of keeping track of circuit geometry, and the complex procedure of deciding the interpolation operator. This paper addresses these problems in order to obtain a practical and efficient implementation of the AMG-based method for power grid analysis and optimization. The AMG-based method is very general and, therefore, can

0278-0070/$25.00 © 2008 IEEE

ZHUO et al.: POWER GRID ANALYSIS AND OPTIMIZATION USING ALGEBRAIC MULTIGRID

be applied to any power grid—regular or irregular. It is also very flexible and can be easily combined with other techniques such as the conjugate-gradient (CG) method, the charge-based technique [18], or the partitioning method [23] for further speedup. Our approach is composed of the following steps. 1) Use a dynamic AMG-based reduction strategy to reduce the grid size. 2) Obtain the corresponding restriction and interpolation operators. 3) Simulate or optimize on the coarsest grid. 4) Map the solution back to the finest grid. It can be seen that in our approach, the first two steps are the same for both the power grid analysis and the decap optimization. By noting the inherent connection between the analysis and the optimization, we can reuse the information contained in analysis to gain a further speedup in optimization. The rest of this paper is organized as follows. Section II gives an overview of multigrid method. Next, in Section III, we provide a brief review of power-grid modeling. Following that, an improved AMG-based reduction scheme is discussed in Section IV. Additionally, several AMG-based methods for power grid analysis are demonstrated in Section V. In Section VI, we present the algorithm that uses the AMG for DA. Section VII reports the performance of our algorithms with a set of benchmarks. Finally, we present concluding remarks in Section VIII. This paper is an extended description of the work in [17] and [26]. Similar as other literatures, we use h to indicate fine grid and H for coarse grid. II. O VERVIEW OF M ULTIGRID M ETHOD Multigrid is a method to accelerate the convergence of solving differential equations numerically [25], [27]. The main component of solving differential equations is to solve linear systems by using iterative methods such as Gauss–Seidel. The iterative methods can smooth out high-frequency errors rapidly but are usually slow in removing low-frequency errors. The basic idea of multigrid is to use a projection of the fine-grid problem on a coarser grid to remove the hard-to-damp lowfrequency errors, which is called coarse-grid correction [25]. The high-frequency errors are removed with those iterative techniques, which is called smoothing [25]. The coarse-grid correction and smoothing work in complement to each other, i.e., the errors that are not damped by the smoothing will be damped by the coarse-grid correction and vice versa. The basic multigrid operations include the following: 1) reduction which maps the problem to coarse level with the restriction operator; 2) interpolation which maps the problem back to fine level with the interpolation operator; 3) smoothing which uses iterative methods to smooth out high-frequency errors; 4) coarse-grid correction which eliminates low-frequency errors on coarse grids. A classical multigrid solver repeatedly applies the aforementioned operations, reduces the error in every step, and finally converges to the solution. The multigrid method can be categorized to GMG and AMG. GMG is relatively straightforward that the reduction and interpolation operations are based on predefined grid hierarchies of

739

the problem, preferably regular structures. Fixing the coarsegrid correction puts a more complex requirement on the choice of smoothers to maintain the fast convergence [25], [27]. In contrast, AMG does not require a predefined grid. Instead, it fixes its smoother to some simple methods and carries out its reduction and interpolation only on the information contained in the underlying matrix. Therefore, AMG is more flexible in handling general structures that may be irregular [27]. Recently, multigrid has been widely used in power grid analysis [14]–[16] and optimization [24]. The multigrid method in [14] uses an AMG-like interpolation procedure. However, regarding the selection of the coarser grids, this method is still geometrically based. Hence, it requires keeping track of the geometry change, which may degrade the efficiency due to the irregularity. The work in [15] emphasizes on the system reduction part of AMG so that fast computation is achieved. However, it neglects the smoothing steps and results in nontrivial accuracy degradation. The AMG-based power grid analysis of [16] follows the complete AMG procedure with smoothing operation at every level. This approach can achieve a better solution accuracy, but the runtime improvement becomes less. The work of [24] is based on GMG and, therefore, is restricted to regular power grid. In practice, a power grid is often irregular [15], [16] due to the usage of IP core or system-on-chip designs. Furthermore, it is hard to estimate the violation accurately using simply the linear programming (LP) as in [24]. III. B RIEF R EVIEW OF P OWER -G RID M ODELING Before presenting our AMG-based approach, we give a brief review of power-grid modeling and simulation. There are two supply grids in VLSI design: the power and ground grids. The two grids influence each other, and therefore, a simultaneous simulation is preferred. However, if we take advantage of the fact that the power and ground grids are often symmetric, the combined power/ground grids can be reduced back to a single power grid [6]. Power grid is usually a metal mesh where each edge can be modeled as a resistor. Each node of the mesh has a ground capacitance consisting of parasitic capacitance and decap. Active devices, which are modeled as timing-varying current sources, are connected to the mesh nodes. Some nodes are also connected to power pads that can be treated as ideal voltage sources [28]. Hence, with a modified nodal analysis, the linear system can be represented with the following system of differential equations: G · x(t) + C · x (t) = u(t)

(1)

where G is the conductance matrix, C is the admittance matrix resulting from capacitive, or inductive, elements if metal mesh is linked to pads with RL elements, x(t) is the vector, including the node voltage, voltage sources, and corresponding branch current, and u(t) is the vector of independent time-varying current sources [9], [14], [29]. By using backward Euler method, this system can be discretized to a linear algebraic system (G + C/h) · x(t) = u(t) + C/h · x(t − h).

(2)

With a fixed time step, we may rewrite (2) as Ah · x(t) = b(t), where Ah = G + C/h is the system matrix, and b(t) = u(t) +

740

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008

C/h · x(t − h). If x(t) only consists of node voltage, then it represents an RC network that matrix Ah is symmetric and positive definite (SPD) [14]. If inductive elements are taken into consideration, the use of K matrix [30] still keeps the system matrix SPD, whereas (2) becomes two coupled iterative equations [8]. IV. I MPROVED AMG-B ASED P OWER G RID R EDUCTION Reduction is a critical step in AMG methods. In this step, we can reduce the power grid to an easy-to-handle size and obtain the corresponding interpolation operator. Since the system matrix Ah is SPD, the restriction operator is simply the transpose of the interpolation operator [27]. A good choice of the interpolation operator will greatly improve the convergence of the AMG [27]. This section is focused on an improved AMGbased reduction scheme. The improved reduction procedure to obtain the interpolation operator is discussed in Section IV-A. Then, a dynamic-threshold control mechanism for speeding up the reduction is presented in Section IV-B. A. Interpolation Operator In this section, we will introduce an interpolation operator that can provide a desired compromise between accuracy and runtime for power grid analysis. We define the following notations for describing the reduction (coarsening) from a certain granularity level to a coarser grained level: C = {the set of nodes kept in coarse grid} F = {the set of nodes removed from fine grid in coarsening} Ni = {node i s neighboring node set} Si = {node i s strongly connected node set}.

Fig. 1.

Small example of the RC model.

nodes in Si . However, this assumption is not true for every node, particularly when the visited node has some neighboring nodes with similar connection strength. Hence, it is unfair to arbitrarily choose one of the neighboring nodes and discard the others, as their contributions to the interpolated voltage are almost the same. The classical AMG method uses all the nodes in Si [27] for interpolation. If any node j in Si has already been removed, multipass interpolation is performed that the nodes in Sj are used to replace the impact of node j [14], [27]. As the reduction procedure continues, it may take several passes to find the interpolation weights for a specific node. Thus, it becomes complex and time consuming to decide the interpolation operator. Besides that, using too many nodes for interpolation will increase the number of nonzero elements (NNZs) of the coarsened system matrix and degrade the efficiency of matrix factorization. In order to achieve a better compromise between accuracy and computation cost, we propose to perform interpolation based on a redefined Si as    Si = j|Strength ≥ max φ, max{Strength} −  (4) Ni

In the process of reduction or coarsening, some nodes (or variables) are removed, whereas the others are retained in the coarsened grid. At the beginning, C = F = ∅. The nodes are visited in a prefixed order. Usually, critical nodes, such as power pads, critical loads, and corner nodes, are kept in the coarse grid or added into set C [15]. Whether or not a noncritical node is kept depends on the connection strength between nodes [15]. Same as in [15], the connection strength between nodes j and i is defined as Strength = |aij /aii + aij /ajj |/2

(3)

where aij is the element at the ith row and the jth column of the system matrix. For a node i, its neighboring nodes with a connection strength greater than a certain threshold φ (usually in the range 0.1–0.3)[15], [27] form the set Si . If a node is removed during coarsening, its smoothing error can be interpolated from its strongly connected nodes [27]. Using too few neighboring nodes for interpolation may degrade accuracy, whereas using too many nodes may cost large runtime and memory overhead. In [15], only one neighboring node, which has the strongest connection, is utilized for interpolation. This method depends on the assumption that the neighboring node with the strongest connection plays the dominant role in interpolation and suppresses the impacts of all the other

where φ is the threshold for reduction, and  is an empirically chosen constant with a small value of about 0.001–0.005. The connection strength of the selected nodes is close to maxNi {Strength} and has the most significant influence on the removed node. Since  is small, the interpolation weight can be simply set as 1/|Si |, where |Si | is the number of nodes in the set Si . This scheme simplifies the interpolation-weight computation and works particularly well when many nodes have similar “strength”. Then, F = F ∪ {i}, and C = C ∪ Ni . For a removed node, all of its neighboring nodes will be kept in the coarse grid. We show a small example to illustrate the details of the interpolation operation in Fig. 1. Suppose that all of the four neighboring nodes j1 , j2 , j3 , and j4 are strongly connected to node i, and the connection strength of j1 and j3 to node i is much larger than that of the other two nodes. Such a case is often met when j1 , j3 , and i are on one metal layer, whereas the other two nodes are on another layer. We can obtain the interpolation operator in Fig. 2 that nodes j1 and j3 are selected for interpolation. Unlike our scheme, method in [15] randomly chooses the first-visited one in {j1 , j3 } and discards the other one with a similar strength. The classical AMG uses all the nodes for interpolation, decides whether there exists a multipass, and then computes the interpolation weights.

ZHUO et al.: POWER GRID ANALYSIS AND OPTIMIZATION USING ALGEBRAIC MULTIGRID

Fig. 2.

Interpolation operator.

Based on the interpolation operation as aforementioned, h we can obtain the overall interpolation operator PH = h h h PH1 PH2 , . . . , PHn as well as the coarsened system matrix h T h h (PH ) A PH . This procedure is performed iteratively until the matrix is small enough for direct solve. h at It can be observed that for our interpolation operator PHi any grid level i, the sum of any row is one because there are nk NNZs at the kth row with the value 1/nk . Therefore, we have h . the following lemma for the overall interpolation operator PH Lemma 1: The sum of any row in the interpolation operator h PH (an n × m matrix and n > m) is one. Proof: The sum of any row in the interpolation operator h PH(i) at level i is one, and the overall interpolation operator is the product of the interpolation operators at all levels like h h h h = PH1 PH2 , . . . , PHn . PH h h h Let us consider the product PH(i) PH(i+1) , where PH(i) is an h ni × mi matrix, PH(i+1) is an ni+1 × mi+1 matrix, and mi = h h (k, :)PH(i+1) , ni+1 . Hence, the kth row of the product is PH(i) h h where PH(i) (k, :) is the kth row of PH(i) . The sum can also be rewritten as mi+1 mi 

h h PH(i) (k, j)PH(i+1) (j, l)

741

levels of coarsening. When the system matrix changes during coarsening, the constant threshold may sometimes be too low or too high. In order to solve this problem, we propose a dynamicthreshold mechanism such that a stable reduction rate is retained in all levels of coarsening. At the beginning, the threshold is set to 0.2 which is an empirically good value employed in the previous work [15], [27]. After the first level of coarsening, the threshold φ is determined by an empirical function φiter = f (ratioiter−1 ) 0.1,   0.3, = [u(ratioiter−1 −1.5)−0.5]×k1    × ek2 |ratioiter−1 −1.5| −1 +0.2,

φiter−1 < 0.1 φiter−1 > 0.3

(7)

otherwise

where u(x) is a unit step function, and k1 and k2 are empirically chosen as 0.001 and 12, respectively. By doing so, the reduction ratio can be stabilized throughout all levels of coarsening. Such an empirical function is drawn from the hope that the threshold becomes smaller when the reduction ratio is too low and becomes larger when the ratio is too high. We consider 1.5 as a good reduction ratio for each level. In order to keep the ratio stable for different levels, the two-sided exponential function is suggested as the basic structure of the equation. With numeric experiments, we fit the equation and decided approximately each parameter. Thus, such a strategy can be applied to a series of different cases. V. AMG-B ASED M ETHODS FOR P OWER G RID A NALYSIS

l=1 j=1

=

mi 



mi+1 h PH(i) (k, j)

j=1

mi+1

h PH(i+1) (j, l).

(5)

l=1

m i

h h Since l=1 PH(i+1) (j, l) = 1 and j=1 PH(i) (k, j) = 1, the sum of (5) is one. By repeating the aforementioned procedure, h is one. Moreover, it can be seen that the sum of any row in PH h as the transpose of PH , the sum of any column in the restriction operator RhH is also one. Q.E.D.

B. Dynamic Reduction Threshold The reduction rate of coarsening heavily depends on the threshold φ. The reduction rate can be quantified as the reduction ratio between two consecutive levels ratio =

number of nodes at previous grid level . number of nodes at current grid level

(6)

If the threshold φ is too low, the reduction ratio may be very high during the first few iterations, and the accuracy is degraded. In later iterations, the node degree increases rapidly due to the aggressive reduction. Consequently, the matrix quickly becomes very dense, and the reduction dramatically slows down. Therefore, a too low threshold may hurt both accuracy and convergence rate. If the threshold is too high, it is quite likely that the sets Si for some nodes are empty and that very few nodes are removed. Previous AMG-based power-gridanalysis works [15], [16] use a constant threshold throughout all

This section begins with a fast AMG-based approximation method for power grid transient analysis in Section V-A. After that, by combining with the error control mechanism, the proposed AMG is extended to an accurate solver in Section V-B. Section V-C proposes an interpolation-operator refinement mechanism and applies the AMG method to analyze the grid with small modification. A. Improved AMG-Based Approximation Solver With the discussion in Section IV, we may easily get the interpolation/restriction operator and map the problem from the fine grid to the coarse grid. However, for the smoothing procedure, the complementary operation, if we do not perform it like [15], great accuracy degradation may occur. On the other hand, directly applying a general-purpose AMG solver to the power grid analysis is not computationally efficient because the pre- and postsmoothing at each coarse-grid correction level are very time consuming. Instead, we propose a customized AMG method for power grid analysis, which is summarized in Fig. 3. In this new approach, a weighted-Jacobi-based presmoothing is performed only once at the beginning. This presmoothing can significantly improve solution accuracy while only one iteration of presmoothing has a limited impact on runtime. After reduction, the linear system at the coarsest grid can be solved directly. Successive solutions on each time point would involve only inexpensive forward and backward substitution procedures.

742

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008

Fig. 3. Improved AMG-based approximation solver for power grid transient analysis.

B. AMG-Based Solver With Error Control Mechanism For multigrid method, its convergence is affected by the smoothness of the current distribution. If a circuit has a severely uneven current distribution, only one presmoothing is not adequate for eliminating high-frequency errors, particularly in dc analysis. In [17], a W-cycle-like postsmoothing scheme is performed to control the error. With multiple iterations, this scheme can control the error at a tolerable level. However, the aforementioned scheme is not completely consistent to the proposed coarse-grid correction operator. By noting that the proposed AMG is designed to reduce the error, we may consider it as an approximation for the inverse of Ah and, hence, use it as a preconditioner for a Krylov subspace solver, like the CG method. The AMG preconditioned CG method is described in Fig. 4. By employing CG as the error control mechanism, we may reduce the error that is damped poorly by the proposed multigrid and obtain a more robust method with good convergence. C. Incremental Update of the Interpolation Operator for the Analysis of the Modified Grid If some changes are made to a chip design, its power grid often needs to be modified accordingly. In this scenario, the power grid analysis can be performed incrementally based on the results of previous design instead of being carried out from scratch. We extend our AMG-based method for the fast analysis of the modified power grid. By the AMG reduction, the nodes on the coarsened grid can be considered as the representative nodes of the fine grid. To analyze the modified grid, we just need to add the nodes in the modified regions into the original coarse grid. After that, the updated coarse grid can represent the modified fine grid and minimize the impact of the faraway pads in the wirebond technology. To include those modified regions, we define the “local window,” which is the smallest window covering the modified region. If there are several modified regions, several corresponding windows are built.

Fig. 4.

AMG-based solver with error control mechanism. TABLE I NOTATIONS USED IN THE INTERPOLATION-OPERATOR INCREMENTAL UPDATE

In the AMG-based method, constructing the interpolation operator is the most time-consuming step as it includes multiple matrix multiplications. In a single dc analysis by the AMG preconditioned CG method, the interpolation-operator construction accounts for about 40% of the entire computation cost. Therefore, in the analysis, we focus on how to incrementally update the interpolation operator for the modified power grid by partially reusing the interpolation operator of the original grid. The notations used in the methodology are shown in Table I. Step 1) Decide the node set Sdel and Sadd for removed and inserted wire segments. Step 2) Decide the local window for each modified region. Slocal contains the nodes in all these windows. For a node i in Sdel ∩ {Nodes of PGH 0 }, its local window also includes the nodes whose voltages are interpolated from node i. Step 3) Find the node set Srmv . Step 4) Obtain the new interpolation node set Sinterp for the modified grid PGhm . Step 5) Obtain the node set Snew for PGhm . Step 6) Clear the interpolation weights for nodes in Srmv to zero. Set their interpolation weights to one.

ZHUO et al.: POWER GRID ANALYSIS AND OPTIMIZATION USING ALGEBRAIC MULTIGRID

743

Step 7) Obtain the interpolation operator on PGhm . Remove |Sdel | rows corresponding to Sdel from the interh on PGh0 . Insert |Sadd | rows polation operator PH h corresponding to Sadd in PH . Insert |Srmv | columns h . For a node i in Srmv , corresponding to Srmv in PH the interpolation weight is at (xi , yi ) of the new interpolation operator, where xi is the node i’s index in Snew , and yi is the node i’s index in Sinterp . For the aforementioned methodology, the adjustment of wire/decaps corresponds to the column insertion in the interpolation operator, whereas wire-segment removal/insertion just corresponds to row removal/insertion. We use the neighboring nodes and some strongly connected nodes to alleviate the impact of the modified regions on the interpolation operator. The methodology does not perform any matrix operation except rows/column insertion/removal. Thus, it is faster than constructing a completely new interpolation operator. With the node voltage V0 (t) for PGh0 as the initial guess for each time step, we can apply the updated interpolation operator to the same algorithm in Fig. 3. This approach can handle multiple modified regions on very large power grids and make the modified power grid analysis more efficient. VI. F AST DA U SING AMG In this section, we introduce a fast DA method based on AMG. An overview of the algorithm is given in Section VI-A, including the problem formulation and related power-gridreduction issues. In Section VI-B, an error-compensation mechanism is proposed to reduce the errors resulted from algebraically nonsmooth structure. Several customized speedup techniques for sequential quadratic programming (SQP) are discussed in Section VI-C. Finally, a charge-based backmapping flow is demonstrated in Section VI-D. A. Overview of DA Using AMG The size of decap at each node is a decision variable in the DA problem, in which we attempt to minimize the total area of decaps while the voltage at each node is no less than certain threshold at any time point. The lower and upper bounds for allowed decap size are represented as lb and ub , respectively. The DA problem is formulated as the following non-LP (NLP) problem, where Ch is the decap vector, and Cih is the ith element of Ch . DA  Cih (8) Minimum i∈PGh

Subject to :

ceq(Ch ) =



si = 0

(9)

i∈PGh

lb < Ch < ub

Fig. 5. Illustration of voltage drop.

the problem size using the AMG-based technique described in Section IV. The reduced problem on the coarse grid is solved directly, and then, the solution is mapped back to the original fine grid. Here, we use the conductance matrix G on the original fine grid as the system matrix Ah to obtain the interpolation operator. Following the techniques in Section IV, we can obtain h the overall interpolation operator PH for the underlying matrix h h T A . Moreover, the overall restriction operator is RhH = (PH ) . h Therefore, the system A can be reduced to a coarser grid h AH = RhH Ah PH . Since Ah is just the conductance matrix G, H A can be considered as the conductance matrix corresponding to a more complex but coarsened grid. Once the interpolation operators are obtained, the currentsource vector on the coarse grid can be obtained by IH (t) = RhH Ih (t)

(11)

where Ih (t) is the current-source vector on the fine grid. Correspondingly, the bounds for decap sizes are also updated to H H H H H H lH b < C < ub , where lb = Rh lb , and ub = Rh ub [24]. In our AMG-based method, the voltage-source nodes are retained. The parasitic capacitance is used as the lower bound of decap optimization instead of being stamped in the capacitance matrix for the convenience of computation. If we directly include the capacitive elements in the system matrix to compute the interpolation operator, the capacitance matrix on the coarse grid is no longer a diagonal matrix. The fill-ins at the offdiagonal positions just denote the cross capacitance between nodes. Such a capacitance matrix may increase the number of variables in the NLP and make it unsolvable. It is also impratical to map the optimized cross capacitance back to the grounded decaps on the fine grid. Therefore, the capacitance on the coarse grid is obtained by CH = RhH Ch .

(12)

A flow of the proposed DA algorithm is shown in Fig. 6. One transient analysis is performed on the original power grid, based h and the restriction operator on the interpolation operator PH RhH previously obtained, to precondition the system matrix such that we can obtain further speedup. Other techniques will be discussed in the following sections.

(10)

T

t2 where si = 0 | min(Vi (t)−Vth , 0)|dt = t1 (Vth −Vi (t))dt, and [t1, t2] is the time interval in which the violation occurs, as shown in Fig. 5. This voltage-drop noise metric is adopted from [19]. Due to the huge size of power grid, solving the NLP DA directly is extremely difficult. Therefore, we propose to reduce

B. Error Compensation Obviously, the power grid reduction causes some information loss that is an inevitable price paid in exchange for the improvement of computation speed. In other words, the credibility of the optimization solution depends on the discrepancy between the transient response on the coarse grid and that on the fine

744

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 27, NO. 4, APRIL 2008

Fig. 7.

Voltage-drop noise metrics of V (t) and PhH V H (t).

Fig. 8.

Raise of the threshold voltage.

its violation area is roughly equal to ceq(Ch ). For example, in Fig. 8, we raise the threshold voltage for the interpolated voltage from Vth to Vth1 to make S1 ≈ S2 . Thus, we have h

ceq(C ) =

t2  

(Vth − Vi (t)) dt

i∈PGh t1



t6    i∈PGh

Fig. 6. Flow of DA using AMG.

grid. Based on that, we propose a compensation technique to reduce the error due to this discrepancy. Suppose that the NLP on the coarse grid is DA coarse  Minimum CiH (13) i∈PGH

Subject to :

ceqH (CH ) =



sH i =0

lH b

t2

H