## A New Optimization Cost Model for VLSI Standard Cell Placement P.Y.S. Cheung, C.S.K. Yeung, S.K. Tse, C.K. Yuen and W.L. Ko Electrical & Electronic Engineering, The University of Hong Kong Abstract -- In this paper, we propose a new optimization cost model for VLSI placement. It distinguishes itself from the traditional wirelength cost model[2][3][6] by having direct impact on the quality of the detailed routing phase. We also extend the well-known simulated annealing standard cell placement algorithm by applying our new cost model. Experimental results shows that we got 13% layout area reduction compared to traditional wire length model, 11% reduction to commercial tool. #### I. INTRODUCTION The VLSI placement came as one of the earliest problems needed to be fully automated as circuitry complexity has been doubled every two years predicted by the Morre's law[5]. In the placement stage, location of an individual block is assigned on a rectangular image to facilitate routing. Interconnections to external leads, called pins, are made in a manner to satisfy various physical constraints in routing phase. It is ideally the case that a given placement will achieve 100% routing within a given area. However, we are unable to know whether a placement solution is satisfactory until a later routing stage. So, we propose a new objective function hoping that it will lead to a good solution in the final layout, i.e., a high routability result. A far simpler objective which most placement algorithms used is to minimize the total wire-length parameter over all possible assignment of the cells. The calculation of the total wire length is the summation of the estimated length of each net which is usually obtained from the half-perimeter bounding box model[6] or the minimum steiner tree of the nets[2]. Even then, the problem has been found NP-complete and in general can only solved sub-optimally using heuristics. Obviously, the abstract model above is hypothetical and by no means reflect the true final routing wire length and even the optimal solution bear little implication to the final layout. Moreover, the wire length model only focuses on the amount of wiring but do not consider where the distribution. As a result, it can lead congestion and routing failures. Such a situation is illustrated in Figure 1. In this paper, we propose a new optimization cost model called <u>Net Crossings</u>, which distinguishes itself by having impact on the quality of the routing phase and hopefully leading more compact layout. We also extend the well-known simulated annealing standard cell placement algorithm with our new cost model. The remainder of this paper is organized as follows. We address standard definition and our new optimization model in section II. A new standard cell placement algorithm is presented in section III. Experimental results are discussed in Section IV and finally, the conclusion and discussion are in section V. vertical gird, spacing=1 unit vertical gird, spacing=1 unit - (a) Shorter wire length Two tracks required - (b) Longer wire length One tracks required Figure 1 An example shows that placement minimizes wire length can induce more area required for routing. # II. STANDARD DEFINITION AND OUR OPTIMIZATION MODEL The definition and notation are given as follows: A standard cell is defined by an ordered tuple $\delta$ =( $t_1$ $t_2$ ... $t_k$ )where a non-zero $t_j$ denotes an input or output terminal on the horizontal boundary of the cell, and a zero $t_j$ denotes an unused terminal. The terminals are assumed to be lying on the equally-spaced vertical grids and the number of grids for a cell is simply the cardinal $|\delta| = k$ which is then directly proportional to the cell width. The mirror $\delta$ of a cell is a transformation on $\delta$ by reversing the terminal ordering, i.e. $$\delta = (t_k ... t_2 t_1)$$ A standard cell library $\zeta = \{ \delta_1, \delta_2, \dots \delta_\pi \}$ is a set of distinct cells and for notational convenience, we also define $\widetilde{\zeta} = \{ \ \widetilde{\delta}_1, \ \widetilde{\delta}_2, \ \widetilde{\delta}_\pi \}$ A <u>row</u> $\overline{\gamma} = (\gamma_1 \gamma_2 ... \gamma_\mu)$ is ordered tuple where $\gamma_\rho \in \zeta$ or $\gamma_\rho \in \widetilde{\zeta}$ for each $\rho$ . The size of a row is the summation of the individual cell: $|\gamma| = \sum_{\rho} |\gamma_{\rho}|$ . Notice that the cells in a row may not be unique. A <u>channel</u> consists of an upper row $\overline{\alpha} = (\alpha_1 \ \alpha_2 \ \dots \ \alpha_{\mu})$ and a lower row $\overline{\beta} = (\beta_1 \ \beta_2 \ \dots \ \beta_{\pi})$ With respect to a channel, terminal $t_j$ in cell $\alpha_{\rho j}$ (or $\beta_{\rho j}$ respectively). Then the set of all terminals C is simply: $$\begin{split} C = \{ \text{ all } \alpha_{\rho j} \text{ 's and all } \beta_{\rho j} \text{ 's } \} \text{ where} \\ 1 \leq \rho \leq \mu \text{ , } 1 \leq j \leq \mid \alpha_{\rho} \mid \text{or } 1 \leq \rho \leq \pi \text{, } 1 \leq j \leq \mid \beta_{\rho} \mid \end{split}$$ A <u>net</u> $n \subseteq C$ specifies a set of terminals to be interconnected and a connectivity list $N = \{ n_1, n_2, \dots n_z \}$ is a collection of nets. The following definitions related to our definitions of *net crossings*. A <u>crossing</u> represents the situation where two nets intersect when individually connected. This is shown in Figure 2 below. among nets in a channel would be ideal, formulating an algorithm for such a complexity is basically unfeasible. It should be remembered that the module placement is a two-dimensional problem, e.g., in standard cell layout style, the layout area consists of many rows and just to compute the crossings number itself is far from trivial. Moreover, the actual number of crossings in a channel is still an unknown parameter while the locations and orientations of cells are not fixed. Rather then concentrating on the nets in a channel as a whole, we propose to narrow down the definition of the number of crossings $\chi$ from the perspective of a cell. Figure 3 illustrates the basic idea. Assuming that the shaded cell C with Although minimizing the number of crossings terminals $(t_1, t_2 \dots t_5)$ is centered at the origin of the Cartesian Plane and five other cells are connected to each of the terminals as shown. In Figure 3(a), no nets are crossed and so $\chi_c$ is simply zero optimally. This would not be so if cell $C_2$ and $C_4$ are interchanged and in this case $\chi_c$ become one as in Figure 3(b). As in Figure 3, suppose that five cells C<sub>1</sub> ... C<sub>5</sub> randomly placed around cell C, they can be partitioned into two ordered sets U and L; those reside on the upper half of the plane (consisting the 4th and 3rd quadrants) and those on the lower half plane(the 1st and the 2nd quadrants), both arranged $C_2$ ) and $L = (C_1, C_3, C_5)$ . Notice that the order of the terminals in C connecting to U, i.e., $(t_4,t_2)$ in anti-clockwise, which follow the order of U, and similarly but on the other side, L follows the proper terminals order of C. However, this in not the case in Figure 3, while $U = (C_2, C_4)$ is on the contrary of the terminal order of C. It should be noted that a net with both sides of terminals on the same vertical grid line, as the situation in Figure 4, induces no net crossings with other nets. Indeed, this can be proved to be a necessary and sufficient condition for $\chi_c = 0$ . Now given a placement configuration of all cells, the total number of net crossings $\gamma$ can be computed by: $\gamma$ = $\sum_{c} \chi_{c}$ , where c is the total number of cells. By minimizing the net crossings $\chi_c$ for each cell, the total number of net crossing y would also be minimized. Here, we look back the situation in Figure 1, in 1(a), $\chi = 1$ and half premier wire length is 4 while in 1(b), $\chi = 0$ and wire length is 5. Moreover, we can see two advantages of our model over the traditional wire-length model. First, $\chi_c$ can always be made to less than one-half of the maximum value since a simple side-mirror operation on cell c reverse the order of the terminals connecting to c. Second, the theoretical lower bound for $\chi$ is always zero if there is no multi-terminals net, while there is no such lower bound in the wire-length model. We can also observe that electrical equivalent swappable terminals can further reduced $\chi_c$ . So, our optimization model is also applicable to the following physical design process after placement, i.e. channel pin assignment and global routing. # III. STANDARD CELL PLACEMENT ALGORITHM In this section, we extend the simulated annealing placement algorithm[7] by applying our optimization model described in section II. The reason for such an extension is that we want to see the direct effect of our model and avoid the effect of the algorithm. Here is our extension. The standard cell placement algorithm consists of two stages. In stage1, modules are allowed to be moved and interchanged within the *range limiter*, which is a rectangular window and shrinks as the temperature decreases. When the temperature is reduced to a point at which the window size becomes too small to allow the modules to switch rows, stage 2 begins. First, all the modules overlaps are removed and then the annealing process continues with two operations: interchanges of adjacent modules and changes of a module's orientation, i.e. side-mirror of the module in x -direction. In this stage, we apply our optimization model -- net crossings $\chi$ , as the objective function, instead of the half perimeters of the bounding rectangles of all the nets. Because our definition of $\chi$ is focused on the module as described above, it should be noted that we regard the two adjacent cells as "one module" and the interchange operation changes the terminals permutation of "the module" as in Figure 5. The calculation of $\chi$ is the number of net crossings on "this module". Figure5 ### IV EXPERIMENTAL RESULTS We evaluated our optimization model and algorithm on two test benchmarks. The first set is two benchmark circuits from 1990 International Workshop on Layout Synthesis and the second is two real circuits generated by Mentor Graphics<sup>TM</sup> using VHDL. The parameters of the two test bench is shown in Table I and II respectively. Table I The first set of test bench | Circuit | #modules | #nets | #pin | #rows | |----------|----------|-------|-------|-------| | Primary1 | 752 | 904 | 5526 | 20 | | Primary2 | 2907 | 3029 | 18407 | 30 | Table II The second set of test bench | Circuit | # modules | #nets | #rows | |-------------|-----------|-------|-------| | 4-bit adder | 30 | 163 | 2 | | 16-bit ALU | 369 | 1867 | 6 | In the first test benchmark, an initial placement solution generated by *Min -cut* algorithm[1][4] is improved by the TimberWolf standard cell placement algorithm with traditional half perimeter bounding box model and our extension in Section III. After that, we use the AutoCells<sup>TM</sup> platform to do global, detailed routing and chip tiling. We compare the final layout area as in Table III. We also compare the placement results which the full process generated by AutoCells<sup>TM</sup> in Table IV. Table III. Layout Area Comparison. Traditional | <u> </u> | Model v.s. our model | | | |----------|-----------------------------|-------------------------|-------------------------| | Circuit | Chip area traditional model | Chip area our extension | Chip<br>area<br>reduced | | Primary1 | 17.28 | 15.71 | -9% | | Primary2 | 61.72 | 53.67 | -13% | Table III. Layout Area Comparison. AutoCells™ | <u>v</u> | <u>.s. our model</u> | | | |----------|-------------------------|----------------------------|-------------------| | Circuit | Chip area<br>AutoCells™ | Chip area<br>our extension | Chip area reduced | | Primary1 | 17.25 | 15.71 | -9% | | Primary2 | 60.25 | 53.67 | -11% | In the second test benchmark, we import our placement result to Mentor Graphics<sup>TM</sup> platform to finish the layout process. We compare the final chip area to the full process by Mentor Graphics<sup>TM</sup>. The results are in Table IV. We also compare the vias in the layout in Table V. The technology we used is CMOSN 1.2 micro. Table IV Chip Area Comparison. Mentor Graphics v.s. our | model | | | | | |-------------|----------------------------------|-------------------------------|-------------------|--| | Circuit | Chip area<br>Mentor<br>Graphics™ | Chip area<br>our<br>extension | Chip area reduced | | | 4-bit adder | 659994 | 624438 | -5% | | | 16 bit ALU | 3891810 | 3740028 | -4% | | Table V Vias Comparison. Mentor Graphics v.s. our model | Circuit | Vias by<br>Mentor<br>Graphics™ | Vias by our extension | Vias<br>reduced | |-------------|--------------------------------|-----------------------|-----------------| | 4-bit adder | 289 | 272 | -6% | | 16-bit ALU | 1761 | 1710 | -3% | ### V. CONCLUSION In this paper, we propose a new optimization model, net crossings, for VLSI standard cell placement. Our model distinguishes itself from the traditional half perimeter bounding box wire-length model by having direct impact on the quality of the channel routing phase and leads more compact layout. We prove our claims by implementing our model with a simulated annealing standard cells placement algorithm. We also compare with Commercial Tools AutoCells<sup>TM</sup> and Mentor Graphics<sup>TM</sup> and our model and algorithm generate more compact layout in all cases. It is obvious that our optimization model can be applied to other layout style such as building blocks and mixed beside standard cell architecture. Moreover, it is also applicable to channel pin assignment and global routing problem. We believe that further refinement and modification on the model can lead more compact layout. #### REFERENCE - [1] Breuer, M. A., "A class of min-cut placement algorithms," in *Proc. of the 14<sup>th</sup> Design Automation Conf.*, pp.284-290, 1977. - [2] Hanan, M., "On Steiner's problem with rectilinear distance," *SIAM Journal on Applied Mathematics*, vol. 14, no.2, pp. 255-265, March 1966. - [3] Hwang, F.K., "On Steiner minimal trees with rectilinear distance," *SIAM Journal on Applied Mathematics*, vol. 30, no.1, pp. 104-114. January 1976. - [4] Kernighan, B. W. and S. Lin, "An efficient heuristic procedure for partitioning graphs," Bell System Technical Journal, vol. 49, no. 2, pp. 291-307, February 1970. - [5] R. N. Noyce, "Microelectronics", Scientific American, 237:62-69, September 1977. - [6] Schweikert, D.G., "A 2-dimensional placement algorithm for the layout of electrical circuits," in *Proc. of 13<sup>th</sup> Design Automation Conf.*, pp. 408-416,1976. - [7] C. Sechen, D. Braun, A. Sangiovanni Vincentelli, "ThunderBird: a complete standard cell layout package," in IEEE Journal of Solid-State Circuits, vol. 23, no. 2, pp. 410- 420, April 1988.