













- Layout density control
  - density rules minimize yield impact
  - uniform density achieved by post-processing, insertion of dummy features
- Performance verification (PV) flow implications
  - accurate estimation of filling is needed in PD, PV tools (else broken performance analysis flow)
  - filling geometries affect capacitance extraction by > 50%
  - is a multilayer problem (coupling to critical nets, contacting restrictions, active layers, other interlayer dependencies)



- Modern foundry rules specify layout density bounds to minimize impact of CMP on yield
- Density rules control local feature density for w×w windows
  - + e.g., on each metal layer every 2000um  $\times$  2000um window must be between 35% and 70% filled
- Filling = insertion of "dummy" features to improve layout density
  - typically via layout post-processing in PV / TCAD tools
     boolean operations on layout data
  - affects vital design characteristics (e.g., RC extraction)
  - accurate knowledge of filling is required during physical design and verification















- If all *w*× *w* windows of fixed *r*-dissection have density ≤ *U*, there may be *floating w*× *w* window with density min{1, U + 1/r -1/(4r<sup>2</sup>)}
- Fixed-dissection algorithm is inaccurate
- Exact algorithm is slow = O(k<sup>2</sup>)



### **Multilevel Approach**

### • Estimation:

- max floating window density  $\leq$  max bloated window density
- min floating window density ≥ min shrunk window density

### Jooming:

- remove standard windows in underfilled bloated windows
- subdivide remaining tiles and find area of new bloated windows
- Terminate subdivision when either:
  - # of rectangles is small (run exact density analysis), or
  - (max bloated density)/(max standard density) ≤ € (say, 8=1%)

### **Multilevel Algorithm**

**Tiles** = list of all windows (r = 1) **Accuracy** =  $\infty$ While **Accuracy** > 1+  $\epsilon$ find are in each bloated and standard window MAX = max area of standard window BMAX = max area of bloated window refine **Tiles** = list of tiles from bloated windows of area  $\geq$  MAX subdivide each tile in **Tiles** into 4 subtiles **Accuracy** = BMAX / MAX Output max standard window density = MAX/ w<sup>2</sup>



- Given design rule-correct layout of k disjoint rectilinear features in n×n region
- Find design rule-correct filled layout
  - no fill geometry is added within distance B of any layout feature
  - no fill is added into any window that has density  $\geq U$
  - minimum window density in the filled layout is maximized (or has density ≥ lower bound L)

# Filling Problem in Fixed-Dissection Regime

Given

- fixed *r*-dissection of layout
- feature area[T] in each tile T
- slack[T] = area available for filling in T
- maximum window density U
- Find total fill area p[T] to add in each T s.t. any w × w window W has density ≤ U and min<sub>W</sub> ∑ <sub>T ∈ W</sub> (area[T] + p[T]) is maximized







# <section-header><text><text><equation-block><text><equation-block><list-item><equation-block><equation-block><equation-block><text>









# Subwavelength Optical Lithography — Technology Limits

- Implications of Moore's Law for feature sizes
- Steppers not available; WYSIWYG (layout = mask = wafer) fails after .35μm generation
- Optical lithography
  - circuit patterns optically projected onto wafer
  - feature size limited by diffraction effects
  - Rayleigh limits
    - resolution R proportional to  $\lambda$  / NA
    - depth of focus DOF proportional to  $\lambda$  / NA2
- Available knobs
  - amplitude (aperture): OPC
  - phase: PSM

# Next-Generation Lithography and the Subwavelength Gap

- EUV
- X-rays
- E-beams
- All at least 10 years away; require significant R&D, major infrastructure changes
- > 30 years of infrastructure and experience supporting optical lithography









### **OPC** Issues

- WYSIWYG broken  $\rightarrow$  (mask) verification bottleneck
- Pass functional intent down to OPC insertion
  - make corrections that win \$\$\$, reduce performance variation
  - OPC insertion is for predictable circuit performance, function
- Pass limits of manufacturing up to layout
  - don't make corrections that can't be manufactured or verified
  - Mask Error Enhancement Factor, etc.
- Layout needs models of OPC insertion process
  - geometry effects on cost of required OPC to yield function
  - costs of breaking hierarchy (beyond known verification, characterization costs)







# Phase Shifting Masks

- no phase shifting: poor contrast due to diffraction
- phase shifting by 180°: reverse electric field on mask, destructive interference yields zero-intensity on wafer (high contrast)
- Background
  - invented in 1982 by Levenson at IBM
  - interest in early 1990s, but near wavelength  $\rightarrow$  no pressing need
- Many forms of phase-shifting proposed
- Key issues: manufacturability, design tools
- Today: subwavelength gap forces PSM into every process (example: Motorola 90nm gates using 248nm stepper, announced in early 1999)











# Phase Conflict and the Conflict Graph





### Phase Conflict and the Conflict Graph

- Self-consistent phase assignment is not possible if there is an odd cycle in the conflict graph
- Phase-assignable = bipartite = no odd cycles
  - this is a global issue!
  - features on one side of chip can affect features on the other side
- Breaking odd cycles: must change the layout!
  - change feature dimensions, and/or change spacings
  - degrees of freedom include layer reassignment for interconnects

### **Conflict Graph**

- Dark Field: build graph over feature regions
  - edge between two features whose separation is < B</li>
- Bright Field: build graph over shifter regions
  - two edge types
  - <u>adjacency edge</u> between overlapping phase regions : endpoints must have <u>same</u> phase
    - essentially, these regions must be "merged" into single phase shifter
    - DRC-like (gap, notch type) local rules must likely be applied to such "merging"
  - <u>conflict edge</u> between shifters on opposite side of critical feature: endpoints must have <u>opposite</u> phase
  - Step 3: simple reduction to previous (dark-field) T-join solution: each dotted edge becomes a 2-chain (introduce one extra vertex)





























- How to delete **minimum-cost** set of edges from conflict graph G to eliminate odd cycles?
- Construct geometric dual graph D=dual(G)
- Find odd-degree vertices T in D
- Solve the T-join problem in D:
  - find min-weight edge set J in D such that
     all T-vertices has odd degree
    - J all other vertices have even degree
- Solution J corresponds to desired min-cost edge set in conflict graph G













|                       | Layout1                |                                      | Layout2                            |                       | Layout3   |          |
|-----------------------|------------------------|--------------------------------------|------------------------------------|-----------------------|-----------|----------|
| Testcase              | polygons               | edges                                | polygons                           | edges                 | polygons  | edges    |
|                       | 3769                   | 12442                                | 9775                               | 26520                 | 18249     | 51402    |
| Algorithm             | edges                  | runtime                              | edges                              | runtime               | edges     | runtime  |
| Greedy                | 2650                   | 0.56                                 | 2722                               | 3.66                  | 6180      | 5.38     |
| GW                    | 1612                   | 3.33                                 | 1488                               | 5.77                  | 3280      | 14.47    |
| Exact                 | 1468                   | 19.88                                | 1346                               | 16.67                 | 2958      | 74.33    |
| • Runtime<br>• Greedy | es in CPU<br>= breadth | J seconds<br>a-first-sea<br>Williams | s on Sun<br>arch bico<br>son95 heu | Ultra-10<br>loring (s | imilar to | Ooi et a |







- PSM must be "transparent" to ASIC auto-P&R
  - "free composability" is the cornerstone of the cell-based
     methodology!
- focus on poly layer → we are concerned with placer, not router
   Competitive context for placer
  - extremely competitive runtime regimes (e.g., 10<sup>6</sup> cells detailplaced in 20 min); faster runtimes needed in RTL-planning methodologies (Nano/PKS, Tera)
  - any nontrivial cost of checking placement phase-assignability is unacceptable
- Iteration between placer and a separate tool is unacceptable
  - interface to auto-P&R tools is bulky (e.g., 100s of MB for DEF), slow
  - no known convergent method for post-P&R phase-assignability checks to drive P&R to guaranteed correct solution (very difficult!)
- <u>P&R tool MUST deliver guaranteed phase-assignable poly layer</u>

| Guidelines                                                                                                                                                                                                                                                                                  |                                             |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------|
| <ul> <li>Placer</li> <li>no re-entry into placer from an external tool         <ul> <li>any needed extra functionality must be built d</li> <li>placer must guarantee a phase-assignable poly</li> <li>polygon layout information currently not in placer vocabulary</li> </ul> </li> </ul> | lirectly into QP<br>when finished<br>nent   |
| <ul> <li>available relevant abstractions: pin EEQs/LE<br/>layer geometries</li> <li>side files or LEF extensions needed for, e.g.,<br/>versioning or phase shifters near left/right cel</li> </ul>                                                                                          | Qs, overlap<br>capturing<br>I boundaries    |
| <ul> <li>Cell layout</li> <li>cell layouts and phase shifters are assumed fixed creation</li> </ul>                                                                                                                                                                                         | d during library                            |
| <ul> <li>on-the-fly cell layout synthesis or layout pertugenerally not allowed</li> <li>2<sup>k</sup> possible versions (i.e., distinct phase bindings)</li> </ul>                                                                                                                          | irbations<br>) are available                |
| for a given master cell with k connected compone<br>phase conflict graph, k' < k of which contain critic<br>boundary<br>impractical to use EEQs to capture versioning                                                                                                                       | ents in its<br>cal poly at cell<br>a within |
| iterative improvement<br>7<br>ICCAD Tutorial: November 11, 1999                                                                                                                                                                                                                             | C Andrew B. Kahng<br>Majid Sarrafzadeh      |



- Same-row composability
  - any cell can be placed immediately adjacent (in the same row) to any other cell
- Adj-row composability
  - any cell can be placed in an adjacent cell row to any other cell, with the two cells having intersecting x-spans
- Four cases of <u>cell libraries</u> (G = guaranteed; NG = not guaranteed)
  - Case 1: adj-G, same-G
    - most-constrained cell layout; most transparent to placer
  - Case 2: adj-G, same-NG
  - Case 3: adj-NG, same-G
  - Case 4: adj-NG, same-NG
    - least-constrained cell layout; least transparent to placer









### Case 1: Adj-G, Same-G Solution 1: "no restrictions on the cell layout" create cell abstractions such that placer runs in "normal" 9 mode e.g., pre-bloat (by 1 site) cells that have critical poly near left/right boundary . e.g., create overlap layer obstacles corresponding to critical poly near top/bottom boundary <u>Solution 2</u>: smart rules to restrict cell layout e.g., every pair of boundary-CP features from the same cell must be non-interfering • definition: two features are non-interfering if they are in different connected components of the cell's phase conflict graph no boundary-CP feature is "near" two different sides of its cell these two restrictions → composability guaranteed (no odd) cycles possible) Solution 3: dumb rules to restrict cell layout all cells have 250nm-wide 0-phase boundary (IBM style)



- M = number of master cells in library
- C<sub>i</sub> = i<sup>th</sup> master cell, i = 1, ..., M
- w<sub>i</sub> = width of i<sup>th</sup> master cell, i = 1, ..., M
- V<sub>i</sub> = number of versions of the i<sup>th</sup> master cell, i = 1, ..., M
- C<sub>ik</sub> = k<sup>ih</sup> version of i<sup>th</sup> master cell, i = 1, ..., M and k = 1, ..., V<sub>i</sub>
- N = number of movable cells in the row of interest
- R<sub>h</sub> = h<sup>th</sup> cell in the row of interest
- $S_h =$  master cell corresponding to the h<sup>th</sup> cell in the row of interest
- boundary-CP = critical poly feature "near" the cell boundary















# **Implications of Technology**

### Hard IP reuse is difficult

- divergent foundry processes
- o design- and context-specific variants of cells, macros, cores
  - filling densities
  - thermal, noise sensitivity contexts
  - layer usage and local region porosity constraints, physical access
  - incompatibility of separate phase solutions, or phase solutions + local routing
  - tool-specific variants (e.g., for different auto-routers)
  - diffusion sharing, continuous device sizing, tuning (dual Vt, multiple supply voltages (thermal, IR drop contexts), different input arrival times/slews,...)
- Hard-reuse: An ideal that must be tempered (abandoned?)
- Custom-on-the-fly is natural consequence of tuning, perf opts, migration, soft- and firm-IP reuse

C Andrew B. Kahng Majid Sarrafzadeh