[Slides] Digital ASIC Backend Process Tutorial for Frontend Engineers

Make it to the Right and Larger Audience

White Paper

[Slides] Digital ASIC Backend Process Tutorial for Frontend Engineers



Standard cell based IC design with hard macros such as analog PLL/ADC and memories (CPU is implemented in standard cells too)
GDS(Geometry Data Standard), chip layout sent to foundry, contains info about transistors and interconnects/wires, binary file.
From foundry to customer – wafer, die, and chip
Zoom in: 3D drawing of transistor/base layer and metal layers

Overall ASIC design flow from frontend RTL design, verification to backend process (Synthesis, floorplan, PnR, timing)
Synthesis from RTL to Netlist
Synthesis Timing
Use wire-load model for timing calculation
RTL vs Netlist Equivalence Check
Physical Process Flow
Backend process, place and route, from netlist to GDS
Synthesis: Synopsys DC/PC
PnR: Synopsys Astro, Cadence Encounter/icfb
DRC&LVS: Cadence Dracula, Mentor Calibre
STA: Synopsys Primetime
Physical Implementation
From netlist to GDS
Standard Cell based ASIC Design
Row-based Std Cell Layout
One chip may contain cells of different height but they are in different rows. Taller/larger std cells tend to faster but consume more area and power.
Power and ground are connected through abutment in a row
With a sizeable library of cells, also called technology library
Technology library is normally developed and distributed by foundry. But large companies normally has library group to develop their own customerized library.
Standard Cell
Example to show a gate schematic and its std-cell implementation
Standard cell
In older technology, there are explicit spaces between std cell rows as routing channels
In modern technology, routing is done OTC (over the cell) with multiple metal layers
Routing over macros may have restriction. Sometimes designers need to pre-allocate routing channel from blocks to macros
Row-based Std Cell Layout
Power distribution – power ring/finger globally and power rail/abutment in std cell rows
Standard Cell
Std cell power and ground
CEL View vs. FRAM View
CEL view: all physical information about the cell. Used to output GDS and not used in PnR.
FRAM view: contains pin location, shape, and routing blockage info. Used in PnR.
Inverter example
CEL View vs. FRAM View
SRAM example
Technology File (tf)
Technology file is unique to each technology
Contains metal layer parameters
Number, name designations, physical and electrical characteristics, design rules, etc., for each layer/via
Technology File (tf)
Floorplan – set target
Size of chip
Place main blocks, macros, pins, and IO ports
Normally hard macros such as memories are placed near the sides of cores or chip
Power planning to give enough power to gates
Type of clock distribution
A good floorplan is to minimize the connections between logic groups/blocks

A typical floorplan flow
Aspect ratio (AR)
Ratio of chip width and length
Take into account routing resource, if more horizontal layers, rectangle should be ong and vice versa. Normally metal1 is used by std cells. Odd number metal layers are horizontal and even are vertical.
The percentage of area used, at chip level it is (std-cell-area + macro-area + pad-area)/chip-area
Standard cell blockage (no std cell allowed), non buffer blockage (only buffer allowed), and blockage below power lines

Placement: place the gates (std cells)
Coarse placement – approximate locations and may be not legal
Detail placement – legalize placement
Timing driven placement vs Congestion driven placement
TDP: opt critical path timing, shorten nets
CDP:  tends to spread cells and thus lengthen nets
Iterative placement, set different opt effort based on congestion severity
Timing opt: add/delete buffers, resize gate, swap pins, move instances, etc.
min area and pwr

Place_opt needs route timing to optmize placement. But route is not started yet, how to get route timing?
ICC uses virtual route (Manhatten geometry, horizontal-vertical, no diagonal routing) to estimate routing length and shape, then estimate RC parameters based on TLU+ model
The same way is used for clock tree synthesis (CTS)
Clock Tree Synthesis (CTS)
During placement, clock is assumed to be ideal – clock tree insertion delay is 0 from clock port to FF clock pin.
CTS builds clock tree and routes clock nets from port to FF pins
Need to optimize timing after CTS (called postCTS Opt) including setup and hold
Clock Tree Synthesis (CTS)
Ideal clock assumed before CTS
Clock Tree Synthesis
Routed clocks after CTS.
Clock buffers added
Congestion may increase, other cells may be moved
Normally CTS needs multiple iterations
CTS needs frontend inputs
Clock Tree Synthesis (CTS)
Main parameters: skew, delay, transition tme
Clock Tree Synthesis (CTS)
Route clock nets before signal nets (done in routing stage). Green colored nets are clock nets.
Clock Tree Synthesis (CTS)
Example of clock tree in a single processor project
Use metal/via to connect pins of std cells and macros as well as finishing up power and ground
Output is a routed design which is DRC and timing clean
Global routing
Plans overall connections between blocks, determine routing topology such as channels or routing regions nets go through. Max # of nets routed, min routing area, min total wire length, min delay
detailed/local routing
Actual connection takes place, creates actual via and metal connection, min area/wire-length/delay
Timing check after routing with accurate parasitic RC extraction of nets
Design Rule Check (DRC)
Design rule examples:
Chip Finish
To improve yield, so called DFM (Design for Manufacturing)
Wire spreading
Redundant via insertion
Filler cell insertion
Metal fill insertion
Metal slotting
Wire Spreading / Widening
Redundant Via Insertion
Filler Cell Insertion
Fill up empty row/site which doesn’t contain std cells
Make Nwell/Pwell consecutive
Can be used as decoupling cap cell
Metal Fill
Issued caused by small metal density in manufacturing
Metal Fill
Metal Slot
Power Signoff – IR Drop
IR Drop
IR Drop
Static IR drop
3% for VDD + VSS (Flip chip)
5% for VDD + VSS (wire bond)
Dynamic IR drop
Around 5x in signoff constraint
Scan mode IR drop
Peak power usually around clock edge
Analyzing IR drop during small timing window when FFs are switching at the same time
Power Signoff
Cadence EPS
Synopsys PrimeRail
Power Signoff


Profile Photo
We are a small team of ASIC and FPGA design engineers with combined >40 years of experience. We successfully led several chips through the whole design to TO process. Familiar with Xilinx FPGA, we designed complicated system of using multiple FPGAs to verify complicated ASIC and also for the final products. We are interested in working as independent contractor for your projects. Feel free to contact us.

1 Comment
  1. jgavde 5 years ago

    Good slides about IC backend fundamentals.


Contact Us

Thanks for helping us better serve the community. You can make a suggestion, report a bug, a misconduct, or any other issue. We'll get back to you using your private message ASAP.


©2021  ValPont.com

Forgot your details?