Logical Effort [PDF]

Assume gate G1 is driving some # of other gates G2. – Fanout = number of such gates being driven. ❑ Delay of a gate

6 downloads 4 Views 194KB Size

Recommend Stories


Logical Effort
The best time to plant a tree was 20 years ago. The second best time is now. Chinese Proverb

Logical Effort
Keep your face always toward the sunshine - and shadows will fall behind you. Walt Whitman

Logical partitions [PDF]
This being human is a guest house. Every morning is a new arrival. A joy, a depression, a meanness,

Logical Framework
In the end only three things matter: how much you loved, how gently you lived, and how gracefully you

logical fallacies
We can't help everyone, but everyone can help someone. Ronald Reagan

Logical segmentation and labeling of PDF documents
Every block of stone has a statue inside it and it is the task of the sculptor to discover it. Mich

Logical-Shapelets
Knock, And He'll open the door. Vanish, And He'll make you shine like the sun. Fall, And He'll raise

Logical Filtering
Before you speak, let your words pass through three gates: Is it true? Is it necessary? Is it kind?

Logical Thinking
In the end only three things matter: how much you loved, how gently you lived, and how gracefully you

logical thinking
Keep your face always toward the sunshine - and shadows will fall behind you. Walt Whitman

Idea Transcript


Outline

Introduction to CMOS VLSI Design

‰ ‰ ‰ ‰ ‰ ‰

Lecture 7B: Logical Effort

Introduction Delay in a Logic Gate Multistage Logic Networks Choosing the Best Number of Stages Example Summary

Lecture by Jay Brockman University of Notre Dame Fall 2008

Modified by Peter Kogge Fall 2009 Based on lecture slides by David Harris, Harvey Mudd College http://www.cmosvlsi.com/coursematerials.html 6: Logical Effort

Slide 1

Review ‰ Assume gate G1 is driving some # of other gates G2 – Fanout = number of such gates being driven ‰ Delay of a gate G1 = parasitic delay + effort delay ‰ Parasitic Delay = delay if gate G1 is driving 0 load – Function of diffusion capacitance in gate – Delay seen when G1 drives no other circuits ‰ Effort Delay: delay due to capacitance of circuits driven by G; function of – the number of gates of type G2 being driven – the input capacitance presented by a G2 gate

CMOS VLSI Design

5: Logical Effort

CMOS VLSI Design

Slide 2

Normalized Delay ‰ Normalized Delay: gate delay relative to: – “fanout of 1” inverter (drives one inverter) – with no parasitic capacitance – of value 3RC • R = eqvt resistance of unit nmos transistor in saturation • C = eqvt capacitance of gate of unit nmos transistor

CMOS VLSI Design

Normalized Parasitic Delay ‰ Inverter has 3 units of diffusion capacitance – 2 from pmos – 1 from nmos ‰ Parasitic delay of inverter is τ =3RC ‰ Normalized Parasitic Delay of a gate is p/3RC – p is parasitic delay from Elmore model ‰ To convert to psec, multiply by parasitic delay of an inverter in chosen technology

CMOS VLSI Design

Normailzed Effort Delay ‰ h = fanout or electrical effort = property of circuit – = # equivalent G1 gates being driven by G1 – = Cout/Cin where • Cout = total load capacitance presented by G2 inputs • Cin = input capacitance G1 presents on its sources – When gates G2 (those being driven) are type G1, then • h = # of copies ‰ g = logical effort to drive a gate of type G2 – effort required to drive one gate vs perfect inverter – how many eqvt invertors one G2 gate looks like – = (Input cap of G2 gate)/(Input cap of inverter) – = (Input cap of G2 gate)/3 ‰ Inverter has logical effort = 1 CMOS VLSI Design

Summary Rise Time

Input Cap per load

# of loads

Common Gates Gate type

Number of inputs 1

Inverter

R Fall Time is similar Cdiff

hCin

# of G2s Logical effort of type G2

delay = RCdiff + hRCin

NOR 2

XOR, XNOR

4/3

5/3

6/3

(n+2)/3

5/3

7/3

9/3

(2n+1)/3

2

2

2

2

4, 4

6, 12, 6

8, 16, 16, 8

2

3

4

n

2

3

4

n

1

NOR

pG1 + h*gG2

n

Tristate / mux XOR, XNOR

2

2

3

4

n

4

6

8

2n

4

6

8

3RC CMOS VLSI Design

g=Logical Effort

Number of inputs 1

NAND

-------------------------- =

4

Gate type Inverter

relative-delay = RCdiff + hRCin

3

1

NAND

Tristate / mux

2

CMOS VLSI Design

p=Normalized Parasitic

n-input NAND Gates

parasitic delay = (N2/2 + 5/2N)RC SLOW!!!!!! CMOS VLSI Design

MultiStage Logic Networks Relative Input Capacitance (based on gate design & transistor size)

gi = logical effort to drive a gate of type i = input cap/cap of inverter hi = fanout of gates of type i = load cap/input cap Question: if delay thru one gate is p + hg, can we write delay thru multistage as some P+HG? CMOS VLSI Design

Scaling Transistors ‰ What if all transistors in gate G got wider by k? – Denote as gate “G(k)” ‰ Parasitic delay of G(k): delay of unloaded gate – Diffusion capacitance increases by k – Resistance decreases by k – Result: No change ‰ Effort delay: ratio of load cap to input cap – If drive same # of G(k) as before, no change – If drive same # of G(1) as before, decrease by 1/k ‰ Result: fanout to type G(1) gates increases by k

CMOS VLSI Design

Overall Delay ‰ delay = ∑delay(i) – where delay(i) = delay thru I’th “stage” of logic ‰ delay(i) = pi + hi * gi – pi function only of gate type at stage i – gi function only of gate type at stage i • input cap/cap of inverter – hi depends on gates at stage i+1 • total load on gate i/input cap of gate I ‰ delay = ∑(pi + hi * gi ) = ∑(pi ) + ∑(hi * gi ) ‰ Can we write ∑(hi * gi ) as some H*G?

CMOS VLSI Design

‰

Paths that Branch

Definitions Path Logical Effort G = ∏g

Branch point ‰ No! Consider paths that branch: ‰ Individual terms 15 g1 = g2 = 1 (inverters) h1 = (15 +15) / 5 = 6 5 h2 = 90 / 15 = 6 15 ‰ Path Terms G = Πgi = 1x1 = 1 H = Cout/Cin = 90 / 5 = 18 GH = 18 F = g1h1g2h2 = 36 = 2GH != GH

i

‰ Path Electrical Effort

H=

Cout-path

F =∏

‰ Path Effort

10

x g2 = 5/3 h2 = y/x

g1 = 1 h1 = x/10

Cin-path f i = ∏ gi hi

y g3 = 4/3 h3 = z/y

z g4 = 1 h4 = 20/z

20

90

90

Question: Can we write F = GH? 5: Logical Effort

CMOS VLSI Design

Slide 13

Branching Effort

‰ Path Delay

‰ Path Effort Delay

Con path Note:

‰ Now we compute the Path Effort – F = GBH

CMOS VLSI Design

∏h

i

Slide 14

D = ∑ d i = DF + P

‰ Path Parasitic Delay

Con path + Coff path

B = ∏ bi

5: Logical Effort

CMOS VLSI Design

Multistage Delays

‰ Introduce Branching Effort – Accounts for branching between stages in path

b=

5: Logical Effort

P = ∑ pi DF = ∑ f i

= BH

Slide 15

5: Logical Effort

CMOS VLSI Design

Slide 16

Designing Fast Circuits

Gate Sizes ‰ How wide should the gates be for least delay?

D = ∑ d i = DF + P

fˆ = gh = g CCoutin

‰ Delay is smallest when each stage bears same effort 1 fˆ = gi hi = F N

⇒ Cini =

‰ Thus minimum delay of N stage path is

‰ Working backward, apply capacitance transformation to find input capacitance of each gate given load it drives. ‰ Check work by verifying input cap spec is met.

1

D = NF N + P ‰ This is a key result of logical effort – To find fastest possible delay – Doesn’t require calculating gate sizes 5: Logical Effort

CMOS VLSI Design

Slide 17

5: Logical Effort

Example: 3-stage path

A

8

5: Logical Effort

x

CMOS VLSI Design

y 45 y

B

45

Slide 19

Slide 18

x x A

x

CMOS VLSI Design

Example: 3-stage path

‰ Select gate sizes x and y for least delay from A to B

x

gi Couti fˆ

8

x

y 45 y

Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay

5: Logical Effort

B

45

G= H= B= F=

fˆ = P= D=

CMOS VLSI Design

Slide 20

Example: 3-stage path x x A

8

x

45 B

45

Logical Effort Electrical Effort Branching Effort Path Effort Best Stage Effort Parasitic Delay Delay

5: Logical Effort

‰ Work backward for sizes y= x=

y

y

G = (4/3)*(5/3)*(5/3) = 100/27 H = 45/8 B=3*2=6 F = GBH = 125

fˆ = 3 F = 5

x 8

P=2+3+2=7 D = 3*5 + 7 = 22 = 4.4 FO4

CMOS VLSI Design

Slide 21

45 P: 4 N: 6

CMOS VLSI Design

P: 12 N: 3

B

5: Logical Effort

x

y 45 y

CMOS VLSI Design

B

45

Slide 22

Choosing Best # of Stages ‰ Goal: estimate delay & choose transistor sizes ‰ Many different topologies (combinations of gate types) that implement same function ‰ We know in general – NANDs better than NORs – Gates with fewer inputs better than more inputs ‰ Typical shortcut: estimate delay by # of stages – Assuming constant “gate delay” – and thus shorter paths are faster ‰ THIS IS NOT ALWAYS TRUE! – Adding inverters at end with increasing sizes can speed up circuit, esp. when high load

‰ Work backward for sizes y = 45 * (5/3) / 5 = 15 x = (15*2) * (5/3) / 5 = 10

5: Logical Effort

x

A

Example: 3-stage path

A P: 4 N: 4

Example: 3-stage path

45

Slide 23

CMOS VLSI Design

Best Number of Stages

Example (p. 178)

(p. 178)

‰ How many stages should a path use? – Minimizing number of stages is not always fastest ‰ Example: drive 64-bit datapath with unit inverter InitialDriver

1

1

1

‰ How many stages should a path use? – Minimizing number of stages is not always fastest ‰ Example: drive 64-bit datapath with unit inverter

1

InitialDriver

D =

1

D = NF1/N + P = N(64)1/N + N

1

1

1

8

4

2.8

16

8

23 DatapathLoad N: f: D:

5: Logical Effort

64 1

64 2

64 3

CMOS VLSI Design

64

DatapathLoad N: f: D:

4

Slide 25

5: Logical Effort

General Derivation

D = NF + ∑ pi + ( N − n1 ) pinv 1 N

Logic Block: n1Stages Path Effort F

N - n1 ExtraInverters

N total stages with (N-n1) Inverters • do not change logical effort • do add parasitic delay

CMOS VLSI Design

3 4 15

64

4 2.8 15.3 Fastest

CMOS VLSI Design

Slide 26

‰

pinv + ρ (1 − ln ρ ) = 0 has no closed-form solution

‰ Neglecting parasitics (pinv = 0), we find ρ = 2.718 (e)

‰ Again, – these ρ values are best logical effort per stage ^ = log – when you have N ρ F stages

pinv + ρ (1 − ln ρ ) = 0 5: Logical Effort

2 8 18

64

‰ For pinv = 1, solve numerically for ρ = 3.59

i =1

1 1 1 ∂D = − F N ln F N + F N + pinv = 0 ∂N 1 ‰ Define best stage effort ρ = F N

1 64 65

64

Best Stage Effort

‰ Consider adding inverters to end of n1 stage path – How many give least delay? n1

64

Slide 27

5: Logical Effort

CMOS VLSI Design

Slide 28

Sensitivity Analysis

1st Example, Revisited

D(N) /D(N)

‰ How sensitive is delay to using exactly the best 1.6 1.51 number of stages? 1.4

‰ Ben Bitdiddle is the memory designer for the Motoroil 68W86, an embedded automotive processor. Help Ben design the decoder for a register file. A[3:0] A[3:0]

1.26

1.2

32 bits

1.15

1.0

0.5

0.7

1.0 N/ N

1.4

2.0

= actual N vs optimal N

‰ 2.4 < ρ < 6 gives delay within 15% of optimal – We can be sloppy! – I like ρ = 4

5: Logical Effort

CMOS VLSI Design

Slide 29

What Does This Mean? ‰ 16 word register file – There are 16 separate row lines – Branching factor of 16 at end ‰ Each word is 32 bits wide & each bit presents load of 3 unit-sized transistors – The load on each row line is 32*3 ‰ True and complementary address inputs A[3:0] – Any address input needed for only 8 row lines ‰ Each input may drive 10 unit-sized transistors – Total input capacitance from 1st stage gates on inputs = 10 5: Logical Effort

CMOS VLSI Design

Slide 31

‰ Decoder specifications: – 16 word register file – Each word is 32 bits wide – Each bit presents load of 3 unit-sized transistors – True and complementary address inputs A[3:0] – Each input may drive 10 unit-sized transistors ‰ Ben needs to decide: – How many stages to use? – How large should each gate be? – How fast can decoder operate? 5: Logical Effort

16

Register File

CMOS VLSI Design

Slide 30

Number of Stages ‰ Decoder effort is mainly electrical and branching Electrical Effort: H= Branching Effort: B= ‰ If we neglect logical effort (assume G = 1) Path Effort: F= Number of Stages:

5: Logical Effort

N=

CMOS VLSI Design

Slide 32

16 words

0.0

4:16 Decoder

(ρ =2.4)

(ρ=6)

Number of Stages

3 Stage Gate Sizes & Delay

‰ Decoder effort is mainly electrical and branching Electrical Effort: H = (32*3) / 10 = 9.6 Branching Effort: B=8 ‰ If we neglect logical effort (assume G = 1) Path Effort: F = GBH = 76.8 Number of Stages:

Logical Effort: Path Effort: Stage Effort: Path Delay: Gate sizes: A[3] A[3] 10

A[2] A[2]

10

10

10

A[1] A[1] 10

10

G = 1 * 6/3 * 1 = 2 F = GBH = 154

fˆ = F 1/ 3 = 5.36 D = 3 fˆ + 1 + 4 + 1 = 22.1 z = 96*1/5.36 = 18

y = 18*2/5.36 = 6.7

A[0] A[0] 10

10

N = log4F = 3.1

y

z

word[0] 96 units of wordline capacitance

‰ Try a 3-stage design 5: Logical Effort

y

CMOS VLSI Design

Slide 33

Comparison

5: Logical Effort

z

word[15]

CMOS VLSI Design

Slide 34

Review of Definitions

‰ Compare many alternatives with a spreadsheet Term

Stage

Path

number of stages

1

N

G = ∏ gi

Design

N

G

P

D

NAND4-INV

2

2

5

29.8

logical effort

g

NAND2-NOR2

2

20/9

4

30.1

electrical effort

h=

Cout Cin

H=

Con-path + Coff-path Con-path

B = ∏ bi

Cout-path Cin-path

INV-NAND4-INV

3

2

6

22.1

branching effort

b=

NAND4-INV-INV-INV

4

2

7

21.1

effort

f = gh

F = GBH

NAND2-NOR2-INV-INV

4

20/9

6

20.5

NAND2-INV-NAND2-INV

4

16/9

6

19.7

effort delay

f

DF = ∑ f i

INV-NAND2-INV-NAND2-INV

5

16/9

7

20.4

parasitic delay

p

P = ∑ pi

NAND2-INV-NAND2-INV-INV-INV 6

16/9

8

21.6

delay

d= f +p

5: Logical Effort

CMOS VLSI Design

Slide 35

5: Logical Effort

CMOS VLSI Design

D = ∑ d i = DF + P

Slide 36

Method of Logical Effort 1) 2) 3) 4) 5)

Compute path effort Estimate best number of stages Sketch path with N stages Estimate least delay Determine best stage effort

6) Find gate sizes

5: Logical Effort

F = GBH N = log 4 F 1

D = NF N + P 1 fˆ = F N Cini =

CMOS VLSI Design

gi Couti fˆ

Slide 37

Summary ‰ Logical effort is useful for thinking of delay in circuits – Numeric logical effort characterizes gates – NANDs are faster than NORs in CMOS – Paths are fastest when effort delays are ~4 – Path delay is weakly sensitive to stages, sizes – But using fewer stages doesn’t mean faster paths – Delay of path is about log4F FO4 inverter delays – Inverters and NAND2 best for driving large caps ‰ Provides language for discussing fast circuits – But requires practice to master

5: Logical Effort

CMOS VLSI Design

Slide 39

Limits of Logical Effort ‰ Chicken and egg problem – Need path to compute G – But don’t know number of stages without G ‰ Simplistic delay model – Neglects input rise time effects ‰ Interconnect – Iteration required in designs with wire ‰ Maximum speed only – Not minimum area/power for constrained delay

5: Logical Effort

CMOS VLSI Design

Slide 38

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.