/
Author: Koren I.
Tags: mathematics algorithms computer science computer technology
ISBN: 1-56881-160-8
Year: 2002
Text
I. ..
2...- -, 1"'.
.... .Q.
G Y.
· _ · er
lrlmetiC.
Israel Koren
COMPUTER
ARITHMETIC
ALGORITHMS
Second Edition
Israel Koren
University of Massachusetts, Amherst
;
A K Peters
Natick, Massachusetts
dilorlal. Sales. and Customer Scrvice (>nice
This book is dedicated to my wife, Zahava,
my sons, Vuval and Varon,
and to the memory of my parents.
Jacob and Dvora.
A K Pcters. Ltd
6. Soulh A\'cnue
Natlc\..:. MA 01760
\\"\\"W. akpeters. com
CopYright 0 2002 hy A K Peters. Ltd
All nghls reserved No part oflhe malenal protccled by this copvnght nOlice may be reproduced
or utihl'ed m an) form. clcctronic or mechanicdl. mcluding photocopying. rt.cording. or by any
mformalton storage and retric\al system. \\Ithout \\'Tltten permission from the COP} right owncr.
Lihral')' of Conress Catalon-in-Publication data
Koren. Israel. 1945-
Computer arithmetic algorithms Israel Koren.-2nd cd.
p cm.
Includcs bibliographlcal references and index
ISBN 1-56881-160-8
I Computer arithmetic 2. Computer algorithms I Title
V,A 76.9.C62 K67 2001
005 I -dc21
2001045837
Figures 43.4.4.4.5 and 4.7 are reprinted from D.J. Kuck. The Stnlclllre ofCon/plllers arid
('omputat,on$. Vol 1 (copyright" 1978 John Wiley) by permission of John Wiley & Sons. Ine.
lahle 2 2 and figures 5 32,6.11,6 13,6.20 6.23. 7 5. 7.7. 7.13, and 10.2 are reprinted by permis-
sion of the Instltult. of Electrical and Fleetronics Fngineers (IEEE). Table 4.3 IS reprcsented by
permission of the IBM Systems Journal
Thc first edllton of this book \\ as publish by Prentice-Halline
Pnnted In Canada
06 05 (14 03 02
10987654321
FORWORD TO THE SECOND EDITION
PREFACE
CONVENTIONAL NUMBER SYSTEMS
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
CONTENTS
xi
xiii
1
The Binary Number System 1
Machine Representations of Numbers
Radix Conversions 4
Rf'present,ations of NegativE' Numbers
Addition and Subtraction 13
Arithmetic Shift Operations 15
Exercises 16
Rcfen'nces 17
2
6
2 UNCONVENTIONAL FIXED-RADIX NUMBER SYSTEMS 19
2,1
2.2
Ncgat.ivp-Radix NumhC'r Systt'lJ1s 19
A GCll('ral Class of Fixt>d-Radix Number Systems
21
vIII
2.3
2.1
2.5
2.ti
SiJ{IIf'fI- DiRil NlIIulll'r S\'tlIt'llIs
Biulry SD NIIIIIL 'rs 27
Ex,'rc-i,u'!o 32
n.'f"rf'III-('A'I :1:1
23
Contents
Contents
[,.7
1).1'1
3
SEQUENTIAL ALGORITHMS FOR MULTIPLICATION
AND DIVISION
35
IJtl
lU I)
6.11
5.12
!j,I3
6.11
3.1
3.2
:t:1
3,1
3,6
3.6
s.'CIII1'1I1 illl 11I1t il,lirlil iOIl
S"'III,'ut inl I )iviRiol1 ;m
NIJllro""lorilll( Di\'itlioll
SlJllllro' nUIII Fxt ml't iOIl
Fxcrci:;(,,; 50
n,'fc'ro'lIrl'1> 52
3&
12
18
4
BINARY FLOATING-POINT NUMBERS
,1.1
.1.2
1,3
1.4
1.5
J.()
1.7
1,8
.HI
4.10
4.11
I'rc-.lillllllnri,'s 53
Flunt illg- Puiut 0r)('rntious r,!1
ChujC'(' of Fluutlll-Puillt R"prcututJoll
Th!' IF"'F Fioutiug-Polilt ::'lllIulnrcl G7
RllIllld-(,lf SdU'rTlc:'I 71
Cllllni Diglt.11 76
Flonting-Point Acld,,1'11 81
F xc'cpt iOllJ, tH
HOIllIlI-olf Frrors nlld I'Il1'ir AI.'C'III11UIIl\.ioll
Exelci . 89
n(,r"It'II('c":t 91
5
FAST ADDITION
5.1
5.2
5.3
5.4
5.&
/j.b
I.illy' A<lcJ"rll IICI
('nrry-H.''''''I Alhl"!H 113
(' my-Skip A<ld.'rll I iii
II v"rul Alld,'rll II!I
Cnrry-SuVt' A<ld.'fH 121
Pipdilllll IJf Arit 1IIII<,t ii' 0IJl'lIIlic,nH
EXf'rdIWA 1,15
Itt'fl'r"II""I' 13/(
1:12
6
HIGH-SPEED MULTIPLICATION
6,1 H,'cJudlll( Ihl' NUIIII...r of Pllrt.illl Prlllllll'h. III
6.2 IIIIIJlelllclilillJ.: "/lIW' -Iult IJJlierH UHin.., Sml1l1er OIlC''t
6.:1 AC'I'ulllulnt iug till' Pllrt.inl Pro ,1111'111 I ,If,
GA Alt 'nllltlvc Ie 'JmiIJuL':I fl,r PnrtluJ Prl duct \!'t::llllluiutlull
G.5 FUtlcd Mliitiply-Add Ullit Iti"
(J.G Army llIltiplif'rH 1"7
(j,7 O"t.illlillity uf MIIII iplll'r 11111'1"111<'111 tllI,IUI 171
(j.t.! EXI'rdH4.':I 17(j
6.9 HdeIl'uc"H 179
53
65
7
FAST DIVISION
7.1
7.2
87 7.:1
7, I
7.5
7.fJ
93 7.7
SHT Di\ iH11J1l 11'11
lIiJ.:h-Hudix Divisioll 187
S"I'I'(lill UI' thl' Divilliull Procc'S!! WI'!
ArlllY Divitl"rs .!OJ
F'8/lt 8'11111((' HOllt Exl m!'t inll .lOlJ
Ex 'reitt ' 209
n 'fl'rl'II(''-':I 210
8
DIVISION THROUGH MULTIPLICATION
102
tJ.1
8.2
8.3
8.1
HiI'plf'.Carry Arltll'rtt !I3
Cnrry-I..uok-Alu'utl Atlrlc'rl! 9!'i
Collrht.lrJlIlI1 SIIIII AcI.Ie.rli
Opt illhllily ur Algllrit.lmu; nllt! Thdr 11111,11'1111'111 nt ilJIIH
Carry-l...ook-Abed(1 Alldi\.illil ft"vittiU...1 HJ()
Prefix Aclt".rll 1 U!.I
I}ivi!<illil "y ('lJllvt'r"II" ,
DiviHllJ1l "y HI.dJlwl'ILl inll
Fxf'rd.... 222
It 'f"n'III"'" .l2:1
21J
21M
Ix
141
119
1[;7
181
213
x
9
EVALUATION OF ELEMENTARY FUNCTIONS
9.1
9.2
9.3
9.4
9.5
9.6
9.7
!u:S
9.9
9.10
The Fxpc)IU'nt.ial Fum.tion 22!1
TIIP I og/lrilluu Func.tion 229
The l'rigonomctri,> F\mctioll5 232
The InVl'rsC' Trionolllf't.ri,' FUllct,ions 2:3!i
Thp lIypt>rholk FUn(,t.ioll<; 238
nllllnd!o 011 t.lu' Approxllnatioll Error 2:m
Sp('('(I-lip TN'hniques loll
Ot.her Techniqucs for E\lIIII/ltinF; Elcmpntary FUlldiuns
Exprdsf's 2 '.1
Rdpn'lIN'S 2 15
10
LOGARITHMIC NUMBER SYSTEMS
10.1 Sign-Logarithm Numher SYSU'III:' 247
10.2 Arithll1l't.k Op,'ml.ion5 49
10.3 C'oml»uison to Bimuy Floating-Point XnlUber:;
10.4 ('<Inversions to/from C'oll\'t!lIt.ional R('prc"S,'nlnt.iolis
10.5 Exercises 255
10.6 RefprPII('('S 256
11
THE RESIDUE NUMBER SYSTEM
11.1 Preliminaries 259
11.2 Arithl1l,'tic 0l)('ratioll:; 261
11.3 The ASSlI('illt(-'(1 J\lixt"d-Rudix S)'sll'lII 264
11.4 Conwrsion of Numbers froll1/to t.ht" R('siclne S)'sh'm
11.5 S('ll'Cting the Modnli 267
11.6 Error Detection .lUd Corrc>ct.ioll 2li!)
II. 7 Fxerciscs 27.1
11.8 n. .fcrcllccs 275
INDEX
Contents
225
FORWORD TO THE
SECOND EDITION
2.1:3
247
2!"12
Thi.. edition includes several new sections II.... well dS many amendment.!) and cor-
rL"CtiollS IUddc since tht> first edition in 1993. The new sectioI15 includ,> floatillg-
point urldprs, float.ing-point, ('xceptioI15, p;elleral carry-IOQk-Hlw ul addffs, prf'tix
add('rs, Ling adders, and fused multiply-add units. N('w algorithms and imple-
mcntations have been added to almost nil dlllpt,f'rs. 1\Iy thanks to the t.uden
and readers who:;e clisco\o('ries of f'rrors, ..ld gen('ral conunents, are reflpct('(1 in
this vohllue.
Sinc(' th(' first edition, a wf'b page W81> creat.ed for CQrnpllt('. .trithmf'tic .41-
gorithm.'1, whirh contains updat.es on t.he book, solution<; t.o 84'1('(:tt"d probl('ms and
rt"lt"vant links. It can be found at hup:/ /www.ccs.Ulna..edu/ffP/koren/arith.
Additionally, thcre is now 311 on-linf', Ja\aScriptbas('d simulator fur
IIII\IIY of the dlgorithms containt"d ill thi1> book. A uSl'ful tool for both stu-
dent." alld I>rf\ct.ition('rs in th(' liPId, the Ja\'a-bu..'it-d simulat.or can be foulld at
htt ))://\\'ww .l'CS.llmass.edu/ele/koren/arith/sill1l1lator.
I v('ry much wPlt-olllc furth,'r ('oll1l11l'nts and sugg<'StioI15; please e-1IU1il
th,'m to me at korell',<4l"Cs.unut.."8.oou.
25J
259
2611
277
I,d K OTrn
."mhcr.t, Massachusetts
July 2001
xl
PREFACE
The goal of thit: textbook is tOl'xplain tllP fUlldallu'ntdl principlps of alorithms
available for pl'rfornun drit.lllnetic operations in digit." I'Omputers. T11I'sf' in-
clude basil' arit.lllnetil' op{'ration likl' additlOll, suhtraction, lUultiplicdt.iun, arHI
division in fix(!d-point ..lid ftontillg-point numb'r s)'st('(ns, and morc complex
operations, 8uch dS S(IUare root pxtractioll and ,'vaillatiou of l'xrlOnPlltin log-
arithll1ic, alld trigonometric fUllctiolls. dno so 011. Even the SN.'mingly 8impl('
arithmC'tic op,'rat.ion:. turn Ollt to be more complex than Olll' expects whell at-
temptillg to impl"lUent. thl'l11. 1'hp descriptions foulld in lTIany pxcellpllt hooks
Oil computer .m'hih'ctUrl> do not prO\.. io£' t.he level of detail rp«uir,'d, <\lId tllf'rp
is a n('('(1 for .\ book thnt is cOlllpl,.t.('I)' dl',,:oteo to diJ!;itdl ('oIUJmh'r <uithml'tj(".
D('signs tlmt inchul,> arithmpti<' ..nits have Jlrolif,'rated in r..c('nt years.
Till' progress alr(,l\dy mad.. in intl'gratNI dr('lIit tl'chnology, alld furthl'r ulvaJu'f'S
I'xpN.tL'd in tlu' JIl'm futun', h8." hcul a significsmt impact on the dpsi of lIew
arithllwtic proCl'ssors, 1'111' higlll'r d('nsity of inh'gr It(>d ci((uit-s now etMblps th('
design and imph'ml'lItation of sophi1ot.icatNI arithml't.ic pwn'SSO[s t'mpluying al-
gorithms that WNt' cOII:.idpf('o prohihitivI'ly cOlllpll'x in thp past. COnSl'<JllI'nt Iy,
m(" hods' hat w,'r,' Jln'viollly (onsiden'd IIIlconW'nt.iollal should hf' I'xaminNl,
since tl1l'Y muy nllw h" '\tt.ra(.tiv(' altprnat,i\("s. Furthermore, th., e8.<;(' of d('.
siglllllg I1pplicatioll-spI'cihc IUll').,mted cirruits currpnt.I)' allows f'ngul(-'('rs to (It...
sign arit.llllll.t ic IInits tailorpd to th('1f spl'("ml nl'('ds, ratlwr 'han having tu lIM>
g"lIl'ral-purposp circliits. Thl'st> spcdulil.ed arit.hnll'tic nnits 1"01.11 achif'l. I' '\ high(>r
'i1">l'(l of 0p,'ratioll for t.h(' Jllrticlilar applicatioll bt'llI coru;idprt-'d.
xiii
xh'
Prefoce
Prefoce
xv
This bunk tll'scribc thc l)rilldlll,'S of COlli plltl'r arit.hnl!'tie alp;oritlulil'; illl.....
pl'nrll'ntly of nn)' partieul.\( t.l'Chnology l'mplll}l'd for t.lu'ir impleuwntntion. The
I'xist<'lIcl of sew-ral impl('m"lItlll.ion tt'dmologil'S alld the rapid .-h{\IIg,,:, in t.hesc
lUakl' a d"tail''ll dl'Scripl.ion of any impl('nlt'lltaliou almost immediately ob)lct.e.
TI1l' book illdudes mmll'rkal ('xsunpll'<; to iIIustrntr' the working of tlu' algorithms
pre$('ntcd ali!I pXl'lnills the COIICCpt:. b.,hind tht' algorithmc; without relyillS 011
at(' diagrams. Such diarams arl' lIc;ually :-traightforward, lIlIllallY rt'i1d,'r with
a sufficiently good hackground ill diitnl r1psigu, as is I'Xp,'('t("(! from tll(' rcadl'r
of this book. can draw his/hl'r owu. Thl'S1' diagrams are iu nUlIlY ClL'.S ueI,'Ss
for th,> pra(.titionl'r, sillcl' tll(' tl'dmolog.." that. h,'/shl' plalls to USf' imposes it.s
COllst raints on thl' implcIl1Putat.ion. and. more importantly, t.hl') do not prcJ\'idl'
ally ncl,litiollal illsiht.
All the algorithms in thi.., book arc d('S('ribNI within the !'.I1111e franwwork,
so that. th(' similariti("S b(.twl'l'1I dilf,'rt'llt algorithms becomc c\'idl'lIt,. allll. ('on-
sl'<jul'lItly, I.he ha..<;ic principles IlI'hind thcse algorithms can bl' c.\Sily idmtifi('(I.
This should hdp thf' reMl'r toO b,'u.'r undl'rst<lI1d the cnrn'lItly avnihlble algo-
rithms, kllow how to s..Jt.'<.'t t.he most appropriate algorithm to match a given
technology, aud evell he nble to dl'\'l'lop n,'w algorithms if t.he n('t.'<.1 ariS(.'$.
This book is intcnded to bc ust.'<l as a tcxtbook in a wnior-Iewl or first-
)'ear graduat '-It.'vcl course in computer arithmetic and as n ref('rl'l1ct. bouk for
pmctkillg enbmf'('rs. The reader's l'xpccted hackground includes a basic knowt-
cdgl' of digital dl'Sign and t.11I principl of digit 111 collll>uter organi1at.ioIl. Th(,
book includt':> 11 chapt.rs; Mch chapter has a list of rl'lf'"ant n>fl'r('lIccs culd a
!.'Ct of ex('rdses. A l'parate solut.ion manual is available from thc publisl1f'r npon
rC'<]ul':'\..
This book dol'S not include all the algorithmli that ha\'e eVl'r bN'n sug-
gested, and is 1IJt'<tnt ol1ly to serve as a solid illt,rodu(t.ioll t.o this rich field, whidl
is cOIlt.illnonsly c\oh'ing. fi,.'aders who .1re intl'rest.etl in further dctdils (III a par-
t.icular topic should consult thl' list. of rl'fl'renct's at. t,hl' end of cach chapter.
Excdll'lIt SOurCl"S for addit.ional information are the proceedings of th(' bjannual
IEFE Symposium 011 Computer Arithml'tic and Pl'riodicals such as th(. IEEE
Journdl on Solid-St.atf' Circuits or the IEEE Transactions on Computers. The
Iutter had several sp<'<'ial issu d('votf'd to computcr arithmetic. Ot.her :.ourc
are the sewral books on compntl'r aritllllll,tk nlre.ady <1\'aildblp, which are Iist(.d
at th(' end of Chaptcr 1.
Thi tl'xtbook e\'olved from lecture not prepared for comlit'S ill computl'r
arithmetic that I have taught fit tht' Uniwrsit.y of California at Santa Baroora,
thl' Uniwlsity of Southern California, the Technion in ISnL('I, thl:' Univeity of
California at Berkeley, aud the University of !\f8.....'mchnS<'tts at. Amh('rst. 1111'
order of topics ill tilt' book follows their order in my lect.ures. Howf'ver, sewral
instructors who w;ed a prf'liminary \'crsion of tin> book, did IIOt. experience any
difficulti{ wheu l"o\'cring the chapt.crs out of order.
MrlllY p(,(lple ha\'c contribuh:<1 in diffcrl'm, ways to t.hiR book. Prof. James
Howard from thl' Univ"rsit.y of Californi,1 al Snnta Barbara was t.he first t,f)
bUgg('st thp writillg of this book. Prof. William Kahan from the Univcr;;ity d
California nt. BI'rkl'ley, with whom I hnd long di'K'u8..<:ioru; in 1983, IIIl!uencrd my
vi('ws on uumy topics. Mushe Gavri('lov from I SI ulgie rf'ad much of the book,
and his lIIany sUf'stions h,'II)(-d to improve the prescllt.ation. Prof. Iary .JSIII,'
Irwin from t.he P.'lmsyl\,8Ilia Statt' Univf'ity alld Prof. Ot'hrooz Parhami from
the U ni\'ersity of Cnlifornia at Santa Barbara aJ!;ro to IISf' a prf'liminary wrsion
of t.his book in their clBSIo alld provid''lt lI1any helpful comlllt'nts dond SUR!1;t.J:)I)s.
Several oth('rs rl'\'iew<"d Ilarts of the book alld Rave me wry imJ>i"Jrtallt
ft-'edbu('k. These include Prof. Dan Atkins from thp Uniwrsit.y or :\Iichis.:an
at. Ann Arhor, Prof. Milos Ercegovac from t.he Univcrsity of California at L(fI
Angl'l('s, Prof. Earl Swart.1Ialld('r, Tom Callaway, and Midu\pl S('hultt.' from tit,.
Ulli"'crity uf Texas at Austin, and Dr Gf'Orge Taylor from SlIn Micr05yst('ms,
Th(' grncluat(' :;tudl'nts who took my coursp in the previously m,'ut.iOlll'd
rampusl'S lI1ade many cont.ribut.iolls through th,'ir qn<'St.ioIL<; and sllggNtiOit'l. In
particular, I wish t.o acknowledge t.he contrihut.ions mnd(' hy S8{'hin Ghanpmr,
who prepared the solutioll lI1all\lal, and by Ofra Zinaty, Moshp Gavrielov, and
Susan Iorin.
Last but 1I0t. least, I wish toO thank my wife, Zahava and Illy sons. Yuval
and Varon, who provided moral SUI)I>ort 8.<; wpll as l'ditorial as,.'list.anc('.
Israel Koren
Amhe,.st, Massachusetts
1
CONVENTIONAL
NUMBER SYSTEMS
1.1 THE BINARY NUMBER SYSTEM
In luuwntiollul digital computers, inl('Kers arc reprcsf'nt,I'(1 AA biliary munbcrs of
fixed 11'1I."h. A biliary nmnlwr of If'ngth n is 811 orderr'tl . '«nencc
(.r.n-h Xn-2,'" ,XI' .r.o)
of binMY (Iiits whl're ('ach digit x. (also known as n bit) rnll ....'..urn.' onr of
the values 0 or 1. The length 11. of the ''''qll('llll' is of significancf', sin("(' binary
IIlIIubers in digital l"omputers are stored ill rf'wstC'rs of ,I fixftl If'Dgth. n. Thf'
ahoVl' seqUt'(Jle of n digits (or n-tllph') rf'pr''M'nts thl' intrgf'r vnlnf'
n-I
'\ ')n-I ')" -2 + 2 " 2 '
J'\ = .r,,_I_ + X,,-2- + .., XI +.r.o = LXi .
,SIlt
(1.1)
Up\lI'r casP Il'ttf'rs IU,' used in this book to repfl':!,'nt, nlllll('rkni vnhu'$ or Sf'-
CJlII'n('C's of digits while lowf'r case letters, usuall)' illdlxl'(l. n'I>rt>.o;l'lIt individual
digits. The wf'ight of tlw digit.r, in (1.1) is the ith pOWf'r of2, \Vhil'h is ..allf'd IIII'
rodi.r. of thf' Immlwr ystelll. 1'h,' interpretation rule in Fqu'ltioll (1.1) is sill1il.u
to th,' rlllf' IIsf'd for t h,' ordinary de('ill1al lIumlwrs. Th,'rc are, howf'v('r, two dif-
fprf'lu'c Iwtw('t'n these int,erprl'l,at.ion rules. First,. tht, radix 10 is ust'(l illstt>,d of
2 in Equation (1.1) nllli conlJlIt>nt.l}'. t,he all()\\('d digits ill t.11f' decimul case are
.r.. E {O. I, 2, . .. ,9} instc'all of x, E {O, I}. We cnll t he decimal IIlllllht'rs rudix-IU
1
2
1. Conventional Number Systems
1.2 Machine Representations of Numbers
3
nllmbt>r:- and rhe hinary nUllltwrs raclix-2 Ilumhf'rs. We' illdic"\te the radix to be
u wh('n interprptinp; a giVl'n lIt'n("(' I)f digits hy \Hitillg it as a uhscript.
Thu.... the 5('Q1I('JIce (101 ho rf'prf'Sent.s th" df'Cimal \,8)U(' WI, while the S('(jllence
(101h reprCSl'nb the d,.'dmal \'31ue 5.
Since operand.. and f(ults in an arithmetic unit arp stored in r.--gtSters of
a fixed length. therc is 8 fiuiw number of dbtinrt \7.\lu(-'S that can be repr("SffitA'd
within an arithm{'tk unit. Let .\mon and X...... r delww the slllall(':,t and largest
representable va1u, rp:sp4"cti\'ely. WI'Sc"\Y that IX me ". X rrt4Z ) is the range of the
reprC:;('ntable numbc'P.J. An}' aritlml,'tic operation that attempts to produre a
result larger theD X maz (or smaller than X m .,,), will prodnce an incorrect re;ult.
In >;urh cases the aril hmeti( unit ,.hould indicate thnt the ('ueratA'd r('Suh is ill
t>rror. Thh. is usuall)' c..allf'<.! an ot'crfiOfl l indication
decimal system) and ullcon\'entional y;,.tf'ms likP thf' signPd-digit nllmhpr svswm
(to be pre;pnt.€d in Chapter 2).
The conw'nlionnl number systf'm!. 8rf' nonF'f!dfmdant weighted. and poBi-
tionalilumber sysrns. In a nonn:dundant numbpr syst(,D1 everY number ha..'1 a
unique reprf'S{'lItatiou; in other v."Ords, no two sequenN'S have the sam,. nllm.n.
("al \-alue. The tIrm u'l:lght€d numbf'r systf'ID mf'ans that therC' is a sequence of
v.eip;htb
1£'''-1.1£',,-2,.... U I. U'O
that dehrmin the value of the gin'n ntuple (x n-I, r ..-2.' . . ,.l'o) by thp t'Q1J
tioll
-I
X = Lx.u:.,
.",,0
(1.2)
Example 1.1
If t.he conventional binary numb('r s)stcm is 'mplo)'.d to rl'prl'S('lIt un-
signed int..-gers u..,ing four binary dils (bils) then, \"m= = (15ho is repre-
sent -d h\' (1111)2 and X....... = (Oho i., rt'prcscnt€d b)" (OOOOh- Incr<'asing
Amcu: by I rult.:o ill (16ho = (lOOOOh out of whidl, in a 4-hit reprp-
S('IIUtt.ion, only tht, Ia.-.t four digils are retained, yielding (OOOOh = (0) 10.
III gcnl'ral. a numb€r .\ that is not ill the range (Xm."..\maz) = 10,15]
is fi'pr(>';l'ntro by Y modulus 16, or Y mod 16, which b the remainder
when di\'iding Y b)' !G. Such a ituatiun can arise. for example, wh(:'11 two
opt'rands X and Y cUe added and thir sum excft'ds Y maz ' In this Cd.'iC,
thl' final rult S :.atisfics S = ( \" + Y) mod 16. For I'xaruple,
Thus, U'; is the w('ight a.igned to the digit in th.. ith position. Xi. Finall'. in a
pQsitional numher system, the weight U'. depends only on the position i of the
digit X.. In the conventional nllm&r systems the weight U', is the ith power of 8
fixed int('gC'r r, which is the mdi.x of the number system. In ot.her \\"Ords, U', = r'.
Therefore, the$e number svtems art' also calif'<.! fiud-mdix systems. Since thp
weip;ht a...<>siglll'd to the diKit Xi is r", this digit has to satisfy 0 z, r - 1.
Otherwise, if X, r is allowed, thf>11
x,r i = (x, - r)r" + 1 . r'+I.
X
+y
I 0
o 1
o 0
1 1
I 1
1 0
11
7
rC5ultiug in two machine represeutations for the saUlt.' .....dlue: (..".l'.I,.rh''')
and (.". ,Xi+! + I,r, - r," .). In other word, allowing x, r introduce:.. redun-
dancy into the fixf'<.!-radix nurulwr )st UI.
A Sl'llnl'nce of n digits in a register doe:. not 1Iet.'t"'Sdnly haw to r p pr\.:'5Cut
"II intC'ger _ \\', may lL'iC lIch a sequence to rcpre.ent 8 mixed numbt.'r that ha.s
a frdCtional part a.<; wl'II a... an integral part. This is done by pdltitiouiug the n
digits into two ts; k (iiits ill the integral part and m digits in thp fractiollal
IJdrt, sat,isfyinp; k + m = n. The \'"11111' of au n-tuplt> with a radix point betv."f'fll
the k most signjficdut diKit.s and th,' Tn Icat significant digits
Since thl' final r(",ult ha... to be stored in a 4-hit wgister, the most sig-
nificant bit (whOSf' w('ight is Z' = 16) is discardro. viC'lding (OOlOh = 2
= 18 mod 16. 0
TlK comentiondl binary syst III is a specific example of a number system that can
be used to r..pr'nt, numerical ValUf!S in an arithmetic unit. A number 5y:.tm1
is in general defined hy the set of values that, each digit can &sUTne aud by "n
interprf'tation rul(' tl13t dl'fincs the mappmg bctv.n the sef(uellc of diits and
thclr numerical \'illuCb. \\'c distinguish bctwCf'1I coII\'f'utional lIumbcr 5ysll'llI:-
like the binary sYbtem dribcd in t.hp pre\10U5 Sf'Ction (or the cOlIJlIJouly lJ:.t"C1
(Zk-IXlc-2'" XIXO
.. I
Y
int grol part
Z-I2'-2"' Z -m)r
.... ....
froctloncd part
1.2 MACHINE REPRESENTATIONS OF NUMBERS
is
..\
= xk_lr"-1 + Xk_2r"-2 +... + .l'lr + ro + x_lr- 1 +..
k-I
= L x.r-.
a--'n
+ X_rn r - m
(1.3)
4
1. Conventional Number Systems
1.3 Radix Converons
5
fL.. the qllotipnt. If we IlftW dividf' th,' abow. qUlltif'llt by rD. Wf' obtain .r.1 88 tlIP
(f'lIIailldl'r. We may thl'rf'forc divid(' till' qlloti('nti by r D reppa'p(lIy, rf'tdilling
tlw remaillllpr8 a.., the r('quirf'd digil unt it a '-f'r(, (Illot ipnt is rp"cll('(1.
To filld thl' rf'presf'llt.dioll (X_IJ:_2' ..Lm)rD ()f thp frartJOnal part J(f',
WI' rl'write the ,ppropriatp part of &Iu<ttinn (1.3) as follow8:
The radix pnint i., not storc:>d in till' rf'gistl'f but is undl'rstood to bt' III a fixed
position hptv,f'('n thf' k most signifi('ant (tigits and tla,. m least J!iignificaJlt digits.
For' his rMSOn wc> call such rl'pr''Sentations fUl:d-pomt rf'pr('S('ntatiolL'>. A pro-
grammf'r of a digital f'Ompliter is not llC<'essarity rtrictC'd to the use of numbers
having the prpclf'h'rmillf'd position of t.lIP radix point bllt can properly 5(:alc thp
opprands. As long 8.'1 thp same scaling fartor is U8('(t for all opl'rands, the add
alld suhtract opprations ) ipld the cow'('t results, &illre aX :f: a}' = a(X ::I:: } ).
whf'rt" a if! t.he srating factor. Howevcr, corrfftions are (('<Juired wllPn p"rforming
multipliC''ltioll and division, sillcl' aX . a} = a 2 Xl and aX/flY = XI} - Cmn-
mrmly uSl'd positions for thl' radix Iloint are attlae rightmost side of tl1f' number
(i.e., purl' illtPg('rs, 7rI = 0) and at thp Il'ftll1ost side of thp Ill1ll1ber (i.c., purc
fractions, k = 0).
Giwn thf' length 11 of the opprands, the weight r- m of the Iew:.t significant
digit indicalt'" thl' positioll of the radix point. To simplify our discion from
t bis point 011 and to a\'oid the nf'l'd to dist inguish hetw('(>n t hI' different partitions
of numher (illto fractional alld illtegral parts), WI' introduce the notion or a unit
in the last plmfion (ttlp), whkh is the w.>ight ofthe Il'ast significallt diwt.
XF = r;;1 {X-I + r;;1 (Z-2 + r;;l(x-3 +.. .)]).
( 1.6)
If we lDultiply XF by ro, WP obtdin a mIXed nwnber with X_I as it,> mtcg-ral
part and
r;;l [X-2 + r;;l(x-3 +.. .)]
as the frdt-tional p"rt. WI' may therefore multiply th,-. frdCtional pdft.S by rD
repl'atl'(IIy, retaining the gl'neratf'd inteR,'rs d:; the reql1irMt digit:;. HOWl'ver,
ulllike thf' algorithm for the integral part, this alKorilhm is not gl1<tralllu:d to
t('rmillate, 8inc(' a fillit" fraction (olle t hat needs a finite Ilumher of nonzero digits)
in one Illlmhpr systl'm may corre:.pond 00 an infilJir-e fraction ill all(,thl'r This
does not cOllstitllf£ a problpm ill practit-e, sin('e the proces.-! call be t 'nnilla!.''11
aft('r 111 steps (or a few additiollal OIll'S if rOllndillg is dl':>ired).
Example 1.2
The decimal mixed numhpr X = 46.37510 is to be couvert€<! t.o binary
form. Starting with >"1 = U; we ohtaill th,' following quotif'llts alld re-
mainders when repedtedly dividing by 2:
ulp = r- m .
(l.4'
1.3 RADIX CONVERSIONS
Radix con\'prsioll is the translation of a Ilumber X rl'pr<'Sf'ntf'd ill one rddix
Ilwnber systc'lI1 (00 b(> called the SOUN'./' Ilumber system) to its rl'prcsentation
in another nllmber syst'm (called the dc.tination number systf'm). The major
reason for such conver1'iolls i!o the fact that most arithmetic IInits opl'ratl' on
biliary numbers while tht'ir users are more accustomed to dI.'Cimal l1\ullhprs.
which also require a smaller Illlmbl'r of digits. We will therefore cmphasi7R
coll\'('n;ions between the decimal alld the biliary number sYf!t.'lUs hut will pm;pnt
the algorithms ill a mure genl'ral form.
Givl'll a munber X, we wish to find its repre:.elltation in thf' destinatiolJ
number sYbtcm with radix rD' For cOIl\('lliellcc, we distillKuisb bctw(.'('11 thp
conven;ion of the int('gral part. XI and that of thp fr-r.ctional part XF. Startillg
witb the inu:RTal part, it represelJtation (J:k_ll'k_2'" XIXO)ro is sollght. Wp
can (('\Hit.. tht:. appropriate part of Equation (1.3) a.!; follows:
Qllotipllt
23
11
5
2
I
o
H 'mainder
O=xo
1 =XI
I = l'2
1 = X3
O=x"
I =x
\\'I' 1l0W convert. the fractiolldl part X F = 0.375 alld obl.am the followmg
illtf'ger1' alld fractions when rep('atcdly nmltiplying by 2:
Intl'ger part.
0= X-I
1 = %-2
1 = X-3
XI = {('''(.lk-lrO + xk-2)ro +... + x2)r O +xI}r o + .I'o, (1.5)
Fractiondl lJart
.75
.5
.0
when' 0 ::; J:. < ro. Tin!!;, if \\if' diville XI by ro we obtain 70 as th(' rpmailllJer
WId
Thlls, thl' final result is 4(j.37510 = 101110.0112.
U thp fractional p8rt of thp gi\oell decimal nUUlb,'r Wd:> \' l- = O.J, the
above algorithm would n.vcr tf'nuillatp, siucc the dpcimal fr.u.tiolJ 0.310
is thl' infinite biu8ry fr8(tion (0.0100110011.. 'h. 0
{(... (xk--Ir O + rk-2)ro + ... + X2) ro + .l'1}
6
I. Conventional Number Systems
1.4 Representations of Negative Numbers
7
For fixro-point IlUmher:. in a r "Ii" r s)'!>t('IJI, we haw to dctcrmim' till' way
nChativp 11I11J11)('rs ar(' rt'prt>sc>ntcd. Two di!f('rent forlJl a((' cOImJlonl)' USt'(l:
A major disadvaJlrae of t.he ..ignpc!-ll1anitUll' r"J)r p ' 'nlat.lon ill thnt !h('
opernt.ion to 1)1' p('rfnrn1l'd ml\Y d('pend on t,he siglls of t h,' ('p .wnds For ('x-
aUlph>, whp!1 adding a posit.hp l1Iunhf'r X alld a 11I'ative uUl1Jbpr Y (i.{'., Y is
th.' absolut(. volue of thp Sffoml op('raud), we 1Il.'Cd to pprform th(' ealrllinrion
X + (- 1'). If Y > Y, wp should ohtain as a fill/il result -(V - Y). W.' t hprf'for.'
n('('(1 to fir!ot ('nlt-Illat.>} - X, i.p., 'iwitrh the order of Ilu' uppr.ulfls and l)t'rform
slIbtwrt.iou rath,'r than addition, alld then attach the minus sigll. This (1....'111-1
in a SN}U('IJ('C of decisions that. have to bl:' IIInde, costing ex, I'N> culll rollogic' .lIld
('xl.'C'ution time. This is avoided in the (omplelJll'Jlt r('prt"nlntioJl JlJI'tho(l..-.
All th(' arit hnJl't i(' oJ)prat ions III the above COI1\'f'rSlon nlgorillmlb w('re
pcrfornll'd in IIIC' sonrc(' I1Innht'r syst,pm, \Vhidl was till' decimal syst"JI1. fo
ronV('rl a binary I1Innhpr to till' ch'dmal systl'lIl, we Ul8) I'ither (>XN'utp t.hp
above algorithm in du' soltre(' binary sy:.tt'm or. mort' cOI1\'('lli(,llt,ly, p('rfonn tltr
{'unvpr.;ion in thp d('stinal-ion d,"Cima1 systcm u:sin Equatiou (1.3).
1.4 REPRESENTATIONS OF NEGATIVE NUMBERS
I. Slgll \IId magnitud.> reprf'Ii(>ntntion, which is also called the sign('I..I-lJJag.
IUtunl' 1I11'I.hoo
2. COIDI'IplOPllt reprps('ntation \Vhit'h comprise.,> two alternatives
2. Complement representations: Th('re "lre two alh.rnntivcs:
(i) Radix ("oml)IIIIPllt (also called t.wo's ("OlllplplOf"nt ill thp biliary syM('JI1)
(ii) Diminishcd-radix complement (called unl"S ("OInplpml'nt ill t Iw biliary
s)'st.l'm)
III bot.h rompl('ment, mcthods, a po:;itiv<" ullmhl'r i.. r"pr('S('lItc'd in th("
same way as in tI}(' signed-III<\Kllitlldl' methoc!, wllf'rN\..'I a negative IIl1mhpr, - Y,
is rI'prt>sl'uted by (R - Y) where R is a ('onstant whoS<' vahlP \\1" will dl'tl'rlllin("
next. Such a repTl.'$('utation satisfif'S the bl1.M(" iOf'nt ity
1. Signed-magnitude: IIerl' tllP sign ano nUignit ndc art' rcpr('S{'nte{1 SCpd-
ralely. The first digit is thl' sign digit "lnd the rI'lIlaiuing (11-1) digits rI'J)rtI'nt
till' maRnituol'. In thp hillary ("l, tlJ(' sigu bit is normally !'I>IN'tNI to be 0 kJr
po...iti\' . numhf'rs and 1 for negntiv(' OIU'S. Iu the nonhinury Ca.5C, tI)(' valucs 0
alld (r - 1) arc assignc'd to th(' sign digit of posit.ive nnd nJ.':at,ivl' nllmbers, rl'-
sp('(.ti\'ply. Notice that in this ("&i'> only 2. r,,-I ont of tlw r" possibl{' sc..'(I"l'n('cs
an> Utili7..<'d. This will be discu...'i,."t-d further lah'r on.
I..<'l the (n - 1) digits r'!prnting the magllitud,' 11(' partitiom'(l illto (k-
1) flUd m (IiRits in tllP int<.>gra1 alld fractional parts, re.sp<.tivdy. The largest
rpprc....entable valu(> is then
-(-V) = Y,
(I. 7)
,. ( k-l I )
'moz = r - II 1) ,
whcr' !lIp = r- m
sim'p the complem('nt of (R- Y) is R-(R- Y) - Y. OIW ohlll' major advantage'S
of a cOlllpl"l1Il'nt rl'prl.'sentatioll (regludl('N> of thp ('x....U't valup of R) i:. that 110
dt'Cisions havl' to b> lIIade before executillg "lll additioll or subtraction. III tJIP
pn'vious ex.uupll', whcrc a positiv{' aUlI a Ill'gat.ivt' IIIlIllh('r arl' to b,' add('d,
th(' second opcrand is reprcscntL'I..I by (R - Y). rh('rofnrp, till' addit.iull to hI'
pl'rformed is
X + (R - Y) = R - (Y - X).
If Y > X then the negative Tl.'slIlt -(V - X) is dlrpndy repTl>sl'lIt,pd ill the sanlt'
compl,>mellt form; i.l'., (\,'i R - (} - X), and tlwr(' i.. 110 1U't'd to UUlk,' any SJ)( dal
dl.'Cisiolis likl' intercll31lbillg the orol'r of th(' two opl'rnllds. Howev('r, if X > y,
tilt' ("orr{'("t rcsult shollid be (X - Y) whil(' X + (R - }-') = R + (Y - V). Th.'
additional t('nn, R, IUIL.;t be disl'"lrllNl, and I hp vallie uf R shollid bl.' selt.'l.ted to
simplify or eV('1I completely ('limillate this corrN'tion step.
Another re{luirl'mcnt on the valu(' s('lN'll'd for R is that th.. ('"llt-Illation
of the complement (R - } ) of a giVl'n nllmber l' he a simple IIp,'rntion t.hat
can be donc at, a high sp(1('d. D"fort. ol'('iding on thl' mlue of R W' defille the
complement of a single digit XII denoh'd by x.' as
alld t.h,' corn'sponding rl'prcsentation is O(r - I)... (r - 1). Thus, tI}(' range of
positive nllmhl'rs U. [0 , rA=-1 - ulp]. Thl' rangl' of Ilt'gatiw numbers is similarly
(_(r k - I _ !lIp). 0], represent,>d by (r -1)(r - 1)... (r -1) to (r -1)0'" O. WI'
ther....for(' have two r('pr<'SPl1tat.ions for zero, one posit,ivp and onl.' negatiw. This
I'" wtOllwllil'nt whl.'n impll'l1Ientil1g an arithmetic unit, since an ('lIUal indication
must be gpnE'rat('(j in a t(t fur L.ero op('ration for tilt' two different rl'prf'lientatioIi S
of ./..<'ro.
Example 1.3
In till' binary case all 2" scqllt'llcC'S are utili7RCI. Thp 2"-1 !'I'<JII,'nccs from
00. . .0 to 01 . . . 1 repnoscllt prn-itivc numb('I"S, while tl1l' rel1laining 2" I
sc..>quences from 10...0 to 11... 1 rl'pre:;,'ut npgatiw nIlJl1I)('rs. If k = n
(and thl'rl'forl', m - 0 and !lIp = 'fJ = 1) the rangf' of Iwsitive nllmhers i:;
[0,2,,-1 -1] "lnd the rallc lIf l1t'gat.ivl' nmnlwrs ifi [_(2"-1 -1), 0]. 0
Xi = (r - I) - r,.
(1.8)
\Ve dellote I,y ..\ the fI-tuplp (Xk-I,.f.k-2' . . " .i -m) obtained after cumpleml.'lIting
('vcry digit in t.he St'<IIIl'I1CC corresponding to X. \VI" IIOW adel X to X nlld. hIL>;l.d
R
1, Conventional Number Systems
1 .4 Representations of Negatave Numbers
o
X + A + uip = r".
( 1.9)
Sequencc Two's cOlllplpment 0111"5 rornplcmcnt Siguf'd-m8Rn it ude
0111 7 7 7
0110 (j 6 6
0101 5 5 5
0100 4 .. ..
UOli 3 3 3
DO 10 2 2 2
0001 1 1 1
0000 0 0 0
1111 -1 0 -7
1110 2 -1 6
1101 -3 -2 -I)
1100 - -3 -4
1011 -5 - I - 3
1010 -6 -5 -2
1001 -; -6 -1
1000 -8 -7 -0
on Equatioll (1.8), we obtain Xi + X, = (r - 1), indppf'I1lI"nt of the t'xact vailI('
of Xi. WI' thpll add nip to t.lIP sum of .\ aud X, yif'lding
\ XIe-t Xk-2
+1" 3'1e-1 XI:-2
(r - 1) (r - 1)
+ ul l)
1 0 0
X- rn
-m
(r - 1)
1
o
= r le
The abO\" calculutlon can bf' rt'writtf'n as
Nutt' I hat when till' above re:sult is stored into a regish>r of I('ngth n (n = k + m),
th(' most significant, digit is discarded alld the final nsult is zero. In gpner<tJ, stor-
illg t,he rl':>ult of any arithmetic operation into a ft.xed-Iellgth register is tX}uiv.,"'nt
to t.akillg the renmindt'r after dividing by r".
Rearranging the lPrlllS ill the pre\'ious equation re:sults in
TABLE 1.1 Three representation methods of binary numbers with k. n. 4
r" - X = X + ulp.
Ie -
R - X = r - X = ,,\ + ulp.
( 1.10)
additional sequence thal start:. with a 1, namely, 1000, which has no cor-
rp()llding positive IIInnbcr. It reprents tilt> neative number (-8ho.
Therf'fnrf', tlU' range of binary mnnbl'rs ill the two's complement method
with k = n =.. is - 8 X 7. Th(' two's complement repl'f'Sf'ntations
of "111 values within thi.. r8llge are shown in Tahle 1.1.
To illustrate the :.implicity of executing the operation ..\ + (-}') with
}' > X, consid('r thl' additioll of thl' lIumbers 2 and 7, represent'-'tl by
0010 and 1001, respectivcl)':
o 0 1 0 2
+ I 0 0 1 -7
1 0 I 1 -5
This is the correct (('Mdt rt>prf'Sf'llted in the two's COlllplplUent methJd,
alld there is 110 IIl'<.'d (or any preliminary decisions or post corr,-'C.tions.
Even whl'lJ X > }', tin: expected r('Sult is calculated without requirin
any corrN'tiolls. For example, when adding 7 8nd -2, reJlrt'S('lIted by
0111 ami 1110. flospecti\'cl)', Wt' obtain
Omsequently, if we splcet. thc vahlf' r le for R w(' obt.ain
Th(' calculation of the complement (R - X) of a ivl'n lJumber X as df'fineO
abO\'e is quitR simpl(' and if> iudept>ndt>nt of the value of k. \\"c c.all this radlX-
compkmcnt representation. No corw('tioll is needed for it wht'n thc r('5ult of the
prc\1ous operation. X + (R - Y), is positive (i.e., when X > Y), since R = r k
is dibcarded when calculatillg R + (X - V).
Example 1.4
For r = 2 and k = n = 4 (alld ronsequellt-ly, m = 0 and uip = 2 0 = 1) the
radix compleJllf'nt (also call('(1 thp two's complf'lIll'llt in the binary case)
of a numhpr X equal.. 2 4 - X but can instead be calculated. according to
E<juatioll (1.10), by X + 1. In this case, the scqUf'llces 0000 to 0111 repr('-
sent thp positive Ilumbers 010 to 7 1 0, resl)('ctively. The two's comph'lIlent
of the largest positive number is 1000+ 1 = 1001 alld it repn":Sent:. the valu('
(-7ho. 1'11£' two's complenlPnt of .tero is 1111 + 1 = 10000 = 0 mod z1;
i.e., there is a single representation of i:cro. Thus, f'adl positive number
has a corrl'Spoudillg negdtive numlwr that, starts with a 1. There is an
o
+ 1
1 0
III
1 1 0
101
7
-2
5
Ouly the last four lea.."t siRnific3nt digits arc retaint't.l, yit>lding 0101. 0
10
1. Conventional Number Systems
1.4 Representattons of NegatIVe Numbers
11
A s('('ond possible dIOI("P for R I:' R = rA: - ,lip. This is the tlimirti.h('d.
rotliT ccmpltm 'FIt for which, accordinF; t.o Equation (1.9),
R - \ = (rA: - ulp) - .\' = "Y.
(1.11)
nUlJlbpr R/2 1, Nouo\'t'rldppin rCOIls for (It o;it.ive and n£'ativc Ilumhcrs can
be adlicv,'(i cdSily if the radix r is an ('VCIl rllllulll'r. In thi.. ("BSt'. 111 ordl'r to
satisfy the itwquality 1\1 $ R/2 = I' r n - I (for dU intecr.only repr('S('ntatioll),
t hi' values 0, 1,' . . '1-1 for t hc 11I0st significant diit wOllld corrt'Spond to poitivp
I1Ilmhl'r:., whilf' thl' valllCS l' . . . , r - I would ("orrl)olld to IlPativp Ilumlwrs. If,
howevpr, t he radix is odd, then t he rcpr'nt ltions 0 to ( ':' - 1) mll!;t. be moof'
positiw. 81ld thp remainillg onl'S n('ati\p, lIJakin it lIJore diffic'lIlt t.o distinguish
bctw,'('n positiv(' and negative numbers.
H.,fC', tl1l' derivation of thp C'OJUpll'lI1l'ut is even :mllpll'r than that (If the radix
c0111plpl11l'nt. Allt.lll' di,.,rit.-romplplIJ£,lIts I, ("an he calcuhltNi in parall('I, Il'ading
t() " fAst. computatiun of .\ . 011 th(' otlll'r hand, a corrfftion stpl' is nN'dNI wlll'l1
thl'r('Slilt R + (X - Y) io; obtdin('(1 alld (X - }') is pOliitiw, <I..S will h(' ('xplailwd
18ter.
Example 1.5
For r = 2 and k = n = 4, the dillJinish,'d-radix COl1lpll'lIJ£'lIt (also ca1k-d the
on("s C'OlIJplclIll'nt ill the hin8ry cMe) of a nllmhpr X ('(ju81s (2 - 1) - X,
which a1:.o l'Qual..; X , thl' :.eqU('IICC of digit comph'mcnts, ac-cording t.o
Equation (1.11). A:s in tl1l' pre\'ious example, till' S('{]lII'IIN'S 0000 to 0111
r£'pr.-'SCut tht' positivI' IJwnb('rs 0 to i, rt>spt'('t ivt'ly. The onp's C O IU()lpl11£'llt
of tll(' 1.LI'gl'.st positivI' number is 1000, rpprt.'Spnting the value (-7)ro. Tlw
one's COl11pl(,lI1l'nt. of z'ro is 1111; i.e., ther(' arc two represent fit ions of
i:cro. In !'lll1Unary, th(' rang!' uf binary numhers in thp OIlC'S complement
IIIl,thod with k = 11 = 4 is - 7 $ .\ :5. 7. The different rt>presentations of
prn.itiw and neative numbers v. ith k = 11 = 4 ill tbe tbree methods arp
C'OlI1parNI ill Table 1.1. 0
EXAmple 1.6
In the> radix-complpuwnt d('('illJ81 system t.lw most signifil'dnt digit C'in
a.-;SIIJIJ(' any of its 10 pos.-;ibl.> vnlul'S. Thus, all Sl'I)UPIlCes with 8 Ipadin
digit of 0, 1, 2, 3. or 4 reprPSf'nt posit.iw nllll1bf'rs, while> tho having
8 leading digit of 5, 6, 7, 8, or 9 reprl'St'nt Ilegat.ive ont'S. For JI = .',
thp largest positive number is 4999 81111 thf' St-'<JIlf'ncl'S 5000 through 9999
rl'prcnt IlPgativp Illlmbl'rs with valul'S of -5000 t.hrough -I, rl'Spe<'ti\f'ly.
TIll' ruuge> is th('r('fore -5000 $ X :5. 4999 and is showll ill th(' din ram
below.
"99 fJ()()(J (]()()I
.
.
.
.
.
.
III the biliary ca.')(', the most signific81lt digit C&.l a.."5UIllC only two vnllll'S
and is thus a "t.rue" sin digit. This holds for all three reprclltatioll nwtbods
as showlI in T'ihl(' 1.1 and t.he dist. iuct ion hetwl'l'n positive I1nd negat.ivt> numbers
h; greatly simplified. In t.he nonbinary C.8Se, restricting t.h(' 1110st signifil"Unt digit
to two '''alucs only (0 and (r - 1)) would lollsiderably rcdllc(' t.he percentage of
uti1i/,-d S(>qUI'IIC; ouly 2. r n - I Ollt of r n (or 2 out of r) would he IIsed. To makl'
up for this, wc can M tI1l' most. siguifkant digit a..o;:;ul11e all its pos...;iblc values \Ild
partition the total numb('r, r n , of b(.'<juences equally (or dlmust ('{]IMlly) betweclI
pOsil.ive aud neatiw valu(':;. In gt'llcral, a given numlwr X is represl'ntcd in
a compll'ment :.-ystellJ by X if it is positiVI' or by R - IXI if it is npgdtivp. To
ha\'e ullambiguous reprt'Sl'lItdtiolls, th£' r('gions for po:.itive and npgative Illllllbers
:.hould not owrlap. In ot.ll('r words, thp illeqllality
.
StJOI
5000
-
1 \1 :5. R/2
LI,t }' = 2345; to find th£' r('prescntation of Y = -23 '5 (i.e.. the rd(lix
compll'ment R - Y with R = 10 I) WP IL,)(, thp expre:;:;,ion R Y = Y + ulp.
The digit complf'nll'ut iu I hi C-lL..;,' is the ume's compll'm('ut, yil'lding Y =
7654 and thus, Y + I = 7655. We CIUI \l'rify t.his rcsult by adding Y to
- Y, obtaining 2345 + 7655 = 101 = 0 mod 101. 0
must be satilifll'd. If tbe \-alul' X = R/2 + 1 is allowl-d to be includcd in the
region of rt'pn"Sl'l1t<ihle nUl11b.>rs, tb('u th,' Ilf'gative number -X is re(JrPS('uh,<1
by R - X = R/2 - 1, whit-h is idl'nti,'al to the represent.ation of th,' po:.itiw
Thp readt:'r should rl'.uize by now that algorithms for arithnll'tic o,,('ratiOl
call be d('veloped for varioll:' fixL-d-radix IIlllnlll'r s}'st('ms autl for .litfpf('ut parti-
tions of mixed numhers (into their integral and fractional parts). To simplify our
12
1. Conventional Number Systems
1.5 Addition and Subtraction
13
ciiscus.<:ion from this point on, W' restrict oursP'" to him:\f)' intt'gl'f!;; i.r., r = 2
anci k = 'I. ThC' exteusion of what follows to thc gcncra] cast' of binary Ilumh('f!;
wit h m '" 0 or rflciix r I1IlInheTS wit.h r '" 2 is, in most. cases, stralhtfoD' ard.
1.4.2 The One's Complement Representation
'fhl' rangl' of rI'IUI,:,('ntahl(' umnhl'rs in tlll' om"'s (olOpl('mPT11 systt'm for k = n
is ymml'trir, and ('fJllals
1.4.1 The Two's Complement Representation
TIll' rang(' of nUI11I)('rs in thE' two's cOl11pl('l1Ient systt'lIl for k = n is
_ 2,,-1 \" (2"-1 - ull)) wit.h ull) = 2 0 = 1.
This rangt' is slight Iy asyml1lt.tril' as thcrc is om' mort' ncgat ivc nl1lnl)('r than
th('re flrl' positive numbers. Till' hillary lIumbl'r - 2"-1 (rl'l'fI'sl'lIted by 10...0)
d{\{"S not h8\'1' a positi\'c ('quival,'nt. C'onSl'qul'ntly, if n complcment op,'ration
for t.his numbl'r is attl'lIIpted, an o\'l'rflow indication must !I(' gt'IJl'mll'd. On tJu>
othcr lIand, there is a uuiqul' rcprcsl'ntatiou for 0, as shown in Table 1.1.
Gin'n a r('preS('ntatiol1 (X,,_I. X,,-2...., xo) in two's COl11p"'II1l'nt., Wt' can
use thl' followillb proCl:durc to fino its numerical \'I\hll' X: If .en_I = 0, tll('u Y" =
L:0IX,2i, whil,' if X,,_I = I, the giv('u sequence (('»n'St'nts a Ill'gati\"c nllJuber
whose absolute value call b{' uhtained by rompll'lI1l'nlillg t hl' giwn Sl'<)uencl' rul!l
then emploYIng the previous l'quation.
_(2"-1 - /lIp) Y" (2"-1 - !lip) with Ilip = 2° = 1
As a result, thl're arc two rcprpSl'ntations of Zl'ro, a positivc i:cro rppr('cnf('d by
000. . .0, and a 1I('ative ./em repn'5('nt('d by 11] ,. . 1. as h()wn in Table 1.1.
For the OIIC'S COll1pll'l11l'lIt SYSh'lJ1 WI' h.wl' an ('(Ination similar to t.lmt of
t.llt> two's COJJlp!t'lJJl'nt syst.(,II1:
,,-2
X = - x,,_1(2 n - 1 - Ilip) + 2: .c,2'.
,=0
(1.I:J)
For ('x<unpll', the 4-t,uple 1010 rcpresents the valu... _(2:1 - 1) + l = -5. 1'111'
proof of Equation (1.13) is left. to thc n'adl'r a.o; an exercise.
Thc dl'rivat.ion of the one's compll>mcnt is <:implt'r than that. of the t.wo's
COl11pll'l1I('nl.. For eadl digit we h8vP to calculate X. = I-x.. which is the Doolpan
COlli pl('II1('ut ane! can hI' dOliI' ill parallc'l fur all tIiKU-!;.
Example 1.7
Given t.he .1-tupl... 1011. w(' ("an fiud its valu(' by first compl('Ulenting it to
obtaJlJ 0100 + 1 = OWl, then cakuhting tl)(' valu(' of tht' s('qucn("(' 0101,
which is 5. Tills indicates that thp vahll' of the> original s('qul'n("(' is -5. 0
Instead of tll(, ablwe pro(.cdu[t' \". CRn uS(' thl' ("xprE'Ssion
1.5 ADDITION AND SUBTRACTION
\\,ht'n addin or subtract.ill nlllilhers rl'prc 'nt.cd in t.he signed-magnitude n'pn'-
st'lltation, only thl' l11al{uitudl' hils pllrticipato;: in thl' arithmetic opl:"ration, while
tll(' sign bit.s nre t,r('ah'(l st'IMratl'ly. Consl'quo'ntly, a carry-out (or borrow-out)
indicates o\'l'rflow. For ..xnmple,
n-2
"- X 2 n-1 + " X 2 .
A - - ,,-I' L i .
, 0
(1.12)
o
o
o
I 0 I I
+ 0 I 1 0
I 0 0 0 I
11
6
1 Carry-out
Using Equatloll (1.12) for 1011 we obtain -8 + 2 + I = -5.
To proVI' thc validit.y of Equation (1.12) lIote first, that, if X,,_I = 0, thl'
exact same positive \alue is c.a1culat('d. If X,,_I = 1, the value of the gi\Cn
rcprcsl'lItation is
- (\ + Illp]
=
[ ,,-2 ] [ n-2 ]
- ?:i'i2' + I = - 2:(I-x,)2' + 1
.-0 ,=0
[ n-2 n-2 ] [ n-:l ]
- 2: 2 '-2:.I"i2'+1 =- (2"-I-l)-2:x.2i+1
,=0 .-,0 ..0
Tht' final rl'sult is positivI' (the SUIII of two positiw IIlIInbers), but its magllitudr>
ill four bit.s is t'rronrously obtained 11-" I instead of Ii. sinct' 1 = 17 mod 16.
In both ("olllpll'l11eut "y"tl'ms, all digits, induding tll,' sign digit, pnrticipi\te
in t lit' add or subtract opcrnt.illil. A carr) -out. is thN,'fnrc not nl'Ct'SSurily all
iuclieatioll of an owrfluw in t,ht' S)'stl'III. For l'xmnple>, wh,'n 'lddillg tilt' two
IlIlmh(.rs.\ = 13 and Y = -M rcprI!sl'ut.f'd iu th two's COll1pll'lII('lIt n1l'thod, Wl>
obt xiII
=
+
]
o 1 I
110
001
o I
o 0
o I
13
8
5
Carry-ollt, hill no OVf"'rfiow
,,-2
= _2 n - 1 + 2: X.2i
,-0
whirh exartly thl' valul' of tllP right-hand side of F"<luat,ion (1.12) for .ell-I = 1,
Thl' carry-out is discardl.'(l ami dm's Jlot indkute OVl'rHow. In g('neral, if X and Y
IUWt' opposit.t. signs, no o\'l'rfiow l'lUl oC("lIr rt'f.':ardll' wht"t.lwr t ht'rc ill a rtilT)'oout
or not, as iIIust-rnh'd in tilt' follnwin t.wo I'xamplt'&:
14 1. Conventional Number Systems
0 0 1 () I 5
+ 1 () 1 1 () -10
I 1 () 1 1 -5 No carry-ollt
0 1 () 1 0 10
+ I 1 0 1 1 5
1 0 ° 1 ° 1 5 Carry-oIl I.
1 ,6 Arithmetic Shift Operations
15
If \' nnd Y haw th., smm' sigll and t.he sign of th.> (('suit is dilfl'rPllt from t.hnt
of thp two opprallds, tlwn aJi overflow occurs. For pxampll',
As wp haw' SI'('II IlI'foft., 110 "II('h C'orrf'f'tion i<; Ill! cssary ill hVQ's complement.
"\odil.ifln.
In hoth compll'nlt'nt sv!otplJ},., n suhtntl't ojJ.'r8t.iun, - Y, IS pprforllled
hy adrlinl1; t hI' C'omplPIIIPnt of) to X. III thp OIlP'S 10mpl"IIwIII. svsh'm till!;
IIIC8US ,,,Iding V to X; ill othpr words, X - l' = + '} III tllp two's f'OlIJplelllPnt
syM.'1II w,> perforlll - }" = \' + (V + ulp). This still rpquires Dilly a sin11'
.\dd('r olwmtion, sinn' flip is addf'(ll.hrollh the forC'(.'(1 carry input to thp biuar'
"dd('r. This will be further ,'xplnincd in CIMllt 'r 5.
1 1 () 0 1 -;
+ 1 0 1 1 () 10
1 ° 1 1 1 1 15 C8rrv-uut and overflow
° 0 1 1 1 7
+ 0 1 ° 1 0 10
1 ° ° 0 1 15 o c\rry-out bnt ov.rflow
1.6 ARITHMETIC SHIFT OPERATIONS
Anoth.'r way of lIistillguishing mJ10llg thp thrcc 111t!t.hods for rcprl'SI'lIullg Uf'KHt,i\P
IUnuhcrs is to cOllsidpr thl' infinite ext.ellsions t.o t.hp right "\1111 to dw Icft of a
giVl'11 nUIIII)('r, In thl' siglJC'd-magnitudl' IJwthod, th,-. manitlldp J'n-2,... ,Xo
can he vil'wl'd n.<: thp infinitp SI'<JIJ('nrp
In tilt' OIll"S cumpl(,I1I('lJt syst,'m a C'arr)'-ollt io; all indicat ion that n <'Or-
rpctioll stpl) is lu,<,dPd. For l'Xaml)lc, wht>n adding a posit.ive IUllllbl'r X aue! "l
negat.ive numher - Y (('prpSt>llh'd in one's cOI111'lpll1l'nl., thp n'sult is
...0,0, {Xn-2,... ,xo},O,O,..
X + (2 n - ulp) - l' = (2 n - ulp) + (X - }')
If allY arit 11IIwtil" ojJ('mtioll rcmlt s in a 1l0llzcro pn'fix, tll(,1I t.his constitutf'S all
ovcrflow.
In the rlldix-complpmPllt Sdll'JIlI' till' infinit... l'xtf'n!ooion is
nlld if \ > }' WI.' houle! obtain (X - 1'). The tcrm 2" rcpr<'SCl1ts thl." nmy-out
bit. which is disc<irdcd sinC'e tit(' filial result should be stored into a register uf
I,'ugth 71. The ((>suit is tlll,rt'fore (X -) - ulp), and the I1PCcssary correctiou is
mndp by adding 111", For I'xmnpl."
... Xn-I..c,,-lo {Xn-I,... ,.co},O, 0..,
where Xn_1 is thc I'igu diRit, Filially, for tl1l' cliuuuishl.'<l-radrx COIIII,II'IIU'ut
sdll'lI1l' thp Sl'<IUPUN' is
0 1 0 1 ° 10
+ 1 1 0 1 0 -5
1 ° ° 1 ° °
C<>rr('C'tion + 1 /lIp
° ° 1 ° 1 5
". X,,_ II Xn-I, {.c" -II'.', .ro}, X,,_ II X.._I ...
Example 1.8
Th... Sl'<)II('lln>s 1011., 11011.0, uud 1I1011.UU nil (('prescnt, t.hl' v8hl1." 5
in thp two's cOlllpl"lIJl'nt II1pthocl, Similarly, thp !it'<lnenl'cs lOW" 11010.1,
nnd 1110101.11 all rpprf'Sl'lIt the valul' -5 in tit(' onc's compl"lIIl'lIt. method.
Thl' gl'n('ml proof is left IL<; aUl'xerdse for thl' reader. 0
Thc generalod carry-out is c"lIl'd end-around carry, ulPt\ning that the carry-
out is au indication lh8t a 1 shuuld be dlidcd to thl' I('I significant position.
If there is u" carry-out thell no ("orrection I:> 1I"l..J('d. This is t hl' case ",h('u
.\ < }' and thus tit" fI'Sult, -(1' - X), is IJI'gativl' and should hf' rf'prl'S('ntNl by
(2" - ul/) - \} - X). For cxalllple,
-10
-5
No carry-ollt ,uul hPIJC'p no correctiou
Thes(' extellsiolls arc u:.i'flll when mldiuK op"r3uds with diffefl'llt uUlllbel'8
of hits. TIJ(' shortpr opernud IUU!>t be ('xt.'lul...1 to the leugt.h (If till' luugf'r
ou.' befof(' hdug ndd('d, I3ll..'wd 011 these cXh'usious we l'<\11 tlcrivf' till' rules
for aritlullet.i.. shift Olll'ratilllJs where a I..ft and right shift. nr(' l'quival('IJt t.o
nUll! ipli('dt.ion unci divh,ion by 2, (('spcct.i\'cly.
+
o °
] 0
1 1
1 ()
1 0
° 1
1
1
°
5
HI
I. '"IIIP''' 1.n
III tw,,'s l."OI1II'Io'IIII'l1t,
SId, {UOWI:! = I-:J}
S/dl {OOlOl:l t.:;}
SId {11I1112 = 5}
'I.R {l1Ull:! = -5}
In lUll' 's l"umpll'JIlellt.
'd {llOllh = :;}
'I.R {IIUJO:! = 5}
I.
Conventional Number Systems
1.8
Refmences
17
Ul 01 OJ = 10
= IIOOl(}.l = -t 2
= 100ltll :::I -10
11:1 11101:) = -3
= J(lIUJ = -10
= lllOl,1 = -2
0
\.6. l:m'n tUI "-1111'11' , - (rn-I....,Zn) Ihllllh.. ",,1\1(\ of S/I.H( \} + ,h,H(-\"}
lUll I of SI.,I (\;) ,'h.l {- \} lI'Iill l'ltbl'r tbl' rliltlini'llul-rI\"hc fompll'lIwllt tlr
I h,' ru..ti'C t'OJIIIII"IIII'lil r.'..r "'lItlltioll
1. 7. "II lit I\rc' t h,' rllhos for o\<'rflow d,'I,'C't Ion In .d.l/sub'met opf'rnlinl''1 'I\'bf'n II sing
l\\o'S 1"'lIlIllh'IIII'1I1 or Olll"!! ""1111111'1111'111 rf'llrt'lIl"tI\}IU" 1'''UllllillJ( llml I"IIrrv-ill
1111\1.. ur,r-oul siJ(IIWs Cor Ihl' biWI digil firt' n\'Iilnblc'i
1.8. I'rtl\ I' I hnt I h., \,\lur of IIny nlll11hrr in t hI' Onf"fC (""101'11'1111'111 :lVSI('1I\ "/In hI' rill.
.-uhlll..I, fl.r IUI' k 111111111, IIsillJ( ,he forll\lIln
"-2
IiJ I \........ .
.\ .. -zliJ_I(2 - - \I"l)'. L.. r,2'
J.-m
1.0. CUll th,' '''lltRltOIl X a _.r,,_12 n - 1 ... L::r.2' hI' ('XI'utll'(llo thp rn<lix 1"010-
pll'lnent rl'"I1'!oo'l1lnt It'll for "..)' ,.....it h'1.' rI\<lix r i
I.ltl. '1'1... t.lilf.'n'Jat..., , -} rnll bl' ftlrllll..1 b 1,,1.lillg till' c'ompll'ml'lIt of \ lo }
IUl\lt\J('1I COllljllCIIlClitiuR 111\' rt","ll, Pro", thnl litis is trll" fnr IUJ,)' COlllpl"IIIr>lIt
sdll'lIIt' I\ud /\Ii\' I1t'rnllix('(l-rllllix IIlIInh('r !I)-:;lt'lII. b il Ils('ful?
1.11. h\lw Ilml 1\ I'urry-nnt in I\IIY ".hl I\r NuhlrtJl'l I\ltl'mlioll IIf IIm"s l'III1II'II'IIIC'1II
lJulllhl'r!I ilillinltl I hnt ilia l'II.I-urt>lIIull'lury lIalL'i1 1,1' I\Itd,'(1 to t:orn'l't lh(' 1111111
no:;ult. P \y I)lUlil'nlllr Rlh'Jltioll 10 tho c{\.w wh('rl' h,,-' 1It'gt\liVl' nllmhl'r!I 111'...1 to
b,' n.!tI...1
1.12. nlll I III' n.rn,,'-Utlil MI'I' ill OIll"S l."OlIlplr>lIIl'lIl l\lIdillUIi IJ)' IwdiUjt CUl I'Jlll-urnu lid
IIIny "lIcrnll' 1IIltllh,'r I'IIII'IU. IIntl,-nrr}"?
1.13, III Iht' SiII'(I'IJJRWlil\llII' N'\UI'St'lItutillll 111I,rt. is IUtnth,'r wuy It) pi'r(o'rm 11.1' III"
c'rtIliClla .\ t. (-}) wi..,.. } >.\, Imil('ud or dllminft the nrt!t'r ur Ol'I'UIII,ls
III1I"/llnllllt ill - (1 - \), WI' CUlt simpl)' suhl m,.-l} fmlll' 1I..\\1'\..'r, this
will n\J1t iu II II,,I fur n rorr('("li'lli. ,,"or ('X"IIII,II',
\ I it hUII't II' ::Ihift "IIl'rnliol1s nn' \'1.'r." u!iI.'flll ill 1111111)' nlguril hills for IUlltt i-
Illi.'nuolI nnd CIi\'ISlUII, "" will It" tll'sailll"C.l ill Clmph'r :t
1.1, (n) Fmtl Ihl'lix,...I.llnillt N'\II1'St'IIIIIlinllsufthPtwu \'Rlm'S ('I1,:.!!"ho l\u.1 (-.11.25ho
III t h,' nltti,.; "OIllI'It'IUI'ltt 11I1I1 diminish,..I- m.li" ,-OIllI}h'III,'nl )"!oI(,IIL" if till' nuli.x
r L'i , I.. _ 3 il\lt''r dits, fUlll III - I frodinll digit.
(b) \{I'IIt'l\I (n) fnr r 0=: 2, I; 0:: N mill rn "" :l.
1.2. "I'riry Ihc' "orfl....hl' of thl' fol1u"'l1 \unt"....llIr., lU olJmili till' two':; l'<)lIIpll'llIcul,
(f,'n-Io ,,/0,-2," . ,,,,,) c}f 1\ IH'II ti''(lu''Jll'l' (rn_I' r..-J.' ,. ,J'O),
Stnrt fmlll Ibp rbll\ll.....t hit .ro. Iqlr I'lu:h II iu .\' Sl'l tit" "'lrr'l'ullllillJ( hil i.. }
Itl tlllnlil tllI rt'llI.1t 1111' 1i11 I. ,,",'r IhL" \ ...'1 lh.' ,""rrI'Slllllldill hit lO 1. "'rom
lhi.. I'UIIII 011 ....'1 !I. ell ..
1.3, Sill'" t hilt "l'llru"iUl"It'I' :i.:n bils IUt' 1II....h..1 tu fl'l'rt'.....'nl 1111 ,.-diRit dl....im,,1
Iltllllhl'r.
I..... III tin.. Ilrtlblt'l\I \\"t' nU1'11I1'1 10 tind Ih,' mllst "c'lIkil'lIt" !i'Cl...l-nltli" mlmlwr sys-
1t'1II. '.....IIIIaI. Ilmt l\ Ilill"I'1'1II \11hll'_ 111'(...1 10 hi' rt'l'rt'N'IIIt...1 mill .....'/lr(')1 rur 11.0
rl"li" r I hili lIIillimi ,'S lIu' I'rOllut"1 1:' ;:: II . r whero II is lbl' IUllllh,'r "f rl\llix
I' "iil... IIIIIt !In' n"ll1irPtt to rt'11f.....'lJt :V \'I1ltll'l'O; I.., II ... 108, 1\ , nailS, lh.'
fUlII'linll /. Dr' 1t}J(, 1'1 ... (r/loli:.l0 I')' h1O N UaII,'il hc' minilllir.,...I,
(ft) Ju"Ufy 1111' ..,.I.....tlllll of UIC' obJ ....ItHO fUII..lln.. E,
(h) I\lblilulP 1111' \lIhll':!! nf r/ltl810r fM r...'.!,r,a.- ,,10. WIUlI \lH'llll'llt'...l
('hllin't< fur r'l
1.r.. I'fl'\"l' lh.. "XIl.'IJ.t;IIIIL" nf 1"'...'1\ ,...mll''''I",'nt 111111 IOm.'N l"IlIlIlllt'III1'nl IIUlllhl'J1i to
hlliui\l' !iC"'1I1"II(,,(. III I'urlinalnr, hL......1 till 1':'lllIIliuu (1.1:!) sl...", llalll Ihl' H'I'
n.....utulltlll of It si"'1Il'(1 biJllJr' 1IIIIIII>I'r ill (II + 1) hit.'! ill IIIl' t\\U'''i 1'0111,,11'111('111
1U1'lhud IIll\)' hi' .It'n"l...1 frum it... r.'I"...."lIllUI..1I ill II bits b)' rt",,'lItillg 'hI' !llgn
hil
1.7 EXERCISES
(I
I
I
o 1 I
101
\ 1 0
+11
- t:
- 1.1
whilt' tla,' C",'rN'('1 rt'Sult L.. I 0010 = (-2}.0. hllw thll' till' (("'llIIrt'll t"orn'l'Uou
l'IlII h,' d"IIl' hy tuklll till' l\\'O'S t"ollll'll'll1cut llf I Ill' rt'8ult \ t ttI. EXllhiin wh,\' Ih,'
1'111111'1(''''1'111 uJlI'mliulI I'ro\ i,ll Ilat' u,'("nl)' n,rrl'Hillll.
1,8 REFERENCES
A 1IIIIIIIIt'r tlflt'xll.....k" C()('u..iug C'III""IIIIU1It'r l,rilhllll'lu'lm\., 1....,11 jlllhlu.h...1 hlIIl'C'
Wli:' ('III'S!' illdllllc.':
[II .I. .1. I. C." \N .\c:ll, 1)191foi ClllIIIJlJfrr ",111111t 11(': 1}t'191J flllIl un"I. III' /ltllfJUIi.
:\kl:r\lw-llill. NI'\\' Yurk, 1!1".1.
[21 I. I-'IOHFs, I'hc IlIgle..1 ('1lmllUfc I' U,thllll'fJ,", 1'n'nlln' IIldl, EIII."......ll'lilrs, N.I,
WI':"
18
1. Conventional Number Systems
131 I..I In NS AND S. 1'-. Um:n\1A:-;, AdvanCtd CQmptltrr (ll1tJJlIlr'tlr dr..I9", Wil..,
N('\', \ork, 2001.
1.1] ,J. n. GOSIIN(;, /Jr.Aign oj antJlfllchc uni"" Jor "Igltal r<Jml 111tr r:t SI)ringc'r-\'('rlaJ(,
N..w Ynrk, UuO
(51 K. HWANn, ('ampute,. anthllletlc: J>nllnplc'!, arrhit,"t.'tllrr, mill d. .i'lll, \\'iJ,'y,
Np\\, \mk, 197R
lfil ,1.M. MIIII.FIt, Flrmrlataf'lJ Hmctiorl3: "gonthm, and Impl, m, IItallOlI.',
Uirkhdll>iCr, Uooton, 1!/97.
(7] A. H. OMONI>I, ('omputer ant/mlctte SIP/,'rrL. Algonthm.., arrhilrclllre allfl im.
plementation", Prt'lIlire Hall, Engll'wood ('Ii Ifs , NJ, 1!1!1.1.
(8] B. PAJIIIA\II, Co, apu/..,. antlJJ1l1'ltc: Algori/hllL' and J/anlll1an /)uigrL', Oxe"rd
University Pr('.'lS, Npw \ ork, :WlIO.
(9] N. R. !'co'.-r, ('ornpu"'r 1lI1f,.ocr sy,"crn.. (lrld anthmf'lir, l'rl'lIti.... IInll, EIIII.'"
wood ('tiJfs, NJ, lYH!").
(10] 0, SPANiOl., Compllter arilhrllf'tic: I vglc and d"..I9'I, \\ ill')', N,.w York, l!.I/(1.
(11] S. WAHR nnd M. J. FI.YNN, Introduction to anthill 'hc Jor digital 8Y," 'III ." 'Ign-
c.-s, 11011, Ihnl'hnrt, \\'iJJloll, N('w York, 19t!2
H.'prints of 1\1 an)' !"!a.",..if Iml"'['S .iJ>I)('cU' ill th' next lwn volulJlcs:
(12] I". B. SWAltTll.ANIU'Il. ,JR. (I';dilOr), ('ompul r ant/llllchr, vol. 1, WEll C nl1l l)Utl'f
Society I'((, I..os AI.unitos, ('A, 1990.
1131 I.. I". SWArtTll.ANmm. In. (Edilor). Com,,," 'r' antlmlf.lIe, vol. :.!,IEEE (;omlmu'r
Society I'n_, I..os \I.uuitos, CA, 1990.
SC'\,('ml dmlll('r., in ('Cllllj)utl'r orJ/:.\lIillitioli ll'xlhouk!i nlld hOm(' sUf\ey lutld(:s have
h('('n d,'vol'(t 10 "OIuput('r antbm.'lic illdu4Iill:
11.1/ Y. ('IIV, Computf'r oryafJizaltorJ and micropro!lmmrning, Prl'lItic.c 111111. 1';!Ilc-
wood Cliffs, NJ, 1!)72, ('bIlP. 5.
(15/ II. I . GARS:R, "Numh('r flYfll('UL., and nrithm('tic," in ,1dtrallrt.'J m comJlutCI':i, vol.
6, F. L. Alt ami M. Rtlhinulf (F..ds.), AUIlI,'mk, fI;,'\V \\lrk, 1!IG[I, pp, 131-19.1.
(16) II. I.. GAltNF.R, .. l"hrory nf mrnj)ult'r addition l\IId overfinYos," IFHF 1hms. on
Cornpu/m. C-27 (April 1978), 2tJ7-301.
117/ V. C. IlA\lACIIF.R, Z. (;, VHANESIC, rmd S. G, ZAKY, ('omlJ111lr oryanizahon,
2nd ,'d., M(Grmv-lIiII, N.,w York. 19HI.
(18) D. GOI.OO."It(;, "('omplllt'r .tritlnnl'tic," ill Computf'r arrhitf'cilln: A quantl/ilwr'
al.proarh, D. A. Pnttl'nJOli and J. L. 1I"IlI.(y, 2nd ('(Iiti\lll, MOrKllII KallfmuuII,
S.m Mateo, CA, 19<]6
1191 l.. 1\111.IS('II, "Matbemahml fOllllllation.. of cnmpllt('r IIritllllll'tic," /IBF Ihm.:J
011 Computf'n, '-26 (July 1977),610-620.
(20J 0. 1 . MA('SOltl.r::,', "lIigh-bJu'('(ll\rilhllleh(' III IUII.IfY COlllpU!t'I'1:I;' Proc. oj IIlE,
49 (Ja.u. 1961), 67-91.
(21) G. \\'. R"'T\\'II"N.:Il, IOliillllry ImthmC'lir," III A '('an('H UI CQrnpul rs, vol. 1, (0",
I.. All (Editor), A('dtleuuc, Nc'w York, 19hO, lip. 231-:mx
122/ (' rUNG, MArilhmetic," in ('OfrlJIUU,. ,'clcnce, A. I:. ('ardl'u ct al. (Eds.), WiI'y-
Intf'rb<'If'ncc, N('\V York, 1972.
2
UNCONVENTIONAL FIXED-
RADIX NUMBER SYSTEMS
AlthouJ.:h t.h,> cOII,,('nlIOlial hinary llI11nht'r sy:;t('1J1 with two's f.omplt'lJlI'nt. repw-
s('ntut.ioll of n('ativ(' !llImbers is ('01111110111)' IISl'c! ill .uithml'til: IIlIit.s, tllI'rc are
sl'wrnlutlll'r IlIlIuh('r bystl'lI whidl have proVl'1I to he 118('flll for ("('rt.ain "pplil"(,"
tions. TIII's(' in..llldc th(' ncgative radix \11,1 sincd-digit. nlllllh('r systems, but.h
of which are dlscrih('d in this dll\lltl'r. Othcr unconventional IIIlIlIb('r sytmJ.:j,
induding the sigll.loaritlun number syst('m snd t.hl' residue nllmlll'r .'iIytt.Ul, m"
di.scus.,,<,d in Challtl'rs 10 alld 11, npl'cti\'('ly.
2,1 NEGATIVE-RADIX NUMBER SYSTEMS
Cnn\'l'ntional uumber systems arc fix('cl-mdix sy:.h'lJJs for whi..h th(' wt'iRht. IL'. of
thl' ith digit., Iat ib r', the r"n(' uf lw:h digit. is {O.l,.", r-l} alld t.11I" ilJhrJl(4tJ'-
lion rull' for c llcull1l.in t.he nl101('rical valau' of th(' St'qU('II(,,1' (In-I, X n -2,..., .1:0)
is
n- I n-I
\' = L.l: 1 JI'. = L XI r'.
ICO i=O
fhp radix r is lIormully 'II'I,-ct,ed to hI' a positivl' iull'ger. Hnwl'wr, it is IlDt
lIeCl'ssary to r('st.rict. r to positive valau's, and WI' may sd,' .t r = -13, w!lcr(' 11
iI.. a po...it.ivl' intl'g.'r. fh.' digit S('t remaius thl' ;.lInU'; i.e., Xi E {O, 1, 0" ,13 -:- I}.
I'h(' valll(, of the fI-t upl p (.rn-.. .r n -2.... ,.ro) ill t.his ,ILguhI1f-rodiJ: S)'Stl'JJI 1-...
(2.1)
n-I
). = LX, (-Ii)'.
,..0
(2.l)
10
20
2. Unconventional Fixed-Radix Number Systems
III oth('r word:., tlU' \\'l'ight 1L'. satisfies
II' = { ,
, -,1'
if i is I'\'"n
if i is odd.
Example 2.1
Thl' nl'gati\'.......radix nUIl1Ll'r sytl'lIJ with J3 = 10 is c811t'd thl' FIIga-dfClmal
syst P I1l. Consider thl' following thre('-digit l1l'ga-decimal numb('rs:
(192)_10 = 100 - 90 + 2 = 12,
(012)-lo = -10 .. 2 = -8
Thl' lart'St po...;iti\'e vRlue that can Lt. rl'prntl'll as a 3-tllple
(3"2,3"103"0)-10 is (909)_10 = 90910 whill' the sUJI\lIest is (090)_10 = -90 10 ,
Thus, t.hl' range of "alul's rl'preseuted as 3-tuples (3"2, Xl, :ro) ill t hI' 111'1;'1-
dl'(:imal systl'Jn is - 90 \' 909. This range is d.....YlIlnIetric, silll;1' t.h('r('
are approximatl'ly 10 timl?$ a.<; m.my positive numbers as negative OlJe.
This is always trill' for odd \<lIIli'S of fl. If JI is c\'en then t.he opposit.(' is
trul'. For examJlll', the rall!W for n = ,I i!' -9090 \' 909. 0
III the negatl\'c--radi'C number systl'm there is 110 net.'d for a sl'parate sign
digit, and conSl'<lUtntly thl'fI' is 110 net'<! for a special method like dw radix-
rompll'lIll'nt mt.thud, to r('pre8l'lIt negative Iluml)l'rs. The sign of the lIumber
is dl'tl'rmilll'li by the first lIonZl'm digit. The fad that. thl'rl' is 110 dist.inction
hetwl"l'11 positive Ilumher and lIegativl' number n'pr'!sentations makps the arith-
metic opl'fatiolls indifferent to t.he sign of thp number. Howevcr, thl' algorithms
for the basic arithmetic operat.ions in thc Ill'gati\'c--radix nUllIber syst"111 a/'"
slightly more compll'x then their coulltcrp'1W; for the conwntiolJallllJluhcr S)'5-
t.cIJlS, as iIIustratro in the following eXdJllpit'.
Example 2.2
COllsider negathl'-radix nUlJlbprs of lenhth n = 4 with ,1 = 2. Thi"
lIulJluer system is c.alled the n ga-binary S)'st.l'lIl. Thl' fallgl' for I-bit lU'g<'-
biuary IIwllbers is
(-Who = (1010)-2 X (0101)_2 = (+5ho.
\\'IJt'1i addillg negs-binary lIumbers the l''rry bits call be either positiVI'
or Iwgalive as illustrated in t.he followill 8ddition, wherc the weights of
the different. hit po:-itions arc showli in th,' lop row:
!
I
I
t.
8 +4
o 1
1 1
o 1
2 +1
o 1
o 1
1 0
5
-3
2
2 2 A General Class of Flxed.Radlx Number Systems
21
Notl' that in the -2 ('olumn tllPre is a carry-ill who.' w('iRht. is +2, anel
t.h., only W"1Y to hRncll.' it is to convert. it into f I 2; i.p., I)rcoduf'f' a sum
bit whose w.>i"t i -2 Illld a positivp rarry-ollt hit '0 thr + 1 "OhlOlII.
Also not.e that in t.hl' +-1 ('"ohlllln, a ("arry-out is gl'IU'rat('d with wpiRht +8-
This rarry hit. '1llCl the opl'wlld bit ill the 8 rohlllln c8ncpl r'1('h ot hrr
out to proOIl('1' a ,('ro slim hit. 0
The II.,-hin.try llI11ub('r syatl'm has h(,(,11 propOSRft for spveral siRnal
pwr('S."ing applications, and algorit.hms for '111 arithnll'ti('" opprations in thi.. IlUUl-
bl'r svst.l'm havp hl'l'lI dl'viSf'f1. HowPver, its liS(' ha.c; bf>f'n limitC'd and thC'(f' IUP
cllrrl'lItly 110 st andard iutl'gr.tt''il circuit.s (lCs) t hat perform arithmPlic opera--
tions ill this y:.tcm. 0111' ofthl' rpa..',oIlS for this is tllP fart. that it is Ilot slIpc>rior,
III principiI', to t lIP morl' wl'lI f'St.ahlishpd two's rompll>ffil'nt systpm. As shrM'1l
in till' next!; ction, th two arl' ffil'mhl'r'l of a largl'r group of Ilumllf'r sy"tPJl
with vcr)' similar propl'rtips.
2.2 A GENERAL CLASS OF FIXED-RADIX NUMBER SYSTEMS
Till' lIe<ltive."radix, .mll lIIany otlJt'r fix('(l-r"1dix IlllmlJl'r systems, are lJJemben;
of a 1)(I),\d dass of nonrc'ilundallt lIumb,'r systpm'i. III thi cld....s, "at'h n-digif
nllllll)('r sytem is charactpri./f'il by a positivl' radix i3 and a vator A of ICllgth '1,
A = (>'.._1, >."-2...", >'0) wh,'re Ai E {-l,I}. Such a syst.t'm, with d st.andard
digit liet, {O, 1,..., J -I}, l'an bl' idelltifil'd by the triplet. < 11,13, A >. The vdlue
X of all n-tuple (X"-I' Xn-2,"', xo) ill the system < 11, {J, A > is giv'lI by
n-l
X = L >.,:r,' .
.=0
(2.J)
TIll' IImltiplyinJ.: fae-tor A, allows U'i to S4'lpct bet\w.'n tllP t\\'o pos,o;;ihl(' w..ights
Ji and -(3'. individually, for C\'l'ry di1I positiun i.
For any givf'1I r "fix J t!l('rl' arc in this da.....s 2" dist.inct IJInnber !>Yloot"IIlS
corrf'$ponding to the diff"H'lIt valu of A. Among thl'JIJ is rh(' positivl:!-mdix
lJumlll'r Syst{'IO, for whirh >., = f 1 for I'very i, aud t.hl' neg<'ti\l'-radix nlllnbl'r
systpm, for which A. = (-1)' for evcry i. Also illl"ludPd is the radix-complt'III1'lIt
numb('r svstl'm with X,,-l /1..<; a "tru pn 5i1I digit (i.l'., .r,,_1 E {D. I}) and thl'
charal"tl'rizilJg vector A = (-1,1,1,... ,1) (JI.
\\'t' \\ ill now examinl' SOIllI> uf 'he propprtl"'-:' "f thi... gl'lll'r.,1 cbu;s uf num-
ber systl'ms. Ict P Mid N d('lIote the hug'St dud !>mall.>..,t rt'prf'Sl'lJtable 111-
tl'gl'rs, rpspf'('tiw'ly, in thl' g,'ncral syt'm < rl, 1.1..\ >. The digits of P =
(lJ n - t. Pn-2,... Po) sat.bf"
22
2. Unconventional FlXed-Radlx Number Systems
2.3 Signed-Digit Number Systems
23
{ 11-1
p, = 0
if >.j = + 1
oth('rwi
i = 0, 1, . . . , n - 1
Th(, compl"IIII'lIt X of II llulIIbpr X in a Rys't'm < 71, {3, A > is ,h,tilJ{'(l
ha..'i4'd 011 till> digit ("()lIJplf'IOf'lIt of .1:, whirh is .£; = (fJ - 1) - :r i. as follows:
A mor(' rOllvpnif'nt. (''Cprpssicl!I for P, is P. = teA; + 1)(/ -I), ha.';('I1 (Ill whirh tht'
\'Dille of till' '1-tllpll' (p"-..I'n-2,... Po) i&
n-I n-I
X = L X;..\i{j' = L >',(/1
1-=0 ia:O
n-I
1)/3i - L ..\,.c, , I' = Q - X
iO
(2.7)
II PllrI',
r =
n-I 1 [ "-I "-I ]
L (A, + 1)({3 - l)f;I' = 2 L >'i{j3 - 1)1 1 ' + L(I1- 1)1 1 '
icO ,.0 ,o
.!.IQ + (,j" - 1») (2.4)
2
-X = X - (J = X + (-Q). (2.8)
III other words, the ddditive ill\'('rsc of ..\ can bp fonnf'fl by addiug th(' ndditiv('
illVl'rse of Q to thl' mmplpJI1l'nt X .
=
EXlimple 2.4
III thf> two's cOlllpll'l11('nt Syst'lII, Q = -/lIp and 'herefor... -x = X +
""}. In thl' upgd-hinary Systl'JU, Q = (. ",1,1.1,1) = -(... ,n,1.0, 1) nnd
h('llce, Q = (... ,0.1. 0,1). Tbe additiv(' invpl'SC uf (01011)_2 = (-9ho,
for example, it;
whl'rp Q it> the \'alnc oCthc 7I-tuple ({3-1,;J -1,,' .,,)-1) in < TI,,J. A >. As "ill
hecolJl(' "'.Ioen' lat('r UII, Q is 8 \'Pry significant quant.ity. Similarly, tl1l' dip;its of
N = (Y,,-IoUn-2," 'Yo), thl' smallt'St rppn"SI'ntahll' IJIllllh('r, satisfy
n-I
N = " .!.(>'i - 1)(JI- I)Jj' = .!.IQ - ({3n - 1)].
2 2
'co
(2.5)
16 -8 +4 - '2 +1
X 1 0 1 0 0
-{2.f 1 0 1 0 1
1 1 ° ° 1 = 9 10
wherl' th(' addit.ion is pl>rforl11cd according to the ruiN of aclclillg Ill'gd-
binary IJIII..bcrs. This r(,lJlt ('"ill be Vl'rjfi('d hy addillg the ()riin.,luulI1ber
to its additive inverse. Ag.,ill followillg t.hl' mil... of lIL'gca-biliary addition,
(01011)_2 + (11001)_2 = (00000)_2, 0
y, = {
11-1
o
if Ai = -1
ot IlI'rwis('
i = 0, 1, .. . ,TI - 1
811d its v ,Iue is
Tht' I1Il1l1ht'r of illtl'g('['S in the rallge S X P is r -."Ii + 1 = pn, aud
tlJ(' rallgt' is, ill gI'JlPr<,I, a..';Yll1l11ctric. A 1111'&-\1(' of th(' a....ymIllPtry ..au he th('
diffl'rl'II(',(' I}('tw"t'll the ahsolute valul'$ of tht' largest alld SJlllilll'st numh,'fS:
rhe additive illvl'rsc may be cmploypd ill subt(tiol1 b)' USihg th(' t'C)uatioll
\ - y = X + Y + (-Q).
(2.9)
p - INI = p + ." = (J
(2.6)
Applying this ('C)uat.ioll lIlay rpquire two add opt'rat.iolls with curry propagation,
which is til11l'-('ollsuming. An altf'rnat(' L'xpn:.o;ion is
X-}= X +L
(2.10)
EXAmple 2.3
Thc lJI'gath ....radix systt>1Il for whidl A = (...,-1,+1,-1,+1) has /ill
<LS millet rk raJlRe, alld for all I'\"('n valuc of n t!al're an' {3 tillles as IlIIaIlY
II('gative lIulJlbprs as poitive on('5. If we prcf('r to havc more po.o;itive IIIIIn-
bers, v.c (/ill illStl'M use the system < 71,,3, A = (..., + 1, -1, +1, -1) >.
Two of the biliary :.y&tcIIIs are Il('arly symlllct.nc: till' t.wo's complt'IJII'lIt
ystellJ, < 71, J=2. A =( -1,1,1,... .1) >, for whi..h P + N = Q = -ulp,
dlld th.. sYSU'1JJ < 71.11-2, A = (+ 1, -I, -1...., -1) > for which J> + J\" =
Q = +ulp. 0
Two cligit-col11(JII'IIIl'IIt. operatiolls, nlld only 0111' a<lditloJl with carry-propag ,'ion.
/ire rNllJin...1 hpre,
2.3 SIGNED-DIGIT NUMBER SYSTEMS
In all ti'CL'II-radix 8)'&t.('IIIS that WI' have ,!J(aluined so f"r, the .digit. t hlL.. h"m
r(\<Jtri(,t('(1 to {O,.,., r - I}. Howl'wr, we nil allow the follCJwmg (hglt set:
x, E {(r -1) . (r - 2) ,...,1,0, 1,...,(r -In,
(2.11)
24
2 Unconventional Fixed-Radix Number Systems
2.3 Signed-Digit Number Systems
25
wlwn' I eqnals -i und not (r - 1) - i '1.S !wfor '. Wc nM' l('rr t.h,' salllr not.aifn
&." hrfore sincl,) t.hi/; ic; donr rommonly in the t,'chmcal ht.rr ,ture'. Eal'!1 digit IS
rit.hrr poitive or nl"gtltiw, so t.h,'r<' is no u('('(1 for a sl'parat.(' sign digit TIlr
rnltin llllmbC'f syst'm i c8n,'(ltlll' sig1H'd.digit (SD) syst'lIl.
ill addit.ion and !>uht.raction. Consid.'r t hr f(}lIowin ol>'rat ion:
(Xn-I....' xo) :I: (Yn-"'" ,Yo) = (.'In-1,... ,so)
Example 2.5
For r = 10 till' a}lowrd digit.'> 8r(' {9, 8,..., 1, O. 1,...,. 9} "1nd, if n = 2,
t.hr rang., is 99 \" 99, whieh indllllf"S 199 nmnlwrs. HOWl'H-'r. with
two (liit8 (:l'I'.rO) r8eh having J9 possihilities, th,'rp are 19 2 = 361 rf'pre-
srntatious, and h"lll'C' SOIlle> numbers ha\'(' 1I10rr than 0111' repfl'.s,'ntat.ioJl.
Tilt' lllllll!)('r syst.l'm is tllC'rdore rt.'<.!nndant. For c'xlunplr, (01) = (19) = 1;
(O) = (IS) = -2. TIll' r('prrs,'nt.,t.ion of 0, however, is nniqlU" 8nd so is
the rt'pmwntat.ion of 10. Ont of tit<' 361 reprcs,'utatiolls, 361 - 199 = 162
are redundant, and thus tlwrr is 1% r('(lundancy. TI1l' rc.\(lrr call \'('rify
that t'tlch nnmh('r in tins nngE' has at most two reprcsc'utat.ions. 0
'(' waut to brf'ak the c.arry dlains by having tilt, sum digit. Si dep.'nd only Oil
the four operand digit!> x" Yi, .r,-l and y.-l' If this can be achipvM, t.h('n the
addition time bpcom<'S indrp(,lult'nt of t.hl' Ipngth of tit(' opf'rands. An addit.ion
alorithm t.hat can achi,'\'c this indpppnd,'nc(' consists of two stf'ps:
Step 1: Computc all intf'rim snm U , and a earry digit c.:
Iti = X, + Y, - rei
wlll'rc
-u
if (x, + Yi) a
if (x, to Yi) a
if IX I + Yil < a
(2.1J)
Step 2: Calculate HII' final stun S, = U , + Ci-l'
A::. will beeoll1l' appart'nt later on, adding some rt><llInddncy in a numbl'r
s\'stem nn be wry beneficial. On the other Imnd, a high 1"\'1'1 of [{'(hmdancy
1;1ight hI' too costly, sincc a larg('r digit set rC<l\tires a hngpr nllmlwr of hits to
r('precnt ('och digit. WI' may r('ducl' t.he alllount of rrolludancy by r('striding
tllP digit set to
Example 2.7
If wr s<'lfft a = 6 for r = 10, then x, E {ii,..., 0,1" ., ,6}. Step 1 in the
dho\'e addition dlgorit.hm thl'n hccome::. U , = (Xi + y;) - 10e, and
Xi E {a, a-1 ,...,I,O,l,...,a}
with
f r - 1 1
- <a<r-1
2 --
(2.12)
u
if (x, + Y.) 6
if (x, + Yi) 6
othf'rwisp
wh('[(' till' Ct'iling r X 1 of a number X is the smalil'st integer t hat is largl'r than or
('«UN to x. At If'&t. r different digits are llf'edC'd to rcprc."l'nt a numbrr in a rndix
r s}'stl'm and with ii x, a we ha\'c 20 + 1 digits. Tht>r('forC', the int>quality
20 + 1 r must bc sat.isfil'(l and t.hl' lower bound in inequality (2.12) follows.
Thns, instrad of p('rforminb the addition of two dt'<.imal numbprs, sneh as
4536 and 1.166, in the convf'ntional dl'cimallllnnhf'r system
4 5 3 6
+ 1 -I 6 6
600 2
Example 2,6
For r = 10 t.he range of a is 5 $ a 9. If we select a = 6, th,>n for
n = 2 t.h('rf' are 133 numbers in the range 66 X $ 66. Each digit has
13 possihle \'alu('s for a tot.al of 13:2 = 169 [{'pr(,sf'utations; i.e., there is a
27% rL'<.!undancy. Notice t.hat 1 now has only on,' reprl'sent.ation, namdy
(01). (19) is not a valid reprcsf'ntat.ion, since '9 is illegal. How('vc'r, 4 still
has two [I'prf's,'nt.atiuns: (0-1) and (16). 0
in whil'!1 t.he carrv propagat from the (past significant digit to the most
significant. onf', we have no carry propagation chain in
4 5 3 6
+ 1 4 6 6
0 1 1 1 c,
+ 5 T T 2 II.
6 0 0 2 s,
SD (('presentat.ions are Utwful wlwll dl'wloping algorithms for multiplica-
tion and division, as dl'scrih,'d ill Chapll'rs 6 and 7. Ho\\'eVl'r, the originull1
that.ion for introduc'ing SD IlIllubers was t.u eliminatf' carry propagation chains
Notl' that. the carry bit" werr shift('d to th.. Iph t.o simplify tilt' ,'xl.'Cution
of tlU' second stcp of the algorithm. 0
26
2, Unconventional Fixed-Radix Number Systems
Binary SD Numbers
27
In thl' lat I'xmnple the operands were Sl'\!,I"tI>d so that t.lll'Y ('ould hI'
vipwNI pit.hl'r as conVl'ntional d,>cinml numlwr:- or as Sf) dl'cinml numhl'rs with
till' digit. St't. {fi,..., 0,1,..., 6}. In genera!, a nmVl'nt.ional d,>dma!lIInnhl'r IIlY
nse digits tikI' i, 8, and 9, which arc ont of t.h,' allowt.>d digit. st't. 1'111' pr{'vic)\ls
addit.ion algorit.llln can be> uspd for converting a conwntional dl'cima!nulIlber to
SD form b' artificially consill,'ring Pl\C"h digit 8.'> t.hl' snm (Xi + Y.) in Eqndtion
(2.13).
riY, 00 01 01 11 11 11
Ci 0 I 1 I I 0
", 0 I I 0 0 0
TABLE 2.1 The rules for adding binary SD numbers.
EXAmple 2.8
To find the SD rl'pre>s(>nt ation of t.he dpcimal nllmh,'r 27956 WI' xpply the
pre\'ions algorithm, resulting in
2.4 BINARY SD NUMBERS
Xi + Y. 2 i 9 5 6
C; 0 I I 0 I
". 2 3 I 5 4"
s. 3 2 I 6 :{
For r = 2, tlll're is only onp possible digit. St>t; nanll'ly, {I,O, I}. In othl'r words, (l
mllst t'qnall. The int.erim slim and carry in till' addition algorithm from SN.tion
2.3 an>
Ili = (x. + y.) - 2('.
8m!
To couwrt a nllmber in SD reprcsl'nt.at,ion to ('onvl'nt.ional fl'prt':>l'ntat.inn
onl' can subtract thl' digits with Ilt'gati'il' wcight. from t.he digit.s with
posit.ivl' w('ight. For 3216 4 WI> obtain
3 0 0 6 0
o 2 104
2 7 9 5 6 0
u
if (Xi + Vi) 1
if (Xi + y.) I
if (Xi+Yi)=O.
To R\lafaut.ee that no npw {:arry will be gC'neratl'(l, till' slim digit. s" ca!-
culatC'd from Ili + _ 10 mnst 1>at.isfy Is.1 S a. Sill<'C 1c,- rI I, th.. condit.ion
IUil a-I has t.o bC' satisfilod for al! possibll' \'Rlm.,:> of X, anrl Yi' For example,
till' lurest. \'"\Iue t.hut Xi + Yi can a..o;sulI1e is 2a, for which Ci = I and ", = 2a - r.
The> inequality U I = 2a - r a -I is cI('arly satisfied, since a r - 1. Howcwr, if
Xi + Y, = a, which is thC' smallpst valul' for which Coj is st.iIIl, then Ili = a -,' < O.
Subst.it.ut.ing lUll = r - a into 111.1 (a - I) yiplds till' im'qlJality 2a r + I.
HI>nct', thl' !>el,'('h'<.\ digit set. nmst sat.isfy
These rules arc slInunari?,'<.! in Table 2.1. This tablE' dues not includC' t.hE' combi-
nat.ions X,Yi = 10, XiY, = 10, aud XiYi = i I, since the addition x. + Y, is a com-
mutat.ivC' opera! ion. NotC' that in till' binary CflSC till' condition a f!:}! 1 = 2
I'annot. he satisfied, and l"Ouseqllently thpre> is no guarantft' thdt a Ill'W carry will
not be gt'nC'rat.ed in thl' sel'Ond step of the> algorithm.
Still, if t.he opl'rands to hI' added do not ('outain t.hc digit i, neW carrit'S
will not bf' gene>rafl'(l. Collsirle>r, for examplE', thl' addition of the following two
mnnhers, whi('h, in thp con\'('ntional re>prt:'Sf'ntation, will generate .\ t'arty tbat
will propag<\tl' from the lca.t significant position to tht> lIloot significunt posit.ion:
r r + 1 1
- <a<r-1.
2 --
\\'f' ha\c considered so fdr only thl' t.wo pxt.fl'm,> value>s of Xi + y, for which ('i = 1.
Howpwr, the reader {:an verify that. for all oUwr possib!e \'8lu,'s of x, + Y" till>
condition IUi I a - I is 'ltisfil.o if a r 1.
(2.14)
I I I I
+ 0 0 0 I
I 1 I Ci
I I I 0 II,
1 0 0 0 0 8 ,
Example 2.9
SD dedmal numbers must. satisfy a 6 to guardllt.l'l' t.hat no Ul'\\' carri('g
will be generak>d in till' previous algorithm. 0
Hl're, no carry propagation chain l'xist:>. However, if SD lIlunbcrs with I tliits
arc added, new ('arries may occur. For I'XlUnpl(', if Xi I'l,-I = 01. then C'-I = I:
alU! if x,y. = 01, th,>n II, = I, yielding 8 , = ". +C,_I = 1+ l. Thus, a Uf'W carry
is gl'nl'ratcd, us iIInstrutL,<1 in t!lt' following nddit ion:
28 2 Unconventional Fixed-Radix Number Systems
0 1 1 i 1 1
+ 1 0 0 1 0 1
1 1 1 1 1 c....
I 1 1 0 1 0 Iti
., . ., 1 0 () Si
Binary SD Numbers
29
Binary SD mllllhprs ar(> particularly useful in t.hl:' dpvE'loplJlpnt of f&..1
algorithms for IIlIlltiplication and division, which are di<icuso;('(l in C'haptl'r.I 6
"nd 7. In th('S(' ClL'i('S Wt' will b(' intprpstpd in minimal 8D rrpr"iwntatiQrM; i.e.,
rpprpscntat ion., that indudp a minim'll mnnbpr of nonzero digits. Nonzero digits
will correspond to add/subtract opl'r \tion'l, and thf' numLer of thpsc houl(i be
minimizL'(1 while ./('fO digits will mrrl'spOlld to shifto)lly opcntions.
The stars indicat.(' po!>itions wher(' UI'W carrit'S are gplU'ratpd ami must be aUow('d
to propagah'.
Fxamining till' rules in Tahle 2.1. onp can wrify t.hat thp comhination
('..-I = 11, = 1 occurs when J:iYI = 01 and X,-IYi-1 equuls ,'it hpr 11 or 01. \Vp
can a\'oid sett.ing u, = 1 ill thps.' ('ast'S Ly spJ('(.ting = 0 and therdor«' making
11, f'qual 1. W" should lIot, how(>v(>r, changp thL' ,'ntry for XiY, = 01 in Tablp 2.1
to read c; = 0 and Ui = I, since for Xi-IYi-1 = Ii, which results in Ci-) = I,
we still ha\"(' to ..ct = I .md i', = 1. Simil.lrly, thp combination (',-I = tI, = I
occurs when x,y, = 01 and .:r,_IY,_1 equals ('itlll'r II or 01. Wf' call avoid setting
lIi = I by sd('ct.ing in t.llI'SC C<lse5 (.md in thes' cas '$ only) Cj = () and thprt'forc
ill = 1. In summary, w(' Cdn ,'n5U((' that no np\.... carries will 1)1> gencrated b,y
examining thp t.wo bit.s to thp right Xi-lVi-I wlwn d,'tprmining Il, .md l'j, arriving
at the' rules shown in Table 2.2. Observe that Wt' cun still calcul(\t.(, C, and Il, for
all bit positions in parallel.
Example 2.11
For X = 7 w(' haw t hI:' following r('prE'sent.at.ions:
842 1
o 1 1 1
111 1
101 1
1 001
1 1 1 1 1
Out of tht'Sp, 1001 is t.he minimal reprt'sf'ntation. rhe canonical rf'Coding
algorithm generatt>s minimal SD representations of given binary numbers
and is pfI's..'nted in Chapt.pr G. 0
x,y, 00 01 01 01 01 11 11
Xi-I y,-I - nClt }lPr at ll'ast npitlu'r at least - -
is 1 011(' is j ilo 1 on(' b I
c. 0 I 0 0 1 1 1
Il, 0 I 1 I 1 0 0
TABLE 2.2 Modified rules for adding binary SD numbers (8).
Eliminating carry propagat.ion when adding binary numbers can spCf'd up
IIpprdtions like mult.ip\i.-ation and division, whosp pxpcution usually includes a
large number of add/suLtract op('rations. Two concerns arise whPll SD rt'prp..
scntntions of binary numbprs ar(> uspd in an arit.lull,'t.ic unit. TIlt' first one iR
thp exact eHcoding of tlm..'C \'aluL':>, namely 0, 1 and -1, using binary signals.
The sccond onp is the need to Cf\nvprt. the result. in SD rPprpsentation to it..
con\'ent ional two's complpm('nt rl'prt'Sentat.ion. Out of the .&! = 2-1 pn......,ihll' ways
to encod,' thp t.hrl'(> valu,'s of a binary signed digit x using two bits, x" and x'
(h and I for high anellow, respectively), only nint> an' dist.inct eucotlings uuupr
[1prmutation and logical Hegat.ion. Out of th('8(' nine eucoding two ha\'e bem
used in practice. These arl:' shown in Tahle 2.3.
Example 2.10
Rpppating the prp\ious examplp WI' oht.ain
0 1 1 I 1 1
+ 1 0 0 i 0 1
0 0 0 1 1 1 C,
1 1 1 () 1 0 Il,
1 1 0 1 0 0 8 1
Encoding 1 Encodiug 2
X Xh x' Xh x'
0 00 00
1 o 1 01
1 10 1 1
Notp that direct summat ion of till' two operands will re.sult in 111100.
Th, and also 010100, ar(' l'CllIivalent, and all [('pr(>Sent tlw valuc 20 l o.
o
TABLE 2.3 Two encodlngs tor binary SD numbers,
30
2, Unconventional Fbc9d-Radlx Number Systems
Binary SD Numbers
31
FlluH1i1l1-( 2 ("'II Iw vil'wl'li , " two'", l'ompll'lIIl'lit n'prt':'l4'1I1 "floll of I he
"ij(lIcc! cli1 ;r. Filrollill I i1- Homl'l illll'S prt.fl'rnhll' Nilll'!' it H"I iHfil'S I hi' "Iilllpll'
fC,lut iOIl
Uy f.'nrr"lll(illJ.( I l'flllH ill I hi' "lIoVI' 1'1]11111&1111 W., 1'llIl btulIl I Ill' ufit hllll'l'" . \'11'
Liun Itnl iHlir'fl lIy I hI' "fmollk,,1 n'c'olllnl( ulWmt.lull
z, -1- r III /I, 1'.1-.
:r - 'r' - .r'.
EXllIllpl.-. 2.12
rro ("OIlVt'ft I III' sn U'p/lI'ul ul illil IIf till' IUIIUbl'f lUlU 10 I WO'H C'Olllplr
IIlI'lll Wf' "Pllly tI... pn'viouH ,,1lIril hili, rl'HIiIt iug III
,mil c'oll!«'CIUpIlII', till' (,olllhill"llUll 11 IIiL'! n \ulillmlllll'rk,,1 vu\l\l' IIf 0, I'hi.'! I'll-
,'odiu nl"" flimplifil'H iii" rOIl\'I'fSioli of n Ilumlll'r frolll I Ill' !iJ) lu I Ill' IWO'H 1"0111-
ph'JlII'nl n'prc'SI'uI ,,(,iOIl, 'fhiH I'CIIlVNSiulI iN .10110 hy Hllhtntlt iuJ.( I hI' t!4..'f11ll1I1'I'
3': .I,.r_2,.,.,.r8 Irulil tho "l'lllII'IU'I' .r:._ I ,3':. -2....' r. IIMillg Iwn'H I'omplt...
ml'1l1 uril IUllI'l il ,
Thl'rl' I'XiHI H "llollWf C'OIlVl'rtilOIl algoril hili whu(: IIllpll'ml'llI "I iOIl re'llll/l
n I'irrllil silllpipi tlu", n ("ompl"'l' hiuury aelc!.'r (171, 19\), III thili "Iofil hm I he
hilmry Higlll'c\ IIii'-s MI' C'XfUllinc'l1 fWIII rihl 10 \e'fl, 1111(> lliJ.(il nt n I illlc'. 1'1,,'
nlgori\.hm rl'lIIu\'N! nl! 011'Ilrrl'11I I'ri \If 1 I liJ.(it 1- 111111 "fi,rwufch," Ihl' 1lI'ulivl' siJ.(1l In
till' mO'il sillilil'lUll hil, the oilly lIit wit h 1\ 111'1(111 iVI' wl'ighl ill IIIC' I wn'H C'fJlllplo-
IIWIlI rl'prc':'lI'llt "liOIl, TIlt' fiJ.(hlmosl 1 digit ii, fI'plul"I'll Ly a 1 UIle! I hi' III'J.(ulivl-
tiiJ.(1I iH fOfwarc\e.l\ to I III' Il'fl, n'plul'ill O'H by 1 'RUllt iI n 1 ili n:udu'e!, whkh "1'011.
tilIllINi" 1.111' lu'glll iVI' sil(lI "ml iN u'plul'I'll hy U 0, If n 1 i,., UIII [I'n('hl'jl IIII'II till' 0
in I III' mOl'll, ,.,iJ.:lliliI"lUlI pOHil iOIl iH IlIflll'l\ illtll n 1, hl','ollliuJ.( t.hl' III'j,tul ivc "Il bil
of I.he I WO'fi romp!c'ml'1I1 fI'pn'/iI'1l1 ul iou, If n Hl'l'OlUI 1 ill l'IlI'CJllllh'rr'll Itl,fore' 11 1
it!, it. iH rl'l)lru'I'o lJy U 0 nUll \.111' forwunlillg of 1,111' 1I...ul iV!' tligll 1'11111 iUIIP.'!. Ilu'
1lI'U( ivc' /oIi1I iH flJrw.lfC!c'11 with till' nul of n u "hurrraw" hil which "11"0111 11 lIIg
IL" 11 1 iH lll'illl( forwarch'cI, ",..I ('lJlIulll 0 IJllwrwlHI'. Th.. ru!r's of tluR Illoril hm
ArC' HIUIiUmri1.I'11 ill Tuhll' 2.'1, whl'fI' II. is till' Ith dij,tit. of t.l1I' SlJ uIIIIIIII'r, z, ill till'
ilh hit of IIII' I'Clrrl'spollclill Iwo'N ClIlIlplc'lJIl'UI rr'prr'!i4'ulatioll, c, iH IllI' pfI'violiH
"uorruw" lUlIl C;-t I i" I Ill' IIl'xl "horrow." For I III' \I'ILOj\. Hil(lIilil'lUlt Iligil WI' ISIlIJlI'
C'I) = O. l'hiH ulj,turithrn pl'rformfJ thl' InvprRt' Opl'fI\tillll III tlml p,'rfllrulI'll 11.\ thl'
(111O,,;..,l nc-udiJl!l ulJ.(urit lUll pr '1I1.d ill S 'dion (U. It 8lLtiBfil's I Ill' aril hml'l Ie
I'qllnt illll
II. 0 I 0 1 II
r, 1 1 I 0 0
%, 1 0 I 1 0
Siul'I' t hi' rnlll((: of rc'prc'!«'lItubl ' IllllllhNfI ill thl' Sf) 1Il1'1 h011 itl ahllo I. rloll
hie' IImt IIf I Ill' IWII'H c'Clmpll'lIJ1'II1 Illl't hOll, UII ll-tliJ(it Sf) IlIllJIIJt'f IIIIIHt hi'
("oll\'l'rll'lllo 1111 (" -t- 1 )-hil, I WCI'H c'Olllph'lIIl'lIl [I'pn'HI'1I1 ut iOIl 1111 ilhlHlmtl'll
111'111\'1.
0 1 0 1 0 I
C', 0 0 0 0 I I 0
0 I 001 1
\\lil h01l1 t.l1I' I'xt.m hit positiull tb.. mlluL 'r 1U WI IIId III' l'onwfh'll to - 1:1.
o
l/, . = %1 - 2rl-t-l '
flu' 1It11' of EllmclillJ( 2 ill rnhll' 2,3 ulll() hUJ4 ReIlUI' "dvuul URI'J!, Silll'l' I h,.
vnhll' of till' IIpl'fruullliJ.(il .r, iH giVl'1l !.} -2rJ + r: IllI'fl' nr . IlItH, llI""I'ly ..d ulld
.rI_ I' ill t.WII Ilcljnl'l'II1 clill pusitiulIN wllic'h hnvl' IIII' HI""" Wl'il(ht. \VI' C'11II 1111'11
rl'rullp the bitl! .r: flilel .t:' I 1.0 forlll n IlI'W llij.(il i,. I'I'rfurmillj.( lIudl I'qllnl-
\VI'ij,tht. roupllJ IGI allllwH 1111 tlJ r!(:riVl' III'W ndililillll mil'" fOf ./1 ullIl II. whil,lt
lIIuy lr'lle! tll Himpll'r implc'llll'lIt nt.iuIlH.
Nllh' thaI III ruhll'l.!") IIII' hil.'! J': I allc!II:_1 Hrl' U"I,(Jr'11 uilly WIII'II .il/ll =-
1 !.1I1 lIot wlwil .r. lI, = I. ThiH mllY \e'IU.l 10 "implc'f implr'II11'II1 uI,ioll l'olllpl1rl'.1
tll Tllhll' 2,5 wlll'fI' t hi' IIlformul iOIlIlIIlJII\. till' pfl'viOIiH IIiJ.(it.s iN fI',!llirr'cJ u"o for
t.lll' I"" rc '/1 = 1.
lI. I'. %, {'. t I
0 0 II II
II I I I
1 II I 0
I I 0 II
I 0 1 1
1 1 0 J
i,lit 00 01 (II 111 OJ 11 11
.r" y" both hol h t,1 lr'lt, -
1 . I -
,
IlrI' II "rr'O on I' iR 1
:r. I lIr- I - Ilt. h'lt huth - -
\Jill' iH I lirl' 0
"1 0 I 0 I U I I
() 1 1 J 1 () 0
/I.
TABLE 2.4 An olgonthm for converting SD to two s complement representation,
TABLE 2.6 Rules for adding blnary!lD numbers with equal-weight grouping (6).
32
2. Unconventional FIXed-RadIX Number Systems
26 References
33
2.1. Find thf' Ih;oo-pc)int n'pn'J}talions of thl' t",,, ,,-a1u, (.U.2!iho and ( 1l.25ho
in tht' fllIO\'.;ug radi,,-r numbt"r )tt'IU.. Ea.:h nUlnb('r consists of k integt'r digits
Nld m fraction digits.
(1\) Sigued-digit n'pn'lItation ....ith r = I. k = .1, "I = 2. and tht' digit set
(2. 1. 0. I, 2). .\t I,'ast out' nt>gatiw lti,git must appt'al" in thE' rcpl1'S('utarion
(b) .\ l'&"'1\-dt'limal "St('m \\-ith r = -10, k = 3, RIId m = 2.
2.2. Giwn tht' ,,1\lut' (-14ho. th(' ,,"Ord length n = 6, RIId thE'digit set (1,0.1). find all
thl' '<oibl,' "ix-c.ligit SD rt'prt'St'ntatious of tht' giwn ,,-a1ut'. Inclicate the minimal
repl't':!C11ta.tion. Is this repre.t'ntation uWtJue?
2.3. \'erif')' that the reptatilln of 0 in WI SD number :.n.tE'm is \mique.
2.4. Pro" that any fix('(!-r tcli.x numbt'r systl'm < n,l3, A > is nonredundant.
2.5. Calculatt' the d1tference bt't\\"'t'('u tht' two nwnbE'1'S 0010 and 0101 (in the conwn-
tional bin' s)tE'm) in t\\"O "d)"S: 6.r..t. in tht' uaditional wa) by adding the
t90"O'8 rompleIUt'nt of 0101 to 0010. then b)' using Equation (2.10). Is there an
&hlUltag,' to using one of thE' two mt'thod.-<?
2.6. ShO'o'o' that if Q r ...: 1 1 no ne..... carr). \\ill bt> g'l"ne.rattod \\'"h('n adding SDnumbt'rs.
2.7. Find Nl tht' "-aIut'S of the radix r for \\ hich tht' t\\"O->otl'p algorithm for adding SD
numbt>rs ....ilI not gt'!leratt' nt'\\" carrit's if and only if the ma:umum redundancy
is f01lo......00.
2.8. Show that in t' negati\-e-radi:< systt'm with d > 2 the additiw in"-erse of Q
is -Q - (....2.2,2. 1). Find tbl' additiw inW.lSt' of o.19 in the nl'&"3-dedmaJ
s)"$.tew. \'('riC)' )our re:.-ult by adding it to 19.
2.9. Can modifiro rult.,. such ...... thQl!;f' in lablE' 2.2 bt' dE'riwd for r > 2 so that Jtoss
rt'dundaDC)' (.g., Q = r 1 ) will bt' nt"l."doo!
2.10. Are thl' modifit'd rules for adding binar:- SD numbt'rs in Table 2.2 unique? In.
oth.?r ...."Onls. can )"OU t anotht'r :iet of rul...s that \\-ilI guarantee that DO nt'W
carries ....ilI bt' g'eDE'rated in tbt' >ot'l."Ond step of tht' addition?
2.11. An im3b'-rudix numb« ,,tt'D1 can be definoo bS folIO\'."$. ut the radL"C r
ha\-e t form r = j./J where j = ,,"='1 and ,J is a positi'o'e integer. Let the
t set be {O.I,....J3 - I}. 500\\' that aU l;'wn-poI>ition digits rept a real
numbt>r }" in the negatin'-radi.x (-) S)tcw and all odd-pQ:oitiou digits repre;ent
aD imagi..uar). nwuber j" Z in the S3mt' Dfgath-e-radix. numbt>r stem. A. a
result. C3D writl' X = }' + j,,"37. repn'St'nting a complt'x number b) a
siDgle ueDCt'. Therefore. in:itt'ad of perfocmi.ng four multiplications and t\\"O
2.12. \'rite the BooIn equatDS for a circuit implementin th.. convemon aJgonthm
w ThblE' 2.1 USlIIg encodmg I from Thblt' 2.3. Point out thp similarities bet-.-o
tht' resulti.ng cimlit and a fuD-adder (FA, see See.tion !J.l). DisrlJ&'l Ihp bJlity
of <'I1lplo:rlDg an}' 8p('('dup tecluuques w.ed for biDalY addition.
2.13. ShOVo' that the algorilhm, summarized ID Tabl.. 2..1, for COD\'l'rting IUJ SD to
a tv."O's complE'm€'nl rt'presentation of a binary nUDJbEor, can be performed by
forming the 5O"qUl.'Ilce 0.IV..-II.I!/wt-2I,....lvll,11I01 and than performing a bit-
v.-i:;e E'xc\u.'Ii".e-OR operation 'I!o-ith the ue c",c,,-Ioc,,-1,'" .C),O which is
obtaiIlt'd according to the rules in Table 2.1. This \"t'rSion of the conversion
algorithm 9o-as pn'So.'nted in (7).
2.14. A hybrid SD number S)":;tPID "'''8S presented in [5}. In this S)"SlE'm onlv some digit
positions are signoo .....hilt' the rE':;t remain unsigned. Develop two sets of rules
for adding tht' corrt'SpOnding digits of tv."O hybrid numbers, similar to th05(' in
Thble 2.2. The first table will indicate tbe rults for Iing the carry c. and
intermediatE' sum u, for tv."O signed digits z, 8J1d !I. witb <It-I and lit-I being
tht' lov.-.?r order 1\\"0 unsigned bits. The second table wiD indicate the rules for
"'t"lecting the carry c. and intt'rmediate sum e, for two UDSlgnro digits <It and b.
'I!o-itb an incoming car C.-I that can assume the values 0, I, and -1. SbO'lV thal
if d is the longest distance between neighboring signed digits then the maximum
carry propagation chain is of length (d + 1)_
2.15. Show tbat the addilion rules in TablE' 2.5 guarantee that DO new carnes wiD be
gt'nt'r3ted in the second srep of the Additioo.
2.5 EXERCISES
2.6 REFERENCES
(h+J\ -ZI»<.(1+jJjZ1)=lh}2- ZiZ)+J\ -O'"IZJ+h.t'l)
[1] A. .\\ IZIEIS, '"Signoo-d.igit number repl't'SeI1tations fur fast paralIl aritJuIKotic.-
IRE ThJlU. on El«. Comput. EC-lO (Sept. 19M), OO
[2] H. L. GAR"ER, .....umbt>r S)"SteInS and arithmetic.- in ..tdl"C!11«S m C \'OL
6. F. L_ Alt and M. Rubinoff (Ech.). Acadt'mic, New ....ork. 1965. pp. 131-194.
[3} I. KORt. and \ . (ALILU.. "On c1as:se; of positive. negath-e and iInagma.r) radix
number :t")"Stems." IEEE Trun.t. on Computer:r. C-30 (Ia) 1 '1).312-31-.
[4) B. PARHAMI, "('E'.ralized signed-digit number systt"IDS: .\ unifying framework for
redundRllt number representations.- IEEE 'l'r-aM. on Compukn. 39 (Jan. 1990),
98.
(5) D.S. PHATAK and I. KOR£..., .lIybrid Signro-Digit :'iwubt"r S)"SteIDS: .\ Unified
Framework for Rroundant Number Repres.e.ntabOlIS with Bounded Carty Propap-
tion Chains, - IEEE TnJ1U. on Compukn. 43 (August 1994). ,"91.
(6) D.S. PffATAK T. GOFF. and I. I\ORE:\, -COnstant-time additioo and simultanroas
rorwat cOn\-.?oD ba:.<:'d on redundant biDan repta&ions.- IbEE J"nvu. on
Computers, 50, (2001).
Ii) II. R. SRL'l'A' and 1\. K P.-\RHI,"'-\' Cast \'LSI addt'l' arch.it«ture,- IEE.E J. oJ
SoUd-State Cm:1IJts. 27 (:\Iay 1992), 761-.0;.
SJI",,.n;a.ta.
'A'" can din!cU) multiply hiO complex numbtors \1 and \J. Will th u.. of thi$
irn"f9" -ndlx nuwbt'r S)'S\t'Dl up tbt' multiplication of compk-x numbt>N?
3.1
2
Unconventional Fixed-Radix Number Systems
3
fHJ N, 1'''1\11(;1, II YII!>lllm.\, mIl S. ... AJIMA, ullil(b 111"....1 VL.';I IIIlJlliplkutl(1II "I.
fCurilhm willi II rp<llJn,l.ml hill.ny IItlrlilinn trN'," I/",..B Ira"" <In ('ornpu'( r.., ,14
(Sc>pt. I ''»), 710(1"796.
(9( S. M. ... "'N, C. . I lilli, (' II ('IWN, nn,l J. Y I.P.F. "Ao ,.II..-wnt rt'Chuul,ml-
Innar)' IJIllulwr I . billllr.v 1II1I1I1t.., I"tJllvr'rh r," IF,..,.. J, of !;ol.d-Slalf ('irrlJl'"" 27
(,J IU. 190.!), 1()<)-112.
SEQUENTIAL ALGORITHMS
FOR MULTIPLICATION
AND DIVISION
This cllaptpr 1)f('fI('lItH 1111' h(Li.. tu'(lu"'utinl IlgorithlllH for rnultiplil Ition, divi-
Hioll, IIIlIl Mlm,rp Wilt f'xtrEwI.ioli. AIorit.hlml for hih-p"i'rl lIIultiplil'ltion "In'
dl'S('ril,..rI in Chaptl'r 6. ('haptprH 7, .md 8 inclurl.. alg..rithlJ1 for fo.sl rlivion
me! high-"",..'() calr-uldtion of Hqnar., rOIJhl.
3.1 SEQUENTIAL MULTIPLICATION
Lr't till' 1II1I1t.ipli..'r .md IIIl1ltiplir'allrJ he rll'uofr'(l I,y X .md A, n'sp....tivcly, with
till' folluwillg H4'IIIWIIC'I'8 of digits:
x = Ln-I.r,,-l .., XILO .
t = n ,-111".-2 ... IlnO
whl'rp X,,_I lIud (1,,_1 dr<' the l'oiKII clb.:it!> m r'it,IJI'r tit. blJ.,JlPd-lIlnlIitlldt: ur th.'
('(nllplc'lIIl'llt trwt hurl!!.
TI..' b"'1l1 p nti.,1 nl(.,orithm fllr rnllltil,li('dtioli ('OIJHib of 11 - 1 HII'ps wh 'r'
in 1III'p j the lIIuh.iplipr hit L , iH r'){amilll'" alld Ih ' IHmhu'l. LJ/t ill addr'(l to
111t' IHI,,'inlll-!ly IU"'"lIlIIlntc'tl I",rt inl prnrhJct, d"lIlIt.'(1 by pu). Thl' '(lJlfflpri It.'
'x l,rr'l!!liOIl fur this r"c'lIfl>ivc l,rol'I'c!nr ' ill
1'(1+ 1 ) = (1'{J)+x,... t ),2- 1
j = 0, 1,2" ','" - 2
(,tl)
( 0 1 .. 1 I . I ' 1 (1 ,fJ) .. .£ A ) bv 2- 1 ..hift,..;
Wlll'f" lh tlu' firsl litr'l' I' = O. lY 1I tip ymg t II' 81111l J.
. , I ' , . t tl r ' l " llt to 1,II ''' l l 1'(J+l1 lll'rorr' i«ldlUJ( thl' IIpxt prodllr.t
I IY nlll' I"'HI 1011 II Ir' ,.",., .
.l J+ I A rhi!> nlij.(ulIlI'JlI ill llI'I'I'fk'illry, ttillll' lh,' wr'lJ.:hl uf Xj+1 111 rlollblp I hili IIf
:u;
rcister art' four bits IOIlK, and COILS('(llIelltly a dltubh'-ll'lI)!;th rf'gistPr
rrquirt'iI for storiug the final product.. TllP \frticallilw in the tnhlt' bl>lnw
sepumtps t.hp lIIost !>igmficallt IlItlf of the product, which mil be stored
in a singll'-ll'Ilgth rf'gist(>r (fonr hits kHlJ1;), from t.he Ic&;t significnt half,
whit-II cau be storf'd in a sl'COIul :<illgll'-Ipugth rpj1;ister.
Th,> thrpe bits of the multiplier, .1:2, Xl, and LO, are I'xalllinl 0111' bit "t a
t.iuw, shuting with the least. significant bit Zo. An add-and-shirt or shift
ouly operation is t.hen performed accordingly. Thp final remlt i.. npgat.iv{'
and is properly reprei;ented iu t.wo's ("1JlIJpl"ffipnt. Notp t.hat the part.ial
product bits in the least sigrllficant half do not participate in t.11t' alld
opl'ratiou, find that. all four bit. positions in the fir..t rpgistpr (holding thp
most. significant half of tbe final product) are utili.led. lIu\H'wr. only three
bit positions in the :.ecoud register are utilized, Il'aviug the 11'&.<;t !o>ignilinmt
bit position ullused. This Ilerd not nPCl'ss<l.rily bl' thr fiual arruugl'nwut.
TIU' three bits in the st'(,onrl rpgister om aft,'rwflnls be ston.'<i iu tllp thrN'
right.most positions, and t hr- sign bit of thl' Sf'COnrl r,'gistl'r cau th'>11 be
set according to oue of the following two pos...ibilitics: (1) Always St't. till'
sign hit to 0, Irrespective of thp sign of thp product, sinc(' it is the le(l.';t.
significant part of the rcsult; (2) Set t hI' sign bit e<lual to thl' ign bit of the
first regh;tl'r. Another Pl}........ible arrmigelll('nt is t.o use all four bit. p(j..;itions
in the $(.>(;01111 regi:.ter for tlw four It'dst significant hits of tht' protIlI,.t, U,
the right.most two bit positions in t.hp first registf'r, and inSl'rt t\\O cupiL'S
uf the sibn bit into tin' rl'ffiaining bit p06it.ions. 0
3fi
3, Sequent1al Algorithms tor Multlpllcat10n and Division
3.1 Sequential Multiplication
:Tj. Tt> prow thnt thl' nbovp p(Oc,'(llIr" cnlrlllntl's elII' product. of A and \, WI'
reprnh'dh. slIhsl it.ntl' into till' rf'('nrsi\'I' FAlllnt ion (J.l), yil>ldill
p(n-I)
(p(n-2) + X n -2' A). T I
((I(n-3) + X n -3. A), 2- 1 + X n -2. A) .2- 1 =...
(X"_22-l + Xn_32-2 + ... + x02-(n-I») . A
n-2 ( "-2 )
( LXj 2-(n-I-)) ),A = 2",-1) LXj 2 j '/1
J-O J-O
Ii 1
X x 0
P -0 0
.ro = 1 Add A + 1
1
Shift 1
XI = 1 => Add A + 1
1
Shift. 1
X2 = 0 => Shift only 1
=
=
=
=
If bot.h openmtls arl' posit.i\'e (i.e., X,,_I = 1l,,_1 = 0), till' product U is obtained
rr,JIIl
n-2
U = 2 n - l . p(n-I) = (LXj 2 J ). A = \ . A
j 0
The rult is a Jlrodud consisting of 2(11 - 1) bits for its nmJ1;uit.ude. Tu prow
this, note that. th., ma.ximmn value of U is ubtaill,'(1 wlu'n ...t and X EUnlt' tht'ir
maximulll \'aIU('. Then,forp,
(3.2)
U..,or = (2 n - 1 - 1)(2,,-1 -1) = 22n-2 - 2" + 1 = 2 2n - 3 + (2 2 "-3 - 2" + 1) (3.3)
Sincl' Ihr- Inst tprm in E<luatioll (3.3) is Jlosith.p for n 3, thp followillg illN}Ualit.y
hold,.:
2 2n - 3 < U",ar < 2 2n - 2 ;
n3
(3.4)
Thus, (2n - 2) hits arl' requir('(1 to reprf'SCnt the \'alUI', producing a total of
(2n - 1) bit.s when added to t.be SI,l,'11 bit
For slf;nL-d-llIa'uitude numbers WI' multipl)' t.he two mdJ1;nit.udl's w;ing t.he
abovi' algorithm aud gcuerate the sign of the result :;epdratel' (it is pusitiw if
both Opp[811rls h6\'1' tlw saml' sign rolll npgati\',' IIthprwisl'). For two's nnd onr-'s
compll>meJlt repr<N'ntatiolls WI' should distinguish betwN'JI multiplication with a
Iwgati\'e mult.iplicand A antlmultiplic.ation with a nep:ative mult.iplier .\. If only
th£' llIultilllic<l.nd is nl'gatiw, there is uo Ill'l'd to change the pre\'ious nlgorithm.
WI' only to add 80m" nmltiplp of a negative nnmber that i<; rf'prPsented in I'itll('r
two's or oup's COmpll'lIIl'llt.. This is illlI"itrat('(1 ill thl' npxt example.
37
o 1 1
o 1 1
000
o 1 1
o 1 1
1 0 I
o 1 1
o 0 0 I
1 0 0 0
I 1 0 0
5
3
1
o
1
-15
Example 3.1
hi the fuUowing lIlultipl) o()cration, the multiplicand A is a Ill'gRtive IIUIl1-
ber repre:.r>mt><l iu the two's complelll('nt method, while thl' nmltil)li('r X b
piti\e. Both arc four bit.s lung lmd tll\ fiual product thprt>£ore h&. sewn
bit:!, including th.. sign bit. In an urithmetic unit for -I-bit. opl'runds. all
Thp situation, how('\,er. is dilfl'rent whl'n t.he multiplil>r is nl'gati\'t'. Ht're,
we consider each bit sl'lmratdy, and the sign bit (which has a nrgat.ivp \wight)
canJlot he t.r('ute<! in the salll(' way 8S the othl'r bits. First COll::>ider t.wo's (.om-
pll'nwnl numbers, which suti"fy _
X' = -X,,_I 2,,-1 + X (3.5)
whe[( i = L;-x]2].
38
3, Sequential Algorithms tor Multiplication and Division
x . A = U -.4. 3',,_1.2,,-1
(3.7)
3.2 Sequential Division 3D
Example 3.3
'I'll" prodnct of 5 .md .1 in IJIJP's COml)lcmpnt [I"'prf':-of'ntht.ion ill
A 0 1 0 1 5
Y x 1 1 0 0 3
J:a 1 =>P -A 0 1 0 1
:To = 0 Shift 0 0 1 0 I
XI =0 Shift 0 0 0 1 0
.c'J = 1 Adll A t- O 1 0 1
0 1 1 () 0 1
Shift 0 U 1 1 0 0 1
.r:J = 1 Onrect. + 1 0 1 0 1 I 1
1 I I () 0 () 0 -15
If tll(' sin hit of thl' mnlliplil'r in till' prt',,'ioUbly I)r'>nll't' procNlnr> is
ignor('d, t hl'n till' linal r('Snlt 1I !'at,i:.lil'S
ff= \".,1=(X+3"n_I'2"-I). 1= \ .A-f A'X n -I,2"-I. (3.b)
The '...rm \ . 1 i!ol t hp cl('$in'(l prodnf't nnd lU'n('(', if 3'n _I = 1, the followmg
(:orr('ct ion is n('('cs....liry:
In ot h('r word!ol, if .r,,_1 = 1, w' must subt.ract t.hl' nmlt iplif'1111d 1 fwm t lu' mo.'.t
slifinUlI half of U.
Example 3.2
1'111' mnlt ipli('r and mntt iph('and in this example dre hot h neat i\e IlIllIlh('rs
in IllI' t.wo's compll'mcnl rt'pre:wnt.at,ion:
A 1 0 1 I -5
X x 1 1 0 I -3
.ro - 1 => Acid .4 1 0 I 1
Shifl 1 1 0 1 1
.rl = 0 => Shift only I 1 1 0 1
£',2 = 1 => Add .1 + 1 0 1 I
1 0 0 1 1
Shift 1 I 0 () 1
J:a = 1 => CorrC'Ct + () 1 0 1
0 0 0 I 1 1 +15
As in till' previons ...xampl..., till' :mhtnu'tion of th... (hrst) corrl'l.tinn t.erm
is f\rcomplhdu:xl by ddding its OIlI"S ('omplelUcnt. Hc,wl'w"r, nnlikl' t,11I'
prl'\'ious ('xampll', t.he one's ('omplcment has to 1)(' pxpanch'd to dOli hi...
si?.(' using the sign digit (:'C(' Section 1.6). This illlplil'S that a dOllble-
lellh binary ,uhler is needed. 0
3.2 SEQUENTIAL DIVISION
Divbion b till' IlIO!>t ('omplex of the fOllr basi<' arithmetic oppmt.ions alul, CUll"''''
qlll'ntly, t.he most timt....(.onsnming. Unlikt, th... oth...r tlm'C Itp,'rdt.ion." division,
in gl'n...rnl, has a re.-mlt cmL"i!olting of two compulll'nts. Giv('n u dividl'lul X unll u
divisor D, a quoti('nt Q and a remaind("r R have to bp (,dlcnlatRd so M to S8tisfy
In thl' cOHl'(.tion st,pp, HII' loubtrart ion of the llIultiplil'dlld is p.-rformed l!:o'
adding it" two's cltmplcment 0
,,\ = Q . D + R wit.h R < D,
(3.W)
x = - J: n _I(2,,-1 - ul]») + X
(3.)
WI' will I1SSlImc at. first, for simplicity, tlmt the operandI- X and IJ <tnll
t lu> rC's II It S (J uld 1l arC' poitiVl' mnlll)t'rs.
In mlmy fixed-point. aritlmll'tic units, a doubll'-Ic'lIgth prOfhl(,t is Iwni!abk>
aftl'r a multiply operation, md we wish to nllow till' uS(' of this rp,,;nlt in .\
IiUhSl'CJIJCnt divid.. IIllI'mtioll, Thllll, \ ma) lII'('npy a clollble..!l'llgth rc'istl'r,
whil,' all utllC'r opl'nmds an' stured ill single-length reistcrs. OIllN('clllmtly, w('
have to mdke tlur... tlldt. till' n':SlIlting qnot.i,'nt Q is smalll'r tlldll or "qual to th('
Idrg(ost, nllllll)!'r that we ('fin !Store in a singlc-lcnJ,tt,h r".,ult.er. If n is the lItlmlll'r
of hits in a !Sillgll'-Il'nbrt,h n'gistl'r, then ('\'ery sinII'-lenth inwgl'r is sJIlllllcr t hall
2" -I. Th('rl'fore, to ('IINllll' that, t.he qnotil'lIt, is a singlt."'lc'llth illt,>gcr (i.C"., t hI'
illl'C)nality (J < 2"-1 it; :,,,t,il-liC'd), wp must fl'quin' that
Similarly, whcn multiplyin onl":; compl('m,'nt. I1Ilmbprs, which sat.isfy
t.lll'n,
x . A = U - X.._I . 2 n - 1 ,A + :r"-I .ul,), A.
(3.9)
ThUb. if In-I = 1, we st.art with 1'(0) = A. whil'h tdkcs ('arl' of t.he Sef'Olll1
('orrcctiull tl'rlJl, IMIlIC'ly, .c"-l . 1I1p. A, and at t.he end IIf till' PW('cs..., WI' bllht.r.lI't
till' first correctiun tl'rlll. A . :Z,,_1 . 2"-1.
x < 2,,-IV.
40
3 Sequential Algorithms for Multipllca110n and Division
3.2 Sequential Division
.II
0 .1 0 0 0 0 0
0 1 .0 0 0 0 0 Sf't II = 1
1 1 .0 I 0
0 0 .0 1 0 0 0
0 0 .1 0 0 0 set lfl = 0
0 0 .1 0 0 0
0 1 .0 0 0 set (/1 = 1
1 1 .0 1 0
0 0 .0 1 0
Ifthi C'OlIllition is not s<'\lisfi('(I, nn Ot'l'r:flour inrlit'ation should be produCt.>d
hy thp nrithmptic nnit. Onr should he nware that thl' aho\e condition can
nl\\'n's hr satisfied by prf'Shifting 01\1' of th(' OpN1\nd:- X or D (or both). This
pr('hifting is ,>sp,'('ially simplp to apply wlll'll tlu' op(>mnch, nrp !lonting-point
lIIunhcr.;. Anoth,'r condit.ion that hn..; to hI' ch('('k,'(1 is that D I- o. If this is not
thp cast"', n dit'idc by zero indication should bp gl'npratp<1 by t.hl' aritllliletil' unit.
Unlit..,,-' the pn.vious coudition. no corr('('ti\.p act.ioll can h(' tak,'n wht'n D = O.
Th,' pntation of algorithms for division is !'iimpler WhNI till' dividenu
and divisor. as wpll as the quot.ient and remainder. are interpn't('(1 as frCl(1ioll.
In this ('8.0;(', the divid(' OVl'rflow condition bl'l'Omt'S X < D to pnsure that the
qnoti,'nt is a fraction. TIlt, division procedure that. is prnted npxt l\..<;S\UU('S
t hat all op(,fands and n>sults arl' fractions. but i.; clearly al valid for integ('I'S,
as willlwrollll' nppan'nt Int('r on.
To obtain the fractional (positi\'(') quotient Q = O.q."'q", (wlwre rn =
n - 1), WI.' p('rform till' division as a S('qUl'nC1.' (If suhtract.i(1ns and shifts. In step
i of the pro thp r<'mainder is compared to the di\'ir D. If the remainder is
the larg,'r of tht' t.wo, th,'11 the quotiPl1t bit q. is set, to 1. If not, it is S<'t to 0,
Th.. ('quat il)n for the it h step i
"0 = \"
2ro
Add - D +
rl = 2ro - D
2r.
"2 = 2rl
2r2
Add - D +
r3 = 2r2 - D
Note that the generation of 2ro should not. result in an overflow indication
(multiplying a po:;itive number by 2 should result in a positivI' number),
sinc' the quotient and remainder are within t.he proper raJ1l1;e for the given
dividend and divisor. Hence, an e.xtra bit poit.ion in the arithmet.ic unit.
is nt'eded.
The final n':Sults I\rl' Q = (O.lOlh = 5/8 alld R = r m 2- m = r32-3 =
1/4.2- 3 = 1/32. (The precise quotient is the infinite binary fraction 2/3 =
0.1010101 .. ..) TI)(' fjuot.ient and final remainder satisfy the equation X =
Q. D + R = 5/8 . 3/-1 + 1/32 = 16/32 = 1/2. 0
r. = 2r._l - q. . D ;
(J.11)
Exactly t.he samp procedure should be followed if the opt'rands and result.
nr<' integers. In t,bb case we may rewrite Equatioll (3.10) as follows:
i = 1.2,...,rn
Whl'r,' ri is the new remainder and r._1 is the pn'vious remaind,'r. The fir:;t
relluundl"r is ro = X. Thill>, q. is de(('rmined by cOlllparing 2ri_1 to D. This
('Ompl\J'in is thp most complicated operation in the dh-ision proCl>&;.
"'1' will now pro\""£' that the abow prOCt'<lure indl",i calClllat,>s the quotient
and th,' final n'1lI8illdt'r. Th,' n.'lIlainder in the last st 'p is r m and [('peated
fiultitutioll of Equation (3.11) yields
i ln - 2 XF' = 2..-IQF" 2 n - 1 Dp + 2..-1 Rp
(3.13)
where XI", Dp, Qp, and RF are fractions. Dividing Equat.iolJ (3.13) by 2 2 ..- 2
yields
r m = 2rm-l -qm' D
= 2(2r m -2 - q..._1 . D) - qm' D =...
= 2mro - (qm + 2qm-1 +... + 2"'-lqr) . D.
Substituting ro = X and dividing both sictt'S by 2 m results in
r...T'" = X - (9.2-1 + ll'l2- 2 + ... + qmT m ) . D;
... Q D + 2 -(n-l) Rp
'''1''= F' I" .
(3.14)
The abow' mentioned .-undition \" < 2n-ID, whell divided by 22n-2, now takes
the form X,.. < DF.
Example 3.5
We repeat the previous example with all operan?s alld results being in-
teger.; In this case the double-length di\itlend IS X' = 01 = 32,
and ;le di\isor is D = 011 = 6. The o..-r'rflow colldition X < 2 n - 1 D is
tt'Sted bv comparing tbe most significant half of X, 0100, to D. 0110. The
I . f I d ", Q 0101 - '" and R = 001n.. = 2. Ob:.erve
resu ts 0 t le lVlSlon arc = 2 - U"".l .
that in the final "tep of the pr the true remainder R IS generated
and, as CaD be \erified from Equation (3.1-1), tht're is no "' to further
multiply it by 2-(n-I). 0
hence
r m 2- 171 = X-Q.D
as ft..'<)Ulfed. Note that the Uue final remainder is R = r m 2- m .
(3.12)
Exawple 3.4
I,.(ot X = (O.IOOOOOh = 1/2 and D = (O.1I0):z = 3 ,I. The dividend
occupies a double-length register. The condition \" < D is cll'arly satisfiE'd,
42
3. Sequential Algonthms for Mu and DivisIon
3 3 NonrestorIng DMs10n
13
r.
r.
D
D
2r.-l
2r'-1
D
q, = I
2D
!2D
I
!
FIGURE 3.1
Restomg dIVisIon.
The most difficult t('p in th.. di\'ision pl'O<.'edure is the compari."On betwrcn
the dhi.';or and the rl'mainder to dl't('rmiue th(> quotil'nt bit. If this is dont> bv
\)btracting D from 2r'_h tben in the C&.Cf' of a DE'gatiw result \\"(' St't q. = o. and
"'" must I'lton' the remainder ta its prl'\;ous \-alue. This ml'thod is thl'J"('fa-e
called rc'ltoring dil-i..on. and can be diagrammed. ns hown in Figure 3.1. Such
a diagram is !':Oml'times callro a Robt:>rt.)n diagram [i'J.
This diagram illustrates the fact that if r._1 < D. q. !'hould bt:> Sl"'lected
""-' "" to ('no;ure ri < D. Since ro = X < D, ""'l" are guarantE't'd to obtain R < D.
In wnll1ary. a di\"1$ion pE'rformro by the restaring mpthod US.'S m suhtra.ctions..
m :shift opE'rations, and an average of m 2 restort' operatioll.. The latter can be
impll'mcnh"...i {'ithl'r by adding D or b' retaining a copy of the pre\ IOUS remainder.
thu... 8\'()Jding the tilDe penalty in\"Oh'ed in the re:.tare operations.
FIGURE 3.2 Nonrestori"lg dIVision.
inst('ad of qiq.1 = 10 (which is too large). "' ".{mld get q.qi+1 = 1 i = 01.
Furth('r corrertion. if nl'eded, would be done in the next teps.. Consequentl),
the quotient bit is dl'tl'rminro in the nonrestoriug scheme by the fall wing rule:
qi={
if 2r._1 0
if 2r._1 < 0
(3.1S)
3.3 NONRESTORING DIVISION
This rulp is simpl,'r (and fastN tv execute) than the *1Cl'tion rule for restoring
di\'ision :since it requires the comparison of 2r._1 ta 0 rather than D. The
n'lUaindt'r is computed u....,ing the same equation
An alt('matlw 1'C111'!UP for St"Quential dhi.c:ion is. th, nonrestorin9 di\'iian al-
gorithm. in wbich the quotirot bit is not correctro Bnd the remainder is not
rest<lred immooiat('ly if it is negative. These corrections are instead postponed
ta later "t(>p. In the restaring method. if 2"._1 - D is negati\"'l". the remainder
is restored to 2r._I' It is then shifted and D i... on('(' again ubtracted, obtaining
4,.,_1 - D. This process is repE'ated as long as the remainder is m--gatiw. In
the nanl'l..,toring method ,,"{' a\'Oid tlIP tl'St.orl' operation, tay "ith a negatn-e
J"('mainder 2ri_1 - D < O. hift it, Bnd then attempt to correct it b)' addin9
D, obtaining 2(2r._1 - D) + D = -Ir._1 - D. Thus. this algorithm produces a
remainder "'Qual ta the one "'e "'Ould generat(' using 'toring di\'ision.
Consider no",' the re:;ulting quotient. To enable the rorrection of a "wrong
lectioll of the quotient bit in step i, "-e must allow the Iwxt quotiellt bit, qi+h
to a..wne 8 llt>g8tiw \'alue. In other ",'Orets, the allowl-d \-a1u€S for qi are 1 and
i where j J"('pfC:S('nls -1. If q. was incorrectly set ta 1, [('Suiting in a ncgati\-e
remainder, we ,,'Ould then select q....1 = I and add D to the n'mainder. Hence,
r. :;;;; 2r._1 - q, . D;
(J.16)
in oth('r word... subtract the divisor D if 2ri_1 is pjti\-e Bnd add it othef'l1ri...
Thp nonrestaring dh-ision is diagranunoo in Figure 3.1. Hpn'.lri-ll < D d q.
is 5('1('('t<'<1 ta ensun' I"il < D. X)te that q. i- 0 and thl'refore, at (>a('11 step, elth<>r
an addition or subtraction is performed. This is n()t an SD reprntation. and
there is no rroulldanc\ in tht! reprE'$t'ntation of the quotieut in tbe nonrt"'Storing
dh i...ion. In summar....: th(' nonrestoring method requin'S l>x8C'tly m add ubtract
anet :.hift opl'rations Its main advant8{, b its simpler selection rule
Example 3.6
u.t \" = (o.lOUh = 1 2, and D = (0.110>2 = 3 -I. as ill Example 3.-1.
rhe final nmaind,'r is t.hl' sarue as b'fore, and the quotient is Q = 0.11 I =
0. 101 2 = 5/8. 0
33 Nonrestoring DIviSIOn 45
Example 3.7
Lct. X = (O.I()()h = 1/2 and D = (1.01Oh = J/ I.
ro=X 0 .1 0 0
2ro 0 1 .0 0 0 pt qi = 1
Add D 1 1 .0 1 0
rl 0 0 .0 1 0
2rl 0 0 .1 0 0 S<'t. q2 = 1
Add D + 1 1 .0 1 0
r2 1 1 .1 1 0
2r2 I I .I 0 0 S('t q1 = 1
Add -D + 0 0 .1 1 0
r3 0 0 .0 1 0
4 3, Sequential AlgorIthms tOl Multipl cation and DIvISion
(1) ro = \ 0 .1 0 0
(2) 2ro 0 1 .0 0 0 set ql = 1
(3) Add - D + 1 1 ,0 1 0
(4) rl 0 0 .0 I 0
(5) 2rl 0 0 .1 0 0 setq2=1
(6) Add - D + I 1 .0 1 0
(7) T2 I 1 .1 I 0
(8) 2r2 I 1 .1 0 0 set.q3=1
(9) Add D + 0 0 .1 I 0
(10) "3 0 0 .0 I 0
The nonr<>storing division proCt'$S in the previons mmmpl<, can hp rl'pr(\-
st'nt.t'd raphi('ally nsing a diagram similar to th<, one depicted in Figure 3.2.
The rt'Sulting diagram is shown in Fignn' 3.3. Thc horiontal lim's correspond
t.o tht' Add :i::D opt'ration in lincs 3, 6 and 9 in Fxample 3.6, and thc diagOild1
lint's {X)rm.pond to t.h,' Multiply by 2 operation in lin<>s 2, 5 and 8.
A vcry important feature of nonrestoring division is that it can easily be
extt'ndf'd to two's complement negative numbers. Thf' g,'n<,ralized splt><.'t.ioJl ruk
for q. is
Finally. Q = 0.111 = 0.101 = -(O.IOIh = -5/8, or in two's romplf'ment,
1.011. Notp t.hat thc final rcmaindcr is 1/32 and has the same sin as the
dividt'nd X. 0
qi = {:
if 2r._l amI D havf' t.h" sall1e sign
if 2ri_1 and D havl' opposite signs
(3.17)
By (kfinition, the sign of thl' final rt'mnincipr II1I1.<;t ('(Inal t.hat of thp divi-
dpnd. For cxamplc, when dividing 5 by 3 Wp should obtain a qnotipnt. of I and
a final remainder of 2, and not a qllotll'llt of 2 and a final remainder of -1,
although this remainder st.i11 sat.isfies iRI < D. COnSt'(IUI'ntly, if the sign of thp
final r<,maincl,'r is different from that of the diviclpnd, a correction of both le
final r<,maindt'r and quotient is nee<led. fhis situation, rC(luiring a correction
stcp, aris's since the quotif'nt digits in t.he nonrestoring division a]goritlm ar
rcstrictf'li to {l,l}. The last digit can not be St't to 0 and therefore an evpn
qnoticnt can not bc gcnerated.
Since the rl'mainder changes sign<; during th<, process, there is nothing
spe<'ial about a IlPgative divid<,nd X. The following example iIIust.rates th,.. case
of a nf'gative dh,isor in t.wo's complement.
-.................-...--....-.-....---...........
i q. = i
I
I
! 01.10 2";_1
i
!
i
......_................................................u..1
Example 3.8
Lt,t X = (O.lOlh = 5/8, and D = (0.1l0h = 3/-1. rht>n
ro=X 0 .1 0 1
2ro 0 1 .0 1 0 set ql = 1
Acid D + I 1 .0 I 0
rl 0 0 .1 0 0
2rl 0 1 .0 0 0 :;et 'l'! = 1
Add D + 1 I .0 I 0
r,l 0 0 .0 I 0
21"2 0 0 .1 0 0 ct lJ3 = I
Add D + 1 I .0 I 0
r3 1 I .1 1 0
";
D=O.II
AGURE 3.3 The nonrestoring division In Example 3.6.
46
3, Sequential Algorithms tor Multlpllcotlon and Division
3.3 Nonrestorlng Division
.J7
TIll' fina) rt'mainc!t'r is negative. whil,' the divid('nd is positive. \Vf' must
corr('('t tht' final rpmaind('r by addill D to r3, 'i,'lding I.J 10+0.110 =0.100,
and t ht'n corr('('t. t.ht' Quot ient:
3.3.1 Generating a Two's Complement Quotient
wh('re Q = 0.111, and ther<,forp C.Jcorrt:cff:d = O.l1o. l = 3/1.
o
Thp nonrt'stHrill division, fL'i prf'viollsly pn'S'lIt('cI, gPllf'mtes n CJllotif'1I1 thai
USI'S till' digits I 'ind I dud miht tht'rt'forp Lp incomrJdt.iLIt' with th,' n'I)((':.f'U-
tat ion uspc! for t.l1I' Ilividl'ud and divISOr. If X dud D mi' n'prt''utl'd in twO'!!
eOlllp)pm<,nt., thf'1J therf' is 'i 1J'_o<.\ for a couwrsion from the nhovl' rppn'spn-
tat ion to two's I'ompll'm('nt \\e may, in pritll'il)ll', uS(' 0111' of the alorithms
pr"scntcd in Section 2.4 for cOllwrting a SD IIInnlwr to its two's ('ompll'ml'1I1
rpl>rcs.'ntation. These algorit.hms howf'\'l'r, r('(luire that all thp dits of th..
(Iuot,i<,nt. bl' known bpfof<' tht' con\'f'rsion ('811 bl' pl'rfOrtlll'c! th\lo; illn(,f\siu t11l'
tot'll !'xl'Cution tilll(' of thp divide ol)('ration. \V<, pn'fl'r t)II'rl'forp to I'lIlploy
an algorit.hm that ptrforms tilt' con\'l'rsion on tIll' fly, as t 11(' (Iiit s of till' quo-
til'nt bl'eomt. availablt', in a sprial ffLo;hiou from thl' most to 1.lJ( lefL.o;t sigrllfkant
digit. SlIeh an on-tlltfly convprsion aloritlnu from SD to two's compll'lIlenl
r<,prt.:;t>nt ation has bC('1I presl'nt.ed in (3).
We nUl ho\\evl'r, take ddvantage of th(' fdl t that. th<, (Inoti('lIt digit ill thl'
nonrt'st.oring lIivisi()(I can <L..."lmle only thl' values I and i (i,e., q, 1= 0) nlld IIt'riVl'
a simpler alorit.hlll that r<'<llIires a 1,"5.0; (,Olllplt'X circnit. for its impINIlf'ntation.
Since th<, CJnotil'nt. diit can aSSllntl' only two vahll's, a singll' hit is sllffi'it'nt. for
reprI'Sl>nt.ing it, and WI' may assign the digits 0 and I to t.he vahlt's 1 and 1,
rtp,'(.tivp)y. Lt.t. tilt' resulting binary IIInnb('r 1)(' denot('(1 by (O.PI .. '1'''') wht'rt
Pi = !(q. + I). This nUlllbtr can be couverted to two's complement USIII the
following alorithm:
Step 1: Shift the givpn nnlllhl'r 0111' hit po!>ition to the I('ft.
Step 2: Compl<,nll'nt th<, most sigrlificant bit.
Stl'p :/: Shift. a I into th<, I('ast signifi<'ant po!>ition.
The result of this algorit.hm is th<, '<lllf'nC'1'
(1 - PI) . P2P3 . . 'l)m 1.
We will now provp that th(' ahovp sl'qupncl', when intf'rprt,tl'd as numbcr in
two's cOlllplt'm,'nt, has the saml' nllmerical vahll' a... till' origilldl qllot.lf'lIt (). TIt(
value of the above sCQlIl'ncl' in two's complclll 'Ht is
Qcorrtcred = q - flip
In ,'nera), if t hI' final r<,maindf'r dnd tll<' divid('nd ha\'(' opposit<, signs, a
corr('ction st<,p is lI('I'd,'<I. If t.hp dividl'nd and divisor haw t.h" Mml' sign, tJll!n
thf' rl>maindt'r r". is corrected by adding D and the QlIot.it'nt is C'orrectl-,<I h"
slIbtraeting fill). If the dividl'nd aud divisor hav<, oppositl' signs, we snbtract D
from r", and C'Orrpct the quot.ipnt by adding flip.
Anot ht'r conSl'qIll'nce of t he fact that 0 is not an allowed diit in non-
restoring divisiou, is tht' nl'Cd for a correction if a ll'ro r<,nldinder is gf'nf'J'at.l'<I in
an int{'rmcdiatl' stl'p. This C'a.<;(' is ill list rated in t.he 1Il'>.."t t'xample.
Example 3.9
Ll't X = (l.lOlh = -3/8 ami D = (0.110h = 3/4. The corr('(.t resnlt of
t.his di\ision is Q = -1/2 "ith a ?ero rt>mainder.
ro=X 1 .1 0 I
2ro I I .0 I 0 set 'II = I
Add D + 0 0 .1 I 0
r, 0 0 .0 0 0 zero remaindl'r
2"1 0 0 .0 0 0 s<,t q2 = I
Add -D + I I .0 I 0
r'.l I I .0 I 0
2r2 I 0 .1 0 0 set q3 = I
Add D + 0 0 .1 I 0
"3 I I .0 I 0
Not.e that alt.hollgh t.he final r<,maimler r3 and tllt dividtnd X ha\'<, tht"
samp Sigrl, a correction st<,p is u('(>dcd, sim'(' thp (Iuotient wp gl't is Q =
oj Ii = 0.I0i',l = -3/8 inst.ead of -1/2. W<, must t.ht'rcfore detect. thp oc-
currl'uce of a /,('ro int.ennroiat.e remainder and corr('('t the final rt'nldindpr
(to obtain a zero remainder):
m
o '" 2 -.+1 + ,.-no
-(I - pr)2 + p, .
.-2
(3.18)
r3(C01'TCeted) = r3 + D = 1.010 + 0.110 = 0.000
Substituting Pi = !(q, + I) ,}iehb ".
fJlT I - T I .. L)q, + I)T 4 + 1,-m
,-2
m m
= q 1 2- 1 - (T I - T m ) + Lq;T' + L 2 -'.
.-2 i_2
Wp haw to t.hpu correct the quotient Q = 0.11 i = 0.I0l by subtracting
flip. yil'lding Qcorred d = 0.1O<h = -1/2. 0
48
3. Sequential Algorithms for Multfpl cation and Division
3 d Square Root Extraction
Ie}
Till' last h'rm ('quais (2 -I - 2-"') IUld thl'n'fore,
rrJ ".
=l/12" I + LQ,2" j = LQ,2- i =q.
ia:2 Ie. I
Tu provc' Ilmt till' ahov(' prnc('durp yi,.lcIs I,h,. rf"qnin'c! squnrp rool, Wf'
(('Iwutt..'tlly slIhst.iIUlf' FIJunl ion (J.19) in I hC' f'Xprf'ssion for r m, ohtaining
Thc' ahow C'Onwr!oion algorithm l'l\ll h(' ('X('cuh'd in a hit-sc'rial fa."hionj
that is, wC' can c'lwr"I,' lh(' apprnprintC' hit of t.hc' cluoti('nt. wh('n rf'prt's('nlc'(l in
two's ('ollll)IC'I11C'nt, at C'l\('h slc'P of thl' nunrlt.oring division. For !'xnmplC', in thl'
last division with X = 1.101 and D = 0.110, instC'ud of l'n(,rlIt.ing I Ill' "uotil'nl
bit" .1Jl, we cl\n gcnl'rnlC' thC' bils (1- 0).101 = 1.101. AftC'r thC' mrrcct.ion step
W(' ohtain (Q -nip) = 1.100, whit-h is thc ('orrt..>ct fl'prc:;cntat,ion uf - 1/2 in two's
("(lmpll'ml'nt.. Thl' samC' on-t h('-fly ("onwrsion algcnithm ('an h' dC'riVl"d from tIll'
gf'lll'ral SD to two's compleml'nt. conVl.'rsion alorit.llIn prcsl'ntl'd in Scct.iun 2..t,
Thi:; is left lIS an cXl'rdsc for thC' readl'r.
r m = 2"m_l-r/ m (2(./m_If-l/,..2-'")
= 2 l "m -2 - 2qm l(2Clm-2 .. q", d - qm(2Clm-1 + qm 2 -m)
=
2 m . ro - 2 m [(q I 2- 1 )2 + (q22-2f t-... + (qm 2 -m)2]
[ "I-I ]
_2'" 2122-2q12 1+... + 2q,,,2 -m q j 2-'
( m ) 2
2"'\ _2 m L q,2 i _2,n(Y_Q2).
,..I
=
3.4 SQUARE ROOT EXTRACTION
rl = 2ro - ql(O + qlT I ) = 2X - ql(O + q I 2- 1 )
(3.20)
Dividing b) 2'" resUlls in 11ll' expertl'll (('Ialion with rrn2-m us t.hC' final rOo
nmindl'r.
Example 3.10
IA \: = 0.1011 2 = 11/16 -= 17G/256. It.s sqnan' rool is
ro=X 0 .1 0 1 I
2ro 0 I .0 1 1 0
(0 -+- 2- 1 ) 0 0 .1 0 0 0
"1 () 0 .1 I I 0 set l/1 - 1, QI - 0,1
2"1 0 1 .1 1 0 0
(2(}1 + 2- 2 ) 0 1 .0 1 0 0
0 0 .1 0 0 0 Sl't tfJ - 1, Ql 0.11
"2 is sm.,I1....r than (2(12 +- 2- 3 )
2"2 0 I .0 0 0 0
= 1.101
r3=r2 0 1 .0 0 0 () SI.'t. '/3 = 0, (1:1 = 0.110
2r3 I 0 .0 () 0 0 till a positiw numher
(2(13 + Tot) 0 1 .1 0 0 I
0 0 .0 1 I I >;et. cl.1 I. QI 0.1101
r-l
The ('uII\'entinm\1 "complPling t.h.... lJU8rC" nwt.hod fc)r qu{\rp root pxlrodMIJJ b
conl't'ptunll)' similar to thC' rl'sloring divi:.ioJl sdll'me. L"t the gl\'l'n rn<IK:and X
h(' a positive fraclion, fLnd I..t Q = (O.lll Ill' . . qm) d"Jlot, its sqmup rool. Th,'
bits of Q ar" gem'rutc'{l in 111 st 'ps, om' bit pl'r l 'I). \\"l' llSC the not.atiull
.
Qi = L qk 2 -k
k I
fnr thC' partially dC'\"C'loped root at stt'p i. Tlms, Q", = Q. \\'c alSl.) tll'not' thl'
rl'maimll'r in St.l'P i by "i. Thc next fl'nll\ind..r, in Rellt'ral, is ('aknl"t.NI from
ri = :lr,_l - q. . (2Qi-1 + q,2 j).
(3.19)
Compnrin tIlt' RImVI' I'quat.ion to F"<lufLtion (3.11) Slll--:-ts that thl' sIJuare root
extraction can b ' vicwc..'tl dS division with u dU\nbing divisor, i.p., D. = (2Q._1 -+
l/,2-'),
In tIll' fit tl'P diP rpmaindt'r IS the radil'/Lud \ and Qo = O. TIll'
pC'rfortlll"d ('akulat.inn is thl'«'furC'
To dc.tA'rmilll' the qnfLrc root digit 1/. in thl' ((>st.oring schcmc, a tcntathe (('-
maind,'r,
2ri-l - (2<;'_1 + 2-')
15 c LlC"uI6tPd. Note t.hat dJ(' t..rm (2QI_1 + 2-') is .'qnnl to (ql.ql" .,/._IOlllnd
is w'ry slmpl,' to c ,lculalc. If th.... abo\'(' t.l'ntnti\'t' rC'm'iinclC'r is po!oiliw, W(' ston
it:, vulUl' ill "i WId Sf't 1/. l'lIUnl to 1. 01 hcrwilM', WI' sC't "1 = 2"'_1 lUld q. = 0,
Finallv Q = 0. 1101 2 = IJ/lt> /Lnll thl:' final rc'luaiml..r is 2- l r.1 = 7/256 =
X - () = (176 - HiU)/25Ii. 0
Th,' lIhuV(' procedure it\ similllr to till' (('storing division .lgorith.lII. A
nll'thocl similar to thc' nunrc'loring division algorit.hm nUl he "lJJploYI'd wllh t.1t..
following select.iun rulc for C/i:
50
3 Sequential Algorithms for Multiprrcatlon and Division
q, = {: if 2"'_1 0 (J.21)
if 2"i-1 < 0
Thi!o aloritJlIn is ilIustmtNl in till' JII'xt ('mllpl(>. 3.5.
Example 3.11 3.6.
L,>t Y = 0.011001 1 = 25, t>4.
3.7.
ro = \ 0 .0 1 1 0 0 1
2 r o 0 .1 1 0 0 1 0 sdql=I, QI=O.1 3.8.
-(0 + 2- 1 ) 0 .1 0 0 0 0 0
rl 0 .0 1 0 0 1 0
2r. 0 .1 0 0 1 0 0 set '1"2= I, Qz=O.1I
-(2Q. +2-2) 0 1 .0 1 0 0 0 0
r2 1 1 .0 1 0 1 0 0
2rz 1 0 .1 0 1 0 0 0 set 113 = 1. Q3 =0.11 i
+(2Qz-2- 3 ) + 0 1 .1 0 I 0 0 0
r3 0 0 .0 0 0 0 0 0 3.9.
The square root is Q = O.lll = 0.1012 = 5/8. 0
3.10.
The rlig1ts of tlU' squarE' root Q can bc COIJ\'crtt.'<i to two's complelm>nt rep-
rl':'t'lltatioll by the $c'Un( nll'thod u.;ed for the quotil'lIt in the lIonrtorillg dr.ision
algorithm. Faster algorithms for S(luafe root extraction ha"e bt.'\.'n developed and
illlpll'lIIeut.t'<i. &lI1lt' of th"111 arc illtrodllC('() in Chnph'r i.
3.5 EXERCISES
3.1. Gi\"t'D tht' foUo\\IDg thn>t:' pain; or binar multiplicand d.1Id lUultipli{'r:
(i) +.1001 cU1d -.0101 (ii) -.1001 and +.0101 (iii) -.1001 and -.0101.
(a) R{'pn.':5l'nt th{' nwubers in the two's cowplcmCllt form and wultipl,). th{'m.
Check your rt'Sults.
(b) R(>pt'at (a) for thE- one':;; cowpl{'m{'nt form.
3.2. Can the bt.'qUl'ntial wultiplication algorithm be wodified that the luultiplicr
biLo; an> pxaminl>d starting with thl' most significant bit? What ruighl b(' a major
diSNhcwtago! to th modifil'd .rithm.r
3.3. Multiply thE' binl\r). SD nwuber.; \ = 10i01 (th(> multiplicand) and = 01101
l multiplier). P{'rforw a1J int(>rmroiat(> step5 in SD arithm{'tic.
3.4. Gi\"t'D the rollowing tbm.' pairs of binar)' di\ id{'nd and divisor:
(i) +.1010 and -.1101 (ii) -.1010 and +.1101 (ill) -.101O.wd -.1101.
3.5 ExercISes
!il
1{('pn'Wnt the llu.mLf'(!I in th£' two's complt'lII('nl form and p{'rfonn thf'div,.."on
b). tL{' nOIUC!otorwK III .thod. fhe CjuotiPnt should "\Lo;o be r p pr('S('1l11'(\ in two's
compl p m('lll.
[k'vi.<;e I\J1 algorithm for dividing llumhM"!i III onl"s complemE'nt rf'prCSl'llt Uon.
Ulusrratf' your algorithm u.'Iin thp thrf'e pairs of numb4'fS in probll'll1 3..1.
Wnl" lhe rul<'S for nonrcstoring divi.'Iion fnr dedmal fractiorul. IUtL'Itratl' thl'
procedur{' using a positive dividend and positi\'e 'lnd nE'gl\tivl' divisors.
Explnin tbr n('f'(1 for n correclion step in lhe nonrt'Storin/i1: division if a zero
r('rnailUl{'r is (>ncoullt{'re<1.
Show thai if th(' quotient bits q. (i = 1,2..... .m) in th{' nonrC"Storing division art'
set according to th ' mle
{ I if the siKDS of th(> rf'mainder find divi.',or a....pp
q - . .
, - 0 if th{' sijl;us of the r('lDilindI'r and divisor diffr>r
and subudA..tion (addition) is pE'rformro wh(>n ql = 1 (ql = 0), thl'fl thl' col're("tion
term (I + 2- m ) hI\." 10 be addro to ql.Q2". qm to obtain a Cjuoti(>nl rpprest'nted
in two's c(}Jnplemenl.
Can the algorithm for con\.erlin/i1: the quolient bils g{'nE'rated in nOIln>storing
di\ i... ion into two's compl{'ment r(>prt'S('ntation be modified fur converting bin<ir)
Sf) numbers to t"'-o's compll'l'llent? (o;xplaill.
To spN'd up the Ilonre<;u>ring division, it has been sUAAestro to allow 0 to bt> a
Cjuoti(>nt bit for whirh nO add/subtract operation is needed in order lo calculate
a new remainder. The modified selection rule is
..-u
if 2r'_1 2: D
if 2r,_1 < -D
oth{'rwise
\pply thi... n('w algorithm to calculate thE' Cjuoti(>nt of th(> dl\idend '" = 0.101
8Dd thr di,'i.'iOr D = 0.110. Would )OU recomml'Jld the use of this new d.lgnrithm?
Fxplain.
3.11. The on-the-fly conveon algorithm in :!ubsection 3.3.1 IS a special C3S(' of thf'
SD-lo-two's complement con\'"{'r.;ion algorithm pre;l'JlU'd in Section 2..... Since
ju.'It lhe \'81u(':) I and i ar(> allowt'tl we need only use the last fow r(nl. in Table
2.4. Th{' resulting table for conwrting th(> CjuotiE'nt O.PIP'1'" Pm to its two's
cowpl{'llIenl ('Qui\C\lent :0.:1:2''';:'''' is shown below, ",;th the indices changf'd
to match lhe different indexing w.ed her£'_
I') c, -} C,_l
1 0 I 0
1 1 0 0
0 0 1 1
0 1 0 1
52
3
Sequential Algorithms tor Multiplication and Division
Show I hat .:1:2':m = 'P'2" 'lJ", 1. AL'IO show that hn..>;('d on thf' tir1't Iwo roW'! in
l'ahl.' 2.4 %i) = 1 - PI.
3.12. Fin,l the 5("llilUt' mol of 0.011111 u"lIIg tlw nonrl'stnring algorithm.
4
(I) J. .J _ I. CAVA:-I AG II , Dlgllal comJlukr antlirndic: IJt'sign and Jruplc"wntutlOn
Mf'Gnw-lIi1l, :-.If'W \ork, HlI\.I. .
(2) \. CIII', rornpulcr organulltion lInd mlcrnprogmmming. Pr('ntin' lIal1, Englp-.
",ood ('Iiff..., ,J, 19i2, dlI\JI. 5.
(3) t. D. FRC'FGOVAC md T. tANG, "Oo-thC'-fly cOJJ\l'rsion of redulIlll\ut illto con.
ventional rl'pres('ntatiolls," IEEE 1hm.s. on Cornput n, C-J6 (July 1987), 895-
817-
(4) h. II WANG, rompulcr antlmlctic: Pnnciple.s, archll,y'!1tre. und de_.lgn, \Vill',)',
New York, 19;8.
(5) O. L IACSORI FY, "lIigh-SII('f-'(1 nntlunNic in bllmry COlllput.crs," Proc. of IRE,
49 (.11\11. 1961),6;-91.
(6) (;. W Ib:IT\\liOSNEH, "Biliary aritllllll'tic," in AdlmnCl'. In ('0 mpul....,s, \01. 1, F.
I. Alt, (Editor). AClld('mic, N,'\\' \ork, 1960, pp. 231-308.
Ii) .J. E. ROln:R1"SON, ".\ 11('\\' .-I....".. of digital di\'i'lioll ml'thllds," IRC 1hms. on
Fl. .tn/rlle rompute.". Ee-7 (Sept. 195), 21l-i-22:l.
181 N. R. SCOTT, Computer number slJstem. and untJunetic, Prentice lIal1, Ingl...
wood Cliffti, NJ, 19M5.
[9) C. flING, "Aritlullt,tic," ill romputer science, A. F. ('ardenl\.':;t.t 61. (Fd5.), Will''-
lotl'rsciellce, \lOt'\\' 'rork, 1972, ("Iml). 3. .
110) S. \\'ASER aud I. J. FLYNN, Int.v l..cllOfI to antl.rnehc for dlgltol s-yst -m tlc.slgn-
ers, Hult, lUol'h.ut, Winston, New York, 1982.
BINARY FLOATING-
POINT NUMBERS
3.6 REFERENCES
4.1 PRELIMINARIES
To obhLin Ii dynlimic rflngp of rl'pr('nt"hl(' r('al IIIlmb('rs without having to
S<'al(' th(' olwwmls, WI' 1lS4' floatin-puint numbers ill.'il 'ad of fixed-point onl's.
The rl'pn'St'ntalion of float.ing-point numbprs is similar to the conllmJnly u
scientifi(" notation and ronsists of two Inrts, th si9nificand (or mantbsu) .\1 6nd
the rX]Jollcnt (or dlamct.cristic) C. Th(' floatin-point number F rq.lreseul-ed hy
th(' pair (M. F) has the "nitII'
f' = M . {JE
where {3 is th.' base of tl... ,'xpont"nt, This bas., is connnon to all flodting-point
numhers ill n giwn systcm. It i:. ther('fnre 1101 indudro in till' reprCSt'lItl1tion tlf
a flol1tilll«-point. IIInnber, bnt is rdl- h('( impli('(1.
Thus, thp n hit.s that n'pmsent a lIoating-point Jlumber arc pl1rtit.itllll.-d intu
two parts, onc holding the signitit"antl M 8nd the lither the l"xl.lUII,'nt E. 1'111'
range of f('pn'.sPIII.ahle fkntin-poillt Jlumbprs is larger th.m thai of fixt'd-point
rel)f('S,mt.dtinll, hilt the pr('{'iion is 1omall('r. Th.. totalnnmbcr of ditff'n'nt. valnt'R
(n'pn'Sf'nt.)hlc in n bits) is still 2". allil sinn' the ra.ngc b .tWl.'('JI the smallt
IInd th,' larW>sl n'prl'l.'utablt' ",,\rws illcn'USl'S. the tlistaJlle bctwl"n any two
conSI'cul i\'t' v"hll'o;; must incrt.'a..>;t' as w,.\I. Floal in-point nlllubcrs art' tlnti parser
than fix,-d-point lIumbNs, resulting in a 10wI'r pn'('jo;ion. Any r('HI nnmber whose
value lies betwecn t.wu roll:-....'utiw tlnatillg-point mllllhcr'l iB TIlnppro Ollto 00('
53
54
4. Blnarv Floating-Point Numbers
4. 1 Preliminaries
55
of t.hN:(' two numlwn;. Thf'r('fofC'. a largl'r distanel' hl'h\,('l'n the two conSl'clItivC'
nnmhrrs rpsults in a lowC'r prrcision of n'prf':;{'ntat.ion. A nll)(C' dC't ailf'd di-;cnSoo;inn
on the pr('("isiou of rrprrnt.at ion app'ars in S('{'I ion 4.3,
ThC' signilkami .U allli thC' t'xpoucnt F of 11 IIndting-point nnmhl'r F ar('
hot.h signed qnantit.i('S. The C'xponent is nsually a signed integer, whill' t\)(.
<;ilifiC1\11c1 is usually rpprPs(,lItNI in one of two ways: one 'IS a pnre fra('t.ion,
and the ot.her a.'i a IUlmlwr in the Tnng(' (1,2) (for 13 = 2). In addit.ion. diffl'rmt
dl('m('$ for T('prN:('nting negati\'e vahu's C8n be l'mploYl'd for I'aeh of t.h(' t.\m
parts of tlIP flont.ing-point nnmber. Unt.i11980 there was no st.andard for floating-
point. numbers and almost evrry compnter syst('m had its own represl'ntation
nll'thod. This made t.h(' transportation of scientific programs and data bctwl£n
two different ma('hin('S v('ry diffkult.. At that tinw th(' IEEE st.andard 754 [9]
was formulatftl; it, is uS("C1 in most floating-point arithnll'tk nnits dl'Signcd in
r('('ent yenrs. The dl'tait:. of this standard ar(' presl'nt.ed in Sc('t.ion 4..t.
Although only wry few compnter systems st.iII USI' tll('ir own float.ing-point
format rather than t,he IEEE standard format, it. is important to nndl'rstand
some of the prior formats. These prior formats greatly influenced t.h(' ol'Cisions
made by thC' IEEE floating-point standard committee '1nd shldying tlwm allows
a b(,tter Imd('rstanding of the IEEE standard form,it. Thes(' fOrlllRt.s diffl'r in
tltf' part.itioning of the n bit.s betw('('n the signilieand and ('xpon('nt fi('lds, in the
rt'presl'nt.ation method uSt'll for ('ach of t.he two part.s. and in tlu' valut' of the
basI' jJ. In whnt follows we will consid('r only a f('w prior formats; t.he othl'rs ('nn
bf' 3Dalypd in a similar mann<:,r.
\\'e start with the slgnificand fidd and examine dll> l'ommon C&e where
the significand is a signed-mdgnitud fraetion. The float.ing-point. format. in snch
a cast' consists of a sign bit 5, c bits of an exponent E, and 111 bits of an unsigm'd
fract.ion JU, satisfying m + e + 1 = 11, as shown helow:
t.hp expolll'nt (.md vice versa) at, the samr time. so that the value of till' floating-
point nlllnbf'r rC'mains nnehang(''ll. \Vlll'nevf'r an arithn1f'1 ir opf'rntion results in
a significnnd larger than the maximnm allowC'd value of .\f moz = 1- ulp. we mllst
rNlucp the sinilkand to b£' in the allowahle range. WC' hnvC' t,o snnulhnromJ
incrC'ase thp £'xponpnt so that. dlC' value of thp tle,ating-point nnmbt'r stays thl'
sam. ThC' smal1l'St inl'Tea..",' in E, hf'ing an intC'gpr, is by 1. ThrrPforp, w£' should
n54' thp following relation wlll'n t.he nd to rcdllcf' t.he signific"\nd ariS<'S:
M. {3F = (Mlm. {J'+I
The divide operatioll in Mlf3 turns into a simple arit.hmf'tic shift right opC'ration
if {3 is an int.I'gral powpr of the rndix. If {J = r = 2, then shiftmg the sigmfieand to
the right by a single position mnst bl' (X)mppnsatl.,<1 by aJdin 1 to the exPOIIt'nt.
Example 4.1
Suppose that an arithmetk op!'rntion yields the result 01.10100.2 100 ,
whieh has a significnlullarger than lI[maz. We should (('()uce th.. slgnif-
icand by shifting it one position to thp right and increase the I'xponent
hy 1, yielding 0.11010 . 2 101 . If {3 = 2 k , thPII changing the exponpnt by
1 is equivalent to shifting thl' signifi('nnd by k positions. ConS('qnently,
only k-position shifts arc a1lowpd; e.g., for J3 = -I = 2 2 , 01.10100. ..010
=0.01101.4°11. 0
s .C
F = (-1) . M . JJ
(4.1)
In general, thp representation of numh£'rs in a floating-point form is not
UllielUC. For rxampll" 0.11010,2 101 = 0.01101.2 11 °. Th(' same v.uue can also be
r£'prl'Sent('(1 with an exponpnt of Ill, hut if t.he signifiednd lipid is only 5 bits long.
the rcsulting signifinnd would bp 0.()0110, so w,' would lose a significant digit.
Out of all possible rpprc:;entations we ther£'fnre prefl'r the one wit.h no leading
.lcros, allowing ns t,o retam the ma.ximum mlmhpr of significant digits. \Ve cl\!1
t.his t.h£' 1Iormalized form. The normalied form also simplifit'S the comlJluisou
hetw('('n float.ing-point numbers. A larger f'xponent indicnt('S {\ larger overall
numher and so thC' signifi('ands ha\'C' to be comparpu only for l'(lual I'xpolll'nt.s.
Notiel' t hat, since in t,hl' casp of ;J = 2 k , thl' signifirand can be shiftt'll only hy k
(or a mult iple of k) plR>itions, thl' significand is I'OnsidI'Tl,<lnortll<llizcd If there is
a nonz('ro hit in the first k po:::itions. For pxampl(', the normnli/ed form of the
numhl'r O.OOOOOllO . l6 101 is 0.01100000' ({)IOO. .md we couilinot eliminate the
singl£' Ipading O.
If th!' fraction is normaIi7f'l1. t.lw range of the signifieuml IS smaller duul
[0,1 - ull)]. The smalit'St and largest allowabh' vahl£'s arp instf'lul
o Expolll'nt E
Un...igned Signilirand AI
The value of such a floaling-point nnlllb('r (8, E, J\f) i:; given Ly
SUlce (-1)° = 1 8.nd (-1)1 = L Thf' maximal valli£' of th(' fractional signif-
leand, denoted by II/ maz , I'qnals /I[nlax = 1 - 1(11), where IIlp is the wC'ight of
the Ie&t-:.ignificant Lit of the fractional significand. Usually, but, as WI:' will Sl'('
lat. 'r, not alwdYs, !lIp = 2- m .
Next we dieuss t.hl' sclc(.t.lon of a value for thl' implied base p of lhe
exponent. For practical purposes, tht' ha.'W ,J i.. restrictl.>d to an intl'gpr 1)()wC'r of
till' radix r = '2. In ot.her words, {3 = 2 k wherC' k = 1,2,.... The rl'<&.Son for t.his
is that it. pro\'idt'S a simpl£' method of d('('reasing the significcmd and increing
1
J\[mi.. = 13
md
JU ma %: = 1 - IIlp
flu
4. Blnarv Floating-Point Numbers
i1 1 Preliminaries
fi7
1'\."., 1.lml I I... I nUJJ:I' or lIorlllllli"''41 rrnt'lllIus II,",!. IIl1t iJldlllll' I h., vnl.... /.I'rn
I h'III'I', IL HI"'I'ilLl 1"p((,I'1I1 ul illll iA 111'4'11,'" rnr /.I'ro. A IIII:.sihl,' fI'pn'til'lItu'ioll for
""III c'oUNiN"!" IIf 1\/'" 0 nJIIIILlI)' I'XI'"III'II' ", USUIlI1, }, ... () is II(I.fl'I"(I'4I, Hill(.'I
I h,' fI'pn'sl'lIl "I,inll of "..ro ill l\ulI' ill-poiJlI i.. I hl'JI ill"l1l knl to il s fI'pnt'll'lIlnl ion
in lixI',I-J1l1iuI, l'oillll'lifyillJt tIll' ,'xI,<,ul ion of n h'1 f(or ""W illstrucl.ioJl.
Fil1ully, WI' clisl'Il:'s t.11I' WI\} I'XPIIIII'III... nn' fl'pr..sc'ntlxl. fhe most 'OlllmOIl
(I'J1r" cntullon ih us n biusl" I'XPI11ICllt, u 'l:ordin to whic'h
OE2'-1
wlIPre' /'''''<1 is I h,' smlLlII'sl f'XPOIII'III, AmI , mo% is I hp h'fK1. All idpnt kILl
runp I'xisl S fOI JlI'nl ivf' 11I1IIIhrf:.. \VIII'lIpvpr I hl' f'XI"lIlI'lIt of t hp rf'sult of fllllle
unlluJlC'I k 0p('rnl ion i larpr I hfln F lllax , "II t'rporJcnt overfloll' indicat.ioJl shlluld
1Jl' generntl'(1. Siluilnrly, nn I'xpntll'nt. 1lIIlU1Icr 'Imu Emn Ahnnld cnlralc An c r-
POI/I'''' UI/(It'1'llnw mdicat.ion I'h ,.. 1I"8 mnkp t.h . proramnllr nwarl' of t.hc'
situ,,1 ion unci nllow IJlm/hpr to takp , hI' 1J('('pssnry sll'ps. Sinep 1.111' sinilic' md is
"Iwnys kl'pt IIlInnnli/.I',I. "II' oVl'rflow will hp rpft"I,tI.d through thp f'XI)lInl'lIl. In
!lOlIIl' Jllnchines, wlll'lI an ,>xponpnt oVf'rllow ()('('urs. U sPl'cinl wpre:'I'II' /llion of
infinit.y is III"''' for thl' (I'MIIt. fwo lIt.hl'r po..ibilil il's arf' st.oppillg the' C'IJllIputa-
lillll allll illtl'rrnpliug t,h,' proc'p$sor, or, tllI'lpl\..,1 rl'{'ollluU'lIClf'cI, s,'trill IIIl' (I'SIIIt
to t.1I1' I/lrg,'st r"prc'SC'ntuhlc' IIIlIuhpr. If 811 PXl)OllJ'lIt undprllllw O\TllrS, rIll' WI'-
r 'I:wntation of II'ro is uS1'11 for Ihl' rc'sult III som,' c ,,",'S, hUI tilt' propf'r pxpOOl'nl
undl'rllow lIu it> :-till wised, Spiting till' fesult to 1,<>(0 ullows th,' computatiolls
to 1)(0 ('('I'd, if aPI)wpriutc. wil h01l1 lLIlY interruptions. A detailc(1 IIi1!CU:fi11lJ1 of
t.11l' WllY I'lI..h I'XC('pt inns Ufl' Imudll'cl in t lIP IEFF 0;' IIIdurd for lIo8till-IJ"illr
lIumbers nppcurs in Sf'd.ion ,1.8-
TIll> I'olllpll.tl' fllllg(' of floatiug-poinl llIuutll'rti is shown III thl' followiug
:.dll'lIlat.il- (Jjagrmll, Not.kl' I,hut "4'ro is 1I0t. iududt.d in till' raul(c of f'it.11pr p+
or }, -.
,. - Fir.... I I" tt I;
",h,'re' IIII' ,,,tI,, iA n l'uu1 1'111 mltl ",r".. is I hi' I flit' \111111' IIf ,hi' c'XlwUPII' fI'pr(\-
fI('III1'11 in I wo'ti l'IIIIlJlII'IIII.nt. nit, rnngl' for ,'"'",, lISillg I hI' t hil s of I hp I'XPOIIl'III
li,'lcl IN
2' -I S E"."e S 2' -I - 1.
1'111' hins is """0\1)' s.,II'I'II'clIUo I hI' umgllil tlllc' ..f (.tit. IIIl1sl nl'lIti\ I' ('xpolll'nl; i.,'.,
2" I, )'il,ldlllp:
III I his I'IL"I', WI' l'uy I hut I III' 1''I:\l1II11'1I1 is fI'PfI'SI'lIh'll iu IIII' c':rc'('s 2" I 1111'1 hocl.
1'1... nlhunl"I' oflhiN s..hl'lIll' is Ilml ",hl'lI C'lliupuriuJt I,wu l'xl)(IIII'nts (us h. JU'I.d..d
iu uclcl/SlIhl.I"I.'t np,'rnt.lolls) \\1' lito,}' ilIl1rl' Ihl' sigll hi,s ILlIlI C'olnllJlrp thl'lII 1).0; ir
IIII' wpre' IIlltiinl'(llIlIlIIllI'rs, As I' (I.snlt, If II flolIl illkp..illt furllmt. liltS till'S, E,
nllll .\!I'IIIIlPUII 'lit.. III tlth. or(h'r, "'" 1Il1l)' ('lIlIIl'lIr" I hi' lIolil ill-J1oiul IIl11uhl'rs ns
t.hllllh IIII'Y WI'I"I' hillury illt'('rs in SiIII'(I-nm1II1 mIl' re'pwsl'nt "t iou, AllolIlI'r
",h "1I1.1II' iN t Iml tit,. slIIalll'sl (('pf('sl'nl lIhl.. ulllulll'r hus I hI' AAIIIP I'XpOlll'lIt ILO;
lNII. lIulIII,ly, U.
o
- CO
+CO
EJ\poncnl
O\'crno\Oj
FJ\ponent
O\'crflow
EXlllllplt' 4.2
For t - i I h,' rrLIII' uf I'Xplllll'n'" ill I ",o's ('lIlIIplc'IIII'1I1 rl'I)f('s,'nlnt inn is
G4 S Flr"r S G;I wilh IOnllUOO alii I 01 11111 n'pre'M'lItill I.hl' VUIUI'N {),I
uwl ha, rc':,pl",.t.i\'('I,\'. \\'hl'lI IUldiug 1,IIt, hilL>; lIf (i.I, I h,' t.rIIl' vullle (j,' is
[1'1'11':>1'111,'(1 hy 00110000 ulIIl I hI' I fill' vnhll' li:l iN rc'I)(I'sl'uII'(1 hv 1111111.
ThiN fI')I(('SI'1i1 ", illn it! 1'1'111'41 I II,' I'XI'I.s.... li,llil'lll'IIII', 0
EXRlIlplc 4.3
Thl' shllrt lIoutillg-plliul fornmt iu till' 10M 370 sy!'h'lII l'ollsists of 32 bit-'i
purtitioued &; f,,\lows:
I" 7 hits. I'XCCSS 61 eXllolIPlIl M - 2.1 hil." uU'Iif{u('(1 frlLl'I,iollal siJ(Uilkmltl
Thc Lw;' of till'S" IIIIMiug-poillt IIllllltwrs is 13 = 16, and Itpu'"
F = (_1)5. M, l6 E - OI ,
!./ml<l . p,....,.. S r'
< A / . 13 ".."00
_ .n "ua.!'
11('fI' ,,' is ((' I m's('"t('d hy 0000000 alld III t.hp vuhlt {j,I, whilt' i""'Ul:t
, "Uti.. .
(('P(l'Sl'IIt.('d hy 1111111, hus tIll' vnlut! +{j3. Siuce 13 = 11 It IS o.lI\pnJ(:"t
to considl'r t hI' siguilinllld us cOllsisl iu of six hexllI!('('mlal (hlt!l. I h,
uornmli/.I'II ignili('lulli thl'rl'fofl' sut.islilS
-6 I 2 -2..
A / _ l r. -I < /1/ < .\1"'07 = I - II> = - ,
,n mln - U _ _
CllllliI'(llIt'utly, ,.t u = (1-1()-1I) .H,,,,j J:-" 7.2;J, (()7:1, "JIll F:.... = (16- 1 ).
J(j CII 5..1 . III- 7 °,
Nol I' I hnl I hI' I'XI'I'" 2'- -I [('I'[(....I'ulnt.ioul'lul III' IIhl nilll'd hy silllply inn'rl-
luJ... IIII' si1I hil IIf 1111' t WI) 's l"Olllpll'lIIl'nl (('pre'sl'nl nl iuu: i.I'., 1\.1.1 ill t.l1I' vuhll's II
ullcI I of 'III' toI1I hil illllinLt!. nc'nl iVI' ,,1111 posit iVI' IIl11ulll'rs, re'sp,'ct.ivI'ly.
I'll\' (umpll.tl' l'mlI' uf lIonllnli;('I'4ll1l1atiu-poilll IIIl11lhl'n. illdlldl's icl"nti-
('Ill "Uhmlij(I')oo for pll:.ili\'I' lIoutitljJ;-poiut IIl11uhl'rs, dl'llolt'd by F+, 1J1Il11I1'W1' ive'
IIIIILliug-lluwt uUlllb,'rb, d\'llut'd by J -, 1'1..' flUlg" IIf I'ositi\'l' lIuutillg-poinl
numb 'Ib Ib
58
4
BInary Floating-Point Numbers
4.2 Floating-Point Operations
59
IBM/370 nFC'/Vr\ '< C'ylwr 70
Word 1l'IIgth (douhle) 32 (61) bits 32 (61) bils 60 bits
SiKnifi('nd+{hidd"n hit} 24 (6) hit!' 23 + 1 (55 + 1) bits 4X bil'l
Expollent 7 bits 8 hils II hits
Bias 1301 128 1011
BlLo;(' 16 2 2
Rang,. of AI f-M<1 !M<l 1\f<2
Hc:p[(':!I('nlal i')11 of .\1 Sigll('(l-magmt 11"'> Sigmo;:l-maKllit ud" 011("5 ("ompll'n1l'nt
Approximatlo fl\IJg'_ 16 6J :::: 7 . 10 7 :\ 2 127 :::: 1.9. 10 38 21023 :::: 10 307
AI'Pro'Cilltal" rc""olutioJl 2- 2 .. :::: 10- 7 (10- 17 ) 2-: H :::: 10- 7 (10- 17 ) 2- 111 :::: 10- 11
ca..<;c where f = 0 and F = O. With 8 hiddpn h't tl" I
o 1 . 20- 128 _ -ll9 . I, 1111 may r('pr(")pnt t 1(' valne
. - 2 . Howf'wr, thl' finatmK-plJmt numtwr f = E = 0 is still
,'xp('ct.f'd tll rf'pn''sf'nt 0, a rl'pn'Sl>ntatlon which dOl'S 1I0t use a hiddl'n bit. \
dl'arly can not allow f = E - 0 to rf' p resent. two d ' l a' t I T" I I .
. lIer...n Vi ues. HI ,&V(IIC t n
fwm happPl1lng WI' must r£'St.rict the use of t.h e xl) ( 'n llt E 0 d . .-
. ...- , = all r4'nIP It lor
r£'prf'nt mg t h(' vnlu ./ero only. COIH,PfIIl£'ntly, thl' smalll'St pxponent .1Uow('(1
for nOn7f'ro nllmOf'rs IS E = I Therefo r f' th a .. Inall t P "",' t " L' h
. , ,, ,. v.,1 IVP nunllJPr In t .
DFC/VAX SVlit.<>1II is F+ = l...2 1 - 12H - 2 -128 I ' he larg ""' t ' t ' I .
. man -. "-" 1'081 .rve nUiIl )f'r J..S
F;:;n:r: = (I - 2- 2 ').2 2 :15-128 = (1- 2-24)'2127,
TABLE 4.1 The floatIng-point formats of three machInes.
4.2 FLOATING-POINT OPERATIONS
For f'x'ullplc, Il't (5, E, AI) = (CI200000)r6 b(' a Aoating-point mllnhcr
in the short IDM format, then th£' first bytf' consists of (1IO(J(){JOlh. 1'h('
sin bit is 8 = 1; i.e, t.he uumber is negative. Thp pxpofll'nt is 41 16 .urd,
with a hias of 64 1 0 = 1016, EtrutJ is (.11 - .10)16 = 1. Finally, /If = (0.2h6,
l1£'n('(' F = (-O.OOlO):l . 16 1 = (-2}.0.
The rf'solut.ion of t hi!. flodting-point. repre.wutation, df'fiUNI as the di!.trulce
between two cons('(ut ive significam.ls, is equal to the w('ight of t hf' If'&<;t.-
significant bit of thl' signific3nd. Thus, the r£'solut.ion of this repreSf>utllt.ion
is 1I1p = 16- 6 = 2- 24 ::;;: O.G. 10- 7 , \\' say that t.tw short. format has
approximat.ply se\'cn significant dt'('imal digits. Should a hij:?;hcr J>r('('ision
1)(' desired, the IBM system provid a long floating-point. format, in whidl
the significaud is pxtended by adding to it. a sccond 32-hit. word. fhis
format i.s
Thp way flonting-point operations .are eXf'Cut.cd df'pf'nds 011 the 8pf'f'if]c forn\ctt
upd for rc'prfenting. t.he operII(l:,. In what follow1> Wf' will as.sump that thpsig-
mfic8mls are lIormahuod fractions in siW1('(I-mauitude rl'prf'S('ntation rulel t.hat
the expouents are biasl'd. Giv('ll two numbers, FI = (_1)5, . !III . 13EI-bla. md
F 2 = ( -1 )82 . .'1./ 2 . {3F2- lna ., we need to caJeoulat.f' th(' r('Sult of a basic arithulI'tlc
op('ration yif'lding /'3 = (-I )53 . .\1 3 _/1 FJ - lna .. Wf' start with multiplication and
division, since th('S(' are edSier to follow than addit.ioll and subtrdct.ion, which
will be dcscribNI later on.
7 Lits - I'xcess (i-I f'XpOllt'ut 56 bits - uDsignf'd fractional siKilitiwlld
Thl' range is roughly t hp sam..., but the resolut.ion is now ulp = 16-1-1 =
2- 56 ::;;: 10-17; i.e., 17, inst.ead of 7, significant decimal digit:.. 0
Multiplication. Thp !.ignific-ands of thf' t.wo oJ}('rands arf' to bf' mnlt.iplif'd
a.<; if t.hl'y wprl' fixed-point IIIl1nher5. Thp pxponpnts of t.he olwramls arf' to bp
addl'<l. These two opcratiolL'i Chn be done in parallel. Th... sign 53 is po!.it.ive
if t.he signs 51 and 8 2 are {'fIUal and is negative otbcrwisf'. When adding the
two exponents EI = E;rutJ + bias "lnd = Erue + bias, the bias shonld be
subtract('d once to obtain t.lIP corn'Ct f'xpollf'nt. For bias = 2"-1 (whit-h ill bindry
is r£'presentl:'d as 100...0), subtrd(.ting t.h... hias is e<luivalent to adding thl' bia....
and is accomplishLxl by complf'menting the sign bit. If th... rcsulting PXPOIlt1lt £.1
is larger than Emar, an oVI>rAow indicat.ion must hp gpnl'ratl>d. If the I'xpollent.
E3 is negat.iVl' "lnd is smalll'r thdn E min , then "ln IInderAo", indication DlU5t be
gf'nl'ratro.
Whf'n multiplying thl' significam!s we havp to lIIake :.ure that M3 is a
normalized significaml. Since £,af'h op('rand's signifi('and satisfies 1/13 .\I, <
1. (i = 1.2), the product of thp t",o significands satbfi,'S
Tahle 4.1 compares the floating-point format in IBt\1 computf'rs to t.ho
UsM in DEC/VAX and CDCjCybt'r 70 computers. The hidd('ll hit in t.h£'
DFC/VAX io; a scheme t.o innf'as{> t.he numoer of signifinmt bits in the slgnifi-
cand and thus increase the prf'dsion. For a base IJ of 2 thl' nOrmali7f'd significand
will alway:< haw a ledding I. This bit can be (limindtf'd, allowing thl' indnsion of
an extra bit. A... a result, the rcsolution hecom(>S 1l1Jl = 2- 24 instf'ad of 2- 23 . Th('
vdlue of a floating-point numo(>r (5, f. E) in the hort DEC formnt is t.lwrdorf'
1/1J2 M I . M 2 < 1.
{_I)SO.lf.2E-128
Consequently, WI' may nl'e<l t.o shift the signifi,"allli onl' poition to thl' left Ul ordl'r
t.o normali7e it. This i.. achif'vcd o. pf'rforming on<> basl....d If'ft shirt operation;
i.e., k hdSf....2 shifts for [J = 2 k , at thlsametllne rl'ducing thl>exponpnt. bv I, Thill
is call1'd thl' lJOstrwrmalizahon btl'p. After tin" step is expcutt.'(I, thf' expOnf'nl
whf're f is t.h.. patt.ern of 23 hitJ; in the signifkand fidd. In this ca.<;(>, a I'ro
significand tipld (f = 0) r('p(('S(>lIts t.he fraction 0.10:2 = 1/2. Consid('r now the
60
4
Blnarv Floating-Point Numbers
4,2 FloatIng-Point Operations
61
1//35 Ah/M"l < {J.
A hrntf'-forcp lJIf'thod wOllld lw to ('ontiml(' thp PXI'('utinn of 8 dirp('f di-
vision {\Iorit.hn, 1'lIIployed f(Jr ('akulating Mr/J\[2 fi)r PI - E-l stl'pS, ('vl'n if
FI - E 2 IS nllleh gn'uft'r than t.hl' IIInnlwr of stf'pS net'dl'd to en('rnte the Tn hits
of t.he qnotit'nt's significand. In pract.it.f', this is not an \Cc£'ptahlt, solution sincp
thf' "xecutiun of thf' floating-point remaincler opl'rat.ion may takl' nn arbitrnry
numlJC'r flf dock cycl£'s. As a res1tlt, diP float ing-point rf'mainclf'r is ofh'n nku-
lated in software rather than in hardware, An altprnnt,ivf' solut.ion i!oo to dmnf'
a REM-st.c'p olwration, X IlEAf F2, whic'h pPrforms a prf'-spccified l118..xinmm
mlmhpr of dividf' stt'!>!!, such as t hf' IIInnbl.r of divide st 'ps rf'C)uircd in a (I'gulnr
dividl' Olwration, Initially, X is eqnal to Fit aftPrwards it if> madf' ('CJual to till'
remaind{'r of thc pre\ ious IlEAf-stl'p operation. Sm'h a IlEM-stA'p opl'rlation can
be repeatf'{luntil a rPlllaindc'r that is smallpr than F 2 /2 is oht"inNI (7).
Addition/Subtraction. Tlu.'&' oPI'rations require that the £'xponents of both
operands he ('qual b"fo(l' adding or subtrdl'ting thp significands. Only wlll'n
EI = £'2 can tlw tNm Jf'l b(. factorf'd out and the two significands All and M 2
bt> added. To achievf' this wc' align the significands by shifting the sinifkc.md
of thp snmllt'r operand to the right, increasing it.s exponent at the sallie tim.',
unt.il it. f'quals the otht'r expOlwnt. In ot.her words. the ignifiC'ELIld of I tll slJu,lI('r
numbt'r (i.e., the numbpr with the smaller (xponent) is shifted El - E.lllni.'ie-fJ
posit,ions to tht. right. For example. if E 1 E2' then
may hf'Come smaller than Emon, and an exponent underflllw indication should
bt" gt"nerah'd.
Division. The siRnifinmcl... of tlU' two olll'mmis art' to bp divided and till'
£'xponeuts snbt.ract.N:1. H£'r(', wC' have to add till' hia..o; to tilt' differ('lu't' El - .
If tlll' nult ing ('xpon('nl is out of t.h(. rmlRe. an oVNAow or und,'rflow indication
houtd h' generated. The result aut. :-ignificand sat.isfies
Tht'rcfore. a single bas('-i shift rij:?;ht of t.he signif1cand, accompani('d by an in-
cr<-,a..c;t> of oue in the t'xpon('ut, may be r('(luin'ct iu t.he postnormali7at.ion stf'P.
Till' I'xpollt'nt incred.';c may, in turn, lead to an o\'£'rAow,
If the dh'isor is zero, an o\'£'rHow occurs, and a sp(x:ial indicat.ion of division
by zero should be gent'ralt'd and the quotirnt can be set to zoo. If both divisor
and dividt'ud arc z('ro, the mmlt is undefined ami iu the IEEE 754 tandard
such a qu.mtity hu:. a special repres('ntation called not a number (NaN). NaN
also repr£'scnt:- uninitializcd variabl and the result of O. 00, Thesl' will be
discussed furth£'r in St'ct,ion 4.4.
Remainder. Unlike fixed-point division, floating-point division does uot
g('u('rate a final remaindN. The fixed-point remainder, denoted by R, is defined
as.>i. QD wh(r(' X. Q, and D .lrC the dividend. quot.ieut and divisor, respect.ively
(see Chapter 3). This remainder &"1t,isfil's the inequality IRI 5 IDI and is a
byproduct of the direct. division algorithms, like the restoring and nonrttoring
algorithm. The situation is differeut in floating-point division. The Aotlting-
point remaindrr, d('noted by Fl RF!If F2' is defined as F 1 - F 2 .Int(Ft! F 2 ), where
11It(Fd F 2 ) is the quotient Ft! F 2 converted to an integer. The conversion of the
quotient to WI int.eger nn be p('rformed either through truncation (i.e., rt'moving
the fractional part) or through wunding-to-nearcst. The IEEE standard uses
tilt' round-to-n£'arest-evt'n mode' whit'h is dl.lillt'd in Section 4.5. Iu this Cc., tilt'
following inequality is sat isficd:
F1:f: 1'2 = (- l)s,. AlI:f: (_1)51. M2.-(FI-F») 'I3E,-bIa6, (1.2)
Not,' t.hat Wt' do not decreasl' th£' f'.-Xponcnt of tlu' larger number to mak.' it f'{llw.I
to the ot Iwr I'XPOIlt'ut, 1>incc t.his will rrsult, in a significdud IdrgN than I, and a
larer significand lddt'r will be required.
If, ba..wd on the t.wo sign bits and thf' originally rt'quirt'd opt'rutiou, an
addition i!- lH'rformed, t.hf'n tilt' rl'sult,aut siguificand denotp<1 by M (Ohtdilll'd by
adding thl' two aligned siKllifiemlds), is in the rang('
1Ft REM F21 5 1F21/2
II {J S M < 2.
Careful examinatiou of th" ('xpression for Fl REAl F 2 revcdls th(' higher com-
pl(>xity involved iu ('alculating the floating-point remainder compart'd to that of
tlw fixed-point remainder. An algorit.hm for floating-point division will n£'r-
ate a quotient reprnt('d as a Aoat.ing-point number and will not gem'rate tlw
intt>ger IntlFdF 2 ) whi(.'h can b(' as large as {JEmu-Frnl n . Therrfort", w(' must.
ca1culat,(> the floating-point rl'maindt'r separately 50 that we can perform tJlis
t.ime-coll:mmillg calculation only wltf'U it. is requir('d. The floating-point remain-
der is nOP<led, for eXdmple, when performing an arguIII('nt n>dnctiou for perk:)(lic
fUIICt.iol1s like tllt' tribonomdrk funrtiOIl:S sine and co."inc.
If the significuml J\J is Rrcater than I, a po"tnormfili7ation step is rc'quiflt. fhi.s
consists of shift.ing the significand to the right to yirld At. h and increasing th£'
l'xponcnt by on('. At this point an I'xponent oVl'rfto\V may occur, In summary,
the following steps are required whf'll adding or snLtradinb two tlout ing-puint
numbers:
Step 1: C'alculatl! t.hf' djfft,rtn('e d of the two rXpOnf'lIt!t, (I = IF. - E21,
Stc'p 2: Shift th(' significand of thp smallt'r IIlnnl>t'r by d b&t.--13 positions to the
right.
62
4_ Binary Floating-Point Numbers
.4 2 Floating-Point Operations
63
:,'ILP 9: Ad,' 1111' uliJ;:lIM slnificmlll ami S' llll' C.XP"lIl'UI of the rcsull''(lnal tn
lit" ,"!..xpmll'nl ,}f Ih,' larI'r ,'pl'rutll"
o 1.\/\ < I
Slgnijlrond #2
SICl) ./: r\ormaliz(' lit,' fl'Sullaut. sinificmul ,\lid adjusl th,' ""pOIII'nt if ne<'I'-.snry.
Sit 1) 5: Round I hI' rt'SlIhl\lIl siF;nilkmllJ ami ,,,1jns. Ilu' "xpolll'nt if Ilt"t('ary.
If I he linal operat iolt cnllt,<1 for is subt mel i'lU, t h,'n tit,. fl'Snh lUll signilinuul
sall:ofi,',,-
lxJt(JIl('nt
rom)lnroon
and
signifi.-and
alijl;llment
and ft postnonnali:mtinn sh'p i!; n'<luifl"\l if Ih,' r,'suhallt sillIfinuul is small,'r
than 1/13. l'his !olt'p mnsist:. (If shining Ih£' signilinulIl to th,' I('ft ami ,I,"\'rea."illg
tlw ('xpl}(1I'nt simuhalll'onsly. which may I,'"d to an (,xpotll'nl umll'rllow. In
l'xtn'IIII' ca.'4':S, the pn.o;tnurlllnli./utiull Sll'P 1111\.\" rt'<llIIft' 1\ shifl left operation ov'r
all bill' in t.lu' :oiguifh-aml, yi,'lding a z,'ro result.
Signilknnd
nddil iOIl-
subl rl\rl ion
Example 4.4
1....'1 FI = (0. 100nOO) 16 .16 3 and F 2 = (O.F F F F F F)u; .16 2 Ill' two nnmlw(8
in tltr short IBI\I forlllat, to he slIhtrdCI(,,\1. Thl' sigllifi":\Iul uf tlu' slUall,'r
onl' (i.t'., F:l), has to be shiflt'(t to the right, r('snlting in thl' 10."'8 of the
least-significant digit.
Reg...ti'r
LtYldmg (J!I
Detef.tar
R I Shlft.cr
POSl-
norwaliJ:nlion
and
roullding
Sigrlifiomd
FI O. 1 0 0 0 0 0 16 3
F 2 aiWIl'(1 O. 0 F F F F F W,\
FI - 1<2 o. 0 0 0 0 0 I w : 1
I )ost. nnfmali7 at.ion O. l 0 0 0 0 0 [6-2
AGURE 4.1 Floating-point adder/subtractor.
PI o. l 0 0 0 0 0 0 W 3
F'l n1illI'd O. 0 F F F F F F Hi 3
FI - f O. 0 0 0 0 0 0 1 1(). 1
Posl nnrml\lizal inn 0, I 0 0 0 0 0 0 16- 3
block diamm shows t 11l' hm s(J>aratt' data paUls, wit.h lh,' left. onc fur th,' 'XJ)
lIl'nt.s and t Ill' right om' for :.ignifiral1<k EJ'1}. Adder' #1 (,,()lIIputt':\ the diff('rem'f'
Iwt.\,...'t'u till' t'Xpnlll'nts of t h£' two oprr.ulIls The fl'snlt.iug exponent diffefl'llCE'
(FIJ). DifJ. in I h.. fiw.lrc) df'h'rmin's I h(' amount ,f shift right po...ition:; I hat
thl' signific<md l)f thl' smaUt'r op('rlmd must go through in order to be nliglwd
with the ot.her significant!. Mux #1 is a IImltipll'xor (also known as data St'ltoc-
tor) whit-h routt'S 011l' of it-s two inputs 10 its single ontpnl. Th(' sdcctinn lIf
thl' input signitkand tu hI' rontl>O to the Righi Shllter is dt'termin,,<1 hy the o;ign
of th,' t'XpOI1l'nl dilf"fl'nC'\'. This sign also conl.wb .\ltlX #2. whit'h s('I,'t.ts tilt'
sigllitkl\ml uf Ilu' largt.r opemnd turd rout, it. to tit" Significand Add('r. In th,'
m'xt SI.t'J> thl' <\llditinn or :;ubtraction of the now align!'d si.gllit1cands is pt'rformed
al1<l tht' rl'Sult slon,<1 iu a regish'r. rh('n, a spt."\'ial drnlit, Leading 0:1 Delector',
l'xamilll'S I h£' It'llding hil.s of tl1l' fl'sldt.ing signilkmul and ,Ieh'rmincs tit" typl'
of shift operation I1N,<I,"\1 for HIt' postnnrmalization step and thl' rorfl'spolllling
Hl1justull'nl of th,' l'xponl'nt.. This ndjnstnll'nt is I>t'rforml'tl using E.rp. .Iddt r
#2 ",hoSt' sl"\'und mpnt is the expolll'nt of 1.111' larger l)pt'rmlli. Thi!o. ill tnrn, is
:;d('(.tl"\l b)' Mw: #:1 which is .'gain c\mtwlkd b,)' I.ht' :,Jn of the 1'"<PotU'llt. di!f£'r-
t'llce. Finally_ t hl' 1um'merlln' eXl"t'IIIt'S the rounding. if nlX.'iilI), nl-rording t.ll
Ih,' rnl"s l'xplain,,,1 in th(' IIl'XI S(,,\,t.ion.
Not only is this n timt....consumin postnonnali7atioll sh'p (shifting 11\'(,f
nw ht'xHlI('('illl.,1 lligit-s), but tlu' finnl rl':iuft is in ('rror. Th,' l'orR'Cl result
(with un "unlimih'(l" l1Innlwr of signinl'1\1Il1 digil$) is
The ('tror (alw l'alled 10....... of signifkauce) i 0.1 - 16- 2 - 0.1 . W- 3 =
O.F. It>-3. A solution to this prohlt'1Il WI)uld hi' 10 havc uurd (Iiits;
i.t'., ndditillnal digits to tilt' rij:?;ht of Ih.. signitkallli tn hold t.h,' shifh'tl-
out digits. In th,' nho\'\' £'xamplt', a sinlr (llt'xad,"("ill1ul) n.ml diit. is
sulli,.j('nt. Thl' guard digil will h,' disl'uss,'(1 in S,'(,tion 4.5. 0
A :>implifi('(1 blol'k dinram of the nr("nitr.\. fl'quin."tl tu p,'rform thr ao-
ditlou l)f lIbtral.'tion of ttlmrill-poilll numb('rs is t.l1'pirlt"\l in Fignr' 4.1. fh..
64
4. Binary Flootlng-Polnt Numbers
4.3 Choice of Floatlng.Polnt Representation
or.
fO
.",
Xu X'l XII x,o xI' x x 7 x. X-' x. XJ Xl X, Xo
' - .JJlJJJ:k J}-: : ,
",u
12 II 10 9 II 7 (i S ., 3
4.3 CHOICE OF FLOATING-POINT REPRESENTATION
FIGURE 4.2 A two-level radlx-4 shlfter for 16 bits.
All 1J(IIh till' IFF Htandrd 7501 d''SCrih('d in th(' next se .tion iR now mmmnnly
m",11II JI('wly dslhncd anhmptic lIuits, it is important to 1II1<1(>rst 1&1111 'hp impli-
cat lOllS of s('I('I'1 mg a IJdrt.r("ular formut and, \S n re>;lJlt, undl'rstand t.llp n'L'iOIJ!I
IJI'hind thp mlnp,,'d standard
\Vht>n d.'signing a formlLl for lIoatinJo(-point JllJlubf'rs Wf' art' J(iv,'n n. t.hr
total nmnb£"r of bits, <md w(' have to dptc'rmiJu> thp Ipngth of thp signifimncl 'Ulcl
cxpOJI('nl fields, denoted by m ano e. rc'sp('c.t iwly (slItisfying m + c + 1 = n),
and till' \ahl(> of th(' expol\t'nt hns(' (3. OIl<' goal t h..t Wf' wonlll likp 10 nelliI've> is
h.wing a small repno:;pntation £"rror, which if! till' prwr lIIau{' wlll'n r'preseJlting
a high-precision rt'al IIIJlnbpr in a finite-It'ngth floatmg-polllt format. I >t x III'
a real nurnbf'r dml F/(x) b(. its mdchinc rf'prcst'lltution. FI(x) - r is 1'1:1111'0
the ub!/Olrdl rrprc:,nltation rror. For evcry r£"al nmnber x that il'l within the
rang(' of th . float.ing-point IIInnbers, t h,'rp drc two 1"lIl1.'\I!l'utivc repn'st'nt ul.ionli
FI mId F 2 satisfying F 1 5 x 5 1- 2 , flms, we hdve thp dlOice of sl,t.ting FI(x)
('1111.11 to ('ith('r FI or F2. If FI = AI/If', th('n r 2 = pI + ulp)I, md tlJl'
lIIaxin\llm ahsolutl' ('rror is half t.he dist-lUI('I' h(>lwe('n PI and 1"2, which in turn
(>CJllals ulp - {3F. Unlik(' th(' distllnce Il('hvl'n two consCl'lItive sigllificancls (rJIII).
which \.....s dio;('us:;o in the pn'vious section, th dio;tan<:e b 'tween two CIJn!4'Cutiv,'
float ing-puint IIIIIn1wrs (ulp. (3E) is clearly not a <'On>;t.ant but varit'>;, b"I'omill
largt'r ILO; thf' ,'xp0l\('nt incrcaSt>s, 8S shown in thl' following diagram:
$1
SJ
Out of t Ia(' functionaillnits included in Figure .1.1 thc t\VO shifters deserve
n sf'pnral(> llis('ussion. TIll' first. shift£"f should bp cl\pahll' of pl'rforming right
(alignllwllt) shifts only whil,> the s('cond ou(> shoulll 1w nlllt, to (wrform (>ithf'r
rihht or h'ft (I,"tuormlilization) shift.s. lon' importantly, th(> two shiftl>rs lI\\Ist
hI' capable of perfofllullg large "hift operatlonli. IL>; large as the numbl'r of digits in
thp signilinnd fi('ld. fhl' oVl'rali performall/"(' of the floating-point add/suhtract
IInit is highly III'pclldl'nt on t.he sp('('d of these two shift.f'rs, Const'qul'ntly. th 'y
ar(> usually implt'II\('nlf'd fL>; rombinatorial shiftl'rs rathcr tlmn shift rcist..rs,
which would r"quir£" a Img£" "\nd vllriahh' rmm1)t'r of doek cyeles to com I'll'll' t h(>
shift. A combinatorial shifter gClIl'ratcs all possible shift I'll paU.erns but only one
is provideo nt. till" out.put according to rom(' control bits. Sinn', ill gl'llI'ra such
combinatorial shifters arc c,'lpahll' of performing circular shifts (rotates) as w('II,
they are cOlI\lI\onb... known iL') bumJ shifters.
A barr('1 shifh'r can be impl£"mented as a singl(> Il'vei arrdY when' l'aeh
input hit i.o; oirectl) eonncctro to m (and ('\'I'n morl') outpnt lines. For 111 = 53
(till' numbl'r of signific"\1\(1 bits in till' IEEE douhllprecision format. s('(' Section
4.4) th' largl' num1)('r of cOlln('ct.ions (and the resulting larg(' I.Jt'C.trical load)
makp this an undlosirahll' solntion, dlthough tilt' 0\0"('(1'11 dlsign is I'Onceptnally
simpl(' 114). One alternati\'f' is a two-Ic\'cl array. W,. can impl('\1\cnt. a two-Jp\'1'1
cOlI\hinatoriul shift.t'r for 53 bits by ha\ ing the first lev(>1 shift thl' bits by 0, 1,
2 or 3 bit I'0:.itions, lLml leU ing till" s(>cond Icvl'l shift tilt' hits by lIIultiplffl of 4
(i.e., 0, 4, 8, -", 52). 111 t.his way, shifts from lellt.h 0 to 53 ran bl' pl'rformed.
WI' 1',111 this two-I(>"...I shiftl'r a raclix-4 shifter. An I'Xdlllplt' of n two-level radix-4
shift(>r for Hi bit.s is shown ill Figurt' 4.2. III the first I..vcl of th(> 53-hit ruoix-
4 shift.l'r eal'h bit hu... four .It."t.inatiolls, whil(' in thl' S('(-oll<l 1"\'1'1 each has 14
dlstill(\t ions. A more baldnel'rl two-h>\"('1 shiftt'r for 53 hits would bt> tht' radix-8
t:ihifter, whl'rc the first le\'l'l shifts from 0 to 7 hit. positiolls ano t.he :>/.'(_'OI\(II('wl
shiftl'l by multiples of 8 (i.e., 0, 8, 16, 24, 32. .10 11110 .I). Thus. ('arh hit in th('
first IeVI'I hdS 8 dcstindtiollb and 7 ill the St'ClIlld Il'wl.
. . .
. . .
pI::
,:1F+J
F. "'1
Usually, t.hl' absolntt' si...c of the I'rwr is less importallt thall its rf>latiw
!>iz(' (compdr£"d to tIlt' oribrillal valilc x). Thus, a 1II11f1> illJfJt)rtant dntl commonly
uspd nwasur(' for thc rq)(('Sl'lItllt.iolll.'rror is !S(x) = (FI(x) - x)/x, whidl is t.he
rclutivc rcprnRntatiQn error'. To m('a.snrc th£" "nc.euracy" of t.hl' r('I}(l"sl'lIt'ltion
WI' may use thl' maximum r('llItive rl'pr(>sPIlt.atioll ,'rror (MRRE), which is 811
IllIper LOIIIIO of 6(.1:). Thi., upp,>r houlld ('all 1)(' nhtllilll"d ,Lo; follows:
l5(x) 5
.1 u l p fjE
J. I
Mill"
=
1 /lIp
2 AI
< 1 ulp =
21""
I
- u/", J3
2 ,.
( I.J)
Thus. till' MRRE = ! ulp. {3 illcrf'!J.Scs \\ith till' f'xponellt. hlL'i(> 13 bllt de£"reil.'it'S
with ull'; that il'l, wit.h thl' lIurnbl'r of silIificallll hit$, FIt,
fill' l\IRUE would prnvid,' nn an..'ptublc IIJI'&;urt;> for t.h(' u('cura('y of
th(' r<'pr(>pntntioll if thl' opl'[(mds ill flolltlllg-point complltat.iom, W<'rt' 11101"1> or
IIf1
4. Binary Floating-Point Numbers
4 4 The IEEE Floating-Point Standard
07
1
.\1 In ( ;
IpMl
f " m HUllgl' MRRE \RRl
2 9 2:.? 2'. -I = 2.1':' 0.5 . 2-.11 . 2 2- 2 O.ISn.2-:.I 1
=
,I 2I 1 2 '-1 = 2J' -2 = 2J.' 0.5 . 2-:13 . I = 2-:1.1 0.135.2- 11
U; i 2.1 W 141 - t = 21"-1 = 2.1:'>.1 0.5.2-.11 Hi = 2- 2 ' U.169 . 2- 11
I,':','i IIlIiformh' dilrihutt'(\. Howl'n'r, 8.'; lu\:> h.,'n "hsf'(\'\'lt, t.h!' (1ilrihllljolJ of
floaf m-p,'ill; opl'ramls is not unifl.'rnl bul appron..tll's I h,' r('('iprol'lllltilrihllfi()J1
wil h 1111' f"IIO\\ in u"II"iity function (20):
TABLE 4.2 Range, MRRE, and ARRE of three 32-bIt fIoal ng-polnt formats (4).
In ..I h('r words, h\rlo!;,'r SilIilkl\lllts 1\1'(' 1t'S.."i Iikf'l,\' 10 ,'('cm thllll sllJIIII('r slIili-
('1\11l1s. For f'},.nmplf', thl' fi1 u.jl of n d,,'illJl\1 flOnfill-poillt "p,'mlllt will 1II0.."i1
likd,)' h(' n 1; 2 is till' s('l'Olllt most likd)' nnd SI) ,tn.
This lIolllllliform uhitrihutioll is hu.clI int" u('t"Ollllt ill Ih(' S''l'Olllt IIJCIlIr"
prop,)..;,,,, fur n'pn'SI'1I1 ali,)11 I'rrur. l'his ml'l\."Uf<' is till' mwmy( rt'll\t.in' n'prt'-
s"lItl\lion I'rror (AUUF). Thp nUI.Xillllll1l \1\111(' of th,' nhs"llItl. ('rror is, .L"i Ims
hl't'll shown I\hn\"f', ! ul'l' ,I-'. nUll sinC'\' t hi' minimum l'rror is If'ro, t h n\'l'mgt'
nhsl,IIIt.t, error is t ulp 131-'. TIlt' l'Orl'1'Spollllillg fI,I.\tin' H'prl'St'nlnl ion ,'rror is
Ilwrt'fot\' ! I\Ild. ('()ns('<)lIl'nt I)',
,_t u/p ,
ARRF - J; /1111113 -1.\1 d. I =
.f - 1 1//,)
III 4'
(,1..1)
C'\)JIslImlllg stl'pS n((' thl' I\Iium uf th(' silifil'lUlds l",fofl' IUld/sllhl md ul)-
l'rnt ions IImt po."t lIorllln)i/lilioll ill Imy n'LI illg-poilll opl'ml iOIl. II Im. \)1\'11
obSt,(\."t thnt. n IlIrgl'r ,'xpolll'ni hl\w,J if'lds u highl'r prolll\bilit.y tlf ''(IIJIII,'xp(
IIl'lIls illl\ttd!!\uhtrn,'1 "Iwmliolls (ill ",hidl ('1\,"" un nli).,YJIIII"lIt tf'1' is "'-('I'''-':I\ry)
IUIlI R lo\\'('r prnhl\hility t hnl n pll,."tnormi\lill\lioll tl'p wilt hI' 11I'f'lk'cl. St ,\titi"111
muu} sis of 1\ \/Ui!'1 r of pro)..rmus Ims pro\"idl'tt I ht' rt'SlIlb indllll,,1 iu Ti\b' 1.:1
(19). This tnh'" shows t.11I' 1"'rl'l'lItl\f(I' ,)f CI\S,'S ill which 110 itlif(nllll'lIl shirts WI'fI'
1I1'l'd,'CI, th()I' iu whkh n sillgll' U)ipllII('Ut. shift was (('llllin'll. !LUll IIII.N' ill whidL
n Inrgl'r (1\\',' or lIIore p,.sitillns) shift Wib; In"tl'l!. Also. pl...o;tllnrlnnli./l\tiull \\'l\,."i
Illtl rt'(llJin'll iu r.I..I(\\ of the' elL. for 13 = 2. whil,' It WIL"i 1101 1JI"l'CIt'(t in S:!..&l'(\
of Ih,' l'i\S,'S for Jf = Hi. It, is illh'rl'stillg to lIoh' Ilmt I'WII for = 2. uo pllIor-
IIlILlizatioll Sh'l' is 1I('("('s..--nry ill 1II,\St. ('I,,..('S. rh,' nh",.(' rouid('ml iOIl is of limilt'll
pnwticnl slIifinuJ(,,'C wlwu n band shiftf'r. iL'; ,It'scrih"t ill SI'C'liou 1.2, is IISl'l!.
Allot her faetor to Ill' C'\11Iid.'rf'l1 wlLl'1I St'I,'(.tillg 1\ 1101\1 illg-poilll fornmt
I th,' Sl e I,f thl' mugt.'. I'hl' mllJ,W of 1111' Ixith-(' flOOling-pollll mnulw, ti'r
l'xl\lIIpl,'. is apprnxinmldy l'<)lULI 10 th,' IlIrgcst p,l,.."iitiw mnul)('r n'prt'scutnhl,,,
Slll(,(> Ihl' slllall,'St pll,.."iti\"l' IJIllllbl'r is \.l'ry doSt' to I,'ru. \\'1' willusl' tlJt.'fl.Lrc the
('xpt\'ssioll ;3f:,..a. for t.h(' fl\nge. Tlms. to obtaill II Inr,' nUll' WI' should iUl'rl'I\SI'
,J lUld/or th£' IIlllllhl'r of l'xpuuI'ul bits e. IlIcrel\Siug tIll' Inth'r implil's II'&' hils
for thl' sigllifil.'flud tidtl wilhiu th... tlunting-poiut fnrmllt l\IIll 1\ high,'r mlm' (If
(lip, rt'sulting in n hiJ;lll'r rf'pn'SI'lIlnliou f'rror. A silllill\r drl'l'1 is I'XItt'riNlo'(l if.
illsh'ml (If incl'1'IL."iug t, WI' ilJ('n'IL"" Ih" I'xpunenl bust' . (OllSt''lUt'llll), tlwrc is
a tnLlll'otf hl't\\',""11 Ihe muc IUIlI thl' rt'pr('"il'utntioll l'rror.
\\'... IlIn)' rollsilll'r sl'\l'rnlflnaling-poillt rt'I)((.....,'lItl\lilIIlS t.hut hu\"c till' smul'
nU\.g" nud sd".t thl' Olll' wilh t.h,' slllull,'sl lnRE or AnRE, Auolh,'r pl).,-.:ihilily
is to ('Ollsilll'r SI'\ I'ml (('pn'sl'lIl aliUlIs wit h t h,' Sl\IIII' :-.mnE (or ARn F) 1\11.1 S(,Il"l.t
I h£' ,\111' wil h t.hl' largl'sl fl\llf(p. }t"ur I'xmllplf', fM u 32-hil wInd rn + f' is n (011"
bil is rf'St'rY<'l1 f,)r thf' Sigll), UtIlt WI' IIIi\Y IISI' Oil... of Ih,' fl'prt'St'lItnliolls showlI ill
I'ubl,' -1.2 [-I). All t hn"l' hi" I' lIenrl}' Ih(' sullie mng('. 1'h,' bu....,. 16 rt'pr!'St'1I1 atioll
is illferior to the uthcr two rcprpsl'lItutions with J3 = :z iUld = -1 if thl' i\IIUtE is
s('ll'C,tl'<i I\S l\ IlII'USllre' for th,' rt'pn'St'1I11\1 iOIl I'rror. Busc.1 lurns out to I'rodllt"t,
t.hl,lowI'St AnnE fur t.his pl\rliculur ,'xulllpl('. If hl\Sl' 2 is sd''l''I.<'lt, amI a hilStIl'1I
hil is U'lt, th"1l th,1 MHHF IUIII ARR\-' IUI' rt'ltut"l'll by n fudor of2, IIJlLkill t.his
forlllut till' Oil£' wit.h I he sllU\lkst. rt'prt'St'lItnti,tJl t'rror.
A llitT"rt'lIt onl thnt ,)II,' might t'llllsi,ll'r wh"11 d,x'idillg UpOIl tL flnl\llIIg-
poi lit. format is till' "X,'('Utioll I illll' of flllat.illg-poillt 0p,'r.\t iOlls. Two t iml'-
AIlIlIlI'ul shift 13=16 3=2
0 .li.I' 0 2.()"O
1 21.W% 12.1
>2 26.7"0 55.3<',\
TABLE 4.3 The probability of alignment shifts of different sizes (19)
4.4 THE IEEE FLOATING-POINT STANDARD
Th,' IFE\-' lIol\t.illg-poillt. shmdanl,l...filll's four forllluts for tlualill-I)('illt IJIUIl-
hl'l"S. TIll' tirsl Iwo un' till' hasil' Sill)!;fto"'prtx"isioll 32-bit f,.rllll\t alld t,h,' duuhl,,-
prt'l'isioll 6.I-hil forlllut. nit' 01 h,'r 1\\0 Urt' Ih,' ('xh'JllI,,1 forml\ts. to lw IISt'CI,for
illtertlllXliulc (I'slIlts. rhl' SillII' I'xh'ml''ll forllmt. shou'" IJln-c 1\1 It'IL"it H hils.
mid I III' douhll' l'xH'mkxt f,)flllal sllImhl hl\\"I' 1\1 It'IL.,,t so bits. rh,,' ,xh'lIllt.'ll
forllluis haW' 1\ hih,'r prt'Cisioll 01111 u highl'r rlmgl' tlum I h,' l'Orrl'sl'omlill)., 3!-
mut ti-t-hit forml\ls,
4.4.1 Single-Precision Format
rhl' 1JI,Ist imporlullt. .,hj,....li\l. fl)r I hi' 32-bit. format is pn,<'isoll 01 n'lm'sllllLlil'"'
!lplI,',\ hi\S(_' 2 WiL'; SI'It'l.ft'lt, ullowill I Ill' USt. of a li,lul'I,1 hit 10 furt IJt:r ,1111'11'1.>;('
t hl' pr('("IS101l. As 1\ rf'slIll, I Ill' sllg&I':ih'Cl foruml IS sumlur It) t h,' PI, <. forlllut
68
4, Binary Floating-Point Numbers
4.4 The IEEE Floating-Point Standard
69
('''(. rubll' 4.1), hut thl'rl' !UI' "oml' diffl'n'nc('s, I"" ilillil'ntNllwlow. III orcl"r tu
hR\,(, R rl'l\..;onahlt, rRuge, nil I'XJllIlIl'lIf fil'lt! of Ipngt h 8 hits WI\." SI'IN.t('d, )'iddin
t hI' fnllnwin fOfllmt:
F = (_1)8 1./ 2 E - 127 .
(4.5)
This is 1'0111(" imr.s eXpre:sl)4d as (- 1)'" 0./ 2 1 - 1l7 to hR\,(, till' !18mI' hil\."1 as nor-
muli./l'd lIIunhers, Note tlmt tlenormali71'd nmnhl'rs II/we no hidcll'n oit sinc{'
the sluificnllds should lIot 01' nornMIi"e<I, AI:;o, although the trnt' value (Jf the
t'XI)(IlIl'lIt should haw lu'('n 0 - 127 = -127, the vahlp -126 was Sf'lt'Ch'd, sincp
t hI' snmllpst lIorlllalizt'd lIurnbpr i!> Fin = I. 2-1.l6. \Vit h r!l'nornmli1t'(llIInnlwrs
thl' slIlull('.st. rt'prl'Sl'ntabll' nnrnht'r is 2-23.2-126 = 2- 1 . 111 instl'l1d of 2- U6. Tit£>
addit.ion of t!cllormali7A'tllllllllb(>rs hI\." b(,(,11 h'rmNI grallnal underflow or rul,'ful
11I1llt'rflow, It dot>:. not dimillate nndl'rflov. I hnt it slIb"ltaut,lIIlIy rpdnc tlX' ap
l)('twI'('1I the slllall,'st rf'prpSl!ntabll' nllmhl'r nnd ero. This gap, of si' (> 2-14'1, it!
l'<lual to t.hl' distall<'t. b('tw('('n mlY two l'OnS('('ut.ivt. t!,'normalize<1 nUl1lht'r!> Rnd
is also thp distnnet' bl,twl'('n UII.\' two consc<"utivl' nOfllmli71'(1 llIunhl'rlo with fl\l'
smalll'..-t plliisibll' expOll<'nt. (1 - 127) = 126, liS iII\lst.rdtf'(1 in t hI' following
diagrmn:
S S hit.s - bi8S('(f "'\III)U<,ut F 2:1 bil.... - ulisij.,<JIl'li fral"titlU /
Ont of t ht, 256 combilU,t ions of t hI' I'XPOIII'lIt fidd, two mp rf'SI'r\'ed for pf'('ial
vahll'S. F = 0 is n,,",'r\'I'{1 for 'pro (with fraetioll / = 0) ulld deullrmnli7M
IIIllIIhl'fS (with fra('tion / I: 0). E = 255 is n'Sf'r\'NI for cx: (with fradion
/ = 0) anti :\iaN. (with fmetion / f. 0). fhes' spf'{'ial rl'prc,st'ntntiolls ar<'
furt h...r diS("lIssl'(l h,'low. For thl' rcmaillinJ!; C.XPOlll'lIts (i.t'., I $ E $ 254), the
\"Rh\l' of t he floating-point nlllllbt'r is in'n hy
TllI'rf' art' tWII lliffl'f('nrt.>s bet\\' "'II thl' IF FE sillgll' prt'('\Slon forlllat mid the
DFC short format. Thl' I'XpOIll'nt oil\.... is 12i instl'all of 2"-1 = 2 7 = 128. This
pro\'ido>s n larJ!;,'r ml\.ximlllll \-ahll' of th,' trlll' I'XpOIll'nt, 254 - 12i = 12; illstrad
of 254 - 128 = 126. yit>ltling a lar('r rallgl'. A similar I'lf{'(,t is 1Ic1lil'\'f'd bJ IIsillg
8 siJ!;lIificand (If 1./ insteatl of 0./, sineI.' this Rllds 1 to till' exponcnt. As a n'sult,
t hI' largt'St. and sm,IIIC'St. positi\l' nlllllhr[5 are
Denormalized
numbers
. . .
o
2- 126
2- 125
2- 121
F = ( 1)5 0 ./ 2- l26 .
( '1.6)
Dl'normnli"ed nlllnhrrs haw not bl'l'n included in all t lIP d!'.Signs of d[ith-
IIIl'tie units t.hat follow th(> IEEE standard. flUs is IIIwnly Ilut' to thr high cost
n.,..odat.cd wit.h their IInplellll'ntation, sill<'(' (,hI' rl'prt'Sf'utation of dcnommli.lt,cI
IIlllllbl'rs is different from that of norllll\lillod IIUlllbl'rs, ft'<)lIirillJ!; n morc compl(>x
tlpsigll and po..,..ibly a 10nJ!;er o\ll'rnll px{'('ntioll t.ime. E\'l'n dl'si,:!;ns lhnt impl£'-
IIIl'nt dpnortlld1i71'd II\lIIlbl'rs allow tht' programmer to avoid t Ill'ir use if fdStI'I'
('XI'('lIt iOIl is df'irNi.
Thl' IEEE stalldnrd also drfin, a sillgl{'-rxh'ndf'd forlllRt tu hp I'lIIplo)'r'd
wlll'n caknblting illtermecliat.t. r('sults within th.. evaluatioll of complex Ii.III(tions
like (,hc tnmscclldt'lltal und power fllllet.ion. The single-,>xtl'lldt'(l forlllat l'!Xtcnds
tlu' cxpolll'nt field from 8 to II bit.s and thl' signifknlld liPid frolll 23+ I to 32
or more bits (wit,hout a hitldl'n bit). Thus, t.he total length of a siuIt'-,'xtt'ndd
floating-point number is at. I"n....t 1+11+32-1.1 bits.
Thf'rr arl' two kinds of NaN (Not a Nlllul>t'r), till' sign.,ling (or trappillg)
NaN, sud t.ht' quiet (or nontrapping) NnN. Na!':s ure r£'prespnh'd in th(' sinh'-
prl'cision fOfllUlt h)' E ..255 and / 1:0 allnwing n large numlwr of pns...ihh' vahu's.
The most sinific8l1t hits of th.' fraction can be uS('{1 to distinguish betwl,,",u tht>
two kiuds of NuNs, Thl' rt'mnilling hits IIIn)' colltaill system-(h'pI'ndl'nt inform!,.
tion. All pxnmplp of a signulinl!; NnN is an uninit.i.llill'd \'uriublc. A silUuing
NaI\ srts t.hl' Invlllid operat.ion t'x(.l'ption l1al!; (s('(' Sf'{.tion .1.8) wht'llt'vpr lm)'
arit hmetk OIH'ration \\ it h this NaN as IUI opt.rlllid. is nttt'mpf£'11. In l'ontra...'it,
n quil't NaN does not. sd t.he Invalid opt'mt.illll ,'xCI'ption flag whell inYlIlv('d in
f '+ - ( ? _ ?-23 ) . ,,25-1-127 _ (1 ,,-24 ) ?12t1
rJ'1GZ - -.. - - - - . ..
and
F+. = I O . 2 1-127 _ ?-126
".In. - ..
('ompun'(l tll F.o:r = (1- 2- 2 .').2 127 find Fin = 2- 128 , n'SI".,-t.h'e(y, in the DEC
forlllat. The "XPUIII'lIt bias and significant! rallb(' "C[{' s'll'CtL-d so liS to allow
th£' rt'Ciproeal of allnormalizl'd lIlunbt>rs (in partkular, F,;;",) to be reprt'sl'nt('(1
without o\'l'rf1ov.'. This r('(luin'lIIt'lIt is not satisfi('(1 for F';;in in till' DEC format.
Finally. a fl'v.' eomml'nts :ibollt th(' spl'Ciul \'alu{'S that can bp r('I}((>sl'n(,('d
ill till' IFEF format. dnd which are summarized ill the followill hlble.
/=0 /iO
£=0 0 Dl'normaIi71't1
E = 255 f:oc NnN
Opl'rst.iollS dpilling v. ith thr \'nhlf' I"X! that ar(' reprejl'lItt,<1 by / = 0, E =
255, and S = 0.1 1I111"'t. ohey till' tradit.iollal 1116thematical con\'l'ntiolls such a.:;.
F + oc = , FIx: = 0, ,.te. Tht> dl'nnrmnli7..ed numlu'rs provide repn"st'ntlltions
for vallll'.'i slIIall..r thall the smdllcst lIonnali1(d IIlIInOt'r, I"wpring thl' prohability
of an eXp"llI'nt uut!I'rllllw. DI'lIormali/l'<I I\\lInbprs 8r,' fl'pr'>SI'ntl'd hy F = O.
IIlId their \'alu' is \'t'n b)o'
70
4. Binary Floot1ng-PoInt Numbers
4.5 Round-off Schemes
71
an aritlmwt.ic operation. A signaling NaN hlrtlS into a quid NaN wh(n uscd us
an olwrand for I\n arithl1lf'tk operation if t hc In\'alid 0p('rat.ion t rap is ,1ilbhl,
to avoid sl'Uing thr Invalid 0pl'ration fll\g 8J!:/lin latN on. A qnipt NaN is al9)
proch..."" wlll'n an iJl\'i\li,1 opl'ratioll slIch as 0, OQ is '1ttempted, sillCt> this 0l)pm-
t.ion hnd alrl'ady Sf't thf' 11I\'i\lid o!)pmtion flng OIlCI'. Th,> fraction '11'111 ill a qlli,'t
NaN may contaill a poilltl'r to till' offf'ndillg 1i11l' of code, A qllil't Na:\, whcn
lIsed a.s an opl'rand of an arit.lmwtic opNation will producl' tlw samf' quipt Na.'IJ
8.'< a result anel will not s<>t any eXl'l'pt ion IIag. Fur ('xampl(" u!'\ +5=NaN. If
bot h op,'rands of 8n aritlllllctic opf'ration arl' quipt NnNs, t hI' rC'Sult will p'llml
lIll' NaN with th!' smalll'st sigmfkand,
4.4.2 Double-Precision Format
4,5 ROUND-OFF SCHEMES
Thl' maill cllnsinl'ration for the douhlt'-prccision format is rangl'. Const'-
quent Iy. thc exponent field is iIlcrt',I.l'd tu 11 bits ;)"it'lding the following furmat:
fill' lJ.('curw'y of results ohtainftl in a floating-point arit hl\ll.t.k unit is limitf'(1
pvpn if the inh>rll1l'Cliah' rf':mlts c Ilt-ulatf'11 in t hI' aritluUl,tic- unit ,Irf' arl1lr,Ite'.
Thl' nlllnht'r of cOrllpuh'{l digits lIIay f'xCCf.'d till' total nmnbf'r of di"its allowl'd hy
t hf' format. alld we h"vc t.o dispoSl' of the extra cliJ!:its bpforf' the. 'inal rl'Sults are
storftl in a user-aecessibl(' rcgistl'r or in tl\l' IlII'Ulory. Fur pXdmpll', wh,'n nml-
tiplying two significallds l'adl of length 111, a prudurt of lenRth .z", i.J; J!:eneratf'd
and w,' IIUlst rollnd it off to m digirs.
\\'!Jcn S<'IL'{'r ing a rollnd-off sehl'lIlC' WP n('('d to considf'r the f(lll()win:
1. Accuracy of re'slIlts (nlllnl'ri(,31 considerat.ions).
2. Cost of implcmentation and SPf"('<.! (madlinI' consid.'rdt,ions).
F = (- 1)5 1./ 2F-I023,
(4.7)
Let x dnd y be re Iinurnhf'rs and If't F/ 1)1' till' set of machine rl'prl'SPnra-
tions in a given floating-point format. Dl'notc by F/(x) thp machinp reprt'St'nta-
t.ion of x. \\'hl'l\ rounding rcalnumber!- to ma, hillt' rppresl'ntat.ion'i the following
conditions should be satisfied:
1. F1(x) $ FI(y) whf'nl'\'t'r x $ y.
2. If x E F/ then F/(x) = x.
3. If FI an,l F2 arc two cunSlut.ivc numhers in FI I\ch that FI $ x $ F2,
then t'it-her Fl(x) = FI or F/(x) = F 2 .
:; 11 hits - bia."ed exponent B 52 hits - UIiSiWll'd fraction /
Thl' l'xtrPIllf' \'ahll's of F, i.1'., 0 aud 20.17, arc rl'.<;l'rvNl for the salllc purpoSL
8." ill thc singlf"-prl'cision format. Thl' valuc of a floaring-poinl number with an
exponent F in till' r8nge 1 E 20.16 is
TABLE 4.4 The single and double IEEE floating-point formats,
Let d "I'nott' the numher of extra digit.s thdt art' kepr in thp nrirhmetic
unit (in addition t.o thl' m ignifkaJ\d digits) befort, rounding i perfornwd. For
L"On\'cnif'lIee, I(t us assllml' t.hat tlll're is a radbc point betwf'ot>n the m must signif-
kant digit.s (of the siJ!:nificand) allli thl' d extra digits. Thus. we' will inVl'stig'1re'
ways to round IIlllubl'rs like 2.9910 and obtdiu ,\I) intl'gl'r.
The simpll'st sdlellll> is ('ailed truncation or chopping and is illustrnt('(1 in
Figurp 4.3 (11). We (('move t.he II extra lligits with no change in thl' 111 rl'nJ.lining
digits. For 1\ givl'n FI $ x $ F2' Trunc(x) rl'Sult in rounding towllfd Il'ro,
yiclding thl' smaller of FI and F2. For t'xample. thl' decimal numhf'r x = 2.99.
when rounded rowanl zero, yields 2. This is a fast method that dol's Ilot rpquire
ami extra hardware, bllt H.s mllJlerkal performal1l'e is \'Cry poor. ThC' .rror
inrodueed by trun('at.ion ("I\n be /limost IS lurJ!:e dS ulp (the weight. of t.hl' It'flst-
significant ,Iigit of th(' signifirand).
rh,' curVf' for Trmw(x) li('S t'ntirl'ly bl'low the idl'allinc (the dott(',1 line
in FigurC' -t.3) which pwvilil's illfinitC' pre..-isioJl, We sa)' that truJll"iltlOIi ha..... a
ncyatit1e bim; whl'rl' the bins, iJl gl'lll>ral, llIeasur...... the t 'ndency of a round-off
schc'lIIc to favor f'rrors of Ii part.i('ul.u ,.,igll, CII'mly, we would like t.o \11\1' a
round-o'f scht'1JI1 t.hnt. is unbi8.<;ed. or ha..... a v,'r snhlll bias, To (-Olllpdre rhe
bi.\.<; of truncation to that of orher rounding sehemcs quuntitaHvdy, WI' dmne
This format. dS well 8.<; till' singlt'-predsion format, arc sllmmaril't'd in Table
4.4. A doublf'-f'xtl'nded forml\t is also nt'finl'd in t.hf' IFEF stamlurd. It I'xtl'll<ls
tlw expOl1f'nt .h,\d from 11 to 15 bits and t hI' significaml fipld from 52+ 1 to ().I
or more bits (Le., without a hidden bit,), and ('on8e(IUcntly t.he t.otallllllnlwr of
birs in the doublt'-cxt.ended floating-point format is at leasr 1 + 15+64=80. Tile
inrerC'St.1'd rl'ader is referrftl to several arridC's d,'scribing the detliils of the IEEE
floRting-point standard that appear in t.he IEEE Computer Magazine, Vol. 14,
March 1981.
Single Doublp
Word I('ngt.h J2 bit.s 6-1 hit.....
Fractloli + hidden bit 23 + 1 bits 52 + 1 bit
F>..poncrlt 8 bit.s 11 bits
ilia.-; 127 1023
Approximate' ranbc 2 128 J.8 . 103 8 2 1024 :::::: 9 . 10 307
SmallL'l normali/,cd numbl.r 2-126 :::::: 10- 38 2-lOl2 :::::: 1O- JOII
Approximate rC$olution 2- l3 :::::: 10 7 2- 52 :::::: 10- 1 :;
72
it Binary !=Ieotln ]"f'elnt Nurnl (.
4.> I'?ounc.l.off Set r 1 .
/I/lFmd Ifl.
III """HI(.r)
11111
1111.
mil
()I "
CICIO.
:1
Tn", (. .
JOn.
£Ill.
OHl,
O!ll.
( (Ju,
IIII.n 1111.1 () I.n II 1.1 10.0 10.1 11.11 11.1 .r
FIGURE 4.3 The truncation sch m
.
U{J.U (10,1 nl,lI (), I 10 (J '11.1 11.11 II I
F'GURE 4,4 Th round.t.n r t 'h rn ,
IIII' hilLN for IL ,:,i\'l'1I d fiB the 1.1\"'ruy,1' I'rrllr fill n t!I't IIf 2' l'IIIIHI'f'II'ivf' 1JIlllllIf'rH I
wh"r<' I,rt'or c r '/J1If'(.r) - .r nJld n IIlIifllrm di'il ril,lIl iOIl for 'III' Hlgllllif"llllll It!
1&AAlllnl'f\ /J II,
For I'xlLlllpln, 1111' rfllulllillJ{ ,'rrorH wh"l1 'nUII'II' hlK \Nllh d _ I. IIff' HhllWII
ill f'1I1,1t. ,tr" III 'Ilitl Illhl", \ iH nlly HIKnili('lIJlIl"f 1I'IIlh",. 1'111' HIIIII IIf "((OrH
fllr 1111 I." IZ I l'f}II111'f'II'iVf' 1II1II11",rll IJj :1/2, I'Il1'ff'foll', t hI' hilLII for d 2, whil'h
iH IIII' IIVf'ra:,' f'rror, "q1l1l18 - ,J/M,
A 11101'1' II 'f'lIrn'f' IIf.III'IIII' ill ,III' mUJlIl.t,,-1II 11'( 111111'111'1111' II Ollllllolily klllJWII
IUljllHI 7Yllmdirlg), whil'!J for 1.'1 5 .r }'2 vlI'hlH till' 11I'/lI"I'r III . f}lIl. of FI 111111 F J ,
It i8 uhlnilJl'd l1y IIflflinh 11.1 L'I (or m W'III'rnl, lulflillV, !Jllif Ii 1/111) 1U1I111'11I1IIillJ(
ollly IIII' IIIt"w'r JllLrt. of IIII' HIIIII (dIOJlIIJIlj., 'III' f"u'l illn), I'ilr f'XlIlIIJlJr" '" 1!lIIIIII
oflllll' cI"f'lIIlILllJlllnhN .r - l,fl!J, Wf' /lfld O,r, /11111 ('hop olf 'hf' fnlf'l i01l1l1 pm I, of
:i. m, olltlllllillJ(:i. Th" IIII1XillllJlII I'l'I'lIr wllllid o('f'lII' whl'lI J _ 1.,0, UOIIIIIIIIIJ( it
on yif'IIIH 1.,1)11 I II.!'"() - :i.O!l lillfilly rf1I1t illv. III :I, wllh ,m "rIor IIf n.rl. Jfollllli
t'-I...a(f'Ht i" IIIII'fl III IIIUIIY IInlllllll'lk IIIII'H lUll I iLH f'lIrv" ill H!JOWII ill FiJJ,lJlf'.1 I
Ill). Nu'" lhnl f'}r pf'rfOrJIlllIl!, 'III' rOIIlIlI-Lo-II"IIII'tlt Hd...IIII', II HillV,h' I'x'nl fli it
(j.('., tI 1) IH Hllllki"lIl.
IIf Hld( ,) IH 11I'lIrlV IIYIIIIIII.ln(' with (I'tll'I','1 'II t.hf' Ioh'lIllIlII', whl"h III II
HIIIJI1III1I1 1II111II11I0VI'IIII'1i1 I vl'r '"1111'1111011. 1I0wf'wr, III ,hi' JlIIIIII. .\'.111 WI' wII1I111
nlwlIVH I o III If I "" (111bl IH huli"IIIf'f11ll Fi :"r l ' .1.llIv 1111' 1II'ILvy dotH), 1I1II11I\'I'r II
11I1Iy, Hf'l IIII'I II 'I' of 1II,"rnt.ioIiH w" IIIILY W" II Ii II v.hl "oHIt lVI' "Inll. I';",J '/. ,lth
"ill f'nn 1)1' f nklllu' "11 from lilJ,JI' -1.11 I'll, I'll<' HIIUI 01 l'rrurH IIlf nil ' , I
IJIllllhl'uJ It! I 1/1., /LIlli thliH 'III' 11I1u! IH I/H. wllkh III "1111111,,1 ,111111 lit" InlUl "I I h'l
llllllf'nlJolI Wh"IIII', 1"11<' Hnllll' HIIIII of 1'.-rorH (1.11" II/I.) IH IJI".ulwtl r'll" I.
UII" f'lIIlIjf'IIIII'II,ly tlt,' himl, ill W'III'I'ul, III ,I. ". I'hlH 1tlflll'IIh'H 'hllllll<'IHIHlllw
"iux IH ,JIll' l) IIII' rOllmJi,w lip of >. f(I .. (I. Ii, I'''' nlll nil IIIlhin 'd rllllllfllll
w,. 1"lIIld, III f'IUlI' IIf U III' (I."., \ HI), I'll IIXI' 11111 of "'1 111111 I'J, 'III' Oil" WillI""
If'll1l-HII.llifklllll I,ll IH (I (1,1'., 'III' "Vl'IIIIII"), 1'hlH wuy W" would 'I"f rnnlo'iv rllllllli
lip IUIII dllwlI. 1'111' ol,I,niJlf'11 Hf.It"IIII' IH l:nll"11 I h" ,. }lnul t '-Uf'tU t",/<", f'hf'lII"
IIIIIIIH IIII1HII'II"'" III Fiv,III'" I r) 1111,
I)
-1/.1
-1/1.
-:t I
1'. If.r
II
1/-1
I 1/2
I I I
TABLE 4.5 Th(j roundln J rrore t r Ih truncollolJ och fill wilt! d . 2
TABLE 4.8 111 ( un jrl\l ,1110111' r Iho r ulld I 'If I! I 'h III with
74
4 Binary Floating-PoInt Numbers
4 5 Round-off Scheml:Js
7"
Rrnmd-tn-
nCl.ln st- 1.1 n(x)
100.
OIl.
010.
001.
(){)() .
(I - 1)
,.. .
2' ,. (I - 1) UC J 'f
...
1- I
(
)
FIGURE 4.6 An Impleml:Jntotlon of the ROM rounding heme
fhp HOM ill thi'l fiIJr' h'1H 2' rows flf (I-I) bit P &l"h. (' /tJ':urllllJ( h )Jr' ",'rly
wllnd.J (/- 1 J rftJult ('XI' 'pt whpn .ill (I 1) IfJW-orflf'r IJitg ("If till' H1J.lIihf 1IIfi .u'
l's. Iu t.his ragt., tllf' RUM r'11JC11.11 alll's (tluUI d!..ctiw'ly 'ntn' I'ing tl.,. r' ult
inJoit '.ad of rlllJnflillJ( it,) slid .IVlJi,1 th,' full .u:lditi(,11 If, ff,r ?Campll', 1 :: . th,.
taM,. IfJok-up wfluhl Iw VI'ry fMt and y't 2!j!j lIut c f 25G f'.w Wf uhl t f' ,r ,"rly
ffJlludf'd, ThjJoi rCJIUlciing, ffJr 1=3, ill iIIulltmtl.,1 in Fiur,' 4.7 fIll.
Th, bibS ,f HOI rounding for 1 = J ,md fi = 1 C'''II },P CAlf'ul;it....J fr',HI
Tdble 1.8. 1'1", sum of prr',rs is +1/2 ff.,r th,. hr. t thrl:#' Jl.JOIlpH flf IIV' r J :;& 2 ami
1.11 -1/2 for the fCJllrtll grull", nl" awrd .. hi 111 ill thf'rf fflrf' 1/8. In tl... ''1),.r1
ra8P, for giVl'1I V'dlllP-8 f/f d .nlll I. tll,' a""iiW' },idli is !( )'I - C} )'-1). TlJia
bi.L.'4 I' JIIW rges to i<!)d (e.g., * fc r d = I, w}wn I iH larg f ' pn' ugh, infO' RoM
rollnchng ronVf rges trJ tll!' rolllld-kH'f' uCtit IICh, HlI', If th.- rIJlmd-t..o-nMrr-'1ir,..
(......pn mc flifir-atj"n iH adoptl'fl, th.. t/idli of tll" me elifif:d kU\1 r e ImdUlK (:( flVf rgNi
t,o zero.
ROM(x
100.
00.0 00.1 01.0 01.1 10.0 10.1 11.0 11.1 x
AGUAE 4.5 The round-to-neorest-even scheme,
To compute thl' biN> of thi!o schpme for d = 2 we hdVl' to consider two
groups of Iii".., 2 d = -I "-"I :.hown in Table ,1.7. Thl' 6lJm (If ('rrors is -1/2 for
t.hp first group dnd +1/2 for thp S('Cond. ThlJ. thl" ivpragc' bi"-,, is O. Thl'
round-to-neart-f'\'l'n S<"he'mp is mandatory in tilt' IFFE Hoating-point sWlllldnl.
Another possible' modification to the rouud-to-llI"drest S<"hpmc, which also yi p 1cls
an Ilnhia...t rounding, is, in Cabe of a lie, to choose out of FI and F 2 thp onl' whose
Ip&t-ignifir.aut bit is 1. This is kuown as the round-tQ-nco.rcst-odd sc.heme
Although the round-to-nearf>St S<"hl'm have a good mlmpri,'al "pl'rfor-
mance," thpir main dis.t.dvantage is th'it. tht"')' rp<]lIirp a cmnple tf> add op..ration.
sinc/? thl' carry from the le.a....t-signifircmt hit mot}' propagatf> acros.<; thp f'ntirp :.ig-
nificand. fo avoid tlll5 timt'-consnrning carry-propati()II, it has Lf'('n SUf>Stf'd
(10) to IL<;e a ROI (read-only memory), v. hich would hold a look-up l:iLlp for
rounded nul . For example, a ROM with 1 arldrcss lines would ha"p as input.
the (I - 1) le.asl siRDificant bits (out of the m hits) flf thc tiignifie'dnd, and ou!)
thp most significant bit Ollt of the d extra hits, 8.'> dppiro-d in Figure 4.6.
011
010.
lJmbl'r R urld(.c) Error :\umber Rr./lJnd(:r , I-..rTfJr
XO.OO XO. 0 X 1.00 AI. 0
XO.Ol XO. 1/-1 X1.01 Xl. -1j.1
XO.IO XO. -1/l. X 1.10 A1.+1 +1/2
XO.l1 Xl. -+1/4 XU I Xl.+l +1/1
001.
000
TABLE 4.7 The roundtng errors for the round-to-neorest-even scheme ...Jtth d = 2
00.0 00.1 01.0 01.1 10.0 10.1 II 0 ILl r
FIGURE 4.7 The ROM rour Ing he ne \/'11m'. 3
i6
4, Blnory Floating-Point Numbers
4.6 Guard Digits
77
N IIIn iwr ROM (x) Error Number IlOM(x) Error
\ 00.0 \00. 0 .\ 10.0 .\ 10. 0
\'"00.1 XOL +1/2 .\'"10.1 XII. +1/2
...\"01.0 \'"01. 0 \'"11.0 X11. 0
XOl.l .\'"10. +1/2 .\'11.1 XII. -1/2
(ar. a21 + [b.. boll = [al + b.. CJ;z + b-;zl
hin. right (Ilwrat iOIl IIIRy, how('\'pr, bp (t'<llIirpd. WIIPII rnult.iplyin two lIomlRl-
ilNI fr.lctiollal silIifiellllds, at mMt 0111' :-:hift It'ft ill nl'f'ded if 13 = 2 (k posit ions
:;hift wlll'lI (3 = 2"). ThPrpfore, OIlP guard digit (in thl' radix 13 IIIl1nhpr S}'Sh'lII)
is Millicil'nt for post normalizatiou. A Sf'COlld lIard digit is nN'dNI for rounllinJ!; if
lhl' rolllltl-tu-neart'St schl'm,' IS adClI,"'II. Thus, n total of t.wo gunrd digits Slim, ,'.
rhese two digits arp l'a\led thl' (; (guard) "'Id till' R (rolllul) digits. Thp salllP
cOllclusioll 1:) reacllt'(l whplI lIll' siJ{l1ific IIIlI is n siguf'd-lIJngnitmlp IIlIInhc'r ill thp
rRllge [1,2), as is the elLS(' in tht' IEEE :;tandard TIlt' proof of this :ot ,t/'mpnt is
Il'ft (IS a t>xercis,' for tht' rcud"r.
hnpl£,IIJPntillg till' rolJlul-t,,"lIcarl'st-c\"PII sdlC'm{' rcqnin.s 81\ iudi,'ator to
poillt out whpthC'r nil tht' additional digits that w('n' J!;pnC'ratNI ill t.11t' mn1til)Iy
opprat.ion an' i'cro, in o«ll'r t.o dptcct a t.ic, fhis indicntor CUll be impll'lIIcllt.,'d
as a singll' hit, whkh is the logieal on of all additional hits, and is known l\." thp
sticky bit. TIllis, tlm'(' hit.!>. mundy, G, R, allli S (t.icky) art' sufficil'nt., I'vell for
till' rouud-to-lIt'arest-t'Vl'1I scheUlc. Comput.ing the sticky bit when 1IJ1l1t.ipl) ill
two sigmfic.U1ds docs IIOt. rP<lnire the J!;1'1ll'mt.ion of all I he h'L..t siglliti('lInt hits
of the product. Thl' IIlIIlIb('r of lrdilillg 1eros ill t.h,' product of two hinRry
sinificauds i P{Jual to t.l11' IIIlIlIbl'r of t.railing eros in tilt' IIllIlliplier plus till'
IIlllnb,'r of t.railing l('ros in t.he mlJltiplicand. Thns, thl' st.icky hir should IX'
Sl't to 1 oul.,. if t h,' ('xpectl'd number of I niling .I.f'ros in till' dllllhl.,-li'ugth prochll.'
is smdlll'r t iJan t hI' IIlmllwr of thC' IpllSt !oignificant prodll'" bit.s that are dlmded.
Other tedmilJlI(,s for compnt.illg thl' sticky bit for till' prodnct nfl' 1)(I'sPlltn:l
in [15J.
The CUM' of addit.ion/snbt.nction of floating-point nllmbers is lIIor(' .-om-
plieatcd. e.'jpccldlly whl'lI t.he final operatiou ..allt><! for (aft,'r f'xnmining tilt' sigll
bib) is Mlht.r.Il.tiou. As before, Wl' &Sllllll' t.hat the signifk"mls of thl' opprallls
lire nornl<,lil'd sigllNI-mngllillldl' fractions. Revisit.ing the subtnu:t l'xamplt' m
Se(.t.ion 4.2, it S{'t'III'j that all shift"II-out IliJ.,its of thl' subtrahplul IIIlJht lit' kept,
and IIIUSt. p.artil'ipatp in t.he subt.ract. opl'rat.ioll, in ord(r lo hdV(' all ,hI' 1It''-'sary
digits for Ihp po,...t.normnli.liltion st.pp. Thi!i \\ ill rNlnire lIS lIIallY guan dlts as
th,' numL('r of digits ill tht> siguifi,.alld field, doubling the si.t,' of th.e Slg}JfiC8l1d
add('r/sllht.raet.or. 1I0wl'vl'r, if the signifi(,Rlul of the !'uhtrclllt'lld IS shltl'd y
111m£' thall oUl' positioll to thl' right iu thl' prC'-dlignllll'nt stl'p, the n'Sullll dif-
ference will hll"" 1\1 lIIust mil' leadiug l'ro. This implit'S t.hat dt. must. ulle of th.'
shifted-oul digits lIIay Le n.'<llIired for tht, postnormalimtion step.
TABLE 4.8 The roundIng errors for the ROM rounding scheme with' = 3 ond
d=1.
Thl' IEEE float.inJ!;-point. stcmdard [91 illchulps fom rounding mode'S. Till
default is t.h(' roulld-to-nPIUt'St-pVl'n lIIode. Tilt remainillg t.11('(' IIrc din't.1.f'11
roulldings: roulld toward zc'ro (t.nlllcatt.). round t.o\\''1rcl "X"' .\IId round toward
-,X). rhl' roulld toward 1:0.: mod<'S are useful for Intl'rval Arit.hmetit- in which
each n'alllllmher a is repn'Sl'nted Ly two Hoat.ing-point. IIIlmLprs UI and U2 pro-
\'idillg lower and uppf'r hOlJnrls, rl'Spectively, for the (('ell value. Thus, an opl'rand
is rl'prPSf'IIINI by an inll'nal alld all arithmetic operat.ions Opl'rdtp on intl'r\dls.
[1]. For ,'xamplt'. thl' <KId, subtract and mult.iply opl'rations in illtc'rv,,1 arithllJl'ti('
ar<> defined lIS follows
[UI,U21 - [br,b-;zl = [al - b-;z,u2 - brJ
[UI, U2J X [bl' b-;zJ = [min{ ulbf, u.b:l. U2bl. CJ2b-;Z}, mfLX{ ulbr, ulb-;z, U2b.. u2b-;Z})
In these C('J1Jputatiolls, thl' lower bound (t'.g., UI + b l in addition) is roundt.,1
toward - x whill' lhl' upper hound (U2 + b-;z in addition) ih rouuelNi tow Uti 00.
The intervals cakullltp<t for the final (('Suits will prcwidl' au timdte on tll(,
accuracy of the computat.ion.
4.6 GUARD DIGITS
V{hl'lI mult.iplying two slgnifieand!o, WC' ohtain a douLI('-length re:oult, and dearly
IIOt. a\l the ('xtr., digits an' nl't...led for prop('r rounding, A similar situat.ioll arhit's
when adding or sliLlrading two numbpl"S thdt Oil 1I0t Illwe tilt' sam(' expor1l'lIt
The qUt'St illll now is, \\'hat is the smalle.sl IIlJlllher of pxt.ra digits that WI' 11('('(1
within the arithmetic IIlIit.'! Thl'SI' ,'xtra digits are used for roundillg alld for
post.normalii' ,timl in thC' CdS.(' \\'hl'lI "'ddillg .teros aw nhtllilll'<l. In wlt.,t fol-
lows, we will first considl'r nmltiply /divillp operat.ions. and thl'lI add/subtract
o!J(,rRtions, sinct' the formpr aff' simplpr to IU\IIdl('.
As Wf' have' &-'('u ill SI'('t.ion 1.2, if siglled-magnit.udc' fractions arf' uSNI as
:;ignifil'ands, th'>1I1I1I ('xtra digits for l'ostnorllJ'1lii'ation arC' n('('dNI for division. A
EXI\IIIpie .1.5
In t.he following suht.ract op('rat.ion tht' signifil'lIIuls of the two 'I)I'rancls
A . I n I ? I ' t l on g and tilt, hlL'i" is 2. ASSIJIII" thnt tilt' dllfl'Cl'IICl1
dn( .lft' - )1 S c , , I -Co f
, I t . 1 . E - En - 2 re< l lJirin g a 2-bit position :-: Iht 0
net.w,>;'n t II' l'xponl'lI s s \ - ,
the suLt ralwlIlI n in t ht' prp-alignmt'llt "tc>p.
78 4, Binary Floating-PoInt Numbers 46 Guard DIQits 79
A O.1OOOlKJl()) JOO 00 G S
B aliglll'd () .001 IO{)OOU{)Ul 10 A 0.1000001(11 WO () 0
A-B 0.01 () 100 10 10 JO 10 B ',Iignro 0.0000001 H 000 0 1
Post nnrrnaliu,t ion 0.101001010101 A - IJ O.OU 111111011 1 1
Post lIormaliz"tirm 0.111111110111
Ex.actl)' the <;alllP (('Suit is ohtdinl'd ew'n if only ulll' lUud hit p.art ici(lah"'S
in the uhtract.ioll and l'lu'ratf'S till' nf'csary horrow. 0
Thf' two hits, G and S, arp slIfti( il'nt for 1J(,..t.normRli.lation. If, howevf'r, thl'
rOIll\lI-to-Ilt'.u,'St sdwlJI(' is follo\\'f'd, an additional occur"!t ' hit is nlnirl,(1. Thp
fit.icky bit, whidl iR only all iurlicator, Cdllllot J>Crvc this purfJ ". Thllll, thff'"
bits, naDlI'ly, G, n, and 8 'UP rl'qlllr 'd. 8.'1 iIIustrdlcd in thf' npxt 'Xdllll,1 .
1'111' sitllation i" diffr.rt'nt if thl' mot significant shiftl'd-Ollt bit is f'qIMI tll
.ll'ro, as illlI'itratl'c1 in tltp following l'x3I11plr'.
Exampl(' 4.6
('.()nsid"r t.III' 881111' two siJ!:nifi("1nds fL<; in tilt' prp\ ious f'xamplr', but. noW
dS:>I1I11I' that t,hr diffen'u("(' b(.tw(,<,n tlal' f'xponpnts is E \ - FII = 6. Thus,
D's flignifi..and is Rhiftl'd by six positions tll thf' right to ulign it wit.h A's
ignific,\IId.
Example 4.7
Consider th(' f(,lIowing two Riific8nrls, whi..h arl' ahnost the SaJlIt' as
t hosp in t hI' pr('violls I'xlUnple, PXCI'pt for one bit in B, whirh L'i indir'at'd
in hold fEKe. Ab (wfore, assulJle that. t.he Iliff('n'nC' ' h(.tw "n the f'XllllnCIILs
is E" - E B = 6. The corr .t rc:>ult :;hollid b"
t
B nlignl'd
A-/J
Po:.tnorllmlw.,tion
0.10000010110£1
0.OOmXJOI10000
0.011111111011
0.111111110111
000000
000110
111010
A
B alin('d
A-B
Pot.llonlldli.latioll
0.1000£1010 1100
0.0000£10110000
0.011111111011
£1.111111110111
00£10£1£1
010110
101010
o
If only 011(' guard hit p.artidpat.pd in till' lIbtr.j(;'lion, thcn thc four 1t'.\Sl
:.inifk.Ult bits of t.he rcsult after the. postnorlllalii'dt.ion tcp would be
10£10 inst,.ud of 0 Ill, 0
How"ver, if \VI' use ollly the G dnd S bits, WI' obtain
In till' dbow subtrat'tioll, a long scCJUt'n('t' of borrows is obtaim'd. and it
bl.."f'mh t hdt \\" III 'Y 1\C('d all till' additional diJ!:its in B to guar.\Jlfl'(' that. a horrow
is gCllI'rat >d. \\'c miht. cOllclude that, in tht' worst CfL'>P, we lIIust dOllhtr, thp
lIulllhf'r of digits. How"vl>r, d ('arf'ful alldlYHis II'ad to tht' ('ondusion t.hat it.
suffkl's to dihtinJ!:ui!>h l)('twl'('n two ClISI'S: (1) All '\lJditiolldl bit.s (not including
t ht' guard hit) arf' I'('ro, and (2) at lP.L'>t onp of t.hr ad(litiOlldl hit.s is 1. Tu prow
I hiR, noti('(' that all till' I'xtra digits ill A art' 7t'ro5 (A was lIot p((.'hifted); hl'JI('c,
till' (('Suiting thn"f' Il'fI.'it significant bit.!:! III A - JJ (011 ill tit(' above l.'xlllllple)
arl' indepl'llIlellt of th.. f'x8.ct. poition of the 1 's ill till' I'xtrd digitb of B. Thl'
only t.hing we I\ccd to know is wlll't.llt'r a 1 W'lS shiftt'd 0111 or not, alld tilt' 'it.il"ky
bit (".an bl' miNI for thi8 purpos(', If a 1 is shifh'd into it during t.hl' alignllll>nt,
it will bi tiCl to 11111'; otlwrwi:w, it n'lIIdins Z('ro. It is thl,rt>forp "J!:"in thl' logical
OR of all tlu' I'xtr., bits of n. rill' "t.ieky bit participat('S ill till' suhtraction
.illd gl'lIraU':> till' IU'Cl'ssary borrow. Tlnlb, using thp I wo digits. G .\IId S, I hI"
prt.vioWi suhtr.&Ctiol\ 100kR likf'
G S
A O. 1 oouOu 10 11 DO £I 0
B "IiKIIPd 0.000£1£101 10£1£10 0 1
A-1J 0.011111111£111 1 1
Post lIormali.l at.ioll 0.1111111 1(1111 1
TIl(' wllnd hit aftcr th,. postllorlllalioldtion stl'P should be .I1'W, and t hp
stir'ky bit C \IInot be UbCd for rounding. \V.. lIIut. U&(' Ihp thrN' bits. G, R.
dnd S, .IS shown below:
(; R S
A 0.100£1001 011011 0 0 II
B .,Iiglll'd 0.00£1£1£10110000 0 1 1
A-B 0.011111111011 1 0 1
Post lIorlll"lii' atilJl1 0.111111110111 0
1'111' (.orrpct. R hit, with 1\ valuc of 0, is now av'ulable, and r: III 1)(' lL'WII
in thp round-lO-nl"&r/><;t !ldlf'mf'. Note that if thl' rmmd-to-lIf'arest-l'\'('n
IIll'lho" IS followl'(l. thl' sticky bit. whi..h ill nLocoed to det '(.t u tip, is
I\lrl'ddy .,vdill\hll'. This bit tI\I'(I'fore, SN\CIi twO) pmI' CIi. 0
80
4. Binary Floating-PoInt Numbers
4 7 Flootlng-polnt adders
If 110 postnormali7ation is re(lIli, rl'u tlll'n. fot' rounding IHJrI)()SI'S, Wl' dn
nol, IlCftI t.hf' three bits G, fl, 8nd 8, hilt shoultl IISI' onlv t.W) biLo;. a ronnd bit
and" :;ticky bit. Till' original G hit can SI'rVI' as all fl i)it, and t II' oriinal fl
alld S bits must 1)1' ORl'cI in ordl'r to gen,>ratl' anI'\\' sti<'ky hit. \\1' arl' 111I'n h.fl
with two hits, which WI> I'all fllUlIl S, for thl' round-t(HIarl'st-I"'l'n pfln: -Our'.
fhis sit uation is ilIustrat II in thl' follnwlIIg ('xalllpll',
ISIJ n s Opl'ralion Errol'
0 () 0 +0 0
0 0 1 +0 0.25 1/11'
0 1 0 +0 - 0.50 111,)
0 1 1 +0.5 ulp +0,25 ui"
1 0 0 +0 0
I 0 1 +0 - 0,25 ulp
1 1 0 +().5 ulp +11.50 ,lip
1 1 1 +0.5 uip +0.25 ulp
EXAmple 4.8
In t Ill> f(}lIowm subt.mction, no po:-tnnrmali7ation is 1It'('(h'd. Thl' difft'r-
,'IIl'P h(.t wl'{'n th(' I'XPOI\f'lIts is E" - F II = 6.
rotal
o
.1 0.100001 0 10 1 00
B O. 1 ]()()O()O 11100 1
G R 8
.4 0.100001010100 0 0 0
B 'iligllPd 0.000000110000 0 1 1
A-B 0.100000100011 1 0 1
Ii S
Bf'forf' rounding 0.100000100011 1 1
After round-to-ucarc.sl 0.100000100100
0
(a) ROIlII<l-t.o-nl'arest -I'\'I'U sc'h,mt>
SiJ1;11 R S Oppration
+ 0 0 +0
+ 0 1 +1 ulp
+ 1 0 + 1 ulp
+ 1 1 + 1 tdp
- 0 0 +0
- 0 1 +0
- 1 0 +0
- 1 1 +0
III thr pmcNhlrl' 8hov£', if th£' IIpW R hit is 7cro thcli no roundiug b,
rCtluired and the sticky bit only indi(,lIt/>g whl'thcr dIe final rcsult is I'XII('t (if
S = 0) or incxact (if 8 = 1). If R = 1 t.Ill'n till' operation to be Jll'rform('d
in the rOllnd-to-nI'IUIt-t'\l(>1I pro('('(lure dept>nds 011 tll(' st kkv bit and on tl1f'
If'ast-signifinLllt hit. (1I('not('<I by L) of the resultant significalld If the stkky bit.
equal6 1 thell wlIIuling mnst bl' performed by adding a ulp to thc silIitkand.
If tlw sticky hit 'quais 0 thcn t.his is the til' CM" and ouly if L = 1 is rounding
IW('l'l\sary. The above ml('S can hI' stated llIorl' sliccinctly by saying that the
round-to-tu'ar('St.-l'vpn SdlC11I1' requires tIll' nd,litiou of a ulp to the signifklUlu if
thf' Doah-an £'xprl'SSioll
(c) Round-to-plus-int1nity !>dlI'IlIP
81
R S Operation Err'or
0 0 +0 0
0 1 +0 0.25 /lIp
1 0 +0 0.50 ulp
1 1 -t- 0 0.75 /III'
I I
l'otdl I -o.:m') /lIp I
(b) ROllnd-to-,,,I'ro sc!lpme
Sign It S Operatioll
- 0 0 +0
- 0 1 + 1 Ill,)
- 1 0 + 1 /lIp
- 1 I +1 IIlp
+ 0 0 +0
+ 0 1 +0
+ 1 0 +0
+ 1 1 +0
(d) Ronnd-to-rninus-infinity ,>chclIIp
TABLE 4.9 The rules of all four rounding schemes.
Thl' addition of a ul" t.hat. ('orlle8 aftpr thl' two signilieands IUlvl' hl'{'n
added subst<lIItially iIl<Tt..L'5es the cXPclltion time of a tloatillg-point 'Uill/suhtr.u't
operdtioll. Thp extra delay clue to rounding can be avoided becauw .tll t.hro..'
gunrd hits are known l}Pforp the sinifil" u\lls art' hcin ndded. Thus, the a<ldition
of 1 to L clln he doul' dt tht> salnr tilll(' that tll(' sihnificands arc wld/'d. flu'
cxact position of t h,' L hit is nut known YI.t, since a postnormnlization IIIdY be
rcquirpd. HOwl'vI'r, the L hit has only two posibll' positions '&nd two I1Ildt>rs ('1m
theft.forp hI' USI'lI in par,III'I, providing the 'orrcdly wmu!t'(1 rt'8ults for bot.h
cases 17). ('his ('8n nlso bl' d("hi"vl'd IIsing on£' wider lib dl.'scrihpd in thl' I\('xt
sl'ctioll.
li. S + Il. S. L = R. (8 + L)
equals 1.
The addition of a ulp to thl' significnlld jJ'> 5Q1llct.iIllC5 nc '(ll'd 'vcn whcn
a directed rounding lIIode if' followed. For examl>le, in till' round toward +oc
mode, a III,) IIIlI..t he addl'd to the sinifical\d if till' result is pusitivr '&nd pitlll'r
R or S ('(Iuals 1. A similar !iit.u8tion oc('urs in t.11I' round tow,ud -oc mod.. wl\l'n
thl (('Suit i:; nl'gativl' and thl' Boolpan I'xpression R + S 1'«uuls I. 1'1\1' rule'S
for all four roundinl:, lIIodes IIII.mclatl'(l hy t 11P IEEF tI(),'\t ing-plJint stancl,ml art'
shl}wn in Table 4.9.
4.7 FLOATING-POINT ADDERS
The pr(1I "durl' folluwed wlll'\1 'L(lding two flontin-J>o\l1t nllUlb p r8, dl"pi('tl'(l ill
Figure 1.1, indudCtl 8 larf!;t" numher of sttJps whidl .Lr' ex(" "IIt.d Sf'(jl\l'\J1 iaUy.
82
4 Bi"lary FJoatfng-Point Numbers
4 7 FlOating-poInt adders
83
Ho"..e\'n, a careful f'xamination of the procroure rc\'e.als that not aU of thP.-oo
st(>p5 mw;t be eXf"('utt"'d for f'3("h operation, and that som.. of th£> t£>ps can be
exeruted in parall('1.
To this end. .e disth betv.-cen (>ffective addition and effecti\'e sub-
traction (21. The effective oJ>{'ration depends on the Sign bits of the two op6'anru.
and the instnlction to be executed. For effecti\'e addition, we first CAlculat(' the
exponent difff'rence to determine the alignment shift. \\"p then shift the siJi1lili-
c.and of th{' smaller operand, and add thE'S(> alignf:'d signifk.ands. The result of the
addition can m-erftow by at most one bit position. Thus, a postnormalizatim
shift. which would be time-ron:.uming. is not needed. The single bit overflO'lt,
can be easily detected and, if it is found, a I-bit normalization of the result is
perfonnro using a multiplc..xor.
To eliminate the need for an increment operation in the rounding step,
the abo\"(-mentionro signmcand adder is d{':)igned to produce to simultaJEOus
results, sum and sum+l. An adder capable of producing the two result.s sum and
sum+l is sometimes called a compound adder (13], and it can be implemented in
various ways, including carry-look-ahead and conditional sum (see Chapter 5).
For the IEEE round-to-nearest-ew'n rounding mode. we use the rounding bits
to determine which of these tv.-o results should be selected. ote that these two
resul (i.e., sum and sum+l) are sufficient even if a single bit owrflow occurs.
In ca...--e of an m.erftow. the 1 will be addoo in the R bit position (in:.-wad of in
the least significant bit ,LSB) position). and since R = 1 if rounding is needed,
a carr)' will be propagated into the LSB position to generate the correct vahle
of sum+l. Howe\'er, for the two directed rounding modes (roWJd to :i:x), the
R bit is not necessarily 1 and, as a moult, sum+2 may be needed in the ca.c;e of
the I-bit O\'erflow.
In effecti\'e subtraction, ma.-,.o,i\'e cancellation of the most significant. bus
Ola) occur. resulting in the need for a length)' 'tnonnali7..ation step. Hov.e\"Cr,
this can happen only when the exponents of the two operands are close (i.e., the
difference is Ies.<; than or equal to 1), in which C3.-e::(' the pre-alignment tep can be
eliminated. It th£>refore makes selLe::(' to implement effective subtraction a; two
separate procedures: one for the case where the exponents arc close, and one for
tbe case wb re the exponent difference is greater than 1. In the CLOSE case
only a postno rm(i.liw ion 4ift ma' be required, while in the FAR case only a
pre-alignment shift may be needed. The required steps in the CLOSE and FAR
cases are sh090D in Table 4.10.
In the CLOSE 'e first pff'dict the exponf'nt difference. ba.scd on thp
("\\'0 least signilicant bits of the oJ>{'rands, to allow the subtraction of the sgnifi-
cands to start as soon as ble. If the predicted difference is 7..ero, a subtract
"ill be executed with DO alignment. If the predicted difference is H. the signm-
c:and of the bmaller operand is shifted once to th.. right (using a multiplexor) and
then subtracted from the other !>ignificand. At the same time. the true expollt'llt
StPp CLOSE FAR
I Prf'dict f'Xponent SuhtraM PXpOnent«
2 Subtract significand'l Align o;jgnificands
Predict number of I 3{lin 7R-T0C'I
3 Post normalization Subtrart siificands
4 SeIPCt properly rounded result sPlect properly roundPd result
or nt'ga«> result
TABLE 4.10 The steps in the CLOSE and FAA cases.
diff('rence is cakumted. If this difference is great.pr than 1. the above prU'fflnre
IS aborted and the F-tR procedure is followed. H. hov.ev{'f. the true I?XpOnf'nt
difference is Ies.<; than or equal to :H. the CLOSE procedure is continuPd. In
parallel with the subtraction, the number or leading 7.1'rO bits is predictl"rl. in
ordf'r to <let-ermine the numbpr of shift positions in the postnormalization step.
The nonnalizat.ion of the significand and tbe correspondinJ( exponent adjllit.lI1PDt
are done next. Finally, rounding is performed in step 4. As mmtioned above.
rounding CAn be a.ccompli..hed by precomputing sum and um+l and then
leeling the one which is proJ>{'rly roundro. In step" a negation of the result may
aL<;() be needed. Since the subtract operation is almost always executed that
the maller (pre-shifted) significand is subtracted from the larger one. the result
of the subtraction is usually positive and negation is not required. Xote that the
gn of the final result is determined by the sign of the largest operand. Only
In the ca...--e where the exponents of the t'O oJ>{'rands are equal, tbe r('Sult of the
signific.and subtraction may be negative (represented in two's complement), re-
quiring a negation step. Homer. in uch a case, DO pre-alignment is perfcnned,
and consequently, no guard bit.:. are gt>nerated (the result is exact) and hence
no rounding is nr)". Thus. the negation and rounding steps arf> mutuaDy
exdush'e.
In the F.-tR case the expouent difference is calculated first, and then tbe
significand of the srnalJer operand is shifted to tbe riJ(ht to align it itb d
other signific..and. The shifwd-out bits are used to set the stic.ky bit. The r
signific.and is nov. subtracted from the larger signmcand. with the result king ei-
ther nonnalized or requiring a cingle-bit-position I ft-shift. hicb isaccompli<;hed
using a multiplexor. In step 4, rounding is performed. _
We conclude this section witb a bri('f description of the leadmg zeroe. pre-
diction circuit. This circuit should prroict tbe position of t,he leading non-zero
bit in the result of the subtract operation before thp subtraction is compk>ted.
This would allow us to execute the ,-tnonnalization ft immediately f<1Jow-
mg the subtraction. One way to arhie\'e this i3 to examine te bi of the 1v.o
operands (of the subtract operation' in a hion. starting. with. the most
significant bits to determine the position of the first 1 I J. Thb SE'riaI opa-a-
84
4, Binary Floating-Point Numbers
4.8 Exceptions
85
t.ion can be 8('('('I('rat.('(1 using a paralld ch('Ole imilar t.o th(' rarry-look..;\lwad
t£'clmiquf' (Sf'(' C'haptpr 5).
Anntlwr ,..ay to prcdkt thC' position of th.' I..ading 1 is to 1:'1Jl'rate in
parallel a s<>t. of int('rm('diatC' hits ('i u('h t.hat e. = 1 if the corresponding bit.s
a l and b. of t.hp two op..ramls arf' id..ntiral ancl t.hc. pr('vious bits, i.f'., a._I
and bi-It allow t.he propagation of th(' t'xpc>ctl:'d carry (i.e., ELt If'ast onf' of the
bits ai-l ELnd b l - 1 ('quais 1), A curry is exp('ctC'd since thl:' suht.ract opf'ration
i f'Xffut{'d by forlUiu t.he oI1(,'s comph'llIent. of the subtralll'ud aud forcing a
carry int.o the If'a.<;t significant position. (In the notation uSId in Chaptl'r 5, the
Doolt'an exprt.>$<;ion for ei is fi = (I. III bi (ai-I +h,-d wh('(1' hi is the compll'lUpnt
of the oriinal subtrahl:'nd bit.) In other words, ei = 1 if a carry is allowl'd to
propaJ?:ate to position i. The corresponding ith bit of t.he correct result \\ilI also
b(> equal t.o 1 unl!':'." the force<1 carry from th(' least significant, posit.ion did not
propagate t.o posit.ion i-I; in such a casl:', t.hl' correct result will h",..f' a 1 in
posit.ion i-I instC'ad. Thus, t.ll(' posit.ion of tlu' leading 1 in the result is eit.her
id..nt ical to the position uf the leading 1 in th(' s('qUE'nce of the ei bits, or it. is one
position to the riht. \Ve may therefore count, the number of leading zeroes ill
t,lu' sc>quence of t,he c, bit.s and provide this connt to the barrel shift,er f'x('('uting
the postnormalil.at.ion shift. After this, at must. one bit position corre('tion shift,
(to th£' Il:'ft) will be fl'quift'(1118). A comparion of s'veraln\('th()ds for predicting
t.he position of thC' Ipading bit. appears in 116).
tlJl' correl>polldin trap is disabled, is dptermim'(l by thl:' siJ!:n of thp int,ermediatR
(overflowN\) result and the rounding modp a.., follows:
1. In thf' rmmd-to-nf'l\rf'St-pwn IIICHlp an X) wit.h t.hp sign of thp intf'nnpcliatp
r"snlt is gene rat -d.
2. In t.he round tuwc\nl 0 mode t III' largest (('prp.',entable uUlUhpr wit.h thp !oign
of lllf' int.prnll'cliat I' result is genC'rated.
3. In the> round toward -00 lIIudp t.hc' largpst rf'prCS4'nt ,bll' numhl'r with a
phis siRI1 is gf'nf'rat('(1 if th£' int.rormPdiat.c rf'sult i!' positivI'. Otlll'rwiS4:' thp
final rC'Sult is set to -00,
4. In tlw round toward oc mode t.he hrgpst reprpsf'ntabl(' numhf'r with a
minus sign is gc'm'mtNI if I. ht' intl:'rJlu'(lidt(' rf>Sult is negatiV('. Ot.herwiSl'
the final result is ct to +00.
WI' will conn>nt.rate in t.his st'ction on t.he cxcC'pt ions sp('cifit'cI in t hI' IEEF &all-
clard 19). The 754 standard dt'filU'S five t.ypes of except.ions: ov£'rflow, undl'rfuw,
division-bY-7ero, invalid oppration and inE'xact result. The first t.hft'!' c!Xccptions
are found in almost all floating-point systt'lUs. Only thp last, two are peculj,.u to
thp IEFE standard.
Wh('ll an except.ion occurs, a stat.ns flag is set. and a spt'CifiNl mmlt is
gC'nerated (e.g., a corrC<'tly signed ex. when a division-bY-l.ero occurs). The status
flag should remain Sf't. unt.il t'xplicitly cI£'arcd. The IEEE standard rennummds
t hI:' implf'lUl:'lIt at ion of a separat.(' tral}-f'IJabll:' bit for pneh exception. If this
bit is on whPlI the> corrt'Sponding t'xcept,ion occur!. t h(>n, iu addition to settin
the st.atus flag, the us.t'r t.rap handler is caliI'd. Suflkipnt information must hC'
provided by the floating-point, unit to the trap handlpr to allow it, to takC' the
appropriatp actiou, e.g., exact identification of t.he oppration whidl caused tilt'
t'xct'pt.ion.
Overflow. Till" ov('rflow exct'pt.ion Hag is Sf't whc'npver the expouf'ut of
till' result C'xcCt>ds t.he larg.>:;t vahlf' allowpd in the result's format,. For exampl£',
in the "ingle-precision formdt all overflow occurs if E > 254. The result, wheu
If the OVf'rHow trap is f'nablPd t.hen thf' trap handler r('CeivPs the inter-
mediatp rE'sult divicl(>d by 2 U and then rounded wht'rc> a is 192 or 1536 for the
single- and double-prl'cision format, resp('Ct.iwly. This snling adjust.ment. was
chOSl:'n in order to translate thl' OVE'rHo\\f'd rf'Sult as nparly as posl>iblE' t.o the
middll' of the C'xpOllPnt rauge so that it cau be USf'd in subsC'qupnt opprations
with It'Ss risk of causing further eXCl:'ptiollS. For exampl(', whpn multiplying
the numbl>r 2 127 (for which E == 254 in t.he siuglf'-precislolJ format) hy 2 127 ,
the ov'rflowed product. has an eXpOllPlIt of E = 251 + 2M - 127 = 381 aftpr
bf'ing <\djusted by 127, as is normally dOlC' for the uultiply oppration.. his
[{'SuIt c1parly o\'f'fflows since E > 254. If tins product IS thpn I>calecl (Ulllltiphed)
by 2- 192 , the r('sulting exponpnt becomc>s E = 381 - 192 = 189 whidl repre-
sents t.he "tnlf''' "alul:' of 189 - 127 = 62. This scalr'd intenuecliate vahlf' has a
smallcr risk of ('ausing furt,hl:'r pxceptions. ('onsidE'r now th,! casC' whrf' relati"ply
"small" opf'rands mmlt in an (}vprAow. If we multiply t.he IIlllllher 1>1 (for whi('h
E = 191 in t.he singl(....prf'dsion format,) by 2 65 (E = 192), tl.le overflowf'( prod-
uct ha." all exponPllt of E = 191 + 192 - 127 == 256. If tins exponPllt IS t!l('n
adjusted by 192, WI' obtain F = 256 - 192 = 64 v:hich rppt'(>nt the "tru"
,,-alue of 61 _ 127 = 63. Conversiuns (e.g., from decUlIal to hmary) are handled
differt'ntly, S(..'C 19).
Underflow. Thp mnditiolls under which t,hf' ulldl:'rf!ow Rag is sl'f. depend
011 whether the undl'rllow t.rap is l'nabled or disablcd, If the ulIh>rllow trap
is cnabl..d the Ulld('rflow l'xc('pt.ion Rag is set whenl'vcr thC' result, IS a lJo.lIl.ero
I 1 ? F , d 2 F,,"n R nI'u ll th -, t F is - 126 for t.he smgle-
Il\un )('r }ctwl"f'n - _ on n an . n_-" ,"m . .
.. r t I 1() ') 2 r )r t h e (Ioul>lt>- p recision formal. fht' mtl'rmedmtC'
precIsion IUrllm, alll - - Il ...' It
I II . t l fi telv P rec;lsc re>1I
result d£'livl'red to the un(h>rllow trap lalll er IS II' III m.. r. .
mult.ipliNI hy 2" ami then rouncled. As for oVl'rllow, a t'quals 192 ur 1;>.$6 for t,he
4.8 EXCEPTIONS
h
4. Binary floating-Point Numbers
4.9 Round-off Errors and Their Accumulation
87
S1l1Jdl.... or douhll....pn'l,isi,)n f)rnU\t.. n'Spt'l"t i\",'ly. ('{)II\'t'r.;iolls ELr" also 1U\lIdlt.'ll
ditTt'n'utly (9).
If h')\\'l'\,'r. tht' Imtll'rt1,)w trnp is diAAhh'c.I. dCllnrmali7("(llUm1b'1'S l\fl' al-
I("""d. I" hi' 'mdl'rt1ow t'x(t'pti()11 ling IS t hell s"t ollly wht'll au ,'xt morclillary
I('SS of I\....urucy O("('U whil(' n'pfl''nting th,' intertm'llifih' r!.'Sult (whit-h 1m.... a
n,)II/('rO \'Rhl!' Iwt\\'\'l'1I ::t:2":......) 1\." 1\ denormali7t,<1 IIl11nhl'r. Such 1\ k\. of I\l'CU-
fI\'y O("('II \\'lwII 'itlu'r tlu' u\fd bit )r thl' stkky hit is non/('ro. Tht'st'irulkah'
nil itWXI\l.t n"$ult. In an lU'ithmdk ullit wlwfl' "I'llonnalit"li uumlwr.; nre lint
impl'Jlll'utl>d the ll'Ii\'\'n'l.l n'sult is ('ithl'r 7t'fO or ::t:2".'......
110 pn"C:isioll WlL" 10000t whell p<'ffcJfmiug the rounding, Thf' purpose or the in€,)Q\Ct
ft"$ult. flag is t.o allow iutegt'r calculations to be performed iu a floating-point
unit.
4.9 ROUND-OFF ERRORS AND THEIR ACCUMULATION
fhE' u('('(1 to pt'rform rounding in t1oating-point operations., eVt'n \\ ith the be-;t
rounding sdl('ml" r£'Sults in errol'S that teud to ac("\uuulc\te 'L" the lIumbt>r of
opl'fI\tiolls incfl'&>es. The relati\'€' round-uff f'rror in a floating-point opt!'atiou
is d('lu)tt'd by E and is defined hy the equatiou
Fxnmplt' 4.9
Supp\.",,' t Ill' lIIul'rf1,)w t':..'\:"'pt ion t rnp is ,1ihlt"C.l mI(l drouormalizt'(1 mnu-
b('r.; ha\'\' 1)(''I.>n impll'mt'ntl"C.l. If \W lUultipb' 2- M by 2- M th.' m:mlt.il\b
t'xp,)lwnt is F = \ 1.?7 l)5) + (1:!7 - (5) -127 = -3. Siul"l" E < 1 \w mnnot
n'pn'nt the pl"\)duct ns a n)rlUn1ilt-d lIlunher. IIISh'd(l, WI' n'pl't,'ut tit€'
I't'::mlt 2- 130 as tht' ,I,'nonnaliz("C.1 IJllmbt'r O.()(){)l .:! -126. i.I'., 1=.0001 .\n(1
£=0. Xo undl'rflow t'x"\'pti()l) fl.,S is t't.
If tilt' St"C."(")ud opt'tand is (1 + ulp)'.?-M (ratht'r than 2- M ), the C'Otrt'Ct
pro(hll.t I (1 +ulp)2- 130 whidl. whl'n coll\'t'rtt>\1 to 8 IIt'nornmli7t"C.l Ulnnbt'r,
yidds 1=.0001 l\.Iul E = 0 us bt>fol't. but now t ht' st icky bit is t"C.l,ml to 1.
Tht'f\.fol't" this is au ilJl'xl\ct I'tult 81Ul tlu' ,mdt'rt1ow exception flag is t.
o
Fl(x.. y) - (r .. y)
E=
(x'" y)
( -I,)
whert:' . is an one of the floating-point arithmetic operat.ions +, -. x, or /. and
Fl(.x * y) is the correctly roundt'<i or trunC'\tt'<i rt.'mlt of the opt'ration (x. y). A
more conwui('nt fOrlu of E(luation (4,8) is
Fl(x*y) =(.x"'y).(l+E).
-1.9)
Upr bounds for the rel8ti\'e error E CI\.Il be deri\'ed for the different
wuml-otf sdl..m€'S. For truncation, thf' ab...;olut(' error cau be almo:>t as Iar as
tht' It'a.st-significant digit of the significand. ThE' worst C8.-"f> for the rE'laaw error
is when the norl1ln1izt'd result assumes it,:;; smallest ible \-alue. Thcl"f'klre.
Diviston b_v zero. Thl' lli\'isl,)n-by-zt'ro t'x("\'ption flag is l'('t wlu'nt'\"\.'r
the dh'iSl)r is Zt'I\") 1\.11d tht' di\'id..nd is n flnitt' nOU7t'W muutX'r. "'hen the
('orr\-..spolJlling 1 mp is 1i.blt'l.1 t h.. I't"$uh must bt> n I..'()rre<'tly signl"ll .:\...
Im,alid opef800n. Tht' ill\"Rlid opt>ration fla b t if 811 operand is iunwd
f\.)r 11t(' opt'ration to 'pt'rforml-d. Tilt' l't'Sult, wh>n the itwa1ill opt'ration b:ap
li&"\blt'l.l, is a quil't "\a.."\. E..xaulplcs of im"Rlid opt>rations are (9):
1. Iultiplyin..t.'; 0 h' "'-
2. Dividing 0 by 0 or "'-' by C\.
3. -\dding + XI rold - ""-
4. Huding tbt' Sll,um' root of a Iwg,\tiw opt>rand.
5. Cakulatiug. tltt' n"maindt:'r .r REI y wht>f\.> y is zero or .r is intinit('.
6. .Am' c)pt'latiou on a s"URliug :'\<L'\
<)-m
I I < - 'J-mTI
Err..nc: - 1 2 = .
For tht' 1"\)1II1d-to..neart'St sch('me, tht' maximum absolute error 6 only half c:J ulp.
and cons<'Qul'ntly
nexdCt rosuIl Tht' iUt'\l.t n":!'ult flag is t if the f\.)JUldt'l.l fl.':'1I1t is uot
e.xoct or if it o\"\'rtlvw \\;thout 1\.11 owrflow tmp. ..\ rollndl'li n"$ult is e.xm-t onl'
Whto'11 both tbe g.uard bit aud the stick)' bit 1\1't' equal to Zl'J'Q. This implil that
I < I'J-m+1 - ?-Ift
I E roun4 - 2- - - .
The abo\-e formulas pro\'idl' ouly upper bounds for the relative error.
What might bt> more important i... the exact distribtion of te relati\'e erru
\\ithin its bound... This distribution has been studied (t:'.g.. III (WI) aud the
following density function of Err..nc: ( Figure 4.8(a» bas bet>n deri\-ro:
{ 2",-I/ln2 if 0 o < 2-"'
/. ( ) - 1 1 / if "' _ -m :5 o < ? _ -m-rl . -1.10)
.'r>oftc"' O - (;0 - 2"'- ) 10 2 -,
. at ' I on are unifonnl\' distributed in
In othl'l" words t.he rE'latl\.c errot'S lor tnmc ,
-' . h . [ ?-m ?-mTII The
tht' n'gioll (0.2.-"'11\.1)(1 rel."iprocalb distributed m t (-' regJon - . - .
88
f
. tlTUni'
4. Binary Floating-Point Numbers
f
roulld
4,10 Exercises
flO
1-
1-..'
A Illurp Sl'vPre Case of error IUTulUulatioli may occur in t.hf' 8uhtrart op-
eration ill - 4 2 The relative error in this C&:P, ulldpr t.hf' !'lame a.,..umption as
ahove, N}lIals
(n)
(b)
.£
1-".' 0
A A 2
A c A c' "I - A c A c' (2
1- 2 1- 2
If the o!Jl'n\lIlls AI and 1\2 are positive numbprs doSl' to eac'h other, the tcell-
lIIulatf'd relat iw. I'rror conld incre"t,.;(' sibrnificantly, esPl'{'ially if the t\\D relativf'
error" fl and (2 have oppm;itp signs, resulting in a very substalltial inaccuracy.
The <t('cumulation of errors wh('n a seqlll'ncc of sl'veral Roat.ing-point (1)-
erations UP pprformro depends on the part.irular set of opf'Intiolls pf'rforn1<'d; in
other words, it d('ppnds on the spt'dfic applicntion. TIJPrefore, tllf' aCC'lItnulation
of errors in tht' gl'n('ral case cannot b(' analY7.ed and simulat.ive st.lldif'S nlllst hE'
USM il1:-.tt'ac1 (e.g., 1101, (121). As might b(' ('xpected, thpsp stlldips hl\vP :-.howl1
t.hat in most cases, the acclmmlat('(1 rplativp f'rmr whl'n trwlcation is used is
higher than that when the rouud-to-ncarest schemp is f'mployed.
€o
FIGURE 4.8 The density function of the relative error for (a) truncation. and
(b) rounding.
nVf'mg<, \nlue uf tilt' rplativp prror CUllI10\\, be calculuted. This cakulatiolJ rt'Sult.s
in
- 2 -,"-1 /1 2 0 72 ?-,"
'rune == n . . _ .
The deusity funct.iou of fround (5(.'(' Fibrure .I.(h» is
f<ro..n""O) = {
2"'-1 In2
( 21ol - 2".-I)/ln2
if -2- m - l :5 (0 2-"'-1
if 2- n .- 1 < 1(01 < 2-"'
4.10 EXERCISES
(.1.11 )
\\'hf'rc A nnd A 2 dcnote t he "correct" vahu's of . t I am] "-\2, reslH'('t i\'dv. If Wt'
focus 011 the accumulated trror and assuUle that 110 lIew ('rror will bp illtrodul1
in the addiuolI, tlu'll lh!' relative error uf the $11111 AI + A2 becollu's
A. 1 + A . 2 _ A A
A + A 2 - 1 + A . (I + A + A . (2
Thus, thL rplat.iw .'rror of t I... ::.UI11 i:-. a \\'dght -"(I avprag p of t.he relutivt errors
uf till' opprands. If the two operands nIl' positr\e, then the above "rror will be
duminah>d by the ('rror iu th larger uperand,
4.5.
l'on.sider thl' following 'IO-bit findting-point format: A sign bit, a ..!9-bit nor-
malized (fr8Ctional) significund in two's complelllt'nt form, a 100bit exce,.:, 512
('xl>on('nt, and base 2 for the exponent, Determine the bit patterns of thc !>mall-
e:;t and largL'8t positive and negative numb(>rs, Imcl find tbC'ir \'alm'. Also, find
thc distallcl' betwt'Cn dllY two consl'<'utive numbers. lIow many dilfl'rent values
CCUI ,you rt'pr<"S('nt using thi'l format?
Find the nonnalizPd int('uml !Uachint' reprt.'sent<ltiolls of t.he following 110 Iting-
point numbC'rs in the short IBM format, in the sbort DEC fonnat ,\nd in thf'
short IFEE forwat: (a) +0.2.') 'l+3l (b) - 31.75.2- 7
(a) Show t.hat thpr,' arc 1.875 tim('S a... lIIany normnli./A'd l1oating-pointnumlwrs
with all expon('nt ba.<;(' of 16 as thoS(' with 811 t'xpom'nt ba.'IC of 1 Both us<: the
sal11(' number of bits ill thf' cxpont'nt and siKUificaud fields.
(b) Wbat is tlu' ratio of rt'prescntable normaliz('d nllmh('rs for {3 = I ?
How llIanv dellornmli.tl'd numht'rs arc thpre in the short (32-bi) IFFE form.at,
mul what" is tht'ir rangr'? C'ompar(' it to the number of nOrlnab.tc<l values wltb
t.hp fixt'd t'xpoIJent 1 (including bills).
\\'hidl of th(' following I)rop('rhes [lrc s.'\tisfird ill systt'nL.o; with .dt'Jlormalized
numb('rs, and which arc satislied c\('n in s\'sh'ln& wit.h no dnorlllalUeti. nllmbcn.:
( ) -+ . I ' Y J 0 (b) ( X - y) + Y :x (with a rolluding ('nor).
8 x r Y JlUP if'S Z - r . .
. b ' f 1/ 4. 0 thell ...!... ..... x ( with a roundmg error).
(c) E-or a noullallL...d JJ\lm er Z, t x r l{ll -
. . l [( ! ) d ( I ) '-I J
Prove tbat th,' u\'t'rugc bias of th(' ROM roulldmg S('hemt' IS J :I - 3 .
-., b t h IFEE standard' HOlmd-lO-nMrC'st-
I'onr roundmg schpmes ar(' .supporh"U y c .. .
t'\('Il, rounel toward zt'w (trllDldte), round to\\ard infilJlt.. and roulld toward
-00. Calclilatt' till' bias of all four rounding S('hC'mrs,
4.1.
fhe rdutm.' t'rrors, whclI rollnding, alP ulliformly distrihut('(1 in the region
(_2-m-l, 2- m - 1 J ulld rt'cipronlly dist riblltl,<1 els('whell'. Unlikl' F(lliation (4.10),
t.h(' dl'uity f\lndion in Fqllltt.ion (4.11) is symuu't.rir wit.h respl'ct to fO = 0 and,
dS 8 (('suit, tht' Iw('rngt' rdat i\t' ('rror is O. Tlw Ulml},t.iral1y derived f'Xp(1':.siou!i
iu Fquations (4.10) and (4.11) w('re shown to be in \'Cry good agrccment with
('mpiricnl rcmlt s.
I'hp ahlw uuw}'sb conn'utrated on tilt' rouud-off ('rrors occurring in a
single l1oHhug-point oJlt'ration. It (li(1 lint. indicatt' how thl's(' errors could ac-
nuuulHte iu subs.L'<IUl'lIt operdtious. Considl'r, for I'xamplf', two int,elllu'diate
H'8UltS AI and ,'h, which 1m' to be add('(1. DCllott, by £1 aud (2 thl'ir corrt'S»nnd.
illJ.. re1ut i\'(' errors, satisf,yiug
4.2.
4.3.
AI = A(l + £1)'
1 2 = A 1 ( 1 + (2)
4.4.
4.6.
4.7.
90
4, Binary Flocmng-Polnt Numbers
4.8. I'or thf' four rQundinp; scheJ1l(,<; sUIJportM h)' thp JEFF standard (S('(' problem i)
show I hr final roulldN! rrsulls in thc foUowing thrN' rases:
4,11
References
91
4.11 REFERENCES
s cXpa1Icnt fro('ticm "uord
0 00011111 11111111111111111111111 1
0 11111110 11111111111111111111111 1
1 11111110 11111111111111111111111 1
4.9. U",,,,'(I 011 t.hl' rt'Sults of problem 8, what L'I the ad\"antal' of thp rOUJlIl-to-nearcst-
E'V('n S<'hemC'! What is th,. clL'IfI(I"antage of the> round-to-nc.ucst-('\'I'll schelUC
(whE'n impl('IJJ('lItatioll L'I consi,!rU'c!). whi("h C"\II be avoided if a round-to-m'arest-
odd S<'lIC'IIIC' is adopt,'(I?
4.10. \Vrite down th(' postnorlnalizatioll st('lJs th"lt might bE' m'(lro wh('1J pC'rform-
inp; addition, subtractioll, IIlIIltiplkn.ion, and di"u,iou \\ ith two flonting-point
opE'rauds iu the IEEE short forwdt, Indicat' how WI&u) gUilnJ diKits ar" Ile('ded
ill rac-h r}sp.
131
( II
(51
(6)
(71
(8)
(9)
(101
(11)
(Ill
(131
(141
(151
1161
117)
I
4.11. Show Ihp ((-suIt oft.IJ(' followill operatiOlL... on lluml...fS in thl' IEEE short fOruMt
in all four roulldiug Sdlf'JJlf'S (S('(' problclI1 7). The operallds arf' gi\'f'n ill the
hl'xadN'imal lIotation.
(8) 3FRO 0000 t- OORO 0000 (b) 3FU UOOU - 3FiF 1"Fn'
(e) 3F80 0000 + 33S0 0000 (d) 3F80 0000 OORO 0000
(e) ,10000001 x 1000 0001 (f) .10000000 :13800000
4.12. l\vo normali./ed floatll1g-pomt UlllubC'rs A iUlIt B in till' short IEEE format were
n<ldp<l. I\l1d the ("('Suit was ('(Ilial to ..t. Do this imply thnt B = O'!
4.13. Gi\'('n a floating-point numh('r A with an expollent E" (ill "II)' format), its SlIC-
C"i()r hiL'! rithrr the same C'XPOII(,llt or t.hp expon(,lIt. I'A + 1. Is I he distance
betw('('n A alld its sucnossor th(' samE' ill both CI\S('S?
4.14. (8) Compar(' thl' enor invol"ed in thl' S('riaJ evallintioll of the product of fOllr
lIulUbers, IJ('rforllll as «(...1 1 x ...1:z) x Ih) x \..) to that ()f its parallel ('valuation
performed as «AI x A:z) x (AJ x ita». Decid.. whethcr one of th ml'tbods has
a sm.uler UPI)er bound for t.he E'nor wbf')J forming thE' product of n nnmhC'l'S.
(b) 1{C'l'f'Dt (n) for thl' stUll of four Illllub('l'S, thC'1I n nUlJlhf'rs. Can w(' gl'tlowl'r
('rror boullds if we know that. thC' llumhC'rs .up ill .'10mI' ordC'r; C.., 8SCI'nding
ordl'r?
4.15. I'ro\'e that t.hf' optimal wny to imlJlcl11ent a two-lcVE'1 combiu.,tonal shiftt'r for
k bits, wlll'l"{! k = m:Z, is for the first Icv('1 t.o shift by Inlihipll'S of 111, aud thC'
:;<'Coud levello shift from 0 to m. A:iSwne that tbp df'la)' L'I proportional to till'
nllmber of dtinations for cach line in the two II>\'('L.,. Call you gcueralize this
r('<;uh for Imy v"hll' of k?
4.16. Anoth('r way to implemellt a radix-I combinatorial shifll'r for 53 hits is by re-
blricting thc nwnbcr of destinations for cvpry hit in each Ic\'clto I but allow nll)re
than two Icvels. How mallv levels will sudl a combillatorial shiftE'r ha\f"? lIow
wany le\f'l8 will a radix-r oJlJbinatoria1 swft('r for m hits ha\'" if thl' nurnh('r of
destinatioll.'I for ('wry bit in each 1('\ el is r('5trictro to r H?
(11 G. ALEfELD and J. IIEtaHt:RGI-:R, Introduction to inknral computatlon., Anvl-
I'mi(' I'n>;ss, NY, 193.
(21 B. J. RE'IISCtlEIDElt, d at "A pipelilJed 5O-I\IHz (':\IOS &I-bit floatill-point
arithmptic prOCt'SSQr," IEEE Journal on Solid-State Circuits, 2.l (Octobf'r 1!189),
1317-132.1.
R. P. BRENT, wOn till' precision attainablC' with vnriOlL" floatillK-point nmllhpr
S)StC'IDS," WFE 1Turu_ on Compurs, C-22 (June 1973),601-607.
\\ , J. ('ODY. JR., "Static wid dynamic numerical charactcri.'!tics of floRling-point
"U'lthmE'tic," /f;EE Trons. on romputers. C-22 (June 1973), 598-601.
W. J. CODY. JR., "Anctl1;i5 of proposals for t.hE' floatillK-point stilllddrd," Com-
puter, (1\larch 1981),63-69.
G. FVI:N alld P .-I. SEIDEL, wA cOlIJpari..'ion of tbr('(' rounding algorilhms for
!Elm floating-point multiplication," IEEE Troru_ on Computers, -19 (July 2(00),
6:!S-650.
D. GOLDBERG, "ComplitE'r IIrithmrtJc," ill Complder arrhiu>cture: A quantitahve
approach D. \. Patterson and J. I . Hennessy, l\Iorgan Kaufmann, CA. 1996.
1-'. HOKENI-:K. R. rOXTO' E Blld p, ('OOK, "Second-Kencration RISC floating-
point with multipl'y-add fused," IEEE Journal of SolId-State Circuits, 25 (October
19JO), 1207-1213.
"IFFF slalllhrd for hinary floatmg-poillt arithmetic," ANSI/Il-:EE 75<1-1985, also
in Comput r, 1.1 (March 1981),51-62.
D. J. KliCK. et aL "AnalysIs of rounding method:. in floating-point arithmet.ic,"
IEEE Trons. on Computers. C-26 (7) (Jul)' 1977) PI'. 613-650.
O. J. KlJCK, The' Structure of computl'N and computations, vol. 1, Wiley, New
York, 1978, cbap. 3.
J. D. MAR\S" and D. W. 1"TlIL", "A simulative study of corrC'lat.NI error
prop."\ntion in various finitt>-prl'Cu,ion arithmetics," lEEr: 1hms. on Computer.t.
('-22 (June 1973), 587-597.
S. F. Ont'IIIAN, II. AL-T\\"IJRY, and 1\1. J. FLYNN, "The SNAP proje(.t: Ot'Sign
of floating-point aritlullctic units," Proc. of 13th Symp. on Computer Arithmetic
(July 1997), 156-165.
V. PESG, S. S"Ml:DR"LA and M. G,WRIELOV, "011 the i.mplcmentation ofshifters,
multipliers and divider:; in VI Silloating-point units," Proc. ofBth Syrnp. on Com-
putf'1' Arithmetic (May HIR7), !)!'...102.
M. H. S"NTORO, G. BEWICK and I. A. HOROWITZ, "Roundmg algorithms for
IEEE multipliC'rs," Proc. 9th Svmp. on Computer Arithmehe, 1989. 176-1H3.
I. S. SCUMOOKLt'H and K. J. :-lOWK", "Leading Lero anticipation and detection
- A COIIII)arisoli of met hoWl," Proc. 15th Slimp. on Compo Jlrithmehe. 2001,7-12.
P. 11. STERBI-:NZ. Floahng.pomt computation, Prentice HaU, ElIgl'wood ('liffs,
NJ, 197-1.
1}2
4
Binary Floating-Point Numbers
118)
H. S.l'1UI<I, II. IORIN"I<". f't al., "I radillg-7Pro anhupatoI\' 10KIC for hi h-s d
floatmg poi III addition," IEEE Journal on Solid-Stat CircUIt., 31 ( Au<T1It 'C 6 ' )
1157-1164. ' 0- ,
V. \\ . S\\ EF'III-:Y, "An analysis of floaling-point "clditiou" IBM S !J t J al
-I (1965) 31-12. ' .. .. $ f1J.9 ourn .
1'\ - Ts,\o, '"On thp distrihution of sifllJficaut digit..., and roundoff f'rror.!" Commu-
nirotions 01 the AC\f, 17 (May 197.1), 269-271. '
J. M. Y OIlE, "Rowlliing in floating-point arithlll('tiC," IEEE 1m,..,. on CO" t
('-22 (Jnnp 1973) 57i-586. 'pu m,
5
(Ig)
120)
(21)
FAST ADDITION
5.1 RIPPLE-CARRV ADDERS
The addit.ion of two operands is thf' most frequent, operation in almost any arith-
metic unit. A two-operand addt'r is lIsed not only when performing additions
and subt.ractions, hut also often cmploycrl when executing more complex Op(1'-
at ions lik" multiplication and divisioll. CUlIs('qucntly, a fast two-op"rMd addt'r
is l'$..<;('ut iat.
The most straightforward implempnt.ation of a pnrallel adder for two oper-
ands I.._I' X n -2 ... . IO and Yn- r. y..-2 . ", Yo is through the use of n basic units
callNI full adders. A full addpr (FA) is a logical circuit that accepts t.wo oppraud
bits, say Ii and Yi, aud an incoming carry bit, O('lIotNI by c.. aud then produ('('S
t,h(' corrf'spondiug sum hit, denoted by SIt and an outgoing carry bit, dpnoted
by C'+I' As this not.ation suggests, th" outgoing carr\' C,+I is also the incoming
carry for t.he subsrquent. FA, which has Ii+l 3nft Yi+1 as input bits. The FA
is a comhinational digital circuit implpllwnting thl' hillary addition of three bi
through the following Doolean equations:
8i =XiEBYieC,
wlwre e it> the exclusive-or operation. aud
c.+l = I, . y, + Ci . (I, + Y,)
(5.1)
(5.2)
wltl're X, . Y, is the AND oplration, Xi 1\ y" und Ii + Yi is t.he UR opt'r<\tiun,
I, Vy..
04
5. Fast Addition
5 2 Carry.Look-Ahead Adders
'Jf.
X:t
Y:i
X2
112
X,
Yl
Xo
Yo
Mlm output and tl1.. carry-out art' ('qual. This mny bp thf' CJ!..CW iC. for
I'xaluple. hoth circllit!! liS(' a two-level gatf' implpmentatinn, fhe foliowillJl:
diagram shows t.he sum amI carr)' siF;l1als as d funct.ion of thE' time, T,
m('asuwd in 6 f ., units:
('3 C2 CI c{)
FA FA Fi FA
83
8"}
81
80
T=O 1111
+ 0001
T = 6PA ('arry 0001
Sum 1110
T = 26 PA Carry 0011
Su m 1100
T = 36f' , ('arry 0111
5111u 1000
T = lf'A Carry 1111
Sum 0000
FIGURE 5.1 A 4-bit rlpple.corry odder.
A paralll'l '\ddt"r cousist.ing "f FA Cor 11 = I is dt'pieted in Figun' 5.1. 111
a paralld arit.llU1f'tic unit, all 211 inpnt bits (x. and y,) are usually a\,lilahle t.o
the addN at. the sanu' time. lIowl'\'('r, t.hl' earri('s hav£' tu propagatI' from thf'
FA in position 0 (till' position of t.h£' F.\ whosl' input., arc Xo ano Yo) to po..;ition
i in ordf'r for thl' FA in that position to prOlhu'c t.he corr('d sum and carry-out
bit.s. In othl'r word", we net"'o to wait nutil the carri('S ripple through .11111 FA:.
beforf' WI' can daim t.hat, thl' tlln ont put s arf> corr('('t and may be used in furt,hl'!'
calculatiom,. Bff8USC of this, thl' par.11Iel adder shown in Fignn' 5.1 is ca]l"d a
ripple-carry adcier. Note t.hat thl' FA in position i, twing a comhinatorial circuit,
will S('(> an incoming carry c. = 0 at the b('Kinning of till' operation, anI I will
accordingly produce a Slllll hit. s,. The incoming carry c, lllay chang.' 11l1t'r on.
mmlt ing in a corrl'Sponding clUIng!.' in s.. Thus, a rippll' eff,'ct can he obsl'rwd
at t.he sum out.puts of th(' addt'r as w£'II, continuing unt.il till' carry propagatiou is
('ompl('te. Also, notice that in uu add oppration, tll(' iucoming carry in position
0, co, is always ".ern and as a result t.hp FA in this posit.ion nn be replacl'd hy a
simpler unit C8IMhle of adding only two bits. Such a circuit exist and is caBed
a halC add('r (HA), and its Boolean '-'<Iuat.ions cnn be obt.aincd from Equ.,tions
(5.1) and (5.2) b,y S<'t.ting Ci l'Qual to O. Still, an FA is frequently IISI.:d to pnahl..
w; to add a 1 in th£' INI,st-siguitlcant po.itioll (ul,,), This is 1\('f'(led to impll'mt'nt
a subt.ract operation in the two's comp"'n)('nt llIl'lhod. HI'H" tl1l' suhtrahend is
complenll'ut('<.1 ami th('n addt>d to thl.' minuPIl<l. l'his is accomplish('d hy taking
the onc's comp">ul(>nt of thl' Mlbt.rahend alld ddding a forced carry to t.lJ(' FA ill
posit ion 0 by setting Co = L
This is tlu' longest cdrry propagation chain that can OfTur whl'n 'lliding
two -I-bit numbers. In synchronous arithmetic units, the time .,lIowl'd for
thl' add('r's oppratiou must be the worst.-case delay, which is, in thp gf'IIl'ral
case, 1l-6Jo'A' This \lJpaus that t.he nddl'r is assumed to produce t.he correct
sum aCtror this {i"pd dclay rep;ardll'.8S of th(' actual carry propagation timl"
wlneh might. bt: very short, as in 0101+0010.
COl1sidpr 1l0W th(' huhtrdd operation 0101- 0010. which is pprfornu'(l by
adding the two's compl{,Ulcnt of thp subtrahl'nd to the 1lI11luend. TIlt' two's
complenlf'nt is forlll('(1 by taking t.he one's l.'ollll>I{,lI!mt of 0010, 1101, dlUl
setting tht, Corc('(1 carry Co to 1, yi£'lding 0011. 0
It. is dear that thc long carry propagation chdins must be dpwt with in
ord('r to spC('(1 up thl' addition. Two main approachl':> can be envisiollt'<l: One
is to reduct the carry propagation tiuu>; t.he otht>r is to deted tilt' cOlllpletion of
the carry propagation and avoid wasting time while waiting for the fix('(1 cll'L'1Y
(of n . 6f",' for rippl('-c'\rry add(>rs) unlt'SS absolut.ely lu'cpssary. Clearly, the
second approach leads to a variable addition time, which may bt' im'onv(,llimt
in a synchronom; dl'Sign. We will tht>r"fort> coucentrate on the first. approddl
and study s('veral s('h('nl('s for accf'INf\ting Ciirry propagation. fhe tedmiqut' for
dptcction of carry cOlllplct.ion is left as an e.xert'ise to tht' reader.
5.2 CARRY-LOOK-AHEAD ADDERS
Example 5.1
Collsider thf' follo\\ ing t.wo operands for t.h.. addpr in Figur p 5.1: X3, .f2,.f1t
Xo = 1111 and Y3, Y2, Yl. Yo = 0001. 6F' denot.es till' operation timf'
(dl'lay) of all FA, assllming that t.ln' dl'lay:. associatl'{l with generat.ing the
Thl' IlIOSt. commonly uS(d Sdll'1JI(' for a('celt'rdting carry propagntion is tht: rorry-
look-ahead sdlpme, Th.. main id('a hl'llind ("arry-look-uht'ad addition is an I&t-
tpmpt to gpn('rate nil incoming carries in parallel (for all the f1 - I high ordt'r
FA'i) and avoid tht> nt"f.'rl to wnit until the l'Orr('('t carry l)ropaat,t>S frum the
DO
5 Fast Addition
52 Corry-look-Ahead Adders
97
:1'11$.(" (FA) IIf 'hp IJ,Mrr whr-r'> il h 18 l)peJl ('lIcmt '(I. rhiM nUl I". A.('("olflpli!jI,,',1
in r rinriph... lIincf' the 'arnl'S "n'm 'd uncltl... wuy IIU'y prI)IMElt." dr'pr'url only
011 ,h . diiL'I of t hr' ongJrlll1 IJllmbl'rti Xn-ITrt_" ....1'0 IIl1clll n _ 11Jn-2 .. Yo. TI..'1\('
diJ.(lt.'! arp avuilahl,. lIirnuh 1111 I "lIIflly to 811 !jtllg, of t.lw Allrl"r mill, ronq'll'ntly,
(01J.('h 8taJ(l Cllll h V(' 1111 tlw infrlrlnation it II"",I in fJrtl"r hI "lIklIhl ,till' 'In"l
vahl" of IIII' iucUJninJ( rarry ntlrl "lIlOl'lItr thr HUIII bit w:(,urdingly, rhiH, hrl\\l.'\"'r,
wlIlIl,1 rrqllirr. an inorrlinat 'Iy InrJ.,!.' numl".r of UlI)IIt 'Iaeh SU1gC of the> -ul J.'r,
rNIII,.rinJ( t laiR 8ppro 'h IIlIlnlll.tkal.
Otl(' may r('dllct> till' IUnnlJl'r of mplltH II. p1\('11 H' aJ{(' hy pxtmd ing I hI'
informat i'JII Ill','clr.d fruln the inpllt (lii'8 t.o ,I(.t.f'rlninr' whr'th"r IJI'W r'\rries will
1)(. W'IJI'rnll'd and whf'llIPr th('y will hr. propag Ilf"1. fo I hiR pnel, wr' will "tudy
ill tlpI Iii t h.. g"flr'rnl ion IUleI prllplLgntion IIf ('arri(. .
1'11('((' an' Ht agf'R in t hp a,l{]pr in whi,'l1 a I'arry-ollt if! generat '{I regardl,"8s o(
I III' uU:OfJllng carry. 8nd lib 1& result, UII IlllrlitiOlllL1 infurlnatioll 011 Prt'ViOIiH inpul
digll.!i if! n'lIlIir,'rl. Th, . Ir' I,h ' III Ig('s f'lr whil'h .£, = Y. = 1. flwrl' .lrQ ot IlI'r
H'aJ(' thftt arl' only capa!>I,. of pr()padlillg t.III' in{(llIIlIIJ( ('ury; i.e., X.III = 10,
or :Tilll - 01. Only a Htag., in whir'h r, = II, = 0 ('<IIlIlot propuJ(at.' a !'Ilrry, I'I
lUlliiruilah' till' infnrmation (('g ,rding tlU' "Iwr Iti',11 Sllld propuf1:ntioll ,Jf rnrries,
w,' .I"lilll' I hr following logif' flllll'lJonH IIHinf1: t Iw AND ami OR 0pl"rdtt0I1S, Let
., = J', . V, (1"001 {. thc flf flf-rat d rorry Iud I(.t 1'. = .£. + Y. dl'nol' till' IJrlJI atlll1u}
C(Jn"II, AfJ a rInh-, UII' Bo,JlNLIIl'xpression (5.2) for tllr 'fury-uut, "an h,: rl'WriUA'1l
88
C'+I = I,y, + rifT. + y,) = 0 , +- r. 1'1'
Sui. I ilutin (', - G I -f ('i-II'I-I in I,hp Ilhov,' pxpnion yir'ld!!
"HI = G. + G.-J/', + G.-.ll',-IP. + C,-2 P .-"P , - I I'. =
= (;i + (;'_11', +- (;.-21'1- 11'..1 ... + ('01'01'1" .1',.
(5.:i)
A rlp1sy of 2{. is thl'lI 1IN'(Jr.t1 t41 J{"nr-rat" 1\11 ('-i (888umin a two-Ipvpl '\tR
irnplr'lfll'ntat ion) uul IInllt IIpr 2fI t ) r'II"Hlh.' tilr' 811m rliJ{il.8, II" in IJaralll.1
(af1:(UIl, IIsslIIlling a Iwo..lf'Vpl f1:atp irn[llpm{'r'ld ion). 11"11(1'. a tf,tal {Jf 5 . tlmp
IIlIilri iR III" Ilpd, r"J(anll'!8H of n, I,h' IUllntwr (If !,jtR in ,.adl (1J!I'UJnd. HIIWr-VPl,
for a larJ(c vlihlp of 71, say, 1 = J2. an "xt fl'nll'ly IlIrS(I' nUluhpr IIf f1: itf'S ill 1II'f'llpd
1)11 1 1, lIIor . 1If1I)Ort IIllly, gates with 11 wry larJ<:p fan-in (up rPflllirrd (fan-in is t.hr
1IIIIIJlwr uf J{a t . input.s, mrl is (:(11111110 n + I in thi!! ('ASP). fhr-r"fnrl', wpmnHt
rr.dlll"f' t.he S[J III of tlU' look."hp.lIl at th" ""PPllflf' (If !II N.d. We may divill,.. thr-
n st.ag into Kroups "nd lahvr> a RI'p IratI' rarry-III Ik-'Ih,'oo in .dl rollp. fhe
grouP!! rail thpIJ hI' inh'r"omu'('t '() !'y th.. rippl,..-nrry JII,thod. DivillillS( till'
-uJd('r int.o Nlual-lli/.pd grnups IUI8 the a.rJditlonal b"n .fit of Iliodularity, r'J'luirill
till' d,.t.ait,.d diKJI of only a IIIIII(' int('J(J'atcd circuit. A grolllJ Rill' ,)f I h. 1)1""11
(,(Jllllllollly used, IInd ICs 'apdl,I" of /i{Millg lwo """lur'Il(', ('ach colIsiHtilig of
four digits with ('Irry-II ok-ahr.ml, arl' dvailalJle. Si/f' -I wa..., 1;(,11'("1.4>11 "1'''Ulty'. it is
a ("Ollllllon f8l,tfJr of Int,Ht. word Hi/A''8, 'uill dl80 b,'cauf!(' of tt'I"hllol"gy-dl'JJf'IJllr'llt
("unHl-raint.!I (".g., till' availa"'" 1I1II1I1,,'r of input /0111 Pllt piIlR).
F'Jr 1l bits and J(roulJS of 'Iii" I, then' arp n/I groups. To propngatA" a
carry thwugh a group oncr t.he 1'1'8, G, 's, and Co arr. availahlr-, WI' n!,,'d 2G
tUII(' IInits. Thus, I; is 1",{'dC',1 to gpllprat' all P, IJ(I G.. (n/4' . 2(' an'
n .ded t.o I,rupag t . the l'l1rry t.hrouJ(1I all bits, and an additional d 'Iay of 2A(,
i lu'dcd to p;elll:rdte t.he surra fJllt.pllt..'j, for a tfJwl (Jf (2'i + 3'r; = ( + J)o.
Thi!. i!l alllloHt a fourfold rcdndion ill dr'ldY C.oIIJIMf(.d to the 'lnr delay of a
rippl,'-cl1rry ddrJ('r.
W" III 1Y fllrthpr spppd III' t.he adrlition by provi,ling rdrry-Iook-nhpad
uVf'r grml)JH in atMition to th(' intNII(lllook-.alwml wit.hin rill' groul). WI" dpfinp
a TOIlI'-!/r 111 mU.d ,(1"7/, G", and d grOfJp-propfl" flted carTY, P", for a grnup of
si,w .. as fullows: C" = I if a cSirry-out (of Ihc group) is gl'llI'r"t{'{1 irat.('rnally
anrl P" = 1 if d f'drry-iu (to 1.l1l' KJ'Ollp) is Jlwpal1t .,1 int{'rnally to pmdnl"e a
"f1rry-nut (of thp group). fh,.. Dnoll'dll f'lIlIatiou for tho . carn ar..
"HI = G. + GI_1I-', + C1- I I'I-IP.,
Flirt IlI'r SUbbtll IItioliH (I '1i1l 1t, in
Thill I YJI" of cxpn-'8Hi(JII all( W!l IIH to . lleulat . utI lhe (:arrif'8 III paJ"llIl -I frmu th,'
oriillllllhil.ti r n-- l,cn-2 ' . 'J'o flllIl 1J.. I II.. - 2 . . . 110 and tl1(' fur' ,tI carry Co For
"XIIIJlI)I,., for a 4-hil wld('r, I III' ('arril:t! IIrI'
C;" = 0 3 + C"P3 + G I P 2 1'3 + COP I P 2 P1,
P" = !'oPt P',lP3'
(5.5)
rl = Go + r'o/'u,
'2 - G I -t- GoPI + ('0 P" 1'1 , (5..1)
('.I = G" + G.l'" + (;01'11'2 + cOn,P I P 2,
"4 = '3 + G"I-'3 f GII''l/i + r:OPIP211 + ('OI'O/'IP 2 I'J'
TIll' group-eJJl'mt('d /Iud gwup-propngat. r rJ carrll for HCv<>ral grullJ.lS ('<\!l !lUW
"e used to geller It'. group carry-ills in fi IIIfllIIlI'r slIuil.if to billl('-bit cdrry-iliH ill
hlllEtitllJ (5..1). A cOilihinutorial circuit. impl,..lIwllting th cqllatiollti itllLvail Ihlc
I.!! 11 H('pllrat.. .11lt1 stalillard IC, fhis Ie' iM l'alh'd . rarry.look-rJh(a<l r/( r. mtor,
ami it.s UM' is ilhIHlrnh,,1 ill t.lae following pxalllpli'.
If thill iH d"lIe fur III Ht.dW'H IIf the ILdd('r, thcn f(Jr I'w'h SIILW' n A<; "I'IIlY
ill w{Jlllrl'fl t.v W'llI'rnl . all I, 111111 (l.. will'''' a i!l th' delay of n tlII1'" glil r ',
Exftmplc 5,2
I'or u = W there arC four f1:roUll11, wit.h OIlIl'utH G;;.Gj, (;i,C:; IJIIJ Pc;,
P" P" 0" flJ(" tierv ' in l Jllts to d carrv-Illuk-.lhl'ad J(,'IU'rdtor, whu:;!'
I' ", 6 ,i'
98
1
I
5
Fast Addition
:l'll1-l:l f/lll-l2
ZU-8 YI1-
X';'_I 1/7-4
.r3-0 11.1-0
Cnn'1l./ook-Ahmd G,'n('rator
FIGURE 5.2 A 16-blt two-level carry-look-ahead adder. (The notation X3-0
means :1'3, .1"2, x" xo.)
OUt'fluts art> dcnot<>d by'",. C8, and ('12. s.ttisfYIII
r" = Go + C{)p O '
C" = Gi + GoPj + C{)Pti PI"'
CI2 = G l + Gi 1'; + GoPj 1''; + coPti PI" P;.
(5.G)
A 16-hit, udder with four groups, pach with inh'rnal carry-Ionk-ahead and
all addit.ional ('urry-Iook-alll'ad v;en('rator, arc dl'pil'tt'cI in Figure 5.2. TIlt,
operatioll of this addN consists of the following four !otPps:
1. All groups V;l'nt>rate ill p.uallPl hit.-carry-gPIIPrat.{', G" and hit-carry-
propagate, 1',.
2. All gronps p;en{'rale in p<Lrallel group-carry-gl'lI!'rcLtl', G;, mul groUIJ-
carrv-prnpag.Lt.c, 1';.
3. Thp ('arry-Iuok-ahcad g!'lIcrator produces the carnes C." C8, mul Cl2
into the p;roups.
4. TIll> broups c .Lkulatl' t.hl'ir illdividnal SUIII bit.. (in parallel) with int('r-
lIal ('arry-Iook-alll'ud. In other word.., t.hpy first. gCII('rat.(' t.hs:> ints:>rnal
carnes accurdmg tl) Equatton (5.1) .Llul tllI'l1 the SIIIII bits.
Th!' IIIlIIimum timf' delcLY of'iatcd with stl'ps (1)-(4) (us,'mming II mini-
mum I1UIllI>('r of ga c II'vpls 11111.11 Ctrcui) is IG for st.,p 1, 2G for st. I >!>
2, 2G for sh'p 3, and "G fur sttP'1. Thus, the t.otal addition tinil' is
YG int,('M uf IIG, whkh is th(' addit.ion t.illl(, if thp l'xtl'rtml carry-
look-dlll'ud gCIII'rat,or is not, uS{'d dnd thp carry ripples .Llllonb t.hs:> rollpS.
J'lI' c8kulatiolls yi..t.1 only throrC'tiralcst.iIU8tf'S fur tl,,' addition tim!'.
In prartil"4, OIU' has to 11M' till' t,) piC'al dc'lays ILSsol'iatl'd with thl' part.il'"ular
illt cgrBted drcuil cmploy'(l in order tu cdkulatl' till' IUhlition thup lIIorP
aCl'urald)' (til."" any iUh'f,rah-o drcllit ddt'\bouk). 0
5,3 Conditional Sum Adders
99
\8 shown ill Figurp 5.2. thp carry-Iook-.,hpacl gpm>rator produrf'S rwo wl-
llitional outputs, Goo 'tnl! p.., whosp Bools:>an pqua!iollS 'tr(' similar to thoS(' in
Equation (5.5). rhpse ncw outputs drc rnllpd !i('rtion-carry generatr "lid scrt;nTl-
c.arry p,pa!lat . r('>;'pl'clivl'ly, wlU'rp a S('Ct.icJII, ill this case>, is a Sf't of four group'!
aucl conslst.s of 16 bits: As hpforl', th(' numher of f.,fOllpS in a sect.ion is ('()mmonly
Sl't at four IIf>cElIlst. of IInplplllpntat,ion-rpll\ted cQnsiderations, 'tlld 1I0t bl'C'all!,;c of
any limitat.ioll of tIll' IlIIdl'rlying algorithm.
If thl' numhl'r of hits to be addf'(1 is largt'r than 16, say, 64, WI' mdY 11M>
(>it.hI'C four circuits, p8('h similar to the Ulll' shown in FigllrP 5.2, with a ripple-
Cdrry bctwCf'n acljdCl'nt Sl.'C.tions. or UM' anotllpr I('vpl of cnrry-I(Klk-ah...acl, dnd
adlil'v{, d f.lsl('r pxpculioll of addition, fhis is pXl\l'"tly the !>dmp circuit as above,
ILCl'epting t h.. four pairs of sN.t.ioll-curry-genpratp and scction-carry-propadtt',
and prnduC"ing th{' carries Cl6, C32, and C48. -
As the 1U1Inh('r of bits, fI, ill("(ca.se, lIIorp levs:>ls of ('arry-look-ahead g('n-
f'rators I'"dll be added in ordcr to spl'f'cI up t.he addit.ion. The rS:>!Juin d lIumbc.r
of level:) (for maximum speed up) approadlPS lop;" f/, whf'r(' b is tIlt' blocking fnr-
tor, i.p., the numhl'r of bit.s in a group. till' number of groups in a Sf'<'t.iOIl, dnd
so on. Thl' blocking factor is 4 iu t.he ronw'ntiollal implplIlI'ntation depirt..--d in
Figurp 5.2. Th{' overall a(Mit,ion t.illll' of a carry-look-ahead adder IS thenfon'
proportional to lo&, n.
5,3 CONDITIONAL SUM ADDERS
Allother sdlS:>IUP for fast mldition t.hat provid( a logarithmic :,pf'('(I-lIp is the
colHlitlOnul 611m IlIldt'r (291. TIll' principle behind t.his s('hl'ml' is to gellPmll' two
sets uf ontput,>; for IL givcn group uf I)perand bits, say, k bits. Each set iududu,
k sum bit.s amI .In outgoing carry. 0111' set assullles that thl e\'clltual incoming
carry will be .l('W. while t.he other dSSllln that it. will be 0I1P. Om'e the inculllin
carry is known, w(' 111'('(1 ollly to Sf'11'c-t tl1l' corrfft sC't of outputs (Ollt. of the 'wo
sds) without. waitin for th(' ("ILrry to furth,'r propllgate throuJth thC' k posit.ions
(St-X' Figure 5.3). Clearly, we !ohould IIOt. apply this idf'-e\ to all " operaJ(1 bit
at t.he bl'v;innillg of t.lu' <uM operat ion, 1I1r-e \\e will then havp t.o walt nntll
thl' ..arry propagat.es through nUn (>o..itions b>fon' lIIakiug thp ss:>1('{'tilln. WI'
lu'C'd, t.lu>r('fon', to divid(> the giVl'n fI hits illtll SIlMUf'r groups alill c1pply the
ahow iclC'1L t.o pach of t.h(>1U Sf'paratt>ly. III this way, thl' serial carry-propa'tntou
illside tlu' sl'parnh> groups cau bl' dnne in punlld, redudng th.. o\'cralll'XI:'{'ul,l.on
t.illle. Thl'St' groups can, in turn, III' furt h('( dividt'(l into subgroups, for WhK'h
t.he c<Lrry-prupa.:lti(Jn timc IS PVPlI smnlh.r. Ths:> output-s of thl> ,,"bgroIlIJs 11lf'
thcn comhiuC'cl to J!:I'III'mh> tl.... ()lltput of ths:> groups. .
A uatural division of tIll' It op,'nmd bits wOllld bl' into two roups uf s/e
11/2 bits ('adl. Fudl 0111' of thc:;c cun bt' furtlu'r dividNI into t\O groups of S".e
n/4 hit.s ('adl. fhis proC('1iS l'an, in prindplt'. b' cOlltilUlt'<lulI! II a grollp of Sill'
100
5 Fast Addition
k k
C alAI
o
k bOt Adder
1
M ulhplerer"
G' n
k
FIGURE 5.3 Selecting the correct set of sum bits and corry-out.
1 is readied if n is an integer pOw('r uf 2. In this c&..o;e, log2 n st(.J.ls arc' n('('dcd
in th£' proce.ss, where in step 1 we deal with !oinglp bit positions, in stf'P 2 pairs
of bit.s lJJ'(> handled, and so on. Notic(>, how('ver, that a givf'n group do not
necessarily have to b£' dividpd into e<lual-si7(d subgroups. Thus, t.he condit.ional
hum. l'IlPmt> c.an be applk>d even if the number of bits is not a power of 2.
Example 5.3
In this f'xample we illustrate the way groups containing single bits .ue
combined into pairs of bits. \'"e use here thf' followin(.., not.ation: s? denotes
the sum bit, at posit.ion .i undf'r t.llf' a.....,umption that the incoming carry
into till> cUTrf'ntly consid('red group is OJ s: i., defined simil.uly, and so
srI' t.he outgoing c..drri (from the group), c?+1 snd c!+t. \Ve will first
considf'r two adjacent, bit, posit.ions, and, in st.ep I, each con"titutcs a
c;.eparate group:
i 7 6
x. 1 0
V, 0 0
Asllming incoming (arry = 0
c?tllO 0
AS'i\Jming incoming carry = 1
c:+I I 1 0
In st..p 2 t.hC' two bit pO:fltions ar(> combincd (uHing dat.d scl('ctors) into
on.. group of si.le 2:
I
5.3 Conditional Sum Adders
un
i,i-117.6
y, , Y. -I I 00
S?,S?_I 110 Assuming incoming carry = 0
c?+1 1 0
uming incoming carry = 1
C:+1 1 0
I
I
Notf' that. th,. carry-out. from position 6 hf>f'omes an intprnal (to tht' group,
c'\rrv and COIlhf'qUl'nt Iy, Wf> can S('lpcl thp appropriafe set of out puts for
position 7. 0
Example 5.4
\\p now apply t.he conditional sum method to the addit.ion of two 8-bit
operands (Figurp 5..1). The process has log28 = 3.steps. . .
:'tJoticc that thp forced carry (which equals 0 in thIS pxalllple) IS aVBJlable
dt the beginning of the operation. Therr'for P t only 011(> set of outputs nPeds
to be generated for the rightmost group at each step. 0
i 7 6 5 " 3 2 1 0
z. 1 0 1 1 0 1 1 0
fl. 0 0 I 0 1 1 0 1
3 1 0 0 I 1 0 1 I
Step I c?+t 0 0 1 0 0 1 0 0
"I 0 1 1 0 0 1 0
I
c:+l 1 0 1 1 1 1 1
,,0 1 0 0 1 0 0 I 1
. 0
5t<;p 2 4+1 0 1 1
". 1 1 1 0 0 1
(":+1 0 1 1
,,0 1 1 0 1 0 0 1 I
I
5tep 3 4+t 0 1
3 1 1 1 1 0
,
1 0
I c'+1
({.sult D I
1 I
000
FIGURE 5.4 CondiTional sum addition of two 8-blt numbers
102
5. Fast AdditIOn
54 Optlmallty of Algorithms and Their Implementations
103
A \1uiatinn of Ih(' conditional !ilml nddl'r i'i till' carrY-!iI'I'''('1 addl'r. As iu
th., conditional sum add('r, thf' n hits an' di\'idf'<1 iuto rnups (but uot Ut'('t's-'iarily
of t.hf' samp si/"p). aud pal'll group gt'nprntl'''' two !il't.!i of !ilml hit!i nlld au olltgoin
carry hit. ThO' inromiu ('Iury Sf'lI'Ct..... ulIt' of tilt'S(' t.wo so't.s, U Illik.' the cOllrlitioual
sum nddpr, t'<tell group i... lint. furt,h('r di\.j,I('d iuto smnll('r suhgroups. ThO' C'aIr\'-
splf'C't lH!df'r ami Mml' of its varidtiol1!i arl' dl"SCril)("t1 in Sf'ction 5.M.
ComparinJ:: t II(' condit ioual :OUIU alld t.h., carry-Iook-alu'ad sdll>ml's, \\'l' St'"
t.hat I he two hm'f' about th£' sal1)£' sp('f'd. Tht' digll of a Nllulitionnl sum nddl'f
is, hOwl'vl'r, le.s.'I modular than t hat. of a carry-look-alll'lHl add('r, alld this is
t h,' malU n'H:;on for 11lf' much higher popularity of I hp latt('r. fh(' IIPxt Sf'<'t.ioll
illdudf'S a morp gt'lwral discussion on th£' oJ>timnlity of algorithms for addit.ion
aud t,hpir implcmentatiom;.
siolls hl'IW''f'lIllIllIIhpr "yst.f'lIIs. Thl'sI' cOllvprsions may I Ilrll out t ) b.. ns ('omp.'x
as (or ('\'.'11 morp cmupl('x t h.m) thp addition itsf'lf all.1 IInlt'SS a fast CIJlJ\' 'r 'ion
alorithm is a\'ailahll', this approach is of IimilNI pmct.icnl valu,'.
A Illt'orf't.il:'al model to d(,tt'rmille a lower bouud (III tilt' sp.d of Iddil ion
has hl't'n J>wposed hy Willograd (35), Spira (JO), and otlu.'rs. This h. 1\11 id"nli./-f'<1
modl'l. whosp purpose is the dprivat.ion of a bnmul iurlt'pl'lldl'nt of thp illlplf'1l1l'u-
tation tt'dllJology. It 8SMlIIll'S I hat till' circuit for utltlition is rpnIi7('(1 ullg only
IlIIt' t.YI'I' of gate, the (j, r) gat', whert' " is th(' radi." of thl' IUllnber systl'1J1 used
anti I is the lan-in of th, atl'; i.e., thO' maximum IlIIlJIber of inputs to t.he att.'.
All (j, r) g.ltc.::! are MSulIIl'd to h(' capnhh> or computing any r-\'l\lu1 function
of I (or "'ss) arullu'nt.s in cxactly tlu' :.illne tUliP pl'riod. This tix..d time pf.'riOlI
is t!l'filled a." thl' unit. dl'lay, amI t.he comVlltatioli timl' of thp u.lttf'r circuit ill
IIIPasurf'<1 in t.hcs > ullit ddays.
Sinl:'f' all (j, r) gat' nn romputt' any fum:tion of I argnmpnt.s, Wf' llN'd to
fiud out only how numy such gatt:'s arc required and how many ('ircuil Ipvf'lli arp
nl-df'(1 iu order to propl'rI} mnnect theS(' gate:;. Tim!;, a cirruit for '1dding two
rOllix-r 0pl'rauds with '1 digits f'{I('h has 2n illputs and produ("l's 1l + I olltpnts.
Considt'r the output that f(''1nirps dll 271 inputs for its "dkulalioll. flwsp 271
inputs can h(' n-tlucl'd to a smaller lIumber of arguments by usinp; 2n/ 11 su('h
(j, r) gat.('S (whl'rf' till' ('('iling r xl of a uunlher x is thp smalll'st illt{'gf'r thaI
is larger thalli or 'Iual t.o x), These gat.t"S lwlollg t.o t.hl' samp logic le\'l'l and
thf'rf'fore operate ill par.LIlt'1. TIt(' rcsultin 1JI1ll1hpr of iutprm('CliatC' argwnl'lIts is
r2n/l1. nud this IIIlInher I'dn hp further n..-tluccd through a !'ol'cond lewl of (j, r)
gates. A :ochelllatiC' diagram of till' resulting drC'uit ill dl'pirtf'<1 ill Figurp 5,5.
Th(' t.otal IIl11nl)('r of I(,\lels in such a tnt(> const.ructed of (f. r) gah's is ,Lt I('alit
nog/2n. Not, that thc illdirnt,c.1 number of (j, r) gutl'.'; at. ('ach I"\'PI i'i only
5.4 OPTIMALITY OF ALGORITHMS AND THEIR IMPLEMENTATIONS
NllUlProus algorithms for fast additiou, as well as ot,lll'r aritlllUttk operations,
haw h.....'u de\'I'lopl'd and impll'mf'lItt'd sillcc the early days of digital C'olUputers,
alld II('Wl'r om's are still Iwing proposed. The m'lin real,on for the continuillg
rt':wnrc!1 alld dl'\'l'lopIl1l'nt of nO'''' algorit.hms for thl' hfL'Iic aritllllll'tic opl'r.l1ioIlS
is till' rapid changl' i u th(' t.('dmology I h.Lt is uso'd to i mpll'lIlCllt 1111'111. A II
algorit.hm t hut Cdll hf' opt.imally impl,'m('ntt'd in 0111' wdmology is not nl'carily
th£' bpst in a (lilff'rt.'nt t('('hnolog\'. COllsequeutly, dt'sigll('rs n('('(1 t.o cOllt.inuously
ree\aluuh' t.hl' availahl(> algorithms for a c('rt.ain arithmet.ic o{>l.'rat.ion ami tlll'ir
sllllahility to thl? currl'lIt t('('hllolob}'.
III addition to the dl'pf'lldellc(, on the technology I'mplo}'ed for the im-
pll'nll'lItation of the aIgoritlull, the »('(forma\l('p of a giwlI algor it hm is h('uvily
atf('(.tt'<I hy the unique features of till' algorithm ilst.lf and/or thl' IIl11nh('r sys-
t-l'lIl U'iPd to represellt t,hf' opl'rands It.lld results. Thus. many swdil>s haVt' I)('m
performcd t.hat compare V"'drious algorit.hms in an effort to dl'tl'rmiu(' which will
pf'rfortn hl't,tcr. prpfprnhly illd('pl'lIdel1tly of t.he tl'f.!Jnolugy uspd for tll(' imple-
ml'ntat.ion. r-.lore importantly, till' ohjf'ctive of some studies wa:; to fhltl till' limit
(houlld) on th(' performance of any algorithm in executing a giv('u drithlUl't.i('
operatioll.
rhl' ('x('cution tim,' of addition, L('illg highly depcndl'lIt on the way n,rri
art' propngatt....l, can reach its minimulII in algorit.hms, which avoid t.h(' »ropltga-
tion of l'arri('S altog(>ther, or incur a \'ery Iimit,NI carry-pwvaation. rhen'fow,
numher s't'stl>ms that arl' dlar."teri/l>d hy almost carry-fn't' addition, can pro-
vicl," "optimal" dlgorithms for additiolJ. On(' such IlIl1n!Jer system is the residue
Ilurnbl'r systPIII (df'scrib£'c! iu Chaptl'r 11); '1nol hpr is t.he SD nllllll>er system
(d,('rib('(1 ill ('h.it)ter 2). Onl' should, hu.....l'wr, he aware (}f t.hp fll("t that th '8'
lIulllb('r S} stf'ms are 1I0t. frequelltly IIS('(I ill pradkc. COUsl'qu('ntly, in ord.'!' to
takl' adwlltagp of tlu'S(' fast algnrit.lnns for addition, WI' 1It'(1 to pl'rform cOII\"er-
XI
"'2
(f, r)
:1
x/
"'/+1
"'/+2
(f,r)
(I, r)
%2/
(/,r)
FIGURE 5.5 A portlal diagram of a circuit Implemented with ( f r ) gotes.
104
5 Fest AddltlOf"l
5A OptlmaRty of AJgorrthms and Tt Irnr;JerTlfJfltotlOnl
I Wi
8 lov.'Pr bound, !\inC'P it 8SSumes that no arum{'nt is nNodro as input t ) mor'
than onp (f. r) gah. The resulting numher of levels is cOfW'(juentlv also a lov.cr
bound. ThprPfore. the Iov.-er bound on th tim<> to perform addition iJ;
Todd rlog/2nl
(5.7)
Impl m 11 l£In C t. In.8t1. Sfl, r(1Jlarity of du. d'''iign .11) J I II. I. of UJl ",.. n-
m,"tionB fUP ClJru;iderdlJly IT1r rp important, fliru:1' th Y dff.,"t brA-h tit' ,Ii''" n 81,."
w.ed hy tll'! d. r dud th.' dCllign tim.r'. Tit. two f:sol rH (r ..., imvlPln nta'i m
CI t and 8p(:(.'<1) 00 nrJ' n. rily 81.hV-V" tho ir rnimrrmffi va]11f" m tho> ffW
dCfl1gn. Thus. d tr.sdooff 1 v. n tlV'fIP two rrught 11a\-p. to t (r,I1O,1
If p'rCotrrl.d1lu- hi rruJr' importaJ' than irnJlI'Jr nWI.m ( t, 'h n ltv>
carry-look-.wcad add. r is VI'ry dltra.c IV.... Still. the impl'rn nl:stJ II t I Z:Ila
be rcduc:d e.JX.'I(ialJy whPlJ flJlI C1.lK m VLSI is I'1l1J11 J)'.. md. "'" 8 rP15lJ1t _
uLuity of the doogn nd SV . of th r'jfjnir' ar A d.-t.< rrulfI' tb. irnJlI frI' n'a n
Ubt. This Call b. 81 bi,-v.-d by LaJ-.ing dlJ...anl::ig p of tb<. ..svallat,lp d.W' rJf frr"""
dom in the dC!oil; namely, tb.. bloduJJg W"1J'r. fl. b1rx.kin W"ti r is alw
bounded by the fan-in UJJu.tr3int. In 6(JUIP CdW.:6 it might. 3W' I p tXJunrl.oti by
additional constrdint.s, bUch 88 the numl.Jer of pins. Hr}Wr-vl't. liar. lug} 1M
v-.,,]uP. for tile blocking factI.Jr is filA n''''''''>barily tlv. t boar., nt1 tlipr" ia r
to ralUdW it.s cfff"Ct on (.?tfc'JMJtit n tim' (inrl implt!ln. ntalv n tXM.. F, r <?';vnpiP,
(i blocking wowr of 2 re&ults in VlTj r...guur l.s".JUt of "'inacy tr WJth up
to log,.! n Icvelb, r<>quiring trJl::i1 ;U';3 .,f apJlr<Jxirna . i n Ir, & Iolj, F1Jrtl r
d<.wl& on the btru(,'tur P of tlv> t:."J'ry-Ir-"'Jk-ab...d/J tr . '10 all. ;s hit king wv r f 2
dJ'e providoo in thl' nPXt s''''''1i JII.
If ItJWu irnl,I"'fnntdlil.Jn cost ia r" juirf'ti, tbPn thp r.3rry-&rx,k-VM
sdaPffi'" might bP inappwpriatA". ill' rippIP-carry method can t* U&OO tr>g I. r
with 80me pePd-up t.<rlmiqIJP.8, wl.i h df!JI''JJd t n tw cl1(l w.lw(J1r.. Ow-
ucla t..-'Cbwt,Ju is tlv! 1allchel>1Pr dliojP.J' i 14'. wh(J6f' mat' diagTaw.. tlUJWJI
In Figurp -{J. This diagram ioclud 8wit.dl that C4D be r wring IJCII*
trallSL tors or I>imilar dP'Vices in ""dJ'1£,us wchuoJogiE$. TOC tbt S'NW /I- r IUJJ
measured in units of (f, r) gate dplay.
Therf> are SI"...era1 a.c;,I,urnptions underlying tbis model that m.ake it an ide-
alized model. First, onl)' tbe fan-in lirnit.ation is taken into .account., while th,.
fan-oul. coru..1raint is ignorf'd. The fan-out of a gate is the ability of its output
to driw' a number of inputs to bimiJar ga in the next Ipvel. In practice, tla..
fan-out of a gate is constrainf'd. Second and more important, t hp modpl um(:,
that an} r-\'alued function of f argumpnts can be ca1culatPd by a 8wgle (J, r)
gate in one unit dda,}". In practice. only a small numb...r of such functioru, can
be implementro b)' a 6inglf' gdte requiring the lm1allt possIble del.sy. Mauy
functions IDaY require eitlacr a more complex gate ('Ioith a longcr dl?lay) or lleOO
to be implemented using s(>\"eJ'a1 simple gates organized in t'loO or more levpb
Tbe bound in Equation (5.7) 85SUmes that tb{'re is at least one output
digit that depends on all 2n input digits. If the addition tet hnique (or the
particular numlwr system employed is !Ouch thd! not all 2n digits are nf'('(Jed to
dctermine an) output digit, then a lower \-alue for the bound can b(' tdblished.
In....tead of ha\'ing tf('('S ",'ith rIog/2n 1 leo.els, !';maJl('r trees ("",.ith fev;pr inputs)
can be used. 'I"hb occurs if d carry cannot propagd.te from thp lc.:sst-ignificaJjt
positioll to tbe most-!Oignificant position. since such a long Cdrry prop.sgation
implies that the most-!Oignificant Output digit is depr'lIdent 011 all 2n inputs. For
example, if onl}' z., 1/'7 :C'_I and 1/.-1 are needed to determin the sum digit lJi,
tben
Todd nog/41.
If the com'f>ntional binary number S)stem is used. a carry can propagat p
through alJ n positions and rlog/2n 1 is still a lower bound on the additioll
tune. TIJ two additioll algonthms described in the prf>\;OUS 5eI_'tions (namely,
thp cany-Iook-ahead and the collditional sum) have an execution time that is
proport' nal to Iogn and can, in thooC). approach tho> abO\'e bound.
Howeo.. ",hen comparing two or more algorithms "';tb the s.sme tboorct.-
ical bound for the executioo time. some objecti'a,'e fuDL'tU.J1I rel.stcd to th,... rost oC
implementati 0 5hould also be ta.kPn into acoount. The type of thp additional
objecth-e funch(1II used dds 00 the tecim!Jlogy employed, For ex.arnpl... ",'ho
discrete gate!> are usPd to impJemeut the circuit, the number of sucb gate! mw
£jIP..1'\'(> as an obJOCti\{' function IOO:<s.bllring the implemf'ntati.JJ1 CO<;t.. Tbe IIUJIlxor
of gate. aIollg the crilk:a] (1(.JlIg.-t) path (in other ",'ord'i. tl1l' Dumber of circuit
1M l) detennmes the exeruUoII time of th,.. algorithm. If full custQm VLSI tech-
no w.ed then the exact number of ga hd..... ''PJ}' IimitErl clIo 1, on the
%,
, K. P -.
.
4+1 C.
GIr..Md
"'1'1&;1
(' ( ('
If G. = J if KI = J
"'I'" "'0'"
AGURE 5.6 A Manc1 ooder
JOt)
5 Fast Addition
55 Corry. Look-Ahead Addition Revistted
107
i nrf' con'rollf'd hy lit(' sllIal.. p., G.. (\II(I 1\ i, wll4'n> I - ;:r. ED 1/, iH tll(' mrry.
propngnfl' signal, G, - X.,}, if! th,' rarry-fll'm mtl' f'iRIII\I, nnd Ai :c :l'i IIi i thl>
I' 'rt"JI-kill !liJtnnJ. TII( t hr{'f' sigilli I" RTf' df'fill"d !l0 . hn. onf', and onlv onc, of th"
corTl"SJJonciinK :-.witcbN< I., rlr),'i('(1 at IIIIY tnn". TIJII'i. tit<' '''(llJahOIi p. = X, EB 1/. is
1ISf'{1 in..f<'ad of p. = x. + 11. 11..'1 b"fon'. If G. """ I, an ontgoing t'arry is "ncrah'd
irr'J1f>rfi\'P of the IIIcoUliu nlrry. If hi = 1, uuy incOlllin carl)' is "kilk'd" and
no' nllowPd toO t'ou.illuf' it:. propuKatiuu. If p. I hnwf'\,pr, an irl(mnin carry ill
nllowt,<1 '0 propaat '.
All <rwit.ches illnnits 0 thronh II -I ar" !oof't Hilllult IUI('olisly, and, fl8 a rt.'SlIlt,
thp propng8tm carry expr'rif'lIrt'S (lnly n sitlle swi.rh dl'lay Pl'( !!tac. fl\{.
unllJlwr of c'lrry-prop.Ela'" :.witdw'i thal can h(' clI..o;carlf'd i.. limi'f't.! in practicc,
nnd . his limit depf'ndH 011 the !!J"'l"ific ."dllJology pml,loyr,<I, rhus, thror' is a
m...,d tu partition. Iw n units into groups ami iUSf'rt ftf'parn'ing df'vicC1S (bllffcrs)
hf'h\f'f'1I 'hf'm. lu thr'ory, th£' "xl'cution timl' of tbis add('r is linPMly proportional
to . hI' nllml)('r of bit:" n. How,'vf'r. thp rnlio bf'twN"'u i.,., f'x{'clItion timc a.nd t.hat
of alloth(>r adder (e.g., tlu' carry-Iook-.Elh,'nrl wltll'r) df'I)f'nrJs on .hf' part.irlilar
tt'C}moloR,V. In any t,'Clmology thnt is "lIIploy...<I to rf'aliz(' the Mancbcstf'r add"r,
the implr'IU('nt.at.ion cost, OI(>I\.o;lIr,"'<1 in Si7f' of aTl'a and/or rt>lIlarity of dtiRn, is
f'xpf'Ctl.,<1 to bf' It Wt'r than that of a cnrr)'-Iook-ahrond arld('r (25).
thf' !!inlp l,it-po'li.ic'lI cnrry prr'JIIlJ(',t nltd (:IIf'rnf{' funr tir.mll 1', nJICI (:.. ,j::
.wo KrnllJI-r'arry rllnc' ions ran be calr nldtRd r 'urHivply mung. lip twn B
('(Illat ion!'!
I'''j
"""
{ /'.
P"P'- I : J
{
if i = J
if i > j;
(58)
C'- J
=
C
C. + P. . G I:j
iF, =)
if i > j.
(5.0)
5.5 CARRY-LOOK-AHEAD ADDITION REVISITED
No." that tl... notations p.:. flnrl Pi (find '1irnilnrly G" and 0..) flrP pCluival. n.
Thf' r"cUfsivc ....<IUBt.IOII!) (5.) and (!j.9) r'nn hf' furthf'r gf'nf'nhzNI t.n
1',.] = lJ,;m' Pm-I:], (d. 10)
C..] = (hm + P,.m .0",-1:j.
I " > m > ) . + 1. This is .he SlUUP gf'ncrali7ati o n that WM f'JnpIOYNI in
w Icrf I - - f . 1'''
Sf'clion 5.2 to flcrivf' thl' serf iou-cnrry propagat(' and gt'n,'rat-e Ullrtl()ui!,
aurl C... It cnn III' fnrrn',lIy proVf'rI hy induction rm m.
Il1str'3(1 of d.'alinR '" ith ew:h uf thf' group-carry fuuction<J 'll'paraf{'ly wro
, I tl ' r (P. G ) and d,'fin(' a new Buol('an f'pf'ra'or, r"llIf'f1 thr
mtroc IIrf' . JC fJ II '-I' ._]
fllurl.llnr'n' II nrry opeTl,tor anrl d(,llOtf'd by 0, as follows (2):
In .his !<f'clion Wf' rlf'rive th(' N}lIations for ca.rry-Iook-a}wnd addit.ion in a morc
gf'llI'rnl way. This will nllow liS to lOI18lrlP( vari(Jl1 implf'IU('nt.atious of th(' carry-
look-roJr''\d ddd('r rn.lwr than being rtrktcd to a pTt'(I(,tf'nuin...<I hlocking fac-
tor. It will also pro\'idf' 8 g"JIf'rnl franwwork for deriving ('xprcssious for othf'r
t('('hniqut'S for fnst addition. inrll1ding carrY-M:lcct and carrY-f>kip add.'rs.
WI' first introdllcP .hf' following notc\tinn. Il't 1:] 8url GI:] df'not" thp
Rroup-propagatf'C1 carry and the roIlJ)-Rf'neratlod ccarry fllnction!!. Tt':lJ)mivf'ly.
fnr 'hp group of I,it positiOl18 i, i-I, .. " j (with i j). & Hhown iu FigurE' 5.7.
P"J 1'Clllal!' 1 whl'n au incoming I' ,rry intA:, the 1t'IL.,t significant posit-iolt
j, Cj, I." allowed '0 propaRate throllh all i - j + 1 bit positions. Gt;J ('quals 1
when a t:dUY IS gencrRt<..o in 8t IN\...t on(' of t.lJI' hit. positions frolll j to i inchlsiw
and propaatcs to hit I){JSition i + I, i.e., the ontgoing c.arry c,+t ('quais 1.
Th tJefinitions R.'n('ralu' . ho.'oC in F"<luat.ion (5.S) aud iuclud,. ns 8 sJ'N'ial C8.<;('
(P,G) 0 (P,G) = (P, P,G + p. G)
(5.11 )
Uj\ing this fundalllf'n.al cnrry opera.or W enn r,..writ. thf' rpcurSIVP foAluatlon
(5.10) 11..0;
(P,:], C.;]) = (P"m.Gl,m) 0 (Pm-l.j.Cm-I-J), (5.12)
wbprf' i m j + 1 TIlt' fundamf'ntnl rarry oppr"tion hlt.o; hf'n !lhc""'l1 J be
t......'<OI.iative(2J. i.t',. fori m > t' > j (or,lIJoff'ILCrnrntf'ly,I In 2: v+l )+2)
«Pl:m. Gl:m) 0 (Pm-I:", G m - I .,,» 0 (P"_I:]. G"_I])
= (P,:m, G,;m) 0 «Pm-I:",Gm-I.,,) 0 (P , .- I : j . (;,,-I:j»' (5. lei)
from I dl'fillition in ECluauOII (5.11) the fund.snU'ntal cnrry oJl..rn'ion if! an
itl"lII()Ot"lIt o(ll'rat ion
(P,C)o(P,C) = (p. P,G+ p.C) =(PG)
(5 11)
4+1 -rn ... rn ... 0- "
and COn!!NIUpn.lv. thf' must gf'II"rnl fllrm \If F<luation (5.12) is
(P,.j.G,) = (Pbn.(:i:rn)0 (f'".],G.... j ),
([d!j)
FIGURE 5.7 A group consISting of I. J + 1 bit pOsitions ( I J),
whcr' i m and " j but " IS not II' 'C88(mly f'qual rn - I; it io; <Jllly rf'(luir''f1
that v 2: rTI - 1.
108
5 Fast Addlt on
56 Prefix Adders
!Of)
em = G m - I :) + Pnl-I:j . ('j
(5.16)
1>1& 6,GI II
Tlw aboW' £'qn.at.ions indicate how Uti' (propaat.. auo 'uf'rntc) group
c"rri J:J nnd G;:) cau be C<'llcnlat('d from suhgroup carri whcr£' tl)(' two or
morf' !ollbgroups are:' of nr hit. fRr)' 1oi./t> aud t.hl'se subroups may (,\"I'n over hip.
Event nally, w£' wOllld like to us(' th('s(' group and suhgronp cnrri in orcl('r If)
calculatt' tlw individual hit carrif's ("1+1. <;, ..., Cj+h dnd slim ontpnts .'1., ,'I.-It
..., S)' To t.hi1o ('nd we must takc into account thp I'pxt£'rnal" carr' c) (set' Fignrc
5.7). 'For tlw mt.h bit position. i m j, W' ha\"'
1'15 I-I,CI&:
J>I5'1:/,GI&:1
whicb can abu h(' r('writ.t£'n a..c;
(P m - I : J , G rn - 1 : J ) 0 (1, cJ)'
PI&.O,CI5.0
alld if Pm = 3'01 (9 Yrn thl'lI
S,n = c,.. EB Pm.
(5.17)
FIGURE 5.8 A tree structure tor calculating ("16 (each line except roo represents
two signals that ore either 3'm and !I. I or P",m and G,,:m)'
1I0w('\,('r. if Pm = Im + Urn then 8m = C m EB (xm (9 Ym). All alternative WdY to
o£'al wit.h the incoming carry into the group, c J ' IS to modify the t'quation fOI
G J from XjUJ to XJUj + PJ . c J ' Th('II, tin' equatil)n for Crn b('(:olllcs
Thc trf'C st.ructurc for cdlculating ('Ib is dcpictfit in Figurt' 5.8. fh£' part of thc
tu'e tructurf' that gencrate:; (P7:0.G 7 ,o) corm'iponds to the pxprion
C m = G m - I :)'
(5. Us)
(P 7 : 0 , G 7 : 0 ) = (PH, G 7 .,) 0 (P ao , G3:0)
= {(P 7 '6,G 7 :6) 0 (P:>:.., G:>:..)} 0 {(P 3 : 2 ,G 3 : 2 ) 0 (P1:O,G 1 : 0 )}
= {«P 7 , G 7 ) 0 (Po, G f }J 0 I(P, (;) 0 (1'1> G 4 )]}
o {[(P 3 . G3) 0 {P2, G 2 }] 0 [(Pr. G 1 ) 0 (Po, Go)]} (5.21)
Tht' later "forces" the carry Cj to propagate throuh the group while th> former
allows it to "skip" t.h£' group. Wt' will furthf'r elaborate on this ill the discllisioll
of cnrry-skip acMl'rs.
Equat.ions (5.15)-(5.18) ('.8U be used to derive various impl('mt'ntat.ions
of ac1d£'Is inclnding ripple-carry, carry-look-allf'acl, carry-St'I.'<..t., carry-skip and
othel. A 5-bit ripple-carry adder corrt'Spomb to tilt' case wlu're:' all subgroups
consist. of a singl£' bit position and the computation stuts at posit.ioll 0, proceed"
to po!oitiou 1 and so on:
(P 4 ,G 4 ) 0 {(P 3 ,G 3 ) 0 «P 2 , G2) 0 (H, Cd 0 {(Po, Go) 0 (l.eo}}])}
(5.19)
All the circuits in th.. sceond t.hrongh the fift.h levpls in Fignre 5.8 imple-
ment the fundamcntal carry operation ami ar£' tht'rpfort' id.>utical. Cl6 is equal to
GI:O (colllparl' to Equation (S.18)) lIud, sinn:' Pm = X m (B y"., WP cfln calculate
the Slim bit Su; using 810 = C16eP 1 6. The tr strue-turp in Figure 5.8 also g,'uer-
ales the carries C2, c, alld C8' The carry bits for t.he I('mu.ining bit positions can
b,> calculnted thronh I'xtra subtrt.'e struct\lrt'g that call he' acld('d to tl)(> binary
trft' shown ill FiJl;tlrc 5.8. Oncc all the carries arc knO\\ II, th.' corresponding IIlIIn
bits can he:' \'(IlIIputed tlsinf( Fqnation (5.17).
In the ..hove design the blocking fnct-or always t'Quals 2. lIt1wpwr, thp
IJlo('kinf( fd('tor du£'s not. have to be th£' !>alllt' for ,,111('\'cl1; of tbp carry gt'nerntion
tret". Diffl'reut. \'ahms of t.he blo('king factor IIIdY lead to a mort' dti,'i(>nt 1I!>e d
SpR('£' nnd/or short<>r int£'rcollllections (221.
A 16-bit carry-look-allPad «<Mer wit.h four groups of siz£' 4 (i.e., blocking factol
of 4) falUl II ripple-carry among groups corrl"Sponds to thc following e:'xprt'Ssion:
(P I 5:12' Gl:12) 0 {(P II :8. G 1I:8) 0 (P7:-I' G 7 :..) 0 {(P 3 : 0 , G 3 :0) 0 (J, co)}]} (5.20)
\\7C IIf'Xt introduc.' a variant of a ..arry-Iook-ahead adder that was propu:.eO iJl
[2J. This varidnt uses a blocking factor of 2 rnlting in a vcry regular la}uut of a
biliary trC(' with log2 n levels, rt'quiring a total area of dpproxinMtc liizl' 11 .log 2 n.
To iIIUbtrate the design of th.> add,'r. consider thp calcuilitiou of C16, the incollling
C.drry at Rtage 16 in817-bit (or more) achlcr ami SUppOSf' that Co = xoYo+Po'c.().
5.6 PREFIX ADDERS
TIll' addl'r showu in Figure 5.8 may be \'iewf'd liS a parallel I'«.fix l'ircuit.. A
por811..1 prefix circuit i.s fa combinational circuit with n inputs .1:1,3'4, . . . ,X n pro-
110 5. Fast AcSamon
'''- UIJI1 " '0 9 6 7 , 5 # J 1 , II
9 i , 9 9 9 9
, , , I I ,
. . ,
I I ,
r1
,.rJ
rap #
,.r5
y: "'W r6
£hdptuJ .ate,,1
5.6 Prefix Adders
111
,.,.. 'S 14 IJ 11 " ,. , . 7 f j , J 1 , .
. . I . . . .
. '-'
.
, . I
. "
I .
.
, . .
. . I
. ,
. ,
.
.
. , .
..",:
I I I . . .
, . I . . .
-,
-'
AGUAE 5.9 The Brent-Kung (2) poroBel prefix groph.
AGURE 5.11 The Kogge-Stone (16) potolet preflx graph.
ducin tbp outputs XI, X"l 0 %1, . . . ,X n 0 Xn-I 0 . , . 0 %1, where 0 is d11 ciath"e
binary oppration. Tbe first stagf' of the addpr in Figure 5.H genf'raw tbe in-
dividual P;, and G. signals. Thp rpmaining stages constitute the parallel prefix
circuit v.;tb the fundamental carry oppration ... n ing as the 0 as...--ociative binarY
oppration. This part of thr addPr trf'(' can be designed in many different way
Thp particular way this part is implementf'd v.itbin the 16-bit Brent-Kung adder
2] in Figure 5.8 is sbov.-n in Figure 5.9. Tbe bullets implement the fundamf'ttal
carr)' operation while the empty circles at the top generate the individual p. and
G. &ignals. Note that Figure 5.8 wluch gf'nerates G1S:O sh()';lo"'S only the top four
&t8g of the complete parallel prefix graph in Figurf' 5.9 v. hic.b uses S(>V('n stagp.s
to also geoeTate all thr intf'nnediate carries G.:o (i = 14,13.....1).
The number of stages and consequently, the total delay of the adder, can
boP redu{:ed by modifying the structure of the parallel prpfix graph_ The minimum
number of stages for a parallel prf'fix grapb is log2 n v.hich for n= 16 is equal to
4 ..biJe the number of st.sp;es in a Brent-Kung parallel prefix grapb is 21% n-I.
One way to implem.nt a fOlJr-5tagf' paraUf'1 prefix graph bas bf'en proposed in
[1 i) and is shown in Figure 5.10. Nc te tbat unlikf' Figure 5.9, thf' Ladner-F&hpr
adder f'mploys fundamental cart} operators with 8 fan-in value higher than 2.
i.e.. the blocking factor ..-arips from 2 to n/2. Such an implementation also implies
a fan-out of up to n 2 requiring buffplS which 31M to the OVE'rall delay.
Another parallel prpfix graph which al u-<:€s only 1% n stages but has
low('r fan-in and fan-out requirements, has bePn proposAl in 116] and is shown in
Figure S.lI. Tbis adder has stiU a higher number of latA:-ral wlres with a IOf'r
span than the Brent--Kung adder and such v.ues usually require some butf('ri,
re:.ulting in additional delay. Se....eTai other ..-ar1ants of parallel prpfi."( graph..
haw been proposed (e.g., [15]) illustrating that ill eneral, smaller adder delay
can be achievf'd in exchd11ge for higher merall arf'.3 and/or pow('r. Compromises
bPtwn ..implictty of wiring and overall delay have also bef'n SU&Rted. For
example, 8 hybrid dt.'Sign combining stag(':, from the Brt"nt-I\ung and Koggt>-
Stone adders W"dS propOSol'd in [12] and is shown in FigtIrP 5.12. It has five rath>r
tban four stagPS v.ith tbe middlp tbr resembling the hogge-Ston p structure,
but its v.ires ha..e a shorter span than thOSP in the Kogge-Stont' adder.
'1
'0
o
'.pIID
,,.rl
'''- IS U IJ
6
.
, ,
-#
1
tJ
114'"
. . I I
W . W
. . I I I ,
I I :--1:>' I:
. I I .. I
. I .. I
aqrl
114 rl
rJ
rJ
6"#
o..z,.w 1141" 4
fhlpllZS S
AGURE 5.10 The Ladner-rascher (17) paroJeI prefix graph.
AGURE 5.12 The Hon-Co!son (12) para lei preftx graph.
112
5. Fast Addition
5.8 Corry-Select Adders
113
Ling adders (191 are a variation of t.III' carr.r-Iook-allt'ad EU1t!£'rs. Th£'y U:-'l a sim-
plcr \crsiou of t.h,' group-Kcucratro caTTY signal alld thus provid,' nil opportunity
to rroucc the 3Ssoriah'd dday. \\'C will SI'P the principle h,'hind Ling addplli
through a simple example. SUPPOSl' that we start with a carry-look-ah£>aO adder
with groups of si7e 2, i.t'., we producc till' signals GI:o, p.:o, G 3 : 2 , 1'3:2 anti 80
on, 1'lw olltgoin carry for position 3 c.an he exprCoS$ed as
flJ(' maximum fau-in iu (5.25) is smaller t.han that in (5.26) leding to simplpr
ami mmally faslcr circuits. Otlll'r vari"tions of till' expressiou for tll£' gTl"IIIJ)-
('U('wtl'<l carry G havt' lOrrpOl)(liu varintious for H. For f'xampl£>, for tht'
clJuatiou G 3 :0 = G 3 + P 3 G J :0. the corresponding ('(Iuation for H is H:.:o = (;3 +
T 2 Jho wlwre 72 = X2 + 112. A mort. Kenpral pxprion for H is Hi:o = G. +
1j_IJ/,_I:O wllt'r£' F'_I = X,_I + Y,-I.
The cdlcillation of the slim bits in a Liu addt>r is slightly morf' involvpd
thau that for the curry-Iook-..h£'ad. rIb iIlust.rat' this calculatiou considpr S3:
5.7 LING ADDERS
C4 = G3:0 = G3:2 + P 3 : 2 G1:0
(5.22)
83 = c.\ ffi (X3 ffi Y3) = (P 2 H 2 : 0 ) $ (X3 e Y3)
= Iho (x3 ffi Y3) + Iho(P2 $ (X3 $ Y3»
(5.27)
WhN '
G 3 : 0 = G 3 + l'3G2 + 1-'31-'2(G. + p.C o ).
(5.23)
Thl' calculatiou of H 2 : 0 is fa...t£'r thall t.llut of C3 reducing ttu' rlf'lay fl."isodatf'(1
with gencr.\tillg 83.
Tltrl'C othcr variatioll of t.ht> carry-look-ahead adder which have properti<s
similar to those of Ling's adder have been presented in 171. Au inrplpm('utat)oll
of a 32-bit Ling adder is dcrih("d in [81. A lI1uliplcxor-b&ed iwplt'mentntion of
Ling adders is preSt'nt4.'d in 1261.
G 3 : 2 = G3 + P 3 G 2 , G.:o = G. + P1G O , ami P3:l = P 3 P 2 .
W,. f>ithC'r assume that CO = 0 or SC't Go = xoYO + POl'O. Expressing G3:0 in trms
of rh£> indi\'iclual carry g(,lIerat.£' and propagat4.' signals w£' ohtain
Sincp G3P:" = G3 all terms in th above t'quation have 1-'3 a.,> a commoll factor.
mid w£> C-iUl thcrl'for(' rt'wrih' the expressioll for G: w as
5.8 CARRY-SELECT ADDERS
113:o = H 3 : 2 + P2:. 1I .:o,
(5.24)
In a carry-spll'ct adder tht:' 1/ biLe; ar£' di\'idl'll into nonov£'rlapping groups of
pos:;ibly diffcr('nt It'ngths. The und£'rlying strategy is similar 1.0 that of Ule
couditional-sum adr!£>r d£'scribed in Sl'ction 5.3. Each group gpnt>rat('S t.wo s('t
of sum hits alld an outf(oing carry. Oue set assum('S that th,> in("(Jllliug carry into
the group is 0, thC' other assume.s that it is 1. V,:hen thl' incoming carry int.o
th£' group is assigned it.s final \'ahle it Sl'1f'cts onl' of the two SC'ts a.<; is shown ill
Figure 5.3. Figure 5.13 is a mor£> detailed version of Figure 5.3 dl'pie-ting tJlf' lth
brouP which consists of k bit positions starting with hit positioll j and I'nlling
with bit position j wllf'rc i = j + 1.- - 1.
fI)(' out.puts of the group arc the slim hits SIt 8,_1, .." Sj and thp Ith
p;roup outgoing ('arry CH 1. The corrp()nding 1300lean pquations are
G3:0 = P3/ho.
whprl' 113:0 is defin('(1 as
and
H J : 2 = G 3 + G 2 . H I : o = G. + Go.
1'\ot(' that. P2:. il> uS(>cI in (5.24) in contra."it to P 3 :2 which is us£>r! in (5.22).
Fquation (5.24) dpfines H as an alt£'rIlative to the carry g('lwrate G and shows
that Jl call be> calculat£>d in a similar manner. Howe>ver, H dops not have a
slInple intcrprptatiol1 like G. On tl)(' otllf'r hand, H is simpl('r to calculate. For
'xmnpl(', expressmg H3:0 in t('rIllS of the individual bit signals yidds
8m = s?" . 'l!J + s:n . c} ;
m = j,j + 1, - ., i,
(5.28'
lho = G 3 + G 2 of PlPI(G I + Go)
and
which can be 6irnplifi('d to
C'+1 = c? t-I . 'l!J + C+I . c}'
(5.29)
(ho = G 3 + P 3 G 2 + PJPlG I + P3P 2 l'lGO.
(5.26)
where s is the ",th sum bit under the condition that tilt' incoming carry into t.h('
Ith group is O. This is the same notation that WI' haw' used for t.h(' conditional-
slim add('r. Tht> notatious ''':.., c?+I and c! +-1 are defined !<Imilarly.
The two Sf'parat.e 6et of outputs cau b ' calculated in a ripplf'-carry lIIan-
nero Thus, for hit position m W(1 calculate c?n ami c!" from G-I:l and G: n - 1 : j ,
H 3 : 0 = G3 + G 2 + P 2 G I + P2P I G O .
(5.25'
whilt, Equation (5.23) can bp rcwriU£'n 8:>
114
5. Fast Addition
5.8 Corry-Select Adders
115
.r..Xi_Io...,.rj
k
k
till' PXf\ct. impl'>llJl'ntatiolJ. If, for pxamplt., W(' &'>"1'11111<.' a siml,lt. two-lew,1 gate
ilJ)l'll'llll'ntlltion for t.hl' mult.iplpxillg rin'uit corr('ilpolldin to EqllRt.ion (5.32)
tht'u the delay '\.."M)ciat.N1 wit.h the ('arry-'lp(" chain 'hrouh thp pr,'CNtinJl; 1- I
groups is (I - 1) .26(; whpre 6c is th£> delay of a :;inglt. gi1tp. Thp ddny of the
ripple-carry ,-hain through thl:' k, hit. positions in th.... Ith rouP. wllPu .'gain Wp
&.,<...lIInp a simpl p two-l<.'\'f'1 gate implementat.ion, is k, . 26{;. Equalizing thpse twn
ddnys mSlllts in
y"Y'-h'" .Ilj
o
k -bit A,h/rr
O,IJO I' ... ,,0
· t- . J
k-bil
k, = 1 - 1 with k, 1; 1 = 1,2,.... L
(5.33)
when'L is thf' number of gronps as shown b£>low.
C,+I =
Mulhplc-.rer
Cj = CroupJ"ca,.ru -in
k,
1
I ... I
1:1
kl
I -.. I
k lIi.a._I,....8,
In other words, the ...roup lengths should follow the bimpl p arithnJ{'tic progrion
1, 1,2,3, .... ami diP t.otal number of bits. n, must. satisfy
FIGURE 5.13 The I th group. consisting of the k bit positions J.j + 1. ..., I, In a
corry-select odder,
1 + L(L - 1)/2 n,
(5.3-1)
dnd consequently
L(L - 1) 2(n - 1).
(5.35)
rp(-'('tr\'cly, which in tuna, drf' calculated from (sre Equat.ion (5.19))
(Pm-I'}'_I:j) = (P m - Io G m - l ) 0 (Pnl-2,G m - 2 ) 0"'0 (p)' G j )
(5.30)
As a rf'Sult, t I... siz(' of thl' larg£'5t group and thf' execution Hn... of t lIP carry-
selpct addf'r are of the' ordC'r of ,;no For I'xampl£>, for n = 32, b&>ed on Fquat.ion
(5.35) nine groups ar£> rcquired. One pos.siblp choic€' for their siz('s is 1, 1, 2,
3, 4, 5, 6, 7 8nd J. The total carry propagat.ion time, under the a..';nnnption Qf
two-level gat£> implpln('nt.ation, is 18. 6c. instt'ad of 626G for tl\p ripple-carry
adder.
If thl' lengths of all L gronps arf' equal, the carry-select chain (i.e., gen-
erating thl' gmllpJ_Ca'TY - out from the g,'oupLCar"y - in, see Figurt' 5.13)
dO('$ not ha\'(' to be neC'l-'$Sarily of the ripple-carry type. Instead, a single or cven
multiple-level carry-Iook-ahl'ad can be employed (1).
COlllparc to thc ripple-carry adder, the carry-self'Ct adder re(llIirl'!! a 1111-
plicate c Irry-chain logic and additiona1 carry-sdC<"t logic, However, this logic
l"ircuit overhead can b(' rt'duced by observing that thp two Sf'parnte carry-dmins
in Figllrl' 5.13 and th,' multiplexing ('ircu.itry can be combined to yield a 8impler
implplUentation (331. In such an implclllentation each bit position inclndffi the
following logic circuitry
and
(Pn>-I:J,G_I:j) = (Pm-I:), G_I:j) 0 (1.1) = (Pm-l:j'-l:J + P nl - I : j ).
. (5.31)
ott' t.hat .P,n.- :} has o sllprS('ript since it is indppendent of t.hl' incoming carry,
Ou('e thl' mdl\'ldual bit carnes have b('('n calculated t.11f' corrl'sponding swn bits
dre
8° cO "" Po all(1 I 1 "" P.
m = m W '" 8m = ('m w m.
S' . I . I ( ' . f ° 1
IIIcr .+1 Imp les Ci+l I.e., I Ci+l = 1 thcn cJ+I must equal 1) we can
simplify Equatiou (5.29) to read
CHI = C?+I + c:+ 1 . Cj.
(5.J2)
TII(' si.f.RS of t.he St'parate groups can be f'ithcr diff(.rent (e.g., (J3]). or th(.y
can all bl' '-'qual to k (c.g., (11), with possibly Olll' gronp of size smaller tll[\n k. III
tltl' fjrt (,<I..'!C the si"R of tlte Ith group is chosen 80 as to eqnali.f(' thp delay of the
rippk'-carry wil hin the group and the (May of tht' C'arry-selC'C1 chain from grollI'
1 to group I. The ahove two delays d('pend on the tedmology em"lo)'I'<I and
Urn = XmY", ; Pm = X m ED Ym ; Pnl:o = Pm . P,n-I:O,
alld
(' = G m + P,n' C_I; 8m = Pm e (C_I + C)' P m - I : O )'
(5,36)
116
5 Fast Addlllon
5.9 Corry-Skip Adders
117
(').... I.r"uf "f 1!a"I<I' "fl"n! iOIlH III I,.ft lu t!a.. rPllcI,'r ILlllln ('X"It'IS". Otltf'r WII illt ioml
tlf I It" ("l\rrv.""I"I'1 oc.ldnr hllvl' bt"'11 pl'Opo.'wei IInll illlph'IIII'III''t1 wil It 1101111' of
t ht'!'(, II..,wrill<'d In S"I"I.1ulI 5,11)
6,9 CARRY-SKIP ADDERS
FIGURE 5.15 A 15-blt corry-skip adder.
A I' l1'ry Hklll IIIld,'r n'4III'I'S IIII' lillI!' 1I1'I,(It'l1 to proplllIlt. I h,' ("ntry hy IIkippiHR
()"f'1" ...rVllpli \If 1'"IINI'f'uIIVI' udtll'r r.tllJ.:I'S, AN ...III'h, I hi' t"llrrY-Hkip udrl('r R"llf'fIIli./f'S
,!an iel"11 lu'hilld the' IIIIII'h"s"'r IIllclt'r eI,'Jot,'rih,'(1 ill SI'''' ion 5..1. Thl' rurry.
tlkip 1I(!t!t'r illllsl fIIlt'H t I". tlt'I"'IIII.'III'I' (If I hi' "opl illllll" nlgllril Itlll for atldit i(;11
on t.11I' n\"lIilllbl(' "'I'hIlOIIl,\', AII!allllf(h II", ('urry-...kip nloritltlll hILH 1"'1'11 knnWII
fnr lllilll' Y"lafS, i.' 1111'" 111'1"1 IIIII' I'tlJllllllr ollly (1'1'1'111 Iy. III VI SI t,'I'ltIlOlllgy I hI'
('urrY-HklplllltI,'r IN Cfllllllllrnhl., ill IIp('(',1 to IIII' "/\fry look-nlll'a(1 It.dllliqlll (for
roltllllollly UIII'l1 wnrel II'IIJ{I Its hut lIut 1It'('I'!I.luily ill I he n.YIIIJI' ot ic- :M'IIIi(') hili il
((.qllir('s I,s dlill an'a 111111 ('IIIISlIInt'li It.tls 11I1\\'('r.
1'1... l'urrY-Hkip 1\(t.I.'r i... hl'-"I'(I 011 I h,' follnwillg ohN('rvILI inll. Th( I nrrv
Ilrl'IIIIf{"tillll prot"<."-'S ('1111 skip 1U1)' "11(Ic'r st"J.:I' fur wltidl r m i- 1/", (or ill otlll:r
\\"rcIH. P', co J... t£lllm - 1). SI'\','ml ('OllsI'("uti"" IItng('S ('''" hi' skippl'<l if all
tUltiHfy r". 11''''' 'I'hll..., "II IIdd,'r nlllsislill of I' Htngt':S is dividl,tI illtn J{rollps nf
('ulI:>I."'IIIi\(' Jot' "10:1'8 wi, h " Nilllplt, rippl,'-,'nrry sdll'm,' n...,,<1 ill ('m'lt J.:rollp. FvC'ry
J.:wUI) ulloll W'III'fllh'.'l " J{roIlIH"lIrry-prop"glllt' siglllli tlml 1'11"lIls I if 811 titllI'S
illlt'lIl/Il to thl' Rrollp IIlItisfy Pm = 1. Thill lIilml ('flll he UN('d It) ""ow 1111
illl'omillJ( ('lIrr)' into th.-' J{ronll 10 .....kip.. 1111 t h,' ...tllJ.:(':> wit hill t.h(' group und
J.:I'lIl'wt" 1\ J,:rolll)-('lIrry-out. LI" U p"rl il'nlur J(ronp, IIny, f(rtlup I, ('ollsitit uf tltl' k
hit 1"'NiliulI" j,j.1 1"...j + k - IlL" IIhlJ\\'1I ill FiJ(urt' 5,1.1. BIL",'d,," Efll1/1tlon
(!i. Hi) thl' Booll'llII I'xprcioll for GmuJ,_LCf.n7l-out iH
wlll"lI till' i - j + 1 bil positimlN allow I he IIlI"ltmillJt rarry Cj to proPIlRuh' to thp
lIext hit. poSil iOIl. H I. '1'111' Imlfl'rs shown in thf' fignrt> renli./" I !a" OR olll'ratioll
in the uhf}\'c Uouleiln I'xprt'ssion.
Flguw 5,15 df'pkt'l 8 IS-hit carry-skip wltl(>r f'fllllli!Jting of thr groups,
p.-'Ch of bize 5. Notin' th(It thl' siJ.:nals P i : J for 1111 grollps CUll he gcncrat4'd
l:!illlnltlllll'Ously allowing 1\ fa..c:t skip of ((JUpS whi('h !mt illfy P"j = 1.
WI' witoh t,o d('h'rlllim' till' "ptimnl ...iz,' of t.hp group, k. This optimal
siz' dl'l)\'l/(ls on thp rlltio 11l.tw('C1l till' r..rry-ripplt, I im ' throuJ,(h u singlt' stfJl(",
dl'noted by t r , and thl' tinlt' it takes to skip a group of 'Iil.c k, dCllotL.-d by l.(k).
The lal tcr ill, for lIIost implplnelltatiolls, illd p l)I'IIIIt'nt of k.
AN/mllf' first t.hllt. nil groups arp flf the tlamc size k, Rlld, for simplicity,
ILo;.surnc fnrtlwr t.hut n/k is 1111 illteg('r, Thf' group sil'(' k shIJuld L' 5t'I,'ct '(I >o()
thut thf' till)!' for tilt' longpst carry-prupagat.ion dmiu is minillJi7.cd. niP long l t
("urry-propagation I'haill ort'urs wllt'll a ('arry is ..IIt'rl\t,>d ill sta" 0 alld IIIIII
propngntps nil t hp way to stug(' n - I. This IIII'un8 that ,h,' ('arry will rippll'
throuh stllgl'S 1,2,....k - 1 wit.hin gWllp 1. kip roups 2,3,...,(n/k - 1),
th"11 ripplt. through group n/k. Thl' overall rarry-prol)6J{ation time is in this
rM('
T,'orrl/ = (k - 1) . t r + th + (n/k - 2) . (I. + tb) + (A' - 1) . t r ,
(5.37)
a,v IIIJ_LClIIry.uuf = (;,.) f. P, j . (;l\Iul,_LCan"!I-rli
where tb it> tit. deluy &..sc,ci"lf'(1 wit.h the hulft'r (which mlplcm,-nts till' OR oppr8-
tiou) het.wl'£'11 two J{roups, as showu ill Figure 5 15, If, for t'XWUI'II', 1 struihtf"r-
wnrd two-level gutl' impll'UlC'nlut.ion if; "lIIplo.)'cd for both th(' rippll'-carry ('irt'lIIl
"lid the curry-kip circllit, thclI t r would I'fluall. + t" = 2a. yif'lcIing
wllt'rf' (:,,; I''-Iunk I whl'lI 1Il"lIrry IS "Ill'rnl.t'd illt.,ntlll to till' rollp 111111 if; IIlIow('(1
to IlJnllll"'" through 1111 th' r'llluillillg bit positiolls illduding i. PI:) (..'(jllllis 1
11,,11..,
lr"l1 = (4k + 21l/k -7). Dc.
DitT,'rcllt.illtillg Tco""1/ wit.h (I....p,'('t to k IIlId ('C}ll/ItillJ{ tht' dt'ri\'lIt.ivc to II rt"Sults
ill
k"pl. = ../11/2 .
'....-
(:.'....I'JJ. 'u,.rIJ ,"I'
('J -
Gr""l,J.("nrrJ/- ill
/u;. for till' ,'arry-dl'l't. uddt'r, t.11I' group Sill' alld Ih(' t'lury propagnf-ioll till.' nr('
l)rolHlrl i(mal t.o ..jii,
For "XUIIII,II', for II = 32, "IJtht gruupti "f I./" k opl . :sI JW = I will pw\'ult'
lilt" 11I'S1. dCIIRI1. with Top.. = 25a, instct\u uf (j2l; for the rippll'-t'urry &ld(>r,
FIGURE 5.14 1he f th group consisting of bit positionS}.} of- I. ,.., Iln a corry-skip
odder,
llR
5. Fast Addition
5.10 Hybrid Adders
119
Rf'\'iit ing th(' prE'\ iou.... analysis, on(' should realizl' that. Curt.h('r spl'f'd-up
CAn hI' achie\'('(l iC we make thl' Silt' oC thf' first and last. groups f'v('n smallf'r
than tliP fixPd siZt, k. and in this w 'y rl"lluc(' the ripple-carry delay through th£'S('
J,(fIIUpS. Also. wp may increase the size oC t h£' c('nter group. sinc£' t.he skip tim('
is usually independent oC gronp sizp. Another way to reduce T"..r'1l is to clf'Sign
a sffoncllcvel oC skip circuitry Urat wonlcl allow skipping two or more grouJ.l' in
one step. Aclclitionall£'wls can also bE' envisioned.
Thl? idpa of using unequal group sizes has alrl'.8dy bf'('t) sllggpsted in t.he
pa..<rt; (18]. Only r('("('ntly haw' sew'ral algorithms h('('n cleveloped Cor deri\ing t.he
optimal group siz Cor ditfpr£'nt. t('Chnologies and implc1lIentations (i.e., diff(>rellt
\'nln of thl' ratio (t" + 'b)/t r ).
\\'e '" ill now fonnulat(' the problem and iUustrate its solution through
S('veral exc\mplcs. Note first that, unlike the >;imple analysis for t.he eqnal-sizf'd
group case abo\'f', we cannot restrict ourst'lws to thc analysis of the worst c&,')('
for carr)' propagation. This may lC'acl to t.he trivial conclusion that thC' fir& and
Ia.<;t. W'Ol1pS should consist of a single stage, whilt' all r£'maining n - 2 st.ages shol1ld
constit.l1t.p 8 singl(' c('mer group. In t.his dcsign, a carry generatNl at the begin-
ning of t.he c('ntN group may ripplt' through all the other" - 3 stHgcs, becoming
the' worst <A:L'Sl'. We therl'Core 1lC(.>d to consider all possibl£' carry-propagation
d18ins tbat may start. at an arbitrary bit position a (for which Xo = Ya) and stop
at the next position b that also satisfies Ib = Yb, where a new carry-propagation
chain (independpnt of the previous one) may st.art.
Let k l , k 2 .... . kL denote the size of thf' L cliffl'rl'nt, groups st.arting at po-
sition O. Theil
dl'wloped, relying 011 either geQlllf'trical intf'rprptatiorJl> (e.g., II3}) or dynamic'
proJl;ranJlllillJ1; 14].
Example 5,5
TII£' optimal organization Cor a 32-bit carry-skip acld£'r with a singlE' Ipv(') of
carry-skip has been derived in several clilfer£'nt ways. This optimal organi-
lat.ion includes L = 10 groups with sizt'S k l , k 2 ... . ,k lO = 1.2,3.4,5,6,5.3,
2.1 for f_ + tb = f.. yil'lding TrorF"1/ S 9. t.. (13). If t.. = 2r., thrn
T""M"I/ S 18LlG, instead of 25LlG, as in the equal-Rize group ca.-;r. Th{'
read(>r can vt'rify that allY two bit positions in WlY two groups It and I',
(1 U S v S 10), satisfy TcaN"l/(u,!') 9. t r . 0
The similarities betwc<'ll the carry-skip and carry-splcct adders and their
carry propagation timps shoulcl not come a. a surprise. Although the strntrgies
Iwhincl tit£' two schemes souncl clilferellt, the equations r£'lating the group-Utrry-
out with the group-carry-in are, in both ca...es, varidtions of thf' samp basic
Equation (5.16). Only the d£'tails of the implementation vary, in particular thl'
calculation of thc Slim bits. EVE'n this clilferl'nct> is rc>duced when thE' multiplpxillg
circuitry is mergf'd into the summation logic according to Equation (!>.J6).
5.10 HYBRID ADDERS
&'-1
Taarl"Jl(u, t') = (k.. - 1). I.. + tb + L (I.(k , ) + tb) + (k" - 1)' I...
I ..+1
(5.38)
Hvbricl adders are adders which use a combination of two or more of the pre-
\'iusly described methods Cor adclition. A common approach to the design ()f
hybrid acld('rs is to choose one mpthod for carry propagation and dnotlU'r metbor\
for SUm calculatioll. The two hybrid aclclers pre:>entPcl in this section combine
some variation of a carry-select. adder Cor calculating the sum and a modifird
Manchester adder for carry propagation. Both divicle thp o(>f'rands intu grC'Alps
of equal size-8 bits each.
The first hybrid addcr 120] emplo}s the carry-select method for calculat.ing
the sum for each group of 8 bits separately as shown in FiJure 5.16. The group
carry-in signal that selects one out of the two sets of slim bits is not generatNl in
a ripple-carry manner as shown in Figure 5.13. Inst-ead, thc carr if'S into thP 8-bit
groups are gellerated by a carry-look-ahead tree as propo..'>Cd in II). III t.he ca.<;€'
oC a 64-bit adder the&' are Cg, C16, C:l.a, C32, C40, ('..s and ('56 (see Fignrt> 5.16).
The structure of a carry-look-ahead tree for generating thesp ('arries would
be similar but not neres....arily idf'ntica) t.o that shown in Figurp 5.8. Thl' clitff'r-
£'nces betwe<'n such structur£'s stem from vdriatiolls in the blocking fw:tor at eJ.('h
level of the tree and the exact implf'lIIentation of 1\ modulI' for cnkulating th(>
fUlldl\Jnl'ntal carry operator. If we rt"Strict ollrS<'lves to a fixed blocking factor
the natural ehoiecs for groups oC si7e 8 hits indnd.. 2 (as in Figure 5.8), 4 or 8.
L
L k i = n.
.=1
In tile most general case, a carry-propagation chain starts at sump position within
group u, ends at somt' position within group 1', and skips the groups u + 1. u +
2,..., t' - 1. In the worst case, the carry will bE' gt:'nC'rated in thp first position
within group u, aud will stop in thp Ia....t position within group t'. The O\erall
c.any-propagation tim(', denoted by TcoM"l/(u, I:). is
The t of group Si7..es k l , k 2 ,' ". kL should be sclectt.>ll so that the longest
carry-propagdtion chain is millimul'd:
minimize [ lIIax Tco""I/(u, t') ]
I:!::"S,,':!::L
To wh'e this upt.uniLation prohlem, thl' siLf' of the groups. as weU as the IIwnber
of groups. L. mw.t Le determinPd. Algorithms for solving thi" problem Ita\" b 'en
120
5. Fast Addition
5 10 Hybrid Adders
121
P 63 .G bJ
Pro. GGO
P 59 . G 59
P,o;6, G:'i6
P 55 , G 55
P.'i2, G 52
P SI .G 51
P."" G II'
[>47, C. n
P 44 , G.",
p.. 3 . G 43
P 40 , G.o
P 39 . C 39
P.1G. G36
P35. G35
P3l. G 32
P 31 .G 31
P lioo ,G 2 f'.
P 27 , G 27
P 24 . G 2 "
P23. G 23
P 20 , G20
P I9 ,G I9
P 16 . G I6
P 15 ,G 15
P 12 . G I2
PIl. G 11
nloGIj
P7.G 7
P 4 .G..
P3,G3
Po, Go
('.()
(':\6
0
C21
1
0
1
CI6
0
{'8 1
0
1
o
P IO P'Jo
/ --.L/ --.lJ
1'3:0
1163:56
o
1'3
PI
1>2
'"I"
(a)
SS5'4fj
++G"
( Go ( (:1 ( G;I ( (;3
"'1"
ill"
"I"
"I"
(b)
39:32
FIGURE 5.17 A Manchester carry module tor calculating a group propagate
(circuit 0) and generate (circuit b) tor a group of size 4.
15:8
Thl' first choke (('Suits ill th£' largest numher of 1(>',.,'18 in the trpc while t.he
last Oil£' results in compl£'x lIIodul£'s for the fUlldamf'nt..1 carry operator with a
high d('lay. A blockillg factor of 4 repre:.<>nts a rC<iSOnabl£' ('ompromi::;e and ha..,
!.J('t'n selected in [20]. A M<\I)chcstcr cdrry propdgate/geIlPratt' module (AICC in
Figure 5.16) with a blocking fdetor of four is II£'picted in Figurf' 5.17.
III the lIIost general cast' tht' Manchf'Stf'r carry module in FiKure 5.17
accepts four pairs of inputs: (/I:io,Gil:io)' (Pja:jo.Gjl:jo), (Pk,:ko.Gk,:/co) and
(P'I:'O' G'F'O) where i l io. jl ;0, k l ko and 'I 10. It produc('s three pairs
of outpIIL.,: (P)I "0' GjJ:io), (Pkl:l o ' Gkl:l o ) dl\(l (£',1:10' G'I'IO) IInd('r tilt' conditiolls
i l jo - 1, jl ko - 1 and k l '0 - 1. These conclitiolls allow overlap among
tlw input subKroups follo\\'ill Equatioll (5.15). A schematic diagram showing
the operation of thl' carry modlllt' is dt'pictt'd in Figure 5.U.
!i-bit Adder
87.0
Inpu
jl :;0
i) : io
II : 10
k l : ko
FIGURE 5.16 A SChematic diagram ot a 64.bit hybrid odder (20)
(J!dp"ts
'I : io
kl : 10
;1 : io
AGURE 5.18 A schematic diagram describing the operation of the Manchester
corry module In Figure 5.17 In the general case,
112
5. Fast Addition
5 10 Hybdd Adders
123
S--l>., \of""
:.--1>" ..M., ,
F1GURE 5.19 thE! available inputs and required outputs at the third level ot the
carry-Iook-oheod troo. (A cloned Ime represE!nts on optional dependence.)
S-bi' _ ",,, I
.\ ...n.h'...,,.,.
(' ""1/ CIu""
.\f..lfi," or"
.\ ,lOld,...,..,.
t ., rrJl ('h.....
r - - - - - - - - - - - - - - - - - -I
H::tl I 39::12 31: 11- 2:t Iti I.:O I
I I
!..:I I
..-1.- I
I
I
I
-&;:0 I ;\!I:O :11:0 :?:I:O I
L _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _I
".."..",...,..,.
(""1"'" ('h..i"
,\t.mch,'_,'rr
("Irry CI...i"
'fc",,.hr."rr
C.."rll ('hoiu
lral" '$
55:48
\ n"..h...d,."
(" ""1/ CI...."
( )1"1".11,.
:-.."':0
to., .M I.....
., ',M,."
"-, '.101..,.
1'11(' fir.-I 1(,\'\'1 uf thl' l"Iury-lllnk-ahl'ad tre,' fllr (I (.i I-hit nddl'r indud,
1-1 M/Uldlt'Sh>r cnrry IUOlhlll's mlti cakulnt(':; l1Jo.GJ'o). (P;..., G r :..), .
lPs,:,>::u. G':\:\::>2), i.t'., lluly 11.(' I)Utputs r.w 111111 0'3:0 in Figurl' 5.1;- 1m' utililt'C,l
III th,' ('('('Ind 1('\,,1 tlf I ht> cnrr'-lol)k-l\he:'UtI t t\'t' ,'Ucll ll\udll':'lt'r ('urry 1II0tlul,>
g('lll'rah>s tWolmi of l1utput$. ....)rr'':ip'ln,ling fl' (/.o, GI.O) m\(1 \l'l.O, G 1 _ 0 ) iu
Figurt' 5.1;, TIIU, t.11l' S\'('onll It'\'('1 of lIIodull's I'ft 1 \"idl'S I hI' \1Ulll>S (11;:0, G.: o ).
(P I :\:O,Gli!I:l1), li21 16,G 2 : I . W ). (PJI:16,GJI:16), (r.'!}:3.l.G"I}.31), (r...: 32 ,G,.::u).
und (:..s, G5:IS) (1'\'1(> F4,'<\trl. 5.16), This 1('\"(>1 J;plIt'rnh'S th,' ,"urrit's ('1\ untl ('16.
which 1'I()lIal <:.:0 Nill G I5 ,O. 1't-'S1""(.ti\,l.ly, Silll'\'I\) is I\lft'udy illl''\npurI\lt'll into th,>
I1IUdlll,' thul g"ut'rat \PJ;O, G3:0) in till' tln-:t It>\'\'1 1201 (s,'C,' E')lll\tinn \5.1S).
I'h,' n\'nilahlf' illputs, I'l"l)uiml outputs alld tht' tll'p('lltl('u("{' mllong I ht'm
ill th" third 1l'",1 of thl' curr'.look-llllt'u,1 tn'l(' art" shown ill Fun' S.W. (''nrly
two 1l\lIdll"St(>r carr)' Ulodul,>s IU'l> suffici"1I1 tu luodUl'\' Ih,' t\'C,luin...1 l.utput:;
in Figut\' 5.1.. Om> such modull' cnn iml)I,'lIIt>nt th,' rdntil 1 11ships indllllt'C,l in
t hl' dn...h,"<! ho." in t hi tlgut\'. This Ulo,lull' will gt'lIl'rnh' till> carries C'2.., ('31
anel ('...0. A 1itx'oml ml),lul(' cnn I>ru,hl('1.' I h,> two n'lIIluliing pl\i of rt'<}lIin'l.l
out puis with th,' inputs com>spnnlling t.o 55:-IS, .I'i:a:!. n:lIi un,l 15:0. rla
mod III,' will g.>m>n'tp the ('unil's C..g I\n,l 6. Nutil'(' in pnrtintlar, tlll\t thl' tin-:t
ntl.dul(' d('Scribt'll by till' da....hl'd box in Figure:' 5.1) must impl"IIII>lIt 111l' I\\'u
r!oU.{'(\ Iin,"S from 23: 16 ill IIII' ti'1lrl' sitll'\' 21: 16 is (t'C,luin'll for gl'IIl'ral in :.!3:0.
rht> /Ibm'" dl':inibt'(1 iUlI}lpml'lIt/lt iun of u I; I-hil, nddl'r ('\llIIhinillj: t h,' cl\rr'-
sl,lcct SI'lll'lUt" for gt'm'rRtillg Ih(' sum bils IUld 1\ lnlld1th'r-bn...;.'(I,'nrly-k..)k.
nhl>Rei tn'\' for 'nl'natill th(' nU'ril's illt!) thl' flJUJIS i lII)t ullilJlll' alld dol'
not 1Il'('t"'l'rily minimizp t hl' Iw,'mll"",'C,'utiun tillll'. Altc,rtlnh' impl(,lIlI'lItnt ions
including vuriablt. siz,' of thl' l"I,rry-sl'l('('1 roul)S mid of thl' i\llUldu'Stl'r mrry
IlIo,lulo>s al thl" ,lilli'rt"lIt I('\'I'L..; ,}f I Ill' I n'l' m(lY pron' It) Udlil'\"> n II'Wl'r ,'Xl'CUtiOIl
t imp
niP li-l-\Jit luldl'r dlos"rihl'C,l ill 161 Iii\' ill,,:, tla,' li.J hits ililtl t\\"tI St'l:- llf si/.,'
32 hits. E.\I"II 1it'1 "f 32 hils is, ill turll. furthl'r tlivid,'C,1 iutu four roupf; 'If 7C
t:I bits. Fur '.\...'rJ.. gruup uf t'iglll hit.... tWII s"ts of ClllltHt i"lIl,1 :<11111 'lilt pills 11ft'
.\ ..lti,""'.x'"
"'ulh,,,.,,",,,
C1I
":11
,\(Ulh,""'" "
F1GURE 5.20 A schematic dlaQram of a 32'blt hybrid odder.
l'UI'rI\h'll S"I}llmtl'ly, rhl' t\\"o 1110..4 !<l,.lIitkl\lll, ttroups 1U"t' t h('11 l""lUhiUl'l(1 illt,. II
sill,.I,' bar,'r )1.roll(1 ,.f :ri7" Ui. Thi hU'g,'r ft)III) is fnrt h,'r (',llIlhilll"Cl \\'at h tIll' u,.'{t
rolill uf si/,' to fllnn IUl ,'\'CU IfIrI'r l'\luJlof silc :.!-t hits wul Stl 'n f"U",\ iUK
I.h,' priul"ipl,' uf l''OlIIlil iounl-:'llIn lultlit ion. How"\"t.'r. t hi' \Yo,)' th 1 anput n\rri.
fnr thl' h'L"it- 8-hil V.rolll"'" un' l'Il,rllt('11 is I....unlll,.tt'ly ,lilll'n'nt frullI th,' n)('fI..,,1
dl'.:'l"rihL'l1 in S'''Ct.ioli 5.:1. A sdll'nmtit.- ,lil\gnun ,ll'IJldil the Illw-l1Illt'r huU' uf tbe
(.i I-bit utldt'r i,; shllwn ill Fllturt.. 5.20. rh,' hhUldll'sll'r \'Un,," dawu unil III {b
tiun" l'nl'rlltl...; till' 1'".. G", 1111t11\." ...nl"s ft)r tin' iudl\ Ilhll\l hils IIUlll"t'IIIIJllI,""
{h,> l'", mllll',"" IIlIll'uts fur the 1\..........uu..'11 inn1mill nlfry of 0 IItIll 1. r,'r>Ju,,'h\,'I.
Thl'St"' t.'\ulliitiollul \"tur}'-ullt MllIIls L'Olltrollhc Illult.ipll'",,'rs IIUlI tlct,'ruulIl' \\'hkh
c-ulI.lilll.n,\1 "Ulli l>il-:- ,\tl fllr\\'lud,'lI to t he l1l'X Il'\"I.'1 lIf mull ipll'xlUJ{. ria,' Iw.1
St'ts l)f dunlluullipll'x,'rs (of SI..... 8 Inlll Hi bits) Im,l th," siudc rl.'ttUhlf Imtitil''''''I;I'l
"f sii',' :!,I hits 111'1' imvll'llll'ntct.\ U:;J\ the tucw{s shuwu ill Fllrt.. 5.21- 1'111.' hlh-
ur,ll'r II/df uf thl' tj I-hit I"ltlt'r IUL" II sl rul't illl' "llllllUI tu tlml illl'iul ,rl:'o. nil'
124 5. Fost Addition
s:" s2, I s
'"
...
c_. C;;
Clu.
,J- Ct
SO S,,,
on
(a) (b)
5 11 Cony-Save Adders
125
z
u
=
(3 2) Counl
e 11
F1GURE 5.22 A (3,2) counter.
F1GURE 5.21 The basic circuit for a duol mult1plexer (0) and a single multiplexer
(b). The signals cr and cr are generated by the Manchester corry unit of the
preceding group of 8 bits.
carn'-savc addition, we let t 1)(> ("'\fry propagatt" ollly in the last step, while in all
the tber steps we genf'rate a partidl slim and a 8('(Iu p nce of carries $'parat€ly.
Thu'i. the basic carry-save adder (CSA) accepts three n-bit oppralllis and genpr-
ates two n-bit results, an n-bit partial sum. and an n-bit c.arry. A :.econd CSA
dc('ept tbese two bit-Sl'<Juent't'$ and another input opt'nnd. and gelwrates a II'JW
partial sum and C'Uf)'. A CSA is then'fore, capable of reducing the numbpr of
operand" to be added from 3 to 2. without any carry propagation.
A carry-save adder may be implemented in sc\('ral different waYS. In thf"
simplest implementation, the basic clement of the carry-save adder is a full adder
with three input, r, y, and z. whose arithmetic operatiun can be dpscribf"<1 by
Dlain diffel"t"'nce is that the incoming c.'\fry, C32, is calculated by a separate carry-
look-ahead circuit wlaOSt' inputs an? th{' conditional carry-out signals gt3lt'rated
by the four Mancbt':;ter carry units in Figure 5.20. This allows the opt>ratiul of
the brgh-order balf of tbe 64-bit adder to o\"t'rlap the operation of the low-<>rder
half. III :;\Iuuuary. this adder combines variants of thn>e diff,'rent techniques for
fast addition: Manclu'.:Ster carry generation. carry-select and conditioual-sum.
Othl'r designs of hybrid adders can bE> ('nvisioned, combining variations of
tilt' ba.."ic methods for fast addition, possibly implementing for eXNnplc, groups
with UlleqUal number of bits. Olle :>uch adder, a Manche.;tt'r adder \\ ith v8riabl
(group size) carry-skip, hdS bt.."'n propo.. and a.llalyZt'd in (3). The "optimal-
ity" of h.rbrid 8dders is higbl}' dependent 011 the available tedmology and its
particular delay p.uamctcrs.
x + Y + z = 2c + s.
(5.39)
where sand c are the sum and carry output.s, respecti\.ely. Th£'ir valul'S arc
s = (x + y + z) mod 2
and
c =
(x + Y + z) - s
2
(5.-10)
\\ hen three or more operands are to be added simultaneously (e.g., in multipli-
cation) using lwo-operand adders, the timf'ooonsuming carf)'-propagation mnst
be repe3l('<l scveral times. If t he number of operands is k, t hen carrit haw to
propagate (k - 1) times. Several tt.'dmiqu for multiple op,'rand addition that
&UE>Wpt to lower th carry-propagation pmalty baw b>n propo.. and imple-
mented. Th", teclmiquc that is most commonly u.'i<'d is carry-save addrtlon. In
The outputs arf" t be WE>ighted binar) reprf'SE'ntat.ion of t hp mWlber of l's in the
inputs. We therefore call the FA a (3.2) counter, 8.., shown in Figurf' 5.22. An
n-bit CSA consists of n (3,2) c.ount.{'rs operating ill paralM with no carry link:.
inh'rconnecting them.
A carf)'-save adder for four .t-bit opE>rands X, } . Z, and W, is shuwlI in
Figure 5.23. Thp upper two levels arp .I-bit ('SAs. whilE> th€' third le\('1 is a -I-bit
carry-propagating adder (CPA). Th\" latter is a rippl(,-ccU'ry add('r, but. ma)' be
replaced by a carry-look-ahpad adder or any othpr f8....t CPA. One :;hould note
that partial sum bits and carry bits ari' interconnected to uarNltf'(' that rnly
bits having the same wpiht lire added by a.ny (3.2) c()unr_
In order to add thp k op.'rands X I, .\'"2, .... X le we m..ro (k - 2) (,SA WI it:.
and OIiP CPA. If th\" CSAs are arrangro ill a ca.';;('adl, as in Figure 5.23. then the
time to add the k operands is
5.11 CARRY-SAVE ADDERS
(k - 2). Tes." + TCPA,
126
5. Fast AddItiOn
5.11 ConV,SOV'& AddElrs
r Sf] :s
r:: =.
Z'I JtI =,
x,
\J
\,
56
54
So
5"
51
.s,
AGURE 5.23 A carry-sove odder for four operands
wht'N'Tcp.-\ is tbl' opt.>ration tim4" of a C'P.\ and Tcs.\ is th(' op(,rE\tion tilU(' of a
CSA, which equals tbe dd8' of a full .uld('r. f''''' Tht' latt('r i.. ut It'<\S! 2 . G,
whert 11c is th .1t>18)' of a singlt' at. Nott' that. tilt' tillnl rt'nh nmy rmdl a
lengtb of '1 + f/o!l",1k' bit.s, sinl th(' sum of k opl'r8nds. of si.l n bits t'ad,. l'lU}
be as large as (2 n - l)k.
A better \1\ 8Y to organiz(' th(' CSAs. aud ['('(1u('(' till' opt'rnt iou liult', i in
the furm of a trt,'(' 001ll1l101lly callro a Wallncc t.r('{' [3-1). A six-ol1('rnnd WI111I1(\'
trE'(' is iIlustratro in Figurt' 5.2-1. Thf' left nrrows 011 t.lIt' carr)' outputs uf tilt'
CSA... indicat{' that these output.s ha\'c to bt' shifted to tht' It,ft h('for'" b4"ing
added to the SIIIII bi, as shown iu Figure 5.23. In this tn'C. till' uumbt'r of
"perands is n'<iucffi by a factor of 2/3 at each It'wi. TillIS,
127
\.
\p,
\
(' S
(II)
(b\
AGURE 5,24 (0) A CSA tree for slx operands (b) An Implementation of (J
6-Input bit-slice of the tree In (0).
Sl'(\Ut'U,'1' of uumlwfS i 2.3,.I.(},),13,19,2, ,'h'. St I1rt m wil h Ii\"\' "p('flUItI. Wf\
stilI IIl...'(1 thn'(' Il'wb as \W tlI1 for six '11)t'r:UIII, Tilt' l'lltril'S in r'ithl,' 5.1 \\'1",'
gl'llt'ruh'\l usiu similnr UrIIU1l'lIt.s. fhis I nblt' "t)ws t.lII' I'XU\"( 1II11111wr ur 1(.....,lto'
n'(luirt'(l fur lip 1t1 li3 011t'mll".
2 I
k'(3) 2
EXlunple 5.0
For k = 12. 6\"(.lt,\"('ls ,\ft' IIt"l'<""II, n'sldtiu in a "t'ln' nt 5. Il'... ill:-It'I\l1
of 10, I.s,., whit-h is 1.11l' (ll'IJI' fnr 1\ liul'l\r l.n....l'mll. nf 10 ('Sr\. 0
EXluniuiliK 1),ble 5.1, Wt' nmy 1111h' Ilml Ihl' mt)"..t .'t"'\II'mit',,1 imJlh'lIu'1i
tation (in It'rms of 1Il1ll11ll'r of II,\..'I) i R('hit'\','(1 wh"11 I h., lIIulllwr "t opl'nulIl:.
is 1\11 I'lt'lIl1'ut of tht' s,'ril's :I,,'.ti,H, la, W,2S, ".0 rhlls. (or II )l;iv"11 lllulIIlI'r of
flllI'mlids. sa, I.', whil'h ilo uol 1m ,'I,'11\I o Ut ,.f t his 1i>I'ri(', W(' 111'1'11 t,l 11J:o" "lib
"lItJlllo':h ('SAs to rt'tllll'\' k 10 !Ill' do:U'st (1111(\ :m",II,'r ,hnu k) t'I,'UII'lIt III lh.'
nllow sI'ril'S, For t'xJllIIl'll', fur k ... '1;, \W 11\11,)' U:;,' 8 (' '\.:1 (wil h '1,1 iIlIJII!)
rutht'r thuu 9 ('SA.... In till' tuplcvd. so 11",1 th uUIIIIll'r of ulH.'nulll.., ill till' 11I'xt
It'\"I'l will 111' I:( . 2 ... 3 HI, whil'h 1:0 1\1\ dl'uwnf IIf till' sl'ril's, 1'111' r"lIInilllll/ot pllrL
of (ht' trt'(' will hn\t' it tJ!J,'wlIIls fu!luw I.h., 'ril'.s,
wht>re I i:, t.he number of Icwls requin>d. C'.ollSt'qIll'ntly,
log (kj2)
Numb,r uf Icwls :::::: I . ( /
09 3 2).
\5..11)
Equation (5.41) provides only an ftotnnnlt' of thl' numh('r flf lewis, sim"e a'
each It'wl the IlUlllbt'r of operands must be an iutt'gt'r. Thus, if N i is the II\lmbl'r
of operands at levd i, then th(' ulllnlll'r of operands lit t hI' It,\....1 (i + I) abun' nm
IX' at mo' tN, . 3/2J (whert' tht> Uoor lxJ of a nUlIlbl'r x is th,- largl"St intt'Ker
that is smallf'r than or eqllal to .r). TIll' IIl11ulwr of opemuds ,,' the bnttnm
I&'\'el (i.e., level 0) is 2, so that th(' muximum nUlnbt'r of op£'raluls 1\1 Il'\'I'1 1 is
3 aud the mRXuJ1um numhpr uf upt'ralld.. at Il'\'.-I 2 is 19/2J = 4. 'flIP (("Suiting
12
5. Fast AddItion
5 11 Corry-Save Adders
120
N limber ,)f nperllnds N uml)('r of l\'v\'ls
3 1
-t 2
5 $ I.- $ 6 3
7 :S I.- :S 9 .1
10 :S 1.' :S 13 5
1..\ :S 1.' :S H) 6
20 < k 2 i
29 k 42 8
..\3 < I." < 63 9
ZI :T2 Z1
r. r r"
7:7
or
m rl092(1.' + 1)1.
I
I
I
I
I
1
I
TABLE 5.1 The number of levels In a CSA tree for k operands.
The idea of using a (3,2) counter to form nl\llt.i-opf'r<\lld adders can be
I'xh.'nd\>tl to a (7,3) counler, whO&' t.hr\'\' outputs rt'preSt'nt the nmuber of l's in
its s,\'('n inputs. Anotlu'r ('xampl\' is thl' (15...\) counh'r or. in genf'ral, auv (1.', m)
countf'r wher(' 1.' nnd m satisfy
Sa
51
So
2 m - 1 1.'
AGURE 5.25 A (7,3) counter using (3.2) counters.
A (7,3) counter, for £'-xample. C'NI be implemeuted using (3.2) l"mmlers as shown
in Figure 5.25, whl'r(' interlll\>diate re:;ults are nddt>d according tu t.heir w(ibt.
Ho\\e\'er. (.his implellwntatioll re<luirl"S four (3,2) cuulltt-'rs arrnng\>tl in tlm'(' lev-
els and therefor\' pn)\'idt-' nO spL'('(.I-up compl\f\'tl to Nl implem('ntation ba."o.l
011 (3,2) counters. A (i,3) counter ("<\n also b,' impl('nll'nt('(1 dir('("Uy as a multi-
I('\'£'l circuit tint may hav(' a smaller overall delay depending on thp part.icular
It'C"Jmology £'mployt'<1 (21). Sil1l'e the number of int.erconnC'Ctions that a cin.'uit
n'quires grl'at.ly alfl'!C'ts its silicon area, a (7,3) oount('r is prl'ft'rrabl\' to a (J,2)
coullt.er. A (7,3) has tcu COllllL'Ct.ions and rl'moV'$ four hits while a (3,2) count('r
has fivc connf'Ction-; and rt'moves only olle bit.. Anot.her impl\'Ill('ntation of the
(7,3) c.'ount£'r is throuh a ROM of siZ\' 2 7 x 3 = 128 x 3 bits. The access
time of t.his ROM is unlikclv to b,> slUall('r than t.he delav a..;,sociatL'<I with t.he
lIup)pm('ntation ill Figure 5.25. Howc\'er, a SP\'t'd-up may bt:' arhit'v('(1 if a ROM
implt'IIIt'lItatioll is usl'd for a (k, m) COllllter with higher "alues of 1.' and m.
\\"I)('n S('v('ral (7,3) counters (in Pdrallel) are used t.o add S\'\'t'n operands,
\\e uht aill t hn>e result.s, and a second I('vt'l of (3,2) countt'rs is nft'dPd t.o rec.hlc('
thc::;c tu two re.8ults (5WD Rmi carry) to be added by a CPA. A similar situat.ion
arifiL"S whcn (15...\) or more compll'x rounters art' used, geut:'rdting more than two
rults aud \.oll'llu,.,ntly rL'<Juiriug a secund level of connters. In soml" ca."l'S,
till." additit,na) lewl of ('Ounh'rs can he combilled with the first I('\'cl of count('rs,
result.ing ill 8 morc conv(,llienl imph'mentation.
In what foll(,w6, we show how till' (7,3) l'OUllh'r can be cumbined wit.h
a (3,2) nnnlt,'r. \\'\, ('all the combil\l'tl ('Quilter a (7;2) COlllpror. 1..\1 g"l\l'ral,
a (k; m) compres.."Or is a variant of a l'oullter with 1..' primary input, all of tht:
61\m,> weight, say 2 1 , and rn primary outputs of wl'ights 2'. 21+1, ..., 2'+"'-1 [9).
In addit.ion, the compr('s-r has s{'wral incoming carries, all of wpight 2', from
previous compressors, and sev('ral outgoillg carries of weights 2'+1 I\lld up. The
6-input bit-slicc shown in Figure 5.24(b) is a trivial,>xamplt' of a (6;2) \ompr"or
where all outg(,ing carries hav(' t.he sam(' wt'ight 21+ 1 and the uumber of t,hcsc
c.nrri<'$ f'qllllis th(' numbl'r of iucumillg carries a.nd is alw cqual, in general, to
k -3.
A straightforwl\rd impl('IUt'nt"1tion of a (7;2) C'omprt>SSor is shown in Figure
5.26, where tllP bottom right (3,2) counter is I.ht" additional (3,2) l"Ountl'r, while
thl' remaining four (3,2) cOllnters ,oDstitute t.he ordinary (7,3) counter that is
depictt:'ll in Figure 5.25. A (7;2) compressor in cnlumn i has" .ven priml\ry inputs
of w£'ight 2 i I\l1tl t.wo c.'1rry inputs from colul11n (i - 1) aud (i - 2). It gCIll'mtf'S
two primary outputs, dt'nutt:'(l by S2' uud S2 1 +1, r('f\{'(.ting t.heir wt'ight.s, aud
two outgoing carri('s ('2,+1 and ('2,+2, to ('olumns (i + 1) and (i + 2), ((pl"('-
tiwly. Note that th(' input c..arri('" to the (7;2) rompr£'S.."or iu figure 5,26 (10 nut
participatE' in the gt'uf'ration of the two out put ('arri\'S in ordt'r tu .w..i(1 d ,Jow
carry-propagation chain, Also notic\' that this ("OlIIpressor is uot 8 (Y,4) CHllI1tpr
since it has two output.s (S2'+I and C21+1) with thl' :i1JJ.l1\' weight. fhe lIupleulc.'n-
tl\tioll dcpide<1 ill Figure 5.26 does uut. offer an)' Spt'\>(lup. A diffl:'r\'lIt, pU1>..o;ibl.\
multilc\'ellogi(', impl£'lIll'ntatiun may yield 11 slUi&1ler O\'l'nul dday o.s I(}\ d..o; the
gcneration of the out)l;oiug carrit's n'muius iUIIt'IH'lId('1I1 of the iJll'ulIlin..: cnrrit::i.
130
5
Fast Addition
5.11 Corry-Save Adders 13J
. .
. .
. .
. .
. .
. . . .
2 1 2' 2 1
2 1 2 1 2 1
2 t
CAn)' ("2.... 2 to (i + 2)
C'VTY ('2,...1 to (i -4- 1)
Cnrry ('2 1 from (i - I)
Carry C2 1 fr<>rn (i - 2)
FIGURE 5.27 A (5.5.4) counter. The dots represent Input or output bits.
(k,-J,k,-2...., ":0, m)
A rC<JSOuablc way of imp)PJllcuting gPllerdlizf>d counh'rN if; by uRing HOM!J.
For cxample, the (5,5,4) coullt'r shown iu Figur p 527 C1\n h realizl.tl wIth a
2(6+ftl x 4 ROM (i.e.. 1024 x 4).
Th(' (5,5,4) couuters ('onw'nipntly rt'tlucf> thl' input op('rduds tel two in-
t('rmcdinte r,,:;ults, r('(luiriug ouly oue CPA to prodlll'p thp final sum. In thr
geueral C'ase, 8 !>trillg of (1.-0 = I, .... k'_1 = k, m) CfJuutl'rs may gpn('ratf> morp
than two iutprmpdiatp rults. rNluiring ddditioual n'dlirtilJII hl.fon. 8 (1)A can
bl' uSPd. To fiud out t hp IlIlInlwr of iUh'rllll.tJiate rp!mlts g('nl'r ,tffi by '" SN
of (k./.,.....,/.,..m) countprs, muidpr thr. following. A St't of (k,k....,k.tn)
collutprs, with I ('olumns carh, prudu('es m-bit outputs at intervals tiC I hits.
Any column ha.'! ,It lIIust r T 1 output bits. Thlls, /." (lpcr1lud!! cau b ' r('(.hu'f'd
to 8 = r T 1 upt'rmlCls. If 8 = 2, a siuglt' CPA cau gpllf'r.ltf> t hf' hll.11 sum
Otherwise, Curthl'r rt'lluction, from 8 to 2, ill llt'f'<ll'd.
FIGURE 5.26 A (7:2) compressor wlfh two IncomIng carnes and two outgoing
carries, In bit positIOn I .
All thf' prr'Viuw.ly dewribcd counters life :;iugll'-cllhunn cOilUtl'rs. For
IUlllti-oiJ 'ralld addition we call gt'lIl:'ralizc thl"8C sinI('-('oluUlII wUlltf'rs into multiplt,.
columll COllntf'rs. \\'1' d('finl.' a w'n p rdIi7l'() parallpi ('ounter as a couut.cr tlnd add...
I inplll c£JluUlIIB and produces all m-bit OlltPUt (31). Tlu> not!ition WI' uS(' for
MidI a r.lJllnt.('r is
whpn' k i is thl' nllllllwr of inpul bits in 1.lIP i-th I"Olumn wit,h wpight 2'. Clt>arly,
8 (k, m) I:Ollntt'f is a sppcial C<W' of this gPlIpraliJ'..I'd countpr. TIll' lIIunher of
outputs m UlU<;t salbfy
I-I
2'" -1 Lk 1 2 1 .
. 0
(5.42)
Example 5.7
If till' lIumber oC hits per column iu a two-coluUlII COlllltlr (k.k. fII) is
increased bt'yund 5, then m 2: 5 ami as a rto:mlt, 8 = r 71 > 2. For
exampll', iC k = 7, the illPqllality t.Jldt uun,t be satisfj,'d is 2'" - 1 2:
7.3 = 21, alld Ihl.'fI'foft. rr& = 5. A I'(.t of (7.7,5) countt'r/'! will gelll'ratc
11 = 3 operauds, aud COII'<IUl'Utly l1uotlll'r !Wt of (3,2) ('lIllIItl'(!I is IIft'IIr'c!
ill urdt'r to rl'(hu'I' the ullmbt'r uf opPrEllul'I 1.0 2. 0
If dll I ('0"11111111 haw the sanl!' h(.ight. k (i.e" /"'0 = k l = ... = k,_1 = k). Ihl'u
I h, ilu'(llIality tll<\t h L'i to hl' 6dt.isfi,'d it>
2"'-12: k.(2 ' -1).
(5.43)
A imillc' exampl/' of I hl'8f' muutl'rs is till' (6,5,.1) count.pr !>howu iu Figur4"
!j,2i. hJr thi" I"IJlUlt..l'r, k = 5, I = 2 Rnd m = 4, alld illl'ltllalit.y (5AJ) turn!!
illt, an f'quality, ilUJlIVinJ{ that. 811 16 ('(lmbinatilJulI of thp IlIItlllll hits !ire 1r;4.fu!'
(5,5.4) (.4)uut 'r.. can hI' U8P.d t . rpthu'r five 0I)l'r8l1dll (oC any Ipngth) to two r/'Sults
that ('hll tJ1I'1i I,p addt.d wilh Ii (,PA. 'fht' II'ugth Ilf 0IWr8ud" will r!ptrrminl' th..
uwubpr 'JC (5,!" ,4) C-OWlt1'r1! in J",mllp!.
Tht' harclwar' complexity (IC a clury-suVl' /1(IIII'r for a larKf' uumbpr of
0p'rtInc1! might. Iw prohihitivc, ilult'III'uclcllt (If tht' pnrt inllnr t} pP of p lrHllpl
C'ounters crnployt.od. am' way to ff'thll'f' t,he har"ware ('OJTII,I,'xity is t.o df''Iign
n smaller (' ,rrY-6aVC trl'C dud II"C it itl'rntiwly. 1'h4" fA oJII'r<1n"!! !if(' dividr'(l
into rn/jl Krollps tlf j oJwwud.. I...wh, alliin tn"f.' for j +- 2 operauds with tv..,
feedback J,al hs alld 11 CPA it; "f'ignr'(I, I\H ShUWlI1II Fignfl' 5.28. The t.Wo fl't.dhock
Imtllii mnk(' it nl'('e<...nry tit comJlI.'t.{. thl' lir:.L puss t.hrmlKh the (,SA t.rf.... h '(Ul .
132
5
Fast AddltJon
1
I
I
I
5,12
P petlnlng of Arithmetic Operations
133
\1
\J
\
Stage
1
-"II)' t
:l
SI 9f1
:I
7
}'
CSot
Tr
FIGURE 5.29 A three-stage pipeline.
CP-\
stagl:' 2 t.o (,Xf'Cutp st('p 2 of th(' c\lgorithm, whilp :.tagf' I ran start tllP ('Xt'('ution
of st ep I 011 Ilu> next Sf't. of opprands \" lInd Y.
A common way t,o iIIul:'tratt' th(' way a piJldin,' npl'rnlf's is throllgh ,
timing diagram lik(' the line in Figurf' 5.30, whirh shows I ht' t'xact timing (If Ii"lIIr
slIccessi\'e udditions with opprands X I & Y I , \"2 & Y 2 , XJ &. Y JI ami Y., & Y 4
produring the results ZI, Z'l, Z3, and Z4. respfftivdy.
L('t. Ti denot(' t.he execution time of stage i and II'I 1"/ dpuotp thp time
need...:1 to store new data into a latch. In g('Ilt'ral, tht' dplays associated with
the rlifferNlt pipeline stdg. arc not icll'nt.ical, and fasler stages mllt watt, until
the slow,'st stage complete.s its task twfore they can all switch to thl' IWXt task.
Therefore, the tin1\' inl(,r\al bet\Vl:'t'lI two !>ucc('Ssiv> findl ro>:>ults bpin${ produced
by the pipeline is
FIGURE 5.28 A CSA tree with two feedback paths and J new operands.
till' SN'"l"'l1U1 SC'I of j opf'ralllis is appliro. Thl slows duwn thl' ex..'Cution of the
mllitipl('-\.)pcrand addition, since pipelinillg is nol possible. In the next Sfftion
we discu pipdining in gl'n('ral and (Iescribe ways to modify tllf' trl st.ructure
in FUI't' 5.28 to $upport pipt'lining.
T = ma..x {Ti} + 'T/
1S;'Sk
(5.44)
5.12 PIPELINING OF ARITHMETIC OPERATIONS
whpre k is tht"' number uf :;tag,- in tilt' gellPral ('h....e. Tht' rin1/' interval T is al....
CaJll'<l thl' p'IJl'line period, and 1fT is call,x! the pipeline rote or oowl",i,Jth
Pipt'lining is a wry wdl known t('('hniqu(' for accelt'rating the e-x£>Cution of su('-
i\'(' id"nti.-a) op('rations. Instead of dE'Signing a rirmit capable of e<ffut,ing
1\ $ingle op<'rat inn on out' set of op('rantls at a I,imp, we (h.-'Sign one that is par-
tilioued illto \'era1 suhcircuits that can oppratt"' indcpcnd£'lltly on coruccutive
ts of opernnds. This way the ext'(;utions of several sncc('$Sj..'(' operau01 owr-
lap, and the ratt> at whidl results arC' produCt'd IS considerably higher than that
of a nonpipt'liued dl"Slgll.
To allow pipplilliug, the algorithm is divided into S '\'era! steps, and a
suitabk drcuit is dt':'lgned for each of the.st' :.teps. Th(' S<'paratc ci(('uits, which
are calk>d pipdiJlt stages, musl be alIO\\"l'<l to opt'rale illdcpt'ndently On differt'nt
set6 of uperands. To &:hie\ this goal. storage elements (latches) lIIust be add..'<1
betW(,(,1I adjac,>nt stages, so that wlwn a SI.aF;t' works on aile set of opprands. the
prt'<.'l'<lit Sle can wurk 011 the next s-et of op('rands,
Au "'''81l1pll' uf a pipelinp for adtlition (or any oth('r two-opprand opt'ration)
conslstillg uf thrl't' tagt'S i:. dt'pictftl in Figurp 5.29. H,'r(', till' arlditilln of the
t\\'o Opt>r81lds .\ nnd} is pt>rfornu>tl in tlm."> t:tt'ps. The Intdll b(.twl'l'n stag('
laud tiUt)!;t' 2 s\.un' \.he ulh>fmP<!inl,' rtults of Sh'p 1, which Me tlwlI uso>d by
Pipl.'lin('
!\tN(
Z,UI
2 ill
Z.silt
.. 15
h
Stage
(' product'<1 prodm:...1 produ("('d pl'O<
-
3 Operation Operation Open" ion Operation
I J. 3 .1
-
f Opt'rntion ()pulll>n Opcmtion OIX'n.tion
I 2 3 I
I °lwl"ataon OpcraliolJ Uperatlon Uperntlon
I 2 3 .1
-
T
X2&))
('nt"r
I.T
\kb
nt..r
3r
'\1"'''
cnt..r
.IT
ST
1>1" rime
SIQge
SIQ9
o
.\1&) I
.,n[er
FIGURE 5.30 A timIng diagram of a thlee stage pipeline
134
5. Fast AddItion
5 Exercises
135
The dock signal s)'1l'hrnlllzinJ:( the pipeline's op£>rationlUust be S('t SO that
the dock riocl is ('Qual to or larg('r than T. Figure 5.30 shows the ca..c:e wh,..re
the clock pt.'riod ('quais T. H,..r(', sft('r a latenc)' lIf 3. T. n£>w rt.ults are produl't'd
al t Irt' rat.' of liT.
An important dtign dedsi,>n is thp partitioning of thp givpn /:1lgoritlull
into stcJ>1' t hat will hl' eXffutro by t hp St.'parate stage'S of tire pipdiu('. These
sh'ps should pref('rably be dpfim-d so that th,y have similar ('xccution tune,;,
sin(.....' the pipdine rat(' is d('tprminM by tire cx('("ution time of the sluwt."St step.
Thl' lIumber of stpps nm.:t then be d...t('nninN:L As tills number incr...a.-.es, the
pipelint' )(>riod d('('re&t but th(' numb('r of latt:h goes up (incrcasinF; the cost
of implpnU'ntation), and so dO€'S the latency of tlrl' pipdillt'. The latency is the
I ime elapsed uul il the first rf'Sult is producM. This is especially important wheu
only a single pa..;s through tbe pipelhw is required «('.g., lLddition of only one pair
of I.lperands). Thus, tl1t're is a tradroff h<>tween the laten.."y and impl('mpntation
cost on onf' hand and the pipeline rat,,' on the other hand. The extra delay du£>
to the latehes. 1), can be 100\"t'red by using special circuits like the Earl latch Ill).
5.12.1 Plpellning of Adders
TI1(' reIat iw simplicity of two-operand addcrs usually dOt'S not justify th>ir impl£'-
mmtation as pipdin('$. Howc\"pr. in special-purpose digns. wh('n IDam" succes-
siw additions are needed. such implelllt'nt.atiollS are justifiable. The ('ouditional-
sum add£>r can be ea.o;;i!y implclllf'nted dS a pipeline. One 'way of doing this is
to haw' log2 rr stages correspondmg to the log:! n steps in the conditional-swlI
algorithm. The, allows \1.<; to ov('rla() th(' c.xecution of up to 102 n additions.
How('\,er. th(' required number of latches may be t'xCt'SSive. Two (or c\,('n more)
stt'pI; call be combined to form a single t.agc in the pipeline reducing the latches'
O\erhcad and the lat.enc)'.
The carry-look-ahead &.Ider. de>('ribed in Section 5.2. cannot be pipelined.
smce som,' carl')' signals must propagate backward (5('(', ('_g., Figure 5.2). How-
e'\"I:'r. different desigw. of t h(' carry-Iook-allf'311 adder, following the approach
de:.cribro in Section 5.5, can be pipt'lined. Ht're. the final carries and the mrry-
propagc\te signals (implemented as p. = x. e Yi rath('r than p. = r. + Yi) can be
w;ed to ralculate t.he sum bits, eliminating the n('('() for f('f.-dback connf'Ctk>n...
Clearly. pipelining i.<; mo['{' b('lIpficiai in thp c.ast" of multipl(,-o(H'rand addt:>rs,
likl? thf' carl')'-t'a\'f' '\CIders de:>("ribed in Section 5.9. Modif) ing the implcn}('nt3-
tiun of CSA trff'. (see for example. Figure 5.2-1) to form a pipeline is straight-
CUNa.rd a.nd requires only the c\ddilion or lall'hes. These can be added at each
Ie\"pi of tht' tn:>e if maximum bandwidth is desired, or two (or more) le\,('!s of the
tno(' can be combined to form a <:ingle :.tage of the pipeline. reducing the o\'6"all
numbt"!' of latdl'''=' and t hp pipeline latenc',
If the hardware complt'xity of the CSA tree for a large numbt"r of operand..
prohibilh'e, a partial trw lik... th\? onp :Jlov.n in Figure 5.28 can be designed.
X'. x'
. . .
Xl
rRJ1 TI-t I J C1p('rorld.!
FIGURE 5.31 A CSA free alloWIng overlap between iterations.
Howc\'er. as pointed out eMIit'r. the two feedback connections pre>'ent pipelin-
iug. This ca.n be rectified by modifying tbese fN'dback connections. lru.tedd of
connecling the two intermediate re.ults of the CSA tl"f'C to it:; inputs, we can
connect, them to the bottom level of the CSA tree strm,"ture, as shown in Figure
5.31. The modifit.-d structure now COllSU;h, of a :.maller tr with j inputs <it tbe
top, two !iCparate CSAs. aud a set of latches at the bottom. The two parat
CSk; aud latches 3..-.similate the two intermediatf' re.-uI and fonn a pipeline
stage. This wa', tht' top CSA tree for j oppralilis can be pip£>lined too. and the
o\'erall time needed to add 811 n operands is reduced considerably.
5.13 EXERCISES
5.1. Compare the two alternati\'{'S for tbe Boolean expression of the propagijted carl).
n.unely. P. = z..;.-y, and P. = z. +1/.. What might be the benefits and drawb;,cb
of each expl'ffiSioo?
5.2. (a):\ carry-completIOn add..r 1101 dett'Cts tbt' completion of the carry propagahon
and geoer3WS a signal CC (carry complete). indicating that the bibs.
sto>ady .wd C<Ul bt- u..;ed. The ba.:.ic unit in tlus adder is a modified full adder
,,-;Ib the sam.- inputs z, and SI. and output ,. Tbe carrie> an' different. tbough:
insu>ad ora singl.- inroming carT) lint' c,. there an> two incoming c-arry linps (and.
siruilarl'. t"-o oulgoing CarT)' lj) denoloo by aDd c:" c? = I if it is amadlf
known that the incoming carr)' is ro, and - 1 if it is almsdy known Lha&
T
136
5, Fasf Addition
5,13 Exercises
137
thC' iuromiug carry if; on('. Thp 1\\'0 outgoillJi: "'flrrics, c?+1 lille! C:+h art' dE'lilll'd
similarly.
NOle Ih81 IUJ add('r fllag" with inputs z. = y. = 0 Cflll 1lE'\'E'r gcncrate a carrv-
,"II anci tJl('re£ore ..-all prollucl' c?+I = 1 illlml..-lililel,." wilholll waiting for fIIY
carr,., prop8,Raliml. Similarl,." an fldder stage wilh inputs x, = y. = 1 wiU alwa\s
!l:eUf'ral,' 1\ carry-oil I "'rid call therC'for(' produce c:+I = 1 IIwm>diately witholt
waitiu,R for allY ('arry propaJ;nlion. All add('r slage with inputs x./,I. = 01 or
z./,I. = 10 will iuitially sct bOlh carry-out signals to c?+I = C:+1 = 0 J\nd will
wnit unl iI 011(' (find ('xnelly onc) of its carry-in signfll., becol1lC'S 1. Gilly thpn
E'ithl'r c?+I or C:+I is SC't 1(1 I. Writ(' BooINU1 Pquations for the thre<' olltputs 3"
c?+I and C:H 8S fmll'tions of thE' four inputs z.. y., c? and c:. Deline a signal
eCI = ('+I + C:+h explaill its InPfilling, und show the BoolC'an (>Quation for the
mrry-romplction signal ce.
(b) It has h(,(,11 showll thai the aVf'rage Il'ngt.h of thp 10llgest carry propagatioll
dmill wh('n two n-bit 0lwrands art' added is approximately IOK:;:(5n/ I). Estimate
tb(' a\'eragl' a.-Idition time and ,'omparC' it to thp addition tillll' of a rippll'-carry
addE'r. Take n = 64 as a lIulIll'ricsl example. \Vhat is th" drawback of this
"arry-romplC't ion fldder'!
5.3. \'('rify Ibat th(' organization "ilh 10 groups ohize klo k2...., klO = 1,2,3. .1, 5,6,
5,3,2. 1 for a 32-hit carr,,'-skip adder with a single Ip\'el of carry-skip alld I. +h. =
I. S Itistles TearrJ/ 9. Ir (13). You IIIfly either 8I1alyZI' all important cases or
writl' a program that ellJlmpratcs all ca..
5.4. E.'ilimflte thE' carry-propagalion tiIllP in a two-lc\eI carr)'-skip addcr for 32 bits
Ihat an' di\'id,>d into II groups of size 1,2,(1,2,3),(2,3,3),(3,3,2),(3,2),2, where
('\'('1'''' set of parenthe&'S liignifit'S a single group in the SCCOild 1<,\ E'I. The first Ii\'e
F)"QUps, of Sl./ 1,2,(1.2,3), are shown in Figurt' 5.32 (321. Assunle that t. + II> = tr
and comparE' )0111' timate to tbe carry propagation d('lay of the sillgle--Ie\'el
carry-skip addpr in problem (3). Is thl' S(ond I('vel of carry-skip justified?
5.8.
(8) Show nil im)Jlellwnt.atiun of a If)-bil ,'ondiliunfll-'mlll nrldl'r usillg fullllllcl..rs
and dutn S('II:'CIOrs. Indicate bow mfillY I('s flrc needed. Rcpeat th.. IC 1",(),IIl' fflr
a 21-hit adder.
(b) Fst.imat th(' E'xccutioll tirn uf till' cflnditiullal-Silln ndde in (1\) dlld I'om-
INUP it to thf' eXC('lIlion timl" of I' ury-Iook-flhead adders. 1J1I.'Ie vour 'hm.ntion
011 typical df'laY9 frUlIJ mpllt signals to output siKl18l'l, whi,'h Cdn bf' found in dny
IC data book.
(8) Uesig.u a :J2-l.Iit conditinna1 sum ac:ldl'r Ibat Rtarls off willi Kl'1J11 or 1<17(' .1.
The two pO!iSihl.. .'11'1.'1 of outpnts for eaeh group of siz(' ,IIUf' gNleratl'ft IIsilll( 7,11 I
ICs with inlC'rual carry-look-ahead. 1':1(' fllJproprJatC' ,hltfl selC'clorR ill .uldilinu
to thc 7 IHH 1('5.
(b) fuitimnte tb(' exC('ution time of the comlitional-slIIn addrr iu (a) ,md ,0JllI",r.'
il to th(' E'xffution time of a c.ury-Inok-ahpad "dripI'.
5.9. Add to the tree structur.. ill Figllrp 5.5 lite millimllm nccessary circllitry to l1;('n-
E'rate thE' carries CIII, Cl4, . . . ca.
5.7.
5.10. Prov(' tilt' exprt'SSioll for l'stillJating till' uumb('r of lev('l'l m a W8111\("e trl'(' for k
operands,
5.11. Design a 4-bit counter capablp of rt'f.lllcing Ih.. nUllJbr of "I'prllilds from I I" 2
with a carr)' whoS<' prnp"galioll is lilllitt:>d to one pOlSilion.
5.12. A sel of n/2 (!»,5,-I) cOlluters cfln he used to r('(lucc 5 0lwr8l11ls (n-bit elu:h) 102
operands.
(a) What kind of countE'r is needed to rroucl' 7 op,'rdnds to 2?
(b) R('p('flt (a) for 90pt'rands.
(c) Repeat (a) for k operauds.
5.13. G\1(, le\'E'I of (5,5,4) coullters call bc usoo 10 roouc(' 5 opt:>rands to 2 op('rands.
\\'hat is the UlHxillliun number of operand'i that uw he rcdllu.--d to 2 whell two
levels of (5,:';,4) counters are used'!
5.14. Show an implewcntation of a (!»,5,4) COllnter using (3.2) COl1llt.
5.15. Desi,,'IJ a (7;2) comprC'SSOr with 7 inputs, 2 outputfl (a."1 ill I;'igtlre 5.2ti) thdt bits
,I inllUt carries (rom IJosition (i - I) and gpllprates .. output carrit'S to position
(i + 1). Use (J,2) COUlltCrs and lJIakc SUI'I' IhalUO 10llg carrv-prl.p3Ral;nn dldilL'.
aw g..nrI\hxi. ('om parE' this desigu to thc one shown in FigurfJ 5,:.!1i cOlUlidl'riuK
speed of operaliou aud oth..r hctors.
5.16. [.':.timatp thc time f1C('doo to ."1£1 rI upl'rlillll.. 1I...ing th,' CSA tn fflr j nl'w
operands sbowll ill Figw,' 5.2X, and cnmpRr(, it to the time lIloed,'d Uliilll( till' CS \
Iret' shown ill Figur(' 5.:n with oVf'rl<ip hNw('NI ...tlc(('_i\"t' il('ralinll.'!. J\8ilJlllf'
that nfj is an mlgl'r. USE' n = ,.!I allli j = Ii at! a nUllleril'dl ('XHIIII,Jt
5.17. A (,SA Iree for 6 up('r.mci'i uf Il'ngth :J2 bils f'ach indlllll'S I (,SA IIlIils. \Vital
should the Ipngth of ('8ell of theS(' CSA units lJe'{
5.18. Pro\'C Eqlldtion (5.36) for llign Ilosltion m in the fl'ClucN cUfnplexity c Irry.tll'It.'Ct
adder.
FIGURE 5.32 A two-Ievel corry-sldp odder.
5.5. E.'ilimat(' lu(' addiliou lime of aD SO-bit ('.vr)'-Iook-ahead addcr conslructoo of
th.' ICs 74181 aDd 741f\2 for various addl'r configurations, including a ripl,le-
carr,.' hl'lwl'('ll 74181 IC... and the maximum number of If'wls of 7-1182 1('5. Draw
Ii hlm'k diagram for E'3('L omfiguratioll. lJue your eslimation on typical d,>la}'s
frow input Ni..J:; to output signals, which caD bc found in auy IC data book.
6.6. IJsi Ii tablt. similar to the one in Figure 5.4, show tltl' \'arious sleps of tUI'
ron..Jition"l-suw ddditiOD of tbe following two Ilumbers, each cOllSL'Iting of 2 I
hiL'i:
z= 000101101100101101001111
1/ = 00100 III 00000 11110 110111
I3S
5. Fast A<:b"ion
5..19 .; bow to ....Ulv<-a te tbe Qt ID tbe \I cart"\ moduIP sbow11 in
F" 5..1..
5 20. ()nn; a diagram Iik.e tbe in F agure 5.1- for tbeS-bit Manchester
Carry C'hain U!II!'d in figure 5.20.. f« each bit pOI!itMlo m (m = O. I. -- - . 1) t
are thJ-fto S90; P.. G. aDd 1\.. ExplaiD wh is 1\. whiJe it is DOt
rt'qUired in F 5.1..
5.14 REFERENCES
(II o. J. BEDRlJ. adda: IRE Jhm,.,_ OR Ekdnm. C-ompt&kr.f. EC-ll
f.JuDr 19b'"2). J.ID.34f).
R. P. BRE.'" aod H. T. Knc. .A regular 1a"OUt £01' paraDeI adderr;.- IEEE
1\-oru. OR C.o C-'J CIazdII9S"l). 200-26-1..
P. K. CHA" aud I. D. F. ScHuG. -Ana1:rsi:s aDd design ol'C'-I05 1aoc:be5tET
edders with ,wiabk> carJ)'p.- IEEE Tr'C1U. em Compukn, 39 (-\ugus& 1990),
983-992
1 P. 1\.. CHA'. M, D. F SoIUG, C. D. THO}IBOR..'O." &Dd \". G. O....LOBDZ-
UA. -y oprirnm.1 ioo of cart)-skip addfts aDd bkdt cam--Iook-ahNd addets
using multictimecsiooal -namk - IEEE Tnr. OR Computen.. .II
-\ugus* 199'2).9"20-930.
151 L. DADDA. "'Some scbemt8 Cor paraUeI mulupliers," AU4 fh-.qtoen:o. 3.1 (Ialch
1%5), 346-356-
D. W. DoBBERPl'HL d Gl.. -.0\ 2(».MHz tH-b duaJ-is:sue C'IOS microprore;;sor.-
IEEE J. 0/ SoI&d-Sf4U CtrC1I&ts, (NO\. 199'2). 1').'.).I5&:J.
fi1 R W. DoRA", '''\ariant.sofan lmproo,""t'd Cart) Look-Abeed Adder. IEEE Tf'flTU_
OR Comput, 31 ,t. I'. 1110-1113.
:\1. J. Fl\:-':S A"D. F. OBElWA", .4dNnad CX1mpUteT cmthmenc daign.. \\ iJe).
"\ew York. 2001.
191 D. D. Gr..J, -ParaDei compressors, - IEEE Tram OR Compulc:l, C- 9 (Ma)
19&)). 393-398.
I B. G1LOHU."T. J. PmU:R£.'C£ aDd S. Y. WOSC, -Fas\ carry logic lOr digital
compui: IRE Trun". OR El«trofL Comput. EC-4 (Dec. 1955), 13J.IJ6.
In] T. G. HALU" aDd:\1 J. Fl\:S. -Pipt"liniDg o£ arithmetic £uocti;)ns,- IEEE
Trum on CompuknJ. C-21 (August 197'1). .
(12, T. 8A, aDd D. A. CARI-"O", -Fast aree-efDcXoot \LSI adders,- Proc_ sUa SfI'RlI-
OQ Compaf.eT' ....F1UttnetlC. 198-. -t56-
(13) V KAI\"TABl TR.A, - i...g optimum can)-5kip ad<kn.- Proc 10th S!I"Ip. ern
Compata .-triUamdic. 1991. 1153-
[14 1 T. K1L8t.R.'. D. B. C. EO\\AI'lDS &Dd D. A:'PJ-\L1.. -A pardUel arithmetic UDit
a sa1Uraitr fas&-<any cimUt: Pr«_ o/IEo. Pt. B. 107 (Nov.
1960),513-:')'4
5.1.\ Re"eteoces
139
c
1
i1 S KW1AoLES. A faaWy 01 adcIeIs.- Proc 141A-"
1999. Jo-3.I
(Hij P. I. I\.OCGE aDd H. S. S'ro:S£, -A para1iPI alpithm for tbe tr.JhIt.. of.
class 01 eqaa1ioas. - IEEE Truru C c..a -\
1M3). TS6- ';'93.
.Ii) R_ E L.'DER aod I J. FISCRER. -Para1Jtoi pmix COIDp'Q'........ .. J
.-4C\l. n (<Jd.obft I). ,n
18) I. LEHWA:o. aDd X_ Bt"RL'. ....kip t for earn- propap&ioD ill
aritlu:DfticUDirs..-IRE T oa E.kInm C-o E(-IO( 1 ,
691-69'
(I II. LI:\G. -High biDan- adMr.- IBM J. R. ... Dr:wl.. 25 '\Ia 1 I).
156-166-
.!OI T. L\CH aod E. E.. 1;:\H.RTlLA."iDER. JR.. -..\ """g trft' . 1ook-abNd
." IEEE TMI7I$. 0'11 Co 41 -\ugus& 199'1}.931-939.
..?II I. IEHTA. \. PAJUI..-'R &Dd E. S""DER. -Hi mu1t1 dIIsip
usiQg awlti-iDpu\ C'OUDU'r aDd rou.t'- r cim1it.s.. - Proc.. I s,z.". C
pula ."riJhmdi. 1991. .
._' 'f. F. XG-'l. I. J. IR\\"1.... aod. RA\HT. -R. U'ftt-tJmp ft6cimt any""'-
aheed adden.. - J_ 01 PcrGlld cmd lNtribtdal C0ftIJ"'IIm9. 3 (1 ') 92-105
:23) V. G. OKL08DZ1J..\, -Desip aDd anah-sis of fast carrv-propapu
DOIH'qU&l iDput signal arri,,'81 protilP.- PnK.. . .hilomGr Conjantu. (199-1).
1398-1-101-
"-1' V. G. O"'LOBDZUA and E R. BAR.'£S. -oa DD additioa ill \"L.
t«h.DoIo!;y.- J. 01 P.n&Ud cmd Com,.rilsg. S (1 .. .1&-i'l1.
S. OSC aDd D. E. ATK1"''5=. -A comparisoo of ALl" s&.nxtures b \"LSI
.: Proe.. 6th Symp. on kr AnthmdlC. (J I - ), Io-I
>6J D.5. PIL'T.\t\ aDd l KORE.... -11uenDl'diate b uro opened
additioo eouabllilg mu1tix<w. .. I'rot:.. 1 fA IEEE ....
OR Ccmpu.kr .o\nuamt'tic., 1999. 22-29.
.! D-S. PHAT.'-K. T. GOFFaod I. KoRE-'. -Coa:staDC-tilD uddilioa aDd
£0J'1Da& COD""ft'Sioo buI!d OD redUDdaDt ' Dlalioas..- IEEE T_ ...
Compukn. so. (2001'.
281 S. Sl"GHaOO R. \h'UtA:s, -luluJMpopmmd aMl.w- aDdmuJupticaiioD" 'EEl:
Trans. ern C41fnp.t. C- (19i3) 1l3-1lO.
29' J. r..L.':-'"'''''.. -Cooditiooah.qun .ddiiioo logic". IRE Traru. EC-9 (Jwre 19f50\
2'16-231-
(301 P. S. $PIR..\. "CompuLatioo LI.me$ of arithrDetic aDd Boc:W.D CuncuoIIs ID (4..".
circWts. - IEEE Jnvu. on C C-22 (Juar 19':'3}
1311 \\. J. Stt...zn, \\. J. Kl Bm aDd C. H. GABl."IA. - -\ oompact 1Ui. pe.raUd
multiplkatioD 5Cbeme - lEEB Trvzu. _ C_ c- ,; (Oct.. 19iJ) '-967.
ll S. Tl RRN. "Optimal poup ributioa iD ca.rt)- ip .. s,...
0'11 ComplJtcr "ac. 1. Qt)..1()3..
r-
1
I
I
I
140
5. Fast Addition
6
(3:iJ A. TYAc:J, "A rroucro-area Hchpme for carry-S('lcct OOd('(8," IEEE 1huu. on Com.
P'''' C-42 (OC'tobrr 1993), 1163-1170.
(34] C. S. WALLA!'I'", "A SUAA<'8110rJ for 8 fast lllultiJl!ipr," IFEE 1hm.,. on rnmpul D,
ECJ3 (Frbruary ]96.1) 1.1-17.
(35] S. WINOGRAD, "On th(' lim(> r{'(juif('(1 to pprfnnn addition," J. oflhc AC,\!, 12
(1965) 277-285.
HIGH-SPEED
MULTIPLICATION
Multiplication involves IWO basic operdtious: the gl'lll'ration of part.ial products
and th{'ir accumulation. ConSC<luently, there are two WayS to !>p('f'(lup IIll1ltipli-
cation: reduce the uumb('r of part.ial products or accelerate t.lwir a("culI1ulation.
Clparly. a smaller number (Jf partial product also reduces th,.. ("Qmp"xlty, and.
as a result, r('duccs th,.. t.ime needed to accumulate t.hp plil'tial produrts.
High-speed multipliers nu he classifif'(] into thrPe general types. Tht> first
geuerates all partial products in parallpl, dnd then II a fast IIInlti-op€fand
adder for their accumulation. This is known 8b a parallpl multiplier.. Th' sec-
ond, known as a high-spt?t.>d sequential multiplier, t'llerate8 the partial products
sequentially and .adds each newly g'nerated product to thp prt>viously w:t':lUnu-
lated partial product. The third is made up of an array of id('ntical c('I1'1 that
generate uew partial products and acrUlllulntc th,..m simultdneously. Thus, there
are no separat.e circuits for partial product gpnpration dnel for th{'ir an'lunulation.
Thi" is known as an array multiplier, dud it t{'nd to hav(' a rrollct'd eXt'cution
t.ime, at th(' expense of incn>ascd hardware complexity,
6.1 REDUCING THE NUMBER OF PARTIAL PRODUCTS
To reduce t.he numbpr of partial product.s (and hpuce rt'dlll'p thp "\1111l1l11t of
hdrdware ill\'olvPd and t.he f'xe<'lItion time) we may ('xaminp two or mort> hits
of the multiplier at a time. Howpvpr, this SC'hpmp n'(luires the g('ul.'ratilJu of
th{' Ulult.iplt'S A, 2.4, RUt! 3'" wht'ff' A is th,.. mllitiplicaud, as in Chl1ptt'r 3.
This r"dnce$ t.he lIumbpr of part.ial prmlllrt,s to n/2. hut ..act. :.tt'p ht'(.oult:S
HI
142
6. High-Speed Mulhpllcatlon
6,1 Reducing the Number of Portlol Products
1-13
mon' rompkx. Variou alp;orithms for rt,<ludllg thl' IIl1mbl'r of partial pruducts
wil.hout illl'rl'8Sing tlte compll'xily of gl'lU'rating t'adl partial prmhll'l ha", hN'n
propo.<;,'d. On(' of 1,111' fir:;t. su..h algorit.hms was Booth's algorithm 131.
Booth's algorit.hm. as \",'11 11.<; many otlwr algorit.hms, i ha.."NI on lh" fact
that fl'wpr partial product.s hl\\'1' hI h.. g('nemh'tl for groups of consecutive zeros
anrl on. For a W'0up of cOlIs('('lItivl' ...,'ros ill the 1I1lIitiplil'r there IS nu nt>t.'(1 to
g,'nf'fah' any lit'\\' partial product. \\'1' only UCi'l1 to shift t.he previously aCClUnu-
lall'(} part.ial product one hit position to the right for every 0 in t.he mult.iplier. For
a p;ronp of, 8<'\Y m. cOllsecutive I's in I he multiplier, ...0 {II.. .11} 0.. " fewl'r
than m nt'W partial product.s c.an be g('IU'rah"(l. TIll' abm'l' SI'QIIl'U('e l''Iuals thl'
differl'lIce b('twn t.lu' following two bit seq\ll'nl'l'S, each haviug a singlp nOll1('ro
hit:
Xn-I X,,-2 Y..-I
(I) I 0 I
(2) I I 0
TABLE 6.2 Recodlng the sign bit In Booth'S algorithm.
Example 6.1
fill' multiplier UOIlIIOOII(O) is r('C'Ockd a... 0100010101, rt'lJlIiring 4. in-
stl'acl of 6. add/:mbtrllct op,'rations. Th,.. J'l'ro in 1).mnthCSl i8 the rt'f"r-
,-,uce hit x _ J for xo. 0
...O{ll...ll}O...= ...l{OO...OO}O... - ...O{OO".OI}O...
Using SD (signed-digit) notation, tlirllsSt'(1 in Ch.\ptl'r 2, the abO\p cau be writ-
ten as
. .. I {OO. .. Ol} 0.. .
For (,xflmple, ...0 {1111} 0... = .. .1{OOOO} 0... ...0 {OOOl} 0...
=. ,I {OOOI} 0... or. in dN'imaluolation, 15 = 16-1. Thns, insfI'ad of generdt-
ing all m pnrtial product.s, w(' may genl'rate ouly two partial products, with thl'
S('cond bl'ing rompll'IIIl'UI,NI. III other words, the firt partial product is added,
while t.he second is subt.racted. Not, that the rl'CJllirNI lIIunhl'r of singlf'-bit
shift-right operdtions is still m.
We c.all thi!> o!'('ratioll re('oding thl' multiplier in SD corl,... The simpl£'st
fl'Coding sdu'lIIl' is th(' original Boot.h's algorithm, sunmlari./t'd in Tahle 6.1. III
this algorithm, thl' cnrrl'nt. hit. Xi and the prl'violls hit Xi_I of the multiplier
X n -IX,,-2'" XIXO are t'xamil1Pd in order to genl'rat.e the it.h hit Y, of the re-
codl'<l muItiplipr Y,,-ly..-2'" YIYO' Thp prpvious bit, X'-I> serves here only dS a
reference bit. At its turn, X'_I will be recoded to yil'ld Y,-It with X , -2 St"rving
as a rdercnc£' hit. For i = O. we define tht' reference bit X_I to b(' zero. A simple
way of computing t h(' recOlk'(l bit is through YI = Xi-I - Xi.
Thp rcroding of the multiplier bits nCN:1 not be dOlle in any prt'<lctcrminNI
order (from tht, most sigllificant bit to thl' II'l\St. siguificant hit or vice versa) and
can ('ven bl' doue in paralll'l for all bit posit.ions.
\\'hplI the tnultiplif'r and 1II1I1t.iplicand arp rl'prl'spntf'd ill two's comple-
lIIent, Booth's algorit.hm )idds till' rorrf'Ct product. if III(' sign hit r,,-J partic-
ipates in the proccs.... For this sign bit, WP n('('d t.o decidp whl'ther to p,..rfClnn
an add or subtract operation, However, no shift oppratiou (of till' accumllldt(>{1
part.ial 1)rodurt) is required, since this shift operation !WrvC's only 1\..'> pn'paratioll
for th(' 11l'xt stf'p. Clearly, t.he corr,..ctUt of till' last stat,..ml'nt ha.<; to ht' \1.'rilit'l.l
only for uep;ativl' vallll.'S of X (for which Xn-I = I). Thus, t.hl'rC' ar,.. two CasM
that ",,<-'(I to be examilll'd and t.hey 'Irf' shown in 'fabl 6.2. III both CAAC'S th('
requirl'<1 product is. as we ha\'{' S(.."(.'II in Chapter 3,
,,-2
A.X=A.X-A,x n _I.2 n - 1 wher O' =2:>121,
1="0
III case (I), Y,,_1 ('ails for the "'lIht.ract.ioli of ...t, which is dOlle "ftt'r thl' partial
prmluct has been shift 'd (rt-I) timps. HI'U('(', lilt' neCf'S<;ary corre('t.ion iR nmdt'.
In case (2). without considering th£' sign bit we are sc'\nlling over a st.riug of I's
aud we u('('d to perform an additiou for position (n - I). Wh£'n I,,-It whirh
''(Iuals 1, is also OOllsiden'<l, t.hl' required addition is not doup. fhis is C<luivalpul
to suhtra( t.iug 11. 2" -I, which is thp nccessary corrprtioll tt'rm.
(6.1)
TABLE 6.1 Booth's olgorllhm.
Example 6.2
Thp followiul!; s('qlll'ntialmuitiplkdtion iIIustratl''i rase' (2) ill T'\blt' 6.2:
A 1 0 I I -5
X x I 1 0 1 -3
Y 0 1 1 I rt'codcd mliitiplipr
Add - A 0 I 0 I
Shift 0 0 I 0
Add A + I 0 1 I
I I 0 I
Shift I I 1 0 1
Add - A + 0 I 0 1
0 () I I I
Shift 0 0 0 I I I
X, X'_I Opt'ratioll Comments VI
0 0 shift only st.ring of J:f'ros 0
I I shift ollly st.ring of lJlIl'..8 0
I 0 subtract and shift Iwginuing of a striug of oUl' I
0 I add and shifl l'nd of fl string of ones I
144
6. High-Speed Multiplication
6.1 Reducing the Number of Partlol Products
145
This multiplint ion st.arts from t hp least sir;nificant hit. of t hp lIlult,ipli,'!'. If
start.f'd from t.he most signifieant. bit.. a lon('r addt>rfsubtractor would bl'
nC"ded t.o allow for carry propaation. Also, no 'that Uwrc is no lll'Cd t.o
j:!;('n('rate the rerodl'd SD multipli('r t.hat would r('quir(' two hits per digit. if
generat('d. Inst.l'ad. th(' bits of th(' orip;inal mult.iplier Cl\ll be smnn('(I, arKI
appropriatp control signals for thl' addPrfsuhtractor Clm Ilf> g('nl'rat.l'd. 0
to the original Booth's alj.!;orithm in fabll' 6.1, Wf' Sf'f' that an if10latNI 1 or 0
is handled mor(' efficiently t.here. If.c, _I is an isolatf'd 1, Yi-t = 1, so only d
ingle op('ration is Il(cled. A similar simplification occurs if.c. t if: an iwlatf'd
o in a f:tring of l's. In this case, ... 1O( 1)... is rf'coded into... i 1..., or, more
com f'nif'ntly, into... 01 . ", ami ap;ain only a singlf' operation is perform('(1. A
slmple way to find t h(' rf'quirt"ll opl'ration is to calculat1'
Booth's algorithm ('an hand It, two's complemf'nt mult.ipliers properly, and
conSf'qul'nt.ly if unsip;llI'd nUlllbt'rs an' to he multiplit'(l, W must add a .lero to
tit.. If'ft of t Ill' mult iplit'r (i.e., X n = 0) to I'nsme the rorr<,<,t.nl'Ss of the result.
Thf're are two drawhaeks t.o Booth's algorithm. Thl' first. is that t.ht>
numhpr of add/suht.ract opt'r.ltions i.. \ariable. and so is thf' number of shift
opl'rat.ions hpt.wt'('n t.wo conM'Cutivl' add/..uht.ract opf'rat.ions. Th('$(' are Vt'rv
incon\'cni('nt whl'n df'signing a syndlronous mult.iplit'r. S('cond, t.his algorithl1
bt.'Colllt'S i!l<'ffieil'nt when tht'rt' .He isolat.t'd 1'5; for f'xampll', 001010101(0) is
recoded as 01111111 L fl'<)uiring ('ight, instead of four. operat.ions.
The sit.uation can be improvt>d b}' ('xamining tlm'e bits of X at a time
rather t.han t.\\0 (11). The bits Xi and X._I are recoded int.o !I. and !I.-It while
Xi-2 rves as a reference bit. In a separate step, x.-2 dnd Xi-3 dre re('oded
into y.-2 and !I.-3. with Xi-4 serving as a r('ference bit. Thus, the groups of
thr('e bit.s each oVt'rlap, wit.h t.he right.most bl'ing .l'1.l'O(X-r), t.he next one being
X3X2(XI), and so on as shown below:
X._I + X.-2 - 2x.
for odd \'8lues of i and repr('sl'nt. thc result. as a 2-hlt hinary munhpr !lty.-I in SD
notation. TIll' v('rificat.ion of t.his st at(,lIlent is If'ft as an exercise for t h(' wder.
For 011011011011(0) Wt' now obtain 011011011011, and t.hl' numher of
op,-rations remains four, which is t.he minimum. Howf'vl'r, for 001101101101(0) we
gl't. 0110110i,101, requiring four, intead of thrC<', op('rations. Still, compared to
till' radix-2 Booth's algorit.hm in Tahlp 6.1, tht> numbt'r of patt('rns for whid! the
numher of partial product.s is inrreas('d, ratlwr t.han dPf'rl'a:5l'd, is smaJlf'1'. Also.
the incrt'asp in th(' mnnb.'r of op('rations is smaller. In any case, we may d.ign
an ll-bit synchronous nmltipli.>r tint. generates exactly n/2 partial products.
H('re, t.oo, two's complem('nt multipliers are handled correct.ly, but we ha\ to
mak(' sure t.hat. n is ('\,('n. Ot.herwise, an I'xtt'nsion of th(' sign hit is required.
Also, we Ill't-d to add a lero to thl' I('ft. of t.he lUult.iplil'r if unsigned numbers art>
mult.iplied and 1l is odd. Two zeros lIlust he addt'd if 11 is dn ev.m numher.
.
.
.
Example 6.3
A 01 00 01
Y x 11 01 11
Y oi 10 01
-A +2A -A
Add -A + 10 11 11
2-bit Shift 1 11 10 11 11
Add 2A + 0 10 00 10
01 11 01 11
2-hit Shift 00 01 11 01 11
Add A + 10 11 11
11 01 10 01 11
17
-9
[1"Coded multiplier
operation
X7 X6 Xs X4 X3 X2 XI .l'o (x_r)
--..-.- --..-.-
--..-.- --..-.-
U7 Y6 Us y., U3 ltl VI Yo
The rull'S for t.hi!> radix-4 modifitod Booth's algorithm dre shown in Table
6.3 for all ndd valu('S ofi, naml'ly, i = 1. 3, 5, . '". Comparing this algorithm
X. X._I .c.-2 Y. y.-I opt>ration conunl'nts
0 0 0 0 0 +0 string of .lcros
0 1 0 0 1 l-A a singl(' 1
1 0 0 1 0 -2A lH'ginning of l's
1 1 0 0 1 -A heginning of l's
0 0 1 0 1 +A l'nd of 1 '5
0 1 1 1 0 +2A l'nd of l's
1 0 1 0 1 -A a single 0
1 1 1 0 0 +0 string of l's
-153
Thf're are n/2 = 3 st.eps in t.his mult.iplication mid in pal'll step t.wo mul-
t.iplier bits aI'f' dealt wit.h. AI1 a rpsult., all 1thift opNatioll$ arc two bit
position shifts, Also, not(' that an additional bit for storing the correct
sign ic; R'quired to prop('rly hand It' the addit,ion of 2A. 0
TABLE 6.3 A radix-4 modified Booth's algorithm.
It is poib'" t() extpnd t.hl' ubo\'(' recodiug to three bits at a t.ime, dud
have overlappmg groups of four bits ,'uch, flip ((.sulting alj.!;oritlun is mlled
146
6. High-Speed Multfpllcatlon
6 1 Reducing the Number of PortKJI Products
1.11
till' radix-8 modifi('(J Booth's algorithm. In this ,,1orithllJ, only n 3 partial
produrt alP generatPd. hut tlu> multiplf' 3.-1 is IU"I"{IPd, adding rompll'xitv to the
ha...ic step. For r>xatnple, r('('oding 010(1) yields Y.Y'-IY.-2 = 011. A thmque
for simplifying th. J.,f'nf'r ition and accunmlation "f the multiples :i::3A has bcpu
pr ntl"d in 171.
An intl'rffiting qu('Stion now aris('S: what is th(' minimal nWllber of add/-
buhtrart 0JWration rl't'luirf'd for a givpn multiplier? To an5\\W this, we haw
to find Ihf' minimal SD reprCS4'ntation uf the multiplipr; i.e., the one \\ith thp
mallcst number of nonwro digits, m;" E;:;OI Yil. It ha..; bPf'n sho\\ n (20}
that a Sf'<IUCIlce Y"-IYn-2...YO is a minimal r{'present.at.ion of an SD numhl'r if
Y. -I/i-I = 0 for 1 S is'' - 1, given that tlw most significant bits can tisf)'
11"-1 - ""-2 =11. To Sf'(' thf' rPAison bf'hind thi... condition consider, for example,
the rf'prf>S('ntation of 7 wit.h only three bits; hf'rp 111 is a minimdl represt'ltation
although Y. . Y.-I '# O. In pr3(."tice. for any multiplier X, we c.an al\\-ays add a 0
to it Il'ft to makp sure that the abo\'e condition is satisfi(>(1.
Tht> algorithm for obtaining the minimal representation of X i... df':'3cribed
nPXt.. The multiplier bits an' examined from right to leh, onf' bit at a timf' '\\ith
th(> n('xt bit to t hp left (i.e., .r.+I) 8ef'\. ing as a rf'fcrencc bit. To correctly handle
a single 0 within a string of 1 's (and similarly. a single 1 withiu a string of O's)
we nCf'll information on tht> kind of :otring that exi..,s to the right of the current
position. For this purpose we use a W carry " bit. (0 for O's and 1 for 1 's). Thi...
algorithm is c.alled canonical receding and its rules are summarizf'd in Table 6.4.
where c. is t.he pre\"iou,> "carry" and CHI is thp m."t "carry."
A:s before, the recoded multiplier (dft.er canonical recoding) c.an be uSNI
\\ithout. any correction steps iftlw original multiplier L'i rf'prcsellt.ed in two's com-
plement. Hl're, \\e have to extend the sign bit Xn-It obtaining X"-IX..-IX n _2
... Xo. Canonical rC<'oding can also be expandf'd to gen('ratp two or more bits at
a time. TIle multipl of A neE'ded in the c.a..'>f' of two bits an' :i::A and 2A.
r,1 z. .r'_1 OJ>f'rat ion C..muwTlb
0 0 0 +0 s ring of O's
0 0 1 +2A PlId of 1 's
0 1 0 +24 3. ingl(' 1
0 1 1 +4.4 end of l's
1 0 0 -4.-\ beginning of l's
1 0 1 -2.-\ a single 0
1 1 0 -2.4 beginning of 1'9
1 1 1 +0 tring of 1 's
.r,+1 X. c, Y. C,+I Comlllents
0 0 0 0 0 btring of O's
0 1 0 1 0 a singlp 1
1 0 0 0 0 string of O's
1 1 0 I 1 beginning of 1'6
(J 0 1 1 0 end of 1'6
0 1 1 0 1 string of 1 '6
1 0 1 i 1 a sillgl( 0
1 1 1 0 1 st.ring of 1 's
TABLE 6.5 An alternate 2-bit-ot-o-time multiplcotion oIgomnm.
The main disadvantage of canonical reroding is that thp bits of the mul-
tiplier are generated equentially, while in the original and modifiPd Booth's
dlgorithms Wf' may generate the bits simultanrollsly (thPTP is no "c.aIT)" propa-
gation). Thi:. implies that in thp lattf'r c.a.o;e, we can generate all partidl products
in parallel. and then u..-.e a fast multi-operand adder.
Another drawback to c.anonical reroding is that. like Booth's algorithm, in
order to take full adV""dUtagP of t.he minimum number of add subtract operations
the number of these operations must be \'ariabl,,, as must be the length of the shih
operations. This is difficult to implement. and wp would prefer to bave unif(fil]
..hifts. This implif'S that the number of partial products will al\\'a's be n/2.
although c.anonic.al recoding can lead to a much smaller numbf-r of operatioffi.
The radix-.. modifif'd Booth's algorittlIn in Table 6.3 L'I not thp only t\oa)
of reducing t h(' number of partidl products, t\o'hilf' still ha\ ing uniform shifts of
two bits each. lru.tead of IL..ing the next bit to the right (X.-2) as a reference
bit when examining .riZ.-1t \\'e can IL<;(, the next bit to the left (.r.+r). The
rul for this mult.iplication algorithm are summarized in Table 6.5 where, &>
before, i is an odd number. The multiples of A that are needed are ::t: 2A
and :f: 4A, and they can be easily generated using shifts. The multiple -1..1 must
be gem'rated when (Xi+I)XiXi_1 = (0)11 to takt> CMC of the end of group of l's.
This can not be done at the time when the bits (z..U)Z'+2.r.+1 are ('xamined,
since they have a .rero ill the rightmost position. As a result, this algorithm i
not a rccoding of the multiplipr, as \\'e c'\nnot ('xpress -I in two bits. The numher
of partial products is al\\'ays n/2. As for c.allonical rt.'COding, t\\O'S cornplpment
multipliers can be handled hy pxtellding t.he sign bit. AL, if unsignro nUDJbers
are rnultipli,'(\. one or two Zf'ros must. bf> addPd to the left of the mult.iplier.
TABLE 6.4 Canonical recoding.
Example 6.4
For the multiplier 01101110, the following pMtial prodUl:ts Me gE'nerat.£d"
(0) 01 10 11 10
+2.4 -2.4 +-1.-1 -2.-1
1:s
6. High-Speed MuttipUcalion
T
I
Thi tmu....llltt,s In thl' SD lUlmhl'r o III 11 OOiO, whidl is 111.1 (I lIlilliuml
rC'pn"St'ut at inu, siuC't' it iuc-Iudl-:; two lldjfiC,'ut unll/.t'W eliit.... Empln,. big
tlu' ('l\lll\nicul n'l...uliu sununllrill.xl in Tn"I,' 6..a )'il'lcI OlOOitltHO, whirh
I II miuima) n'prc'\ 'utat.iuu. 0
Fur t.l11' ri/o::html)....t. pnir .1"1.1"0, if .1"0 = 1 it is cousi,ll'n'll n "'t..ntinuatiuu of
" s\ rit of l's t.hnt nl'\l'r n'nll)' stnrtt'll. mid t hl'n'fon' no su"trnction tonk place.
.....or I'xmupll" t hI' lIlultiplil'r 01110111 n'sult.s iu I Ill' following pRl"t ial pwdlll'f.s:
in....h'nll of
01 11
+2.4 +0
+2A +0
01 11
-2.-\ +0
2.4 -A
This can bc' eorrt'C'h'() It)' Sf'Bing t.hl' initial partin) produc'\ t" bc'
of 0 whc'lIc'\"('r .1"0 = 1. \11 four I'l),."...ibl" l"n..<:I'S nn' listl'll iu Table 6.6.
1 instcnd
.1"2 .rl .1"0 Opl'ratiun
0 0 I +-2.4 - .4 = A
0 1 1 +....4 - .. = 3.4
I 0 1 -24 - A = -3.-\
1 1 1 0 - A = -.4
TABLE 6.6 The handling of .1"1.1"0 with .1"0 = 1 In the algorithm of Table 6.5
Example 6.5
WI' n>pc'nt t ht' pft'\'ious l'XRlIlpl,' (wht'rc the rnclh:-.J lIludifi('(1 Booth's al-
gorithm was Ust"li) and obtain
.4 01
X x (1) 11
o
10
00
10
11
01
Ot
11
00
11
Initial - 4
Adct 0
00
01
-2.4
1l
00
Il
10
11
10
01
00
01
+
2-bit Shift
Add -2.4
+
1
1
1
2-bi\ Shift
Add 0
+
01
11
o
11
00
11
11
10
01
10
00
10
Ii
I
Opl'r<\tion
1l
11
01
11
01
11
- 153
"\ote \ IMt tht' lIlultiplier's sign bit hnd t. bl' c'xtt'ndl-d in ('rdl'r to tll'l:idt'
that no upefl,tion is Ill'l'lic'lf for till' first pair of IIIU1tlplicr bit.s. Alo;o, l\S
6.2 Implementing large MultfpUers Using Smal er Ones
}.If)
ill thl' pn>\ ions "'XI1Ulpte, all nddit,inrml hit for hnldin thl' corr ct :,:iJI Is
111'1'.1''11, hpcnusl' of mult.ipl tik... -2, t. 0
1'111' IllI,thod sUlIlllll\ri/exl in 1'1'''11' ti.5 eRn nlso 1)(' I'xt.f'lull'o f" thr "it
or mor... fit l'Udl ....t...p. Unwl'\','r. as in th,' mdix-R moclifiC'(1 Booth's I\lgurithll.
lIlultiplt':> of . \ tikI' 3 1 or t'Vl'I1 6,'\ nr... nN'd{'{1. .md unh._'$.... tho,,: 'rt' p..r"'()f\rl11II
I\llvmll wld stored oT1u>wh...r", WI' hau' to p'.'rform two adIIU.IIJI1S III 8 slIIgli'
s\<,p. For example, for (0)101 we 1Iet.,-d 8 - 2 = 6, and for (I )001. - 8 + 2 = -fj..
6,2 IMPLEMENTING LARGE MULTIPLIERS USING SMALLER ONES
If nil JI X" hit multiplil'r is implt'lUt'uteJ l\S a singlp integrated cin'uit... \VI' C3:'1
...wral such l'ircuirs for implt'lI1enting hUF;t.'r multiplit'rs.. A 2ft x 211 bit mulhpler
Call be constructed out of four 11 x n bit multipliers. This is hnsed 011 tht" followmg
"''(Iuution:
-t.x = ( \H'2"+Ar.).(XII.2"+ \L) = ill' \"H'2:1" + (.4H' \L+ tL"\H)-.t'+AL-Xl.
(6.2)
wh('r... All Rnd .4L art' t.h(' most mld lcast significant halvt'S of .4, rtp(,( t.ivdy,
and X" aud X r nre tikl'wiSl' for X.
The four I)Rrtial prOlluets of It'lIgt-h 2n IHt-$ t'l\('h should be cor['t'('t.ly aliled
b...fort' being adllt'll. l\S shown in Fil{Ure 6.1(a). A more conwment nrral1gl'ml'nt
is shown in Figurp 6..1(b). This last arral1gl'l1lt'nt gi\"f's th.. minimum hl'ight of
)(
.-iL X XL I
'\11 X Xl.
(ft)
At. x .\11
I A" x \H
(b)
'\L X '\:"
AGURE 6.1 Aligning the four partial ptoducts In Equation (6.2).
150
6 High-Speed MultipUcatlon
I I
I II
II
II II II
II II I
I II
I I
FIGURE 6.2 Aligning the 16 portlal products.
th!' matrix of numbt>rs to be added r('(]uiring on.. 11'\"1'1 of mrry-sa\t' addition
and a CPA (carry-propagating allder). Note that t.h(' n least. significant bits are
already bits of t.hl> final product, amluo further addition is UN"clcd. 1'11(' 2n bits
in the Cf'nwr have tu hI" addl'd b}' a 211-hit. CSA, whoo.e out.puts art' coun('ctf'd
to a CPA. The n 111I)st siguificant bits ha\'e to be connectf'd to th... same CPA,
5111("(' the ct'ntcr bits may geuerah.> a Cl1rry into t.llt> most. signifi,'ant. bits. Tlms,
n 3n-hit. CPA is nf'ftlNJ.
TIlt' id('a uf d('('omposing a large multil)lier into smalll'r ones can be furt.hpr
cxtf'l1dc<l. First, t.ll(' bil' multiplit>r used as a building block can be an n x m
bit multiplier, with n 'I- m. Second, lIlult.ipliers larger than 2/1 x 2m cau be
implf"mcuted. For exampll', a 4n x 4n bit multiplier can be implemented usiug
available n x n hit. multipliNs. A .1" x 4n bit multiplier re-quires four 2n x 2,.
bit multiplier.., whirh in turn rt'Quire four n x n bit IIllIltipliers each, for a total
of 16 n )( n bit multipli!'rs. Thl' 16 partia1 products geul'rated this way l1av(' t.o
be aligned b('fort b('ing added, as showu in Figurt> 6.2. Similar arrang(,l11f"nts of
part.ial products can hI" drawn for any m x k,1 hit multiplier with an int('ger k.
After aligning thl' 16 products, as shown iu Figurt> 6.2, w(' h.w(' up to
seven bits in ou(' columu that need to be nddl:'{1. To add seven opl'rands we
may 1I:.e a set of (7,3) counters. which gpnerat.t.' thrcc operands, to be adde<1 by
another set of (3,2) count('rs. These will generate two operands, to be add(.d by
a CPA. Anothpr posibility is to combiue the two Sl'ts of couuters into a set of
(7;2) COl11prt.rs, depic-tf'd in Figure 5.26. The ta..<;k of selecting an ecouomic-al
multi-operand addl'r is IIiSC'IL"'<;('() next.
6,3 ACCUMULATING THE PARTIAL PRODUCTS
Aft.er gcnf'rating t.he purtial products eithcr through one of the algorithm.o; dis-
cuswd in Section 6.1 or by using sma)ler multipliers, a..<; in S('ct.ion 6.2, we must
accwnulate <,n the I>artial products to obtain the final product. If a high-:'I)et-d
dccmnulation of partial products is desirf.'d, a fast mult.i-operand adder should bl'
employro. Such lliult.i-opf"rand adders, using Jifft>rt'nt types of parallr>1 connters,
-,
I
,
I
6 3 Accumulating the Portlal Products
15]
lO91!7ti51321n
lO g 7 6 :; I 3 2 I 0
00000 0
o 0 0 000
o 0 0 0 0 0
o 0 0 0 0 0
o 0 0 0 0 0 0 0 0 0 0
000 0 0 0 0 0 0
000 0 0 0 0
o 0 000
000 000
000
o 0 0 0 0 0
o
(a) Original mat.rix of 36 bits.
(b) RrorganilM matrix I)f bits.
FIGURE 6.3 SIX partial prOducts to be added.
1
have h('('n dcscribe<1 in Chapter 5. Wc should. howcver. tdkp advantaltc of the
particular form of the partial product t.o be added aud reduce the hardware
compl(>xity of t.hl' mnlt.i-operand adder. The partial products to be added h8\'e a
smaller numher of hits than the final product, and they have to be aliglll>(1 hefore
bl'ing ndd(. Thus, we can expect to see many columns t.hat include fewpr bits
than the total numbf"r of part.ial products, requirinp; simpler .-onnters (or their
addit.ion.
Consider, for exampl('. the six partial products that are gcnl'rated wheu
multiplying two unsigned operands of It.'ngth 6 bits each, nsing t.he simpl.. onc..
bit-at-a-time alorithm. Th(' matrix of partidl product bits t.o be added is shown
in Figur(' 6.3(3). These si.,< operands can be added using t.he three-level wrry-
save t.ree shown in Figurc 5.24. Tht> number of (3,2) counters can, howewr. bf'
substantially rNluced by taking adnntage of tll(> fact that all columm; but (m,-
in Figure 6.3(a) contain fewer than si.,< hits. To simplify thc task of tll"Ciding
how many counters are needed we can r('(lra\\' the matrix of bits t.o be added, as
depictf"d in Figure 6.3(b).
To further reduce the hardwarf> C'omplexity we a1so allow t.hp U<;f" of half
addt'rs (HAs) in addition to full adders (FAs). .-\n HA, which can b(' callf'(1 1\
(2,2) count('r, has a lowt'r hardware complexity t.han an FA. Figure 6.4 depicts
thl' (J,2) and (2,2) counters that ('an be u in order to rf'(luce the number of
operands from 6 to 2. Thl"Se two operands are then added through a CPA. In
this figurp. Ii vl'rtinl block containing thr bits repr(,'sf"nts a (3,2) count(>r, whil('
a \'Ntical block containing two bits repres('nts a (2,2) counter. Thl' hori.wnta!
blocks in Figure 6.4(b) show the ontputs of t,hf" (3,2) and (2,2) counters in Fij.!;lIre
6.4(8). For ('xample, the horizontal block in columns 2 alld 3 contains th(' two
out.puts of th(' (3,2) counter in column 2 of Figure 6.4(a). The number of I('vels
in tilt.' C1\rry-saVf" addit.ion is still 3, but the numb..r of counters is subtautiallv
smal!pr t.han t.hat need,>d in thc g('neral CflSP (set' Figure 5.24).
The nnmhf"r of counters can be fnrt,hl'r rl-'dnced bv employing t.hl' idt>1\
mt'nt.ioned in Chapt.er 5 of reducing the nnmber of bits in pru:h column to the
152
10 9 8 T e Ii 4 S 2 0
J8608°6 ·
(2,2) 0 0(3.2)
coun'er l;J l;J counter
(a) U>\'l'1 1 carry-save addition.
109876543210
o €:::]) €:::]) €:::]) @3 0 0
@3@3@3@3 0
@3@3 0
0@3
(b) Result.;; of 1('\'el I.
6. HIgh-Speed Mult1pl catIon
T
I
,
I
I
cI()S('st elelllpnt of thp spries 3,4,6,9,13,19,... (01). This is shown in Fignrl' 6.5
whl'rl' for examJ>It>, in Fij.!;lIrp 6.5(a), column 5, the 8malll"£l. nUlUhpr of countf'rll
which will rE'du("e the number of hits to four. is used. Ovprall, thp S('hpnlt in
FigurE' 6.5 r('quirf"s fift.een (3,2) rollllters and five (2,2) count('r;j, comparpd t.o
the sixtCl>n (3,2) l'Olmtt>rs and ninp (2,2) r.ountprs nt>ftl('d in Fil1;lIrt> 6.... The
savings an' even mort> substantial when larger multipliprs a£l' dt'siRIlt'cl.
The abovp disc'lIssion is restrictl.,<1 to unsigned numbt>rs. If some of lhp
partial products arp negRt.i\'f" numbers rpprpspntro in two's ('omplpmpnt , \W nf'{'d
to modify the matrix of bits shown in Figurf' 6.3(a). Spt>cifically, 'ill sij.!;n bits
must be properly pxt('nd('d before t.he addit.ion of t.h,.. part.ial produC't.s tdk pine",
yielding the lUatrbc shown in Figur(' 6.6. The numt)('r of bits in row 1 is now
11 instead of 6, and I!O Oil, This I'xtensiol1 significantly increa.o;pg tht' hanl"''8re
complexit.y of the multi-o\)cralld adder required. If two's complem('nt numlJPrs
are obtain('d by generat.ing the one's complement. and thell adding a carry to th('
Ipa.<it. significant bit., thl' matrix will have to be increased ('ven further.
WI' may minimize the increa....e in complexity by realizing that the two's
complement numh('r
6,3 Accumulating the PartIal Products
153
10 9 7 6 5 I 3 2 I 0
00088888. ·
o 0
000
(c) Level 2 carry-save addit.ion.
II 10 9 8 7 6 :; I 3 2 0
.000888:
o 0 0 0
o
(d) u>\'('1 3 carry-save addit.ion.
FIGURE 6.4 Reduction at the sIx partial products.
10 9 8 7 6 :; " 3 2
o : : 8 : 00 : : : : ·
· : 88 0 : .
(a) u>\'el 1 carr:.'-save addition.
10 9 8 7 6 :; 4 3 2 1 0
00000 000 0 0 0
000 0 0 000 0
000 0 0 0 0
o 0 0 0 0 0
(b) Re;ults of le\,1 1.
8 8 S 8 S 8 %4 %3 %2 %1 %0
whose value is
-S' 210+.9.29+8.211+8.27 +8.26+.'.1.25+%-4.24+%3.23+%2' 2 2 +Z l .2 1 +Zi)
can be replaced by
o
10 9 8 7 6 :; 4 3 2
o 0 0 0 0 (-s) %4 %3 2:2 ZI %0
o
since
o 000 0 0 0 0 000
.888880.
-S' 2 10 + 8' (2 9 + 2 8 + 2 7 + 2 6 + 25) = -s, 2 10 + 8' (2 10 - 25) = - 5.2 5 .
To repr4'St'nt. the value -s in column 5 (in Figure 6.6), we complement the
original sign digits to obtain (1 - s), and add L V'le get thf" -5 as r('{}uin>d
o 0
(c) Lvel 2 carry-S3\'C addition.
10 9 8 7 6 6 " 3 2 0
. . . . . 0 0 0 0 0 0
. . . . 0 0 0 0 0 0
. . . 0 0 0 0 0 0
. . 0 0 0 0 0 0
. 0 0 0 0 0 0
0 0 0 0 0 0
109ti76513210
.88888880.
o 0
(d) Level 3 l'arry-sa\'e addition.
FIGURE 6.5 An alternate scheme for reduction at the six partial products.
FIGURE 6,6 Six signed partial products to be added (full circles Indicate ex.
tended sIgn bits).
T
6.3 Accumulating the Partial Products
155
154
6, HIgh-Speed Multiplication
10 II 8 7 6 5
1
3 2
o
10 9 8 7 6 6 I 3 2 0
81 81 . . . . .
0 0 0 0 0
'12 0 0 0 0 0
83 00000
!f.iooooo
Ss 0 0 0 0 0
86'00000
S:i.
'83.
. . . 81
. . . .'2
.'14 . . . . . 83
85 . .
· · .'1-1
SG . . . · · 80;
AGURE 6.7 The modified array of six signed partial products.
86
FIGURE 6.9 The modified array wrth negative part1al products represented In
one's complement.
along 'ith a carry of 1 into column 6. The lat.tPf will Sf'rve as the (>x"tra 1 nl.'f'dNl
in colnmn 6 to deal with t.he sign bit. of the second partial product. Another
carry-out will be generated in column 6 and o on. Tht' resulting mat.rix of bits is
shown in Figure 6.7. This nf"W mat.rix has fewer bits than that. in Figure 6.6 hilt
has a higher maximum height (7 instead of6. in column 5). \\"f" can eliminate t,hC'
ext.ra 1 in column 5 if we place t.he two sign bit.s 81 and 82 in the same column,
sinn'
Example 6.6
In t.his example, the negatiVl' partial products are generated as a rl"Sult of
a rffod"d multiplier using canonical recoding. If all sign bits are extended,
t.he following !lIatrix is obtained, where all replicated sij.!;n bits arc shown
in bold face:
(1 - 8d + (1 - 82) = 2 - 51 - 82'
The 2 is carried out to till' ne>xt column, leaving behind -81 and -82' An t'xtra
1 in this column is no longer r('(juirf"d. Placing t.he two sign bits in the> same
column can be achieved by first extending the sign bit 81 in onf" position. as
..hown in Figure 6.8. The maximum column height is now back tv 6.
If the negativc partial products arc obtained by first generating t.he> one's
complen1f'nt. and t.hpn adding a carry to the least significant bit. these eXt.ra
carries can t.hen be added to tlU' mat.rix, as shown in Figure 6.9. The full circles
indicate that. thf" complpnu>nts of the corresponding bits are taken whenevt'I'
8. = 1. The ext,ra 56 in column 5 incr('ascs tb(' maximum column height to
7. Howt'ver. if the last part.ial product. is always positive (i.e., the multiplier is
alwavs positive), t.his 86 can be eliminat.ed.
A 0 1 0 1 1 0 22
X x 0 0 1 0 1 1 11
}' 0 1 0 1 0 1 Recoded multiplier
1 1 1 1 1 1 0 1 0 1 0
0 0 0 0 0 0 0 0 0 0
1 1 1 1 0 1 0 1 0
0 0 0 0 0 0 0 0
0 0 1 0 1 1 0
0 0 0 0 0 0
0 0 0 1 1 1 1 0 0 1 0
A smaller matrix of bit.s to be added is obt.aint'd if we Colluw thl' S('heme
iIlustrat.l'd in Fibrure 6.8:
10 9 8 7 6 5 -1 3 2 0
81 81 0 0 0 0 0 10 9 R 7 6 5 3 2 I 0
82 0 0 0 0 0 0 1 0 1 0 1 0
83 0 0 0 0 0 1 0 0 0 0 0
8.\ 0 0 0 0 0 0 0 1 0 1 0
8!) 1 0 0 0 0 0
0 0 0 0 0
1 1 0 1 1 0
86 0 0 0 0 0 1 0 0 0 0 0
AGURE 6,8 Further mOdified array of six signed partial products. 0 0 0 1 1 1 1 0 0 0
156 6. High-Speed Multiplication
If tltp l1Pgalivp partial are genprnh'<l lIsiu mIl"'" complpm('nl and a carry
into t.h£' lel\.<;t. significant. poit.i()n, th('n till' resultiug matrix becomes
10 9 8 7 6 5 " 3 2 0
0 1 () 1 0 0 1
] 0 0 0 0 0 1
0 0 1 0 0 1 0
1 0 0 0 0 0 1
1 1 0 1 1 0 0
1 0 0 0 0 0 0
0
0 0 0 1 1 1 0 0 0 0
6 4 Alternative Techniques for Portial Product Accumulation
157
10 51 11 7 6 0; I 3 2 0
1
l!ijoooooo
1 82 0 0 0 0 0 0
83 0 0 0 0 0 0
Scht>me (a)
If we use thp modifiNl radix-4 Boot.h algorithm for generating the partial
products, th(' rpsldting matrices, corrc:-.ponding to Figures 6.7, 6.8 and 6.9. arc
showu in Figun' 6.10(a), (b) Rnd (c), respcctivply. Note that in Figurt> 6.10(a),
the carry genpwt.ed by adding 1 to SI = (l - 81) for the first partial product is
not. J>osit.ioncd in the san1(' column as that of S2 = (1 - 82), the comph'm('nt of
the sign bit of thl' second partial product. \Ye need. tht>reforp, t.o put an extra
1 in column 7 which, togpther with thl' carry gpneratcd by COIIlUUl 6, produce::;
the neccssary 1 in column 8.
10 9 8 7 6 I) -1 3 2 0
Sj' 81 81 0 0 0 0 0 0
1 82 0 0 0 0 0 0
83 0 0 0 0 0 0
Scheme (b)
Example 6.7
We repeat the multiplicat.ion from the pr('vious pxampl.. but now use the
raclix-4 modified Booth's algorithm resulting in three partial products.
The recodt"d multiplier turns out in this case to be exactly t.he samt'; i.e.,
010101. The 5C<"ond srheme (see Figure 6.10(b» rt'Sults in
10 9 8 7 6 5 , 3 2 0
81 81 81 . . . . . .
1 82 . . . . . . III
83 . . . . . . 82
83
Scheme (c)
FIGURE 6,10 Three schemes for an array of three (radix-4 mocllfled Booth
algorIthm) partial products
9 8 7 6 :I -1 3 2 0
0 1 1 0 1 0 1 0 6,4 AL TERNA TlVE TECHNIQUES FOR PARTIAL PRODUCT ACCLNULA TlON
1 0 0 1 0 1 0 Several modifications to the basic t.ref> st.ructure for partial product (\('Clunula-
1 1 0 1 1 0
0 0 1 1 1 1 0 0 1 0 tion have been suggested and implemented. The purpose of tilt'S(' tt'dmiques is
to reduce thl' numbcr of levf>ls in tl1(' tree (and, as a rp-sult., speed up the accu-
mulation) Bnd/or achieve a more regular design. Tree struct.ur£'8 Ilsnally ha\'P
If the third schf'lIle (Sf'C Figure 6.IO(c» i.. followed, the resnlting matrix is very irregular interconnects. This irregularity complicates the implementat.ion
9 8 and, more importantly, irregular st.ructures result in area-inefficient layouts, cs-
7 6 5 " 3 2 0 pedally whpn a rt>ctangular-shapedlayout is sought. Notirp also that a smallpr
0 1 1 0 1 0 0 1 numbl'r of levels results in less irregularit.y.
1 0 0 1 0 0 1 1 The number of Il'v<'ls in the t.r can be lowen>" by using a redul'tion rate
1 1 0 1 1 0 1 hight>r than 3:2. A reduction ratp of 2: 1 can bt> adlit>w(l if the carry-save adders
0 are replacf'd by adders for binary SD numbers described in St>ction 2.4. Like
0 0 1 1 0 0 1 0 0 the carry-save adder, the SD addcr generates the stUn (Jf it:) two operauds in
15R
6. High-Speed Multiplication
6 4 Alternatrve TechnIQues for Partial Product Accumula110n
159
constant tin1f' (indppl'nd('nt of t.he numher of hits), sillce tlu' carry is allow..d t.)
propaJ{at(' at most one position. Th(' IlIlmhl'r of 11'\'('ls in thl' Sf) adetf'r tr<'e is
smallpr and, in aetdit.ion. the t.ree produces a sing'" re:mlt rather than thl' two
results of t.he ordinary CSA t.ree (S{'C for l"'xample, Figure 5.24). HowPvl"'r. the
rt>Sult of the SD adder tree is still in SD repn"S('ntation and, cnnsl'(Jut'llt.ly, in
most. cases, a conversion to two's complen1f'nt rl'prcsentat.ion is nl'l'(J.'(1. This
conversion is done by forming two seqUl"'nCl. Th(. first Sl'qUI'Ill'e, dl'notf'd by
Z+, is {rl'at('d by rpplaring ('aeh lIeath'e digit of t.he SD numb,'r hy 7(ro. The
$pconet S4'q1U'nce. denotPd by Z-. rl"'plan'S t>.uch 11l'gc\tivp digit of thp original
SD number with its absolute value. and I'ach positivI' digit by I.l'ro. fhen. t.he
diff'prl'nc(' Z+ - Z- is found by adding thl"' two's cOlnplpml>nt of 7- to Z+ using
a {'arry-propagat.ing adetl"'r. HenCl>, a final stagt' of a CPA is Ill'ed('d Ill're. as it is
nl'CdPd in the ordinary CSA tree.
Anot.hpr advautag.. of an SD adder tret' over a CSA tret' is that t,hert' is
no Ill't for a sign hit I'xtension when negative partial products arl' to be addfti.
SD numbers simply do not rcquirt' a spparate sign hit. Tlw major dismlvantdRe
of t.he SD adder is that its ett'Sign is mort> complex, consnming morc gates and
conSt'quently a largt'r chip area, sin('e each signed etigit requirl'S two ordinary
hits (or a lIIult.iplC'-valucd logic impll'lIIentation that ('an prO\'ietl' tlm."" values
pcr digit, corresponding to -I,D, <\11£1 1). As a rult, a morl' careful comparison
b(>tween the CSA trt'(> and thl"' SD (\(Ietl'r tn"t' for tht> pdrticular giwn tN'hnology
must be performed berore deciding which to pmploy.
C tI'
C m
c s
FIGURE 6.11 A (4;2) compressor.
Implementing thl' (4;2) compressor with two (3,2) counters. as shown in
Figure 6.11, will rt';ult in a delay of four l'xclusive-or gates. Thus. the dday of
the implementation in Figure 6.12 is expeded t.o be 25% lower than that of the
implementation in Figure 6.11.
Other mult.i-ll'vl'l implementations of n (4;2) compre::osor are possible. \11
such implementations must sat.isfy the following arithmetic equat.ion:
Xl + X2 + X3 + XI + Ct.. = S + 2(C + Cout),
Example 6.8
A 32 x 32 mult.iplier ha....>(1 on the radix-4 modified Booth's algorithm gen-
erat.t'S 16 part.ial produet.s to bl"' accumulatro and consequent.ly requir<'8 a
CSA tf('(' with 8ix Il'vl"'ls (:)l,'(' Table 5.1), but. m'ts an SD addcr tret' with
only four levels. Soml' sophisticatl"'et logic d('Sigll t.echniqut's and layout.
sc!wm('8 ('an hp pmployed. resulting in less area-consuming implt'ment.a-
tions (10). 0
and C""t should not dppE'nd on c.... to avoid horizontal rippling of ('arrie-s. Tlw
t.ruth table for such implementatiolls is summarized in Table 6.7, where a, b, r, d,
t: and fare Bool('an \'Briables. The impll'mentation in Figure 6.12 corrt'Sponds
to the setting a = b = c = 1 and d = e = f = o.
TIn' same reduction rate of 2: 1 can be achieved without rc'SOrting to SD
rt>prpscntat.ions by using (4;2) comprl'ssors, shown in Figure 6.11. Similarly
to t.he (7;2) compret;M)r in Figurt> 5.26, t.he (-1;2) comprcs..",or must be desigm'd
!j() that C ou . is not Ii function of Ci... in oreter to avoid a rippll'-cnrry effect.
Alw, the (4;2) compre:;wr mny be implement.pd as a multi-level circuit with a
smaller overall delay compared to t.he implt'mentatioll bl:lSed on two (3,2) coun-
ters. &; ill Figurc 6.11. Oue such implement.at.ion is shown in Figun' 6.12 with a
delay uf thr 'e "xdusivl..'-or gates between the input.s (XIt X:l. X3 and XI) anet t.he
outllUt S.
C"".
£:in
c
S
FIGURE 6,12 An Implementation of a (4;2) compressor.
160
6. High-Speed Multiplication
6.4 Alternative Techniques for Partial Product Accumulation
161
.r, X2 :7'3 X.. Cour C S XI X2 ;]'3 3'.. Cout C S
0 0 0 0 0 0 (',.. 1 0 0 0 0 Co.. Cjrl
0 0 0 1 0 Ci.. Co" 1 0 0 1 d J C-i..
0 0 1 0 0 r.... - I 0 1 0 e e
i.. Co..
0 0 1 1 a a c , .. 1 0 1 1 1 Co" C';n
0 1 0 0 0 C;n C;; 1 1 0 0 f 1 Co"
0 1 0 1 b b Cin 1 I 0 1 1 Ci.. (-in
0 1 1 0 c C C;.. 1 1 1 0 1 Cifl c....
0 1 1 1 1 Ci.. Cin 1 1 1 1 1 1 (',..
TABLE 6.7 The truth table for a (4;2) compressor.
:;.>vC'rdl other techniques have hN'n sugg(':Stf'd to modify the 9trn(1urf' of
(,SA trf'f'S which IISP (3,2) count.ers, in ordt'r t(, achil've a morp rt'gular nne! Irsq
arM-consuming layout. S1Ieh modifipd tree !otrnrfllrt>s may n'CJuirp a !>nmrwh1\t
largt'r number of CSA levels wit.h 8 lEJrger overall dt'lay. fwo SUdl t,'chniqllffi
are dc:.crihcd nl'xt. fhe first ono> df'tines halancl"d df'lay trPe:; (21) ( aLc:o (19))
while t.he secund one defines 0\ prtllrned-stairs t,rees [15). Figure 6.13 iIIu'itratcs
the st.ructnre of thf' hit-slices for these tv.o t.echniqucs and compares thplII to
the corn>sponding Wallace tree bit slice. All the bit-slices in Figuro> 6.13 ''UP
for 18 opprands (parti81 product.s) which may hp gf'nprated hy a high radix
multiplication algorithm (e.g., a radix-I modified Boot.h's algorit.hm). In thil>
ca....e, the 1M downward triangles in Figure 6.13 n'pr('s('nt mlllt.iplpxf'rs that, wlpct
the suit.abl(' mult.iple of tht' mult.iplicand. Thl' rectangll"S rf'pre.<;eut. (3.2) couulf'rs,
and t.he numbers on these counters indicat/' the df'lay pxpPfipnf'f'(1 by t.hl' input
operands. Thus, after 6ilF t. t.wo re,sults are producpd by the Wdllsc(> and thc'
overturned-stairs trf'€Si the balanced tref> requires 76.f'A.
Note t.hat all three tree structures have fifteen outgoing carries and fiflt'en
incoming carries, and each outgoing carry is aligned with its corresponding in-
coming carry (from the previous bit slice), so that adjacent. bit-slices ahut. The
inmlJling carries are ront.('d to difft'rent (3,2) count/'rs so that all thp inpnts to a
counter ar(' valid before or at the necessary t.imt'. Only for the balanc,'<i tr«> arc
all fifteen incoming carries generated exact.ly when they arl' reqnirC'd, sincl' all
paths are balanCf>d. In tht' ot.her two treE'S, thf'rt' art' counters for which not. 811 in-
coming carries arp generat('d simult.anrously. For exalllpl p , the bnttom COlU1ter
in the overturnro-!.tairs tr has incoming carries whose associated delays arc
46. FA and 56.FA.
Thp three trf'C structures also differ in till' numher of requir('d wiring tr'\cks
between adjacent bit-slices; thes£' iu turn, affect the layout area. Tltt' WaDa.ce
tree requires six wiring tracks; the overt.urned-st.airs and the balanced tree requirt'
three and t.wo tracks, respectively. Note the inhert'nt tradeoff betwn size and
spet.'<l. A Wallacf' tree guarantees t.he lowest o\'erall delay hut requires the highest
numb('r of wiring tracks (on t.he order of log N, whf're N is the number of inputs).
The balanced tree. on the other hand, requires the smaHc.st number of wiring
t.racks but has the high(>St overall delay.
The balanced and overturt1t'd-stairs t.rs have a regular structurp aud call
be designed in a systemat.ic way. This is difficult to SI.."e from Figure 6.13, but it.
c.'U1 be concluded from Figure 6.14, which shows the complet(' st.ruct.ure of t
two trees as w('11 as t.hat of the corrpc;ponding W<\lIac't" t.ree. The bnilding bkJd\D
of the balanct'<l and oVt'rturned-st.airs trN'S are indiC'att'ti with dot.ted lines in
Figure 6.14. The exact details of the recursivl' coust.ruct.ion uf the two typt.'.'i of
regular tre€S, and some variations of tllt.'Ill, can be found in 124) and lIS).
Wht'n det.£'rmilling the final layout of a CSA t.rf'e, ('are must hI' t.dkt'1J to
make sure that wires conllectin the inpnts to a c<\fry-saw adder h.\V(' roughly
An adder trf'(' that. USl..'S (4;2) compressors will have a more regular struc-
ture and may ha\'e a low('r d('lay thau an ordinary CSA tree made of (3,2)
count.ers. Table 6.8 compares th(' dt'lays of carry-save trees using eitlwr (3,2)
counters or (4;2) compre:>sors. Since th(' delay of a (4;2) compressor is 1.5 timps
t.hat. of a (3,2) counter, t.he number of levels of (4;2) compressors in column 3
is multiplit>d by 1.5 t.o yield the equivalt'nt delay in column 4. Note that. thf'
equivalent dplay of a carry save tree nsing (4;2) mmpressors (column 4) is not
always smaller than that. of a carry save t.rt'f' ul'ing (3,2) counters (column 2). For
I'xample, for nine partial products, (3,2) counters will yield a carry save t.ree with
an overaJllowE'r delay. Various ot.her counters and compressors can be empk>yed
in the implementation of the addition tree for the partial product accumulation;
for examplf', (7,3) counters (13).
Number of Number of le\'els Number of levels Equivalent.
operands using (3,2) u:;ing (4;2) dl'lay
3 1 1 1.5
4 2 1 1.5
5-6 3 2 3
7-8 4 2 3
9 4 3 4.5
10 - 13 5 3 4.5
14 - 16 6 3 4.5
17 - 19 6 4 6
20 - 28 7 4 6
29 - 32 8 4 6
33 - t2 8 5 7.5
TABLE 6.8 Comparing the delays of corry-save adder usIng either (3 2) coun-
ters or (4;2) compressors.
162
Wallace tree
bit lice.
Overturnro-
stairs bit slice.
6. High-Speed Multiplication
1
6.4 Alternative Techniques for Partial Product Accumulation
163
(a) Wallace tree.
J
..
s
(b) OVf'rturned-stairs trl't'.
6
z
J
Balanced tree
bit slice.
..
5
FIGURE 6,13 Three CSA tree bit-slices for 18 operands (downward triangles are
multiplexers).
6
7
(c) Balallced tree.
FIGURE 6.14 Wallace. overturned-stairs and balanced trees for 18 operands
(downward triangles are multiplexers).
164
6. High-Speed Multiplication
6.5 Fused Mulhpty-Add Unit
165
t.he S!1l11l' length otherwise tin' (lr'lny balann'l! pat,hs will no long('r Iw balanCf.'rI.
Consider, for examplf', a CSA trpe for 27 upf'r.mds (27 partinl products oM ,inpd
frum a 53-bit multiplil'r using t,h(' radix-4 morlifiPd Boot.h alJ1:orithm). A CSA
t.ree con!>lrnded ont of (4;2) compressors is sho\\n in Fillrf' 6.15(a). .\IId t.h..
corresponding layout. is sho\\n in Fiburp 6.15(b) 1251. Notf' that the bottom
compressor (#13) is located in th.. middl«' so thnt c'ompn.'''Sors #11 and #12 Urt
roughly at the same distance from it. Compres.-;or #11 in turn has PCiuai If'ngth
wires from #8 and #9 Bnd so on.
6.5 FUSED MUL TIPL V-ADD UNIT
7
A fused mult.iply-add unit p,>rforms thl' mult.iplication A x B followed innnrcli-
dtely by an addition of the produrt amI a third operand C so t.hat t.llP calculation
of Ax B + C is done 8.." a singh> and indivisible operation. ('leddy, such a unit
is capable of performing multiply onlv, by setting C = 0, and add (or subtract]
only by setting, for ex-..mpl(', B = 1.
A fused multiply-add unit can n'duce t.he overall pxecut.ion tim.> of dminM
multiply and t.hen add/subtract operations. An pxample of a case whpn uch
chained multiply and add are useful is in t.he evaluat.ion of a polvnomial anI n +
On_IX n - 1 +... + ao t.hrough [(anx + on-dx + a"_2Ix + .... On the ot.her hand,
independent, multiply and add opl'ratioos can not be performl'(1 in para.llp!.
Another advantage of a fused multiply-add unit, comparrd to separat.e
multiplier and addrr, arises when exC<'uting floating-point operations sinc(' round-
ing is performed only once for the result A x B ... C rathpr then twice (for thp
multiply and t hen for the add). Since rounding may introduce computat.ion ('r-
rors, reducing the numbE'r of roundings may have a posit.ive ('ff('('t on r he overall
error. In the design reported in 1141, this additional accural'y was h('lpful wllf'n
producing a rorrectly rounded quotient in the divide by reciproc.3t,ion alrit.l\ln
(see Section 8.2).
Figure 6.16 shows -..n implpmcut.at.ion of a fused Illultiply-add uuit for
floating-point comput.ations. Here, A, Band C arc the significands while E"
EB and Ec are the exponpnts of t.he operdllCls, respc.>ct.ively. Thp CSA trE'e
generates all the partial produrt.s and performs their carry-saw dCcunmL-..tion to
produce two result.s which are then added with the prop('rly a1igm'd operaml C.
The ad(ler 8("cepts tbrcc operands aud t,her('fore, must first n.><lure thelll w t\',o
(usiug (3,2) counters) and then perform carry-propagate additiou n1t steps of
po..,t-normalization and rounding are ex('cuted m'xt,
The design iIIust.rated in Figure 6.16 t'lI1ploys t.wo ll'Chuiqllcs in ord..r tl'
redurp t.he owrall executiun tilllE'. First, the leading zero <Ult,icipator circuit U!>t'$
th(' propagate and gent'rate sinals produrtd by the adl!f'r (sce Section 5.2), to
prcdil'l the tYI>" of sbift which will ht' JI('Nled ill t.hc post-normalizatiun step.
(a) (4;2) comprpssor tree.
I I I I
! 1
I 1
--r T
I '1
I 1
I I ' 1
J. T
I 1
I 1
I I
I I
I I ,1
J. 1
I 1
T T
I I '1
I 1
1
8
2
11
3
9
13
4
12
s
10
6
(b) Simplified layout of the (4;2) comprpssor tree.
FIGURE 6.15 A (4;2) compressor tree for 27 portlol products and its layout
166
A
B
6. High-Speed MultiplicatIon
6.6 Array Multipliers
)67
c
product A x B
53 I 53
rangl' of adcll'nd C
I 53 I
CS.4 fu('
.4 x IJ +/ncr.xB
Round
53
-----
Increl\\('nter
53
2 op('rand adder
This is the range for 53 E A + En - Ec -53. If E ,+ En Ec 5-1, tht' hits
of C which are shifted furth('r to the right. will bf' replaced by a st.ic'ky hit. and
if E, + En - Ec $ -54, all tllP bits of Ax B will b(' r('pl8C('d by a sticky bit
TIlt' overall penalty is t.hus a 50% increas(' in t.he widt.h of the adder which, in
turn. will incr('a.. the E'xecution time of the adder. Note however. tlldt t.he top
53 bits of the adder need only be capahle of incrplIleuting the original Wnh'llt.s
of the 53 bits if a carry propagates from t,he lower 106 bits.
The path from the output of the rounding circuit in Figure 6.16 to t.he
multiplexer on tht"' right is used when performing a l'alculat,ion like (X x y + Z) +
.4. x B. Thl' path from the output of thc nornnli7ation cin'uit to th(' mult.iplexer
on the left is \L<;ed whpn performing a calculation like (X x Y + Z) x B + C. In
this case, t.he rounding step for (A x B + C) is pprformpd at thp same time &
the multiplication by D, by adding the partial product Illcr. x D to the ('So\.
tree.
FIGURE 6.16 Fused floating-point multiply-odd unit.
6,6 ARRAY MULTIPLIERS
This circuit operates in parallel to the addition itself so that the delay of tht'
normalization step is shortpr. Seeond, and 1Il0rc importantly, thl' alignment of
thl' signifkand C in E A + En - Ec is done in paralld to thl' lIIultiplicat,ion of
A dnd B. Normally, in a floating-point addition, we align the significand of the
smalll'r operand (i.e., th(' operand with the smaller exponE'llt). This will imply
that. if the product A x B is smaller than C, we will have to shift t.he product
after it hN; bet'u generated, introducing additional delay. We prefer illswad to
ah\ays aligll C even if it is larger than A x B, to allow the shift to be performed
ill parallel to the multipliration. To achipve this, we must allow C to shift either
to thp right (as i... traditionally done) or to the Il'ft, the direction dictatPd by
whether the r€'Sull of E A + En - Be is eitl\f'r positive or nl'gativ<>, respectively.
If we allow C to be ..hift to the t.'ft wp must inrrt"'a,,,(' the total nlllnbpr of bits
ill the acidl'r. For ('xample, if all operands arc float.ing-point nlllnh('l's in the long
IEEE furmat, thp possible rauge of C rt"'lative t.o tlu> product A x B i shown 80.<;
fullow6:
The two basic opt"'rations, thl' generation of part.ial products and their Mllluna-
lion, may be merged. III this way, we avoid tht' overhead that is due to th('"
separate controls of these two operations, snd we thus speed lip the mult.iplica-
tion.
Such mult.ipliers. which conist of identicall'ells, eal'h l'apable of fonning
a new partial product and adding it. to t.ht"' prl'viously accumuldted partial prud-
uct, are called iterative array multipliers or simply array mulfipli,'s. Clearly,
any gain in speed is obtained at the exppnse of extra hardware. Another impor-
tant characteristic of array multipliers is that t,l1('y can be implt'mellted :J.' .l.S to
support a high rate of pipplining.
To iIIustratt"' tht' op('ration of an arny multiplier, examine th,' 5 x :; par-
alleloralJl shown in Figur<, 6.17, which contains all 25 partial product bits of
the form a, . x j properly aligned. A straight forward implem,'utnt.ion of all array
multiplier rnlds the first two partial products (i.e., (14 . .co, a3' .to, ... ao..co lind
a4 . XI, a3 . XI, . . . ao . XI ) in row OIW dfter proper aligIlIllt'ut. ThE' resnlts IIf t hp
first row are thE'1I added to (1-1' X2. a3' £2. . , . ao' £2 in the ecnnd row, d.nd so un.
The basic ct>II for such all array multiplier is an FA accel>tillg one bit of till' IIt'W
T Array Multipliers 169
168 6. High-Speed Multiplication 6.6
0.. 03 02 al ao QI.l:O aoro
x x.. '£3 X2 XI XO 0
<14 . Xo a3 'XO a2 . Xo '11' XO aO' Xo
0.. 'XI 03 'XI a2' XI 01' XI aO'xl
O. . :1'2 a3 'X2 a2 'X2 al 'X2 aO 'X2
0" 'X3 13' X3 a2 'X3 Ol 'X3 a(l . X3
01' .T.. a '3"4 02 . X.. al'I4 Go . X..
PI> P8 P- Pe PIS p.. P3 f'2 PI Po
.
RGURE 6.17 The portlal products generated In a 5 x 5 mUltIplication.
Ps
P7
Pfj
PIS
p..
P3
P2
Po
partial product (a, . xJ)' one bit of th > previously 8rcumulat.ed partial produ('t,
and a ('arry-in bit. A blm'k diagram of a 5 x 5 array multiplier for unsigned
numbers is dt'pictt>d in Figure 6.18. In thp first four rows tllPr(' is no horizontal
carry propaation. In othcr words, a rarry-save type addition is pcrforl1\t'C1 in
these rows, and the a('cunmlat('(\ part.ial product. consists of int.ermediate sum
and ('.arry bit.s. Only in t.he last row is a horizontal carry propagation alloy,d.
The bl.."t row of cells in this figure is a ripple-carry adder that can be replaced
by a fast two-operand adder (e.g., carry-Iook-ahe.ad adder) if a shorter owrall
px('cut ion t.imE' is dp..sircd.
The array multipliE'r in Figurf> 6.18 hns to be lIIodifi(>d in ordpr to allow
mult.iplication of signed numbers in two's complempnt. notation. since product
bits like a, . .co and ao . X4 have a negative weight. and should be subtracted
rather t.han added. Onf> way to handle t.hp t'ight negatively w(>ight.ed partial
product bits propprly, in a 5 x 5 bit. multiplication, is depicted in Figure 6.19.
Bits with negative weight arc marked with a small circle instead of an arrow.
Such bits ha\'p to be' subt.racted iIl"tead of being added. The ('ells wit.h three
positivp inputs are ordinary FAs and art' marked by I in the figure. TI1P cplls
with a singlp negative input and two positive inputs are marked by II. The sum
of t.he t,hree inputs of a type II cell can vary from -1 to 2. This requires th£'
diagonal output c to have a wpight of +2, and th(' vertical output s t.o hav(> a
weight. of L The arithmet.ic operation of a type /l cell is described by the
equation
Pg
FIGURE 6.18 An array multiplier for unsigned numbers.
x + Y - z = 2c - 8.
The \alues of the 8 and c uutputs are given by
(6.3)
wit.h all its input.s negat.ive is marked by I' (see Figure 6.23) aud has negatively
weighted c and s outputs. This cell counts th£' number of (.-I)'s t it5 inpt:.
dnd represents this number through thE' c and S output.s. Logically, Its oeratlon
i the 5<"lme as that of typl> I (-ell and, t.herdore, their gat.1:' implementations arp
identical. This explains t.he rea..n for marking t.hem I a.nd I'. Similarly, thp
gate implcmentlitions of type II and type II' cells ar£' ielti('aI. .
AnothE'r dppronch to the design of an array multlphE'r for two s omplo-
mcnt opf>rands is t.o employ Booth's algorithm. A multiplier basc..>d 011. t1S ao-
rit.hm consists of 11 rows of basic cells, wht'r£' n is t,hp number of lIIult,lpht'[ bits.
Each row is ('apablf' of either adding or subtracting a properly liJtnt'(l. lIIultipli-
cand to t.he previously acrumulated part.ial product. Thp ('ells III row , perform
an add :mbtract or transf£'r-only operat.ion, depf'nding on thf" \'alm' of .£, and
t.he apropriat.e reference bit. Such a mult.iplier is shown in Figurp 6.20 for fu.ur
bit operands. ThE' basic cell in this mult.iplipr is a oont.rolled ald/sllhtrdct/"lnft
(CASS) circuit. depictt>d in Figure 6.20(a) (121. I'he Hand D slgllals are l'Onrol
signals indicating t.he t.yp(' of operat.ion to be pl'rforlned by the curr"oJl(ling
row of CASS cells, If H is 0, no nrit.luuetic operat,ion is done, nd thf'rdr(>
the new part.ia.1 product. hit, d('noted hy POll" is equal to th(' prt'vlous lIlit', 1 i,,'
s = (x + y - z) mod 2
and
c =
(x + Y - z) + S
2
(6.4)
Cells with t.wo negat.h'e inputs and one positive input. are marked by /l'.
The sUllI uf t.hpir inputs can vary from -2 to 1. Hence, t.heir C ont.put. should
havf" a w,'ight of - 2 and their S ontput should have a weight of + 1. Finally, a. cell
170
6. High-Speed Multiplication
6.6 Array Multipliers
171
a.,oCO
U{)Xo
o
a
Pl1l
,
11
D
C ou .
C,n
L
Poul
(a) Controllf'd add/subtract/shift (CASS) cell.
o
o
o
o
XO
o
1"3
Po;
P8
P7
P<;
Ps
Pot
P3
P2
PI
Po
FIGURE 6.19 An array multiplier for two's complement numbers.
X2
If H = 1, an arit.hmetic operat.ion is performpd, generating a new Pout. The
t.ype of arithmetic operation is indicated by the D signal. If D = 0 thcn the
multiplicand bit, denoh>d by a, is added to Pin with C-in as an incoming carry
bit. from the adja.(:ent cell t.o th(' right. The cdl thl'n generates Pour and Coul as
thE' outgoing carry t.o t.h(> IWXt cell Oil the lpft. If D = 1 t.hen the multiplicand
bit. a, is subt.racted from P;n, with C"I 8.'> an incoming borrow and Coul as the
outgoing borrow. Thus. th(' logic equations for P OUI and Cout are (12)
POUI = Pin ED (a . H) ED (c.-;n . H),
Coul = (Pin EI:) D). (a + Cin) + a. Cin'
An alternatp approach to the dign of a CASS cell is 8.., a comhination of a mul-
t,iplexer (sl'lecting among 0, +a and -a) and an FA. The control signals Hand
D fur row i are generated by a CTRL circuit (shown in Figurp 6.20(b)) based
on the multiplicr bit x, and the refcrenrp bit Xi-it following the rules of Booth's
algorithm from Tabll' 6.1. The first row corn.>sponds to t.he most significant bit of
the mult.iplipr. H('nce, thp rpfadting part.ial product nN'ds t.o be shifted to t.he l('ft
Lpfort' we> add to it (or :mbtrdCt. from it) t.ht> next mult.iple of the multiplicand.
Tl' achieve t.his, a np\\, cell with input Pm = 0 is addpd (at the right end) t,o the
XI
Po
Ps
Pot
P3
P2
Pl
Po
(b)
FIGURE 6.20 A Booth's algorithm array multiplier.
172
I
6. High-Speed Mu",pRcatlon
6l'cond row, amllo each row aftl>rward. Since the nUlllbf'r of bits in till' partial
product incrC&ics by om> in puc.1t ruw, WI' need to ('xpand the multiplicand I>t'IDr('
adding it to (or subtracting it from) tht' partial product. This is a(,colllplihed
b,y replicating the sign bit of thp multiplicand 8S shown in Figure 6.20{b).
Notc th.it wt> CUln()t take .t(h'antae of strings of 0'1'1 or I's in this imple-
nlP.ntation. sinc'p w(' cannot pliminatp or skip rows. Thus, the only adwntsgt:' in
this impll'mcntation is till' ability to multiply negdtive IIlllllhl'rs in two's rolll-
plf'lII('nt with no n<1 for any mrre('tioll step. Also notp that the opt>ration in
row; n(,('l:1 not b£' cf£'layro until all th£' upp£'r (i - 1) rows have complctc<l thpir
"pt'ration. Thus, the least signifirant bit. of the product, Po, will be g('ncrated
"fter on(' CASS ('('II delay (in addition to the dl'lay of a CTRI drcuit), p. will be
generated after two CASS cell ddays, and the most significant bit, P2n-2. will
be f..,eoerated aftcr (21/ - 1) CASS cell delays.
In a similar way we can implement highpr-radi.x multiplication sdlcmes,
whidl rt'quir(' less rows in the array by employing, for example, t.lle radi.x-4
algorit.luns shown in Table:; 6.3 and 6.S or similar radix-8 algorithms. Thf'S:'.
too, ran handle nl'gative multipliers in two's <,ompl('meut reprc>srntatiOlL The
huilding blork of such multipliers is a multiplexer-adder circuit that selects the
correct lIIultiple of the multiplicand A aud adds it to thp previously 'lccumulatl,d
partial product to produce a new ar('umulatro partial product.
b d 0,
z]
('-o..r S
Latched full addcr with an AND gate.
b d d b
hJ ff
<'our S
<'-our S
Latched half adders
FIGURE 6.21 The basic units of the plpellned arroy multtpller (16),
6.6
Array Multipliers
173
t) I IJ.I
02
°1
0(1
%2
.riJ XI
ro
rl
Po
Ps
p.,
PI!
P:.
PI P.J P.l PI Ib
FIGURE 6.22 A pipe lined 5 x 5 array mu"iplier for unsigned numbers.
174
6, High-Speed Multiplication
6.7 Opllmolity of Multlpll8r Implementations
175
An important charact.eristic of array multiplil'rs is t h"r. they allow a pipclm-
ing mod... of opf'rat ion, whl're t.h(' f'XI'Cllt ion of sppmah' mult iplicat ions owrlnps.
If this mndt' of upl'ration is df'$iretl, the long c!('II\Y lk"SociatNI with thp carry-
J>rol)dgatin addition pprfnrml:'d in thl' I.\st. row of thp army (e.g., bCC FiblUe
6.18) should b(' minimi?t'fJ. since it. det('rmin('s the throughput of thp pipeline.
This can bp flchip\'Nl by ff'placing tlIP CPA with sl'venl additilln..u row that,
lik.> tllf' first rows in t.he array. allow a carry prol)agatiun of only one posit.ion
behw'I'n auy two consft'uti\"(' rows. Fiw snch rows art' nt'<'rlNl iu tht> 5 x 5 ar-
ray lUultiplier for IlnsignMnumb('rs in Figur(' (j.18, with -I, 4, 3, 2, alii! 1 ceUs,
r<>Sl)ectiv('ly, Thp$f' rows are shown in a pip('lint><l vcrsion of the 5 x 5 array
Itmltiplit'r. dl'pict.l'd in Figure 6.22. Thl' basic c('lIs I'lIIploycd in this multiplier
arc shown iu Figun' 6.21 (16). The FA in Figur(' 6.21 indudcs an AND gatp that
generatt'S t.he prOlhll't. hit nix]' This product bit. is added to thl' incoming bits b
and d to producp thl' output bits S and COld' The lUodified FA I\lso propagates
th(' hit n. and Xj to lII'ighboring cells. Th(' two vl'rsions of t.he HA in the samf'
figurf' ar(' used in th(' bnttom five rows whpr(' t.h,' cells add only two input bit.s
each.
In ord('r to 'iupport pipcliniug, all cl'lls in thp array must includ(' latches, so
that each row can handle a s"I)ardt.e mult.ipli('r-multiplkmlll pair. Also. rebisters
arl' needed t.o propagat.e the multiplier bits t.o their destination, and to propagate
thl' product. bitb t.hat haV(' bet'n completed, which is dOllP in paralld with tl.!
generation of new product. bits.
Up to 10 l'OIlSL'Cuti\'C mult.iply operat.ions mn bE' ex<'Cut.ed simultanf'ously
in thp multiplier depictl'd in Figure 6.22. TIt(' ma.ximum rat.c at whidl multiply
operat.ions can be cOO1pll'ted is detl'rntined by t.he d('lay associated with thl'
modified FA, including the 18tch. This ratt' might. be, in pract.ice, too high to
Le ust.>O as th£' clock rate of t.1)(' circuit. Howev('r, other impll'lIIent.ations of t.he
5 x 5 pip('lincd mult.iplier with a low('r rat(' ar(' possible. For example, two row
can be combined t.o form a sing'" pipelinc stage' with a lower ratl' but. with fc\\r
latch (less drcuit.ry ov('rall).
Also, if t he rc",idnf' number systt'm L" f'mploycd, smaller c'ircuits with ff"wf'f input!;
arp required and, conseqlll'nt.ly, thl' lowf'r bonnd i
Tmull pogJ2ml,
(6.6)
1:null rlogJ2nl.
(6.5)
wl1Prf' J1I is t.he Jlumher of digits t.hat are needed t.o rf'prf'S('nl t.hp largP"il m'Hlulus
in the rl'siduc uumber systcm, as explained in Chaptf'r 11. and wumlly rn « n.
\Vhl'u arclling fl}r an optimal implementation of a multiplier in Hie mn.
\'t'utional binary number systf'm t w(' nCf'(1 to comparl' t.he perfornnnC'e (p.xecntion
timc) and implementation costs (e.g.. rl'gularit.y of thf' df"!;ign, totkl urt'a, pt.e.)
of the previously rlpscribpd algoritluns for multiplic'ation. \Ylwn both f'xu"uriou
time snd implcmentdtion cost, say, area, nl'f'd to be t.ak('n into iUTount, an lib..
j('ctive funct.ion like A. T Can be used, where A denotf' the area and T denotf'S
t he execution time. A more gl'neral form of an objective fuuftion is A. T" . whell'
0' can be I'ither smallpr or Inrer t.han 1.
In what follows WI' compare several mult.ipli('rs, some uf whieh wel"" pre-
sent.ed in pre\'ious sc<tions. The simple array mult.iplier depictpd in Figur p 6.18
hI's a very regular struct.ure. It can bl' impl,>ml'nt,,<! f'asily as a fel.t.angular-
shaped array, wit.h no waste of chip m('a. The" least. significant bit.s of t.he
final produc,t "lr£' tlu'n produced on the right side of the ft>Ctangll', whill' th., n
1I10St. significant. prodm't bits arl' t.he outputs of t.he butt.om row of t.he re<'tangle,
which constitut.es a CPA. Alt.hough this impl('mentatiun is highly reguldr <lJld its
d('sign and layout. are very simpl£'. it. pOSSf'S.'i('S two major drawbacks: Fir!>t, it.
r(>(luir<'S a very large area, proport.ional to Ill, sin('(' it contains about. n 2 FAs !Ill."
AND gates. S('cond, it has a long execution time T of about 21&' F \ (FA IS
t.he delay of an FA). More precisely, T cousists of(n -I)f' \ for the fin;t (n - I)
rows and an additional (n - 1)FA for the CPA if implelll('nh>o as 8 ripplt'-('arry
adder, as shown in Fignrp 6.18. Thus. all objectiw funct.ion of thf' forlll A . Tis
directly proport.iunal to n 3 . .
If a highly pipelint'll vcrsiou of this arrav lIIultiplif'r is dl'sired, t.he fC<IUJred
area increa...<;('S even furt.her (since th£' CPA must be replsced) as (Ioes t.he latency
of a single mult.iply operation. Howpver, the result.ing pipelinf' period. which
detcrmines the pipclining rate, is short£'r. ... .
An impl('mt>ntat.ion of t.he Booth's algorit.hm array nmltlpher. deplLtL>O III
Figure 6,20, off(>rs no advantage oVE'r th(' previous mult.iplit'f when performance
and area are consid('((,{\, since the ar('a A is of the order of ,.,,2 alll! T is line8r
in n. The mo(!if1NI radix.4 Booth <\Igorithm (Hee Tahle (j.3) ("an poteutiaUy
result in a b('tter impll'mentation, sin('(' it requires only n/2 rows of l'elk This
reduction in t.he numbl'r of rows could, in principle, rPt.luce th(' delay (T) kllli
the implementat.ion cost. (A) by a factor of t.wo, decrea.-;ing the objt.'(.tiv flll(:tit)n"
A. T to a fourth of its previous value. However, a more d{.taitt.'11 exannllldlUU 01
the design reveols that. the act.ual d('lay and area gains .up less thon exp,'(.ted.
Thp r('Coding logic and, lIIore importflntly. till' partial produet. selt'ctofS, ..lei
6.7 OPTIMALITY OF MULTIPLIER IMPLEMENTATIONS
Bounds on the p('rformanc(' of algorit.hlllS for multiplicat.ion have been derivc<! in
a way similnr t.o that of t.he bounds for addit.ion that. Wl're desl'ribed in Section
5.4. It is intpfPSting to note t.hat the t.heoretical bounds for multiplication art'
similar to those for addition, although, in practice, multiplication is more time-
con!;uming t.han addit.ion. Thus, if we adopt. t.he idcali?Mmodel, which 8SSluncs
t.hat dll circuits are impll'lI\euted using (I, r) gat('s, the execution timc of a
multiply ('ircuit. for t.wo opl'rands with 11 bits each must satisfy
176
6. High-Speed Multiplication
6,8 Exercises
6.2.
Prove that no c(lrrection stl'p i nC'I'dPcI whC'1I using thl' multiplk .Iion <il$torithm
in Table 6.J with A rll'galive multiplier r('prt'Sentf'<'11Il lwo'" LOlllplClJ1cnt. H."pt
this for the dlgorithru in fdblp 6.5 with" sign hit pXll'nsion.
Verify that the new pRrtial product in the radix-.I modified Booth "'borithm is
(X'_I + X'-2 - 2x,). A for odd wlut'S of i. Use this exprion to fonnally pro"f'
the ('orrectnC"SS of th.. algorithm.
Writt'down the rult'S for a radix-8 nJodified Booth's alorithru or, In nthpr wonl..,
'l 3-bit wrsion of the algorithm in fable 6.3.
(n) Verify that the 7,1161 ch,p, which is called a 2-bit by I-hit parallel nmllipll'xE'r,
impl('ments t he algorithm in Thbll' 6.3.
(b) How many 8u('h chips ''''1' nl'eded to constnll"t a 12 x 12 bit two's comph'mcnt
nJultiplier? Show how thi'S(' dups should bC' intercOnnf'Cll'd.
(c) Explain how the Q. output signals of thl' 71261 dup arc used tn generate
tll<' sign hit of the partial produ('t.
(d) What type of carry-save adder is needed?
In case 101 of Table 6.6 we Dt'e(ltwo forced cArneS into the acldpr. To avoid this,
we mav forct' Xo to 0 if it equals 1 anrl srt the initial partial product to be + \
inst('ad of -A. Show that the correct partial product is ulwa}'1i obtdiucd.
{)('SIgn a 3n x 3n bit multipliE'r out of n x n lilt lIlultipliel'1:l. Find tht' nwuber of
II x n bit multipliers that ar' needed dnd show how Ihe partial products should
bt' aligl1(,(1. What type of counters arc needed to add the partial products? Cdn
(5,5,4) counters be l15('ful?
Write tht' truth t'1ble of A type I I cell uSO'<I in two's complement array multiplit'rs
and obtain the Doolt'an equations for the c and iI output.:;. Repeat this for typt'
1 I' cells.
('iill the four ('t'11s in the lu.st row ill Figurl' 6.19 be madE' into t}'pe 11 by definlllK
the rightwost zero carry as having a positive w"ilt?
The idt'<1 behind the arr'\\' multiplipr ill Figurt' 6.19 was fi.r:,t prop05ed by pczaris
(19\, who has shown a slightly different org.wi7-3tion of the multiplier, as depi('tro
in Figure 6.23. Explain why the P. output in Figurt' 6.l3 is conlll'Ctl'd tu the cf'1I
on ils left side.
D('Si1 all array multiplit'r for two 5-bit uega-binary opt'ram1s; for example, X =
L:=oxj(-2)'.' What is the range of e8('h 0l)erand ad ohhe l>roduct? Draw t.he
lIJultiplit'r, indicate how miillY differ('nt types of I-bit rells are nt'f'cll'tl, and WW
tht' trnth table for each type.
III this question you are a1Iked to timate the pxecutiou time of an array ruul-
liplier like the ol\e shown in "'igure 6.18. Denote hy l:J.. dnd .:ic t.lll' dda)s
d.'oSociated with the sum and cnrry outputs of tht' ha.'iic cells, re8pcctlvply, an,l
a:..sume that they sati..f,. l:J.. > l:J. c . Find the criti('sl path in the array lUultiplil'r
d..wuillg all pro"duct bits a,xJ lirl;' IIwilable sillluIUlIll'Ously. K'dimate tht' pXC'CU-
tion time of a n x n bits multiplicdllOIl. Can )OU sugge8t ways to speed lip tile
opt'ration?
complpxity to t.he drruit and result. in a larer number of inlt'rronllprtiolt, and
a 1()1It'r rl..lay pt>r row. AI!iO, sinct> t.he rf'lat.ive shifl bet.ween any two adja(\,nt
rows is two bil positions, WC' must. allow the ('krry to propagatt' hori7ontally in
t h('se hit pO!>ition!.. This can bt> aehipved ('it.her locally or at thp last row of lhp
flrray mult.iplit'r. Aft.er that, a carry propagation through (2n - 1) hits (instead
of n. - 1) is required [181. Thp ('xact O\'Crall reduction in fhp objf'Ctive function
dpppnds on the detail.. of th(' dpsign and th(' technology used.
Similar problpms arp t>ncount('f('d \\'h(,11 implpmenting tilt> radix-8 modifipQ
Booth's algorithm in th(' form of 8n array multipli('r. In addit,iulI, the part.inl
product 3.4 should be precalculated. (',onsequently, the n>ductiol1 in delay and
area may be far Is than the t'xl){'ct.ed fador of 1/3. St.iII. the implem('ntation of
th(' radix-8 algorit.hm might be cost-t>fff'{"tive in certain technologil's and design
styles.
Irr('$p('('tiv(' of t.hp WdY partial products arc generated. they can be accu-
mulated dt.lwr t.hrough a cascadp int.prcuoncction (as in Figures 6.18 and 6,20)
or throu"h a tree structure (e.g., a CSA tree, in Section 5.11, or Senne varia-
tion of it, as in Section 6.4). The number of levels iu a CSA t.ree for k partial
product.:; is of the order of logk rather t.han being linear in Ii (as in a cascade in-
terconnect.ion), resulting in a much shorter f'Xl..,<,ution timp (thp numbf'r of partial
product$, k. ('an be n, n/2. n/3, etc; whpre 11 is the number of bits). Howcver,
CSA tree structures havc irregular int.eu'onn('Cts, making it difficult to find an
arca-t>fficicnt layout with a recta.ngul<ir shape. MOrt.'Over. all overall width uf 2n
is required in must cases. This may re;ult in a multiplier area of the order of
2n log k. The objective funct.ion A . T may, conseqllf'ntly. incrC'a..<;c as 2n log2 k.
The bal.mced delay t.ree in Figure 6.13 has a more regular structure. The
incrpl11ents in the numbC'r of operands in the balanced delay tree arc 3, 3, 5, 7,9. . '.
Th(' sum of t.hl' e1ement.s in t.his serit-'S is of the ordr>r of p, where j is the num-
ber of elt>ments in the series. The number of It'vels, which det.ermines the o\oerall
delay, incrl'ases linearly with j. As a result, the overall delay of a balanced delay
tr('(' i.. of the order of Vii, where k =;2 is the number of operands. This nC<'cls
t.u be compared t.u logk, which is tilt> number of levels in the complete binnry
t.r. The detailed proof it; left to the re«der as an exercise. One should bp
aware that general exprionb for the t:Olllplexit.y of either the execution time
or till:' area, like the ones above, ba\e theoretical import.ance, but only limitf'd
pr8Ctical1>iguificsJlce. For any given technology, a more detailed examination of
tbp alternat ive design!. is necessary before final conclusions can be drawn.
6.8 EXERCISES
6.1. Show thai Huoth's algorithm can be used to convert a nwnbcr iu two's comple-
mt'nt rt'prffientatioll to its SD reprPSentation.
177
6.3.
6.4.
6.5.
6.6.
6.7.
6.8.
6.9.
6.10.
6.11.
6.12.
178 6. High-Speed Multiplication 6,9 References 179
nozo a.. aJ a:l al ao
0 x x X.I Xl rl .l'o
al.70 ll"i '.ro al' Xo al '.ro no . .ro
n"'Zl a:t . Xl a2' XI al,xl ao'ZI
al'X2 a3'.%2 a:l'X2 al 'X2 ao . X2
a.. .73 a3 '2'3 02' X3 al '2'3 ao '2'3
a . .r a3' :r.. a2' X.. al'X" ii'o'x
ii 0 0 0 a
XI 0 0 0 x,
Po Pf', P7 PG 1:- P P3 P2 PI Po
FIGURE 6.24 Patiol ptoducts tOf e 5 x 5 TwO'S complement motIon (1).
6.19. Find the values of a,b,c,d,e and f in f.lble 6.7 which will yiE'ld an ('xprpssion
with the smallest numb('r of literals (8 literal is any appearance of eithE'r .£, or
x, ) for Coul.
6.20. Prov(' th.\t the following modification (see I"igure 6.25) of th(' vrangl'ment of
partiM products (for two's compl('ment operands) suggested in (11. produces the
correct filial product, Compare this arr81lgt'm('ut to the original one shown in
Figurt' 6.2-1.
Po
Ps
P7
Pr,
Pc.
I'..
P3
1'2
PI
1'0
FIGURE 6.23 The Olley multlptef tOf two s complement numbers suggested k'I (19),
1 a.. .xo 03'2'0 °2'ZO °1'IO aO'ZO
a..'Xt a3' X I °2'2'1 °1'XI Oo'XI
°4' I 2 °3' X 2 °l'X2 al'Z2 ao'X2
°3' I 3 °2' X 3 °1'X3 00'X3
1 0.. '.fl °3' I o1 'ii2-Xi aO'I4
Pg Ps P7 P6 Ps p.. P3 P2 PI Po
6.13. Prove that th(' arr<\lIgement of partial product hit... shown in Figurp 6.21 produces
the l'orre<'t product of two 5-bit two's compl('nwnt ope.r.mds. wlll're :C. = 1 - x,
and :;imilarl' ai = 1 - ai. This arraugellwnt W<\S suggcstt'd h) H"lugh and \\'ool('y
(11. Comparc thi.!. multiplil'r to the two's complt'ment array mllitipli('r.; shown ill
Figures 6.19 and 6.20, con:;idring the dlllouut of hardware .\IId I'x('clltion tillle.
FIGURE 6,25 Modlfted orroy ot porTlOl ploducts to< e 5 x 5 twO'S complement motIon
6.14. The Booth's algorithm multipli('r in Fire 6.20 starts with the most siguificant
bit of the multiplier. Redcsign the multiplier starting with tht' least signific.mt
bit of X. Compare the execution time and tht' requirro bardware of the two
alternatives.
6.9 REFERENCES
6.16. Prove that the dl'ld)' of thf' habwt:ed .Ielay trL't' hO\VlJ ill Figure 6.13 is propor-
11011.11 to .,fk, wht'1'(' k is the numh('r of op('rnnrls.
6.17. EXIII"iu why the HAs iD tbe leftmo:;t column of the .u-ray multipli('r ill Figurt'
6.22 h'wf' 110 carr)' output.
6.18. Verify thdt tlu impl('lUentation in Figure 6.12l'orresponds to the setting 0 = b =
c = 1 and d::;;; e = f = 0 of the variables in I'lble 6.7.
(11 C.R. BAUGH and B.A. WOOLEY, "A two's compl(,lUent parallel array multipliw-
tioll algorithm," IEFf; '1h1ns. on Computers, C-22 (Dec. 1973), 1045-10<17.
(21 K. C. BICI<I-:nsr\FF, t. J. SClwLTEand E. E. SWARTZLANDER, "Parallel reduced
area multipliers," Journal of VLSI Signal Processing, 9, (1995), 181-191.
(3] A.D. BUOTH, "A signE'd binary multjplicntioD tl'('hniQut'," Quart. J. M. tppl.
Mclfh., 4, Part 2, 1951, 2:)6-240.
(41 L. DADDA, "Some scht'mes for parallel multil)li('," Alta fuquenza, 34 (March
1965), 3.1fi-31'i6.
(51 L. IhDDA. "On parallel digitnllDultipliers," .Uta fu'quenw, 45 (H.l76), 57.1-580.
(61 J. DEVERt-:1 L, "Pip('line iterative arithmetic arra)'S," IEEE ['rans. 011 Computers,
C-24 (Mvch 1975), 317-322.
6.15. Show Ii blol'k diagrd1l1 of a 6 x 6 bit two's complel1Jl'llt multiplier constructed out
of multipll'xer-addt'r circuits bas<>d on the radix-4 modified Booth's algorithm ill
I'abl(' 6.3.
180
6 High-Speed MUltlplicotion
(7) f. J, FI YI\: AND S. r. OBEltMAN, AdJJanced CJJmpute-r o"thmf'tlr Je..tign. \\ Hey,
Nf'w York, 2001.
18) J ,A. GmsoN and R. \\', GIIIII,\ItD, "S) uthf'Sis mul romp:lri",m or two's clIlllJllement
parallf'1 Illllltiplipl"S," IH/;;F 1hm.s. on romput.eJ. r-24 (Oct. 197:;), 1020-1Ol7.
191 A. HABIIII and P.A. WINI"L, "I'Mt lIIultipLII'n>," /1-:P/:.' 1hu13. on C01llput rs, C-19
(Fl'h. 1970), 153-157.
(10) \. UARATA 1'1 0/., "A high pecd IlJllhipJjI'r usiUK" redundant lIinM)' f\lld"r trl"t'."
IFFP .1. of Solid-State Clrcuils, Sr-22 (Ft'I.. 1987), 28-33.
(11) Q. I . MAl"SORLEY. "IIiRh-spcOO arithlllNic in 'JinHry COIllJlutt'I'S," Iroc. oflHF,
49 (Jan 19tH). 67-91.
112) J.C, MAJITIIIA and R. "'ITA. "All itf'rnti\'e arrdY for lJIultipliralion of sigDl'C1
binary numhl'I"S," If'PI:: Ihm... on ('omputprs, ('-2fJ (F('b. 1971). :!l.1-216.
(13) 1. MEU'fA, V. P\RMArt and I' . SWARTZLA:l:l>ER, "lIih-51)\'t'(1 multiplier d('::;i1I
using lIIull i-input cOIlllh'r .md ("(>mpr('s.-.or circuits," Proc. lOtI. Symp. on Com-
puter A "thmciic (I9<J 1), '1:J-50.
11-1) H. (UN roYE , E. 1I0KENEK and S.I . H.eNYON, "Ul'Slgn or the IBM luse 8)'5'
tl'm/600 l1uating-poiut unit," IHM Jorlnlal ofllt-". m::h and DCl1clopmi'nl, 34 (Jan-
uar}' 1990), fI9-67.
115) Z.J. Iull and F. .JuTANu, "O\l'rturnro-slair.; adder In¥s and multiplier lIes.igu,"
If?Ef' 1hm.. on ('lmll)lItt', 41 (August Illi}2).940-!1.1R
[16) T.G. l"OLL t.l al.. "A piJlPlilll'd 330-MH.l multiplier," IEEE Jour'rllll of Su/id-Stllli'
('irruit., 8C-21 (June 198 1 3), .111- 116.
117) V. G. OKI.OIU>/IJA and O. \\'ILLEGER. "lmproviuR lIJultiplil'r dl>sign by u!.ing im-
prO\ed column ('omprion tree and optimizoolina1 addl'r in C'I\IOS tel:hllology."
Il'FB 1hUl.'!. on VLSI systf'rns, 3 (June 19fJ5), 292-301.
(18) V. PF.NG. S. SAMllI)Jt,\LA I\nd 1\1. G,WRlELO\', . 011 the illlpll'll1l'lIt<\tion of shiftl'n>,
1I111ltipliel'$ "lid dividt>r.; in VLSllloatillg-poinlullit.s," Proc. of 8th Syrllp. on Com-
putN' Arithrnptic (Muy 1987), 95-102.
[191 S.D. PElAIUS. "A ,IOn., 17-bit by Ii-bit array multiplipr," IFFI'; lhm.s. on ('om-
puters, C-20 (April 1971). '142-447.
120) G. \\. REITWIE.SI'..;R, "Biliary arithml'tic," in Advonc.es in compute-rs, vol. 1, F.
L. Alt, (Editor), Ac \(Ielllic, N"w \ork, 1960. pp. 231-308.
[211 L.P. 1t1lBINFIELD, UA proof of the mOdili,,<:1 Booth's dlgurithm ror multilllirl\tion,"
IEFE 'nuns, on Co IIIp ult>rs, C-24 (Oct. 1975), 101.1-1015.
(221 'I.It. SANTulto dnd M.A. 1I0ROWITl, "SPIM: A pipplmed 6.1 x 6.1 itprati\'e mul-
tiJllil'r," IEF]'; Journal of So/id-Statt: Circuit.'!, 24 (April 1m'!.!), 1j-4!}:J.
123) P.I-'. S n;1 LING, C.U. IAIU .:L, V.G. OKLOBDZIJA alld R. RA\'I, "Optinu,1 circuit....
for 1>Rr.1l1pl lUultiplif'rs," IFEF H'Urk.. on Cornpllters, 47 (March 1998) 273-285.
[2,1) D. Zl'RAS aud \\'. H. MeAl.! ISTER, "Halallcl'(l Mia)' trl'eS and combinatorial di\ i-
sion in VLSI," !EFE Journal of So/ld.State ('jrcuit.'1, S(,-21 (Oct. 1986),81-1-819.
[251 B..K. YIJ .md G.B, lnam, "1671\((17 radix-.I flouting point multiIJlicI." Proc. of
t/lc 120, SlIrnp. on Cornpllt r I',.,thmclic (Jul)' 199.5), 119-154.
7
FAST DIVISION
Tht'rc arc two different approach to the development of algorithms for high-
spl"('d division. The mort' com'elltional approach USt'S add/subt.ract aud shift
opf'rations. whill' the second relies on mult.ip!ieation. The operat.ion count in tilt'
first approadl is linearly proportional t.o t.ll(' word si7t'. n. Tht' number of stpps
in the second approach is logarithmically proportional to n, but each individual
step i<; more complex. Thl' first approach is disclJSS('d in this chapt.er while the
second is presented in Chapter 8,
7.1 SRT DIVISION
The most well kno.....n division algorithm of the first. t.ype is the SRT division,
nallied after SwC'Cney. Robertson, and Tocher (Ill), 115), [19», ea<h of whom
developed it independent.ly at. around the same time. The motivat.ion behind
the SRT algorithm was an at.tempt to speed up the nonrestoring division (whidl
consists of n add/subtract operations and is presentt'd in Chapter 3) by allowing
o to be a quotient digit for which no add/subtract. operat.ioll is needNi. In prin-
ciple. we can dlane the rule for selecting the quotient digit in the nonn.o;:;toring
division to
1 if 2rj_l D
o if -D :S 2r,_1 < D
i if 2r'_l < - D
and the corresponding new rcmainder is
rj = 2rj_1 - q, . D.
q. - {
(7,1)
(7,2)
181
182
T
7 Fast Division
7 I SRT Division
183
r,
rl
/J
1/2
r---------
r---------
12D 2r'_1
I
I
I
2r'_1
q. = I
q. '" 0
q. = I
,. =0
II
I
I
I
---------..
---------
-D
FIGURE 7.1 Nonrestonng division with qi = O.
-Ill
FIGURE 7.2 SRT divisIon.
This modified nonrestorinl!; di\'ision is diaramml'<l in Figurt' 7.1. The difficulty
wit.h thi.. nl:'w sel<'t'tion rulc is t.hat a full comparison of2ri_1 with t>ithl:'r D or-D
is r('()uired. If WI:' rl:'strict D to bl:' a normalized fraction satisfying :S IDI < 1,
we may rE"duce thl:' rcgion of 2ri_I, for which q. = 0, as follows:
1 1
-D < -- < 2r'_ 1 < - < /J
- 2-' 2-
(7.3)
2ri = O.Olxxxx .md so on. Similarly, if 2ri_l = 1.110x.rx.r then 2ri-t > -1/2
and we set q. = 0 obt.aining 2ri = 1.1O.l'xx.r and 80 on. \Ve say, th('reforf'. t.hat
SRT division is nonrest.oring division with a normalized divisor and fl'nldinder.
SRT division, as nonrestoring division, can be extended to include negat.i1"'p
divisors in two's complement. The selel'tion rule for qi thE'n bp('omcs
The advantage of this is t.hat now we can compart' the partial rem.linder
2r._1 to f'ithE'r 1/2 or - 1/2, inst.ead of D or -D. A binary fract.iun
is largE'r tll8n or equal to 1/2 if, and only if, it stdrt.s with 0.1. Similarly, a
binary fraction it> smaller than -1/2 if and only if it starts with 1.0 (in two's
complf'ment representation). Conse<luently, only two bits of 2ri_1 have to b('
examined, im;tf'ad of a full-length cOlnparison bet.wef'n 2"i_1 and D. In some
cases (e.g., when the dividend X is larger than 1/2) the shifted partial rE'l11aillder
IlPeds 811 integer bit. in addit.ion to the sign bit, and t.hus, tllr{'(' bits of 2",_1 must
b. examined. ThE' rul(' for splE'C(.ing t he quotient digit becomes
q, = { ;
if 12r._ tI < 1/2
if 12ri-tI 1/2 & ri_1 and D have the same sign
if 12ri-d 1/2 & ri-I and D hav(' opposite signs.
(7.5)
q, - {
if 2r'_1 1/2
if -1/2 :S 2r'_1 < 1/2
if 2ri_1 < - 1/2.
(7..1)
Example 7.1
Let the di\'id('ud X be ('qual to (O.OlOlh = 5/16 alill t.he divisor D bf'
(O.llOOh = 3/.1. Applying the SlIT algorithm yields
ro =X 0 .0 1 0 1
2ro 0 .1 0 1 0 1/2 set. ql = 1
Add - D + 1 .0 1 0 0
rl 1 .1 1 1 0
2rl = r2 1 .1 1 0 0 -1/2 set f/2 = 0
2r2 = r3 1 .1 0 0 0 -1/2 Sl't '13 = 0
2"3 1 .0 0 0 0 < -1/2 set q4 = i
Add D t- O .1 1 0 0
r, 1 ,1 1 0 0 J1('gative renminder & posit.ive Y
Add D + 0 .1 1 0 0 COnN"t iOIl
rl 0 .1 0 0 0 l'orrech.'<l final fl'maimlcr
The fl.'Sult.ing algorithm is rallt"'d SRT division, and it nn be diagrannllcd as
shown in Fiure 7.2. This diagram shows the quoti('nt digits that mllst be
st'I("('t,f'<j ill order to satisfy t.he conditionlr.1 :S IDI, guaranh't,ing thl:' cOIl\'eren('p
of thE' divi..ion prucedur(> with a final remainder smallf'r t.han IDI.
Tht"' SRT divi1>ioll procl.'SS st.arts off with a normalized divisor and hns t.he
pffl't:t of lIormaHzing the partial rf'mainder by shifting o\'t'r leading O's if it is
pOt-itivc, bud ledding 1'8 if it is ncgutive. For pxample, if 2r'_1 = O.OOlx.rxJ:
(wlwre :r u; un}' binary digit), t.hen 2r'_1 < 1/2 and we set C]i to 0, obt.aining
T
184
7, Fast Division
7 1 SRT Division
185
Thl' qllotJPnl elJ('ratcd hl'fore thl' correction is Q = 0.1001. This is a
minimal repn'sentatioll of Q = 0.0111 in SD form. In uthl'r words, a
miniuml number of add/subtract operations is performed. After corn:'(':-
lion, Q bffom(>s 0.0111 - UIJ1 = O.OIIO.,? = 3/8, and t.he finnl rl'llIaindt'r is
1/2.2- 4 = 1/32. 0
that snbtracting 2D (D/2) instead of D is E'qllivall'nt to pl'rfonning t.hl'
subtraction one position earlier (Iat.er).
2. Chang' the comparison const,anl K = 1/2 if D is outside the optimal ranJ?;e.
Such a change is allowed bet'ause the ratio D/ K is wbat really lI1att('g,
since we compart' the part.ial rplUainder to K, not. D.
ro = Y 0 .0 0 I 1 1 I 1 1
2ro 0 .0 I I 1 I I 1 0 < 1/2 S<'t ql = 0
2r] 0 .1 I 1 1 I 1 0 0 ? 1/2 set q2 = 1
4.dd - D + 1 .0 I 1 1
r2 0 .0 I 1 0 1 I 0 0
2r2 0 .1 I 0 1 I 0 0 0 ? 1/2 set q3 = I
Add - D + I .0 I 1 I
r3 0 .0 I 0 0 1 0 0 0
2r3 0 .1 0 0 I 0 0 0 0 ? 1/2 set qs = I
Add - D + I .0 I 1 I
r4 0 .0 0 0 0 0 0 0 0 zero final rl'mainder
Q = 0.0111 2 = 7/16. This is not a minimal rt'prt'St'ntatioll of t.he <)lIot.ipnt
in SD form. 0
The idl'a behind scheme (1) is that wht>nl'ver D is small wt> may 1'11(1 up
generating a sequelle'e of 1 's in tl1f' quotient one hit at a tin1f', rf'<}lliring a slIht.r1\ct
operation per each bit, as in the last example. In sUe'h rases, suht.rR<'ting 2D
iustf'ad of D (which is equivalent to subtracting D in thf' previous step) might
generat£' a negat.ive part.ial rl'lI1ainder, allowing 118 to generate a sequence of O's
as quotient bits while nonnaliJ:ing the partial rt'mainder.
Example 7,2
Let A = (O.OOIlll11h = 63/256 and D = (O.lOOlh = 9/16.
Example 7,3
Repeating the previous example WI' obtain
ro=X 0 .0 0 1 I 1 I I 1
2To 0 ,0 1 1 I I I 1 0 < 1/2 set ql = 0
2T] 0 0 .1 1 1 1 1 1 0 0 t;ubtra.<'t 2D
Add - 2D + 1 0 .1 1 1 instead of D
r2 1 1 .1 1 0 1 1 1 0 0 set ql = 1 and Q2 = 0
2r2 1 .1 0 1 1 1 0 0 0 set q3 = 0
2T3 1 .0 1 1 1 0 0 0 0 :5 -1/2 set q. = I
AddD + 0 .1 0 0 1
r4 0 .0 0 0 0 0 0 0 0 zero final remaindl'r
Basro on 1.111' last example, WI' may conclude that it. is possible to fur-
ther r"dm.'(' the numbN of add/suhtract operatiolls. Simulations and statistical
analysis studying t.h(' I'ffiriency of the SRT method have bren performed [91, and
the conclusions were:
Q = o.lOoh = 7/16 and this is a minimal rt>pre.sent.dtion of tl1t' quotient
in SD form. 0
1. TIll' <Wcrdl' "shift" in the SRT Ilwthod is 2.67, meaning that for a dividend
of length n we need, on the avcragc, n/2.67 operations. For example, for
n = 24, on t.11t' averagl', 24/2.67 = 8.9 :::::; 9 operations are requirl'd.
2. The oct.ualnumb('r of uperatiolls u,>c(k..l depl'nds upon the divisor D. The
smallt>sl number is achit'ved when 17/28 $ D $ 3/4 (or, approximately
whl'll 3/5 :5 D :5 3/4), with an aVc:>rage shift uf 3.
Henn>, in urder to rrouce the numher of add/subtract operations, \\1' should
modify th£' SR r method whl'n till' divifiOr happens to be out of th£' optimdl
rdnge (3/5 $ D $ 3/4). Two ways of aehi£'ving this are d<':-crib{'(1 below:
1. EXdlllin... tll(> possibility of 1Ising a multiplt' of D like 2D if D is too small,
or D/2 if D i6 too largf', in SUIilP uf thl' 6teps during t.he division. Notice
If D is large, a single 0 within a sequence of 1 's in the quotit'llt. may
result in two consecutive add/subtract operations, instead of one. Performing an
addition of D/2 instead of D for the last I hefore the single 0 (which is equivalent.
to p('fformin the addition on£' position latt'r) may gt'nerat.e a negative partial
remaind(>r that will allow lIS to properly handle tht' single 0, and tht'n continue
normalizing tht> partial rt'mainder unt.il t.ht> end of the se((upncl' of 1 's is roof:hCll.
Example 7.4
Let X = (O.OllOOh = 3/8 and D = (0.1110Ih = 29/32. The corn.'Ct 5-
bit quotient is Q = 0.011012 = 13/32. Applying tht> basic SRf algorithm
results in Q = 0.10111, and thl' single 0 within the group uf l's in Q IS
not handled in the most efficient way. If \VP IIse tlw multiple D /2 we obtain
T
186 ], Fast Division I
ro = A 0 .0 1 1 0 0
2ro 0 .1 1 0 0 0 1/2 :i't tJl = 1
Add - D + 1 .0 0 0 1 1
rl 1 .1 1 0 1 1
2rl 1 .1 0 1 1 0 t't q2 = 0
2r2 1 .0 1 1 0 0 0 add D/2 q3 = I)
Add D /2 + 0 .0 1 1 1 0 1 instt'ad of D
r3 1 .1 1 0 1 0 1 set q3 = 0 and
2r3 1 .1 0 1 0 1 ql = 1
2r4 1 .0 1 0 1 0 -1/2 set q = 1
Add D + 0 .1 1 1 0 1
rs 0 .0 0 1 1 1 final n>mainder = 7/32 . 2- :'
Q = O.lOOn,! = 13/32; i.e., the single 0 is lumdk>d propl'rly. 0
To implement this scht'mt', two adders are nCf'df'(l. Ont' adder will always
add or subtract D, whilp t.he other will add/suhtrad 2D if D is too small (i..,
D starts with 0.10 in its true form) or add/subtract D/2 if D is too large (i.e.,
D starts with 0.11 in its true form) [11). The output of tilt' primary adder
is normally uS(>d. unless t.ht' output. of the alt.l'rnatp adder results in a largt>r
normalit.ing shift..
Th" idpa of using multiples of D can be extended to the ust' uf 3D/2 and
3D/4 in addition to D itSf'lf. Thl"Se provide an even higher overall average hift
(of ahout 3.7), but require a more complt'x implementation [11).
Schen1t' (2) is based on the fact that for K = 1/2, the ratio D/ h in thp
optimal rang" J/5 D 3/4 is
6 / 5 < D = D
- K 1/2 3/2
K < D < K.
5 - - 2
I
I
I
I
I
I
I
(7.6)
or
If the given D is not in the optimal range for h = 1/2, we can choose a differ-
ent comparison ('on:.tant K, Consequt'ntly, for difff'rent ranges of D there are
difft'rl'nt \d1ues of K,
A numerical search for [( [13] has shown that satisfactory result.s can
bJ> obt.,inf'(1 if the region 1/2 IDI < 1 is divided into five (not equally sized)
subregion:., e.&:h lu\Ving a dim>rent comparison COIL<;tant K" as depided in Figure
7.3. Note t.llat four bits of the diviwr haw to be examim,<1 in order to scled the
l'omparison c(Jnstant., which in turn has only four bits to be compan.d to the fOur
most. significdnt bit.s of the remaindt'r. The determination of the subregion..:; for
the ,iivisor and the corresponding comparison constants has to be done throuh
a numerical search. This is bec.auS<' hoth should b(> hinary fractions, with a small
number of bits in order to simplify the r«'Sulting division algorithm.
7.2 High-Radix Division
187
1/2
.1000
...J
9/16
.1001
I
5/8
_1010
I
15/16
.1111
I
I
1.0
L
3/4
1100
I
K I = 3/8
.0110
K27/16
0111
KJ= 1/2
1000
K4=6/R
1010
K:, =3/.1
1100
FIGURE 7.3 The values of the comparison constant for the five dIVIsor subregions.
Example 7,5
WI' repeat the division in Example 7.2 with X = (0.00l1l111h = 63/256
and D = (0.100Ih = 9/16. The appropriate comparison conbtant for
this divisor is 1<2 = 7/16 = 0.01112 (see Figure 7.3). If the remainder is
negative, it should be compared to t.he two's compll'ment of K 2 . which if!
1.100 1 2.
ro= X 0 .0 0 1 1 1 1 1 1
2 r o 0 .0 1 1 1 1 1 1 0 0.0111 set. ql = 1
Add -D + 1 .0 1 1 1
rl 1 .1 1 1 0 1 1 1 0
2rl = T2 1 .1 1 0 1 1 1 0 0 l.l 001 setQ2=O
2r2 = r3 1 .1 0 1 1 1 0 0 0 1.1001 Sf't q3 = 0
2r3 1 .0 1 1 1 0 0 0 0 < 1.1001 set q4 = I
Add D + 0 .1 0 0 1
rl 0 .0 0 0 0 0 0 0 0 zero final rl'IlMindpr
The quotient Q = 0.1001 = 0.01112 = 7/16 is repreo;ented in a minimal
SD form. 0
7.2 HIGH-RADIX DIVISION
The number of add/subtract opl'ratious rt>quired by the radi.,,<-2 SRT a1gurithm
and its variations is data-dependent. Thus, an asynchronous circuit must. be
designed in order to tak£' advant.age of tht' reduc('{} number of non.tero bits in the
quotient. Consequently, attempts to incre{ the number of zeros in the quoticnt
have, in the currt'utly available technology, very limited pmdi('al ignifi('.aI1('t'.
The number of add/subtract operations in the division proce$S cnn hp
reduced and still be data-indppl'ndent by increasing the radix,3 for the process,
where selecting 13 = 2'" allows the gel1l'ration of F1l qnoti(>nt bits o:\t t'aeh st-t'p.
In this way, the number (If stt'ps is reduced to rn/ml The r",'ursive l'<luation
188
7. Fast Division
7,2 Hlgh.Radlx DivISion
189
for t ht' rf'nl/\indpr i nO\\
rtlD
k
r, = ;J r,_) - qi' D (7.7)
wh£'rp the nmlt,iplkation by {3 = 2 m is adlipvro by shifting t.he remaind£'r m bit
positions to t hI' left. Tht> digit set for th e quot.il' nt is {O, 1, . . . . (IJ - I)} fur thp
restoring diviion and ean bt' 8.'> large 8.." {CO - 1)", " 1,0, I,.." ({3 -I)} for thp
high-radi.,,< SRT division.
A radix higher than 2 ('..an, in principlp, bp used for any of t.he previously
mpnti()ned division algorithms. For restoring division, t,hi" mealls that \\\:: start
with lI1C' initial guess q, = 1 and, if t,he rcsulting remainder /Jri-l - D is positivp.
we incrp8sP it to qi = 2 and subtract D from t hp t.emporary rplllainc!pr, obta,ining
I.}ri-I - 2D. The proeess is rppl'at.,>d unt.il we r('ach t.he \alue q, = j, for which
t.he ft'mporary remainder is negative. \,"p t he'n rest-Or£' the fI'mainder by adding
D, obtaining {3r.-1 - (j - I)D, ane! St.'t q, = j - l. This sequl'nt.i,,1 procroure
can be v('ry t,imf'-C'onsuming. making its advantage ovt>r the binary algoritluu
questionable. It can be replaced by a parallel proccs,,, if sevenl comparhon
circuits comparing {3ri-1 to mult,ipl£'s of the divisor, jD, are includl.>d in the
division unit. The' comparison circuit producing t,he smallpst posith'e rC1I1aindpr
points to the correct quotil"nt digit. ClParly, this implpnwntation requires a
substant.ial hardware investment. Similar changes can be introduced into th£'
binary nonrl"storing division algorit.hm.
In what follows we describe t.he high-radix SRI' algorit.hm. It. is possible
to implpment a high-radix SIlT division circuit that is fa.ster than its binary
version. The quotient digit qi in slich an algorithm is a signed digit in t.he
rang(' {o, 0-1 . ...,1, O,I,....o}. when' r!(}1-1)l 'S 0 'S ({3-I) (see
Chaptpr 2).
To find uut 1.111' possible chokes for a in th£' high-radi.x division algo-
rithm cODsider th(' following. The quotient. digit qi is ordinarily sc\e(.ted so t.hat
Ird < IDI; ot.l1l'rwise, t.he next quotipnt digit might b3v(> to he 13 or larger. This
guarank'("S the convergence of th(> dh'ision proc(>dure. The above condit.ion im-
plies t.hat for the maximal remainder ,Jri-l = Ij(D - ul]) and a positive divisor
D, spl(>('ting thl' largest \dlue for the quotient diit qi = 0 should be sufficient to
)'ipld a remainder ", in till' allowabl region. Therpfore, the following inequality
:->hould hold:
ri = p(D - tllp) - aD 'S D - ulp (7.8)
Dividing Equat.ion (7.8) by D (('\'eals that wp may Sl'll'ct, for 0 only the maximum
valul' 0 = (13 - 1). It is rl'<\...;.oll8blp, however, to consider division ll'chniqucs
for which Ir,1 'S IdDl, wherp k is a fr<iction, since this r£'duces thl? size of th(>
allowable region for the partidl rpmaindcr. as shown in Figure 7.4. Equation
(7.8) uow take" thl' form
.--------- ---------
. .
I I
. .
-k!} I I klJ
/Jr,_IID
.--------- ---------.
-k
FIGURE 7.4 The allowable region for the portial remainder ( k 'S 1 ).
Hence, (\ k({3 - I), and if we want t.o allow the sf'lection of any value of 0 in
the rangp r !({3 -1)1 'S 0 'S (13 - 1), k should satisfy 1/2 'S k 1. since . 'S i!=i.
The smal\,'r the value of k, the smaller t.he redundancy is in the numb...r system
for the quotient.
Example 7.6
Let (3 = 4 and a = 2. We set k = 0/({3 - 1) = 2/3. For this select,ion,
ITil kD = D. and l{3r.-11 = 14ri-ll 'S iD, or l-ul and I oIr() I I .
Th£' digit set for qi is {" 1.0.1. 2}. The rpgion in which a specific v.uuJ>
q can be selected is given by
_ < 4r,_1 _ q <
3 - D - 3"
aud consequently,
2 .tr.-l 2
-'3 + q 'S [) 'S '3 + q.
For t'xample. the value q. = 2 may be selected in t.he region t 'S sri) I 'S
, since -I 'S (:!!zf- - 2) 'S . Similarly, we may select ql = I for
.1 < 4r'-1 < *. The different n>gions for seleeting the quotit'nt digits are
3 - D -"
shown in Figure 7.5. In the oVt"r\apping region. namel,}', 'S 4r;;l 'S .
we can sele(.t eitl1Pr qi = I or q. = 2. Similar owr\apping regions exist for
qi = 0 and qi = 1, for q, = 0 and qi = 1, and also for q. = I and qi = . 0
r, = keD - ull)) - aD k(D - tlll)'
(7.9)
I
I
I
I
I
I
In general, the ratio k ::;: o/({3 - 1) is a mt>8SUrl' of tht: redundaucy in
the representation of tlw quuticnt digits. The largl'r this ratio, tht> larr tbe
overlap r('gions art' in the plot of r;/ D versus j3r._ 11 D. For example, if we set
o = 13 - 1 = 3, then k pquals 1, which corresponds to the JlUI..,,<imum redundancy.
In this c&>e, the region for qi = 1 is 0 'S .!!:ff- 'S 2, and, for qi = 2, it is
1 'S 'S 3. Thus, t.he owrlapping region where eitlwr q. = I or q. = 2 CWI he
190
1
I
7 2 High-Radix Div\slon
7. Fast Division
r;/D
f> = fJr,-1
(8/3,2/3)
I
I
I
I
I
:.lri_I/D
I
I
I
(-8/3, -2/3)
-1
AGURE 7,5 The quotient digits for {3 = 4 and fi = 2.
selected is 1 IriJ I 2, which is larg('r tl1<\n the corrt'sponding oVl'rlap region
for Q = 2 (see Figure 7.5).
The implication of having an oVt'rlap r('gion is that we have a choice of
valu, of both the partial remainder and the divisor, that will eventually scparatt'
t.he t.wo adjacent. regions corresponding t.o two consecuti\'t? \lI.lUt'S of tilt, quoti('nt
digit. q. The selected values of the partial rt'maindN and divisor separating the
adjaa'nt regions of q will serve as comparison constants during the t>x"cution
of t.he divide opl'rat.ion. \Ve may therefore select t.lue comparison constald.s
so that they require as few digits as possible. Such a selection will reduce t.he
execut.ion time of the comparison st.ep when dett>rmining the quotient digit.
Clearly, a larger overlap wgion (corresponding to a higher valup of 0) llIay
allow us t.o sclect comparison constants with fewer digits. On tilt> ot.her hand, a
higher valul' of 0 Illeans having to produ('e 11I0re multiples of the form Q. D,
requiring extra hardware and/or time.
For a given 0 we need to determine the lIIunbt'r of bits of the partial
rem.under and the divisor that must be examined in order t.u select the quotiPnt
digit. This is the Illost difficult st('p whpn developing a high-radLx SRT algorit.hm.
It can be accomplishPd numerically, anal,}'tically, or graphically. A combination
of thi'sc tt'dmiqU(.'$ can also be employed.
To graphica1ly determine' tht' r('quired numbt'r of bits of tI... partial re-
mainder and the divisor we U<;i' thl' poD or partial remainder versus divisor plot
((2, 91), like' the one shown in Figure 7.6. The purpose of the P-D plot. is to
indicate t.he regions in which given values of q, may be st'leeted. To dt'tf'rminp
the r<'gion for a givcn value q of the quotient digit, consider again the basic
l-'quat.ion for the partial rcmaindl'r, rt'writtl'n n...
191
(k + j + 1) . l)
(k+J).D
(-k+j+l).D
(-k + j) . D
Dm'.. DI
D'J
DrJl.Q
D
FIGURE 7.6 A poD plot.
To simplify the not.ation from this point on, We denote the previous pArtial
remaiuder {3ri-1 by P, as shown in Figure 7.6. The maximnm value of P for
whkh the value q can be sdected depends on the maximum allowed value for ri
and siuce
(7.11)
II for tht'
(7.12)
(7.13)
- /.: D $ r, k D.
wp obtain an upper limit for P, for which we may select tit!> value
quotient digit:
P fflOZ = (J.: + q). D
Similarly, the lower limit for P is
Pm;n = ( - k + q ) . D.
Equat.ions (7.12) and (7.13) for a specific value of q, say. q = j, art> repr(>S(>ntro
hy t.wo lines in Figure 7.6.
At each point in the region het.w<*'n these two lines we nMY select the
value j for the quotient digit q. Dut' to the redundancy in I'{'prescnting the
quotient, the regions for II = j and q = j + 1 oVt'rlap. The overlapping r('gion
is bt>t.we(>n the upper line for q = j and the lower lint' for II = j + 1, as shown
in Figure 7.6. Note that Figure 7.6 includes only posit.ive vnlues of tilt' divisor
and partial rf'mainder, and thus constitutes only one-quartRr of the com plett> -
D plot. However, tilt' complet.t' P-D plot is symmetric about both axes. md. m
{31'i-1 = Ti + II' D.
(7.10)
(k + j) . D min C (- k + j + 1) . Dmo%.
(7.14)
I
I
I
I
I
I
I
I
7.2 High-Radix OMslon
193
192
7. Fast DIvISion
mo..t. c&.ws, it is sufficil'nt to anal)'7.<' onf'-(juarler of it. Notl' 'llso t hat only value'S
of IDI in tilt' range ID min . D nU1 %1 aro' of interest. Examples fur this range' are
[0.5.1) and [1,2). Th(' latter is applicabll' wht'n di\ idin floating-point, numhl'ni
in t.11P IEEE standard (SCf' Chapt.t'r 4).
For till' abow'-IIlI'ntion,'(1 ov"rlaJlpin f("ion. we have to dl'termin(" thl'
vallie of P, which willl'\'I"ntllally sP(Jarate till' splC'ction rl'ions of q = j ano
q = j + l. This valtlP will S<'r\'P as n comparison const<\Ilt. and t.l1P number of hits
rt'Qllir('(J to ft'prt'S(>nt it will dl"tl'rminJ> tht> n('CessRry precision wlu'n ("x"mining
the partial remaind('r in ord("r t.o S('lc(,t q. Thl' linl' sl'pard..ing thl' regions of t.l1l'
part ial r("maindpr may lw a st raight hori:70ntal line (DI\(" I hat. is imlepl'ndpnt of
D) or 8 stnirstC'p funct.ion, partit.ionin t.he mnge of tllP divisor (D mi ." D",o%)
into st>veral intt>n-nls. \\'l' may have a singll' hori70ntallinp r = c, where c is a
constant, if and only if tllPn> is a vnlue c satisfying
pxaminoo. it is :mfficient t.o consldl'r t.he overlapping rpv;ion bf'twf"l'n q = ('( .uul
q = 0 - 1 near D,,"n'
Lpt N p d('nllte the nllmLt'r of bi of thp part ial remaindf-'r P t.hat h<iw t (I
be exalllint'C1 in order to dct.ermili(> the corrpct valllP q of the quotil'nt digit. N D
is dl"fincd silllilariv for the divisor. fhe splPC't ion of the value q can hI" donI" hy
a look-up table implem,'nted for examplp, in a PLA (programmahll' logic ilrray)
wit h N p + No inputs. Our ohjective is to minimi?e th,> size of th€' look-liP t lhle,
which, in t.urn, will sp....d "l> diP division proCf'SS.
Let Ep and ED denot.e the number of fraetional bits within Np dil(l No,
respl'Ctively. The preci:.ion at. which tht> partial remaindpr is pxamim'Cl it> rims
2-(P, and similarly 2-- 0 is the prpcisioll of t.h!' "truncat('(\" divisor. Clt>arly,
t hes!:' t.wo must satisfy
2-(D Xm,"
and
2-£P S Y,",n'
(7 .1)
p
X = D2 - DI = _ k + j + 1
= P . 2k - 1
j(j + I) + k (1 - J.).
P
k + j
Howpver, the inequalities prO\idc only upper bOlluds for determining the prf'-
cision at which the divi&>r aud remainder should be pxamined. since the two
ext.ren1P points of the interval l:U( (} ) may r('()uire a highpr prl"cision; i.f"..
more than ED (Ep) fract.iollal bits. To check whether the computl'd vailles of '-D
and f p arc sufficient. or whether a higher precision is required. We ma' u<;c the
poD plot. This plot can also be used to decide on the value of q for l'<lch p<iir
of valut".-S of P and D when t.runcated to tlte most significant N p and N D hit,
respef'tivl"ly.
When making thl"Se decisiuns we mUltt take into account the limited pre-
rision of the divisor and the partial remaindl'r. As a result, each point with the
coordinate'! (P, D) in the P-D plot. repreS<'nts all the partial remainder-divLr
pair" with
This illlplil's that the splc(.tion of q will he illtlepl"ndt>llt of Dane! will dl'Jll>nd
only on P. If this inequality is not sat.isfied, WI' must divide [Drn," , Dmoz) illto
1'{'\'('ral snnlIer int("r\'lls. The ''st,'pping'' points deh'nnillt.' thl' precision (i.e., the
11IImbpr of digits) at which \\'t' have to eXdmint" D, while diP h("ight. of tht' stl'pS
dl't.ennines tilt' precision at which dIP part.ial rpmainrler has to be l'xaminro The
IIMximum widt.h of a stpp bl'twN'n Dl and D2' delloted by \, is the horiwntal
dist.ance ht't\\'t'E'lI the two linl"S definillg th€' ov("rlap region (Sf'C Figurl' 7.6). The
expression for this horiwntal distanc(" i.., in gpnl'ral,
(7.15)
I
I
I
I
I
I
I
P $ part.ial rt"mainder S P + T(P,
D divisor $ D + 2-£D.
Therefore, we mUst select for the pair (P. D) a V'dlue of q which is legitimate for
all the pairs in the above rangp, . .
CouJ:lider, in particular, point. A in Figure ;,6. For a ,hvlSOr th ,t t.'<luab
D 2 we may select q = j + 1, but for a divisor that ("quais 02 + 2-£D we must
select q = j. Consequently, we should not. s(!led q = j +.1 for point. A or or
any other point in t.he owrlap region whose horizont.al oastanct' fwm t.he lUll"
(-k + j + I)D is smaller than 2-<0.
Thl" horizontal distaJlc.- f:1 \'" is minimal whcn j is maximal and P is minimal.
The m.o:imlll11 vnltlt' of (j + 1) is 0; hf'nC'f' j = 0 - 1; P is llIinillldl whl'n
DI = D""n. Thus,
21; -1
D. \mrn = Drn,"' (k+o:-l) 0(0 -I) ... k(1 _ k).
(7.16)
The maximum height} of thl' h'p, or the vertical distam.,' bl'twcl'll t.h,'
two lines,
} = (J.'+j)D-(-k+-j+I)D = (2k - I)-D.
(7.17)
Example 7.7
The P-D plot for jJ = 4, a = 2, and DE (0,5,1) is showl III Figllrl" 7.7.
In t.his tigure, the overlapping region for I.J = 1 and I.J = 2 hps Ltw''t.'n the
Ii !les
Tl.ili; ,prtical disldUU' is a minimum whl"n D = D, .in. Cons{!<JuC'ntly, to de-
t.rmme the prt'c "1OU at which thp partial rf'maindt'r and divisor have to he
p = 4rj_1
1.101
1.100
1.01l
1.010
1.001
1.00U
0.111
0.110
0.101
0.100
0.011
0.010
0.001
T
7 Fast Division I
5/3.D
q=2 I
1/3. D
7.2 High-Radix DIvision
195
194
(I) P = (Ii + 0 - I)D = 5/3. D, and
(2) P= {-li+n)D=4/3.D
\\'(' first dwck thf' pos.,<;ibility of a ::-inlp horilOntal lint>, as in E<luation
(7.14). (Ii + I)D min = 5/3.0,5 and (- Ii + 2)Dm.... = .1/3. l. and
!>incp 5/6 < 4/3, a singh> Iin" is impos.<;ible and we must have s('v('ral
divisor intf'r\als. Nl.'xt, we calC'ulatf' the !imalle:>l horiwntallmd \'I>rtirnl
distan('('S:
2/3 . 1)
,5353 I
;\:min = Dman' -. - = -. - = - = 2- 3 hpnc e fV > _ 3
3 20 6 20 8 '
1/3. D
/3. D
Ymin = Drrnn' 1/3 = 1/6, ht'ncp f.p 3
The poD plot in Figurf' 7.7{a) mclmles a grid with fp = fD = J. A Im'risf'
examinat.ion of the overlapping rcgion between q = I and 'I = 2 in Figur p
7.7(a), reveals that for the partial rf'nldindt'r-divisor pair (0.110,.100) w(-'
cannot select a value of q which will be It'gitiml\lp for all the points in the
l"Orrcsponding recta.ngle. Furthl'rmore, it. is apparpnt that by incn'a..,ing
fV to 4 the above problem is resolwd. The corn>:;ponding grid is shown in
Fiure 7.7(b). \Ve now have to decide on till' vallie of q for each (PI'langlp
in this grid. Figure 7.7(b) shows all the possibll' sell'ctions of a \dlue for q
for all the pairs (P, D) within r.he overlapping regions. The 11P8vy lint's in
the figure show one pos.<;ible!'Ct of lines St'pardting thp regions for difff'wnt
values of q. Clearly, this is one out of many possible' solutions allowing
the designpr t.o splect. a solution that will, for t'x'lmple, minimi.lp the PLA
which implemf'nt.s t ht> look-up table for q. Such a PLA will have .V D + N p
inputs where N v = -I aud N p = 6, since thrt't' morp bits are nt'eded for
the int.eger part of the remainder and its 1>ign (-8/3 P 8/3). fhe
numbcr of inputs t.o this PLA can be reduced to .V p + No - I = 9 by
hiking advant.ag of the fact. tlldt th most siJtific.allt bit of D is alv.ayo; I
and ("an be thprl'fore omitt.ed.
Notc that theoretically, for thp ovcrlapping region between q = I 8nd
q = 0 we could uS(> a single hOrIZontal Iinc, since 2/3 . 1/2 1/3. This
howe\'t'r, will rt'<llIire a high-precision comparison of the partial rt>lIMinder,
since c = 1/3 rftJuirl's that all tlw fractional bits in the p6rthl rt>maindcr
will lit> compared. \Ve thereforP partition thJ> divisor interval into two
subintervals, as shown in Fignre 7.7(b). 0
q 0
o . IOU
0.101 0.110 0,111 1000
(a) t I' = EV = 3. [}
1.101
1.100
l.Ull
1.010
1.001
1.000
0.111
P::;;4r,_1
0.110
0.101
0.100
0.011
0.010
0.001
8/:'0 f) 5/
/ 2.........- V;
/ y 2 lor2
/ 2 lor2
2 V lor2 I
2/ .4 lor2 I
..---; 1./ A"'
./ ./"" V
2 I
1 .2.- 2
k- --
1 1
.!.- !--t- -7" Oorl Oorl Ourl Oorl
1
1 I Oorl Oorl 0 0 u
0
....- u
3.D
4/3. D
/3. D
0.1000 0.1010 0.1100 0.1110 1.0000
(b) (p = 3, fV = 4. f)
FIGURE 7.7 The poD plot for 13::: 4.0"" 2. and DE (0.5.1).
Example 7,8
Let X = (0.001 11 II Ib = 6J/256 aud D = (O.lOOlh = 9/16 as in Exam-
ple 7.5. For t.his divisor, the comparison constant.s for tht' I)Urtial r('majJ
der are, accordin to Figure 7.i(b), IJ-I (llOlO) aud i/8 (0.111).
196 7. Fost Division
ro = ..\ 0 .0 0 1 1 1
4ro 0 0 .1 1 I I I 7/8 set ql = 2
Add -2D I 0 .1 1 1 0
TI I I .1 1 0 I 1
4rl 1 1 .0 I 1 1 < -1/4 s('t f}2 = i
Add D 0 0 .1 0 0 I
T:2 0 0 .0 0 0 0 7('ro linal rplIlaindcr
T
I
7.2 High-Radix DMsIon
197
The t'ntri,.s of th.. look-up tabll' for tlw aho\'c algorithm ('(\n also be cdl-
('ulat.('d nunwrically (instl'ad of graphically) with th" lIumher of inputs to thp
look-up tahlt' determinl'<1 by a t.rial-and-t>rror lIumerical sNlrch. Suppose. for
example', that we st.art with an initial gu of fO = 3 and fp = 3 and Wf'
attt>mpt to calculatf' t.he value of q for D = 0.100 and P = 0.110. Sin('(' we
truncatro till> divisor, w(' n,'Cd to consider divisors from 0.100 t.o 0.101. Simi-
larly, the partial rf'maindf'r could have a ,'a1ue from 0.110 to 0.11 1. Thus, P / D
could be as mall 0.110/0.101 or as large as O.Ill/O.lOO. The first f'quals
O.llO/O.lOi = 1.2 and, according to Figure i.5. requires q = 1, whil.. the second
equals 0.1l1/0.IOO = 1.75, requiring q = 2. WI' therpfore condude that t.he
aboVl' precision of P and D is insufllci('nt. \Vf'must increase th.. number of bits
of either the divisor or tht> partial wmainder and try again. A simple program
could be prepared to perform this nunu>rical s('arch and det-t'rmine the value of
q for carll (P, D) pair (6].
To iIIub"trate the lower precision of compari,>on needed for a higher value
of 0 (i.e., a higher level of redundancy) consider the following l'xarnple.
10.0
3D
111.0
110.0
101.0
100.0
2D
p = o1rj_1
The re;ulting quotit>nt. is Q = 0.2i.. = 0.10012 = 0.01112 = 7/16.
o
11.0
D
q::sl
01.0
q""O
1.0
1.1
10.0
I)
FIGURE 7.8 The poD plot fOf (3 = 4. 0 = 3. and D E (1.2).
ND = 2 and Np = 4, instead of N D = 4 and Np = 6 for 0 = 2. A less
complicated quotient selt'ction logic is needed hpf{'. but thp multiple 3D
is required, which i.. costl) to gt'nerate. 0
Example 7.9
For /3 = 4 und a = 3, k is a/U1- 1) = 1. Th(" region for q = 2 is betw('('n
the lines P = (k + q)D = 3D "\nd P = (-k + q)D = D, while the rf'gion
for q = J is betwn t.he lines r = olD and P = 2D. Thus, the overlapping
region for q = 2 and q = 3 is het.w£'en the lines P = 3D and P = 2D,
m. shown in Figure 7.8. For D E [1.2), as in tht' IEEE float.ing-point
standard, we hav" th... following inf'qualities:
Example 7,10
LM. X = (01.0l0Ih = 21/16 anrl D = (01.1110)2 = 15/8, For this ,tivisor,
the partial rem!\indt>r comparison constants are, according to Figurl' 7,8,
I, 2 and 4.
1\ ". _ D I _ 3 _ -I
L.).'m... - ,,"n' 3 . - - - - 2 , hence fD 1;
6 6
ro = X 0 I .0 1 0 I
4ro 0 1 0 I .0 1 0 4.0 set ql = 3
Add -3D l- I 0 I 0 .0 I 1
r1 1 I 1 1 .1 0 1
4rl I 1 1 0 .1 0 0 -2.U etq2=I
Add D + 0 0 0 I .1 1 1
r2 0 0 0 0 .0 I 1 fi nal rt'nmimler - 3/8. 2- 4
The quotient is Q = (0.3i)" = (0.1101)2 = 11/16. We "erif)' the re:;ult
of the divide operdtion through X = Q. D + R = 11/16. 15/8 + 3/I2H
= 168/128 = 21/16. 0
Yn..n = Drum' I = I, hellC: [I' O.
To ohtain the values of U.... comparison constRnt.s we have to t>xamine t.he
didgrdl11 in Figllrl' 7.8. We collcludl' that. ED = I and fp = 0 find thereforf',
198
7. Fast Division
7.3 SPEEDING UP THE DIVISION PROCESS
A major f(>a..,«JII for till' low p('('(1 ohhf' division proct's-<; is the fad that w,' have to
('omplpte the ith sto'p hl'fore continuing t.o diP (HI) stl'p. Th(' mult.iply proccss
can be lu'cl>lcrated t'asily by gl'nerating t;('\'cral partial products simultaneously,
since they are indppendl'nt. In contra..,t, the step ill t.hl' rlivisioll proC(,.."'S are
dppPlldl'nt. an" WI' cannot. start a npw stpp hl'fore t hi' current Wlll8illder is
known al\(l a new quotient digit is S('I,'ct('d. Em'h step ill t.he di\'ioll consi.s
of two substeps. First., a quotil'lll digit is sdpetl.'(I, Rlid thcli the IlI'W partial
r('lIJainder is calculatl'tl.
We can spt'l'd up till' high-radix di\ ision pro("('SS dl'$cribed in Sect.ion 7.2
ill onp of two ways. One way is to overlap t.hl' full-precision calculation of tJu'
partial rl'mainder in step i with the selection process of the quotient digit. in
stf'p (i + 1). This overlapping is possible, since not all bits of the JlPW partial
rc'maindl'r must bp known in order t.o select. tIll' nl'Xt. quoti('nt digit. Anot hpr way
to SPl'(>() up this proCI'S." is to replace the carry-propagate add/subt.ract. operation
for calculating diP npw prut.ial rl'mainrll'r hy a cMry-savp oppration.
III the first mcthod. a truncatl'rl approximation of the new partial re-
mainder is calculdt('(1 in parallel to the full-precision .-alculdtion (If the part.ia)
ft'mailllier. This approximat.ion can bl' obtained at a high spl'l'd, enabling us
t.o pft'pare for t.he nl'\\' step (i.e., determine t.he quot.ient. digit) even before thp
current step is completed.
Thprl'forp, inst.pad of fir:-.t compll't.ing the cakulat.ion of the partial r('main-
rlpr r,_1 (with compll'tl' carry propagation) in stpp (i -1), and thpn inputt.ing tllP
.V I' most significant bit.... to thl' PLA to determine qi in stcp i, we can use a small,
fast a"dpr t.hat has as inputs tht' mo....t. significAnt hit of t.h(' prl'Vious partial re-
mainder, rlenotpd by {3-;::'2' aud t.he most significant bits of the corr<'Sponding
mult.iplo' of the di\'isor, denoted by I/:-::;D, as depict.ed in Figure 7.9.
ADDER
D
AGURE 7,9 A quotient digit selection logic
I
7.3 Speeding Up the DivisIon Process
199
This approximate partial remainder (APR) adcll'r pro,lu('es an approx-
imation of the rI"<Juired JV p mot. significant. hit.s of the new partial rI:'main-
dl'r, d,'noted hy r::;, beforp t.hl' full-pr('("ision a,ld/snbtr!\ct oppration (r.-l =
(3r,-2 - qi-lD) is complet.('(1. This allows liS to IJPrform a look- \head quotif'nt.
digit selection, and we can Iect q, in parallel with thf' fuIl-pr('{'ision calculation
of the partial remainder ri_I' CIo'arly, th(' Sill' of this APR adder should he
detl'rtnined $0 that sufficiently an'untl' N p bits will be j!;pnerated. Since t.he
uncertaint.y in thp result. of t.his adder is largf'r than thl' uncertainty in the t.run-
catro prl'\'iolls partial remaindl'r I3ri-2' wp may net'<l additional input bits to
t.he quotient digit look-up table.
Example 7.11
For 13 = 4, a = 2, and D e (1,2), t.he poD pl..t is !ohown in Figurp 7.10.
An APR addl'r of t.he con\'£'nipnt. size of eight. bits is sufficient to gpner,iU'
the necl.>sSary inputs t.o t.he quotient selection PLA (17). The hori1Ontai
lin(":ol in Figure 7.10 were det.{>rminc<l t.o reduce thl' complf'xit). of t.he PLA.
Only t.hrl.>e divisor bit.s are needed as input.s to t.hp PLA, sinc.' th!:' most.
significant bit of t.he divisor is always 1. As for t.I1P part,ial rl'maillll,.r.
out. of t.he output.s of t.he APR addl'r, fi\'e bits (inclucling thp sin hit) are
suffident in most cases. For a posit.ive partidl rem"\inder, only in three
cases (markpd by a, b, and c in Figure 7.10) is an additional bit. requirpii.
In case a, D = 1.001 and P = 1.1 and the single fractiomu bit. of P is
insufficient.. Thl' divisor can a.sume dny value from 1.001 t.o 1.010. The
part.ial rcmainder can have a value from 1.1 to 10.0. Thus, the rangp for
P/ D is from 1.1/1.010 = 1.2 t.o 10.0/1.001 = 1.77. The first re<luircs
q = 1, whill' t.he second rf"<Juires q = 2 (see Figure 7.5). Adding a sffond
fractional bit. t.o P solves t.he problem, allowing us to sl'lcct q = 1 for
p = 1.10 and q = 2 for P = 1.11.
In C&'>eS band c, the pxtra fractional bit of Pis r('(luir('(). sinCl' t.l... 8-bit.
APR addl'r may introduce an additional t.runcation error. fnrthpr increas-
ing thf' range of P/D, Consider for example, case b. where D = 1.100
and P = 10.0. If no APR adder is used, t.he range for PI D is from
10.0/1.101 = 1.23 to 10.1/1.100 = 1.66 and q = 1 can be seleded. The
8-bit. APR .wder int.roduces an error of up to 2- 6 in r._ It which increasf'S
to 2- 4 after the multiplication by 4. This additional error incre6Sl the
maximum \"Iut' of P / D from 1.66 to 1. 7. rftluiring 'J to be 2. All extra
fractional bit. of P solves this probl('m,
For a nlgat.i\"e partial renninder r('prL"Sentf-d ill two's complclJlellt, there
are SL"( cases where 1 or e\en 2 ndditiOll!\1 outpnt. hit.s of t.he APR adder
arc required t.o guarant.Ct' the corrc>ct Sl'1l.'t.ioli of the quotient digit. /171.
o
200
P = 4r._1
011.1
011.0
010.1
010.0
001.1
001.0
000.1
000.0
111.1
111.0
110.1
110.0
101.1
10 1.0
7
Fast Division
7.3 Speeding Up the Division Process
201
8/3 . IJ
./ r 5/3
-- -----
/ ----- .-
q-2 4/3
----
----- ....-- b -- --
---
----- -. --
--- 2/3
--
q-l -
c
- 1/3
q=O
0
- 1/
q..l - -
. .... . -
'iT- -- -2/
- -- h-..
--- -- ---
"""--- -
q= ---- --- --
-4/
- ...............
..........
.0
.0
.0
.D
Carry-savc Adder
3.0
FIGURE 7.11 An SRT divider WIth redundant remainder.
J.D
3.V
two sequences of intermediate gum bits and carry bit.s. These should he ston'd
in two separate registl>rs as shown in Figure 7.11. Only t.he most signifkant um
and carry bit.s must be assimilated using the APR adder in urdf'r to gpncrate !Ul
approximate partial remainder and allow t.he select.ion of the quot.ient. digit. In
this case, tllP calculation of the approximate partial remainder and thl' sdf>(.tioll
of the quotient digit are the most tim('-consuming operations. Thus, in eam step
of the division process a carry-ve adder calculates the partial remain&>r, .md
the APR add('r then (\('Cl'pts t.he most significant sum and carry bits of the par-
tial remainder and gt'nerat('s t.he required inputs to the quotient selection PLA.
As in the first method, the numher of inputs to the PLA and its t'nt.ries need
to be calculated, taking into account the unc('rtwnty in the sum anll c-arrv bits
representing the t.runcat.ed remainder.
100.1
01.000
01.110
10.000
01.010
01.100
FIGURE 7.10 The poD plot for (j = 4, Q = 2. and D e (1.2).
In the schemp deseribed above, the time needed for each step of the divi-
sion is primarily determined by the' t.in1f' requirl'd t.o p('rform the add/subt.ract
operat.iolJ for calculating the new partial remainder, since the quotipnt digit was
selected in thl' previous f;tcp. Thp sc<:ond method for spp€ding up t.he division
proccs... a\'oids the time-consuming carry propagation whpn calculating the new
p6rlial rcm<,indl'r. Sincl' a truncated partial remainder is sufficient for selecting
tll(> next. quotient digit, t.here is no nt'Cd to complete t.he calculation of the par-
t.ial remaindl'r 6t any intermediate step in the division. Thus, in.,tead of using
a carry-propagdting addcr to calculate the new part.ial rpm.undC'r, we can use a
carry-ave addpr and represent the parti61 remainder in a redundant form lr-ing
Example 7.12
An algorithm for high-speed division with p = 4, Q = 2, dil!l D E (1,2)
has been pre:wnted in (71. The partial remainder is calculated in a carry-
save manner and, consequently, t.wo registers are needed to tore the sum
bits and the carry bits separat.C'ly, resulting in a somewhat mon' complex
de.sign. An 8-bit APR adder is uSt..>c:1 to gPllerat.c the most significant part.ial
remainder bit.s that are I1t'f'<led a,<; input:; t.o t,Il' quotient selN,tion PLA,
202
7, Fast DivISion
7 4 Array Dlvfders
203
Thl' ill puts to this APR addl'r ar£' th.. eight most !;ignificiI}t sum bits
and carry bits in tit£' r<:>dundant rl'pre,'wntat.ion of the partial rc>n1dindl'r.
Thp output!> of thl' APR adclt>r are t.ll('n cOlI\'ertpd to a sign-magnitude
r£'prC'sputation alld, a8 a rf'SlIlt, only four hits of the approximatp partinl
rl'maindl'r arc nl"(ded in mo..'It c-.e.s, Only in four cast>s is an addit.ional
hit rN}uirro, yieldillg a vl'ry simple PLA. 0
most '''!/III-
fir.arlt bit.
Further spl.''('(1 up of the SRT division call bf' achieved by incrt'a...illg the
radix 13 of t.he algorithm t.o 8 or eVl'Il higher. This reduccs the lIumber of st 'ps
to fll{31 or 10wl'r. S('veral such rndix-8 SUT dividers have bl.''(>n implemented
/8, lOJ. The IIUlill disadvantdge of th(' radix-8 (or higher) SRr algorit.hm is the
high complexity of the C(llotif'nt selection PLA which thell becomes thp most
tim<>-consumillg unit of tlU? dividpr in Figure 7.11.
0111' way to avoid the need fur a vC'ry complpx quotit>nt sf'll'Ction PLA
is to implement a radix-2 m SRT IInit us a 1>et of m overlapping radix-2 SRT
st!lv;es. The radi.x-2 SRT requires a very simple quotipnt sl'lcction logic since q.
(q. E {-1. O. I}) is solely dett'rmil1l'd by t.he remainder and is independ£'nt of
t.llt'divisor. \Ve 1I1ust however. overlap the quotient selections for the m bits so
that all ", quotil'nt bits will be gl'n('rated in ont' st.{'p of the procc&;. Figurl' 7.12
depicts two oVl'rlapping radix-2 SRT stages which gelwrate two quot.il'nl bits (qi
Rnd q,+d in on£' step, implem(>uting a radix-4 division.
Ba.'iPd on the 1I10St, significant bits of the two remainder St'<IUt'nces (the
slIIn and c.arry sequpnc(>S), a value for q, is gpnerat.ed using a Q"d unit. In
paralIt'l, all thr(>(' possible values of q.+ I are generatl.,<1 using tlm.'(> Qd units.
Thpsp values corrpond to t.he tlue\' possible int£'rn1('diate rt>maindC'rs, l\1\mt'l)',
2r._I- D, 2ri_1 aud 2r'_1 +D. Note howt'ver that only the most siJ.,'11ificant bits
of t.hpgp three remainders have to bp gcneratl."ll. Onc(' qi is known. the correct
valu... of q.+! is sf'l('cted. This value is t.hen USM to select t.hp correct multiple of
the diviNor to form the new remainder whirh will be stored in t.he t.wo r£'gistcrs
(for tllt sum and carry sequpnces). The:> ovprall delay of till' radix-4 circuit in
Figurc 7.12 is detl'rminffl by t.hl' dt'lay of a Qd, thp delay of t\\O multiplexor
units and till' delay of the final CSA unit, This dl'lay may be shortl"r t.lMn till'
deluy of a nulix-4 stage dul' to the higher complpxity of t.he radix-4 quotient.
sek>ction PLA /1.-&1.
Extt'llcling the above technique to radix-8 SRT division neceitates a more
c Jlnplex quotient selection circuit since tlm'C quotient digits (nampl)', q.. q.+1
and qi+2) must bp generak>d in parallpi. For ellerating qi+1 the speculat.ivt:
remaindprs 2r'_1 - D, 2r',_1 IInd 2r._1 + D have to be calculated. For generat.
ing q.+J the speculativ(' fI>lUainders -iri_1 - 3D, 4ri_1 - 2D, -iri_1 - D, 4r._it
4ri_1 + D, 4r'_1 + 2D, and Iri_1 + 3D must be calculated (again, only th£' mrn-t
significant bit'i of these seven renldinders). This implil'.s that sevell Q",d units arl'
II.'<juirC'tl with multiplexors (controllto<l hy qi and qt+l) to select the correct valul'
o
D
D
o
q.+1
FIGURE 7.12 Two overlapping radlx-2 SRT stages.
of qi+2 /141. Ext£'nding the abo".e to four ovC'rlapping mdix-2 stages t.o obtain a
radi.x-16 divider will r<'Sult in an increase in the tot.'ll number ofQd unit.s frum
11 to 26 making this approach more costly. Another dlternatiw for a radi."(-16
divider is to use two overlapping radix-4 SRT stages (181.
7,4 ARRAY DIVIDERS
All algorithms for division can be impleml"ntro using an array of cells wh,n'
each stpp of the aJgorithm is eXf'cut('(1 hy a separate row of ('ells. Thus, r1 rows of
cells with" cells per row are required to implement a radix-2 division.'lorithll.
If the restoring Rlgorithm is f;plectl.>d for implpmentation, te t.h chtfl'l't'Ul'e m
each row between the previous partial n>mainder aud tit,> h\'lsor IS formt'd. IUld
the quotient bit is generated u('cording to tltp sign of this (Iffl"rp}c. rhl're I no
need how£'ver to restore the partial rt'maiudt'r if t.h£' qllutll'nt bit IS dt{'J'mllled
to b; O. I ns';a<t , 3Ct"Ording to t.he gen,'mkd (juot.it'nt bit., tithl' the pr'Violl:l
part.ial rl'lIlaincl£'r or thl' diffcrence (whit-h constitutes tle new partll1 rplU8l11dl'r)
is transferred to the next row. If a ripple-carry M"hpu}t> IS f'lUployt"(lm eWI)' row,
204
7. Fast Division
l
I
74
AIrav Dividers
205
then it tak('s n stpps to propagate the ('arry in a single rowand, since t.lWfl' arc
n rows, t.he t(1tal execution t.ime IS of thl' ordl'r of n 2 .
Similarly, we ('I\n impl('m('nt a nonrft;toring division army. It has about
t,h£' samp spCf'd as the r{'Storin arrd)", and it.s only adwU\tdJ::I' is its ability to han-
dle negative I}peraruls in a simpl(' way. On t.he other hand, thp final r£'lIIainder
lIIay he inror['('('t, having a sign opposite to that of t lip di\'idend. An pxample
of such an army di\'idl'r is shown in Figur(' 7.13. where IO,XI .. 'X(i is th(' divi-
dend, rlo.dld2d3 is th,' divisor, QO.Q\Q2q3 is the quoti,'nt, and TO.TJT2T3 is tlw final
rpmaindPr (3].
Thp operation (addition or !>uhtraction) to be p£'rformPlI in a iven row is
cont.rolled by th£' signal T (5('(' Figufl' 7.13(a». If T = 0 an addition is p('rformed,
while if T = 1, a subtraction is pxpcuted by adding the two's complement of th,'
divisor, whidl is 8ssumffi (in the implementation dppicted in Figure 7,13) to b('
positive (i.c., do = 0). The latter is done hy forming the one':, complement of the
divisor and forcing a carry of 1 into the right.most cell by conn('ct.ing it t.o T. The
genemt.l'd quot.ient. and remainder are r£'prented in two's complem('nt. but t.he
final remainder is not always corr('d To show that thl' quotil'nt genl'mted the
array di\ ider in Figur(' 7.13 is correct, lIotp that in each step ofthe nonr,oring
division, t.hl' partial remainder and th(' mult.iple of t.he divisor (:i:D) alwa's have
opposite sins. TllPrt'fore, the carry-out from the leftmost cell equals 1 if t.he
sign of the lIew part.ial product is 0, and vice vcr:>.:'1, Hl'nce, Cout = 1 implies
that thl' opl'ration in the nc>.:t row should hc subtraction (1' = I), since tbe
divisor is assumed to bl' positive. Similarly, c....t = 0 g£'lll'rates T = 0, so an
add operation should he performed in the next row. The values Cout = 1 and
Co,,' = 0 for row i correspond to 1Ji+1 = 1 and 11;+1 = 1. respect.ively, where 11,...1
is t II(> (i + 1 )th quotient bit. This is identical to the relationship between p and Q
in t.he algorithm for converting the representation of the quotient that us the
digit set {1.1} to the equivalent two's complement representation, as described
in Section 3.3,
In the two previously presented array di\'iders a complete add/subtract
operation with carr)'-propagdtion is performro by I'ach row in the array. In tlte
nonrcstormg division. only t.he sign bit of the partial remainder it; needul to sell'Ct.
the quotient bit. This sign bit can be g('nerated by using a fast carry-Iook-ahead
circuitry, while the other bit.s of the partial remainder can be generated using
carrY-Sl\\'e adders. Each cell gcnl'rales a P, and G. out.put. (Cc1I'ry-propagat£' and
wrry-gelll'rat(', rpgp,-"Ctivcly, as in a carry-look-ahNld adder) in dddit.ion to t.he
ordinary sum and carry outputs. The p. alld G; out.puts of all cells in the same
row are cOllnected to a c.arry-Iook-ahead circuit., which g('neratl's tit£> quotient. bit.
Tlu:, exel'ut.ion time of such an array divider is of the order of n log fl., compared
to JJ2 for the previous two array dividprs (3].
lu a similar way, we can implpment a high-radix division array with cnrry-
:,ave addition. Herp, a small ,.arry-iook-alll-'ud '1ddl'r is usro to dt'termint> tIll'
I,
r. n
,
r
COUI
FA
Con
L
I
I
I
I
I
I
I
I
I
,
I
rol.lC
(a) Controlled add/subtract (CAS) cell.
,10
Xo dl
XI d2
.1:2 d 3
Z3
90
91
112
93
ro rl
r2
r,l
(b)
FIGURE 7,13 A nonrestorlng array divider.
206
7. Fast Division
T
,
,
I
7.5 Fast Square Root Extraction
207
correct most significant hits of t he partial rC'maind,'r in order to select t!1t' quo-
til'lIt diil. Due to tin- illlil(\rit)' b,>twIl('n thp hasic c<,l1 in array lIIult,ipli,'rs and
array dividers, a rontrolle:>d arr.\)' multiplil'r/diviclC'r can be dl.signed. &'vcnl
such circuits have been dt'ScribN1; e.g., II).
SincC' (Qi-I +2-i-l) and ((J._1-2-i-l) are in th' rangf> 11/2,11. w€' may rppl \("f"
(7.19) hy the following selection rule, which avoid" do high-pre<'isioll C'<Imp<iri..oll:
7,5 FAST SQUARE ROOT EXTRACTION
q. {
1
o
i
if 1/2 2ri-1 S 2
if -1/2 S 2rj_1 < 1/2
if -2 2rj_l < -1/2.
(7.20)
As point.C'd out in Section 3.4, the similarities between square root extraction alld
die di\'ide opl'ration allow thp adaptation of almost all the algorithms that have
h('(>n devplopNI for division to thp calculution of the square root, with only somp
minor modifimtions. ConscquC'ntly, small extensions to the hardware desi8n('(\
for 1\ di\'ision nnit <'liable the c<\lculation of the squar€' root (e.g., 17), 1171, (211).
The nonrestoring algorit.hm for square root extractioll that has been pre-
sented in Section 3.4 allows the use of thp digits 1 and i for 'I.. where Q =
0''11, . , . ,'1m is the cnkulatcd square root. Allowing 'I. to assume the value 0 has
two important adv811tagl."'S. First, a shift-only operation is rpquired when q. = 0,
redulillg till' number of add/subtrClCt opprations that must be performe<1. Sec-
ond, having a ov('rlap het.\Vl-'cll d\{' region of the remainder ri where 'Ii = 1 is
s('l('('tro and the region where 'I, = 0 is se:>l<,ctM leads to a reduced precision in-
spE:'ction of the remainder. In the nonrestoring scheme, we must identify the case
of ri 0 in ord('r to c.x>rrectly Sl't the bit. 'I.. This requires precise determination
Qf the sign bit of r,. IffJi = 0 is allowed, a lower precision comparison is sufficient,
enabling th€' use of carry-S3\"C adders in the calculation of t.he:> r{'mainder Ti' In
thio; ('aSe, the remainrle:>r is (I'presentcd as t.wo Sl'quences, a partial sum sequence
and a sequence of carries. Only a few high-order bits of these two SC'quences
must Iw examiued in order to select the bit 'Ii.
To decide on the region of ri where 'I, = 0 can b(' selected, we basically
follow t.h€' same idea as that behind the SRT division algorit.hm. We restrict t.he
squarl' root Q to be a normali7<:>d fract.ion, 1/2 Q < 1, with'll always equal
to 1. CouSP-queutly, t.he radicand :ohould satisf)' 1/;1 S X < 1. In this case, th"
remainder ri-I (for i 2) 88tisfi<'S the condit.ion 112)
This selection rule is !limilar to the SlIT rule in lnat.iCln (7.4).
2(Q'_1 - 2-') S T._I lQ.-1 + 2- i ),
Example 7.13
Let X = 0.0111101:1 = 61/128:
ro=X 0.0111101
2 r o 0.1111010 fiet f/l = 1. Ql = 0.1
I -(0 + 2- 1 ) 0.1000000
rl 0.0111010
2rl 0.1110100 set '12 = 1, Q2 = 0.11
I -(2QI + 2- 2 ) 0 1.01 00000
r2 1.1010100
, 2 r 2 1 1.0101000 S('tq3=l. Q3= 0.101
+(2Q2 - 2- 3 ) + 0 1.0110000
r3 0 0.1011000
I 2r3 0 1.0110000 set 'I" = 1, Q" = 0.1011
(2Q3 + 2-") 0 1.0101000
, r" 0 0.4.1001000
2r.1 0 4.1.0010000 Sf't qs = 0, Qs = 0.10110
rs 0 0.0010000
, 2 r s 0 0.0100000 Sf't '16 = 0, Q6 = 0.14.11100
r6 0 0.0100000
2 r 6 0 0.1000000 set '17 = 1, Q7 = 0.1011001
(2Q6 + 2- 7 ) 0 1.0110001
r7 1 0.0001111
The square root is Q = 0.1011001 = S9/128. Tilt' final relllclimier is
2-7r7 = -113/214 = X - Q2 = (7808 -7921)/2". 0
wher<.' Q, I is the:> partially calculated root at step (i-I), i.e., Q,-I = 0.Qlq2,qi-I'
In step i 2, we tuay tlll'refore set 'Ii equal to 0, which results in r. = 2ri_1
wheulc'vpr ri-1 is in the range [-(Qi-I _2- i - I ), (Q._I +2- i - 1 )1. Hence, a possible
fielection rule for q. is
The high-radix algorit.hms for division can also be modifil>d .to caculate
the square root. The generall'<Juation for computing th,> uew rem6mder 11:1
Ti = Jlr.-I - 'Ii' (2Qi-l + '1.lr')
(7..n)
q, {
1
o
I
if r._1 (Q._I + 2- i - l )
if _(Q._1_2-i-l) ri_1 (Q._I +2-i-l)
if r._1 S - (Qi-1 - 2-'-1).
(7.19)
1 _.1' .1 h d t. t orQ Is{ n a-I , ...,l.4.1,I,....,- } ,
where i3 is t 1C rlUllX Rllu t. e igi se II I' -,
as it is for division.
208
1
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
7, Fast Division
For exaOlplp, whl'n fj = 4. t he diit spt {2, T, 0, 1,2} is prderahll', since it
eliminates the 11('(>(1 to genl'rat(' thp mutt,ipl,> 3q"-J' The generation of thl root
Ulultiplp q, . (2Q'-1 + qi4 -i) makes the squan> root ('xtrnction somewhat more
compl('x than t.he equiw\lpnt divj,..ion. lIow('\'pr, a careful pxamination of tJl('
requirf'cI Ilmltipks shows that these can b(' (,Mily calc-ulated. For thp posit.ive
values of q" namely, 1 and 2, we h8\'c to subtract the s('quencps Q0012 and QOI0 2 .
respectively. \\'III'n qi = i, we ha\;e to add the sl'<Juencp Q001 2 , which has the
same valal(' dS (Q - 1)111 171. Similarly, when q, is 2, we must add t.he sequpnc('
QOI0 2 , which is equivalC'nt to (Q - 1)1102. Thus, huvinp; two registers with the
valups Q and Q - I, updatpcl at ('wry stC'P. greatly simplifies the exccution of
tllP square root algorithm 171.
Since only a 10w-prP<'isioll comparison of thp rpmainder is JIl"f'dPd in or-
der to sl'lect t,he quotipnt digit. we JUay pprform the add/subtract operation in
Equation (7.21) in a carry-s!we manlier, and use a small rarry-propagatp adder
to calculate thp most significant bits of Ti' Tho.<;(' will t.hen provide some of the
inputs to a PLA for splcet,ing the square root digit q" similarly to division. Thp
otl1l'r inputs to thf' PLA are the most significant bits of the root multiple. Sev-
eral rules for selecting q, have b(>('n proposed «(4], 171. and 1211). In these mips,
thC' int.prvals of the r('mainder detprmine thp size of the carry-propagating adder
(hetwet'n 7 and 9 bits for the base-4 algorithm with thp digits {2, 1. 0.1, 2}) and
t.hp pxact PLA entries. The selt'CtPd digit qi depends on the truncated remainder
and t,he truncated root multipl('. In the first tep, howpver, no estimated root is
availabl('. Consequently, a separate PLA for predicting the first few bits of the
root ma)' be necessary.
Example 7.14
The radix-4 divider (for double-precision floating-point numbers in the
IEEE I'tandard) report.ed in 17] is capable of calculating tbe square root as
well, The P-D plot in Figure 7.10 was also used for squarc root extraction.
Therefore. the same PLA (with 19 product t('rm..) is used for predicting
the next quotient. digit and the next root digit. A separate PLA (with 28
product t.erms) was added to the ",rit.hmetic unit. in order to provide the
fivp most significant bits of the root. The inputs to this PLA arC' the ix
most significant bit.s of the signifil'and and the least. significant bit of tl1P
t'xponent. Thp latt('r indicates whethC'r the exponent is odd or mren. This
is ll<'Cessary in order to find the square root, according to the following
equation:
{ vfOTl.2(E-I)/2-1023 if E is odd
v L / .2 E - 102 .i = (7.22)
..;oJ5fJ.2E/2-1-1023 if E is evcn
Note tlldot the rcsulting radic \Od (JOT! or JQ.Ofl) u. in the range 11/4, 1]
yieldin a square root. in t.he required range, 11/2,1]. 0
7.6 Exercises
209
7,6 EXERCISES
7.1. Show that, for a divisor D = 3/4, the snT ..Jgorithm will n\way gl'neratt> tbl'
minimal SD rpprescntation of the quotient.
7.2. Given a dividend = 0.1001 unl a divisor D = 0.1010. perform the divide o(>('r-
ation using (8) the standard binary SRI' algorithm, (b) the modified binary SRT
dlgorithm allowing the I of the multiples D/2 aud 2IJ, and (c) Ihp modifi('(!
binary SHr aloritbm with five different comparison constants.
7.3. Check whether the divisor is witbin the favorable region (re<lwring fwu add/sub-
tract operations) for all five comparison constants K J ,... hr..
7.4. Analy./..<' the possibility of using the first two bits of the divisor a." a comparison
constant K to speed up the SRI' al&orithm for a divisor in thf' range (0.5,11. For
example, for D = 0.1101, the most significant bits of the partial remainder will
be comparro to K = 0.11-
(8) Find whether the number of ddd/5ubtract operations in Ihp slJe1jt('(1 mPlhod
will be smaller thdn, e<lual to, or greater tLan the nwuber of these operalions in
the original SRI' method (with K = 0.1) in the following thrl'e c:
(i) D = 0.1001 'Uld X = 0.10000111
(ii) [) = 0.1101 and X = 0.01011011
(iii) D = 0.1111 and X = 0.01001011
(b) \'I.'ould you recommend thp use of the abO\'e algorithm?
7.5. Derive a variation of the binary SHT divL<;ion algorithm that USf'S 8Pveral compar-
ison constants similar t() the one described in Scction 7.1. Allow tLe comparison
c,onstaots to have at most three fractional bits. The divisor domain (0.5,1) :.hollld
hp Ilartitioned into subregions so that only the three wost !jignificant fractional
bits of the di\'isor will be needed to determine the Iected comparison constant.
7.6. Show t,hat, for a = 13 - 1, the high-radix SRT algorithm inrludC"S 8S a special
case the high-radix nonrestoring division algorithm.
7.7, The followinJ?; algorithm CWI be used to convert a quotient qn, giwn ill SD
repre.ntation with the radix r and the allowed digit set «r - 1),,, .1,1,... (r-
I)}, to its equivalent representation YOY1Y'l'" Yn in radix complement:
Step 1: If ql < 0 then set yo = r - 1 and YI = r + ql
else set Yo = 0 and YI = 91
St p : Set j = 1.
Step 3: If 9J+1 < 0 then set YJ = Yj - 1 and YJ+I = r + qJ+J
else set !lj+J = <}J+ I
Sup 4: If j = rI - 1 then st()P
i"l'>C set j ::: j + 1 and j?;Oto !jtt'p 3
(8) lIse the above' algoritbm t() ("onvert the SD binary number 1111 to two's
complement representation. . . ..
(b) Compare thi!j aloritbm to the one described 111 Sf'(.lIon J.3.
(c) Can the subtraction in step 3 genralA> a borrow that JUay propagnu> to
higher-order bits?
210
], Fast DIvon
T
I
1
7.7 References
211
[11 D. P. AGRAWAL, "lligh-spet'<l arithmf'lic arrays," IEEE 1hms. on Computers.
C-28 (March 1979), 215-221.
121 D. E. ATKINS, "Higher-radix division using estimates of thC' dh isor and partial
remainder.;," /FEE furu. on Compute-, C-17 (O(.t. 1968), 925-931-
(3] M. CAPPA and V. C. HAMACHER, "An augmr>nted iterl\tivt' array for hiJ?;h-spcC'd
binary divisiun," IEEE 1hnu. on Coml/utf.'T'S, C-22 (Pcb. 1973), 172-175.
(41 L. ('1t.mm,:RA and P. 1()STlISCIII, "High('r radix square rooting," IEEE 1hnu.
orc ('ompute. 39 (Oct. 1990). 122123l.
[51 M. D. ERCeGOVAC and 'f LANG, Vit/ision and squarr root: Digit-recun'L>nce
algorithnlS and impl mrntation., Kluwer Academic Publliiher, Norwell, 199'1-
[6] D. GOLD8FRG, "Computer .uithmetic," ill ('ompuh.r architect.ure: A quantitatrve
approach. D. A. Patterson and J. L. lIenncs:,y, Morgan KdUflU8IUI, San Matro,
CA, 1990.
[71 J. FANl>RIANTO, "Algorithm for hijo(h speed sharM radix .1 diviion and rddix I
8qllEUt> root," Proc. of 801 Symp. on Computer Arithmetic (1987), 73-79.
181 J. FANDlUANTO, "Algorithm for high sp shared radix 8 division and radix 8
square rool," Pmf'. of 9th SII'''p. on ('omputf'T Anthmf'tic (1989),68-75.
I
1
I
,
,
,
I
I
I
I
,
(91 C. V. FR:IMAN, "Stl\tistiralana!ysi'lof cert"lin binAry clivi.."m Algorithms," Proc.
of lIlF, 49 (Jail. 1D61), 91-103.
110] R. F. 1I0IJSON and M. W. FRAS'R, "An ptlkil'nt lUaximum-redundanr-y rltdix-8
SItT divisiolJ and squar('-root method." lI';I-:E Journal of Solid-State CimJlf8. .JO
(JalJ. 1995), 29-3B.
(111 O. L. MACSORLE\', "High-speffi arithmetic in binary COll1putt'rs," Proc. of IRF,
49 (JalJ 1961), 67-91.
1121 S. )'IAJERSKI, "SquarE' rooling algoriLlnns for hiJ?;h-speed digital cirmit.s," If'FE
1ron.,. on Computers, C-34 (Aug. 19R5), 7l1-7J:t
(131 G. l\-b;17', "A class of binary divisions yielding minimall). represented quotients, ..
iRE 1hm-'., EC.ll (Dec. 1961), 761-7f>.1.
(141 J.A. PIIAIIIIl' <ind G.B. ZYNER, "167 )'1I1z radix-B divide cU1d 5(}Ullre root Il.<;injt
I)\'('rlappro radix-2 stag," Proc. of tllc 12th Symp. on Comp'dcr Anfllmetic (.Iu"
1995), 155-162.
(15) J. E, ROBERTSON, "A nt'w cla..'IS of digital division ml'thod..:' IRF Tnm.... on
Electronic ComplJte, E(,-7 (Sept. 1958),218-2l2.
(161 J. E. ROBER'fSON, "The correspondence betWt'l'n methods of digital divbion /lnd
multiplier (I'('ording procedures." IEI-'E Troru. on Computers. ('-19 (AuK. 1970).
692-70 l.
1 17 1 G. S. TA\ LOft, w('ompatible h.ud","U'e for division and square root," Proc. of 5th
Symp. on Computer Anthmdi4= (1981), 127-13-1.
[18) G. S. TAYLOII, "Radix 16 SlIT dividers with oVl'rlapped quotient selection stages,"
Proc. of nIl Symp. on ('omputer Arithmehc (June 1985), &1-71.
(191 K. D. TOe-HER. "'fechniqlles of multiplkalion nud di\'Lsion for autowatic bmar)'
<'Ompulers," Quart. J. Mech. Appl. Math.. 11, Pt. 3 (1958), 361-.JM.
[201 J. B. WILSON RIld R. S. LEDLEY, "An aIorithm or rapid binary division," IRF
1hlrlS. on Ell'drorlic Compute, EC-1O (1961), 662-670.
[211 J. II. P. ZURAWSKI and J. B. GOSLING, "DesiKJl of a high-speed square root
multiply and divide Unit," IEEE funs. on Computc. ('.36 (Jan. 1987), 13-23.
7.8. Show the P-D plot for SRr divi...ion with (1- .1, 0 = 3 and DE 10.5,1). D('ter-
minI' thp rl'WOIL'i for the different vnluC"S of the quotiC'nt dit-,it and the IIr{'Cision
of the partialremailJder and the divisor thai i... n<'ffied.
7.9. Show the s{'Cond part of tht' P-D plot dt'piclJ in Figure 7.10 for n nC'gati\1'
partial reml\inder in two's complt'mpnt.
7.10. Redraw the poD plot sho\\'11 in Figure 7.10, aUIJ\\iug tht> r('C}llirl'<1 precision of the
divisor and of the partial rC'mainder to be 2- in most cases. Indicatp Ihl' f('w
('a. wlIPrt> a hilther pr{'Cision is required.
7.11. Vt'rif)' that thl' C1\rry-out hits of the lC'ftmost C.\Ss in Figure 7.13 geuerate th('
rorrE'("\. quotient bits for X = 0.011111 an(1 D = 0.110. Is the generated final
remainrlpr corrl'<'t!
7.12. ('an the array divider in FiguJ"C 7.13 be pipelined? What would the rale of such
a pilwline b('?
7.13. Show the block dil\gram of a binary uonro.>:;turing .UrdY dividl'r (wit.h tht' same
Ilumber of bils for the dividend anI I divisor, as in FiKitre 7.13) that gC'nerau-s th('
sign bit of the p'lItial remainder using 1\ carry-look-ahead circuit, while generating
all other bits using c3rry-savl' addition. Dl'Sign till' I>a..:;ic cell(s) of tlJ(' dividl'r.
Compart' the execution time and implementation complexity of } our array divider
to those of the divider in Figure 7.13.
7.14. Show that eX"lctly 11 rndix-2 Q.d units are required when implementing a radix-
8 divider using three overlapping radix-2 SRT !jtag. [stimate the delay of thPSe
three ovprlapping stages.
7,7
REFERENCES
T
I
I
I
I
I
I
I
I
I
I
I
I
I
I
8
DIVISION THROUGH
MULTIPLICATION
The number of steps in the previously described divi..ion methods was linparlv
proportional to the numbpr of bits, n, while in the clivision-by-convergen;e
schemes, which will be descrihpd in this dlapter, thp nUlnbpr of steps is pro-
portional to log2 rl. However, the basic operation in thp division-by-converg('nce
schemes is not an add/subtract operation but tilt> usually slower multiply op-
eration. Hencc, a fast parallel multiplier is necessary to successf\tlly impiernt'llt
these schemes.
8.1 DIVISION BY CONVERGENCE
Llt the divisor D and the dividend N be considered the dt'nominator and nu-
mt'rator, rl"Spt'ctively, of the quotit'nt Q. Thus, () = N/D. This holds if we
multiply both numerator and denominator by the sallie factor no, or even by III
factors Ro, Hit .. " Rm-l' If tht' factors R. me selected so that the dpnominator
convt'rges to 1, the nUllwrator will converge to Q:
N N - RoRl ,., Rm-I Q
Q=-= -. -
D D, RoR. ... Rm-I 1
(.1)
In Equation (8.1) only the quotient is calculated, and a separate compu-
tation is ne<"l.>ssary if t.ht' remainder is ne('de<1. Then.fore, this divil!ion ellle is
more suitable for floating-point computations.
The p-ssential step in this method is tilt' selection of t,h(' factors to {'nsnre
the convprgCnce of the denominator to 1. rhis selt'ction is based on the following
213
DI = D. Ro = (I - y) . (I + y) = 1 _ y2,
(8.2)
T
I
I
I
I
I
I
I
I
t
I
I
I
I
f
I
1
I
f
8 1 Division by Convergence
215
214
8. DIVIsion through Multiplication
D2 = DI . RI = (l- y2). (1 + y2) = 1- y4.
(8.3)
Example 8.1
For the 15-hit mUllbpr!l N = 0.011010000000000 = 0..1062510 ami D =
0.110000000000000 = 0.75 10 the quotient is cakulatPd as follows:
The first multiplying fa(tor is Ro = 2 - D = 1.010 000 000000 ooO,and
as a result, the new numerator and denominator arp N I = N. Ru =
0.100000100000000 nnd DI = D.Ro = 0.111100000000000. ThpSN'Onrl
multiplying factor is RI = 2 - DI = 1.000 100 000 000 000,3nd the next
numt'rator and denominator arf'N2 = N, . RI = 0.100010 100010000 and
D 2 = DI . RI = 0.111111110000000. Not.e that t.hp numht'r of IcsdinR
l's in the denominator has doubled from 4 to 8. The t.hird multiplying
factor is R2 = 2 - Ih = 1.000000010000000. and thenN 3 = N 2 . R2 =
0.100010 101 010 101 and D3 = D2' R 2 = 0.11111111111111l.Collverg-
ence has bn achie\'ed (D3 = 1 - ulp) in three steps and the resnlting
quotient is Q = N 3 = 0.54165 1 0. The t'xact rc>:mlt is the infinite fn('tion
0.54166 10 , 0
ob'rvRtiou: Let tile divisor be a normalizM binary fra(.tion O.lx.r.rx (whprp .'ach
x IS ,'ithpr 0 or 1). Therefore. 1/2 $ D < 1 and D = 1 - Y. whpre y !. If we
8<'1('('t Ro to be 1 + y. t.hen thp np\\, denominator is
and since y2 :$' t, the nt'w denominator DI satisfies DI , and is th('refore
closer to 1 than D, In binary notation, the' III'\\' dpnominator has t he form
DI = O.l1XJ"3'.l".
In st,ep 2 we sel('('t RI = 1 + y2 obtaining
ow, y.1 :$' I and. t.herefore, D 2 takes on the form O.l1l1xx.l'x, aud is closer
to 1 than DI' In geupral, in the (i + l)th st.ep the divisor D, has the form
D i = 1 - Yi, where y, = y2', and sim'p y :$' 2- 1 , D, has at. 1C<'lSt 2' leading l's.
The next multiplying factor is R. = 1 + Yi and, as a fl--sult. D'+J = D,R; will
have 2'+J leading 1'5.
To show formally that t.he clt'nominator con\,{'fges to 1. note that
lim D. = (1 + y) . = 1.
,-x. 1 + Y
(8.5)
The total nnmb('r of steps (order of log2 n) is smallpr th.m that. rf'Quired
by the algorithms based on add/subtract operations. for which the nllmber of
steps is linear in n. Ho\\'evt'r.pach stpp involvf'S two multiplicntion, which arp
more t.ime-consuming than add/subtra('t opcrat.ions. Thus, t.here is a 1lPf'd to
further reduce the number of step$. Olle way to achieve this is to speed up thl:'
first fpw steps, where the comrergence is very slow. After step 1, only two leadiug
l's are guaranteed, and aher step 2 is completed only four l's arc guaranteed.
A speed up of the fin;t steps can subst.antially reduce the rpquired number of
multiplications. Instpad of sele<..ting the first multiplier to be i?() = 1 + y, we
may use a look-up table t.hat provides a multiplier that ensurcs a denominator
DI with 1,- lpadillg 1 '5, where k is at least 3. The next denominc\tor will thcn
have 2k leading 1 '8, and so 011. The size of this table, which can be ston>d in
a ROM, inrreases exponent.inlly wit.h k, the dl'Sired number of le\(lillg l's. The
number k therefore has to b(> determinE'd so that the require<.1 size of the tablt'
is still reasonable.
(1- y). (1 + y)(1 + y2)(1 + l)...) = (1 + y). (1- y)(1 + y2)(1 + l).. .). (8.4)
Thp term within t.hf' brackets on the right-hand side of Eqnation (8.4) is the
series expallsion of 1/(1 + y) for 0 :5: y $ , hencp
Thl process of multiplying t.he numerator and denominat.or by R; is rep('ated
until D, con\'ergl'.s to 1, or, more prt'cisdy, to 0.11...1 (which equals 1 - 1I1p).
The number of leading 1 's in D, is doubled at each step, c\nd ther('fore the number
of itprations is m = pog21l1. and we say t.hat the convergence is quadrati,.. Tilt'
multiplying factor in each step has to be generat.ed 'tile! t.hen multiplied by the
l1umprat.or and denominator. TII(' multiplying factor R, has to be obtailled from
D,. Fort.unatPly. thp relat ion betwl't'n these t.wo is simple. R; equals 2 - D i ; i.e.,
R. is t.he two's complement of the fraction D,. Thus, cu('h step consists of th...
t.wo mult.iplicatiolls
Example 8.2
The division scheme described above was first implt'mentoo ill t.l... 181\1
360/91 for flonting-point division (I). The long format of ftOc\ting-poiut
numbers in t.he IBM system consists of 64 bits partitiont'(1 as follows:
Do+J = D, . R, and N I + I = N i . R i
(8.6)
S 7 hits - bia.sed ('xpon('llt 56 biL'i - IUL"igned f'Tacti(lnnl significi\Ild
and a t.wo's complement operation,
R'+I = 2 - D'+J'
(8.7)
The operands for the division nn' th('r('fore 56-bit long fractiu(l.-;. If the
first multiplying factor is Ro = 1 + y, t.hen flog.l5t.il = 6 sh'ps Ilre lICt,(I,>d,
216
8. Division through Multiplication
I
t
I
8 1 DiviSion by Convergence
217
r('quiring 12 multiplications of 56 hit$ each, Actually, only II multiplica-
tious sut' nf'('(It>d. sincp thprp is no nf'('(i to calculat(' Ds in the I&t step; we
now that it equals 1 and th"re is 111..> nC<'<.! to calculat(. another multiply-
mg fnct.or. To rechlcP tlu' numbpr of nlllltiplicat.ions we can use a look-up
table for th(' first factor RQ 80 that DI will have at. least k = 7 I,'ading l's
since it If'ads to the sequenct' '
Thp "('rror" in D'+1 is 0(1 - V.), which sat.ifi
o s o( 1 - y,) S 0 < TO,
1 -+ 7 -+ 14 -+ 28 -+ 56
ad we n therefore still expect to obt."in 0 II:'ding l's in D i + l . Th.. only
difference IS that now Di+1 may convt'rge toward I from eit.h£>r bplow (i.e., D"H =
0.1 '0' .lxxxx) or above (i.e., D'+I = 1.00... O.cxxx), sillce the "prror" i.. always
pOSltlVl:' (bee exprcise 1).
In each of the truncatro multiplication factors, th(> first half of tht' hits
are identical, either all D's or all 1 'so If the multiplipr R; is recoded ubing SD
represpntation, then these leading O's or 1 's will not generatp nonZf'ro Pdltial
products and tht' execution time of the multiplications will be further rpouf'C<1.
of leading l's. Consequcntly, we n£'Cd only four steps, fl'quiring seven
multiplications. For this k, it has b(,(>l1 cll'tprmined that 7 bits of Dare
nC'('(lC'd and 10 bits of the factor Rl haw to he' stored at each location,
rl'qlliring a ROr-,.1 of sizr' 128x 10. Thp cOntents oftht' row that corresponds
to a given denominator D = 1- Y is the approximated value of (I +V)(I +
y2)(1 + y4) t.runcated to 10 bits. At its highest precision, such a multiplier,
R o , would guarantee eight leading l's. It should be clear tbat no error is
introducI'cJ by using a multiplier out of a table, sinc(' it is ud to multiply
botb numerator and denominator, and the pre\'ious convrorgcnre scheme
is initiatpd at this point. 0
,
I
I
1
Example 8.3
Thl fast multiplier in the ftoatillg-point arit.hmetic unit in th(' IBM 360/91
computer (11 rpc('ivcs opl"rands of l('ngth 56 bits eh and uses the algo-
rithm outlined in Table 6.5 to generate partial products of the form 0,
2A, and 4A. where A is the multiplicand. The resulting 28 partial
product.s require 26 c.arry-save adders. To reduce the amount of harli-
ware Ilef'ded, a smalll'r carry-save addition trt"e for eight operands was
designed. This trt'€ is capablt> of accept.ing six npw partial products and
adding them to the two previous intermediatt' reslllt.!i (int('rn1l'cliate sum
and carry), which are connected through two feedback paths as :,hown in
Figurc 5.31. This tree must be used five times in order to aCI'umulate
all 28 parti.,1 products. The carry-save tree was designed as a pipeline
to allow overlapping between consecut.ive sets of six partial products so
that the accumulation of all 28 partial products would takl' only six clo('k
cyell's. This overlapping is possiblt' among the bet.s of partial products
that correspond to t.he same mult.iply operation. Complptc overlapping
between two different multiply operations is also achievable as long as the
number of eneratt'd partial products is less than or equal to six. In such
cases, the carry-save tree is pa..'iSecl only once per multiply operation, and
there is no nPt'd to use the feedback connections. Consequently, limit-
ing the numbl'r of partial products to six or less call significantly speed
up th!" execution of the S('veral consecutive mult.iplications 1ll'l:"Xleci in the
division-by-collvergt'ncp alborithm.
To achit>vt> the seqUl'llce of denominators Di with 7, 14, 28, aud 56
leading ones (or eros), we nt>ed multipliers R i with 10, 14.28, !Uu156 bits.
resppctively. The first multiplier is rt'81.1 out of thp ROM and gpnl:'rat4'S
only five partial products. The oth!"r t.hrpe multiplit:'rs contain 7, 14, and
28 leading eros (or ones) which can be skipped, aud th('re is no n....>d
to generate any partial products for them. Wo> on I)' need to idl'uti(y the
Another way to further reduce the eXt>cution time of thl:' division algo-
rithm is to speed up the mult.iplications by using shorter multipliers. siui:c tht'
multiplicatioll time increases linearly with the nllmber of multiplier bits. Instead
of using a multiplier of length rI bits for all Ulultiplicatiolls, we can use a trun-
cated multiplier for some of the products. Using a truncatpd multipli('r will not
int.roduce !"rrors into t.he dh ision process, since the numerator and denominator
are mult.iplied by the same factor. Clparly, we cannot us!" a t.runcated mult.iplier
for the last product because a high accuracy is needed at this point.
At step (i + I) we want to generate Di+t with 0 (0 2i+l) leading l's by
multiplying the denominator Di, which ha.o;; a/2 leading l's, by R i . OriRinally,
R. = 2 - D i = 1 + Yi (whert> D. = 1 - Yi). Instt'ad of using R;, wt' lL'>t' a
truucated multiplier fl. T by forming the two's complement of only the first 0
bits of D;, which constitute the truncatro cll'llominator DiT' In other words,
R iT = 2 - D..,.. If we use the notation niT = 1 + Yr. then the error in the
truncated multiplipr is 0 = YT - Y._ This "error" is always posit.ive and satisfies
the in('Quality 0 S 0 < 2- 0 . Whcll multiplying tht' truncatro multiplit'r b)' the
untruncated denominator WP obtain
Di+1 = D. . R.r = (1 - Yi)' (I + YI') = 1 + yl' - Yi - Y.Yr, (8.8)
Substituting Jtr = Y. + 0 yields
Di+1 = 1 + 0 -IJi(Yi + 0) = 1 - y; + 0(1- y.).
(8.9)
218
8. Division through Multiplication
8.2 DMsion by Reciprocation
219
first ancllnst hits of surh a group of id('ntical bits. Th,>rf'fort>, the Sl'roud
nrultiplit>r, of length 1-1 bits, g"Ill'ratt's only thp five partial products
O. 11 11 11 Ix xx xx xx
............ ............ ............ ............ ............
and the feedback connections in the carry-save t rCf> are not used. The third
multiplier. of I('nh 28 hits, gen£'mtcs nin£' partial products, requiring till'
use of the fc>edback conncction in the carry-save trL'C.
This can be a\'oid('(1 by acldit.ional truncation of the multiplif'r. To
thp 14 leading idf'ntiral bits, we may add nine bits (for a total of 23 bits),
aud slill generate only six part.ial products:
0.1 11 11 11 11 11 11 Ix xx xx xx xx
............ ............ ............ ............ ............ ............
This, however, results in a new denominator guaranteed to ha,,'e only
14 + 9 = 23 leading ident.ical bits, instead of 28 (t.he proof of this is left as
an exercise for t.hf' reader). Thf' next multipli('r will therefore havp only
23 lending idt>ntica1 bits, and again we may add nine extra bits without
r(>{}uiring the use of the feedback conne('tions. This multiplit>r will mmlt. in
a denominator with 23 + 9 = 32 It'ading id£'ntical bits. Forming the two's
complement of this denominator will giv(' us t.he next multiplier v.ith 32
leading identical bits, which can inrr('a,o;;e the number of I('ading ident.ical
bits in the denominator up to 64 and achieve convergence. Since this is
the last multiply operation within the division, we can afford to use thp
ff'«lback conn('Ctions, so there is no need to limit the number of multiplier
bits and all availablt> 56 bits can be used. The se<luence of multiplication
factors now contains five multipliers of lenJ!:ths of 10, 14, 23, 32. and 56
hiw, increasing the number of multiply operations from 7 to 9. However,
all these multiplications can be overlapped, leading to a total ('xe<'ution
time of 18 clock cycles [II. 0
I
I
I
I
I
J
I
whf're l (r.) is lhe derivativp of f(x) with ro:sJJPct to x. FlJr the fUllct.lon f(x) =
I/x - D, \"hich 11I1S a 71'ro at x = lID, f'(x) = -1/x 2 , yidding
X'+J = .1".(2 - D. x.).
(8.11)
.l'. +1 converges to the rcci procal of D and thl' mnvergl'll('p of this srheult' is
quadratic. To prove t.hifi,lct.j dpnotp the t'rror in the ith stPp; i.p., 6. = l/D-xl'
Simpl.. algphrnic malliplilations show t.hat 6.+1 = D6. If D it; a norml\.Ii.lcd
fraction (i.e.. 1/2 D < I), then 6. I and the error denPl\...<;f>S quadratically.
If the first approximation u. Xo = I, then XI = (2 - D), and
X2 = (2 - D). [2 - D(2 - D)I = (2 - D). [I + (D - 1)2).
(812)
Repeatedl)' sllbtitllting expression (8.11) results ill
Xi = (2 - D)(l + (D_I)2)(1 + (D -1)"1).. .(1 + (D _1)2')
= (I - (D - 1»(1 + (D - 1)2)(1 + (D - I)')... (I + (D - 1)2' X8.13)
I
I
I
J
If D is a normalized fraction then (1 - D) is a fraction y s<ttisfying 0 < y t.
Therefore, the binomial sf'rit"S in Equation (8.13) u. identical to the onc usl'd in
Equation (8.4) and it convergt"S to
I 1
i x. = 1+ (D - I) = D .
(8.14)
A somewhat different approach to performing rlivision through lIIultiplication
is to first calculate the redproeal of the divisor D and t.hen multiply it by t.he
di\'idpocl to forlJl the final quotient. The rcciprocal of D can b.. calculated using
the NC\\ ton-Raphson iterat.ion lIIt'thod. This is a method of finding t.he zero of a
gi\'(,1J functiolJ f(x), where a zero of f(r.) is the solut.ion of f(x) = O. u.t. r.o be
the first approximation and let .l'. be till' estimat<.> for t.he lero at the ith stf'1>-
The ue"..t estimate, XHIt is calculatecJ from
As in Sedioll 8.1, w(' may reduce the required number of stpps by rf'ading
the first approximation out. of a table, rather than setting Xo ('(}ual to 1. This
table, stored in a ROM, accepts the j most significant digit.s of D (except [or
the first. digit, which is cllways 1), and produ(,(,$ an approxilllcltion to. fhe
range [0.5,1) is divided into 2 J intervals (of bil.p I). = 0.5 . 2- 1 each) .\IId it can
be shown that thp optimum value of Xo for the kth interval (1.- = 1,2,..., 2}) is
th<.> rt'Ciprocal of the number curr£'bponding to thp middle point of the intervcll
(see Exprcise 7). This midd\t, point is ! + (k - t) . 1)., IU1rl thus
8.2 DIVISION BY RECIPROCATION
2}+I
.l'o(k) = 4 .
2J+k-.
(.15)
f(x,)
XHI = Xf - f(.l',)
(8.10)
This stepwbe approximation for j = 2 is shown in Fignrt> 8.1.
lnst.e<ld of the stepwist> approximation to 1/ D. a piecewise lin('ar approXI-
mation can be (,lIlploy'(1. Its generation is more cOlllplkatpd but its 8Ccural.."y ib
higlwr [31.
220
8
DIVIsion through Multiplication
I
I
I
I
I
8.2
DMSlon by Reciprocation
221
15
2.0
1,(1
0.50
0.75
I 1.00 D
two's complplIH'nt calculation. This illtrodu('('S an l'nor of si?p 2- 31 . .£1 is
then computt'd by performing thr multiplication XI = Xo . (2 - d.ro) with
16 bits of Xo multipliPd by thp 32 most significant hits of tl\(' multiplicand,
forming n 32-hit proclu('t. The resulting XI is accurate to approximately
31 bits. In the second itf'ration, a similar Sf't of op<,rations is repeated to
calculatt' t.he final approximation £2. In Xl . d, only 64 bits arc gl'lwratpd
and then only the olle's complement is calculated. Npxt, the multipli-
cation XI (2 - d . Xl) is pf'rfornll'd produring thp approxim<\ted valn(' of
ltd. This value is then multiplied by tit" dividpnd N and the r('Sulting
.\pproximate<l quotiellt ()' is roundf'd according to anyone of the four
rounding schemes supported by the IEEE floating-point standard (round-
to-ocare'lt-cvf'll, round to 0, round to :t:oo). Thi final rounding dot's not
guarantee an accurately roundf'd result for all values of thp opf'rand d. 0
1.5
FIGURE 8.1 A stepwise approximation for the reciprocal -h (J ;; 2).
Example 8.4
Thf' impl(,nlcntation of a dh'ision-by-reciprocal-approximat.ion algorithm
was proposed for the ZS-I 64-bit comput('r (51. This computer uses till'
IFEE floating-point standard Rud. as a result, th(' significand d of the
di\'isor is within t.he rang(' I d < 2. Tht' 15 most significant bit.s of
t ht' divi:>or (excluding the hidden bit), I.d l d 2 . . . d If>, arf' used to address
a ROM look-up table for the initial approximation Xo. The table is of si7c
32K x 16 bits, producing an illitial approximation in the range 0.5 £0 <
1. XO ther('fofl has th(' binary form 0.IY2Y3'" YI6' This approximation is
calculated by taking thl' fl'('iprocal of the mid-point between l.d l d 2 . . . dlf>
and its su('r.essor where tht' mid-poiut is I.d l d 2 .. . d lS 1. TII(' reciprocal of
t.he mid-point is roundt'd by adding 2- 17 , alld t.he result is truncated to
\'if'ld the required 16 bits, 0.IY2Y3'" Y16.
Th(' precision of this initial approximation was shown to be Ixo -
l/dl < 1.5.2- 16 (51. Therdor<" basPd on the quadratic convergencp of
the Ncwton-Raphson method, only t.wo it ('rations arc needE'd to achieve
th<, required prcc'isioll of 53 !>ignificand hits. Two iterations of the forln of
Equation (8.11) i1J\'oh'e four multiplications and two complem('nt opera-
tions, As w& done in the implemputatioll of the di\'ision-by-convergf'nce
scllpmp (in the IBM 360/91 sy:.tem), additional simpliflcat.ions were per-
formt><l iu order to reduce t.hp o\'erall execution lime. In the first iteration
that calculat,.os Xl = IO . (2 - d. 3'0), d and Xo are multiplied and t.he
complement (2 - d . xo) is fowlt'd, To spN'd up the multiplication, the 16
bit.'> of Xo are multil)lil>d hy the J2 most significant bit... of d, and the result
is roundt>d to 32. instead of 48, bits. In t.he complement operation, ouly
lllP one's C'...omplelm>ut is performed to avoid the cdrry-propagation in t.he
t
A major disadvantage of most implementat,ions of t.hp clivision-by-rPf'ip-
rocation algorithm is that the aC('ltr8C}' is smaller thdD that achievPd by tt..
add/subtract type of diviion algorithms dcribed in Chapter 7. Corrective ac-
tions can be taken to guarantee that tht' least. significant. bit is correctly rounded.
However, the additional computation usnally slows down the division. rhltl,
when df'ciding between a division algorithm t.hat is based on multiplicatia1s and
an algorithm bfl.S('d on add/subtract operations, prt'cision, spf't'd, and cool trade-
offs must be taken into account. The final dccision depends on the technology
available. The prf'Cisioll required hy the IEEE standard has been arhieved at
a reasonable cost and speed in the IBM RISC/6000 (7). A doublc-widt.h data-
path is used then> to implement the division by reciprocation algorithm. All
operations are doue in a fused multiply-add unit (set> Section 6.5) resuJtil1 in a
double-length estimate Q' of the quotient. A remaindcr is then calculated (using
the fused multiply-add),
R=N-DxQ'.
This rcmainder is used to ('ompute a properly rounded result (agam using
thl fused multiply-add ullit.) in the d('sired rounding mode,
. I
(}=Q+Rx D
where -b is the result of the Newton-Raphson iteratin$. " ,
A diff"rent solution is to estimate the error in Q by calculatmg N = DQ
[61. If (./ is sufficiently accurate (at I('&,>t to n+ 1 bits hert.'. n is the 1Umer of
b ' . t h . ' fi 1d) tl1e I" , t sI ' g nific3nt bits of N provldp the dlrcct Ion of
lts m e slglJI cal,..... .
th ' Q ' B '''''''''I on this information and t.he desired rounding modl', Q
e error In . .""', . ' . b
('.an be corrected by either 6ddillg or subtract-ing I at the rH I bit pOSlt.lOIl or Y
trunrating it.
222
r
I
I
I
8. Division through Multiplication
8.3 EXERCISES
8.1. Show that if a truncnted IIJllhiplit'r i'i lISt'd then the dl'llOJIlinator D. will CODVCl'g"
toward 1 from below or above.
8.2. I'I'O\'C that n nnnnali7('(1 dl'Dollllnator V. ill I ht' di\'h.ioD-by-coll\'t>rt'n("(' alf':<>-
ritlull with a itlcnticall('ading hilS, when multilllied hy its two's compll'mellt R.
with a idrnticdll('juiinj1; hils and olll' II I'xtra bits (b a), will generate a new
dPIIC)min.ltnr 1Jt+1 with at least a + b leadin idt'nrical bits.
s.3. Pro\,(' that a ROI\( of gi./e 2 7 x 10 can always prO\'idl' a multipl)ing fd('tor Ro
Slirh that DI = D . fiQ has nt least sc\'en leading I's wh£'n n is a norllJa!iu'(l
fraction (0.5 $ n < 1) in the di\'ision-by-comerj1;encc alJ?;orithm. CalJ thl' siz£' of
thi'! rahl(' bl' furthC'r r('()uN.'d?
8.4. Evaluate the overall tillll' lJeeded to nccumulah' 28 partial Jlrodu(.ts ul'oing a
pipelinro partial (,SA Ir('(' .L<; shown in Figure 5.:U, ("apable of a("("l'pting six
new partial products at OIlC(>. Compare it to the ti1lJe nt'('dro if no overlapping
is 0110\\'('(1.
8.5. Prove that 0.+1 = Do? in the diVJsion-by-rf'Ciprocation nlgorithm.
8.6. A difTerf'nt reprcsentation of the algorithm for calculating thc rcdJlrocal is ob-
tained by introducing a nl'W variable, z.. which ('quais z, = Dx.. Show that the
rulting equation for z. i...
J
z.+t = .:.(2 - z,)
Show alc;o that from Zi we can calculate x. uing the equation .r.+t = x.(2 - z.).
If the initial \alue for x. is xo = 1, then the illitidl \'.uue for z, is Zo = n. \\'!telJ
x. con\erges to liD, z. com.ergf'S to I. Thus, the cOIllJlarison of z, to 1 allows us
t() dpt.f>nnin£' whether con\'ergcncl' has b(;'('n adlievpd. \\irite the above sclwmp
as a ratio betw('en Xo and zo similar 1.0 r.;quatiolJ (8.1). What will be the form
of the multiplication fartors R.?
8.7. Show that the optnnal initial value .ro for the kth inten'3l (k = 1. 2, ... ,2') in till'
di\'ision-b)'-rt'("iprocation algorithm is th£' rl'Ciprocdl ofth(' number corrC'Sponding
to its middle point.
8.8. To calculate thp S<)uart' root of a gi\'C'n o}lerl\nd N it has b('('n suggted to apply
th(' NeWlon-RaJ>hson metbod with fez) = 1- 1/(N x 'l). Show the corrponding
itt'rRtion rule and pro\'e Ibat the cOQ\ergence is quadratic. Apply this procedure
to c.tJCUldt-P v'3 to five derimal pll\('('5.
8.9. Given au 0IIl'rlUici A that is a norm'llizro fraction (0.5 A < 1), wh<il. function
:: = f(A) "ill thp following prol"CdllrC' calculate'!
(i) x,+! =.r. . r? with Xo = A.
(ii) Z.+I = Zi . r, with:: o :;:; A
with r, :;:; 1 + (1 - x.)/2.
How man)' iteratiol1!l an' JI(>(..oded in this calclliation I\nd what oJH'ratioflS arc
exccut4...od in cadi iteration?
lIow is till' mllit iplying coeffident ("alcul!lted!
84
References
223
Estimate the f'rror III the final rC"Sult.
('an you suggest ways t() spN'd III' the caklliatlon?
Given A = 0.11100001, calculdlt> thp -bit rt'SlIlt .:: = f( \) 1I!llOb tht' above
procedure and comp.ue it to the corrl'Ct result.
8,4 REFERENCES
III F. S. ANmmsoN et al.. "The IBM system/360 model 91: Floatin-point ,'xccnlton
unit," lHM JDurnal Rf'.. and Dev., 11 (Jan. 1967), .H-53.
[21 M. D. FRnx:ovAc, T. LANG, J.-M. h.ILum, and A. TI!oSERA:-'D, "lw,jproc8-
tion. sqllare root, illvers' Slluare root, and somt' elementary functions lL'iing samll
IIIl1ltilllil'rs," If:f:F Ihlns. on Compufer." 49 (July lOOO), 628-6:n
(3) D. FERltJ\RJ, "A division method using a p<iwllcl Dlultiplit'r," IEEE 7rans. on
Computef"ll, 1';('-16 (April 1967). 221-226.
[41 M. J. Flyr;:.I, "011 division by functiolJal iteration," IEEE Jron... QIl Computrf"S,
G-19 (Aug. 1970),702-706.
[51 D. L. FOWLER and J. E. SMITH, "All 8('curate high spced implell1l'ntation of
division hy reriprocal approximation," Pree. of 9th Symp. on C'omputer AntJlmetic
(H}9), 6Q.67,
(6) II. KABUO et 01., "Accuratp rounding sc1leme for the Newton-Raphson method
using rrollndant binary arithmetic," IEFE Trans. 011 Complltf'f"II. 43 (Jannary
1991), .l.J..51.
(7) P. W. IARKSTEIN, "Computation of elementary functions OIJ th£' IBM ruse
system/6000 pfO('('1;.<;()r," WAf Journal Re". and De"., 34 (Jl\n. 1990), 111-119.
(8) S. F. Omm\lAN, and M. J. FL....NN. "Di\'i.'IiolJ algorithms and impl£'mf'ntdtioll8,"
IFFE 7hms. on Gomputm, 46 (August 1997), 833-85.1.
[91 P. SODEliqulsr, and :-'-1. LEESER, "Area and IIPrfornmnce tradeoff... in t!nating-
point rli\'idp and square-root implellwntations," ACM Computing Sunwys. 28
(SrptC'lIIher 1996), 518-561.
[101 c. S. 'VAI,LAct', "A slIggC"Stion for a fdSt mllltipliC'r," lEf'E 1hm8. f.lect. Comp.,
EG-13 (Fl'b. 1964), 14-17.
T
I
I
J
,
9
EVALUATION OF
ELEMENTARY FUNCTIONS
Our objective in this chapter is to find eflki£'nt algorithms for evaluating elemen-
tary funct.ions (like eX, lu x, sin x, cos x, Ptc.) in hardware. Thl>$(,> algorithrru,
should be 8<'curate and :,ufficiently fast in comp<uison to a1ternativt' algorithms
implemented in softw.uc. SlIl'h algorithms are especially useful Wht:ll designmg
scientific hand-held l'alculators and numerical (usually floating-point) processors
«121).
One straightforward method is the use of a look-up tablp. To e\'Bluate
y = F(x) where x and yare numbers of length 11 bits, a ROM of siz, 2 n x n is
r<'<J.uired. For n 20, this size is prohibiti\'{' ( 16M). Another method is to
use a Taylor series expansion; e.g.,
co .
x x
e = L.,-=j"'
I.
i-O
(9.1)
This series OOIl\'erges rapidly for a fract.ional argument x. However I for x dosf' t,o
1, a large numb£'r of steps is needed. In '\ddit.ion, its hard war.. implementation if'
complex, since separat.e logic.alnetworks are needed for dilfl'rent f'1t'lIlenwry func-
tions. Even for software implementations there exist more l'Ificit'llt algorithms
rt'<luiring a smaller llumber of computational steps for the sam" precision. Th
most commonly used are based on eit.her pol} nomial or rational approxinu:Jt.iomi.
Consider, for example, the exponentil:Jl function e Z . "Vt' can exprcss it in
the form eX = 2% 1"83" and partition the exponent .r logl e into it!" iutegl'r part I
and its fracHonRI part I. i.e., oX log2 e = I + I. Thus,
,l' = 2 1 .2'. (9.2)
225
(9.3)
T
,
I
1
I
9.1 The Exponenttal Function
227
226
9. Evaluation of Elementary Functtons
The inC"orpor'\tion of t.he f8< tor 2' in I.ithl'r fixf'd-point or float ing-point
('akulations is straightforward. To p\rahl6tl' 2/ I)ne cau lIS(' a rational approxi-
mation, which is the ratio betwC<'lI a nUnJerator polynomial and a denominat.or
polynomial. Oue such nppruximatioll is hmot,d on two d('gree-5 polynomials
2/ = {«(a,/ + a,)! + 03)! + Cl2)! + ClI)! + Uo
««b:,/ + b-l,)! + b J )! + b- l )! + bd! + 1
For tht' abo\'f' method to be dIicit'nt. wt' 1I(,('cI a simplf' wa)' of St'll'(:tin the
b, 's to f'nsuro> the convergence of X.+ I to 0, Ab.o, tin> lIIultiplic.,tion by b i houl(1
be simple and not overly time-consuming.
To simplify the multiplication. the b,'s are givt'll tht' forlll b , = (1 + .1,2-'),
whprl' 8i e {-I. O,I}. This way, tile multiplicat.ioll is rrouc('(1 to shift .tlld .,dll
opt'rations. Also, t.he term Inb, = In(1 + 8.2-') in formula (i) is t'itht'r posit.ivp
or I1Pgative. This is ncccsary in ordpr to emmrr thl' cOllw'rgpnre of pithpr a
positive or nl'gatiw X,+! to O. Clearly, all po:-ihll' values of In( 1 :i: 2- i ) IIIUlot I",
precalculated, since their onlille calrul"tion will slow dowli tllP exprution enough
to rf'nder the algorithm uS('lcss. Th('S1' pr(>(' Ilcnlatro qu"ntit if'S an> stoll'.d in do
look-up table; e.g., a ROM of size 2n x n.
Substituting b, = 11 + 8i2-') in Equation (9.-1), w(' oht"in
whPr'> a, and bi (i = 0,1....,5) are known constants (7]. To (>valuate this
approximat.ion WP n('(>d 10 nmlt.iplic'\tiol1s, 10 additions, and 1 division. Other
approximalion requiring a smaller px,"'('ution time arl' availahll'.
9.1 THE EXPONENTIAL FUNCTION
TllPrc art' ,,11.efllative mpthods for evaluating th('S1' dl'll1pntan' functions I.hat are
more rdily illlplel1lnted in hardw.\re. Most l'xisting algorillIns for e\'aluating
pleml'ntary fun(.tions are similar to t.he division-by-con\"ergence algorithm (see
ChaptN 8). They involvl' two (or lIlorl') recursive formuldS related in such a wa\'
that when one formula is forced to a contant value the ot.her formula yields th
desired rf'Sult.
For examplp, to evaluate the exponential fun(.tion Y = e ZO for a fractional
argument :1'0, we may use thl' formulas
(i) x.+!
(ii) Yi+l
= Ii - In(l + 8.2-').
= y.. (1 + 8 1 2- 1 ).
(9.8)
To calculate the exponpntial function we have to find the wl.tor II = {80,8l.. ",
8",-1} so I.hat
m-I
L In(1 + s,2- 1 ) = .rO.
1=0
If 81 is restricted to {-I, 0,1}, thl'l1 the smalle:.t allIl largt'St values of Xo th"t
can be reprcsented in uch R way corre:.pond t.o all 81 = - I and "II 'I = 1,
respectively, and cons('(}uentl)',
X m =0
or
(i) X'+1
(ii) Y'+1
= Xi -Inbi,
= Y. .b , .
(9.4)
The b, 's are S('1l'C"IRrl in such a way that thl' elemront.s of the Sf'<}uence xo, x"
.... r'n approl\\h 0 where 711 is the llumber of iterat.ions needed to ensure the
conwrg('ncp to ?NO. i.e., .r m = O. As will become t'\'idcl1t later, the collvergence
is linear. In ut.lll'r words, m is a linear function of the number of bits, n.
To find thC' valu approached by the wrresponding !;equence Yo, y" ...,
1/m lIute that (61
m-I m-l
L In(1 - 2- 1 ) Xo S L In(1 + 2-1).
1...1 1=0
For every Xu ill this intef\'al thpre Px1sts a vector s such t.hat. the mnvergt'nl'e f
X m to 0 is guaranteed. For large elllJugh v"lu('s of 111 (711 20) the bounds III
inequalil.y (9.9) converc, yielding tilt' following dOllllJin for xo:
(9.9)
y.+1 . e""'+' = y, . b. . e r .- Inb , = Y. . cr..
(9.5)
- 1.2.& XO 1.56
(9.10)
Yrn . c r ... = Yo . c ro ,
and !;illn x 'n = 0, Yrn yipld tht' desired result,
(9.6)
If WI' restrict the "rgunwnt .co t.o positive fmct,iolls. WI' 1\1 ,)' uS(' a illple
schC'me for selecl.ing 8,. This is none-sided sl'lt'Ction rule; i.e., 8i e fO,1}, sllllliar
to tllP quotient-bit selection rulro in the rt>storing divisi(JII aloritlllil. In step
(i + I) we funn the ditferpnet> D = Xi - In(1 + 2- i ). If D is positive or ./.ero. we
set 8i = 1 and X,+1 = D; if D is nl'gativ(>, Wf'set Si = 0 \ud .£.+! x,:
\\'t> ma)' allwy?e the possibll' outronl('S of t.he sublmct opeml.IOlJ m £ormulu
(i) by examining the 1liylor seril'S rxplUlsiullof III( 1 + 8i 2 - 1 ):
(s,)J 2- 2 . 2- Ii
In(1 + d.T') = siT' - 2 + 8'3
(9,11 )
III p.micular,
Yno = Yo . e Xo .
(9.7)
TIll" similarity bptwt.'€n the above algorithm for calculating ('''0 and the division-
by-converp;I'lIce algorithm is now apparl'lIt. In t.hl' divL<,ioll-by-coll\'f'rgl'l1Cr algo-
ritiull tilt' fl:Jtio N,I D, is kept const.ant whilr herl' the product Y. . e XI rf'lIlain8
contallt illd"pt'lldent of the specific values of I.hp bi's.
228
I
I
t
I
t
9 EvaluatIon of Elementary Functions
; 1 + 2 . 111(1 + 2 ') 1 - 2 ' In(I-2 ')
0 10.00000 00000 0.1011000110 0 -
1 1.10000 00000 0.0110011111 0.10000 00000 -.1011000110
2 1,01000 00000 0.0011100100 0.11000 00000 -.0100100111
3 1.00 100 00000 0.0001111001 0.1110000000 -.0010001001
4 1.00010 00000 0.0000111110 o 1111000000 -.0001000010
5 1.0000100000 0.0000100000 0.11111 00000 -.00001 00001
6 1.00000 10000 0.00000 10000 0.1111110000 -.00000 10000
7 1.0000001000 0.0000001000 0.1111111000 -.00000 01000
8 1. 00000 00 100 0.0000000100 0.1111111100 -.0000000100
9 1.00000 00010 0.0000000010 0.1111111110 - ,00000 00010
10 1.00000 0000 1 0.0000000001 0.1111111111 -.0000000001
TABLE 9.1 The value of In(1 :t: 2-') with lO-bit precision,
At. step i we call cancel the bit whose weight is 2-' by subtracting In(1 + 8;2-')
from Xi. This implies that. we may net>d up to 11 steps to pnsure the coO\'ergence
of an n-bit. fract.ioll t.o zero. The convergence Lc; t.herefore linear and m = 11.
We can rely on this fact and slightly modify the selection rule to improve tlx>
performance of the algorithm. If t.he bit in the it.h position is 1, we select 8i = 1
and calculatf' X'+1 = X. - In(I + 8,2-'). If the bit is 0, we selpct 8i = 0 and go
on to the next step without performing any subtraction. Ai> a result, Wf' have
the capability of skipping over zeros. More complex :wlection rules can be used
as well (3).
Example 9.1
To calculate eO. 2 :; in lO-bit precision we need a table of In(1 :i: 2- i ) for
i = 0,1,2.", ,10 with each entry having 10 fractional bits. The required
entries have been calculated and are shown in Table 9.1. The calculat.ed
entries have bet>n rounded-t.o-nearest rather than trullcated. As a result,
Wl' obtain In(I :t: 2-') = :t:2- i for i 6.
Suppose that we use the one-sided SE'lection rule. St.arting wit.h
Xo = 0.2510 = 0.01000000000 2 , we must select. So = 8) = 0, since Xo <
In(1 + 2- 1 ), and thus X2 = Xo. We then select 82 = 1 and obtain .r;3 =
X2 - 0.0011100100 = 0.0000011100. We thpn set 83 = 84 = 0, and at this
point it becomes l'vident that we must select 8:; = 0, 86 = 87 = 8(:1 = 1,
and 89 = 810 = 0, yielding Xu = O. In parallel, we calculate YII = YO' (1 +
2-2)(1+2-6)(I+2-7)(I+T8) yiplding YII = 1.01001 000112 = 1.28418 10 ,
The eXdCt ,'alue with a prf'cision of five decimal digits is 1.28.103 10 , The
approximation pnor equals 0.I5M.2- 10 . The exact steps ofthe calculation
are summarized in the tdble at the t.op of the next page:
If we &.10'\ 8i to assume the value -1, then the selectf'd values are
(b'o. b'lo 82, 83, b'4, 8:;, 86, 87. 'H, 89, 810) = 0, I, -1, 1,0,0,0,1,1,1,1 resulting
in .rll = 0 and YII = 1.01001 0OOI. The approximation error equals
0.842 . 2- 10 ,
9 2 The Logarithm Function
219
I :1:, II, s,
0 0.01000 00000 1.0000n 00000 0
1 0.01000 00000 1.00000 00000 0
2 0.0100000000 1.00000 00000 I
3 0.00000 11100 1.01000 00000 0
4 o 00000 11100 1.01000 00000 0
5 0.0000011100 1.01000 00000 0
6 0.00000 11100 1.0100000000 I
7 0.0000001100 1.01000 10100 1
8 0.0000000100 1.0100011110 1
9 0.0000000000 1.0100100011 0
10 0.0000000000 1.0100100011 0
11 0,0000000000 1.0100100011
o
The domain for the .uguruent of thf' exponential funct.ion ('an be I'xtenc!PfI
by writing thp argument. as .r; = x log2 e . III 2. then partitioning the first tprm
into its int.ep;cr part and its fractioual part; i.e., X log2 e = 1 + f. where 1 is the
integer part and f is a fractioll (0:$ f < 1). We need, thprpfore, to calculate
Y = eIt = e('+/)ln2 = e'ln2+/ln2 = 2' . e/ 1n '2.
We set Xo = fin 2, and cOlLc;equentIy 0 '5: Xo < III 2 = 0.6Y3. Thl' factor 2'
is e&5ily dealt with, either by incorporating it int.o the I'xp"llt'nt part of tbe
floatiug-point lIumber. or through a shift operation if fixt>d-point arithmic is
used.
9,2 THE LOGARITHM FUNCTION
The procedure for calculating e1t in the previous sectlOIl is bnsed on continuft!
summation of terms of t.he formln(l + 8,2-') to force t,hp C"onvergt'nce t)£:1', to O.
This type of procedure is called addititre normulization. III a similar WHY, we may
define a multiplicative normalization procedure as one in whi(h .r, is forced to I
(or some other nonzero constnnt.) by ('Ont.inned multiplicatioll with precalculated
factors.
The set of recursive £ormulM for lI1ultiplil'ative normaluation has th.. £01-
lowing gC'llt'nll fonn:
(i) Xi+l = .r,' b..
(ii) y,+! = y, - y(bi).
(9.12)
,
b i is selpctcd :;0 that Xa+1 approaches 1. i.e.. Xi+! =.ro n b, --. land thus, aft( r
1=0
'" steps
m-I 1
n bl = - (the lIIultiplicatiw invprse).
'",,0 a:o
(9.13)
230
9. Evaluation of Elementary Functions
I
9.2 The Logarithm Function
'231
Tht' multiplic{\t,ion factor b. is again given thp form (1 + 8.2-'), and for Si E
{- 1. O,I} t.he following inequality determines thE' domain of the algorithm:
(9.14)
I
I
I
baM' e by any othpr hast'. Of particular illtf>rt an' thp base 10 1()8rlt lun and
exponent ial functions. Thl'Sf' arp IIs('ful in hand-lJ(>1rI caknlatora.
Finally note that, in a IIHIlUlf'r analogolls to E(lildtinn (9.5), Wf> 1I11\V writ..
m-I m-I m-I
n (1 - 2-') s n (1 + sIT') s n (1 + 2- 1 )
1= I 1=0 '=0
Y.+I + In.ri+1 = (y, -Inbi) + (In.r. + Inb,) = Vi + In I;,
indicdt.ing that Y.+I I:Jpproachf>S Yo + In Xo when r.+I approndlPf< 1.
(9.17)
Thl':ip hounds cn bE' calculat('d from the ("orr<'Sponding bounds in ineqnalil3'
(9.9) and, for large f'nough valups of m, tllt'v converge to
",-I
0.29 S n (1 + sIT') S 4.i7.
1=0
Example 9.2
To calculate In(0.50) in lO-bit. precision Wp use thp pntries of Tahlp 9.1.
The steps of the calculation are shown in tht' tablp bt'low:
Therefore, positive normalized frdct.ions are within the domain.
If the argument for thp logarit,hm function is not already representcd as
a normalized floating-point number, we may rewrite it as such, obtaining x =
Xo' 2E2, where 0.5 S Xo < 1. Thus, Inx = IIiXo + Ex In2.
A simplt' one-sided selfftion rule with s. E {0.1} ('.an be employed herc
as wpll. If Xj already has i leading l's (i.e., x, = 0.11 ...1 Ozz... z). then the
I
i z, y, "
0 O.wooo 00000 0.0000000000 0
1 0.1000000000 0.0000000000 1
2 0.11000 00000 -0.01100 11111 1
3 0.1111000000 -0.1010000011 0
.1 0.1111000000 -0.1010000011 1
5 0.1111111100 -0.1011000001 0
6 0.1111111100 -0,1011000001 0
7 0.1111111100 -0.1011000001 0
8 0.1111111100 -0.1011000001 1
0 1.00000 00000 -0.1011000101 0
10 1.0000000000 -0.1011000101 0
11 1.0000000000 -0.1011000101
Cou!'CQu('nt.ly, the domain for Xo is
0.21 Xo S 3.45.
(9.15)
Xi+1 = xjb j = Xi + xj8 j 2-'
Thus, YII = -0.10110001012 = -0.69238 1 0' The exact result (in tiw.'
decimal digits) is -0.69315 and t.lw approximation error is 0.783.2- 10 . 0
multiplication
with 8j = 1 will produce (i + 1) lE'ading l's in X.+1'
.
Formula (ii) in Equation (9.12) rE'_';ults in Y'+1 = Yo - 2:g(bl)' If we sclC<'t
, 0
For multiplictive normalization we can also considl'r t.hp ca:;t' wh(>re the
operat.ion in formula (ii), Fqllation (9.12), is muitiplicatiull in:;tead of Sl1htrac
tion. Sdet.ting, for example, g(b.) = b (wllt'rc k is 811Y rpal number) formula
(ii) yields
g(b,) = Inbl, th(>n
i i
Yj+1 = Yo - 2: lnb l = Yo -In fIb,
'=0 1-0
Thereforp, whpn X.+l approarhes I, Yi+1 converges to
1
Ym = Yo - In - = Yo + III Xo.
XO
(9.16)
m-I ( rn-l ) k
Ym = YO' IT g(b,) = Yo . IT bl
1=0 1..0
v I . f k I tll n ll Y --. " o/ x o and a clivid£' opE'rat.ion is t'xecutl'Cl. If
ror examp P, I . = , ' .+1 ,. . d
k = 1/2, Yi+1 -+ Yolvxo and thE' reciprocal of thp square root. IS calcl."at.t., an
. b . r;;:: H , > g( b ) = - 10: and to slmphfy the
If we set Yo = .ro. we 0 tSIll Yi+ I ) v.ro. trt . V II
multiplication in formula (i) w(' ma}' selcd n(bi) = (1 + s;2-'). COlls('quellt,ly,
--. yO' ( :J k
(9.1)
Thus, thp mult.iplicative normalization algorithm in (9.12) with g(b.) = In(1 +
6;2-') can be ,'mployro to c.Rlculatp thp natnrallog'lrit.hm function F(xo) = In .ro.
Not.ice thai thp saml' tablp of In(1 + 8,2' ') that. was uspd for evaluating tht'
,>xpollt'ntil:Jl fUlll1.iun h. n(:'Clpd here. Also not.i> that WI.> may rcpldce t.h(' logarithm
2 -:I, ?-.+l 2.,-2.
b l =(1+8.2-.):l=1+2s,2--'+s j 2 =1+Si +S..
(H.W)
232
9. Evaluation of Elementory Functions
T
93 The Trlgonometnc Functions
233
Thp multiplication by b i in formula (ii) is now morp complex and AS a result,
t.he efficit'llcy of t hp method dcrihed above for square root extraction is ques-
tionable.
m-l
DC'notp 1\ = n V I + {sI2-1)2; thpl! for X m = 0 WI' obt'tin
1=0
Yrn = YO' K. c'r o = Yo .1<. (cos.ro + jsinxo).
(9.25)
9.3 THE TRIGONOMETRIC FUNCTIONS
e'Z = cosx + jsinx
(9.20)
The eonvergence dumain for this alorithm for large ,'nough valu! of 1Il (m 20)
is
rn-1 m-I
-1.743 = L t8n- 1 (-2- 1 ) Xo L tau- 1 (2- 1 ) = 1.i,lJ (92()}
1=0 1.0
which includes the useful domain 0 Xo rr /2 = 1.57.
Thp Xi '8 are real numbprs, Lut the y, 's are complpx numbers. To separatp
the real and imaginary parts of Yi, let y, equal Z. + jW, wher.. Z, is the rpal part
of y, and W, is its imaginary part. Formula (ii) now takl's till' for III
To c\'a.luatR the trigonometric functions W' nse the well-known relation bw('('n
the exponential function and the trigonometric fun..tions
wlwre j = ,!=T. For th(> exponential function we used the recllrsive formulas for
additivl' normali?ation. which have t he following gf'ueral form:
(i) Xi+l
(ii) Ys+l
= Xi - g(b i ),
= Y. ab..
(9.21)
(ii) y,+! = Z'+I + jn"+1 = (Zi jtV,)(I + js,T').
Therefore, till' recursive formulas for Zi+1 and U',+I 8re
(ii) Zi+! = Zi - 8,2-' . """
(ii)' Wi+! = H', + 8,2-' . Z,.
The initial value yO can also, in genl'ral, be a complex number. YO = Zo + j\V o ,
but if we set W o equal to O. making Zo equal to Yo, we obtain the dl'Sired value:.,
which according to Equat.ion (9.25) arf'
(9.27)
To calculate the trigonom('uic functions we select b, = (1 + jSi2-'). This complex
number can be written in the form
V 1 2 + (S,2-i)2. eJ8, = V I + (8,2-')2. (cosO; + jsinOi)
(9.28)
where 0, = tan- I (8,2-'). This is the polar form of 8 complex number which, in
general. is
A+jB= V .4 2 +B2.ei 8 = V A2+B2'(cosO+jsinO)
(9.22)
Zm = Yo' K . COS X o,
tV m = Yo' 1< . sinxo.
The recursi\'C formulas for thp calculation of sin.r and cos.r were developed
son1t'what differently by VoIder 117) through rohtions in a polar ytt'lI1. IW:'
resulting implementation was named CORDIC (COOr<iIl8tcd Rt.lon, Dlglt
Computer). This formulation was later f'xtendt.>d to mdude dlVISlun and the
evaluat.iun of hyperbolic functions (18). ,
A possible Sf'lection rule for s, to el1!1urp the convergPlice of .1:.+1 to 0 whcli
Xo is in thl' domain presented in Equation (9.26) is
if Xi> tan- 1 (2-')
if IXil tan-I (2-')
if Xi < - tall-I (2- i ).
(9,29)
where 0 = tan-I .
The Si '5 arc selected so that tht> clements of the sequence Xo, Xli ,.., X m
approach O. To find the value approached by the corrf'Sponding sequenCt" Yo, YI.
"', !1m we form an expression analogous to that in Equation (9.5):
Yi+1 . eJZI+1 = Yi . b, . r,-;g{br) = y, . eJZi V I + (8,2- i )2 . e1°'e- jg (b i )
Setting
g(bi) = 0; = tan-I(8iT'}
(9.23)
yields
(9.24)
Si = {
-1
(9.30)
Yi+! . e'ZH' = y, . eJz. V I + (8,2-')2.
Note that if it wpre not for the term V I + (8i2-')2 the product y, . eJZ' would
have hc<>n a constant. Equation (9.24) results in
m-I
Yrn . eJ:tm = Yo' e'zo II V I + (s,2- 1 )2.
'.0
S . . t -1 ( -0 ) = -trm-I ( o), t.8n-J(s,2-') =:i , tan- l (2-'), and thus only n
II1ce an . 1 W' hi'. . t f valut>..-: for
constants, rather than 2n, must be stort'(I m a ROlv . It t ns se (, .
Si, {-I,O,I}, we obtain
234
9. Evaluation of Elementary Functions
9.4 The Inverse Tr1gonometnc Functions
235
(9.31)
Example 9.3
To calculatp in( 71' / -I) in lO-bit prE'('ision we nefil a table of drclan(2- 1 ) for
i = 0,1. 2,...,10 with each entry having 10 fractional hits. Thp re<luired
entripg havE' bN'll calculatNi and are shown in Tahh-' 9.2. TIlt' calclllah>e1
entries haV(' been roundPd-to-nt:'arest. rat h"r than trullcatJ.'d. As a rpsult,
we obtain arctan(2-') = 2-' for i 4.
The input operand is 71'/-1 = O.7853U!'!lo = 0.1100100100 2 , Initially
we set Zo = 1/1< which in IO-bit precision isO.l001l0111 (=0.607-1 10 ),
We thE'n St't Si = 8ign(x;) per Fqnation (9.32). The final results arp Zit =
0.10110100112 and W u = 0.1011010100:2 while the "E'xact" results in 10.
bit prt"Cision arc 0.101101O10 = 0.707110 and 0.101101010 = O.iOil lO ,
respectively. The approximation error is thus not larger thall IO. Thl:'
exact steps of the calculation arc sunllnarized in thf' t.ablt:' bl:'low.
m-I
K = II J l + (8i 2 -')2.
,=0
ThE'refor!.'. h is lIot 1\ constant hut depends on tbe \'cctor s, which in turn dl.'p('nd:;
upon Xo. In addition. the' ahove seh>ction rule requires a full-length companson
bE'twef'n I, and tan- I (2-i). If we r<'Stri(,t SI to assume only the valu<'S {-l,l},
then
m-I
I\. = II J l + 2- 2 ,.
;=>O
and it i a constant that, can be precalculated. For 111 > 16, K = 1.6468. We
may thPIi st>t Yo = I/K and, a.<; a result. ohtain Z... = COSXo and lIT m = sinxo.
The selcction mil' for 8, llOW becomf>S
2- 3i
t -I ( 2 _; ) 2 -'
an 8i =S, -8iT +...
(9.33)
i x, 7, ''"', s,
0 0.1100100100 0.1001101110 0.OOOOOOOOO0 1
1 0.OOOOOOOOOO 0.1001101110 0.1001101110 1
2 -0.0111011011 0.0100110111 0.1110100101 -1
3 -0.0011100000 0.1000100000 0.1101010111 -1
4 -0.0001100001 0.1010001011 0.1100010011 -1
5 -0.0000100001 0.1010111100 0.1011101010 -1
6 -0.0000000001 0.1011010011 0.1011010100 -1
7 0.0000001111 0.1011011110 0.1011001001 1
8 0,00000oo111 0.1011011000 0.1011001111 1
9 0.00000000 11 0,1011010101 0.1011010010 1
10 0.0000000001 0.1011010100 0.1011010011 1
11 0.0000000000 0.1011010011 0.1011010100
o
Si = { _:
if Xi 0
if .r i < O.
(9.32)
This is similar to t.he select.ion rule for llonrestoring division, aud a full-length
comparison is not needed. To examiue the ratc of convergence of XJ+I to 0,
consider the Taylor seril'5 expansion of tdn- I (,...2-.) (in radians):
Again, linear convergence is achieved. Also, for i > 71./3, all terms except the
first in the above s('ries art> negligiblE' and, as a result, a considE'rable rNiuction
ill the si7e of thl' ROM is possible.
i 2-' archlll(2 ')
0 1.0oo00Q()OOO 0.1100100100
1 0.100000oo00 0.0111011011
2 0.01OOOOOOOO 0.0011111011
3 0.0010000000 0.0001111111
4 0.000100000o 0.000 100000o
5 0.0000 100000 0.0000100000
6 0.0000010000 0.0000010000
7 0.0000001000 0.0000001000
8 0.0000000100 0.0000000100
9 0.OOOOOOOO10 0.00000oo010
10 O.OUOOOOOOOI 0,00000oo001
9.4 THE INVERSE TRIGONOMETRIC FUNCTIONS
To evaluate the inverse trigonometric funct.ions we use t.b multiplicatw nor-
mali7ation algorithm for the inverse exponential funct.ion; I.e., the Ulu...lon. In x
described in Section 9.2. The general fortn of mult.iplicatiVt' n0rI11dhzat.1011 IS
(i) Xi+1 = Ii' bIt
(ii) Y,+I = Yi + g(bi)'
(9.3)
If t b (1 +J - 2 -i ) as in Section 9.2. then formula (i) yields
we 8e i = '"
TABLE 9.2 The value of arctan(2-i) With lO-blt precision.
on-I m-I 9
X m =.rO' II hi = XO' II (1 + jSI2- 1 ) = XO' 1\t. J ,
'O 1-0
(9.J5)
T
236
9 Evaluation of Elementary Functions
9.4 The Inverse TrIgonometric Functions
237
171-1
(J = L e"e, = tan- l (s.2-') and
'-0
171-1
K = IT J l + (S,2-')2.
'=0
Setting Vo = 0 and U o = 1/1\, we oht.ain
U m = C()O and V rn = ..inD.
Recall that K is a constant if we rpst.riel Si to tilt' values {-I. I}. If, for a givPII
argumelt C, we need to calculatl.' t/J = eos- 1 C, we 5plt'ct the s.'s in i\ way that.
Urn --t C and, 8.'i a result, e --t tb. Th{, angle e is obtaiau"{l from Y,n = 0, where
v. herp
imilar1y, ifwp set g(bi) = (J, = tan-I (siT') and Yo = 0 then formula (ii) re.'Slilts
an
m-I
Ym = L (J, = e. (9.36)
'=0
Here, Xi is a complex variablt: in th(" form Ui + jv.. To obtain the recursive
formulas for the (eal and imaginary parts, consider again formula (i):
(i) U i +1 + j\-'i+1 = (U i + j\li) . (1 + js,2-')
Separating the real and imaginary parts, we obtain
",-I 171-1
(J = L ta11- I (s.2-') = L s,um- 1 (T I ).
'=0 '=0
ConS('Clllently, a tablp of lan- 1 (2- i ) is ndt'd. dS it is for sinp and eosint'o Simi-
larly, to calculate tb = !lin- I D for a givE-n D we selpct the sa's so that V m D
and aain e --t tb.
To evaluate the inverse t'-Ulgent, tan- 1 C, let V m --t 0, and from Equation
(9.41) we obtain tall 0 = - and (J = -tan- I (Jfo). We set V o = C, Uo = 1
and obtain t811- 1 C = (J. In summary, the iterative formulas to be calculdted
(i) U.+1 = U i - Si 2 -i . Vi,
(if \/i+1 = \Ii + Si T " . UI'
To find the relationship between the values of U V.
. nt, m
£ollowl11g expression:
arc
(i) Ui+t = U i - Si 2 -i . V. ; Uo = 1,
(i)' \';+1 = \'i + Si2-i . Ui ; V o = C, (9.-12)
(ii) YI+I = YI - Si tnn- I (2-') ; Yo = 0,
with s, E {-I,1} set according to the signs of V. and V, so that V i +1 is closer
to O.
(9.37)
and Y... we form the
X,...I . e- Jlli . 1 = Xi . b, . e-Jllle-J9Cb.)
(9.38)
Example 9,4
To calculate tan- I (1.0) in lO-bit precision we usp the entries of Table 9.2.
The steps of the calculation are shown in the table below. The final rL>-
suit. is Yu = 0.11001001OCh which is equal to tbe "ex ad" resnlt III to-bit.
precision,
Substituting the selted values of b, and g(b i ) yields
Xi+1 . e- JIII + 1 = Xi' e- jll ' . J l + (s,2- i )2,
resulting in
X . e - jllm - X e -J9 - X l.'
rn - m' - O' fl,
(9.39)
Urn' CO:, 0 + V... . sin(J = K. U o ,
-Urn' sinO + Yon . cos(J = K. Yo.
(9.41 )
i Ui Vi y. S,
1 1.0000000000 1.0000000000 0.0000000000 -1
2 1.1000000000 0.1000000000 0.0111011011 -1
3 1.1010000000 0.0010000000 0.1011010110 -1
4 1.10 100 10000 -0.0001010000 0.1101010101 1
5 1.10 1001 010 1 0.0000011001 0,1100010101 -1
6 1.1010010110 -0.0000011100 0.110011010 1 1
7 1.1010010110 -0.0000000010 0.1100100101 1
8 1.1010010110 0.0000001011 0.1100011101 -1
9 1.1010010110 0.0000000100 0.1100100001 -1
10 1.1010010110 0.0000000001 0.1100100011 -1
11 1.1010010110 _0.0000000001 0.11001001011
o
wher..
...-1
K = IT J l + (S,2-')2
'=0
as in Section 9.3.
. . Replacing ... and Xo in Equation (9.39) by the corresponding real and
uuagll1ar". part Yields
(U.. + j\'m)' (cose - j &in(J) = K. (Uo + jVo).
(9.40)
Thlli>,
238
9. Evaluation of Elementory Functions
I
9.5 THE HYPERBOLIC FUNCTIONS
To c"\lculatt' t.lu> hypt'rbolic functions
:-inh TO = .!. (e ZO - ("-XO)
2
and
1
coshxo = '2 (e ZO + e- ZO ),
WC IIS(' an algorithm similar to that used for the exponential function; i.e.,
(i) Ti+l = Xi - g(b.) ,
(ii) Yi+} = y.' b. ,
b. = 1 + 8i2-',
We first. (('writt' b. as
1 + 8.T' = V I - (8;2-.)2. exp(tanh- I (8i 2 - i »
which is based on tht' ide-ntity
1 +.E = VI - x 2 . exp(tauh- 1 x)
if Ixi ¥- 1.
Thc proof of this identity is left to the reade-r as an cxercise.
Formula (ii) in Equation (9.44) results in
Yrn = Yo 'if b, = Yo ( if V I- (8,2-')2 ) . exp{'f tanh-'(s,T'»
'=0 ',.,0 '<>0
rn-I
= YO' k . e,xp( L tanh- I (8,T'»
'",0
where
m-l
i< = I1 V I - (s,2-')2,
1..0
(9.43)
(9.44)
(9.45 )
(9.46)
(9.47)
This is a constant factor (K = 1.205) if we restrict. B, to {-I,I}, as we did for
tht' trigonometric functions.
Unlike- t.ht' algorithm for c Z , we select here g(b,) = tanh -I (8i2-i) so
L tanh- l (s,2-') -+ Xo wh(,11 Xi+} -+ O.
'''''0
Thus, from Equation (9.17), y", = YO' i<. e Zo . We now dpfill(' a new variahle tIt
which is calculatcd recnr..ivt'ly t.hrough the cquation t i +} = ti' (1- 8,2-'). As in
9.6 Bounds an the Approximation Error
239
Equation (9.47), this new variable conv('rg('s to to' A . e- ro . Let Zi = t. (YI + t.)
aud Wi = ' (Yi - f.). ForlUuln (ii) is r('placed by th,' f..!lowing hw, forlllulas:
(ii)
( ii)'
Z.+I
""HI
= Z. + 8.2- 1 . "'"
= 'Vi + s.2- i . Z,.
(9..18)
Thes ' variables convergf> to
Zm
1.1-
'2 YoKe zo + '2 fo J(t,-r o
I - e Zo + e- zo 1 _ e Zo _ e- zo
= 2(yO + to) . 1< . 2 + 2(YO - to) . h . 2
= Zo.j(. coshxo + ""0. i<, sillh.xo
(!J.-i9 )
=
and
1 J - ro 1 K -. -zo
= '2YO \e - '2fo C
= ZO' i< . sinh Xo + 11'0 . 1\ . cosh TO
(9.50)
tr m
Now. setting UfO = 0 and Zo = 1/ k yields Zm = cosh Xo aud W m = sinh xo.
The resulting formulas (i), (ii), and (ii)' are similar to the»;e for the
trigollometric functions. Therc is, howe\'(r, a major rlift'prellce btwN>n b' con-
vergcnce of .ei+l to 0 in these two sdl(m. For the trigonometric fUllctlons the
relationship
tan-I(Tl iH » > 1/2 .tan-I(T i )
holds. As a result, cvcn if we obtain x. = 0 and !x.+11 betomes tn-I(2-') (sillce
8. is restricted to {-I,I H, we can still expcct .em to convt'rt' to O. Howe\'er,
for hyperbolic functions WI> have
tanh- 1 (T(.+I) < 1/2 .tanh- I (2- i )
and the convergcnce of x m to 0 is not guaranteed u IIless several t.eJ>s ar(' repe<ttcd
. ?4 h . t 3 4 7 12 13 18 and 21 must be
twice. For example, If n = _ t Pil :; Pps , , , , , '
repeat.ed twice (8].
9,6 BOUNDS ON THE APPROXIMATION ERROR
I . t h, '" p eeted wh('11 e\'<I1-
In this .sect.ion wc {'Stimate the maximum ,'rror t lat. IS o. e . r. '.
uating elemcntary fUllct.ions using eith"r additive normahzat.lon or llIultJp Icatl\t'
normalization.
J
240
9. Evaluation of Elementary Functions
In the addit.iw normali?ation proc('dure, if Xo is a.n n-bit fraction, then
E;:C;I g(h/) approaches Xo with dn error of E = Xo - E;:I g(bl), which satisfies
lEI S 2- n . At thp same time, w(' attt'mpt t.o evaluatf' Y = F(xo) by calculating
m-I
Y = F( L g(b,) = F(xo - E).
1=0
The corre-sponding Taylor sprif'S expansion is
dF E 2 dJF
y = F(xo - E) = F(xo) - E' dr Izo + 2' . dx 2 1 .1: o + 0(t: 3 ).
(9.51)
Since E is of the ordpr of 2- n , the last two terms in the above ('xpansioll are
n('gliible. As a result, t.he error tJ in y is of tbe ord£'r of E' *Izo. The magnitude
of this error depends on the spedtk function F(xo). For example, for the expo-
nential function F(xo) = e Zo , 1.1:0 = e Zo , and thU5lcSl S 2- n . e 1nl = 2-(n-l)
for x S In 2. The maximum error in the funct.ion has double the size of the error
ill t.ht" argument. This ('rror can be reduced by increasing the numher of bits in
Xi by 1.
In the lUultipIicati\e normalization procedure,
m-I
Xo II bl -+ I,
1=0
and t hI' error is
m-l m-I
E = 1 - xo II blor Xo II b l = 1 - E.
1=0 1=0
InstMd of F(xo), v.-e calculate
Y=F( n;::'lbl ) =F( 1E ) F(xo[I+E-E2+...])F(xo+xoE).
The Ta\.lor series expan!'ion of the last expression is
dF 1 dJ F
F(xo + Xof') = F(xo) + XOE' dx 1.1:0 + 2 (XOt:)2 , dxl l.l: o + 0(t: 3 ).
(9.52)
For exalllplp, for the llaturallogarithm function F(xo) = In Xo the derivatives
""' 1 I ,(IF ]
arc Iii Zo = z;; and dZ"lzo = -";'1' Th('refore, Y = F(xo) + E + 0(E 2 ). Hence, the
o
error 6 in Y sat.isfies 115/ 1t:1 S 2- n . and t.he precision obtained is satisfactory.
9.7 Speed-up Techniques
241
9.7 SPEED-UP TECHNIQUES
Con!'icll'r thl' expolIPllt.ial function F(xo) = e Zo , for which the prf'{'aklllawd
l."Onbtants In(1 + 8i2-i) ha\e the following Taylor serirn expansion:
I (1 + 2 _' ) 2 _. ( ) '2 2 -'21-1 I 2 - Ji
n 8i = 8. - 8. + -8j - . . .
3
For i > n/2. the above expression is approximately 8.2-'. As a ff'Sult, lIot lInly
can we reduce the size of the ROM rf><juir{'(1 but., morf' importuntly, the last n/2
steps can be replaced by a singlp operation, as shown helow. In the stpp8 prior
to st.<'p i, we havl' already canceled at least t.he first (i - 1) hits in Xi' :1', thus
has the following form:
x. = 0.0...00 ZiZ.+}'"
"'-v-"
I-I
where z" (k = i, i + 1,...) is a single bit. To cancel the remaining nonzero bits
in x" for i n/2, we should select 8" = z" (k > i); i.e., all thp rpm ,illing 8,,'8
can be predicted ahead of time. Based on thi8 knowledge, we lIlay speed up the
execution. In the last steps we need to calculate (for i n/2)
m-I
Y... = Yi II (I + 8,,2-") Y. (1 + 8i2-' + 8i+1 2 -(i+l) + 8.+12- h +l) +...) ,
k=.
(9.53)
and, since we select SIt = z" for k i, the last tenn in Equation (9.53) lIals
(1 + x;). Tbus,
Ym = y.(1 + Xi). (9.54)
If a fast multiplier is available, the overall execution time can be redul'oo. This
is called a termination alJorithm 12) (or terminallillear approximation). Eveu
if a fast multiplier is not available, we can still take advantagt> of the ability
to predict the Si'S by performing all thp additions of the products ill EClution
(9.53) ill a C"lrry-save manner, avoiding the time-colIslllllig carry-prpagtlol.
\Ve lIlay arrive at tbe same exprC"sioll for the tt'rmmal apprOXJlnatloll III
a dif[prent way based on the Taylor series expansion
F(xo) = F«xo - x.) + x.)
dF 1 .J rPF I O( J )
= F(xo - :1'.) + J:, . - d 1(.1:0-.1:.) + 2 x ; , d 2 (ZO-"'" + ..c.
x x (9.55)
d': _ d"' 1 =
for IXil < 2- n / 2 . It call be shown that F(xo - x,) = J;1(.l:o-zl) - ;J;T (Zo-z.)
- I I 1. 2. < .!..,,-n.e 1n2 = 2-".
y" and tht>rt>fore. F(xo) = Yi( 1 +x.) +6, wht're u "" 2 Xi y, - 2 -
242
9. Evaluation of Elementary Functions
9.8 Other Techniques tor Evolua"ng Elementary Functions
243
Thus, the' bound on t.he l'rror \\'h('ll the I l'rillinal approximat,ioll is u8\'d is halC
its value withont tilt' termilU\1 approximation, providing highl'r precision.
A tl'nnination algorithm can also he applif'(1 in thl' calculat.ion of th...
natnrallugarit.hm fmlct.ion, F(xo) = In :ro. In this ca....e, XI h the form
1 - x. = 0,0 -.. 00 :.%i+l'"
"'-.....--"
,-I
on-I
Ym = Y. - L lu(l + sk Tk ).
k=.
Ol'ing "ble to predict the Ii" '6 allowR us to perform t.hp arlditi<'ln8 in thp ''<llIation!l
for XII Z, aud U', in a ('arry-savp mnnnpr (II. nr>f'xaminillg thp S('rit';l exptlllsil)n
of tan- l (1 + 812-1) in Fqulition (9.:J3), WP conclude t,hat we may. in principlt',
st.art the prt>dictiCin proc('ss even earli...r and prPdict the 81 '8 for ; 1 $: 3i "II
at oncl'. The corrE'Sponding tl'rms can then hI' addPd with d carrY-8ftVI' ndrl..r
and a single pass through a ('l\rry-prupagatl' addpr. Howf'vpr, fm 1< n/2 Wf' still
nCt-'(1 to usp the Sl't {-I. I}. This can hI' done by propprly recoding the hits in
Xi using t.he set {-1,1}.
We would also like to speed up tilt' computation fur the first steps, 0
i n/2. This caD be done by using a radix higher th<1lI 2; e.g., g(b;) = 1 + 8.r- i ,
where r is some puwer of 2, sayr = 2'1, aud 8i e {-(r - I), -(r - 2),... - 1,0,
1,..., (r - I)}. This way, we may handle q bits of x in a singh' stl"p, hut with
an incrl'a.scd complexity of ('<Jeh step, since s. has a largc'r ranKe. Still, somp
improvement in speed is poible (41.
Another method is to allow Si to a...',sume the vdlues in {-I. O,l} and
modify the st'lect.ion rule in such a way that the probability of St'1f'C'tillg Si = 0 is
nMximized. This is simildr to tht> iclea hehind the variations of thp snr division
algorithm. This approach has bf'('n analyzed in (31 hut has not gained much
popularity.
or
n.--I
1 2 _. 2 -(0+1) 1 '" 2 - k
:r. = - Z. - Z'+I -... = - Z" .
k i
TIIP Cormula for Xi+1 ,yields
3"+1 = X. . (1 + 8i2-i) = 1 + 8i2-' - %i2-o - ...
For X'+1 to approacll 1. we should t.hereforc sel('('t Sk = Zk for k l. In parallel,
we expe<'t. t.o cakulate in the rl'maining steps
ConS(>Quently, bas.ed 011 the Taylor series expansion Cor In(l + 8,,2-"), w(> obtain
the following terminal dPproximation for ; n/2:
rn-I
Ym U. - L Sk 2 -k = Y. - (1- x;)
,,"'.
(9.56)
9.8 OTHER TECHNIQUES FOR EVALUATING ELEMENTARY FUNCTIONS
Many calculators and floating-point processors employ some variati?l1S of the
pr('\ iously presented algorithms for the evaluation of elenwntary fuuctlOns. Usu-
ally, each one of these has a particular implemelltation that. .dep(>nds on the
precision '\nd speed requirempnt.s .lnd the area constraints. Stili, severa! (-,tiler
methods for caknlating elementary fnnctions haw beell propo...<wd, and son);' ha\'(>
hpcn implemented; P.g., (51, In (91 an evaluat,ion of elementary functions hased
on rational approximations is proposed. This n1Pthod, which is commonly
when evaluating elementary functions in software, can become very cost-dftl.ve
for hardware impl('mc'ntat.ioll. This is the case wlwn a fast adder dlul multlplipr
are availahle anel when high prt'cision is required; e.g., tilt' arguments are ex-
tended doubl(>-prccision float.ing-point lIumhprs in the I.EEE. Htam.lard. III such a
situation, hl:Jrdware implementation uf rat.iondl approxlI.ll<1hOIlS can Sl.IC(,V..;sfI1
compet(> with the m('thods basl'd on continued sumruatlolL<; and nu1t1plntioos
(whose convergencp is lint>ar in the number of bit.s) when "X('{:UUOII tllne' dud
chip arca dre considered. .' . . h
A SOlllt'wlldt different approach rombincs pol)'lIonnal approxlII18tlons Wit
I k bl 110 161 Here the domain of the argumcnt r If the elementary
a 00 -up ta e ,. , ') I It
functiun fer) is dividcd int.o smnlll'r interv"lls (usually o ,_'qual It' 8n,c t. t:'
valucs of f(I ) for the Loundary points .c. bl'twt:'t'n the mterv lis. are k, pt III
the Iok-up tblt,. Then, the valut' of f(x) at tilt:' givl'n point. J: I" calculated
The evaluat.ion of the trigonolUetric functions can also be acceleratro by
predicting tht> s. 's based on the series expansion in Equation (9.33). Pre\'i()usly,
in ordcr to obt.ain a C'onstant value for 1< indppendenl of the S('lpct-ed 8. 's, w{>
have r{>strict.ed 8. t.o he in {-I, I}. However, if
m-I
K = IT V I + 2- 21
,..0
i<; replacd by
n/2
J( = IT V I + 2-.!1,
1=0
the de\'iation from the exact v'\lue is less than 2- n (II. Thprl'fore, for i > n/2,
we may select 8k (k i) from the 8pt {O, I} and prt'dict Sk = Zk, whpre th£' Zk
are thp bits in the Ip,LSt. significant portion of x.:
Ii =0.0 ... 00 Z'%i+I'"
"'--v-'
.-1
T
244
9. Evaluation of Elementary Functions
ba.'I<'d on the valul' f(x,), whicll is rl'ad from thf' tablf' where x, is t.ht> closl'St
boundary point. to 1', and a polynomial approximat.ion p(x - Xi) for f(x - .1',).
Since t.he distanl"e (x - Xi) is small, a \"pry simplt> polynomial cfln bc I'mploY''lI,
requiring significant.ly Ie:,:; t.imf' t.han a rnt.ional approximation for f{x) on the
entire domain. The overall algorithm tll('rf'fort:' ha..c; three stt'ps:
1. Find t.he dosest bowldary I>oint. Xi Hnd cakulatc the "reduction t.ran.;for-
mat.ion," which is usually the distance (I = X - Xi'
2. Calculat p 1 1 «(/).
3. C-omhinl' f(x,) with p(d) to calculat.e f(x).
Example 9.5
Tlw following algorithm can be employed for calculating on (-1.11 (161:
32 houndary points of the form x, = ;/32, i = 0, 1,' ..31, flrc dl'fined. find
the valul. of 22". art:' pr<'Cfllculfited Ellul st.ored in a look-up tflble. In step
1, we search for an Xi such t.hat Ix - m - xii $ 1/1.)..1, where m = -1. O. or
1. Then, we calculatf' d = (x - m - Xi) .In 2. This "distallc(>" satisfies Cd =
2r-m-z., and Idl $ (In 2)/64. In step 2 we calculatc' an flpproximation for
cd _ 1 using a polynomial p(d) = (I + P2d 2 + p:,cF + ... Pk(I", whcr(> 1>2.
P3 . . . p" art:' precalculated coefficieuts of the polynomial approximation of
th(' function e d - 1 on the interval (-(ln2)/64,(ln2)/64I. In step 3 we
reconstruct 2.1' using
2.1' = 2'"+.1'. . e d = 2 m (2z, + 22'0 . (e d _ 1») :::::: 2 m (22'. + 22'0 . P(d»
wh(>re 2%1 is rt'ad out of the look-up table. A detailed error analysis for
this algorithm for IEEE double-precisioll arguments shows that the crror
is bounded by 0.556 ulp, wht>rf' ulp for this format. equals 2- 52 . 0
9.9 EXERCISES
9.1. Apply th,' IJroct.'<!url' in Section 9.1 to calculate eO,$. Aume that the argument
:to = 0.5 and flU intt'nnt-diate ["('Suits haw 12 fractional bits. Prepare a table of
all terms of the form In(1 :I: 2- 0 ) witb 12-bit predsion. Compare your rfSult to
the exact \-alue of Vi and compafl' thp I'rror to 2- 12 .
9.2. Hf'peal probll'm I, applying th(' procedurl' in Section 9.2 for cakulalin In 0.5.
9.3. Provl' id,'ntity (9.16).
9.4. Prove th6t Equalion (9.55) yields the 581Dl' expnossioll for the tennination algo-
rithm (for e") as Equation (9.5-1).
9 9 References
245
9.5.
Write a proc(--dIJ((' for calcll1ating Yolzo IIsing nJ\JltipliC'ativt' normali1.ahon. 0<1-
vi,>£, a t4'rmination algorithm and diSCIL>;8 its ('fff'Ctiwness.
The rffiprocal of thp square root of a ghen n-bit Opl'r-1u<l ran be caloJl"tNI in
n teps using the following two ec:1'latioo.s:
9.6.
(i) x.+!
(ii) y.+1
:s Zo' (1 + s,2- o r l =:E. . (1 + 28.2-' + s:T: U )
= y.' (1 + 3.T')
where the :I;'S al'(' St"lecd so th"t x.+J -+ 1 and COJ1.'IIII4'nUy y.+1 -+ yol,JF;,.
WI' W'dl1tto examinp the possibility of employing a tl'rruinalion 61gorithm in Ih(l
kth step, in order to speed up the computation by rt'plocing the rl'm3inillg (n-k)
steps by a single opt'ration. In step k Wi' have
"-I
1 - Xlt = 1 -.to n (1 + 8,2-')2 = 0.0.. .0:,,:,,+1'"
':1O
and to obtain the final value of y, we calrulatp y" n{l + 3I T ').
,="
For what valu, of k C3n the termination ctIgo)fit.hm be used? Writt' in detail the
computation nt'C'dro in this termination algorithm.
..
9.7. Verify that if the factor K = n J l + 2- 2 ' in the call-Illation of sine/cosine is
1..0
"/ 2
replaced b)' J\. = n J l + 2- 2 ', then the deiation from the exact vnlue is 1eN!
,..0
tban 2-".
9.8. Find the reduction proc('(}ul'(' that should be used when calculating 'Iin(x) for
x 1f 12 using the algorithm in Section 9.3.
9.9. Show that selccting g(b.) = b, as in Jo:quation (9.18). yields
Y..,.I y.
-;;- :;;:; -;;
X'+I Z,
9.10, After th4' first step in Example 9.J we already obtain .r :;;:; O. Wbv do we havt' to
execute the remaining nint' steps'?
9.10 REFERENCES
[IJ P. W. BAKER, "Suggestion for 3 fl\.Sl binary sint'/cosine gent'rator," IBEE 1roras.
on Compulf'r.t, C-25 (Nov. 1976), 1131-1137.
(2) T. C. CHEN, "Automat.ic computation of eXIJonentials, 10g-dJ'ithms, r6tios, and
:;quare roots," IDAf JnunJal Rt'..J. and Det... 16 (Jllly 1972), 380-3&.
246
I
9. Evaluation of Elementarv Functions
[3J B. O. DFLuGISH, "A class of algorithms for automatk evaluation of c'PrtRin el£'-
mpntnr) fUIlI tions," Oept. Compo 8ci., Ulliv. of Uljnois, R£'I>. :J99, Junc 197U.
141 M. D. ERCEGOVAC. "Radix-It! £'valuation of certain £'1(,IIl£'ntary funcl,ions," IFJ-:E
1h11l8. on Co mlUi ter.'i , C-22 (June 1973), 561-566.
[5J P. L FARMWALD, "Higb handwidth e'Blu6tion of eIPrnf'ntnr' functions," Proc.
of 5th Symp. Otl ('omputer Anthmetir (May 1!UH), 139-1.12.
[6J D. GOI.I>III':RG, private mmmunicatiou.
(7) J. F. HAltT et al., Computer approximations, Wiley. N('w )ork, 1968.
(8) G. L. IIA\ ILASD and A. A. Tl's?'r NSKI, "A CORDIC arithmPlj(" pron.'SSOr chip,"
IEEE 1hm.... on Computers, ('-29 (F('b. 1980),6R-79.
1 9 1 I. KORE:-I and O. I'INATY, "Evaluati elementary functions in a nUlJlerical co-
processor bast.od on rational approximatiolls," IEb"E TIuns. on Compllters, 39
(Aug. 1990), 1030-10:17.
1 1 0J P. W. 1\IARKSTFIN, "Computation of l'I£,lIlentary functions on the IBM RISC
system/6oo0 processor," IBM Journal Rc.s. and Vev., 34 (Jan. 1990), 111-119.
1111 J.-M. MULLER, Elementary ftm("tion..: Algorithms and implC1IJt'rltation,
Birkhauser, 1997.
(12J R. AVE, "[mpll'lUentntion of trallSC('ndental nlllctiOH" UII a nuwt'ric processor,"
Microproce..18ing and Microprogramming, 11 (1983), 221-225.
113J M. J. SCIIULTE Bod E. E. SWARTZI ANDER, "I1ardwar£' designs for pxactly rounded
elE.'m('otary functions, IEBB 1hms. 011 Computers, 43 (August 199,1),961-973.
1 14 ) \V. H. SPE{'I\FR, "A class of algorithms for In x, I'xl' X, SiD x, COS .1:, tan-I x,
cot-I x." IEEF l'rnns. 011 Electron. ComlJUt 1'"3, 1-'C-14 (Feb. 1965),85-86.
115] S. STOR.... and P. T. P. TANG, "New algorithm for improved transcclIl!£'otal func-
tions on lA-64," Proc. of 14th Symp. on Computer Arithnu"tic (April 1999), 1-11.
116) P. T. P. TANG, "Tabl£'-looku}J algorithrru. for £'Iementary functions and their error
analysis," Proc. of 10th Symp. 011 Computer Arithmetic, (1991),232-236.
117J J. E. VOLDER, "The CORDIC trigonomctric mmputing techniquc," IRE TIun...
on f'lectron. Computel'"3, EC-8 (8('pt. 1959), 330-334.
118) J. S. WALmER, "A unified algorithm for t'lelUeotary functions," Spri1lg Joint
Computer Con/., Proc., 38 (1971),379-385.
10
LOGARITHMIC
NUMBER SYSTEMS
A numbcr system bas('d on logarithms can simplify nmltiplication, division,
roots, and powers. When logarithms are used, multiplication and division are
reduced t.o addition and suhtraction, respectively, and powers and roots are rE"-
duced to lIlultiplic"\t.ion and division, rl'spectively. On the other hand, add ami
subtract operat.ions become more complex. Anot.her major probll'm is deriving
logarithms and antilogarithms quickly and accurately nolJgh to allow c0z:-cr-
sions to and from the conventional numb('r reprE.'SentatlOns. These conversions
always involvc approximat.ions, resulting in inaccuracies. Ther('f?re, binry .1 0g -
aritl;ms can be useful only in arithnlt'tk units dPdicat<>d to ::special apph.C'toIlS
where very fl'w convl'rsiollS are required but. many multiplicdtions and divISion:>
are executed; e.g., real-t.ime digital filters.
10.1 SIGN-LOGARITHM NUMBER SYSTEMS
d b d g t S allli a lo g arithm EA that
Let. a nmnbe>r A be> represente y a sign i i \
includes an integer part and a fractional part.
S,E.-\ = S, Gk_lak-2...ala o. ' !1-1a-l""a_
-
;; I
. . t tal f ., - 1 + k + I bits The si g n S -\ is set to 0 if .-t is poiti\"(',
reqUlrmg a o. 0.. - .. fA'
and t.o 1 if A is negative. E, is t.he logarithm o the absolute vllhlt' 0 ; I.e.,
EA = log2IA). The itlte>rpretatioll rull' for SAE.-\ IS thus
A = (_I)S" .2£".
The base of the expone>lIt may, in gt'neral, be different from 2.
(10.1)
( 10.2)
247
248
10. logarithmic Number Systems
10 2 Anthmetic OperatIOns
249
(-I)S"F A
8
6
..
2
o
-2
-4
-6
-8
-100
00001.010 rf'present thf' num£'rical value +2(H) = +2.37.nlO
01110.100 rf'prcsf'nt$ t Ilf' numf'rical \'alm' +2-(1+!) = +0.3535510
The largf'St posit.ivf' lIumber is 00111.111 = 2(8-') = 234. i5301O. The
smallf'St positivp number is 01000.000 = 2 -II = 0.003906 10 , TI1t'rt' is no
rt'pr('Sentation for Lero. 0
10.2 ARITHMETIC OPERATIONS
o
A
FIGURE 10.1 The relationship between A and Its representafton {-I )SA E A In
the sign-logarithm system (for A 1).
-80
-60
-40
-20
20
-10
60
80
100
In a logarithmic.- systelll Illultipliratioll and division are ::impl('r opt>ratlons than
addition mid subtraction. To cakulate the pwduct P of two opt'ralllis .4. and B
we add their logarithms
E p = E" + E8.
( 10.3)
To rc>prt'llt numbel'S smalll'r than I, nl'gative logarithms are nt'e(ft'(1.
For this purpose, we may uS!.' till' two's (>omplf'mf'nt rf'prt'St'ntation or a hiasn.f
rcplt'sentation, which tak{'s thl' gel1l'ral form
and if these logarithms are biased, we have to thl'n subtract th(' bias. As witIt
convC'ntional sigTlt,<I-magnit.udl' systf'IllS, the sign bit of the product is dett>nnilwd
by the modulo 2 addition of the operands' sign bits, Sf> = SA ff) S8' Similarly,
whf'n calculating the quotif'llt Q = 1/ B we subtract t.he logarithms
E" = (a"-l -. 'Go' U-l" .u-,h - Bias.
b'q = E \ - E8,
(10. ')
Example 10.1
Let k bt' 4 aud I equal 3. ConSt><IIIt'ntly, Tll'Quals 8. Suppo.<;e that the log-
drithm EA is represclltt'd in the two's complC'lUt'nt. m£'thod. The following
arc two rl'presentations withill thc range:
and if th{'Se logarithms are hiased, we must add the bias. The sign hit. of the
quotient, SQ, is detenninf'<1 in exactly the sanl(> way as is done for t.he pru,lllct;
i.e., SQ = S" EEl 58- Comparing the multiply and divide operations to their
countt'rparts in a floating-point syst.em, onf' shouM note that in the logarit.hmic
system they are exact. operations, and no rounding is required. Thus, these two
operations do not contribute to the accumulat.ion of computation errors. o.t>r-
flow and wtderflow mdY occur when executing the add or subtract operations in
Equations (10.3) and (10.4). but these are easily detected. . .
In contrast, addition and subtraction of operands that are g1Vf'JI by their
logarit.hms are complicated. One brute-force way to perform thl'Sl' operatioll5 is
through the use of a complete look-up table. Howt'\'er, the size of such a .taJt>
(220 x n bits) is prohibitivc for any rea..<;onabic value of n, the nUlllbe.r of bits. m
an operand. \Ve may thcrc>fore use one of the following two altt:'rnatlVt"S, winch
still rt:quire t.ables, but of smaller size:
1. Use an antilogarithm table, add and then use a k)arit.hlll tble. . fhis
requires a total of 3.2" x n bits in tabl('s (three tables If the anttlugarlthm<;
of the two operands are to be read simultaneously).
2. Calculate directly the approximated sum (or difference). This methd
employs smaller look-up tdbles and is therefore the one most l,'OIlIlIlIJl Y
used.
COllunllJlI)' u!>t'd values for the bias afl' :t'-I or 2 k - 1 - 1.
Examining Equation (10.2), one should rf'nlil.e that. this is an t'xtrt'me case
of the familiar floating-point syst.em with tit(' signifkand always equal to 1. As a
result, t.ht:' exponent E" i allowftl tu be a mixed number rather than all integl'r,
in urdC'r t.o enubl(' the repfl'Sentation of numbers t.hat are not integral powers
of the bosc. In t.his numLer system. as in the float.ing-point systf'm. zero is not
inl'luded in the ordinary range of \'alul'S. If biased log<uithms art:' used, we may
df'Cide t.bt\t E" = 0 rt'prest'nts 0 instead of 2£....... _ Another way to reprf'Scnt. zero
is to haw a special bit in the format indicat.ing that the valuf' is 0, regardless of
the value of E A [8],
Figurt:' 10.1 d£'picts t.h(' rdatiollship, for A 1, betw€'l'n t.he real number
A and its representat.ion (-l)sAE A in the sign-logarithm system. It is evident
from this figun' that (-1 )SA EI\ is monot.onic in A. and comparison is therefore
straiRhtforward. Givf'n two IIl11n1)('rs A = (_l)sA .2 FA and B = (_1)5 s . 2En,
we fint. ,'ompare their "igllq SA and Sa; then, if the signs are C<)ual, we compare
tl1l'ir logarithms E" and En.
I
250
10, logardhmic Number Systems
102 Arithmetic Operattons
251
E....
EB
A Ida'
Ec
"'hen cakulatin t.he smn (or diffl'reuce)
(' = .4:1: B = (_I)sc. 2Ec
a("('(Ircling to thl' sl'("ond met hod, \\'(' distinguish hetwf't'n two cases. In thl' first,
1.41 > IBI, and \\'t' rt:'write the expression for C as
B
C = A:i: B = A(l :I: -).
A
Wt:' set Se equal to SA and
ROM
B B
Ec = log2IA(I:i: A )I = IOR21AI + log211 :i: A 1= E A + ell (E A - En). (10.5)
RGURE 10.2 Adder/subtracter fer slgn-Iogonthm numbers.
wht'rf' thl' function ell (E A - En) is definro as
B
ell (EA - EB) = log211 :f:: A 1= log211 :i: 2-(E A -Es)l.
( 10.6)
several stratf'gies for reducing thf' size of the look-up tablf's havf' het>n sllggP5tt'd
and implemented. One approach is to partition thl' table of si.te T' x n illto
several smallt:'r tables (7). Since elI(x) decreases rapidly with incrf'asinR x, the
size of the corresponding part.ial tables can be substantially reducf'<l. Another
approach is to uS(' a combination of linear approximation aud a look-up tab
(8). In either case, there is no need to gcneratt' 4>(x) for very large values I)f ;r,
since t.he value of elI(x) becomes smallN than tht' resolution of t.ht' SYfitf'lU.
The valrlf' of cJI(x), wherf' x = E A - En > 0, llIust be prccalculated and stored
in a table. For conwnience, two separate tables are commonly used. one for
4>+(.r) = log2(l + 2- Z ) and a second for elI-(x) = log2(1- 2- Z ). Eadl table can
bt' implementf'<1 in 8 ROll.,I of a size not larger than 2" x n. The si.te of the table
for cJI+(x) may he reduct'd to 2" x I, since elI+(x) :5 I for x O. In other words,
4>+ (x) is always a fraction dnd will nf'v£'r require mort:' I. han 1 bits.
In the secoud case, IAI < IBI, so Se = SB Bnd Ee = E B + 4>(EB - E 1).
Const>quenUy. tht.' steps in both cases are:
I, Compare A and B to determine the larger of t.he two.
2. C'alculat.f' x = E A - En or EB - EA.
3. Read c}>+(x) or 4>-(x) from th£' appropriate table and add it to either E A
or En.
The fit, two steps ('an be exeruted in parallel if t.he dat.a flow depirted in
Figure 10.2 is adopted [71. As a result, tht' total time needed is approximately
TADDISlfR 2. TADDER + T ROM .
(10.7)
Example 10.2
A 20-bit logarithmic number syst(>111 procP5.sor has bel'n designed and im-
plemented as a singlp VLSI chip [71. The 20 bits include a sign bit. and 19
bits for an exponent. in two's complemt:'nt representat.ion. TllPse HI bits
are partitioned into a sign bit, a 6-bit integf'r part, and a 12-bit fractional
part. The two look-up tables for 4>+ and -, if fully implpmt'nt.p<1 ill
ROM. would require 2 18 . 12 + 2 18 .18 = 7.8Mbits. The si7R of tht'se tdbles
can be substantially rcdured as described next. First, there is no need to
generate values I. ht are smallpr than 2- 12, t.he resolution of the sel('rh>d
number s)-stem. The solution of
elI+(x) = log2(1 + 2- Z ) = T I2
is x = 12.52 and, the solution of cfI-(x) = 102(1- 2- Z ) = _2- 12 is also
:r = 12.52. Thus, 110 look-up table entries are required for .£ 2: 12.52.
Consequently. there is no Ill;'t'd to provide more than 4 bits in t.he iuteger
part of x as iuputs t.o tht' look-up tnblf's, allowing .x :5 15. The/.> , four
bits, t.ogether with the 12 fract.ional bits, constitutf' t,he r{'quired 16 input.s
to the ROM illstt'ad of 18. The remaining range [0,151 is then divid(d
into II smallpr intervals: [0.0,0.5), [0,5, 1.01, [1.0,2.0). [2.0,3.01. [J.O, 1:01.
[.1.0,5.01. (5.0,6.0), [6.0,7.0), [7.0,8.01, [8.0,9.01, and [9.0. 15.01. For t,hc' first
10 intervaL.; ROl\H through ROM 10 dre uSI.d whill' for t,he last 0111' a PI A
implementation has prown to bp mort' economi('al, providing valu os for
The only source of error wllC'n performing an addit.ion (or subtraction) is
the rounding of c}>+ or 4>-. The values st.ored in tht> ROM should be rounded
(e.g., to nearest, gll(' Chaptt'r 4) rather than truncated and the error int.roduced
\\ ill t hf'rcfore hr no largt'r than . 2- r .
The sizt' of the abo\'c two tab It's for elI+ and cJI- is t.he major obstacle when
attemptiug t.o impll'IIII'llt an arithmetic unit that operat.es on sign-Iogaritlun
numLNS. For n 2: 20, thl' requirf'd ROM becomes prohibit.ively large. Therefore,
252
10. logarithmic Number Systems
10.4 Conversions to/from Conventional Representoftons
253
t.he suhinfl'f\-nl (9,12.52J. Thp r{'a..'()n hehiu(1 the partition of t.he intf'f\al
(0.0.9.0J into 10 6111uller int.ervals is that the graph for tJI(x) becomp:s flat
for large value:'':; of :c. As a rt'sult, the number of input bits to thp ROM
aud the:' numher of output bits dccrease' rapidly. Tlw total ROM space
employed is 83.55 Kbits.
This approadl, if applied to a 30-bit format. would requirt:' a ROr..r
:.pace of 70Mhits [8J. Thcreforl', a different. approach, using a linear ap-
proximation, wa..<; propose<! to further rrourf' tht:' si.w of the look-up tabl<'S..
With t.his approach, the size of tht:' ROM decreases. Howp\'er, the execu-
tion t.imc increMes. 0
1'0 mea..c;l!rp thl' a('rllr \(,). of r 'presl"ntation in th{' logarit,llInic svt"'lJ1. wp-
clclllatf' t.lu' r{'I"t.ivp stt'P sizr, (It finl-d as
2(EA+2- 1 ) - 2 f ;A 2 1
?,.. = 2 - 1.
_ A
( 10.10)
The maximum relative rpprf'spntatiou t'rror, whil'h pquals half thf' (f'lnt.iw stpp
size, if; tlllI a COll8t3nt indt'prndcnt. of EI\'
For till' floatiug-point sysU'm thl' relative stpp sill' is
(M+2 m)2E_At.2E
Af.2 E
2" m
=r:r.
(10.11)
An important advantage of the logarithmic system is t.hat several ot.her
arithnwtir operations c<ln be eXl'CutL>O in a straightforward manner. For example,
tlw reciprocal of a given numbpr A = (_1)5"2Eo4 is simply A-I = (-I)s"2- E ",
and only thp two's complement of EI\ Ileeds to be calculated. Squaring a given
number is ac('omplished by A2 = 22EA, requiring a shift left operation. If the
logarithm is biast'<I, thp hia.o;; IUUSt. be subt.racted. Tht' square root of A is givcn
by JA = 2 EA/2 , rpquiring a shift right operation. He>rt:', if t,he logarithm is
hiu...;e<I, t.he bias lUust be added first. Exponent.iation is also simplified, since>
All = 2 11 '£", and ouly a fixl>d-point multiplication is re<luircd.
To compare t.he two step sizes, assume that. I = m. Sinre
')2- 1 - 1
lim - 2- 1 = 0.693,
1....00
and for normalizM fract.ion.. 1 S I/M :5 2, the following inequ<ilit}' hulds:
?-I
2 2 - 1 - 1 < =-
-At
As previously notf'd, t.he logarithmic syst.em is an extren1(' case of the con\'m-
tionw floating-point system, Therefore>, a more detailed comparison bl't.\\\'Cn the
two should bf' madt!.. Two important charact.eristics that nPed to be analYLed
are the range and t.he accuracy of representatioll.
TIlt' range:' of posit.ive logarithms E.-\ using k + I bits is
_2"-1 S EI\ S 2 k - 1 - 2- 1 2"-1.
Therefore, the range for positive numbprs in t.he sign-logarithm system is
As a result, tllP reprpS('ntation error in the loarit.lunic sytt'm is sliht.ly lowt'r
than t.hat in the corresponding floating-point system (with fit = I). l'Sppcially
for small numbt'rs. Numbers with thc saJlle exponent. are t:(lually sp,lled iu
tlw floating-point number system, while ill the sign-loarit.lulJ system snl'1Uer
numbprs are denser.
10,3 COMPARISON TO BINARY FLOAT1NG-POINT NUMBERS
Example 10.3
The 20-bit. logarithmic processor rep(lrted in 17J ha.... the rdIJg(' 2-o :5
A+ < 2&1 wit.h a precision of 2-1.2.52. The rangf' of this system is twin'
t.hatf a 20-bit floating-point format. with a 12-bit signitic8nd and i-hit.
TI .., I . I tl b tt ?-12.,2 ver . W ' 2-1.2 0
exponent. 1(' precision IS s Igl Y e er, _ :i '" .
2-2-1 S A+ S 22-1.
(10.8)
This range should be> l'Ompared to t he range of positive normalized floating-
point numbprs wit.h {3 = 2 and 11 = e + m + 1. where e and 7ft are tll£' numbt:'r of
hits in thp exponent and t.he normali7Nl fractional significand, rcspectiwly. TI1f'
latter rdngl> i, as shown ill C'haptrr 4,
! .2- 2 (0- 11 < F+ < ( 1- 2-m ) . 2 2 (0-1 1 - 1 .
2 --
( 10.9)
10,4 CONVERSIONS TOIFROM CONVENT10NAL REPRESENTAT10NS
Convt:'rsions t.o thp logarithmic number system from {'ither a fixed-point systm
or a floating-point system require tht' ralculatiolJ of logarithms. The Opplblt .
convcrsions rpquire the calculation of ant.ilogarit.hms. For cxample, to conwrf
t.he float.ing-point nUlnbf'r (_1)5 . M .2 E to the logaritlllJlic SYRtl'lO we IIJIlt
cakulatt' F = log2 ,"I to obtaiu
5 E+F
(_1)8..\/.2 E =(_1)5.2".,2 E =(-1).2 .
If y,.(> :wt € = Ii and, consequently, m = I, t.he ranges remain about the same.
254
I
0.8
O.G
log2(1 + x)
0.4
0.2
10. Logarithmic Number Systems
10.5 ExercISeS
255
0.2
0.4
0.6
0.8
I
Example 10.4
Let N = 0111.01 = 7.15 1 0 and IJ = 3. W' shift out "\ sillle z..ro, anti thpn
a 1. Hence, t = 2 and at th.' end the rcistcr contains 1101, i'Ih'rprpt('d
as .1101 = .8125 10 . Therefore, lug 7.25 10.BOll = 2.812510. Tlw
accurate value is 2.8579810'
A similar approximation can be used Cor tht' antilogarithm. Gi\Pu
t + y, wllPrl' y is a fraction. we need to raklll.\I.{' N = 21+11 = 2'2"
2'(1 + y). This is the sam(' approxim.ltion IIsro beCurf'. 11 ::::: log2(1 + y).
This ran b.. irnplf'lIlt'lItf'd in hardwal1' by placing a 1 in pO$ition t .me!
pl<lCing the fract.ion y next t.o it. For f'xamplt', givf'n 100.1010 = 1.1)25 10 ,
t = 4 aud t.ht:' approximat('(1 antilogarithm is 11010.002 = 2610. Thp
correct value is 2. 1 . 625 = 2,1.67537 1 0, 0
x
FIGURE 10.3 A linear approximatIon tor log2(1 + x).
TI\(' logarit,hm of a given operand call be found in a look-up tahle whoS('
sizl' grows expOlIPntially with t.he number of opl'rand bits. Another way to find
it is to cakulatf' an approximation to the logarithm. Let N be a binar:y nUlIlber
Zu Zu-I . . . ZO . Z-I . . . z-'" and let z, be t hI' most significant nonzero bit of Ill.
Thf' valu.. of this numbcr can be written as
A pil'cewis(' linear approximation has also bE"t'n suggMtf'<1 (II. The intf'[\..tI
(0.1] is divide{l into four equal subintervals. and a linear approximation of t.hp
form x + a . f(x) + b is ud for each subinterval, where f(:r) is eitht'r x or .i,
the one's complement of x. The l.ont.ants 'I and b arp sdC<'tPd so as to minimi7f'
the error and be fractions with p(lwers of 2 as denominators. The resulting
expression i.., (I)
1--\'
i=-ll
log,(l + x) '" j x+Ji. x O$x<t
16
(10.12) x+Ji. !<.z:<!
64 4 - 2 (10.11)
+! + 3 1<x<1
x !.'IX m 2 -
:r + !;]; $.z:<1
..
'-I '-I
N=2'+ L 2'zi=2'(I+ L 2 i -'zi)=2'(I+x)
where x is a fraction. 0 $ :r < I. Clearly,
log2 N = t + log2(1 + x)
(10.13)
The approximation ..rror is -0.006 $ f $ 0.008. The total ('rror rdngc, 0.014, is
lower than that of tile linear approximation hy a fact.or of 6.
Higher predsion CRU be 8chievro by using a look-up t.able implemented
in a ROM. However, the si7e of tins ROM is prohibitively large f(1r reasonable
values of the numbt:'r of operand bits. A mort' C<'OlIomil'al implt'JIlent.atioll ba..d
on R PLA ha.<; been sUf'Sted ill [-I).
wh"rt:' t is thr charact.('ristic of the logarithm and log2(1 + x) constitutes the
mant issa.
A linear approximation for logAI +x), suggested in (5), uses only the linear
t('rnl in th.. Taylor series; i.e., log2( I + x) ::::: x. This approximation is shown in
Figure 10.3, and its error is t:(x) = 10g:z(1 +x) -x. The maximum approximation
error IS found by differentiating E(.r), obtaining Max f(X) = 0.08639 for x =
0,.14269.
Th.. hardwarf' implemt'ntation of this linear approximation is very simple.
The oper<Uld 1\' is stored in a shift. register and a roul1tcr with an init.ial value
of u is uM.'d. N R; shifted to the left rep(,3t('(lIy unt.iI a I is shifted out., and at
t.he same t.int(... the count.f'r is d....crementec.l once for every shift operat.ion. The
contpnts oC till.' count.pr at the end of t.he operat.ion are u - (It - t) = t, and th('
contt'nt.s of tllp shift rl-'gist.er are t.he approximatro mant.issa x.
10.5 EXERCISES
10.1.
(a) For a sign-Iog'lrithm systcl1l with n = Hi, k = 6, lultll = 9, h swalle:;1
and larC"St. pnsitive numbprs, assuming ba.'.(' 2 and that the l.\100 s complemnt
method is used to represt'nt. Deat.ive logarithms. C'alculat' Ib(' lllaXIJUum rt'lnll\'('
repr£'scntat.ioD error.
(h) Repeat (8) for ba...e 10. .
Given the sigu-logarithlU system d£'finl'(l in problt'lIJ 1, show the rcplt':;eutallon
f h d 2 5 an(l } - 3 7 iu tlilii S":lt,em and IlerforJII IIII'
o t e two op£'ran 5 ." = . - ,. , .1
10.2.
256
10, Logarithmic Number Systems
10.6 References
257
opprations .\ + }', X - } , \ ,} , X f1 , 1/ '(. '(2. ..;x, and \ ". Calculat(' I,hl'
f'ntriC's of 4>+ and - thut I\rl'l1N'ded.
10,3. Writf'it. Hool('3n expression for thp siKIlal. selc<:tinJ1; till' table of 4>+(x) given the
sign bitsoHhl' two operands. S.... a.ncl Sn. and t.hesignals ADD and SUB( I'RAC 1')
indicatin thl' type of opl'ration bf'iuJI; executed.
10.4. A 32-bit format for I,bp sign-loJl;arithm svstem hR.>; b('Cu 8uggestf'd with k = 8.
1- 13, a bR.,*, 2, and a hiR.<; of 127 (3). This r('Sults in 8 format that is very cI()5('
to thf' ingl('-prl'Cision format of tbf' IEEE standard (8ef' Chaptf'r 4). Write down
an expr('S,<;ion for thf' value of a non7..('ro nUlDbl'r X given the sign bit. Sx aud the
logarithm E'I(. Use thp notation Ex = 1+ F, whf'rl' I is tilt' k-bit integf'r and F
is th(' I-bit fraction. ('0Illp3rl' the rangf' of these two number r£'IJr('S('utations and
writp tbe rule for converting an mFt-: floating-point number to the sign-logarithm
sstPm. Estimatf' tht' conversion f'rror and sUAAcst a way to redue£' it..
10.5. Determinf' tht' minimum numher of inputs and outputs needed for ROM6, which
corresponds to t.he range (4.0.5.0) of 4>+ (x) for the 20-bit logarithmir processor
dt'SCribed in the text (7).
10,6. Sho..... that the maximum error of thl' approximation log2(1 +.r.) x is 0.08639.
Suppose that the approximation IOg2( 1 + x) x + c. is used inst£'ad, where the
int£'rval (0,1) is dividro into four subintl'rvals. as in Equation (10.14), and c. is a
ronstant (,JIlployro for the ith subinterval (i = 1,2,3, '1). Find the best vslue$ for
the c.'s bO 8S to minimize the error, and calculatl' the resulting maximum error.
10.7. Write an f'xprl.'SSion for the distance bf'tween two adjacf'nt numbers in the sign-
logarithm system and compare it to that of the corr£'sponding floating-point.
system. Show that smaller numbers arf' denser in the sign-logarithm system.
10.8. To r£,IJr('S('nt values in the range IAI 1 Wf' may r('Strict E" = 10gb IAI to positive
illtPgers. Wbat is the r.mgf' of \'alUl that the b b lDay assume?
(6) E- E. SWART lLANDER. JR., <IUd A. G. ALEXOPOULOS, "Thp siKll/lo)tarithm lIumbrr
syMcm," IE,.;£-: Innu. on Computers, C-24 (Opc. 1975), I:.!JR-12'll.
(7) F. J. TAYLOR et al., "A 20-bit logaril,hmir munbf>r systPtll prort'S,<;()r." WEE 1rnru.
on Computers, 37 (Fph. I!JAA), 190-199.
(8) I.. I<. Yu and D. 1\1. Li-:wl!'1. "A ,iO-bit intt'grat<,<1 logluithmic numhpr systf'm
processor," IEEE J. 0/ Solid-State Circuil.." 26 (Oct. 1991), 1-I:i3-1140.
10.6 REFERENCES
(1) 1. ('OMBET, H. V. ZONNEVELD, and L. VERBEEK, "Comlmtal.ion of the base two
logarithm of binary numbers," IEEE Trans. on Elect. Computers, EC-14 (Dcc.
19(15), 863-867.
(2) A. D. EDGAR and S. C. LEE, "FOCUS microcomputer number s)stem;' ('ommu-
nirotions 0/ th ACM, 22 (March 1979), 166-177.
(3) F. S. LAI and C. E. Wu, "A hybrid number s}stem procesor with grometric
alld complex arithmetic aipabilitics," IEEE Trans. on Computers. 40 (Aug. 1991),
9,2-Ml.
(oil Ii- Y. Lo and Y. AOKI, "Generation of a precise hinary logarithm with differ-
ence grouping programmahle logic array," IB,..,., 1\-ans. 011 Comput,'rs, C-34 (Aug.
19M!',), 681-69l.
lJ J. N. MrrCHi'LL, JR" "Computer multiplication and division IL"iing binary loga-
rithms," lllF 1hnu. on Fleet. Computers. EC-11 (Aug. 1962), 512-517.
11
THE RESIDUE
NUMBER SYSTEM
The residue number system is dn intl'gcr number system whose most impor-
tdnt property is that additions, subt.ractiol1$, and mult.iplicat.ions arE> inhf'rently
carry-Cree. As a result we may add, subtract, Or multiply uumbers in oue :itep
rf'gardlf'g.<; oC the length oC diP nllmbf'rs involvf'd. Unfortlluatl'ly, ot.h('r arith-
metic op('rations. like division. comparison, and sign detection, arp WI)' complex
and slow. AlloUwr problE>1U with t.he residue IIlllnber system is that it is an
integer lIumbpr system and, as a result. it is very incol1\'pnient to reprt'SE'nt. frac.
tions. Consequently, the residue systf'm has not been seriously considem:l for uS(>
in general-purpose computers. However. Cor some special-purpost' applimt.iolls
such as many types oC digital filters (6), in which the number oC additions and
multiplications is substantially greater than the number of il1\'ocations of magni-
tude comparison, overflow detection, division, and a.lik(', the residue system ran
bl' vcr) att.ractive.
11,1 PRELIMINARIES
A r('.<;idllf' number systf'm is chara('terv.ed by a bEl.') , that is not a single radix hut
an N-tuple oC integers (mN, mN-1...., ma). Each of these m, (i = 1.2,..., N)
is called a modulus. An intc>ger X is reprt:'sented in the residue number sytem
by au N-tuple (XN,XN-I, ....xa) where x. is a nonnegat.ive illt,eger sathiCying
X = 711. . q, + x,
(11.1)
259
260
11
11.2 ArIthmetic OperatIOns
261
The Residue Number System
X
-4
-J
-2
-1
o
1
2
3
4
5
6
7
8
this case is 6. There ar(' only six different. rl'prntflt.ions in thp r('!!irhle
syst('m with t.he moduli (m2' m.) = (3.2), o;;incc Xl can a.."-S1111Je two pos-
sible> values and X2 can a:;$Ullle t.hrf'e. We must thprpforp limit. thp range
to include only six numbers. Two slich possible ralll!;CS dre marked in the
table. On(' is -3 X 2, and tlIP othpr is 0 X 5. 0
.1"2 3'1
2 0
i -(f - - r -
I 1 0 I
I 2 1 I
---rf----U-: :
I 1 1: I
I 2 0: I
L_____J
o 1:
1 0:
.
2 1:
-...----------,
o 0
1 1
2 0
It. hdS be<>n shown (7) that. in gNler8.l. t.he number of differpnt rf'present.Q-
tions, aud. as a rftmlt. the lIumber of elements in the u.wful rang(' of the rffidup
system is t.he least common multiple of thp moduli. denoted by
M = l.c.m (m}. 1112, ,.., mN)
(11.2 )
The least common multiplp of tbe moduli is t.he smallest integer t,hat. has all the
values of ru l as divisors. In the above example AI = l.e.m(2, 3) = J.. 3 = 6, but
for rnl = 2 and m2 = 4, M = l.c.m(2.4) = 4. Hencp, in order to get the l8.rgest
possible range
TABLE 11.1
system.
The representation of numbers In the (m2, m.) = (3,2) residue
where qi iR the largest. integer such that 0 Xi (mi - 1). Xi is known as th('
residue of X modulo mi. and the notations Y mod mi and I Ylm. are cOllllllonly
used.
N
.\1 = IT In,
.=1
(11.3)
we must select moduli that arc pairnrjse relatively prime. Two moduli m , and
mJ arc said to be relatively prime if 1 is t.heir greatest common divisor. This is
usually written as g.e.d(m.. 11Ij) = 1. For example, 4 and 9 are relatively prime,
alt.hough neit.her in it.self is prime.
For a given M, if only nonnegat.ive integers are needed, the range can be
set to (0, AI - 1). If, 011 t.he ot.her hand, negative numbers are also desirpd, then
t.he range can be set to (-(M -1)/2, (AI -1)/2) if M is odd. or (-M/2, (M/2-I»)
if AI is even.
Examining t.he ent,ril's of Table 11.1 in the range (0.5) we should redlize
t.hat a magnitude comparison betwn two numbers is not. simple. For example,
(2,1) represent.s a number t.hat. is larger than the number reprt'SenteJ by (2,0),
but (1,1) represent.s a number smaller t.han that. reprf'SCntt!d by (0,1). This stems
from t.he fact that, unlike the conventional number systems, the residue system
is not weight.ed. Also, if negative nUlilbers are included in the range t.hen the
sign of a number is not apparent from its residue reprl'sentation.
Example 11.1
Consider a two-modulus syst.pm with the moduli m2 = 3 and 7U1 = 2.
Thl' repre:,entation of X = 5 in this residue system is (X2. xd where
X2 = 1513 = 2, since 5 = 3 . 1 + 2,
XI = 1512 = 1, since 5 = 2.2 + 1.
Therefore, the re:.idue reprpsentation of 5 is (2,1). The number X does
not ne<:t>.ssarily have to be a positive integer. For t>xample, if X = -2, then
.1:1 = 1- 212 = 0, since -2 = 2. (-1) + O. Also, -2 = 3. (-1) + 1, yielding
X:2 = 1. Not.e that Xi is by definition positive. Thus, -2 is reprE'-sented by
(X2,Xl) = (I,O).
Table 11.1 includE's the representation of integers in the range (-4,8)
in t.he (m2. md = (3,2) residue number system. is apparent from Ta-
ble ILl, the residue rppresentation of a number is unique. However, t.he
I'onverse is not t.rue, and two or more numbers may have the same repre-
sentatioll. For t'xampll" 1 ane! 7 <UP represl'nted by (1,1). Consequent.ly,
w> must. limit. th,> rdnge of the numbers to be represented. As we can see
froln Table 11.1, the residue reprf'SC>ntation is periodic and the period in
11.2 ARITHMETIC OPERATIONS
The basis for performing an addition in t.he residue sy!>tl!.m is the identity
IX + Ylm. = "XI",. + 1 1 'lm,lm. = IXi + y,lna.
(11.4)
k
L-\)
,;;1
k
= L IX, 1m,
j..l
( 11.5 )
11.2 Arrthmetlc Operations 263
EXRmplt> 11.3
COllsidl'r the rt'siduf" number s}':<h'm with thf' spt of four mo(\uli (m.1t 1113.
m2, mr) = (7,5, J. 2). Thf'SP mMluli arf' pairwiS(' relatively prime Bno
th...reforf'
.,
]./ = l.e.m(n 1,"".. mol) = II nti = 210.
,=-1
262
11
The Residue Number System
nr, in g>ncrRI, when adding the k upemnds .\10 .\2.... Ak
m,
"
Similarly, the identity for multipliration is
I Y1'lrn, = 1 1 .\ 1m, . IYlm, I.... = l.e, . Yllrn.
(11.6)
Pf'rfonning the addition and multiplication of th.' two operands l( = J
and Y = 4, reprf'Sented by (3.3,0,1) and (4,4,1,0), respectively, yidds
(7 5 3 2) (7 5 3 2)
3 (3301) 3 (330 I)
4 + (.1" 1 0) 1 x ( I .1 1 0)
7 ( U 2 1 1 ) 12 (520 O)
or, in g<,n('ral,
k
IT Ai
,=1
k
= I1: \jlrnl
) 1
(11. 7)
m,
"',
Example 11.2
We add the numbt'I'$ X = 1 and Y = 2 in the (m2. mr) = (3.2) rcsidut'
systf'm. The representations for Y and '\ arc (1,1) and (2.0), f<>spfftivd}'.
Tlwrefore,
One ('(\11 verif)' that till' results (0,2,1,1) and (5,2,0,0) r€'present thf' f'J(-
peeted values 7 and 12, respectively. But, whm t.lw following addit.ion is
performed,
(7 5 3 2)
206 (3 1 2 0)
7 + (02 1 1)
should be 213 (3 3 0 1 )
tlw result (3,3,0,1) represcnts the valuf' 3. which satisfies 3=12131210 (lnd
we clearly have an overflow situat.ion, which is difficult to idc>nt.ify. 0
Thl" proof of the above eqlldt.ions art' straightforward (7).
IX2 + Y21m = 11 + 213 = 0
l.rl + Yllml = 11 -t 012 = 1
Thp linal result is t.hus (0,1). which reprf'sents t.he value 3. Mult.iplying
t h... t.wo numhe>rs X and Y yields
IX2' Y2lm:l = 11 .213 = 2 } ' .
1 . I _ 11 01 - ° x..} = (2,0) r('prl"S<'nt.mg the value 2.
Xl YI "" - . :2-
11.2,1 Multiplicative Inverse
The multiplicative inverse oC a number e modulo m is d numbl'r b, ° b (m -1)
satisfying Icbl rn = 1, b is dt'uoted by 1lrn' Any number e has an addit.ive illverse
I - elm, bllt. the multiplicative inverse 1lm dol'S not always exist.
The inverse 1ITrI exists if and only if g.e.d(e, m) = 1 and lel m '" O. If these
conditions are satisfi('d then 1lm is unique. For examplt'.
o
For subtraction we define the additive mverse of a number c modulo mi
as follows:
1- elm. = Inti - el.... (since Im.lm. = 0) (11.8)
For eXdlllplc, 1- 213 = 13 - 213 = 1. In othl'r words, the itwerse of a numbl"r may
b(' formed by "compleme>nt.ing" eaeh residue with rpcct to its modulus. As fOr
addition, the> equation Cor subtraction is
IX - }'Im, = IIXlm. -IYlrn.l m . = IXi - Yilm,
(11.9)
m=5 m=6
e HI", (' HI...
1 1 1 1
2 3 2 None g.e.d(2,6) = 2
J 2 3 None g.e.d(3.6) = 3
4 4 4 None g.c.d(,1.6) = 2
5 5
Using the definition of tilt. additive iu\'ers(', the t.erm Ix. - y.lm. {'liIl be replaCl'd
by I.c.. + I - Yilrn, 1m,. For f'xampll'. subtmrting }' = 3 from .\ = 5 in the
("'2, mr) = (3.2) residue s)stem yields
1.r2 - Y2lrn = 12-013 = 12+013 = 2 } . .
IXI-vJl rn l=jl-112=ll+112=O \: 1 =(2,0),r('prffle(lhllgthplluc2.
If m is a priult' numbpr, then for pvpry PlIssiblt. value e sat.if)'mg 1 e
m - 1, g.e.d(e, m) = 1 and tho nmlt.iplicativl' inverse t:xists,
264
11. The Residue Number System
11.3 The Associated Mlxed-Rodlx System
265
Maglliturll' cmuparison, sign rl('te<.tion, and ov('rflow d,.tedion for the rcsidu{'
munh('r system can hI"' facilitat<>d by converting the givl"'n ridm' rppn>scnt.ations
into the? a.-'::sociatpd mixp<l-radi.x number svstl>m. This is a weight.ed number
syst('II1, wit.h th(' repr'-'SPllt.ation for a number \' giv('n by
This calculdtion ran he done in rpsidul' arithllll'tic, a... can be p"..'lilv Vl'rifia:1
through thp followinK ft'presl'ntation of the prncrourc in E<luat.ion 01.'12):
Yj+1 = ('i,-a.)!..!...1 with YI=X
J1I,
a, = } mod m, (11.13)
11.3 THE ASSOCIATED MIXED-RADIX SYSTEM
x = (IN . (rnN-1 .mN-2...mr) + ... +a3' (m2' md +aJ' JU) +al
(11.10)
Example 11.5
To convert a number X repr('Spnted by (x., 'c3, 'c2,.rl) in thp rpsidul' sY:'ltPm
with tbe moduli (m I. m3, m2. J1II) = (7,5,3.2) to the associated ml.x"t.
radix system, thl"' following equat,ions can be \l.'5pd:
with the digits a. satisfying
0:5; ai < m,i
i=I,2,...,N.
(11.11)
al = Xmod2=xl,
1
a2 = (X - al )1'2 1 mod 3,
a3 = (X - al )11-1l2) 1lmod 5.
a4 = (((X-al)II-a2)1I-a3)1lmodi.
Being a ,,'('ighted lIumbf'r system implil'S that magnit,udt' compdrison is straight-
fonvard. For example, tbe ,,,,1m's 0,1,2,3,4 and 5 in the mixed-radix syst,em as-
sociated with the (3.2) msidue syst.em (SI."e Tahle 11.1) are represented by (0,0),
(0,1), (1,0), (1,1), (2,0), and (2,1), respedively. The value of a pair (a2,aa) in
this mixed-radix syst.('m is 2. a2 + al'
Example 11.4
In t.be mixf'd-radix s)'stem associated wit.h the (m4, m3. J1IJ, J1Ia) = (7.5,3,2)
resid\l(' system, a numbpr X is represented by (a4' a3, a2, at>, when:
x = 30 . 04 + 6. a3 + 2 . a2 + al
It is more conveniC'nt to follow the algorithm in Equation l1l.JJ) and
ex('cute the conversion in t.he rC.':iidue system. For I'xample, we convert thl'
number 43 represented by (1,3,1.1) as follows:
}'I = (1,3,1,1) and therefore, al = 1'"1 mod 2 = XI = 1.
To obtain 1'2 we first subtract al from Y I , yielding (0,2,0, -). Note
that only the first three digits in Y2 are of interp.st, sim.'e al is alreddy
known. We then multiply by HI, which equals (4,3.2,-), obtaining
Y2 = (O, 1. 0. -). Thus, a2 = 1 2 mod 3 = 0. Subtracting a2 = ° yields
(0,1, -, -). Next we multiply by I !, which equal.. (5,2. -, -), yieldinJ!:
}'3 = (0,2, -. -). Therefore, a3 = h mod 5 = 2. Subtracting a3 = 2 we
get (5,-, -,-). We then multiply b)'I!I? = 3, yielding}"4 = (1, -,-,-).
Thus, a4 = 1 and the represcntation of 43 in the mi.xl't!-radix system is
(a4,a3,a2,aa) = (1,2,0.1). 0
and the digits ai satisfy 0:5;a4<7, 0$a3<5, 0:5a2<3 and O:5;al <2.
The numbers 43 and 37 ar(' represented by (1,3,1,1) and (2,2,1,1) in the
given residuI' system, respectively. The corresponding representations in
the associated mix{'d-radix system are (1,2,0,1) and (1,1,0,1), respectively.
These l8St t.wo representations can be compared indicating that 43 is
gr<'ater than 37. 0
Any two numbers in a given residue system can be compared by converting
them into the associdted mi.xed-radix system. Converting a numbl'r represrnted
by (XN, XN-lo ..., xa) in the residue s)'stem to the associated mixed-radix r('p-
resentation (aN,aN-1o ... ,al) is performed using thc following equations [71:
al = \' mod ml = Xl
1
a2 = (X - ad I-I mod m2 (11.12)
nil
a3 = ((X - adl!- a2) 1lmod nl3
ml "'2
The mixed-radix system i., u6eflll for ovprttow det.ection as well, For this
purpose, we should add a'redundant modulus mN+1 to the basic set of N mouli.
Hpre, the t.erm redundant modulus I!1callS that we use only t.he rallge deterlllllled
by the original N moduli. For overflow detection we convert the b riven repre-
sent.ation (XN+lo XN,... ,Xl) to the dS!:iociated mixed-radix sytem. If UN+I 1= °
then an overflow has occurr('d,
266
11. The Residue Number System
11.5 Selecting the Moduli
267
11.4 CONVERSION OF NUMBERS FROMITO THE RESIDUE SYSTEM
then
If tilt' moduli m, nr,' pnirwi...., rciatiw'l)' primt"' w(','nnU8(' tht"' Chin(,8t"' R.'maindcr
Throrl'llI in order to COnVl'rI a Illimber in rhe fI"':ooiduf' s)'stem to t.h,' conVl'nt.ioliul
n\llllh('r S) st.f'III. Tin!> t.llI'ort"'1Il stdtc:; that
I"
IX'm = 'Lx, 1 2 '\m
IJ-O
( 11.16)
m
I,.,.
I \ I.( = 'L riIJ I I (11.141
JIl
IJ.I , m, IIf
N
,,'I,,'re r;l) = : , ,\l = IT PIlJ and nil the ,alues of Ill) ar(' pnirwi!oc rd.\tiv('ly
J-I
primt"'o The proof is fOllnd in (7).
Thl' t('fms 1 2J 1 C'UJ I", pre('aklllatP<t and sl-ort'd In a tablc
...
Example 11.7
To find 1110110113' w,' first gellf'rdte a tabl.. of 12'13, yiplding
12°13 = 1.
1 2 1 1 -?
3-""
1 22 13 = 1.
1 ?3 1 - 2
. 3 - ,
Example I 1.6
For t Itt"' rt"'ltlue numb!'r s)'st,'m with t h... tllfl.'C pairwl:;(' r('lnt ivd.y primt"'
moduli (m3. m2, m.) = (7.3,2), the range includf'$ .\1 = 12 Illnllhers.
GiVl'n a rt"'pr('wntation (X3,X2,X.) = (0,2,1), we wish to find ..\
Th('rl'fore, IllOllOl1J = 126+25+23 +2.1 +?>13 = 11+2+ 2+1 +113 = 1. 0
11.5 SELECTING THE MODULI
. .H 42
rrll = - = - = 21;
ml 2
_ 12
m) = - = 14'
. 3 '
-12
r?l.J = -=- = (;
(
Wt' ilia). hav(' diff"rem objectives wlrl'n Sdl ,ting tire modllli. If our objectiv('
i.. to [('(llIc(' th(' ('x('{"ntion time of addil,ioll ami mulriplintioll, tlwu a larJ.,
numht"'r of smnll llIoduli is desirable, sinre th(' ('xN'ution time of these opt'ffitions
is (h.termint"'d hy till' Idrg('st lIIudul1l8. How('v('r, a large nllmber of small moduli
will I('ngtllt"'n t.ht"' t inlf for converl,ing residut"' numbers to the dS..."Ociatcd mixed.
radix syslf'm, since t hb conw'rsion i!> a Sl."ljuential proc{.'(lure in which the number
"f stt"'ps is dt"'t.t"'rminl'(l hy th£' IllIlIIbt'f of moduli, Such conv£'rsiOIl'i arp m,<:y
for map;nituclf' comp.lfison, sign dCllrtion, or ov('rftow detection.
Anol,lwr consi(k'ratioll whl'n st'lecting moduli is the fact that th(' residues
woul,lnornmllv be cod('(1 ill some binary l'Odt', alld t.he arithmetic operation,; on
th.' rcsidu£'s ,,;oul(\ be cX('tutcd 011 thir corresponding hillary rt'prntatjuns.
\\,, therefore haw the following objectiVl's:
1 1 I I I I
- - - - 1-
rill "'I - 21 2 - ,
1 1 I 1 1 ,
- - - - 2.
ri1 2 m - 1.1 3 - ,
/ ...!.. / = I.!. I = 6
r;13 mJ . (; 7
Th('r('forc, : \"1012 = 136. I3 + 28. X2 + 21. Xll12 = 13u. 0 + 28.2 + 21. 1/.2 =
1771012 = 35, rh(' cOl'ffici(>nts 36, 28, and 21 nre COUslants that call h('
comput.,'d ollce and stored. 0
An alternatt"' limu for tht"' pquation for rOIlv<'rting a lIumber in th,' r(';:,idue
Y8t('m to decim"il is
IXI.\( = IA3.l'3 + 42X2 + Alx)l",
(11.15)
L Effici('nt binary representation to millimi.lc tire total number of bits.
2. COII'\i('ni('nt binary cOlling to simplify the execution of arithmptic opera.
tion.
"Iwr(' A, i:, tht"' '\o,'f'ight" of t.ht"' digit. x.. Th"rC'fore, A:i is tht"' value of (1,0,0),
A 2 is th.. valut"' of (0,1.0) alld ..4 1 is th,' \'11111(' of (0,0,1). For t.Ilt' rl'sidut'sY'it('m
(m3,m2,ml) = (7,3,2) tltes<' ,alU(!:) ar' 36,28, .md 21, fl'SI>l'Cl,iwly. yielding the
('xact samt"' ('x pression as in the pfl'"ious example.
The smalll'8t. Illlml'f'r of hits lIeeded to represl?nt t.lw rf'Sidu(' digit for the
modulus m, is flo(l;2 mil- H,'n('('. to maximize l,he (('prt>,,,,'ntation (storage) effi-
cit'lIcy, Wf' prt"'ft"'r to S .Ipct all m, that. ('(Iual" 2" or is very clost' to it; e_g., (2 k 1).
Clemly, WI.' call select 0111)' one m, of the form 2" and still have relatively prnn
moduli. W.' may rh"lI, ill addition to 'i<, 81'1,'('r (2" - 1) ami a few orher moduh
of till' form (2' - 1). Howt"'Vt'r, not, all terms of tl1(' form (2 ' - 1) lIlay b(' 'iclect.ed,
!>illcc 2 k - 1 = (2"/ 2 - 1)(2"/ 2 + 1) for ('wn values of k. rhus. (2" - 1) alld
(2"1 2 - 1) arc 1I0t ff'lat.ivt'ly primt'o (2'" -1) is also fnctordble for some odd valul......
11.4.1 Conversion from Binary to the Residue System
If l,he op(,f"inds arc givt"'11 ill tlw cOII\'l'ntiOllftl hinary syst."III, W(' call cOllvcrt
tlll'lII dir{'{'tly into tlu' rc.5iduc SV!>WIIl. Giwll X = 2:; Ox}2 j with .l'J E {O.!},
268
11, The Residue Number System
11.6 Error Detection and CorrectIon
269
of k. The st"lt>rt.cd mocluli should h.. as close as J>ossihlt' to on(' anotht>r to avoid
ver,}' large moduli, whidl would incrl'asf' thl' l'xl'<'ut.ion time.
Digit
Binary Code
o
000 or III
1
001
2
010
3
101
1
110
Example 11.8
Considt"r the four l110duli 32 = 2 5 , 31 = (2 - I), 15 = (2 1 - 1), and
7 = (2 3 - 1). The total I11I111ber of hits rf'quir('d for tht'ir rt'prl'scntation
is 5 + 5 + 4 + 3 = 17 bit.s. These four moduli arf' rf'lativcly prime, and thus,
M = 2 5 (2:\ - 1)(2 4 - 1)(2 3 - 1) = 2 17 - 2 14 _ 2 13 _ '" > 216
Any binary coding of 2 16 numh,'rs I"t'quires at least 16 hit.s. Thert'fort'.
t h('s(' fOllr lIlortuli yield a very ...fficit'nt l'oding. 0
TABLE 11.3 Alternate binary coding for residues modulo 5.
In most ('a.'>('S. the conVt'ntional hi nary coding for th(' digit is used, This
is not rea II,)' nccf'SSary "lnd, for m = 5, for cxamplf', w.> may S€1{'('t thp coding
"hown in Tablt' 11.3. Tht' pairs 1 amI .1, and 2 and 3, arc additive invprsp pairs
and alo;o onc's comph>m('uts.
11.6 ERROR DETECTION AND CORRECTION
For moduli of t,he form 2", an ordinary binary .\Clder can be W>ed, in
which ca."l' t.he aJditivp invers(' is simply th(' two's complement. For (2" - 1), an
adder with ('nd-around carry ('an bc uS('(I, "lnd the additiw inverse is the one's
complem('nt.
Example 11.9
If m = 2' - 1, tht' additivc invt"I"Se of the digit c is m - c = (2' - 1) - c,
which f.'<luals the on("8 romplemem of c. Suppose I = 3 and, as a result"
th(' modulus is 7. Also assume the convent,ional binary coding for Ul('
rt'sidue digits. If we wish to subtract, I from 6, we instead add tht' one's
compl,'m('nt of I to 6, yielding
110 6
+ 011 olle's compkmcnt of 100=410
1 001
1 End-around carry
010 0
Two subjf'cts are discll'>St>d in this SC<'tioll. Thl' first is thl' use of residue arith-
metic t.o dl'tect and possibly rorrt'ct f'rrors wht>n pf'rformiug arithl11t!tlc 0po'ra.
tions on numbf.'rs rf'presented in the conventional numhl'r systems. fhe second
is the use of redundant moduli in a rcsidUl' system to allow dewtion and pos-
sibly corrf'rtion of errors while performing arithmet.ic operations in the residue
syst.em.
11.6.1 Error Codes for Conventional Number Systems
Arit.hmetic f"rror codes are those codes t,hat are prcserved under arit,hml'tic op-
erations. This property enables t.he det.ect.ion of errors imml'<liately dftff the
compl('tion of t,l\(' arithmetic operation. Surh concurrent, error detf'Ction can al-
ways be aUainp<1 by duplicating t.he arit.hmetic proC't'SSor. fhis method, hoy;ever.
is t.oo costly,
We say that. an f'rror code is presf"rved und('r an ariUlm('tk operatio' " if
for any two operands X and Y, and thf" corresponding encoded entities '( I\ud
y', there is an operation IJ.' for the coded opt'rdnds satisfying
For moduli differf'nt from 21: or (2" -1), look-liP tables must be uS('d. For
example, tilt' addition and multiplication tahll's for m = 5 arc dcpict.t'd in Tahll'
11.2. Earh of these tables is of sin' x 3 = 64 x 3 bits.
+ 0 1 2 3 4
0 0 1 2 3 4
] 1 2 3 -1 0
2 2 3 4 0 1
J 3 4 0 1 2
4 4 0 1 2 3
(X
y' ) = ( \' 1< y)'
(11.17)
x 0 1 2 3 4
0 0 0 0 0 0
1 0 1 2 3 ..
2 0 2 4 1 3
3 0 3 1 4 2
4 0 4 J 2 1
Error codes to be USld in hn arithmNic unit should be t''Camined for im-
plement "ltion costs and dfcctivencs.". By t'osts we mean bot h hardware cost allt
'>xl'<'ntioll time cost (the additional delay dill" to the nd to l'lIcode he .operllds
and dlt'rk the result). By efft'rtiwness we mean fault coverag> winch IS dcfim'd
as the percE'ntage of possihl,' fanlts t.hat will be dE'trt<'d (vf'lghted rclItagc
conidf'ring thf' probability of t.hE' differ('nt faults). Smgle-b.'t faults d(rl,}, have
a higher probaLilit,,}, t.han Illllitipl(.-bit. faults, and we wonltl lake to make Mllt..tl1dt
61\ of tll(,1ll are c1('tl'<'tf'd by t,h(' checking schl'me. Note, howe'('r, tl.'a.t Ii slll1e
error in an opt'rand or an intt>rllll'cliat.f' (('sult Illa cause a muIUpt""'dlglt rror III
the final result.. For eXdll1ple. when adding two bmary I.'u.'nbers, If stage t of tle
adder is falllty, 611 the r,'muminj:.t (n - i) higher onlf'r dlglt.s ma.... he t'rroneouli.
TABLE 11.2 MOdulo 5 addition and multiplication tables.
270
11. The Residue Number System
11 6 Elror Detecbon and CorrectJon
271
Therf' are two dR......t"'s of arit hmt"'t k rodt"''!: tilt"' sf'parat(' coell'S aJul the
lIonSf'parntp codes. In t.llt"' :oo...paratl"' CQdl the data and check bits arc compk11'Iy
lIeparated dllowmg us to nse tht"' data hit.s imlllooiat'ly with no l'lIcorlillf.t, W
!olart wit.h t.he simpll'.St. 1l0nS4.'parat' co,, which me tilt' \N-coelt..'S II). The'
codps are fornll..'(1 b,}' multiplying the operands by a consl<lIlt. t In ot.her words,
X' in Equation (11.17) is A . ...\. ami th,' olwr<\tions . unci * aft' id._'ntical. For
('xampl(', if A = 3 we mult.iply t"'adl operand by 3 (obtained ns 2X +X) lInd cllI'ck
t.h(' r('Sult. of any arit hlllt"'t ic opt"'ration to SC(' whet.her it. i:. an inlt'('r lIlultiplp of
J. All I'rrur nmgnit\lcl<'S that art"' lIlultiplf's of .4 art"' Imd('tl,.-tdhlt"'. fh('(('fort"', wt"'
IIho\lld 1I0t :.('I(>(.'t a vahl(' of A t hat is a pO\\t"'r of t lit' radix 2. An oeld valnt"' of
A will Ilf'tfft t"'vcry single digit fault, since such all I'rror ha... a magllitlllll' of 2'.
A = 3 prm'idf'8 tht"' 11'.L"'t. exp('osiv(' AN codl' that still I'nl\bles UII' d(.t.et'\.ioll of
nil singl(' l'rrors.
x
\,,+y
)'
IXI"
Frrnr
1Jctt!chon
E'I"T()" fndlCallon
IYI"
IIXI" + IYI" I"
FIGURE 11.1
An odder with a separate residue check.
C(X) C(1) = C(X * V).
(11.18)
IIog 2 A. is tht' same for bot.h rodl>S. The most import"lnt eliffl'rence is due to the
propl'rt.,}' of sl'pdrat"'IIt"'-ss. Thl' arithmetic unit for the clllOCk symhol C(X) in the
r(,$idue code is cOlllpl('tely !Wp"lrate from the main unit operat.illJ( all X, whjle
ollly a single ullit (of a higher compl('xity) exwts ill thp case of tht' AI...' codl'.
All \elder with a reidu(' ("ode i1> dl'piltt..>d in Figure 11.1. III till' error det'ltion
block shown in this figure the residn p 1II0duio A of the t + Y ill put, is calcnlatpd
and comparf'd to tht"' othN input. A mislllatch rl"ults in an error illdimtion.
The AN and wsiduI' ('odes wirh A = 3 'lrl' the simplest I>xalllpies of a cia:;:;
of arithmt"'t.ic (separate and lIoll&'pardtc) codes whil'h lL'if'"l value of A of the form
A = 2° - 1, with (L !wing an inteer [I). This ("hoice simplifi t.he calculation of
t.h(' remainder whcn dividing by A (which is lIeeded for the dlt"'rking algorit,hm).
dlUl it i'i thl' rl'aSOIl that these codes arc called low-cost arithnll'tir rodes. The
cakulat.ion of th(' rf'lOaindcr when dividing by 2" - 1 is simple, bcnnsc t.hl'
l'quat.ioll
Example 11.10
Tht' lIumher OI1Ch = (ho is fI'pr(';Sl.'nt.t"'d in the A,\ cod,' wit h A = 3 hy
0100102 = 1810' A fault. iu bit, posit.ion 2 3 may result in the erroneous
I1Ilmht"'r 011010 2 = 2610. This error is ('nsily d"tl'(tabll'. since 26 is lIot a
multiple of 3. 0
The simplest separate code. are the re:.idue code ami the in\"Cr:;(' n-:;idue
code. In I'deh of Ul<'Se we attach a separat(' chf'ck s)'lIIhol C(X) to (,\"Cf\" opNand
X. For t.h(' f('Sidu(' cod.', C(X) = t mod A, where A i.. call(>(1 t.he eh('('k modulus.
For the inver:;e residut' code. C(X) = A - (X mod .t1). For bot.h separat.e codes,
Equar iOIl (11.1 i) is replaced by
This (>()uation dl'arly holds for addition, IInrltiplirat,ion, l\ml snbtrdrtion (see
Fqnat.ions (11.4), (11.6), and (11.9), rI'SI)('ctivt"'ly). For division, tht"' ('(I"at.ioll
X - S = Q. D is satisfic d whl'({' X is tht"' dividl'nd, D is till' divisor, Q is till>
quot.iellt, and S is the remaindl>r. The corresponding residue l'IlI'('k is therl'fore
Iz;r"I"_J = IZ;!"_I '
r=2".
(l1.19)
IIXIA -ISlA IA = IIQI, 'IDI ,I A .
allows t.he liS(' of modulo - 1 snmmation of the groups of si7.(, a hits thdt
compose' the number «('arh group has a valnc 0 Zl 2 Q - 1).
For eX8mpll', if A = 3, X = 7 alld D = 5, the result.s are () = 1 8nd S = 2. Thl'
cornpollding ridue ('heck is: 11713 -121313 = 11513 .111313 = 2.
A f(o;.irtnp cortp with A as a rh('('k modulus ha.... t!w smut"' l'xact lInrl('t('('tablc
error magnit.udes a..<; rhl' corresponding iN code'. For l'xampll\ if J'\ = :J, ollly
error" Hldt modify thl' rc':mlt by SOUIf' nlllltipll' of 3 will go lIIull'!('(.tf'd, ami
COIl!tl.-qucllt,I). single-bit (>rror:; are always dt't('(1ahh'. In addition, thl'dl('('killg
alJ;!;orithms for the AN cod(' alld t,ht"' nidUl' code dre t.lIP S"lmp; in both WI' have
to compute the ridue of t,11l' rl:.'bult modulo A. Even t,ht" iIlCrpd.'i(' in word Il'ngth,
Example 11.11
To calclliate the rcmainder when ,Iividing the Illlmbl'r X = 11110101011
by A = 7 = 2 3 - 1, we partition X into groups of si/.e 3, stdrting with th,'
le"1...t sinifil'ant bit.. Tluli yields X == (Z3, Z2. ZI, Z() = (11,110.101,011).
WI' t.ht"'n IIdd thl'SC groups modulo 7; i.e., WI' "cnst O\lt." 7'.'1 "\lId add tht"'
l'lId-monnrl-clrrv Wh('lll'Ver 11(>(' ....M/Uy. A c.nry-out, ha.'! a wf.ight of 8 amI
sillce 1817 = 1 Wt: must add all clld-'\rouml-carry wh.>never there is a carry-
ont as illu!otrated helow.
272 11. The Residue Number System
11 %3
+ 110 %2
I 001
+ 1 end-around cdrry
010
+ WI %1
111
+ 011 ZO
I 010
I I'nd-8round carry
+ Oll
11.6 Error Detection and Correction
l73
Till> residu(' modulo 7 of X is 3, which is the corrC<'t. rl'maindl'r of \ =
196310 when divided by i. 0
may modify the prol('dur so t.hat two's complemcnt (with R = 2") can also \,p
I'mplo)('d:
(2" - X) lIIod .4 = (2" - I-\" + 1) mod A = (2" - 1- X) mod \ + 1I1Iod \
( 11.20)
\'0f' t.hNcfore, need t.o add a corn.'Ction term 111 , to t.h(' rf'Sidlll' code when formillg
the two' cOll1pll'lI1f'nt.. Note that A m'Lt still be a f8('tor of 2" - 1. A :.imilar
corr('ct ion is nf'('(ll'd whl'n Wf' add opl'r8ncls rl'prc.<;('nt.('(\ in two's COll1plt'lUpnt and
a carry-out (of weight 2") is gent>rat.('d in t.hf' main d(ld(r. Such a carry-out is
dicard{'(l according to the rules of two's cOlllpl,>mt'nt arithmetic. To COmpf'JISBt.f'
for this, WI' ll('ftl to subtract 12"1" frolll the rtosidlll' dleck. Since..\ is a factor of
(2" - 1), the term 12"1" is equal to IliA.
Thl'Se modificdtions r('sult in an interdl'p...ndt'ncc between tile lUain arith-
lII('tic unit and the check unit that. op('rates on t.lIP residue:.. SUi'll an illterdf'.
pend('nce lUay caus(> a !>it.uation where an Nror from till' main unit prop<igatt'S
to the check unit and the efft.'ct of th(' fault is maskP<1. Howevpr, it h& Ll't'll
prov(>n in (2) that th(' occllrrl'nce of a single-bit error is alway:. detectable.
Both separate and nonseparate COdl'S are prt>:>erved when we perform arith-
metic operations on unsigned operands. If WI' wish to include signed operand:. as
well, we must require that the code be complementable with respC<'t to R where
R is eit.her 2" or 2" - I (wh('re n is the number of bits in the encoded operand).
The selected R will det.ermine wl)('tl1<'r two's compll'ment or one's complement
arithmetic will be employed. The origin<,1 operancl is compl(>m('ntable with re-
spect to M (as bl'fore, AI is eit.her 2 m or 2 m - 1, but wit.h m < n). Hl'nce, for
the AN code, the equat.ion R - AX = A(M - X) must be satisfied, yielding
R = AM; i.e., A lIU1st. be a hctor of R. If we insist on A being odd, it ('xcludcs
till' dlOice R = 2". Thus, only one's compll>m('nt can bl' uS('(I, with A being a
factor of 2" - 1.
Example 11.12
For Jl = 4, R is <'qual to 2" - 1 = 15 for one's COmpll'lIllmt., alld is divisible
by A for the AN code with A = 3. The numbl'r X = 0110 is repn..'S(,ntf'd
by 3X = 010010, and its one's complement 101101 (= 45 1 0) is divisible
by 3. However, thp two's complt>IIIl'nt of 3X is 101110 (= -1( 1 0) and is
not divisible by 3. If JI = 5, thpn for onp's complement R is equal to 31,
which is not divisihle by A. The numb('r X = 00110 is represented by
3X = 0010010. and its one's complt'uwnl is 1101101 (= 109 1 0), which is
d 0
Example 11.13
For tile rl'sidue code with A = 7 and Jl = 6, R = 2 6 = 61 for t.wo's
complement and R - 1 = 03 is divisible by 7. Tht' number OOlOl = 10 10
has thp rt'idue 3 moclnlo 7. The two's cOlUplell1l'nt of 001010 is 110110.
The compll'ment of 1317 is 1417 and ad(ling t.he correction term 1117 yields
5, which is thl' correct residul' modulo 1 of 1l01l0 (= 5'lro).
If Wt' now add to X = 110110 (ill two's compleml'nt) the number
}' = 001101, a carry-out is generated and discarded. We must thereforl'
subtract t.l11' corrC<'tion tl'rm 12 6 17 = 1117 from the residue check with thp
modulus A = 7, obtailling
110110:::: 1(
+ 00110l=Y
I 000011
101 =IXI7
+ llO=IY17
1 011
I end-around carry
100
I corrt.>et ion term
011
3 is the correct ftidup of the result 000011 modulo 7.
o
For the ft'sidue code wit.h the check modulus A, the equation A - C(X) =
(R X) mod A must be satisfied. This implies that II must be an integer multiple
of A, again allowing only one':, complement arithmetic tu be w;ed. However, we
Error currection can be achil'vro by using two or morl' residue d1t'Cks.
The simplest. ClISl' is thl' birt'sidup cod('. which consists of t.wo rt'sidm' cht'dls I
:I \ If ,t 2 u I U llll A2 - 2 b - I are two low-cost. re:.idue cht'cks wlth
ant J 2. ....1 = - - . .
Jl = l.c.m(a. b) whl're n is t.he number uf bit.s in the uperands. tht'n any slllle-hlt
l'rror can be corrcctetl 15).
274
11, The Residue Number System
11.7
References
275
11,6.2 Error Codes for the Residue Number System
11. 7. Gi\'I'n., number X dnr\ ils reshllJp nl"dltl.. 3. C(X) = IXb. How will thp nidlJ(>
challgt' wh('11 ..\ is 'ihifted hy one hit position t.o thl' Il'ft if thC' shift...d onl hit is
O? n.I.'pN\t thi.. f"r th" (o\.'ie whC'r{' thC' shift ,<I-out bit is 1. Vprify YOllr rule for
X = (lllOl shifted fh'f' t.imf's to the I{'ft.
Thf' rf'siehlf' syslt"m is inhf'rf'ntly morf' fault.-tol('rnnt than t.lu.' conw'ut.lonal num-
bt"r sysh'm. Thf' hwk of illtC'ractiou among th(' rf'siehlf' diits (no carry-propagation)
implips that. a fault. in a singlf' digit. willllot rt'Sult. in ('rrors in ot.lwr diit,-;, This
desirahlC' fault isolation prop('rly is prt'st'r\'('(1 whilf' performing addit.ion, sub-
t.raction, dnel lIIult.iplicat.ion, but is not prC'sl'rVl'd whilp other o!wrat iOlt' alP
performed. Anolhl'r con'<1uC'ncC' of t.he fault. isolation property is thflt a runs is-
I.('nt Iy NronL'OUS fl'Sidue digit, whf'n idf'nt.ifilxl, ('an bf' ,lisconn<'C'tf'd ami tIle rt'St.
of t.h(' residllt' arit.hmetic unit. can !>t.iI\ be' U8('(1.
Tlw fault-tolerance fC"ltnrc will bl' lIumif.stP<1 only if m:lund,Ult. moduli
an' addeel to thp original set of mocluli, allowing thp ttctf'.-t.ioll of C'rrors and
c\'I'n thC' idt'lItifiCdtion of fault.y residuc digit circuits. The resulting system is
cal\('l.1 a redundant residuc number syst 'J1I dllli is ddim'" by a set of N + L
moduli (mN.+I,....mN+I.f1IN,...,mr). The L moduli llINtL,....11IN+1 arf1'
thf' rroundant. moduli. implying that out of t.lll' total range [0, /III - I) (wh('rc
/liT = rf"i L ,n;) 10. AI - I) (with M = 11:-:1 flI.) is t,ll(' If'j:.titimatt" range. The
rllne [M. Mr - I) is coiled the iI\('gitinMtp range. It has bn shown (6) t.hat a
sinh' error alw"lYs moves the op(rc\nd from t he leit.imdtl' range to t he illiti-
matI' range and {'(\n tlwrcforC' be eR.5ily idelltified.
11.8. fhl' I'akulalloll of thf' rC'lIh.illller whl'n divulill hy . \ = lO - 1 C8n hI' dOIll' in
l,antll('1 rul hl'r thall ill Sf'rit"S IlMl1Ilt'r. Show a block c1il\l\m of sltcb " parallel
c:ircllit for 32-hit long nmnlll'r.; and \ = 15.
11.9. Show that II n'Sidue ('h('('k with thl' modulu.'; J\ - 2" - 1 c'\n dNf'Ct. 1111 ('rror« in
a gruup of a-I (or 1('S..'i) adjacf'nt hits. Sudl error.; arp I-alll'd bur'St rTOrs of
length a-I (or Ipss) "111<1 they m.w occur whf'll ..,hiftinK all operand by . ''f'nl
bit I)ositions.
11.10. Pro\'f'that 1.:,r'lr-l = hlr-l fur r = lO and 0 $ :, $"}.o - 1.
11.11. \\ hC'n pl'rformmg . divide opf'rI\tioll wit.h \N rodro op,.r.mcl'i the ljuohpnt Q
n1lJ'it he lJIull ipli('(1 bv J'\. Filld '\ siml>lf' <\Iorit.hm for executing tllli; Itmlt.iplica.
tiun when t\ = 2° - I. IIIuslnte your .'Igorithm fur" ;; 15.
11.8 REFERENCES
11.1. \ C'rify that thf're art' only four different rl'1)f("SE'ntatiou.o; in the «'SicluC' nwubpr
stC'1JI with (m:z.m.) = (4.2).
11.2. Given the set of moduli (7,5.3), find: (a) the mng(' .\/, (b) the coefficients for
t.lle (,hin' RClI1aindC'r Throrell1 aud IhC' V<ihlC' rt'pu'8C'ntC'd by (2,3,2), (c) thf'
corn'SlJolidinK mixro-radix rt'l'rl.'nlatiou, (d) the rt'pr("S('lItatioll of 20 in the
u'SiduC' system I\ud in tht' mixed-radix s} stelD.
11.3. Prove tht' C'hine8e RC'IIMimle'r rhrorC'm by calculating thl' relll uudl'r modulo m.
of rluation (11.14), knowing that e\'ery number \" has (\ unillue rl'prl"SE'utfitioli
in the residue s).ste1JJ.
11.4. \\'rih' t1J(' !oubtract.ioll table for m = 5 first in dl'('imal reprt'Sl:'utation and 1 hen
in binary r('prt"SC'lJt.ltiem IL'iiuK (a) the ordinary binary coding with (JOO through
101 (for thp dlRit..'i 0 throuWi .1, rp('{"ti,cly), (b) tb... binary code in Table 11.3
\\ hat is the ad\antage of the M.'Cond scheme?
11.5. "nte the rule for cOII\'prting a dt'l'ullal number 10 the r('>,,,hll' system using 1\
table of 11<Y 1m. lIow is this rule simplitlf"d for thp cast' m = fJ1
11.6. Di\'irll" 35 hy 5 ill thl' rt'Siclue !!stC'm wit-h (1113. ml, f1Il) = (7,3.2). ('fill ou
cli id,' 35 by 7? 31 by 5! "" It.lt are the condition.s undl'r which division can
ca...ily b(' carried out?
(I) A. AVIZIENIS, "\rit.hml,tic f'rror ('(Illes: ('ost ,)ntl dfectiel1('SS stlldies for <ipph-
cation iu digit.al syst.em design," IFEE 'Ihm.... on Cornputm, ('-20 (No', 1971),
1322-1331.
(2) \. AVIZIE:>;IS, '.\ritllllll'tic algorit.hms for errnr-codrd operantis," IEEE Irons em
Cornput.crs, ('-22 (Jllne 1973), 567-572.
(3) '\ _ MA, /FBI-' 'lnms. on ('omputf'r." .17, (Mdfch 199R), 333-:J:J7.
(.1) F. Pm'RDlGII \R \Z. dnd II. :\1. YASSINt", "A sigm-d.digit dI"chitf'Cture for rt";idlll' 10
biliary transforml\tioll," 11-:FE Tmn..,- on Computers. 46 (Oct. 1997). 11-16-1150.
(51 T. It. RAO, "Birt"Sirlue error-corrt'(.t.iuJ., ("Otll'8 for CfllJlputf'r '\rilhlJlE'lic," IEE1-'
TmrlS. on Computers. C-IY (M.,)' 1970),398-102.
(6) :\1. A. SODERST'AND. W. K. J'>:-;KI:-;S, G. A. JULLlI'S. .lIId F ,I. rWLOH. J{1'.,due
numb r system aritllfTll>f1C rnodem (lppluatlOn in digital sir/nal pf"O<.cssJng. IEEE
PrE'SS, New York, 19ts6.
[7) N. S. S/ABO nntl R. I. TANAKA, Rf.'.,idue onthrnetic and It., applicatIOn to comput r'
technology, Mc(;rdw-Hill, New York, 1967.
(8) F. J. rWLOR, "H.('Siclue I\ritlllllcllc: A IIItmi..1 with C'xl\lIJplffi," fECF Computer,
17 (Mil)' I!JSI), !io-62.
(9) H. Z'MMEnMANN, '1':lIicit'nt VI SI implcmcntl\llI)hS of modulo (2" :I: 1) B(hliti;n
find lIIuhiplicdt.ion," Proc. 0114tll Symp. on Comput r lrithmetlc (A(lril 199!J),
158-167.
11.7 EXERCISES
INDEX
A
ripplf'-carry, 9.1, 108, 125, 168, 115,
o:J
sihnf'<l-diKil (SD), 157
Addit.ive inverse, 23, 262
AIinlU('nl, 61, 63, 67, 77, 82, 166
of part.ial products, 150
Ant.ilogarit.hm, 247
Approxin1dt.ion
initial, 219
linear, 2M
piecewise linear, 255
polynomial, 225, 21
rational, 226, 213
Slim, 219
Array
divider, 203
ll1ultipht>r, 167
Asymmetry 1l1('dSUre, 22
Add('r
carr)' completion, 135
carry-look-ahead,95, 106, 108, 125,
1.J4, 168
carry-propagate, 24:1
carry-save, 125, 1: I, ViI, 200, 20,1,
206. 217, 213
carry-scl<'Ct., 102. 113, 119
carry-skip, 116, 121
two lev('I, 118, 136
condit.ional-sum, 99, 11:. 123, I.H
full (FA), !)3, 125
half (HA), 9.1, 151
hybrid, 119
Ling, 112
Manch('stcr, 119
lUulti-opl'rand, 141, 150
parallel, 93
prefix, 109
Brent.-Kung, 110
I1an-C.uh;on, III
Koggc-Stone, 111
Ladner-Fihcher, 111
B
Barrel shifLer, 6-1, 67
Ulocking factor, 99, 105, 119
277
27S
Index
Index
Uoolh's alnritlun, 1-12, 175
radix-4 lIJodifiro. 141, 1 i5
rndix- moditil'<l, 146
(7,3), 121\, 150
l)1\ralle I. I:JO
r<'SiduC', 2iO
S4.'paratc', 270
ExcC'SS n1l't hoel, f,G
Fxponellt. 5:1, tU, 2-17
ba..<;(',53
bil1......'tI, 56. 1.8
overllow, fl7
underflow, f,7. 68
bX(llInelltiRI function, 226, 241. 252
c
D
lWe \'A X, 5R
Ueuurm,di71'<1 nUll1bl'r, 68
Digit
:;I't, 21, 2:J
sign, H3, 153
sigllilkanl, !i, r.8
wl>ight. 19
Digi t 1'1 fillf'f, 25H
Di'biun, 181, 2:H
by cOIJ\'C'rgl'nCI'. 226
by r('<'iprol'ation, lIS
by :z,'ro, 40, 60
hih-radix, 187,201
look-at1end quotil'nl digit selCl'tion,
199
uonrestormg, 12, 181, IR, 2\11, 2:U
rl.'Storing, .12, 188, 203, 227
8e(IUential, 39
SRI', UH, 206
t.hrollgh multiplic8tion, 213
Filii-ill. 97. 103
Fall-out, 1()'1
,,"'lUlt
CO eragl', 269
isolation. 274
loll'rdlll'f'. 271
Fixl'<l-point rC'prl'Sl'ntatJon, .1
Fixl>d-radix nUJllher S} stC'lII, a
Floating-point
dddit.ion, 61, 81
eXl'('I,tious, fI,l
formal. 57
cloublpn.'("IM'IJJ, 67
CXll'ndro, 67
long, 58
single-precision, 67
WEE !>taudarcl, 5.1, 6i, 166, 192,
220, 2 13
nUlIJbers, 5:\, 213, 211'1
l'\lSed multiply-add, 165
Clrr'
Rddil iOlJ fr('(' of, 102
compl£'lillll adder, 135
d .t('('tioll of compll'tI<IIl. 9!i
('lid-around. 1-1, 2fiS
forcro, 9,1, 10 1
fuudanll'ntfll operator, 107
gt'1lt'rntc, 9G, 106, 112, 20-1
kill. l()(i
look-alwld, 95, 204
look-ah,'ad (,IIf'rntor, 95
opC'rfition f(('(' of, 259
Ilwl'dgale, 96, 106, 112, 201
propagation. 27.1
propagation chain. 21, 95, 118
t'DC/Cyher,58
('I1<\racteristic, 53, 2f,1
Chiu('S(' remailllll'r throrpm, 261>
Comparisou con:>tant, 185, 192
Complement
digit, 7
dillJini..,llt'd-ratlix, 7
nillp's. 11
Onl"8, 7, 13, 3, 153, 20.1
rndix, 7
two'!>, 7, 12,37, 1.13, 153, luo, 182,
2()'1. 211. 24
('OIlJIII('x IJlJlnber, 232
Comprl'SSOr, 129
(4;2), 15/\
(7;2), 121'1, lfiO
Con\'lrgf'nC('
domain, 227, 2JO, .l33
Iwear, 228
ljuadratir. 21.1, 219
S<"hl'mt', 213
('OHDlC, 2:U
CQunter
(2,2), 151
(3,2), 125, 150, lfiR
(5.5,4), l.m
F
E
EIl'IIJ('ntar) function, 225
I,'rror
l1("cumlilat.iou of. 87
al)proxillJdtion, 239
correnion, 269
lI,>tt><"t.ion, 269
dist.ribution of rf'lative, 87
relat.i\ C' rOUiul-l)ff, ts7
repre5<:'ntation, 65
absolute, 65
av{'rag(' rc'lut.h'", 61t
lDaxmlUm rl'lati"e, 65
rC'lath'e, 65
rounding, 72
unbiased rounding, 73
rror codes, 269
AN, 2iO
low-cool, 271
nonsl.'paratc. 270
G
GaU'
(f.r), 103, 174
delay, 96
Grddual II uderIInw , 69
Guard digits, 62, 77
H
lliddt.u bit, :IH, oi
Il)'p('rhulic functions, 2:i
279
1
IBM, 5
Ich'dliM,<:1 model, 103
ImdJo:inary-radix nllmt I'r 8yS1I'm, :12
hnpl"lIIent'1tion CCIOiI. 10:,. 1116, 175
Infinite exlPll..ion!l, 15
Infinity, 57
Inlprpn'tation rule, 1, .l4.7
Int./'flodl '\rit.hmeti', 76
L
latch, 132
I ('l\(ling Ino pr('(linion, H3
I 4'.L"t common multiplt', 2fH
J,ength of intprronnl'ctiontJ, 105
Logarit.hm funrtiou, 230, ll:.!
Log.uithmir
IIl11l1bl'r system, 24i
sp<'Cll-up, 99
Look-up tahle, .! 19
Loss of sitka1J..e. 62
Low('r hound, HH
I
Mantis....... 53, l51
Iodulu..., 259
dJl'ck, 270
rt'(luudant, 265
Iult.i-Ievel,'ircuit, 1
Mult.il'licat.ion, :15, 111
array, 167
d("C'ompositJon of, 150
high-radix, 172
pipclinN, 17<1
recodl,<:I, 1,12, 155,217
-"''<luC'ntinl, 35
syn,'hrouou.s, 1.11
tnmo:ulecl, 216
Mliluplicdtivc inwlSl', :!Ij:i
Multiplving factor, 211
N
Nl'Jo:nUt' number:;, b
N('wton-ltdphsu u llIethtlcl, l1l:\
I
280
Index
Index
281
R
HOI, 7.1, 216, 219, 225, l50
RoundinK, 82, 165, 221. 2:'0
bin.'., 71
chopping, 71
digit, 77
!tOM rounding, 75
round-off 5('h(,II1C"S, 71
round-to-:I:oo, 76, 82
round-lO-npar{>'\t, 72, 77
rOllnd-to-nC'lu('.',t-(''pn, 73, 77, 82
round-to-ncarC"St-odd, 71
roundiug townrd 7..<'ro, 71
truncRtion, 71
conwnmmto two's romplpml'nt. 30,
,17
mmim<il rt'pr<'SC'nlalion, 29, 116. UH
Sign.-d-mRilll,lp mNhod, 6, :m
Signitirand, 5J, 63
Sp-up t('("hni!Ju, 241
Sqlll\rp root, I, loo, l31, l52
Stick v bit, 77, i8, 79, 80, 167
NormR!i./.(\llon
additive, 229, 240
Ilmltipli 'ati'e, 229, 235
llonn81izro number. 55, 182, 21-1,
230
pQSt, 59. 63, 67, 76. 82, 165
Not R nllmbl'r (N8N), 60, 68, 69, t)t)
Number sy!otf'm
("omersion, 103
fucro-radix, 3. 21
imaginary-radix, 32
loarithmic, 2.17
mixed-r8<lix. 264
m'g.,-bmhtY,20
Ill'ga-d('dmnl. 20
nr'g<ltive-r&lix. 19, 22
nonrroundant, 3. 21
pQl;itional, 3
redundent residuE', 274
ridup, 102, 259
slgn-Iorithm, 247
signed-digit (SD), 2-1, 102
unwdghted. 261
wC'ightro, 3, 264
See oL.o Floating-point
PrecC!lon, 53
Progranlmnble lOgic arr8Y (PI A), UJ3,
2o..
o
Ractix, I, 19, 103, 259
,'onversion, ..
Rane, l, 21,53,66, 252,261
RS}'mmetric,20
dynamic, 53
legitimate, 27-1
Rl'Ciproc81, 218, 252
Rt'("oding
canonical, 30, 1.16, 155
See also MultiplicRtion and nooth's
algorithm
Reduction rate, 157
Rroundancy, 21
in r<"Sidue number s\stf'm, 274
m,'d.'mre, 189
R,'ferC'nce bit, 1.1l, 116
Regi.,ter
double-length, 37
single-I(ngth, 37
Regular desiKn, 105, 157
Relative step size, 253
Relali\'elv primE'. 261
Rt'IlMinder, 39, 213
parlial, 182
truncatro, 200
Z('ro, .16
Repregenlabll' number
large:.t, 21, 53
smallest, 22
RepfC:j('nt8tion
accuracy, 2[).,1
elhclCJlcy, 267
See a180 Frror, reprl'8t'nttion
Re:.ldue number. 260
inlo(>J"S(' , 270
rroundant, 271
Ripple-c'drry, 116
oMlder. See Adder
T
s
Taylor s.'ril"l, 225, 210
T"rmlnation algorithm, 241
festing for I.ero. 50
l'ilning di"gnm, 133
Treo., 109
balance<:1 deJa), 161
carry-S3vP, 131, 160, 217
overturned-stairs, 161
Wallace, 126
l'rigonometric fumlions, 23l. 242
inlo('t.S(' functions, 235
Two's complement. See Cornp!t'melll
Scaling factor, ,I
Sdl'ntific nOIRtion, 53
Shift
comhinatorl<il, 61
pr<'-, .10
uniform. 1-17
Signed-digit (SD) UUlllbt'f, ll, 102
adder, 157
binary, 27, 157
binary ,'ncoding, 29
U
Unit in the IdSt IJosition (ut'J), 1, .16,
5.1
OIIP'S conlpll'mcnl. SI'C ComplemC'nt
Optimal algorithm, 102
Overflow, 2, 12, 13, .10,81
detection, 265
O\'f'rIBp region, 100, 192
p
Par","'" prefix circuit, 109
Part-lill products, 141
I:!Ccumulation, I-II, 150
uligwnent, 150
matrix of bits, 151
l'iJWlinl', 132, 11)7. 217
multiplier, 174
rau', la3
5t3gC"S, IJ2