Text
                    I. ..
2...- -, 1"'.
.... .Q.
G Y.
· _ · er
lrlmetiC.
Israel Koren


COMPUTER ARITHMETIC ALGORITHMS Second Edition Israel Koren University of Massachusetts, Amherst ; A K Peters Natick, Massachusetts 
dilorlal. Sales. and Customer Scrvice (>nice This book is dedicated to my wife, Zahava, my sons, Vuval and Varon, and to the memory of my parents. Jacob and Dvora. A K Pcters. Ltd 6. Soulh A\'cnue Natlc\..:. MA 01760 \\"\\"W. akpeters. com CopYright 0 2002 hy A K Peters. Ltd All nghls reserved No part oflhe malenal protccled by this copvnght nOlice may be reproduced or utihl'ed m an) form. clcctronic or mechanicdl. mcluding photocopying. rt.cording. or by any mformalton storage and retric\al system. \\Ithout \\'Tltten permission from the COP} right owncr. Lihral')' of Conress Catalon-in-Publication data Koren. Israel. 1945- Computer arithmetic algorithms Israel Koren.-2nd cd. p cm. Includcs bibliographlcal references and index ISBN 1-56881-160-8 I Computer arithmetic 2. Computer algorithms I Title V,A 76.9.C62 K67 2001 005 I -dc21 2001045837 Figures 43.4.4.4.5 and 4.7 are reprinted from D.J. Kuck. The Stnlclllre ofCon/plllers arid ('omputat,on$. Vol 1 (copyright" 1978 John Wiley) by permission of John Wiley & Sons. Ine. lahle 2 2 and figures 5 32,6.11,6 13,6.20 6.23. 7 5. 7.7. 7.13, and 10.2 are reprinted by permis- sion of the Instltult. of Electrical and Fleetronics Fngineers (IEEE). Table 4.3 IS reprcsented by permission of the IBM Systems Journal Thc first edllton of this book \\ as publish by Prentice-Halline Pnnted In Canada 06 05 (14 03 02 10987654321 
FORWORD TO THE SECOND EDITION PREFACE CONVENTIONAL NUMBER SYSTEMS 1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 CONTENTS xi xiii 1 The Binary Number System 1 Machine Representations of Numbers Radix Conversions 4 Rf'present,ations of NegativE' Numbers Addition and Subtraction 13 Arithmetic Shift Operations 15 Exercises 16 Rcfen'nces 17 2 6 2 UNCONVENTIONAL FIXED-RADIX NUMBER SYSTEMS 19 2,1 2.2 Ncgat.ivp-Radix NumhC'r Systt'lJ1s 19 A GCll('ral Class of Fixt>d-Radix Number Systems 21 
vIII 2.3 2.1 2.5 2.ti SiJ{IIf'fI- DiRil NlIIulll'r S\'tlIt'llIs Biulry SD NIIIIIL 'rs 27 Ex,'rc-i,u'!o 32 n.'f"rf'III-('A'I :1:1 23 Contents Contents [,.7 1).1'1 3 SEQUENTIAL ALGORITHMS FOR MULTIPLICATION AND DIVISION 35 IJtl lU I) 6.11 5.12 !j,I3 6.11 3.1 3.2 :t:1 3,1 3,6 3.6 s.'CIII1'1I1 illl 11I1t il,lirlil iOIl S"'III,'ut inl I )iviRiol1 ;m NIJllro""lorilll( Di\'itlioll SlJllllro' nUIII Fxt ml't iOIl Fxcrci:;(,,; 50 n,'fc'ro'lIrl'1> 52 3& 12 18 4 BINARY FLOATING-POINT NUMBERS ,1.1 .1.2 1,3 1.4 1.5 J.() 1.7 1,8 .HI 4.10 4.11 I'rc-.lillllllnri,'s 53 Flunt illg- Puiut 0r)('rntious r,!1 ChujC'(' of Fluutlll-Puillt R"prcututJoll Th!' IF"'F Fioutiug-Polilt ::'lllIulnrcl G7 RllIllld-(,lf SdU'rTlc:'I 71 Cllllni Diglt.11 76 Flonting-Point Acld,,1'11 81 F xc'cpt iOllJ, tH HOIllIlI-olf Frrors nlld I'Il1'ir AI.'C'III11UIIl\.ioll Exelci . 89 n(,r"It'II('c":t 91 5 FAST ADDITION 5.1 5.2 5.3 5.4 5.& /j.b I.illy' A<lcJ"rll IICI ('nrry-H.''''''I Alhl"!H 113 (' my-Skip A<ld.'rll I iii II v"rul Alld,'rll II!I Cnrry-SuVt' A<ld.'fH 121 Pipdilllll IJf Arit 1IIII<,t ii' 0IJl'lIIlic,nH EXf'rdIWA 1,15 Itt'fl'r"II""I' 13/( 1:12 6 HIGH-SPEED MULTIPLICATION 6,1 H,'cJudlll( Ihl' NUIIII...r of Pllrt.illl Prlllllll'h. III 6.2 IIIIIJlelllclilillJ.: "/lIW' -Iult IJJlierH UHin.., Sml1l1er OIlC''t 6.:1 AC'I'ulllulnt iug till' Pllrt.inl Pro ,1111'111 I ,If, GA Alt 'nllltlvc Ie 'JmiIJuL':I fl,r PnrtluJ Prl duct \!'t::llllluiutlull G.5 FUtlcd Mliitiply-Add Ullit Iti" (J.G Army llIltiplif'rH 1"7 (j,7 O"t.illlillity uf MIIII iplll'r 11111'1"111<'111 tllI,IUI 171 (j.t.! EXI'rdH4.':I 17(j 6.9 HdeIl'uc"H 179 53 65 7 FAST DIVISION 7.1 7.2 87 7.:1 7, I 7.5 7.fJ 93 7.7 SHT Di\ iH11J1l 11'11 lIiJ.:h-Hudix Divisioll 187 S"I'I'(lill UI' thl' Divilliull Procc'S!! WI'! ArlllY Divitl"rs .!OJ F'8/lt 8'11111((' HOllt Exl m!'t inll .lOlJ Ex 'reitt ' 209 n 'fl'rl'II(''-':I 210 8 DIVISION THROUGH MULTIPLICATION 102 tJ.1 8.2 8.3 8.1 HiI'plf'.Carry Arltll'rtt !I3 Cnrry-I..uok-Alu'utl Atlrlc'rl! 9!'i Collrht.lrJlIlI1 SIIIII AcI.Ie.rli  Opt illhllily ur Algllrit.lmu; nllt! Thdr 11111,11'1111'111 nt ilJIIH Carry-l...ook-Abed(1 Alldi\.illil ft"vittiU...1 HJ() Prefix Aclt".rll 1 U!.I I}ivi!<illil "y ('lJllvt'r"II" , DiviHllJ1l "y HI.dJlwl'ILl inll Fxf'rd.... 222 It 'f"n'III"'" .l2:1 21J 21M Ix 141 119 1[;7 181 213 
x 9 EVALUATION OF ELEMENTARY FUNCTIONS 9.1 9.2 9.3 9.4 9.5 9.6 9.7 !u:S 9.9 9.10 The Fxpc)IU'nt.ial Fum.tion 22!1 TIIP I og/lrilluu Func.tion 229 The l'rigonomctri,> F\mctioll5 232 The InVl'rsC' Trionolllf't.ri,' FUllct,ions 2:3!i Thp lIypt>rholk FUn(,t.ioll<; 238 nllllnd!o 011 t.lu' Approxllnatioll Error 2:m Sp('('(I-lip TN'hniques loll Ot.her Techniqucs for E\lIIII/ltinF; Elcmpntary FUlldiuns Exprdsf's 2 '.1 Rdpn'lIN'S 2 15 10 LOGARITHMIC NUMBER SYSTEMS 10.1 Sign-Logarithm Numher SYSU'III:' 247 10.2 Arithll1l't.k Op,'ml.ion5 49 10.3 C'oml»uison to Bimuy Floating-Point XnlUber:; 10.4 ('<Inversions to/from C'oll\'t!lIt.ional R('prc"S,'nlnt.iolis 10.5 Exercises 255 10.6 RefprPII('('S 256 11 THE RESIDUE NUMBER SYSTEM 11.1 Preliminaries 259 11.2 Arithl1l,'tic 0l)('ratioll:; 261 11.3 The ASSlI('illt(-'(1 J\lixt"d-Rudix S)'sll'lII 264 11.4 Conwrsion of Numbers froll1/to t.ht" R('siclne S)'sh'm 11.5 S('ll'Cting the Modnli 267 11.6 Error Detection .lUd Corrc>ct.ioll 2li!) II. 7 Fxerciscs 27.1 11.8 n. .fcrcllccs 275 INDEX Contents 225 FORWORD TO THE SECOND EDITION 2.1:3 247 2!"12 Thi.. edition includes several new sections II.... well dS many amendment.!) and cor- rL"CtiollS IUddc since tht> first edition in 1993. The new sectioI15 includ,> floatillg- point urldprs, float.ing-point, ('xceptioI15, p;elleral carry-IOQk-Hlw ul addffs, prf'tix add('rs, Ling adders, and fused multiply-add units. N('w algorithms and imple- mcntations have been added to almost nil dlllpt,f'rs. 1\Iy thanks to the t.uden and readers who:;e clisco\o('ries of f'rrors, ..ld gen('ral conunents, are reflpct('(1 in this vohllue. Sinc(' th(' first edition, a wf'b page W81> creat.ed for CQrnpllt('. .trithmf'tic .41- gorithm.'1, whirh contains updat.es on t.he book, solution<; t.o 84'1('(:tt"d probl('ms and rt"lt"vant links. It can be found at hup:/ /www.ccs.Ulna..edu/ffP/koren/arith. Additionally, thcre is now 311 on-linf', Ja\aScriptbas('d simulator fur IIII\IIY of the dlgorithms containt"d ill thi1> book. A uSl'ful tool for both stu- dent." alld I>rf\ct.ition('rs in th(' liPId, the Ja\'a-bu..'it-d simulat.or can be foulld at htt ))://\\'ww .l'CS.llmass.edu/ele/koren/arith/sill1l1lator. I v('ry much wPlt-olllc furth,'r ('oll1l11l'nts and sugg<'StioI15; please e-1IU1il th,'m to me at korell',<4l"Cs.unut.."8.oou. 25J 259 2611 277 I,d K OTrn ."mhcr.t, Massachusetts July 2001 xl 
PREFACE The goal of thit: textbook is tOl'xplain tllP fUlldallu'ntdl principlps of alorithms available for pl'rfornun drit.lllnetic operations in digit." I'Omputers. T11I'sf' in- clude basil' arit.lllnetil' op{'ration likl' additlOll, suhtraction, lUultiplicdt.iun, arHI division in fix(!d-point ..lid ftontillg-point numb'r s)'st('(ns, and morc complex operations, 8uch dS S(IUare root pxtractioll and ,'vaillatiou of l'xrlOnPlltin log- arithll1ic, alld trigonometric fUllctiolls. dno so 011. Even the SN.'mingly 8impl(' arithmC'tic op,'rat.ion:. turn Ollt to be more complex than Olll' expects whell at- temptillg to impl"lUent. thl'l11. 1'hp descriptions foulld in lTIany pxcellpllt hooks Oil computer .m'hih'ctUrl> do not prO\.. io£' t.he level of detail rp«uir,'d, <\lId tllf'rp is a n('('(1 for .\ book thnt is cOlllpl,.t.('I)' dl',,:oteo to diJ!;itdl ('oIUJmh'r <uithml'tj(". D('signs tlmt inchul,> arithmpti<' ..nits have Jlrolif,'rated in r..c('nt years. Till' progress alr(,l\dy mad.. in intl'gratNI dr('lIit tl'chnology, alld furthl'r ulvaJu'f'S I'xpN.tL'd in tlu' JIl'm futun', h8." hcul a significsmt impact on the dpsi of lIew arithllwtic proCl'ssors, 1'111' higlll'r d('nsity of inh'gr It(>d ci((uit-s now etMblps th(' design and imph'ml'lItation of sophi1ot.icatNI arithml't.ic pwn'SSO[s t'mpluying al- gorithms that WNt' cOII:.idpf('o prohihitivI'ly cOlllpll'x in thp past. COnSl'<JllI'nt Iy, m(" hods' hat w,'r,' Jln'viollly (onsiden'd IIIlconW'nt.iollal should hf' I'xaminNl, since tl1l'Y muy nllw h" '\tt.ra(.tiv(' altprnat,i\("s. Furthermore, th., e8.<;(' of d('. siglllllg I1pplicatioll-spI'cihc IUll').,mted cirruits currpnt.I)' allows f'ngul(-'('rs to (It... sign arit.llllll.t ic IInits tailorpd to th('1f spl'("ml nl'('ds, ratlwr 'han having tu lIM> g"lIl'ral-purposp circliits. Thl'st> spcdulil.ed arit.hnll'tic nnits 1"01.11 achif'l. I' '\ high(>r 'i1">l'(l of 0p,'ratioll for t.h(' Jllrticlilar applicatioll bt'llI coru;idprt-'d. xiii 
xh' Prefoce Prefoce xv This bunk tll'scribc thc l)rilldlll,'S of COlli plltl'r arit.hnl!'tie alp;oritlulil'; illl..... pl'nrll'ntly of nn)' partieul.\( t.l'Chnology l'mplll}l'd for t.lu'ir impleuwntntion. The I'xist<'lIcl of sew-ral impl('m"lItlll.ion tt'dmologil'S alld the rapid .-h{\IIg,,:, in t.hesc lUakl' a d"tail''ll dl'Scripl.ion of any impl('nlt'lltaliou almost immediately ob)lct.e. TI1l' book illdudes mmll'rkal ('xsunpll'<; to iIIustrntr' the working of tlu' algorithms pre$('ntcd ali!I pXl'lnills the COIICCpt:. b.,hind tht' algorithmc; without relyillS 011 at(' diagrams. Such diarams arl' lIc;ually :-traightforward, lIlIllallY rt'i1d,'r with a sufficiently good hackground ill diitnl r1psigu, as is I'Xp,'('t("(! from tll(' rcadl'r of this book. can draw his/hl'r owu. Thl'S1' diagrams are iu nUlIlY ClL'.S ueI,'Ss for th,> pra(.titionl'r, sillcl' tll(' tl'dmolog.." that. h,'/shl' plalls to USf' imposes it.s COllst raints on thl' implcIl1Putat.ion. and. more importantly, t.hl') do not prcJ\'idl' ally ncl,litiollal illsiht. All the algorithms in thi.., book arc d('S('ribNI within the !'.I1111e franwwork, so that. th(' similariti("S b(.twl'l'1I dilf,'rt'llt algorithms becomc c\'idl'lIt,. allll. ('on- sl'<jul'lItly, I.he ha..<;ic principles IlI'hind thcse algorithms can bl' c.\Sily idmtifi('(I. This should hdp thf' reMl'r toO b,'u.'r undl'rst<lI1d the cnrn'lItly avnihlble algo- rithms, kllow how to s..Jt.'<.'t t.he most appropriate algorithm to match a given technology, aud evell he nble to dl'\'l'lop n,'w algorithms if t.he n('t.'<.1 ariS(.'$. This book is intcnded to bc ust.'<l as a tcxtbook in a wnior-Iewl or first- )'ear graduat '-It.'vcl course in computer arithmetic and as n ref('rl'l1ct. bouk for pmctkillg enbmf'('rs. The reader's l'xpccted hackground includes a basic knowt- cdgl' of digital dl'Sign and t.11I principl of digit 111 collll>uter organi1at.ioIl. Th(, book includt':> 11 chapt.rs; Mch chapter has a list of rl'lf'"ant n>fl'r('lIccs culd a !.'Ct of ex('rdses. A l'parate solut.ion manual is available from thc publisl1f'r npon rC'<]ul':'\.. This book dol'S not include all the algorithmli that ha\'e eVl'r bN'n sug- gested, and is 1IJt'<tnt ol1ly to serve as a solid illt,rodu(t.ioll t.o this rich field, whidl is cOIlt.illnonsly c\oh'ing. fi,.'aders who .1re intl'rest.etl in further dctdils (III a par- t.icular topic should consult thl' list. of rl'fl'renct's at. t,hl' end of cach chapter. Excdll'lIt SOurCl"S for addit.ional information are the proceedings of th(' bjannual IEFE Symposium 011 Computer Arithml'tic and Pl'riodicals such as th(. IEEE Journdl on Solid-St.atf' Circuits or the IEEE Transactions on Computers. The Iutter had several sp<'<'ial issu d('votf'd to computcr arithmetic. Ot.her :.ourc are the sewral books on compntl'r aritllllll,tk nlre.ady <1\'aildblp, which are Iist(.d at th(' end of Chaptcr 1. Thi tl'xtbook e\'olved from lecture not prepared for comlit'S ill computl'r arithmetic that I have taught fit tht' Uniwrsit.y of California at Santa Baroora, thl' Uniwlsity of Southern California, the Technion in ISnL('I, thl:' Univeity of California at Berkeley, aud the University of !\f8.....'mchnS<'tts at. Amh('rst. 1111' order of topics ill tilt' book follows their order in my lect.ures. Howf'ver, sewral instructors who w;ed a prf'liminary \'crsion of tin> book, did IIOt. experience any difficulti{ wheu l"o\'cring the chapt.crs out of order. MrlllY p(,(lple ha\'c contribuh:<1 in diffcrl'm, ways to t.hiR book. Prof. James Howard from thl' Univ"rsit.y of Californi,1 al Snnta Barbara was t.he first t,f) bUgg('st thp writillg of this book. Prof. William Kahan from the Univcr;;ity d California nt. BI'rkl'ley, with whom I hnd long di'K'u8..<:ioru; in 1983, IIIl!uencrd my vi('ws on uumy topics. Mushe Gavri('lov from I SI ulgie rf'ad much of the book, and his lIIany sUf'stions h,'II)(-d to improve the prescllt.ation. Prof. Iary .JSIII,' Irwin from t.he P.'lmsyl\,8Ilia Statt' Univf'ity alld Prof. Ot'hrooz Parhami from the U ni\'ersity of Cnlifornia at Santa Barbara aJ!;ro to IISf' a prf'liminary wrsion of t.his book in their clBSIo alld provid''lt lI1any helpful comlllt'nts dond SUR!1;t.J:)I)s. Several oth('rs rl'\'iew<"d Ilarts of the book alld Rave me wry imJ>i"Jrtallt ft-'edbu('k. These include Prof. Dan Atkins from thp Uniwrsit.y or :\Iichis.:an at. Ann Arhor, Prof. Milos Ercegovac from t.he Univcrsity of California at L(fI Angl'l('s, Prof. Earl Swart.1Ialld('r, Tom Callaway, and Midu\pl S('hultt.' from tit,. Ulli"'crity uf Texas at Austin, and Dr Gf'Orge Taylor from SlIn Micr05yst('ms, Th(' grncluat(' :;tudl'nts who took my coursp in the previously m,'ut.iOlll'd rampusl'S lI1ade many cont.ribut.iolls through th,'ir qn<'St.ioIL<; and sllggNtiOit'l. In particular, I wish t.o acknowledge t.he contrihut.ions mnd(' hy S8{'hin Ghanpmr, who prepared the solutioll lI1all\lal, and by Ofra Zinaty, Moshp Gavrielov, and Susan Iorin. Last but 1I0t. least, I wish toO thank my wife, Zahava and Illy sons. Yuval and Varon, who provided moral SUI)I>ort 8.<; wpll as l'ditorial as,.'list.anc('. Israel Koren Amhe,.st, Massachusetts 
1 CONVENTIONAL NUMBER SYSTEMS 1.1 THE BINARY NUMBER SYSTEM In luuwntiollul digital computers, inl('Kers arc reprcsf'nt,I'(1 AA biliary munbcrs of fixed 11'1I."h. A biliary nmnlwr of If'ngth n is 811 orderr'tl . '«nencc (.r.n-h Xn-2,'" ,XI' .r.o) of binMY (Iiits whl're ('ach digit x. (also known as n bit) rnll ....'..urn.' onr of the values 0 or 1. The length 11. of the ''''qll('llll' is of significancf', sin("(' binary IIlIIubers in digital l"omputers are stored ill rf'wstC'rs of ,I fixftl If'Dgth. n. Thf' ahoVl' seqUt'(Jle of n digits (or n-tllph') rf'pr''M'nts thl' intrgf'r vnlnf' n-I '\ ')n-I ')" -2 + 2 " 2 ' J'\ = .r,,_I_ + X,,-2- + .., XI +.r.o = LXi . ,SIlt (1.1) Up\lI'r casP Il'ttf'rs IU,' used in this book to repfl':!,'nt, nlllll('rkni vnhu'$ or Sf'- CJlII'n('C's of digits while lowf'r case letters, usuall)' illdlxl'(l. n'I>rt>.o;l'lIt individual digits. The wf'ight of tlw digit.r, in (1.1) is the ith pOWf'r of2, \Vhil'h is ..allf'd IIII' rodi.r. of thf' Immlwr ystelll. 1'h,' interpretation rule in Fqu'ltioll (1.1) is sill1il.u to th,' rlllf' IIsf'd for t h,' ordinary de('ill1al lIumlwrs. Th,'rc are, howf'v('r, two dif- fprf'lu'c Iwtw('t'n these int,erprl'l,at.ion rules. First,. tht, radix 10 is ust'(l illstt>,d of 2 in Equation (1.1) nllli conlJlIt>nt.l}'. t,he all()\\('d digits ill t.11f' decimul case are .r.. E {O. I, 2, . .. ,9} instc'all of x, E {O, I}. We cnll t he decimal IIlllllht'rs rudix-IU 1 
2 1. Conventional Number Systems 1.2 Machine Representations of Numbers 3 nllmbt>r:- and rhe hinary nUllltwrs raclix-2 Ilumhf'rs. We' illdic"\te the radix to be u wh('n interprptinp; a giVl'n lIt'n("(' I)f digits hy \Hitillg it as a uhscript. Thu.... the 5('Q1I('JIce (101 ho rf'prf'Sent.s th" df'Cimal \,8)U(' WI, while the S('(jllence (101h reprCSl'nb the d,.'dmal \'31ue 5. Since operand.. and f(ults in an arithmetic unit arp stored in r.--gtSters of a fixed length. therc is 8 fiuiw number of dbtinrt \7.\lu(-'S that can be repr("SffitA'd within an arithm{'tk unit. Let .\mon and X...... r delww the slllall(':,t and largest representable va1u, rp:sp4"cti\'ely. WI'Sc"\Y that IX me ". X rrt4Z ) is the range of the reprC:;('ntable numbc'P.J. An}' aritlml,'tic operation that attempts to produre a result larger theD X maz (or smaller than X m .,,), will prodnce an incorrect re;ult. In >;urh cases the aril hmeti( unit ,.hould indicate thnt the ('ueratA'd r('Suh is ill t>rror. Thh. is usuall)' c..allf'<.! an ot'crfiOfl l indication decimal system) and ullcon\'entional y;,.tf'ms likP thf' signPd-digit nllmhpr svswm (to be pre;pnt.€d in Chapter 2). The conw'nlionnl number systf'm!. 8rf' nonF'f!dfmdant weighted. and poBi- tionalilumber sysrns. In a nonn:dundant numbpr syst(,D1 everY number ha..'1 a unique reprf'S{'lItatiou; in other v."Ords, no two sequenN'S have the sam,. nllm.n. ("al \-alue. The tIrm u'l:lght€d numbf'r systf'ID mf'ans that therC' is a sequence of v.eip;htb 1£'''-1.1£',,-2,.... U I. U'O that dehrmin the value of the gin'n ntuple (x n-I, r ..-2.' . . ,.l'o) by thp t'Q1J tioll -I X = Lx.u:., .",,0 (1.2) Example 1.1 If t.he conventional binary numb('r s)stcm is 'mplo)'.d to rl'prl'S('lIt un- signed int..-gers u..,ing four binary dils (bils) then, \"m= = (15ho is repre- sent -d h\' (1111)2 and X....... = (Oho i., rt'prcscnt€d b)" (OOOOh- Incr<'asing Amcu: by I rult.:o ill (16ho = (lOOOOh out of whidl, in a 4-hit reprp- S('IIUtt.ion, only tht, Ia.-.t four digils are retained, yielding (OOOOh = (0) 10. III gcnl'ral. a numb€r .\ that is not ill the range (Xm."..\maz) = 10,15] is fi'pr(>';l'ntro by Y modulus 16, or Y mod 16, which b the remainder when di\'iding Y b)' !G. Such a ituatiun can arise. for example, wh(:'11 two opt'rands X and Y cUe added and thir sum excft'ds Y maz ' In this Cd.'iC, thl' final rult S :.atisfics S = ( \" + Y) mod 16. For I'xaruple, Thus, U'; is the w('ight a.igned to the digit in th.. ith position. Xi. Finall'. in a pQsitional numher system, the weight U'. depends only on the position i of the digit X.. In the conventional nllm&r systems the weight U', is the ith power of 8 fixed int('gC'r r, which is the mdi.x of the number system. In ot.her \\"Ords, U', = r'. Therefore, the$e number svtems art' also calif'<.! fiud-mdix systems. Since thp weip;ht a...<>siglll'd to the diKit Xi is r", this digit has to satisfy 0  z,  r - 1. Otherwise, if X,  r is allowed, thf>11 x,r i = (x, - r)r" + 1 . r'+I. X +y I 0 o 1 o 0 1 1 I 1 1 0 11 7 rC5ultiug in two machine represeutations for the saUlt.' .....dlue: (..".l'.I,.rh''') and (.". ,Xi+! + I,r, - r," .). In other word, allowing x,  r introduce:.. redun- dancy into the fixf'<.!-radix nurulwr )st UI. A Sl'llnl'nce of n digits in a register doe:. not 1Iet.'t"'Sdnly haw to r p pr\.:'5Cut "II intC'ger _ \\', may lL'iC lIch a sequence to rcpre.ent 8 mixed numbt.'r that ha.s a frdCtional part a.<; wl'II a... an integral part. This is done by pdltitiouiug the n digits into two ts; k (iiits ill the integral part and m digits in thp fractiollal IJdrt, sat,isfyinp; k + m = n. The \'"11111' of au n-tuplt> with a radix point betv."f'fll the k most signjficdut diKit.s and th,' Tn Icat significant digits Since thl' final r(",ult ha... to be stored in a 4-hit wgister, the most sig- nificant bit (whOSf' w('ight is Z' = 16) is discardro. viC'lding (OOlOh = 2 = 18 mod 16. 0 TlK comentiondl binary syst III is a specific example of a number system that can be used to r..pr'nt, numerical ValUf!S in an arithmetic unit. A number 5y:.tm1 is in general defined hy the set of values that, each digit can &sUTne aud by "n interprf'tation rul(' tl13t dl'fincs the mappmg bctv.n the sef(uellc of diits and thclr numerical \'illuCb. \\'c distinguish bctwCf'1I coII\'f'utional lIumbcr 5ysll'llI:- like the binary sYbtem dribcd in t.hp pre\10U5 Sf'Ction (or the cOlIJlIJouly lJ:.t"C1 (Zk-IXlc-2'" XIXO .. I Y int grol part Z-I2'-2"' Z -m)r .... ....  froctloncd part 1.2 MACHINE REPRESENTATIONS OF NUMBERS is ..\ = xk_lr"-1 + Xk_2r"-2 +... + .l'lr + ro + x_lr- 1 +.. k-I = L x.r-. a--'n + X_rn r - m (1.3) 
4 1. Conventional Number Systems 1.3 Radix Converons 5 fL.. the qllotipnt. If we IlftW dividf' th,' abow. qUlltif'llt by rD. Wf' obtain .r.1 88 tlIP (f'lIIailldl'r. We may thl'rf'forc divid(' till' qlloti('nti by r D reppa'p(lIy, rf'tdilling tlw remaillllpr8 a.., the r('quirf'd digil unt it a '-f'r(, (Illot ipnt is rp"cll('(1. To filld thl' rf'presf'llt.dioll (X_IJ:_2' ..Lm)rD ()f thp frartJOnal part J(f', WI' rl'write the ,ppropriatp part of &Iu<ttinn (1.3) as follow8: The radix pnint i., not storc:>d in till' rf'gistl'f but is undl'rstood to bt' III a fixed position hptv,f'('n thf' k most signifi('ant (tigits and tla,. m least J!iignificaJlt digits. For' his rMSOn wc> call such rl'pr''Sentations fUl:d-pomt rf'pr('S('ntatiolL'>. A pro- grammf'r of a digital f'Ompliter is not llC<'essarity rtrictC'd to the use of numbers having the prpclf'h'rmillf'd position of t.lIP radix point bllt can properly 5(:alc thp opprands. As long 8.'1 thp same scaling fartor is U8('(t for all opl'rands, the add alld suhtract opprations ) ipld the cow'('t results, &illre aX :f: a}' = a(X ::I:: } ). whf'rt" a if! t.he srating factor. Howevcr, corrfftions are (('<Juired wllPn p"rforming multipliC''ltioll and division, sillcl' aX . a} = a 2 Xl and aX/flY = XI} - Cmn- mrmly uSl'd positions for thl' radix Iloint are attlae rightmost side of tl1f' number (i.e., purl' illtPg('rs, 7rI = 0) and at thp Il'ftll1ost side of thp Ill1ll1ber (i.c., purc fractions, k = 0). Giwn thf' length 11 of the opprands, the weight r- m of the Iew:.t significant digit indicalt'" thl' positioll of the radix point. To simplify our discion from t bis point 011 and to a\'oid the nf'l'd to dist inguish hetw('(>n t hI' different partitions of numher (illto fractional alld illtegral parts), WI' introduce the notion or a unit in the last plmfion (ttlp), whkh is the w.>ight ofthe Il'ast significallt diwt. XF = r;;1 {X-I + r;;1 (Z-2 + r;;l(x-3 +.. .)]). ( 1.6) If we lDultiply XF by ro, WP obtdin a mIXed nwnber with X_I as it,> mtcg-ral part and r;;l [X-2 + r;;l(x-3 +.. .)] as the frdt-tional p"rt. WI' may therefore multiply th,-. frdCtional pdft.S by rD repl'atl'(IIy, retaining the gl'neratf'd inteR,'rs d:; the reql1irMt digit:;. HOWl'ver, ulllike thf' algorithm for the integral part, this alKorilhm is not gl1<tralllu:d to t('rmillate, 8inc(' a fillit" fraction (olle t hat needs a finite Ilumher of nonzero digits) in one Illlmhpr systl'm may corre:.pond 00 an infilJir-e fraction ill all(,thl'r This does not cOllstitllf£ a problpm ill practit-e, sin('e the proces.-! call be t 'nnilla!.''11 aft('r 111 steps (or a few additiollal OIll'S if rOllndillg is dl':>ired). Example 1.2 The decimal mixed numhpr X = 46.37510 is to be couvert€<! t.o binary form. Starting with >"1 = U; we ohtaill th,' following quotif'llts alld re- mainders when repedtedly dividing by 2: ulp = r- m . (l.4' 1.3 RADIX CONVERSIONS Radix con\'prsioll is the translation of a Ilumber X rl'pr<'Sf'ntf'd ill one rddix Ilwnber systc'lI1 (00 b(> called the SOUN'./' Ilumber system) to its rl'prcsentation in another nllmber syst'm (called the dc.tination number systf'm). The major reason for such conver1'iolls i!o the fact that most arithmetic IInits opl'ratl' on biliary numbers while tht'ir users are more accustomed to dI.'Cimal l1\ullhprs. which also require a smaller Illlmbl'r of digits. We will therefore cmphasi7R coll\'('n;ions between the decimal alld the biliary number sYf!t.'lUs hut will pm;pnt the algorithms ill a mure genl'ral form. Givl'll a munber X, we wish to find its repre:.elltation in thf' destinatiolJ number sYbtcm with radix rD' For cOIl\('lliellcc, we distillKuisb bctw(.'('11 thp conven;ion of the int('gral part. XI and that of thp fr-r.ctional part XF. Startillg witb the inu:RTal part, it represelJtation (J:k_ll'k_2'" XIXO)ro is sollght. Wp can (('\Hit.. tht:. appropriate part of Equation (1.3) a.!; follows: Qllotipllt 23 11 5 2 I o H 'mainder O=xo 1 =XI I = l'2 1 = X3 O=x" I =x \\'I' 1l0W convert. the fractiolldl part X F = 0.375 alld obl.am the followmg illtf'ger1' alld fractions when rep('atcdly nmltiplying by 2: Intl'ger part. 0= X-I 1 = %-2 1 = X-3 XI = {('''(.lk-lrO + xk-2)ro +... + x2)r O +xI}r o + .I'o, (1.5) Fractiondl lJart .75 .5 .0 when' 0 ::; J:. < ro. Tin!!;, if \\if' diville XI by ro we obtain 70 as th(' rpmailllJer WId Thlls, thl' final result is 4(j.37510 = 101110.0112. U thp fractional p8rt of thp gi\oell decimal nUUlb,'r Wd:> \' l- = O.J, the above algorithm would n.vcr tf'nuillatp, siucc the dpcimal fr.u.tiolJ 0.310 is thl' infinite biu8ry fr8(tion (0.0100110011.. 'h. 0 {(... (xk--Ir O + rk-2)ro + ... + X2) ro + .l'1} 
6 I. Conventional Number Systems 1.4 Representations of Negative Numbers 7 For fixro-point IlUmher:. in a r "Ii" r s)'!>t('IJI, we haw to dctcrmim' till' way nChativp 11I11J11)('rs ar(' rt'prt>sc>ntcd. Two di!f('rent forlJl a((' cOImJlonl)' USt'(l: A major disadvaJlrae of t.he ..ignpc!-ll1anitUll' r"J)r p ' 'nlat.lon ill thnt !h(' opernt.ion to 1)1' p('rfnrn1l'd ml\Y d('pend on t,he siglls of t h,' ('p .wnds For ('x- aUlph>, whp!1 adding a posit.hp l1Iunhf'r X alld a 11I'ative uUl1Jbpr Y (i.{'., Y is th.' absolut(. volue of thp Sffoml op('raud), we 1Il.'Cd to pprform th(' ealrllinrion X + (- 1'). If Y > Y, wp should ohtain as a fill/il result -(V - Y). W.' t hprf'for.' n('('(1 to fir!ot ('nlt-Illat.>} - X, i.p., 'iwitrh the order of Ilu' uppr.ulfls and l)t'rform slIbtwrt.iou rath,'r than addition, alld then attach the minus sigll. This (1....'111-1 in a SN}U('IJ('C of decisions that. have to bl:' IIInde, costing ex, I'N> culll rollogic' .lIld ('xl.'C'ution time. This is avoided in the (omplelJll'Jlt r('prt"nlntioJl JlJI'tho(l..-. All th(' arit hnJl't i(' oJ)prat ions III the above COI1\'f'rSlon nlgorillmlb w('re pcrfornll'd in IIIC' sonrc(' I1Innht'r syst,pm, \Vhidl was till' decimal syst"JI1. fo ronV('rl a binary I1Innhpr to till' ch'dmal systl'lIl, we Ul8) I'ither (>XN'utp t.hp above algorithm in du' soltre(' binary sy:.tt'm or. mort' cOI1\'('lli(,llt,ly, p('rfonn tltr {'unvpr.;ion in thp d('stinal-ion d,"Cima1 systcm u:sin Equatiou (1.3). 1.4 REPRESENTATIONS OF NEGATIVE NUMBERS I. Slgll \IId magnitud.> reprf'Ii(>ntntion, which is also called the sign('I..I-lJJag. IUtunl' 1I11'I.hoo 2. COIDI'IplOPllt reprps('ntation \Vhit'h comprise.,> two alternatives 2. Complement representations: Th('re "lre two alh.rnntivcs: (i) Radix ("oml)IIIIPllt (also called t.wo's ("OlllplplOf"nt ill thp biliary syM('JI1) (ii) Diminishcd-radix complement (called unl"S ("OInplpml'nt ill t Iw biliary s)'st.l'm) III bot.h rompl('ment, mcthods, a po:;itiv<" ullmhl'r i.. r"pr('S('lItc'd in th(" same way as in tI}(' signed-III<\Kllitlldl' methoc!, wllf'rN\..'I a negative IIl1mhpr, - Y, is rI'prt>sl'uted by (R - Y) where R is a ('onstant whoS<' vahlP \\1" will dl'tl'rlllin(" next. Such a repTl.'$('utation satisfif'S the bl1.M(" iOf'nt ity 1. Signed-magnitude: IIerl' tllP sign ano nUignit ndc art' rcpr('S{'nte{1 SCpd- ralely. The first digit is thl' sign digit "lnd the rI'lIlaiuing (11-1) digits rI'J)rtI'nt till' maRnituol'. In thp hillary ("l, tlJ(' sigu bit is normally !'I>IN'tNI to be 0 kJr po...iti\' . numhf'rs and 1 for negntiv(' OIU'S. Iu the nonhinury Ca.5C, tI)(' valucs 0 alld (r - 1) arc assignc'd to th(' sign digit of posit.ive nnd nJ.':at,ivl' nllmbers, rl'- sp('(.ti\'ply. Notice that in this ("&i'> only 2. r,,-I ont of tlw r" possibl{' sc..'(I"l'n('cs an> Utili7..<'d. This will be discu...'i,."t-d further lah'r on. I..<'l the (n - 1) digits r'!prnting the magllitud,' 11(' partitiom'(l illto (k- 1) flUd m (IiRits in tllP int<.>gra1 alld fractional parts, re.sp<.tivdy. The largest rpprc....entable valu(> is then -(-V) = Y, (I. 7) ,. ( k-l I ) 'moz = r - II 1) , whcr' !lIp = r- m sim'p the complem('nt of (R- Y) is R-(R- Y) - Y. OIW ohlll' major advantage'S of a cOlllpl"l1Il'nt rl'prl.'sentatioll (regludl('N> of thp ('x....U't valup of R) i:. that 110 dt'Cisions havl' to b> lIIade before executillg "lll additioll or subtraction. III tJIP pn'vious ex.uupll', whcrc a positiv{' aUlI a Ill'gat.ivt' IIIlIllh('r arl' to b,' add('d, th(' second opcrand is reprcscntL'I..I by (R - Y). rh('rofnrp, till' addit.iull to hI' pl'rformed is X + (R - Y) = R - (Y - X). If Y > X then the negative Tl.'slIlt -(V - X) is dlrpndy repTl>sl'lIt,pd ill the sanlt' compl,>mellt form; i.l'., (\,'i R - (} - X), and tlwr(' i.. 110 1U't'd to UUlk,' any SJ)( dal dl.'Cisiolis likl' intercll31lbillg the orol'r of th(' two opl'rnllds. Howev('r, if X > y, tilt' ("orr{'("t rcsult shollid be (X - Y) whil(' X + (R - }-') = R + (Y - V). Th.' additional t('nn, R, IUIL.;t be disl'"lrllNl, and I hp vallie uf R shollid bl.' selt.'l.ted to simplify or eV('1I completely ('limillate this corrN'tion step. Another re{luirl'mcnt on the valu(' s('lN'll'd for R is that th.. ('"llt-Illation of the complement (R - } ) of a giVl'n nllmber l' he a simple IIp,'rntion t.hat can be donc at, a high sp(1('d. D"fort. ol'('iding on thl' mlue of R W' defille the complement of a single digit XII denoh'd by x.' as alld t.h,' corn'sponding rl'prcsentation is O(r - I)... (r - 1). Thus, tI}(' range of positive nllmhl'rs U. [0 , rA=-1 - ulp]. Thl' rangl' of Ilt'gatiw numbers is similarly (_(r k - I _ !lIp). 0], represent,>d by (r -1)(r - 1)... (r -1) to (r -1)0'" O. WI' ther....for(' have two r('pr<'SPl1tat.ions for zero, one posit,ivp and onl.' negatiw. This I'" wtOllwllil'nt whl.'n impll'l1Ientil1g an arithmetic unit, since an ('lIUal indication must be gpnE'rat('(j in a t(t fur L.ero op('ration for tilt' two different rl'prf'lientatioIi S of ./..<'ro. Example 1.3 In till' binary case all 2" scqllt'llcC'S are utili7RCI. Thp 2"-1 !'I'<JII,'nccs from 00. . .0 to 01 . . . 1 repnoscllt prn-itivc numb('I"S, while tl1l' rel1laining 2" I sc..>quences from 10...0 to 11... 1 rl'pre:;,'ut npgatiw nIlJl1I)('rs. If k = n (and thl'rl'forl', m - 0 and !lIp = 'fJ = 1) the rangf' of Iwsitive nllmhers i:; [0,2,,-1 -1] "lnd the rallc lIf l1t'gat.ivl' nmnlwrs ifi [_(2"-1 -1), 0]. 0 Xi = (r - I) - r,. (1.8) \Ve dellote I,y ..\ the fI-tuplp (Xk-I,.f.k-2' . . " .i -m) obtained after cumpleml.'lIting ('vcry digit in t.he St'<IIIl'I1CC corresponding to X. \VI" IIOW adel X to X nlld. hIL>;l.d 
R 1, Conventional Number Systems 1 .4 Representations of Negatave Numbers o X + A + uip = r". ( 1.9) Sequencc Two's cOlllplpment 0111"5 rornplcmcnt Siguf'd-m8Rn it ude 0111 7 7 7 0110 (j 6 6 0101 5 5 5 0100 4 .. .. UOli 3 3 3 DO 10 2 2 2 0001 1 1 1 0000 0 0 0 1111 -1 0 -7 1110 2 -1 6 1101 -3 -2 -I) 1100 -  -3 -4 1011 -5 - I - 3 1010 -6 -5 -2 1001 -; -6 -1 1000 -8 -7 -0 on Equatioll (1.8), we obtain Xi + X, = (r - 1), indppf'I1lI"nt of the t'xact vailI(' of Xi. WI' thpll add nip to t.lIP sum of .\ aud X, yif'lding \ XIe-t Xk-2 +1" 3'1e-1 XI:-2 (r - 1) (r - 1) + ul l) 1 0 0 X- rn -m (r - 1) 1 o = r le The abO\" calculutlon can bf' rt'writtf'n as Nutt' I hat when till' above re:sult is stored into a regish>r of I('ngth n (n = k + m), th(' most significant, digit is discarded alld the final nsult is zero. In gpner<tJ, stor- illg t,he rl':>ult of any arithmetic operation into a ft.xed-Iellgth register is tX}uiv.,"'nt to t.akillg the renmindt'r after dividing by r". Rearranging the lPrlllS ill the pre\'ious equation re:sults in TABLE 1.1 Three representation methods of binary numbers with k. n. 4 r" - X = X + ulp. Ie - R - X = r - X = ,,\ + ulp. ( 1.10) additional sequence thal start:. with a 1, namely, 1000, which has no cor- rp()llding positive IIInnbcr. It reprents tilt> neative number (-8ho. Therf'fnrf', tlU' range of binary mnnbl'rs ill the two's complement method with k = n =.. is - 8  X  7. Th(' two's complement repl'f'Sf'ntations of "111 values within thi.. r8llge are shown in Tahle 1.1. To illustrate the :.implicity of executing the operation ..\ + (-}') with }' > X, consid('r thl' additioll of thl' lIumbers 2 and 7, represent'-'tl by 0010 and 1001, respectivcl)': o 0 1 0 2 + I 0 0 1 -7 1 0 I 1 -5 This is the correct (('Mdt rt>prf'Sf'llted in the two's COlllplplUent methJd, alld there is 110 IIl'<.'d (or any preliminary decisions or post corr,-'C.tions. Even whl'lJ X > }', tin: expected r('Sult is calculated without requirin any corrN'tiolls. For example, when adding 7 8nd -2, reJlrt'S('lIted by 0111 ami 1110. flospecti\'cl)', Wt' obtain Omsequently, if we splcet. thc vahlf' r le for R w(' obt.ain Th(' calculation of the complement (R - X) of a ivl'n lJumber X as df'fineO abO\'e is quitR simpl(' and if> iudept>ndt>nt of the value of k. \\"c c.all this radlX- compkmcnt representation. No corw('tioll is needed for it wht'n thc r('5ult of the prc\1ous operation. X + (R - Y), is positive (i.e., when X > Y), since R = r k is dibcarded when calculatillg R + (X - V). Example 1.4 For r = 2 and k = n = 4 (alld ronsequellt-ly, m = 0 and uip = 2 0 = 1) the radix compleJllf'nt (also call('(1 thp two's complf'lIll'llt in the binary case) of a numhpr X equal.. 2 4 - X but can instead be calculated. according to E<juatioll (1.10), by X + 1. In this case, the scqUf'llces 0000 to 0111 repr('- sent thp positive Ilumbers 010 to 7 1 0, resl)('ctively. The two's comph'lIlent of the largest positive number is 1000+ 1 = 1001 alld it repn":Sent:. the valu(' (-7ho. 1'11£' two's complenlPnt of .tero is 1111 + 1 = 10000 = 0 mod z1; i.e., there is a single representation of i:cro. Thus, f'adl positive number has a corrl'Spoudillg negdtive numlwr that, starts with a 1. There is an o + 1 1 0 III 1 1 0 101 7 -2 5 Ouly the last four lea.."t siRnific3nt digits arc retaint't.l, yit>lding 0101. 0 
10 1. Conventional Number Systems 1.4 Representattons of NegatIVe Numbers 11 A s('('ond possible dIOI("P for R I:' R = rA: - ,lip. This is the tlimirti.h('d. rotliT ccmpltm 'FIt for which, accordinF; t.o Equation (1.9), R - \ = (rA: - ulp) - .\' = "Y. (1.11) nUlJlbpr R/2 1, Nouo\'t'rldppin rCOIls for (It o;it.ive and n£'ativc Ilumhcrs can be adlicv,'(i cdSily if the radix r is an ('VCIl rllllulll'r. In thi.. ("BSt'. 111 ordl'r to satisfy the itwquality 1\1 $ R/2 = I' r n - I (for dU intecr.only repr('S('ntatioll), t hi' values 0, 1,' . . '1-1 for t hc 11I0st significant diit wOllld corrt'Spond to poitivp I1Ilmhl'r:., whilf' thl' valllCS l' . . . , r - I would ("orrl)olld to IlPativp Ilumlwrs. If, howevpr, t he radix is odd, then t he rcpr'nt ltions 0 to ( ':' - 1) mll!;t. be moof' positiw. 81ld thp remainillg onl'S n('ati\p, lIJakin it lIJore diffic'lIlt t.o distinguish bctw,'('n positiv(' and negative numbers. H.,fC', tl1l' derivation of thp C'OJUpll'lI1l'ut is even :mllpll'r than that (If the radix c0111plpl11l'nt. Allt.lll' di,.,rit.-romplplIJ£,lIts I, ("an he calcuhltNi in parall('I, Il'ading t() " fAst. computatiun of .\ . 011 th(' otlll'r hand, a corrfftion stpl' is nN'dNI wlll'l1 thl'r('Slilt R + (X - Y) io; obtdin('(1 alld (X - }') is pOliitiw, <I..S will h(' ('xplailwd 18ter. Example 1.5 For r = 2 and k = n = 4, the dillJinish,'d-radix COl1lpll'lIJ£'lIt (also ca1k-d the on("s C'OlIJplclIll'nt ill the hin8ry cMe) of a nllmhpr X ('(ju81s (2 - 1) - X, which a1:.o l'Qual..; X , thl' :.eqU('IICC of digit comph'mcnts, ac-cording t.o Equation (1.11). A:s in tl1l' pre\'ious example, till' S('{]lII'IIN'S 0000 to 0111 r£'pr.-'SCut tht' positivI' IJwnb('rs 0 to i, rt>spt'('t ivt'ly. The onp's C O IU()lpl11£'llt of tll(' 1.LI'gl'.st positivI' number is 1000, rpprt.'Spnting the value (-7)ro. Tlw one's COl11pl(,lI1l'nt. of z'ro is 1111; i.e., ther(' arc two represent fit ions of i:cro. In !'lll1Unary, th(' rang!' uf binary numhers in thp OIlC'S complement IIIl,thod with k = 11 = 4 is - 7 $ .\ :5. 7. The different rt>presentations of prn.itiw and neative numbers v. ith k = 11 = 4 ill tbe tbree methods arp C'OlI1parNI ill Table 1.1. 0 EXAmple 1.6 In the> radix-complpuwnt d('('illJ81 system t.lw most signifil'dnt digit C'in a.-;SIIJIJ(' any of its 10 pos.-;ibl.> vnlul'S. Thus, all Sl'I)UPIlCes with 8 Ipadin digit of 0, 1, 2, 3. or 4 reprPSf'nt posit.iw nllll1bf'rs, while> tho having 8 leading digit of 5, 6, 7, 8, or 9 reprl'St'nt Ilegat.ive ont'S. For JI = .', thp largest positive number is 4999 81111 thf' St-'<JIlf'ncl'S 5000 through 9999 rl'prcnt IlPgativp Illlmbl'rs with valul'S of -5000 t.hrough -I, rl'Spe<'ti\f'ly. TIll' ruuge> is th('r('fore -5000 $ X :5. 4999 and is showll ill th(' din ram below. "99 fJ()()(J (]()()I . . . . . . III the biliary ca.')(', the most signific81lt digit C&.l a.."5UIllC only two vnllll'S and is thus a "t.rue" sin digit. This holds for all three reprclltatioll nwtbods as showlI in T'ihl(' 1.1 and t.he dist. iuct ion hetwl'l'n positive I1nd negat.ivt> numbers h; greatly simplified. In t.he nonbinary C.8Se, restricting t.h(' 1110st signifil"Unt digit to two '''alucs only (0 and (r - 1)) would lollsiderably rcdllc(' t.he percentage of uti1i/,-d S(>qUI'IIC; ouly 2. r n - I Ollt of r n (or 2 out of r) would he IIsed. To makl' up for this, wc can M tI1l' most. siguifkant digit a..o;:;ul11e all its pos...;iblc values \Ild partition the total numb('r, r n , of b(.'<juences equally (or dlmust ('{]IMlly) betweclI pOsil.ive aud neatiw valu(':;. In gt'llcral, a given numlwr X is represl'ntcd in a compll'ment :.-ystellJ by X if it is positiVI' or by R - IXI if it is npgdtivp. To ha\'e ullambiguous reprt'Sl'lItdtiolls, th£' r('gions for po:.itive and npgative Illllllbers :.hould not owrlap. In ot.ll('r words, thp illeqllality . StJOI 5000 - 1 \1 :5. R/2 LI,t }' = 2345; to find th£' r('prescntation of Y = -23 '5 (i.e.. the rd(lix compll'ment R - Y with R = 10 I) WP IL,)(, thp expre:;:;,ion R Y = Y + ulp. The digit complf'nll'ut iu I hi C-lL..;,' is the ume's compll'm('ut, yil'lding Y = 7654 and thus, Y + I = 7655. We CIUI \l'rify t.his rcsult by adding Y to - Y, obtaining 2345 + 7655 = 101 = 0 mod 101. 0 must be satilifll'd. If tbe \-alul' X = R/2 + 1 is allowl-d to be includcd in the region of rt'pn"Sl'l1t<ihle nUl11b.>rs, tb('u th,' Ilf'gative number -X is re(JrPS('uh,<1 by R - X = R/2 - 1, whit-h is idl'nti,'al to the represent.ation of th,' po:.itiw Thp readt:'r should rl'.uize by now that algorithms for arithnll'tic o,,('ratiOl call be d('veloped for varioll:' fixL-d-radix IIlllnlll'r s}'st('ms autl for .litfpf('ut parti- tions of mixed numhers (into their integral and fractional parts). To simplify our 
12 1. Conventional Number Systems 1.5 Addition and Subtraction 13 ciiscus.<:ion from this point on, W' restrict oursP'"  to him:\f)' intt'gl'f!;; i.r., r = 2 anci k = 'I. ThC' exteusion of what follows to thc gcncra] cast' of binary Ilumh('f!; wit h m '" 0 or rflciix r I1IlInheTS wit.h r '" 2 is, in most. cases, stralhtfoD' ard. 1.4.2 The One's Complement Representation 'fhl' rangl' of rI'IUI,:,('ntahl(' umnhl'rs in tlll' om"'s (olOpl('mPT11 systt'm for k = n is ymml'trir, and ('fJllals 1.4.1 The Two's Complement Representation TIll' rang(' of nUI11I)('rs in thE' two's cOl11pl('l1Ient systt'lIl for k = n is _ 2,,-1  \"  (2"-1 - ull)) wit.h ull) = 2 0 = 1. This rangt' is slight Iy asyml1lt.tril' as thcrc is om' mort' ncgat ivc nl1lnl)('r than th('re flrl' positive numbers. Till' hillary lIumbl'r - 2"-1 (rl'l'fI'sl'lIted by 10...0) d{\{"S not h8\'1' a positi\'c ('quival,'nt. C'onSl'qul'ntly, if n complcment op,'ration for t.his numbl'r is attl'lIIpted, an o\'l'rflow indication must !I(' gt'IJl'mll'd. On tJu> othcr lIand, there is a uuiqul' rcprcsl'ntatiou for 0, as shown in Table 1.1. Gin'n a r('preS('ntatiol1 (X,,_I. X,,-2...., xo) in two's COl11p"'II1l'nt., Wt' can use thl' followillb proCl:durc to fino its numerical \'I\hll' X: If .en_I = 0, tll('u Y" = L:0IX,2i, whil,' if X,,_I = I, the giv('u sequence (('»n'St'nts a Ill'gati\"c nllJuber whose absolute value call b{' uhtained by rompll'lI1l'nlillg t hl' giwn Sl'<)uencl' rul!l then emploYIng the previous l'quation. _(2"-1 - /lIp)  Y"  (2"-1 - !lip) with Ilip = 2° = 1 As a result, thl're arc two rcprpSl'ntations of Zl'ro, a positivc i:cro rppr('cnf('d by 000. . .0, and a 1I('ative ./em repn'5('nt('d by 11] ,. . 1. as h()wn in Table 1.1. For the OIIC'S COll1pll'l11l'lIt SYSh'lJ1 WI' h.wl' an ('(Ination similar to t.lmt of t.llt> two's COJJlp!t'lJJl'nt syst.(,II1: ,,-2 X = - x,,_1(2 n - 1 - Ilip) + 2: .c,2'. ,=0 (1.I:J) For ('x<unpll', the 4-t,uple 1010 rcpresents the valu... _(2:1 - 1) + l = -5. 1'111' proof of Equation (1.13) is left. to thc n'adl'r a.o; an exercise. Thc dl'rivat.ion of the one's compll>mcnt is <:implt'r than that. of the t.wo's COl11pll'l1I('nl.. For eadl digit we h8vP to calculate X. = I-x.. which is the Doolpan COlli pl('II1('ut ane! can hI' dOliI' ill parallc'l fur all tIiKU-!;. Example 1.7 Given t.he .1-tupl... 1011. w(' ("an fiud its valu(' by first compl('Ulenting it to obtaJlJ 0100 + 1 = OWl, then cakuhting tl)(' valu(' of tht' s('qucn("(' 0101, which is 5. Tills indicates that thp vahll' of the> original s('qul'n("(' is -5. 0 Instead of tll(, ablwe pro(.cdu[t' \". CRn uS(' thl' ("xprE'Ssion 1.5 ADDITION AND SUBTRACTION \\,ht'n addin or subtract.ill nlllilhers rl'prc 'nt.cd in t.he signed-magnitude n'pn'- st'lltation, only thl' l11al{uitudl' hils pllrticipato;: in thl' arithmetic opl:"ration, while tll(' sign bit.s nre t,r('ah'(l st'IMratl'ly. Consl'quo'ntly, a carry-out (or borrow-out) indicates o\'l'rflow. For ..xnmple, n-2 "- X 2 n-1 + " X 2 . A - - ,,-I' L i . , 0 (1.12) o o o I 0 I I + 0 I 1 0 I 0 0 0 I 11 6 1 Carry-out Using Equatloll (1.12) for 1011 we obtain -8 + 2 + I = -5. To proVI' thc validit.y of Equation (1.12) lIote first, that, if X,,_I = 0, thl' exact same positive \alue is c.a1culat('d. If X,,_I = 1, the value of the gi\Cn rcprcsl'lItation is - (\ + Illp] = [ ,,-2 ] [ n-2 ] - ?:i'i2' + I = - 2:(I-x,)2' + 1 .-0 ,=0 [ n-2 n-2 ] [ n-:l ] - 2: 2 '-2:.I"i2'+1 =- (2"-I-l)-2:x.2i+1 ,=0 .-,0 ..0 Tht' final rl'sult is positivI' (the SUIII of two positiw IIlIInbers), but its magllitudr> ill four bit.s is t'rronrously obtained 11-" I instead of Ii. sinct' 1 = 17 mod 16. In both ("olllpll'l11eut "y"tl'ms, all digits, induding tll,' sign digit, pnrticipi\te in t lit' add or subtract opcrnt.illil. A carr) -out. is thN,'fnrc not nl'Ct'SSurily all iuclieatioll of an owrfluw in t,ht' S)'stl'III. For l'xmnple>, wh,'n 'lddillg tilt' two IlIlmh(.rs.\ = 13 and Y = -M rcprI!sl'ut.f'd iu th two's COll1pll'lII('lIt n1l'thod, Wl> obt xiII = + ] o 1 I 110 001 o I o 0 o I 13 8 5 Carry-ollt, hill no OVf"'rfiow ,,-2 = _2 n - 1 + 2: X.2i ,-0 whirh  exartly thl' valul' of tllP right-hand side of F"<luat,ion (1.12) for .ell-I = 1, Thl' carry-out is discardl.'(l ami dm's Jlot indkute OVl'rHow. In g('neral, if X and Y IUWt' opposit.t. signs, no o\'l'rfiow l'lUl oC("lIr rt'f.':ardll' wht"t.lwr t ht'rc ill a rtilT)'oout or not, as iIIust-rnh'd in tilt' follnwin t.wo I'xamplt'&: 
14 1. Conventional Number Systems 0 0 1 () I 5 + 1 () 1 1 () -10 I 1 () 1 1 -5 No carry-ollt 0 1 () 1 0 10 + I 1 0 1 1 5 1 0 ° 1 ° 1 5 Carry-oIl I. 1 ,6 Arithmetic Shift Operations 15 If \' nnd Y haw th., smm' sigll and t.he sign of th.> (('suit is dilfl'rPllt from t.hnt of thp two opprallds, tlwn aJi overflow occurs. For pxampll', As wp haw' SI'('II IlI'foft., 110 "II('h C'orrf'f'tion i<; Ill! cssary ill hVQ's complement. "\odil.ifln. In hoth compll'nlt'nt sv!otplJ},., n suhtntl't ojJ.'r8t.iun,  - Y, IS pprforllled hy adrlinl1; t hI' C'omplPIIIPnt of) to X. III thp OIlP'S 10mpl"IIwIII. svsh'm till!; IIIC8US ,,,Iding V to X; ill othpr words, X - l' =  + '} III tllp two's f'OlIJplelllPnt syM.'1II w,> perforlll  - }" = \' + (V + ulp). This still rpquires Dilly a sin11' .\dd('r olwmtion, sinn' flip is addf'(ll.hrollh the forC'(.'(1 carry input to thp biuar' "dd('r. This will be further ,'xplnincd in CIMllt 'r 5. 1 1 () 0 1 -; + 1 0 1 1 () 10 1 ° 1 1 1 1 15 C8rrv-uut and overflow ° 0 1 1 1 7 + 0 1 ° 1 0 10 1 ° ° 0 1 15 o c\rry-out bnt ov.rflow 1.6 ARITHMETIC SHIFT OPERATIONS Anoth.'r way of lIistillguishing mJ10llg thp thrcc 111t!t.hods for rcprl'SI'lIullg Uf'KHt,i\P IUnuhcrs is to cOllsidpr thl' infinite ext.ellsions t.o t.hp right "\1111 to dw Icft of a giVl'11 nUIIII)('r, In thl' siglJC'd-magnitudl' IJwthod, th,-. manitlldp J'n-2,... ,Xo can he vil'wl'd n.<: thp infinitp SI'<JIJ('nrp In tilt' OIll"S cumpl(,I1I('lJt syst,'m a C'arr)'-ollt io; all indicat ion that n <'Or- rpctioll stpl) is lu,<,dPd. For l'Xaml)lc, wht>n adding a posit.ive IUllllbl'r X aue! "l negat.ive numher - Y (('prpSt>llh'd in one's cOI111'lpll1l'nl., thp n'sult is ...0,0, {Xn-2,... ,xo},O,O,.. X + (2 n - ulp) - l' = (2 n - ulp) + (X - }') If allY arit 11IIwtil" ojJ('mtioll rcmlt s in a 1l0llzcro pn'fix, tll(,1I t.his constitutf'S all ovcrflow. In the rlldix-complpmPllt Sdll'JIlI' till' infinit... l'xtf'n!ooion is nlld if \ > }' WI.' houle! obtain (X - 1'). The tcrm 2" rcpr<'SCl1ts thl." nmy-out bit. which is disc<irdcd sinC'e tit(' filial result should be stored into a register uf I,'ugth 71. The ((>suit is tlll,rt'fore (X -) - ulp), and the I1PCcssary correctiou is mndp by adding 111", For I'xmnpl." ... Xn-I..c,,-lo {Xn-I,... ,.co},O, 0.., where Xn_1 is thc I'igu diRit, Filially, for tl1l' cliuuuishl.'<l-radrx COIIII,II'IIU'ut sdll'lI1l' thp Sl'<IUPUN' is 0 1 0 1 ° 10 + 1 1 0 1 0 -5 1 ° ° 1 ° ° C<>rr('C'tion + 1 /lIp ° ° 1 ° 1 5 ". X,,_ II Xn-I, {.c" -II'.', .ro}, X,,_ II X.._I ... Example 1.8 Th... Sl'<)II('lln>s 1011., 11011.0, uud 1I1011.UU nil (('prescnt, t.hl' v8hl1." 5 in thp two's cOlllpl"lIJl'nt II1pthocl, Similarly, thp !it'<lnenl'cs lOW" 11010.1, nnd 1110101.11 all rpprf'Sl'lIt the valul' -5 in tit(' onc's compl"lIIl'lIt. method. Thl' gl'n('ml proof is left IL<; aUl'xerdse for thl' reader. 0 Thc generalod carry-out is c"lIl'd end-around carry, ulPt\ning that the carry- out is au indication lh8t a 1 shuuld be dlidcd to thl' I('I significant position. If there is u" carry-out thell no ("orrection I:> 1I"l..J('d. This is t hl' case ",h('u .\ < }' and thus tit" fI'Sult, -(1' - X), is IJI'gativl' and should hf' rf'prl'S('ntNl by (2" - ul/) - \} - X). For cxalllple, -10 -5 No carry-ollt ,uul hPIJC'p no correctiou Thes(' extellsiolls arc u:.i'flll when mldiuK op"r3uds with diffefl'llt uUlllbel'8 of hits. TIJ(' shortpr opernud IUU!>t be ('xt.'lul...1 to the leugt.h (If till' luugf'r ou.' befof(' hdug ndd('d, I3ll..'wd 011 these cXh'usious we l'<\11 tlcrivf' till' rules for aritlullet.i.. shift Olll'ratilllJs where a I..ft and right shift. nr(' l'quival('IJt t.o nUll! ipli('dt.ion unci divh,ion by 2, (('spcct.i\'cly. + o ° ] 0 1 1 1 () 1 0 ° 1 1 1 ° 5 
HI I. '"IIIP''' 1.n III tw,,'s l."OI1II'Io'IIII'l1t, SId, {UOWI:! = I-:J} S/dl {OOlOl:l t.:;} SId {11I1112 = 5} 'I.R {l1Ull:! = -5} In lUll' 's l"umpll'JIlellt. 'd {llOllh = :;} 'I.R {IIUJO:! = 5} I. Conventional Number Systems 1.8 Refmences 17  Ul 01 OJ =  10 = IIOOl(}.l = -t 2 = 100ltll :::I -10 11:1 11101:) = -3 = J(lIUJ = -10 = lllOl,1 = -2 0 \.6. l:m'n tUI "-1111'11' , - (rn-I....,Zn) Ihllllh.. ",,1\1(\ of S/I.H( \} + ,h,H(-\"} lUll I of SI.,I (\;)  ,'h.l {- \} lI'Iill l'ltbl'r tbl' rliltlini'llul-rI\"hc fompll'lIwllt tlr I h,' ru..ti'C t'OJIIIII"IIII'lil r.'..r "'lItlltioll 1. 7. "II lit I\rc' t h,' rllhos for o\<'rflow d,'I,'C't Ion In .d.l/sub'met opf'rnlinl''1 'I\'bf'n II sing l\\o'S 1"'lIlIllh'IIII'1I1 or Olll"!! ""1111111'1111'111 rf'llrt'lIl"tI\}IU" 1'''UllllillJ( llml I"IIrrv-ill 1111\1.. ur,r-oul siJ(IIWs Cor Ihl' biWI digil firt' n\'Iilnblc'i 1.8. I'rtl\ I' I hnt I h., \,\lur of IIny nlll11hrr in t hI' Onf"fC (""101'11'1111'111 :lVSI('1I\ "/In hI' rill. .-uhlll..I, fl.r IUI' k 111111111, IIsillJ( ,he forll\lIln "-2 IiJ I \........ . .\ .. -zliJ_I(2 - - \I"l)'. L.. r,2' J.-m 1.0. CUll th,' '''lltRltOIl X a _.r,,_12 n - 1 ... L::r.2' hI' ('XI'utll'(llo thp rn<lix 1"010- pll'lnent rl'"I1'!oo'l1lnt It'll for "..)' ,.....it h'1.' rI\<lix r i I.ltl. '1'1... t.lilf.'n'Jat..., , -} rnll bl' ftlrllll..1 b 1,,1.lillg till' c'ompll'ml'lIt of \ lo } IUl\lt\J('1I COllljllCIIlClitiuR 111\' rt","ll, Pro", thnl litis is trll" fnr IUJ,)' COlllpl"IIIr>lIt sdll'lIIt' I\ud /\Ii\' I1t'rnllix('(l-rllllix IIlIInh('r !I)-:;lt'lII. b il Ils('ful? 1.11. h\lw Ilml 1\ I'urry-nnt in I\IIY ".hl I\r NuhlrtJl'l I\ltl'mlioll IIf IIm"s l'III1II'II'IIIC'1II lJulllhl'r!I ilillinltl I hnt ilia l'II.I-urt>lIIull'lury lIalL'i1 1,1' I\Itd,'(1 to t:orn'l't lh(' 1111111 no:;ult. P \y I)lUlil'nlllr Rlh'Jltioll 10 tho c{\.w wh('rl' h,,-' 1It'gt\liVl' nllmhl'r!I 111'...1 to b,' n.!tI...1 1.12. nlll I III' n.rn,,'-Utlil MI'I' ill OIll"S l."OlIlplr>lIIl'lIl l\lIdillUIi IJ)' IwdiUjt CUl I'Jlll-urnu lid IIIny "lIcrnll' 1IIltllh,'r I'IIII'IU. IIntl,-nrr}"? 1.13, III Iht' SiII'(I'IJJRWlil\llII' N'\UI'St'lItutillll 111I,rt. is IUtnth,'r wuy It) pi'r(o'rm 11.1' III" c'rtIliClla .\ t. (-}) wi..,.. } >.\, Imil('ud or dllminft the nrt!t'r ur Ol'I'UIII,ls III1I"/llnllllt ill - (1 - \), WI' CUlt simpl)' suhl m,.-l} fmlll' 1I..\\1'\..'r, this will n\J1t iu II II,,I fur n rorr('("li'lli. ,,"or ('X"IIII,II', \ I it hUII't II' ::Ihift "IIl'rnliol1s nn' \'1.'r." u!iI.'flll ill 1111111)' nlguril hills for IUlltt i- Illi.'nuolI nnd CIi\'ISlUII, "" will It" tll'sailll"C.l ill Clmph'r :t 1.1, (n) Fmtl Ihl'lix,...I.llnillt N'\II1'St'IIIIIlinllsufthPtwu \'Rlm'S ('I1,:.!!"ho l\u.1 (-.11.25ho III t h,' nltti,.; "OIllI'It'IUI'ltt 11I1I1 diminish,..I- m.li" ,-OIllI}h'III,'nl )"!oI(,IIL" if till' nuli.x r L'i , I.. _ 3 il\lt''r dits, fUlll III - I frodinll digit. (b) \{I'IIt'l\I (n) fnr r 0=: 2, I; 0:: N mill rn "" :l. 1.2. "I'riry Ihc' "orfl....hl' of thl' fol1u"'l1 \unt"....llIr., lU olJmili till' two':; l'<)lIIpll'llIcul, (f,'n-Io ,,/0,-2," . ,,,,,) c}f 1\ IH'II ti''(lu''Jll'l' (rn_I' r..-J.' ,. ,J'O), Stnrt fmlll Ibp rbll\ll.....t hit .ro. Iqlr I'lu:h II iu .\' Sl'l tit" "'lrr'l'ullllillJ( hil i.. } Itl tlllnlil tllI rt'llI.1t 1111' 1i11 I. ,,",'r IhL" \ ...'1 lh.' ,""rrI'Slllllldill hit lO 1. "'rom lhi.. I'UIIII 011 ....'1 !I. ell .. 1.3, Sill'" t hilt "l'llru"iUl"It'I' :i.:n bils IUt' 1II....h..1 tu fl'l'rt'.....'nl 1111 ,.-diRit dl....im,,1 Iltllllhl'r. I..... III tin.. Ilrtlblt'l\I \\"t' nU1'11I1'1 10 tind Ih,' mllst "c'lIkil'lIt" !i'Cl...l-nltli" mlmlwr sys- 1t'1II. '.....IIIIaI. Ilmt l\ Ilill"I'1'1II \11hll'_ 111'(...1 10 hi' rt'l'rt'N'IIIt...1 mill .....'/lr(')1 rur 11.0 rl"li" r I hili lIIillimi ,'S lIu' I'rOllut"1 1:' ;:: II . r whero II is lbl' IUllllh,'r "f rl\llix I' "iil... IIIIIt !In' n"ll1irPtt to rt'11f.....'lJt :V \'I1ltll'l'O; I.., II ... 108, 1\ , nailS, lh.' fUlII'linll /. Dr' 1t}J(, 1'1 ... (r/loli:.l0 I')' h1O N UaII,'il hc' minilllir.,...I, (ft) Ju"Ufy 1111' ..,.I.....tlllll of UIC' obJ ....ItHO fUII..lln.. E, (h) I\lblilulP 1111' \lIhll':!! nf r/ltl810r fM r...'.!,r,a.- ,,10. WIUlI \lH'llll'llt'...l ('hllin't< fur r'l 1.r.. I'fl'\"l' lh.. "XIl.'IJ.t;IIIIL" nf 1"'...'1\ ,...mll''''I",'nt 111111 IOm.'N l"IlIlIlllt'III1'nl IIUlllhl'J1i to hlliui\l' !iC"'1I1"II(,,(. III I'urlinalnr, hL......1 till 1':'lllIIliuu (1.1:!) sl...", llalll Ihl' H'I' n.....utulltlll of It si"'1Il'(1 biJllJr' 1IIIIIII>I'r ill (II + 1) hit.'! ill IIIl' t\\U'''i 1'0111,,11'111('111 1U1'lhud IIll\)' hi' .It'n"l...1 frum it... r.'I"...."lIllUI..1I ill II bits b)' rt",,'lItillg 'hI' !llgn hil 1.7 EXERCISES (I I I o 1 I 101 \ 1 0 +11 - t: - 1.1 whilt' tla,' C",'rN'('1 rt'Sult L.. I 0010 = (-2}.0. hllw thll' till' (("'llIIrt'll t"orn'l'Uou l'IlII h,' d"IIl' hy tuklll till' l\\'O'S t"ollll'll'll1cut llf I Ill' rt'8ult \ t ttI. EXllhiin wh,\' Ih,' 1'111111'1(''''1'111 uJlI'mliulI I'ro\ i,ll Ilat' u,'("nl)' n,rrl'Hillll. 1,8 REFERENCES A 1IIIIIIIIt'r tlflt'xll.....k" C()('u..iug C'III""IIIIU1It'r l,rilhllll'lu'lm\., 1....,11 jlllhlu.h...1 hlIIl'C' Wli:' ('III'S!' illdllllc.': [II .I. .1. I. C." \N .\c:ll, 1)191foi ClllIIIJlJfrr ",111111t 11(': 1}t'191J flllIl un"I. III' /ltllfJUIi. :\kl:r\lw-llill. NI'\\' Yurk, 1!1".1. [21 I. I-'IOHFs, I'hc IlIgle..1 ('1lmllUfc I' U,thllll'fJ,", 1'n'nlln' IIldl, EIII."......ll'lilrs, N.I, WI':" 
18 1. Conventional Number Systems 131 I..I In NS AND S. 1'-. Um:n\1A:-;, AdvanCtd CQmptltrr (ll1tJJlIlr'tlr dr..I9", Wil.., N('\', \ork, 2001. 1.1] ,J. n. GOSIIN(;, /Jr.Aign oj antJlfllchc uni"" Jor "Igltal r<Jml 111tr r:t SI)ringc'r-\'('rlaJ(, N..w Ynrk, UuO (51 K. HWANn, ('ampute,. anthllletlc: J>nllnplc'!, arrhit,"t.'tllrr, mill d. .i'lll, \\'iJ,'y, Np\\, \mk, 197R lfil ,1.M. MIIII.FIt, Flrmrlataf'lJ Hmctiorl3: "gonthm, and Impl, m, IItallOlI.', Uirkhdll>iCr, Uooton, 1!/97. (7] A. H. OMONI>I, ('omputer ant/mlctte SIP/,'rrL. Algonthm.., arrhilrclllre allfl im. plementation", Prt'lIlire Hall, Engll'wood ('Ii Ifs , NJ, 1!1!1.1. (8] B. PAJIIIA\II, Co, apu/..,. antlJJ1l1'ltc: Algori/hllL' and J/anlll1an /)uigrL', Oxe"rd University Pr('.'lS, Npw \ ork, :WlIO. (9] N. R. !'co'.-r, ('ornpu"'r 1lI1f,.ocr sy,"crn.. (lrld anthmf'lir, l'rl'lIti.... IInll, EIIII.'" wood ('tiJfs, NJ, lYH!"). (10] 0, SPANiOl., Compllter arilhrllf'tic: I vglc and d"..I9'I, \\ ill')', N,.w York, l!.I/(1. (11] S. WAHR nnd M. J. FI.YNN, Introduction to anthill 'hc Jor digital 8Y," 'III ." 'Ign- c.-s, 11011, Ihnl'hnrt, \\'iJJloll, N('w York, 19t!2 H.'prints of 1\1 an)' !"!a.",..if Iml"'['S .iJ>I)('cU' ill th' next lwn volulJlcs: (12] I". B. SWAltTll.ANIU'Il. ,JR. (I';dilOr), ('ompul r ant/llllchr, vol. 1, WEll C nl1l l)Utl'f Society I'((, I..os AI.unitos, ('A, 1990. 1131 I.. I". SWArtTll.ANmm. In. (Edilor). Com,,," 'r' antlmlf.lIe, vol. :.!,IEEE (;omlmu'r Society I'n_, I..os \I.uuitos, CA, 1990. SC'\,('ml dmlll('r., in ('Cllllj)utl'r orJ/:.\lIillitioli ll'xlhouk!i nlld hOm(' sUf\ey lutld(:s have h('('n d,'vol'(t 10 "OIuput('r antbm.'lic illdu4Iill: 11.1/ Y. ('IIV, Computf'r oryafJizaltorJ and micropro!lmmrning, Prl'lItic.c 111111. 1';!Ilc- wood Cliffs, NJ, 1!)72, ('bIlP. 5. (15/ II. I . GARS:R, "Numh('r flYfll('UL., and nrithm('tic," in ,1dtrallrt.'J m comJlutCI':i, vol. 6, F. L. Alt ami M. Rtlhinulf (F..ds.), AUIlI,'mk, fI;,'\V \\lrk, 1!IG[I, pp, 131-19.1. (16) II. I.. GAltNF.R, .. l"hrory nf mrnj)ult'r addition l\IId overfinYos," IFHF 1hms. on Cornpu/m. C-27 (April 1978), 2tJ7-301. 117/ V. C. IlA\lACIIF.R, Z. (;, VHANESIC, rmd S. G, ZAKY, ('omlJ111lr oryanizahon, 2nd ,'d., M(Grmv-lIiII, N.,w York. 19HI. (18) D. GOI.OO."It(;, "('omplllt'r .tritlnnl'tic," ill Computf'r arrhitf'cilln: A quantl/ilwr' al.proarh, D. A. Pnttl'nJOli and J. L. 1I"IlI.(y, 2nd ('(Iiti\lll, MOrKllII KallfmuuII, S.m Mateo, CA, 19<]6 1191 l.. 1\111.IS('II, "Matbemahml fOllllllation.. of cnmpllt('r IIritllllll'tic," /IBF Ihm.:J 011 Computf'n, '-26 (July 1977),610-620. (20J 0. 1 . MA('SOltl.r::,', "lIigh-bJu'('(ll\rilhllleh(' III IUII.IfY COlllpU!t'I'1:I;' Proc. oj IIlE, 49 (Ja.u. 1961), 67-91. (21) G. \\'. R"'T\\'II"N.:Il, IOliillllry ImthmC'lir," III A '('an('H UI CQrnpul rs, vol. 1, (0", I.. All (Editor), A('dtleuuc, Nc'w York, 19hO, lip. 231-:mx 122/ (' rUNG, MArilhmetic," in ('OfrlJIUU,. ,'clcnce, A. I:. ('ardl'u ct al. (Eds.), WiI'y- Intf'rb<'If'ncc, N('\V York, 1972. 2 UNCONVENTIONAL FIXED- RADIX NUMBER SYSTEMS AlthouJ.:h t.h,> cOII,,('nlIOlial hinary llI11nht'r sy:;t('1J1 with two's f.omplt'lJlI'nt. repw- s('ntut.ioll of n('ativ(' !llImbers is ('01111110111)' IISl'c! ill .uithml'til: IIlIit.s, tllI'rc are sl'wrnlutlll'r IlIlIuh('r bystl'lI whidl have proVl'1I to he 118('flll for ("('rt.ain "pplil"(," tions. TIII's(' in..llldc th(' ncgative radix \11,1 sincd-digit. nlllllh('r systems, but.h of which are dlscrih('d in this dll\lltl'r. Othcr unconventional IIIlIlIb('r sytmJ.:j, induding the sigll.loaritlun number syst('m snd t.hl' residue nllmlll'r .'iIytt.Ul, m" di.scus.,,<,d in Challtl'rs 10 alld 11, npl'cti\'('ly. 2,1 NEGATIVE-RADIX NUMBER SYSTEMS Cnn\'l'ntional uumber systems arc fix('cl-mdix sy:.h'lJJs for whi..h th(' wt'iRht. IL'. of thl' ith digit., Iat ib r', the r"n(' uf lw:h digit. is {O.l,.", r-l} alld t.11I" ilJhrJl(4tJ'- lion rull' for c llcull1l.in t.he nl101('rical valau' of th(' St'qU('II(,,1' (In-I, X n -2,..., .1:0) is n- I n-I \' = L.l: 1 JI'. = L XI r'. ICO i=O fhp radix r is lIormully 'II'I,-ct,ed to hI' a positivl' iull'ger. Hnwl'wr, it is IlDt lIeCl'ssary to r('st.rict. r to positive valau's, and WI' may sd,' .t r = -13, w!lcr(' 11 iI.. a po...it.ivl' intl'g.'r. fh.' digit S('t remaius thl' ;.lInU'; i.e., Xi E {O, 1, 0" ,13 -:- I}. I'h(' valll(, of the fI-t upl p (.rn-.. .r n -2.... ,.ro) ill t.his ,ILguhI1f-rodiJ: S)'Stl'JJI 1-... (2.1) n-I ). = LX, (-Ii)'. ,..0 (2.l) 10 
20 2. Unconventional Fixed-Radix Number Systems III oth('r word:., tlU' \\'l'ight 1L'. satisfies II' = { , , -,1' if i is I'\'"n if i is odd. Example 2.1 Thl' nl'gati\'.......radix nUIl1Ll'r sytl'lIJ with J3 = 10 is c811t'd thl' FIIga-dfClmal syst P I1l. Consider thl' following thre('-digit l1l'ga-decimal numb('rs: (192)_10 = 100 - 90 + 2 = 12, (012)-lo = -10 .. 2 = -8 Thl' lart'St po...;iti\'e vRlue that can Lt. rl'prntl'll as a 3-tllple (3"2,3"103"0)-10 is (909)_10 = 90910 whill' the sUJI\lIest is (090)_10 = -90 10 , Thus, t.hl' range of "alul's rl'preseuted as 3-tuples (3"2, Xl, :ro) ill t hI' 111'1;'1- dl'(:imal systl'Jn is - 90  \'  909. This range is d.....YlIlnIetric, silll;1' t.h('r(' are approximatl'ly 10 timl?$ a.<; m.my positive numbers as negative OlJe. This is always trill' for odd \<lIIli'S of fl. If JI is c\'en then t.he opposit.(' is trul'. For examJlll', the rall!W for n = ,I i!' -9090  \'  909. 0 III the negatl\'c--radi'C number systl'm there is 110 net.'d for a sl'parate sign digit, and conSl'<lUtntly thl'fI' is 110 net'<! for a special method like dw radix- rompll'lIll'nt mt.thud, to r('pre8l'lIt negative Iluml)l'rs. The sign of the lIumber is dl'tl'rmilll'li by the first lIonZl'm digit. The fad that. thl'rl' is 110 dist.inction hetwl"l'11 positive Ilumher and lIegativl' number n'pr'!sentations makps the arith- metic opl'fatiolls indifferent to t.he sign of thp number. Howevcr, thl' algorithms for the basic arithmetic operat.ions in thc Ill'gati\'c--radix nUllIber syst"111 a/'" slightly more compll'x then their coulltcrp'1W; for the conwntiolJallllJluhcr S)'5- t.cIJlS, as iIIustratro in the following eXdJllpit'. Example 2.2 COllsider negathl'-radix nUlJlbprs of lenhth n = 4 with ,1 = 2. Thi" lIulJluer system is c.alled the n ga-binary S)'st.l'lIl. Thl' fallgl' for I-bit lU'g<'- biuary IIwllbers is (-Who = (1010)-2  X  (0101)_2 = (+5ho. \\'IJt'1i addillg negs-binary lIumbers the l''rry bits call be either positiVI' or Iwgalive as illustrated in t.he followill 8ddition, wherc the weights of the different. hit po:-itions arc showli in th,' lop row: ! I I t. 8 +4 o 1 1 1 o 1 2 +1 o 1 o 1 1 0 5 -3 2 2 2 A General Class of Flxed.Radlx Number Systems 21 Notl' that in the -2 ('olumn tllPre is a carry-ill who.' w('iRht. is +2, anel t.h., only W"1Y to hRncll.' it is to convert. it into f I 2; i.p., I)rcoduf'f' a sum bit whose w.>i"t i -2 Illld a positivp rarry-ollt hit '0 thr + 1 "OhlOlII. Also not.e that in t.hl' +-1 ('"ohlllln, a ("arry-out is gl'IU'rat('d with wpiRht +8- This rarry hit. '1llCl the opl'wlld bit ill the 8 rohlllln c8ncpl r'1('h ot hrr out to proOIl('1' a ,('ro slim hit. 0 The II.,-hin.try llI11ub('r syatl'm has h(,(,11 propOSRft for spveral siRnal pwr('S."ing applications, and algorit.hms for '111 arithnll'ti('" opprations in thi.. IlUUl- bl'r svst.l'm havp hl'l'lI dl'viSf'f1. HowPver, its liS(' ha.c; bf>f'n limitC'd and thC'(f' IUP cllrrl'lItly 110 st andard iutl'gr.tt''il circuit.s (lCs) t hat perform arithmPlic opera-- tions ill this y:.tcm. 0111' ofthl' rpa..',oIlS for this is tllP fart. that it is Ilot slIpc>rior, III principiI', to t lIP morl' wl'lI f'St.ahlishpd two's rompll>ffil'nt systpm. As shrM'1l in till' next!; ction, th two arl' ffil'mhl'r'l of a largl'r group of Ilumllf'r sy"tPJl with vcr)' similar propl'rtips. 2.2 A GENERAL CLASS OF FIXED-RADIX NUMBER SYSTEMS Till' lIe<ltive."radix, .mll lIIany otlJt'r fix('(l-r"1dix IlllmlJl'r systems, are lJJemben; of a 1)(I),\d dass of nonrc'ilundallt lIumb,'r systpm'i. III thi cld....s, "at'h n-digif nllllll)('r sytem is charactpri./f'il by a positivl' radix i3 and a vator A of ICllgth '1, A = (>'.._1, >."-2...", >'0) wh,'re Ai E {-l,I}. Such a syst.t'm, with d st.andard digit liet, {O, 1,..., J -I}, l'an bl' idelltifil'd by the triplet. < 11,13, A >. The vdlue X of all n-tuple (X"-I' Xn-2,"', xo) ill the system < 11, {J, A > is giv'lI by n-l X = L >.,:r,' . .=0 (2.J) TIll' IImltiplyinJ.: fae-tor A, allows U'i to S4'lpct bet\w.'n tllP t\\'o pos,o;;ihl(' w..ights Ji and -(3'. individually, for C\'l'ry di1I positiun i. For any givf'1I r "fix J t!l('rl' arc in this da.....s 2" dist.inct IJInnber !>Yloot"IIlS corrf'$ponding to the diff"H'lIt valu of A. Among thl'JIJ is rh(' positivl:!-mdix lJumlll'r Syst{'IO, for whirh >., = f 1 for I'very i, aud t.hl' neg<'ti\l'-radix nlllnbl'r systpm, for which A. = (-1)' for evcry i. Also illl"ludPd is the radix-complt'III1'lIt numb('r svstl'm with X,,-l /1..<; a "tru pn 5i1I digit (i.l'., .r,,_1 E {D. I}) and thl' charal"tl'rizilJg vector A = (-1,1,1,... ,1) (JI. \\'t' \\ ill now examinl' SOIllI> uf 'he propprtl"'-:' "f thi... gl'lll'r.,1 cbu;s uf num- ber systl'ms. Ict P Mid N d('lIote the hug'St dud !>mall.>..,t rt'prf'Sl'lJtable 111- tl'gl'rs, rpspf'('tiw'ly, in thl' g,'ncral syt'm < rl, 1.1..\ >. The digits of P = (lJ n - t. Pn-2,... Po) sat.bf" 
22 2. Unconventional FlXed-Radlx Number Systems 2.3 Signed-Digit Number Systems 23 { 11-1 p, = 0 if >.j = + 1 oth('rwi i = 0, 1, . . . , n - 1 Th(, compl"IIII'lIt X of II llulIIbpr X in a Rys't'm < 71, {3, A > is ,h,tilJ{'(l ha..'i4'd 011 till> digit ("()lIJplf'IOf'lIt of .1:, whirh is .£; = (fJ - 1) - :r i. as follows: A mor(' rOllvpnif'nt. (''Cprpssicl!I for P, is P. = teA; + 1)(/ -I), ha.';('I1 (Ill whirh tht' \'Dille of till' '1-tllpll' (p"-..I'n-2,... Po) i& n-I n-I X = L X;..\i{j' = L >',(/1 1-=0 ia:O n-I 1)/3i - L ..\,.c, , I' = Q - X iO (2.7) II PllrI', r = n-I 1 [ "-I "-I ] L (A, + 1)({3 - l)f;I' = 2 L >'i{j3 - 1)1 1 ' + L(I1- 1)1 1 ' icO ,.0 ,o .!.IQ + (,j" - 1») (2.4) 2 -X = X - (J = X + (-Q). (2.8) III other words, the ddditive ill\'('rsc of ..\ can bp fonnf'fl by addiug th(' ndditiv(' illVl'rse of Q to thl' mmplpJI1l'nt X . = EXlimple 2.4 III thf> two's cOlllpll'l11('nt Syst'lII, Q = -/lIp and 'herefor... -x = X + ""}. In thl' upgd-hinary Systl'JU, Q = (. ",1,1.1,1) = -(... ,n,1.0, 1) nnd h('llce, Q = (... ,0.1. 0,1). Tbe additiv(' invpl'SC uf (01011)_2 = (-9ho, for example, it; whl'rp Q it> the \'alnc oCthc 7I-tuple ({3-1,;J -1,,' .,,)-1) in < TI,,J. A >. As "ill hecolJl(' "'.Ioen' lat('r UII, Q is 8 \'Pry significant quant.ity. Similarly, tl1l' dip;its of N = (Y,,-IoUn-2," 'Yo), thl' smallt'St rppn"SI'ntahll' IJIllllh('r, satisfy n-I N = " .!.(>'i - 1)(JI- I)Jj' = .!.IQ - ({3n - 1)].  2 2 'co (2.5) 16 -8 +4 - '2 +1 X 1 0 1 0 0 -{2.f 1 0 1 0 1 1 1 ° ° 1 = 9 10 wherl' th(' addit.ion is pl>rforl11cd according to the ruiN of aclclillg Ill'gd- binary IJIII..bcrs. This r(,lJlt ('"ill be Vl'rjfi('d hy addillg the ()riin.,luulI1ber to its additive inverse. Ag.,ill followillg t.hl' mil... of lIL'gca-biliary addition, (01011)_2 + (11001)_2 = (00000)_2, 0 y, = { 11-1 o if Ai = -1 ot IlI'rwis(' i = 0, 1, .. . ,TI - 1 811d its v ,Iue is Tht' I1Il1l1ht'r of illtl'g('['S in the rallge S  X  P is r -."Ii + 1 = pn, aud tlJ(' rallgt' is, ill gI'JlPr<,I, a..';Yll1l11ctric. A 1111'&-\1(' of th(' a....ymIllPtry ..au he th(' diffl'rl'II(',(' I}('tw"t'll the ahsolute valul'$ of tht' largest alld SJlllilll'st numh,'fS: rhe additive illvl'rsc may be cmploypd ill subt(tiol1 b)' USihg th(' t'C)uatioll \ - y = X + Y + (-Q). (2.9) p - INI = p + ." = (J (2.6) Applying this ('C)uat.ioll lIlay rpquire two add opt'rat.iolls with curry propagation, which is til11l'-('ollsuming. An altf'rnat(' L'xpn:.o;ion is X-}= X +L (2.10) EXAmple 2.3 Thc lJI'gath ....radix systt>1Il for whidl A = (...,-1,+1,-1,+1) has /ill <LS millet rk raJlRe, alld for all I'\"('n valuc of n t!al're an' {3 tillles as IlIIaIlY II('gative lIulJlbprs as poitive on('5. If we prcf('r to havc more po.o;itive IIIIIn- bers, v.c (/ill illStl'M use the system < 71,,3, A = (..., + 1, -1, +1, -1) >. Two of the biliary :.y&tcIIIs are Il('arly symlllct.nc: till' t.wo's complt'IJII'lIt ystellJ, < 71, J=2. A =( -1,1,1,... .1) >, for whi..h P + N = Q = -ulp, dlld th.. sYSU'1JJ < 71.11-2, A = (+ 1, -I, -1...., -1) > for which J> + J\" = Q = +ulp. 0 Two cligit-col11(JII'IIIl'IIt. operatiolls, nlld only 0111' a<lditloJl with carry-propag ,'ion. /ire rNllJin...1 hpre, 2.3 SIGNED-DIGIT NUMBER SYSTEMS In all ti'CL'II-radix 8)'&t.('IIIS that WI' have ,!J(aluined so f"r, the .digit. t hlL.. h"m r(\<Jtri(,t('(1 to {O,.,., r - I}. Howl'wr, we nil allow the follCJwmg (hglt set: x, E {(r -1) . (r - 2) ,...,1,0, 1,...,(r -In, (2.11) 
24 2 Unconventional Fixed-Radix Number Systems 2.3 Signed-Digit Number Systems 25 wlwn' I eqnals -i und not (r - 1) - i '1.S !wfor '. Wc nM' l('rr t.h,' salllr not.aifn &." hrfore sincl,) t.hi/; ic; donr rommonly in the t,'chmcal ht.rr ,ture'. Eal'!1 digit IS rit.hrr poitive or nl"gtltiw, so t.h,'r<' is no u('('(1 for a sl'parat.(' sign digit TIlr rnltin llllmbC'f syst'm i c8n,'(ltlll' sig1H'd.digit (SD) syst'lIl. ill addit.ion and !>uht.raction. Consid.'r t hr f(}lIowin ol>'rat ion: (Xn-I....' xo) :I: (Yn-"'" ,Yo) = (.'In-1,... ,so) Example 2.5 For r = 10 till' a}lowrd digit.'> 8r(' {9, 8,..., 1, O. 1,...,. 9} "1nd, if n = 2, t.hr rang., is 99  \"  99, whieh indllllf"S 199 nmnlwrs. HOWl'H-'r. with two (liit8 (:l'I'.rO) r8eh having J9 possihilities, th,'rp are 19 2 = 361 rf'pre- srntatious, and h"lll'C' SOIlle> numbers ha\'(' 1I10rr than 0111' repfl'.s,'ntat.ioJl. Tilt' lllllll!)('r syst.l'm is tllC'rdore rt.'<.!nndant. For c'xlunplr, (01) = (19) = 1; (O) = (IS) = -2. TIll' r('prrs,'nt.,t.ion of 0, however, is nniqlU" 8nd so is the rt'pmwntat.ion of 10. Ont of tit<' 361 reprcs,'utatiolls, 361 - 199 = 162 are redundant, and thus tlwrr is 1% r('(lundancy. TI1l' rc.\(lrr call \'('rify that t'tlch nnmh('r in tins nngE' has at most two reprcsc'utat.ions. 0 '(' waut to brf'ak the c.arry dlains by having tilt, sum digit. Si dep.'nd only Oil the four operand digit!> x" Yi, .r,-l and y.-l' If this can be achipvM, t.h('n the addition time bpcom<'S indrp(,lult'nt of t.hl' Ipngth of tit(' opf'rands. An addit.ion alorithm t.hat can achi,'\'c this indpppnd,'nc(' consists of two stf'ps: Step 1: Computc all intf'rim snm U , and a earry digit c.: Iti = X, + Y, - rei wlll'rc -u if (x, + Yi)  a if (x, to Yi)  a if IX I + Yil < a (2.1J) Step 2: Calculate HII' final stun S, = U , + Ci-l' A::. will beeoll1l' appart'nt later on, adding some rt><llInddncy in a numbl'r s\'stem nn be wry beneficial. On the other Imnd, a high 1"\'1'1 of [{'(hmdancy 1;1ight hI' too costly, sincc a larg('r digit set rC<l\tires a hngpr nllmlwr of hits to r('precnt ('och digit. WI' may r('ducl' t.he alllount of rrolludancy by r('striding tllP digit set to Example 2.7 If wr s<'lfft a = 6 for r = 10, then x, E {ii,..., 0,1" ., ,6}. Step 1 in the dho\'e addition dlgorit.hm thl'n hccome::. U , = (Xi + y;) - 10e, and Xi E {a, a-1 ,...,I,O,l,...,a} with f r - 1 1 - <a<r-1 2 -- (2.12) u if (x, + Y.)  6 if (x, + Yi)  6 othf'rwisp wh('[(' till' Ct'iling r X 1 of a number X is the smalil'st integer t hat is largl'r than or ('«UN to x. At If'&t. r different digits are llf'edC'd to rcprc."l'nt a numbrr in a rndix r s}'stl'm and with ii  x,  a we ha\'c 20 + 1 digits. Tht>r('forC', the int>quality 20 + 1  r must bc sat.isfil'(l and t.hl' lower bound in inequality (2.12) follows. Thns, instrad of p('rforminb the addition of two dt'<.imal numbprs, sneh as 4536 and 1.166, in the convf'ntional dl'cimallllnnhf'r system 4 5 3 6 + 1 -I 6 6 600 2 Example 2,6 For r = 10 t.he range of a is 5 $ a  9. If we select a = 6, th,>n for n = 2 t.h('rf' are 133 numbers in the range 66  X $ 66. Each digit has 13 possihle \'alu('s for a tot.al of 13:2 = 169 [{'pr(,sf'utations; i.e., there is a 27% rL'<.!undancy. Notice t.hat 1 now has only on,' reprl'sent.ation, namdy (01). (19) is not a valid reprcsf'ntat.ion, since '9 is illegal. How('vc'r, 4 still has two [I'prf's,'nt.atiuns: (0-1) and (16). 0 in whil'!1 t.he carrv propagat from the (past significant digit to the most significant. onf', we have no carry propagation chain in 4 5 3 6 + 1 4 6 6 0 1 1 1 c, + 5 T T 2 II. 6 0 0 2 s, SD (('presentat.ions are Utwful wlwll dl'wloping algorithms for multiplica- tion and division, as dl'scrih,'d ill Chapll'rs 6 and 7. Ho\\'eVl'r, the originull1 that.ion for introduc'ing SD IlIllubers was t.u eliminatf' carry propagation chains Notl' that. the carry bit" werr shift('d to th.. Iph t.o simplify tilt' ,'xl.'Cution of tlU' second stcp of the algorithm. 0 
26 2, Unconventional Fixed-Radix Number Systems Binary SD Numbers 27 In thl' lat I'xmnple the operands were Sl'\!,I"tI>d so that t.lll'Y ('ould hI' vipwNI pit.hl'r as conVl'ntional d,>cinml numlwr:- or as Sf) dl'cinml numhl'rs with till' digit. St't. {fi,..., 0,1,..., 6}. In genera!, a nmVl'nt.ional d,>dma!lIInnhl'r IIlY nse digits tikI' i, 8, and 9, which arc ont of t.h,' allowt.>d digit. st't. 1'111' pr{'vic)\ls addit.ion algorit.llln can be> uspd for converting a conwntional dl'cima!nulIlber to SD form b' artificially consill,'ring Pl\C"h digit 8.'> t.hl' snm (Xi + Y.) in Eqndtion (2.13). riY, 00 01 01 11 11 11 Ci 0 I 1 I I 0 ", 0 I I 0 0 0 TABLE 2.1 The rules for adding binary SD numbers. EXAmple 2.8 To find the SD rl'pre>s(>nt ation of t.he dpcimal nllmh,'r 27956 WI' xpply the pre\'ions algorithm, resulting in 2.4 BINARY SD NUMBERS Xi + Y. 2 i 9 5 6 C; 0 I I 0 I ". 2 3 I 5 4" s. 3 2 I 6 :{ For r = 2, tlll're is only onp possible digit. St>t; nanll'ly, {I,O, I}. In othl'r words, (l mllst t'qnall. The int.erim slim and carry in till' addition algorithm from SN.tion 2.3 an> Ili = (x. + y.) - 2('. 8m! To couwrt a nllmber in SD reprcsl'nt.at,ion to ('onvl'nt.ional fl'prt':>l'ntat.inn onl' can subtract thl' digits with Ilt'gati'il' wcight. from t.he digit.s with posit.ivl' w('ight. For 3216 4 WI> obtain 3 0 0 6 0 o 2 104 2 7 9 5 6 0 u if (Xi + Vi)  1 if (Xi + y.)  I if (Xi+Yi)=O. To R\lafaut.ee that no npw {:arry will be gC'neratl'(l, till' slim digit. s" ca!- culatC'd from Ili +  _ 10 mnst 1>at.isfy Is.1 S a. Sill<'C 1c,- rI  I, th.. condit.ion IUil  a-I has t.o bC' satisfilod for al! possibll' \'Rlm.,:> of X, anrl Yi' For example, till' lurest. \'"\Iue t.hut Xi + Yi can a..o;sulI1e is 2a, for which Ci = I and ", = 2a - r. The> inequality U I = 2a - r  a -I is cI('arly satisfied, since a  r - 1. Howcwr, if Xi + Y, = a, which is thC' smallpst valul' for which Coj is st.iIIl, then Ili = a -,' < O. Subst.it.ut.ing lUll = r - a into 111.1  (a - I) yiplds till' im'qlJality 2a  r + I. HI>nct', thl' !>el,'('h'<.\ digit set. nmst sat.isfy These rules arc slInunari?,'<.! in Table 2.1. This tablE' dues not includC' t.hE' combi- nat.ions X,Yi = 10, XiY, = 10, aud XiYi = i I, since the addition x. + Y, is a com- mutat.ivC' opera! ion. NotC' that in till' binary CflSC till' condition a  f!:}! 1 = 2 I'annot. he satisfied, and l"Ouseqllently thpre> is no guarantft' thdt a Ill'W carry will not be gt'nC'rat.ed in thl' sel'Ond step of the> algorithm. Still, if t.he opl'rands to hI' added do not ('outain t.hc digit i, neW carrit'S will not bf' gene>rafl'(l. Collsirle>r, for examplE', thl' addition of the following two mnnhers, whi('h, in thp con\'('ntional re>prt:'Sf'ntation, will generate .\ t'arty tbat will propag<\tl' from the lca.t significant position to tht> lIloot significunt posit.ion: r r + 1 1 - <a<r-1. 2 -- \\'f' ha\c considered so fdr only thl' t.wo pxt.fl'm,> value>s of Xi + y, for which ('i = 1. Howpwr, the reader {:an verify that. for all oUwr possib!e \'8lu,'s of x, + Y" till> condition IUi I  a - I is 'ltisfil.o if a  r  1. (2.14) I I I I + 0 0 0 I I 1 I Ci I I I 0 II, 1 0 0 0 0 8 , Example 2.9 SD dedmal numbers must. satisfy a  6 to guardllt.l'l' t.hat no Ul'\\' carri('g will be generak>d in till' previous algorithm. 0 Hl're, no carry propagation chain l'xist:>. However, if SD lIlunbcrs with I tliits arc added, new ('arries may occur. For I'XlUnpl(', if Xi I'l,-I = 01. then C'-I = I: alU! if x,y. = 01, th,>n II, = I, yielding 8 , = ". +C,_I = 1+ l. Thus, a Uf'W carry is gl'nl'ratcd, us iIInstrutL,<1 in t!lt' following nddit ion: 
28 2 Unconventional Fixed-Radix Number Systems 0 1 1 i 1 1 + 1 0 0 1 0 1 1 1 1 1 1 c.... I 1 1 0 1 0 Iti ., . ., 1 0 () Si Binary SD Numbers 29 Binary SD mllllhprs ar(> particularly useful in t.hl:' dpvE'loplJlpnt of f&..1 algorithms for IIlIlltiplication and division, which are di<icuso;('(l in C'haptl'r.I 6 "nd 7. In th('S(' ClL'i('S Wt' will b(' intprpstpd in minimal 8D rrpr"iwntatiQrM; i.e., rpprpscntat ion., that indudp a minim'll mnnbpr of nonzero digits. Nonzero digits will correspond to add/subtract opl'r \tion'l, and thf' numLer of thpsc houl(i be minimizL'(1 while ./('fO digits will mrrl'spOlld to shifto)lly opcntions. The stars indicat.(' po!>itions wher(' UI'W carrit'S are gplU'ratpd ami must be aUow('d to propagah'. Fxamining till' rules in Tahle 2.1. onp can wrify t.hat thp comhination ('..-I = 11, = 1 occurs when J:iYI = 01 and X,-IYi-1 equuls ,'it hpr 11 or 01. \Vp can a\'oid sett.ing u, = 1 ill thps.' ('ast'S Ly spJ('(.ting  = 0 and therdor«' making 11, f'qual 1. W" should lIot, how(>v(>r, changp thL' ,'ntry for XiY, = 01 in Tablp 2.1 to read c; = 0 and Ui = I, since for Xi-IYi-1 = Ii, which results in Ci-) = I, we still ha\"(' to ..ct  = I .md i', = 1. Simil.lrly, thp combination (',-I = tI, = I occurs when x,y, = 01 and .:r,_IY,_1 equals ('itlll'r II or 01. Wf' call avoid setting lIi = I by sd('ct.ing in t.llI'SC C<lse5 (.md in thes' cas '$ only) Cj = () and thprt'forc ill = 1. In summary, w(' Cdn ,'n5U((' that no np\.... carries will 1)1> gencrated b,y examining thp t.wo bit.s to thp right Xi-lVi-I wlwn d,'tprmining Il, .md l'j, arriving at the' rules shown in Table 2.2. Observe that Wt' cun still calcul(\t.(, C, and Il, for all bit positions in parallel. Example 2.11 For X = 7 w(' haw t hI:' following r('prE'sent.at.ions: 842 1 o 1 1 1 111 1 101 1 1 001 1 1 1 1 1 Out of tht'Sp, 1001 is t.he minimal reprt'sf'ntation. rhe canonical rf'Coding algorithm generatt>s minimal SD representations of given binary numbers and is pfI's..'nted in Chapt.pr G. 0 x,y, 00 01 01 01 01 11 11 Xi-I y,-I - nClt }lPr at ll'ast npitlu'r at least - - is 1 011(' is j ilo 1 on(' b I c. 0 I 0 0 1 1 1 Il, 0 I 1 I 1 0 0 TABLE 2.2 Modified rules for adding binary SD numbers (8). Eliminating carry propagat.ion when adding binary numbers can spCf'd up IIpprdtions like mult.ip\i.-ation and division, whosp pxpcution usually includes a large number of add/suLtract op('rations. Two concerns arise whPll SD rt'prp.. scntntions of binary numbprs ar(> uspd in an arit.lull,'t.ic unit. TIlt' first one iR thp exact eHcoding of tlm..'C \'aluL':>, namely 0, 1 and -1, using binary signals. The sccond onp is the need to Cf\nvprt. the result. in SD rPprpsentation to it.. con\'ent ional two's complpm('nt rl'prt'Sentat.ion. Out of the .&! = 2-1 pn......,ihll' ways to encod,' thp t.hrl'(> valu,'s of a binary signed digit x using two bits, x" and x' (h and I for high anellow, respectively), only nint> an' dist.inct eucotlings uuupr [1prmutation and logical Hegat.ion. Out of th('8(' nine eucoding two ha\'e bem used in practice. These arl:' shown in Tahle 2.3. Example 2.10 Rpppating the prp\ious examplp WI' oht.ain 0 1 1 I 1 1 + 1 0 0 i 0 1 0 0 0 1 1 1 C, 1 1 1 () 1 0 Il, 1 1 0 1 0 0 8 1 Encoding 1 Encodiug 2 X Xh x' Xh x' 0 00 00 1 o 1 01 1 10 1 1 Notp that direct summat ion of till' two operands will re.sult in 111100. Th, and also 010100, ar(' l'CllIivalent, and all [('pr(>Sent tlw valuc 20 l o. o TABLE 2.3 Two encodlngs tor binary SD numbers, 
30 2, Unconventional Fbc9d-Radlx Number Systems Binary SD Numbers 31 FlluH1i1l1-( 2 ("'II Iw vil'wl'li , " two'", l'ompll'lIIl'lit n'prt':'l4'1I1 "floll of I he "ij(lIcc! cli1 ;r. Filrollill I i1- Homl'l illll'S prt.fl'rnhll' Nilll'!' it H"I iHfil'S I hi' "Iilllpll' fC,lut iOIl Uy f.'nrr"lll(illJ.( I l'flllH ill I hi' "lIoVI' 1'1]11111&1111 W., 1'llIl btulIl I Ill' ufit hllll'l'" . \'11' Liun Itnl iHlir'fl lIy I hI' "fmollk,,1 n'c'olllnl( ulWmt.lull z, -1- r III /I,  1'.1-. :r - 'r' - .r'. EXllIllpl.-. 2.12 rro ("OIlVt'ft I III' sn U'p/lI'ul ul illil IIf till' IUIIUbl'f lUlU 10 I WO'H C'Olllplr IIlI'lll Wf' "Pllly tI... pn'viouH ,,1lIril hili, rl'HIiIt iug III ,mil c'oll!«'CIUpIlII', till' (,olllhill"llUll 11 IIiL'! n \ulillmlllll'rk,,1 vu\l\l' IIf 0, I'hi.'! I'll- ,'odiu nl"" flimplifil'H iii" rOIl\'I'fSioli of n Ilumlll'r frolll I Ill' !iJ) lu I Ill' IWO'H 1"0111- ph'JlII'nl n'prc'SI'uI ,,(,iOIl, 'fhiH I'CIIlVNSiulI iN .10110 hy Hllhtntlt iuJ.( I hI' t!4..'f11ll1I1'I' 3': .I,.r_2,.,.,.r8 Irulil tho "l'lllII'IU'I' .r:._ I ,3':. -2....' r. IIMillg Iwn'H I'omplt... ml'1l1 uril IUllI'l il , Thl'rl' I'XiHI H "llollWf C'OIlVl'rtilOIl algoril hili whu(: IIllpll'ml'llI "I iOIl re'llll/l n I'irrllil silllpipi tlu", n ("ompl"'l' hiuury aelc!.'r (171, 19\), III thili "Iofil hm I he hilmry Higlll'c\ IIii'-s MI' C'XfUllinc'l1 fWIII rihl 10 \e'fl, 1111(> lliJ.(il nt n I illlc'. 1'1,,' nlgori\.hm rl'lIIu\'N! nl! 011'Ilrrl'11I I'ri \If 1 I liJ.(it 1- 111111 "fi,rwufch," Ihl' 1lI'ulivl' siJ.(1l In till' mO'il sillilil'lUll hil, the oilly lIit wit h 1\ 111'1(111 iVI' wl'ighl ill IIIC' I wn'H C'fJlllplo- IIWIlI rl'prc':'lI'llt "liOIl, TIlt' fiJ.(hlmosl 1 digit ii, fI'plul"I'll Ly a 1 UIle! I hi' III'J.(ulivl- tiiJ.(1I iH fOfwarc\e.l\ to I III' Il'fl, n'plul'ill O'H by 1 'RUllt iI n 1 ili n:udu'e!, whkh "1'011. tilIllINi" 1.111' lu'glll iVI' sil(lI "ml iN u'plul'I'll hy U 0, If n 1 i,., UIII [I'n('hl'jl IIII'II till' 0 in I III' mOl'll, ,.,iJ.:lliliI"lUlI pOHil iOIl iH IlIflll'l\ illtll n 1, hl','ollliuJ.( t.hl' III'j,tul ivc "Il bil of I.he I WO'fi romp!c'ml'1I1 fI'pn'/iI'1l1 ul iou, If n Hl'l'OlUI 1 ill l'IlI'CJllllh'rr'll Itl,fore' 11 1 it!, it. iH rl'l)lru'I'o lJy U 0 nUll \.111' forwunlillg of 1,111' 1I...ul iV!' tligll 1'11111 iUIIP.'!. Ilu' 1lI'U( ivc' /oIi1I iH flJrw.lfC!c'11 with till' nul of n u "hurrraw" hil which "11"0111 11 lIIg IL" 11 1 iH lll'illl( forwarch'cI, ",..I ('lJlIulll 0 IJllwrwlHI'. Th.. ru!r's of tluR Illoril hm ArC' HIUIiUmri1.I'11 ill Tuhll' 2.'1, whl'fI' II. is till' Ith dij,tit. of t.l1I' SlJ uIIIIIIII'r, z, ill till' ilh hit of IIII' I'Clrrl'spollclill Iwo'N ClIlIlplc'lJIl'UI rr'prr'!i4'ulatioll, c, iH IllI' pfI'violiH "uorruw" lUlIl C;-t I i" I Ill' IIl'xl "horrow." For I III' \I'ILOj\. Hil(lIilil'lUlt Iligil WI' ISIlIJlI' C'I) = O. l'hiH ulj,turithrn pl'rformfJ thl' InvprRt' Opl'fI\tillll III tlml p,'rfllrulI'll 11.\ thl' (111O,,;..,l nc-udiJl!l ulJ.(urit lUll pr '1I1.d ill S 'dion (U. It 8lLtiBfil's I Ill' aril hml'l Ie I'qllnt illll II. 0 I 0 1 II r, 1 1 I 0 0 %, 1 0 I 1 0 Siul'I' t hi' rnlll((: of rc'prc'!«'lItubl ' IllllllhNfI ill thl' Sf) 1Il1'1 h011 itl ahllo I. rloll hie' IImt IIf I Ill' IWII'H c'Clmpll'lIJ1'II1 Illl't hOll, UII ll-tliJ(it Sf) IlIllJIIJt'f IIIIIHt hi' ("oll\'l'rll'lllo 1111 (" -t- 1 )-hil, I WCI'H c'Olllph'lIIl'lIl [I'pn'HI'1I1 ut iOIl 1111 ilhlHlmtl'll 111'111\'1.  0 1 0 1 0 I C', 0 0 0 0 I I 0  0 I 001 1 \\lil h01l1 t.l1I' I'xt.m hit positiull tb.. mlluL 'r 1U WI IIId III' l'onwfh'll to - 1:1. o l/, . = %1 - 2rl-t-l ' flu' 1It11' of EllmclillJ( 2 ill rnhll' 2,3 ulll() hUJ4 ReIlUI' "dvuul URI'J!, Silll'l' I h,. vnhll' of till' IIpl'fruullliJ.(il .r, iH giVl'1l !.} -2rJ + r: IllI'fl' nr . IlItH, llI""I'ly ..d ulld .rI_ I' ill t.WII Ilcljnl'l'II1 clill pusitiulIN wllic'h hnvl' IIII' HI""" Wl'il(ht. \VI' C'11II 1111'11 rl'rullp the bitl! .r: flilel .t:' I 1.0 forlll n IlI'W llij.(il i,. I'I'rfurmillj.( lIudl I'qllnl- \VI'ij,tht. roupllJ IGI allllwH 1111 tlJ r!(:riVl' III'W ndililillll mil'" fOf ./1 ullIl II. whil,lt lIIuy lr'lle! tll Himpll'r implc'llll'lIt nt.iuIlH. Nllh' thaI III ruhll'l.!") IIII' hil.'! J': I allc!II:_1 Hrl' U"I,(Jr'11 uilly WIII'II .il/ll =- 1 !.1I1 lIot wlwil .r. lI, = I. ThiH mllY \e'IU.l 10 "implc'f implr'II11'II1 uI,ioll l'olllpl1rl'.1 tll Tllhll' 2,5 wlll'fI' t hi' IIlformul iOIlIlIIlJII\. till' pfl'viOIiH IIiJ.(it.s iN fI',!llirr'cJ u"o for t.lll' I"" rc '/1 = 1. lI. I'. %, {'. t I 0 0 II II II I I I 1 II I 0 I I 0 II I 0 1 1 1 1 0 J i,lit 00 01 (II 111 OJ 11 11 .r" y" both hol h t,1 lr'lt, - 1 . I - , IlrI' II "rr'O on I' iR 1 :r. I lIr- I - Ilt. h'lt huth - - \Jill' iH I lirl' 0 "1 0 I 0 I U I I () 1 1 J 1 () 0 /I. TABLE 2.4 An olgonthm for converting SD to two s complement representation, TABLE 2.6 Rules for adding blnary!lD numbers with equal-weight grouping (6). 
32 2. Unconventional FIXed-RadIX Number Systems 26 References 33 2.1. Find thf' Ih;oo-pc)int n'pn'J}talions of thl' t",,, ,,-a1u, (.U.2!iho and ( 1l.25ho in tht' fllIO\'.;ug radi,,-r numbt"r )tt'IU.. Ea.:h nUlnb('r consists of k integt'r digits Nld m fraction digits. (1\) Sigued-digit n'pn'lItation ....ith r = I. k = .1, "I = 2. and tht' digit set (2. 1. 0. I, 2). .\t I,'ast out' nt>gatiw lti,git must appt'al" in thE' rcpl1'S('utarion (b) .\ l'&"'1\-dt'limal "St('m \\-ith r = -10, k = 3, RIId m = 2. 2.2. Giwn tht' ,,1\lut' (-14ho. th(' ,,"Ord length n = 6, RIId thE'digit set (1,0.1). find all thl' '<oibl,' "ix-c.ligit SD rt'prt'St'ntatious of tht' giwn ,,-a1ut'. Inclicate the minimal repl't':!C11ta.tion. Is this repre.t'ntation uWtJue? 2.3. \'erif')' that the reptatilln of 0 in WI SD number :.n.tE'm is \mique. 2.4. Pro" that any fix('(!-r tcli.x numbt'r systl'm < n,l3, A > is nonredundant. 2.5. Calculatt' the d1tference bt't\\"'t'('u tht' two nwnbE'1'S 0010 and 0101 (in the conwn- tional bin' s)tE'm) in t\\"O "d)"S: 6.r..t. in tht' uaditional wa) by adding the t90"O'8 rompleIUt'nt of 0101 to 0010. then b)' using Equation (2.10). Is there an &hlUltag,' to using one of thE' two mt'thod.-<? 2.6. ShO'o'o' that if Q  r ...: 1 1 no ne..... carr). \\ill bt> g'l"ne.rattod \\'"h('n adding SDnumbt'rs. 2.7. Find Nl tht' "-aIut'S of the radix r for \\ hich tht' t\\"O->otl'p algorithm for adding SD numbt>rs ....ilI not gt'!leratt' nt'\\" carrit's if and only if the ma:umum redundancy is f01lo......00. 2.8. Show that in t' negati\-e-radi:< systt'm with d > 2 the additiw in"-erse of Q is -Q - (....2.2,2. 1). Find tbl' additiw inW.lSt' of o.19 in the nl'&"3-dedmaJ s)"$.tew. \'('riC)' )our re:.-ult by adding it to 19. 2.9. Can modifiro rult.,. such ...... thQl!;f' in lablE' 2.2 bt' dE'riwd for r > 2 so that Jtoss rt'dundaDC)' (.g., Q =  r  1 ) will bt' nt"l."doo! 2.10. Are thl' modifit'd rules for adding binar:- SD numbt'rs in Table 2.2 unique? In. oth.?r ...."Onls. can )"OU t anotht'r :iet of rul...s that \\-ilI guarantee that DO nt'W carries ....ilI bt' g'eDE'rated in tbt' >ot'l."Ond step of tht' addition? 2.11. An im3b'-rudix numb« ,,tt'D1 can be definoo bS folIO\'."$. ut the radL"C r ha\-e t form r = j./J where j = ,,"='1 and ,J is a positi'o'e integer. Let the t set be {O.I,....J3 - I}. 500\\' that aU l;'wn-poI>ition digits rept a real numbt>r }" in the negatin'-radi.x (-) S)tcw and all odd-pQ:oitiou digits repre;ent aD imagi..uar). nwuber j" Z in the S3mt' Dfgath-e-radix. numbt>r stem. A. a result.  C3D writl' X = }' + j,,"37. repn'St'nting a complt'x number b) a siDgle ueDCt'. Therefore. in:itt'ad of perfocmi.ng four multiplications and t\\"O 2.12. \'rite the BooIn equatDS for a circuit implementin th.. convemon aJgonthm w ThblE' 2.1 USlIIg encodmg I from Thblt' 2.3. Point out thp similarities bet-.-o tht' resulti.ng cimlit and a fuD-adder (FA, see See.tion !J.l). DisrlJ&'l Ihp bJlity of <'I1lplo:rlDg an}' 8p('('dup tecluuques w.ed for biDalY addition. 2.13. ShOVo' that the algorilhm, summarized ID Tabl.. 2..1, for COD\'l'rting IUJ SD to a tv."O's complE'm€'nl rt'presentation of a binary nUDJbEor, can be performed by forming the 5O"qUl.'Ilce 0.IV..-II.I!/wt-2I,....lvll,11I01 and than performing a bit- v.-i:;e E'xc\u.'Ii".e-OR operation 'I!o-ith the ue c",c,,-Ioc,,-1,'" .C),O which is obtaiIlt'd according to the rules in Table 2.1. This \"t'rSion of the conversion algorithm 9o-as pn'So.'nted in (7). 2.14. A hybrid SD number S)":;tPID "'''8S presented in [5}. In this S)"SlE'm onlv some digit positions are signoo .....hilt' the rE':;t remain unsigned. Develop two sets of rules for adding tht' corrt'SpOnding digits of tv."O hybrid numbers, similar to th05(' in Thble 2.2. The first table will indicate tbe rults for Iing the carry c. and intermediatE' sum u, for tv."O signed digits z, 8J1d !I. witb <It-I and lit-I being tht' lov.-.?r order 1\\"0 unsigned bits. The second table wiD indicate the rules for "'t"lecting the carry c. and intt'rmediate sum e, for two UDSlgnro digits <It and b. 'I!o-itb an incoming car C.-I that can assume the values 0, I, and -1. SbO'lV thal if d is the longest distance between neighboring signed digits then the maximum carry propagation chain is of length (d + 1)_ 2.15. Show tbat the addilion rules in TablE' 2.5 guarantee that DO new carnes wiD be gt'nt'r3ted in the second srep of the Additioo. 2.5 EXERCISES 2.6 REFERENCES (h+J\ -ZI»<.(1+jJjZ1)=lh}2- ZiZ)+J\ -O'"IZJ+h.t'l) [1] A. .\\ IZIEIS, '"Signoo-d.igit number repl't'SeI1tations fur fast paralIl aritJuIKotic.- IRE ThJlU. on El«. Comput. EC-lO (Sept. 19M), OO [2] H. L. GAR"ER, .....umbt>r S)"SteInS and arithmetic.- in ..tdl"C!11«S m C \'OL 6. F. L_ Alt and M. Rubinoff (Ech.). Acadt'mic, New ....ork. 1965. pp. 131-194. [3} I. KORt. and \ . (ALILU.. "On c1as:se; of positive. negath-e and iInagma.r) radix number :t")"Stems." IEEE Trun.t. on Computer:r. C-30 (Ia) 1 '1).312-31-. [4) B. PARHAMI, "('E'.ralized signed-digit number systt"IDS: .\ unifying framework for redundRllt number representations.- IEEE 'l'r-aM. on Compukn. 39 (Jan. 1990), 98. (5) D.S. PHATAK and I. KOR£..., .lIybrid Signro-Digit :'iwubt"r S)"SteIDS: .\ Unified Framework for Rroundant Number Repres.e.ntabOlIS with Bounded Carty Propap- tion Chains, - IEEE TnJ1U. on Compukn. 43 (August 1994). ,"91. (6) D.S. PffATAK T. GOFF. and I. I\ORE:\, -COnstant-time additioo and simultanroas rorwat cOn\-.?oD ba:.<:'d on redundant biDan repta&ions.- IbEE J"nvu. on Computers, 50, (2001). Ii) II. R. SRL'l'A' and 1\. K P.-\RHI,"'-\' Cast \'LSI addt'l' arch.it«ture,- IEE.E J. oJ SoUd-State Cm:1IJts. 27 (:\Iay 1992), 761-.0;. SJI",,.n;a.ta. 'A'" can din!cU) multiply hiO complex numbtors \1 and \J. Will th u.. of thi$ irn"f9" -ndlx nuwbt'r S)'S\t'Dl  up tbt' multiplication of compk-x numbt>N? 
3.1 2 Unconventional Fixed-Radix Number Systems 3 fHJ N, 1'''1\11(;1, II YII!>lllm.\, mIl S. ... AJIMA, ullil(b 111"....1 VL.';I IIIlJlliplkutl(1II "I. fCurilhm willi II rp<llJn,l.ml hill.ny IItlrlilinn trN'," I/",..B Ira"" <In ('ornpu'( r.., ,14 (Sc>pt. I ''»), 710(1"796. (9( S. M. ... "'N, C. . I lilli, (' II ('IWN, nn,l J. Y I.P.F. "Ao ,.II..-wnt rt'Chuul,ml- Innar)' IJIllulwr I . billllr.v 1II1I1I1t.., I"tJllvr'rh r," IF,..,.. J, of !;ol.d-Slalf ('irrlJl'"" 27 (,J IU. 190.!), 1()<)-112. SEQUENTIAL ALGORITHMS FOR MULTIPLICATION AND DIVISION This cllaptpr 1)f('fI('lItH 1111' h(Li.. tu'(lu"'utinl IlgorithlllH for rnultiplil Ition, divi- Hioll, IIIlIl Mlm,rp Wilt f'xtrEwI.ioli. AIorit.hlml for hih-p"i'rl lIIultiplil'ltion "In' dl'S('ril,..rI in Chaptl'r 6. ('haptprH 7, .md 8 inclurl.. alg..rithlJ1 for fo.sl rlivion me! high-"",..'() calr-uldtion of Hqnar., rOIJhl. 3.1 SEQUENTIAL MULTIPLICATION Lr't till' 1II1I1t.ipli..'r .md IIIl1ltiplir'allrJ he rll'uofr'(l I,y X .md A, n'sp....tivcly, with till' folluwillg H4'IIIWIIC'I'8 of digits: x = Ln-I.r,,-l .., XILO . t = n ,-111".-2 ... IlnO whl'rp X,,_I lIud (1,,_1 dr<' the l'oiKII clb.:it!> m r'it,IJI'r tit. blJ.,JlPd-lIlnlIitlldt: ur th.' ('(nllplc'lIIl'llt trwt hurl!!. TI..' b"'1l1 p nti.,1 nl(.,orithm fllr rnllltil,li('dtioli ('OIJHib of 11 - 1 HII'ps wh 'r' in 1III'p j the lIIuh.iplipr hit L , iH r'){amilll'" alld Ih ' IHmhu'l. LJ/t ill addr'(l to 111t' IHI,,'inlll-!ly IU"'"lIlIIlntc'tl I",rt inl prnrhJct, d"lIlIt.'(1 by pu). Thl' '(lJlfflpri It.' 'x l,rr'l!!liOIl fur this r"c'lIfl>ivc l,rol'I'c!nr ' ill 1'(1+ 1 ) = (1'{J)+x,... t ),2- 1 j = 0, 1,2" ','" - 2 (,tl) ( 0 1 .. 1 I . I ' 1 (1 ,fJ) .. .£ A ) bv 2- 1 ..hift,..; Wlll'f" lh tlu' firsl litr'l' I' = O. lY 1I tip ymg t II' 81111l J. . , I ' , . t tl r ' l " llt to 1,II ''' l l 1'(J+l1 lll'rorr' i«ldlUJ( thl' IIpxt prodllr.t I IY nlll' I"'HI 1011 II Ir' ,.",., . .l J+ I A rhi!> nlij.(ulIlI'JlI ill llI'I'I'fk'illry, ttillll' lh,' wr'lJ.:hl uf Xj+1 111 rlollblp I hili IIf :u; 
rcister art' four bits IOIlK, and COILS('(llIelltly a dltubh'-ll'lI)!;th rf'gistPr  rrquirt'iI for storiug the final product.. TllP \frticallilw in the tnhlt' bl>lnw sepumtps t.hp lIIost !>igmficallt IlItlf of the product, which mil be stored in a singll'-ll'Ilgth rf'gist(>r (fonr hits kHlJ1;), from t.he Ic&;t significnt half, whit-II cau be storf'd in a sl'COIul :<illgll'-Ipugth rpj1;ister. Th,> thrpe bits of the multiplier, .1:2, Xl, and LO, are I'xalllinl 0111' bit "t a t.iuw, shuting with the least. significant bit Zo. An add-and-shirt or shift ouly operation is t.hen performed accordingly. Thp final remlt i.. npgat.iv{' and is properly reprei;ented iu t.wo's ("1JlIJpl"ffipnt. Notp t.hat the part.ial product bits in the least sigrllficant half do not participate in t.11t' alld opl'ratiou, find that. all four bit. positions in the fir..t rpgistpr (holding thp most. significant half of tbe final product) are utili.led. lIu\H'wr. only three bit positions in the :.ecoud register are utilized, Il'aviug the 11'&.<;t !o>ignilinmt bit position ullused. This Ilerd not nPCl'ss<l.rily bl' thr fiual arruugl'nwut. TIU' three bits in the st'(,onrl rpgister om aft,'rwflnls be ston.'<i iu tllp thrN' right.most positions, and t hr- sign bit of thl' Sf'COnrl r,'gistl'r cau th'>11 be set according to oue of the following two pos...ibilitics: (1) Always St't. till' sign hit to 0, Irrespective of thp sign of thp product, sinc(' it is the le(l.';t. significant part of the rcsult; (2) Set t hI' sign bit e<lual to thl' ign bit of the first regh;tl'r. Another Pl}........ible arrmigelll('nt is t.o use all four bit. p(j..;itions in the $(.>(;01111 regi:.ter for tlw four It'dst significant hits of tht' protIlI,.t, U, the right.most two bit positions in t.hp first registf'r, and inSl'rt t\\O cupiL'S uf the sibn bit into tin' rl'ffiaining bit p06it.ions. 0 3fi 3, Sequent1al Algorithms tor Multlpllcat10n and Division 3.1 Sequential Multiplication :Tj. Tt> prow thnt thl' nbovp p(Oc,'(llIr" cnlrlllntl's elII' product. of A and \, WI' reprnh'dh. slIhsl it.ntl' into till' rf'('nrsi\'I' FAlllnt ion (J.l), yil>ldill p(n-I) (p(n-2) + X n -2' A). T I ((I(n-3) + X n -3. A), 2- 1 + X n -2. A) .2- 1 =... (X"_22-l + Xn_32-2 + ... + x02-(n-I») . A n-2 ( "-2 ) ( LXj 2-(n-I-)) ),A = 2",-1) LXj 2 j '/1 J-O J-O Ii 1 X x 0 P -0 0 .ro = 1  Add A + 1 1 Shift 1 XI = 1 => Add A + 1 1 Shift. 1 X2 = 0 => Shift only 1 = = = = If bot.h openmtls arl' posit.i\'e (i.e., X,,_I = 1l,,_1 = 0), till' product U is obtained rr,JIIl n-2 U = 2 n - l . p(n-I) = (LXj 2 J ). A = \ . A j 0 The rult is a Jlrodud consisting of 2(11 - 1) bits for its nmJ1;uit.ude. Tu prow this, note that. th., ma.ximmn value of U is ubtaill,'(1 wlu'n ...t and X EUnlt' tht'ir maximulll \'aIU('. Then,forp, (3.2) U..,or = (2 n - 1 - 1)(2,,-1 -1) = 22n-2 - 2" + 1 = 2 2n - 3 + (2 2 "-3 - 2" + 1) (3.3) Sincl' Ihr- Inst tprm in E<luatioll (3.3) is Jlosith.p for n  3, thp followillg illN}Ualit.y hold,.: 2 2n - 3 < U",ar < 2 2n - 2 ; n3 (3.4) Thus, (2n - 2) hits arl' requir('(1 to reprf'SCnt the \'alUI', producing a total of (2n - 1) bit.s when added to t.be SI,l,'11 bit For slf;nL-d-llIa'uitude numbers WI' multipl)' t.he two mdJ1;nit.udl's w;ing t.he abovi' algorithm aud gcuerate the sign of the result :;epdratel' (it is pusitiw if both Opp[811rls h6\'1' tlw saml' sign rolll npgati\',' IIthprwisl'). For two's nnd onr-'s compll>meJlt repr<N'ntatiolls WI' should distinguish betwN'JI multiplication with a Iwgati\'e mult.iplicand A antlmultiplic.ation with a nep:ative mult.iplier .\. If only th£' llIultilllic<l.nd is nl'gatiw, there is uo Ill'l'd to change the pre\'ious nlgorithm. WI' only to add 80m" nmltiplp of a negative nnmber that i<; rf'prPsented in I'itll('r two's or oup's COmpll'lIIl'llt.. This is illlI"itrat('(1 ill thl' npxt example. 37 o 1 1 o 1 1 000 o 1 1 o 1 1 1 0 I o 1 1 o 0 0 I 1 0 0 0 I 1 0 0 5 3 1 o 1 -15 Example 3.1 hi the fuUowing lIlultipl) o()cration, the multiplicand A is a Ill'gRtive IIUIl1- ber repre:.r>mt><l iu the two's complelll('nt method, while thl' nmltil)li('r X b piti\e. Both arc four bit.s lung lmd tll\ fiual product thprt>£ore h&. sewn bit:!, including th.. sign bit. In an urithmetic unit for -I-bit. opl'runds. all Thp situation, how('\,er. is dilfl'rent whl'n t.he multiplil>r is nl'gati\'t'. Ht're, we consider each bit sl'lmratdy, and the sign bit (which has a nrgat.ivp \wight) canJlot he t.r('ute<! in the salll(' way 8S the othl'r bits. First COll::>ider t.wo's (.om- pll'nwnl numbers, which suti"fy _ X' = -X,,_I 2,,-1 + X (3.5) whe[( i = L;-x]2]. 
38 3, Sequential Algorithms tor Multiplication and Division x . A = U -.4. 3',,_1.2,,-1 (3.7) 3.2 Sequential Division 3D Example 3.3 'I'll" prodnct of 5 .md .1 in IJIJP's COml)lcmpnt [I"'prf':-of'ntht.ion ill A 0 1 0 1 5 Y x 1 1 0 0 3 J:a 1 =>P -A 0 1 0 1 :To = 0  Shift 0 0 1 0 I XI =0  Shift 0 0 0 1 0 .c'J = 1  Adll A t- O 1 0 1 0 1 1 () 0 1 Shift 0 U 1 1 0 0 1 .r:J = 1  Onrect. + 1 0 1 0 1 I 1 1 I I () 0 () 0 -15 If tll(' sin hit of thl' mnlliplil'r in till' prt',,'ioUbly I)r'>nll't' procNlnr> is ignor('d, t hl'n till' linal r('Snlt 1I !'at,i:.lil'S ff= \".,1=(X+3"n_I'2"-I). 1= \ .A-f A'X n -I,2"-I. (3.b) The '...rm \ . 1 i!ol t hp cl('$in'(l prodnf't nnd lU'n('(', if 3'n _I = 1, the followmg (:orr('ct ion is n('('cs....liry: In ot h('r word!ol, if .r,,_1 = 1, w' must subt.ract t.hl' nmlt iplif'1111d 1 fwm t lu' mo.'.t slifinUlI half of U. Example 3.2 1'111' mnlt ipli('r and mntt iph('and in this example dre hot h neat i\e IlIllIlh('rs in IllI' t.wo's compll'mcnl rt'pre:wnt.at,ion: A 1 0 1 I -5 X x 1 1 0 I -3 .ro - 1 => Acid .4 1 0 I 1 Shifl 1 1 0 1 1 .rl = 0 => Shift only I 1 1 0 1 £',2 = 1 => Add .1 + 1 0 1 I 1 0 0 1 1 Shift 1 I 0 () 1 J:a = 1 => CorrC'Ct + () 1 0 1 0 0 0 I 1 1 +15 As in till' previons ...xampl..., till' :mhtnu'tion of th... (hrst) corrl'l.tinn t.erm is f\rcomplhdu:xl by ddding its OIlI"S ('omplelUcnt. Hc,wl'w"r, nnlikl' t,11I' prl'\'ious ('xampll', t.he one's ('omplcment has to 1)(' pxpanch'd to dOli hi... si?.(' using the sign digit (:'C(' Section 1.6). This illlplil'S that a dOllble- lellh binary ,uhler is needed. 0 3.2 SEQUENTIAL DIVISION Divbion b till' IlIO!>t ('omplex of the fOllr basi<' arithmetic oppmt.ions alul, CUll"'''' qlll'ntly, t.he most timt....(.onsnming. Unlikt, th... oth...r tlm'C Itp,'rdt.ion." division, in gl'n...rnl, has a re.-mlt cmL"i!olting of two compulll'nts. Giv('n u dividl'lul X unll u divisor D, a quoti('nt Q and a remaind("r R have to bp (,dlcnlatRd so M to S8tisfy In thl' cOHl'(.tion st,pp, HII' loubtrart ion of the llIultiplil'dlld is p.-rformed l!:o' adding it" two's cltmplcment 0 ,,\ = Q . D + R wit.h R < D, (3.W) x = - J: n _I(2,,-1 - ul]») + X (3.) WI' will I1SSlImc at. first, for simplicity, tlmt the operandI- X and IJ <tnll t lu> rC's II It S (J uld 1l arC' poitiVl' mnlll)t'rs. In mlmy fixed-point. aritlmll'tic units, a doubll'-Ic'lIgth prOfhl(,t is Iwni!abk> aftl'r a multiply operation, md we wish to nllow till' uS(' of this rp,,;nlt in .\ IiUhSl'CJIJCnt divid.. IIllI'mtioll, Thllll, \ ma) lII'('npy a clollble..!l'llgth rc'istl'r, whil,' all utllC'r opl'nmds an' stured ill single-length reistcrs. OIllN('clllmtly, w(' have to mdke tlur... tlldt. till' n':SlIlting qnot.i,'nt Q is smalll'r tlldll or "qual to th(' Idrg(ost, nllllll)!'r that we ('fin !Store in a singlc-lcnJ,tt,h r".,ult.er. If n is the lItlmlll'r of hits in a !Sillgll'-Il'nbrt,h n'gistl'r, then ('\'ery sinII'-lenth inwgl'r is sJIlllllcr t hall 2" -I. Th('rl'fore, to ('IINllll' that, t.he qnotil'lIt, is a singlt."'lc'llth illt,>gcr (i.C"., t hI' illl'C)nality (J < 2"-1 it; :,,,t,il-liC'd), wp must fl'quin' that Similarly, whcn multiplyin onl":; compl('m,'nt. I1Ilmbprs, which sat.isfy t.lll'n, x . A = U - X.._I . 2 n - 1 ,A + :r"-I .ul,), A. (3.9) ThUb. if In-I = 1, we st.art with 1'(0) = A. whil'h tdkcs ('arl' of t.he Sef'Olll1 ('orrcctiull tl'rlJl, IMIlIC'ly, .c"-l . 1I1p. A, and at t.he end IIf till' PW('cs..., WI' bllht.r.lI't till' first correctiun tl'rlll. A . :Z,,_1 . 2"-1. x < 2,,-IV. 
40 3 Sequential Algorithms for Multipllca110n and Division 3.2 Sequential Division .II 0 .1 0 0 0 0 0 0 1 .0 0 0 0 0 Sf't II = 1 1 1 .0 I 0 0 0 .0 1 0 0 0 0 0 .1 0 0 0 set lfl = 0 0 0 .1 0 0 0 0 1 .0 0 0 set (/1 = 1 1 1 .0 1 0 0 0 .0 1 0 Ifthi C'OlIllition is not s<'\lisfi('(I, nn Ot'l'r:flour inrlit'ation should be produCt.>d hy thp nrithmptic nnit. Onr should he nware that thl' aho\e condition can nl\\'n's hr satisfied by prf'Shifting 01\1' of th(' OpN1\nd:- X or D (or both). This pr('hifting is ,>sp,'('ially simplp to apply wlll'll tlu' op(>mnch, nrp !lonting-point lIIunhcr.;. Anoth,'r condit.ion that hn..; to hI' ch('('k,'(1 is that D I- o. If this is not thp cast"', n dit'idc by zero indication should bp gl'npratp<1 by t.hl' aritllliletil' unit. Unlit..,,-' the pn.vious coudition. no corr('('ti\.p act.ioll can h(' tak,'n wht'n D = O. Th,' pntation of algorithms for division is !'iimpler WhNI till' dividenu and divisor. as wpll as the quot.ient and remainder. are interpn't('(1 as frCl(1ioll. In this ('8.0;(', the divid(' OVl'rflow condition bl'l'Omt'S X < D to pnsure that the qnoti,'nt is a fraction. TIlt, division procedure that. is prnted npxt l\..<;S\UU('S t hat all op(,fands and n>sults arl' fractions. but i.; clearly al valid for integ('I'S, as willlwrollll' nppan'nt Int('r on. To obtain the fractional (positi\'(') quotient Q = O.q."'q", (wlwre rn = n - 1), WI.' p('rform till' division as a S('qUl'nC1.' (If suhtract.i(1ns and shifts. In step i of the pro thp r<'mainder is compared to the di\'ir D. If the remainder is the larg,'r of tht' t.wo, th,'11 the quotiPl1t bit q. is set, to 1. If not, it is S<'t to 0, Th.. ('quat il)n for the it h step i "0 = \" 2ro Add - D + rl = 2ro - D 2r. "2 = 2rl 2r2 Add - D + r3 = 2r2 - D Note that the generation of 2ro should not. result in an overflow indication (multiplying a po:;itive number by 2 should result in a positivI' number), sinc' the quotient and remainder are within t.he proper raJ1l1;e for the given dividend and divisor. Hence, an e.xtra bit poit.ion in the arithmet.ic unit. is nt'eded. The final n':Sults I\rl' Q = (O.lOlh = 5/8 alld R = r m 2- m = r32-3 = 1/4.2- 3 = 1/32. (The precise quotient is the infinite binary fraction 2/3 = 0.1010101 .. ..) TI)(' fjuot.ient and final remainder satisfy the equation X = Q. D + R = 5/8 . 3/-1 + 1/32 = 16/32 = 1/2. 0 r. = 2r._l - q. . D ; (J.11) Exactly t.he samp procedure should be followed if the opt'rands and result. nr<' integers. In t,bb case we may rewrite Equatioll (3.10) as follows: i = 1.2,...,rn Whl'r,' ri is the new remainder and r._1 is the pn'vious remaind,'r. The fir:;t relluundl"r is ro = X. Thill>, q. is de(('rmined by cOlllparing 2ri_1 to D. This ('Ompl\J'in is thp most complicated operation in the dh-ision proCl>&;. "'1' will now pro\""£' that the abow prOCt'<lure indl",i calClllat,>s the quotient and th,' final n'1lI8illdt'r. Th,' n.'lIlainder in the last st 'p is r m and [('peated fiultitutioll of Equation (3.11) yields i ln - 2 XF' = 2..-IQF" 2 n - 1 Dp + 2..-1 Rp (3.13) where XI", Dp, Qp, and RF are fractions. Dividing Equat.iolJ (3.13) by 2 2 ..- 2 yields r m = 2rm-l -qm' D = 2(2r m -2 - q..._1 . D) - qm' D =... = 2mro - (qm + 2qm-1 +... + 2"'-lqr) . D. Substituting ro = X and dividing both sictt'S by 2 m results in r...T'" = X - (9.2-1 + ll'l2- 2 + ... + qmT m ) . D; ... Q D + 2 -(n-l) Rp '''1''= F' I" . (3.14) The abow' mentioned .-undition \" < 2n-ID, whell divided by 22n-2, now takes the form X,.. < DF. Example 3.5 We repeat the previous example with all operan?s alld results being in- teger.; In this case the double-length di\itlend IS X' = 01 = 32, and ;le di\isor is D = 011 = 6. The o..-r'rflow colldition X < 2 n - 1 D is tt'Sted bv comparing tbe most significant half of X, 0100, to D. 0110. The I . f I d ", Q 0101 - '" and R = 001n.. = 2. Ob:.erve resu ts 0 t le lVlSlon arc = 2 -  U"".l . that in the final "tep of the pr the true remainder R IS generated and, as CaD be \erified from Equation (3.1-1), tht're is no "' to further multiply it by 2-(n-I). 0 hence r m 2- 171 = X-Q.D as ft..'<)Ulfed. Note that the Uue final remainder is R = r m 2- m . (3.12) Exawple 3.4 I,.(ot X = (O.IOOOOOh = 1/2 and D = (O.1I0):z = 3 ,I. The dividend occupies a double-length register. The condition \" < D is cll'arly satisfiE'd, 
42 3. Sequential Algonthms for Mu and DivisIon 3 3 NonrestorIng DMs10n 13 r. r. D D 2r.-l 2r'-1 D q, = I 2D !2D I ! FIGURE 3.1 Restomg dIVisIon. The most difficult t('p in th.. di\'ision pl'O<.'edure is the compari."On betwrcn the dhi.';or and the rl'mainder to dl't('rmiue th(> quotil'nt bit. If this is dont> bv \)btracting D from 2r'_h tben in the C&.Cf' of a DE'gatiw result \\"(' St't q. = o. and "'" must I'lton' the remainder ta its prl'\;ous \-alue. This ml'thod is thl'J"('fa-e called rc'ltoring dil-i..on. and can be diagrammed. ns hown in Figure 3.1. Such a diagram is !':Oml'times callro a Robt:>rt.)n diagram [i'J. This diagram illustrates the fact that if r._1 < D. q. !'hould bt:> Sl"'lected ""-' "" to ('no;ure ri < D. Since ro = X < D, ""'l" are guarantE't'd to obtain R < D. In wnll1ary. a di\"1$ion pE'rformro by the restaring mpthod US.'S m suhtra.ctions.. m :shift opE'rations, and an average of m 2 restort' operatioll.. The latter can be impll'mcnh"...i {'ithl'r by adding D or b' retaining a copy of the pre\ IOUS remainder. thu... 8\'()Jding the tilDe penalty in\"Oh'ed in the re:.tare operations. FIGURE 3.2 Nonrestori"lg dIVision. inst('ad of qiq.1 = 10 (which is too large). "' ".{mld get q.qi+1 = 1 i = 01. Furth('r corrertion. if nl'eded, would be done in the next teps.. Consequentl), the quotient bit is dl'tl'rminro in the nonrestoriug scheme by the fall wing rule: qi={ if 2r._1  0 if 2r._1 < 0 (3.1S) 3.3 NONRESTORING DIVISION This rulp is simpl,'r (and fastN tv execute) than the *1Cl'tion rule for restoring di\'ision :since it requires the comparison of 2r._1 ta 0 rather than D. The n'lUaindt'r is computed u....,ing the same equation An alt('matlw 1'C111'!UP for St"Quential dhi.c:ion is. th, nonrestorin9 di\'iian al- gorithm. in wbich the quotirot bit is not correctro Bnd the remainder is not rest<lred immooiat('ly if it is negative. These corrections are instead postponed ta later "t(>p. In the restaring method. if 2"._1 - D is negati\"'l". the remainder is restored to 2r._I' It is then shifted and D i... on('(' again ubtracted, obtaining 4,.,_1 - D. This process is repE'ated as long as the remainder is m--gatiw. In the nanl'l..,toring method ,,"{' a\'Oid tlIP tl'St.orl' operation, tay "ith a negatn-e J"('mainder 2ri_1 - D < O. hift it, Bnd then attempt to correct it b)' addin9 D, obtaining 2(2r._1 - D) + D = -Ir._1 - D. Thus. this algorithm produces a remainder "'Qual ta the one "'e "'Ould generat(' using 'toring di\'ision. Consider no",' the re:;ulting quotient. To enable the rorrection of a "wrong lectioll of the quotient bit in step i, "-e must allow the Iwxt quotiellt bit, qi+h to a..wne 8 llt>g8tiw \'alue. In other ",'Orets, the allowl-d \-a1u€S for qi are 1 and i where j J"('pfC:S('nls -1. If q. was incorrectly set ta 1, [('Suiting in a ncgati\-e remainder, we ,,'Ould then select q....1 = I and add D to the n'mainder. Hence, r. :;;;; 2r._1 - q, . D; (J.16) in oth('r word... subtract the divisor D if 2ri_1 is pjti\-e Bnd add it othef'l1ri... Thp nonrestaring dh-ision is diagranunoo in Figure 3.1. Hpn'.lri-ll < D d q. is 5('1('('t<'<1 ta ensun' I"il < D. X)te that q. i- 0 and thl'refore, at (>a('11 step, elth<>r an addition or subtraction is performed. This is n()t an SD reprntation. and there is no rroulldanc\ in tht! reprE'$t'ntation of the quotieut in tbe nonrt"'Storing dh i...ion. In summar....: th(' nonrestoring method requin'S l>x8C'tly m add ubtract anet :.hift opl'rations Its main advant8{, b its simpler selection rule Example 3.6 u.t \" = (o.lOUh = 1 2, and D = (0.110>2 = 3 -I. as ill Example 3.-1. 
rhe final nmaind,'r is t.hl' sarue as b'fore, and the quotient is Q = 0.11 I = 0. 101 2 = 5/8. 0 33 Nonrestoring DIviSIOn 45 Example 3.7 Lct. X = (O.I()()h = 1/2 and D = (1.01Oh = J/ I. ro=X 0 .1 0 0 2ro 0 1 .0 0 0 pt qi = 1 Add D 1 1 .0 1 0 rl 0 0 .0 1 0 2rl 0 0 .1 0 0 S<'t. q2 = 1 Add D + 1 1 .0 1 0 r2 1 1 .1 1 0 2r2 I I .I 0 0 S('t q1 = 1 Add -D + 0 0 .1 1 0 r3 0 0 .0 1 0 4 3, Sequential AlgorIthms tOl Multipl cation and DIvISion (1) ro = \ 0 .1 0 0 (2) 2ro 0 1 .0 0 0 set ql = 1 (3) Add - D + 1 1 ,0 1 0 (4) rl 0 0 .0 I 0 (5) 2rl 0 0 .1 0 0 setq2=1 (6) Add - D + I 1 .0 1 0 (7) T2 I 1 .1 I 0 (8) 2r2 I 1 .1 0 0 set.q3=1 (9) Add D + 0 0 .1 I 0 (10) "3 0 0 .0 I 0 The nonr<>storing division proCt'$S in the previons mmmpl<, can hp rl'pr(\- st'nt.t'd raphi('ally nsing a diagram similar to th<, one depicted in Figure 3.2. The rt'Sulting diagram is shown in Fignn' 3.3. Thc horiontal lim's correspond t.o tht' Add :i::D opt'ration in lincs 3, 6 and 9 in Fxample 3.6, and thc diagOild1 lint's {X)rm.pond to t.h,' Multiply by 2 operation in lin<>s 2, 5 and 8. A vcry important feature of nonrestoring division is that it can easily be extt'ndf'd to two's complement negative numbers. Thf' g,'n<,ralized splt><.'t.ioJl ruk for q. is Finally. Q = 0.111 = 0.101 = -(O.IOIh = -5/8, or in two's romplf'ment, 1.011. Notp t.hat thc final rcmaindcr is 1/32 and has the same sin as the dividt'nd X. 0 qi = {: if 2r._l amI D havf' t.h" sall1e sign if 2ri_1 and D havl' opposite signs (3.17) By (kfinition, the sign of thl' final rt'mnincipr II1I1.<;t ('(Inal t.hat of thp divi- dpnd. For cxamplc, when dividing 5 by 3 Wp should obtain a qnotipnt. of I and a final remainder of 2, and not a qllotll'llt of 2 and a final remainder of -1, although this remainder st.i11 sat.isfies iRI < D. COnSt'(IUI'ntly, if the sign of thp final r<,maincl,'r is different from that of the diviclpnd, a correction of both le final r<,maindt'r and quotient is nee<led. fhis situation, rC(luiring a correction stcp, aris's since the quotif'nt digits in t.he nonrestoring division a]goritlm ar rcstrictf'li to {l,l}. The last digit can not be St't to 0 and therefore an evpn qnoticnt can not bc gcnerated. Since the rl'mainder changes sign<; during th<, process, there is nothing spe<'ial about a IlPgative divid<,nd X. The following example iIIust.rates th,.. case of a nf'gative dh,isor in t.wo's complement. -.................-...--....-.-....---........... i q. = i I I ! 01.10 2";_1 i ! i ......_................................................u..1 Example 3.8 Lt,t X = (O.lOlh = 5/8, and D = (0.1l0h = 3/-1. rht>n ro=X 0 .1 0 1 2ro 0 1 .0 1 0 set ql = 1 Acid D + I 1 .0 I 0 rl 0 0 .1 0 0 2rl 0 1 .0 0 0 :;et 'l'! = 1 Add D + 1 I .0 I 0 r,l 0 0 .0 I 0 21"2 0 0 .1 0 0 ct lJ3 = I Add D + 1 I .0 I 0 r3 1 I .1 1 0 "; D=O.II AGURE 3.3 The nonrestoring division In Example 3.6. 
46 3, Sequential Algorithms tor Multlpllcotlon and Division 3.3 Nonrestorlng Division .J7 TIll' fina) rt'mainc!t'r is negative. whil,' the divid('nd is positive. \Vf' must corr('('t tht' final rpmaind('r by addill D to r3, 'i,'lding I.J 10+0.110 =0.100, and t ht'n corr('('t. t.ht' Quot ient: 3.3.1 Generating a Two's Complement Quotient wh('re Q = 0.111, and ther<,forp C.Jcorrt:cff:d = O.l1o. l = 3/1. o Thp nonrt'stHrill division, fL'i prf'viollsly pn'S'lIt('cI, gPllf'mtes n CJllotif'1I1 thai USI'S till' digits I 'ind I dud miht tht'rt'forp Lp incomrJdt.iLIt' with th,' n'I)((':.f'U- tat ion uspc! for t.l1I' Ilividl'ud and divISOr. If X dud D mi' n'prt''utl'd in twO'!! eOlllp)pm<,nt., thf'1J therf' is 'i 1J'_o<.\ for a couwrsion from the nhovl' rppn'spn- tat ion to two's I'ompll'm('nt \\e may, in pritll'il)ll', uS(' 0111' of the alorithms pr"scntcd in Section 2.4 for cOllwrting a SD IIInnlwr to its two's ('ompll'ml'1I1 rpl>rcs.'ntation. These algorit.hms howf'\'l'r, r('(luire that all thp dits of th.. (Iuot,i<,nt. bl' known bpfof<' tht' con\'f'rsion ('811 bl' pl'rfOrtlll'c! th\lo; illn(,f\siu t11l' tot'll !'xl'Cution tilll(' of thp divide ol)('ration. \V<, pn'fl'r t)II'rl'forp to I'lIlploy an algorit.hm that ptrforms tilt' con\'l'rsion on tIll' fly, as t 11(' (Iiit s of till' quo- til'nt bl'eomt. availablt', in a sprial ffLo;hiou from thl' most to 1.lJ( lefL.o;t sigrllfkant digit. SlIeh an on-tlltfly convprsion aloritlnu from SD to two's compll'lIlenl r<,prt.:;t>nt ation has bC('1I presl'nt.ed in (3). We nUl ho\\evl'r, take ddvantage of th(' fdl t that. th<, (Inoti('lIt digit ill thl' nonrt'st.oring lIivisi()(I can <L..."lmle only thl' values I and i (i,e., q, 1= 0) nlld IIt'riVl' a simpler alorit.hlll that r<'<llIires a 1,"5.0; (,Olllplt'X circnit. for its impINIlf'ntation. Since th<, CJnotil'nt. diit can aSSllntl' only two vahll's, a singll' hit is sllffi'it'nt. for reprI'Sl>nt.ing it, and WI' may assign the digits 0 and I to t.he vahlt's 1 and 1, rtp,'(.tivp)y. Lt.t. tilt' resulting binary IIInnb('r 1)(' denot('(1 by (O.PI .. '1'''') wht'rt Pi = !(q. + I). This nUlllbtr can be couverted to two's complement USIII the following alorithm: Step 1: Shift the givpn nnlllhl'r 0111' hit po!>ition to the I('ft. Step 2: Compl<,nll'nt th<, most sigrlificant bit. Stl'p :/: Shift. a I into th<, I('ast signifi<'ant po!>ition. The result of this algorit.hm is th<, '<lllf'nC'1' (1 - PI) . P2P3 . . 'l)m 1. We will now provp that th(' ahovp sl'qupncl', when intf'rprt,tl'd as  numbcr in two's cOlllplt'm,'nt, has the saml' nllmerical vahll' a... till' origilldl qllot.lf'lIt (). TIt( value of the above sCQlIl'ncl' in two's complclll 'Ht is Qcorrtcred = q - flip In ,'nera), if t hI' final r<,maindf'r dnd tll<' divid('nd ha\'(' opposit<, signs, a corr('ction st<,p is lI('I'd,'<I. If t.hp dividl'nd and divisor haw t.h" Mml' sign, tJll!n thf' rl>maindt'r r". is corrected by adding D and the QlIot.it'nt is C'orrectl-,<I h" slIbtraeting fill). If the dividl'nd aud divisor hav<, oppositl' signs, we snbtract D from r", and C'Orrpct the quot.ipnt by adding flip. Anot ht'r conSl'qIll'nce of t he fact that 0 is not an allowed diit in non- restoring divisiou, is tht' nl'Cd for a correction if a ll'ro r<,nldinder is gf'nf'J'at.l'<I in an int{'rmcdiatl' stl'p. This C'a.<;(' is ill list rated in t.he 1Il'>.."t t'xample. Example 3.9 Ll't X = (l.lOlh = -3/8 ami D = (0.110h = 3/4. The corr('(.t resnlt of t.his di\ision is Q = -1/2 "ith a ?ero rt>mainder. ro=X 1 .1 0 I 2ro I I .0 I 0 set 'II = I Add D + 0 0 .1 I 0 r, 0 0 .0 0 0 zero remaindl'r 2"1 0 0 .0 0 0 s<,t q2 = I Add -D + I I .0 I 0 r'.l I I .0 I 0 2r2 I 0 .1 0 0 set q3 = I Add D + 0 0 .1 I 0 "3 I I .0 I 0 Not.e that alt.hollgh t.he final r<,maimler r3 and tllt dividtnd X ha\'<, tht" samp Sigrl, a correction st<,p is u('(>dcd, sim'(' thp (Iuotient wp gl't is Q = oj Ii = 0.I0i',l = -3/8 inst.ead of -1/2. W<, must t.ht'rcfore detect. thp oc- currl'uce of a /,('ro int.ennroiat.e remainder and corr('('t the final rt'nldindpr (to obtain a zero remainder): m o '" 2 -.+1 + ,.-no -(I - pr)2 + p, . .-2 (3.18) r3(C01'TCeted) = r3 + D = 1.010 + 0.110 = 0.000 Substituting Pi = !(q, + I) ,}iehb ". fJlT I - T I .. L)q, + I)T 4 + 1,-m ,-2 m m = q 1 2- 1 - (T I - T m ) + Lq;T' + L 2 -'. .-2 i_2 Wp haw to t.hpu correct the quotient Q = 0.11 i = 0.I0l by subtracting flip. yil'lding Qcorred d = 0.1O<h = -1/2. 0 
48 3. Sequential Algorithms for Multfpl cation and Division 3 d Square Root Extraction Ie} Till' last h'rm ('quais (2 -I - 2-"') IUld thl'n'fore, rrJ ". =l/12" I + LQ,2" j = LQ,2- i =q. ia:2 Ie. I Tu provc' Ilmt till' ahov(' prnc('durp yi,.lcIs I,h,. rf"qnin'c! squnrp rool, Wf' (('Iwutt..'tlly slIhst.iIUlf' FIJunl ion (J.19) in I hC' f'Xprf'ssion for r m, ohtaining Thc' ahow C'Onwr!oion algorithm l'l\ll h(' ('X('cuh'd in a hit-sc'rial fa."hionj that is, wC' can c'lwr"I,' lh(' apprnprintC' hit of t.hc' cluoti('nt. wh('n rf'prt's('nlc'(l in two's ('ollll)IC'I11C'nt, at C'l\('h slc'P of thl' nunrlt.oring division. For !'xnmplC', in thl' last division with X = 1.101 and D = 0.110, instC'ud of l'n(,rlIt.ing I Ill' "uotil'nl bit" .1Jl, we cl\n gcnl'rnlC' thC' bils (1- 0).101 = 1.101. AftC'r thC' mrrcct.ion step W(' ohtain (Q -nip) = 1.100, whit-h is thc ('orrt..>ct fl'prc:;cntat,ion uf - 1/2 in two's ("(lmpll'ml'nt.. Thl' samC' on-t h('-fly ("onwrsion algcnithm ('an h' dC'riVl"d from tIll' gf'lll'ral SD to two's compleml'nt. conVl.'rsion alorit.llIn prcsl'ntl'd in Scct.iun 2..t, Thi:; is left lIS an cXl'rdsc for thC' readl'r. r m = 2"m_l-r/ m (2(./m_If-l/,..2-'") = 2 l "m -2 - 2qm l(2Clm-2 .. q", d - qm(2Clm-1 + qm 2 -m) = 2 m . ro - 2 m [(q I 2- 1 )2 + (q22-2f t-... + (qm 2 -m)2] [ "I-I ] _2'" 2122-2q12 1+... + 2q,,,2 -m  q j 2-' ( m ) 2 2"'\ _2 m L q,2 i _2,n(Y_Q2). ,..I = 3.4 SQUARE ROOT EXTRACTION rl = 2ro - ql(O + qlT I ) = 2X - ql(O + q I 2- 1 ) (3.20) Dividing b) 2'" resUlls in 11ll' expertl'll (('Ialion with rrn2-m us t.hC' final rOo nmindl'r. Example 3.10 IA \: = 0.1011 2 = 11/16 -= 17G/256. It.s sqnan' rool is ro=X 0 .1 0 1 I 2ro 0 I .0 1 1 0 (0 -+- 2- 1 ) 0 0 .1 0 0 0 "1 () 0 .1 I I 0 set l/1 - 1, QI - 0,1 2"1 0 1 .1 1 0 0 (2(}1 + 2- 2 ) 0 1 .0 1 0 0 0 0 .1 0 0 0 Sl't tfJ - 1, Ql 0.11 "2 is sm.,I1....r than (2(12 +- 2- 3 ) 2"2 0 I .0 0 0 0 = 1.101 r3=r2 0 1 .0 0 0 () SI.'t. '/3 = 0, (1:1 = 0.110 2r3 I 0 .0 () 0 0 till a positiw numher (2(13 + Tot) 0 1 .1 0 0 I 0 0 .0 1 I I >;et. cl.1 I. QI 0.1101 r-l The ('uII\'entinm\1 "complPling t.h.... lJU8rC" nwt.hod fc)r qu{\rp root pxlrodMIJJ b conl't'ptunll)' similar to thC' rl'sloring divi:.ioJl sdll'me. L"t the gl\'l'n rn<IK:and X h(' a positive fraclion, fLnd I..t Q = (O.lll Ill' . . qm) d"Jlot, its sqmup rool. Th,' bits of Q ar" gem'rutc'{l in 111 st 'ps, om' bit pl'r l 'I). \\"l' llSC the not.atiull . Qi = L qk 2 -k k I fnr thC' partially dC'\"C'loped root at stt'p i. Tlms, Q", = Q. \\'c alSl.) tll'not' thl' rl'maimll'r in St.l'P i by "i. Thc next fl'nll\ind..r, in Rellt'ral, is ('aknl"t.NI from ri = :lr,_l - q. . (2Qi-1 + q,2 j). (3.19) Compnrin tIlt' RImVI' I'quat.ion to F"<lufLtion (3.11) Slll--:-ts that thl' sIJuare root extraction can b ' vicwc..'tl dS division with u dU\nbing divisor, i.p., D. = (2Q._1 -+ l/,2-'), In tIll' fit tl'P diP rpmaindt'r IS the radil'/Lud \ and Qo = O. TIll' pC'rfortlll"d ('akulat.inn is thl'«'furC' To dc.tA'rmilll' the qnfLrc root digit 1/. in thl' ((>st.oring schcmc, a tcntathe (('- maind,'r, 2ri-l - (2<;'_1 + 2-') 15 c LlC"uI6tPd. Note t.hat dJ(' t..rm (2QI_1 + 2-') is .'qnnl to (ql.ql" .,/._IOlllnd is w'ry slmpl,' to c ,lculalc. If th.... abo\'(' t.l'ntnti\'t' rC'm'iinclC'r is po!oiliw, W(' ston it:, vulUl' ill "i WId Sf't 1/. l'lIUnl to 1. 01 hcrwilM', WI' sC't "1 = 2"'_1 lUld q. = 0, Finallv Q = 0. 1101 2 = IJ/lt> /Lnll thl:' final rc'luaiml..r is 2- l r.1 = 7/256 = X - () = (176 - HiU)/25Ii. 0 Th,' lIhuV(' procedure it\ similllr to till' (('storing division .lgorith.lII. A nll'thocl similar to thc' nunrc'loring division algorit.hm nUl he "lJJploYI'd wllh t.1t.. following select.iun rulc for C/i: 
50 3 Sequential Algorithms for Multiprrcatlon and Division q, = {: if 2"'_1  0 (J.21) if 2"i-1 < 0 Thi!o aloritJlIn is ilIustmtNl in till' JII'xt ('mllpl(>. 3.5. Example 3.11 3.6. L,>t Y = 0.011001 1 = 25, t>4. 3.7. ro = \ 0 .0 1 1 0 0 1 2 r o 0 .1 1 0 0 1 0 sdql=I, QI=O.1 3.8. -(0 + 2- 1 ) 0 .1 0 0 0 0 0 rl 0 .0 1 0 0 1 0 2r. 0 .1 0 0 1 0 0 set '1"2= I, Qz=O.1I -(2Q. +2-2) 0 1 .0 1 0 0 0 0 r2 1 1 .0 1 0 1 0 0 2rz 1 0 .1 0 1 0 0 0 set 113 = 1. Q3 =0.11 i +(2Qz-2- 3 ) + 0 1 .1 0 I 0 0 0 r3 0 0 .0 0 0 0 0 0 3.9. The square root is Q = O.lll = 0.1012 = 5/8. 0 3.10. The rlig1ts of tlU' squarE' root Q can bc COIJ\'crtt.'<i to two's complelm>nt rep- rl':'t'lltatioll by the $c'Un( nll'thod u.;ed for the quotil'lIt in the lIonrtorillg dr.ision algorithm. Faster algorithms for S(luafe root extraction ha"e bt.'\.'n developed and illlpll'lIIeut.t'<i. &lI1lt' of th"111 arc illtrodllC('() in Chnph'r i. 3.5 EXERCISES 3.1. Gi\"t'D tht' foUo\\IDg thn>t:' pain; or binar multiplicand d.1Id lUultipli{'r: (i) +.1001 cU1d -.0101 (ii) -.1001 and +.0101 (iii) -.1001 and -.0101. (a) R{'pn.':5l'nt th{' nwubers in the two's cowplcmCllt form and wultipl,). th{'m. Check your rt'Sults. (b) R(>pt'at (a) for thE- one':;; cowpl{'m{'nt form. 3.2. Can the bt.'qUl'ntial wultiplication algorithm be wodified  that the luultiplicr biLo; an> pxaminl>d starting with thl' most significant bit? What ruighl b(' a major diSNhcwtago! to th modifil'd .rithm.r 3.3. Multiply thE' binl\r). SD nwuber.; \ = 10i01 (th(> multiplicand) and  = 01101 l multiplier). P{'rforw a1J int(>rmroiat(> step5 in SD arithm{'tic. 3.4. Gi\"t'D the rollowing tbm.' pairs of binar)' di\ id{'nd and divisor: (i) +.1010 and -.1101 (ii) -.1010 and +.1101 (ill) -.101O.wd -.1101. 3.5 ExercISes !il 1{('pn'Wnt the llu.mLf'(!I in th£' two's complt'lII('nl form and p{'rfonn thf'div,.."on b). tL{' nOIUC!otorwK III .thod. fhe CjuotiPnt should "\Lo;o be r p pr('S('1l11'(\ in two's compl p m('lll. [k'vi.<;e I\J1 algorithm for dividing llumhM"!i III onl"s complemE'nt rf'prCSl'llt Uon. Ulusrratf' your algorithm u.'Iin thp thrf'e pairs of numb4'fS in probll'll1 3..1. Wnl" lhe rul<'S for nonrcstoring divi.'Iion fnr dedmal fractiorul. IUtL'Itratl' thl' procedur{' using a positive dividend and positi\'e 'lnd nE'gl\tivl' divisors. Explnin tbr n('f'(1 for n correclion step in lhe nonrt'Storin/i1: division if a zero r('rnailUl{'r is (>ncoullt{'re<1. Show thai if th(' quotient bits q. (i = 1,2..... .m) in th{' nonrC"Storing division art' set according to th ' mle { I if the siKDS of th(> rf'mainder find divi.',or a....pp q - .  . , - 0 if th{' sijl;us of the r('lDilindI'r and divisor diffr>r and subudA..tion (addition) is pE'rformro wh(>n ql = 1 (ql = 0), thl'fl thl' col're("tion term (I + 2- m ) hI\." 10 be addro to ql.Q2". qm to obtain a Cjuoti(>nl rpprest'nted in two's c(}Jnplemenl. Can the algorithm for con\.erlin/i1: the quolient bils g{'nE'rated in nOIln>storing di\ i... ion into two's compl{'ment r(>prt'S('ntation be modified fur converting bin<ir) Sf) numbers to t"'-o's compll'l'llent? (o;xplaill. To spN'd up the Ilonre<;u>ring division, it has been sUAAestro to allow 0 to bt> a Cjuoti(>nt bit for whirh nO add/subtract operation is needed in order lo calculate a new remainder. The modified selection rule is ..-u if 2r'_1 2: D if 2r,_1 < -D oth{'rwise \pply thi... n('w algorithm to calculate thE' Cjuoti(>nt of th(> dl\idend '" = 0.101 8Dd thr di,'i.'iOr D = 0.110. Would )OU recomml'Jld the use of this new d.lgnrithm? Fxplain. 3.11. The on-the-fly conveon algorithm in :!ubsection 3.3.1 IS a special C3S(' of thf' SD-lo-two's complement con\'"{'r.;ion algorithm pre;l'JlU'd in Section 2..... Since ju.'It lhe \'81u(':) I and i ar(> allowt'tl we need only use the last fow r(nl. in Table 2.4. Th{' resulting table for conwrting th(> CjuotiE'nt O.PIP'1'" Pm to its two's cowpl{'llIenl ('Qui\C\lent :0.:1:2''';:'''' is shown below, ",;th the indices changf'd to match lhe different indexing w.ed her£'_ I') c, -} C,_l 1 0 I 0 1 1 0 0 0 0 1 1 0 1 0 1 
52 3 Sequential Algorithms tor Multiplication and Division Show I hat .:1:2':m = 'P'2" 'lJ", 1. AL'IO show that hn..>;('d on thf' tir1't Iwo roW'! in l'ahl.' 2.4 %i) = 1 - PI. 3.12. Fin,l the 5("llilUt' mol of 0.011111 u"lIIg tlw nonrl'stnring algorithm. 4 (I) J. .J _ I. CAVA:-I AG II , Dlgllal comJlukr antlirndic: IJt'sign and Jruplc"wntutlOn Mf'Gnw-lIi1l, :-.If'W \ork, HlI\.I. . (2) \. CIII', rornpulcr organulltion lInd mlcrnprogmmming. Pr('ntin' lIal1, Englp-. ",ood ('Iiff..., ,J, 19i2, dlI\JI. 5. (3) t. D. FRC'FGOVAC md T. tANG, "Oo-thC'-fly cOJJ\l'rsion of redulIlll\ut illto con. ventional rl'pres('ntatiolls," IEEE 1hm.s. on Cornput n, C-J6 (July 1987), 895- 817- (4) h. II WANG, rompulcr antlmlctic: Pnnciple.s, archll,y'!1tre. und de_.lgn, \Vill',)', New York, 19;8. (5) O. L IACSORI FY, "lIigh-SII('f-'(1 nntlunNic in bllmry COlllput.crs," Proc. of IRE, 49 (.11\11. 1961),6;-91. (6) (;. W Ib:IT\\liOSNEH, "Biliary aritllllll'tic," in AdlmnCl'. In ('0 mpul....,s, \01. 1, F. I. Alt, (Editor). AClld('mic, N,'\\' \ork, 1960, pp. 231-308. Ii) .J. E. ROln:R1"SON, ".\ 11('\\' .-I....".. of digital di\'i'lioll ml'thllds," IRC 1hms. on Fl. .tn/rlle rompute.". Ee-7 (Sept. 195), 21l-i-22:l. 181 N. R. SCOTT, Computer number slJstem. and untJunetic, Prentice lIal1, Ingl... wood Cliffti, NJ, 19M5. [9) C. flING, "Aritlullt,tic," ill romputer science, A. F. ('ardenl\.':;t.t 61. (Fd5.), Will''- lotl'rsciellce, \lOt'\\' 'rork, 1972, ("Iml). 3. . 110) S. \\'ASER aud I. J. FLYNN, Int.v l..cllOfI to antl.rnehc for dlgltol s-yst -m tlc.slgn- ers, Hult, lUol'h.ut, Winston, New York, 1982. BINARY FLOATING- POINT NUMBERS 3.6 REFERENCES 4.1 PRELIMINARIES To obhLin Ii dynlimic rflngp of rl'pr('nt"hl(' r('al IIIlmb('rs without having to S<'al(' th(' olwwmls, WI' 1lS4' floatin-puint numbers ill.'il 'ad of fixed-point onl's. The rl'pn'St'ntalion of float.ing-point numbprs is similar to the conllmJnly u scientifi(" notation and ronsists of two Inrts, th si9nificand (or mantbsu) .\1 6nd the rX]Jollcnt (or dlamct.cristic) C. Th(' floatin-point number F rq.lreseul-ed hy th(' pair (M. F) has the "nitII' f' = M . {JE where {3 is th.' base of tl... ,'xpont"nt, This bas., is connnon to all flodting-point numhers ill n giwn systcm. It i:. ther('fnre 1101 indudro in till' reprCSt'lItl1tion tlf a flol1tilll«-point. IIInnber, bnt is rdl- h('( impli('(1. Thus, thp n hit.s that n'pmsent a lIoating-point Jlumber arc pl1rtit.itllll.-d intu two parts, onc holding the signitit"antl M 8nd the lither the l"xl.lUII,'nt E. 1'111' range of f('pn'.sPIII.ahle fkntin-poillt Jlumbprs is larger th.m thai of fixt'd-point rel)f('S,mt.dtinll, hilt the pr('{'iion is 1omall('r. Th.. totalnnmbcr of ditff'n'nt. valnt'R (n'pn'Sf'nt.)hlc in n bits) is still 2". allil sinn' the ra.ngc b .tWl.'('JI the smallt IInd th,' larW>sl n'prl'l.'utablt' ",,\rws illcn'USl'S. the tlistaJlle bctwl"n any two conSI'cul i\'t' v"hll'o;; must incrt.'a..>;t' as w,.\I. Floal in-point nlllubcrs art' tlnti parser than fix,-d-point lIumbNs, resulting in a 10wI'r pn'('jo;ion. Any r('HI nnmber whose value lies betwecn t.wu roll:-....'utiw tlnatillg-point mllllhcr'l iB TIlnppro Ollto 00(' 53 
54 4. Blnarv Floating-Point Numbers 4. 1 Preliminaries 55 of t.hN:(' two numlwn;. Thf'r('fofC'. a largl'r distanel' hl'h\,('l'n the two conSl'clItivC' nnmhrrs rpsults in a lowC'r prrcision of n'prf':;{'ntat.ion. A nll)(C' dC't ailf'd di-;cnSoo;inn on the pr('("isiou of rrprrnt.at ion app'ars in S('{'I ion 4.3, ThC' signilkami .U allli thC' t'xpoucnt F of 11 IIndting-point nnmhl'r F ar(' hot.h signed qnantit.i('S. The C'xponent is nsually a signed integer, whill' t\)(. <;ilifiC1\11c1 is usually rpprPs(,lItNI in one of two ways: one 'IS a pnre fra('t.ion, and the ot.her a.'i a IUlmlwr in the Tnng(' (1,2) (for 13 = 2). In addit.ion. diffl'rmt dl('m('$ for T('prN:('nting negati\'e vahu's C8n be l'mploYl'd for I'aeh of t.h(' t.\m parts of tlIP flont.ing-point nnmber. Unt.i11980 there was no st.andard for floating- point. numbers and almost evrry compnter syst('m had its own represl'ntation nll'thod. This made t.h(' transportation of scientific programs and data bctwl£n two different ma('hin('S v('ry diffkult.. At that tinw th(' IEEE st.andard 754 [9] was formulatftl; it, is uS("C1 in most floating-point arithnll'tk nnits dl'Signcd in r('('ent yenrs. The dl'tait:. of this standard ar(' presl'nt.ed in Sc('t.ion 4..t. Although only wry few compnter systems st.iII USI' tll('ir own float.ing-point format rather than t,he IEEE standard format, it. is important to nndl'rstand some of the prior formats. These prior formats greatly influenced t.h(' ol'Cisions made by thC' IEEE floating-point standard committee '1nd shldying tlwm allows a b(,tter Imd('rstanding of the IEEE standard form,it. Thes(' fOrlllRt.s diffl'r in tltf' part.itioning of the n bit.s betw('('n the signilieand and ('xpon('nt fi('lds, in the rt'presl'nt.ation method uSt'll for ('ach of t.he two part.s. and in tlu' valut' of the basI' jJ. In whnt follows we will consid('r only a f('w prior formats; t.he othl'rs ('nn bf' 3Dalypd in a similar mann<:,r. \\'e start with the slgnificand fidd and examine dll> l'ommon C&e where the significand is a signed-mdgnitud fraetion. The float.ing-point. format. in snch a cast' consists of a sign bit 5, c bits of an exponent E, and 111 bits of an unsigm'd fract.ion JU, satisfying m + e + 1 = 11, as shown helow: t.hp expolll'nt (.md vice versa) at, the samr time. so that the value of till' floating- point nlllnbf'r rC'mains nnehang(''ll. \Vlll'nevf'r an arithn1f'1 ir opf'rntion results in a significnnd larger than the maximnm allowC'd value of .\f moz = 1- ulp. we mllst rNlucp the sinilkand to b£' in the allowahle range. WC' hnvC' t,o snnulhnromJ incrC'ase thp £'xponpnt so that. dlC' value of thp tle,ating-point nnmbt'r stays thl' sam. ThC' smal1l'St inl'Tea..",' in E, hf'ing an intC'gpr, is by 1. ThrrPforp, w£' should n54' thp following relation wlll'n t.he nd to rcdllcf' t.he signific"\nd ariS<'S: M. {3F = (Mlm. {J'+I The divide operatioll in Mlf3 turns into a simple arit.hmf'tic shift right opC'ration if {3 is an int.I'gral powpr of the rndix. If {J = r = 2, then shiftmg the sigmfieand to the right by a single position mnst bl' (X)mppnsatl.,<1 by aJdin 1 to the exPOIIt'nt. Example 4.1 Suppose that an arithmetk op!'rntion yields the result 01.10100.2 100 , whieh has a significnlullarger than lI[maz. We should (('()uce th.. slgnif- icand by shifting it one position to thp right and increase the I'xponent hy 1, yielding 0.11010 . 2 101 . If {3 = 2 k , thPII changing the exponpnt by 1 is equivalent to shifting thl' signifi('nnd by k positions. ConS('qnently, only k-position shifts arc a1lowpd; e.g., for J3 = -I = 2 2 , 01.10100. ..010 =0.01101.4°11. 0 s .C F = (-1) . M . JJ (4.1) In general, thp representation of numh£'rs in a floating-point form is not UllielUC. For rxampll" 0.11010,2 101 = 0.01101.2 11 °. Th(' same v.uue can also be r£'prl'Sent('(1 with an exponpnt of Ill, hut if t.he signifiednd lipid is only 5 bits long. the rcsulting signifinnd would bp 0.()0110, so w,' would lose a significant digit. Out of all possible rpprc:;entations we ther£'fnre prefl'r the one wit.h no leading .lcros, allowing ns t,o retam the ma.ximum mlmhpr of significant digits. \Ve cl\!1 t.his t.h£' 1Iormalized form. The normalied form also simplifit'S the comlJluisou hetw('('n float.ing-point numbers. A larger f'xponent indicnt('S {\ larger overall numher and so thC' signifi('ands ha\'C' to be comparpu only for l'(lual I'xpolll'nt.s. Notiel' t hat, since in t,hl' casp of ;J = 2 k , thl' signifirand can be shiftt'll only hy k (or a mult iple of k) plR>itions, thl' significand is I'OnsidI'Tl,<lnortll<llizcd If there is a nonz('ro hit in the first k po:::itions. For pxampl(', the normnli/ed form of the numhl'r O.OOOOOllO . l6 101 is 0.01100000' ({)IOO. .md we couilinot eliminate the singl£' Ipading O. If th!' fraction is normaIi7f'l1. t.lw range of the signifieuml IS smaller duul [0,1 - ull)]. The smalit'St and largest allowabh' vahl£'s arp instf'lul o Expolll'nt E Un...igned Signilirand AI The value of such a floaling-point nnlllb('r (8, E, J\f) i:; given Ly SUlce (-1)° = 1 8.nd (-1)1 = L Thf' maximal valli£' of th(' fractional signif- leand, denoted by II/ maz , I'qnals /I[nlax = 1 - 1(11), where IIlp is the wC'ight of the Ie&t-:.ignificant Lit of the fractional significand. Usually, but, as WI:' will Sl'(' lat. 'r, not alwdYs, !lIp = 2- m . Next we dieuss t.hl' sclc(.t.lon of a value for thl' implied base p of lhe exponent. For practical purposes, tht' ha.'W ,J i.. restrictl.>d to an intl'gpr 1)()wC'r of till' radix r = '2. In ot.her words, {3 = 2 k wherC' k = 1,2,.... The rl'<&.Son for t.his is that it. pro\'idt'S a simpl£' method of d('('reasing the significcmd and increing 1 J\[mi.. = 13 md JU ma %: = 1 - IIlp 
flu 4. Blnarv Floating-Point Numbers i1 1 Preliminaries fi7 1'\."., 1.lml I I... I nUJJ:I' or lIorlllllli"''41 rrnt'lllIus II,",!. IIl1t iJldlllll' I h., vnl.... /.I'rn I h'III'I', IL HI"'I'ilLl 1"p((,I'1I1 ul illll iA 111'4'11,'" rnr /.I'ro. A IIII:.sihl,' fI'pn'til'lItu'ioll for ""III c'oUNiN"!" IIf 1\/'" 0 nJIIIILlI)' I'XI'"III'II' ", USUIlI1, }, ... () is II(I.fl'I"(I'4I, Hill(.'I I h,' fI'pn'sl'lIl "I,inll of "..ro ill l\ulI' ill-poiJlI i.. I hl'JI ill"l1l knl to il s fI'pnt'll'lIlnl ion in lixI',I-J1l1iuI, l'oillll'lifyillJt tIll' ,'xI,<,ul ion of n h'1 f(or ""W illstrucl.ioJl. Fil1ully, WI' clisl'Il:'s t.11I' WI\} I'XPIIIII'III... nn' fl'pr..sc'ntlxl. fhe most 'OlllmOIl (I'J1r" cntullon ih us n biusl" I'XPI11ICllt, u 'l:ordin to whic'h OE2'-1 wlIPre' /'''''<1 is I h,' smlLlII'sl f'XPOIII'III, AmI , mo% is I hp h'fK1. All idpnt kILl runp I'xisl S fOI JlI'nl ivf' 11I1IIIhrf:.. \VIII'lIpvpr I hl' f'XI"lIlI'lIt of t hp rf'sult of fllllle unlluJlC'I k 0p('rnl ion i larpr I hfln F lllax , "II t'rporJcnt overfloll' indicat.ioJl shlluld 1Jl' generntl'(1. Siluilnrly, nn I'xpntll'nt. 1lIIlU1Icr 'Imu Emn Ahnnld cnlralc An c r- POI/I'''' UI/(It'1'llnw mdicat.ion I'h ,.. 1I"8 mnkp t.h . proramnllr nwarl' of t.hc' situ,,1 ion unci nllow IJlm/hpr to takp , hI' 1J('('pssnry sll'ps. Sinep 1.111' sinilic' md is "Iwnys kl'pt IIlInnnli/.I',I. "II' oVl'rflow will hp rpft"I,tI.d through thp f'XI)lInl'lIl. In !lOlIIl' Jllnchines, wlll'lI an ,>xponpnt oVf'rllow ()('('urs. U sPl'cinl wpre:'I'II' /llion of infinit.y is III"''' for thl' (I'MIIt. fwo lIt.hl'r po..ibilil il's arf' st.oppillg the' C'IJllIputa- lillll allll illtl'rrnpliug t,h,' proc'p$sor, or, tllI'lpl\..,1 rl'{'ollluU'lIClf'cI, s,'trill IIIl' (I'SIIIt to t.1I1' I/lrg,'st r"prc'SC'ntuhlc' IIIlIuhpr. If 811 PXl)OllJ'lIt undprllllw O\TllrS, rIll' WI'- r 'I:wntation of II'ro is uS1'11 for Ihl' rc'sult III som,' c ,,",'S, hUI tilt' propf'r pxpOOl'nl undl'rllow lIu it> :-till wised, Spiting till' fesult to 1,<>(0 ullows th,' computatiolls to 1)(0 ('('I'd, if aPI)wpriutc. wil h01l1 lLIlY interruptions. A detailc(1 IIi1!CU:fi11lJ1 of t.11l' WllY I'lI..h I'XC('pt inns Ufl' Imudll'cl in t lIP IEFF 0;' IIIdurd for lIo8till-IJ"illr lIumbers nppcurs in Sf'd.ion ,1.8- TIll> I'olllpll.tl' fllllg(' of floatiug-poinl llIuutll'rti is shown III thl' followiug :.dll'lIlat.il- (Jjagrmll, Not.kl' I,hut "4'ro is 1I0t. iududt.d in till' raul(c of f'it.11pr p+ or }, -. ,. - Fir.... I I" tt I; ",h,'re' IIII' ,,,tI,, iA n l'uu1 1'111 mltl ",r".. is I hi' I flit' \111111' IIf ,hi' c'XlwUPII' fI'pr(\- fI('III1'11 in I wo'ti l'IIIIlJlII'IIII.nt. nit, rnngl' for ,'"'",, lISillg I hI' t hil s of I hp I'XPOIIl'III li,'lcl IN 2' -I S E"."e S 2' -I - 1. 1'111' hins is """0\1)' s.,II'I'II'clIUo I hI' umgllil tlllc' ..f (.tit. IIIl1sl nl'lIti\ I' ('xpolll'nl; i.,'., 2" I, )'il,ldlllp: III I his I'IL"I', WI' l'uy I hut I III' 1''I:\l1II11'1I1 is fI'PfI'SI'lIh'll iu IIII' c':rc'('s 2" I 1111'1 hocl. 1'1... nlhunl"I' oflhiN s..hl'lIll' is Ilml ",hl'lI C'lliupuriuJt I,wu l'xl)(IIII'nts (us h. JU'I.d..d iu uclcl/SlIhl.I"I.'t np,'rnt.lolls) \\1' lito,}' ilIl1rl' Ihl' sigll hi,s ILlIlI C'olnllJlrp thl'lII 1).0; ir IIII' wpre' IIlltiinl'(llIlIlIIllI'rs, As I' (I.snlt, If II flolIl illkp..illt furllmt. liltS till'S, E, nllll .\!I'IIIIlPUII 'lit.. III tlth. or(h'r, "'" 1Il1l)' ('lIlIIl'lIr" I hi' lIolil ill-J1oiul IIl11uhl'rs ns t.hllllh IIII'Y WI'I"I' hillury illt'('rs in SiIII'(I-nm1II1 mIl' re'pwsl'nt "t iou, AllolIlI'r ",h "1I1.1II' iN t Iml tit,. slIIalll'sl (('pf('sl'nl lIhl.. ulllulll'r hus I hI' AAIIIP I'XpOlll'lIt ILO; lNII. lIulIII,ly, U. o - CO +CO EJ\poncnl O\'crno\Oj FJ\ponent O\'crflow EXlllllplt' 4.2 For t - i I h,' rrLIII' uf I'Xplllll'n'" ill I ",o's ('lIlIIplc'IIII'1I1 rl'I)f('s,'nlnt inn is G4 S Flr"r S G;I wilh IOnllUOO alii I 01 11111 n'pre'M'lItill I.hl' VUIUI'N {),I uwl ha, rc':,pl",.t.i\'('I,\'. \\'hl'lI IUldiug 1,IIt, hilL>; lIf (i.I, I h,' t.rIIl' vullle (j,' is [1'1'11':>1'111,'(1 hy 00110000 ulIIl I hI' I fill' vnhll' li:l iN rc'I)(I'sl'uII'(1 hv 1111111. ThiN fI')I(('SI'1i1 ", illn it! 1'1'111'41 I II,' I'XI'I.s.... li,llil'lll'IIII', 0 EXRlIlplc 4.3 Thl' shllrt lIoutillg-plliul fornmt iu till' 10M 370 sy!'h'lII l'ollsists of 32 bit-'i purtitioued &; f,,\lows: I" 7 hits. I'XCCSS 61 eXllolIPlIl M - 2.1 hil." uU'Iif{u('(1 frlLl'I,iollal siJ(Uilkmltl Thc Lw;' of till'S" IIIIMiug-poillt IIllllltwrs is 13 = 16, and Itpu'" F = (_1)5. M, l6 E - OI , !./ml<l . p,....,.. S r' < A / . 13 ".."00 _ .n "ua.!' 11('fI' ,,' is ((' I m's('"t('d hy 0000000 alld III t.hp vuhlt {j,I, whilt' i""'Ul:t , "Uti..   . (('P(l'Sl'IIt.('d hy 1111111, hus tIll' vnlut! +{j3. Siuce 13 = 11 It IS o.lI\pnJ(:"t to considl'r t hI' siguilinllld us cOllsisl iu of six hexllI!('('mlal (hlt!l. I h, uornmli/.I'II ignili('lulli thl'rl'fofl' sut.islilS -6 I 2 -2.. A / _ l r. -I < /1/ < .\1"'07 = I - II> = - , ,n mln - U _ _ CllllliI'(llIt'utly, ,.t u = (1-1()-1I) .H,,,,j J:-" 7.2;J, (()7:1, "JIll F:.... = (16- 1 ). J(j CII  5..1 . III- 7 °, Nol I' I hnl I hI' I'XI'I'" 2'- -I [('I'[(....I'ulnt.ioul'lul III' IIhl nilll'd hy silllply inn'rl- luJ... IIII' si1I hil IIf 1111' t WI) 's l"Olllpll'lIIl'nl (('pre'sl'nl nl iuu: i.I'., 1\.1.1 ill t.l1I' vuhll's II ullcI I of 'III' toI1I hil illllinLt!. nc'nl iVI' ,,1111 posit iVI' IIl11ulll'rs, re'sp,'ct.ivI'ly. I'll\' (umpll.tl' l'mlI' uf lIonllnli;('I'4ll1l1atiu-poilll IIIl11lhl'n. illdlldl's icl"nti- ('Ill "Uhmlij(I')oo for pll:.ili\'I' lIoutitljJ;-poiut IIl11uhl'rs, dl'llolt'd by F+, 1J1Il11I1'W1' ive' IIIIILliug-lluwt uUlllb,'rb, d\'llut'd by J -, 1'1..' flUlg" IIf I'ositi\'l' lIuutillg-poinl numb 'Ib Ib 
58 4 BInary Floating-Point Numbers 4.2 Floating-Point Operations 59 IBM/370 nFC'/Vr\ '< C'ylwr 70 Word 1l'IIgth (douhle) 32 (61) bits 32 (61) bils 60 bits SiKnifi('nd+{hidd"n hit} 24 (6) hit!' 23 + 1 (55 + 1) bits 4X bil'l Expollent 7 bits 8 hils II hits Bias 1301 128 1011 BlLo;(' 16 2 2 Rang,. of AI f-M<1 !M<l 1\f<2 Hc:p[(':!I('nlal i')11 of .\1 Sigll('(l-magmt 11"'> Sigmo;:l-maKllit ud" 011("5 ("ompll'n1l'nt Approximatlo fl\IJg'_ 16 6J :::: 7 . 10 7 :\ 2 127 :::: 1.9. 10 38 21023 :::: 10 307 AI'Pro'Cilltal" rc""olutioJl 2- 2 .. :::: 10- 7 (10- 17 ) 2-: H :::: 10- 7 (10- 17 ) 2- 111 :::: 10- 11 ca..<;c where f = 0 and F = O. With 8 hiddpn h't tl" I o 1 . 20- 128 _ -ll9 . I, 1111 may r('pr(")pnt t 1(' valne . - 2 . Howf'wr, thl' finatmK-plJmt numtwr f = E = 0 is still ,'xp('ct.f'd tll rf'pn''sf'nt 0, a rl'pn'Sl>ntatlon which dOl'S 1I0t use a hiddl'n bit. \ dl'arly can not allow f = E - 0 to rf' p resent. two d ' l a' t I T" I I . . lIer...n Vi ues. HI ,&V(IIC t n fwm happPl1lng WI' must r£'St.rict the use of t.h e  xl) ( 'n  llt E 0 d . .- . ...- , = all r4'nIP It lor r£'prf'nt mg t h(' vnlu ./ero only. COIH,PfIIl£'ntly, thl' smalll'St pxponent .1Uow('(1 for nOn7f'ro nllmOf'rs IS E = I Therefo r f' th a .. Inall  t P "",' t " L' h . , ,, ,. v.,1 IVP nunllJPr In t . DFC/VAX SVlit.<>1II is F+ = l...2 1 - 12H - 2 -128 I ' he larg ""' t ' t ' I . . man  -. "-" 1'081 .rve nUiIl )f'r J..S F;:;n:r: = (I - 2- 2 ').2 2 :15-128 = (1- 2-24)'2127, TABLE 4.1 The floatIng-point formats of three machInes. 4.2 FLOATING-POINT OPERATIONS For f'x'ullplc, Il't (5, E, AI) = (CI200000)r6 b(' a Aoating-point mllnhcr in the short IDM format, then th£' first bytf' consists of (1IO(J(){JOlh. 1'h(' sin bit is 8 = 1; i.e, t.he uumber is negative. Thp pxpofll'nt is 41 16 .urd, with a hias of 64 1 0 = 1016, EtrutJ is (.11 - .10)16 = 1. Finally, /If = (0.2h6, l1£'n('(' F = (-O.OOlO):l . 16 1 = (-2}.0. The rf'solut.ion of t hi!. flodting-point. repre.wutation, df'fiUNI as the di!.trulce between two cons('(ut ive significam.ls, is equal to the w('ight of t hf' If'&<;t.- significant bit of thl' signific3nd. Thus, the r£'solut.ion of this repreSf>utllt.ion is 1I1p = 16- 6 = 2- 24 ::;;: O.G. 10- 7 , \\' say that t.tw short. format has approximat.ply se\'cn significant dt'('imal digits. Should a hij:?;hcr J>r('('ision 1)(' desired, the IBM system provid a long floating-point. format, in whidl the significaud is pxtended by adding to it. a sccond 32-hit. word. fhis format i.s Thp way flonting-point operations .are eXf'Cut.cd df'pf'nds 011 the 8pf'f'if]c forn\ctt upd for rc'prfenting. t.he operII(l:,. In what follow1> Wf' will as.sump that thpsig- mfic8mls are lIormahuod fractions in siW1('(I-mauitude rl'prf'S('ntation rulel t.hat the expouents are biasl'd. Giv('ll two numbers, FI = (_1)5, . !III . 13EI-bla. md F 2 = ( -1 )82 . .'1./ 2 . {3F2- lna ., we need to caJeoulat.f' th(' r('Sult of a basic arithulI'tlc op('ration yif'lding /'3 = (-I )53 . .\1 3 _/1 FJ - lna .. Wf' start with multiplication and division, since th('S(' are edSier to follow than addit.ioll and subtrdct.ion, which will be dcscribNI later on. 7 Lits - I'xcess (i-I f'XpOllt'ut 56 bits - uDsignf'd fractional siKilitiwlld Thl' range is roughly t hp sam..., but the resolut.ion is now ulp = 16-1-1 = 2- 56 ::;;: 10-17; i.e., 17, inst.ead of 7, significant decimal digit:.. 0 Multiplication. Thp !.ignific-ands of thf' t.wo oJ}('rands arf' to bf' mnlt.iplif'd a.<; if t.hl'y wprl' fixed-point IIIl1nher5. Thp pxponpnts of t.he olwramls arf' to bp addl'<l. These two opcratiolL'i Chn be done in parallel. Th... sign 53 is po!.it.ive if t.he signs 51 and 8 2 are {'fIUal and is negative otbcrwisf'. When adding the two exponents EI = E;rutJ + bias "lnd  = Erue + bias, the bias shonld be subtract('d once to obtain t.lIP corn'Ct f'xpollf'nt. For bias = 2"-1 (whit-h ill bindry is r£'presentl:'d as 100...0), subtrd(.ting t.h... hias is e<luivalent to adding thl' bia.... and is accomplishLxl by complf'menting the sign bit. If th... rcsulting PXPOIlt1lt £.1 is larger than Emar, an oVI>rAow indicat.ion must hp gpnl'ratl>d. If the I'xpollent. E3 is negat.iVl' "lnd is smalll'r thdn E min , then "ln IInderAo", indication DlU5t be gf'nl'ratro. Whf'n multiplying thl' significam!s we havp to lIIake :.ure that M3 is a normalized significaml. Since £,af'h op('rand's signifi('and satisfies 1/13  .\I, < 1. (i = 1.2), the product of thp t",o significands satbfi,'S Tahle 4.1 compares the floating-point format in IBt\1 computf'rs to t.ho UsM in DEC/VAX and CDCjCybt'r 70 computers. The hidd('ll hit in t.h£' DFC/VAX io; a scheme t.o innf'as{> t.he numoer of signifinmt bits in the slgnifi- cand and thus increase the prf'dsion. For a base IJ of 2 thl' nOrmali7f'd significand will alway:< haw a ledding I. This bit can be (limindtf'd, allowing thl' indnsion of an extra bit. A... a result, the rcsolution hecom(>S 1l1Jl = 2- 24 instf'ad of 2- 23 . Th(' vdlue of a floating-point numo(>r (5, f. E) in the hort DEC formnt is t.lwrdorf' 1/1J2  M I . M 2 < 1. {_I)SO.lf.2E-128 Consequently, WI' may nl'e<l t.o shift the signifi,"allli onl' poition to thl' left Ul ordl'r t.o normali7e it. This i.. achif'vcd o. pf'rforming on<> basl....d If'ft shirt operation; i.e., k hdSf....2 shifts for [J = 2 k , at thlsametllne rl'ducing thl>exponpnt. bv I, Thill is call1'd thl' lJOstrwrmalizahon btl'p. After tin" step is expcutt.'(I, thf' expOnf'nl whf're f is t.h.. patt.ern of 23 hitJ; in the signifkand fidd. In this ca.<;(>, a I'ro significand tipld (f = 0) r('p(('S(>lIts t.he fraction 0.10:2 = 1/2. Consid('r now the 
60 4 Blnarv Floating-Point Numbers 4,2 FloatIng-Point Operations 61 1//35 Ah/M"l < {J. A hrntf'-forcp lJIf'thod wOllld lw to ('ontiml(' thp PXI'('utinn of 8 dirp('f di- vision {\Iorit.hn, 1'lIIployed f(Jr ('akulating Mr/J\[2 fi)r PI - E-l stl'pS, ('vl'n if FI - E 2 IS nllleh gn'uft'r than t.hl' IIInnlwr of stf'pS net'dl'd to en('rnte the Tn hits of t.he qnotit'nt's significand. In pract.it.f', this is not an \Cc£'ptahlt, solution sincp thf' "xecutiun of thf' floating-point remaincler opl'rat.ion may takl' nn arbitrnry numlJC'r flf dock cycl£'s. As a res1tlt, diP float ing-point rf'mainclf'r is ofh'n nku- lated in software rather than in hardware, An altprnnt,ivf' solut.ion i!oo to dmnf' a REM-st.c'p olwration, X IlEAf F2, whic'h pPrforms a prf'-spccified l118..xinmm mlmhpr of dividf' stt'!>!!, such as t hf' IIInnbl.r of divide st 'ps rf'C)uircd in a (I'gulnr dividl' Olwration, Initially, X is eqnal to Fit aftPrwards it if> madf' ('CJual to till' remaind{'r of thc pre\ ious IlEAf-stl'p operation. Sm'h a IlEM-stA'p opl'rlation can be repeatf'{luntil a rPlllaindc'r that is smallpr than F 2 /2 is oht"inNI (7). Addition/Subtraction. Tlu.'&' oPI'rations require that the £'xponents of both operands he ('qual b"fo(l' adding or subtrdl'ting thp significands. Only wlll'n EI = £'2 can tlw tNm Jf'l b(. factorf'd out and the two significands All and M 2 bt> added. To achievf' this wc' align the significands by shifting the sinifkc.md of thp snmllt'r operand to the right, increasing it.s exponent at the sallie tim.', unt.il it. f'quals the otht'r expOlwnt. In ot.her words. the ignifiC'ELIld of I tll slJu,lI('r numbt'r (i.e., the numbpr with the smaller (xponent) is shifted El - E.lllni.'ie-fJ posit,ions to tht. right. For example. if E 1  E2' then may hf'Come smaller than Emon, and an exponent underflllw indication should bt" gt"nerah'd. Division. The siRnifinmcl... of tlU' two olll'mmis art' to bp divided and till' £'xponeuts snbt.ract.N:1. H£'r(', wC' have to add till' hia..o; to tilt' differ('lu't' El - . If tlll' nult ing ('xpon('nl is out of t.h(. rmlRe. an oVNAow or und,'rflow indication houtd h' generated. The result aut. :-ignificand sat.isfies Tht'rcfore. a single bas('-i shift rij:?;ht of t.he signif1cand, accompani('d by an in- cr<-,a..c;t> of oue in the t'xpon('ut, may be r('(luin'ct iu t.he postnormali7at.ion stf'P. Till' I'xpollt'nt incred.';c may, in turn, lead to an o\'£'rAow, If the dh'isor is zero, an o\'£'rHow occurs, and a sp(x:ial indicat.ion of division by zero should be gent'ralt'd and the quotirnt can be set to zoo. If both divisor and dividt'ud arc z('ro, the mmlt is undefined ami iu the IEEE 754 tandard such a qu.mtity hu:. a special repres('ntation called not a number (NaN). NaN also repr£'scnt:- uninitializcd variabl and the result of O. 00, Thesl' will be discussed furth£'r in St'ct,ion 4.4. Remainder. Unlike fixed-point division, floating-point division does uot g('u('rate a final remaindN. The fixed-point remainder, denoted by R, is defined as.>i. QD wh(r(' X. Q, and D .lrC the dividend. quot.ieut and divisor, respect.ively (see Chapter 3). This remainder &"1t,isfil's the inequality IRI 5 IDI and is a byproduct of the direct. division algorithms, like the restoring and nonrttoring algorithm. The situation is differeut in floating-point division. The Aotlting- point remaindrr, d('noted by Fl RF!If F2' is defined as F 1 - F 2 .Int(Ft! F 2 ), where 11It(Fd F 2 ) is the quotient Ft! F 2 converted to an integer. The conversion of the quotient to WI int.eger nn be p('rformed either through truncation (i.e., rt'moving the fractional part) or through wunding-to-nearcst. The IEEE standard uses tilt' round-to-n£'arest-evt'n mode' whit'h is dl.lillt'd in Section 4.5. Iu this Cc., tilt' following inequality is sat isficd: F1:f: 1'2 = (- l)s,. AlI:f: (_1)51. M2.-(FI-F») 'I3E,-bIa6, (1.2) Not,' t.hat Wt' do not decreasl' th£' f'.-Xponcnt of tlu' larger number to mak.' it f'{llw.I to the ot Iwr I'XPOIlt'ut, 1>incc t.his will rrsult, in a significdud IdrgN than I, and a larer significand lddt'r will be required. If, ba..wd on the t.wo sign bits and thf' originally rt'quirt'd opt'rutiou, an addition i!- lH'rformed, t.hf'n tilt' rl'sult,aut siguificand denotp<1 by M (Ohtdilll'd by adding thl' two aligned siKllifiemlds), is in the rang(' 1Ft REM F21 5 1F21/2 II {J S M < 2. Careful examinatiou of th" ('xpression for Fl REAl F 2 revcdls th(' higher com- pl(>xity involved iu ('alculating the floating-point remainder compart'd to that of tlw fixed-point remainder. An algorit.hm for floating-point division will n£'r- ate a quotient reprnt('d as a Aoat.ing-point number and will not gem'rate tlw intt>ger IntlFdF 2 ) whi(.'h can b(' as large as {JEmu-Frnl n . Therrfort", w(' must. ca1culat,(> the floating-point rl'maindt'r separately 50 that we can perform tJlis t.ime-coll:mmillg calculation only wltf'U it. is requir('d. The floating-point remain- der is nOP<led, for eXdmple, when performing an arguIII('nt n>dnctiou for perk:)(lic fUIICt.iol1s like tllt' tribonomdrk funrtiOIl:S sine and co."inc. If the significuml J\J is Rrcater than I, a po"tnormfili7ation step is rc'quiflt. fhi.s consists of shift.ing the significand to the right to yirld At. h and increasing th£' l'xponcnt by on('. At this point an I'xponent oVl'rfto\V may occur, In summary, the following steps are required whf'll adding or snLtradinb two tlout ing-puint numbers: Step 1: C'alculatl! t.hf' djfft,rtn('e d of the two rXpOnf'lIt!t, (I = IF. - E21, Stc'p 2: Shift th(' significand of thp smallt'r IIlnnl>t'r by d b&t.--13 positions to the right. 
62 4_ Binary Floating-Point Numbers .4 2 Floating-Point Operations 63 :,'ILP 9: Ad,' 1111' uliJ;:lIM slnificmlll ami S' llll' C.XP"lIl'UI of the rcsull''(lnal tn lit" ,"!..xpmll'nl ,}f Ih,' larI'r ,'pl'rutll" o  1.\/\ < I Slgnijlrond #2 SICl) ./: r\ormaliz(' lit,' fl'Sullaut. sinificmul ,\lid adjusl th,' ""pOIII'nt if ne<'I'-.snry. Sit 1) 5: Round I hI' rt'SlIhl\lIl siF;nilkmllJ ami ,,,1jns. Ilu' "xpolll'nt if Ilt"t('ary. If I he linal operat iolt cnllt,<1 for is subt mel i'lU, t h,'n tit,. fl'Snh lUll signilinuul sall:ofi,',,- lxJt(JIl('nt rom)lnroon and signifi.-and alijl;llment and ft postnonnali:mtinn sh'p i!; n'<luifl"\l if Ih,' r,'suhallt sillIfinuul is small,'r than 1/13. l'his !olt'p mnsist:. (If shining Ih£' signilinulIl to th,' I('ft ami ,I,"\'rea."illg tlw ('xpl}(1I'nt simuhalll'onsly. which may I,'"d to an (,xpotll'nl umll'rllow. In l'xtn'IIII' ca.'4':S, the pn.o;tnurlllnli./utiull Sll'P 1111\.\" rt'<llIIft' 1\ shifl left operation ov'r all bill' in t.lu' :oiguifh-aml, yi,'lding a z,'ro result. Signilknnd nddil iOIl- subl rl\rl ion Example 4.4 1....'1 FI = (0. 100nOO) 16 .16 3 and F 2 = (O.F F F F F F)u; .16 2 Ill' two nnmlw(8 in tltr short IBI\I forlllat, to he slIhtrdCI(,,\1. Thl' sigllifi":\Iul uf tlu' slUall,'r onl' (i.t'., F:l), has to be shiflt'(t to the right, r('snlting in thl' 10."'8 of the least-significant digit. Reg...ti'r LtYldmg (J!I Detef.tar R I Shlft.cr POSl- norwaliJ:nlion and roullding Sigrlifiomd FI O. 1 0 0 0 0 0 16 3 F 2 aiWIl'(1 O. 0 F F F F F W,\ FI - 1<2 o. 0 0 0 0 0 I w : 1 I )ost. nnfmali7 at.ion O. l 0 0 0 0 0 [6-2 AGURE 4.1 Floating-point adder/subtractor. PI o. l 0 0 0 0 0 0 W 3 F'l n1illI'd O. 0 F F F F F F Hi 3 FI - f O. 0 0 0 0 0 0 1 1(). 1 Posl nnrml\lizal inn 0, I 0 0 0 0 0 0 16- 3 block diamm shows t 11l' hm s(J>aratt' data paUls, wit.h lh,' left. onc fur th,' 'XJ) lIl'nt.s and t Ill' right om' for :.ignifiral1<k EJ'1}. Adder' #1 (,,()lIIputt':\ the diff('rem'f' Iwt.\,...'t'u till' t'Xpnlll'nts of t h£' two oprr.ulIls The fl'snlt.iug exponent diffefl'llCE' (FIJ). DifJ. in I h.. fiw.lrc) df'h'rmin's I h(' amount ,f shift right po...ition:; I hat thl' signific<md l)f thl' smaUt'r op('rlmd must go through in order to be nliglwd with the ot.her significant!. Mux #1 is a IImltipll'xor (also known as data St'ltoc- tor) whit-h routt'S 011l' of it-s two inputs 10 its single ontpnl. Th(' sdcctinn lIf thl' input signitkand tu hI' rontl>O to the Righi Shllter is dt'termin,,<1 hy the o;ign of th,' t'XpOI1l'nl dilf"fl'nC'\'. This sign also conl.wb .\ltlX #2. whit'h s('I,'t.ts tilt' sigllitkl\ml uf Ilu' largt.r opemnd turd rout, it. to tit" Significand Add('r. In th,' m'xt SI.t'J> thl' <\llditinn or :;ubtraction of the now align!'d si.gllit1cands is pt'rformed al1<l tht' rl'Sult slon,<1 iu a regish'r. rh('n, a spt."\'ial drnlit, Leading 0:1 Delector', l'xamilll'S I h£' It'llding hil.s of tl1l' fl'sldt.ing signilkmul and ,Ieh'rmincs tit" typl' of shift operation I1N,<I,"\1 for HIt' postnnrmalization step and thl' rorfl'spolllling Hl1justull'nl of th,' l'xponl'nt.. This ndjnstnll'nt is I>t'rforml'tl using E.rp. .Iddt r #2 ",hoSt' sl"\'und mpnt is the expolll'nt of 1.111' larger l)pt'rmlli. Thi!o. ill tnrn, is :;d('(.tl"\l b)' Mw: #:1 which is .'gain c\mtwlkd b,)' I.ht' :,Jn of the 1'"<PotU'llt. di!f£'r- t'llce. Finally_ t hl' 1um'merlln' eXl"t'IIIt'S the rounding. if nlX.'iilI), nl-rording t.ll Ih,' rnl"s l'xplain,,,1 in th(' IIl'XI S(,,\,t.ion. Not only is this n timt....consumin postnonnali7atioll sh'p (shifting 11\'(,f nw ht'xHlI('('illl.,1 lligit-s), but tlu' finnl rl':iuft is in ('rror. Th,' l'orR'Cl result (with un "unlimih'(l" l1Innlwr of signinl'1\1Il1 digil$) is The ('tror (alw l'alled 10....... of signifkauce) i 0.1 - 16- 2 - 0.1 . W- 3 = O.F. It>-3. A solution to this prohlt'1Il WI)uld hi' 10 havc uurd (Iiits; i.t'., ndditillnal digits to tilt' rij:?;ht of Ih.. signitkallli tn hold t.h,' shifh'tl- out digits. In th,' nho\'\' £'xamplt', a sinlr (llt'xad,"("ill1ul) n.ml diit. is sulli,.j('nt. Thl' guard digil will h,' disl'uss,'(1 in S,'(,tion 4.5. 0 A :>implifi('(1 blol'k dinram of the nr("nitr.\. fl'quin."tl tu p,'rform thr ao- ditlou l)f lIbtral.'tion of ttlmrill-poilll numb('rs is t.l1'pirlt"\l in Fignr' 4.1. fh.. 
64 4. Binary Flootlng-Polnt Numbers 4.3 Choice of Floatlng.Polnt Representation or. fO .", Xu X'l XII x,o xI' x x 7 x. X-' x. XJ Xl X, Xo ' - .JJlJJJ:k J}-: : , ",u 12 II 10 9 II 7 (i S ., 3 4.3 CHOICE OF FLOATING-POINT REPRESENTATION FIGURE 4.2 A two-level radlx-4 shlfter for 16 bits. All 1J(IIh till' IFF Htandrd 7501 d''SCrih('d in th(' next se .tion iR now mmmnnly m",11II JI('wly dslhncd anhmptic lIuits, it is important to 1II1<1(>rst 1&1111 'hp impli- cat lOllS of s('I('I'1 mg a IJdrt.r("ular formut and, \S n re>;lJlt, undl'rstand t.llp n'L'iOIJ!I IJI'hind thp mlnp,,'d standard \Vht>n d.'signing a formlLl for lIoatinJo(-point JllJlubf'rs Wf' art' J(iv,'n n. t.hr total nmnb£"r of bits, <md w(' have to dptc'rmiJu> thp Ipngth of thp signifimncl 'Ulcl cxpOJI('nl fields, denoted by m ano e. rc'sp('c.t iwly (slItisfying m + c + 1 = n), and till' \ahl(> of th(' expol\t'nt hns(' (3. OIl<' goal t h..t Wf' wonlll likp 10 nelliI've> is h.wing a small repno:;pntation £"rror, which if! till' prwr lIIau{' wlll'n r'preseJlting a high-precision rt'al IIIJlnbpr in a finite-It'ngth floatmg-polllt format. I >t x III' a real nurnbf'r dml F/(x) b(. its mdchinc rf'prcst'lltution. FI(x) - r is 1'1:1111'0 the ub!/Olrdl rrprc:,nltation rror. For evcry r£"al nmnber x that il'l within the rang(' of th . float.ing-point IIInnbers, t h,'rp drc two 1"lIl1.'\I!l'utivc repn'st'nt ul.ionli FI mId F 2 satisfying F 1 5 x 5 1- 2 , flms, we hdve thp dlOice of sl,t.ting FI(x) ('1111.11 to ('ith('r FI or F2. If FI = AI/If', th('n r 2 = pI + ulp)I, md tlJl' lIIaxin\llm ahsolutl' ('rror is half t.he dist-lUI('I' h(>lwe('n PI and 1"2, which in turn (>CJllals ulp - {3F. Unlik(' th(' distllnce Il('hvl'n two consCl'lItive sigllificancls (rJIII). which \.....s dio;('us:;o in the pn'vious section, th dio;tan<:e b 'tween two CIJn!4'Cutiv,' float ing-puint IIIIIn1wrs (ulp. (3E) is clearly not a <'On>;t.ant but varit'>;, b"I'omill largt'r ILO; thf' ,'xp0l\('nt incrcaSt>s, 8S shown in thl' following diagram: $1 SJ Out of t Ia(' functionaillnits included in Figure .1.1 thc t\VO shifters deserve n sf'pnral(> llis('ussion. TIll' first. shift£"f should bp cl\pahll' of pl'rforming right (alignllwllt) shifts only whil,> the s('cond ou(> shoulll 1w nlllt, to (wrform (>ithf'r rihht or h'ft (I,"tuormlilization) shift.s. lon' importantly, th(> two shiftl>rs lI\\Ist hI' capable of perfofllullg large "hift operatlonli. IL>; large as the numbl'r of digits in thp signilinnd fi('ld. fhl' oVl'rali performall/"(' of the floating-point add/suhtract IInit is highly III'pclldl'nt on t.he sp('('d of these two shift.f'rs, Const'qul'ntly. th 'y ar(> usually implt'II\('nlf'd fL>; rombinatorial shiftl'rs rathcr tlmn shift rcist..rs, which would r"quir£" a Img£" "\nd vllriahh' rmm1)t'r of doek cyeles to com I'll'll' t h(> shift. A combinatorial shifter gClIl'ratcs all possible shift I'll paU.erns but only one is provideo nt. till" out.put according to rom(' control bits. Sinn', ill gl'llI'ra such combinatorial shifters arc c,'lpahll' of performing circular shifts (rotates) as w('II, they are cOlI\lI\onb... known iL') bumJ shifters. A barr('1 shifh'r can be impl£"mented as a singl(> Il'vei arrdY when' l'aeh input hit i.o; oirectl) eonncctro to m (and ('\'I'n morl') outpnt lines. For 111 = 53 (till' numbl'r of signific"\1\(1 bits in till' IEEE douhllprecision format. s('(' Section 4.4) th' largl' num1)('r of cOlln('ct.ions (and the resulting larg(' I.Jt'C.trical load) makp this an undlosirahll' solntion, dlthough tilt' 0\0"('(1'11 dlsign is I'Onceptnally simpl(' 114). One alternati\'f' is a two-Ic\'cl array. W,. can impl('\1\cnt. a two-Jp\'1'1 cOlI\hinatoriul shift.t'r for 53 bits by ha\ ing the first lev(>1 shift thl' bits by 0, 1, 2 or 3 bit I'0:.itions, lLml leU ing till" s(>cond Icvl'l shift tilt' hits by lIIultiplffl of 4 (i.e., 0, 4, 8, -", 52). 111 t.his way, shifts from lellt.h 0 to 53 ran bl' pl'rformed. WI' 1',111 this two-I(>"...I shiftl'r a raclix-4 shifter. An I'Xdlllplt' of n two-level radix-4 shift(>r for Hi bit.s is shown ill Figurt' 4.2. III the first I..vcl of th(> 53-hit ruoix- 4 shift.l'r eal'h bit hu... four .It."t.inatiolls, whil(' in thl' S('(-oll<l 1"\'1'1 each has 14 dlstill(\t ions. A more baldnel'rl two-h>\"('1 shiftt'r for 53 hits would bt> tht' radix-8 t:ihifter, whl'rc the first le\'l'l shifts from 0 to 7 hit. positiolls ano t.he :>/.'(_'OI\(II('wl shiftl'l by multiples of 8 (i.e., 0, 8, 16, 24, 32. .10 11110 .I). Thus. ('arh hit in th(' first IeVI'I hdS 8 dcstindtiollb and 7 ill the St'ClIlld Il'wl. . . . . . . pI:: ,:1F+J F. "'1 Usually, t.hl' absolntt' si...c of the I'rwr is less importallt thall its rf>latiw !>iz(' (compdr£"d to tIlt' oribrillal valilc x). Thus, a 1II11f1> illJfJt)rtant dntl commonly uspd nwasur(' for thc rq)(('Sl'lItllt.iolll.'rror is !S(x) = (FI(x) - x)/x, whidl is t.he rclutivc rcprnRntatiQn error'. To m('a.snrc th£" "nc.euracy" of t.hl' r('I}(l"sl'lIt'ltion WI' may use thl' maximum r('llItive rl'pr(>sPIlt.atioll ,'rror (MRRE), which is 811 IllIper LOIIIIO of 6(.1:). Thi., upp,>r houlld ('all 1)(' nhtllilll"d ,Lo; follows: l5(x) 5 .1 u l p fjE J. I Mill" = 1 /lIp 2 AI < 1 ulp = 21"" I - u/", J3 2 ,. ( I.J) Thus. till' MRRE = ! ulp. {3 illcrf'!J.Scs \\ith till' f'xponellt. hlL'i(> 13 bllt de£"reil.'it'S with ull'; that il'l, wit.h thl' lIurnbl'r of silIificallll hit$, FIt, fill' l\IRUE would prnvid,' nn an..'ptublc IIJI'&;urt;> for t.h(' u('cura('y of th(' r<'pr(>pntntioll if thl' opl'[(mds ill flolltlllg-point complltat.iom, W<'rt' 11101"1> or 
IIf1 4. Binary Floating-Point Numbers 4 4 The IEEE Floating-Point Standard 07 1 .\1 In ( ; IpMl f " m HUllgl' MRRE \RRl 2 9 2:.? 2'. -I = 2.1':' 0.5 . 2-.11 . 2 2- 2 O.ISn.2-:.I 1 = ,I  2I 1 2 '-1 = 2J' -2 = 2J.' 0.5 . 2-:13 . I = 2-:1.1 0.135.2- 11 U; i 2.1 W 141 - t = 21"-1 = 2.1:'>.1 0.5.2-.11 Hi = 2- 2 ' U.169 . 2- 11 I,':','i IIlIiformh' dilrihutt'(\. Howl'n'r, 8.'; lu\:> h.,'n "hsf'(\'\'lt, t.h!' (1ilrihllljolJ of floaf m-p,'ill; opl'ramls is not unifl.'rnl bul appron..tll's I h,' r('('iprol'lllltilrihllfi()J1 wil h 1111' f"IIO\\ in u"II"iity function (20): TABLE 4.2 Range, MRRE, and ARRE of three 32-bIt fIoal ng-polnt formats (4). In ..I h('r words, h\rlo!;,'r SilIilkl\lllts 1\1'(' 1t'S.."i Iikf'l,\' 10 ,'('cm thllll sllJIIII('r slIili- ('1\11l1s. For f'},.nmplf', thl' fi1 u.jl of n d,,'illJl\1 flOnfill-poillt "p,'mlllt will 1II0.."i1 likd,)' h(' n 1; 2 is till' s('l'Olllt most likd)' nnd SI) ,tn. This lIolllllliform uhitrihutioll is hu.clI int" u('t"Ollllt ill Ih(' S''l'Olllt IIJCIlIr" prop,)..;,,,, fur n'pn'SI'1I1 ali,)11 I'rrur. l'his ml'l\."Uf<' is till' mwmy( rt'll\t.in' n'prt'- s"lItl\lion I'rror (AUUF). Thp nUI.Xillllll1l \1\111(' of th,' nhs"llItl. ('rror is, .L"i Ims hl't'll shown I\hn\"f', ! ul'l' ,I-'. nUll sinC'\' t hi' minimum l'rror is If'ro, t h n\'l'mgt' nhsl,IIIt.t, error is t ulp 131-'. TIlt' l'Orl'1'Spollllillg fI,I.\tin' H'prl'St'nlnl ion ,'rror is Ilwrt'fot\' !  I\Ild. ('()ns('<)lIl'nt I)', ,_t u/p , ARRF - J; /1111113 -1.\1 d. I = .f - 1 1//,) III 4' (,1..1) C'\)JIslImlllg stl'pS n((' thl' I\Iium uf th(' silifil'lUlds l",fofl' IUld/sllhl md ul)- l'rnt ions IImt po."t lIorllln)i/lilioll ill Imy n'LI illg-poilll opl'ml iOIl. II Im. \)1\'11 obSt,(\."t thnt. n IlIrgl'r ,'xpolll'ni hl\w,J  if'lds u highl'r prolll\bilit.y tlf ''(IIJIII,'xp( IIl'lIls illl\ttd!!\uhtrn,'1 "Iwmliolls (ill ",hidl ('1\,"" un nli).,YJIIII"lIt tf'1' is "'-('I'''-':I\ry) IUIlI R lo\\'('r prnhl\hility t hnl n pll,."tnormi\lill\lioll tl'p wilt hI' 11I'f'lk'cl. St ,\titi"111 muu} sis of 1\ \/Ui!'1 r of pro)..rmus Ims pro\"idl'tt I ht' rt'SlIlb indllll,,1 iu Ti\b' 1.:1 (19). This tnh'" shows t.11I' 1"'rl'l'lItl\f(I' ,)f CI\S,'S ill which 110 itlif(nllll'lIl shirts WI'fI' 1I1'l'd,'CI, th()I' iu whkh n sillgll' U)ipllII('Ut. shift was (('llllin'll. !LUll IIII.N' ill whidL n Inrgl'r (1\\',' or lIIore p,.sitillns) shift Wib; In"tl'l!. Also. pl...o;tllnrlnnli./l\tiull \\'l\,."i Illtl rt'(llJin'll iu r.I..I(\\ of the' elL. for 13 = 2. whil,' It WIL"i 1101 1JI"l'CIt'(t in S:!..&l'(\ of Ih,' l'i\S,'S for Jf = Hi. It, is illh'rl'stillg to lIoh' Ilmt I'WII for  = 2. uo pllIor- IIlILlizatioll Sh'l' is 1I('("('s..--nry ill 1II,\St. ('I,,..('S. rh,' nh",.(' rouid('ml iOIl is of limilt'll pnwticnl slIifinuJ(,,'C wlwu n band shiftf'r. iL'; ,It'scrih"t ill SI'C'liou 1.2, is IISl'l!. Allot her faetor to Ill' C'\11Iid.'rf'l1 wlLl'1I St'I,'(.tillg 1\ 1101\1 illg-poilll fornmt I th,' Sl e I,f thl' mugt.'. I'hl' mllJ,W of 1111' Ixith-(' flOOling-pollll mnulw, ti'r l'xl\lIIpl,'. is apprnxinmldy l'<)lULI 10 th,' IlIrgcst p,l,.."iitiw mnul)('r n'prt'scutnhl,,, Slll(,(> Ihl' slllall,'St pll,.."iti\"l' IJIllllbl'r is \.l'ry doSt' to I,'ru. \\'1' willusl' tlJt.'fl.Lrc the ('xpt\'ssioll ;3f:,..a. for t.h(' fl\nge. Tlms. to obtaill II Inr,' nUll' WI' should iUl'rl'I\SI' ,J lUld/or th£' IIlllllhl'r of l'xpuuI'ul bits e. IlIcrel\Siug tIll' Inth'r implil's II'&' hils for thl' sigllifil.'flud tidtl wilhiu th... tlunting-poiut fnrmllt l\IIll 1\ high,'r mlm' (If (lip, rt'sulting in n hiJ;lll'r rf'pn'SI'lIlnliou f'rror. A silllill\r drl'l'1 is I'XItt'riNlo'(l if. illsh'ml (If incl'1'IL."iug t, WI' ilJ('n'IL"" Ih" I'xpunenl bust' . (OllSt''lUt'llll), tlwrc is a tnLlll'otf hl't\\',""11 Ihe muc IUIlI thl' rt'pr('"il'utntioll l'rror. \\'... IlIn)' rollsilll'r sl'\l'rnlflnaling-poillt rt'I)((.....,'lItl\lilIIlS t.hut hu\"c till' smul' nU\.g" nud sd".t thl' Olll' wilh t.h,' slllull,'sl lnRE or AnRE, Auolh,'r pl).,-.:ihilily is to ('Ollsilll'r SI'\ I'ml (('pn'sl'lIl aliUlIs wit h t h,' Sl\IIII' :-.mnE (or ARn F) 1\11.1 S(,Il"l.t I h£' ,\111' wil h t.hl' largl'sl fl\llf(p. }t"ur I'xmllplf', fM u 32-hil wInd rn + f' is n (011" bil is rf'St'rY<'l1 f,)r thf' Sigll), UtIlt WI' IIIi\Y IISI' Oil... of Ih,' fl'prt'St'lItnliolls showlI ill I'ubl,' -1.2 [-I). All t hn"l' hi" I' lIenrl}' Ih(' sullie mng('. 1'h,' bu....,. 16 rt'pr!'St'1I1 atioll is illferior to the uthcr two rcprpsl'lItutions with J3 = :z iUld  = -1 if thl' i\IIUtE is s('ll'C,tl'<i I\S l\ IlII'USllre' for th,' rt'pn'St'1I11\1 iOIl I'rror. Busc.1 lurns out to I'rodllt"t, t.hl,lowI'St AnnE fur t.his pl\rliculur ,'xulllpl('. If hl\Sl' 2 is sd''l''I.<'lt, amI a hilStIl'1I hil is U'lt, th"1l th,1 MHHF IUIII ARR\-' IUI' rt'ltut"l'll by n fudor of2, IIJlLkill t.his forlllut till' Oil£' wit.h I he sllU\lkst. rt'prt'St'lItnti,tJl t'rror. A llitT"rt'lIt onl thnt ,)II,' might t'llllsi,ll'r wh"11 d,x'idillg UpOIl tL flnl\llIIg- poi lit. format is till' "X,'('Utioll I illll' of flllat.illg-poillt 0p,'r.\t iOlls. Two t iml'- AIlIlIlI'ul shift 13=16 3=2 0 .li.I' 0 2.()"O 1 21.W% 12.1 >2 26.7"0 55.3<',\ TABLE 4.3 The probability of alignment shifts of different sizes (19) 4.4 THE IEEE FLOATING-POINT STANDARD Th,' IFE\-' lIol\t.illg-poillt. shmdanl,l...filll's four forllluts for tlualill-I)('illt IJIUIl- hl'l"S. TIll' tirsl Iwo un' till' hasil' Sill)!;fto"'prtx"isioll 32-bit f,.rllll\t alld t,h,' duuhl,,- prt'l'isioll 6.I-hil forlllut. nit' 01 h,'r 1\\0 Urt' Ih,' ('xh'JllI,,1 forml\ts. to lw IISt'CI,for illtertlllXliulc (I'slIlts. rhl' SillII' I'xh'ml''ll forllmt. shou'" IJln-c 1\1 It'IL"it H hils. mid I III' douhll' l'xH'mkxt f,)flllal sllImhl hl\\"I' 1\1 It'IL.,,t so bits. rh,,' ,xh'lIllt.'ll forllluis haW' 1\ hih,'r prt'Cisioll 01111 u highl'r rlmgl' tlum I h,' l'Orrl'sl'omlill)., 3!- mut ti-t-hit forml\ls, 4.4.1 Single-Precision Format rhl' 1JI,Ist imporlullt. .,hj,....li\l. fl)r I hi' 32-bit. format is pn,<'isoll 01 n'lm'sllllLlil'"' !lplI,',\ hi\S(_' 2 WiL'; SI'It'l.ft'lt, ullowill I Ill' USt. of a li,lul'I,1 hit 10 furt IJt:r ,1111'11'1.>;(' t hl' pr('("IS101l. As 1\ rf'slIll, I Ill' sllg&I':ih'Cl foruml IS sumlur It) t h,' PI, <. forlllut 
68 4, Binary Floating-Point Numbers 4.4 The IEEE Floating-Point Standard 69 ('''(. rubll' 4.1), hut thl'rl' !UI' "oml' diffl'n'nc('s, I"" ilillil'ntNllwlow. III orcl"r tu hR\,(, R rl'l\..;onahlt, rRuge, nil I'XJllIlIl'lIf fil'lt! of Ipngt h 8 hits WI\." SI'IN.t('d, )'iddin t hI' fnllnwin fOfllmt: F = (_1)8 1./ 2 E - 127 . (4.5) This is 1'0111(" imr.s eXpre:sl)4d as (- 1)'" 0./ 2 1 - 1l7 to hR\,(, till' !18mI' hil\."1 as nor- muli./l'd lIIunhers, Note tlmt tlenormali71'd nmnhl'rs II/we no hidcll'n oit sinc{' the sluificnllds should lIot 01' nornMIi"e<I, AI:;o, although the trnt' value (Jf the t'XI)(IlIl'lIt should haw lu'('n 0 - 127 = -127, the vahlp -126 was Sf'lt'Ch'd, sincp t hI' snmllpst lIorlllalizt'd lIurnbpr i!> Fin = I. 2-1.l6. \Vit h r!l'nornmli1t'(llIInnlwrs thl' slIlull('.st. rt'prl'Sl'ntabll' nnrnht'r is 2-23.2-126 = 2- 1 . 111 instl'l1d of 2- U6. Tit£> addit.ion of t!cllormali7A'tllllllllb(>rs hI\." b(,(,11 h'rmNI grallnal underflow or rul,'ful 11I1llt'rflow, It dot>:. not dimillate nndl'rflov. I hnt it slIb"ltaut,lIIlIy rpdnc tlX' ap l)('twI'('1I the slllall,'st rf'prpSl!ntabll' nllmhl'r nnd ero. This gap, of si' (> 2-14'1, it! l'<lual to t.hl' distall<'t. b('tw('('n mlY two l'OnS('('ut.ivt. t!,'normalize<1 nUl1lht'r!> Rnd is also thp distnnet' bl,twl'('n UII.\' two consc<"utivl' nOfllmli71'(1 llIunhl'rlo with fl\l' smalll'..-t plliisibll' expOll<'nt. (1 - 127) = 126, liS iII\lst.rdtf'(1 in t hI' following diagrmn: S S hit.s - bi8S('(f "'\III)U<,ut F 2:1 bil.... - ulisij.,<JIl'li fral"titlU / Ont of t ht, 256 combilU,t ions of t hI' I'XPOIII'lIt fidd, two mp rf'SI'r\'ed for pf'('ial vahll'S. F = 0 is n,,",'r\'I'{1 for 'pro (with fraetioll / = 0) ulld deullrmnli7M IIIllIIhl'fS (with fra('tion / I: 0). E = 255 is n'Sf'r\'NI for cx: (with fradion / = 0) anti :\iaN. (with fmetion / f. 0). fhes' spf'{'ial rl'prc,st'ntntiolls ar<' furt h...r diS("lIssl'(l h,'low. For thl' rcmaillinJ!; C.XPOlll'lIts (i.t'., I $ E $ 254), the \"Rh\l' of t he floating-point nlllllbt'r is in'n hy TllI'rf' art' tWII lliffl'f('nrt.>s bet\\' "'II thl' IF FE sillgll' prt'('\Slon forlllat mid the DFC short format. Thl' I'XpOIll'nt oil\.... is 12i instl'all of 2"-1 = 2 7 = 128. This pro\'ido>s n larJ!;,'r ml\.ximlllll \-ahll' of th,' trlll' I'XpOIll'nt, 254 - 12i = 12; illstrad of 254 - 128 = 126. yit>ltling a lar('r rallgl'. A similar I'lf{'(,t is 1Ic1lil'\'f'd bJ IIsillg 8 siJ!;lIificand (If 1./ insteatl of 0./, sineI.' this Rllds 1 to till' exponcnt. As a n'sult, t hI' largt'St. and sm,IIIC'St. positi\l' nlllllhr[5 are Denormalized numbers . . . o 2- 126 2- 125 2- 121 F = ( 1)5 0 ./ 2- l26 . ( '1.6) Dl'normnli"ed nlllnhrrs haw not bl'l'n included in all t lIP d!'.Signs of d[ith- IIIl'tie units t.hat follow th(> IEEE standard. flUs is IIIwnly Ilut' to thr high cost n.,..odat.cd wit.h their IInplellll'ntation, sill<'(' (,hI' rl'prt'Sf'utation of dcnommli.lt,cI IIlllllbl'rs is different from that of norllll\lillod IIUlllbl'rs, ft'<)lIirillJ!; n morc compl(>x tlpsigll and po..,..ibly a 10nJ!;er o\ll'rnll px{'('ntioll t.ime. E\'l'n dl'si,:!;ns lhnt impl£'- IIIl'nt dpnortlld1i71'd II\lIIlbl'rs allow tht' programmer to avoid t Ill'ir use if fdStI'I' ('XI'('lIt iOIl is df'irNi. Thl' IEEE stalldnrd also drfin, a sillgl{'-rxh'ndf'd forlllRt tu hp I'lIIplo)'r'd wlll'n caknblting illtermecliat.t. r('sults within th.. evaluatioll of complex Ii.III(tions like (,hc tnmscclldt'lltal und power fllllet.ion. The single-,>xtl'lldt'(l forlllat l'!Xtcnds tlu' cxpolll'nt field from 8 to II bit.s and thl' signifknlld liPid frolll 23+ I to 32 or more bits (wit,hout a hitldl'n bit). Thus, t.he total length of a siuIt'-,'xtt'ndd floating-point number is at. I"n....t 1+11+32-1.1 bits. Thf'rr arl' two kinds of NaN (Not a Nlllul>t'r), till' sign.,ling (or trappillg) NaN, sud t.ht' quiet (or nontrapping) NnN. Na!':s ure r£'prespnh'd in th(' sinh'- prl'cision fOfllUlt h)' E ..255 and / 1:0 allnwing n large numlwr of pns...ihh' vahu's. The most sinific8l1t hits of th.' fraction can be uS('{1 to distinguish betwl,,",u tht> two kiuds of NuNs, Thl' rt'mnilling hits IIIn)' colltaill system-(h'pI'ndl'nt inform!,. tion. All pxnmplp of a signulinl!; NnN is an uninit.i.llill'd \'uriublc. A silUuing NaI\ srts t.hl' Invlllid operat.ion t'x(.l'ption l1al!; (s('(' Sf'{.tion .1.8) wht'llt'vpr lm)' arit hmetk OIH'ration \\ it h this NaN as IUI opt.rlllid. is nttt'mpf£'11. In l'ontra...'it, n quil't NaN does not. sd t.he Invalid opt'mt.illll ,'xCI'ption flag whell inYlIlv('d in f '+ - ( ? _ ?-23 ) . ,,25-1-127 _ (1 ,,-24 ) ?12t1 rJ'1GZ - -.. - - - - . .. and F+. = I O . 2 1-127 _ ?-126 ".In. - .. ('ompun'(l tll F.o:r = (1- 2- 2 .').2 127 find Fin = 2- 128 , n'SI".,-t.h'e(y, in the DEC forlllat. The "XPUIII'lIt bias and significant! rallb(' "C[{' s'll'CtL-d so liS to allow th£' rt'Ciproeal of allnormalizl'd lIlunbt>rs (in partkular, F,;;",) to be reprt'sl'nt('(1 without o\'l'rf1ov.'. This r('(luin'lIIt'lIt is not satisfi('(1 for F';;in in till' DEC format. Finally. a fl'v.' eomml'nts :ibollt th(' spl'Ciul \'alu{'S that can bp r('I}((>sl'n(,('d ill till' IFEF format. dnd which are summarized ill the followill hlble. /=0 /iO £=0 0 Dl'normaIi71't1 E = 255 f:oc NnN Opl'rst.iollS dpilling v. ith thr \'nhlf' I"X! that ar(' reprejl'lItt,<1 by / = 0, E = 255, and S = 0.1 1I111"'t. ohey till' tradit.iollal 1116thematical con\'l'ntiolls such a.:;. F + oc = , FIx: = 0, ,.te. Tht> dl'nnrmnli7..ed numlu'rs provide repn"st'ntlltions for vallll'.'i slIIall..r thall the smdllcst lIonnali1(d IIlIInOt'r, I"wpring thl' prohability of an eXp"llI'nt uut!I'rllllw. DI'lIormali/l'<I I\\lInbprs 8r,' fl'pr'>SI'ntl'd hy F = O. IIlId their \'alu' is \'t'n b)o' 
70 4. Binary Floot1ng-PoInt Numbers 4.5 Round-off Schemes 71 an aritlmwt.ic operation. A signaling NaN hlrtlS into a quid NaN wh(n uscd us an olwrand for I\n arithl1lf'tk operation if t hc In\'alid 0p('rat.ion t rap is ,1ilbhl, to avoid sl'Uing thr Invalid 0pl'ration fll\g 8J!:/lin latN on. A qnipt NaN is al9) proch..."" wlll'n an iJl\'i\li,1 opl'ratioll slIch as 0, OQ is '1ttempted, sillCt> this 0l)pm- t.ion hnd alrl'ady Sf't thf' 11I\'i\lid o!)pmtion flng OIlCI'. Th,> fraction '11'111 ill a qlli,'t NaN may contaill a poilltl'r to till' offf'ndillg 1i11l' of code, A qllil't Na:\, whcn lIsed a.s an opl'rand of an arit.lmwtic opNation will producl' tlw samf' quipt Na.'IJ 8.'< a result anel will not s<>t any eXl'l'pt ion IIag. Fur ('xampl(" u!'\ +5=NaN. If bot h op,'rands of 8n aritlllllctic opf'ration arl' quipt NnNs, t hI' rC'Sult will p'llml lIll' NaN with th!' smalll'st sigmfkand, 4.4.2 Double-Precision Format 4,5 ROUND-OFF SCHEMES Thl' maill cllnsinl'ration for the douhlt'-prccision format is rangl'. Const'- quent Iy. thc exponent field is iIlcrt',I.l'd tu 11 bits ;)"it'lding the following furmat: fill' lJ.('curw'y of results ohtainftl in a floating-point arit hl\ll.t.k unit is limitf'(1 pvpn if the inh>rll1l'Cliah' rf':mlts c Ilt-ulatf'11 in t hI' aritluUl,tic- unit ,Irf' arl1lr,Ite'. Thl' nlllnht'r of cOrllpuh'{l digits lIIay f'xCCf.'d till' total nmnbf'r of di"its allowl'd hy t hf' format. alld we h"vc t.o dispoSl' of the extra cliJ!:its bpforf' the. 'inal rl'Sults are storftl in a user-aecessibl(' rcgistl'r or in tl\l' IlII'Ulory. Fur pXdmpll', wh,'n nml- tiplying two significallds l'adl of length 111, a prudurt of lenRth .z", i.J; J!:eneratf'd and w,' IIUlst rollnd it off to m digirs. \\'!Jcn S<'IL'{'r ing a rollnd-off sehl'lIlC' WP n('('d to considf'r the f(lll()win: 1. Accuracy of re'slIlts (nlllnl'ri(,31 considerat.ions). 2. Cost of implcmentation and SPf"('<.! (madlinI' consid.'rdt,ions). F = (- 1)5 1./ 2F-I023, (4.7) Let x dnd y be re Iinurnhf'rs and If't F/ 1)1' till' set of machine rl'prl'SPnra- tions in a given floating-point format. Dl'notc by F/(x) thp machinp reprt'St'nta- t.ion of x. \\'hl'l\ rounding rcalnumber!- to ma, hillt' rppresl'ntat.ion'i the following conditions should be satisfied: 1. F1(x) $ FI(y) whf'nl'\'t'r x $ y. 2. If x E F/ then F/(x) = x. 3. If FI an,l F2 arc two cunSlut.ivc numhers in FI I\ch that FI $ x $ F2, then t'it-her Fl(x) = FI or F/(x) = F 2 . :; 11 hits - bia."ed exponent B 52 hits - UIiSiWll'd fraction / Thl' l'xtrPIllf' \'ahll's of F, i.1'., 0 aud 20.17, arc rl'.<;l'rvNl for the salllc purpoSL 8." ill thc singlf"-prl'cision format. Thl' valuc of a floaring-poinl number with an exponent F in till' r8nge 1  E  20.16 is TABLE 4.4 The single and double IEEE floating-point formats, Let d "I'nott' the numher of extra digit.s thdt art' kepr in thp nrirhmetic unit (in addition t.o thl' m ignifkaJ\d digits) befort, rounding i perfornwd. For L"On\'cnif'lIee, I(t us assllml' t.hat tlll're is a radbc point betwf'ot>n the m must signif- kant digit.s (of the siJ!:nificand) allli thl' d extra digits. Thus. we' will inVl'stig'1re' ways to round IIlllubl'rs like 2.9910 and obtdiu ,\I) intl'gl'r. The simpll'st sdlellll> is ('ailed truncation or chopping and is illustrnt('(1 in Figurp 4.3 (11). We (('move t.he II extra lligits with no change in thl' 111 rl'nJ.lining digits. For 1\ givl'n FI $ x $ F2' Trunc(x) rl'Sult in rounding towllfd Il'ro, yiclding thl' smaller of FI and F2. For t'xample. thl' decimal numhf'r x = 2.99. when rounded rowanl zero, yields 2. This is a fast method that dol's Ilot rpquire ami extra hardware, bllt H.s mllJlerkal performal1l'e is \'Cry poor. ThC' .rror inrodueed by trun('at.ion ("I\n be /limost IS lurJ!:e dS ulp (the weight. of t.hl' It'flst- significant ,Iigit of th(' signifirand). rh,' curVf' for Trmw(x) li('S t'ntirl'ly bl'low the idl'allinc (the dott(',1 line in FigurC' -t.3) which pwvilil's illfinitC' pre..-isioJl, We sa)' that truJll"iltlOIi ha..... a ncyatit1e bim; whl'rl' the bins, iJl gl'lll>ral, llIeasur...... the t 'ndency of a round-off schc'lIIc to favor f'rrors of Ii part.i('ul.u ,.,igll, CII'mly, we would like t.o \11\1' a round-o'f scht'1JI1 t.hnt. is unbi8.<;ed. or ha..... a v,'r snhlll bias, To (-Olllpdre rhe bi.\.<; of truncation to that of orher rounding sehemcs quuntitaHvdy, WI' dmne This format. dS well 8.<; till' singlt'-predsion format, arc sllmmaril't'd in Table 4.4. A doublf'-f'xtl'nded forml\t is also nt'finl'd in t.hf' IFEF stamlurd. It I'xtl'll<ls tlw expOl1f'nt .h,\d from 11 to 15 bits and t hI' significaml fipld from 52+ 1 to ().I or more bits (Le., without a hidden bit,), and ('on8e(IUcntly t.he t.otallllllnlwr of birs in the doublt'-cxt.ended floating-point format is at leasr 1 + 15+64=80. Tile inrerC'St.1'd rl'ader is referrftl to several arridC's d,'scribing the detliils of the IEEE floRting-point standard that appear in t.he IEEE Computer Magazine, Vol. 14, March 1981. Single Doublp Word I('ngt.h J2 bit.s 6-1 hit..... Fractloli + hidden bit 23 + 1 bits 52 + 1 bit F>..poncrlt 8 bit.s 11 bits ilia.-; 127 1023 Approximate' ranbc 2 128  J.8 . 103 8 2 1024 :::::: 9 . 10 307 SmallL'l normali/,cd numbl.r 2-126 :::::: 10- 38 2-lOl2 :::::: 1O- JOII Approximate rC$olution 2- l3 :::::: 10 7 2- 52 :::::: 10- 1 :; 
72 it Binary !=Ieotln ]"f'elnt Nurnl (. 4.> I'?ounc.l.off Set r 1 . /I/lFmd Ifl. III """HI(.r) 11111 1111. mil ()I " CICIO. :1 Tn", (. . JOn. £Ill. OHl, O!ll. ( (Ju, IIII.n 1111.1 () I.n II 1.1 10.0 10.1 11.11 11.1 .r FIGURE 4.3 The truncation sch m . U{J.U (10,1 nl,lI (), I 10 (J '11.1 11.11 II I F'GURE 4,4 Th round.t.n r t 'h rn , IIII' hilLN for IL ,:,i\'l'1I d fiB the 1.1\"'ruy,1' I'rrllr fill n t!I't IIf 2' l'IIIIHI'f'II'ivf' 1JIlllllIf'rH I wh"r<' I,rt'or c r '/J1If'(.r) - .r nJld n IIlIifllrm di'il ril,lIl iOIl for 'III' Hlgllllif"llllll It! 1&AAlllnl'f\ /J II, For I'xlLlllpln, 1111' rfllulllillJ{ ,'rrorH wh"l1 'nUII'II' hlK \Nllh d _ I. IIff' HhllWII ill f'1I1,1t. ,tr" III 'Ilitl Illhl", \ iH nlly HIKnili('lIJlIl"f 1I'IIlh",. 1'111' HIIIII IIf "((OrH fllr 1111 I." IZ I l'f}II111'f'II'iVf' 1II1II11",rll IJj :1/2, I'Il1'ff'foll', t hI' hilLII for d 2, whil'h iH IIII' IIVf'ra:,' f'rror, "q1l1l18 - ,J/M, A 11101'1' II 'f'lIrn'f' IIf.III'IIII' ill ,III' mUJlIl.t,,-1II 11'( 111111'111'1111' II Ollllllolily klllJWII IUljllHI 7Yllmdirlg), whil'!J for 1.'1 5 .r  }'2 vlI'hlH till' 11I'/lI"I'r III . f}lIl. of FI 111111 F J , It i8 uhlnilJl'd l1y IIflflinh 11.1 L'I (or m W'III'rnl, lulflillV, !Jllif Ii 1/111) 1U1I111'11I1IIillJ( ollly IIII' IIIt"w'r JllLrt. of IIII' HIIIII (dIOJlIIJIlj., 'III' f"u'l illn), I'ilr f'XlIlIIJlJr" '" 1!lIIIIII oflllll' cI"f'lIIlILllJlllnhN .r - l,fl!J, Wf' /lfld O,r, /11111 ('hop olf 'hf' fnlf'l i01l1l1 pm I, of :i. m, olltlllllillJ(:i. Th" IIII1XillllJlII I'l'I'lIr wllllid o('f'lII' whl'lI J _ 1.,0, UOIIIIIIIIIJ( it on yif'IIIH 1.,1)11 I II.!'"() - :i.O!l lillfilly rf1I1t illv. III :I, wllh ,m "rIor IIf n.rl. Jfollllli t'-I...a(f'Ht i" IIIII'fl III IIIUIIY IInlllllll'lk IIIII'H lUll I iLH f'lIrv" ill H!JOWII ill FiJJ,lJlf'.1 I Ill). Nu'" lhnl f'}r pf'rfOrJIlllIl!, 'III' rOIIlIlI-Lo-II"IIII'tlt Hd...IIII', II HillV,h' I'x'nl fli it (j.('., tI 1) IH Hllllki"lIl. IIf Hld( ,) IH 11I'lIrlV IIYIIIIIII.ln(' with (I'tll'I','1 'II t.hf' Ioh'lIllIlII', whl"h III II HIIIJI1III1I1 1II111II11I0VI'IIII'1i1 I vl'r '"1111'1111011. 1I0wf'wr, III ,hi' JlIIIIII. .\'.111 WI' wII1I111 nlwlIVH I o III If I "" (111bl IH huli"IIIf'f11ll Fi :"r l ' .1.llIv 1111' 1II'ILvy dotH), 1I1II11I\'I'r II 11I1Iy, Hf'l IIII'I II 'I' of 1II,"rnt.ioIiH w" IIIILY W" II Ii II v.hl "oHIt lVI' "Inll. I';",J '/. ,lth "ill f'nn 1)1' f nklllu' "11 from lilJ,JI' -1.11 I'll, I'll<' HIIUI 01 l'rrurH IIlf nil ' , I IJIllllhl'uJ It! I 1/1., /LIlli thliH 'III' 11I1u! IH I/H. wllkh III "1111111,,1 ,111111 lit" InlUl "I I h'l llllllf'nlJolI Wh"IIII', 1"11<' Hnllll' HIIIII of 1'.-rorH (1.11" II/I.) IH IJI".ulwtl r'll" I. UII" f'lIIlIjf'IIIII'II,ly tlt,' himl, ill W'III'I'ul, III  ,I. ". I'hlH 1tlflll'IIh'H 'hllllll<'IHIHlllw "iux IH ,JIll' l) IIII' rOllmJi,w lip of >. f(I .. (I. Ii, I'''' nlll nil IIIlhin 'd rllllllfllll w,. 1"lIIld, III f'IUlI' IIf U III' (I."., \ HI), I'll IIXI' 11111 of "'1 111111 I'J, 'III' Oil" WillI"" If'll1l-HII.llifklllll I,ll IH (I (1,1'., 'III' "Vl'IIIIII"), 1'hlH wuy W" would 'I"f rnnlo'iv rllllllli lip IUIII dllwlI. 1'111' ol,I,niJlf'11 Hf.It"IIII' IH l:nll"11 I h" ,. }lnul t '-Uf'tU t",/<", f'hf'lII" IIIIIIIH IIII1HII'II"'" III Fiv,III'" I r) 1111, I) -1/.1 -1/1. -:t I 1'. If.r II 1/-1 I 1/2 I I I TABLE 4.5 Th(j roundln J rrore t r Ih truncollolJ och fill wilt! d . 2 TABLE 4.8 111 ( un jrl\l ,1110111' r Iho r ulld I 'If I! I 'h III with 
74 4 Binary Floating-PoInt Numbers 4 5 Round-off Scheml:Js 7" Rrnmd-tn- nCl.ln st- 1.1 n(x) 100. OIl. 010. 001. (){)() . (I - 1) ,.. . 2' ,. (I - 1) UC J 'f ... 1- I ( ) FIGURE 4.6 An Impleml:Jntotlon of the ROM rounding heme fhp HOM ill thi'l fiIJr' h'1H 2' rows flf (I-I) bit P &l"h. (' /tJ':urllllJ( h )Jr' ",'rly wllnd.J (/- 1 J rftJult ('XI' 'pt whpn .ill (I 1) IfJW-orflf'r IJitg ("If till' H1J.lIihf 1IIfi .u' l's. Iu t.his ragt., tllf' RUM r'11JC11.11 alll's (tluUI d!..ctiw'ly 'ntn' I'ing tl.,. r' ult inJoit '.ad of rlllJnflillJ( it,) slid .IVlJi,1 th,' full .u:lditi(,11 If, ff,r ?Campll', 1 :: . th,. taM,. IfJok-up wfluhl Iw VI'ry fMt and y't 2!j!j lIut c f 25G f'.w Wf uhl t f' ,r ,"rly ffJlludf'd, ThjJoi rCJIUlciing, ffJr 1=3, ill iIIulltmtl.,1 in Fiur,' 4.7 fIll. Th, bibS ,f HOI rounding for 1 = J ,md fi = 1 C'''II },P CAlf'ul;it....J fr',HI Tdble 1.8. 1'1", sum of prr',rs is +1/2 ff.,r th,. hr. t thrl:#' Jl.JOIlpH flf IIV' r J :;& 2 ami 1.11 -1/2 for the fCJllrtll grull", nl" awrd .. hi 111 ill thf'rf fflrf' 1/8. In tl... ''1),.r1 ra8P, for giVl'1I V'dlllP-8 f/f d .nlll I. tll,' a""iiW' },idli is !( )'I - C} )'-1). TlJia bi.L.'4 I' JIIW rges to i<!)d (e.g., * fc r d = I, w}wn I iH larg f ' pn' ugh, infO' RoM rollnchng ronVf rges trJ tll!' rolllld-kH'f' uCtit IICh, HlI', If th.- rIJlmd-t..o-nMrr-'1ir,.. (......pn mc flifir-atj"n iH adoptl'fl, th.. t/idli of tll" me elifif:d kU\1 r e ImdUlK (:( flVf rgNi t,o zero. ROM(x 100. 00.0 00.1 01.0 01.1 10.0 10.1 11.0 11.1 x AGUAE 4.5 The round-to-neorest-even scheme, To compute thl' biN> of thi!o schpme for d = 2 we hdVl' to consider two groups of Iii".., 2 d = -I "-"I :.hown in Table ,1.7. Thl' 6lJm (If ('rrors is -1/2 for t.hp first group dnd +1/2 for thp S('Cond. ThlJ. thl" ivpragc' bi"-,, is O. Thl' round-to-neart-f'\'l'n S<"he'mp is mandatory in tilt' IFFE Hoating-point sWlllldnl. Another possible' modification to the rouud-to-llI"drest S<"hpmc, which also yi p 1cls an Ilnhia...t rounding, is, in Cabe of a lie, to choose out of FI and F 2 thp onl' whose Ip&t-ignifir.aut bit is 1. This is kuown as the round-tQ-nco.rcst-odd sc.heme Although the round-to-nearf>St S<"hl'm have a good mlmpri,'al "pl'rfor- mance," thpir main dis.t.dvantage is th'it. tht"')' rp<]lIirp a cmnple tf> add op..ration. sinc/? thl' carry from the le.a....t-signifircmt hit mot}' propagatf> acros.<; thp f'ntirp :.ig- nificand. fo avoid tlll5 timt'-consnrning carry-propati()II, it has Lf'('n SUf>Stf'd (10) to IL<;e a ROI (read-only memory), v. hich would hold a look-up l:iLlp for rounded nul . For example, a ROM with 1 arldrcss lines would ha"p as input. the (I - 1) le.asl siRDificant bits (out of the m hits) flf thc tiignifie'dnd, and ou!) thp most significant bit Ollt of the d extra hits, 8.'> dppiro-d in Figure 4.6. 011 010. lJmbl'r R urld(.c) Error :\umber Rr./lJnd(:r , I-..rTfJr XO.OO XO. 0 X 1.00 AI. 0 XO.Ol XO. 1/-1 X1.01 Xl. -1j.1 XO.IO XO. -1/l. X 1.10 A1.+1 +1/2 XO.l1 Xl. -+1/4 XU I Xl.+l +1/1 001. 000 TABLE 4.7 The roundtng errors for the round-to-neorest-even scheme ...Jtth d = 2 00.0 00.1 01.0 01.1 10.0 10.1 II 0 ILl r FIGURE 4.7 The ROM rour Ing he ne \/'11m'. 3 
i6 4, Blnory Floating-Point Numbers 4.6 Guard Digits 77 N IIIn iwr ROM (x) Error Number IlOM(x) Error \ 00.0 \00. 0 .\ 10.0 .\ 10. 0 \'"00.1 XOL +1/2 .\'"10.1 XII. +1/2 ...\"01.0 \'"01. 0 \'"11.0 X11. 0 XOl.l .\'"10. +1/2 .\'11.1 XII. -1/2 (ar. a21 + [b.. boll = [al + b.. CJ;z + b-;zl hin. right (Ilwrat iOIl IIIRy, how('\'pr, bp (t'<llIirpd. WIIPII rnult.iplyin two lIomlRl- ilNI fr.lctiollal silIifiellllds, at mMt 0111' :-:hift It'ft ill nl'f'ded if 13 = 2 (k posit ions :;hift wlll'lI (3 = 2"). ThPrpfore, OIlP guard digit (in thl' radix 13 IIIl1nhpr S}'Sh'lII) is Millicil'nt for post normalizatiou. A Sf'COlld lIard digit is nN'dNI for rounllinJ!; if lhl' rolllltl-tu-neart'St schl'm,' IS adClI,"'II. Thus, n total of t.wo gunrd digits Slim, ,'. rhese two digits arp l'a\led thl' (; (guard) "'Id till' R (rolllul) digits. Thp salllP cOllclusioll 1:) reacllt'(l whplI lIll' siJ{l1ific IIIlI is n siguf'd-lIJngnitmlp IIlIInhc'r ill thp rRllge [1,2), as is the elLS(' in tht' IEEE :;tandard TIlt' proof of this :ot ,t/'mpnt is Il'ft (IS a t>xercis,' for tht' rcud"r. hnpl£,IIJPntillg till' rolJlul-t,,"lIcarl'st-c\"PII sdlC'm{' rcqnin.s 81\ iudi,'ator to poillt out whpthC'r nil tht' additional digits that w('n' J!;pnC'ratNI ill t.11t' mn1til)Iy opprat.ion an' i'cro, in o«ll'r t.o dptcct a t.ic, fhis indicntor CUll be impll'lIIcllt.,'d as a singll' hit, whkh is the logieal on of all additional hits, and is known l\." thp sticky bit. TIllis, tlm'(' hit.!>. mundy, G, R, allli S (t.icky) art' sufficil'nt., I'vell for till' rouud-to-lIt'arest-t'Vl'1I scheUlc. Comput.ing the sticky bit when 1IJ1l1t.ipl) ill two sigmfic.U1ds docs IIOt. rP<lnire the J!;1'1ll'mt.ion of all I he h'L..t siglliti('lInt hits of the product. Thl' IIlIIlIb('r of lrdilillg 1eros ill t.h,' product of two hinRry sinificauds i P{Jual to t.l11' IIIlIlIbl'r of t.railing eros in tilt' IIllIlliplier plus till' IIlllnb,'r of t.railing l('ros in t.he mlJltiplicand. Thns, thl' st.icky hir should IX' Sl't to 1 oul.,. if t h,' ('xpectl'd number of I niling .I.f'ros in till' dllllhl.,-li'ugth prochll.' is smdlll'r t iJan t hI' IIlmllwr of thC' IpllSt !oignificant prodll'" bit.s that are dlmded. Other tedmilJlI(,s for compnt.illg thl' sticky bit for till' prodnct nfl' 1)(I'sPlltn:l in [15J. The CUM' of addit.ion/snbt.nction of floating-point nllmbers is lIIor(' .-om- plieatcd. e.'jpccldlly whl'lI t.he final operatiou ..allt><! for (aft,'r f'xnmining tilt' sigll bib) is Mlht.r.Il.tiou. As before, Wl' &Sllllll' t.hat the signifk"mls of thl' opprallls lire nornl<,lil'd sigllNI-mngllillldl' fractions. Revisit.ing the subtnu:t l'xamplt' m Se(.t.ion 4.2, it S{'t'III'j that all shift"II-out IliJ.,its of thl' subtrahplul IIIlJht lit' kept, and IIIUSt. p.artil'ipatp in t.he subt.ract. opl'rat.ioll, in ord(r lo hdV(' all ,hI' 1It''-'sary digits for Ihp po,...t.normnli.liltion st.pp. Thi!i \\ ill rNlnire lIS lIIallY guan dlts as th,' numL('r of digits ill tht> siguifi,.alld field, doubling the si.t,' of th.e Slg}JfiC8l1d add('r/sllht.raet.or. 1I0wl'vl'r, if the signifi(,Rlul of the !'uhtrclllt'lld IS shltl'd y 111m£' thall oUl' positioll to thl' right iu thl' prC'-dlignllll'nt stl'p, the n'Sullll dif- ference will hll"" 1\1 lIIust mil' leadiug l'ro. This implit'S t.hat dt. must. ulle of th.' shifted-oul digits lIIay Le n.'<llIired for tht, postnormalimtion step. TABLE 4.8 The roundIng errors for the ROM rounding scheme with' = 3 ond d=1. Thl' IEEE float.inJ!;-point. stcmdard [91 illchulps fom rounding mode'S. Till default is t.h(' roulld-to-nPIUt'St-pVl'n lIIode. Tilt remainillg t.11('(' IIrc din't.1.f'11 roulldings: roulld toward zc'ro (t.nlllcatt.). round t.o\\''1rcl "X"' .\IId round toward -,X). rhl' roulld toward 1:0.: mod<'S are useful for Intl'rval Arit.hmetit- in which each n'alllllmher a is repn'Sl'nted Ly two Hoat.ing-point. IIIlmLprs UI and U2 pro- \'idillg lower and uppf'r hOlJnrls, rl'Spectively, for the (('ell value. Thus, an opl'rand is rl'prPSf'IIINI by an inll'nal alld all arithmetic operat.ions Opl'rdtp on intl'r\dls. [1]. For ,'xamplt'. thl' <KId, subtract and mult.iply opl'rations in illtc'rv,,1 arithllJl'ti(' ar<> defined lIS follows [UI,U21 - [br,b-;zl = [al - b-;z,u2 - brJ [UI, U2J X [bl' b-;zJ = [min{ ulbf, u.b:l. U2bl. CJ2b-;Z}, mfLX{ ulbr, ulb-;z, U2b.. u2b-;Z}) In these C('J1Jputatiolls, thl' lower bound (t'.g., UI + b l in addition) is roundt.,1 toward - x whill' lhl' upper hound (U2 + b-;z in addition) ih rouuelNi tow Uti 00. The intervals cakullltp<t for the final (('Suits will prcwidl' au timdte on tll(, accuracy of the computat.ion. 4.6 GUARD DIGITS V{hl'lI mult.iplying two slgnifieand!o, WC' ohtain a douLI('-length re:oult, and dearly IIOt. a\l the ('xtr., digits an' nl't...led for prop('r rounding, A similar situat.ioll arhit's when adding or sliLlrading two numbpl"S thdt Oil 1I0t Illwe tilt' sam(' expor1l'lIt The qUt'St illll now is, \\'hat is the smalle.sl IIlJlllher of pxt.ra digits that WI' 11('('(1 within the arithmetic IIlIit.'! Thl'SI' ,'xtra digits are used for roundillg alld for post.normalii' ,timl in thC' CdS.(' \\'hl'lI "'ddillg .teros aw nhtllilll'<l. In wlt.,t fol- lows, we will first considl'r nmltiply /divillp operat.ions. and thl'lI add/subtract o!J(,rRtions, sinct' the formpr aff' simplpr to IU\IIdl('. As Wf' have' &-'('u ill SI'('t.ion 1.2, if siglled-magnit.udc' fractions arf' uSNI as :;ignifil'ands, th'>1I1I1I ('xtra digits for l'ostnorllJ'1lii'ation arC' n('('dNI for division. A EXI\IIIpie .1.5 In t.he following suht.ract op('rat.ion tht' signifil'lIIuls of the two 'I)I'rancls A . I n I ? I ' t l on g and tilt, hlL'i" is 2. ASSIJIII" thnt tilt' dllfl'Cl'IICl1 dn( .lft' - )1 S c , , I -Co f , I t . 1 . E - En - 2 re< l lJirin g a 2-bit position :-: Iht 0 net.w,>;'n t II' l'xponl'lI s s \ - , the suLt ralwlIlI n in t ht' prp-alignmt'llt "tc>p. 
78 4, Binary Floating-PoInt Numbers 46 Guard DIQits 79 A O.1OOOlKJl()) JOO 00 G S B aliglll'd () .001 IO{)OOU{)Ul 10 A 0.1000001(11 WO () 0 A-B 0.01 () 100 10 10 JO 10 B ',Iignro 0.0000001 H 000 0 1 Post nnrrnaliu,t ion 0.101001010101 A - IJ O.OU 111111011 1 1 Post lIormaliz"tirm 0.111111110111 Ex.actl)' the <;alllP (('Suit is ohtdinl'd ew'n if only ulll' lUud hit p.art ici(lah"'S in the uhtract.ioll and l'lu'ratf'S till' nf'csary horrow. 0 Thf' two hits, G and S, arp slIfti( il'nt for 1J(,..t.normRli.lation. If, howevf'r, thl' rOIll\lI-to-Ilt'.u,'St sdwlJI(' is follo\\'f'd, an additional occur"!t ' hit is nlnirl,(1. Thp fit.icky bit, whidl iR only all iurlicator, Cdllllot J>Crvc this purfJ ". Thllll, thff'" bits, naDlI'ly, G, n, and 8 'UP rl'qlllr 'd. 8.'1 iIIustrdlcd in thf' npxt 'Xdllll,1 . 1'111' sitllation i" diffr.rt'nt if thl' mot significant shiftl'd-Ollt bit is f'qIMI tll .ll'ro, as illlI'itratl'c1 in tltp following l'x3I11plr'. Exampl(' 4.6 ('.()nsid"r t.III' 881111' two siJ!:nifi("1nds fL<; in tilt' prp\ ious f'xamplr', but. noW dS:>I1I11I' that t,hr diffen'u("(' b(.tw(,<,n tlal' f'xponpnts is E \ - FII = 6. Thus, D's flignifi..and is Rhiftl'd by six positions tll thf' right to ulign it wit.h A's ignific,\IId. Example 4.7 Consider th(' f(,lIowing two Riific8nrls, whi..h arl' ahnost the SaJlIt' as t hosp in t hI' pr('violls I'xlUnple, PXCI'pt for one bit in B, whirh L'i indir'at'd in hold fEKe. Ab (wfore, assulJle that. t.he Iliff('n'nC' ' h(.tw "n the f'XllllnCIILs is E" - E B = 6. The corr .t rc:>ult :;hollid b" t B nlignl'd A-/J Po:.tnorllmlw.,tion 0.10000010110£1 0.OOmXJOI10000 0.011111111011 0.111111110111 000000 000110 111010 A B alin('d A-B Pot.llonlldli.latioll 0.1000£1010 1100 0.0000£10110000 0.011111111011 £1.111111110111 00£10£1£1 010110 101010 o If only 011(' guard hit p.artidpat.pd in till' lIbtr.j(;'lion, thcn thc four 1t'.\Sl :.inifk.Ult bits of t.he rcsult after the. postnorlllalii'dt.ion tcp would be 10£10 inst,.ud of 0 Ill, 0 How"ver, if \VI' use ollly the G dnd S bits, WI' obtain In till' dbow subtrat'tioll, a long scCJUt'n('t' of borrows is obtaim'd. and it bl.."f'mh t hdt \\" III 'Y 1\C('d all till' additional diJ!:its in B to guar.\Jlfl'(' that. a horrow is gCllI'rat >d. \\'c miht. cOllclude that, in tht' worst CfL'>P, we lIIust dOllhtr, thp lIulllhf'r of digits. How"vl>r, d ('arf'ful alldlYHis II'ad to tht' ('ondusion t.hat it. suffkl's to dihtinJ!:ui!>h l)('twl'('n two ClISI'S: (1) All '\lJditiolldl bit.s (not including t ht' guard hit) arf' I'('ro, and (2) at lP.L'>t onp of t.hr ad(litiOlldl hit.s is 1. Tu prow I hiR, noti('(' that all till' I'xtra digits ill A art' 7t'ro5 (A was lIot p((.'hifted); hl'JI('c, till' (('Suiting thn"f' Il'fI.'it significant bit.!:! III A - JJ (011 ill tit(' above l.'xlllllple) arl' indepl'llIlellt of th.. f'x8.ct. poition of the 1 's ill till' I'xtrd digitb of B. Thl' only t.hing we I\ccd to know is wlll't.llt'r a 1 W'lS shiftt'd 0111 or not, alld tilt' 'it.il"ky bit (".an bl' miNI for thi8 purpos(', If a 1 is shifh'd into it during t.hl' alignllll>nt, it will bi tiCl to 11111'; otlwrwi:w, it n'lIIdins Z('ro. It is thl,rt>forp "J!:"in thl' logical OR of all tlu' I'xtr., bits of n. rill' "t.ieky bit participat('S ill till' suhtraction .illd gl'lIraU':> till' IU'Cl'ssary borrow. Tlnlb, using thp I wo digits. G .\IId S, I hI" prt.vioWi suhtr.&Ctiol\ 100kR likf' G S A O. 1 oouOu 10 11 DO £I 0 B "IiKIIPd 0.000£1£101 10£1£10 0 1 A-1J 0.011111111£111 1 1 Post lIormali.l at.ioll 0.1111111 1(1111 1 TIl(' wllnd hit aftcr th,. postllorlllalioldtion stl'P should be .I1'W, and t hp stir'ky bit C \IInot be UbCd for rounding. \V.. lIIut. U&(' Ihp thrN' bits. G, R. dnd S, .IS shown below: (; R S A 0.100£1001 011011 0 0 II B .,Iiglll'd 0.00£1£1£10110000 0 1 1 A-B 0.011111111011 1 0 1 Post lIorlll"lii' atilJl1 0.111111110111 0 1'111' (.orrpct. R hit, with 1\ valuc of 0, is now av'ulable, and r: III 1)(' lL'WII in thp round-lO-nl"&r/><;t !ldlf'mf'. Note that if thl' rmmd-to-lIf'arest-l'\'('n IIll'lho" IS followl'(l. thl' sticky bit. whi..h ill nLocoed to det '(.t u tip, is I\lrl'ddy .,vdill\hll'. This bit tI\I'(I'fore, SN\CIi twO) pmI' CIi. 0 
80 4. Binary Floating-PoInt Numbers 4 7 Flootlng-polnt adders If 110 postnormali7ation is re(lIli, rl'u tlll'n. fot' rounding IHJrI)()SI'S, Wl' dn nol, IlCftI t.hf' three bits G, fl, 8nd 8, hilt shoultl IISI' onlv t.W) biLo;. a ronnd bit and" :;ticky bit. Till' original G hit can SI'rVI' as all fl i)it, and t II' oriinal fl alld S bits must 1)1' ORl'cI in ordl'r to gen,>ratl' anI'\\' sti<'ky hit. \\1' arl' 111I'n h.fl with two hits, which WI> I'all fllUlIl S, for thl' round-t(HIarl'st-I"'l'n pfln: -Our'. fhis sit uation is ilIustrat II in thl' follnwlIIg ('xalllpll', ISIJ n s Opl'ralion Errol' 0 () 0 +0 0 0 0 1 +0 0.25 1/11' 0 1 0 +0 - 0.50 111,) 0 1 1 +0.5 ulp +0,25 ui" 1 0 0 +0 0 I 0 1 +0 - 0,25 ulp 1 1 0 +().5 ulp +11.50 ,lip 1 1 1 +0.5 uip +0.25 ulp EXAmple 4.8 In t Ill> f(}lIowm subt.mction, no po:-tnnrmali7ation is 1It'('(h'd. Thl' difft'r- ,'IIl'P h(.t wl'{'n th(' I'XPOI\f'lIts is E" - F II = 6. rotal o .1 0.100001 0 10 1 00 B O. 1 ]()()O()O 11100 1 G R 8 .4 0.100001010100 0 0 0 B 'iligllPd 0.000000110000 0 1 1 A-B 0.100000100011 1 0 1 Ii S Bf'forf' rounding 0.100000100011 1 1 After round-to-ucarc.sl 0.100000100100 0 (a) ROIlII<l-t.o-nl'arest -I'\'I'U sc'h,mt> SiJ1;11 R S Oppration + 0 0 +0 + 0 1 +1 ulp + 1 0 + 1 ulp + 1 1 + 1 tdp - 0 0 +0 - 0 1 +0 - 1 0 +0 - 1 1 +0 III thr pmcNhlrl' 8hov£', if th£' IIpW R hit is 7cro thcli no roundiug b, rCtluired and the sticky bit only indi(,lIt/>g whl'thcr dIe final rcsult is I'XII('t (if S = 0) or incxact (if 8 = 1). If R = 1 t.Ill'n till' operation to be Jll'rform('d in the rOllnd-to-nI'IUIt-t'\l(>1I pro('('(lure dept>nds 011 tll(' st kkv bit and on tl1f' If'ast-signifinLllt hit. (1I('not('<I by L) of the resultant significalld If the stkky bit. equal6 1 thell wlIIuling mnst bl' performed by adding a ulp to thc silIitkand. If tlw sticky hit 'quais 0 thcn t.his is the til' CM" and ouly if L = 1 is rounding IW('l'l\sary. The above ml('S can hI' stated llIorl' sliccinctly by saying that the round-to-tu'ar('St.-l'vpn SdlC11I1' requires tIll' nd,litiou of a ulp to the signifklUlu if thf' Doah-an £'xprl'SSioll (c) Round-to-plus-int1nity !>dlI'IlIP 81 R S Operation Err'or 0 0 +0 0 0 1 +0 0.25 /lIp 1 0 +0 0.50 ulp 1 1 -t- 0 0.75 /III' I I l'otdl I -o.:m') /lIp I (b) ROllnd-to-,,,I'ro sc!lpme Sign It S Operatioll - 0 0 +0 - 0 1 + 1 Ill,) - 1 0 + 1 /lIp - 1 I +1 IIlp + 0 0 +0 + 0 1 +0 + 1 0 +0 + 1 1 +0 (d) Ronnd-to-rninus-infinity ,>chclIIp TABLE 4.9 The rules of all four rounding schemes. Thl' addition of a ul" t.hat. ('orlle8 aftpr thl' two signilieands IUlvl' hl'{'n added subst<lIItially iIl<Tt..L'5es the cXPclltion time of a tloatillg-point 'Uill/suhtr.u't operdtioll. Thp extra delay clue to rounding can be avoided becauw .tll t.hro..' gunrd hits are known l}Pforp the sinifil" u\lls art' hcin ndded. Thus, the a<ldition of 1 to L clln he doul' dt tht> salnr tilll(' that tll(' sihnificands arc wld/'d. flu' cxact position of t h,' L hit is nut known YI.t, since a postnormnlization IIIdY be rcquirpd. HOwl'vI'r, the L hit has only two posibll' positions '&nd two I1Ildt>rs ('1m theft.forp hI' USI'lI in par,III'I, providing the 'orrcdly wmu!t'(1 rt'8ults for bot.h cases 17). ('his ('8n nlso bl' d("hi"vl'd IIsing on£' wider lib dl.'scrihpd in thl' I\('xt sl'ctioll. li. S + Il. S. L = R. (8 + L) equals 1. The addition of a ulp to thl' significnlld jJ'> 5Q1llct.iIllC5 nc '(ll'd 'vcn whcn a directed rounding lIIode if' followed. For examl>le, in till' round toward +oc mode, a III,) IIIlI..t he addl'd to the sinifical\d if till' result is pusitivr '&nd pitlll'r R or S ('(Iuals 1. A similar !iit.u8tion oc('urs in t.11I' round tow,ud -oc mod.. wl\l'n thl (('Suit i:; nl'gativl' and thl' Boolpan I'xpression R + S 1'«uuls I. 1'1\1' rule'S for all four roundinl:, lIIodes IIII.mclatl'(l hy t 11P IEEF tI(),'\t ing-plJint stancl,ml art' shl}wn in Table 4.9. 4.7 FLOATING-POINT ADDERS The pr(1I "durl' folluwed wlll'\1 'L(lding two flontin-J>o\l1t nllUlb p r8, dl"pi('tl'(l ill Figure 1.1, indudCtl 8 larf!;t" numher of sttJps whidl .Lr' ex(" "IIt.d Sf'(jl\l'\J1 iaUy. 
82 4 Bi"lary FJoatfng-Point Numbers 4 7 FlOating-poInt adders 83 Ho"..e\'n, a careful f'xamination of the procroure rc\'e.als that not aU of thP.-oo st(>p5 mw;t be eXf"('utt"'d for f'3("h operation, and that som.. of th£> t£>ps can be exeruted in parall('1. To this end. .e disth betv.-cen (>ffective addition and effecti\'e sub- traction (21. The effective oJ>{'ration depends on the Sign bits of the two op6'anru. and the instnlction to be executed. For effecti\'e addition, we first CAlculat(' the exponent difff'rence to determine the alignment shift. \\"p then shift the siJi1lili- c.and of th{' smaller operand, and add thE'S(> alignf:'d signifk.ands. The result of the addition can m-erftow by at most one bit position. Thus, a postnormalizatim shift. which would be time-ron:.uming. is not needed. The single bit overflO'lt, can be easily detected and, if it is found, a I-bit normalization of the result is perfonnro using a multiplc..xor. To eliminate the need for an increment operation in the rounding step, the abo\"(-mentionro signmcand adder is d{':)igned to produce to simultaJEOus results, sum and sum+l. An adder capable of producing the two result.s sum and sum+l is sometimes called a compound adder (13], and it can be implemented in various ways, including carry-look-ahead and conditional sum (see Chapter 5). For the IEEE round-to-nearest-ew'n rounding mode. we use the rounding bits to determine which of these tv.-o results should be selected. ote that these two resul (i.e., sum and sum+l) are sufficient even if a single bit owrflow occurs. In ca...--e of an m.erftow. the 1 will be addoo in the R bit position (in:.-wad of in the least significant bit ,LSB) position). and since R = 1 if rounding is needed, a carr)' will be propagated into the LSB position to generate the correct vahle of sum+l. Howe\'er, for the two directed rounding modes (roWJd to :i:x), the R bit is not necessarily 1 and, as a moult, sum+2 may be needed in the ca.c;e of the I-bit O\'erflow. In effecti\'e subtraction, ma.-,.o,i\'e cancellation of the most significant. bus Ola) occur. resulting in the need for a length)' 'tnonnali7..ation step. Hov.e\"Cr, this can happen only when the exponents of the two operands are close (i.e., the difference is Ies.<; than or equal to 1), in which C3.-e::(' the pre-alignment tep can be eliminated. It th£>refore makes selLe::(' to implement effective subtraction a; two separate procedures: one for the case where the exponents arc close, and one for tbe case wb re the exponent difference is greater than 1. In the CLOSE case only a postno rm(i.liw ion 4ift ma' be required, while in the FAR case only a pre-alignment shift may be needed. The required steps in the CLOSE and FAR cases are sh090D in Table 4.10. In the CLOSE  'e first pff'dict the exponf'nt difference. ba.scd on thp ("\\'0 least signilicant bits of the oJ>{'rands, to allow the subtraction of the sgnifi- cands to start as soon as ble. If the predicted difference is 7..ero, a subtract "ill be executed with DO alignment. If the predicted difference is H. the signm- c:and of the bmaller operand is shifted once to th.. right (using a multiplexor) and then subtracted from the other !>ignificand. At the same time. the true expollt'llt StPp CLOSE FAR I Prf'dict f'Xponent SuhtraM PXpOnent« 2 Subtract significand'l Align o;jgnificands Predict number of I 3{lin 7R-T0C'I 3 Post normalization Subtrart siificands 4 SeIPCt properly rounded result sPlect properly roundPd result or nt'ga«> result TABLE 4.10 The steps in the CLOSE and FAA cases. diff('rence is cakumted. If this difference is great.pr than 1. the above prU'fflnre IS aborted and the F-tR procedure is followed. H. hov.ev{'f. the true I?XpOnf'nt difference is Ies.<; than or equal to :H. the CLOSE procedure is continuPd. In parallel with the subtraction, the number or leading 7.1'rO bits is predictl"rl. in ordf'r to <let-ermine the numbpr of shift positions in the postnormalization step. The nonnalizat.ion of the significand and tbe correspondinJ( exponent adjllit.lI1PDt are done next. Finally, rounding is performed in step 4. As mmtioned above. rounding CAn be a.ccompli..hed by precomputing sum and um+l and then  leeling the one which is proJ>{'rly roundro. In step" a negation of the result may aL<;() be needed. Since the subtract operation is almost always executed  that the maller (pre-shifted) significand is subtracted from the larger one. the result of the subtraction is usually positive and negation is not required. Xote that the gn of the final result is determined by the sign of the largest operand. Only In the ca...--e where the exponents of the t'O oJ>{'rands are equal, tbe r('Sult of the signific.and subtraction may be negative (represented in two's complement), re- quiring a negation step. Homer. in uch a case, DO pre-alignment is perfcnned, and consequently, no guard bit.:. are gt>nerated (the result is exact) and hence no rounding is nr)". Thus. the negation and rounding steps arf> mutuaDy exdush'e. In the F.-tR case the expouent difference is calculated first, and then tbe significand of the srnalJer operand is shifted to tbe riJ(ht to align it  itb d other signific..and. The shifwd-out bits are used to set the stic.ky bit. The  r signific.and is nov. subtracted from the larger signmcand. with the result king ei- ther nonnalized or requiring a cingle-bit-position I ft-shift. hicb isaccompli<;hed using a multiplexor. In step 4, rounding is performed. _ We conclude this section witb a bri('f description of the leadmg zeroe. pre- diction circuit. This circuit should prroict tbe position of t,he leading non-zero bit in the result of the subtract operation before thp subtraction is compk>ted. This would allow us to execute the ,-tnonnalization ft immediately f<1Jow- mg the subtraction. One way to arhie\'e this i3 to examine te bi of the 1v.o operands (of the subtract operation' in a  hion. starting. with. the most significant bits to determine the position of the first 1 I J. Thb SE'riaI opa-a- 
84 4, Binary Floating-Point Numbers 4.8 Exceptions 85 t.ion can be 8('('('I('rat.('(1 using a paralld ch('Ole imilar t.o th(' rarry-look..;\lwad t£'clmiquf' (Sf'(' C'haptpr 5). Anntlwr ,..ay to prcdkt thC' position of th.' I..ading 1 is to 1:'1Jl'rate in parallel a s<>t. of int('rm('diatC' hits ('i u('h t.hat e. = 1 if the corresponding bit.s a l and b. of t.hp two op..ramls arf' id..ntiral ancl t.hc. pr('vious bits, i.f'., a._I and bi-It allow t.he propagation of th(' t'xpc>ctl:'d carry (i.e., ELt If'ast onf' of the bits ai-l ELnd b l - 1 ('quais 1), A curry is exp('ctC'd since thl:' suht.ract opf'ration i f'Xffut{'d by forlUiu t.he oI1(,'s comph'llIent. of the subtralll'ud aud forcing a carry int.o the If'a.<;t significant position. (In the notation uSId in Chaptl'r 5, the Doolt'an exprt.>$<;ion for ei is fi = (I. III bi (ai-I +h,-d wh('(1' hi is the compll'lUpnt of the oriinal subtrahl:'nd bit.) In other words, ei = 1 if a carry is allowl'd to propaJ?:ate to position i. The corresponding ith bit of t.he correct result \\ilI also b(> equal t.o 1 unl!':'." the force<1 carry from th(' least significant, posit.ion did not propagate t.o posit.ion i-I; in such a casl:', t.hl' correct result will h",..f' a 1 in posit.ion i-I instC'ad. Thus, t.ll(' posit.ion of tlu' leading 1 in the result is eit.her id..nt ical to the position uf the leading 1 in th(' s('qUE'nce of the ei bits, or it. is one position to the riht. \Ve may therefore count, the number of leading zeroes ill t,lu' sc>quence of t,he c, bit.s and provide this connt to the barrel shift,er f'x('('uting the postnormalil.at.ion shift. After this, at must. one bit position corre('tion shift, (to th£' Il:'ft) will be fl'quift'(1118). A comparion of s'veraln\('th()ds for predicting t.he position of thC' Ipading bit. appears in 116). tlJl' correl>polldin trap is disabled, is dptermim'(l by thl:' siJ!:n of thp int,ermediatR (overflowN\) result and the rounding modp a.., follows: 1. In thf' rmmd-to-nf'l\rf'St-pwn IIICHlp an X) wit.h t.hp sign of thp intf'nnpcliatp r"snlt is gene rat -d. 2. In t.he round tuwc\nl 0 mode t III' largest (('prp.',entable uUlUhpr wit.h thp !oign of lllf' int.prnll'cliat I' result is genC'rated. 3. In the> round toward -00 lIIudp t.hc' largpst rf'prCS4'nt ,bll' numhl'r with a phis siRI1 is gf'nf'rat('(1 if th£' int.rormPdiat.c rf'sult i!' positivI'. Otlll'rwiS4:' thp final rC'Sult is set to -00, 4. In tlw round toward oc mode t.he hrgpst reprpsf'ntabl(' numhf'r with a minus sign is gc'm'mtNI if I. ht' intl:'rJlu'(lidt(' rf>Sult is negatiV('. Ot.herwiSl' the final result is ct to +00. WI' will conn>nt.rate in t.his st'ction on t.he cxcC'pt ions sp('cifit'cI in t hI' IEEF &all- clard 19). The 754 standard dt'filU'S five t.ypes of except.ions: ov£'rflow, undl'rfuw, division-bY-7ero, invalid oppration and inE'xact result. The first t.hft'!' c!Xccptions are found in almost all floating-point systt'lUs. Only thp last, two are peculj,.u to thp IEFE standard. Wh('ll an except.ion occurs, a stat.ns flag is set. and a spt'CifiNl mmlt is gC'nerated (e.g., a corrC<'tly signed ex. when a division-bY-l.ero occurs). The status flag should remain Sf't. unt.il t'xplicitly cI£'arcd. The IEEE standard rennummds t hI:' implf'lUl:'lIt at ion of a separat.(' tral}-f'IJabll:' bit for pneh exception. If this bit is on whPlI the> corrt'Sponding t'xcept,ion occur!. t h(>n, iu addition to settin the st.atus flag, the us.t'r t.rap handler is caliI'd. Suflkipnt information must hC' provided by the floating-point, unit to the trap handlpr to allow it, to takC' the appropriatp actiou, e.g., exact identification of t.he oppration whidl caused tilt' t'xct'pt.ion. Overflow. Till" ov('rflow exct'pt.ion Hag is Sf't whc'npver the expouf'ut of till' result C'xcCt>ds t.he larg.>:;t vahlf' allowpd in the result's format,. For exampl£', in the "ingle-precision formdt all overflow occurs if E > 254. The result, wheu If the OVf'rHow trap is f'nablPd t.hen thf' trap handler r('CeivPs the inter- mediatp rE'sult divicl(>d by 2 U and then rounded wht'rc> a is 192 or 1536 for the single- and double-prl'cision format, resp('Ct.iwly. This snling adjust.ment. was chOSl:'n in order to translate thl' OVE'rHo\\f'd rf'Sult as nparly as posl>iblE' t.o the middll' of the C'xpOllPnt rauge so that it cau be USf'd in subsC'qupnt opprations with It'Ss risk of causing further eXCl:'ptiollS. For exampl(', whpn multiplying the numbl>r 2 127 (for which E == 254 in t.he siuglf'-precislolJ format) hy 2 127 , the ov'rflowed product. has an eXpOllPlIt of E = 251 + 2M - 127 = 381 aftpr bf'ing <\djusted by 127, as is normally dOlC' for the uultiply oppration.. his [{'SuIt c1parly o\'f'fflows since E > 254. If tins product IS thpn I>calecl (Ulllltiphed) by 2- 192 , the r('sulting exponpnt becomc>s E = 381 - 192 = 189 whidl repre- sents t.he "tnlf''' "alul:' of 189 - 127 = 62. This scalr'd intenuecliate vahlf' has a smallcr risk of ('ausing furt,hl:'r pxceptions. ('onsidE'r now th,! casC' whrf' relati"ply "small" opf'rands mmlt in an (}vprAow. If we multiply t.he IIlllllher 1>1 (for whi('h E = 191 in t.he singl(....prf'dsion format,) by 2 65 (E = 192), tl.le overflowf'( prod- uct ha." all exponPllt of E = 191 + 192 - 127 == 256. If tins exponPllt IS t!l('n adjusted by 192, WI' obtain F = 256 - 192 = 64 v:hich rppt'(>nt the "tru" ,,-alue of 61 _ 127 = 63. Conversiuns (e.g., from decUlIal to hmary) are handled differt'ntly, S(..'C 19). Underflow. Thp mnditiolls under which t,hf' ulldl:'rf!ow Rag is sl'f. depend 011 whether the undl'rllow t.rap is l'nabled or disablcd, If the ulIh>rllow trap is cnabl..d the Ulld('rflow l'xc('pt.ion Rag is set whenl'vcr thC' result, IS a lJo.lIl.ero I 1 ? F , d 2 F,,"n R nI'u ll th -, t F is - 126 for t.he smgle- Il\un )('r }ctwl"f'n - _ on n an . n_-"  ,"m . . .. r t I 1() ') 2 r )r t h e (Ioul>lt>- p recision formal. fht' mtl'rmedmtC' precIsion IUrllm, alll - - Il ...' It I II . t l fi telv P rec;lsc re>1I result d£'livl'red to the un(h>rllow trap lalll er IS II' III m.. r. . mult.ipliNI hy 2" ami then rouncled. As for oVl'rllow, a t'quals 192 ur 1;>.$6 for t,he 4.8 EXCEPTIONS 
h 4. Binary floating-Point Numbers 4.9 Round-off Errors and Their Accumulation 87 S1l1Jdl.... or douhll....pn'l,isi,)n f)rnU\t.. n'Spt'l"t i\",'ly. ('{)II\'t'r.;iolls ELr" also 1U\lIdlt.'ll ditTt'n'utly (9). If h')\\'l'\,'r. tht' Imtll'rt1,)w trnp is diAAhh'c.I. dCllnrmali7("(llUm1b'1'S l\fl' al- I("""d. I" hi' 'mdl'rt1ow t'x(t'pti()11 ling IS t hell s"t ollly wht'll au ,'xt morclillary I('SS of I\....urucy O("('U whil(' n'pfl''nting th,' intertm'llifih' r!.'Sult (whit-h 1m.... a n,)II/('rO \'Rhl!' Iwt\\'\'l'1I ::t:2":......) 1\." 1\ denormali7t,<1 IIl11nhl'r. Such 1\ k\. of I\l'CU- fI\'y O("('II \\'lwII 'itlu'r tlu' u\fd bit )r thl' stkky hit is non/('ro. Tht'st'irulkah' nil itWXI\l.t n"$ult. In an lU'ithmdk ullit wlwfl' "I'llonnalit"li uumlwr.; nre lint impl'Jlll'utl>d the ll'Ii\'\'n'l.l n'sult is ('ithl'r 7t'fO or ::t:2".'...... 110 pn"C:isioll WlL" 10000t whell p<'ffcJfmiug the rounding, Thf' purpose or the in€,)Q\Ct ft"$ult. flag is t.o allow iutegt'r calculations to be performed iu a floating-point unit. 4.9 ROUND-OFF ERRORS AND THEIR ACCUMULATION fhE' u('('(1 to pt'rform rounding in t1oating-point operations., eVt'n \\ ith the be-;t rounding sdl('ml" r£'Sults in errol'S that teud to ac("\uuulc\te 'L" the lIumbt>r of opl'fI\tiolls incfl'&>es. The relati\'€' round-uff f'rror in a floating-point opt!'atiou is d('lu)tt'd by E and is defined hy the equatiou Fxnmplt' 4.9 Supp\.",,' t Ill' lIIul'rf1,)w t':..'\:"'pt ion t rnp is ,1ihlt"C.l mI(l drouormalizt'(1 mnu- b('r.; ha\'\' 1)(''I.>n impll'mt'ntl"C.l. If \W lUultipb' 2- M by 2- M th.' m:mlt.il\b t'xp,)lwnt is F = \ 1.?7 l)5) + (1:!7 - (5) -127 = -3. Siul"l" E < 1 \w mnnot n'pn'nt the pl"\)duct ns a n)rlUn1ilt-d lIlunher. IIISh'd(l, WI' n'pl't,'ut tit€' I't'::mlt 2- 130 as tht' ,I,'nonnaliz("C.1 IJllmbt'r O.()(){)l .:! -126. i.I'., 1=.0001 .\n(1 £=0. Xo undl'rflow t'x"\'pti()l) fl.,S is t't. If tilt' St"C."(")ud opt'tand is (1 + ulp)'.?-M (ratht'r than 2- M ), the C'Otrt'Ct pro(hll.t I (1 +ulp)2- 130 whidl. whl'n coll\'t'rtt>\1 to 8 IIt'nornmli7t"C.l Ulnnbt'r, yidds 1=.0001 l\.Iul E = 0 us bt>fol't. but now t ht' st icky bit is t"C.l,ml to 1. Tht'f\.fol't" this is au ilJl'xl\ct I'tult 81Ul tlu' ,mdt'rt1ow exception flag is t. o Fl(x.. y) - (r .. y) E= (x'" y) ( -I,) whert:' . is an one of the floating-point arithmetic operat.ions +, -. x, or /. and Fl(.x * y) is the correctly roundt'<i or trunC'\tt'<i rt.'mlt of the opt'ration (x. y). A more conwui('nt fOrlu of E(luation (4,8) is Fl(x*y) =(.x"'y).(l+E). -1.9) Upr bounds for the rel8ti\'e error E CI\.Il be deri\'ed for the different wuml-otf sdl..m€'S. For truncation, thf' ab...;olut(' error cau be almo:>t as Iar as tht' It'a.st-significant digit of the significand. ThE' worst C8.-"f> for the rE'laaw error is when the norl1ln1izt'd result assumes it,:;; smallest ible \-alue. Thcl"f'klre. Diviston b_v zero. Thl' lli\'isl,)n-by-zt'ro t'x("\'ption flag is l'('t wlu'nt'\"\.'r the dh'iSl)r is Zt'I\") 1\.11d tht' di\'id..nd is n flnitt' nOU7t'W muutX'r. "'hen the ('orr\-..spolJlling 1 mp is 1i.blt'l.1 t h.. I't"$uh must bt> n I..'()rre<'tly signl"ll .:\... Im,alid opef800n. Tht' ill\"Rlid opt>ration fla b t if 811 operand is iunwd f\.)r 11t(' opt'ration to 'pt'rforml-d. Tilt' l't'Sult, wh>n the itwa1ill opt'ration b:ap  li&"\blt'l.l, is a quil't "\a.."\. E..xaulplcs of im"Rlid opt>rations are (9): 1. Iultiplyin..t.'; 0 h' "'- 2. Dividing 0 by 0 or "'-' by C\. 3. -\dding + XI rold - ""- 4. Huding tbt' Sll,um' root of a Iwg,\tiw opt>rand. 5. Cakulatiug. tltt' n"maindt:'r .r REI y wht>f\.> y is zero or .r is intinit('. 6. .Am' c)pt'latiou on a s"URliug :'\<L'\ <)-m I I < - 'J-mTI Err..nc: - 1 2 =  . For tht' 1"\)1II1d-to..neart'St sch('me, tht' maximum absolute error 6 only half c:J ulp. and cons<'Qul'ntly nexdCt rosuIl Tht' iUt'\l.t n":!'ult flag is t if the f\.)JUldt'l.l fl.':'1I1t is uot e.xoct or if it o\"\'rtlvw \\;thout 1\.11 owrflow tmp. ..\ rollndl'li n"$ult is e.xm-t onl' Whto'11 both tbe g.uard bit aud the stick)' bit 1\1't' equal to Zl'J'Q. This implil that I < I'J-m+1 - ?-Ift I E roun4 - 2- - - . The abo\-e formulas pro\'idl' ouly upper bounds for the relative error. What might bt> more important i... the exact distribtion of te relati\'e erru \\ithin its bound... This distribution has been studied (t:'.g.. III (WI) aud the following density function of Err..nc: ( Figure 4.8(a» bas bet>n deri\-ro: { 2",-I/ln2 if 0  o < 2-"' /. (  ) - 1 1 / if "' _ -m :5  o < ? _ -m-rl . -1.10) .'r>oftc"' O - (;0 - 2"'- ) 10 2 -,  .  at ' I on are unifonnl\' distributed in In othl'l" words t.he rE'latl\.c errot'S lor tnmc , -' . h . [ ?-m ?-mTII The tht' n'gioll (0.2.-"'11\.1)(1 rel."iprocalb distributed m t (-' regJon - . - . 
88 f . tlTUni' 4. Binary Floating-Point Numbers f roulld 4,10 Exercises flO 1- 1-..' A Illurp Sl'vPre Case of error IUTulUulatioli may occur in t.hf' 8uhtrart op- eration ill - 4 2 The relative error in this C&:P, ulldpr t.hf' !'lame a.,..umption as ahove, N}lIals (n) (b) .£ 1-".' 0 A A 2 A c A c' "I - A c A c' (2 1- 2 1- 2 If the o!Jl'n\lIlls AI and 1\2 are positive numbprs doSl' to eac'h other, the tcell- lIIulatf'd relat iw. I'rror conld incre"t,.;(' sibrnificantly, esPl'{'ially if the t\\D relativf' error" fl and (2 have oppm;itp signs, resulting in a very substalltial inaccuracy. The <t('cumulation of errors wh('n a seqlll'ncc of sl'veral Roat.ing-point (1)- erations UP pprformro depends on the part.irular set of opf'Intiolls pf'rforn1<'d; in other words, it d('ppnds on the spt'dfic applicntion. TIJPrefore, tllf' aCC'lItnulation of errors in tht' gl'n('ral case cannot b(' analY7.ed and simulat.ive st.lldif'S nlllst hE' USM il1:-.tt'ac1 (e.g., 1101, (121). As might b(' ('xpected, thpsp stlldips hl\vP :-.howl1 t.hat in most cases, the acclmmlat('(1 rplativp f'rmr whl'n trwlcation is used is higher than that when the rouud-to-ncarest schemp is f'mployed. €o FIGURE 4.8 The density function of the relative error for (a) truncation. and (b) rounding. nVf'mg<, \nlue uf tilt' rplativp prror CUllI10\\, be calculuted. This cakulatiolJ rt'Sult.s in - 2 -,"-1 /1 2 0 72 ?-," 'rune == n . . _ . The deusity funct.iou of fround (5(.'(' Fibrure .I.(h» is f<ro..n""O) = { 2"'-1 In2 ( 21ol - 2".-I)/ln2 if -2- m - l :5 (0  2-"'-1 if 2- n .- 1 < 1(01 < 2-"' 4.10 EXERCISES (.1.11 ) \\'hf'rc A nnd A 2 dcnote t he "correct" vahu's of . t I am] "-\2, reslH'('t i\'dv. If Wt' focus 011 the accumulated trror and assuUle that 110 lIew ('rror will bp illtrodul1 in the addiuolI, tlu'll lh!' relative error uf the $11111 AI + A2 becollu's A. 1 + A . 2 _ A A A + A 2 - 1 + A . (I + A + A . (2 Thus, thL rplat.iw .'rror of t I... ::.UI11 i:-. a \\'dght -"(I avprag p of t.he relutivt errors uf till' opprands. If the two operands nIl' positr\e, then the above "rror will be duminah>d by the ('rror iu th larger uperand, 4.5. l'on.sider thl' following 'IO-bit findting-point format: A sign bit, a ..!9-bit nor- malized (fr8Ctional) significund in two's complelllt'nt form, a 100bit exce,.:, 512 ('xl>on('nt, and base 2 for the exponent, Determine the bit patterns of thc !>mall- e:;t and largL'8t positive and negative numb(>rs, Imcl find tbC'ir \'alm'. Also, find thc distallcl' betwt'Cn dllY two consl'<'utive numbers. lIow many dilfl'rent values CCUI ,you rt'pr<"S('nt using thi'l format? Find the nonnalizPd int('uml !Uachint' reprt.'sent<ltiolls of t.he following 110 Iting- point numbC'rs in the short IBM format, in the sbort DEC fonnat ,\nd in thf' short IFEE forwat: (a) +0.2.') 'l+3l (b) - 31.75.2- 7 (a) Show t.hat thpr,' arc 1.875 tim('S a... lIIany normnli./A'd l1oating-pointnumlwrs with all expon('nt ba.<;(' of 16 as thoS(' with 811 t'xpom'nt ba.'IC of 1 Both us<: the sal11(' number of bits ill thf' cxpont'nt and siKUificaud fields. (b) Wbat is tlu' ratio of rt'prescntable normaliz('d nllmh('rs for {3 = I ? How llIanv dellornmli.tl'd numht'rs arc thpre in the short (32-bi) IFFE form.at, mul what" is tht'ir rangr'? C'ompar(' it to the number of nOrlnab.tc<l values wltb t.hp fixt'd t'xpoIJent 1 (including bills). \\'hidl of th(' following I)rop('rhes [lrc s.'\tisfird ill systt'nL.o; with .dt'Jlormalized numb('rs, and which arc satislied c\('n in s\'sh'ln& wit.h no dnorlllalUeti. nllmbcn.: ( ) -+ . I ' Y J 0 (b) ( X - y) + Y  :x (with a rolluding ('nor). 8 x r Y JlUP if'S Z - r . . . b ' f 1/ 4. 0 thell ...!... ..... x ( with a roundmg error). (c) E-or a noullallL...d JJ\lm er Z, t x r l{ll - . . l [( ! ) d ( I ) '-I J Prove tbat th,' u\'t'rugc bias of th(' ROM roulldmg S('hemt' IS J :I - 3 . -., b t h IFEE standard' HOlmd-lO-nMrC'st- I'onr roundmg schpmes ar(' .supporh"U y c .. . t'\('Il, rounel toward zt'w (trllDldte), round to\\ard infilJlt.. and roulld toward -00. Calclilatt' till' bias of all four rounding S('hC'mrs, 4.1. fhe rdutm.' t'rrors, whclI rollnding, alP ulliformly distrihut('(1 in the region (_2-m-l, 2- m - 1 J ulld rt'cipronlly dist riblltl,<1 els('whell'. Unlikl' F(lliation (4.10), t.h(' dl'uity f\lndion in Fqllltt.ion (4.11) is symuu't.rir wit.h respl'ct to fO = 0 and, dS 8 (('suit, tht' Iw('rngt' rdat i\t' ('rror is O. Tlw Ulml},t.iral1y derived f'Xp(1':.siou!i iu Fquations (4.10) and (4.11) w('re shown to be in \'Cry good agrccment with ('mpiricnl rcmlt s. I'hp ahlw uuw}'sb conn'utrated on tilt' rouud-off ('rrors occurring in a single l1oHhug-point oJlt'ration. It (li(1 lint. indicatt' how thl's(' errors could ac- nuuulHte iu subs.L'<IUl'lIt operdtious. Considl'r, for I'xamplf', two int,elllu'diate H'8UltS AI and ,'h, which 1m' to be add('(1. DCllott, by £1 aud (2 thl'ir corrt'S»nnd. illJ.. re1ut i\'(' errors, satisf,yiug 4.2. 4.3. AI = A(l + £1)' 1 2 = A 1 ( 1 + (2) 4.4. 4.6. 4.7. 
90 4, Binary Flocmng-Polnt Numbers 4.8. I'or thf' four rQundinp; scheJ1l(,<; sUIJportM h)' thp JEFF standard (S('(' problem i) show I hr final roulldN! rrsulls in thc foUowing thrN' rases: 4,11 References 91 4.11 REFERENCES s cXpa1Icnt fro('ticm "uord 0 00011111 11111111111111111111111 1 0 11111110 11111111111111111111111 1 1 11111110 11111111111111111111111 1 4.9. U",,,,'(I 011 t.hl' rt'Sults of problem 8, what L'I the ad\"antal' of thp rOUJlIl-to-nearcst- E'V('n S<'hemC'! What is th,. clL'IfI(I"antage of the> round-to-nc.ucst-('\'I'll schelUC (whE'n impl('IJJ('lItatioll L'I consi,!rU'c!). whi("h C"\II be avoided if a round-to-m'arest- odd S<'lIC'IIIC' is adopt,'(I? 4.10. \Vrite down th(' postnorlnalizatioll st('lJs th"lt might bE' m'(lro wh('1J pC'rform- inp; addition, subtractioll, IIlIIltiplkn.ion, and di"u,iou \\ ith two flonting-point opE'rauds iu the IEEE short forwdt, Indicat' how WI&u) gUilnJ diKits ar" Ile('ded ill rac-h r}sp. 131 ( II (51 (6) (71 (8) (9) (101 (11) (Ill (131 (141 (151 1161 117) I 4.11. Show Ihp ((-suIt oft.IJ(' followill operatiOlL... on lluml...fS in thl' IEEE short fOruMt in all four roulldiug Sdlf'JJlf'S (S('(' problclI1 7). The operallds arf' gi\'f'n ill the hl'xadN'imal lIotation. (8) 3FRO 0000 t- OORO 0000 (b) 3FU UOOU - 3FiF 1"Fn' (e) 3F80 0000 + 33S0 0000 (d) 3F80 0000 OORO 0000 (e) ,10000001 x 1000 0001 (f) .10000000 :13800000 4.12. l\vo normali./ed floatll1g-pomt UlllubC'rs A iUlIt B in till' short IEEE format were n<ldp<l. I\l1d the ("('Suit was ('(Ilial to ..t. Do this imply thnt B = O'! 4.13. Gi\'('n a floating-point numh('r A with an expollent E" (ill "II)' format), its SlIC- C"i()r hiL'! rithrr the same C'XPOII(,llt or t.hp expon(,lIt. I'A + 1. Is I he distance betw('('n A alld its sucnossor th(' samE' ill both CI\S('S? 4.14. (8) Compar(' thl' enor invol"ed in thl' S('riaJ evallintioll of the product of fOllr lIulUbers, IJ('rforllll as «(...1 1 x ...1:z) x Ih) x \..) to that ()f its parallel ('valuation performed as «AI x A:z) x (AJ x ita». Decid.. whethcr one of th ml'tbods has a sm.uler UPI)er bound for t.he E'nor wbf')J forming thE' product of n nnmhC'l'S. (b) 1{C'l'f'Dt (n) for thl' stUll of four Illllub('l'S, thC'1I n nUlJlhf'rs. Can w(' gl'tlowl'r ('rror boullds if we know that. thC' llumhC'rs .up ill .'10mI' ordC'r; C.., 8SCI'nding ordl'r? 4.15. I'ro\'e that t.hf' optimal wny to imlJlcl11ent a two-lcVE'1 combiu.,tonal shiftt'r for k bits, wlll'l"{! k = m:Z, is for the first Icv('1 t.o shift by Inlihipll'S of 111, aud thC' :;<'Coud levello shift from 0 to m. A:iSwne that tbp df'la)' L'I proportional to till' nllmber of dtinations for cach line in the two II>\'('L.,. Call you gcueralize this r('<;uh for Imy v"hll' of k? 4.16. Anoth('r way to implemellt a radix-I combinatorial shifll'r for 53 hits is by re- blricting thc nwnbcr of destinations for cvpry hit in each Ic\'clto I but allow nll)re than two Icvels. How mallv levels will sudl a combillatorial shiftE'r ha\f"? lIow wany le\f'l8 will a radix-r oJlJbinatoria1 swft('r for m hits ha\'" if thl' nurnh('r of destinatioll.'I for ('wry bit in each 1('\ el is r('5trictro to r H? (11 G. ALEfELD and J. IIEtaHt:RGI-:R, Introduction to inknral computatlon., Anvl- I'mi(' I'n>;ss, NY, 193. (21 B. J. RE'IISCtlEIDElt, d at "A pipelilJed 5O-I\IHz (':\IOS &I-bit floatill-point arithmptic prOCt'SSQr," IEEE Journal on Solid-State Circuits, 2.l (Octobf'r 1!189), 1317-132.1. R. P. BRENT, wOn till' precision attainablC' with vnriOlL" floatillK-point nmllhpr S)StC'IDS," WFE 1Turu_ on Compurs, C-22 (June 1973),601-607. \\ , J. ('ODY. JR., "Static wid dynamic numerical charactcri.'!tics of floRling-point "U'lthmE'tic," /f;EE Trons. on romputers. C-22 (June 1973), 598-601. W. J. CODY. JR., "Anctl1;i5 of proposals for t.hE' floatillK-point stilllddrd," Com- puter, (1\larch 1981),63-69. G. FVI:N alld P .-I. SEIDEL, wA cOlIJpari..'ion of tbr('(' rounding algorilhms for !Elm floating-point multiplication," IEEE Troru_ on Computers, -19 (July 2(00), 6:!S-650. D. GOLDBERG, "ComplitE'r IIrithmrtJc," ill Complder arrhiu>cture: A quantitahve approach D. \. Patterson and J. I . Hennessy, l\Iorgan Kaufmann, CA. 1996. 1-'. HOKENI-:K. R. rOXTO' E Blld p, ('OOK, "Second-Kencration RISC floating- point with multipl'y-add fused," IEEE Journal of SolId-State Circuits, 25 (October 19JO), 1207-1213. "IFFF slalllhrd for hinary floatmg-poillt arithmetic," ANSI/Il-:EE 75<1-1985, also in Comput r, 1.1 (March 1981),51-62. D. J. KliCK. et aL "AnalysIs of rounding method:. in floating-point arithmet.ic," IEEE Trons. on Computers. C-26 (7) (Jul)' 1977) PI'. 613-650. O. J. KlJCK, The' Structure of computl'N and computations, vol. 1, Wiley, New York, 1978, cbap. 3. J. D. MAR\S" and D. W. 1"TlIL", "A simulative study of corrC'lat.NI error prop."\ntion in various finitt>-prl'Cu,ion arithmetics," lEEr: 1hms. on Computer.t. ('-22 (June 1973), 587-597. S. F. Ont'IIIAN, II. AL-T\\"IJRY, and 1\1. J. FLYNN, "The SNAP proje(.t: Ot'Sign of floating-point aritlullctic units," Proc. of 13th Symp. on Computer Arithmetic (July 1997), 156-165. V. PESG, S. S"Ml:DR"LA and M. G,WRIELOV, "011 the i.mplcmentation ofshifters, multipliers and divider:; in VI Silloating-point units," Proc. ofBth Syrnp. on Com- putf'1' Arithmetic (May HIR7), !)!'...102. M. H. S"NTORO, G. BEWICK and I. A. HOROWITZ, "Roundmg algorithms for IEEE multipliC'rs," Proc. 9th Svmp. on Computer Arithmehe, 1989. 176-1H3. I. S. SCUMOOKLt'H and K. J. :-lOWK", "Leading Lero anticipation and detection - A COIIII)arisoli of met hoWl," Proc. 15th Slimp. on Compo Jlrithmehe. 2001,7-12. P. 11. STERBI-:NZ. Floahng.pomt computation, Prentice HaU, ElIgl'wood ('liffs, NJ, 197-1. 
1}2 4 Binary Floating-Point Numbers 118) H. S.l'1UI<I, II. IORIN"I<". f't al., "I radillg-7Pro anhupatoI\' 10KIC for hi h-s d floatmg poi III addition," IEEE Journal on Solid-Stat CircUIt., 31 ( Au<T1It 'C 6 ' ) 1157-1164. ' 0- , V. \\ . S\\ EF'III-:Y, "An analysis of floaling-point "clditiou" IBM S !J t J al -I (1965) 31-12. ' .. .. $ f1J.9 ourn . 1'\ - Ts,\o, '"On thp distrihution of sifllJficaut digit..., and roundoff f'rror.!" Commu- nirotions 01 the AC\f, 17 (May 197.1), 269-271. ' J. M. Y OIlE, "Rowlliing in floating-point arithlll('tiC," IEEE 1m,..,. on CO" t ('-22 (Jnnp 1973) 57i-586. 'pu m, 5 (Ig) 120) (21) FAST ADDITION 5.1 RIPPLE-CARRV ADDERS The addit.ion of two operands is thf' most frequent, operation in almost any arith- metic unit. A two-operand addt'r is lIsed not only when performing additions and subt.ractions, hut also often cmploycrl when executing more complex Op(1'- at ions lik" multiplication and divisioll. CUlIs('qucntly, a fast two-op"rMd addt'r is l'$..<;('ut iat. The most straightforward implempnt.ation of a pnrallel adder for two oper- ands I.._I' X n -2 ... . IO and Yn- r. y..-2 . ", Yo is through the use of n basic units callNI full adders. A full addpr (FA) is a logical circuit that accepts t.wo oppraud bits, say Ii and Yi, aud an incoming carry bit, O('lIotNI by c.. aud then produ('('S t,h(' corrf'spondiug sum hit, denoted by SIt and an outgoing carry bit, dpnoted by C'+I' As this not.ation suggests, th" outgoing carr\' C,+I is also the incoming carry for t.he subsrquent. FA, which has Ii+l 3nft Yi+1 as input bits. The FA is a comhinational digital circuit implpllwnting thl' hillary addition of three bi through the following Doolean equations: 8i =XiEBYieC, wlwre e it> the exclusive-or operation. aud c.+l = I, . y, + Ci . (I, + Y,) (5.1) (5.2) wltl're X, . Y, is the AND oplration, Xi 1\ y" und Ii + Yi is t.he UR opt'r<\tiun, I, Vy.. 
04 5. Fast Addition 5 2 Carry.Look-Ahead Adders 'Jf. X:t Y:i X2 112 X, Yl Xo Yo Mlm output and tl1.. carry-out art' ('qual. This mny bp thf' CJ!..CW iC. for I'xaluple. hoth circllit!! liS(' a two-level gatf' implpmentatinn, fhe foliowillJl: diagram shows t.he sum amI carr)' siF;l1als as d funct.ion of thE' time, T, m('asuwd in 6 f ., units: ('3 C2 CI c{) FA FA Fi FA 83 8"} 81 80 T=O 1111 + 0001 T = 6PA ('arry 0001 Sum 1110 T = 26 PA Carry 0011 Su m 1100 T = 36f' , ('arry 0111 5111u 1000 T = lf'A Carry 1111 Sum 0000 FIGURE 5.1 A 4-bit rlpple.corry odder. A paralll'l '\ddt"r cousist.ing "f FA Cor 11 = I is dt'pieted in Figun' 5.1. 111 a paralld arit.llU1f'tic unit, all 211 inpnt bits (x. and y,) are usually a\,lilahle t.o the addN at. the sanu' time. lIowl'\'('r, t.hl' earri('s hav£' tu propagatI' from thf' FA in position 0 (till' position of t.h£' F.\ whosl' input., arc Xo ano Yo) to po..;ition i in ordf'r for thl' FA in that position to prOlhu'c t.he corr('d sum and carry-out bit.s. In othl'r word", we net"'o to wait nutil the carri('S ripple through .11111 FA:. beforf' WI' can daim t.hat, thl' tlln ont put s arf> corr('('t and may be used in furt,hl'!' calculatiom,. Bff8USC of this, thl' par.11Iel adder shown in Fignn' 5.1 is ca]l"d a ripple-carry adcier. Note t.hat thl' FA in position i, twing a comhinatorial circuit, will S('(> an incoming carry c. = 0 at the b('Kinning of till' operation, anI I will accordingly produce a Slllll hit. s,. The incoming carry c, lllay chang.' 11l1t'r on. mmlt ing in a corrl'Sponding clUIng!.' in s.. Thus, a rippll' eff,'ct can he obsl'rwd at t.he sum out.puts of th(' addt'r as w£'II, continuing unt.il till' carry propagatiou is ('ompl('te. Also, notice that in uu add oppration, tll(' iucoming carry in position 0, co, is always ".ern and as a result t.hp FA in this posit.ion nn be replacl'd hy a simpler unit C8IMhle of adding only two bits. Such a circuit exist and is caBed a halC add('r (HA), and its Boolean '-'<Iuat.ions cnn be obt.aincd from Equ.,tions (5.1) and (5.2) b,y S<'t.ting Ci l'Qual to O. Still, an FA is frequently IISI.:d to pnahl.. w; to add a 1 in th£' INI,st-siguitlcant po.itioll (ul,,), This is 1\('f'(led to impll'mt'nt a subt.ract operation in the two's comp"'n)('nt llIl'lhod. HI'H" tl1l' suhtrahend is complenll'ut('<.1 ami th('n addt>d to thl.' minuPIl<l. l'his is accomplish('d hy taking the onc's comp">ul(>nt of thl' Mlbt.rahend alld ddding a forced carry to t.lJ(' FA ill posit ion 0 by setting Co = L This is tlu' longest cdrry propagation chain that can OfTur whl'n 'lliding two -I-bit numbers. In synchronous arithmetic units, the time .,lIowl'd for thl' add('r's oppratiou must be the worst.-case delay, which is, in thp gf'IIl'ral case, 1l-6Jo'A' This \lJpaus that t.he nddl'r is assumed to produce t.he correct sum aCtror this {i"pd dclay rep;ardll'.8S of th(' actual carry propagation timl" wlneh might. bt: very short, as in 0101+0010. COl1sidpr 1l0W th(' huhtrdd operation 0101- 0010. which is pprfornu'(l by adding the two's compl{,Ulcnt of thp subtrahl'nd to the 1lI11luend. TIlt' two's complenlf'nt is forlll('(1 by taking t.he one's l.'ollll>I{,lI!mt of 0010, 1101, dlUl setting tht, Corc('(1 carry Co to 1, yi£'lding 0011. 0 It. is dear that thc long carry propagation chdins must be dpwt with in ord('r to spC('(1 up thl' addition. Two main approachl':> can be envisiollt'<l: One is to reduct the carry propagation tiuu>; t.he otht>r is to deted tilt' cOlllpletion of the carry propagation and avoid wasting time while waiting for the fix('(1 cll'L'1Y (of n . 6f",' for rippl('-c'\rry add(>rs) unlt'SS absolut.ely lu'cpssary. Clearly, the second approach leads to a variable addition time, which may bt' im'onv(,llimt in a synchronom; dl'Sign. We will tht>r"fort> coucentrate on the first. approddl and study s('veral s('h('nl('s for accf'INf\ting Ciirry propagation. fhe tedmiqut' for dptcction of carry cOlllplct.ion is left as an e.xert'ise to tht' reader. 5.2 CARRY-LOOK-AHEAD ADDERS Example 5.1 Collsider thf' follo\\ ing t.wo operands for t.h.. addpr in Figur p 5.1: X3, .f2,.f1t Xo = 1111 and Y3, Y2, Yl. Yo = 0001. 6F' denot.es till' operation timf' (dl'lay) of all FA, assllming that t.ln' dl'lay:. associatl'{l with generat.ing the Thl' IlIOSt. commonly uS(d Sdll'1JI(' for a('celt'rdting carry propagntion is tht: rorry- look-ahead sdlpme, Th.. main id('a hl'llind ("arry-look-uht'ad addition is an I&t- tpmpt to gpn('rate nil incoming carries in parallel (for all the f1 - I high ordt'r FA'i) and avoid tht> nt"f.'rl to wnit until the l'Orr('('t carry l)ropaat,t>S frum the 
DO 5 Fast Addition 52 Corry-look-Ahead Adders 97 :1'11$.(" (FA) IIf 'hp IJ,Mrr whr-r'> il h 18 l)peJl ('lIcmt '(I. rhiM nUl I". A.('("olflpli!jI,,',1 in r rinriph... lIincf' the 'arnl'S "n'm 'd uncltl... wuy IIU'y prI)IMElt." dr'pr'url only 011 ,h . diiL'I of t hr' ongJrlll1 IJllmbl'rti Xn-ITrt_" ....1'0 IIl1clll n _ 11Jn-2 .. Yo. TI..'1\(' diJ.(lt.'! arp avuilahl,. lIirnuh 1111 I "lIIflly to 811 !jtllg, of t.lw Allrl"r mill, ronq'll'ntly, (01J.('h 8taJ(l Cllll h V(' 1111 tlw infrlrlnation it II"",I in fJrtl"r hI "lIklIhl ,till' 'In"l vahl" of IIII' iucUJninJ( rarry ntlrl "lIlOl'lItr thr HUIII bit w:(,urdingly, rhiH, hrl\\l.'\"'r, wlIlIl,1 rrqllirr. an inorrlinat 'Iy InrJ.,!.' numl".r of UlI)IIt 'Iaeh SU1gC of the> -ul J.'r, rNIII,.rinJ( t laiR 8ppro 'h IIlIlnlll.tkal. Otl(' may r('dllct> till' IUnnlJl'r of mplltH II. p1\('11 H' aJ{(' hy pxtmd ing I hI' informat i'JII Ill','clr.d fruln the inpllt (lii'8 t.o ,I(.t.f'rlninr' whr'th"r IJI'W r'\rries will 1)(. W'IJI'rnll'd and whf'llIPr th('y will hr. propag Ilf"1. fo I hiR pnel, wr' will "tudy ill tlpI Iii t h.. g"flr'rnl ion IUleI prllplLgntion IIf ('arri(. . 1'11('((' an' Ht agf'R in t hp a,l{]pr in whi,'l1 a I'arry-ollt if! generat '{I regardl,"8s o( I III' uU:OfJllng carry. 8nd lib 1& result, UII IlllrlitiOlllL1 infurlnatioll 011 Prt'ViOIiH inpul digll.!i if! n'lIlIir,'rl. Th, . Ir' I,h ' III Ig('s f'lr whil'h .£, = Y. = 1. flwrl' .lrQ ot IlI'r H'aJ(' thftt arl' only capa!>I,. of pr()padlillg t.III' in{(llIIlIIJ( ('ury; i.e., X.III = 10, or :Tilll - 01. Only a Htag., in whir'h r, = II, = 0 ('<IIlIlot propuJ(at.' a !'Ilrry, I'I lUlliiruilah' till' infnrmation (('g ,rding tlU' "Iwr Iti',11 Sllld propuf1:ntioll ,Jf rnrries, w,' .I"lilll' I hr following logif' flllll'lJonH IIHinf1: t Iw AND ami OR 0pl"rdtt0I1S, Let ., = J', . V, (1"001 {. thc flf flf-rat d rorry Iud I(.t 1'. = .£. + Y. dl'nol' till' IJrlJI atlll1u} C(Jn"II, AfJ a rInh-, UII' Bo,JlNLIIl'xpression (5.2) for tllr 'fury-uut, "an h,: rl'WriUA'1l 88 C'+I = I,y, + rifT. + y,) = 0 , +- r. 1'1' Sui. I ilutin (', - G I -f ('i-II'I-I in I,hp Ilhov,' pxpnion yir'ld!! "HI = G. + G.-J/', + G.-.ll',-IP. + C,-2 P .-"P , - I I'. = = (;i + (;'_11', +- (;.-21'1- 11'..1 ... + ('01'01'1" .1',. (5.:i) A rlp1sy of 2{. is thl'lI 1IN'(Jr.t1 t41 J{"nr-rat" 1\11 ('-i (888umin a two-Ipvpl '\tR irnplr'lfll'ntat ion) uul IInllt IIpr 2fI t ) r'II"Hlh.' tilr' 811m rliJ{il.8, II" in IJaralll.1 (af1:(UIl, IIsslIIlling a Iwo..lf'Vpl f1:atp irn[llpm{'r'ld ion). 11"11(1'. a tf,tal {Jf 5 . tlmp IIlIilri iR III" Ilpd, r"J(anll'!8H of n, I,h' IUllntwr (If !,jtR in ,.adl (1J!I'UJnd. HIIWr-VPl, for a larJ(c vlihlp of 71, say, 1 = J2. an "xt fl'nll'ly IlIrS(I' nUluhpr IIf f1: itf'S ill 1II'f'llpd 1)11 1 1, lIIor . 1If1I)Ort IIllly, gates with 11 wry larJ<:p fan-in (up rPflllirrd (fan-in is t.hr 1IIIIIJlwr uf J{a t . input.s, mrl is (:(11111110 n + I in thi!! ('ASP). fhr-r"fnrl', wpmnHt rr.dlll"f' t.he S[J III of tlU' look."hp.lIl at th" ""PPllflf' (If !II N.d. We may divill,.. thr- n st.ag into Kroups "nd lahvr> a RI'p IratI' rarry-III Ik-'Ih,'oo in .dl rollp. fhe grouP!! rail thpIJ hI' inh'r"omu'('t '() !'y th.. rippl,..-nrry JII,thod. DivillillS( till' -uJd('r int.o Nlual-lli/.pd grnups IUI8 the a.rJditlonal b"n .fit of Iliodularity, r'J'luirill till' d,.t.ait,.d diKJI of only a IIIIII(' int('J(J'atcd circuit. A grolllJ Rill' ,)f I h. 1)1""11 (,(Jllllllollly used, IInd ICs 'apdl,I" of /i{Millg lwo """lur'Il(', ('ach colIsiHtilig of four digits with ('Irry-II ok-ahr.ml, arl' dvailalJle. Si/f' -I wa..., 1;(,11'("1.4>11 "1'''Ulty'. it is a ("Ollllllon f8l,tfJr of Int,Ht. word Hi/A''8, 'uill dl80 b,'cauf!(' of tt'I"hllol"gy-dl'JJf'IJllr'llt ("unHl-raint.!I (".g., till' availa"'" 1I1II1I1,,'r of input /0111 Pllt piIlR). F'Jr 1l bits and J(roulJS of 'Iii" I, then' arp n/I groups. To propngatA" a carry thwugh a group oncr t.he 1'1'8, G, 's, and Co arr. availahlr-, WI' n!,,'d 2G tUII(' IInits. Thus, I; is 1",{'dC',1 to gpllprat' all P, IJ(I G.. (n/4' . 2(' an' n .ded t.o I,rupag t . the l'l1rry t.hrouJ(1I all bits, and an additional d 'Iay of 2A(, i lu'dcd to p;elll:rdte t.he surra fJllt.pllt..'j, for a tfJwl (Jf (2'i + 3'r; = ( + J)o. Thi!. i!l alllloHt a fourfold rcdndion ill dr'ldY C.oIIJIMf(.d to the 'lnr delay of a rippl,'-cl1rry ddrJ('r. W" III 1Y fllrthpr spppd III' t.he adrlition by provi,ling  rdrry-Iook-nhpad uVf'r grml)JH in atMition to th(' intNII(lllook-.alwml wit.hin rill' groul). WI" dpfinp a TOIlI'-!/r 111 mU.d ,(1"7/, G", and d grOfJp-propfl" flted carTY, P", for a grnup of si,w .. as fullows: C" = I if a cSirry-out (of Ihc group) is gl'llI'r"t{'{1 irat.('rnally anrl P" = 1 if d f'drry-iu (to 1.l1l' KJ'Ollp) is Jlwpal1t .,1 int{'rnally to pmdnl"e a "f1rry-nut (of thp group). fh,.. Dnoll'dll f'lIlIatiou for tho . carn ar.. "HI = G. + GI_1I-', + C1- I I'I-IP., Flirt IlI'r SUbbtll IItioliH (I '1i1l 1t, in Thill I YJI" of cxpn-'8Hi(JII all( W!l IIH to . lleulat . utI lhe (:arrif'8 III paJ"llIl -I frmu th,' oriillllllhil.ti r n-- l,cn-2 ' . 'J'o flllIl 1J.. I II.. - 2 . . . 110 and tl1(' fur' ,tI carry Co For "XIIIJlI)I,., for a 4-hil wld('r, I III' ('arril:t! IIrI' C;" = 0 3 + C"P3 + G I P 2 1'3 + COP I P 2 P1, P" = !'oPt P',lP3' (5.5) rl = Go + r'o/'u, '2 - G I -t- GoPI + ('0 P" 1'1 , (5..1) ('.I = G" + G.l'" + (;01'11'2 + cOn,P I P 2, "4 = '3 + G"I-'3 f GII''l/i + r:OPIP211 + ('OI'O/'IP 2 I'J' TIll' group-eJJl'mt('d /Iud gwup-propngat. r rJ carrll for HCv<>ral grullJ.lS ('<\!l !lUW "e used to geller It'. group carry-ills in fi IIIfllIIlI'r slIuil.if to billl('-bit cdrry-iliH ill hlllEtitllJ (5..1). A cOilihinutorial circuit. impl,..lIwllting th cqllatiollti itllLvail Ihlc I.!! 11 H('pllrat.. .11lt1 stalillard IC, fhis Ie' iM l'alh'd . rarry.look-rJh(a<l r/( r. mtor, ami it.s UM' is ilhIHlrnh,,1 ill t.lae following pxalllpli'. If thill iH d"lIe fur III Ht.dW'H IIf the ILdd('r, thcn f(Jr I'w'h SIILW' n A<; "I'IIlY ill w{Jlllrl'fl t.v W'llI'rnl . all I, 111111 (l.. will'''' a i!l th' delay of n tlII1'" glil r ', Exftmplc 5,2 I'or u = W there arC four f1:roUll11, wit.h OIlIl'utH G;;.Gj, (;i,C:; IJIIJ Pc;, P" P" 0" flJ(" tierv '  in l Jllts to d carrv-Illuk-.lhl'ad J(,'IU'rdtor, whu:;!' I' ", 6 ,i' 
98 1 I 5 Fast Addition :l'll1-l:l f/lll-l2 ZU-8 YI1- X';'_I 1/7-4 .r3-0 11.1-0 Cnn'1l./ook-Ahmd G,'n('rator FIGURE 5.2 A 16-blt two-level carry-look-ahead adder. (The notation X3-0 means :1'3, .1"2, x" xo.) OUt'fluts art> dcnot<>d by'",. C8, and ('12. s.ttisfYIII r" = Go + C{)p O ' C" = Gi + GoPj + C{)Pti PI"' CI2 = G l + Gi 1'; + GoPj 1''; + coPti PI" P;. (5.G) A 16-hit, udder with four groups, pach with inh'rnal carry-Ionk-ahead and all addit.ional ('urry-Iook-alll'ad v;en('rator, arc dl'pil'tt'cI in Figure 5.2. TIlt, operatioll of this addN consists of the following four !otPps: 1. All groups V;l'nt>rate ill p.uallPl hit.-carry-gPIIPrat.{', G" and hit-carry- propagate, 1',. 2. All gronps p;en{'rale in p<Lrallel group-carry-gl'lI!'rcLtl', G;, mul groUIJ- carrv-prnpag.Lt.c, 1';. 3. Thp ('arry-Iuok-ahcad g!'lIcrator produces the carnes C." C8, mul Cl2 into the p;roups. 4. TIll> broups c .Lkulatl' t.hl'ir illdividnal SUIII bit.. (in parallel) with int('r- lIal ('arry-Iook-alll'ud. In other word.., t.hpy first. gCII('rat.(' t.hs:> ints:>rnal carnes accurdmg tl) Equatton (5.1) .Llul tllI'l1 the SIIIII bits. Th!' IIIlIIimum timf' delcLY of'iatcd with stl'ps (1)-(4) (us,'mming II mini- mum I1UIllI>('r of ga c II'vpls 11111.11 Ctrcui) is IG for st.,p 1, 2G for st. I >!> 2, 2G for sh'p 3, and "G fur sttP'1. Thus, the t.otal addition tinil' is YG int,('M uf IIG, whkh is th(' addit.ion t.illl(, if thp l'xtl'rtml carry- look-dlll'ud gCIII'rat,or is not, uS{'d dnd thp carry ripples .Llllonb t.hs:> rollpS. J'lI' c8kulatiolls yi..t.1 only throrC'tiralcst.iIU8tf'S fur tl,,' addition tim!'. In prartil"4, OIU' has to 11M' till' t,) piC'al dc'lays ILSsol'iatl'd with thl' part.il'"ular illt cgrBted drcuil cmploy'(l in order tu cdkulatl' till' IUhlition thup lIIorP aCl'urald)' (til."" any iUh'f,rah-o drcllit ddt'\bouk). 0 5,3 Conditional Sum Adders 99 \8 shown ill Figurp 5.2. thp carry-Iook-.,hpacl gpm>rator produrf'S rwo wl- llitional outputs, Goo 'tnl! p.., whosp Bools:>an pqua!iollS 'tr(' similar to thoS(' in Equation (5.5). rhpse ncw outputs drc rnllpd !i('rtion-carry generatr "lid scrt;nTl- c.arry p,pa!lat . r('>;'pl'clivl'ly, wlU'rp a S('Ct.icJII, ill this case>, is a Sf't of four group'! aucl conslst.s of 16 bits: As hpforl', th(' numher of f.,fOllpS in a sect.ion is ('()mmonly Sl't at four IIf>cElIlst. of IInplplllpntat,ion-rpll\ted cQnsiderations, 'tlld 1I0t bl'C'all!,;c of any limitat.ioll of tIll' IlIIdl'rlying algorithm. If thl' numhl'r of hits to be addf'(1 is largt'r than 16, say, 64, WI' mdY 11M> (>it.hI'C four circuits, p8('h similar to the Ulll' shown in FigllrP 5.2, with a ripple- Cdrry bctwCf'n acljdCl'nt Sl.'C.tions. or UM' anotllpr I('vpl of cnrry-I(Klk-ah...acl, dnd adlil'v{, d f.lsl('r pxpculioll of addition, fhis is pXl\l'"tly the !>dmp circuit as above, ILCl'epting t h.. four pairs of sN.t.ioll-curry-genpratp and scction-carry-propadtt', and prnduC"ing th{' carries Cl6, C32, and C48. - As the 1U1Inh('r of bits, fI, ill("(ca.se, lIIorp levs:>ls of ('arry-look-ahead g('n- f'rators I'"dll be added in ordcr to spl'f'cI up t.he addit.ion. The rS:>!Juin d lIumbc.r of level:) (for maximum speed up) approadlPS lop;" f/, whf'r(' b is tIlt' blocking fnr- tor, i.p., the numhl'r of bit.s in a group. till' number of groups in a Sf'<'t.iOIl, dnd so on. Thl' blocking factor is 4 iu t.he ronw'ntiollal implplIlI'ntation depirt..--d in Figurp 5.2. Th{' overall a(Mit,ion t.illll' of a carry-look-ahead adder IS thenfon' proportional to lo&, n. 5,3 CONDITIONAL SUM ADDERS Allother sdlS:>IUP for fast mldition t.hat provid( a logarithmic :,pf'('(I-lIp is the colHlitlOnul 611m IlIldt'r (291. TIll' principle behind t.his s('hl'ml' is to gellPmll' two sets uf ontput,>; for IL givcn group uf I)perand bits, say, k bits. Each set iududu, k sum bit.s amI .In outgoing carry. 0111' set assullles that thl e\'clltual incoming carry will be .l('W. while t.he other dSSllln that it. will be 0I1P. Om'e the inculllin carry is known, w(' 111'('(1 ollly to Sf'11'c-t tl1l' corrfft sC't of outputs (Ollt. of the 'wo sds) without. waitin for th(' ("ILrry to furth,'r propllgate throuJth thC' k posit.ions (St-X' Figure 5.3). Clearly, we !ohould IIOt. apply this idf'-e\ to all " operaJ(1 bit at t.he bl'v;innillg of t.lu' <uM operat ion, 1I1r-e \\e will then havp t.o walt nntll thl' ..arry propagat.es through nUn (>o..itions b>fon' lIIakiug thp ss:>1('{'tilln. WI' lu'C'd, t.lu>r('fon', to divid(> the giVl'n fI hits illtll SIlMUf'r groups alill c1pply the ahow iclC'1L t.o pach of t.h(>1U Sf'paratt>ly. III this way, thl' serial carry-propa'tntou illside tlu' sl'parnh> groups cau bl' dnne in punlld, redudng th.. o\'cralll'XI:'{'ul,l.on t.illle. Thl'St' groups can, in turn, III' furt h('( dividt'(l into subgroups, for WhK'h t.he c<Lrry-prupa.:lti(Jn timc IS PVPlI smnlh.r. Ths:> output-s of thl> ,,"bgroIlIJs 11lf' thcn comhiuC'cl to J!:I'III'mh> tl.... ()lltput of ths:> groups. . A uatural division of tIll' It op,'nmd bits wOllld bl' into two roups uf s/e 11/2 bits ('adl. Fudl 0111' of thc:;c cun bt' furtlu'r dividNI into t\O groups of S".e n/4 hit.s ('adl. fhis proC('1iS l'an, in prindplt'. b' cOlltilUlt'<lulI! II a grollp of Sill' 
100 5 Fast Addition k k C alAI o k bOt Adder 1 M ulhplerer" G' n k FIGURE 5.3 Selecting the correct set of sum bits and corry-out. 1 is readied if n is an integer pOw('r uf 2. In this c&..o;e, log2 n st(.J.ls arc' n('('dcd in th£' proce.ss, where in step 1 we deal with !oinglp bit positions, in stf'P 2 pairs of bit.s lJJ'(> handled, and so on. Notic(>, how('ver, that a givf'n group do not necessarily have to b£' dividpd into e<lual-si7(d subgroups. Thus, t.he condit.ional hum. l'IlPmt> c.an be applk>d even if the number of bits is not a power of 2. Example 5.3 In this f'xample we illustrate the way groups containing single bits .ue combined into pairs of bits. \'"e use here thf' followin(.., not.ation: s? denotes the sum bit, at posit.ion .i undf'r t.llf' a.....,umption that the incoming carry into till> cUTrf'ntly consid('red group is OJ s: i., defined simil.uly, and so srI' t.he outgoing c..drri (from the group), c?+1 snd c!+t. \Ve will first considf'r two adjacent, bit, posit.ions, and, in st.ep I, each con"titutcs a c;.eparate group: i 7 6 x. 1 0 V, 0 0  Asllming incoming (arry = 0 c?tllO 0  AS'i\Jming incoming carry = 1 c:+I I 1 0 In st..p 2 t.hC' two bit pO:fltions ar(> combincd (uHing dat.d scl('ctors) into on.. group of si.le 2: I 5.3 Conditional Sum Adders un i,i-117.6  y, , Y. -I I 00 S?,S?_I 110 Assuming incoming carry = 0 c?+1 1 0  uming incoming carry = 1 C:+1 1 0 I I Notf' that. th,. carry-out. from position 6 hf>f'omes an intprnal (to tht' group, c'\rrv and COIlhf'qUl'nt Iy, Wf> can S('lpcl thp appropriafe set of out puts for position 7. 0 Example 5.4 \\p now apply t.he conditional sum method to the addit.ion of two 8-bit operands (Figurp 5..1). The process has log28 = 3.steps. . . :'tJoticc that thp forced carry (which equals 0 in thIS pxalllple) IS aVBJlable dt the beginning of the operation. Therr'for P t only 011(> set of outputs nPeds to be generated for the rightmost group at each step. 0 i 7 6 5 " 3 2 1 0 z. 1 0 1 1 0 1 1 0 fl. 0 0 I 0 1 1 0 1 3 1 0 0 I 1 0 1 I Step I c?+t 0 0 1 0 0 1 0 0 "I 0 1 1 0 0 1 0 I c:+l 1 0 1 1 1 1 1 ,,0 1 0 0 1 0 0 I 1 . 0 5t<;p 2 4+1 0 1 1 ". 1 1 1 0 0 1 (":+1 0 1 1 ,,0 1 1 0 1 0 0 1 I I 5tep 3 4+t 0 1 3 1 1 1 1 0 , 1 0 I c'+1 ({.sult D I 1 I 000 FIGURE 5.4 CondiTional sum addition of two 8-blt numbers 
102 5. Fast AdditIOn 54 Optlmallty of Algorithms and Their Implementations 103 A \1uiatinn of Ih(' conditional !ilml nddl'r i'i till' carrY-!iI'I'''('1 addl'r. As iu th., conditional sum add('r, thf' n hits an' di\'idf'<1 iuto rnups (but uot Ut'('t's-'iarily of t.hf' samp si/"p). aud pal'll group gt'nprntl'''' two !il't.!i of !ilml hit!i nlld au olltgoin carry hit. ThO' inromiu ('Iury Sf'lI'Ct..... ulIt' of tilt'S(' t.wo so't.s, U Illik.' the cOllrlitioual sum nddpr, t'<tell group i... lint. furt,h('r di\.j,I('d iuto smnll('r suhgroups. ThO' C'aIr\'- splf'C't lH!df'r ami Mml' of its varidtiol1!i arl' dl"SCril)("t1 in Sf'ction 5.M. ComparinJ:: t II(' condit ioual :OUIU alld t.h., carry-Iook-alu'ad sdll>ml's, \\'l' St'" t.hat I he two hm'f' about th£' sal1)£' sp('f'd. Tht' digll of a Nllulitionnl sum nddl'f is, hOwl'vl'r, le.s.'I modular than t hat. of a carry-look-alll'lHl add('r, alld this is t h,' malU n'H:;on for 11lf' much higher popularity of I hp latt('r. fh(' IIPxt Sf'<'t.ioll illdudf'S a morp gt'lwral discussion on th£' oJ>timnlity of algorithms for addit.ion aud t,hpir implcmentatiom;. siolls hl'IW''f'lIllIllIIhpr "yst.f'lIIs. Thl'sI' cOllvprsions may I Ilrll out t ) b.. ns ('omp.'x as (or ('\'.'11 morp cmupl('x t h.m) thp addition itsf'lf all.1 IInlt'SS a fast CIJlJ\' 'r 'ion alorithm is a\'ailahll', this approach is of IimilNI pmct.icnl valu,'. A Illt'orf't.il:'al model to d(,tt'rmille a lower bouud (III tilt' sp.d of Iddil ion has hl't'n J>wposed hy Willograd (35), Spira (JO), and otlu.'rs. This h. 1\11 id"nli./-f'<1 modl'l. whosp purpose is the dprivat.ion of a bnmul iurlt'pl'lldl'nt of thp illlplf'1l1l'u- tation tt'dllJology. It 8SMlIIll'S I hat till' circuit for utltlition is rpnIi7('(1 ullg only IlIIt' t.YI'I' of gate, the (j, r) gat', whert' " is th(' radi." of thl' IUllnber systl'1J1 used anti I is the lan-in of th, atl'; i.e., thO' maximum IlIIlJIber of inputs to t.he att.'. All (j, r) g.ltc.::! are MSulIIl'd to h(' capnhh> or computing any r-\'l\lu1 function of I (or "'ss) arullu'nt.s in cxactly tlu' :.illne tUliP pl'riod. This tix..d time pf.'riOlI is t!l'filled a." thl' unit. dl'lay, amI t.he comVlltatioli timl' of thp u.lttf'r circuit ill IIIPasurf'<1 in t.hcs > ullit ddays. Sinl:'f' all (j, r) gat' nn romputt' any fum:tion of I argnmpnt.s, Wf' llN'd to fiud out only how numy such gatt:'s arc required and how many ('ircuil Ipvf'lli arp nl-df'(1 iu order to propl'rI} mnnect theS(' gate:;. Tim!;, a cirruit for '1dding two rOllix-r 0pl'rauds with '1 digits f'{I('h has 2n illputs and produ("l's 1l + I olltpnts. Considt'r the output that f(''1nirps dll 271 inputs for its "dkulalioll. flwsp 271 inputs can h(' n-tlucl'd to a smaller lIumber of arguments by usinp; 2n/ 11 su('h (j, r) gat.('S (whl'rf' till' ('('iling r xl of a uunlher x is thp smalll'st illt{'gf'r thaI is larger thalli or 'Iual t.o x), These gat.t"S lwlollg t.o t.hl' samp logic le\'l'l and thf'rf'fore operate ill par.LIlt'1. TIt(' rcsultin 1JI1ll1hpr of iutprm('CliatC' argwnl'lIts is r2n/l1. nud this IIIlInher I'dn hp further n..-tluccd through a !'ol'cond lewl of (j, r) gates. A :ochelllatiC' diagram of till' resulting drC'uit ill dl'pirtf'<1 ill Figurp 5,5. Th(' t.otal IIl11nl)('r of I(,\lels in such a tnt(> const.ructed of (f. r) gah's is ,Lt I('alit nog/2n. Not, that thc illdirnt,c.1 number of (j, r) gutl'.'; at. ('ach I"\'PI i'i only 5.4 OPTIMALITY OF ALGORITHMS AND THEIR IMPLEMENTATIONS NllUlProus algorithms for fast additiou, as well as ot,lll'r aritlllUttk operations, haw h.....'u de\'I'lopl'd and impll'mf'lItt'd sillcc the early days of digital C'olUputers, alld II('Wl'r om's are still Iwing proposed. The m'lin real,on for the continuillg rt':wnrc!1 alld dl'\'l'lopIl1l'nt of nO'''' algorit.hms for thl' hfL'Iic aritllllll'tic opl'r.l1ioIlS is till' rapid changl' i u th(' t.('dmology I h.Lt is uso'd to i mpll'lIlCllt 1111'111. A II algorit.hm t hut Cdll hf' opt.imally impl,'m('ntt'd in 0111' wdmology is not nl'carily th£' bpst in a (lilff'rt.'nt t('('hnolog\'. COllsequeutly, dt'sigll('rs n('('(1 t.o cOllt.inuously ree\aluuh' t.hl' availahl(> algorithms for a c('rt.ain arithmet.ic o{>l.'rat.ion ami tlll'ir sllllahility to thl? currl'lIt t('('hllolob}'. III addition to the dl'pf'lldellc(, on the technology I'mplo}'ed for the im- pll'nll'lItation of the aIgoritlull, the »('(forma\l('p of a giwlI algor it hm is h('uvily atf('(.tt'<I hy the unique features of till' algorithm ilst.lf and/or thl' IIl11nh('r sys- t-l'lIl U'iPd to represellt t,hf' opl'rands It.lld results. Thus. many swdil>s haVt' I)('m performcd t.hat compare V"'drious algorit.hms in an effort to dl'tl'rmiu(' which will pf'rfortn hl't,tcr. prpfprnhly illd('pl'lIdel1tly of t.he tl'f.!Jnolugy uspd for tll(' imple- ml'ntat.ion. r-.lore importantly, till' ohjf'ctive of some studies wa:; to fhltl till' limit (houlld) on th(' performance of any algorithm in executing a giv('u drithlUl't.i(' operatioll. rhl' ('x('cution tim,' of addition, L('illg highly depcndl'lIt on the way n,rri art' propngatt....l, can reach its minimulII in algorit.hms, which avoid t.h(' »ropltga- tion of l'arri('S altog(>ther, or incur a \'ery Iimit,NI carry-pwvaation. rhen'fow, numher s't'stl>ms that arl' dlar."teri/l>d hy almost carry-fn't' addition, can pro- vicl," "optimal" dlgorithms for additiolJ. On(' such IlIl1n!Jer system is the residue Ilurnbl'r systPIII (df'scrib£'c! iu Chaptl'r 11); '1nol hpr is t.he SD nllllll>er system (d,('rib('(1 ill ('h.it)ter 2). Onl' should, hu.....l'wr, he aware (}f t.hp fll("t that th '8' lIulllb('r S} stf'ms are 1I0t. frequelltly IIS('(I ill pradkc. COUsl'qu('ntly, in ord.'!' to takl' adwlltagp of tlu'S(' fast algnrit.lnns for addition, WI' 1It'(1 to pl'rform cOII\"er- XI "'2 (f, r) :1 x/ "'/+1 "'/+2 (f,r) (I, r) %2/ (/,r) FIGURE 5.5 A portlal diagram of a circuit Implemented with ( f r ) gotes. 
104 5 Fest AddltlOf"l 5A OptlmaRty of AJgorrthms and Tt Irnr;JerTlfJfltotlOnl I Wi 8 lov.'Pr bound, !\inC'P it 8SSumes that no arum{'nt is nNodro as input t ) mor' than onp (f. r) gah. The resulting numher of levels is cOfW'(juentlv also a lov.cr bound. ThprPfore. the Iov.-er bound on th tim<> to perform addition iJ; Todd  rlog/2nl (5.7) Impl m 11 l£In C t. In.8t1. Sfl, r(1Jlarity of du. d'''iign .11) J I II. I. of UJl ",.. n- m,"tionB fUP ClJru;iderdlJly IT1r rp important, fliru:1' th Y dff.,"t brA-h tit' ,Ii''" n 81,." w.ed hy tll'! d. r dud th.' dCllign tim.r'. Tit. two f:sol rH (r ..., imvlPln nta'i m CI t and 8p(:(.'<1) 00 nrJ' n. rily 81.hV-V" tho ir rnimrrmffi va]11f" m tho> ffW dCfl1gn. Thus. d tr.sdooff 1 v. n tlV'fIP two rrught 11a\-p. to t (r,I1O,1 If p'rCotrrl.d1lu- hi rruJr' importaJ' than irnJlI'Jr nWI.m ( t, 'h n ltv> carry-look-.wcad add. r is VI'ry dltra.c IV.... Still. the impl'rn nl:stJ II t I Z:Ila be rcduc:d e.JX.'I(ialJy whPlJ flJlI C1.lK m VLSI is I'1l1J11 J)'.. md. "'" 8 rP15lJ1t _ uLuity of the doogn nd SV . of th r'jfjnir' ar A d.-t.< rrulfI' tb. irnJlI frI' n'a n Ubt. This Call b. 81 bi,-v.-d by LaJ-.ing dlJ...anl::ig p of tb<. ..svallat,lp d.W' rJf frr""" dom in the dC!oil; namely, tb.. bloduJJg W"1J'r. fl. b1rx.kin W"ti r is alw bounded by the fan-in UJJu.tr3int. In 6(JUIP CdW.:6 it might. 3W' I p tXJunrl.oti by additional constrdint.s, bUch 88 the numl.Jer of pins. Hr}Wr-vl't. liar. lug} 1M v-.,,]uP. for tile blocking factI.Jr is filA n''''''''>barily tlv. t boar., nt1 tlipr" ia  r  to ralUdW it.s cfff"Ct on (.?tfc'JMJtit n tim' (inrl implt!ln. ntalv n tXM.. F, r <?';vnpiP, (i blocking wowr of 2 re&ults in  VlTj r...guur l.s".JUt of "'inacy tr WJth up to log,.! n Icvelb, r<>quiring  trJl::i1 ;U';3 .,f apJlr<Jxirna . i n Ir, & Iolj, F1Jrtl r d<.wl& on the btru(,'tur P of tlv> t:."J'ry-Ir-"'Jk-ab...d/J tr . '10 all. ;s hit king wv r f 2 dJ'e providoo in thl' nPXt s''''''1i JII. If ItJWu irnl,I"'fnntdlil.Jn cost ia r" juirf'ti, tbPn thp r.3rry-&rx,k-VM sdaPffi'" might bP inappwpriatA". ill' rippIP-carry method can t* U&OO tr>g I. r with 80me pePd-up t.<rlmiqIJP.8, wl.i h df!JI''JJd t n tw cl1(l w.lw(J1r.. Ow- ucla t..-'Cbwt,Ju is tlv! 1allchel>1Pr dliojP.J' i 14'. wh(J6f' mat' diagTaw.. tlUJWJI In Figurp -{J. This diagram ioclud 8wit.dl that C4D be r wring IJCII* trallSL tors or I>imilar dP'Vices in ""dJ'1£,us wchuoJogiE$. TOC tbt S'NW /I- r IUJJ measured in units of (f, r) gate dplay. Therf> are SI"...era1 a.c;,I,urnptions underlying tbis model that m.ake it an ide- alized model. First, onl)' tbe fan-in lirnit.ation is taken into .account., while th,. fan-oul. coru..1raint is ignorf'd. The fan-out of a gate is the ability of its output to driw' a number of inputs to bimiJar ga in the next Ipvel. In practice, tla.. fan-out of a gate is constrainf'd. Second and more important, t hp modpl um(:, that an} r-\'alued function of f argumpnts can be ca1culatPd by a 8wgle (J, r) gate in one unit dda,}". In practice. only a small numb...r of such functioru, can be implementro b)' a 6inglf' gdte requiring the lm1allt possIble del.sy. Mauy functions IDaY require eitlacr a more complex gate ('Ioith a longcr dl?lay) or lleOO to be implemented using s(>\"eJ'a1 simple gates organized in t'loO or more levpb Tbe bound in Equation (5.7) 85SUmes that tb{'re is at least one output digit that depends on all 2n input digits. If the addition tet hnique (or the particular numlwr system employed is !Ouch thd! not all 2n digits are nf'('(Jed to dctermine an) output digit, then a lower \-alue for the bound can b(' tdblished. In....tead of ha\'ing tf('('S ",'ith rIog/2n 1 leo.els, !';maJl('r trees ("",.ith fev;pr inputs) can be used. 'I"hb occurs if d carry cannot propagd.te from thp lc.:sst-ignificaJjt positioll to tbe most-!Oignificant position. since such a long Cdrry prop.sgation implies that the most-!Oignificant Output digit is depr'lIdent 011 all 2n inputs. For example, if onl}' z., 1/'7 :C'_I and 1/.-1 are needed to determin the sum digit lJi, tben Todd  nog/41. If the com'f>ntional binary number S)stem is used. a carry can propagat p through alJ n positions and rlog/2n 1 is still a lower bound on the additioll tune. TIJ two additioll algonthms described in the prf>\;OUS 5eI_'tions (namely, thp cany-Iook-ahead and the collditional sum) have an execution time that is proport' nal to Iogn and can, in thooC). approach tho> abO\'e bound. Howeo.. ",hen comparing two or more algorithms "';tb the s.sme tboorct.- ical bound for the executioo time. some objecti'a,'e fuDL'tU.J1I rel.stcd to th,... rost oC implementati 0 5hould also be ta.kPn into acoount. The type of thp additional objecth-e funch(1II used dds 00 the tecim!Jlogy employed, For ex.arnpl... ",'ho discrete gate!> are usPd to impJemeut the circuit, the number of sucb gate! mw £jIP..1'\'(> as an obJOCti\{' function IOO:<s.bllring the implemf'ntati.JJ1 CO<;t.. Tbe IIUJIlxor of gate. aIollg the crilk:a] (1(.JlIg.-t) path (in other ",'ord'i. tl1l' Dumber of circuit 1M l) detennmes the exeruUoII time of th,.. algorithm. If full custQm VLSI tech- no  w.ed then the exact number of ga hd..... ''PJ}' IimitErl clIo 1, on the %, , K. P -. . 4+1 C. GIr..Md "'1'1&;1 (' ( (' If G. = J if KI = J "'I'" "'0'" AGURE 5.6 A Manc1 ooder 
JOt) 5 Fast Addition 55 Corry. Look-Ahead Addition Revistted 107 i nrf' con'rollf'd hy lit(' sllIal.. p., G.. (\II(I 1\ i, wll4'n> I - ;:r. ED 1/, iH tll(' mrry. propngnfl' signal, G, - X.,}, if! th,' rarry-fll'm mtl' f'iRIII\I, nnd Ai :c :l'i IIi i thl> I' 'rt"JI-kill !liJtnnJ. TII( t hr{'f' sigilli I" RTf' df'fill"d !l0 . hn. onf', and onlv onc, of th" corTl"SJJonciinK :-.witcbN< I., rlr),'i('(1 at IIIIY tnn". TIJII'i. tit<' '''(llJahOIi p. = X, EB 1/. is 1ISf'{1 in..f<'ad of p. = x. + 11. 11..'1 b"fon'. If G. """ I, an ontgoing t'arry is "ncrah'd irr'J1f>rfi\'P of the IIIcoUliu nlrry. If hi = 1, uuy incOlllin carl)' is "kilk'd" and no' nllowPd toO t'ou.illuf' it:. propuKatiuu. If p.  I hnwf'\,pr, an irl(mnin carry ill nllowt,<1 '0 propaat '. All <rwit.ches illnnits 0 thronh II -I ar" !oof't Hilllult IUI('olisly, and, fl8 a rt.'SlIlt, thp propng8tm carry expr'rif'lIrt'S (lnly n sitlle swi.rh dl'lay Pl'( !!tac. fl\{. unllJlwr of c'lrry-prop.Ela'" :.witdw'i thal can h(' clI..o;carlf'd i.. limi'f't.! in practicc, nnd . his limit depf'ndH 011 the !!J"'l"ific ."dllJology pml,loyr,<I, rhus, thror' is a m...,d tu partition. Iw n units into groups ami iUSf'rt ftf'parn'ing df'vicC1S (bllffcrs) hf'h\f'f'1I 'hf'm. lu thr'ory, th£' "xl'cution timl' of tbis add('r is linPMly proportional to . hI' nllml)('r of bit:" n. How,'vf'r. thp rnlio bf'twN"'u i.,., f'x{'clItion timc a.nd t.hat of alloth(>r adder (e.g., tlu' carry-Iook-.Elh,'nrl wltll'r) df'I)f'nrJs on .hf' part.irlilar tt'C}moloR,V. In any t,'Clmology thnt is "lIIploy...<I to rf'aliz(' the Mancbcstf'r add"r, the implr'IU('nt.at.ion cost, OI(>I\.o;lIr,"'<1 in Si7f' of aTl'a and/or rt>lIlarity of dtiRn, is f'xpf'Ctl.,<1 to bf' It Wt'r than that of a cnrr)'-Iook-ahrond arld('r (25). thf' !!inlp l,it-po'li.ic'lI cnrry prr'JIIlJ(',t nltd (:IIf'rnf{' funr tir.mll 1', nJICI (:.. ,j:: .wo KrnllJI-r'arry rllnc' ions ran be calr nldtRd r 'urHivply mung. lip twn B ('(Illat ion!'! I'''j """ { /'. P"P'- I : J { if i = J if i > j; (58) C'- J = C C. + P. . G I:j iF, =) if i > j. (5.0) 5.5 CARRY-LOOK-AHEAD ADDITION REVISITED No." that tl... notations p.:. flnrl Pi (find '1irnilnrly G" and 0..) flrP pCluival. n. Thf' r"cUfsivc ....<IUBt.IOII!) (5.) and (!j.9) r'nn hf' furthf'r gf'nf'nhzNI t.n 1',.] = lJ,;m' Pm-I:], (d. 10) C..] = (hm + P,.m .0",-1:j. I " > m > ) . + 1. This is .he SlUUP gf'ncrali7ati o n that WM f'JnpIOYNI in w Icrf I - - f . 1''' Sf'clion 5.2 to flcrivf' thl' serf iou-cnrry propagat(' and gt'n,'rat-e Ullrtl()ui!, aurl C... It cnn III' fnrrn',lIy proVf'rI hy induction rm m. Il1str'3(1 of d.'alinR '" ith ew:h uf thf' group-carry fuuction<J 'll'paraf{'ly wro , I tl ' r (P. G ) and d,'fin(' a new Buol('an f'pf'ra'or, r"llIf'f1 thr mtroc IIrf' . JC fJ II '-I' ._] fllurl.llnr'n' II nrry opeTl,tor anrl d(,llOtf'd by 0, as follows (2): In .his !<f'clion Wf' rlf'rive th(' N}lIations for ca.rry-Iook-a}wnd addit.ion in a morc gf'llI'rnl way. This will nllow liS to lOI18lrlP( vari(Jl1 implf'IU('nt.atious of th(' carry- look-roJr''\d ddd('r rn.lwr than being rtrktcd to a pTt'(I(,tf'nuin...<I hlocking fac- tor. It will also pro\'idf' 8 g"JIf'rnl franwwork for deriving ('xprcssious for othf'r t('('hniqut'S for fnst addition. inrll1ding carrY-M:lcct and carrY-f>kip add.'rs. WI' first introdllcP .hf' following notc\tinn. Il't 1:] 8url GI:] df'not" thp Rroup-propagatf'C1 carry and the roIlJ)-Rf'neratlod ccarry fllnction!!. Tt':lJ)mivf'ly. fnr 'hp group of I,it positiOl18 i, i-I, .. " j (with i  j). & Hhown iu FigurE' 5.7. P"J 1'Clllal!' 1 whl'n au incoming I' ,rry intA:, the 1t'IL.,t significant posit-iolt j, Cj, I." allowed '0 propaRate throllh all i - j + 1 bit positions. Gt;J ('quals 1 when a t:dUY IS gencrRt<..o in 8t IN\...t on(' of t.lJI' hit. positions frolll j to i inchlsiw and propaatcs to hit I){JSition i + I, i.e., the ontgoing c.arry c,+t ('quais 1. Th tJefinitions R.'n('ralu' . ho.'oC in F"<luat.ion (5.S) aud iuclud,. ns 8 sJ'N'ial C8.<;(' (P,G) 0 (P,G) = (P, P,G + p. G) (5.11 ) Uj\ing this fundalllf'n.al cnrry opera.or W enn r,..writ. thf' rpcurSIVP foAluatlon (5.10) 11..0; (P,:], C.;]) = (P"m.Gl,m) 0 (Pm-l.j.Cm-I-J), (5.12) wbprf' i  m  j + 1 TIlt' fundamf'ntnl rarry oppr"tion hlt.o; hf'n !lhc""'l1 J be t......'<OI.iative(2J. i.t',. fori  m > t' > j (or,lIJoff'ILCrnrntf'ly,I  In 2: v+l  )+2) «Pl:m. Gl:m) 0 (Pm-I:", G m - I .,,» 0 (P"_I:]. G"_I]) = (P,:m, G,;m) 0 «Pm-I:",Gm-I.,,) 0 (P , .- I : j . (;,,-I:j»' (5. lei) from I dl'fillition in ECluauOII (5.11) the fund.snU'ntal cnrry oJl..rn'ion if! an itl"lII()Ot"lIt o(ll'rat ion (P,C)o(P,C) = (p. P,G+ p.C) =(PG) (5 11) 4+1 -rn ... rn ... 0- " and COn!!NIUpn.lv. thf' must gf'II"rnl fllrm \If F<luation (5.12) is (P,.j.G,) = (Pbn.(:i:rn)0 (f'".],G.... j ), ([d!j) FIGURE 5.7 A group consISting of I. J + 1 bit pOsitions ( I  J), whcr' i  m and "  j but " IS not II' 'C88(mly f'qual rn - I; it io; <Jllly rf'(luir''f1 that v 2: rTI - 1. 
108 5 Fast Addlt on 56 Prefix Adders !Of) em = G m - I :) + Pnl-I:j . ('j (5.16) 1>1& 6,GI II Tlw aboW' £'qn.at.ions indicate how Uti' (propaat.. auo 'uf'rntc) group c"rri J:J nnd G;:) cau be C<'llcnlat('d from suhgroup carri whcr£' tl)(' two or morf' !ollbgroups are:' of nr hit. fRr)' 1oi./t> aud t.hl'se subroups may (,\"I'n over hip. Event nally, w£' wOllld like to us(' th('s(' group and suhgronp cnrri in orcl('r If) calculatt' tlw individual hit carrif's ("1+1. <;, ..., Cj+h dnd slim ontpnts .'1., ,'I.-It ..., S)' To t.hi1o ('nd we must takc into account thp I'pxt£'rnal" carr' c) (set' Fignrc 5.7). 'For tlw mt.h bit position. i  m  j, W' ha\"' 1'15 I-I,CI&: J>I5'1:/,GI&:1 whicb can abu h(' r('writ.t£'n a..c; (P m - I : J , G rn - 1 : J ) 0 (1, cJ)' PI&.O,CI5.0 alld if Pm = 3'01 (9 Yrn thl'lI S,n = c,.. EB Pm. (5.17) FIGURE 5.8 A tree structure tor calculating ("16 (each line except roo represents two signals that ore either 3'm and !I. I or P",m and G,,:m)' 1I0w('\,('r. if Pm = Im + Urn then 8m = C m EB (xm (9 Ym). All alternative WdY to o£'al wit.h the incoming carry into the group, c J ' IS to modify the t'quation fOI G J from XjUJ to XJUj + PJ . c J ' Th('II, tin' equatil)n for Crn b('(:olllcs Thc trf'C st.ructurc for cdlculating ('Ib is dcpictfit in Figurt' 5.8. fh£' part of thc tu'e tructurf' that gencrate:; (P7:0.G 7 ,o) corm'iponds to the pxprion C m = G m - I :)' (5. Us) (P 7 : 0 , G 7 : 0 ) = (PH, G 7 .,) 0 (P ao , G3:0) = {(P 7 '6,G 7 :6) 0 (P:>:.., G:>:..)} 0 {(P 3 : 2 ,G 3 : 2 ) 0 (P1:O,G 1 : 0 )} = {«P 7 , G 7 ) 0 (Po, G f }J 0 I(P, (;) 0 (1'1> G 4 )]} o {[(P 3 . G3) 0 {P2, G 2 }] 0 [(Pr. G 1 ) 0 (Po, Go)]} (5.21) Tht' later "forces" the carry Cj to propagate throuh the group while th> former allows it to "skip" t.h£' group. Wt' will furthf'r elaborate on this ill the discllisioll of cnrry-skip acMl'rs. Equat.ions (5.15)-(5.18) ('.8U be used to derive various impl('mt'ntat.ions of ac1d£'Is inclnding ripple-carry, carry-look-allf'acl, carry-St'I.'<..t., carry-skip and othel. A 5-bit ripple-carry adder corrt'Spomb to tilt' case wlu're:' all subgroups consist. of a singl£' bit position and the computation stuts at posit.ioll 0, proceed" to po!oitiou 1 and so on: (P 4 ,G 4 ) 0 {(P 3 ,G 3 ) 0 «P 2 , G2) 0 (H, Cd 0 {(Po, Go) 0 (l.eo}}])} (5.19) All the circuits in th.. sceond t.hrongh the fift.h levpls in Fignre 5.8 imple- ment the fundamcntal carry operation ami ar£' tht'rpfort' id.>utical. Cl6 is equal to GI:O (colllparl' to Equation (S.18)) lIud, sinn:' Pm = X m (B y"., WP cfln calculate the Slim bit Su; using 810 = C16eP 1 6. The tr strue-turp in Figure 5.8 also g,'uer- ales the carries C2, c, alld C8' The carry bits for t.he I('mu.ining bit positions can b,> calculnted thronh I'xtra subtrt.'e struct\lrt'g that call he' acld('d to tl)(> binary trft' shown ill FiJl;tlrc 5.8. Oncc all the carries arc knO\\ II, th.' corresponding IIlIIn bits can he:' \'(IlIIputed tlsinf( Fqnation (5.17). In the ..hove design the blocking fnct-or always t'Quals 2. lIt1wpwr, thp IJlo('kinf( fd('tor du£'s not. have to be th£' !>alllt' for ,,111('\'cl1; of tbp carry gt'nerntion tret". Diffl'reut. \'ahms of t.he blo('king factor IIIdY lead to a mort' dti,'i(>nt 1I!>e d SpR('£' nnd/or short<>r int£'rcollllections (221. A 16-bit carry-look-allPad «<Mer wit.h four groups of siz£' 4 (i.e., blocking factol of 4) falUl II ripple-carry among groups corrl"Sponds to thc following e:'xprt'Ssion: (P I 5:12' Gl:12) 0 {(P II :8. G 1I:8) 0 (P7:-I' G 7 :..) 0 {(P 3 : 0 , G 3 :0) 0 (J, co)}]} (5.20) \\7C IIf'Xt introduc.' a variant of a ..arry-Iook-ahead adder that was propu:.eO iJl [2J. This varidnt uses a blocking factor of 2 rnlting in a vcry regular la}uut of a biliary trC(' with log2 n levels, rt'quiring a total area of dpproxinMtc liizl' 11 .log 2 n. To iIIUbtrate the design of th.> add,'r. consider thp calcuilitiou of C16, the incollling C.drry at Rtage 16 in817-bit (or more) achlcr ami SUppOSf' that Co = xoYo+Po'c.(). 5.6 PREFIX ADDERS TIll' addl'r showu in Figure 5.8 may be \'iewf'd liS a parallel I'«.fix l'ircuit.. A por811..1 prefix circuit i.s fa combinational circuit with n inputs .1:1,3'4, . . . ,X n pro- 
110 5. Fast AcSamon '''- UIJI1 " '0 9 6 7 , 5 # J 1 , II 9 i , 9 9 9 9 , , , I I , . . , I I , r1 ,.rJ rap # ,.r5 y: "'W r6 £hdptuJ .ate,,1 5.6 Prefix Adders 111 ,.,.. 'S 14 IJ 11 " ,. , . 7 f j , J 1 , . . . I . . . . . '-' . , . I . " I . . , . . . . I . , . , . . . , . ..",: I I I . . . , . I . . . -,  -' AGUAE 5.9 The Brent-Kung (2) poroBel prefix groph. AGURE 5.11 The Kogge-Stone (16) potolet preflx graph. ducin tbp outputs XI, X"l 0 %1, . . . ,X n 0 Xn-I 0 . , . 0 %1, where 0 is d11 ciath"e binary oppration. Tbe first stagf' of the addpr in Figure 5.H genf'raw tbe in- dividual P;, and G. signals. Thp rpmaining stages constitute the parallel prefix circuit v.;tb the fundamental carry oppration ... n ing as the 0 as...--ociative binarY oppration. This part of thr addPr trf'(' can be designed in many different way Thp particular way this part is implementf'd v.itbin the 16-bit Brent-Kung adder 2] in Figure 5.8 is sbov.-n in Figure 5.9. Tbe bullets implement the fundamf'ttal carr)' operation while the empty circles at the top generate the individual p. and G. &ignals. Note that Figure 5.8 wluch gf'nerates G1S:O sh()';lo"'S only the top four &t8g of the complete parallel prefix graph in Figurf' 5.9 v. hic.b uses S(>V('n stagp.s to also geoeTate all thr intf'nnediate carries G.:o (i = 14,13.....1). The number of stages and consequently, the total delay of the adder, can boP redu{:ed by modifying the structure of the parallel prpfix graph_ The minimum number of stages for a parallel prf'fix grapb is log2 n v.hich for n= 16 is equal to 4 ..biJe the number of st.sp;es in a Brent-Kung parallel prefix grapb is 21% n-I. One way to implem.nt a fOlJr-5tagf' paraUf'1 prefix graph bas bf'en proposed in [1 i) and is shown in Figure 5.10. Nc te tbat unlikf' Figure 5.9, thf' Ladner-F&hpr adder f'mploys fundamental cart} operators with 8 fan-in value higher than 2. i.e.. the blocking factor ..-arips from 2 to n/2. Such an implementation also implies a fan-out of up to n 2 requiring buffplS which 31M to the OVE'rall delay. Another parallel prpfix graph which al u-<:€s only 1% n stages but has low('r fan-in and fan-out requirements, has bePn proposAl in 116] and is shown in Figure S.lI. Tbis adder has stiU a higher number of latA:-ral wlres with a IOf'r span than the Brent--Kung adder and such v.ues usually require some butf('ri, re:.ulting in additional delay. Se....eTai other ..-ar1ants of parallel prpfi."( graph.. haw been proposed (e.g., [15]) illustrating that ill eneral, smaller adder delay can be achievf'd in exchd11ge for higher merall arf'.3 and/or pow('r. Compromises bPtwn ..implictty of wiring and overall delay have also bef'n SU&Rted. For example, 8 hybrid dt.'Sign combining stag(':, from the Brt"nt-I\ung and Koggt>- Stone adders W"dS propOSol'd in [12] and is shown in FigtIrP 5.12. It has five rath>r tban four stagPS v.ith tbe middlp tbr resembling the hogge-Ston p structure, but its v.ires ha..e a shorter span than thOSP in the Kogge-Stont' adder. '1 '0 o '.pIID ,,.rl '''- IS U IJ 6 . , , -# 1 tJ 114'" . . I I W . W . . I I I , I I :--1:>' I: . I I .. I . I .. I aqrl 114 rl rJ rJ 6"# o..z,.w 1141" 4 fhlpllZS  S AGURE 5.10 The Ladner-rascher (17) paroJeI prefix graph. AGURE 5.12 The Hon-Co!son (12) para lei preftx graph. 
112 5. Fast Addition 5.8 Corry-Select Adders 113 Ling adders (191 are a variation of t.III' carr.r-Iook-allt'ad EU1t!£'rs. Th£'y U:-'l a sim- plcr \crsiou of t.h,' group-Kcucratro caTTY signal alld thus provid,' nil opportunity to rroucc the 3Ssoriah'd dday. \\'C will SI'P the principle h,'hind Ling addplli through a simple example. SUPPOSl' that we start with a carry-look-ah£>aO adder with groups of si7e 2, i.t'., we producc till' signals GI:o, p.:o, G 3 : 2 , 1'3:2 anti 80 on, 1'lw olltgoin carry for position 3 c.an he exprCoS$ed as flJ(' maximum fau-in iu (5.25) is smaller t.han that in (5.26) leding to simplpr ami mmally faslcr circuits. Otlll'r vari"tions of till' expressiou for tll£' gTl"IIIJ)- ('U('wtl'<l carry G havt' lOrrpOl)(liu varintious for H. For f'xampl£>, for tht' clJuatiou G 3 :0 = G 3 + P 3 G J :0. the corresponding ('(Iuation for H is H:.:o = (;3 + T 2 Jho wlwre 72 = X2 + 112. A mort. Kenpral pxprion for H is Hi:o = G. + 1j_IJ/,_I:O wllt'r£' F'_I = X,_I + Y,-I. The cdlcillation of the slim bits in a Liu addt>r is slightly morf' involvpd thau that for the curry-Iook-..h£'ad. rIb iIlust.rat' this calculatiou considpr S3: 5.7 LING ADDERS C4 = G3:0 = G3:2 + P 3 : 2 G1:0 (5.22) 83 = c.\ ffi (X3 ffi Y3) = (P 2 H 2 : 0 ) $ (X3 e Y3) = Iho (x3 ffi Y3) + Iho(P2 $ (X3 $ Y3» (5.27) WhN ' G 3 : 0 = G 3 + l'3G2 + 1-'31-'2(G. + p.C o ). (5.23) Thl' calculatiou of H 2 : 0 is fa...t£'r thall t.llut of C3 reducing ttu' rlf'lay fl."isodatf'(1 with gencr.\tillg 83. Tltrl'C othcr variatioll of t.ht> carry-look-ahead adder which have properti<s similar to those of Ling's adder have been presented in 171. Au inrplpm('utat)oll of a 32-bit Ling adder is dcrih("d in [81. A lI1uliplcxor-b&ed iwplt'mentntion of Ling adders is preSt'nt4.'d in 1261. G 3 : 2 = G3 + P 3 G 2 , G.:o = G. + P1G O , ami P3:l = P 3 P 2 . W,. f>ithC'r assume that CO = 0 or SC't Go = xoYO + POl'O. Expressing G3:0 in trms of rh£> indi\'iclual carry g(,lIerat.£' and propagat4.' signals w£' ohtain Sincp G3P:" = G3 all terms in th above t'quation have 1-'3 a.,> a commoll factor. mid w£> C-iUl thcrl'for(' rt'wrih' the expressioll for G: w as 5.8 CARRY-SELECT ADDERS 113:o = H 3 : 2 + P2:. 1I .:o, (5.24) In a carry-spll'ct adder tht:' 1/ biLe; ar£' di\'idl'll into nonov£'rlapping groups of pos:;ibly diffcr('nt It'ngths. The und£'rlying strategy is similar 1.0 that of Ule couditional-sum adr!£>r d£'scribed in Sl'ction 5.3. Each group gpnt>rat('S t.wo s('t of sum hits alld an outf(oing carry. Oue set assum('S that th,> in("(Jllliug carry into the group is 0, thC' other assume.s that it is 1. V,:hen thl' incoming carry int.o th£' group is assigned it.s final \'ahle it Sl'1f'cts onl' of the two SC'ts a.<; is shown ill Figure 5.3. Figure 5.13 is a mor£> detailed version of Figure 5.3 dl'pie-ting tJlf' lth brouP which consists of k bit positions starting with hit positioll j and I'nlling with bit position j wllf'rc i = j + 1.- - 1. fI)(' out.puts of the group arc the slim hits SIt 8,_1, .." Sj and thp Ith p;roup outgoing ('arry CH 1. The corrp()nding 1300lean pquations are G3:0 = P3/ho. whprl' 113:0 is defin('(1 as and H J : 2 = G 3 + G 2 . H I : o = G. + Go. 1'\ot(' that. P2:. il> uS(>cI in (5.24) in contra."it to P 3 :2 which is us£>r! in (5.22). Fquation (5.24) dpfines H as an alt£'rIlative to the carry g('lwrate G and shows that Jl call be> calculat£>d in a similar manner. Howe>ver, H dops not have a slInple intcrprptatiol1 like G. On tl)(' otllf'r hand, H is simpl('r to calculate. For 'xmnpl(', expressmg H3:0 in t('rIllS of the individual bit signals yidds 8m = s?" . 'l!J + s:n . c} ; m = j,j + 1, - ., i, (5.28' lho = G 3 + G 2 of PlPI(G I + Go) and which can be 6irnplifi('d to C'+1 = c? t-I . 'l!J + C+I . c}' (5.29) (ho = G 3 + P 3 G 2 + PJPlG I + P3P 2 l'lGO. (5.26) where s is the ",th sum bit under the condition that tilt' incoming carry into t.h(' Ith group is O. This is the same notation that WI' haw' used for t.h(' conditional- slim add('r. Tht> notatious ''':.., c?+I and c! +-1 are defined !<Imilarly. The two Sf'parat.e 6et of outputs cau b ' calculated in a ripplf'-carry lIIan- nero Thus, for hit position m W(1 calculate c?n ami c!" from G-I:l and G: n - 1 : j , H 3 : 0 = G3 + G 2 + P 2 G I + P2P I G O . (5.25' whilt, Equation (5.23) can bp rcwriU£'n 8:> 
114 5. Fast Addition 5.8 Corry-Select Adders 115 .r..Xi_Io...,.rj k k till' PXf\ct. impl'>llJl'ntatiolJ. If, for pxamplt., W(' &'>"1'11111<.' a siml,lt. two-lew,1 gate ilJ)l'll'llll'ntlltion for t.hl' mult.iplpxillg rin'uit corr('ilpolldin to EqllRt.ion (5.32) tht'u the delay '\.."M)ciat.N1 wit.h the ('arry-'lp(" chain 'hrouh thp pr,'CNtinJl; 1- I groups is (I - 1) .26(; whpre 6c is th£> delay of a :;inglt. gi1tp. Thp ddny of the ripple-carry ,-hain through thl:' k, hit. positions in th.... Ith rouP. wllPu .'gain Wp &.,<...lIInp a simpl p two-l<.'\'f'1 gate implementat.ion, is k, . 26{;. Equalizing thpse twn ddnys mSlllts in y"Y'-h'" .Ilj o k -bit A,h/rr O,IJO I' ... ,,0 · t- . J k-bil k, = 1 - 1 with k,  1; 1 = 1,2,.... L (5.33) when'L is thf' number of gronps as shown b£>low. C,+I = Mulhplc-.rer Cj = CroupJ"ca,.ru -in k, 1 I ... I 1:1 kl I -.. I k lIi.a._I,....8, In other words, the ...roup lengths should follow the bimpl p arithnJ{'tic progrion 1, 1,2,3, .... ami diP t.otal number of bits. n, must. satisfy FIGURE 5.13 The I th group. consisting of the k bit positions J.j + 1. ..., I, In a corry-select odder, 1 + L(L - 1)/2  n, (5.3-1) dnd consequently L(L - 1)  2(n - 1). (5.35) rp(-'('tr\'cly, which in tuna, drf' calculated from (sre Equat.ion (5.19)) (Pm-I'}'_I:j) = (P m - Io G m - l ) 0 (Pnl-2,G m - 2 ) 0"'0 (p)' G j ) (5.30) As a rf'Sult, t I... siz(' of thl' larg£'5t group and thf' execution Hn... of t lIP carry- selpct addf'r are of the' ordC'r of ,;no For I'xampl£>, for n = 32, b&>ed on Fquat.ion (5.35) nine groups ar£> rcquired. One pos.siblp choic€' for their siz('s is 1, 1, 2, 3, 4, 5, 6, 7 8nd J. The total carry propagat.ion time, under the a..';nnnption Qf two-level gat£> implpln('nt.ation, is 18. 6c. instt'ad of 626G for tl\p ripple-carry adder. If thl' lengths of all L gronps arf' equal, the carry-select chain (i.e., gen- erating thl' gmllpJ_Ca'TY - out from the g,'oupLCar"y - in, see Figurt' 5.13) dO('$ not ha\'(' to be neC'l-'$Sarily of the ripple-carry type. Instead, a single or cven multiple-level carry-Iook-ahl'ad can be employed (1). COlllparc to thc ripple-carry adder, the carry-self'Ct adder re(llIirl'!! a 1111- plicate c Irry-chain logic and additiona1 carry-sdC<"t logic, However, this logic l"ircuit overhead can b(' rt'duced by observing that thp two Sf'parnte carry-dmins in Figllrl' 5.13 and th,' multiplexing ('ircu.itry can be combined to yield a 8impler implplUentation (331. In such an implclllentation each bit position inclndffi the following logic circuitry and (Pn>-I:J,G_I:j) = (Pm-I:), G_I:j) 0 (1.1) = (Pm-l:j'-l:J + P nl - I : j ). . (5.31) ott' t.hat .P,n.- :} has o sllprS('ript since it is indppendent of t.hl' incoming carry, Ou('e thl' mdl\'ldual bit carnes have b('('n calculated t.11f' corrl'sponding swn bits dre 8° cO "" Po all(1 I 1 "" P. m = m W '" 8m = ('m w m. S'  . I . I ( ' . f ° 1 IIIcr .+1 Imp les Ci+l I.e., I Ci+l = 1 thcn cJ+I must equal 1) we can simplify Equatiou (5.29) to read CHI = C?+I + c:+ 1 . Cj. (5.J2) TII(' si.f.RS of t.he St'parate groups can be f'ithcr diff(.rent (e.g., (J3]). or th(.y can all bl' '-'qual to k (c.g., (11), with possibly Olll' gronp of size smaller tll[\n k. III tltl' fjrt (,<I..'!C the si"R of tlte Ith group is chosen 80 as to eqnali.f(' thp delay of the rippk'-carry wil hin the group and the (May of tht' C'arry-selC'C1 chain from grollI' 1 to group I. The ahove two delays d('pend on the tedmology em"lo)'I'<I and Urn = XmY", ; Pm = X m ED Ym ; Pnl:o = Pm . P,n-I:O, alld (' = G m + P,n' C_I; 8m = Pm e (C_I + C)' P m - I : O )' (5,36) 
116 5 Fast Addlllon 5.9 Corry-Skip Adders 117 (').... I.r"uf "f 1!a"I<I' "fl"n! iOIlH III I,.ft lu t!a.. rPllcI,'r ILlllln ('X"It'IS". Otltf'r WII illt ioml tlf I It" ("l\rrv.""I"I'1 oc.ldnr hllvl' bt"'11 pl'Opo.'wei IInll illlph'IIII'III''t1 wil It 1101111' of t ht'!'(, II..,wrill<'d In S"I"I.1ulI 5,11)  6,9 CARRY-SKIP ADDERS FIGURE 5.15 A 15-blt corry-skip adder. A I' l1'ry Hklll IIIld,'r n'4III'I'S IIII' lillI!' 1I1'I,(It'l1 to proplllIlt. I h,' ("ntry hy IIkippiHR ()"f'1" ...rVllpli \If 1'"IINI'f'uIIVI' udtll'r r.tllJ.:I'S, AN ...III'h, I hi' t"llrrY-Hkip udrl('r R"llf'fIIli./f'S ,!an iel"11 lu'hilld the' IIIIII'h"s"'r IIllclt'r eI,'Jot,'rih,'(1 ill SI'''' ion 5..1. Thl' rurry. tlkip 1I(!t!t'r illllsl fIIlt'H t I". tlt'I"'IIII.'III'I' (If I hi' "opl illllll" nlgllril Itlll for atldit i(;11 on t.11I' n\"lIilllbl(' "'I'hIlOIIl,\', AII!allllf(h II", ('urry-...kip nloritltlll hILH 1"'1'11 knnWII fnr lllilll' Y"lafS, i.' 1111'" 111'1"1 IIIII' I'tlJllllllr ollly (1'1'1'111 Iy. III VI SI t,'I'ltIlOlllgy I hI' ('urrY-HklplllltI,'r IN Cfllllllllrnhl., ill IIp('(',1 to IIII' "/\fry look-nlll'a(1 It.dllliqlll (for roltllllollly UIII'l1 wnrel II'IIJ{I Its hut lIut 1It'('I'!I.luily ill I he n.YIIIJI' ot ic- :M'IIIi(') hili il ((.qllir('s I,s dlill an'a 111111 ('IIIISlIInt'li It.tls 11I1\\'('r. 1'1... l'urrY-Hkip 1\(t.I.'r i... hl'-"I'(I 011 I h,' follnwillg ohN('rvILI inll. Th( I nrrv Ilrl'IIIIf{"tillll prot"<."-'S ('1111 skip 1U1)' "11(Ic'r st"J.:I' fur wltidl r m i- 1/", (or ill otlll:r \\"rcIH. P', co J... t£lllm - 1). SI'\','ml ('OllsI'("uti"" IItng('S ('''" hi' skippl'<l if all tUltiHfy r". 11''''' 'I'hll..., "II IIdd,'r nlllsislill of I' Htngt':S is dividl,tI illtn J{rollps nf ('ulI:>I."'IIIi\(' Jot' "10:1'8 wi, h " Nilllplt, rippl,'-,'nrry sdll'm,' n...,,<1 ill ('m'lt J.:rollp. FvC'ry J.:wUI) ulloll W'III'fllh'.'l " J{roIlIH"lIrry-prop"glllt' siglllli tlml 1'11"lIls I if 811 titllI'S illlt'lIl/Il to thl' Rrollp IIlItisfy Pm = 1. Thill lIilml ('flll he UN('d It) ""ow 1111 illl'omillJ( ('lIrr)' into th.-' J{ronll 10 .....kip.. 1111 t h,' ...tllJ.:(':> wit hill t.h(' group und J.:I'lIl'wt" 1\ J,:rolll)-('lIrry-out. LI" U p"rl il'nlur J(ronp, IIny, f(rtlup I, ('ollsitit uf tltl' k hit 1"'NiliulI" j,j.1 1"...j + k - IlL" IIhlJ\\'1I ill FiJ(urt' 5,1.1. BIL",'d,," Efll1/1tlon (!i. Hi) thl' Booll'llII I'xprcioll for GmuJ,_LCf.n7l-out iH wlll"lI till' i - j + 1 bil positimlN allow I he IIlI"ltmillJt rarry Cj to proPIlRuh' to thp lIext hit. poSil iOIl. H I. '1'111' Imlfl'rs shown in thf' fignrt> renli./" I !a" OR olll'ratioll in the uhf}\'c Uouleiln I'xprt'ssion. Flguw 5,15 df'pkt'l 8 IS-hit carry-skip wltl(>r f'fllllli!Jting of thr  groups, p.-'Ch of bize 5. Notin' th(It thl' siJ.:nals P i : J for 1111 grollps CUll he gcncrat4'd l:!illlnltlllll'Ously allowing 1\ fa..c:t skip of ((JUpS whi('h !mt illfy P"j = 1. WI' witoh t,o d('h'rlllim' till' "ptimnl ...iz,' of t.hp group, k. This optimal siz' dl'l)\'l/(ls on thp rlltio 11l.tw('C1l till' r..rry-ripplt, I im ' throuJ,(h u singlt' stfJl(", dl'noted by t r , and thl' tinlt' it takes to skip a group of 'Iil.c k, dCllotL.-d by l.(k). The lal tcr ill, for lIIost implplnelltatiolls, illd p l)I'IIIIt'nt of k. AN/mllf' first t.hllt. nil groups arp flf the tlamc size k, Rlld, for simplicity, ILo;.surnc fnrtlwr t.hut n/k is 1111 illteg('r, Thf' group sil'(' k shIJuld L' 5t'I,'ct '(I >o() thut thf' till)!' for tilt' longpst carry-prupagat.ion dmiu is minillJi7.cd. niP long l t ("urry-propagation I'haill ort'urs wllt'll a ('arry is ..IIt'rl\t,>d ill sta" 0 alld IIIIII propngntps nil t hp way to stug(' n - I. This IIII'un8 that ,h,' ('arry will rippll' throuh stllgl'S 1,2,....k - 1 wit.hin gWllp 1. kip roups 2,3,...,(n/k - 1), th"11 ripplt. through group n/k. Thl' overall rarry-prol)6J{ation time is in this rM(' T,'orrl/ = (k - 1) . t r + th + (n/k - 2) . (I. + tb) + (A' - 1) . t r , (5.37) a,v IIIJ_LClIIry.uuf = (;,.) f. P, j . (;l\Iul,_LCan"!I-rli where tb it> tit. deluy &..sc,ci"lf'(1 wit.h the hulft'r (which mlplcm,-nts till' OR oppr8- tiou) het.wl'£'11 two J{roups, as showu ill Figure 5 15, If, for t'XWUI'II', 1 struihtf"r- wnrd two-level gutl' impll'UlC'nlut.ion if; "lIIplo.)'cd for both th(' rippll'-carry ('irt'lIIl "lid the curry-kip circllit, thclI t r would I'fluall. + t" = 2a. yif'lcIing wllt'rf' (:,,; I''-Iunk I whl'lI 1Il"lIrry IS "Ill'rnl.t'd illt.,ntlll to till' rollp 111111 if; IIlIow('(1 to IlJnllll"'" through 1111 th' r'llluillillg bit positiolls illduding i. PI:) (..'(jllllis 1 11,,11.., lr"l1 = (4k + 21l/k -7). Dc. DitT,'rcllt.illtillg Tco""1/ wit.h (I....p,'('t to k IIlId ('C}ll/ItillJ{ tht' dt'ri\'lIt.ivc to II rt"Sults ill k"pl. = ../11/2 . '....- (:.'....I'JJ. 'u,.rIJ ,"I' ('J - Gr""l,J.("nrrJ/- ill /u;. for till' ,'arry-dl'l't. uddt'r, t.11I' group Sill' alld Ih(' t'lury propagnf-ioll till.' nr(' l)rolHlrl i(mal t.o ..jii, For "XUIIII,II', for II = 32, "IJtht gruupti "f I./" k opl . :sI JW = I will pw\'ult' lilt" 11I'S1. dCIIRI1. with Top.. = 25a, instct\u uf (j2l; for the rippll'-t'urry &ld(>r, FIGURE 5.14 1he f th group consisting of bit positionS}.} of- I. ,.., Iln a corry-skip odder, 
llR 5. Fast Addition 5.10 Hybrid Adders 119 Rf'\'iit ing th(' prE'\ iou.... analysis, on(' should realizl' that. Curt.h('r spl'f'd-up CAn hI' achie\'('(l iC we make thl' Silt' oC thf' first and last. groups f'v('n smallf'r than tliP fixPd siZt, k. and in this w 'y rl"lluc(' the ripple-carry delay through th£'S(' J,(fIIUpS. Also. wp may increase the size oC t h£' c('nter group. sinc£' t.he skip tim(' is usually independent oC gronp sizp. Another way to reduce T"..r'1l is to clf'Sign a sffoncllcvel oC skip circuitry Urat wonlcl allow skipping two or more grouJ.l' in one step. Aclclitionall£'wls can also bE' envisioned. Thl? idpa of using unequal group sizes has alrl'.8dy bf'('t) sllggpsted in t.he pa..<rt; (18]. Only r('("('ntly haw' sew'ral algorithms h('('n cleveloped Cor deri\ing t.he optimal group siz Cor ditfpr£'nt. t('Chnologies and implc1lIentations (i.e., diff(>rellt \'nln of thl' ratio (t" + 'b)/t r ). \\'e '" ill now fonnulat(' the problem and iUustrate its solution through S('veral exc\mplcs. Note first that, unlike the >;imple analysis for t.he eqnal-sizf'd group case abo\'f', we cannot restrict ourst'lws to thc analysis of the worst c&,')(' for carr)' propagation. This may lC'acl to t.he trivial conclusion that thC' fir& and Ia.<;t. W'Ol1pS should consist of a single stage, whilt' all r£'maining n - 2 st.ages shol1ld constit.l1t.p 8 singl(' c('mer group. In t.his dcsign, a carry generatNl at the begin- ning of t.he c('ntN group may ripplt' through all the other" - 3 stHgcs, becoming the' worst <A:L'Sl'. We therl'Core 1lC(.>d to consider all possibl£' carry-propagation d18ins tbat may start. at an arbitrary bit position a (for which Xo = Ya) and stop at the next position b that also satisfies Ib = Yb, where a new carry-propagation chain (independpnt of the previous one) may st.art. Let k l , k 2 .... . kL denote the size of thf' L cliffl'rl'nt, groups st.arting at po- sition O. Theil dl'wloped, relying 011 either geQlllf'trical intf'rprptatiorJl> (e.g., II3}) or dynamic' proJl;ranJlllillJ1; 14]. Example 5,5 TII£' optimal organization Cor a 32-bit carry-skip acld£'r with a singlE' Ipv(') of carry-skip has been derived in several clilfer£'nt ways. This optimal organi- lat.ion includes L = 10 groups with sizt'S k l , k 2 ... . ,k lO = 1.2,3.4,5,6,5.3, 2.1 for f_ + tb = f.. yil'lding TrorF"1/ S 9. t.. (13). If t.. = 2r., thrn T""M"I/ S 18LlG, instead of 25LlG, as in the equal-Rize group ca.-;r. Th{' read(>r can vt'rify that allY two bit positions in WlY two groups It and I', (1  U S v S 10), satisfy TcaN"l/(u,!')  9. t r . 0 The similarities betwc<'ll the carry-skip and carry-splcct adders and their carry propagation timps shoulcl not come a. a surprise. Although the strntrgies Iwhincl tit£' two schemes souncl clilferellt, the equations r£'lating the group-Utrry- out with the group-carry-in are, in both ca...es, varidtions of thf' samp basic Equation (5.16). Only the d£'tails of the implementation vary, in particular thl' calculation of thc Slim bits. EVE'n this clilferl'nct> is rc>duced when thE' multiplpxillg circuitry is mergf'd into the summation logic according to Equation (!>.J6). 5.10 HYBRID ADDERS &'-1 Taarl"Jl(u, t') = (k.. - 1). I.. + tb + L (I.(k , ) + tb) + (k" - 1)' I... I ..+1 (5.38) Hvbricl adders are adders which use a combination of two or more of the pre- \'iusly described methods Cor adclition. A common approach to the design ()f hybrid acld('rs is to choose one mpthod for carry propagation and dnotlU'r metbor\ for SUm calculatioll. The two hybrid aclclers pre:>entPcl in this section combine some variation of a carry-select. adder Cor calculating the sum and a modifird Manchester adder for carry propagation. Both divicle thp o(>f'rands intu grC'Alps of equal size-8 bits each. The first hybrid addcr 120] emplo}s the carry-select method for calculat.ing the sum for each group of 8 bits separately as shown in FiJure 5.16. The group carry-in signal that selects one out of the two sets of slim bits is not generatNl in a ripple-carry manner as shown in Figure 5.13. Inst-ead, thc carr if'S into thP 8-bit groups are gellerated by a carry-look-ahead tree as propo..'>Cd in II). III t.he ca.<;€' oC a 64-bit adder the&' are Cg, C16, C:l.a, C32, C40, ('..s and ('56 (see Fignrt> 5.16). The structure of a carry-look-ahead tree for generating thesp ('arries would be similar but not neres....arily idf'ntica) t.o that shown in Figurp 5.8. Thl' clitff'r- £'nces betwe<'n such structur£'s stem from vdriatiolls in the blocking fw:tor at eJ.('h level of the tree and the exact implf'lIIentation of 1\ modulI' for cnkulating th(> fUlldl\Jnl'ntal carry operator. If we rt"Strict ollrS<'lves to a fixed blocking factor the natural ehoiecs for groups oC si7e 8 hits indnd.. 2 (as in Figure 5.8), 4 or 8. L L k i = n. .=1 In tile most general case, a carry-propagation chain starts at sump position within group u, ends at somt' position within group 1', and skips the groups u + 1. u + 2,..., t' - 1. In the worst case, the carry will bE' gt:'nC'rated in thp first position within group u, aud will stop in thp Ia....t position within group t'. The O\erall c.any-propagation tim(', denoted by TcoM"l/(u, I:). is The t of group Si7..es k l , k 2 ,' ". kL should be sclectt.>ll so that the longest carry-propagdtion chain is millimul'd: minimize [ lIIax Tco""I/(u, t') ] I:!::"S,,':!::L To wh'e this upt.uniLation prohlem, thl' siLf' of the groups. as weU as the IIwnber of groups. L. mw.t Le determinPd. Algorithms for solving thi" problem Ita\" b 'en 
120 5. Fast Addition 5 10 Hybrid Adders 121 P 63 .G bJ Pro. GGO P 59 . G 59 P,o;6, G:'i6 P 55 , G 55 P.'i2, G 52 P SI .G 51 P."" G II' [>47, C. n P 44 , G.", p.. 3 . G 43 P 40 , G.o P 39 . C 39 P.1G. G36 P35. G35 P3l. G 32 P 31 .G 31 P lioo ,G 2 f'. P 27 , G 27 P 24 . G 2 " P23. G 23 P 20 , G20 P I9 ,G I9 P 16 . G I6 P 15 ,G 15 P 12 . G I2 PIl. G 11 nloGIj P7.G 7 P 4 .G.. P3,G3 Po, Go ('.() (':\6 0 C21 1 0 1 CI6 0 {'8 1 0 1 o P IO P'Jo / --.L/ --.lJ  1'3:0 1163:56 o 1'3 PI 1>2 '"I" (a) SS5'4fj ++G" ( Go ( (:1 ( G;I ( (;3 "'1" ill" "I" "I" (b) 39:32 FIGURE 5.17 A Manchester carry module tor calculating a group propagate (circuit 0) and generate (circuit b) tor a group of size 4. 15:8 Thl' first choke (('Suits ill th£' largest numher of 1(>',.,'18 in the trpc while t.he last Oil£' results in compl£'x lIIodul£'s for the fUlldamf'nt..1 carry operator with a high d('lay. A blockillg factor of 4 repre:.<>nts a rC<iSOnabl£' ('ompromi::;e and ha.., !.J('t'n selected in [20]. A M<\I)chcstcr cdrry propdgate/geIlPratt' module (AICC in Figure 5.16) with a blocking fdetor of four is II£'picted in Figurf' 5.17. III the lIIost general cast' tht' Manchf'Stf'r carry module in FiKure 5.17 accepts four pairs of inputs: (/I:io,Gil:io)' (Pja:jo.Gjl:jo), (Pk,:ko.Gk,:/co) and (P'I:'O' G'F'O) where i l  io. jl  ;0, k l  ko and 'I  10. It produc('s three pairs of outpIIL.,: (P)I "0' GjJ:io), (Pkl:l o ' Gkl:l o ) dl\(l (£',1:10' G'I'IO) IInd('r tilt' conditiolls i l  jo - 1, jl  ko - 1 and k l  '0 - 1. These conclitiolls allow overlap among tlw input subKroups follo\\'ill Equatioll (5.15). A schematic diagram showing the operation of thl' carry modlllt' is dt'pictt'd in Figure 5.U. !i-bit Adder 87.0 Inpu jl :;0 i) : io II : 10 k l : ko FIGURE 5.16 A SChematic diagram ot a 64.bit hybrid odder (20) (J!dp"ts 'I : io kl : 10 ;1 : io AGURE 5.18 A schematic diagram describing the operation of the Manchester corry module In Figure 5.17 In the general case, 
112 5. Fast Addition 5 10 Hybdd Adders 123 S--l>., \of"" :.--1>" ..M., , F1GURE 5.19 thE! available inputs and required outputs at the third level ot the carry-Iook-oheod troo. (A cloned Ime represE!nts on optional dependence.) S-bi' _ ",,, I .\ ...n.h'...,,.,. (' ""1/ CIu"" .\f..lfi," or" .\ ,lOld,...,..,. t ., rrJl ('h..... r - - - - - - - - - - - - - - - - - -I H::tl I 39::12 31: 11- 2:t Iti I.:O I I I !..:I I ..-1.- I I I I -&;:0 I ;\!I:O :11:0 :?:I:O I L _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _I ".."..",...,..,. (""1"'" ('h..i" ,\t.mch,'_,'rr ("Irry CI...i" 'fc",,.hr."rr C.."rll ('hoiu lral" '$ 55:48 \ n"..h...d,." (" ""1/ CI...." ( )1"1".11,. :-.."':0  to., .M I..... ., ',M,." "-, '.101..,. 1'11(' fir.-I 1(,\'\'1 uf thl' l"Iury-lllnk-ahl'ad tre,' fllr (I (.i I-hit nddl'r indud, 1-1 M/Uldlt'Sh>r cnrry IUOlhlll's mlti cakulnt(':; l1Jo.GJ'o). (P;..., G r :..), . lPs,:,>::u. G':\:\::>2), i.t'., lluly 11.(' I)Utputs r.w 111111 0'3:0 in Figurl' 5.1;- 1m' utililt'C,l III th,' ('('('Ind 1('\,,1 tlf I ht> cnrr'-lol)k-l\he:'UtI t t\'t' ,'Ucll ll\udll':'lt'r ('urry 1II0tlul,> g('lll'rah>s tWolmi of l1utput$. ....)rr'':ip'ln,ling fl' (/.o, GI.O) m\(1 \l'l.O, G 1 _ 0 ) iu Figurt' 5.1;, TIIU, t.11l' S\'('onll It'\'('1 of lIIodull's I'ft 1 \"idl'S I hI' \1Ulll>S (11;:0, G.: o ). (P I :\:O,Gli!I:l1), li21 16,G 2 : I . W ). (PJI:16,GJI:16), (r.'!}:3.l.G"I}.31), (r...: 32 ,G,.::u). und (:..s, G5:IS) (1'\'1(> F4,'<\trl. 5.16), This 1('\"(>1 J;plIt'rnh'S th,' ,"urrit's ('1\ untl ('16. which 1'I()lIal <:.:0 Nill G I5 ,O. 1't-'S1""(.ti\,l.ly, Silll'\'I\) is I\lft'udy illl''\npurI\lt'll into th,> I1IUdlll,' thul g"ut'rat \PJ;O, G3:0) in till' tln-:t It>\'\'1 1201 (s,'C,' E')lll\tinn \5.1S). I'h,' n\'nilahlf' illputs, I'l"l)uiml outputs alld tht' tll'p('lltl('u("{' mllong I ht'm ill th" third 1l'",1 of thl' curr'.look-llllt'u,1 tn'l(' art" shown ill Fun' S.W. (''nrly two 1l\lIdll"St(>r carr)' Ulodul,>s IU'l> suffici"1I1 tu luodUl'\' Ih,' t\'C,luin...1 l.utput:; in Figut\' 5.1.. Om> such modull' cnn iml)I,'lIIt>nt th,' rdntil 1 11ships indllllt'C,l in t hl' dn...h,"<! ho." in t hi tlgut\'. This Ulo,lull' will gt'lIl'rnh' till> carries C'2.., ('31 anel ('...0. A 1itx'oml ml),lul(' cnn I>ru,hl('1.' I h,> two n'lIIluliing pl\i of rt'<}lIin'l.l out puis with th,' inputs com>spnnlling t.o 55:-IS, .I'i:a:!. n:lIi un,l 15:0. rla mod III,' will g.>m>n'tp the ('unil's C..g I\n,l 6. Nutil'(' in pnrtintlar, tlll\t thl' tin-:t ntl.dul(' d('Scribt'll by till' da....hl'd box in Figure:' 5.1) must impl"IIII>lIt 111l' I\\'u r!oU.{'(\ Iin,"S from 23: 16 ill IIII' ti'1lrl' sitll'\' 21: 16 is (t'C,luin'll for gl'IIl'ral in :.!3:0. rht> /Ibm'" dl':inibt'(1 iUlI}lpml'lIt/lt iun of u I; I-hil, nddl'r ('\llIIhinillj: t h,' cl\rr'- sl,lcct SI'lll'lUt" for gt'm'rRtillg Ih(' sum bils IUld 1\ lnlld1th'r-bn...;.'(I,'nrly-k..)k. nhl>Rei tn'\' for 'nl'natill th(' nU'ril's illt!) thl' flJUJIS i lII)t ullilJlll' alld dol' not 1Il'('t"'l'rily minimizp t hl' Iw,'mll"",'C,'utiun tillll'. Altc,rtlnh' impl(,lIlI'lItnt ions including vuriablt. siz,' of thl' l"I,rry-sl'l('('1 roul)S mid of thl' i\llUldu'Stl'r mrry IlIo,lulo>s al thl" ,lilli'rt"lIt I('\'I'L..; ,}f I Ill' I n'l' m(lY pron' It) Udlil'\"> n II'Wl'r ,'Xl'CUtiOIl t imp niP li-l-\Jit luldl'r dlos"rihl'C,l ill 161 Iii\' ill,,:, tla,' li.J hits ililtl t\\"tI St'l:- llf si/.,' 32 hits. E.\I"II 1it'1 "f 32 hils is, ill turll. furthl'r tlivid,'C,1 iutu four roupf; 'If 7C t:I bits. Fur '.\...'rJ.. gruup uf t'iglll hit.... tWII s"ts of ClllltHt i"lIl,1 :<11111 'lilt pills 11ft' .\ ..lti,""'.x'" "'ulh,,,.,,",,, C1I ":11 ,\(Ulh,""'" "  F1GURE 5.20 A schematic dlaQram of a 32'blt hybrid odder. l'UI'rI\h'll S"I}llmtl'ly, rhl' t\\"o 1110..4 !<l,.lIitkl\lll, ttroups 1U"t' t h('11 l""lUhiUl'l(1 illt,. II sill,.I,' bar,'r )1.roll(1 ,.f :ri7" Ui. Thi hU'g,'r ft)III) is fnrt h,'r (',llIlhilll"Cl \\'at h tIll' u,.'{t rolill uf si/,'  to fllnn IUl ,'\'CU IfIrI'r l'\luJlof silc :.!-t hits wul Stl 'n f"U",\ iUK I.h,' priul"ipl,' uf l''OlIIlil iounl-:'llIn lultlit ion. How"\"t.'r. t hi' \Yo,)' th 1 anput n\rri. fnr thl' h'L"it- 8-hil V.rolll"'" un' l'Il,rllt('11 is I....unlll,.tt'ly ,lilll'n'nt frullI th,' n)('fI..,,1 dl'.:'l"rihL'l1 in S'''Ct.ioli 5.:1. A sdll'nmtit.- ,lil\gnun ,ll'IJldil the Illw-l1Illt'r huU' uf tbe (.i I-bit utldt'r i,; shllwn ill Fllturt.. 5.20. rh,' hhUldll'sll'r \'Un,," dawu unil III {b tiun" l'nl'rlltl...; till' 1'".. G", 1111t11\." ...nl"s ft)r tin' iudl\ Ilhll\l hils IIUlll"t'IIIIJllI,"" {h,> l'", mllll',"" IIlIll'uts fur the 1\..........uu..'11 inn1mill nlfry of 0 IItIll 1. r,'r>Ju,,'h\,'I. Thl'St"' t.'\ulliitiollul \"tur}'-ullt MllIIls L'Olltrollhc Illult.ipll'",,'rs IIUlI tlct,'ruulIl' \\'hkh c-ulI.lilll.n,\1 "Ulli l>il-:- ,\tl fllr\\'lud,'lI to t he l1l'X Il'\"I.'1 lIf mull ipll'xlUJ{. ria,' Iw.1 St'ts l)f dunlluullipll'x,'rs (of SI..... 8 Inlll Hi bits) Im,l th," siudc rl.'ttUhlf Imtitil''''''I;I'l "f sii',' :!,I hits 111'1' imvll'llll'ntct.\ U:;J\ the tucw{s shuwu ill Fllrt.. 5.21- 1'111.' hlh- ur,ll'r II/df uf thl' tj I-hit I"ltlt'r IUL" II sl rul't illl' "llllllUI tu tlml illl'iul ,rl:'o. nil' 
124 5. Fost Addition s:" s2, I s '" ... c_. C;; Clu. ,J- Ct SO S,,, on (a) (b) 5 11 Cony-Save Adders 125 z u = (3 2) Counl e 11 F1GURE 5.22 A (3,2) counter. F1GURE 5.21 The basic circuit for a duol mult1plexer (0) and a single multiplexer (b). The signals cr and cr are generated by the Manchester corry unit of the preceding group of 8 bits. carn'-savc addition, we let t 1)(> ("'\fry propagatt" ollly in the last step, while in all the tber steps we genf'rate a partidl slim and a 8('(Iu p nce of carries $'parat€ly. Thu'i. the basic carry-save adder (CSA) accepts three n-bit oppralllis and genpr- ates two n-bit results, an n-bit partial sum. and an n-bit c.arry. A :.econd CSA dc('ept tbese two bit-Sl'<Juent't'$ and another input opt'nnd. and gelwrates a II'JW partial sum and C'Uf)'. A CSA is then'fore, capable of reducing the numbpr of operand" to be added from 3 to 2. without any carry propagation. A carry-save adder may be implemented in sc\('ral different waYS. In thf" simplest implementation, the basic clement of the carry-save adder is a full adder with three input, r, y, and z. whose arithmetic operatiun can be dpscribf"<1 by Dlain diffel"t"'nce is that the incoming c.'\fry, C32, is calculated by a separate carry- look-ahead circuit wlaOSt' inputs an? th{' conditional carry-out signals gt3lt'rated by the four Mancbt':;ter carry units in Figure 5.20. This allows the opt>ratiul of the brgh-order balf of tbe 64-bit adder to o\"t'rlap the operation of the low-<>rder half. III :;\Iuuuary. this adder combines variants of thn>e diff,'rent techniques for fast addition: Manclu'.:Ster carry generation. carry-select and conditioual-sum. Othl'r designs of hybrid adders can bE> ('nvisioned, combining variations of tilt' ba.."ic methods for fast addition, possibly implementing for eXNnplc, groups with UlleqUal number of bits. Olle :>uch adder, a Manche.;tt'r adder \\ ith v8riabl (group size) carry-skip, hdS bt.."'n propo.. and a.llalyZt'd in (3). The "optimal- ity" of h.rbrid 8dders is higbl}' dependent 011 the available tedmology and its particular delay p.uamctcrs. x + Y + z = 2c + s. (5.39) where sand c are the sum and carry output.s, respecti\.ely. Th£'ir valul'S arc s = (x + y + z) mod 2 and c = (x + Y + z) - s 2 (5.-10) \\ hen three or more operands are to be added simultaneously (e.g., in multipli- cation) using lwo-operand adders, the timf'ooonsuming carf)'-propagation mnst be repe3l('<l scveral times. If t he number of operands is k, t hen carrit haw to propagate (k - 1) times. Several tt.'dmiqu for multiple op,'rand addition that &UE>Wpt to lower th carry-propagation pmalty baw b>n propo.. and imple- mented. Th", teclmiquc that is most commonly u.'i<'d is carry-save addrtlon. In The outputs arf" t be WE>ighted binar) reprf'SE'ntat.ion of t hp mWlber of l's in the inputs. We therefore call the FA a (3.2) counter, 8.., shown in Figurf' 5.22. An n-bit CSA consists of n (3,2) c.ount.{'rs operating ill paralM with no carry link:. inh'rconnecting them. A carf)'-save adder for four .t-bit opE>rands X, } . Z, and W, is shuwlI in Figure 5.23. Thp upper two levels arp .I-bit ('SAs. whilE> th€' third le\('1 is a -I-bit carry-propagating adder (CPA). Th\" latter is a rippl(,-ccU'ry add('r, but. ma)' be replaced by a carry-look-ahpad adder or any othpr f8....t CPA. One :;hould note that partial sum bits and carry bits ari' interconnected to uarNltf'(' that rnly bits having the same wpiht lire added by a.ny (3.2) c()unr_ In order to add thp k op.'rands X I, .\'"2, .... X le we m..ro (k - 2) (,SA WI it:. and OIiP CPA. If th\" CSAs are arrangro ill a ca.';;('adl, as in Figure 5.23. then the time to add the k operands is 5.11 CARRY-SAVE ADDERS (k - 2). Tes." + TCPA, 
126 5. Fast AddItiOn 5.11 ConV,SOV'& AddElrs r Sf] :s r:: =. Z'I JtI =, x, \J \, 56 54 So 5" 51 .s, AGURE 5.23 A carry-sove odder for four operands wht'N'Tcp.-\ is tbl' opt.>ration tim4" of a C'P.\ and Tcs.\ is th(' op(,rE\tion tilU(' of a CSA, which equals tbe dd8' of a full .uld('r. f''''' Tht' latt('r i.. ut It'<\S! 2 . G, whert 11c is th .1t>18)' of a singlt' at. Nott' that. tilt' tillnl rt'nh nmy rmdl a lengtb of '1 + f/o!l",1k' bit.s, sinl th(' sum of k opl'r8nds. of si.l n bits t'ad,. l'lU} be as large as (2 n - l)k. A better \1\ 8Y to organiz(' th(' CSAs. aud ['('(1u('(' till' opt'rnt iou liult', i in the furm of a trt,'(' 001ll1l101lly callro a Wallncc t.r('{' [3-1). A six-ol1('rnnd WI111I1(\' trE'(' is iIlustratro in Figurt' 5.2-1. Thf' left nrrows 011 t.lIt' carr)' outputs uf tilt' CSA... indicat{' that these output.s ha\'c to bt' shifted to tht' It,ft h('for'" b4"ing added to the SIIIII bi, as shown iu Figure 5.23. In this tn'C. till' uumbt'r of "perands is n'<iucffi by a factor of 2/3 at each It'wi. TillIS, 127 \. \p, \ (' S (II) (b\ AGURE 5,24 (0) A CSA tree for slx operands (b) An Implementation of (J 6-Input bit-slice of the tree In (0). Sl'(\Ut'U,'1' of uumlwfS i 2.3,.I.(},),13,19,2, ,'h'. St I1rt m wil h Ii\"\' "p('flUItI. Wf\ stilI IIl...'(1 thn'(' Il'wb as \W tlI1 for six '11)t'r:UIII, Tilt' l'lltril'S in r'ithl,' 5.1 \\'1",' gl'llt'ruh'\l usiu similnr UrIIU1l'lIt.s. fhis I nblt' "t)ws t.lII' I'XU\"( 1II11111wr ur 1(.....,lto' n'(luirt'(l fur lip 1t1 li3 011t'mll". 2 I k'(3) 2 EXlunple 5.0 For k = 12. 6\"(.lt,\"('ls ,\ft' IIt"l'<""II, n'sldtiu in a "t'ln' nt 5. Il'... ill:-It'I\l1 of 10, I.s,., whit-h is 1.11l' (ll'IJI' fnr 1\ liul'l\r l.n....l'mll. nf 10 ('Sr\. 0 EXluniuiliK 1),ble 5.1, Wt' nmy 1111h' Ilml Ihl' mt)"..t .'t"'\II'mit',,1 imJlh'lIu'1i tation (in It'rms of 1Il1ll11ll'r of II,\..'I) i R('hit'\','(1 wh"11 I h., lIIulllwr "t opl'nulIl:. is 1\11 I'lt'lIl1'ut of tht' s,'ril's :I,,'.ti,H, la, W,2S, ".0 rhlls. (or II )l;iv"11 lllulIIlI'r of flllI'mlids. sa, I.', whil'h ilo uol 1m ,'I,'11\I o Ut ,.f t his 1i>I'ri(', W(' 111'1'11 t,l 11J:o" "lib "lItJlllo':h ('SAs to rt'tllll'\' k 10 !Ill' do:U'st (1111(\ :m",II,'r ,hnu k) t'I,'UII'lIt III lh.' nllow sI'ril'S, For t'xJllIIl'll', fur k ... '1;, \W 11\11,)' U:;,' 8 (' '\.:1 (wil h '1,1 iIlIJII!) rutht'r thuu 9 ('SA.... In till' tuplcvd. so 11",1 th uUIIIIll'r of ulH.'nulll.., ill till' 11I'xt It'\"I'l will 111' I:( . 2 ... 3 HI, whil'h 1:0 1\1\ dl'uwnf IIf till' sl'ril's, 1'111' r"lIInilllll/ot pllrL of (ht' trt'(' will hn\t' it tJ!J,'wlIIls fu!luw I.h., 'ril'.s, wht>re I i:, t.he number of Icwls requin>d. C'.ollSt'qIll'ntly, log (kj2) Numb,r uf Icwls :::::: I . ( / 09 3 2). \5..11) Equation (5.41) provides only an ftotnnnlt' of thl' numh('r flf lewis, sim"e a' each It'wl the IlUlllbt'r of operands must be an iutt'gt'r. Thus, if N i is the II\lmbl'r of operands at levd i, then th(' ulllnlll'r of operands lit t hI' It,\....1 (i + I) abun' nm IX' at mo' tN, . 3/2J (whert' tht> Uoor lxJ of a nUlIlbl'r x is th,- largl"St intt'Ker that is smallf'r than or eqllal to .r). TIll' IIl11ulwr of opemuds ,,' the bnttnm I&'\'el (i.e., level 0) is 2, so that th(' muximum nUlnbt'r of op£'raluls 1\1 Il'\'I'1 1 is 3 aud the mRXuJ1um numhpr uf upt'ralld.. at Il'\'.-I 2 is 19/2J = 4. 'flIP (("Suiting 
12 5. Fast AddItion 5 11 Corry-Save Adders 120 N limber ,)f nperllnds N uml)('r of l\'v\'ls 3 1 -t 2 5 $ I.- $ 6 3 7 :S I.- :S 9 .1 10 :S 1.' :S 13 5 1..\ :S 1.' :S H) 6 20 < k  2 i 29  k  42 8 ..\3 < I." < 63 9 ZI :T2 Z1 r. r r" 7:7 or m  rl092(1.' + 1)1. I I I I I 1 I TABLE 5.1 The number of levels In a CSA tree for k operands. The idea of using a (3,2) counter to form nl\llt.i-opf'r<\lld adders can be I'xh.'nd\>tl to a (7,3) counler, whO&' t.hr\'\' outputs rt'preSt'nt the nmuber of l's in its s,\'('n inputs. Anotlu'r ('xampl\' is thl' (15...\) counh'r or. in genf'ral, auv (1.', m) countf'r wher(' 1.' nnd m satisfy Sa 51 So 2 m - 1  1.' AGURE 5.25 A (7,3) counter using (3.2) counters. A (7,3) counter, for £'-xample. C'NI be implemeuted using (3.2) l"mmlers as shown in Figure 5.25, whl'r(' interlll\>diate re:;ults are nddt>d according tu t.heir w(ibt. Ho\\e\'er. (.his implellwntatioll re<luirl"S four (3,2) cuulltt-'rs arrnng\>tl in tlm'(' lev- els and therefor\' pn)\'idt-' nO spL'('(.I-up compl\f\'tl to Nl implem('ntation ba."o.l 011 (3,2) counters. A (i,3) counter ("<\n also b,' impl('nll'nt('(1 dir('("Uy as a multi- I('\'£'l circuit tint may hav(' a smaller overall delay depending on thp part.icular It'C"Jmology £'mployt'<1 (21). Sil1l'e the number of int.erconnC'Ctions that a cin.'uit n'quires grl'at.ly alfl'!C'ts its silicon area, a (7,3) oount('r is prl'ft'rrabl\' to a (J,2) coullt.er. A (7,3) has tcu COllllL'Ct.ions and rl'moV'$ four hits while a (3,2) count('r has fivc connf'Ction-; and rt'moves only olle bit.. Anot.her impl\'Ill('ntation of the (7,3) c.'ount£'r is throuh a ROM of siZ\' 2 7 x 3 = 128 x 3 bits. The access time of t.his ROM is unlikclv to b,> slUall('r than t.he delav a..;,sociatL'<I with t.he lIup)pm('ntation ill Figure 5.25. Howc\'er, a SP\'t'd-up may bt:' arhit'v('(1 if a ROM implt'IIIt'lItatioll is usl'd for a (k, m) COllllter with higher "alues of 1.' and m. \\"I)('n S('v('ral (7,3) counters (in Pdrallel) are used t.o add S\'\'t'n operands, \\e uht aill t hn>e result.s, and a second I('vt'l of (3,2) countt'rs is nft'dPd t.o rec.hlc(' thc::;c tu two re.8ults (5WD Rmi carry) to be added by a CPA. A similar situat.ion arifiL"S whcn (15...\) or more compll'x rounters art' used, geut:'rdting more than two rults aud \.oll'llu,.,ntly rL'<Juiriug a secund level of connters. In soml" ca."l'S, till." additit,na) lewl of ('Ounh'rs can he combilled with the first I('\'cl of count('rs, result.ing ill 8 morc conv(,llienl imph'mentation. In what foll(,w6, we show how till' (7,3) l'OUllh'r can be cumbined wit.h a (3,2) nnnlt,'r. \\'\, ('all the combil\l'tl ('Quilter a (7;2) COlllpror. 1..\1 g"l\l'ral, a (k; m) compres.."Or is a variant of a l'oullter with 1..' primary input, all of tht: 61\m,> weight, say 2 1 , and rn primary outputs of wl'ights 2'. 21+1, ..., 2'+"'-1 [9). In addit.ion, the compr('s-r has s{'wral incoming carries, all of wpight 2', from previous compressors, and sev('ral outgoillg carries of weights 2'+1 I\lld up. The 6-input bit-slicc shown in Figure 5.24(b) is a trivial,>xamplt' of a (6;2) \ompr"or where all outg(,ing carries hav(' t.he sam(' wt'ight 21+ 1 and the uumber of t,hcsc c.nrri<'$ f'qllllis th(' numbl'r of iucumillg carries a.nd is alw cqual, in general, to k -3. A straightforwl\rd impl('IUt'nt"1tion of a (7;2) C'omprt>SSor is shown in Figure 5.26, where tllP bottom right (3,2) counter is I.ht" additional (3,2) l"Ountl'r, while thl' remaining four (3,2) cOllnters ,oDstitute t.he ordinary (7,3) counter that is depictt:'ll in Figure 5.25. A (7;2) compressor in cnlumn i has" .ven priml\ry inputs of w£'ight 2 i I\l1tl t.wo c.'1rry inputs from colul11n (i - 1) aud (i - 2). It gCIll'mtf'S two primary outputs, dt'nutt:'(l by S2' uud S2 1 +1, r('f\{'(.ting t.heir wt'ight.s, aud two outgoing carri('s ('2,+1 and ('2,+2, to ('olumns (i + 1) and (i + 2), ((pl"('- tiwly. Note that th(' input c..arri('" to the (7;2) rompr£'S.."or iu figure 5,26 (10 nut participatE' in the gt'uf'ration of the two out put ('arri\'S in ordt'r tu .w..i(1 d ,Jow carry-propagation chain, Also notic\' that this ("OlIIpressor is uot 8 (Y,4) CHllI1tpr since it has two output.s (S2'+I and C21+1) with thl' :i1JJ.l1\' weight. fhe lIupleulc.'n- tl\tioll dcpide<1 ill Figure 5.26 does uut. offer an)' Spt'\>(lup. A diffl:'r\'lIt, pU1>..o;ibl.\ multilc\'ellogi(', impl£'lIll'ntatiun may yield 11 slUi&1ler O\'l'nul dday o.s I(}\ d..o; the gcneration of the out)l;oiug carrit's n'muius iUIIt'IH'lId('1I1 of the iJll'ulIlin..: cnrrit::i. 
130 5 Fast Addition 5.11 Corry-Save Adders 13J . . . . . . . . . . . . . . 2 1 2' 2 1 2 1 2 1 2 1 2 t CAn)' ("2.... 2 to (i + 2) C'VTY ('2,...1 to (i -4- 1) Cnrry ('2 1 from (i - I) Carry C2 1 fr<>rn (i - 2) FIGURE 5.27 A (5.5.4) counter. The dots represent Input or output bits. (k,-J,k,-2...., ":0, m) A rC<JSOuablc way of imp)PJllcuting gPllerdlizf>d counh'rN if; by uRing HOM!J. For cxample, the (5,5,4) coullt'r shown iu Figur p 527 C1\n h realizl.tl wIth a 2(6+ftl x 4 ROM (i.e.. 1024 x 4). Th(' (5,5,4) couuters ('onw'nipntly rt'tlucf> thl' input op('rduds tel two in- t('rmcdinte r,,:;ults, r('(luiriug ouly oue CPA to prodlll'p thp final sum. In thr geueral C'ase, 8 !>trillg of (1.-0 = I, .... k'_1 = k, m) CfJuutl'rs may gpn('ratf> morp than two iutprmpdiatp rults. rNluiring ddditioual n'dlirtilJII hl.fon. 8 (1)A can bl' uSPd. To fiud out t hp IlIlInlwr of iUh'rllll.tJiate rp!mlts g('nl'r ,tffi by '" SN of (k./.,.....,/.,..m) countprs, muidpr thr. following. A St't of (k,k....,k.tn) collutprs, with I ('olumns carh, prudu('es m-bit outputs at intervals tiC I hits. Any column ha.'! ,It lIIust r T 1 output bits. Thlls, /." (lpcr1lud!! cau b ' r('(.hu'f'd to 8 = r T 1 upt'rmlCls. If 8 = 2, a siuglt' CPA cau gpllf'r.ltf> t hf' hll.11 sum Otherwise, Curthl'r rt'lluction, from 8 to 2, ill llt'f'<ll'd. FIGURE 5.26 A (7:2) compressor wlfh two IncomIng carnes and two outgoing carries, In bit positIOn I . All thf' prr'Viuw.ly dewribcd counters life :;iugll'-cllhunn cOilUtl'rs. For IUlllti-oiJ 'ralld addition we call gt'lIl:'ralizc thl"8C sinI('-('oluUlII wUlltf'rs into multiplt,. columll COllntf'rs. \\'1' d('finl.' a w'n p rdIi7l'() parallpi ('ounter as a couut.cr tlnd add... I inplll c£JluUlIIB and produces all m-bit OlltPUt (31). Tlu> not!ition WI' uS(' for MidI a r.lJllnt.('r is whpn' k i is thl' nllllllwr of inpul bits in 1.lIP i-th I"Olumn wit,h wpight 2'. Clt>arly, 8 (k, m) I:Ollntt'f is a sppcial C<W' of this gPlIpraliJ'..I'd countpr. TIll' lIIunher of outputs m UlU<;t salbfy I-I 2'" -1  Lk 1 2 1 . . 0 (5.42) Example 5.7 If till' lIumber oC hits per column iu a two-coluUlII COlllltlr (k.k. fII) is increased bt'yund 5, then m 2: 5 ami as a rto:mlt, 8 = r 71 > 2. For exampll', iC k = 7, the illPqllality t.Jldt uun,t be satisfj,'d is 2'" - 1 2: 7.3 = 21, alld Ihl.'fI'foft. rr& = 5. A I'(.t of (7.7,5) countt'r/'! will gelll'ratc 11 = 3 operauds, aud COII'<IUl'Utly l1uotlll'r !Wt of (3,2) ('lIllIItl'(!I is IIft'IIr'c! ill urdt'r to rl'(hu'I' the ullmbt'r uf opPrEllul'I 1.0 2. 0 If dll I ('0"11111111 haw the sanl!' h(.ight. k (i.e" /"'0 = k l = ... = k,_1 = k). Ihl'u I h, ilu'(llIality tll<\t h L'i to hl' 6dt.isfi,'d it> 2"'-12: k.(2 ' -1). (5.43) A imillc' exampl/' of I hl'8f' muutl'rs is till' (6,5,.1) count.pr !>howu iu Figur4" !j,2i. hJr thi" I"IJlUlt..l'r, k = 5, I = 2 Rnd m = 4, alld illl'ltllalit.y (5AJ) turn!! illt, an f'quality, ilUJlIVinJ{ that. 811 16 ('(lmbinatilJulI of thp IlIItlllll hits !ire 1r;4.fu!' (5,5.4) (.4)uut 'r.. can hI' U8P.d t . rpthu'r five 0I)l'r8l1dll (oC any Ipngth) to two r/'Sults that ('hll tJ1I'1i I,p addt.d wilh Ii (,PA. 'fht' II'ugth Ilf 0IWr8ud" will r!ptrrminl' th.. uwubpr 'JC (5,!" ,4) C-OWlt1'r1! in J",mllp!. Tht' harclwar' complexity (IC a clury-suVl' /1(IIII'r for a larKf' uumbpr of 0p'rtInc1! might. Iw prohihitivc, ilult'III'uclcllt (If tht' pnrt inllnr t} pP of p lrHllpl C'ounters crnployt.od. am' way to ff'thll'f' t,he har"ware ('OJTII,I,'xity is t.o df''Iign n smaller (' ,rrY-6aVC trl'C dud II"C it itl'rntiwly. 1'h4" fA oJII'r<1n"!! !if(' dividr'(l into rn/jl Krollps tlf j oJwwud.. I...wh, alliin tn"f.' for j +- 2 operauds with tv.., feedback J,al hs alld 11 CPA it; "f'ignr'(I, I\H ShUWlI1II Fignfl' 5.28. The t.Wo fl't.dhock Imtllii mnk(' it nl'('e<...nry tit comJlI.'t.{. thl' lir:.L puss t.hrmlKh the (,SA t.rf.... h '(Ul . 
132 5 Fast AddltJon 1 I I I 5,12 P petlnlng of Arithmetic Operations 133 \1 \J \ Stage 1 -"II)' t :l SI 9f1 :I 7 }' CSot Tr FIGURE 5.29 A three-stage pipeline. CP-\ stagl:' 2 t.o (,Xf'Cutp st('p 2 of th(' c\lgorithm, whilp :.tagf' I ran start tllP ('Xt'('ution of st ep I 011 Ilu> next Sf't. of opprands \" lInd Y. A common way t,o iIIul:'tratt' th(' way a piJldin,' npl'rnlf's is throllgh , timing diagram lik(' the line in Figurf' 5.30, whirh shows I ht' t'xact timing (If Ii"lIIr slIccessi\'e udditions with opprands X I & Y I , \"2 & Y 2 , XJ &. Y JI ami Y., & Y 4 produring the results ZI, Z'l, Z3, and Z4. respfftivdy. L('t. Ti denot(' t.he execution time of stage i and II'I 1"/ dpuotp thp time need...:1 to store new data into a latch. In g('Ilt'ral, tht' dplays associated with the rlifferNlt pipeline stdg. arc not icll'nt.ical, and fasler stages mllt watt, until the slow,'st stage complete.s its task twfore they can all switch to thl' IWXt task. Therefore, the tin1\' inl(,r\al bet\Vl:'t'lI two !>ucc('Ssiv> findl ro>:>ults bpin${ produced by the pipeline is FIGURE 5.28 A CSA tree with two feedback paths and J new operands. till' SN'"l"'l1U1 SC'I of j opf'ralllis is appliro. Thl slows duwn thl' ex..'Cution of the mllitipl('-\.)pcrand addition, since pipelinillg is nol possible. In the next Sfftion we discu pipdining in gl'n('ral and (Iescribe ways to modify tllf' trl st.ructure in FUI't' 5.28 to $upport pipt'lining. T = ma..x {Ti} + 'T/ 1S;'Sk (5.44) 5.12 PIPELINING OF ARITHMETIC OPERATIONS whpre k is tht"' number uf :;tag,- in tilt' gellPral ('h....e. Tht' rin1/' interval T is al.... CaJll'<l thl' p'IJl'line period, and 1fT is call,x! the pipeline rote or oowl",i,Jth Pipt'lining is a wry wdl known t('('hniqu(' for accelt'rating the e-x£>Cution of su('- i\'(' id"nti.-a) op('rations. Instead of dE'Signing a rirmit capable of e<ffut,ing 1\ $ingle op<'rat inn on out' set of op('rantls at a I,imp, we (h.-'Sign one that is par- tilioued illto \'era1 suhcircuits that can oppratt"' indcpcnd£'lltly on coruccutive ts of opernnds. This way the ext'(;utions of several sncc('$Sj..'(' operau01 owr- lap, and the ratt> at whidl results arC' produCt'd IS considerably higher than that of a nonpipt'liued dl"Slgll. To allow pipplilliug, the algorithm is divided into S '\'era! steps, and a suitabk drcuit is dt':'lgned for each of the.st' :.teps. Th(' S<'paratc ci(('uits, which are calk>d pipdiJlt stages, musl be alIO\\"l'<l to opt'rale illdcpt'ndently On differt'nt set6 of uperands. To &:hie\ this goal. storage elements (latches) lIIust be add..'<1 betW(,(,1I adjac,>nt stages, so that wlwn a SI.aF;t' works on aile set of opprands. the prt'<.'l'<lit Sle can wurk 011 the next s-et of op('rands, Au "'''81l1pll' uf a pipelinp for adtlition (or any oth('r two-opprand opt'ration) conslstillg uf thrl't' tagt'S i:. dt'pictftl in Figurp 5.29. H,'r(', till' arlditilln of the t\\'o Opt>r81lds .\ nnd} is pt>rfornu>tl in tlm."> t:tt'ps. The Intdll b(.twl'l'n stag(' laud tiUt)!;t' 2 s\.un' \.he ulh>fmP<!inl,' rtults of Sh'p 1, which Me tlwlI uso>d by Pipl.'lin(' !\tN( Z,UI 2 ill Z.silt .. 15 h Stage (' product'<1 prodm:...1 produ("('d pl'O<     - 3 Operation Operation Open" ion Operation I J. 3 .1 - f Opt'rntion ()pulll>n Opcmtion OIX'n.tion I 2 3 I I °lwl"ataon OpcraliolJ Uperatlon Uperntlon I 2 3 .1 - T X2&)) ('nt"r I.T \kb nt..r 3r '\1"''' cnt..r .IT ST 1>1" rime SIQge SIQ9 o .\1&) I .,n[er FIGURE 5.30 A timIng diagram of a thlee stage pipeline 
134 5. Fast AddItion 5 Exercises 135 The dock signal s)'1l'hrnlllzinJ:( the pipeline's op£>rationlUust be S('t SO that the dock riocl is ('Qual to or larg('r than T. Figure 5.30 shows the ca..c:e wh,..re the clock pt.'riod ('quais T. H,..r(', sft('r a latenc)' lIf 3. T. n£>w rt.ults are produl't'd al t Irt' rat.' of liT. An important dtign dedsi,>n is thp partitioning of thp givpn /:1lgoritlull into stcJ>1' t hat will hl' eXffutro by t hp St.'parate stage'S of tire pipdiu('. These sh'ps should pref('rably be dpfim-d so that th,y have similar ('xccution tune,;, sin(.....' the pipdine rat(' is d('tprminM by tire cx('("ution time of the sluwt."St step. Thl' lIumber of stpps nm.:t then be d...t('nninN:L As tills number incr...a.-.es, the pipelint' )(>riod d('('re&t but th(' numb('r of latt:h goes up (incrcasinF; the cost of implpnU'ntation), and so dO€'S the latency of tlrl' pipdillt'. The latency is the I ime elapsed uul il the first rf'Sult is producM. This is especially important wheu only a single pa..;s through tbe pipelhw is required «('.g., lLddition of only one pair of I.lperands). Thus, tl1t're is a tradroff h<>tween the laten.."y and impl('mpntation cost on onf' hand and the pipeline rat,,' on the other hand. The extra delay du£> to the latehes. 1), can be 100\"t'red by using special circuits like the Earl latch Ill). 5.12.1 Plpellning of Adders TI1(' reIat iw simplicity of two-operand addcrs usually dOt'S not justify th>ir impl£'- mmtation as pipdin('$. Howc\"pr. in special-purpose digns. wh('n IDam" succes- siw additions are needed. such implelllt'nt.atiollS are justifiable. The ('ouditional- sum add£>r can be ea.o;;i!y implclllf'nted dS a pipeline. One 'way of doing this is to haw' log2 rr stages correspondmg to the log:! n steps in the conditional-swlI algorithm. The, allows \1.<; to ov('rla() th(' c.xecution of up to 102 n additions. How('\,er. th(' required number of latches may be t'xCt'SSive. Two (or c\,('n more) stt'pI; call be combined to form a single t.agc in the pipeline reducing the latches' O\erhcad and the lat.enc)'. The carry-look-ahead &.Ider. de>('ribed in Section 5.2. cannot be pipelined. smce som,' carl')' signals must propagate backward (5('(', ('_g., Figure 5.2). How- e'\"I:'r. different desigw. of t h(' carry-Iook-allf'311 adder, following the approach de:.cribro in Section 5.5, can be pipt'lined. Ht're. the final carries and the mrry- propagc\te signals (implemented as p. = x. e Yi rath('r than p. = r. + Yi) can be w;ed to ralculate t.he sum bits, eliminating the n('('() for f('f.-dback connf'Ctk>n... Clearly. pipelining i.<; mo['{' b('lIpficiai in thp c.ast" of multipl(,-o(H'rand addt:>rs, likl? thf' carl')'-t'a\'f' '\CIders de:>("ribed in Section 5.9. Modif) ing the implcn}('nt3- tiun of CSA trff'. (see for example. Figure 5.2-1) to form a pipeline is straight- CUNa.rd a.nd requires only the c\ddilion or lall'hes. These can be added at each Ie\"pi of tht' tn:>e if maximum bandwidth is desired, or two (or more) le\,('!s of the tno(' can be combined to form a <:ingle :.tage of the pipeline. reducing the o\'6"all numbt"!' of latdl'''=' and t hp pipeline latenc', If the hardware complt'xity of the CSA tree for a large numbt"r of operand..  prohibilh'e, a partial trw lik... th\? onp :Jlov.n in Figure 5.28 can be designed. X'. x' . . . Xl rRJ1 TI-t I J C1p('rorld.! FIGURE 5.31 A CSA free alloWIng overlap between iterations. Howc\'er. as pointed out eMIit'r. the two feedback connections pre>'ent pipelin- iug. This ca.n be rectified by modifying tbese fN'dback connections. lru.tedd of connecling the two intermediate re.ults of the CSA tl"f'C to it:; inputs, we can connect, them to the bottom level of the CSA tree strm,"ture, as shown in Figure 5.31. The modifit.-d structure now COllSU;h, of a :.maller tr with j inputs <it tbe top, two !iCparate CSAs. aud a set of latches at the bottom. The two parat CSk; aud latches 3..-.similate the two intermediatf' re.-uI and fonn a pipeline stage. This wa', tht' top CSA tree for j oppralilis can be pip£>lined too. and the o\'erall time needed to add 811 n operands is reduced considerably. 5.13 EXERCISES 5.1. Compare the two alternati\'{'S for tbe Boolean expression of the propagijted carl). n.unely. P. = z..;.-y, and P. = z. +1/.. What might be the benefits and drawb;,cb of each expl'ffiSioo? 5.2. (a):\ carry-completIOn add..r 1101 dett'Cts tbt' completion of the carry propagahon and geoer3WS a signal CC (carry complete). indicating that the  bibs.  sto>ady .wd C<Ul bt- u..;ed. The ba.:.ic unit in tlus adder is a modified full adder ,,-;Ib the sam.- inputs z, and SI. and output ,. Tbe carrie> an' different. tbough: insu>ad ora singl.- inroming carT) lint' c,. there an> two incoming c-arry linps (and. siruilarl'. t"-o oulgoing CarT)' lj) denoloo by  aDd c:" c? = I if it is amadlf known that the incoming carr)' is ro, and  - 1 if it is almsdy known Lha& 
T 136 5, Fasf Addition 5,13 Exercises 137 thC' iuromiug carry if; on('. Thp 1\\'0 outgoillJi: "'flrrics, c?+1 lille! C:+h art' dE'lilll'd similarly. NOle Ih81 IUJ add('r fllag" with inputs z. = y. = 0 Cflll 1lE'\'E'r gcncrate a carrv- ,"II anci tJl('re£ore ..-all prollucl' c?+I = 1 illlml..-lililel,." wilholll waiting for fIIY carr,., prop8,Raliml. Similarl,." an fldder stage wilh inputs x, = y. = 1 wiU alwa\s !l:eUf'ral,' 1\ carry-oil I "'rid call therC'for(' produce c:+I = 1 IIwm>diately witholt waitiu,R for allY ('arry propaJ;nlion. All add('r slage with inputs x./,I. = 01 or z./,I. = 10 will iuitially sct bOlh carry-out signals to c?+I = C:+1 = 0 J\nd will wnit unl iI 011(' (find ('xnelly onc) of its carry-in signfll., becol1lC'S 1. Gilly thpn E'ithl'r c?+I or C:+I is SC't 1(1 I. Writ(' BooINU1 Pquations for the thre<' olltputs 3" c?+I and C:H 8S fmll'tions of thE' four inputs z.. y., c? and c:. Deline a signal eCI = ('+I + C:+h explaill its InPfilling, und show the BoolC'an (>Quation for the mrry-romplction signal ce. (b) It has h(,(,11 showll thai the aVf'rage Il'ngt.h of thp 10llgest carry propagatioll dmill wh('n two n-bit 0lwrands art' added is approximately IOK:;:(5n/ I). Estimate tb(' a\'eragl' a.-Idition time and ,'omparC' it to thp addition tillll' of a rippll'-carry addE'r. Take n = 64 as a lIulIll'ricsl example. \Vhat is th" drawback of this "arry-romplC't ion fldder'! 5.3. \'('rify Ibat th(' organization "ilh 10 groups ohize klo k2...., klO = 1,2,3. .1, 5,6, 5,3,2. 1 for a 32-hit carr,,'-skip adder with a single Ip\'el of carry-skip alld I. +h. = I. S Itistles TearrJ/  9. Ir (13). You IIIfly either 8I1alyZI' all important cases or writl' a program that ellJlmpratcs all ca.. 5.4. E.'ilimflte thE' carry-propagalion tiIllP in a two-lc\eI carr)'-skip addcr for 32 bits Ihat an' di\'id,>d into II groups of size 1,2,(1,2,3),(2,3,3),(3,3,2),(3,2),2, where ('\'('1'''' set of parenthe&'S liignifit'S a single group in the SCCOild 1<,\ E'I. The first Ii\'e F)"QUps, of Sl./ 1,2,(1.2,3), are shown in Figurt' 5.32 (321. Assunle that t. + II> = tr and comparE' )0111' timate to tbe carry propagation d('lay of the sillgle--Ie\'el carry-skip addpr in problem (3). Is thl' S(ond I('vel of carry-skip justified? 5.8. (8) Show nil im)Jlellwnt.atiun of a If)-bil ,'ondiliunfll-'mlll nrldl'r usillg fullllllcl..rs and dutn S('II:'CIOrs. Indicate bow mfillY I('s flrc needed. Rcpeat th.. IC 1",(),IIl' fflr a 21-hit adder. (b) Fst.imat th(' E'xccutioll tirn uf till' cflnditiullal-Silln ndde in (1\) dlld I'om- INUP it to thf' eXC('lIlion timl" of I' ury-Iook-flhead adders. 1J1I.'Ie vour 'hm.ntion 011 typical df'laY9 frUlIJ mpllt signals to output siKl18l'l, whi,'h Cdn bf' found in dny IC data book. (8) Uesig.u a :J2-l.Iit conditinna1 sum ac:ldl'r Ibat Rtarls off willi Kl'1J11 or 1<17(' .1. The two pO!iSihl.. .'11'1.'1 of outpnts for eaeh group of siz(' ,IIUf' gNleratl'ft IIsilll( 7,11 I ICs with inlC'rual carry-look-ahead. 1':1(' fllJproprJatC' ,hltfl selC'clorR ill .uldilinu to thc 7 IHH 1('5. (b) fuitimnte tb(' exC('ution time of the comlitional-slIIn addrr iu (a) ,md ,0JllI",r.' il to th(' E'xffution time of a c.ury-Inok-ahpad "dripI'. 5.9. Add to the tree structur.. ill Figllrp 5.5 lite millimllm nccessary circllitry to l1;('n- E'rate thE' carries CIII, Cl4, . . . ca. 5.7. 5.10. Prov(' tilt' exprt'SSioll for l'stillJating till' uumb('r of lev('l'l m a W8111\("e trl'(' for k operands, 5.11. Design a 4-bit counter capablp of rt'f.lllcing Ih.. nUllJbr of "I'prllilds from I I" 2 with a carr)' whoS<' prnp"galioll is lilllitt:>d to one pOlSilion. 5.12. A sel of n/2 (!»,5,-I) cOlluters cfln he used to r('(lucc 5 0lwr8l11ls (n-bit elu:h) 102 operands. (a) What kind of countE'r is needed to rroucl' 7 op,'rdnds to 2? (b) R('p('flt (a) for 90pt'rands. (c) Repeat (a) for k operauds. 5.13. G\1(, le\'E'I of (5,5,4) coullters call bc usoo 10 roouc(' 5 opt:>rands to 2 op('rands. \\'hat is the UlHxillliun number of operand'i that uw he rcdllu.--d to 2 whell two levels of (5,:';,4) counters are used'! 5.14. Show an implewcntation of a (!»,5,4) COllnter using (3.2) COl1llt. 5.15. Desi,,'IJ a (7;2) comprC'SSOr with 7 inputs, 2 outputfl (a."1 ill I;'igtlre 5.2ti) thdt bits ,I inllUt carries (rom IJosition (i - I) and gpllprates .. output carrit'S to position (i + 1). Use (J,2) COUlltCrs and lJIakc SUI'I' IhalUO 10llg carrv-prl.p3Ral;nn dldilL'. aw g..nrI\hxi. ('om parE' this desigu to thc one shown in FigurfJ 5,:.!1i cOlUlidl'riuK speed of operaliou aud oth..r hctors. 5.16. [.':.timatp thc time f1C('doo to ."1£1 rI upl'rlillll.. 1I...ing th,' CSA tn fflr j nl'w operands sbowll ill Figw,' 5.2X, and cnmpRr(, it to the time lIloed,'d Uliilll( till' CS \ Iret' shown ill Figur(' 5.:n with oVf'rl<ip hNw('NI ...tlc(('_i\"t' il('ralinll.'!. J\8ilJlllf' that nfj is an mlgl'r. USE' n = ,.!I allli j = Ii at! a nUllleril'dl ('XHIIII,Jt 5.17. A (,SA Iree for 6 up('r.mci'i uf Il'ngth :J2 bils f'ach indlllll'S I (,SA IIlIils. \Vital should the Ipngth of ('8ell of theS(' CSA units lJe'{ 5.18. Pro\'C Eqlldtion (5.36) for llign Ilosltion m in the fl'ClucN cUfnplexity c Irry.tll'It.'Ct adder. FIGURE 5.32 A two-Ievel corry-sldp odder. 5.5. E.'ilimat(' lu(' addiliou lime of aD SO-bit ('.vr)'-Iook-ahead addcr conslructoo of th.' ICs 74181 aDd 741f\2 for various addl'r configurations, including a ripl,le- carr,.' hl'lwl'('ll 74181 IC... and the maximum number of If'wls of 7-1182 1('5. Draw Ii hlm'k diagram for E'3('L omfiguratioll. lJue your eslimation on typical d,>la}'s frow input Ni..J:; to output signals, which caD bc found in auy IC data book. 6.6. IJsi Ii tablt. similar to the one in Figure 5.4, show tltl' \'arious sleps of tUI' ron..Jition"l-suw ddditiOD of tbe following two Ilumbers, each cOllSL'Iting of 2 I hiL'i: z= 000101101100101101001111 1/ = 00100 III 00000 11110 110111 
I3S 5. Fast A<:b"ion 5..19 .; bow to ....Ulv<-a te tbe  Qt ID tbe \I cart"\ moduIP sbow11 in F" 5..1.. 5 20. ()nn; a  diagram Iik.e tbe   in F agure 5.1- for tbeS-bit Manchester Carry C'hain U!II!'d in figure 5.20.. f« each bit pOI!itMlo m (m = O. I. -- - . 1) t are thJ-fto S90; P.. G. aDd 1\.. ExplaiD wh is 1\.   whiJe it is DOt rt'qUired in F 5.1.. 5.14 REFERENCES (II o. J. BEDRlJ.  adda: IRE Jhm,.,_ OR Ekdnm. C-ompt&kr.f. EC-ll f.JuDr 19b'"2). J.ID.34f). R. P. BRE.'" aod H. T. Knc. .A regular 1a"OUt £01' paraDeI adderr;.- IEEE 1\-oru. OR C.o C-'J CIazdII9S"l). 200-26-1.. P. K. CHA" aud I. D. F. ScHuG. -Ana1:rsi:s aDd design ol'C'-I05 1aoc:be5tET edders with ,wiabk> carJ)'p.- IEEE Tr'C1U. em Compukn, 39 (-\ugus& 1990), 983-992 1 P. 1\.. CHA'. M, D. F SoIUG, C. D. THO}IBOR..'O." &Dd \". G. O....LOBDZ- UA. -y oprirnm.1 ioo of cart)-skip addfts aDd bkdt cam--Iook-ahNd addets using multictimecsiooal -namk  - IEEE Tnr. OR Computen.. .II -\ugus* 199'2).9"20-930. 151 L. DADDA. "'Some scbemt8 Cor paraUeI mulupliers," AU4 fh-.qtoen:o. 3.1 (Ialch 1%5), 346-356-  D. W. DoBBERPl'HL d Gl.. -.0\ 2(».MHz tH-b duaJ-is:sue C'IOS microprore;;sor.- IEEE J. 0/ SoI&d-Sf4U CtrC1I&ts,  (NO\. 199'2). 1').'.).I5&:J. fi1 R W. DoRA", '''\ariant.sofan lmproo,""t'd Cart) Look-Abeed Adder. IEEE Tf'flTU_ OR Comput, 31 ,t. I'. 1110-1113. :\1. J. Fl\:-':S A"D. F. OBElWA", .4dNnad CX1mpUteT cmthmenc daign.. \\ iJe). "\ew York. 2001. 191 D. D. Gr..J, -ParaDei compressors, - IEEE Tram OR Compulc:l, C- 9 (Ma) 19&)). 393-398. I B. G1LOHU."T. J. PmU:R£.'C£ aDd S. Y. WOSC, -Fas\ carry logic lOr digital compui: IRE Trun". OR El«trofL Comput. EC-4 (Dec. 1955), 13J.IJ6. In] T. G. HALU" aDd:\1 J. Fl\:S. -Pipt"liniDg o£ arithmetic £uocti;)ns,- IEEE Trum on CompuknJ. C-21 (August 197'1). . (12, T. 8A, aDd D. A. CARI-"O", -Fast aree-efDcXoot \LSI adders,- Proc_ sUa SfI'RlI- OQ Compaf.eT' ....F1UttnetlC. 198-. -t56- (13) V KAI\"TABl TR.A, - i...g optimum can)-5kip ad<kn.- Proc 10th S!I"Ip. ern Compata .-triUamdic. 1991. 1153- [14 1 T. K1L8t.R.'. D. B. C. EO\\AI'lDS &Dd D. A:'PJ-\L1.. -A pardUel arithmetic UDit  a sa1Uraitr fas&-<any cimUt: Pr«_ o/IEo. Pt. B. 107 (Nov. 1960),513-:')'4 5.1.\ Re"eteoces 139 c 1 i1 S KW1AoLES. A faaWy 01 adcIeIs.- Proc 141A-" 1999. Jo-3.I (Hij P. I. I\.OCGE aDd H. S. S'ro:S£, -A para1iPI alpithm for tbe  tr.JhIt.. of.  class 01  eqaa1ioas. - IEEE Truru C c..a -\ 1M3). TS6- ';'93. .Ii) R_ E L.'DER aod I J. FISCRER. -Para1Jtoi pmix COIDp'Q'........ .. J .-4C\l. n (<Jd.obft I). ,n 18) I. LEHWA:o. aDd X_ Bt"RL'. ....kip t for  earn- propap&ioD ill  aritlu:DfticUDirs..-IRE T oa E.kInm C-o E(-IO( 1 , 691-69' (I II. LI:\G. -High  biDan- adMr.- IBM J. R. ... Dr:wl.. 25 '\Ia 1 I). 156-166- .!OI T. L\CH aod E. E.. 1;:\H.RTlLA."iDER. JR.. -..\ """g trft' . 1ook-abNd ." IEEE TMI7I$. 0'11 Co 41 -\ugus& 199'1}.931-939. ..?II I. IEHTA. \. PAJUI..-'R &Dd E. S""DER. -Hi mu1t1 dIIsip usiQg awlti-iDpu\ C'OUDU'r aDd rou.t'- r cim1it.s.. - Proc.. I s,z.". C pula ."riJhmdi. 1991. . ._' 'f. F. XG-'l. I. J. IR\\"1.... aod. RA\HT. -R. U'ftt-tJmp ft6cimt any""'- aheed adden.. - J_ 01 PcrGlld cmd lNtribtdal C0ftIJ"'IIm9. 3 (1 ') 92-105 :23) V. G. OKL08DZ1J..\, -Desip aDd anah-sis of fast carrv-propapu   DOIH'qU&l iDput signal arri,,'81 protilP.- PnK.. . .hilomGr Conjantu. (199-1). 1398-1-101- "-1' V. G. O"'LOBDZUA and E R. BAR.'£S. -oa DD additioa ill \"L. t«h.DoIo!;y.- J. 01 P.n&Ud cmd  Com,.rilsg. S (1 .. .1&-i'l1.  S. OSC aDd D. E. ATK1"''5=. -A comparisoo of ALl" s&.nxtures b \"LSI  .: Proe.. 6th Symp. on kr AnthmdlC. (J I - ), Io-I >6J D.5. PIL'T.\t\ aDd l KORE.... -11uenDl'diate   b uro opened additioo eouabllilg mu1tix<w. .. I'rot:.. 1 fA IEEE .... OR Ccmpu.kr .o\nuamt'tic., 1999. 22-29. .! D-S. PHAT.'-K. T. GOFFaod I. KoRE-'. -Coa:staDC-tilD uddilioa aDd £0J'1Da& COD""ft'Sioo buI!d OD redUDdaDt ' Dlalioas..- IEEE T_ ... Compukn. so. (2001'. 281 S. Sl"GHaOO R. \h'UtA:s, -luluJMpopmmd aMl.w- aDdmuJupticaiioD" 'EEl: Trans. ern C41fnp.t. C- (19i3) 1l3-1lO. 29' J. r..L.':-'"'''''.. -Cooditiooah.qun .ddiiioo logic". IRE Traru. EC-9 (Jwre 19f50\ 2'16-231- (301 P. S. $PIR..\. "CompuLatioo LI.me$ of arithrDetic aDd Boc:W.D CuncuoIIs ID (4..". circWts. - IEEE Jnvu. on C C-22 (Juar 19':'3}  1311 \\. J. Stt...zn, \\. J. Kl Bm aDd C. H. GABl."IA. - -\ oompact 1Ui. pe.raUd multiplkatioD 5Cbeme - lEEB Trvzu. _ C_ c- ,; (Oct.. 19iJ) '-967. ll S. Tl RRN. "Optimal poup ributioa iD ca.rt)- ip ..  s,... 0'11 ComplJtcr "ac. 1. Qt)..1()3.. r- 1 I I I 
140 5. Fast Addition 6 (3:iJ A. TYAc:J, "A rroucro-area Hchpme for carry-S('lcct OOd('(8," IEEE 1huu. on Com. P'''' C-42 (OC'tobrr 1993), 1163-1170. (34] C. S. WALLA!'I'", "A SUAA<'8110rJ for 8 fast lllultiJl!ipr," IFEE 1hm.,. on rnmpul D, ECJ3 (Frbruary ]96.1) 1.1-17. (35] S. WINOGRAD, "On th(' lim(> r{'(juif('(1 to pprfnnn addition," J. oflhc AC,\!, 12 (1965) 277-285. HIGH-SPEED MULTIPLICATION Multiplication involves IWO basic operdtious: the gl'lll'ration of part.ial products and th{'ir accumulation. ConSC<luently, there are two WayS to !>p('f'(lup IIll1ltipli- cation: reduce the uumb('r of part.ial products or accelerate t.lwir a("culI1ulation. Clparly. a smaller number (Jf partial product also reduces th,.. ("Qmp"xlty, and. as a result, r('duccs th,.. t.ime needed to accumulate t.hp plil'tial produrts. High-speed multipliers nu he classifif'(] into thrPe general types. Tht> first geuerates all partial products in parallpl, dnd then II a fast IIInlti-op€fand adder for their accumulation. This is known 8b a parallpl multiplier.. Th' sec- ond, known as a high-spt?t.>d sequential multiplier, t'llerate8 the partial products sequentially and .adds each newly g'nerated product to thp prt>viously w:t':lUnu- lated partial product. The third is made up of an array of id('ntical c('I1'1 that generate uew partial products and acrUlllulntc th,..m simultdneously. Thus, there are no separat.e circuits for partial product gpnpration dnel for th{'ir an'lunulation. Thi" is known as an array multiplier, dud it t{'nd to hav(' a rrollct'd eXt'cution t.ime, at th(' expense of incn>ascd hardware complexity, 6.1 REDUCING THE NUMBER OF PARTIAL PRODUCTS To reduce t.he numbpr of partial product.s (and hpuce rt'dlll'p thp "\1111l1l11t of hdrdware ill\'olvPd and t.he f'xe<'lItion time) we may ('xaminp two or mort> hits of the multiplier at a time. Howpvpr, this SC'hpmp n'(luires the g('ul.'ratilJu of th{' Ulult.iplt'S A, 2.4, RUt! 3'" wht'ff' A is th,.. mllitiplicaud, as in Chl1ptt'r 3. This r"dnce$ t.he lIumbpr of part.ial prmlllrt,s to n/2. hut ..act. :.tt'p ht'(.oult:S HI 
142 6. High-Speed Mulhpllcatlon 6,1 Reducing the Number of Portlol Products 1-13 mon' rompkx. Variou alp;orithms for rt,<ludllg thl' IIl1mbl'r of partial pruducts wil.hout illl'rl'8Sing tlte compll'xily of gl'lU'rating t'adl partial prmhll'l ha", hN'n propo.<;,'d. On(' of 1,111' fir:;t. su..h algorit.hms was Booth's algorithm 131. Booth's algorit.hm. as \",'11 11.<; many otlwr algorit.hms, i ha.."NI on lh" fact that fl'wpr partial product.s hl\\'1' hI h.. g('nemh'tl for groups of consecutive zeros anrl on. For a W'0up of cOlIs('('lItivl' ...,'ros ill the 1I1lIitiplil'r there IS nu nt>t.'(1 to g,'nf'fah' any lit'\\' partial product. \\'1' only UCi'l1 to shift t.he previously aCClUnu- lall'(} part.ial product one hit position to the right for every 0 in t.he mult.iplier. For a p;ronp of, 8<'\Y m. cOllsecutive I's in I he multiplier, ...0 {II.. .11} 0.. " fewl'r than m nt'W partial product.s c.an be g('IU'rah"(l. TIll' abm'l' SI'QIIl'U('e l''Iuals thl' differl'lIce b('twn t.lu' following two bit seq\ll'nl'l'S, each haviug a singlp nOll1('ro hit: Xn-I X,,-2 Y..-I (I) I 0 I (2) I I 0 TABLE 6.2 Recodlng the sign bit In Booth'S algorithm. Example 6.1 fill' multiplier UOIlIIOOII(O) is r('C'Ockd a... 0100010101, rt'lJlIiring 4. in- stl'acl of 6. add/:mbtrllct op,'rations. Th,.. J'l'ro in 1).mnthCSl i8 the rt'f"r- ,-,uce hit x _ J for xo. 0 ...O{ll...ll}O...= ...l{OO...OO}O... - ...O{OO".OI}O... Using SD (signed-digit) notation, tlirllsSt'(1 in Ch.\ptl'r 2, the abO\p cau be writ- ten as . .. I {OO. .. Ol} 0.. . For (,xflmple, ...0 {1111} 0... = .. .1{OOOO} 0... ...0 {OOOl} 0... =. ,I {OOOI} 0... or. in dN'imaluolation, 15 = 16-1. Thns, insfI'ad of generdt- ing all m pnrtial product.s, w(' may genl'rate ouly two partial products, with thl' S('cond bl'ing rompll'IIIl'UI,NI. III other words, the firt partial product is added, while t.he second is subt.racted. Not, that the rl'CJllirNI lIIunhl'r of singlf'-bit shift-right operdtions is still m. We c.all thi!> o!'('ratioll re('oding thl' multiplier in SD corl,... The simpl£'st fl'Coding sdu'lIIl' is th(' original Boot.h's algorithm, sunmlari./t'd in Tahle 6.1. III this algorithm, thl' cnrrl'nt. hit. Xi and the prl'violls hit Xi_I of the multiplier X n -IX,,-2'" XIXO are t'xamil1Pd in order to genl'rat.e the it.h hit Y, of the re- codl'<l muItiplipr Y,,-ly..-2'" YIYO' Thp prpvious bit, X'-I> serves here only dS a reference bit. At its turn, X'_I will be recoded to yil'ld Y,-It with X , -2 St"rving as a rdercnc£' hit. For i = O. we define tht' reference bit X_I to b(' zero. A simple way of computing t h(' recOlk'(l bit is through YI = Xi-I - Xi. Thp rcroding of the multiplier bits nCN:1 not be dOlle in any prt'<lctcrminNI order (from tht, most sigllificant bit to thl' II'l\St. siguificant hit or vice versa) and can ('ven bl' doue in paralll'l for all bit posit.ions. \\'hplI the tnultiplif'r and 1II1I1t.iplicand arp rl'prl'spntf'd ill two's comple- lIIent, Booth's algorit.hm )idds till' rorrf'Ct product. if III(' sign hit r,,-J partic- ipates in the proccs.... For this sign bit, WP n('('d t.o decidp whl'ther to p,..rfClnn an add or subtract operation, However, no shift oppratiou (of till' accumllldt(>{1 part.ial 1)rodurt) is required, since this shift operation !WrvC's only 1\..'> pn'paratioll for th(' 11l'xt stf'p. Clearly, t.he corr,..ctUt of till' last stat,..ml'nt ha.<; to ht' \1.'rilit'l.l only for uep;ativl' vallll.'S of X (for which Xn-I = I). Thus, t.hl'rC' ar,.. two CasM that ",,<-'(I to be examilll'd and t.hey 'Irf' shown in 'fabl 6.2. III both CAAC'S th(' requirl'<1 product is. as we ha\'{' S(.."(.'II in Chapter 3, ,,-2 A.X=A.X-A,x n _I.2 n - 1 wher O' =2:>121, 1="0 III case (I), Y,,_1 ('ails for the "'lIht.ract.ioli of ...t, which is dOlle "ftt'r thl' partial prmluct has been shift 'd (rt-I) timps. HI'U('(', lilt' neCf'S<;ary corre('t.ion iR nmdt'. In case (2). without considering th£' sign bit we are sc'\nlling over a st.riug of I's aud we u('('d to perform an additiou for position (n - I). Wh£'n I,,-It whirh ''(Iuals 1, is also OOllsiden'<l, t.hl' required addition is not doup. fhis is C<luivalpul to suhtra( t.iug 11. 2" -I, which is thp nccessary corrprtioll tt'rm. (6.1) TABLE 6.1 Booth's olgorllhm. Example 6.2 Thp followiul!; s('qlll'ntialmuitiplkdtion iIIustratl''i rase' (2) ill T'\blt' 6.2: A 1 0 I I -5 X x I 1 0 1 -3 Y 0 1 1 I rt'codcd mliitiplipr Add - A 0 I 0 I Shift 0 0 I 0 Add A + I 0 1 I I I 0 I Shift I I 1 0 1 Add - A + 0 I 0 1 0 () I I I Shift 0 0 0 I I I X, X'_I Opt'ratioll Comments VI 0 0 shift only st.ring of J:f'ros 0 I I shift ollly st.ring of lJlIl'..8 0 I 0 subtract and shift Iwginuing of a striug of oUl' I 0 I add and shifl l'nd of fl string of ones I 
144 6. High-Speed Multiplication 6.1 Reducing the Number of Partlol Products 145 This multiplint ion st.arts from t hp least sir;nificant hit. of t hp lIlult,ipli,'!'. If start.f'd from t.he most signifieant. bit.. a lon('r addt>rfsubtractor would bl' nC"ded t.o allow for carry propaation. Also, no 'that Uwrc is no lll'Cd t.o j:!;('n('rate the rerodl'd SD multipli('r t.hat would r('quir(' two hits per digit. if generat('d. Inst.l'ad. th(' bits of th(' orip;inal mult.iplier Cl\ll be smnn('(I, arKI appropriatp control signals for thl' addPrfsuhtractor Clm Ilf> g('nl'rat.l'd. 0 to the original Booth's alj.!;orithm in fabll' 6.1, Wf' Sf'f' that an if10latNI 1 or 0 is handled mor(' efficiently t.here. If.c, _I is an isolatf'd 1, Yi-t = 1, so only d ingle op('ration is Il(cled. A similar simplification occurs if.c. t if: an iwlatf'd o in a f:tring of l's. In this case, ... 1O( 1)... is rf'coded into... i 1..., or, more com f'nif'ntly, into... 01 . ", ami ap;ain only a singlf' operation is perform('(1. A slmple way to find t h(' rf'quirt"ll opl'ration is to calculat1' Booth's algorithm ('an hand It, two's complemf'nt mult.ipliers properly, and conSf'qul'nt.ly if unsip;llI'd nUlllbt'rs an' to he multiplit'(l, W must add a .lero to tit.. If'ft of t Ill' mult iplit'r (i.e., X n = 0) to I'nsme the rorr<,<,t.nl'Ss of the result. Thf're are two drawhaeks t.o Booth's algorithm. Thl' first. is that t.ht> numhpr of add/suht.ract opt'r.ltions i.. \ariable. and so is thf' number of shift opl'rat.ions hpt.wt'('n t.wo conM'Cutivl' add/..uht.ract opf'rat.ions. Th('$(' are Vt'rv incon\'cni('nt whl'n df'signing a syndlronous mult.iplit'r. S('cond, t.his algorithl1 bt.'Colllt'S i!l<'ffieil'nt when tht'rt' .He isolat.t'd 1'5; for f'xampll', 001010101(0) is recoded as 01111111 L fl'<)uiring ('ight, instead of four. operat.ions. The sit.uation can be improvt>d b}' ('xamining tlm'e bits of X at a time rather t.han t.\\0 (11). The bits Xi and X._I are recoded int.o !I. and !I.-It while Xi-2 rves as a reference bit. In a separate step, x.-2 dnd Xi-3 dre re('oded into y.-2 and !I.-3. with Xi-4 serving as a r('ference bit. Thus, the groups of thr('e bit.s each oVt'rlap, wit.h t.he right.most bl'ing .l'1.l'O(X-r), t.he next one being X3X2(XI), and so on as shown below: X._I + X.-2 - 2x. for odd \'8lues of i and repr('sl'nt. thc result. as a 2-hlt hinary munhpr !lty.-I in SD notation. TIll' v('rificat.ion of t.his st at(,lIlent is If'ft as an exercise for t h(' wder. For 011011011011(0) Wt' now obtain 011011011011, and t.hl' numher of op,-rations remains four, which is t.he minimum. Howf'vl'r, for 001101101101(0) we gl't. 0110110i,101, requiring four, intead of thrC<', op('rations. Still, compared to till' radix-2 Booth's algorit.hm in Tahlp 6.1, tht> numbt'r of patt('rns for whid! the numher of partial product.s is inrreas('d, ratlwr t.han dPf'rl'a:5l'd, is smaJlf'1'. Also. the incrt'asp in th(' mnnb.'r of op('rations is smaller. In any case, we may d.ign an ll-bit synchronous nmltipli.>r tint. generates exactly n/2 partial products. H('re, t.oo, two's complem('nt multipliers are handled correct.ly, but we ha\ to mak(' sure t.hat. n is ('\,('n. Ot.herwise, an I'xtt'nsion of th(' sign hit is required. Also, we Ill't-d to add a lero to thl' I('ft. of t.he lUult.iplil'r if unsigned numbers art> mult.iplied and 1l is odd. Two zeros lIlust he addt'd if 11 is dn ev.m numher. . . . Example 6.3 A 01 00 01 Y x 11 01 11 Y oi 10 01 -A +2A -A Add -A + 10 11 11 2-bit Shift 1 11 10 11 11 Add 2A + 0 10 00 10 01 11 01 11 2-hit Shift 00 01 11 01 11 Add A + 10 11 11 11 01 10 01 11 17 -9 [1"Coded multiplier operation X7 X6 Xs X4 X3 X2 XI .l'o (x_r) --..-.- --..-.- --..-.- --..-.- U7 Y6 Us y., U3 ltl VI Yo The rull'S for t.hi!> radix-4 modifitod Booth's algorithm dre shown in Table 6.3 for all ndd valu('S ofi, naml'ly, i = 1. 3, 5, . '". Comparing this algorithm X. X._I .c.-2 Y. y.-I opt>ration conunl'nts 0 0 0 0 0 +0 string of .lcros 0 1 0 0 1 l-A a singl(' 1 1 0 0 1 0 -2A lH'ginning of l's 1 1 0 0 1 -A heginning of l's 0 0 1 0 1 +A l'nd of 1 '5 0 1 1 1 0 +2A l'nd of l's 1 0 1 0 1 -A a single 0 1 1 1 0 0 +0 string of l's -153 Thf're are n/2 = 3 st.eps in t.his mult.iplication mid in pal'll step t.wo mul- t.iplier bits aI'f' dealt wit.h. AI1 a rpsult., all 1thift opNatioll$ arc two bit position shifts, Also, not(' that an additional bit for storing the correct sign ic; R'quired to prop('rly hand It' the addit,ion of 2A. 0 TABLE 6.3 A radix-4 modified Booth's algorithm. It is poib'" t() extpnd t.hl' ubo\'(' recodiug to three bits at a t.ime, dud have overlappmg groups of four bits ,'uch, flip ((.sulting alj.!;oritlun is mlled 
146 6. High-Speed Multfpllcatlon 6 1 Reducing the Number of PortKJI Products 1.11 till' radix-8 modifi('(J Booth's algorithm. In this ,,1orithllJ, only n 3 partial produrt alP generatPd. hut tlu> multiplf' 3.-1 is IU"I"{IPd, adding rompll'xitv to the ha...ic step. For r>xatnple, r('('oding 010(1) yields Y.Y'-IY.-2 = 011. A thmque for simplifying th. J.,f'nf'r ition and accunmlation "f the multiples :i::3A has bcpu pr ntl"d in 171. An intl'rffiting qu('Stion now aris('S: what is th(' minimal nWllber of add/- buhtrart 0JWration rl't'luirf'd for a givpn multiplier? To an5\\W this, we haw to find Ihf' minimal SD reprCS4'ntation uf the multiplipr; i.e., the one \\ith thp mallcst number of nonwro digits, m;" E;:;OI Yil. It ha..; bPf'n sho\\ n (20} that a Sf'<IUCIlce Y"-IYn-2...YO is a minimal r{'present.at.ion of an SD numhl'r if Y. -I/i-I = 0 for 1 S is'' - 1, given that tlw most significant bits can tisf)' 11"-1 - ""-2 =11. To Sf'(' thf' rPAison bf'hind thi... condition consider, for example, the rf'prf>S('ntation of 7 wit.h only three bits; hf'rp 111 is a minimdl represt'ltation although Y. . Y.-I '# O. In pr3(."tice. for any multiplier X, we c.an al\\-ays add a 0 to it Il'ft to makp sure that the abo\'e condition is satisfi(>(1. Tht> algorithm for obtaining the minimal representation of X i... df':'3cribed nPXt.. The multiplier bits an' examined from right to leh, onf' bit at a timf' '\\ith th(> n('xt bit to t hp left (i.e., .r.+I) 8ef'\. ing as a rf'fcrencc bit. To correctly handle a single 0 within a string of 1 's (and similarly. a single 1 withiu a string of O's) we nCf'll information on tht> kind of :otring that exi..,s to the right of the current position. For this purpose we use a W carry " bit. (0 for O's and 1 for 1 's). Thi... algorithm is c.alled canonical receding and its rules are summarizf'd in Table 6.4. where c. is t.he pre\"iou,> "carry" and CHI is thp m."t "carry." A:s before, the recoded multiplier (dft.er canonical recoding) c.an be uSNI \\ithout. any correction steps iftlw original multiplier L'i rf'prcsellt.ed in two's com- plement. Hl're, \\e have to extend the sign bit Xn-It obtaining X"-IX..-IX n _2 ... Xo. Canonical rC<'oding can also be expandf'd to gen('ratp two or more bits at a time. TIle multipl of A neE'ded in the c.a..'>f' of two bits an' :i::A and 2A. r,1 z. .r'_1 OJ>f'rat ion C..muwTlb 0 0 0 +0 s ring of O's 0 0 1 +2A PlId of 1 's 0 1 0 +24 3. ingl(' 1 0 1 1 +4.4 end of l's 1 0 0 -4.-\ beginning of l's 1 0 1 -2.-\ a single 0 1 1 0 -2.4 beginning of 1'9 1 1 1 +0 tring of 1 's .r,+1 X. c, Y. C,+I Comlllents 0 0 0 0 0 btring of O's 0 1 0 1 0 a singlp 1 1 0 0 0 0 string of O's 1 1 0 I 1 beginning of 1'6 (J 0 1 1 0 end of 1'6 0 1 1 0 1 string of 1 '6 1 0 1 i 1 a sillgl( 0 1 1 1 0 1 st.ring of 1 's TABLE 6.5 An alternate 2-bit-ot-o-time multiplcotion oIgomnm. The main disadvantage of canonical reroding is that thp bits of the mul- tiplier are generated equentially, while in the original and modifiPd Booth's dlgorithms Wf' may generate the bits simultanrollsly (thPTP is no "c.aIT)" propa- gation). Thi:. implies that in thp lattf'r c.a.o;e, we can generate all partidl products in parallel. and then u..-.e a fast multi-operand adder. Another drawback to c.anonical reroding is that. like Booth's algorithm, in order to take full adV""dUtagP of t.he minimum number of add subtract operations the number of these operations must be \'ariabl,,, as must be the length of the shih operations. This is difficult to implement. and wp would prefer to bave unif(fil] ..hifts. This implif'S that the number of partial products will al\\'a's be n/2. although c.anonic.al recoding can lead to a much smaller numbf-r of operatioffi. The radix-.. modifif'd Booth's algorittlIn in Table 6.3 L'I not thp only t\oa) of reducing t h(' number of partidl products, t\o'hilf' still ha\ ing uniform shifts of two bits each. lru.tead of IL..ing the next bit to the right (X.-2) as a reference bit when examining .riZ.-1t \\'e can IL<;(, the next bit to the left (.r.+r). The rul for this mult.iplication algorithm are summarized in Table 6.5 where, &> before, i is an odd number. The multiples of A that are needed are ::t: 2A and :f: 4A, and they can be easily generated using shifts. The multiple -1..1 must be gem'rated when (Xi+I)XiXi_1 = (0)11 to takt> CMC of the end of group of l's. This can not be done at the time when the bits (z..U)Z'+2.r.+1 are ('xamined, since they have a .rero ill the rightmost position. As a result, this algorithm i not a rccoding of the multiplipr, as \\'e c'\nnot ('xpress -I in two bits. The numher of partial products is al\\'ays n/2. As for c.allonical rt.'COding, t\\O'S cornplpment multipliers can be handled hy pxtellding t.he sign bit. AL, if unsignro nUDJbers are rnultipli,'(\. one or two Zf'ros must. bf> addPd to the left of the mult.iplier. TABLE 6.4 Canonical recoding. Example 6.4 For the multiplier 01101110, the following pMtial prodUl:ts Me gE'nerat.£d" (0) 01 10 11 10 +2.4 -2.4 +-1.-1 -2.-1 
1:s 6. High-Speed MuttipUcalion T I Thi tmu....llltt,s In thl' SD lUlmhl'r o III 11 OOiO, whidl is 111.1 (I lIlilliuml rC'pn"St'ut at inu, siuC't' it iuc-Iudl-:; two lldjfiC,'ut unll/.t'W eliit.... Empln,. big tlu' ('l\lll\nicul n'l...uliu sununllrill.xl in Tn"I,' 6..a )'il'lcI OlOOitltHO, whirh I II miuima) n'prc'\ 'utat.iuu. 0 Fur t.l11' ri/o::html)....t. pnir .1"1.1"0, if .1"0 = 1 it is cousi,ll'n'll n "'t..ntinuatiuu of " s\ rit of l's t.hnt nl'\l'r n'nll)' stnrtt'll. mid t hl'n'fon' no su"trnction tonk place. .....or I'xmupll" t hI' lIlultiplil'r 01110111 n'sult.s iu I Ill' following pRl"t ial pwdlll'f.s: in....h'nll of 01 11 +2.4 +0 +2A +0 01 11 -2.-\ +0 2.4 -A This can bc' eorrt'C'h'() It)' Sf'Bing t.hl' initial partin) produc'\ t" bc' of 0 whc'lIc'\"('r .1"0 = 1. \11 four I'l),."...ibl" l"n..<:I'S nn' listl'll iu Table 6.6. 1 instcnd .1"2 .rl .1"0 Opl'ratiun 0 0 I +-2.4 - .4 = A 0 1 1 +....4 - .. = 3.4 I 0 1 -24 - A = -3.-\ 1 1 1 0 - A = -.4 TABLE 6.6 The handling of .1"1.1"0 with .1"0 = 1 In the algorithm of Table 6.5 Example 6.5 WI' n>pc'nt t ht' pft'\'ious l'XRlIlpl,' (wht'rc the rnclh:-.J lIludifi('(1 Booth's al- gorithm was Ust"li) and obtain .4 01 X x (1) 11 o 10 00 10 11 01 Ot 11 00 11 Initial - 4 Adct 0 00 01 -2.4 1l 00 Il 10 11 10 01 00 01 + 2-bit Shift Add -2.4 + 1 1 1 2-bi\ Shift Add 0 + 01 11 o 11 00 11 11 10 01 10 00 10 Ii I Opl'r<\tion 1l 11 01 11 01 11 - 153 "\ote \ IMt tht' lIlultiplier's sign bit hnd t. bl' c'xtt'ndl-d in ('rdl'r to tll'l:idt' that no upefl,tion is Ill'l'lic'lf for till' first pair of IIIU1tlplicr bit.s. Alo;o, l\S 6.2 Implementing large MultfpUers Using Smal er Ones }.If) ill thl' pn>\ ions "'XI1Ulpte, all nddit,inrml hit for hnldin thl' corr ct :,:iJI Is 111'1'.1''11, hpcnusl' of mult.ipl tik... -2, t. 0 1'111' IllI,thod sUlIlllll\ri/exl in 1'1'''11' ti.5 eRn nlso 1)(' I'xt.f'lull'o f" thr "it or mor... fit l'Udl ....t...p. Unwl'\','r. as in th,' mdix-R moclifiC'(1 Booth's I\lgurithll. lIlultiplt':> of . \ tikI' 3 1 or t'Vl'I1 6,'\ nr... nN'd{'{1. .md unh._'$.... tho,,: 'rt' p..r"'()f\rl11II I\llvmll wld stored oT1u>wh...r", WI' hau' to p'.'rform two adIIU.IIJI1S III 8 slIIgli' s\<,p. For example, for (0)101 we 1Iet.,-d 8 - 2 = 6, and for (I )001. - 8 + 2 = -fj.. 6,2 IMPLEMENTING LARGE MULTIPLIERS USING SMALLER ONES If nil JI X" hit multiplil'r is implt'lUt'uteJ l\S a singlp integrated cin'uit... \VI' C3:'1  ...wral such l'ircuirs for implt'lI1enting hUF;t.'r multiplit'rs.. A 2ft x 211 bit mulhpler Call be constructed out of four 11 x n bit multipliers. This is hnsed 011 tht" followmg "''(Iuution: -t.x = ( \H'2"+Ar.).(XII.2"+ \L) = ill' \"H'2:1" + (.4H' \L+ tL"\H)-.t'+AL-Xl. (6.2) wh('r... All Rnd .4L art' t.h(' most mld lcast significant halvt'S of .4, rtp(,( t.ivdy, and X" aud X r nre tikl'wiSl' for X. The four I)Rrtial prOlluets of It'lIgt-h 2n IHt-$ t'l\('h should be cor['t'('t.ly aliled b...fort' being adllt'll. l\S shown in Fil{Ure 6.1(a). A more conwment nrral1gl'ml'nt is shown in Figurp 6..1(b). This last arral1gl'l1lt'nt gi\"f's th.. minimum hl'ight of  )( .-iL X XL I '\11 X Xl. (ft) At. x .\11 I A" x \H (b) '\L X '\:" AGURE 6.1 Aligning the four partial ptoducts In Equation (6.2). 
150 6 High-Speed MultipUcatlon I I I II II  II II II II II I I II I I FIGURE 6.2 Aligning the 16 portlal products. th!' matrix of numbt>rs to be added r('(]uiring on.. 11'\"1'1 of mrry-sa\t' addition and a CPA (carry-propagating allder). Note that t.h(' n least. significant bits are already bits of t.hl> final product, amluo further addition is UN"clcd. 1'11(' 2n bits in the Cf'nwr have tu hI" addl'd b}' a 211-hit. CSA, whoo.e out.puts art' coun('ctf'd to a CPA. The n 111I)st siguificant bits ha\'e to be connectf'd to th... same CPA, 5111("(' the ct'ntcr bits may geuerah.> a Cl1rry into t.llt> most. signifi,'ant. bits. Tlms, n 3n-hit. CPA is nf'ftlNJ. TIlt' id('a uf d('('omposing a large multil)lier into smalll'r ones can be furt.hpr cxtf'l1dc<l. First, t.ll(' bil' multiplit>r used as a building block can be an n x m bit multiplier, with n 'I- m. Second, lIlult.ipliers larger than 2/1 x 2m cau be implf"mcuted. For exampll', a 4n x 4n bit multiplier can be implemented usiug available n x n hit. multipliNs. A .1" x 4n bit multiplier re-quires four 2n x 2,. bit multiplier.., whirh in turn rt'Quire four n x n bit IIllIltipliers each, for a total of 16 n )( n bit multipli!'rs. Thl' 16 partia1 products geul'rated this way l1av(' t.o be aligned b('fort b('ing added, as showu in Figurt> 6.2. Similar arrang(,l11f"nts of part.ial products can hI" drawn for any m x k,1 hit multiplier with an int('ger k. After aligning thl' 16 products, as shown iu Figurt> 6.2, w(' h.w(' up to seven bits in ou(' columu that need to be nddl:'{1. To add seven opl'rands we may 1I:.e a set of (7,3) counters. which gpnerat.t.' thrcc operands, to be adde<1 by another set of (3,2) count('rs. These will generate two operands, to be add(.d by a CPA. Anothpr posibility is to combiue the two Sl'ts of couuters into a set of (7;2) COl11prt.rs, depic-tf'd in Figure 5.26. The ta..<;k of selecting an ecouomic-al multi-operand addl'r is IIiSC'IL"'<;('() next. 6,3 ACCUMULATING THE PARTIAL PRODUCTS Aft.er gcnf'rating t.he purtial products eithcr through one of the algorithm.o; dis- cuswd in Section 6.1 or by using sma)ler multipliers, a..<; in S('ct.ion 6.2, we must accwnulate <,n the I>artial products to obtain the final product. If a high-:'I)et-d dccmnulation of partial products is desirf.'d, a fast mult.i-operand adder should bl' employro. Such lliult.i-opf"rand adders, using Jifft>rt'nt types of parallr>1 connters, -, I , I 6 3 Accumulating the Portlal Products 15] lO91!7ti51321n lO g  7 6 :; I 3 2 I 0 00000 0 o 0 0 000 o 0 0 0 0 0 o 0 0 0 0 0 o 0 0 0 0 0 0 0 0 0 0 000 0 0 0 0 0 0 000 0 0 0 0 o 0 000 000 000 000 o 0 0 0 0 0 o (a) Original mat.rix of 36 bits. (b) RrorganilM matrix I)f bits. FIGURE 6.3 SIX partial prOducts to be added. 1 have h('('n dcscribe<1 in Chapter 5. Wc should. howcver. tdkp advantaltc of the particular form of the partial product t.o be added aud reduce the hardware compl(>xity of t.hl' mnlt.i-operand adder. The partial products to be added h8\'e a smaller numher of hits than the final product, and they have to be aliglll>(1 hefore bl'ing ndd(. Thus, we can expect to see many columns t.hat include fewpr bits than the total numbf"r of part.ial products, requirinp; simpler .-onnters (or their addit.ion. Consider, for exampl('. the six partial products that are gcnl'rated wheu multiplying two unsigned operands of It.'ngth 6 bits each, nsing t.he simpl.. onc.. bit-at-a-time alorithm. Th(' matrix of partidl product bits t.o be added is shown in Figur(' 6.3(3). These si.,< operands can be added using t.he three-level wrry- save t.ree shown in Figurc 5.24. Tht> number of (3,2) counters can, howewr. bf' substantially rNluced by taking adnntage of tll(> fact that all columm; but (m,- in Figure 6.3(a) contain fewer than si.,< hits. To simplify thc task of tll"Ciding how many counters are needed we can r('(lra\\' the matrix of bits t.o be added, as depictf"d in Figure 6.3(b). To further reduce the hardwarf> C'omplexity we a1so allow t.hp U<;f" of half addt'rs (HAs) in addition to full adders (FAs). .-\n HA, which can b(' callf'(1 1\ (2,2) count('r, has a lowt'r hardware complexity t.han an FA. Figure 6.4 depicts thl' (J,2) and (2,2) counters that ('an be u in order to rf'(luce the number of operands from 6 to 2. Thl"Se two operands are then added through a CPA. In this figurp. Ii vl'rtinl block containing thr bits repr(,'sf"nts a (3,2) count(>r, whil(' a \'Ntical block containing two bits repres('nts a (2,2) counter. Thl' hori.wnta! blocks in Figure 6.4(b) show the ontputs of t,hf" (3,2) and (2,2) counters in Fij.!;lIre 6.4(8). For ('xample, the horizontal block in columns 2 alld 3 contains th(' two out.puts of th(' (3,2) counter in column 2 of Figure 6.4(a). The number of I('vels in tilt.' C1\rry-saVf" addit.ion is still 3, but the numb..r of counters is subtautiallv smal!pr t.han t.hat need,>d in thc g('neral CflSP (set' Figure 5.24). The nnmhf"r of counters can be fnrt,hl'r rl-'dnced bv employing t.hl' idt>1\ mt'nt.ioned in Chapt.er 5 of reducing the nnmber of bits in pru:h column to the 
152 10 9 8 T e Ii 4 S 2 0 J8608°6 · (2,2) 0      0(3.2) coun'er l;J  l;J counter (a) U>\'l'1 1 carry-save addition. 109876543210 o €:::]) €:::]) €:::]) @3 0 0 @3@3@3@3 0 @3@3 0 0@3 (b) Result.;; of 1('\'el I. 6. HIgh-Speed Mult1pl catIon T I , I I cI()S('st elelllpnt of thp spries 3,4,6,9,13,19,... (01). This is shown in Fignrl' 6.5 whl'rl' for examJ>It>, in Fij.!;lIrp 6.5(a), column 5, the 8malll"£l. nUlUhpr of countf'rll which will rE'du("e the number of hits to four. is used. Ovprall, thp S('hpnlt in FigurE' 6.5 r('quirf"s fift.een (3,2) rollllters and five (2,2) count('r;j, comparpd t.o the sixtCl>n (3,2) l'Olmtt>rs and ninp (2,2) r.ountprs nt>ftl('d in Fil1;lIrt> 6.... The savings an' even mort> substantial when larger multipliprs a£l' dt'siRIlt'cl. The abovp disc'lIssion is restrictl.,<1 to unsigned numbt>rs. If some of lhp partial products arp negRt.i\'f" numbers rpprpspntro in two's ('omplpmpnt , \W nf'{'d to modify the matrix of bits shown in Figurf' 6.3(a). Spt>cifically, 'ill sij.!;n bits must be properly pxt('nd('d before t.he addit.ion of t.h,.. part.ial produC't.s tdk pine", yielding the lUatrbc shown in Figur(' 6.6. The numt)('r of bits in row 1 is now 11 instead of 6, and I!O Oil, This I'xtensiol1 significantly increa.o;pg tht' hanl"''8re complexit.y of the multi-o\)cralld adder required. If two's complem('nt numlJPrs are obtain('d by generat.ing the one's complement. and thell adding a carry to th(' Ipa.<it. significant bit., thl' matrix will have to be increased ('ven further. WI' may minimize the increa....e in complexity by realizing that the two's complement numh('r 6,3 Accumulating the PartIal Products 153 10 9  7 6 5 I 3 2 I 0 00088888. · o 0 000 (c) Level 2 carry-save addit.ion. II 10 9 8 7 6 :; I 3 2 0 .000888: o 0 0 0 o (d) u>\'('1 3 carry-save addit.ion. FIGURE 6.4 Reduction at the sIx partial products. 10 9 8 7 6 :; " 3 2 o : : 8 : 00 : : : : · · : 88 0 : . (a) u>\'el 1 carr:.'-save addition. 10 9 8 7 6 :; 4 3 2 1 0 00000 000 0 0 0 000 0 0 000 0 000 0 0 0 0 o 0 0 0 0 0 (b) Re;ults of le\,1 1. 8 8 S 8 S 8 %4 %3 %2 %1 %0 whose value is -S' 210+.9.29+8.211+8.27 +8.26+.'.1.25+%-4.24+%3.23+%2' 2 2 +Z l .2 1 +Zi) can be replaced by o 10 9 8 7 6 :; 4 3 2 o 0 0 0 0 (-s) %4 %3 2:2 ZI %0 o since o 000 0 0 0 0 000 .888880. -S' 2 10 + 8' (2 9 + 2 8 + 2 7 + 2 6 + 25) = -s, 2 10 + 8' (2 10 - 25) = - 5.2 5 . To repr4'St'nt. the value -s in column 5 (in Figure 6.6), we complement the original sign digits to obtain (1 - s), and add L V'le get thf" -5 as r('{}uin>d o 0 (c) Lvel 2 carry-S3\'C addition. 10 9 8 7 6 6 " 3 2 0 . . . . . 0 0 0 0 0 0 . . . . 0 0 0 0 0 0 . . . 0 0 0 0 0 0 . . 0 0 0 0 0 0 . 0 0 0 0 0 0 0 0 0 0 0 0 109ti76513210 .88888880. o 0 (d) Level 3 l'arry-sa\'e addition. FIGURE 6.5 An alternate scheme for reduction at the six partial products. FIGURE 6,6 Six signed partial products to be added (full circles Indicate ex. tended sIgn bits). 
T 6.3 Accumulating the Partial Products 155 154 6, HIgh-Speed Multiplication 10 II 8 7 6 5 1 3 2 o 10 9 8 7 6 6 I 3 2 0 81 81 . . . . .  0 0 0 0 0 '12 0 0 0 0 0 83 00000 !f.iooooo Ss 0 0 0 0 0 86'00000 S:i. '83. . . . 81 . . . .'2 .'14 . . . . . 83 85 . . · · .'1-1 SG . . . · · 80; AGURE 6.7 The modified array of six signed partial products. 86 FIGURE 6.9 The modified array wrth negative part1al products represented In one's complement. along 'ith a carry of 1 into column 6. The lat.tPf will Sf'rve as the (>x"tra 1 nl.'f'dNl in colnmn 6 to deal with t.he sign bit. of the second partial product. Another carry-out will be generated in column 6 and o on. Tht' resulting mat.rix of bits is shown in Figure 6.7. This nf"W mat.rix has fewer bits than that. in Figure 6.6 hilt has a higher maximum height (7 instead of6. in column 5). \\"f" can eliminate t,hC' ext.ra 1 in column 5 if we place t.he two sign bit.s 81 and 82 in the same column, sinn' Example 6.6 In t.his example, the negatiVl' partial products are generated as a rl"Sult of a rffod"d multiplier using canonical recoding. If all sign bits are extended, t.he following !lIatrix is obtained, where all replicated sij.!;n bits arc shown in bold face: (1 - 8d + (1 - 82) = 2 - 51 - 82' The 2 is carried out to till' ne>xt column, leaving behind -81 and -82' An t'xtra 1 in this column is no longer r('(juirf"d. Placing t.he two sign bits in the> same column can be achieved by first extending the sign bit 81 in onf" position. as ..hown in Figure 6.8. The maximum column height is now back tv 6. If the negativc partial products arc obtained by first generating t.he> one's complen1f'nt. and t.hpn adding a carry to the least significant bit. these eXt.ra carries can t.hen be added to tlU' mat.rix, as shown in Figure 6.9. The full circles indicate that. thf" complpnu>nts of the corresponding bits are taken whenevt'I' 8. = 1. The ext,ra 56 in column 5 incr('ascs tb(' maximum column height to 7. Howt'ver. if the last part.ial product. is always positive (i.e., the multiplier is alwavs positive), t.his 86 can be eliminat.ed. A 0 1 0 1 1 0 22 X x 0 0 1 0 1 1 11 }' 0 1 0 1 0 1 Recoded multiplier 1 1 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 1 0 A smaller matrix of bit.s to be added is obt.aint'd if we Colluw thl' S('heme iIlustrat.l'd in Fibrure 6.8: 10 9 8 7 6 5 -1 3 2 0 81 81 0 0 0 0 0 10 9 R 7 6 5 3 2 I 0 82 0 0 0 0 0 0 1 0 1 0 1 0 83 0 0 0 0 0 1 0 0 0 0 0 8.\ 0 0 0 0 0 0 0 1 0 1 0 8!) 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 86 0 0 0 0 0 1 0 0 0 0 0 AGURE 6,8 Further mOdified array of six signed partial products. 0 0 0 1 1 1 1 0 0 0 
156 6. High-Speed Multiplication If tltp l1Pgalivp partial are genprnh'<l lIsiu mIl"'" complpm('nl and a carry into t.h£' lel\.<;t. significant. poit.i()n, th('n till' resultiug matrix becomes 10 9 8 7 6 5 " 3 2 0 0 1 () 1 0 0 1 ] 0 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 6 4 Alternative Techniques for Portial Product Accumulation 157 10 51 11 7 6 0; I 3 2 0 1 l!ijoooooo 1 82 0 0 0 0 0 0 83 0 0 0 0 0 0 Scht>me (a) If we use thp modifiNl radix-4 Boot.h algorithm for generating the partial products, th(' rpsldting matrices, corrc:-.ponding to Figures 6.7, 6.8 and 6.9. arc showu in Figun' 6.10(a), (b) Rnd (c), respcctivply. Note that in Figurt> 6.10(a), the carry genpwt.ed by adding 1 to SI = (l - 81) for the first partial product is not. J>osit.ioncd in the san1(' column as that of S2 = (1 - 82), the comph'm('nt of the sign bit of thl' second partial product. \Ye need. tht>reforp, t.o put an extra 1 in column 7 which, togpther with thl' carry gpneratcd by COIIlUUl 6, produce::; the neccssary 1 in column 8. 10 9 8 7 6 I) -1 3 2 0 Sj' 81 81 0 0 0 0 0 0 1 82 0 0 0 0 0 0 83 0 0 0 0 0 0 Scheme (b) Example 6.7 We repeat the multiplicat.ion from the pr('vious pxampl.. but now use the raclix-4 modified Booth's algorithm resulting in three partial products. The recodt"d multiplier turns out in this case to be exactly t.he samt'; i.e., 010101. The 5C<"ond srheme (see Figure 6.10(b» rt'Sults in 10 9 8 7 6 5 , 3 2 0 81 81 81 . . . . . . 1 82 . . . . . . III 83 . . . . . . 82 83 Scheme (c) FIGURE 6,10 Three schemes for an array of three (radix-4 mocllfled Booth algorIthm) partial products 9 8 7 6 :I -1 3 2 0 0 1 1 0 1 0 1 0 6,4 AL TERNA TlVE TECHNIQUES FOR PARTIAL PRODUCT ACCLNULA TlON 1 0 0 1 0 1 0 Several modifications to the basic t.ref> st.ructure for partial product (\('Clunula- 1 1 0 1 1 0 0 0 1 1 1 1 0 0 1 0 tion have been suggested and implemented. The purpose of tilt'S(' tt'dmiques is to reduce thl' numbcr of levf>ls in tl1(' tree (and, as a rp-sult., speed up the accu- mulation) Bnd/or achieve a more regular design. Tree struct.ur£'8 Ilsnally ha\'P If the third schf'lIle (Sf'C Figure 6.IO(c» i.. followed, the resnlting matrix is very irregular interconnects. This irregularity complicates the implementat.ion 9 8 and, more importantly, irregular st.ructures result in area-inefficient layouts, cs- 7 6 5 " 3 2 0 pedally whpn a rt>ctangular-shapedlayout is sought. Notirp also that a smallpr 0 1 1 0 1 0 0 1 numbl'r of levels results in less irregularit.y. 1 0 0 1 0 0 1 1 The number of Il'v<'ls in the t.r can be lowen>" by using a redul'tion rate 1 1 0 1 1 0 1 hight>r than 3:2. A reduction ratp of 2: 1 can bt> adlit>w(l if the carry-save adders 0 are replacf'd by adders for binary SD numbers described in St>ction 2.4. Like 0 0 1 1 0 0 1 0 0 the carry-save adder, the SD addcr generates the stUn (Jf it:) two operauds in 
15R 6. High-Speed Multiplication 6 4 Alternatrve TechnIQues for Partial Product Accumula110n 159 constant tin1f' (indppl'nd('nt of t.he numher of hits), sillce tlu' carry is allow..d t.) propaJ{at(' at most one position. Th(' IlIlmhl'r of 11'\'('ls in thl' Sf) adetf'r tr<'e is smallpr and, in aetdit.ion. the t.ree produces a sing'" re:mlt rather than thl' two results of t.he ordinary CSA t.ree (S{'C for l"'xample, Figure 5.24). HowPvl"'r. the rt>Sult of the SD adder tree is still in SD repn"S('ntation and, cnnsl'(Jut'llt.ly, in most. cases, a conversion to two's complen1f'nt rl'prcsentat.ion is nl'l'(J.'(1. This conversion is done by forming two seqUl"'nCl. Th(. first Sl'qUI'Ill'e, dl'notf'd by Z+, is {rl'at('d by rpplaring ('aeh lIeath'e digit of t.he SD numb,'r hy 7(ro. The $pconet S4'q1U'nce. denotPd by Z-. rl"'plan'S t>.uch 11l'gc\tivp digit of thp original SD number with its absolute value. and I'ach positivI' digit by I.l'ro. fhen. t.he diff'prl'nc(' Z+ - Z- is found by adding thl"' two's cOlnplpml>nt of 7- to Z+ using a {'arry-propagat.ing adetl"'r. HenCl>, a final stagt' of a CPA is Ill'ed('d Ill're. as it is nl'CdPd in the ordinary CSA tree. Anot.hpr advautag.. of an SD adder tret' over a CSA tret' is that t,hert' is no Ill't for a sign hit I'xtension when negative partial products arl' to be addfti. SD numbers simply do not rcquirt' a spparate sign hit. Tlw major dismlvantdRe of t.he SD adder is that its ett'Sign is mort> complex, consnming morc gates and conSt'quently a largt'r chip area, sin('e each signed etigit requirl'S two ordinary hits (or a lIIult.iplC'-valucd logic impll'lIIentation that ('an prO\'ietl' tlm."" values pcr digit, corresponding to -I,D, <\11£1 1). As a rult, a morl' careful comparison b(>tween the CSA trt'(> and thl"' SD (\(Ietl'r tn"t' for tht> pdrticular giwn tN'hnology must be performed berore deciding which to pmploy. C tI' C m c s FIGURE 6.11 A (4;2) compressor. Implementing thl' (4;2) compressor with two (3,2) counters. as shown in Figure 6.11, will rt';ult in a delay of four l'xclusive-or gates. Thus. the dday of the implementation in Figure 6.12 is expeded t.o be 25% lower than that of the implementation in Figure 6.11. Other mult.i-ll'vl'l implementations of n (4;2) compre::osor are possible. \11 such implementations must sat.isfy the following arithmetic equat.ion: Xl + X2 + X3 + XI + Ct.. = S + 2(C + Cout), Example 6.8 A 32 x 32 mult.iplier ha....>(1 on the radix-4 modified Booth's algorithm gen- erat.t'S 16 part.ial produet.s to bl"' accumulatro and consequent.ly requir<'8 a CSA tf('(' with 8ix Il'vl"'ls (:)l,'(' Table 5.1), but. m'ts an SD addcr tret' with only four levels. Soml' sophisticatl"'et logic d('Sigll t.echniqut's and layout. sc!wm('8 ('an hp pmployed. resulting in less area-consuming implt'ment.a- tions (10). 0 and C""t should not dppE'nd on c.... to avoid horizontal rippling of ('arrie-s. Tlw t.ruth table for such implementatiolls is summarized in Table 6.7, where a, b, r, d, t: and fare Bool('an \'Briables. The impll'mentation in Figure 6.12 corrt'Sponds to the setting a = b = c = 1 and d = e = f = o. TIn' same reduction rate of 2: 1 can be achieved without rc'SOrting to SD rt>prpscntat.ions by using (4;2) comprl'ssors, shown in Figure 6.11. Similarly to t.he (7;2) compret;M)r in Figurt> 5.26, t.he (-1;2) comprcs..",or must be desigm'd !j() that C ou . is not Ii function of Ci... in oreter to avoid a rippll'-cnrry effect. Alw, the (4;2) compre:;wr mny be implement.pd as a multi-level circuit with a smaller overall delay compared to t.he implt'mentatioll bl:lSed on two (3,2) coun- ters. &; ill Figurc 6.11. Oue such implement.at.ion is shown in Figun' 6.12 with a delay uf thr 'e "xdusivl..'-or gates between the input.s (XIt X:l. X3 and XI) anet t.he outllUt S. C"". £:in c S FIGURE 6,12 An Implementation of a (4;2) compressor. 
160 6. High-Speed Multiplication 6.4 Alternative Techniques for Partial Product Accumulation 161 .r, X2 :7'3 X.. Cour C S XI X2 ;]'3 3'.. Cout C S 0 0 0 0 0 0 (',.. 1 0 0 0 0 Co.. Cjrl 0 0 0 1 0 Ci.. Co" 1 0 0 1 d J C-i.. 0 0 1 0 0 r.... - I 0 1 0 e e i.. Co.. 0 0 1 1 a a c , .. 1 0 1 1 1 Co" C';n 0 1 0 0 0 C;n C;; 1 1 0 0 f 1 Co" 0 1 0 1 b b Cin 1 I 0 1 1 Ci.. (-in 0 1 1 0 c C C;.. 1 1 1 0 1 Cifl c.... 0 1 1 1 1 Ci.. Cin 1 1 1 1 1 1 (',.. TABLE 6.7 The truth table for a (4;2) compressor. :;.>vC'rdl other techniques have hN'n sugg(':Stf'd to modify the 9trn(1urf' of (,SA trf'f'S which IISP (3,2) count.ers, in ordt'r t(, achil've a morp rt'gular nne! Irsq arM-consuming layout. S1Ieh modifipd tree !otrnrfllrt>s may n'CJuirp a !>nmrwh1\t largt'r number of CSA levels wit.h 8 lEJrger overall dt'lay. fwo SUdl t,'chniqllffi are dc:.crihcd nl'xt. fhe first ono> df'tines halancl"d df'lay trPe:; (21) ( aLc:o (19)) while t.he secund one defines 0\ prtllrned-stairs t,rees [15). Figure 6.13 iIIu'itratcs the st.ructnre of thf' hit-slices for these tv.o t.echniqucs and compares thplII to the corn>sponding Wallace tree bit slice. All the bit-slices in Figuro> 6.13 ''UP for 18 opprands (parti81 product.s) which may hp gf'nprated hy a high radix multiplication algorithm (e.g., a radix-I modified Boot.h's algorit.hm). In thil> ca....e, the 1M downward triangles in Figure 6.13 n'pr('s('nt mlllt.iplpxf'rs that, wlpct the suit.abl(' mult.iple of tht' mult.iplicand. Thl' rectangll"S rf'pre.<;eut. (3.2) couulf'rs, and t.he numbers on these counters indicat/' the df'lay pxpPfipnf'f'(1 by t.hl' input operands. Thus, after 6ilF t. t.wo re,sults are producpd by the Wdllsc(> and thc' overturned-stairs trf'€Si the balanced tref> requires 76.f'A. Note t.hat all three tree structures have fifteen outgoing carries and fiflt'en incoming carries, and each outgoing carry is aligned with its corresponding in- coming carry (from the previous bit slice), so that adjacent. bit-slices ahut. The inmlJling carries are ront.('d to difft'rent (3,2) count/'rs so that all thp inpnts to a counter ar(' valid before or at the necessary t.imt'. Only for the balanc,'<i tr«> arc all fifteen incoming carries generated exact.ly when they arl' reqnirC'd, sincl' all paths are balanCf>d. In tht' ot.her two treE'S, thf'rt' art' counters for which not. 811 in- coming carries arp generat('d simult.anrously. For exalllpl p , the bnttom COlU1ter in the overturnro-!.tairs tr has incoming carries whose associated delays arc 46. FA and 56.FA. Thp three trf'C structures also differ in till' numher of requir('d wiring tr'\cks between adjacent bit-slices; thes£' iu turn, affect the layout area. Tltt' WaDa.ce tree requires six wiring tracks; the overt.urned-st.airs and the balanced tree requirt' three and t.wo tracks, respectively. Note the inhert'nt tradeoff betwn size and spet.'<l. A Wallacf' tree guarantees t.he lowest o\'erall delay hut requires the highest numb('r of wiring tracks (on t.he order of log N, whf're N is the number of inputs). The balanced tree. on the other hand, requires the smaHc.st number of wiring t.racks but has the high(>St overall delay. The balanced and overturt1t'd-stairs t.rs have a regular structurp aud call be designed in a systemat.ic way. This is difficult to SI.."e from Figure 6.13, but it. c.'U1 be concluded from Figure 6.14, which shows the complet(' st.ruct.ure of t two trees as w('11 as t.hat of the corrpc;ponding W<\lIac't" t.ree. The bnilding bkJd\D of the balanct'<l and oVt'rturned-st.airs trN'S are indiC'att'ti with dot.ted lines in Figure 6.14. The exact details of the recursivl' coust.ruct.ion uf the two typt.'.'i of regular tre€S, and some variations of tllt.'Ill, can be found in 124) and lIS). Wht'n det.£'rmilling the final layout of a CSA t.rf'e, ('are must hI' t.dkt'1J to make sure that wires conllectin the inpnts to a c<\fry-saw adder h.\V(' roughly An adder trf'(' that. USl..'S (4;2) compressors will have a more regular struc- ture and may ha\'e a low('r d('lay thau an ordinary CSA tree made of (3,2) count.ers. Table 6.8 compares th(' dt'lays of carry-save trees using eitlwr (3,2) counters or (4;2) compre:>sors. Since th(' delay of a (4;2) compressor is 1.5 timps t.hat. of a (3,2) counter, t.he number of levels of (4;2) compressors in column 3 is multiplit>d by 1.5 t.o yield the equivalt'nt delay in column 4. Note that. thf' equivalent dplay of a carry save tree nsing (4;2) mmpressors (column 4) is not always smaller than that. of a carry save t.rt'f' ul'ing (3,2) counters (column 2). For I'xample, for nine partial products, (3,2) counters will yield a carry save t.ree with an overaJllowE'r delay. Various ot.her counters and compressors can be empk>yed in the implementation of the addition tree for the partial product accumulation; for examplf', (7,3) counters (13). Number of Number of le\'els Number of levels Equivalent. operands using (3,2) u:;ing (4;2) dl'lay 3 1 1 1.5 4 2 1 1.5 5-6 3 2 3 7-8 4 2 3 9 4 3 4.5 10 - 13 5 3 4.5 14 - 16 6 3 4.5 17 - 19 6 4 6 20 - 28 7 4 6 29 - 32 8 4 6 33 - t2 8 5 7.5 TABLE 6.8 Comparing the delays of corry-save adder usIng either (3 2) coun- ters or (4;2) compressors. 
162 Wallace tree bit lice. Overturnro- stairs bit slice. 6. High-Speed Multiplication 1 6.4 Alternative Techniques for Partial Product Accumulation 163 (a) Wallace tree. J .. s (b) OVf'rturned-stairs trl't'. 6 z J Balanced tree bit slice. .. 5 FIGURE 6,13 Three CSA tree bit-slices for 18 operands (downward triangles are multiplexers). 6 7 (c) Balallced tree. FIGURE 6.14 Wallace. overturned-stairs and balanced trees for 18 operands (downward triangles are multiplexers). 
164 6. High-Speed Multiplication 6.5 Fused Mulhpty-Add Unit 165 t.he S!1l11l' length otherwise tin' (lr'lny balann'l! pat,hs will no long('r Iw balanCf.'rI. Consider, for examplf', a CSA trpe for 27 upf'r.mds (27 partinl products oM ,inpd frum a 53-bit multiplil'r using t,h(' radix-4 morlifiPd Boot.h alJ1:orithm). A CSA t.ree con!>lrnded ont of (4;2) compressors is sho\\n in Fillrf' 6.15(a). .\IId t.h.. corresponding layout. is sho\\n in Fiburp 6.15(b) 1251. Notf' that the bottom compressor (#13) is located in th.. middl«' so thnt c'ompn.'''Sors #11 and #12 Urt roughly at the same distance from it. Compres.-;or #11 in turn has PCiuai If'ngth wires from #8 and #9 Bnd so on. 6.5 FUSED MUL TIPL V-ADD UNIT 7 A fused mult.iply-add unit p,>rforms thl' mult.iplication A x B followed innnrcli- dtely by an addition of the produrt amI a third operand C so t.hat t.llP calculation of Ax B + C is done 8.." a singh> and indivisible operation. ('leddy, such a unit is capable of performing multiply onlv, by setting C = 0, and add (or subtract] only by setting, for ex-..mpl(', B = 1. A fused multiply-add unit can n'duce t.he overall pxecut.ion tim.> of dminM multiply and t.hen add/subtract operations. An pxample of a case whpn uch chained multiply and add are useful is in t.he evaluat.ion of a polvnomial anI n + On_IX n - 1 +... + ao t.hrough [(anx + on-dx + a"_2Ix + .... On the ot.her hand, independent, multiply and add opl'ratioos can not be performl'(1 in para.llp!. Another advantage of a fused multiply-add unit, comparrd to separat.e multiplier and addrr, arises when exC<'uting floating-point operations sinc(' round- ing is performed only once for the result A x B ... C rathpr then twice (for thp multiply and t hen for the add). Since rounding may introduce computat.ion ('r- rors, reducing the numbE'r of roundings may have a posit.ive ('ff('('t on r he overall error. In the design reported in 1141, this additional accural'y was h('lpful wllf'n producing a rorrectly rounded quotient in the divide by reciproc.3t,ion alrit.l\ln (see Section 8.2). Figure 6.16 shows -..n implpmcut.at.ion of a fused Illultiply-add uuit for floating-point comput.ations. Here, A, Band C arc the significands while E" EB and Ec are the exponpnts of t.he operdllCls, respc.>ct.ively. Thp CSA trE'e generates all the partial produrt.s and performs their carry-saw dCcunmL-..tion to produce two result.s which are then added with the prop('rly a1igm'd operaml C. The ad(ler 8("cepts tbrcc operands aud t,her('fore, must first n.><lure thelll w t\',o (usiug (3,2) counters) and then perform carry-propagate additiou n1t steps of po..,t-normalization and rounding are ex('cuted m'xt, The design iIIust.rated in Figure 6.16 t'lI1ploys t.wo ll'Chuiqllcs in ord..r tl' redurp t.he owrall executiun tilllE'. First, the leading zero <Ult,icipator circuit U!>t'$ th(' propagate and gent'rate sinals produrtd by the adl!f'r (sce Section 5.2), to prcdil'l the tYI>" of sbift which will ht' JI('Nled ill t.hc post-normalizatiun step. (a) (4;2) comprpssor tree. I I I I ! 1 I 1 --r T I '1 I 1 I I ' 1 J. T I 1 I 1 I I I I I I ,1 J. 1 I 1 T T I I '1 I 1 1 8 2 11 3 9 13 4 12 s 10 6 (b) Simplified layout of the (4;2) comprpssor tree. FIGURE 6.15 A (4;2) compressor tree for 27 portlol products and its layout 
166 A B 6. High-Speed MultiplicatIon 6.6 Array Multipliers )67 c product A x B 53 I 53 rangl' of adcll'nd C I 53 I CS.4 fu(' .4 x IJ +/ncr.xB Round 53 ----- Increl\\('nter 53  2 op('rand adder This is the range for 53  E A + En - Ec  -53. If E ,+ En Ec  5-1, tht' hits of C which are shifted furth('r to the right. will bf' replaced by a st.ic'ky hit. and if E, + En - Ec $ -54, all tllP bits of Ax B will b(' r('pl8C('d by a sticky bit TIlt' overall penalty is t.hus a 50% increas(' in t.he widt.h of the adder which, in turn. will incr('a.. the E'xecution time of the adder. Note however. tlldt t.he top 53 bits of the adder need only be capahle of incrplIleuting the original Wnh'llt.s of the 53 bits if a carry propagates from t,he lower 106 bits. The path from the output of the rounding circuit in Figure 6.16 to t.he multiplexer on tht"' right is used when performing a l'alculat,ion like (X x y + Z) + .4. x B. Thl' path from the output of thc nornnli7ation cin'uit to th(' mult.iplexer on the left is \L<;ed whpn performing a calculation like (X x Y + Z) x B + C. In this case, t.he rounding step for (A x B + C) is pprformpd at thp same time & the multiplication by D, by adding the partial product Illcr. x D to the ('So\. tree. FIGURE 6.16 Fused floating-point multiply-odd unit. 6,6 ARRAY MULTIPLIERS This circuit operates in parallel to the addition itself so that the delay of tht' normalization step is shortpr. Seeond, and 1Il0rc importantly, thl' alignment of thl' signifkand C in E A + En - Ec is done in paralld to thl' lIIultiplicat,ion of A dnd B. Normally, in a floating-point addition, we align the significand of the smalll'r operand (i.e., th(' operand with the smaller exponE'llt). This will imply that. if the product A x B is smaller than C, we will have to shift t.he product after it hN; bet'u generated, introducing additional delay. We prefer illswad to ah\ays aligll C even if it is larger than A x B, to allow the shift to be performed ill parallel to the multipliration. To achipve this, we must allow C to shift either to thp right (as i... traditionally done) or to the Il'ft, the direction dictatPd by whether the r€'Sull of E A + En - Be is eitl\f'r positive or nl'gativ<>, respectively. If we allow C to be ..hift to the t.'ft wp must inrrt"'a,,,(' the total nlllnbpr of bits ill the acidl'r. For ('xample, if all operands arc float.ing-point nlllnh('l's in the long IEEE furmat, thp possible rauge of C rt"'lative t.o tlu> product A x B i shown 80.<; fullow6: The two basic opt"'rations, thl' generation of part.ial products and their Mllluna- lion, may be merged. III this way, we avoid tht' overhead that is due to th('" separate controls of these two operations, snd we thus speed lip the mult.iplica- tion. Such mult.ipliers. which conist of identicall'ells, eal'h l'apable of fonning a new partial product and adding it. to t.ht"' prl'viously accumuldted partial prud- uct, are called iterative array multipliers or simply array mulfipli,'s. Clearly, any gain in speed is obtained at the exppnse of extra hardware. Another impor- tant characteristic of array multipliers is that t,l1('y can be implt'mellted :J.' .l.S to support a high rate of pipplining. To iIIustratt"' tht' op('ration of an arny multiplier, examine th,' 5 x :; par- alleloralJl shown in Figur<, 6.17, which contains all 25 partial product bits of the form a, . x j properly aligned. A straight forward implem,'utnt.ion of all array multiplier rnlds the first two partial products (i.e., (14 . .co, a3' .to, ... ao..co lind a4 . XI, a3 . XI, . . . ao . XI ) in row OIW dfter proper aligIlIllt'ut. ThE' resnlts IIf t hp first row are thE'1I added to (1-1' X2. a3' £2. . , . ao' £2 in the ecnnd row, d.nd so un. The basic ct>II for such all array multiplier is an FA accel>tillg one bit of till' IIt'W 
T Array Multipliers 169 168 6. High-Speed Multiplication 6.6 0.. 03 02 al ao QI.l:O aoro x x.. '£3 X2 XI XO 0 <14 . Xo a3 'XO a2 . Xo '11' XO aO' Xo 0.. 'XI 03 'XI a2' XI 01' XI aO'xl O. . :1'2 a3 'X2 a2 'X2 al 'X2 aO 'X2 0" 'X3 13' X3 a2 'X3 Ol 'X3 a(l . X3 01' .T.. a '3"4 02 . X.. al'I4 Go . X.. PI> P8 P- Pe PIS p.. P3 f'2 PI Po . RGURE 6.17 The portlal products generated In a 5 x 5 mUltIplication. Ps P7 Pfj PIS p.. P3 P2 Po partial product (a, . xJ)' one bit of th > previously 8rcumulat.ed partial produ('t, and a ('arry-in bit. A blm'k diagram of a 5 x 5 array multiplier for unsigned numbers is dt'pictt>d in Figure 6.18. In thp first four rows tllPr(' is no horizontal carry propaation. In othcr words, a rarry-save type addition is pcrforl1\t'C1 in these rows, and the a('cunmlat('(\ part.ial product. consists of int.ermediate sum and ('.arry bit.s. Only in t.he last row is a horizontal carry propagation alloy,d. The bl.."t row of cells in this figure is a ripple-carry adder that can be replaced by a fast two-operand adder (e.g., carry-Iook-ahe.ad adder) if a shorter owrall px('cut ion t.imE' is dp..sircd. The array multipliE'r in Figurf> 6.18 hns to be lIIodifi(>d in ordpr to allow mult.iplication of signed numbers in two's complempnt. notation. since product bits like a, . .co and ao . X4 have a negative weight. and should be subtracted rather t.han added. Onf> way to handle t.hp t'ight negatively w(>ight.ed partial product bits propprly, in a 5 x 5 bit. multiplication, is depicted in Figure 6.19. Bits with negative weight arc marked with a small circle instead of an arrow. Such bits ha\'p to be' subt.racted iIl"tead of being added. The ('ells wit.h three positivp inputs are ordinary FAs and art' marked by I in the figure. TI1P cplls with a singlp negative input and two positive inputs are marked by II. The sum of t.he t,hree inputs of a type II cell can vary from -1 to 2. This requires th£' diagonal output c to have a wpight of +2, and th(' vertical output s t.o hav(> a weight. of L The arithmet.ic operation of a type /l cell is described by the equation Pg FIGURE 6.18 An array multiplier for unsigned numbers. x + Y - z = 2c - 8. The \alues of the 8 and c uutputs are given by (6.3) wit.h all its input.s negat.ive is marked by I' (see Figure 6.23) aud has negatively weighted c and s outputs. This cell counts th£' number of (.-I)'s t it5 inpt:. dnd represents this number through thE' c and S output.s. Logically, Its oeratlon i the 5<"lme as that of typl> I (-ell and, t.herdore, their gat.1:' implementations arp identical. This explains t.he rea..n for marking t.hem I a.nd I'. Similarly, thp gate implcmentlitions of type II and type II' cells ar£' ielti('aI. . AnothE'r dppronch to the design of an array multlphE'r for two s omplo- mcnt opf>rands is t.o employ Booth's algorithm. A multiplier basc..>d 011. t1S ao- rit.hm consists of 11 rows of basic cells, wht'r£' n is t,hp number of lIIult,lpht'[ bits. Each row is ('apablf' of either adding or subtracting a properly liJtnt'(l. lIIultipli- cand to t.he previously acrumulated part.ial product. Thp ('ells III row , perform an add :mbtract or transf£'r-only operat.ion, depf'nding on thf" \'alm' of .£, and t.he apropriat.e reference bit. Such a mult.iplier is shown in Figurp 6.20 for fu.ur bit operands. ThE' basic cell in this mult.iplipr is a oont.rolled ald/sllhtrdct/"lnft (CASS) circuit. depictt>d in Figure 6.20(a) (121. I'he Hand D slgllals are l'Onrol signals indicating t.he t.yp(' of operat.ion to be pl'rforlned by the curr"oJl(ling row of CASS cells, If H is 0, no nrit.luuetic operat,ion is done, nd thf'rdr(> the new part.ia.1 product. hit, d('noted hy POll" is equal to th(' prt'vlous lIlit', 1 i,,' s = (x + y - z) mod 2 and c = (x + Y - z) + S 2 (6.4) Cells with t.wo negat.h'e inputs and one positive input. are marked by /l'. The sUllI uf t.hpir inputs can vary from -2 to 1. Hence, t.heir C ont.put. should havf" a w,'ight of - 2 and their S ontput should have a weight of + 1. Finally, a. cell 
170 6. High-Speed Multiplication 6.6 Array Multipliers 171 a.,oCO U{)Xo o a Pl1l , 11 D C ou . C,n L Poul (a) Controllf'd add/subtract/shift (CASS) cell. o o o o XO o 1"3 Po; P8 P7 P<; Ps Pot P3 P2 PI Po FIGURE 6.19 An array multiplier for two's complement numbers. X2 If H = 1, an arit.hmetic operat.ion is performpd, generating a new Pout. The t.ype of arithmetic operation is indicated by the D signal. If D = 0 thcn the multiplicand bit, denoh>d by a, is added to Pin with C-in as an incoming carry bit. from the adja.(:ent cell t.o th(' right. The cdl thl'n generates Pour and Coul as thE' outgoing carry t.o t.h(> IWXt cell Oil the lpft. If D = 1 t.hen the multiplicand bit. a, is subt.racted from P;n, with C"I 8.'> an incoming borrow and Coul as the outgoing borrow. Thus. th(' logic equations for P OUI and Cout are (12) POUI = Pin ED (a . H) ED (c.-;n . H), Coul = (Pin EI:) D). (a + Cin) + a. Cin' An alternatp approach to the dign of a CASS cell is 8.., a comhination of a mul- t,iplexer (sl'lecting among 0, +a and -a) and an FA. The control signals Hand D fur row i are generated by a CTRL circuit (shown in Figurp 6.20(b)) based on the multiplicr bit x, and the refcrenrp bit Xi-it following the rules of Booth's algorithm from Tabll' 6.1. The first row corn.>sponds to t.he most significant bit of the mult.iplipr. H('nce, thp rpfadting part.ial product nN'ds t.o be shifted to t.he l('ft Lpfort' we> add to it (or :mbtrdCt. from it) t.ht> next mult.iple of the multiplicand. Tl' achieve t.his, a np\\, cell with input Pm = 0 is addpd (at the right end) t,o the XI Po Ps Pot P3 P2 Pl Po (b) FIGURE 6.20 A Booth's algorithm array multiplier. 
172 I 6. High-Speed Mu",pRcatlon 6l'cond row, amllo each row aftl>rward. Since the nUlllbf'r of bits in till' partial product incrC&ics by om> in puc.1t ruw, WI' need to ('xpand the multiplicand I>t'IDr(' adding it to (or subtracting it from) tht' partial product. This is a(,colllplihed b,y replicating the sign bit of thp multiplicand 8S shown in Figure 6.20{b). Notc th.it wt> CUln()t take .t(h'antae of strings of 0'1'1 or I's in this imple- nlP.ntation. sinc'p w(' cannot pliminatp or skip rows. Thus, the only adwntsgt:' in this impll'mcntation is till' ability to multiply negdtive IIlllllhl'rs in two's rolll- plf'lII('nt with no n<1 for any mrre('tioll step. Also notp that the opt>ration in row; n(,('l:1 not b£' cf£'layro until all th£' upp£'r (i - 1) rows have complctc<l thpir "pt'ration. Thus, the least signifirant bit. of the product, Po, will be g('ncrated "fter on(' CASS ('('II delay (in addition to the dl'lay of a CTRI drcuit), p. will be generated after two CASS cell ddays, and the most significant bit, P2n-2. will be f..,eoerated aftcr (21/ - 1) CASS cell delays. In a similar way we can implement highpr-radi.x multiplication sdlcmes, whidl rt'quir(' less rows in the array by employing, for example, t.lle radi.x-4 algorit.luns shown in Table:; 6.3 and 6.S or similar radix-8 algorithms. Thf'S:'. too, ran handle nl'gative multipliers in two's <,ompl('meut reprc>srntatiOlL The huilding blork of such multipliers is a multiplexer-adder circuit that selects the correct lIIultiple of the multiplicand A aud adds it to thp previously 'lccumulatl,d partial product to produce a new ar('umulatro partial product. b d 0, z] ('-o..r S Latched full addcr with an AND gate. b d d b hJ ff <'our S <'-our S Latched half adders FIGURE 6.21 The basic units of the plpellned arroy multtpller (16), 6.6 Array Multipliers 173 t) I IJ.I 02 °1 0(1 %2 .riJ XI ro rl Po Ps p., PI! P:. PI P.J P.l PI Ib FIGURE 6.22 A pipe lined 5 x 5 array mu"iplier for unsigned numbers. 
174 6, High-Speed Multiplication 6.7 Opllmolity of Multlpll8r Implementations 175 An important charact.eristic of array multiplil'rs is t h"r. they allow a pipclm- ing mod... of opf'rat ion, whl're t.h(' f'XI'Cllt ion of sppmah' mult iplicat ions owrlnps. If this mndt' of upl'ration is df'$iretl, the long c!('II\Y lk"SociatNI with thp carry- J>rol)dgatin addition pprfnrml:'d in thl' I.\st. row of thp army (e.g., bCC FiblUe 6.18) should b(' minimi?t'fJ. since it. det('rmin('s the throughput of thp pipeline. This can bp flchip\'Nl by ff'placing tlIP CPA with sl'venl additilln..u row that, lik.> tllf' first rows in t.he array. allow a carry prol)agatiun of only one posit.ion behw'I'n auy two consft'uti\"(' rows. Fiw snch rows art' nt'<'rlNl iu tht> 5 x 5 ar- ray lUultiplier for IlnsignMnumb('rs in Figur(' (j.18, with -I, 4, 3, 2, alii! 1 ceUs, r<>Sl)ectiv('ly, Thp$f' rows are shown in a pip('lint><l vcrsion of the 5 x 5 array Itmltiplit'r. dl'pict.l'd in Figure 6.22. Thl' basic c('lIs I'lIIploycd in this multiplier arc shown iu Figun' 6.21 (16). The FA in Figur(' 6.21 indudcs an AND gatp that generatt'S t.he prOlhll't. hit nix]' This product bit. is added to thl' incoming bits b and d to producp thl' output bits S and COld' The lUodified FA I\lso propagates th(' hit n. and Xj to lII'ighboring cells. Th(' two vl'rsions of t.he HA in the samf' figurf' ar(' used in th(' bnttom five rows whpr(' t.h,' cells add only two input bit.s each. In ord('r to 'iupport pipcliniug, all cl'lls in thp array must includ(' latches, so that each row can handle a s"I)ardt.e mult.ipli('r-multiplkmlll pair. Also. rebisters arl' needed t.o propagat.e the multiplier bits t.o their destination, and to propagate thl' product. bitb t.hat haV(' bet'n completed, which is dOllP in paralld with tl.! generation of new product. bits. Up to 10 l'OIlSL'Cuti\'C mult.iply operat.ions mn bE' ex<'Cut.ed simultanf'ously in thp multiplier depictl'd in Figure 6.22. TIt(' ma.ximum rat.c at whidl multiply operat.ions can be cOO1pll'ted is detl'rntined by t.he d('lay associated with thl' modified FA, including the 18tch. This ratt' might. be, in pract.ice, too high to Le ust.>O as th£' clock rate of t.1)(' circuit. Howev('r, other impll'lIIent.ations of t.he 5 x 5 pip('lincd mult.iplier with a low('r rat(' ar(' possible. For example, two row can be combined t.o form a sing'" pipelinc stage' with a lower ratl' but. with fc\\r latch (less drcuit.ry ov('rall). Also, if t he rc",idnf' number systt'm L" f'mploycd, smaller c'ircuits with ff"wf'f input!; arp required and, conseqlll'nt.ly, thl' lowf'r bonnd i Tmull  pogJ2ml, (6.6) 1:null  rlogJ2nl. (6.5) wl1Prf' J1I is t.he Jlumher of digits t.hat are needed t.o rf'prf'S('nl t.hp largP"il m'Hlulus in the rl'siduc uumber systcm, as explained in Chaptf'r 11. and wumlly rn « n. \Vhl'u arclling fl}r an optimal implementation of a multiplier in Hie mn. \'t'utional binary number systf'm t w(' nCf'(1 to comparl' t.he perfornnnC'e (p.xecntion timc) and implementation costs (e.g.. rl'gularit.y of thf' df"!;ign, totkl urt'a, pt.e.) of the previously rlpscribpd algoritluns for multiplic'ation. \Ylwn both f'xu"uriou time snd implcmentdtion cost, say, area, nl'f'd to be t.ak('n into iUTount, an lib.. j('ctive funct.ion like A. T Can be used, where A denotf' the area and T denotf'S t he execution time. A more gl'neral form of an objective fuuftion is A. T" . whell' 0' can be I'ither smallpr or Inrer t.han 1. In what follows WI' compare several mult.ipli('rs, some uf whieh wel"" pre- sent.ed in pre\'ious sc<tions. The simple array mult.iplier depictpd in Figur p 6.18 hI's a very regular struct.ure. It can bl' impl,>ml'nt,,<! f'asily as a fel.t.angular- shaped array, wit.h no waste of chip m('a. The" least. significant bit.s of t.he final produc,t "lr£' tlu'n produced on the right side of the ft>Ctangll', whill' th., n 1I10St. significant. prodm't bits arl' t.he outputs of t.he butt.om row of t.he re<'tangle, which constitut.es a CPA. Alt.hough this impl('mentatiun is highly reguldr <lJld its d('sign and layout. are very simpl£'. it. pOSSf'S.'i('S two major drawbacks: Fir!>t, it. r(>(luir<'S a very large area, proport.ional to Ill, sin('(' it contains about. n 2 FAs !Ill." AND gates. S('cond, it has a long execution time T of about 21&' F \ (FA IS t.he delay of an FA). More precisely, T cousists of(n -I)f' \ for the fin;t (n - I) rows and an additional (n - 1)FA for the CPA if implelll('nh>o as 8 ripplt'-('arry adder, as shown in Fignrp 6.18. Thus. all objectiw funct.ion of thf' forlll A . Tis directly proport.iunal to n 3 . . If a highly pipelint'll vcrsiou of this arrav lIIultiplif'r is dl'sired, t.he fC<IUJred area increa...<;('S even furt.her (since th£' CPA must be replsced) as (Ioes t.he latency of a single mult.iply operation. Howpver, the result.ing pipelinf' period. which detcrmines the pipclining rate, is short£'r. ... . An impl('mt>ntat.ion of t.he Booth's algorit.hm array nmltlpher. deplLtL>O III Figure 6,20, off(>rs no advantage oVE'r th(' previous mult.iplit'f when performance and area are consid('((,{\, since the ar('a A is of the order of ,.,,2 alll! T is line8r in n. The mo(!if1NI radix.4 Booth <\Igorithm (Hee Tahle (j.3) ("an poteutiaUy result in a b('tter impll'mentation, sin('(' it requires only n/2 rows of l'elk This reduction in t.he numbl'r of rows could, in principle, rPt.luce th(' delay (T) kllli the implementat.ion cost. (A) by a factor of t.wo, decrea.-;ing the objt.'(.tiv flll(:tit)n" A. T to a fourth of its previous value. However, a more d{.taitt.'11 exannllldlUU 01 the design reveols that. the act.ual d('lay and area gains .up less thon exp,'(.ted. Thp r('Coding logic and, lIIore importflntly. till' partial produet. selt'ctofS, ..lei 6.7 OPTIMALITY OF MULTIPLIER IMPLEMENTATIONS Bounds on the p('rformanc(' of algorit.hlllS for multiplicat.ion have been derivc<! in a way similnr t.o that of t.he bounds for addit.ion that. Wl're desl'ribed in Section 5.4. It is intpfPSting to note t.hat the t.heoretical bounds for multiplication art' similar to those for addition, although, in practice, multiplication is more time- con!;uming t.han addit.ion. Thus, if we adopt. t.he idcali?Mmodel, which 8SSluncs t.hat dll circuits are impll'lI\euted using (I, r) gat('s, the execution timc of a multiply ('ircuit. for t.wo opl'rands with 11 bits each must satisfy 
176 6. High-Speed Multiplication 6,8 Exercises 6.2. Prove that no c(lrrection stl'p i nC'I'dPcI whC'1I using thl' multiplk .Iion <il$torithm in Table 6.J with A rll'galive multiplier r('prt'Sentf'<'11Il lwo'" LOlllplClJ1cnt. H."pt this for the dlgorithru in fdblp 6.5 with" sign hit pXll'nsion. Verify that the new pRrtial product in the radix-.I modified Booth "'borithm is (X'_I + X'-2 - 2x,). A for odd wlut'S of i. Use this exprion to fonnally pro"f' the ('orrectnC"SS of th.. algorithm. Writt'down the rult'S for a radix-8 nJodified Booth's alorithru or, In nthpr wonl.., 'l 3-bit wrsion of the algorithm in fable 6.3. (n) Verify that the 7,1161 ch,p, which is called a 2-bit by I-hit parallel nmllipll'xE'r, impl('ments t he algorithm in Thbll' 6.3. (b) How many 8u('h chips ''''1' nl'eded to constnll"t a 12 x 12 bit two's comph'mcnt nJultiplier? Show how thi'S(' dups should bC' intercOnnf'Cll'd. (c) Explain how the Q. output signals of thl' 71261 dup arc used tn generate tll<' sign hit of the partial produ('t. (d) What type of carry-save adder is needed? In case 101 of Table 6.6 we Dt'e(ltwo forced cArneS into the acldpr. To avoid this, we mav forct' Xo to 0 if it equals 1 anrl srt the initial partial product to be + \ inst('ad of -A. Show that the correct partial product is ulwa}'1i obtdiucd. {)('SIgn a 3n x 3n bit multipliE'r out of n x n lilt lIlultipliel'1:l. Find tht' nwuber of II x n bit multipliers that ar' needed dnd show how Ihe partial products should bt' aligl1(,(1. What type of counters arc needed to add the partial products? Cdn (5,5,4) counters be l15('ful? Write tht' truth t'1ble of A type I I cell uSO'<I in two's complement array multiplit'rs and obtain the Doolt'an equations for the c and iI output.:;. Repeat this for typt' 1 I' cells. ('iill the four ('t'11s in the lu.st row ill Figurl' 6.19 be madE' into t}'pe 11 by definlllK the rightwost zero carry as having a positive w"ilt? The idt'<1 behind the arr'\\' multiplipr ill Figurt' 6.19 was fi.r:,t prop05ed by pczaris (19\, who has shown a slightly different org.wi7-3tion of the multiplier, as depi('tro in Figure 6.23. Explain why the P. output in Figurt' 6.l3 is conlll'Ctl'd tu the cf'1I on ils left side. D('Si1 all array multiplit'r for two 5-bit uega-binary opt'ram1s; for example, X = L:=oxj(-2)'.' What is the range of e8('h 0l)erand ad ohhe l>roduct? Draw t.he lIJultiplit'r, indicate how miillY differ('nt types of I-bit rells are nt'f'cll'tl, and WW tht' trnth table for each type. III this question you are a1Iked to timate the pxecutiou time of an array ruul- liplier like the ol\e shown in "'igure 6.18. Denote hy l:J.. dnd .:ic t.lll' dda)s d.'oSociated with the sum and cnrry outputs of tht' ha.'iic cells, re8pcctlvply, an,l a:..sume that they sati..f,. l:J.. > l:J. c . Find the criti('sl path in the array lUultiplil'r d..wuillg all pro"duct bits a,xJ lirl;' IIwilable sillluIUlIll'Ously. K'dimate tht' pXC'CU- tion time of a n x n bits multiplicdllOIl. Can )OU sugge8t ways to speed lip tile opt'ration? complpxity to t.he drruit and result. in a larer number of inlt'rronllprtiolt, and a 1()1It'r rl..lay pt>r row. AI!iO, sinct> t.he rf'lat.ive shifl bet.ween any two adja(\,nt rows is two bil positions, WC' must. allow the ('krry to propagatt' hori7ontally in t h('se hit pO!>ition!.. This can bt> aehipved ('it.her locally or at thp last row of lhp flrray mult.iplit'r. Aft.er that, a carry propagation through (2n - 1) hits (instead of n. - 1) is required [181. Thp ('xact O\'Crall reduction in fhp objf'Ctive function dpppnds on the detail.. of th(' dpsign and th(' technology used. Similar problpms arp t>ncount('f('d \\'h(,11 implpmenting tilt> radix-8 modifipQ Booth's algorithm in th(' form of 8n array multipli('r. In addit,iulI, the part.inl product 3.4 should be precalculated. (',onsequently, the n>ductiol1 in delay and area may be far Is than the t'xl){'ct.ed fador of 1/3. St.iII. the implem('ntation of th(' radix-8 algorit.hm might be cost-t>fff'{"tive in certain technologil's and design styles. Irr('$p('('tiv(' of t.hp WdY partial products arc generated. they can be accu- mulated dt.lwr t.hrough a cascadp int.prcuoncction (as in Figures 6.18 and 6,20) or throu"h a tree structure (e.g., a CSA tree,  in Section 5.11, or Senne varia- tion of it, as in Section 6.4). The number of levels iu a CSA t.ree for k partial product.:; is of the order of logk rather t.han being linear in Ii (as in a cascade in- terconnect.ion), resulting in a much shorter f'Xl..,<,ution timp (thp numbf'r of partial product$, k. ('an be n, n/2. n/3, etc; whpre 11 is the number of bits). Howcver, CSA tree structures havc irregular int.eu'onn('Cts, making it difficult to find an arca-t>fficicnt layout with a recta.ngul<ir shape. MOrt.'Over. all overall width uf 2n is required in must cases. This may re;ult in a multiplier area of the order of 2n log k. The objective funct.ion A . T may, conseqllf'ntly. incrC'a..<;c as 2n log2 k. The bal.mced delay t.ree in Figure 6.13 has a more regular structure. The incrpl11ents in the numbC'r of operands in the balanced delay tree arc 3, 3, 5, 7,9. . '. Th(' sum of t.hl' e1ement.s in t.his serit-'S is of the ordr>r of p, where j is the num- ber of elt>ments in the series. The number of It'vels, which det.ermines the o\oerall delay, incrl'ases linearly with j. As a result, the overall delay of a balanced delay tr('(' i.. of the order of Vii, where k =;2 is the number of operands. This nC<'cls t.u be compared t.u logk, which is tilt> number of levels in the complete binnry t.r. The detailed proof it; left to the re«der as an exercise. One should bp aware that general exprionb for the t:Olllplexit.y of either the execution time or till:' area, like the ones above, ba\e theoretical import.ance, but only limitf'd pr8Ctical1>iguificsJlce. For any given technology, a more detailed examination of tbp alternat ive design!. is necessary before final conclusions can be drawn. 6.8 EXERCISES 6.1. Show thai Huoth's algorithm can be used to convert a nwnbcr iu two's comple- mt'nt rt'prffientatioll to its SD reprPSentation. 177 6.3. 6.4. 6.5. 6.6. 6.7. 6.8. 6.9. 6.10. 6.11. 6.12. 
178 6. High-Speed Multiplication 6,9 References 179 nozo a.. aJ a:l al ao 0 x x X.I Xl rl .l'o al.70 ll"i '.ro al' Xo al '.ro no . .ro n"'Zl a:t . Xl a2' XI al,xl ao'ZI al'X2 a3'.%2 a:l'X2 al 'X2 ao . X2 a.. .73 a3 '2'3 02' X3 al '2'3 ao '2'3 a . .r  a3' :r.. a2' X.. al'X" ii'o'x ii 0 0 0 a XI 0 0 0 x, Po Pf', P7 PG 1:- P P3 P2 PI Po FIGURE 6.24 Patiol ptoducts tOf e 5 x 5 TwO'S complement motIon (1). 6.19. Find the values of a,b,c,d,e and f in f.lble 6.7 which will yiE'ld an ('xprpssion with the smallest numb('r of literals (8 literal is any appearance of eithE'r .£, or x, ) for Coul. 6.20. Prov(' th.\t the following modification (see I"igure 6.25) of th(' vrangl'ment of partiM products (for two's compl('ment operands) suggested in (11. produces the correct filial product, Compare this arr81lgt'm('ut to the original one shown in Figurt' 6.2-1. Po Ps P7 Pr, Pc. I'.. P3 1'2 PI 1'0 FIGURE 6.23 The Olley multlptef tOf two s complement numbers suggested k'I (19), 1 a.. .xo 03'2'0 °2'ZO °1'IO aO'ZO a..'Xt a3' X I °2'2'1 °1'XI Oo'XI °4' I 2 °3' X 2 °l'X2 al'Z2 ao'X2  °3' I 3 °2' X 3 °1'X3 00'X3 1 0.. '.fl °3' I o1 'ii2-Xi  aO'I4 Pg Ps P7 P6 Ps p.. P3 P2 PI Po 6.13. Prove that th(' arr<\lIgement of partial product hit... shown in Figurp 6.21 produces the l'orre<'t product of two 5-bit two's compl('nwnt ope.r.mds. wlll're :C. = 1 - x, and :;imilarl' ai = 1 - ai. This arraugellwnt W<\S suggcstt'd h) H"lugh and \\'ool('y (11. Comparc thi.!. multiplil'r to the two's complt'ment array mllitipli('r.; shown ill Figures 6.19 and 6.20, con:;idring the dlllouut of hardware .\IId I'x('clltion tillle. FIGURE 6,25 Modlfted orroy ot porTlOl ploducts to< e 5 x 5 twO'S complement motIon 6.14. The Booth's algorithm multipli('r in Fire 6.20 starts with the most siguificant bit of the multiplier. Redcsign the multiplier starting with tht' least signific.mt bit of X. Compare the execution time and tht' requirro bardware of the two alternatives. 6.9 REFERENCES 6.16. Prove that the dl'ld)' of thf' habwt:ed .Ielay trL't' hO\VlJ ill Figure 6.13 is propor- 11011.11 to .,fk, wht'1'(' k is the numh('r of op('rnnrls. 6.17. EXIII"iu why the HAs iD tbe leftmo:;t column of the .u-ray multipli('r ill Figurt' 6.22 h'wf' 110 carr)' output. 6.18. Verify thdt tlu impl('lUentation in Figure 6.12l'orresponds to the setting 0 = b = c = 1 and d::;;; e = f = 0 of the variables in I'lble 6.7. (11 C.R. BAUGH and B.A. WOOLEY, "A two's compl(,lUent parallel array multipliw- tioll algorithm," IEFf; '1h1ns. on Computers, C-22 (Dec. 1973), 1045-10<17. (21 K. C. BICI<I-:nsr\FF, t. J. SClwLTEand E. E. SWARTZLANDER, "Parallel reduced area multipliers," Journal of VLSI Signal Processing, 9, (1995), 181-191. (3] A.D. BUOTH, "A signE'd binary multjplicntioD tl'('hniQut'," Quart. J. M. tppl. Mclfh., 4, Part 2, 1951, 2:)6-240. (41 L. DADDA, "Some scht'mes for parallel multil)li('," Alta fuquenza, 34 (March 1965), 3.1fi-31'i6. (51 L. IhDDA. "On parallel digitnllDultipliers," .Uta fu'quenw, 45 (H.l76), 57.1-580. (61 J. DEVERt-:1 L, "Pip('line iterative arithmetic arra)'S," IEEE ['rans. 011 Computers, C-24 (Mvch 1975), 317-322. 6.15. Show Ii blol'k diagrd1l1 of a 6 x 6 bit two's complel1Jl'llt multiplier constructed out of multipll'xer-addt'r circuits bas<>d on the radix-4 modified Booth's algorithm ill I'abl(' 6.3. 
180 6 High-Speed MUltlplicotion (7) f. J, FI YI\: AND S. r. OBEltMAN, AdJJanced CJJmpute-r o"thmf'tlr Je..tign. \\ Hey, Nf'w York, 2001. 18) J ,A. GmsoN and R. \\', GIIIII,\ItD, "S) uthf'Sis mul romp:lri",m or two's clIlllJllement parallf'1 Illllltiplipl"S," IH/;;F 1hm.s. on romput.eJ. r-24 (Oct. 197:;), 1020-1Ol7. 191 A. HABIIII and P.A. WINI"L, "I'Mt lIIultipLII'n>," /1-:P/:.' 1hu13. on C01llput rs, C-19 (Fl'h. 1970), 153-157. (10) \. UARATA 1'1 0/., "A high pecd IlJllhipJjI'r usiUK" redundant lIinM)' f\lld"r trl"t'." IFFP .1. of Solid-State Clrcuils, Sr-22 (Ft'I.. 1987), 28-33. (11) Q. I . MAl"SORLEY. "IIiRh-spcOO arithlllNic in 'JinHry COIllJlutt'I'S," Iroc. oflHF, 49 (Jan 19tH). 67-91. 112) J.C, MAJITIIIA and R. "'ITA. "All itf'rnti\'e arrdY for lJIultipliralion of sigDl'C1 binary numhl'I"S," If'PI:: Ihm... on ('omputprs, ('-2fJ (F('b. 1971). :!l.1-216. (13) 1. MEU'fA, V. P\RMArt and I' . SWARTZLA:l:l>ER, "lIih-51)\'t'(1 multiplier d('::;i1I using lIIull i-input cOIlllh'r .md ("(>mpr('s.-.or circuits," Proc. lOtI. Symp. on Com- puter A "thmciic (I9<J 1), '1:J-50. 11-1) H. (UN roYE , E. 1I0KENEK and S.I . H.eNYON, "Ul'Slgn or the IBM luse 8)'5' tl'm/600 l1uating-poiut unit," IHM Jorlnlal ofllt-". m::h and DCl1clopmi'nl, 34 (Jan- uar}' 1990), fI9-67. 115) Z.J. Iull and F. .JuTANu, "O\l'rturnro-slair.; adder In¥s and multiplier lIes.igu," If?Ef' 1hm.. on ('lmll)lItt', 41 (August Illi}2).940-!1.1R [16) T.G. l"OLL t.l al.. "A piJlPlilll'd 330-MH.l multiplier," IEEE Jour'rllll of Su/id-Stllli' ('irruit., 8C-21 (June 198 1 3), .111- 116. 117) V. G. OKI.OIU>/IJA and O. \\'ILLEGER. "lmproviuR lIJultiplil'r dl>sign by u!.ing im- prO\ed column ('omprion tree and optimizoolina1 addl'r in C'I\IOS tel:hllology." Il'FB 1hUl.'!. on VLSI systf'rns, 3 (June 19fJ5), 292-301. (18) V. PF.NG. S. SAMllI)Jt,\LA I\nd 1\1. G,WRlELO\', . 011 the illlpll'll1l'lIt<\tion of shiftl'n>, 1I111ltipliel'$ "lid dividt>r.; in VLSllloatillg-poinlullit.s," Proc. of 8th Syrllp. on Com- putN' Arithrnptic (Muy 1987), 95-102. [191 S.D. PElAIUS. "A ,IOn., 17-bit by Ii-bit array multiplipr," IFFI'; lhm.s. on ('om- puters, C-20 (April 1971). '142-447. 120) G. \\. REITWIE.SI'..;R, "Biliary arithml'tic," in Advonc.es in compute-rs, vol. 1, F. L. Alt, (Editor), Ac \(Ielllic, N"w \ork, 1960. pp. 231-308. [211 L.P. 1t1lBINFIELD, UA proof of the mOdili,,<:1 Booth's dlgurithm ror multilllirl\tion," IEFE 'nuns, on Co IIIp ult>rs, C-24 (Oct. 1975), 101.1-1015. (221 'I.It. SANTulto dnd M.A. 1I0ROWITl, "SPIM: A pipplmed 6.1 x 6.1 itprati\'e mul- tiJllil'r," IEF]'; Journal of So/id-Statt: Circuit.'!, 24 (April 1m'!.!), 1j-4!}:J. 123) P.I-'. S n;1 LING, C.U. IAIU .:L, V.G. OKLOBDZIJA alld R. RA\'I, "Optinu,1 circuit.... for 1>Rr.1l1pl lUultiplif'rs," IFEF H'Urk.. on Cornpllters, 47 (March 1998) 273-285. [2,1) D. Zl'RAS aud \\'. H. MeAl.! ISTER, "Halallcl'(l Mia)' trl'eS and combinatorial di\ i- sion in VLSI," !EFE Journal of So/ld.State ('jrcuit.'1, S(,-21 (Oct. 1986),81-1-819. [251 B..K. YIJ .md G.B, lnam, "1671\((17 radix-.I flouting point multiIJlicI." Proc. of t/lc 120, SlIrnp. on Cornpllt r I',.,thmclic (Jul)' 199.5), 119-154. 7 FAST DIVISION Tht'rc arc two different approach to the development of algorithms for high- spl"('d division. The mort' com'elltional approach USt'S add/subt.ract aud shift opf'rations. whill' the second relies on mult.ip!ieation. The operat.ion count in tilt' first approadl is linearly proportional t.o t.ll(' word si7t'. n. Tht' number of stpps in the second approach is logarithmically proportional to n, but each individual step i<; more complex. Thl' first approach is disclJSS('d in this chapt.er while the second is presented in Chapter 8, 7.1 SRT DIVISION The most well kno.....n division algorithm of the first. t.ype is the SRT division, nallied after SwC'Cney. Robertson, and Tocher (Ill), 115), [19», ea<h of whom developed it independent.ly at. around the same time. The motivat.ion behind the SRT algorithm was an at.tempt to speed up the nonrestoring division (whidl consists of n add/subtract operations and is presentt'd in Chapter 3) by allowing o to be a quotient digit for which no add/subtract. operat.ioll is needNi. In prin- ciple. we can dlane the rule for selecting the quotient digit in the nonn.o;:;toring division to 1 if 2rj_l  D o if -D :S 2r,_1 < D i if 2r'_l < - D and the corresponding new rcmainder is rj = 2rj_1 - q, . D. q. - { (7,1) (7,2) 181 
182 T 7 Fast Division 7 I SRT Division 183 r, rl /J 1/2 r--------- r--------- 12D 2r'_1 I I I 2r'_1 q. = I q. '" 0 q. = I ,. =0 II I I I ---------.. --------- -D FIGURE 7.1 Nonrestonng division with qi = O. -Ill FIGURE 7.2 SRT divisIon. This modified nonrestorinl!; di\'ision is diaramml'<l in Figurt' 7.1. The difficulty wit.h thi.. nl:'w sel<'t'tion rulc is t.hat a full comparison of2ri_1 with t>ithl:'r D or-D is r('()uired. If WI:' rl:'strict D to bl:' a normalized fraction satisfying :S IDI < 1, we may rE"duce thl:' rcgion of 2ri_I, for which q. = 0, as follows: 1 1 -D < -- < 2r'_ 1 < - < /J - 2-' 2- (7.3) 2ri = O.Olxxxx .md so on. Similarly, if 2ri_l = 1.110x.rx.r then 2ri-t > -1/2 and we set q. = 0 obt.aining 2ri = 1.1O.l'xx.r and 80 on. \Ve say, th('reforf'. t.hat SRT division is nonrest.oring division with a normalized divisor and fl'nldinder. SRT division, as nonrestoring division, can be extended to include negat.i1"'p divisors in two's complement. The selel'tion rule for qi thE'n bp('omcs The advantage of this is t.hat now we can compart' the partial rem.linder 2r._1 to f'ithE'r 1/2 or - 1/2, inst.ead of D or -D. A binary fract.iun is largE'r tll8n or equal to 1/2 if, and only if, it stdrt.s with 0.1. Similarly, a binary fraction it> smaller than -1/2 if and only if it starts with 1.0 (in two's complf'ment representation). Conse<luently, only two bits of 2ri_1 have to b(' examined, im;tf'ad of a full-length cOlnparison bet.wef'n 2"i_1 and D. In some cases (e.g., when the dividend X is larger than 1/2) the shifted partial rE'l11aillder IlPeds 811 integer bit. in addit.ion to the sign bit, and t.hus, tllr{'(' bits of 2",_1 must b. examined. ThE' rul(' for splE'C(.ing t he quotient digit becomes q, = { ; if 12r._ tI < 1/2 if 12ri-tI  1/2 & ri_1 and D have the same sign if 12ri-d  1/2 & ri-I and D hav(' opposite signs. (7.5) q, - {  if 2r'_1  1/2 if -1/2 :S 2r'_1 < 1/2 if 2ri_1 < - 1/2. (7..1) Example 7.1 Let the di\'id('ud X be ('qual to (O.OlOlh = 5/16 alill t.he divisor D bf' (O.llOOh = 3/.1. Applying the SlIT algorithm yields ro =X 0 .0 1 0 1 2ro 0 .1 0 1 0  1/2 set. ql = 1 Add - D + 1 .0 1 0 0 rl 1 .1 1 1 0 2rl = r2 1 .1 1 0 0  -1/2 set f/2 = 0 2r2 = r3 1 .1 0 0 0  -1/2 Sl't '13 = 0 2"3 1 .0 0 0 0 < -1/2 set q4 = i Add D t- O .1 1 0 0 r, 1 ,1 1 0 0 J1('gative renminder & posit.ive Y Add D + 0 .1 1 0 0 COnN"t iOIl rl 0 .1 0 0 0 l'orrech.'<l final fl'maimlcr The fl.'Sult.ing algorithm is rallt"'d SRT division, and it nn be diagrannllcd as shown in Fiure 7.2. This diagram shows the quoti('nt digits that mllst be st'I("('t,f'<j ill order to satisfy t.he conditionlr.1 :S IDI, guaranh't,ing thl:' cOIl\'eren('p of thE' divi..ion prucedur(> with a final remainder smallf'r t.han IDI. Tht"' SRT divi1>ioll procl.'SS st.arts off with a normalized divisor and hns t.he pffl't:t of lIormaHzing the partial rf'mainder by shifting o\'t'r leading O's if it is pOt-itivc, bud ledding 1'8 if it is ncgutive. For pxample, if 2r'_1 = O.OOlx.rxJ: (wlwre :r u; un}' binary digit), t.hen 2r'_1 < 1/2 and we set C]i to 0, obt.aining 
T 184 7, Fast Division 7 1 SRT Division 185 Thl' qllotJPnl elJ('ratcd hl'fore thl' correction is Q = 0.1001. This is a minimal repn'sentatioll of Q = 0.0111 in SD form. In uthl'r words, a miniuml number of add/subtract operations is performed. After corn:'(':- lion, Q bffom(>s 0.0111 - UIJ1 = O.OIIO.,? = 3/8, and t.he finnl rl'llIaindt'r is 1/2.2- 4 = 1/32. 0 that snbtracting 2D (D/2) instead of D is E'qllivall'nt to pl'rfonning t.hl' subtraction one position earlier (Iat.er). 2. Chang' the comparison const,anl K = 1/2 if D is outside the optimal ranJ?;e. Such a change is allowed bet'ause the ratio D/ K is wbat really lI1att('g, since we compart' the part.ial rplUainder to K, not. D. ro = Y 0 .0 0 I 1 1 I 1 1 2ro 0 .0 I I 1 I I 1 0 < 1/2 S<'t ql = 0 2r] 0 .1 I 1 1 I 1 0 0 ? 1/2 set q2 = 1 4.dd - D + 1 .0 I 1 1 r2 0 .0 I 1 0 1 I 0 0 2r2 0 .1 I 0 1 I 0 0 0 ? 1/2 set q3 = I Add - D + I .0 I 1 I r3 0 .0 I 0 0 1 0 0 0 2r3 0 .1 0 0 I 0 0 0 0 ? 1/2 set qs = I Add - D + I .0 I 1 I r4 0 .0 0 0 0 0 0 0 0 zero final rl'mainder Q = 0.0111 2 = 7/16. This is not a minimal rt'prt'St'ntatioll of t.he <)lIot.ipnt in SD form. 0 The idl'a behind scheme (1) is that wht>nl'ver D is small wt> may 1'11(1 up generating a sequelle'e of 1 's in tl1f' quotient one hit at a tin1f', rf'<}lliring a slIht.r1\ct operation per each bit, as in the last example. In sUe'h rases, suht.rR<'ting 2D iustf'ad of D (which is equivalent to subtracting D in thf' previous step) might generat£' a negat.ive part.ial rl'lI1ainder, allowing 118 to generate a sequence of O's as quotient bits while nonnaliJ:ing the partial rt'mainder. Example 7,2 Let A = (O.OOIlll11h = 63/256 and D = (O.lOOlh = 9/16. Example 7,3 Repeating the previous example WI' obtain ro=X 0 .0 0 1 I 1 I I 1 2To 0 ,0 1 1 I I I 1 0 < 1/2 set ql = 0 2T] 0 0 .1 1 1 1 1 1 0 0 t;ubtra.<'t 2D Add - 2D + 1 0 .1 1 1 instead of D r2 1 1 .1 1 0 1 1 1 0 0 set ql = 1 and Q2 = 0 2r2 1 .1 0 1 1 1 0 0 0 set q3 = 0 2T3 1 .0 1 1 1 0 0 0 0 :5 -1/2 set q. = I AddD + 0 .1 0 0 1 r4 0 .0 0 0 0 0 0 0 0 zero final remaindl'r Basro on 1.111' last example, WI' may conclude that it. is possible to fur- ther r"dm.'(' the numbN of add/suhtract operatiolls. Simulations and statistical analysis studying t.h(' I'ffiriency of the SRT method have bren performed [91, and the conclusions were: Q = o.lOoh = 7/16 and this is a minimal rt>pre.sent.dtion of tl1t' quotient in SD form. 0 1. TIll' <Wcrdl' "shift" in the SRT Ilwthod is 2.67, meaning that for a dividend of length n we need, on the avcragc, n/2.67 operations. For example, for n = 24, on t.11t' averagl', 24/2.67 = 8.9 :::::; 9 operations are requirl'd. 2. The oct.ualnumb('r of uperatiolls u,>c(k..l depl'nds upon the divisor D. The smallt>sl number is achit'ved when 17/28 $ D $ 3/4 (or, approximately whl'll 3/5 :5 D :5 3/4), with an aVc:>rage shift uf 3. Henn>, in urder to rrouce the numher of add/subtract operations, \\1' should modify th£' SR r method whl'n till' divifiOr happens to be out of th£' optimdl rdnge (3/5 $ D $ 3/4). Two ways of aehi£'ving this are d<':-crib{'(1 below: 1. EXdlllin... tll(> possibility of 1Ising a multiplt' of D like 2D if D is too small, or D/2 if D i6 too largf', in SUIilP uf thl' 6teps during t.he division. Notice If D is large, a single 0 within a sequence of 1 's in the quotit'llt. may result in two consecutive add/subtract operations, instead of one. Performing an addition of D/2 instead of D for the last I hefore the single 0 (which is equivalent. to p('fformin the addition on£' position latt'r) may gt'nerat.e a negative partial remaind(>r that will allow lIS to properly handle tht' single 0, and tht'n continue normalizing tht> partial rt'mainder unt.il t.ht> end of the se((upncl' of 1 's is roof:hCll. Example 7.4 Let X = (O.OllOOh = 3/8 and D = (0.1110Ih = 29/32. The corn.'Ct 5- bit quotient is Q = 0.011012 = 13/32. Applying tht> basic SRf algorithm results in Q = 0.10111, and thl' single 0 within the group uf l's in Q IS not handled in the most efficient way. If \VP IIse tlw multiple D /2 we obtain 
T 186 ], Fast Division I ro = A 0 .0 1 1 0 0 2ro 0 .1 1 0 0 0  1/2 :i't tJl = 1 Add - D + 1 .0 0 0 1 1 rl 1 .1 1 0 1 1 2rl 1 .1 0 1 1 0 t't q2 = 0 2r2 1 .0 1 1 0 0 0 add D/2 q3 = I) Add D /2 + 0 .0 1 1 1 0 1 instt'ad of D r3 1 .1 1 0 1 0 1 set q3 = 0 and 2r3 1 .1 0 1 0 1 ql = 1 2r4 1 .0 1 0 1 0  -1/2 set q = 1 Add D + 0 .1 1 1 0 1 rs 0 .0 0 1 1 1 final n>mainder = 7/32 . 2- :' Q = O.lOOn,! = 13/32; i.e., the single 0 is lumdk>d propl'rly. 0 To implement this scht'mt', two adders are nCf'df'(l. Ont' adder will always add or subtract D, whilp t.he other will add/suhtrad 2D if D is too small (i.., D starts with 0.10 in its true form) or add/subtract D/2 if D is too large (i.e., D starts with 0.11 in its true form) [11). The output of tilt' primary adder is normally uS(>d. unless t.ht' output. of the alt.l'rnatp adder results in a largt>r normalit.ing shift.. Th" idpa of using multiples of D can be extended to the ust' uf 3D/2 and 3D/4 in addition to D itSf'lf. Thl"Se provide an even higher overall average hift (of ahout 3.7), but require a more complt'x implementation [11). Schen1t' (2) is based on the fact that for K = 1/2, the ratio D/ h in thp optimal rang" J/5  D  3/4 is 6 / 5 < D = D - K 1/2  3/2  K < D <  K. 5 - - 2 I I I I I I I (7.6) or If the given D is not in the optimal range for h = 1/2, we can choose a differ- ent comparison ('on:.tant K, Consequt'ntly, for difff'rent ranges of D there are difft'rl'nt \d1ues of K, A numerical search for [( [13] has shown that satisfactory result.s can bJ> obt.,inf'(1 if the region 1/2  IDI < 1 is divided into five (not equally sized) subregion:., e.&:h lu\Ving a dim>rent comparison COIL<;tant K" as depided in Figure 7.3. Note t.llat four bits of the diviwr haw to be examim,<1 in order to scled the l'omparison c(Jnstant., which in turn has only four bits to be compan.d to the fOur most. significdnt bit.s of the remaindt'r. The determination of the subregion..:; for the ,iivisor and the corresponding comparison constants has to be done throuh a numerical search. This is bec.auS<' hoth should b(> hinary fractions, with a small number of bits in order to simplify the r«'Sulting division algorithm. 7.2 High-Radix Division 187 1/2 .1000 ...J 9/16 .1001 I 5/8 _1010 I 15/16 .1111 I I 1.0 L 3/4 1100 I K I = 3/8 .0110 K27/16 0111 KJ= 1/2 1000 K4=6/R 1010 K:, =3/.1 1100 FIGURE 7.3 The values of the comparison constant for the five dIVIsor subregions. Example 7,5 WI' repeat the division in Example 7.2 with X = (0.00l1l111h = 63/256 and D = (0.100Ih = 9/16. The appropriate comparison conbtant for this divisor is 1<2 = 7/16 = 0.01112 (see Figure 7.3). If the remainder is negative, it should be compared to t.he two's compll'ment of K 2 . which if! 1.100 1 2. ro= X 0 .0 0 1 1 1 1 1 1 2 r o 0 .0 1 1 1 1 1 1 0  0.0111 set. ql = 1 Add -D + 1 .0 1 1 1 rl 1 .1 1 1 0 1 1 1 0 2rl = T2 1 .1 1 0 1 1 1 0 0  l.l 001 setQ2=O 2r2 = r3 1 .1 0 1 1 1 0 0 0  1.1001 Sf't q3 = 0 2r3 1 .0 1 1 1 0 0 0 0 < 1.1001 set q4 = I Add D + 0 .1 0 0 1 rl 0 .0 0 0 0 0 0 0 0 zero final rl'IlMindpr The quotient Q = 0.1001 = 0.01112 = 7/16 is repreo;ented in a minimal SD form. 0 7.2 HIGH-RADIX DIVISION The number of add/subtract opl'ratious rt>quired by the radi.,,<-2 SRT a1gurithm and its variations is data-dependent. Thus, an asynchronous circuit must. be designed in order to tak£' advant.age of tht' reduc('{} number of non.tero bits in the quotient. Consequently, attempts to incre{ the number of zeros in the quoticnt have, in the currt'utly available technology, very limited pmdi('al ignifi('.aI1('t'. The number of add/subtract operations in the division proce$S cnn hp reduced and still be data-indppl'ndent by increasing the radix,3 for the process, where selecting 13 = 2'" allows the gel1l'ration of F1l qnoti(>nt bits o:\t t'aeh st-t'p. In this way, the number (If stt'ps is reduced to rn/ml The r",'ursive l'<luation 
188 7. Fast Division 7,2 Hlgh.Radlx DivISion 189 for t ht' rf'nl/\indpr i nO\\ rtlD k r, = ;J r,_) - qi' D (7.7) wh£'rp the nmlt,iplkation by {3 = 2 m is adlipvro by shifting t.he remaind£'r m bit positions to t hI' left. Tht> digit set for th e quot.il' nt is {O, 1, . . . . (IJ - I)} fur thp restoring diviion and ean bt' 8.'> large 8.." {CO - 1)", " 1,0, I,.." ({3 -I)} for thp high-radi.,,< SRT division. A radix higher than 2 ('..an, in principlp, bp used for any of t.he previously mpnti()ned division algorithms. For restoring division, t,hi" mealls that \\\:: start with lI1C' initial guess q, = 1 and, if t,he rcsulting remainder /Jri-l - D is positivp. we incrp8sP it to qi = 2 and subtract D from t hp t.emporary rplllainc!pr, obta,ining I.}ri-I - 2D. The proeess is rppl'at.,>d unt.il we r('ach t.he \alue q, = j, for which t.he ft'mporary remainder is negative. \,"p t he'n rest-Or£' the fI'mainder by adding D, obtaining {3r.-1 - (j - I)D, ane! St.'t q, = j - l. This sequl'nt.i,,1 procroure can be v('ry t,imf'-C'onsuming. making its advantage ovt>r the binary algoritluu questionable. It can be replaced by a parallel proccs,,, if sevenl comparhon circuits comparing {3ri-1 to mult,ipl£'s of the divisor, jD, are includl.>d in the division unit. The' comparison circuit producing t,he smallpst posith'e rC1I1aindpr points to the correct quotil"nt digit. ClParly, this implpnwntation requires a substant.ial hardware investment. Similar changes can be introduced into th£' binary nonrl"storing division algorit.hm. In what follows we describe t.he high-radix SRI' algorit.hm. It. is possible to implpment a high-radix SIlT division circuit that is fa.ster than its binary version. The quotient digit qi in slich an algorithm is a signed digit in t.he rang(' {o, 0-1 . ...,1, O,I,....o}. when' r!(}1-1)l 'S 0 'S ({3-I) (see Chaptpr 2). To find uut 1.111' possible chokes for a in th£' high-radi.x division algo- rithm cODsider th(' following. The quotient. digit qi is ordinarily sc\e(.ted so t.hat Ird < IDI; ot.l1l'rwise, t.he next quotipnt digit might b3v(> to he 13 or larger. This guarank'("S the convergence of th(> dh'ision proc(>dure. The above condit.ion im- plies t.hat for the maximal remainder ,Jri-l = Ij(D - ul]) and a positive divisor D, spl(>('ting thl' largest \dlue for the quotient diit qi = 0 should be sufficient to )'ipld a remainder ", in till' allowabl region. Therpfore, the following inequality :->hould hold: ri = p(D - tllp) - aD 'S D - ulp (7.8) Dividing Equat.ion (7.8) by D (('\'eals that wp may Sl'll'ct, for 0 only the maximum valul' 0 = (13 - 1). It is rl'<\...;.oll8blp, however, to consider division ll'chniqucs for which Ir,1 'S IdDl, wherp k is a fr<iction, since this r£'duces thl? size of th(> allowable region for the partidl rpmaindcr. as shown in Figure 7.4. Equation (7.8) uow take" thl' form .--------- --------- . . I I . . -k!} I I klJ /Jr,_IID .--------- ---------. -k FIGURE 7.4 The allowable region for the portial remainder ( k 'S 1 ). Hence, (\  k({3 - I), and if we want t.o allow the sf'lection of any value of 0 in the rangp r !({3 -1)1 'S 0 'S (13 - 1), k should satisfy 1/2 'S k  1. since . 'S i!=i. The smal\,'r the value of k, the smaller t.he redundancy is in the numb...r system for the quotient. Example 7.6 Let (3 = 4 and a = 2. We set k = 0/({3 - 1) = 2/3. For this select,ion, ITil  kD = D. and l{3r.-11 = 14ri-ll 'S iD, or l-ul   and I oIr() I I . Th£' digit set for qi is {" 1.0.1. 2}. The rpgion in which a specific v.uuJ> q can be selected is given by _ < 4r,_1 _ q <  3 - D - 3" aud consequently, 2 .tr.-l 2 -'3 + q 'S [) 'S '3 + q. For t'xample. the value q. = 2 may be selected in t.he region t 'S sri) I 'S , since -I 'S (:!!zf- - 2) 'S . Similarly, we may select ql = I for .1 < 4r'-1 < *. The different n>gions for seleeting the quotit'nt digits are 3 - D -" shown in Figure 7.5. In the oVt"r\apping region. namel,}',  'S 4r;;l 'S . we can sele(.t eitl1Pr qi = I or q. = 2. Similar owr\apping regions exist for qi = 0 and qi = 1, for q, = 0 and qi = 1, and also for q. = I and qi = . 0 r, = keD - ull)) - aD  k(D - tlll)' (7.9) I I I I I I In general, the ratio k ::;: o/({3 - 1) is a mt>8SUrl' of tht: redundaucy in the representation of tlw quuticnt digits. The largl'r this ratio, tht> larr tbe overlap r('gions art' in the plot of r;/ D versus j3r._ 11 D. For example, if we set o = 13 - 1 = 3, then k pquals 1, which corresponds to the JlUI..,,<imum redundancy. In this c&>e, the region for qi = 1 is 0 'S .!!:ff- 'S 2, and, for qi = 2, it is 1 'S  'S 3. Thus, t.he owrlapping region where eitlwr q. = I or q. = 2 CWI he 
190 1 I 7 2 High-Radix Div\slon 7. Fast Division r;/D f> = fJr,-1 (8/3,2/3) I I I I I :.lri_I/D I I I (-8/3, -2/3) -1 AGURE 7,5 The quotient digits for {3 = 4 and fi = 2. selected is 1  IriJ I  2, which is larg('r tl1<\n the corrt'sponding oVl'rlap region for Q = 2 (see Figure 7.5). The implication of having an oVt'rlap r('gion is that we have a choice of valu, of both the partial remainder and the divisor, that will eventually scparatt' t.he t.wo adjacent. regions corresponding t.o two consecuti\'t? \lI.lUt'S of tilt, quoti('nt digit. q. The selected values of the partial rt'maindN and divisor separating the adjaa'nt regions of q will serve as comparison constants during the t>x"cution of t.he divide opl'rat.ion. \Ve may therefore select t.lue comparison constald.s so that they require as few digits as possible. Such a selection will reduce t.he execut.ion time of the comparison st.ep when dett>rmining the quotient digit. Clearly, a larger overlap wgion (corresponding to a higher valup of 0) llIay allow us t.o sclect comparison constants with fewer digits. On tilt> ot.her hand, a higher valul' of 0 Illeans having to produ('e 11I0re multiples of the form Q. D, requiring extra hardware and/or time. For a given 0 we need to determine the lIIunbt'r of bits of the partial rem.under and the divisor that must be examined in order t.u select the quotiPnt digit. This is the Illost difficult st('p whpn developing a high-radLx SRT algorit.hm. It can be accomplishPd numerically, anal,}'tically, or graphically. A combination of thi'sc tt'dmiqU(.'$ can also be employed. To graphica1ly determine' tht' r('quired numbt'r of bits of tI... partial re- mainder and the divisor we U<;i' thl' poD or partial remainder versus divisor plot ((2, 91), like' the one shown in Figure 7.6. The purpose of the P-D plot. is to indicate t.he regions in which given values of q, may be st'leeted. To dt'tf'rminp the r<'gion for a givcn value q of the quotient digit, consider again the basic l-'quat.ion for the partial rcmaindl'r, rt'writtl'n n... 191 (k + j + 1) . l) (k+J).D (-k+j+l).D (-k + j) . D Dm'.. DI D'J DrJl.Q D FIGURE 7.6 A poD plot. To simplify the not.ation from this point on, We denote the previous pArtial remaiuder {3ri-1 by P, as shown in Figure 7.6. The maximnm value of P for whkh the value q can be sdected depends on the maximum allowed value for ri and siuce (7.11) II for tht' (7.12) (7.13) - /.: D $ r,  k D. wp obtain an upper limit for P, for which we may select tit!> value quotient digit: P fflOZ = (J.: + q). D Similarly, the lower limit for P is Pm;n = ( - k + q ) . D. Equat.ions (7.12) and (7.13) for a specific value of q, say. q = j, art> repr(>S(>ntro hy t.wo lines in Figure 7.6. At each point in the region het.w<*'n these two lines we nMY select the value j for the quotient digit q. Dut' to the redundancy in I'{'prescnting the quotient, the regions for II = j and q = j + 1 oVt'rlap. The overlapping r('gion is bt>t.we(>n the upper line for q = j and the lower lint' for II = j + 1, as shown in Figure 7.6. Note that Figure 7.6 includes only posit.ive vnlues of tilt' divisor and partial rf'mainder, and thus constitutes only one-quartRr of the com plett> - D plot. However, tilt' complet.t' P-D plot is symmetric about both axes. md. m {31'i-1 = Ti + II' D. (7.10) 
(k + j) . D min  C  (- k + j + 1) . Dmo%. (7.14) I I I I I I I I 7.2 High-Radix OMslon 193 192 7. Fast DIvISion mo..t. c&.ws, it is sufficil'nt to anal)'7.<' onf'-(juarler of it. Notl' 'llso t hat only value'S of IDI in tilt' range ID min . D nU1 %1 aro' of interest. Examples fur this range' are [0.5.1) and [1,2). Th(' latter is applicabll' wht'n di\ idin floating-point, numhl'ni in t.11P IEEE standard (SCf' Chapt.t'r 4). For till' abow'-IIlI'ntion,'(1 ov"rlaJlpin f("ion. we have to dl'termin(" thl' vallie of P, which willl'\'I"ntllally sP(Jarate till' splC'ction rl'ions of q = j ano q = j + l. This valtlP will S<'r\'P as n comparison const<\Ilt. and t.l1P number of hits rt'Qllir('(J to ft'prt'S(>nt it will dl"tl'rminJ> tht> n('CessRry precision wlu'n ("x"mining the partial remaind('r in ord("r t.o S('lc(,t q. Thl' linl' sl'pard..ing thl' regions of t.l1l' part ial r("maindpr may lw a st raight hori:70ntal line (DI\(" I hat. is imlepl'ndpnt of D) or 8 stnirstC'p funct.ion, partit.ionin t.he mnge of tllP divisor (D mi ." D",o%) into st>veral intt>n-nls. \\'l' may have a singll' hori70ntallinp r = c, where c is a constant, if and only if tllPn> is a vnlue c satisfying pxaminoo. it is :mfficient t.o consldl'r t.he overlapping rpv;ion bf'twf"l'n q = ('( .uul q = 0 - 1 near D,,"n' Lpt N p d('nllte the nllmLt'r of bi of thp part ial remaindf-'r P t.hat h<iw t (I be exalllint'C1 in order to dct.ermili(> the corrpct valllP q of the quotil'nt digit. N D is dl"fincd silllilariv for the divisor. fhe splPC't ion of the value q can hI" donI" hy a look-up table implem,'nted for examplp, in a PLA (programmahll' logic ilrray) wit h N p + No inputs. Our ohjective is to minimi?e th,> size of th€' look-liP t lhle, which, in t.urn, will sp....d "l> diP division proCf'SS. Let Ep and ED denot.e the number of fraetional bits within Np dil(l No, respl'Ctively. The preci:.ion at. which tht> partial remaindpr is pxamim'Cl it> rims 2-(P, and similarly 2-- 0 is the prpcisioll of t.h!' "truncat('(\" divisor. Clt>arly, t hes!:' t.wo must satisfy 2-(D  Xm," and 2-£P S Y,",n' (7 .1) p X = D2 - DI = _ k + j + 1 = P . 2k - 1 j(j + I) + k (1 - J.). P k + j Howpver, the inequalities prO\idc only upper bOlluds for determining the prf'- cision at which the divi&>r aud remainder should be pxamined. since the two ext.ren1P points of the interval l:U( (} ) may r('()uire a highpr prl"cision; i.f".. more than ED (Ep) fract.iollal bits. To check whether the computl'd vailles of '-D and f p arc sufficient. or whether a higher precision is required. We ma' u<;c the poD plot. This plot can also be used to decide on the value of q for l'<lch p<iir of valut".-S of P and D when t.runcated to tlte most significant N p and N D hit, respef'tivl"ly. When making thl"Se decisiuns we mUltt take into account the limited pre- rision of the divisor and the partial remaindl'r. As a result, each point with the coordinate'! (P, D) in the P-D plot. repreS<'nts all the partial remainder-divLr pair" with This illlplil's that the splc(.tion of q will he illtlepl"ndt>llt of Dane! will dl'Jll>nd only on P. If this inequality is not sat.isfied, WI' must divide [Drn," , Dmoz) illto 1'{'\'('ral snnlIer int("r\'lls. The ''st,'pping'' points deh'nnillt.' thl' precision (i.e., the 11IImbpr of digits) at which \\'t' have to eXdmint" D, while diP h("ight. of tht' stl'pS dl't.ennines tilt' precision at which dIP part.ial rpmainrler has to be l'xaminro The IIMximum widt.h of a stpp bl'twN'n Dl and D2' delloted by  \, is the horiwntal dist.ance ht't\\'t'E'lI the two linl"S definillg th€' ov("rlap region (Sf'C Figurl' 7.6). The expression for this horiwntal distanc(" i.., in gpnl'ral, (7.15) I I I I I I I P $ part.ial rt"mainder S P + T(P, D  divisor $ D + 2-£D. Therefore, we mUst select for the pair (P. D) a V'dlue of q which is legitimate for all the pairs in the above rangp, . . CouJ:lider, in particular, point. A in Figure ;,6. For a ,hvlSOr th ,t t.'<luab D 2 we may select q = j + 1, but for a divisor that ("quais 02 + 2-£D we must select q = j. Consequently, we should not. s(!led q = j +.1 for point. A or or any other point in t.he owrlap region whose horizont.al oastanct' fwm t.he lUll" (-k + j + I)D is smaller than 2-<0. Thl" horizontal distaJlc.- f:1 \'" is minimal whcn j is maximal and P is minimal. The m.o:imlll11 vnltlt' of (j + 1) is 0; hf'nC'f' j = 0 - 1; P is llIinillldl whl'n DI = D""n. Thus, 21; -1 D. \mrn = Drn,"' (k+o:-l) 0(0 -I) ... k(1 _ k). (7.16) The maximum height} of thl' h'p, or the vertical distam.,' bl'twcl'll t.h,' two lines,  } = (J.'+j)D-(-k+-j+I)D = (2k - I)-D. (7.17) Example 7.7 The P-D plot for jJ = 4, a = 2, and DE (0,5,1) is showl III Figllrl" 7.7. In t.his tigure, the overlapping region for I.J = 1 and I.J = 2 hps Ltw''t.'n the Ii !les Tl.ili; ,prtical disldUU' is a minimum whl"n D = D, .in. Cons{!<JuC'ntly, to de- t.rmme the prt'c "1OU at which thp partial rf'maindt'r and divisor have to he 
p = 4rj_1 1.101 1.100 1.01l 1.010 1.001 1.00U 0.111 0.110 0.101 0.100 0.011 0.010 0.001 T 7 Fast Division I 5/3.D q=2 I 1/3. D 7.2 High-Radix DIvision 195 194 (I) P = (Ii + 0 - I)D = 5/3. D, and (2) P= {-li+n)D=4/3.D \\'(' first dwck thf' pos.,<;ibility of a ::-inlp horilOntal lint>, as in E<luation (7.14). (Ii + I)D min = 5/3.0,5 and (- Ii + 2)Dm.... = .1/3. l. and !>incp 5/6 < 4/3, a singh> Iin" is impos.<;ible and we must have s('v('ral divisor intf'r\als. Nl.'xt, we calC'ulatf' the !imalle:>l horiwntallmd \'I>rtirnl distan('('S: 2/3 . 1) ,5353 I ;\:min = Dman' -. - = -. - = - = 2- 3 hpnc e fV > _ 3 3 20 6 20 8 ' 1/3. D /3. D Ymin = Drrnn' 1/3 = 1/6, ht'ncp f.p  3 The poD plot in Figurf' 7.7{a) mclmles a grid with fp = fD = J. A Im'risf' examinat.ion of the overlapping rcgion between q = I and 'I = 2 in Figur p 7.7(a), reveals that for the partial rf'nldindt'r-divisor pair (0.110,.100) w(-' cannot select a value of q which will be It'gitiml\lp for all the points in the l"Orrcsponding recta.ngle. Furthl'rmore, it. is apparpnt that by incn'a..,ing fV to 4 the above problem is resolwd. The corn>:;ponding grid is shown in Fiure 7.7(b). \Ve now have to decide on till' vallie of q for each (PI'langlp in this grid. Figure 7.7(b) shows all the possibll' sell'ctions of a \dlue for q for all the pairs (P, D) within r.he overlapping regions. The 11P8vy lint's in the figure show one pos.<;ible!'Ct of lines St'pardting thp regions for difff'wnt values of q. Clearly, this is one out of many possible' solutions allowing the designpr t.o splect. a solution that will, for t'x'lmple, minimi.lp the PLA which implemf'nt.s t ht> look-up table for q. Such a PLA will have .V D + N p inputs where N v = -I aud N p = 6, since thrt't' morp bits are nt'eded for the int.eger part of the remainder and its 1>ign (-8/3  P  8/3). fhe numbcr of inputs t.o this PLA can be reduced to .V p + No - I = 9 by hiking advant.ag of the fact. tlldt th most siJtific.allt bit of D is alv.ayo; I and ("an be thprl'fore omitt.ed. Notc that theoretically, for thp ovcrlapping region between q = I 8nd q = 0 we could uS(> a single hOrIZontal Iinc, since 2/3 . 1/2  1/3. This howe\'t'r, will rt'<llIire a high-precision comparison of the partial rt>lIMinder, since c = 1/3 rftJuirl's that all tlw fractional bits in the p6rthl rt>maindcr will lit> compared. \Ve thereforP partition thJ> divisor interval into two subintervals, as shown in Fignre 7.7(b). 0 q 0 o . IOU 0.101 0.110 0,111 1000 (a) t I' = EV = 3. [} 1.101 1.100 l.Ull 1.010 1.001 1.000 0.111 P::;;4r,_1 0.110 0.101 0.100 0.011 0.010 0.001 8/:'0 f) 5/ / 2.........- V; / y 2 lor2 / 2 lor2  2 V lor2 I  2/ .4 lor2  I  ..---; 1./ A"' ./ ./"" V 2 I  1 .2.- 2 k- --  1 1 .!.- !--t- -7" Oorl Oorl Ourl Oorl  1 1 I Oorl Oorl 0 0 u  0 ....- u 3.D 4/3. D /3. D 0.1000 0.1010 0.1100 0.1110 1.0000 (b) (p = 3, fV = 4. f) FIGURE 7.7 The poD plot for 13::: 4.0"" 2. and DE (0.5.1). Example 7,8 Let X = (0.001 11 II Ib = 6J/256 aud D = (O.lOOlh = 9/16 as in Exam- ple 7.5. For t.his divisor, the comparison constant.s for tht' I)Urtial r('majJ der are, accordin to Figure 7.i(b), IJ-I (llOlO) aud i/8 (0.111). 
196 7. Fost Division ro = ..\ 0 .0 0 1 1 1 4ro 0 0 .1 1 I I I  7/8 set ql = 2 Add -2D I 0 .1 1 1 0 TI I I .1 1 0 I 1 4rl 1 1 .0 I 1 1 < -1/4 s('t f}2 = i Add D 0 0 .1 0 0 I T:2 0 0 .0 0 0 0 7('ro linal rplIlaindcr T I 7.2 High-Radix DMsIon 197 The t'ntri,.s of th.. look-up tabll' for tlw aho\'c algorithm ('(\n also be cdl- ('ulat.('d nunwrically (instl'ad of graphically) with th" lIumher of inputs to thp look-up tahlt' determinl'<1 by a t.rial-and-t>rror lIumerical sNlrch. Suppose. for example', that we st.art with an initial gu of fO = 3 and fp = 3 and Wf' attt>mpt to calculatf' t.he value of q for D = 0.100 and P = 0.110. Sin('(' we truncatro till> divisor, w(' n,'Cd to consider divisors from 0.100 t.o 0.101. Simi- larly, the partial rf'maindf'r could have a ,'a1ue from 0.110 to 0.11 1. Thus, P / D could be as mall  0.110/0.101 or as large as O.Ill/O.lOO. The first f'quals O.llO/O.lOi = 1.2 and, according to Figure i.5. requires q = 1, whil.. the second equals 0.1l1/0.IOO = 1.75, requiring q = 2. WI' therpfore condude that t.he aboVl' precision of P and D is insufllci('nt. \Vf'must increase th.. number of bits of either the divisor or tht> partial wmainder and try again. A simple program could be prepared to perform this nunu>rical s('arch and det-t'rmine the value of q for carll (P, D) pair (6]. To iIIub"trate the lower precision of compari,>on needed for a higher value of 0 (i.e., a higher level of redundancy) consider the following l'xarnple. 10.0 3D 111.0 110.0 101.0 100.0 2D p = o1rj_1 The re;ulting quotit>nt. is Q = 0.2i.. = 0.10012 = 0.01112 = 7/16. o 11.0 D q::sl 01.0 q""O 1.0 1.1 10.0 I) FIGURE 7.8 The poD plot fOf (3 = 4. 0 = 3. and D E (1.2). ND = 2 and Np = 4, instead of N D = 4 and Np = 6 for 0 = 2. A less complicated quotient selt'ction logic is needed hpf{'. but thp multiple 3D is required, which i.. costl) to gt'nerate. 0 Example 7.9 For /3 = 4 und a = 3, k is a/U1- 1) = 1. Th(" region for q = 2 is betw('('n the lines P = (k + q)D = 3D "\nd P = (-k + q)D = D, while the rf'gion for q = J is betwn t.he lines r = olD and P = 2D. Thus, the overlapping region for q = 2 and q = 3 is het.w£'en the lines P = 3D and P = 2D, m. shown in Figure 7.8. For D E [1.2), as in tht' IEEE float.ing-point standard, we hav" th... following inf'qualities: Example 7,10 LM. X = (01.0l0Ih = 21/16 anrl D = (01.1110)2 = 15/8, For this ,tivisor, the partial rem!\indt>r comparison constants are, according to Figurl' 7,8, I, 2 and 4. 1\ ". _ D I _ 3 _ -I L.).'m... - ,,"n' 3 . - - - - 2 , hence fD  1; 6 6 ro = X 0 I .0 1 0 I 4ro 0 1 0 I .0 1 0  4.0 set ql = 3 Add -3D l- I 0 I 0 .0 I 1 r1 1 I 1 1 .1 0 1 4rl I 1 1 0 .1 0 0  -2.U etq2=I Add D + 0 0 0 I .1 1 1 r2 0 0 0 0 .0 I 1 fi nal rt'nmimler - 3/8. 2- 4 The quotient is Q = (0.3i)" = (0.1101)2 = 11/16. We "erif)' the re:;ult of the divide operdtion through X = Q. D + R = 11/16. 15/8 + 3/I2H = 168/128 = 21/16. 0 Yn..n = Drum' I = I, hellC: [I'  O. To ohtain the values of U.... comparison constRnt.s we have to t>xamine t.he didgrdl11 in Figllrl' 7.8. We collcludl' that. ED = I and fp = 0 find thereforf', 
198 7. Fast Division 7.3 SPEEDING UP THE DIVISION PROCESS A major f(>a..,«JII for till' low p('('(1 ohhf' division proct's-<; is the fad that w,' have to ('omplpte the ith sto'p hl'fore continuing t.o diP (HI) stl'p. Th(' mult.iply proccss can be lu'cl>lcrated t'asily by gl'nerating t;('\'cral partial products simultaneously, since they are indppendl'nt. In contra..,t, the step ill t.hl' rlivisioll proC(,.."'S are dppPlldl'nt. an" WI' cannot. start a npw stpp hl'fore t hi' current Wlll8illder is known al\(l a new quotient digit is S('I,'ct('d. Em'h step ill t.he di\'ioll consi.s of two substeps. First., a quotil'lll digit is sdpetl.'(I, Rlid thcli the IlI'W partial r('lIJainder is calculatl'tl. We can spt'l'd up till' high-radix di\ ision pro("('SS dl'$cribed in Sect.ion 7.2 ill onp of two ways. One way is to overlap t.hl' full-precision calculation of tJu' partial rl'mainder in step i with the selection process of the quotient digit. in stf'p (i + 1). This overlapping is possible, since not all bits of the JlPW partial rc'maindl'r must bp known in order t.o select. tIll' nl'Xt. quoti('nt digit. Anot hpr way to SPl'(>() up this proCI'S." is to replace the carry-propagate add/subt.ract. operation for calculating diP npw prut.ial rl'mainrll'r hy a cMry-savp oppration. III the first mcthod. a truncatl'rl approximation of the new partial re- mainder is calculdt('(1 in parallel to the full-precision .-alculdtion (If the part.ia) ft'mailllier. This approximat.ion can bl' obtained at a high spl'l'd, enabling us t.o pft'pare for t.he nl'\\' step (i.e., determine t.he quot.ient. digit) even before thp current step is completed. Thprl'forp, inst.pad of fir:-.t compll't.ing the cakulat.ion of the partial r('main- rlpr r,_1 (with compll'tl' carry propagation) in stpp (i -1), and thpn inputt.ing tllP .V I' most significant bit.... to thl' PLA to determine qi in stcp i, we can use a small, fast a"dpr t.hat has as inputs tht' mo....t. significAnt hit of t.h(' prl'Vious partial re- mainder, rlenotpd by {3-;::'2' aud t.he most significant bits of the corr<'Sponding mult.iplo' of the di\'isor, denoted by I/:-::;D, as depict.ed in Figure 7.9. ADDER D AGURE 7,9 A quotient digit selection logic I 7.3 Speeding Up the DivisIon Process 199 This approximate partial remainder (APR) adcll'r pro,lu('es an approx- imation of the rI"<Juired JV p mot. significant. hit.s of the new partial rI:'main- dl'r, d,'noted hy r::;, beforp t.hl' full-pr('("ision a,ld/snbtr!\ct oppration (r.-l = (3r,-2 - qi-lD) is complet.('(1. This allows liS to IJPrform a look- \head quotif'nt. digit selection, and we can Iect q, in parallel with thf' fuIl-pr('{'ision calculation of the partial remainder ri_I' CIo'arly, th(' Sill' of this APR adder should he detl'rtnined $0 that sufficiently an'untl' N p bits will be j!;pnerated. Since t.he uncertaint.y in thp result. of t.his adder is largf'r than thl' uncertainty in the t.run- catro prl'\'iolls partial remaindl'r I3ri-2' wp may net'<l additional input bits to t.he quotient digit look-up table. Example 7.11 For 13 = 4, a = 2, and D e (1,2), t.he poD pl..t is !ohown in Figurp 7.10. An APR addl'r of t.he con\'£'nipnt. size of eight. bits is sufficient to gpner,iU' the necl.>sSary inputs t.o t.he quotient selection PLA (17). The hori1Ontai lin(":ol in Figure 7.10 were det.{>rminc<l t.o reduce thl' complf'xit). of t.he PLA. Only t.hrl.>e divisor bit.s are needed as input.s to t.hp PLA, sinc.' th!:' most. significant bit of t.he divisor is always 1. As for t.I1P part,ial rl'maillll,.r. out. of t.he output.s of t.he APR addl'r, fi\'e bits (inclucling thp sin hit) are suffident in most cases. For a posit.ive partidl rem"\inder, only in three cases (markpd by a, b, and c in Figure 7.10) is an additional bit. requirpii. In case a, D = 1.001 and P = 1.1 and the single fractiomu bit. of P is insufficient.. Thl' divisor can a.sume dny value from 1.001 t.o 1.010. The part.ial rcmainder can have a value from 1.1 to 10.0. Thus, the rangp for P/ D is from 1.1/1.010 = 1.2 t.o 10.0/1.001 = 1.77. The first re<luircs q = 1, whill' t.he second rf"<Juires q = 2 (see Figure 7.5). Adding a sffond fractional bit. t.o P solves t.he problem, allowing us to sl'lcct q = 1 for p = 1.10 and q = 2 for P = 1.11. In C&'>eS band c, the pxtra fractional bit of Pis r('(luir('(). sinCl' t.l... 8-bit. APR addl'r may introduce an additional t.runcation error. fnrthpr increas- ing thf' range of P/D, Consider for example, case b. where D = 1.100 and P = 10.0. If no APR adder is used, t.he range for PI D is from 10.0/1.101 = 1.23 to 10.1/1.100 = 1.66 and q = 1 can be seleded. The 8-bit. APR .wder int.roduces an error of up to 2- 6 in r._ It which increasf'S to 2- 4 after the multiplication by 4. This additional error incre6Sl the maximum \"Iut' of P / D from 1.66 to 1. 7. rftluiring 'J to be 2. All extra fractional bit. of P solves this probl('m, For a nlgat.i\"e partial renninder r('prL"Sentf-d ill two's complclJlellt, there are SL"( cases where 1 or e\en 2 ndditiOll!\1 outpnt. hit.s of t.he APR adder arc required t.o guarant.Ct' the corrc>ct Sl'1l.'t.ioli of the quotient digit. /171. o 
200 P = 4r._1 011.1 011.0 010.1 010.0 001.1 001.0 000.1 000.0 111.1 111.0 110.1 110.0 101.1 10 1.0 7 Fast Division 7.3 Speeding Up the Division Process 201 8/3 . IJ ./ r 5/3 -- ----- / ----- .- q-2  4/3 ---- ----- ....-- b -- -- --- ----- -. --  --- 2/3 -- q-l - c - 1/3 q=O 0 - 1/ q..l - - . .... . - 'iT- -- -2/ - -- h-.. --- -- --- """--- -  q= ---- --- -- -4/  - ............... .......... .0 .0 .0 .D Carry-savc Adder 3.0 FIGURE 7.11 An SRT divider WIth redundant remainder. J.D 3.V two sequences of intermediate gum bits and carry bit.s. These should he ston'd in two separate registl>rs as shown in Figure 7.11. Only t.he most signifkant um and carry bit.s must be assimilated using the APR adder in urdf'r to gpncrate !Ul approximate partial remainder and allow t.he select.ion of the quot.ient. digit. In this case, tllP calculation of the approximate partial remainder and thl' sdf>(.tioll of the quotient digit are the most tim('-consuming operations. Thus, in eam step of the division process a carry-ve adder calculates the partial remain&>r, .md the APR add('r then (\('Cl'pts t.he most significant sum and carry bits of the par- tial remainder and gt'nerat('s t.he required inputs to the quotient selection PLA. As in the first method, the numher of inputs to the PLA and its t'nt.ries need to be calculated, taking into account the unc('rtwnty in the sum anll c-arrv bits representing the t.runcat.ed remainder. 100.1 01.000 01.110 10.000 01.010 01.100 FIGURE 7.10 The poD plot for (j = 4, Q = 2. and D e (1.2). In the schemp deseribed above, the time needed for each step of the divi- sion is primarily determined by the' t.in1f' requirl'd t.o p('rform the add/subt.ract operat.iolJ for calculating the new partial remainder, since the quotipnt digit was selected in thl' previous f;tcp. Thp sc<:ond method for spp€ding up t.he division proccs... a\'oids the time-consuming carry propagation whpn calculating the new p6rlial rcm<,indl'r. Sincl' a truncated partial remainder is sufficient for selecting tll(> next. quotient digit, t.here is no nt'Cd to complete t.he calculation of the par- t.ial remaindl'r 6t any intermediate step in the division. Thus, in.,tead of using a carry-propagdting addcr to calculate the new part.ial rpm.undC'r, we can use a carry-ave addpr and represent the parti61 remainder in a redundant form lr-ing Example 7.12 An algorithm for high-speed division with p = 4, Q = 2, dil!l D E (1,2) has been pre:wnted in (71. The partial remainder is calculated in a carry- save manner and, consequently, t.wo registers are needed to tore the sum bits and the carry bits separat.C'ly, resulting in a somewhat mon' complex de.sign. An 8-bit APR adder is uSt..>c:1 to gPllerat.c the most significant part.ial remainder bit.s that are I1t'f'<led a,<; input:; t.o t,Il' quotient selN,tion PLA, 
202 7, Fast DivISion 7 4 Array Dlvfders 203 Thl' ill puts to this APR addl'r ar£' th.. eight most !;ignificiI}t sum bits and carry bits in tit£' r<:>dundant rl'pre,'wntat.ion of the partial rc>n1dindl'r. Thp output!> of thl' APR adclt>r are t.ll('n cOlI\'ertpd to a sign-magnitude r£'prC'sputation alld, a8 a rf'SlIlt, only four hits of the approximatp partinl rl'maindl'r arc nl"(ded in mo..'It c-.e.s, Only in four cast>s is an addit.ional hit rN}uirro, yieldillg a vl'ry simple PLA. 0 most '''!/III- fir.arlt bit. Further spl.''('(1 up of the SRT division call bf' achieved by incrt'a...illg the radix 13 of t.he algorithm t.o 8 or eVl'Il higher. This reduccs the lIumber of st 'ps to fll{31 or 10wl'r. S('veral such rndix-8 SUT dividers have bl.''(>n implemented /8, lOJ. The IIUlill disadvantdge of th(' radix-8 (or higher) SRr algorit.hm is the high complexity of the C(llotif'nt selection PLA which thell becomes thp most tim<>-consumillg unit of tlU? dividpr in Figure 7.11. 0111' way to avoid the need fur a vC'ry complpx quotit>nt sf'll'Ction PLA is to implement a radix-2 m SRT IInit us a 1>et of m overlapping radix-2 SRT st!lv;es. The radi.x-2 SRT requires a very simple quotipnt sl'lcction logic since q. (q. E {-1. O. I}) is solely dett'rmil1l'd by t.he remainder and is independ£'nt of t.llt'divisor. \Ve 1I1ust however. overlap the quotient selections for the m bits so that all ", quotil'nt bits will be gl'n('rated in ont' st.{'p of the procc&;. Figurl' 7.12 depicts two oVl'rlapping radix-2 SRT stages which gelwrate two quot.il'nl bits (qi Rnd q,+d in on£' step, implem(>uting a radix-4 division. Ba.'iPd on the 1I10St, significant bits of the two remainder St'<IUt'nces (the slIIn and c.arry sequpnc(>S), a value for q, is gpnerat.ed using a Q"d unit. In paralIt'l, all thr(>(' possible values of q.+ I are generatl.,<1 using tlm.'(> Qd units. Thpsp values corrpond to t.he tlue\' possible int£'rn1('diate rt>maindC'rs, l\1\mt'l)', 2r._I- D, 2ri_1 aud 2r'_1 +D. Note howt'ver that only the most siJ.,'11ificant bits of t.hpgp three remainders have to bp gcneratl."ll. Onc(' qi is known. the correct valu... of q.+! is sf'l('cted. This value is t.hen USM to select t.hp correct multiple of the diviNor to form the new remainder whirh will be stored in t.he t.wo r£'gistcrs (for tllt sum and carry sequpnces). The:> ovprall delay of till' radix-4 circuit in Figurc 7.12 is detl'rminffl by t.hl' dt'lay of a Qd, thp delay of t\\O multiplexor units and till' delay of the final CSA unit, This dl'lay may be shortl"r t.lMn till' deluy of a nulix-4 stage dul' to the higher complpxity of t.he radix-4 quotient. sek>ction PLA /1.-&1. Extt'llcling the above technique to radix-8 SRT division neceitates a more c Jlnplex quotient selection circuit since tlm'C quotient digits (nampl)', q.. q.+1 and qi+2) must bp generak>d in parallpi. For ellerating qi+1 the speculat.ivt: remaindprs 2r'_1 - D, 2r',_1 IInd 2r._1 + D have to be calculated. For generat. ing q.+J the speculativ(' fI>lUainders -iri_1 - 3D, 4ri_1 - 2D, -iri_1 - D, 4r._it 4ri_1 + D, 4r'_1 + 2D, and Iri_1 + 3D must be calculated (again, only th£' mrn-t significant bit'i of these seven renldinders). This implil'.s that sevell Q",d units arl' II.'<juirC'tl with multiplexors (controllto<l hy qi and qt+l) to select the correct valul' o D D o q.+1 FIGURE 7.12 Two overlapping radlx-2 SRT stages. of qi+2 /141. Ext£'nding the abo".e to four ovC'rlapping mdix-2 stages t.o obtain a radi.x-16 divider will r<'Sult in an increase in the tot.'ll number ofQd unit.s frum 11 to 26 making this approach more costly. Another dlternatiw for a radi."(-16 divider is to use two overlapping radix-4 SRT stages (181. 7,4 ARRAY DIVIDERS All algorithms for division can be impleml"ntro using an array of cells wh,n' each stpp of the aJgorithm is eXf'cut('(1 hy a separate row of ('ells. Thus, r1 rows of cells with" cells per row are required to implement a radix-2 division.'lorithll. If the restoring Rlgorithm is f;plectl.>d for implpmentation, te t.h chtfl'l't'Ul'e m each row between the previous partial n>mainder aud tit,> h\'lsor IS formt'd. IUld the quotient bit is generated u('cording to tltp sign of this (Iffl"rp}c. rhl're I no need how£'ver to restore the partial rt'maiudt'r if t.h£' qllutll'nt bit IS dt{'J'mllled to b; O. I ns';a<t , 3Ct"Ording to t.he gen,'mkd (juot.it'nt bit., tithl' the pr'Violl:l part.ial rl'lIlaincl£'r or thl' diffcrence (whit-h constitutes tle new partll1 rplU8l11dl'r) is transferred to the next row. If a ripple-carry M"hpu}t> IS f'lUployt"(lm eWI)' row, 
204 7. Fast Division l I 74 AIrav Dividers 205 then it tak('s n stpps to propagate the ('arry in a single rowand, since t.lWfl' arc n rows, t.he t(1tal execution t.ime IS of thl' ordl'r of n 2 . Similarly, we ('I\n impl('m('nt a nonrft;toring division army. It has about t,h£' samp spCf'd as the r{'Storin arrd)", and it.s only adwU\tdJ::I' is its ability to han- dle negative I}peraruls in a simpl(' way. On t.he other hand, thp final r£'lIIainder lIIay he inror['('('t, having a sign opposite to that of t lip di\'idend. An pxample of such an army di\'idl'r is shown in Figur(' 7.13. where IO,XI .. 'X(i is th(' divi- dend, rlo.dld2d3 is th,' divisor, QO.Q\Q2q3 is the quoti,'nt, and TO.TJT2T3 is tlw final rpmaindPr (3]. Thp operation (addition or !>uhtraction) to be p£'rformPlI in a iven row is cont.rolled by th£' signal T (5('(' Figufl' 7.13(a». If T = 0 an addition is p('rformed, while if T = 1, a subtraction is pxpcuted by adding the two's complement of th,' divisor, whidl is 8ssumffi (in the implementation dppicted in Figure 7,13) to b(' positive (i.c., do = 0). The latter is done hy forming the one':, complement of the divisor and forcing a carry of 1 into the right.most cell by conn('ct.ing it t.o T. The genemt.l'd quot.ient. and remainder are r£'prented in two's complem('nt. but t.he final remainder is not always corr('d To show that thl' quotil'nt genl'mted  the array di\ ider in Figur(' 7.13 is correct, lIotp that in each step ofthe nonr,oring division, t.hl' partial remainder and th(' mult.iple of t.he divisor (:i:D) alwa's have opposite sins. TllPrt'fore, the carry-out from the leftmost cell equals 1 if t.he sign of the lIew part.ial product is 0, and vice vcr:>.:'1, Hl'nce, Cout = 1 implies that thl' opl'ration in the nc>.:t row should hc subtraction (1' = I), since tbe divisor is assumed to bl' positive. Similarly, c....t = 0 g£'lll'rates T = 0, so an add operation should he performed in the next row. The values Cout = 1 and Co,,' = 0 for row i correspond to 1Ji+1 = 1 and 11;+1 = 1. respect.ively, where 11,...1 is t II(> (i + 1 )th quotient bit. This is identical to the relationship between p and Q in t.he algorithm for converting the representation of the quotient that us the digit set {1.1} to the equivalent two's complement representation, as described in Section 3.3, In the two previously presented array di\'iders a complete add/subtract operation with carr)'-propagdtion is performro by I'ach row in the array. In tlte nonrcstormg division. only t.he sign bit of the partial remainder it; needul to sell'Ct. the quotient bit. This sign bit can be g('nerated by using a fast carry-Iook-ahead circuitry, while the other bit.s of the partial remainder can be generated using carrY-Sl\\'e adders. Each cell gcnl'rales a P, and G. out.put. (Cc1I'ry-propagat£' and wrry-gelll'rat(', rpgp,-"Ctivcly, as in a carry-look-ahNld adder) in dddit.ion to t.he ordinary sum and carry outputs. The p. alld G; out.puts of all cells in the same row are cOllnected to a c.arry-Iook-ahead circuit., which g('neratl's tit£> quotient. bit. Tlu:, exel'ut.ion time of such an array divider is of the order of n log fl., compared to JJ2 for the previous two array dividprs (3]. lu a similar way, we can implpment a high-radix division array with cnrry- :,ave addition. Herp, a small ,.arry-iook-alll-'ud '1ddl'r is usro to dt'termint> tIll' I, r. n , r COUI FA Con L I I I I I I I I I , I rol.lC (a) Controlled add/subtract (CAS) cell. ,10 Xo dl XI d2 .1:2 d 3 Z3 90 91 112 93 ro rl r2 r,l (b) FIGURE 7,13 A nonrestorlng array divider. 
206 7. Fast Division T , , I 7.5 Fast Square Root Extraction 207 correct most significant hits of t he partial rC'maind,'r in order to select t!1t' quo- til'lIt diil. Due to tin- illlil(\rit)' b,>twIl('n thp hasic c<,l1 in array lIIult,ipli,'rs and array dividers, a rontrolle:>d arr.\)' multiplil'r/diviclC'r can be dl.signed. &'vcnl such circuits have been dt'ScribN1; e.g., II). SincC' (Qi-I +2-i-l) and ((J._1-2-i-l) are in th' rangf> 11/2,11. w€' may rppl \("f" (7.19) hy the following selection rule, which avoid" do high-pre<'isioll C'<Imp<iri..oll: 7,5 FAST SQUARE ROOT EXTRACTION q.  { 1 o i if 1/2  2ri-1 S 2 if -1/2 S 2rj_1 < 1/2 if -2  2rj_l < -1/2. (7.20) As point.C'd out in Section 3.4, the similarities between square root extraction alld die di\'ide opl'ration allow thp adaptation of almost all the algorithms that have h('(>n devplopNI for division to thp calculution of the square root, with only somp minor modifimtions. ConscquC'ntly, small extensions to the hardware desi8n('(\ for 1\ di\'ision nnit <'liable the c<\lculation of the squar€' root (e.g., 17), 1171, (211). The nonrestoring algorit.hm for square root extractioll that has been pre- sented in Section 3.4 allows the use of thp digits 1 and i for 'I.. where Q = 0''11, . , . ,'1m is the cnkulatcd square root. Allowing 'I. to assume the value 0 has two important adv811tagl."'S. First, a shift-only operation is rpquired when q. = 0, redulillg till' number of add/subtrClCt opprations that must be performe<1. Sec- ond, having a ov('rlap het.\Vl-'cll d\{' region of the remainder ri where 'Ii = 1 is s('l('('tro and the region where 'I, = 0 is se:>l<,ctM leads to a reduced precision in- spE:'ction of the remainder. In the nonrestoring scheme, we must identify the case of ri  0 in ord('r to c.x>rrectly Sl't the bit. 'I.. This requires precise determination Qf the sign bit of r,. IffJi = 0 is allowed, a lower precision comparison is sufficient, enabling th€' use of carry-S3\"C adders in the calculation of t.he:> r{'mainder Ti' In thio; ('aSe, the remainrle:>r is (I'presentcd as t.wo Sl'quences, a partial sum sequence and a sequence of carries. Only a few high-order bits of these two SC'quences must Iw examiued in order to select the bit 'Ii. To decide on the region of ri where 'I, = 0 can b(' selected, we basically follow t.h€' same idea as that behind the SRT division algorit.hm. We restrict t.he squarl' root Q to be a normali7<:>d fract.ion, 1/2  Q < 1, with'll always equal to 1. CouSP-queutly, t.he radicand :ohould satisf)' 1/;1 S X < 1. In this case, th" remainder ri-I (for i  2) 88tisfi<'S the condit.ion 112) This selection rule is !limilar to the SlIT rule in lnat.iCln (7.4). 2(Q'_1 - 2-') S T._I  lQ.-1 + 2- i ), Example 7.13 Let X = 0.0111101:1 = 61/128: ro=X 0.0111101 2 r o 0.1111010 fiet f/l = 1. Ql = 0.1 I -(0 + 2- 1 ) 0.1000000 rl 0.0111010 2rl 0.1110100 set '12 = 1, Q2 = 0.11 I -(2QI + 2- 2 ) 0 1.01 00000 r2 1.1010100 , 2 r 2 1 1.0101000 S('tq3=l. Q3= 0.101 +(2Q2 - 2- 3 ) + 0 1.0110000 r3 0 0.1011000 I 2r3 0 1.0110000 set 'I" = 1, Q" = 0.1011 (2Q3 + 2-") 0 1.0101000 , r" 0 0.4.1001000 2r.1 0 4.1.0010000 Sf't qs = 0, Qs = 0.10110 rs 0 0.0010000 , 2 r s 0 0.0100000 Sf't '16 = 0, Q6 = 0.14.11100 r6 0 0.0100000 2 r 6 0 0.1000000 set '17 = 1, Q7 = 0.1011001 (2Q6 + 2- 7 ) 0 1.0110001 r7 1 0.0001111 The square root is Q = 0.1011001 = S9/128. Tilt' final relllclimier is 2-7r7 = -113/214 = X - Q2 = (7808 -7921)/2". 0 wher<.' Q, I is the:> partially calculated root at step (i-I), i.e., Q,-I = 0.Qlq2,qi-I' In step i  2, we tuay tlll'refore set 'Ii equal to 0, which results in r. = 2ri_1 wheulc'vpr ri-1 is in the range [-(Qi-I _2- i - I ), (Q._I +2- i - 1 )1. Hence, a possible fielection rule for q. is The high-radix algorit.hms for division can also be modifil>d .to caculate the square root. The generall'<Juation for computing th,> uew rem6mder 11:1 Ti = Jlr.-I - 'Ii' (2Qi-l + '1.lr') (7..n) q,  { 1 o I if r._1  (Q._I + 2- i - l ) if _(Q._1_2-i-l)  ri_1  (Q._I +2-i-l) if r._1 S - (Qi-1 - 2-'-1). (7.19) 1 _.1' .1 h d t. t  orQ Is{ n a-I , ...,l.4.1,I,....,- } , where i3 is t 1C rlUllX Rllu t. e igi se II I' -, as it is for division. 
208 1 I I I I I I I I I I I I I I I I I 7, Fast Division For exaOlplp, whl'n fj = 4. t he diit spt {2, T, 0, 1,2} is prderahll', since it eliminates the 11('(>(1 to genl'rat(' thp mutt,ipl,> 3q"-J' The generation of thl root Ulultiplp q, . (2Q'-1 + qi4 -i) makes the squan> root ('xtrnction somewhat more compl('x than t.he equiw\lpnt divj,..ion. lIow('\'pr, a careful pxamination of tJl(' requirf'cI Ilmltipks shows that these can b(' (,Mily calc-ulated. For thp posit.ive values of q" namely, 1 and 2, we h8\'c to subtract the s('quencps Q0012 and QOI0 2 . respectively. \\'III'n qi = i, we ha\;e to add the sl'<Juencp Q001 2 , which has the same valal(' dS (Q - 1)111 171. Similarly, when q, is 2, we must add t.he sequpnc(' QOI0 2 , which is equivalC'nt to (Q - 1)1102. Thus, huvinp; two registers with the valups Q and Q - I, updatpcl at ('wry stC'P. greatly simplifies the exccution of tllP square root algorithm 171. Since only a 10w-prP<'isioll comparison of thp rpmainder is JIl"f'dPd in or- der to sl'lect t,he quotipnt digit. we JUay pprform the add/subtract operation in Equation (7.21) in a carry-s!we manlier, and use a small rarry-propagatp adder to calculate thp most significant bits of Ti' Tho.<;(' will t.hen provide some of the inputs to a PLA for splcet,ing the square root digit q" similarly to division. Thp otl1l'r inputs to thf' PLA are the most significant bits of the root multiple. Sev- eral rules for selecting q, have b(>('n proposed «(4], 171. and 1211). In these mips, thC' int.prvals of the r('mainder detprmine thp size of the carry-propagating adder (hetwet'n 7 and 9 bits for the base-4 algorithm with thp digits {2, 1. 0.1, 2}) and t.hp pxact PLA entries. The selt'CtPd digit qi depends on the truncated remainder and t,he truncated root multipl('. In the first tep, howpver, no estimated root is availabl('. Consequently, a separate PLA for predicting the first few bits of the root ma)' be necessary. Example 7.14 The radix-4 divider (for double-precision floating-point numbers in the IEEE I'tandard) report.ed in 17] is capable of calculating tbe square root as well, The P-D plot in Figure 7.10 was also used for squarc root extraction. Therefore. the same PLA (with 19 product t('rm..) is used for predicting the next quotient. digit and the next root digit. A separate PLA (with 28 product t.erms) was added to the ",rit.hmetic unit. in order to provide the fivp most significant bits of the root. The inputs to this PLA arC' the ix most significant bit.s of the signifil'and and the least. significant bit of tl1P t'xponent. Thp latt('r indicates whethC'r the exponent is odd or mren. This is ll<'Cessary in order to find the square root, according to the following equation: { vfOTl.2(E-I)/2-1023 if E is odd v L / .2 E - 102 .i = (7.22) ..;oJ5fJ.2E/2-1-1023 if E is evcn Note tlldot the rcsulting radic \Od (JOT! or JQ.Ofl) u. in the range 11/4, 1] yieldin a square root. in t.he required range, 11/2,1]. 0 7.6 Exercises 209 7,6 EXERCISES 7.1. Show that, for a divisor D = 3/4, the snT ..Jgorithm will n\way gl'neratt> tbl' minimal SD rpprescntation of the quotient. 7.2. Given a dividend  = 0.1001 unl a divisor D = 0.1010. perform the divide o(>('r- ation using (8) the standard binary SRI' algorithm, (b) the modified binary SRT dlgorithm allowing the I of the multiples D/2 aud 2IJ, and (c) Ihp modifi('(! binary SHr aloritbm with five different comparison constants. 7.3. Check whether the divisor is witbin the favorable region (re<lwring fwu add/sub- tract operations) for all five comparison constants K J ,... hr.. 7.4. Analy./..<' the possibility of using the first two bits of the divisor a." a comparison constant K to speed up the SRI' al&orithm for a divisor in thf' range (0.5,11. For example, for D = 0.1101, the most significant bits of the partial remainder will be comparro to K = 0.11- (8) Find whether the number of ddd/5ubtract operations in Ihp slJe1jt('(1 mPlhod will be smaller thdn, e<lual to, or greater tLan the nwuber of these operalions in the original SRI' method (with K = 0.1) in the following thrl'e c: (i) D = 0.1001 'Uld X = 0.10000111 (ii) [) = 0.1101 and X = 0.01011011 (iii) D = 0.1111 and X = 0.01001011 (b) \'I.'ould you recommend thp use of the abO\'e algorithm? 7.5. Derive a variation of the binary SHT divL<;ion algorithm that USf'S 8Pveral compar- ison constants similar t() the one described in Scction 7.1. Allow tLe comparison c,onstaots to have at most three fractional bits. The divisor domain (0.5,1) :.hollld hp Ilartitioned into subregions so that only the three wost !jignificant fractional bits of the di\'isor will be needed to determine the Iected comparison constant. 7.6. Show t,hat, for a = 13 - 1, the high-radix SRT algorithm inrludC"S 8S a special case the high-radix nonrestoring division algorithm. 7.7, The followinJ?; algorithm CWI be used to convert a quotient qn, giwn ill SD repre.ntation with the radix r and the allowed digit set «r - 1),,, .1,1,... (r- I)}, to its equivalent representation YOY1Y'l'" Yn in radix complement: Step 1: If ql < 0 then set yo = r - 1 and YI = r + ql else set Yo = 0 and YI = 91 St p : Set j = 1. Step 3: If 9J+1 < 0 then set YJ = Yj - 1 and YJ+I = r + qJ+J else set !lj+J = <}J+ I Sup 4: If j = rI - 1 then st()P i"l'>C set j ::: j + 1 and j?;Oto !jtt'p 3 (8) lIse the above' algoritbm t() ("onvert the SD binary number 1111 to two's complement representation. . . .. (b) Compare thi!j aloritbm to the one described 111 Sf'(.lIon J.3. (c) Can the subtraction in step 3 genralA> a borrow that JUay propagnu> to higher-order bits? 
210 ], Fast DIvon T I 1 7.7 References 211 [11 D. P. AGRAWAL, "lligh-spet'<l arithmf'lic arrays," IEEE 1hms. on Computers. C-28 (March 1979), 215-221. 121 D. E. ATKINS, "Higher-radix division using estimates of thC' dh isor and partial remainder.;," /FEE furu. on Compute-, C-17 (O(.t. 1968), 925-931- (3] M. CAPPA and V. C. HAMACHER, "An augmr>nted iterl\tivt' array for hiJ?;h-spcC'd binary divisiun," IEEE 1hnu. on Coml/utf.'T'S, C-22 (Pcb. 1973), 172-175. (41 L. ('1t.mm,:RA and P. 1()STlISCIII, "High('r radix square rooting," IEEE 1hnu. orc ('ompute. 39 (Oct. 1990). 122123l. [51 M. D. ERCeGOVAC and 'f LANG, Vit/ision and squarr root: Digit-recun'L>nce algorithnlS and impl mrntation., Kluwer Academic Publliiher, Norwell, 199'1- [6] D. GOLD8FRG, "Computer .uithmetic," ill ('ompuh.r architect.ure: A quantitatrve approach. D. A. Patterson and J. L. lIenncs:,y, Morgan KdUflU8IUI, San Matro, CA, 1990. [71 J. FANl>RIANTO, "Algorithm for hijo(h speed sharM radix .1 diviion and rddix I 8qllEUt> root," Proc. of 801 Symp. on Computer Arithmetic (1987), 73-79. 181 J. FANDlUANTO, "Algorithm for high sp shared radix 8 division and radix 8 square rool," Pmf'. of 9th SII'''p. on ('omputf'T Anthmf'tic (1989),68-75. I 1 I , , , I I I I , (91 C. V. FR:IMAN, "Stl\tistiralana!ysi'lof cert"lin binAry clivi.."m Algorithms," Proc. of lIlF, 49 (Jail. 1D61), 91-103. 110] R. F. 1I0IJSON and M. W. FRAS'R, "An ptlkil'nt lUaximum-redundanr-y rltdix-8 SItT divisiolJ and squar('-root method." lI';I-:E Journal of Solid-State CimJlf8. .JO (JalJ. 1995), 29-3B. (111 O. L. MACSORLE\', "High-speffi arithmetic in binary COll1putt'rs," Proc. of IRF, 49 (JalJ 1961), 67-91. 1121 S. )'IAJERSKI, "SquarE' rooling algoriLlnns for hiJ?;h-speed digital cirmit.s," If'FE 1ron.,. on Computers, C-34 (Aug. 19R5), 7l1-7J:t (131 G. l\-b;17', "A class of binary divisions yielding minimall). represented quotients, .. iRE 1hm-'., EC.ll (Dec. 1961), 761-7f>.1. (141 J.A. PIIAIIIIl' <ind G.B. ZYNER, "167 )'1I1z radix-B divide cU1d 5(}Ullre root Il.<;injt I)\'('rlappro radix-2 stag," Proc. of tllc 12th Symp. on Comp'dcr Anfllmetic (.Iu" 1995), 155-162. (15) J. E, ROBERTSON, "A nt'w cla..'IS of digital division ml'thod..:' IRF Tnm.... on Electronic ComplJte, E(,-7 (Sept. 1958),218-2l2. (161 J. E. ROBER'fSON, "The correspondence betWt'l'n methods of digital divbion /lnd multiplier (I'('ording procedures." IEI-'E Troru. on Computers. ('-19 (AuK. 1970). 692-70 l. 1 17 1 G. S. TA\ LOft, w('ompatible h.ud","U'e for division and square root," Proc. of 5th Symp. on Computer Anthmdi4= (1981), 127-13-1. [18) G. S. TAYLOII, "Radix 16 SlIT dividers with oVl'rlapped quotient selection stages," Proc. of nIl Symp. on ('omputer Arithmehc (June 1985), &1-71. (191 K. D. TOe-HER. "'fechniqlles of multiplkalion nud di\'Lsion for autowatic bmar)' <'Ompulers," Quart. J. Mech. Appl. Math.. 11, Pt. 3 (1958), 361-.JM. [201 J. B. WILSON RIld R. S. LEDLEY, "An aIorithm or rapid binary division," IRF 1hlrlS. on Ell'drorlic Compute, EC-1O (1961), 662-670. [211 J. II. P. ZURAWSKI and J. B. GOSLING, "DesiKJl of a high-speed square root multiply and divide Unit," IEEE funs. on Computc. ('.36 (Jan. 1987), 13-23. 7.8. Show the P-D plot for SRr divi...ion with (1- .1, 0 = 3 and DE 10.5,1). D('ter- minI' thp rl'WOIL'i for the different vnluC"S of the quotiC'nt dit-,it and the IIr{'Cision of the partialremailJder and the divisor thai i... n<'ffied. 7.9. Show the s{'Cond part of tht' P-D plot dt'piclJ in Figure 7.10 for n nC'gati\1' partial reml\inder in two's complt'mpnt. 7.10. Redraw the poD plot sho\\'11 in Figure 7.10, aUIJ\\iug tht> r('C}llirl'<1 precision of the divisor and of the partial rC'mainder to be 2- in most cases. Indicatp Ihl' f('w ('a. wlIPrt> a hilther pr{'Cision is required. 7.11. Vt'rif)' that thl' C1\rry-out hits of the lC'ftmost C.\Ss in Figure 7.13 geuerate th(' rorrE'("\. quotient bits for X = 0.011111 an(1 D = 0.110. Is the generated final remainrlpr corrl'<'t! 7.12. ('an the array divider in FiguJ"C 7.13 be pipelined? What would the rale of such a pilwline b('? 7.13. Show the block dil\gram of a binary uonro.>:;turing .UrdY dividl'r (wit.h tht' same Ilumber of bils for the dividend anI I divisor, as in FiKitre 7.13) that gC'nerau-s th(' sign bit of the p'lItial remainder using 1\ carry-look-ahead circuit, while generating all other bits using c3rry-savl' addition. Dl'Sign till' I>a..:;ic cell(s) of tlJ(' dividl'r. Compart' the execution time and implementation complexity of } our array divider to those of the divider in Figure 7.13. 7.14. Show that eX"lctly 11 rndix-2 Q.d units are required when implementing a radix- 8 divider using three overlapping radix-2 SRT !jtag. [stimate the delay of thPSe three ovprlapping stages. 7,7 REFERENCES 
T I I I I I I I I I I I I I I 8 DIVISION THROUGH MULTIPLICATION The number of steps in the previously described divi..ion methods was linparlv proportional to the numbpr of bits, n, while in the clivision-by-convergen;e schemes, which will be descrihpd in this dlapter, thp nUlnbpr of steps is pro- portional to log2 rl. However, the basic operation in thp division-by-converg('nce schemes is not an add/subtract operation but tilt> usually slower multiply op- eration. Hencc, a fast parallel multiplier is necessary to successf\tlly impiernt'llt these schemes. 8.1 DIVISION BY CONVERGENCE Llt the divisor D and the dividend N be considered the dt'nominator and nu- mt'rator, rl"Spt'ctively, of the quotit'nt Q. Thus, () = N/D. This holds if we multiply both numerator and denominator by the sallie factor no, or even by III factors Ro, Hit .. " Rm-l' If tht' factors R. me selected so that the dpnominator convt'rges to 1, the nUllwrator will converge to Q: N N - RoRl ,., Rm-I Q Q=-= -. - D D, RoR. ... Rm-I 1 (.1) In Equation (8.1) only the quotient is calculated, and a separate compu- tation is ne<"l.>ssary if t.ht' remainder is ne('de<1. Then.fore, this divil!ion ellle is more suitable for floating-point computations. The p-ssential step in this method is tilt' selection of t,h(' factors to {'nsnre the convprgCnce of the denominator to 1. rhis selt'ction is based on the following 213 
DI = D. Ro = (I - y) . (I + y) = 1 _ y2, (8.2) T I I I I I I I I t I I I I f I 1 I f 8 1 Division by Convergence 215 214 8. DIVIsion through Multiplication D2 = DI . RI = (l- y2). (1 + y2) = 1- y4. (8.3) Example 8.1 For the 15-hit mUllbpr!l N = 0.011010000000000 = 0..1062510 ami D = 0.110000000000000 = 0.75 10 the quotient is cakulatPd as follows: The first multiplying fa(tor is Ro = 2 - D = 1.010 000 000000 ooO,and as a result, the new numerator and denominator arp N I = N. Ru = 0.100000100000000 nnd DI = D.Ro = 0.111100000000000. ThpSN'Onrl multiplying factor is RI = 2 - DI = 1.000 100 000 000 000,3nd the next numt'rator and denominator arf'N2 = N, . RI = 0.100010 100010000 and D 2 = DI . RI = 0.111111110000000. Not.e that t.hp numht'r of IcsdinR l's in the denominator has doubled from 4 to 8. The t.hird multiplying factor is R2 = 2 - Ih = 1.000000010000000. and thenN 3 = N 2 . R2 = 0.100010 101 010 101 and D3 = D2' R 2 = 0.11111111111111l.Collverg- ence has bn achie\'ed (D3 = 1 - ulp) in three steps and the resnlting quotient is Q = N 3 = 0.54165 1 0. The t'xact rc>:mlt is the infinite fn('tion 0.54166 10 , 0 ob'rvRtiou: Let tile divisor be a normalizM binary fra(.tion O.lx.r.rx (whprp .'ach x IS ,'ithpr 0 or 1). Therefore. 1/2 $ D < 1 and D = 1 - Y. whpre y  !. If we 8<'1('('t Ro to be 1 + y. t.hen thp np\\, denominator is and since y2 :$' t, the nt'w denominator DI satisfies DI  , and is th('refore closer to 1 than D, In binary notation, the' III'\\' dpnominator has t he form DI = O.l1XJ"3'.l". In st,ep 2 we sel('('t RI = 1 + y2 obtaining ow, y.1 :$' I and. t.herefore, D 2 takes on the form O.l1l1xx.l'x, aud is closer to 1 than DI' In geupral, in the (i + l)th st.ep the divisor D, has the form D i = 1 - Yi, where y, = y2', and sim'p y :$' 2- 1 , D, has at. 1C<'lSt 2' leading l's. The next multiplying factor is R. = 1 + Yi and, as a fl--sult. D'+J = D,R; will have 2'+J leading 1'5. To show formally that t.he clt'nominator con\,{'fges to 1. note that lim D. = (1 + y) .  = 1. ,-x. 1 + Y (8.5) The total nnmb('r of steps (order of log2 n) is smallpr th.m that. rf'Quired by the algorithms based on add/subtract operations. for which the nllmber of steps is linear in n. Ho\\'evt'r.pach stpp involvf'S two multiplicntion, which arp more t.ime-consuming than add/subtra('t opcrat.ions. Thus, t.here is a 1lPf'd to further reduce the number of step$. Olle way to achieve this is to speed up thl:' first fpw steps, where the comrergence is very slow. After step 1, only two leadiug l's are guaranteed, and aher step 2 is completed only four l's arc guaranteed. A speed up of the fin;t steps can subst.antially reduce the rpquired number of multiplications. Instpad of sele<..ting the first multiplier to be i?() = 1 + y, we may use a look-up table t.hat provides a multiplier that ensurcs a denominator DI with 1,- lpadillg 1 '5, where k is at least 3. The next denominc\tor will thcn have 2k leading 1 '8, and so 011. The size of this table, which can be ston>d in a ROM, inrreases exponent.inlly wit.h k, the dl'Sired number of le\(lillg l's. The number k therefore has to b(> determinE'd so that the require<.1 size of the tablt' is still reasonable. (1- y). (1 + y)(1 + y2)(1 + l)...) = (1 + y). (1- y)(1 + y2)(1 + l).. .). (8.4) Thp term within t.hf' brackets on the right-hand side of Eqnation (8.4) is the series expallsion of 1/(1 + y) for 0 :5: y $ , hencp Thl process of multiplying t.he numerator and denominat.or by R; is rep('ated until D, con\'ergl'.s to 1, or, more prt'cisdy, to 0.11...1 (which equals 1 - 1I1p). The number of leading 1 's in D, is doubled at each step, c\nd ther('fore the number of itprations is m = pog21l1. and we say t.hat the convergence is quadrati,.. Tilt' multiplying factor in each step has to be generat.ed 'tile! t.hen multiplied by the l1umprat.or and denominator. TII(' multiplying factor R, has to be obtailled from D,. Fort.unatPly. thp relat ion betwl't'n these t.wo is simple. R; equals 2 - D i ; i.e., R. is t.he two's complement of the fraction D,. Thus, cu('h step consists of th... t.wo mult.iplicatiolls Example 8.2 The division scheme described above was first implt'mentoo ill t.l... 181\1 360/91 for flonting-point division (I). The long format of ftOc\ting-poiut numbers in t.he IBM system consists of 64 bits partitiont'(1 as follows: Do+J = D, . R, and N I + I = N i . R i (8.6) S 7 hits - bia.sed ('xpon('llt 56 biL'i - IUL"igned f'Tacti(lnnl significi\Ild and a t.wo's complement operation, R'+I = 2 - D'+J' (8.7) The operands for the division nn' th('r('fore 56-bit long fractiu(l.-;. If the first multiplying factor is Ro = 1 + y, t.hen flog.l5t.il = 6 sh'ps Ilre lICt,(I,>d, 
216 8. Division through Multiplication I t I 8 1 DiviSion by Convergence 217 r('quiring 12 multiplications of 56 hit$ each, Actually, only II multiplica- tious sut' nf'('(It>d. sincp thprp is no nf'('(i to calculat(' Ds in the I&t step; we now that it equals 1 and th"re is 111..> nC<'<.! to calculat(. another multiply- mg fnct.or. To rechlcP tlu' numbpr of nlllltiplicat.ions we can use a look-up table for th(' first factor RQ 80 that DI will have at. least k = 7 I,'ading l's since it If'ads to the sequenct' ' Thp "('rror" in D'+1 is 0(1 - V.), which sat.ifi o s o( 1 - y,) S 0 < TO, 1 -+ 7 -+ 14 -+ 28 -+ 56 ad we n therefore still expect to obt."in 0 II:'ding l's in D i + l . Th.. only difference IS that now Di+1 may convt'rge toward I from eit.h£>r bplow (i.e., D"H = 0.1 '0' .lxxxx) or above (i.e., D'+I = 1.00... O.cxxx), sillce the "prror" i.. always pOSltlVl:' (bee exprcise 1). In each of the truncatro multiplication factors, th(> first half of tht' hits are identical, either all D's or all 1 'so If the multiplipr R; is recoded ubing SD represpntation, then these leading O's or 1 's will not generatp nonZf'ro Pdltial products and tht' execution time of the multiplications will be further rpouf'C<1. of leading l's. Consequcntly, we n£'Cd only four steps, fl'quiring seven multiplications. For this k, it has b(,(>l1 cll'tprmined that 7 bits of Dare nC'('(lC'd and 10 bits of the factor Rl haw to he' stored at each location, rl'qlliring a ROr-,.1 of sizr' 128x 10. Thp cOntents oftht' row that corresponds to a given denominator D = 1- Y is the approximated value of (I +V)(I + y2)(1 + y4) t.runcated to 10 bits. At its highest precision, such a multiplier, R o , would guarantee eight leading l's. It should be clear tbat no error is introducI'cJ by using a multiplier out of a table, sinc(' it is ud to multiply botb numerator and denominator, and the pre\'ious convrorgcnre scheme is initiatpd at this point. 0 , I I 1 Example 8.3 Thl fast multiplier in the ftoatillg-point arit.hmetic unit in th(' IBM 360/91 computer (11 rpc('ivcs opl"rands of l('ngth 56 bits eh and uses the algo- rithm outlined in Table 6.5 to generate partial products of the form 0, 2A, and 4A. where A is the multiplicand. The resulting 28 partial product.s require 26 c.arry-save adders. To reduce the amount of harli- ware Ilef'ded, a smalll'r carry-save addition trt"e for eight operands was designed. This trt'€ is capablt> of accept.ing six npw partial products and adding them to the two previous intermediatt' reslllt.!i (int('rn1l'cliate sum and carry), which are connected through two feedback paths as :,hown in Figurc 5.31. This tree must be used five times in order to aCI'umulate all 28 parti.,1 products. The carry-save tree was designed as a pipeline to allow overlapping between consecut.ive sets of six partial products so that the accumulation of all 28 partial products would takl' only six clo('k cyell's. This overlapping is possiblt' among the bet.s of partial products that correspond to t.he same mult.iply operation. Complptc overlapping between two different multiply operations is also achievable as long as the number of eneratt'd partial products is less than or equal to six. In such cases, the carry-save tree is pa..'iSecl only once per multiply operation, and there is no nPt'd to use the feedback connections. Consequently, limit- ing the numbl'r of partial products to six or less call significantly speed up th!" execution of the S('veral consecutive mult.iplications 1ll'l:"Xleci in the division-by-collvergt'ncp alborithm. To achit>vt> the seqUl'llce of denominators Di with 7, 14, 28, aud 56 leading ones (or eros), we nt>ed multipliers R i with 10, 14.28, !Uu156 bits. resppctively. The first multiplier is rt'81.1 out of thp ROM and gpnl:'rat4'S only five partial products. The oth!"r t.hrpe multiplit:'rs contain 7, 14, and 28 leading eros (or ones) which can be skipped, aud th('re is no n....>d to generate any partial products for them. Wo> on I)' need to idl'uti(y the Another way to further reduce the eXt>cution time of thl:' division algo- rithm is to speed up the mult.iplications by using shorter multipliers. siui:c tht' multiplicatioll time increases linearly with the nllmber of multiplier bits. Instead of using a multiplier of length rI bits for all Ulultiplicatiolls, we can use a trun- cated multiplier for some of the products. Using a truncatpd multipli('r will not int.roduce !"rrors into t.he dh ision process, since the numerator and denominator are mult.iplied by the same factor. Clparly, we cannot us!" a t.runcated mult.iplier for the last product because a high accuracy is needed at this point. At step (i + I) we want to generate Di+t with 0 (0  2i+l) leading l's by multiplying the denominator Di, which ha.o;; a/2 leading l's, by R i . OriRinally, R. = 2 - D i = 1 + Yi (whert> D. = 1 - Yi). Instt'ad of using R;, wt' lL'>t' a truucated multiplier fl. T by forming the two's complement of only the first 0 bits of D;, which constitute the truncatro cll'llominator DiT' In other words, R iT = 2 - D..,.. If we use the notation niT = 1 + Yr. then the error in the truncated multiplipr is 0 = YT - Y._ This "error" is always posit.ive and satisfies the in('Quality 0 S 0 < 2- 0 . Whcll multiplying tht' truncatro multiplit'r b)' the untruncated denominator WP obtain Di+1 = D. . R.r = (1 - Yi)' (I + YI') = 1 + yl' - Yi - Y.Yr, (8.8) Substituting Jtr = Y. + 0 yields Di+1 = 1 + 0 -IJi(Yi + 0) = 1 - y; + 0(1- y.). (8.9) 
218 8. Division through Multiplication 8.2 DMsion by Reciprocation 219 first ancllnst hits of surh a group of id('ntical bits. Th,>rf'fort>, the Sl'roud nrultiplit>r, of length 1-1 bits, g"Ill'ratt's only thp five partial products O. 11 11 11 Ix xx xx xx ............ ............ ............ ............ ............ and the feedback connections in the carry-save t rCf> are not used. The third multiplier. of I('nh 28 hits, gen£'mtcs nin£' partial products, requiring till' use of the fc>edback conncction in the carry-save trL'C. This can be a\'oid('(1 by acldit.ional truncation of the multiplif'r. To thp 14 leading idf'ntiral bits, we may add nine bits (for a total of 23 bits), aud slill generate only six part.ial products: 0.1 11 11 11 11 11 11 Ix xx xx xx xx ............ ............ ............ ............ ............ ............ This, however, results in a new denominator guaranteed to ha,,'e only 14 + 9 = 23 leading ident.ical bits, instead of 28 (t.he proof of this is left as an exercise for t.hf' reader). Thf' next multipli('r will therefore havp only 23 lending idt>ntica1 bits, and again we may add nine extra bits without r(>{}uiring the use of the feedback conne('tions. This multiplit>r will mmlt. in a denominator with 23 + 9 = 32 It'ading id£'ntical bits. Forming the two's complement of this denominator will giv(' us t.he next multiplier v.ith 32 leading identical bits, which can inrr('a,o;;e the number of I('ading ident.ical bits in the denominator up to 64 and achieve convergence. Since this is the last multiply operation within the division, we can afford to use thp ff'«lback conn('Ctions, so there is no need to limit the number of multiplier bits and all availablt> 56 bits can be used. The se<luence of multiplication factors now contains five multipliers of lenJ!:ths of 10, 14, 23, 32. and 56 hiw, increasing the number of multiply operations from 7 to 9. However, all these multiplications can be overlapped, leading to a total ('xe<'ution time of 18 clock cycles [II. 0 I I I I I J I whf're l (r.) is lhe derivativp of f(x) with ro:sJJPct to x. FlJr the fUllct.lon f(x) = I/x - D, \"hich 11I1S a 71'ro at x = lID, f'(x) = -1/x 2 , yidding X'+J = .1".(2 - D. x.). (8.11) .l'. +1 converges to the rcci procal of D and thl' mnvergl'll('p of this srheult' is quadratic. To prove t.hifi,lct.j dpnotp the t'rror in the ith stPp; i.p., 6. = l/D-xl' Simpl.. algphrnic malliplilations show t.hat 6.+1 = D6. If D it; a norml\.Ii.lcd fraction (i.e.. 1/2  D < I), then 6.  I and the error denPl\...<;f>S quadratically. If the first approximation u. Xo = I, then XI = (2 - D), and X2 = (2 - D). [2 - D(2 - D)I = (2 - D). [I + (D - 1)2). (812) Repeatedl)' sllbtitllting expression (8.11) results ill Xi = (2 - D)(l + (D_I)2)(1 + (D -1)"1).. .(1 + (D _1)2') = (I - (D - 1»(1 + (D - 1)2)(1 + (D - I)')... (I + (D - 1)2' X8.13) I I I J If D is a normalized fraction then (1 - D) is a fraction y s<ttisfying 0 < y  t. Therefore, the binomial sf'rit"S in Equation (8.13) u. identical to the onc usl'd in Equation (8.4) and it convergt"S to I 1 i x. = 1+ (D - I) = D . (8.14) A somewhat different approach to performing rlivision through lIIultiplication is to first calculate the redproeal of the divisor D and t.hen multiply it by t.he di\'idpocl to forlJl the final quotient. The rcciprocal of D can b.. calculated using the NC\\ ton-Raphson iterat.ion lIIt'thod. This is a method of finding t.he zero of a gi\'(,1J functiolJ f(x), where a zero of f(r.) is the solut.ion of f(x) = O. u.t. r.o be the first approximation and let .l'. be till' estimat<.> for t.he lero at the ith stf'1>- The ue"..t estimate, XHIt is calculatecJ from As in Sedioll 8.1, w(' may reduce the required number of stpps by rf'ading the first approximation out. of a table, rather than setting Xo ('(}ual to 1. This table, stored in a ROM, accepts the j most significant digit.s of D (except [or the first. digit, which is cllways 1), and produ(,(,$ an approxilllcltion to. fhe range [0.5,1) is divided into 2 J intervals (of bil.p I). = 0.5 . 2- 1 each) .\IId it can be shown that thp optimum value of Xo for the kth interval (1.- = 1,2,..., 2}) is th<.> rt'Ciprocal of the number curr£'bponding to thp middle point of the intervcll (see Exprcise 7). This midd\t, point is ! + (k - t) . 1)., IU1rl thus 8.2 DIVISION BY RECIPROCATION 2}+I .l'o(k) = 4 . 2J+k-. (.15) f(x,) XHI = Xf - f(.l',) (8.10) This stepwbe approximation for j = 2 is shown in Fignrt> 8.1. lnst.e<ld of the stepwist> approximation to 1/ D. a piecewise lin('ar approXI- mation can be (,lIlploy'(1. Its generation is more cOlllplkatpd but its 8Ccural.."y ib higlwr [31. 
220 8 DIVIsion through Multiplication I I I I I 8.2 DMSlon by Reciprocation 221 15 2.0 1,(1 0.50 0.75 I 1.00 D two's complplIH'nt calculation. This illtrodu('('S an l'nor of si?p 2- 31 . .£1 is then computt'd by performing thr multiplication XI = Xo . (2 - d.ro) with 16 bits of Xo multipliPd by thp 32 most significant hits of tl\(' multiplicand, forming n 32-hit proclu('t. The resulting XI is accurate to approximately 31 bits. In the second itf'ration, a similar Sf't of op<,rations is repeated to calculatt' t.he final approximation £2. In Xl . d, only 64 bits arc gl'lwratpd and then only the olle's complement is calculated. Npxt, the multipli- cation XI (2 - d . Xl) is pf'rfornll'd produring thp approxim<\ted valn(' of ltd. This value is then multiplied by tit" dividpnd N and the r('Sulting .\pproximate<l quotiellt ()' is roundf'd according to anyone of the four rounding schemes supported by the IEEE floating-point standard (round- to-ocare'lt-cvf'll, round to 0, round to :t:oo). Thi final rounding dot's not guarantee an accurately roundf'd result for all values of thp opf'rand d. 0 1.5 FIGURE 8.1 A stepwise approximation for the reciprocal -h (J ;; 2). Example 8.4 Thf' impl(,nlcntation of a dh'ision-by-reciprocal-approximat.ion algorithm was proposed for the ZS-I 64-bit comput('r (51. This computer uses till' IFEE floating-point standard Rud. as a result, th(' significand d of the di\'isor is within t.he rang(' I  d < 2. Tht' 15 most significant bit.s of t ht' divi:>or (excluding the hidden bit), I.d l d 2 . . . d If>, arf' used to address a ROM look-up table for the initial approximation Xo. The table is of si7c 32K x 16 bits, producing an illitial approximation in the range 0.5  £0 < 1. XO ther('fofl has th(' binary form 0.IY2Y3'" YI6' This approximation is calculated by taking thl' fl'('iprocal of the mid-point between l.d l d 2 . . . dlf> and its su('r.essor where tht' mid-poiut is I.d l d 2 .. . d lS 1. TII(' reciprocal of t.he mid-point is roundt'd by adding 2- 17 , alld t.he result is truncated to \'if'ld the required 16 bits, 0.IY2Y3'" Y16. Th(' precision of this initial approximation was shown to be Ixo - l/dl < 1.5.2- 16 (51. Therdor<" basPd on the quadratic convergencp of the Ncwton-Raphson method, only t.wo it ('rations arc needE'd to achieve th<, required prcc'isioll of 53 !>ignificand hits. Two iterations of the forln of Equation (8.11) i1J\'oh'e four multiplications and two complem('nt opera- tions, As w& done in the implemputatioll of the di\'ision-by-convergf'nce scllpmp (in the IBM 360/91 sy:.tem), additional simpliflcat.ions were per- formt><l iu order to reduce t.hp o\'erall execution lime. In the first iteration that calculat,.os Xl = IO . (2 - d. 3'0), d and Xo are multiplied and t.he complement (2 - d . xo) is fowlt'd, To spN'd up the multiplication, the 16 bit.'> of Xo are multil)lil>d hy the J2 most significant bit... of d, and the result is roundt>d to 32. instead of 48, bits. In t.he complement operation, ouly lllP one's C'...omplelm>ut is performed to avoid the cdrry-propagation in t.he t A major disadvantage of most implementat,ions of t.hp clivision-by-rPf'ip- rocation algorithm is that the aC('ltr8C}' is smaller thdD that achievPd by tt.. add/subtract type of diviion algorithms dcribed in Chapter 7. Corrective ac- tions can be taken to guarantee that tht' least. significant. bit is correctly rounded. However, the additional computation usnally slows down the division. rhltl, when df'ciding between a division algorithm t.hat is based on multiplicatia1s and an algorithm bfl.S('d on add/subtract operations, prt'cision, spf't'd, and cool trade- offs must be taken into account. The final dccision depends on the technology available. The prf'Cisioll required hy the IEEE standard has been arhieved at a reasonable cost and speed in the IBM RISC/6000 (7). A doublc-widt.h data- path is used then> to implement the division by reciprocation algorithm. All operations are doue in a fused multiply-add unit (set> Section 6.5) resuJtil1 in a double-length estimate Q' of the quotient. A remaindcr is then calculated (using the fused multiply-add), R=N-DxQ'. This rcmainder is used to ('ompute a properly rounded result (agam using thl fused multiply-add ullit.) in the d('sired rounding mode, . I (}=Q+Rx D where -b is the result of the Newton-Raphson iteratin$. " , A diff"rent solution is to estimate the error in Q by calculatmg N = DQ [61. If (./ is sufficiently accurate (at I('&,>t to n+ 1 bits hert.'. n is the 1Umer of b ' . t h . ' fi 1d) tl1e I" , t sI ' g nific3nt bits of N provldp the dlrcct Ion of lts m e slglJI cal,..... . th ' Q ' B '''''''''I on this information and t.he desired rounding modl', Q e error In . .""', . ' . b ('.an be corrected by either 6ddillg or subtract-ing I at the rH I bit pOSlt.lOIl or Y trunrating it. 
222 r I I I 8. Division through Multiplication 8.3 EXERCISES 8.1. Show that if a truncnted IIJllhiplit'r i'i lISt'd then the dl'llOJIlinator D. will CODVCl'g" toward 1 from below or above. 8.2. I'I'O\'C that n nnnnali7('(1 dl'Dollllnator V. ill I ht' di\'h.ioD-by-coll\'t>rt'n("(' alf':<>- ritlull with a itlcnticall('ading hilS, when multilllied hy its two's compll'mellt R. with a idrnticdll('juiinj1; hils and olll' II I'xtra bits (b  a), will generate a new dPIIC)min.ltnr 1Jt+1 with at least a + b leadin idt'nrical bits. s.3. Pro\,(' that a ROI\( of gi./e 2 7 x 10 can always prO\'idl' a multipl)ing fd('tor Ro Slirh that DI = D . fiQ has nt least sc\'en leading I's wh£'n n is a norllJa!iu'(l fraction (0.5 $ n < 1) in the di\'ision-by-comerj1;encc alJ?;orithm. CalJ thl' siz£' of thi'! rahl(' bl' furthC'r r('()uN.'d? 8.4. Evaluate the overall tillll' lJeeded to nccumulah' 28 partial Jlrodu(.ts ul'oing a pipelinro partial (,SA Ir('(' .L<; shown in Figure 5.:U, ("apable of a("("l'pting six new partial products at OIlC(>. Compare it to the ti1lJe nt'('dro if no overlapping is 0110\\'('(1. 8.5. Prove that 0.+1 = Do? in the diVJsion-by-rf'Ciprocation nlgorithm. 8.6. A difTerf'nt reprcsentation of the algorithm for calculating thc rcdJlrocal is ob- tained by introducing a nl'W variable, z.. which ('quais z, = Dx.. Show that the rulting equation for z. i... J z.+t = .:.(2 - z,) Show alc;o that from Zi we can calculate x. uing the equation .r.+t = x.(2 - z.). If the initial \alue for x. is xo = 1, then the illitidl \'.uue for z, is Zo = n. \\'!telJ x. con\erges to liD, z. com.ergf'S to I. Thus, the cOIllJlarison of z, to 1 allows us t() dpt.f>nnin£' whether con\'ergcncl' has b(;'('n adlievpd. \\irite the above sclwmp as a ratio betw('en Xo and zo similar 1.0 r.;quatiolJ (8.1). What will be the form of the multiplication fartors R.? 8.7. Show that the optnnal initial value .ro for the kth inten'3l (k = 1. 2, ... ,2') in till' di\'ision-b)'-rt'("iprocation algorithm is th£' rl'Ciprocdl ofth(' number corrC'Sponding to its middle point. 8.8. To calculate thp S<)uart' root of a gi\'C'n o}lerl\nd N it has b('('n suggted to apply th(' NeWlon-RaJ>hson metbod with fez) = 1- 1/(N x 'l). Show the corrponding itt'rRtion rule and pro\'e Ibat the cOQ\ergence is quadratic. Apply this procedure to c.tJCUldt-P v'3 to five derimal pll\('('5. 8.9. Given au 0IIl'rlUici A that is a norm'llizro fraction (0.5  A < 1), wh<il. function :: = f(A) "ill thp following prol"CdllrC' calculate'! (i) x,+! =.r. . r? with Xo = A. (ii) Z.+I = Zi . r, with:: o :;:; A with r, :;:; 1 + (1 - x.)/2. How man)' iteratiol1!l an' JI(>(..oded in this calclliation I\nd what oJH'ratioflS arc exccut4...od in cadi iteration? lIow is till' mllit iplying coeffident ("alcul!lted! 84 References 223 Estimate the f'rror III the final rC"Sult. ('an you suggest ways t() spN'd III' the caklliatlon? Given A = 0.11100001, calculdlt> thp -bit rt'SlIlt .:: = f( \) 1I!llOb tht' above procedure and comp.ue it to the corrl'Ct result. 8,4 REFERENCES III F. S. ANmmsoN et al.. "The IBM system/360 model 91: Floatin-point ,'xccnlton unit," lHM JDurnal Rf'.. and Dev., 11 (Jan. 1967), .H-53. [21 M. D. FRnx:ovAc, T. LANG, J.-M. h.ILum, and A. TI!oSERA:-'D, "lw,jproc8- tion. sqllare root, illvers' Slluare root, and somt' elementary functions lL'iing samll IIIl1ltilllil'rs," If:f:F Ihlns. on Compufer." 49 (July lOOO), 628-6:n (3) D. FERltJ\RJ, "A division method using a p<iwllcl Dlultiplit'r," IEEE 7rans. on Computef"ll, 1';('-16 (April 1967). 221-226. [41 M. J. Flyr;:.I, "011 division by functiolJal iteration," IEEE Jron... QIl Computrf"S, G-19 (Aug. 1970),702-706. [51 D. L. FOWLER and J. E. SMITH, "All 8('curate high spced implell1l'ntation of division hy reriprocal approximation," Pree. of 9th Symp. on C'omputer AntJlmetic (H}9), 6Q.67, (6) II. KABUO et 01., "Accuratp rounding sc1leme for the Newton-Raphson method using rrollndant binary arithmetic," IEFE Trans. 011 Complltf'f"II. 43 (Jannary 1991), .l.J..51. (7) P. W. IARKSTEIN, "Computation of elementary functions OIJ th£' IBM ruse system/6000 pfO('('1;.<;()r," WAf Journal Re". and De"., 34 (Jl\n. 1990), 111-119. (8) S. F. Omm\lAN, and M. J. FL....NN. "Di\'i.'IiolJ algorithms and impl£'mf'ntdtioll8," IFFE 7hms. on Gomputm, 46 (August 1997), 833-85.1. [91 P. SODEliqulsr, and :-'-1. LEESER, "Area and IIPrfornmnce tradeoff... in t!nating- point rli\'idp and square-root implellwntations," ACM Computing Sunwys. 28 (SrptC'lIIher 1996), 518-561. [101 c. S. 'VAI,LAct', "A slIggC"Stion for a fdSt mllltipliC'r," lEf'E 1hm8. f.lect. Comp., EG-13 (Fl'b. 1964), 14-17. 
T I I J , 9 EVALUATION OF ELEMENTARY FUNCTIONS Our objective in this chapter is to find eflki£'nt algorithms for evaluating elemen- tary funct.ions (like eX, lu x, sin x, cos x, Ptc.) in hardware. Thl>$(,> algorithrru, should be 8<'curate and :,ufficiently fast in comp<uison to a1ternativt' algorithms implemented in softw.uc. SlIl'h algorithms are especially useful Wht:ll designmg scientific hand-held l'alculators and numerical (usually floating-point) processors «121). One straightforward method is the use of a look-up tablp. To e\'Bluate y = F(x) where x and yare numbers of length 11 bits, a ROM of siz, 2 n x n is r<'<J.uired. For n  20, this size is prohibiti\'{' ( 16M). Another method is to use a Taylor series expansion; e.g., co . x x e = L.,-=j"' I. i-O (9.1) This series OOIl\'erges rapidly for a fract.ional argument x. However I for x dosf' t,o 1, a large numb£'r of steps is needed. In '\ddit.ion, its hard war.. implementation if' complex, since separat.e logic.alnetworks are needed for dilfl'rent f'1t'lIlenwry func- tions. Even for software implementations there exist more l'Ificit'llt algorithms rt'<luiring a smaller llumber of computational steps for the sam" precision. Th most commonly used are based on eit.her pol} nomial or rational approxinu:Jt.iomi. Consider, for example, the exponentil:Jl function e Z . "Vt' can exprcss it in the form eX = 2% 1"83" and partition the exponent .r logl e into it!" iutegl'r part I and its fracHonRI part I. i.e., oX log2 e = I + I. Thus, ,l' = 2 1 .2'. (9.2) 225 
(9.3) T , I 1 I 9.1 The Exponenttal Function 227 226 9. Evaluation of Elementary Functtons The inC"orpor'\tion of t.he f8< tor 2' in I.ithl'r fixf'd-point or float ing-point ('akulations is straightforward. To p\rahl6tl' 2/ I)ne cau lIS(' a rational approxi- mation, which is the ratio betwC<'lI a nUnJerator polynomial and a denominat.or polynomial. Oue such nppruximatioll is hmot,d on two d('gree-5 polynomials 2/ = {«(a,/ + a,)! + 03)! + Cl2)! + ClI)! + Uo ««b:,/ + b-l,)! + b J )! + b- l )! + bd! + 1 For tht' abo\'f' method to be dIicit'nt. wt' 1I(,('cI a simplf' wa)' of St'll'(:tin the b, 's to f'nsuro> the convergence of X.+ I to 0, Ab.o, tin> lIIultiplic.,tion by b i houl(1 be simple and not overly time-consuming. To simplify the multiplication. the b,'s are givt'll tht' forlll b , = (1 + .1,2-'), whprl' 8i e {-I. O,I}. This way, tile multiplicat.ioll is rrouc('(1 to shift .tlld .,dll opt'rations. Also, t.he term Inb, = In(1 + 8.2-') in formula (i) is t'itht'r posit.ivp or I1Pgative. This is ncccsary in ordpr to emmrr thl' cOllw'rgpnre of pithpr a positive or nl'gatiw X,+! to O. Clearly, all po:-ihll' values of In( 1 :i: 2- i ) IIIUlot I", precalculated, since their onlille calrul"tion will slow dowli tllP exprution enough to rf'nder the algorithm uS('lcss. Th('S1' pr(>(' Ilcnlatro qu"ntit if'S an> stoll'.d in do look-up table; e.g., a ROM of size 2n x n. Substituting b, = 11 + 8i2-') in Equation (9.-1), w(' oht"in whPr'> a, and bi (i = 0,1....,5) are known constants (7]. To (>valuate this approximat.ion WP n('(>d 10 nmlt.iplic'\tiol1s, 10 additions, and 1 division. Other approximalion requiring a smaller px,"'('ution time arl' availahll'. 9.1 THE EXPONENTIAL FUNCTION TllPrc art' ,,11.efllative mpthods for evaluating th('S1' dl'll1pntan' functions I.hat are more rdily illlplel1lnted in hardw.\re. Most l'xisting algorillIns for e\'aluating pleml'ntary fun(.tions are similar to t.he division-by-con\"ergence algorithm (see ChaptN 8). They involvl' two (or lIlorl') recursive formuldS related in such a wa\' that when one formula is forced to a contant value the ot.her formula yields th desired rf'Sult. For examplp, to evaluate the exponential fun(.tion Y = e ZO for a fractional argument :1'0, we may use thl' formulas (i) x.+! (ii) Yi+l = Ii - In(l + 8.2-'). = y.. (1 + 8 1 2- 1 ). (9.8) To calculate the exponpntial function we have to find the wl.tor II = {80,8l.. ", 8",-1} so I.hat m-I L In(1 + s,2- 1 ) = .rO. 1=0 If 81 is restricted to {-I, 0,1}, thl'l1 the smalle:.t allIl largt'St values of Xo th"t can be reprcsented in uch R way corre:.pond t.o all 81 = - I and "II 'I = 1, respectively, and cons('(}uentl)', X m =0 or (i) X'+1 (ii) Y'+1 = Xi -Inbi, = Y. .b , . (9.4) The b, 's are S('1l'C"IRrl in such a way that thl' elemront.s of the Sf'<}uence xo, x" .... r'n approl\\h 0 where 711 is the llumber of iterat.ions needed to ensure the conwrg('ncp to ?NO. i.e., .r m = O. As will become t'\'idcl1t later, the collvergence is linear. In ut.lll'r words, m is a linear function of the number of bits, n. To find thC' valu approached by the wrresponding !;equence Yo, y" ..., 1/m lIute that (61 m-I m-l L In(1 - 2- 1 )  Xo S L In(1 + 2-1). 1...1 1=0 For every Xu ill this intef\'al thpre Px1sts a vector s such t.hat. the mnvergt'nl'e f X m to 0 is guaranteed. For large elllJugh v"lu('s of 111 (711  20) the bounds III inequalil.y (9.9) converc, yielding tilt' following dOllllJin for xo: (9.9) y.+1 . e""'+' = y, . b. . e r .- Inb , = Y. . cr.. (9.5) - 1.2.&  XO  1.56 (9.10) Yrn . c r ... = Yo . c ro , and !;illn x 'n = 0, Yrn yipld tht' desired result, (9.6) If WI' restrict the "rgunwnt .co t.o positive fmct,iolls. WI' 1\1 ,)' uS(' a illple schC'me for selecl.ing 8,. This is none-sided sl'lt'Ction rule; i.e., 8i e fO,1}, sllllliar to tllP quotient-bit selection rulro in the rt>storing divisi(JII aloritlllil. In step (i + I) we funn the ditferpnet> D = Xi - In(1 + 2- i ). If D is positive or ./.ero. we set 8i = 1 and X,+1 = D; if D is nl'gativ(>, Wf'set Si = 0 \ud .£.+!  x,: \\'t> ma)' allwy?e the possibll' outronl('S of t.he sublmct opeml.IOlJ m £ormulu (i) by examining the 1liylor seril'S rxplUlsiullof III( 1 + 8i 2 - 1 ): (s,)J 2- 2 . 2- Ii In(1 + d.T') = siT' - 2 + 8'3 (9,11 ) III p.micular, Yno = Yo . e Xo . (9.7) TIll" similarity bptwt.'€n the above algorithm for calculating ('''0 and the division- by-converp;I'lIce algorithm is now apparl'lIt. In t.hl' divL<,ioll-by-coll\'f'rgl'l1Cr algo- ritiull tilt' fl:Jtio N,I D, is kept const.ant whilr herl' the product Y. . e XI rf'lIlain8 contallt illd"pt'lldent of the specific values of I.hp bi's. 
228 I I t I t 9 EvaluatIon of Elementary Functions ; 1 + 2 . 111(1 + 2 ') 1 - 2 ' In(I-2 ') 0 10.00000 00000 0.1011000110 0 - 1 1.10000 00000 0.0110011111 0.10000 00000 -.1011000110 2 1,01000 00000 0.0011100100 0.11000 00000 -.0100100111 3 1.00 100 00000 0.0001111001 0.1110000000 -.0010001001 4 1.00010 00000 0.0000111110 o 1111000000 -.0001000010 5 1.0000100000 0.0000100000 0.11111 00000 -.00001 00001 6 1.00000 10000 0.00000 10000 0.1111110000 -.00000 10000 7 1.0000001000 0.0000001000 0.1111111000 -.00000 01000 8 1. 00000 00 100 0.0000000100 0.1111111100 -.0000000100 9 1.00000 00010 0.0000000010 0.1111111110 - ,00000 00010 10 1.00000 0000 1 0.0000000001 0.1111111111 -.0000000001 TABLE 9.1 The value of In(1 :t: 2-') with lO-bit precision, At. step i we call cancel the bit whose weight is 2-' by subtracting In(1 + 8;2-') from Xi. This implies that. we may net>d up to 11 steps to pnsure the coO\'ergence of an n-bit. fract.ioll t.o zero. The convergence Lc; t.herefore linear and m = 11. We can rely on this fact and slightly modify the selection rule to improve tlx> performance of the algorithm. If t.he bit in the it.h position is 1, we select 8i = 1 and calculatf' X'+1 = X. - In(I + 8,2-'). If the bit is 0, we selpct 8i = 0 and go on to the next step without performing any subtraction. Ai> a result, Wf' have the capability of skipping over zeros. More complex :wlection rules can be used as well (3). Example 9.1 To calculate eO. 2 :; in lO-bit precision we need a table of In(1 :i: 2- i ) for i = 0,1,2.", ,10 with each entry having 10 fractional bits. The required entries have been calculated and are shown in Table 9.1. The calculat.ed entries have bet>n rounded-t.o-nearest rather than trullcated. As a result, Wl' obtain In(I :t: 2-') = :t:2- i for i  6. Suppose that we use the one-sided SE'lection rule. St.arting wit.h Xo = 0.2510 = 0.01000000000 2 , we must select. So = 8) = 0, since Xo < In(1 + 2- 1 ), and thus X2 = Xo. We then select 82 = 1 and obtain .r;3 = X2 - 0.0011100100 = 0.0000011100. We thpn set 83 = 84 = 0, and at this point it becomes l'vident that we must select 8:; = 0, 86 = 87 = 8(:1 = 1, and 89 = 810 = 0, yielding Xu = O. In parallel, we calculate YII = YO' (1 + 2-2)(1+2-6)(I+2-7)(I+T8) yiplding YII = 1.01001 000112 = 1.28418 10 , The eXdCt ,'alue with a prf'cision of five decimal digits is 1.28.103 10 , The approximation pnor equals 0.I5M.2- 10 . The exact steps ofthe calculation are summarized in the tdble at the t.op of the next page: If we &.10'\ 8i to assume the value -1, then the selectf'd values are (b'o. b'lo 82, 83, b'4, 8:;, 86, 87. 'H, 89, 810) = 0, I, -1, 1,0,0,0,1,1,1,1 resulting in .rll = 0 and YII = 1.01001 0OOI. The approximation error equals 0.842 . 2- 10 , 9 2 The Logarithm Function 219 I :1:, II, s, 0 0.01000 00000 1.0000n 00000 0 1 0.01000 00000 1.00000 00000 0 2 0.0100000000 1.00000 00000 I 3 0.00000 11100 1.01000 00000 0 4 o 00000 11100 1.01000 00000 0 5 0.0000011100 1.01000 00000 0 6 0.00000 11100 1.0100000000 I 7 0.0000001100 1.01000 10100 1 8 0.0000000100 1.0100011110 1 9 0.0000000000 1.0100100011 0 10 0.0000000000 1.0100100011 0 11 0,0000000000 1.0100100011 o The domain for the .uguruent of thf' exponential funct.ion ('an be I'xtenc!PfI by writing thp argument. as .r; = x log2 e . III 2. then partitioning the first tprm into its int.ep;cr part and its fractioual part; i.e., X log2 e = 1 + f. where 1 is the integer part and f is a fractioll (0:$ f < 1). We need, thprpfore, to calculate Y = eIt = e('+/)ln2 = e'ln2+/ln2 = 2' . e/ 1n '2. We set Xo = fin 2, and cOlLc;equentIy 0 '5: Xo < III 2 = 0.6Y3. Thl' factor 2' is e&5ily dealt with, either by incorporating it int.o the I'xp"llt'nt part of tbe floatiug-point lIumber. or through a shift operation if fixt>d-point arithmic is used. 9,2 THE LOGARITHM FUNCTION The procedure for calculating e1t in the previous sectlOIl is bnsed on continuft! summation of terms of t.he formln(l + 8,2-') to force t,hp C"onvergt'nce t)£:1', to O. This type of procedure is called addititre normulization. III a similar WHY, we may define a multiplicative normalization procedure as one in whi(h .r, is forced to I (or some other nonzero constnnt.) by ('Ont.inned multiplicatioll with precalculated factors. The set of recursive £ormulM for lI1ultiplil'ative normaluation has th.. £01- lowing gC'llt'nll fonn: (i) Xi+l = .r,' b.. (ii) y,+! = y, - y(bi). (9.12) , b i is selpctcd :;0 that Xa+1 approaches 1. i.e.. Xi+! =.ro n b, --. land thus, aft( r 1=0 '" steps m-I 1 n bl = - (the lIIultiplicatiw invprse). '",,0 a:o (9.13) 
230 9. Evaluation of Elementary Functions I 9.2 The Logarithm Function '231 Tht' multiplic{\t,ion factor b. is again given thp form (1 + 8.2-'), and for Si E {- 1. O,I} t.he following inequality determines thE' domain of the algorithm: (9.14) I I I baM' e by any othpr hast'. Of particular illtf>rt an' thp base 10 1()8rlt lun and exponent ial functions. Thl'Sf' arp IIs('ful in hand-lJ(>1rI caknlatora. Finally note that, in a IIHIlUlf'r analogolls to E(lildtinn (9.5), Wf> 1I11\V writ.. m-I m-I m-I n (1 - 2-') s n (1 + sIT') s n (1 + 2- 1 ) 1= I 1=0 '=0 Y.+I + In.ri+1 = (y, -Inbi) + (In.r. + Inb,) = Vi + In I;, indicdt.ing that Y.+I I:Jpproachf>S Yo + In Xo when r.+I approndlPf< 1. (9.17) Thl':ip hounds cn bE' calculat('d from the ("orr<'Sponding bounds in ineqnalil3' (9.9) and, for large f'nough valups of m, tllt'v converge to ",-I 0.29 S n (1 + sIT') S 4.i7. 1=0 Example 9.2 To calculate In(0.50) in lO-bit. precision Wp use thp pntries of Tahlp 9.1. The steps of the calculation are shown in tht' tablp bt'low: Therefore, positive normalized frdct.ions are within the domain. If the argument for thp logarit,hm function is not already representcd as a normalized floating-point number, we may rewrite it as such, obtaining x = Xo' 2E2, where 0.5 S Xo < 1. Thus, Inx = IIiXo + Ex In2. A simplt' one-sided selfftion rule with s. E {0.1} ('.an be employed herc as wpll. If Xj already has i leading l's (i.e., x, = 0.11 ...1 Ozz... z). then the  I i z, y, " 0 O.wooo 00000 0.0000000000 0 1 0.1000000000 0.0000000000 1 2 0.11000 00000 -0.01100 11111 1 3 0.1111000000 -0.1010000011 0 .1 0.1111000000 -0.1010000011 1 5 0.1111111100 -0.1011000001 0 6 0.1111111100 -0,1011000001 0 7 0.1111111100 -0.1011000001 0 8 0.1111111100 -0.1011000001 1 0 1.00000 00000 -0.1011000101 0 10 1.0000000000 -0.1011000101 0 11 1.0000000000 -0.1011000101 Cou!'CQu('nt.ly, the domain for Xo is 0.21  Xo S 3.45. (9.15) Xi+1 = xjb j = Xi + xj8 j 2-' Thus, YII = -0.10110001012 = -0.69238 1 0' The exact result (in tiw.' decimal digits) is -0.69315 and t.lw approximation error is 0.783.2- 10 . 0 multiplication with 8j = 1 will produce (i + 1) lE'ading l's in X.+1' . Formula (ii) in Equation (9.12) rE'_';ults in Y'+1 = Yo - 2:g(bl)' If we sclC<'t , 0 For multiplictive normalization we can also considl'r t.hp ca:;t' wh(>re the operat.ion in formula (ii), Fqllation (9.12), is muitiplicatiull in:;tead of Sl1htrac tion. Sdet.ting, for example, g(b.) = b (wllt'rc k is 811Y rpal number) formula (ii) yields g(b,) = Inbl, th(>n i i Yj+1 = Yo - 2: lnb l = Yo -In fIb, '=0 1-0 Thereforp, whpn X.+l approarhes I, Yi+1 converges to 1 Ym = Yo - In - = Yo + III Xo. XO (9.16) m-I ( rn-l ) k Ym = YO' IT g(b,) = Yo . IT bl 1=0 1..0 v I . f k I tll n ll Y --. " o/ x o and a clivid£' opE'rat.ion is t'xecutl'Cl. If ror examp P, I . = , ' .+1 ,. . d k = 1/2, Yi+1 -+ Yolvxo and thE' reciprocal of thp square root. IS calcl."at.t., an . b . r;;:: H , > g( b ) = - 10: and to slmphfy the If we set Yo = .ro. we 0 tSIll Yi+ I ) v.ro. trt . V II multiplication in formula (i) w(' ma}' selcd n(bi) = (1 + s;2-'). COlls('quellt,ly, --. yO' ( :J k (9.1) Thus, thp mult.iplicative normalization algorithm in (9.12) with g(b.) = In(1 + 6;2-') can be ,'mployro to c.Rlculatp thp natnrallog'lrit.hm function F(xo) = In .ro. Not.ice thai thp saml' tablp of In(1 + 8,2' ') that. was uspd for evaluating tht' ,>xpollt'ntil:Jl fUlll1.iun h. n(:'Clpd here. Also not.i> that WI.> may rcpldce t.h(' logarithm 2 -:I, ?-.+l 2.,-2. b l =(1+8.2-.):l=1+2s,2--'+s j 2 =1+Si +S.. (H.W) 
232 9. Evaluation of Elementory Functions T 93 The Trlgonometnc Functions 233 Thp multiplication by b i in formula (ii) is now morp complex and AS a result, t.he efficit'llcy of t hp method dcrihed above for square root extraction is ques- tionable. m-l DC'notp 1\ = n V I + {sI2-1)2; thpl! for X m = 0 WI' obt'tin 1=0 Yrn = YO' K. c'r o = Yo .1<. (cos.ro + jsinxo). (9.25) 9.3 THE TRIGONOMETRIC FUNCTIONS e'Z = cosx + jsinx (9.20) The eonvergence dumain for this alorithm for large ,'nough valu! of 1Il (m  20) is rn-1 m-I -1.743 = L t8n- 1 (-2- 1 )  Xo  L tau- 1 (2- 1 ) = 1.i,lJ (92()} 1=0 1.0 which includes the useful domain 0  Xo  rr /2 = 1.57. Thp Xi '8 are real numbprs, Lut the y, 's are complpx numbers. To separatp the real and imaginary parts of Yi, let y, equal Z. + jW, wher.. Z, is the rpal part of y, and W, is its imaginary part. Formula (ii) now takl's till' for III To c\'a.luatR the trigonometric functions W' nse the well-known relation bw('('n the exponential function and the trigonometric fun..tions wlwre j = ,!=T. For th(> exponential function we used the recllrsive formulas for additivl' normali?ation. which have t he following gf'ueral form: (i) Xi+l (ii) Ys+l = Xi - g(b i ), = Y. ab.. (9.21) (ii) y,+! = Z'+I + jn"+1 = (Zi  jtV,)(I + js,T'). Therefore, till' recursive formulas for Zi+1 and U',+I 8re (ii) Zi+! = Zi - 8,2-' . """ (ii)' Wi+! = H', + 8,2-' . Z,. The initial value yO can also, in genl'ral, be a complex number. YO = Zo + j\V o , but if we set W o equal to O. making Zo equal to Yo, we obtain the dl'Sired value:., which according to Equat.ion (9.25) arf' (9.27) To calculate the trigonom('uic functions we select b, = (1 + jSi2-'). This complex number can be written in the form V 1 2 + (S,2-i)2. eJ8, = V I + (8,2-')2. (cosO; + jsinOi) (9.28) where 0, = tan- I (8,2-'). This is the polar form of 8 complex number which, in general. is A+jB= V .4 2 +B2.ei 8 = V A2+B2'(cosO+jsinO) (9.22) Zm = Yo' K . COS X o, tV m = Yo' 1< . sinxo. The recursi\'C formulas for thp calculation of sin.r and cos.r were developed son1t'what differently by VoIder 117) through rohtions in a polar ytt'lI1. IW:' resulting implementation was named CORDIC (COOr<iIl8tcd Rt.lon, Dlglt Computer). This formulation was later f'xtendt.>d to mdude dlVISlun and the evaluat.iun of hyperbolic functions (18). , A possible Sf'lection rule for s, to el1!1urp the convergPlice of .1:.+1 to 0 whcli Xo is in thl' domain presented in Equation (9.26) is if Xi> tan- 1 (2-') if IXil  tan-I (2-') if Xi < - tall-I (2- i ). (9,29) where 0 = tan-I . The Si '5 arc selected so that tht> clements of the sequence Xo, Xli ,.., X m approach O. To find the value approached by the corrf'Sponding sequenCt" Yo, YI. "', !1m we form an expression analogous to that in Equation (9.5): Yi+1 . eJZI+1 = Yi . b, . r,-;g{br) = y, . eJZi V I + (8,2- i )2 . e1°'e- jg (b i ) Setting g(bi) = 0; = tan-I(8iT'} (9.23) yields (9.24) Si = {  -1 (9.30) Yi+! . e'ZH' = y, . eJz. V I + (8,2-')2. Note that if it wpre not for the term V I + (8i2-')2 the product y, . eJZ' would have hc<>n a constant. Equation (9.24) results in m-I Yrn . eJ:tm = Yo' e'zo II V I + (s,2- 1 )2. '.0 S . . t -1 ( -0 ) = -trm-I ( o), t.8n-J(s,2-') =:i , tan- l (2-'), and thus only n II1ce an . 1 W' hi'. . t f valut>..-: for constants, rather than 2n, must be stort'(I m a ROlv . It t ns se (, . Si, {-I,O,I}, we obtain 
234 9. Evaluation of Elementary Functions 9.4 The Inverse Tr1gonometnc Functions 235 (9.31) Example 9.3 To calculatp in( 71' / -I) in lO-bit prE'('ision we nefil a table of drclan(2- 1 ) for i = 0,1. 2,...,10 with each entry having 10 fractional hits. Thp re<luired entripg havE' bN'll calculatNi and are shown in Tahh-' 9.2. TIlt' calclllah>e1 entries haV(' been roundPd-to-nt:'arest. rat h"r than trullcatJ.'d. As a rpsult, we obtain arctan(2-') = 2-' for i  4. The input operand is 71'/-1 = O.7853U!'!lo = 0.1100100100 2 , Initially we set Zo = 1/1< which in IO-bit precision isO.l001l0111 (=0.607-1 10 ), We thE'n St't Si = 8ign(x;) per Fqnation (9.32). The final results arp Zit = 0.10110100112 and W u = 0.1011010100:2 while the "E'xact" results in 10. bit prt"Cision arc 0.101101O10 = 0.707110 and 0.101101010 = O.iOil lO , respectively. The approximation error is thus not larger thall IO. Thl:' exact steps of the calculation arc sunllnarized in thf' t.ablt:' bl:'low. m-I K = II J l + (8i 2 -')2. ,=0 ThE'refor!.'. h is lIot 1\ constant hut depends on tbe \'cctor s, which in turn dl.'p('nd:; upon Xo. In addition. the' ahove seh>ction rule requires a full-length companson bE'twef'n I, and tan- I (2-i). If we r<'Stri(,t SI to assume only the valu<'S {-l,l}, then m-I I\. = II J l + 2- 2 ,. ;=>O and it i a constant that, can be precalculated. For 111 > 16, K = 1.6468. We may thPIi st>t Yo = I/K and, a.<; a result. ohtain Z... = COSXo and lIT m = sinxo. The selcction mil' for 8, llOW becomf>S 2- 3i t -I ( 2 _; ) 2 -' an 8i =S, -8iT +... (9.33) i x, 7, ''"', s, 0 0.1100100100 0.1001101110 0.OOOOOOOOO0 1 1 0.OOOOOOOOOO 0.1001101110 0.1001101110 1 2 -0.0111011011 0.0100110111 0.1110100101 -1 3 -0.0011100000 0.1000100000 0.1101010111 -1 4 -0.0001100001 0.1010001011 0.1100010011 -1 5 -0.0000100001 0.1010111100 0.1011101010 -1 6 -0.0000000001 0.1011010011 0.1011010100 -1 7 0.0000001111 0.1011011110 0.1011001001 1 8 0,00000oo111 0.1011011000 0.1011001111 1 9 0.00000000 11 0,1011010101 0.1011010010 1 10 0.0000000001 0.1011010100 0.1011010011 1 11 0.0000000000 0.1011010011 0.1011010100 o Si = { _: if Xi  0 if .r i < O. (9.32) This is similar to t.he select.ion rule for llonrestoring division, aud a full-length comparison is not needed. To examiue the ratc of convergence of XJ+I to 0, consider the Taylor seril'5 expansion of tdn- I (,...2-.) (in radians): Again, linear convergence is achieved. Also, for i > 71./3, all terms except the first in the above s('ries art> negligiblE' and, as a result, a considE'rable rNiuction ill the si7e of thl' ROM is possible. i 2-' archlll(2 ') 0 1.0oo00Q()OOO 0.1100100100 1 0.100000oo00 0.0111011011 2 0.01OOOOOOOO 0.0011111011 3 0.0010000000 0.0001111111 4 0.000100000o 0.000 100000o 5 0.0000 100000 0.0000100000 6 0.0000010000 0.0000010000 7 0.0000001000 0.0000001000 8 0.0000000100 0.0000000100 9 0.OOOOOOOO10 0.00000oo010 10 O.OUOOOOOOOI 0,00000oo001 9.4 THE INVERSE TRIGONOMETRIC FUNCTIONS To evaluate the inverse trigonometric funct.ions we use t.b multiplicatw nor- mali7ation algorithm for the inverse exponential funct.ion; I.e., the Ulu...lon. In x described in Section 9.2. The general fortn of mult.iplicatiVt' n0rI11dhzat.1011 IS (i) Xi+1 = Ii' bIt (ii) Y,+I = Yi + g(bi)' (9.3) If t b (1 +J - 2 -i ) as in Section 9.2. then formula (i) yields we 8e i = '" TABLE 9.2 The value of arctan(2-i) With lO-blt precision. on-I m-I 9 X m =.rO' II hi = XO' II (1 + jSI2- 1 ) = XO' 1\t. J , 'O 1-0 (9.J5) 
T 236 9 Evaluation of Elementary Functions 9.4 The Inverse TrIgonometric Functions 237 171-1 (J = L e"e, = tan- l (s.2-') and '-0 171-1 K = IT J l + (S,2-')2. '=0 Setting Vo = 0 and U o = 1/1\, we oht.ain U m = C()O and V rn = ..inD. Recall that K is a constant if we rpst.riel Si to tilt' values {-I. I}. If, for a givPII argumelt C, we need to calculatl.' t/J = eos- 1 C, we 5plt'ct the s.'s in i\ way that. Urn --t C and, 8.'i a result, e --t tb. Th{, angle e is obtaiau"{l from Y,n = 0, where v. herp imilar1y, ifwp set g(bi) = (J, = tan-I (siT') and Yo = 0 then formula (ii) re.'Slilts an m-I Ym = L (J, = e. (9.36) '=0 Here, Xi is a complex variablt: in th(" form Ui + jv.. To obtain the recursive formulas for the (eal and imaginary parts, consider again formula (i): (i) U i +1 + j\-'i+1 = (U i + j\li) . (1 + js,2-') Separating the real and imaginary parts, we obtain ",-I 171-1 (J = L ta11- I (s.2-') = L s,um- 1 (T I ). '=0 '=0 ConS('Clllently, a tablp of lan- 1 (2- i ) is ndt'd. dS it is for sinp and eosint'o Simi- larly, to calculate tb = !lin- I D for a givE-n D we selpct the sa's so that V m  D and aain e --t tb. To evaluate the inverse t'-Ulgent, tan- 1 C, let V m --t 0, and from Equation (9.41) we obtain tall 0 = - and (J = -tan- I (Jfo). We set V o = C, Uo = 1 and obtain t811- 1 C = (J. In summary, the iterative formulas to be calculdted (i) U.+1 = U i - Si 2 -i . Vi, (if \/i+1 = \Ii + Si T " . UI' To find the relationship between the values of U V. . nt, m £ollowl11g expression: arc (i) Ui+t = U i - Si 2 -i . V. ; Uo = 1, (i)' \';+1 = \'i + Si2-i . Ui ; V o = C, (9.-12) (ii) YI+I = YI - Si tnn- I (2-') ; Yo = 0, with s, E {-I,1} set according to the signs of V. and V, so that V i +1 is closer to O. (9.37) and Y... we form the X,...I . e- Jlli . 1 = Xi . b, . e-Jllle-J9Cb.) (9.38) Example 9,4 To calculate tan- I (1.0) in lO-bit precision we usp the entries of Table 9.2. The steps of the calculation are shown in the table below. The final rL>- suit. is Yu = 0.11001001OCh which is equal to tbe "ex ad" resnlt III to-bit. precision, Substituting the selted values of b, and g(b i ) yields Xi+1 . e- JIII + 1 = Xi' e- jll ' . J l + (s,2- i )2, resulting in X . e - jllm - X e -J9 - X l.' rn - m' - O' fl, (9.39) Urn' CO:, 0 + V... . sin(J = K. U o , -Urn' sinO + Yon . cos(J = K. Yo. (9.41 ) i Ui Vi y. S, 1 1.0000000000 1.0000000000 0.0000000000 -1 2 1.1000000000 0.1000000000 0.0111011011 -1 3 1.1010000000 0.0010000000 0.1011010110 -1 4 1.10 100 10000 -0.0001010000 0.1101010101 1 5 1.10 1001 010 1 0.0000011001 0,1100010101 -1 6 1.1010010110 -0.0000011100 0.110011010 1 1 7 1.1010010110 -0.0000000010 0.1100100101 1 8 1.1010010110 0.0000001011 0.1100011101 -1 9 1.1010010110 0.0000000100 0.1100100001 -1 10 1.1010010110 0.0000000001 0.1100100011 -1 11 1.1010010110 _0.0000000001 0.11001001011 o wher.. ...-1 K = IT J l + (S,2-')2 '=0 as in Section 9.3. . . Replacing ... and Xo in Equation (9.39) by the corresponding real and uuagll1ar". part Yields (U.. + j\'m)' (cose - j &in(J) = K. (Uo + jVo). (9.40) Thlli>, 
238 9. Evaluation of Elementory Functions I 9.5 THE HYPERBOLIC FUNCTIONS To c"\lculatt' t.lu> hypt'rbolic functions :-inh TO = .!. (e ZO - ("-XO) 2 and 1 coshxo = '2 (e ZO + e- ZO ), WC IIS(' an algorithm similar to that used for the exponential function; i.e., (i) Ti+l = Xi - g(b.) , (ii) Yi+} = y.' b. , b. = 1 + 8i2-', We first. (('writt' b. as 1 + 8.T' = V I - (8;2-.)2. exp(tanh- I (8i 2 - i » which is based on tht' ide-ntity 1 +.E = VI - x 2 . exp(tauh- 1 x) if Ixi ¥- 1. Thc proof of this identity is left to the reade-r as an cxercise. Formula (ii) in Equation (9.44) results in Yrn = Yo 'if b, = Yo ( if V I- (8,2-')2 ) . exp{'f tanh-'(s,T'» '=0 ',.,0 '<>0 rn-I = YO' k . e,xp( L tanh- I (8,T'» '",0 where m-l i< = I1 V I - (s,2-')2, 1..0 (9.43) (9.44) (9.45 ) (9.46) (9.47) This is a constant factor (K = 1.205) if we restrict. B, to {-I,I}, as we did for tht' trigonometric functions. Unlike- t.ht' algorithm for c Z , we select here g(b,) = tanh -I (8i2-i) so L tanh- l (s,2-') -+ Xo wh(,11 Xi+} -+ O. '''''0 Thus, from Equation (9.17), y", = YO' i<. e Zo . We now dpfill(' a new variahle tIt which is calculatcd recnr..ivt'ly t.hrough the cquation t i +} = ti' (1- 8,2-'). As in 9.6 Bounds an the Approximation Error 239 Equation (9.47), this new variable conv('rg('s to to' A . e- ro . Let Zi = t. (YI + t.) aud Wi =  ' (Yi - f.). ForlUuln (ii) is r('placed by th,' f..!lowing hw, forlllulas: (ii) ( ii)' Z.+I ""HI = Z. + 8.2- 1 . "'" = 'Vi + s.2- i . Z,. (9..18) Thes ' variables convergf> to Zm 1.1- '2 YoKe zo + '2 fo J(t,-r o I - e Zo + e- zo 1 _ e Zo _ e- zo = 2(yO + to) . 1< . 2 + 2(YO - to) . h . 2 = Zo.j(. coshxo + ""0. i<, sillh.xo (!J.-i9 ) = and 1 J - ro 1 K -. -zo = '2YO \e - '2fo C = ZO' i< . sinh Xo + 11'0 . 1\ . cosh TO (9.50) tr m Now. setting UfO = 0 and Zo = 1/ k yields Zm = cosh Xo aud W m = sinh xo. The resulting formulas (i), (ii), and (ii)' are similar to the»;e for the trigollometric functions. Therc is, howe\'(r, a major rlift'prellce btwN>n b' con- vergcnce of .ei+l to 0 in these two sdl(m. For the trigonometric fUllctlons the relationship tan-I(Tl iH » > 1/2 .tan-I(T i ) holds. As a result, cvcn if we obtain x. = 0 and !x.+11 betomes tn-I(2-') (sillce 8. is restricted to {-I,I H, we can still expcct .em to convt'rt' to O. Howe\'er, for hyperbolic functions WI> have tanh- 1 (T(.+I) < 1/2 .tanh- I (2- i ) and the convergcnce of x m to 0 is not guaranteed u IIless several t.eJ>s ar(' repe<ttcd . ?4 h . t 3 4 7 12 13 18 and 21 must be twice. For example, If n = _ t Pil :; Pps , , , , , ' repeat.ed twice (8]. 9,6 BOUNDS ON THE APPROXIMATION ERROR I . t h, '" p eeted wh('11 e\'<I1- In this .sect.ion wc {'Stimate the maximum ,'rror t lat. IS o. e  . r. '. uating elemcntary fUllct.ions using eith"r additive normahzat.lon or llIultJp Icatl\t' normalization. 
J 240 9. Evaluation of Elementary Functions In the addit.iw normali?ation proc('dure, if Xo is a.n n-bit fraction, then E;:C;I g(h/) approaches Xo with dn error of E = Xo - E;:I g(bl), which satisfies lEI S 2- n . At thp same time, w(' attt'mpt t.o evaluatf' Y = F(xo) by calculating m-I Y = F( L g(b,) = F(xo - E). 1=0 The corre-sponding Taylor sprif'S expansion is dF E 2 dJF y = F(xo - E) = F(xo) - E' dr Izo + 2' . dx 2 1 .1: o + 0(t: 3 ). (9.51) Since E is of the ordpr of 2- n , the last two terms in the above ('xpansioll are n('gliible. As a result, t.he error tJ in y is of tbe ord£'r of E' *Izo. The magnitude of this error depends on the spedtk function F(xo). For example, for the expo- nential function F(xo) = e Zo ,  1.1:0 = e Zo , and thU5lcSl S 2- n . e 1nl = 2-(n-l) for x S In 2. The maximum error in the funct.ion has double the size of the error ill t.ht" argument. This ('rror can be reduced by increasing the numher of bits in Xi by 1. In the lUultipIicati\e normalization procedure, m-I Xo II bl -+ I, 1=0 and t hI' error is m-l m-I E = 1 - xo II blor Xo II b l = 1 - E. 1=0 1=0 InstMd of F(xo), v.-e calculate Y=F( n;::'lbl ) =F( 1E ) F(xo[I+E-E2+...])F(xo+xoE). The Ta\.lor series expan!'ion of the last expression is dF 1 dJ F F(xo + Xof') = F(xo) + XOE' dx 1.1:0 + 2 (XOt:)2 , dxl l.l: o + 0(t: 3 ). (9.52) For exalllplp, for the llaturallogarithm function F(xo) = In Xo the derivatives ""' 1 I ,(IF ] arc Iii Zo = z;; and dZ"lzo = -";'1' Th('refore, Y = F(xo) + E + 0(E 2 ). Hence, the o error 6 in Y sat.isfies 115/  1t:1 S 2- n . and t.he precision obtained is satisfactory. 9.7 Speed-up Techniques 241 9.7 SPEED-UP TECHNIQUES Con!'icll'r thl' expolIPllt.ial function F(xo) = e Zo , for which the prf'{'aklllawd l."Onbtants In(1 + 8i2-i) ha\e the following Taylor serirn expansion: I (1 + 2 _' ) 2 _. ( ) '2 2 -'21-1 I 2 - Ji n 8i = 8. - 8. + -8j - . . . 3 For i > n/2. the above expression is approximately 8.2-'. As a ff'Sult, lIot lInly can we reduce the size of the ROM rf><juir{'(1 but., morf' importuntly, the last n/2 steps can be replaced by a singlp operation, as shown helow. In the stpp8 prior to st.<'p i, we havl' already canceled at least t.he first (i - 1) hits in Xi' :1', thus has the following form: x. = 0.0...00 ZiZ.+}'" "'-v-" I-I where z" (k = i, i + 1,...) is a single bit. To cancel the remaining nonzero bits in x" for i  n/2, we should select 8" = z" (k > i); i.e., all thp rpm ,illing 8,,'8 can be predicted ahead of time. Based on thi8 knowledge, we lIlay speed up the execution. In the last steps we need to calculate (for i  n/2) m-I Y... = Yi II (I + 8,,2-")  Y. (1 + 8i2-' + 8i+1 2 -(i+l) + 8.+12- h +l) +...) , k=. (9.53) and, since we select SIt = z" for k  i, the last tenn in Equation (9.53) lIals (1 + x;). Tbus, Ym = y.(1 + Xi). (9.54) If a fast multiplier is available, the overall execution time can be redul'oo. This is called a termination alJorithm 12) (or terminallillear approximation). Eveu if a fast multiplier is not available, we can still take advantagt> of the ability to predict the Si'S by performing all thp additions of the products ill EClution (9.53) ill a C"lrry-save manner, avoiding the time-colIslllllig carry-prpagtlol. \Ve lIlay arrive at tbe same exprC"sioll for the tt'rmmal apprOXJlnatloll III a dif[prent way based on the Taylor series expansion F(xo) = F«xo - x.) + x.) dF 1 .J rPF I O( J ) = F(xo - :1'.) + J:, . - d 1(.1:0-.1:.) + 2 x ; , d 2 (ZO-"'" + ..c. x x (9.55) d': _ d"' 1 = for IXil < 2- n / 2 . It call be shown that F(xo - x,) = J;1(.l:o-zl) - ;J;T (Zo-z.) - I  I  1. 2. < .!..,,-n.e 1n2 = 2-". y" and tht>rt>fore. F(xo) = Yi( 1 +x.) +6, wht're u "" 2 Xi y, - 2 - 
242 9. Evaluation of Elementary Functions 9.8 Other Techniques tor Evolua"ng Elementary Functions 243 Thus, the' bound on t.he l'rror \\'h('ll the I l'rillinal approximat,ioll is u8\'d is halC its value withont tilt' termilU\1 approximation, providing highl'r precision. A tl'nnination algorithm can also he applif'(1 in thl' calculat.ion of th... natnrallugarit.hm fmlct.ion, F(xo) = In :ro. In this ca....e, XI h the form 1 - x. = 0,0 -.. 00 :.%i+l'" "'-.....--" ,-I on-I Ym = Y. - L lu(l + sk Tk ). k=. Ol'ing "ble to predict the Ii" '6 allowR us to perform t.hp arlditi<'ln8 in thp ''<llIation!l for XII Z, aud U', in a ('arry-savp mnnnpr (II. nr>f'xaminillg thp S('rit';l exptlllsil)n of tan- l (1 + 812-1) in Fqulition (9.:J3), WP conclude t,hat we may. in principlt', st.art the prt>dictiCin proc('ss even earli...r and prPdict the 81 '8 for ;  1 $: 3i "II at oncl'. The corrE'Sponding tl'rms can then hI' addPd with d carrY-8ftVI' ndrl..r and a single pass through a ('l\rry-prupagatl' addpr. Howf'vpr, fm 1< n/2 Wf' still nCt-'(1 to usp the Sl't {-I. I}. This can hI' done by propprly recoding the hits in Xi using t.he set {-1,1}. We would also like to speed up tilt' computation fur the first steps, 0  i  n/2. This caD be done by using a radix higher th<1lI 2; e.g., g(b;) = 1 + 8.r- i , where r is some puwer of 2, sayr = 2'1, aud 8i e {-(r - I), -(r - 2),... - 1,0, 1,..., (r - I)}. This way, we may handle q bits of x in a singh' stl"p, hut with an incrl'a.scd complexity of ('<Jeh step, since s. has a largc'r ranKe. Still, somp improvement in speed is poible (41. Another method is to allow Si to a...',sume the vdlues in {-I. O,l} and modify the st'lect.ion rule in such a way that the probability of St'1f'C'tillg Si = 0 is nMximized. This is simildr to tht> iclea hehind the variations of thp snr division algorithm. This approach has bf'('n analyzed in (31 hut has not gained much popularity. or n.--I 1 2 _. 2 -(0+1) 1 '" 2 - k :r. = - Z. - Z'+I -... = -  Z" . k i TIIP Cormula for Xi+1 ,yields 3"+1 = X. . (1 + 8i2-i) = 1 + 8i2-' - %i2-o - ... For X'+1 to approacll 1. we should t.hereforc sel('('t Sk = Zk for k  l. In parallel, we expe<'t. t.o cakulate in the rl'maining steps ConS(>Quently, bas.ed 011 the Taylor series expansion Cor In(l + 8,,2-"), w(> obtain the following terminal dPproximation for ;  n/2: rn-I Ym  U. - L Sk 2 -k = Y. - (1- x;) ,,"'. (9.56) 9.8 OTHER TECHNIQUES FOR EVALUATING ELEMENTARY FUNCTIONS Many calculators and floating-point processors employ some variati?l1S of the pr('\ iously presented algorithms for the evaluation of elenwntary fuuctlOns. Usu- ally, each one of these has a particular implemelltation that. .dep(>nds on the precision '\nd speed requirempnt.s .lnd the area constraints. Stili, severa! (-,tiler methods for caknlating elementary fnnctions haw beell propo...<wd, and son);' ha\'(> hpcn implemented; P.g., (51, In (91 an evaluat,ion of elementary functions hased on rational approximations is proposed. This n1Pthod, which is commonly  when evaluating elementary functions in software, can become very cost-dftl.ve for hardware impl('mc'ntat.ioll. This is the case wlwn a fast adder dlul multlplipr are availahle anel when high prt'cision is required; e.g., tilt' arguments are ex- tended doubl(>-prccision float.ing-point lIumhprs in the I.EEE. Htam.lard. III such a situation, hl:Jrdware implementation uf rat.iondl approxlI.ll<1hOIlS can Sl.IC(,V..;sfI1 compet(> with the m('thods basl'd on continued sumruatlolL<; and nu1t1plntioos (whose convergencp is lint>ar in the number of bit.s) when "X('{:UUOII tllne' dud chip arca dre considered. .' . . h A SOlllt'wlldt different approach rombincs pol)'lIonnal approxlII18tlons Wit I k bl 110 161 Here the domain of the argumcnt r If the elementary a 00 -up ta e ,. , ') I It functiun fer) is dividcd int.o smnlll'r interv"lls (usually o ,_'qual It' 8n,c t. t:' valucs of f(I ) for the Loundary points .c. bl'twt:'t'n the mterv lis. are k, pt III the Iok-up tblt,. Then, the valut' of f(x) at tilt:' givl'n point. J: I" calculated The evaluat.ion of the trigonolUetric functions can also be acceleratro by predicting tht> s. 's based on the series expansion in Equation (9.33). Pre\'i()usly, in ordcr to obt.ain a C'onstant value for 1< indppendenl of the S('lpct-ed 8. 's, w{> have r{>strict.ed 8. t.o he in {-I, I}. However, if m-I K = IT V I + 2- 21 ,..0 i<; replacd by n/2 J( = IT V I + 2-.!1, 1=0 the de\'iation from the exact v'\lue is less than 2- n (II. Thprl'fore, for i > n/2, we may select 8k (k  i) from the 8pt {O, I} and prt'dict Sk = Zk, whpre th£' Zk are thp bits in the Ip,LSt. significant portion of x.: Ii =0.0 ... 00 Z'%i+I'" "'--v-' .-1 
T 244 9. Evaluation of Elementary Functions ba.'I<'d on the valul' f(x,), whicll is rl'ad from thf' tablf' where x, is t.ht> closl'St boundary point. to 1', and a polynomial approximat.ion p(x - Xi) for f(x - .1',). Since t.he distanl"e (x - Xi) is small, a \"pry simplt> polynomial cfln bc I'mploY''lI, requiring significant.ly Ie:,:; t.imf' t.han a rnt.ional approximation for f{x) on the entire domain. The overall algorithm tll('rf'fort:' ha..c; three stt'ps: 1. Find t.he dosest bowldary I>oint. Xi Hnd cakulatc the "reduction t.ran.;for- mat.ion," which is usually the distance (I = X - Xi' 2. Calculat p 1 1 «(/). 3. C-omhinl' f(x,) with p(d) to calculat.e f(x). Example 9.5 Tlw following algorithm can be employed for calculating  on (-1.11 (161: 32 houndary points of the form x, = ;/32, i = 0, 1,' ..31, flrc dl'fined. find the valul. of 22". art:' pr<'Cfllculfited Ellul st.ored in a look-up tflble. In step 1, we search for an Xi such t.hat Ix - m - xii $ 1/1.)..1, where m = -1. O. or 1. Then, we calculatf' d = (x - m - Xi) .In 2. This "distallc(>" satisfies Cd = 2r-m-z., and Idl $ (In 2)/64. In step 2 we calculatc' an flpproximation for cd _ 1 using a polynomial p(d) = (I + P2d 2 + p:,cF + ... Pk(I", whcr(> 1>2. P3 . . . p" art:' precalculated coefficieuts of the polynomial approximation of th(' function e d - 1 on the interval (-(ln2)/64,(ln2)/64I. In step 3 we reconstruct 2.1' using 2.1' = 2'"+.1'. . e d = 2 m (2z, + 22'0 . (e d _ 1») :::::: 2 m (22'. + 22'0 . P(d» wh(>re 2%1 is rt'ad out of the look-up table. A detailed error analysis for this algorithm for IEEE double-precisioll arguments shows that the crror is bounded by 0.556 ulp, wht>rf' ulp for this format. equals 2- 52 . 0 9.9 EXERCISES 9.1. Apply th,' IJroct.'<!url' in Section 9.1 to calculate eO,$. Aume that the argument :to = 0.5 and flU intt'nnt-diate ["('Suits haw 12 fractional bits. Prepare a table of all terms of the form In(1 :I: 2- 0 ) witb 12-bit predsion. Compare your rfSult to the exact \-alue of Vi and compafl' thp I'rror to 2- 12 . 9.2. Hf'peal probll'm I, applying th(' procedurl' in Section 9.2 for cakulalin In 0.5. 9.3. Provl' id,'ntity (9.16). 9.4. Prove th6t Equalion (9.55) yields the 581Dl' expnossioll for the tennination algo- rithm (for e") as Equation (9.5-1). 9 9 References 245 9.5. Write a proc(--dIJ((' for calcll1ating Yolzo IIsing nJ\JltipliC'ativt' normali1.ahon. 0<1- vi,>£, a t4'rmination algorithm and diSCIL>;8 its ('fff'Ctiwness. The rffiprocal of thp square root of a ghen n-bit Opl'r-1u<l ran be caloJl"tNI in n teps using the following two ec:1'latioo.s: 9.6. (i) x.+! (ii) y.+1 :s Zo' (1 + s,2- o r l =:E. . (1 + 28.2-' + s:T: U ) = y.' (1 + 3.T') where the :I;'S al'(' St"lecd so th"t x.+J -+ 1 and COJ1.'IIII4'nUy y.+1 -+ yol,JF;,. WI' W'dl1tto examinp the possibility of employing a tl'rruinalion 61gorithm in Ih(l kth step, in order to speed up the computation by rt'plocing the rl'm3inillg (n-k) steps by a single opt'ration. In step k Wi' have "-I 1 - Xlt = 1 -.to n (1 + 8,2-')2 = 0.0.. .0:,,:,,+1'" ':1O and to obtain the final value of y, we calrulatp y" n{l + 3I T '). ,=" For what valu, of k C3n the termination ctIgo)fit.hm be used? Writt' in detail the computation nt'C'dro in this termination algorithm. .. 9.7. Verify that if the factor K = n J l + 2- 2 ' in the call-Illation of sine/cosine is 1..0 "/ 2 replaced b)' J\. = n J l + 2- 2 ', then the deiation from the exact vnlue is 1eN! ,..0 tban 2-". 9.8. Find the reduction proc('(}ul'(' that should be used when calculating 'Iin(x) for x  1f 12 using the algorithm in Section 9.3. 9.9. Show that selccting g(b.) = b, as in Jo:quation (9.18). yields Y..,.I y. -;;- :;;:; -;; X'+I Z, 9.10, After th4' first step in Example 9.J we already obtain .r :;;:; O. Wbv do we havt' to execute the remaining nint' steps'? 9.10 REFERENCES [IJ P. W. BAKER, "Suggestion for 3 fl\.Sl binary sint'/cosine gent'rator," IBEE 1roras. on Compulf'r.t, C-25 (Nov. 1976), 1131-1137. (2) T. C. CHEN, "Automat.ic computation of eXIJonentials, 10g-dJ'ithms, r6tios, and :;quare roots," IDAf JnunJal Rt'..J. and Det... 16 (Jllly 1972), 380-3&. 
246 I 9. Evaluation of Elementarv Functions [3J B. O. DFLuGISH, "A class of algorithms for automatk evaluation of c'PrtRin el£'- mpntnr) fUIlI tions," Oept. Compo 8ci., Ulliv. of Uljnois, R£'I>. :J99, Junc 197U. 141 M. D. ERCEGOVAC. "Radix-It! £'valuation of certain £'1(,IIl£'ntary funcl,ions," IFJ-:E 1h11l8. on Co mlUi ter.'i , C-22 (June 1973), 561-566. [5J P. L FARMWALD, "Higb handwidth e'Blu6tion of eIPrnf'ntnr' functions," Proc. of 5th Symp. Otl ('omputer Anthmetir (May 1!UH), 139-1.12. [6J D. GOI.I>III':RG, private mmmunicatiou. (7) J. F. HAltT et al., Computer approximations, Wiley. N('w )ork, 1968. (8) G. L. IIA\ ILASD and A. A. Tl's?'r NSKI, "A CORDIC arithmPlj(" pron.'SSOr chip," IEEE 1hm.... on Computers, ('-29 (F('b. 1980),6R-79. 1 9 1 I. KORE:-I and O. I'INATY, "Evaluati elementary functions in a nUlJlerical co- processor bast.od on rational approximatiolls," IEb"E TIuns. on Compllters, 39 (Aug. 1990), 1030-10:17. 1 1 0J P. W. 1\IARKSTFIN, "Computation of l'I£,lIlentary functions on the IBM RISC system/6oo0 processor," IBM Journal Rc.s. and Vev., 34 (Jan. 1990), 111-119. 1111 J.-M. MULLER, Elementary ftm("tion..: Algorithms and implC1IJt'rltation, Birkhauser, 1997. (12J R. AVE, "[mpll'lUentntion of trallSC('ndental nlllctiOH" UII a nuwt'ric processor," Microproce..18ing and Microprogramming, 11 (1983), 221-225. 113J M. J. SCIIULTE Bod E. E. SWARTZI ANDER, "I1ardwar£' designs for pxactly rounded elE.'m('otary functions, IEBB 1hms. 011 Computers, 43 (August 199,1),961-973. 1 14 ) \V. H. SPE{'I\FR, "A class of algorithms for In x, I'xl' X, SiD x, COS .1:, tan-I x, cot-I x." IEEF l'rnns. 011 Electron. ComlJUt 1'"3, 1-'C-14 (Feb. 1965),85-86. 115] S. STOR.... and P. T. P. TANG, "New algorithm for improved transcclIl!£'otal func- tions on lA-64," Proc. of 14th Symp. on Computer Arithnu"tic (April 1999), 1-11. 116) P. T. P. TANG, "Tabl£'-looku}J algorithrru. for £'Iementary functions and their error analysis," Proc. of 10th Symp. 011 Computer Arithmetic, (1991),232-236. 117J J. E. VOLDER, "The CORDIC trigonomctric mmputing techniquc," IRE TIun... on f'lectron. Computel'"3, EC-8 (8('pt. 1959), 330-334. 118) J. S. WALmER, "A unified algorithm for t'lelUeotary functions," Spri1lg Joint Computer Con/., Proc., 38 (1971),379-385. 10 LOGARITHMIC NUMBER SYSTEMS A numbcr system bas('d on logarithms can simplify nmltiplication, division, roots, and powers. When logarithms are used, multiplication and division are reduced t.o addition and suhtraction, respectively, and powers and roots are rE"- duced to lIlultiplic"\t.ion and division, rl'spectively. On the other hand, add ami subtract operat.ions become more complex. Anot.her major probll'm is deriving logarithms and antilogarithms quickly and accurately nolJgh to allow c0z:-cr- sions to and from the conventional numb('r reprE.'SentatlOns. These conversions always involvc approximat.ions, resulting in inaccuracies. Ther('f?re, binry .1 0g - aritl;ms can be useful only in arithnlt'tk units dPdicat<>d to ::special apph.C'toIlS where very fl'w convl'rsiollS are required but. many multiplicdtions and divISion:> are executed; e.g., real-t.ime digital filters. 10.1 SIGN-LOGARITHM NUMBER SYSTEMS d b d g t S allli a lo g arithm EA that Let. a nmnbe>r A be> represente y a sign i i \ includes an integer part and a fractional part. S,E.-\ = S, Gk_lak-2...ala o. ' !1-1a-l""a_  - ;; I . . t tal f ., - 1 + k + I bits The si g n S -\ is set to 0 if .-t is poiti\"(', reqUlrmg a o. 0.. - .. fA' and t.o 1 if A is negative. E, is t.he logarithm o the absolute vllhlt' 0 ; I.e., EA = log2IA). The itlte>rpretatioll rull' for SAE.-\ IS thus A = (_I)S" .2£". The base of the expone>lIt may, in gt'neral, be different from 2. (10.1) ( 10.2) 247 
248 10. logarithmic Number Systems 10 2 Anthmetic OperatIOns 249 (-I)S"F A 8 6 .. 2 o -2 -4 -6 -8 -100 00001.010 rf'present thf' num£'rical value +2(H) = +2.37.nlO 01110.100 rf'prcsf'nt$ t Ilf' numf'rical \'alm' +2-(1+!) = +0.3535510 The largf'St posit.ivf' lIumber is 00111.111 = 2(8-') = 234. i5301O. The smallf'St positivp number is 01000.000 = 2 -II = 0.003906 10 , TI1t'rt' is no rt'pr('Sentation for Lero. 0 10.2 ARITHMETIC OPERATIONS o A FIGURE 10.1 The relationship between A and Its representafton {-I )SA E A In the sign-logarithm system (for A  1). -80 -60 -40 -20 20 -10 60 80 100 In a logarithmic.- systelll Illultipliratioll and division are ::impl('r opt>ratlons than addition mid subtraction. To cakulate the pwduct P of two opt'ralllis .4. and B we add their logarithms E p = E" + E8. ( 10.3) To rc>prt'llt numbel'S smalll'r than I, nl'gative logarithms are nt'e(ft'(1. For this purpose, we may uS!.' till' two's (>omplf'mf'nt rf'prt'St'ntation or a hiasn.f rcplt'sentation, which tak{'s thl' gel1l'ral form and if these logarithms are biased, we have to thl'n subtract th(' bias. As witIt convC'ntional sigTlt,<I-magnit.udl' systf'IllS, the sign bit of the product is dett>nnilwd by the modulo 2 addition of the operands' sign bits, Sf> = SA ff) S8' Similarly, whf'n calculating the quotif'llt Q = 1/ B we subtract t.he logarithms E" = (a"-l -. 'Go' U-l" .u-,h - Bias. b'q = E \ - E8, (10. ') Example 10.1 Let k bt' 4 aud I equal 3. ConSt><IIIt'ntly, Tll'Quals 8. Suppo.<;e that the log- drithm EA is represclltt'd in the two's complC'lUt'nt. m£'thod. The following arc two rl'presentations withill thc range: and if th{'Se logarithms are hiased, we must add the bias. The sign hit. of the quotient, SQ, is detenninf'<1 in exactly the sanl(> way as is done for t.he pru,lllct; i.e., SQ = S" EEl 58- Comparing the multiply and divide operations to their countt'rparts in a floating-point syst.em, onf' shouM note that in the logarit.hmic system they are exact. operations, and no rounding is required. Thus, these two operations do not contribute to the accumulat.ion of computation errors. o.t>r- flow and wtderflow mdY occur when executing the add or subtract operations in Equations (10.3) and (10.4). but these are easily detected. . . In contrast, addition and subtraction of operands that are g1Vf'JI by their logarit.hms are complicated. One brute-force way to perform thl'Sl' operatioll5 is through the use of a complete look-up table. Howt'\'er, the size of such a .taJt> (220 x n bits) is prohibitivc for any rea..<;onabic value of n, the nUlllbe.r of bits. m an operand. \Ve may thcrc>fore use one of the following two altt:'rnatlVt"S, winch still rt:quire t.ables, but of smaller size: 1. Use an antilogarithm table, add and then use a k)arit.hlll tble. . fhis requires a total of 3.2" x n bits in tabl('s (three tables If the anttlugarlthm<; of the two operands are to be read simultaneously). 2. Calculate directly the approximated sum (or difference). This methd employs smaller look-up tdbles and is therefore the one most l,'OIlIlIlIJl Y used. COllunllJlI)' u!>t'd values for the bias afl' :t'-I or 2 k - 1 - 1. Examining Equation (10.2), one should rf'nlil.e that. this is an t'xtrt'me case of the familiar floating-point syst.em with tit(' signifkand always equal to 1. As a result, t.ht:' exponent E" i allowftl tu be a mixed number rather than all integl'r, in urdC'r t.o enubl(' the repfl'Sentation of numbers t.hat are not integral powers of the bosc. In t.his numLer system. as in the float.ing-point systf'm. zero is not inl'luded in the ordinary range of \'alul'S. If biased log<uithms art:' used, we may df'Cide t.bt\t E" = 0 rt'prest'nts 0 instead of 2£....... _ Another way to reprf'Scnt. zero is to haw a special bit in the format indicat.ing that the valuf' is 0, regardless of the value of E A [8], Figurt:' 10.1 d£'picts t.h(' rdatiollship, for A  1, betw€'l'n t.he real number A and its representat.ion (-l)sAE A in the sign-logarithm system. It is evident from this figun' that (-1 )SA EI\ is monot.onic in A. and comparison is therefore straiRhtforward. Givf'n two IIl11n1)('rs A = (_l)sA .2 FA and B = (_1)5 s . 2En, we fint. ,'ompare their "igllq SA and Sa; then, if the signs are C<)ual, we compare tl1l'ir logarithms E" and En. 
I 250 10, logardhmic Number Systems 102 Arithmetic Operattons 251 E.... EB A Ida' Ec "'hen cakulatin t.he smn (or diffl'reuce) (' = .4:1: B = (_I)sc. 2Ec a("('(Ircling to thl' sl'("ond met hod, \\'(' distinguish hetwf't'n two cases. In thl' first, 1.41 > IBI, and \\'t' rt:'write the expression for C as B C = A:i: B = A(l :I: -). A Wt:' set Se equal to SA and ROM B B Ec = log2IA(I:i: A )I = IOR21AI + log211 :i: A 1= E A + ell (E A - En). (10.5) RGURE 10.2 Adder/subtracter fer slgn-Iogonthm numbers. wht'rf' thl' function ell (E A - En) is definro as B ell (EA - EB) = log211 :f:: A 1= log211 :i: 2-(E A -Es)l. ( 10.6) several stratf'gies for reducing thf' size of the look-up tablf's havf' het>n sllggP5tt'd and implemented. One approach is to partition thl' table of si.te T' x n illto several smallt:'r tables (7). Since elI(x) decreases rapidly with incrf'asinR x, the size of the corresponding part.ial tables can be substantially reducf'<l. Another approach is to uS(' a combination of linear approximation aud a look-up tab (8). In either case, there is no need to gcneratt' 4>(x) for very large values I)f ;r, since t.he value of elI(x) becomes smallN than tht' resolution of t.ht' SYfitf'lU. The valrlf' of cJI(x), wherf' x = E A - En > 0, llIust be prccalculated and stored in a table. For conwnience, two separate tables are commonly used. one for 4>+(.r) = log2(l + 2- Z ) and a second for elI-(x) = log2(1- 2- Z ). Eadl table can bt' implementf'<1 in 8 ROll.,I of a size not larger than 2" x n. The si.te of the table for cJI+(x) may he reduct'd to 2" x I, since elI+(x) :5 I for x  O. In other words, 4>+ (x) is always a fraction dnd will nf'v£'r require mort:' I. han 1 bits. In the secoud case, IAI < IBI, so Se = SB Bnd Ee = E B + 4>(EB - E 1). Const>quenUy. tht.' steps in both cases are: I, Compare A and B to determine the larger of t.he two. 2. C'alculat.f' x = E A - En or EB - EA. 3. Read c}>+(x) or 4>-(x) from th£' appropriate table and add it to either E A or En. The fit, two steps ('an be exeruted in parallel if t.he dat.a flow depirted in Figure 10.2 is adopted [71. As a result, tht' total time needed is approximately TADDISlfR  2. TADDER + T ROM . (10.7) Example 10.2 A 20-bit logarithmic number syst(>111 procP5.sor has bel'n designed and im- plemented as a singlp VLSI chip [71. The 20 bits include a sign bit. and 19 bits for an exponent. in two's complemt:'nt representat.ion. TllPse HI bits are partitioned into a sign bit, a 6-bit integf'r part, and a 12-bit fractional part. The two look-up tables for 4>+ and -, if fully implpmt'nt.p<1 ill ROM. would require 2 18 . 12 + 2 18 .18 = 7.8Mbits. The si7R of tht'se tdbles can be substantially rcdured as described next. First, there is no need to generate values I. ht are smallpr than 2- 12, t.he resolution of the sel('rh>d number s)-stem. The solution of elI+(x) = log2(1 + 2- Z ) = T I2 is x = 12.52 and, the solution of cfI-(x) = 102(1- 2- Z ) = _2- 12 is also :r = 12.52. Thus, 110 look-up table entries are required for .£ 2: 12.52. Consequently. there is no Ill;'t'd to provide more than 4 bits in t.he iuteger part of x as iuputs t.o tht' look-up tnblf's, allowing .x :5 15. The/.> , four bits, t.ogether with the 12 fract.ional bits, constitutf' t,he r{'quired 16 input.s to the ROM illstt'ad of 18. The remaining range [0,151 is then divid(d into II smallpr intervals: [0.0,0.5), [0,5, 1.01, [1.0,2.0). [2.0,3.01. [J.O, 1:01. [.1.0,5.01. (5.0,6.0), [6.0,7.0), [7.0,8.01, [8.0,9.01, and [9.0. 15.01. For t,hc' first 10 intervaL.; ROl\H through ROM 10 dre uSI.d whill' for t,he last 0111' a PI A implementation has prown to bp mort' economi('al, providing valu os for The only source of error wllC'n performing an addit.ion (or subtraction) is the rounding of c}>+ or 4>-. The values st.ored in tht> ROM should be rounded (e.g., to nearest, gll(' Chaptt'r 4) rather than truncated and the error int.roduced \\ ill t hf'rcfore hr no largt'r than  . 2- r . The sizt' of the abo\'c two tab It's for elI+ and cJI- is t.he major obstacle when attemptiug t.o impll'IIII'llt an arithmetic unit that operat.es on sign-Iogaritlun numLNS. For n 2: 20, thl' requirf'd ROM becomes prohibit.ively large. Therefore, 
252 10. logarithmic Number Systems 10.4 Conversions to/from Conventional Representoftons 253 t.he suhinfl'f\-nl (9,12.52J. Thp r{'a..'()n hehiu(1 the partition of t.he intf'f\al (0.0.9.0J into 10 6111uller int.ervals is that the graph for tJI(x) becomp:s flat for large value:'':; of :c. As a rt'sult, the number of input bits to thp ROM aud the:' numher of output bits dccrease' rapidly. Tlw total ROM space employed is 83.55 Kbits. This approadl, if applied to a 30-bit format. would requirt:' a ROr..r :.pace of 70Mhits [8J. Thcreforl', a different. approach, using a linear ap- proximation, wa..<; propose<! to further rrourf' tht:' si.w of the look-up tabl<'S.. With t.his approach, the size of tht:' ROM decreases. Howp\'er, the execu- tion t.imc increMes. 0 1'0 mea..c;l!rp thl' a('rllr \(,). of r 'presl"ntation in th{' logarit,llInic svt"'lJ1. wp- clclllatf' t.lu' r{'I"t.ivp stt'P sizr, (It finl-d as 2(EA+2- 1 ) - 2 f ;A 2 1 ?,.. = 2 - 1. _ A ( 10.10) The maximum relative rpprf'spntatiou t'rror, whil'h pquals half thf' (f'lnt.iw stpp size, if; tlllI a COll8t3nt indt'prndcnt. of EI\' For till' floatiug-point sysU'm thl' relative stpp sill' is (M+2 m)2E_At.2E Af.2 E 2" m =r:r. (10.11) An important advantage of the logarithmic system is t.hat several ot.her arithnwtir operations c<ln be eXl'CutL>O in a straightforward manner. For example, tlw reciprocal of a given numbpr A = (_1)5"2Eo4 is simply A-I = (-I)s"2- E ", and only thp two's complement of EI\ Ileeds to be calculated. Squaring a given number is ac('omplished by A2 = 22EA, requiring a shift left operation. If the logarithm is biast'<I, thp hia.o;; IUUSt. be subt.racted. Tht' square root of A is givcn by JA = 2 EA/2 , rpquiring a shift right operation. He>rt:', if t,he logarithm is hiu...;e<I, t.he bias lUust be added first. Exponent.iation is also simplified, since> All = 2 11 '£", and ouly a fixl>d-point multiplication is re<luircd. To compare t.he two step sizes, assume that. I = m. Sinre ')2- 1 - 1 lim - 2- 1 = 0.693, 1....00 and for normalizM fract.ion.. 1 S I/M :5 2, the following inequ<ilit}' hulds: ?-I 2 2 - 1 - 1 < =- -At As previously notf'd, t.he logarithmic syst.em is an extren1(' case of the con\'m- tionw floating-point system, Therefore>, a more detailed comparison bl't.\\\'Cn the two should bf' madt!.. Two important charact.eristics that nPed to be analYLed are the range and t.he accuracy of representatioll. TIlt' range:' of posit.ive logarithms E.-\ using k + I bits is _2"-1 S EI\ S 2 k - 1 - 2- 1  2"-1. Therefore, the range for positive numbprs in t.he sign-logarithm system is As a result, tllP reprpS('ntation error in the loarit.lunic sytt'm is sliht.ly lowt'r than t.hat in the corresponding floating-point system (with fit = I). l'Sppcially for small numbt'rs. Numbers with thc saJlle exponent. are t:(lually sp,lled iu tlw floating-point number system, while ill the sign-loarit.lulJ system snl'1Uer numbprs are denser. 10,3 COMPARISON TO BINARY FLOAT1NG-POINT NUMBERS Example 10.3 The 20-bit. logarithmic processor rep(lrted in 17J ha.... the rdIJg(' 2-o :5 A+ < 2&1 wit.h a precision of 2-1.2.52. The rangf' of this system is twin' t.hatf a 20-bit floating-point format. with a 12-bit signitic8nd and i-hit. TI .., I . I tl b tt ?-12.,2 ver . W ' 2-1.2 0 exponent. 1(' precision IS s Igl Y e er, _ :i '" . 2-2-1 S A+ S 22-1. (10.8) This range should be> l'Ompared to t he range of positive normalized floating- point numbprs wit.h {3 = 2 and 11 = e + m + 1. where e and 7ft are tll£' numbt:'r of hits in thp exponent and t.he normali7Nl fractional significand, rcspectiwly. TI1f' latter rdngl> i, as shown ill C'haptrr 4, ! .2- 2 (0- 11 < F+ < ( 1- 2-m ) . 2 2 (0-1 1 - 1 . 2 -- ( 10.9) 10,4 CONVERSIONS TOIFROM CONVENT10NAL REPRESENTAT10NS Convt:'rsions t.o thp logarithmic number system from {'ither a fixed-point systm or a floating-point system require tht' ralculatiolJ of logarithms. The Opplblt . convcrsions rpquire the calculation of ant.ilogarit.hms. For cxample, to conwrf t.he float.ing-point nUlnbf'r (_1)5 . M .2 E to the logaritlllJlic SYRtl'lO we IIJIlt cakulatt' F = log2 ,"I to obtaiu 5 E+F (_1)8..\/.2 E =(_1)5.2".,2 E =(-1).2 . If y,.(> :wt € = Ii and, consequently, m = I, t.he ranges remain about the same. 
254 I 0.8 O.G log2(1 + x) 0.4 0.2 10. Logarithmic Number Systems 10.5 ExercISeS 255 0.2 0.4 0.6 0.8 I Example 10.4 Let N = 0111.01 = 7.15 1 0 and IJ = 3. W' shift out "\ sillle z..ro, anti thpn a 1. Hence, t = 2 and at th.' end the rcistcr contains 1101, i'Ih'rprpt('d as .1101 = .8125 10 . Therefore, lug 7.25  10.BOll = 2.812510. Tlw accurate value is 2.8579810' A similar approximation can be used Cor tht' antilogarithm. Gi\Pu t + y, wllPrl' y is a fraction. we need to raklll.\I.{' N = 21+11 = 2'2"  2'(1 + y). This is the sam(' approxim.ltion IIsro beCurf'. 11 ::::: log2(1 + y). This ran b.. irnplf'lIlt'lItf'd in hardwal1' by placing a 1 in pO$ition t .me! pl<lCing the fract.ion y next t.o it. For f'xamplt', givf'n 100.1010 = 1.1)25 10 , t = 4 aud t.ht:' approximat('(1 antilogarithm is 11010.002 = 2610. Thp correct value is 2. 1 . 625 = 2,1.67537 1 0, 0 x FIGURE 10.3 A linear approximatIon tor log2(1 + x). TI\(' logarit,hm of a given operand call be found in a look-up tahle whoS(' sizl' grows expOlIPntially with t.he number of opl'rand bits. Another way to find it is to cakulatf' an approximation to the logarithm. Let N be a binar:y nUlIlber Zu Zu-I . . . ZO . Z-I . . . z-'" and let z, be t hI' most significant nonzero bit of Ill. Thf' valu.. of this numbcr can be written as A pil'cewis(' linear approximation has also bE"t'n suggMtf'<1 (II. The intf'[\..tI (0.1] is divide{l into four equal subintervals. and a linear approximation of t.hp form x + a . f(x) + b is ud for each subinterval, where f(:r) is eitht'r x or .i, the one's complement of x. The l.ont.ants 'I and b arp sdC<'tPd so as to minimi7f' the error and be fractions with p(lwers of 2 as denominators. The resulting expression i.., (I) 1--\' i=-ll log,(l + x) '" j x+Ji. x O$x<t 16 (10.12) x+Ji. !<.z:<! 64 4 - 2 (10.11) +! + 3 1<x<1 x !.'IX m 2 -  :r + !;]; $.z:<1 .. '-I '-I N=2'+ L 2'zi=2'(I+ L 2 i -'zi)=2'(I+x) where x is a fraction. 0 $ :r < I. Clearly, log2 N = t + log2(1 + x) (10.13) The approximation ..rror is -0.006 $ f $ 0.008. The total ('rror rdngc, 0.014, is lower than that of tile linear approximation hy a fact.or of 6. Higher predsion CRU be 8chievro by using a look-up t.able implemented in a ROM. However, the si7e of tins ROM is prohibitively large f(1r reasonable values of the numbt:'r of operand bits. A mort' C<'OlIomil'al implt'JIlent.atioll ba..d on R PLA ha.<; been sUf'Sted ill [-I). wh"rt:' t is thr charact.('ristic of the logarithm and log2(1 + x) constitutes the mant issa. A linear approximation for logAI +x), suggested in (5), uses only the linear t('rnl in th.. Taylor series; i.e., log2( I + x) ::::: x. This approximation is shown in Figure 10.3, and its error is t:(x) = 10g:z(1 +x) -x. The maximum approximation error IS found by differentiating E(.r), obtaining Max f(X) = 0.08639 for x = 0,.14269. Th.. hardwarf' implemt'ntation of this linear approximation is very simple. The oper<Uld 1\' is stored in a shift. register and a roul1tcr with an init.ial value of u is uM.'d. N R; shifted to the left rep(,3t('(lIy unt.iI a I is shifted out., and at t.he same t.int(... the count.f'r is d....crementec.l once for every shift operat.ion. The contpnts oC till.' count.pr at the end of t.he operat.ion are u - (It - t) = t, and th(' contt'nt.s of tllp shift rl-'gist.er are t.he approximatro mant.issa x. 10.5 EXERCISES 10.1. (a) For a sign-Iog'lrithm systcl1l with n = Hi, k = 6, lultll = 9,  h swalle:;1 and larC"St. pnsitive numbprs, assuming ba.'.(' 2 and that the l.\100 s complemnt method is used to represt'nt. Deat.ive logarithms. C'alculat' Ib(' lllaXIJUum rt'lnll\'(' repr£'scntat.ioD error. (h) Repeat (8) for ba...e 10. . Given the sigu-logarithlU system d£'finl'(l in problt'lIJ 1, show the rcplt':;eutallon f h d  2 5 an(l } - 3 7 iu tlilii S":lt,em and IlerforJII IIII' o t e two op£'ran 5 ." = . - ,. , .1 10.2. 
256 10, Logarithmic Number Systems 10.6 References 257 opprations .\ + }', X - } , \ ,} , X f1 , 1/ '(. '(2. ..;x, and \ ". Calculat(' I,hl' f'ntriC's of 4>+ and - thut I\rl'l1N'ded. 10,3. Writf'it. Hool('3n expression for thp siKIlal. selc<:tinJ1; till' table of 4>+(x) given the sign bitsoHhl' two operands. S.... a.ncl Sn. and t.hesignals ADD and SUB( I'RAC 1') indicatin thl' type of opl'ration bf'iuJI; executed. 10.4. A 32-bit format for I,bp sign-loJl;arithm svstem hR.>; b('Cu 8uggestf'd with k = 8. 1- 13, a bR.,*, 2, and a hiR.<; of 127 (3). This r('Sults in 8 format that is very cI()5(' to thf' ingl('-prl'Cision format of tbf' IEEE standard (8ef' Chaptf'r 4). Write down an expr('S,<;ion for thf' value of a non7..('ro nUlDbl'r X given the sign bit. Sx aud the logarithm E'I(. Use thp notation Ex = 1+ F, whf'rl' I is tilt' k-bit integf'r and F is th(' I-bit fraction. ('0Illp3rl' the rangf' of these two number r£'IJr('S('utations and writp tbe rule for converting an mFt-: floating-point number to the sign-logarithm sstPm. Estimatf' tht' conversion f'rror and sUAAcst a way to redue£' it.. 10.5. Determinf' tht' minimum numher of inputs and outputs needed for ROM6, which corresponds to t.he range (4.0.5.0) of 4>+ (x) for the 20-bit logarithmir processor dt'SCribed in the text (7). 10,6. Sho..... that the maximum error of thl' approximation log2(1 +.r.)  x is 0.08639. Suppose that the approximation IOg2( 1 + x)  x + c. is used inst£'ad, where the int£'rval (0,1) is dividro into four subintl'rvals. as in Equation (10.14), and c. is a ronstant (,JIlployro for the ith subinterval (i = 1,2,3, '1). Find the best vslue$ for the c.'s bO 8S to minimize the error, and calculatl' the resulting maximum error. 10.7. Write an f'xprl.'SSion for the distance bf'tween two adjacf'nt numbers in the sign- logarithm system and compare it to that of the corr£'sponding floating-point. system. Show that smaller numbers arf' denser in the sign-logarithm system. 10.8. To r£,IJr('S('nt values in the range IAI  1 Wf' may r('Strict E" = 10gb IAI to positive illtPgers. Wbat is the r.mgf' of \'alUl that the b b lDay assume? (6) E- E. SWART lLANDER. JR., <IUd A. G. ALEXOPOULOS, "Thp siKll/lo)tarithm lIumbrr syMcm," IE,.;£-: Innu. on Computers, C-24 (Opc. 1975), I:.!JR-12'll. (7) F. J. TAYLOR et al., "A 20-bit logaril,hmir munbf>r systPtll prort'S,<;()r." WEE 1rnru. on Computers, 37 (Fph. I!JAA), 190-199. (8) I.. I<. Yu and D. 1\1. Li-:wl!'1. "A ,iO-bit intt'grat<,<1 logluithmic numhpr systf'm processor," IEEE J. 0/ Solid-State Circuil.." 26 (Oct. 1991), 1-I:i3-1140. 10.6 REFERENCES (1) 1. ('OMBET, H. V. ZONNEVELD, and L. VERBEEK, "Comlmtal.ion of the base two logarithm of binary numbers," IEEE Trans. on Elect. Computers, EC-14 (Dcc. 19(15), 863-867. (2) A. D. EDGAR and S. C. LEE, "FOCUS microcomputer number s)stem;' ('ommu- nirotions 0/ th ACM, 22 (March 1979), 166-177. (3) F. S. LAI and C. E. Wu, "A hybrid number s}stem procesor with grometric alld complex arithmetic aipabilitics," IEEE Trans. on Computers. 40 (Aug. 1991), 9,2-Ml. (oil Ii- Y. Lo and Y. AOKI, "Generation of a precise hinary logarithm with differ- ence grouping programmahle logic array," IB,..,., 1\-ans. 011 Comput,'rs, C-34 (Aug. 19M!',), 681-69l. lJ J. N. MrrCHi'LL, JR" "Computer multiplication and division IL"iing binary loga- rithms," lllF 1hnu. on Fleet. Computers. EC-11 (Aug. 1962), 512-517. 
11 THE RESIDUE NUMBER SYSTEM The residue number system is dn intl'gcr number system whose most impor- tdnt property is that additions, subt.ractiol1$, and mult.iplicat.ions arE> inhf'rently carry-Cree. As a result we may add, subtract, Or multiply uumbers in oue :itep rf'gardlf'g.<; oC the length oC diP nllmbf'rs involvf'd. Unfortlluatl'ly, ot.h('r arith- metic op('rations. like division. comparison, and sign detection, arp WI)' complex and slow. AlloUwr problE>1U with t.he residue IIlllnber system is that it is an integer lIumbpr system and, as a result. it is very incol1\'pnient to reprt'SE'nt. frac. tions. Consequently, the residue systf'm has not been seriously considem:l for uS(> in general-purpose computers. However. Cor some special-purpost' applimt.iolls such as many types oC digital filters (6), in which the number oC additions and multiplications is substantially greater than the number of il1\'ocations of magni- tude comparison, overflow detection, division, and a.lik(', the residue system ran bl' vcr) att.ractive. 11,1 PRELIMINARIES A r('.<;idllf' number systf'm is chara('terv.ed by a bEl.') , that is not a single radix hut an N-tuple oC integers (mN, mN-1...., ma). Each of these m, (i = 1.2,..., N) is called a modulus. An intc>ger X is reprt:'sented in the residue number sytem by au N-tuple (XN,XN-I, ....xa) where x. is a nonnegat.ive illt,eger sathiCying X = 711. . q, + x, (11.1) 259 
260 11 11.2 ArIthmetic OperatIOns 261 The Residue Number System X -4 -J -2 -1 o 1 2 3 4 5 6 7 8 this case is 6. There ar(' only six different. rl'prntflt.ions in thp r('!!irhle syst('m with t.he moduli (m2' m.) = (3.2), o;;incc Xl can a.."-S1111Je two pos- sible> values and X2 can a:;$Ullle t.hrf'e. We must thprpforp limit. thp range to include only six numbers. Two slich possible ralll!;CS dre marked in the table. On(' is -3  X  2, and tlIP othpr is 0  X  5. 0 .1"2 3'1 2 0 i -(f - - r -  I 1 0 I I 2 1 I ---rf----U-: : I 1 1: I I 2 0: I L_____J o 1: 1 0: . 2 1: -...----------, o 0 1 1 2 0 It. hdS be<>n shown (7) that. in gNler8.l. t.he number of differpnt rf'present.Q- tions, aud. as a rftmlt. the lIumber of elements in the u.wful rang(' of the rffidup system is t.he least common multiple of thp moduli. denoted by M = l.c.m (m}. 1112, ,.., mN) (11.2 ) The least common multiplp of tbe moduli is t.he smallest integer t,hat. has all the values of ru l as divisors. In the above example AI = l.e.m(2, 3) = J.. 3 = 6, but for rnl = 2 and m2 = 4, M = l.c.m(2.4) = 4. Hencp, in order to get the l8.rgest possible range TABLE 11.1 system. The representation of numbers In the (m2, m.) = (3,2) residue where qi iR the largest. integer such that 0  Xi  (mi - 1). Xi is known as th(' residue of X modulo mi. and the notations Y mod mi and I Ylm. are cOllllllonly used. N .\1 = IT In, .=1 (11.3) we must select moduli that arc pairnrjse relatively prime. Two moduli m , and mJ arc said to be relatively prime if 1 is t.heir greatest common divisor. This is usually written as g.e.d(m.. 11Ij) = 1. For example, 4 and 9 are relatively prime, alt.hough neit.her in it.self is prime. For a given M, if only nonnegat.ive integers are needed, the range can be set to (0, AI - 1). If, 011 t.he ot.her hand, negative numbers are also desirpd, then t.he range can be set to (-(M -1)/2, (AI -1)/2) if M is odd. or (-M/2, (M/2-I») if AI is even. Examining t.he ent,ril's of Table 11.1 in the range (0.5) we should redlize t.hat a magnitude comparison betwn two numbers is not. simple. For example, (2,1) represent.s a number t.hat. is larger than the number reprt'SenteJ by (2,0), but (1,1) represent.s a number smaller t.han that. reprf'SCntt!d by (0,1). This stems from t.he fact that, unlike the conventional number systems, the residue system is not weight.ed. Also, if negative nUlilbers are included in the range t.hen the sign of a number is not apparent from its residue reprl'sentation. Example 11.1 Consider a two-modulus syst.pm with the moduli m2 = 3 and 7U1 = 2. Thl' repre:,entation of X = 5 in this residue system is (X2. xd where X2 = 1513 = 2, since 5 = 3 . 1 + 2, XI = 1512 = 1, since 5 = 2.2 + 1. Therefore, the re:.idue reprpsentation of 5 is (2,1). The number X does not ne<:t>.ssarily have to be a positive integer. For t>xample, if X = -2, then .1:1 = 1- 212 = 0, since -2 = 2. (-1) + O. Also, -2 = 3. (-1) + 1, yielding X:2 = 1. Not.e that Xi is by definition positive. Thus, -2 is reprE'-sented by (X2,Xl) = (I,O). Table 11.1 includE's the representation of integers in the range (-4,8) in t.he (m2. md = (3,2) residue number system.  is apparent from Ta- ble ILl, the residue rppresentation of a number is unique. However, t.he I'onverse is not t.rue, and two or more numbers may have the same repre- sentatioll. For t'xampll" 1 ane! 7 <UP represl'nted by (1,1). Consequent.ly, w> must. limit. th,> rdnge of the numbers to be represented. As we can see froln Table 11.1, the residue reprf'SC>ntation is periodic and the period in 11.2 ARITHMETIC OPERATIONS The basis for performing an addition in t.he residue sy!>tl!.m is the identity IX + Ylm. = "XI",. + 1 1 'lm,lm. = IXi + y,lna. (11.4) 
k L-\) ,;;1 k = L IX, 1m, j..l ( 11.5 ) 11.2 Arrthmetlc Operations 263 EXRmplt> 11.3 COllsidl'r the rt'siduf" number s}':<h'm with thf' spt of four mo(\uli (m.1t 1113. m2, mr) = (7,5, J. 2). Thf'SP mMluli arf' pairwiS(' relatively prime Bno th...reforf' ., ]./ = l.e.m(n 1,"".. mol) = II nti = 210. ,=-1 262 11 The Residue Number System nr, in g>ncrRI, when adding the k upemnds .\10 .\2.... Ak m, " Similarly, the identity for multipliration is I Y1'lrn, = 1 1 .\ 1m, . IYlm, I.... = l.e, . Yllrn. (11.6) Pf'rfonning the addition and multiplication of th.' two operands l( = J and Y = 4, reprf'Sented by (3.3,0,1) and (4,4,1,0), respectively, yidds (7 5 3 2) (7 5 3 2) 3 (3301) 3 (330 I) 4 + (.1" 1 0) 1 x ( I .1 1 0) 7 ( U 2 1 1 ) 12 (520 O) or, in g<,n('ral, k IT Ai ,=1 k = I1: \jlrnl ) 1 (11. 7) m, "', Example 11.2 We add the numbt'I'$ X = 1 and Y = 2 in the (m2. mr) = (3.2) rcsidut' systf'm. The representations for Y and '\ arc (1,1) and (2.0), f<>spfftivd}'. Tlwrefore, One ('(\11 verif)' that till' results (0,2,1,1) and (5,2,0,0) r€'present thf' f'J(- peeted values 7 and 12, respectively. But, whm t.lw following addit.ion is performed, (7 5 3 2) 206 (3 1 2 0) 7 + (02 1 1) should be 213 (3 3 0 1 ) tlw result (3,3,0,1) represcnts the valuf' 3. which satisfies 3=12131210 (lnd we clearly have an overflow situat.ion, which is difficult to idc>nt.ify. 0 Thl" proof of the above eqlldt.ions art' straightforward (7). IX2 + Y21m = 11 + 213 = 0 l.rl + Yllml = 11 -t 012 = 1 Thp linal result is t.hus (0,1). which reprf'sents t.he value 3. Mult.iplying t h... t.wo numhe>rs X and Y yields IX2' Y2lm:l = 11 .213 = 2 } ' . 1 . I _ 11 01 - °  x..} = (2,0) r('prl"S<'nt.mg the value 2. Xl YI "" - . :2- 11.2,1 Multiplicative Inverse The multiplicative inverse oC a number e modulo m is d numbl'r b, °  b  (m -1) satisfying Icbl rn = 1, b is dt'uoted by 1lrn' Any number e has an addit.ive illverse I - elm, bllt. the multiplicative inverse 1lm dol'S not always exist. The inverse 1ITrI exists if and only if g.e.d(e, m) = 1 and lel m '" O. If these conditions are satisfi('d then 1lm is unique. For examplt'. o For subtraction we define the additive mverse of a number c modulo mi as follows: 1- elm. = Inti - el.... (since Im.lm. = 0) (11.8) For eXdlllplc, 1- 213 = 13 - 213 = 1. In othl'r words, the itwerse of a numbl"r may b(' formed by "compleme>nt.ing" eaeh residue with rpcct to its modulus. As fOr addition, the> equation Cor subtraction is IX - }'Im, = IIXlm. -IYlrn.l m . = IXi - Yilm, (11.9) m=5 m=6 e HI", (' HI... 1 1 1 1 2 3 2 None g.e.d(2,6) = 2 J 2 3 None g.e.d(3.6) = 3 4 4 4 None g.c.d(,1.6) = 2 5 5 Using the definition of tilt. additive iu\'ers(', the t.erm Ix. - y.lm. {'liIl be replaCl'd by I.c.. + I - Yilrn, 1m,. For f'xampll'. subtmrting }' = 3 from .\ = 5 in the ("'2, mr) = (3.2) residue s)stem yields 1.r2 - Y2lrn = 12-013 = 12+013 = 2 } . . IXI-vJl rn l=jl-112=ll+112=O  \: 1 =(2,0),r('prffle(lhllgthplluc2. If m is a priult' numbpr, then for pvpry PlIssiblt. value e sat.if)'mg 1  e  m - 1, g.e.d(e, m) = 1 and tho nmlt.iplicativl' inverse t:xists, 
264 11. The Residue Number System 11.3 The Associated Mlxed-Rodlx System 265 Maglliturll' cmuparison, sign rl('te<.tion, and ov('rflow d,.tedion for the rcsidu{' munh('r system can hI"' facilitat<>d by converting the givl"'n ridm' rppn>scnt.ations into the? a.-'::sociatpd mixp<l-radi.x number svstl>m. This is a weight.ed number syst('II1, wit.h th(' repr'-'SPllt.ation for a number \' giv('n by This calculdtion ran he done in rpsidul' arithllll'tic, a... can be p"..'lilv Vl'rifia:1 through thp followinK ft'presl'ntation of the prncrourc in E<luat.ion 01.'12): Yj+1 = ('i,-a.)!..!...1 with YI=X J1I, a, = } mod m, (11.13) 11.3 THE ASSOCIATED MIXED-RADIX SYSTEM x = (IN . (rnN-1 .mN-2...mr) + ... +a3' (m2' md +aJ' JU) +al (11.10) Example 11.5 To convert a number X repr('Spnted by (x., 'c3, 'c2,.rl) in thp rpsidul' sY:'ltPm with tbe moduli (m I. m3, m2. J1II) = (7,5,3.2) to the associated ml.x"t. radix system, thl"' following equat,ions can be \l.'5pd: with the digits a. satisfying 0:5; ai < m,i i=I,2,...,N. (11.11) al = Xmod2=xl, 1 a2 = (X - al )1'2 1 mod 3, a3 = (X - al )11-1l2) 1lmod 5. a4 = (((X-al)II-a2)1I-a3)1lmodi. Being a ,,'('ighted lIumbf'r system implil'S that magnit,udt' compdrison is straight- fonvard. For example, tbe ,,,,1m's 0,1,2,3,4 and 5 in the mixed-radix syst,em as- sociated with the (3.2) msidue syst.em (SI."e Tahle 11.1) are represented by (0,0), (0,1), (1,0), (1,1), (2,0), and (2,1), respedively. The value of a pair (a2,aa) in this mixed-radix syst.('m is 2. a2 + al' Example 11.4 In t.be mixf'd-radix s)'stem associated wit.h the (m4, m3. J1IJ, J1Ia) = (7.5,3,2) resid\l(' system, a numbpr X is represented by (a4' a3, a2, at>, when: x = 30 . 04 + 6. a3 + 2 . a2 + al It is more conveniC'nt to follow the algorithm in Equation l1l.JJ) and ex('cute the conversion in t.he rC.':iidue system. For I'xample, we convert thl' number 43 represented by (1,3,1.1) as follows: }'I = (1,3,1,1) and therefore, al = 1'"1 mod 2 = XI = 1. To obtain 1'2 we first subtract al from Y I , yielding (0,2,0, -). Note that only the first three digits in Y2 are of interp.st, sim.'e al is alreddy known. We then multiply by HI, which equals (4,3.2,-), obtaining Y2 = (O, 1. 0. -). Thus, a2 = 1 2 mod 3 = 0. Subtracting a2 = ° yields (0,1, -, -). Next we multiply by I !, which equal.. (5,2. -, -), yieldinJ!: }'3 = (0,2, -. -). Therefore, a3 = h mod 5 = 2. Subtracting a3 = 2 we get (5,-, -,-). We then multiply b)'I!I? = 3, yielding}"4 = (1, -,-,-). Thus, a4 = 1 and the represcntation of 43 in the mi.xl't!-radix system is (a4,a3,a2,aa) = (1,2,0.1). 0 and the digits ai satisfy 0:5;a4<7, 0$a3<5, 0:5a2<3 and O:5;al <2. The numbers 43 and 37 ar(' represented by (1,3,1,1) and (2,2,1,1) in the given residuI' system, respectively. The corresponding representations in the associated mix{'d-radix system are (1,2,0,1) and (1,1,0,1), respectively. These l8St t.wo representations can be compared indicating that 43 is gr<'ater than 37. 0 Any two numbers in a given residue system can be compared by converting them into the associdted mi.xed-radix system. Converting a numbl'r represrnted by (XN, XN-lo ..., xa) in the residue s)'stem to the associated mixed-radix r('p- resentation (aN,aN-1o ... ,al) is performed using thc following equations [71: al = \' mod ml = Xl 1 a2 = (X - ad I-I mod m2 (11.12) nil a3 = ((X - adl!- a2) 1lmod nl3 ml "'2 The mixed-radix system i., u6eflll for ovprttow det.ection as well, For this purpose, we should add a'redundant modulus mN+1 to the basic set of N mouli. Hpre, the t.erm redundant modulus I!1callS that we use only t.he rallge deterlllllled by the original N moduli. For overflow detection we convert the b riven repre- sent.ation (XN+lo XN,... ,Xl) to the dS!:iociated mixed-radix sytem. If UN+I 1= ° then an overflow has occurr('d, 
266 11. The Residue Number System 11.5 Selecting the Moduli 267 11.4 CONVERSION OF NUMBERS FROMITO THE RESIDUE SYSTEM then If tilt' moduli m, nr,' pnirwi...., rciatiw'l)' primt"' w(','nnU8(' tht"' Chin(,8t"' R.'maindcr Throrl'llI in order to COnVl'rI a Illimber in rhe fI"':ooiduf' s)'stem to t.h,' conVl'nt.ioliul n\llllh('r S) st.f'III. Tin!> t.llI'ort"'1Il stdtc:; that I" IX'm = 'Lx, 1 2 '\m IJ-O ( 11.16) m I,.,. I \ I.( = 'L riIJ I  I (11.141 JIl IJ.I , m, IIf N ,,'I,,'re r;l) = : , ,\l = IT PIlJ and nil the ,alues of Ill) ar(' pnirwi!oc rd.\tiv('ly J-I primt"'o The proof is fOllnd in (7). Thl' t('fms 1 2J 1 C'UJ I", pre('aklllatP<t and sl-ort'd In a tablc ... Example 11.7 To find 1110110113' w,' first gellf'rdte a tabl.. of 12'13, yiplding 12°13 = 1. 1 2 1 1 -? 3-"" 1 22 13 = 1. 1 ?3 1 - 2 . 3 - , Example I 1.6 For t Itt"' rt"'ltlue numb!'r s)'st,'m with t h... tllfl.'C pairwl:;(' r('lnt ivd.y primt"' moduli (m3. m2, m.) = (7.3,2), the range includf'$ .\1 = 12 Illnllhers. GiVl'n a rt"'pr('wntation (X3,X2,X.) = (0,2,1), we wish to find ..\ Th('rl'fore, IllOllOl1J = 126+25+23 +2.1 +?>13 = 11+2+ 2+1 +113 = 1. 0 11.5 SELECTING THE MODULI . .H 42 rrll = - = - = 21; ml 2 _ 12 m) = - = 14' . 3 ' -12 r?l.J = -=- = (; ( Wt' ilia). hav(' diff"rem objectives wlrl'n Sdl ,ting tire modllli. If our objectiv(' i.. to [('(llIc(' th(' ('x('{"ntion time of addil,ioll ami mulriplintioll, tlwu a larJ., numht"'r of smnll llIoduli is desirable, sinre th(' ('xN'ution time of these opt'ffitions is (h.termint"'d hy till' Idrg('st lIIudul1l8. How('v('r, a large nllmber of small moduli will I('ngtllt"'n t.ht"' t inlf for converl,ing residut"' numbers to the dS..."Ociatcd mixed. radix syslf'm, since t hb conw'rsion i!> a Sl."ljuential proc{.'(lure in which the number "f stt"'ps is dt"'t.t"'rminl'(l hy th£' IllIlIIbt'f of moduli, Such conv£'rsiOIl'i arp m,<:y for map;nituclf' comp.lfison, sign dCllrtion, or ov('rftow detection. Anol,lwr consi(k'ratioll whl'n st'lecting moduli is the fact that th(' residues woul,lnornmllv be cod('(1 ill some binary l'Odt', alld t.he arithmetic operation,; on th.' rcsidu£'s ,,;oul(\ be cX('tutcd 011 thir corresponding hillary rt'prntatjuns. \\,, therefore haw the following objectiVl's: 1 1 I I I I - - - - 1- rill "'I - 21 2 - , 1 1 I 1 1 , - - - - 2. ri1 2 m - 1.1 3 - , / ...!.. / = I.!. I = 6 r;13 mJ . (; 7 Th('r('forc, : \"1012 = 136. I3 + 28. X2 + 21. Xll12 = 13u. 0 + 28.2 + 21. 1/.2 = 1771012 = 35, rh(' cOl'ffici(>nts 36, 28, and 21 nre COUslants that call h(' comput.,'d ollce and stored. 0 An alternatt"' limu for tht"' pquation for rOIlv<'rting a lIumber in th,' r(';:,idue Y8t('m to decim"il is IXI.\( = IA3.l'3 + 42X2 + Alx)l", (11.15) L Effici('nt binary representation to millimi.lc tire total number of bits. 2. COII'\i('ni('nt binary cOlling to simplify the execution of arithmptic opera. tion. "Iwr(' A, i:, tht"' '\o,'f'ight" of t.ht"' digit. x.. Th"rC'fore, A:i is tht"' value of (1,0,0), A 2 is th.. valut"' of (0,1.0) alld ..4 1 is th,' \'11111(' of (0,0,1). For t.Ilt' rl'sidut'sY'it('m (m3,m2,ml) = (7,3,2) tltes<' ,alU(!:) ar' 36,28, .md 21, fl'SI>l'Cl,iwly. yielding the ('xact samt"' ('x pression as in the pfl'"ious example. The smalll'8t. Illlml'f'r of hits lIeeded to represl?nt t.lw rf'Sidu(' digit for the modulus m, is flo(l;2 mil- H,'n('('. to maximize l,he (('prt>,,,,'ntation (storage) effi- cit'lIcy, Wf' prt"'ft"'r to S .Ipct all m, that. ('(Iual" 2" or is very clost' to it; e_g., (2 k  1). Clemly, WI.' call select 0111)' one m, of the form 2" and still have relatively prnn moduli. W.' may rh"lI, ill addition to 'i<, 81'1,'('r (2" - 1) ami a few orher moduh of till' form (2' - 1). Howt"'Vt'r, not, all terms of tl1(' form (2 ' - 1) lIlay b(' 'iclect.ed, !>illcc 2 k - 1 = (2"/ 2 - 1)(2"/ 2 + 1) for ('wn values of k. rhus. (2" - 1) alld (2"1 2 - 1) arc 1I0t ff'lat.ivt'ly primt'o (2'" -1) is also fnctordble for some odd valul...... 11.4.1 Conversion from Binary to the Residue System If l,he op(,f"inds arc givt"'11 ill tlw cOII\'l'ntiOllftl hinary syst."III, W(' call cOllvcrt tlll'lII dir{'{'tly into tlu' rc.5iduc SV!>WIIl. Giwll X = 2:; Ox}2 j with .l'J E {O.!}, 
268 11, The Residue Number System 11.6 Error Detection and CorrectIon 269 of k. The st"lt>rt.cd mocluli should h.. as close as J>ossihlt' to on(' anotht>r to avoid ver,}' large moduli, whidl would incrl'asf' thl' l'xl'<'ut.ion time. Digit Binary Code o 000 or III 1 001 2 010 3 101 1 110 Example 11.8 Considt"r the four l110duli 32 = 2 5 , 31 = (2 - I), 15 = (2 1 - 1), and 7 = (2 3 - 1). The total I11I111ber of hits rf'quir('d for tht'ir rt'prl'scntation is 5 + 5 + 4 + 3 = 17 bit.s. These four moduli arf' rf'lativcly prime, and thus, M = 2 5 (2:\ - 1)(2 4 - 1)(2 3 - 1) = 2 17 - 2 14 _ 2 13 _ '" > 216 Any binary coding of 2 16 numh,'rs I"t'quires at least 16 hit.s. Thert'fort'. t h('s(' fOllr lIlortuli yield a very ...fficit'nt l'oding. 0 TABLE 11.3 Alternate binary coding for residues modulo 5. In most ('a.'>('S. the conVt'ntional hi nary coding for th(' digit is used, This is not rea II,)' nccf'SSary "lnd, for m = 5, for cxamplf', w.> may S€1{'('t thp coding "hown in Tablt' 11.3. Tht' pairs 1 amI .1, and 2 and 3, arc additive invprsp pairs and alo;o onc's comph>m('uts. 11.6 ERROR DETECTION AND CORRECTION For moduli of t,he form 2", an ordinary binary .\Clder can be W>ed, in which ca."l' t.he aJditivp invers(' is simply th(' two's complement. For (2" - 1), an adder with ('nd-around carry ('an bc uS('(I, "lnd the additiw inverse is the one's complem('nt. Example 11.9 If m = 2' - 1, tht' additivc invt"I"Se of the digit c is m - c = (2' - 1) - c, which f.'<luals the on("8 romplemem of c. Suppose I = 3 and, as a result" th(' modulus is 7. Also assume the convent,ional binary coding for Ul(' rt'sidue digits. If we wish to subtract, I from 6, we instead add tht' one's compl,'m('nt of I to 6, yielding 110 6 + 011 olle's compkmcnt of 100=410 1 001 1 End-around carry 010 0 Two subjf'cts are discll'>St>d in this SC<'tioll. Thl' first is thl' use of residue arith- metic t.o dl'tect and possibly rorrt'ct f'rrors wht>n pf'rformiug arithl11t!tlc 0po'ra. tions on numbf.'rs rf'presented in the conventional numhl'r systems. fhe second is the use of redundant moduli in a rcsidUl' system to allow dewtion and pos- sibly corrf'rtion of errors while performing arithmet.ic operations in the residue syst.em. 11.6.1 Error Codes for Conventional Number Systems Arit.hmetic f"rror codes are those codes t,hat are prcserved under arit,hml'tic op- erations. This property enables t.he det.ect.ion of errors imml'<liately dftff the compl('tion of t,l\(' arithmetic operation. Surh concurrent, error detf'Ction can al- ways be aUainp<1 by duplicating t.he arit.hmetic proC't'SSor. fhis method, hoy;ever. is t.oo costly, We say that. an f'rror code is presf"rved und('r an ariUlm('tk operatio' " if for any two operands X and Y, and thf" corresponding encoded entities '( I\ud y', there is an operation IJ.' for the coded opt'rdnds satisfying For moduli differf'nt from 21: or (2" -1), look-liP tables must be uS('d. For example, tilt' addition and multiplication tahll's for m = 5 arc dcpict.t'd in Tahll' 11.2. Earh of these tables is of sin'  x 3 = 64 x 3 bits. + 0 1 2 3 4 0 0 1 2 3 4 ] 1 2 3 -1 0 2 2 3 4 0 1 J 3 4 0 1 2 4 4 0 1 2 3 (X y' ) = ( \' 1< y)' (11.17) x 0 1 2 3 4 0 0 0 0 0 0 1 0 1 2 3 .. 2 0 2 4 1 3 3 0 3 1 4 2 4 0 4 J 2 1 Error codes to be USld in hn arithmNic unit should be t''Camined for im- plement "ltion costs and dfcctivencs.". By t'osts we mean bot h hardware cost allt '>xl'<'ntioll time cost (the additional delay dill" to the nd to l'lIcode he .operllds and dlt'rk the result). By efft'rtiwness we mean fault coverag> winch IS dcfim'd as the percE'ntage of possihl,' fanlts t.hat will be dE'trt<'d (vf'lghted rclItagc conidf'ring thf' probability of t.hE' differ('nt faults). Smgle-b.'t faults d(rl,}, have a higher probaLilit,,}, t.han Illllitipl(.-bit. faults, and we wonltl lake to make Mllt..tl1dt 61\ of tll(,1ll are c1('tl'<'tf'd by t,h(' checking schl'me. Note, howe'('r, tl.'a.t Ii slll1e error in an opt'rand or an intt>rllll'cliat.f' (('sult Illa cause a muIUpt""'dlglt rror III the final result.. For eXdll1ple. when adding two bmary I.'u.'nbers, If stage t of tle adder is falllty, 611 the r,'muminj:.t (n - i) higher onlf'r dlglt.s ma.... he t'rroneouli. TABLE 11.2 MOdulo 5 addition and multiplication tables. 
270 11. The Residue Number System 11 6 Elror Detecbon and CorrectJon 271 Therf' are two dR......t"'s of arit hmt"'t k rodt"''!: tilt"' sf'parat(' coell'S aJul the lIonSf'parntp codes. In t.llt"' :oo...paratl"' CQdl the data and check bits arc compk11'Iy lIeparated dllowmg us to nse tht"' data hit.s imlllooiat'ly with no l'lIcorlillf.t, W !olart wit.h t.he simpll'.St. 1l0nS4.'parat' co,, which me tilt' \N-coelt..'S II). The' codps are fornll..'(1 b,}' multiplying the operands by a consl<lIlt. t In ot.her words, X' in Equation (11.17) is A . ...\. ami th,' olwr<\tions . unci * aft' id._'ntical. For ('xampl(', if A = 3 we mult.iply t"'adl operand by 3 (obtained ns 2X +X) lInd cllI'ck t.h(' r('Sult. of any arit hlllt"'t ic opt"'ration to SC(' whet.her it. i:. an inlt'('r lIlultiplp of J. All I'rrur nmgnit\lcl<'S that art"' lIlultiplf's of .4 art"' Imd('tl,.-tdhlt"'. fh('(('fort"', wt"' IIho\lld 1I0t :.('I(>(.'t a vahl(' of A t hat is a pO\\t"'r of t lit' radix 2. An oeld valnt"' of A will Ilf'tfft t"'vcry single digit fault, since such all I'rror ha... a magllitlllll' of 2'. A = 3 prm'idf'8 tht"' 11'.L"'t. exp('osiv(' AN codl' that still I'nl\bles UII' d(.t.et'\.ioll of nil singl(' l'rrors. x \,,+y )' IXI" Frrnr 1Jctt!chon E'I"T()" fndlCallon IYI" IIXI" + IYI" I" FIGURE 11.1 An odder with a separate residue check. C(X)  C(1) = C(X * V). (11.18) IIog 2 A. is tht' same for bot.h rodl>S. The most import"lnt eliffl'rence is due to the propl'rt.,}' of sl'pdrat"'IIt"'-ss. Thl' arithmetic unit for the clllOCk symhol C(X) in the r(,$idue code is cOlllpl('tely !Wp"lrate from the main unit operat.illJ( all X, whjle ollly a single ullit (of a higher compl('xity) exwts ill thp case of tht' AI...' codl'. All \elder with a reidu(' ("ode i1> dl'piltt..>d in Figure 11.1. III till' error det'ltion block shown in this figure the residn p 1II0duio A of the t + Y ill put, is calcnlatpd and comparf'd to tht"' othN input. A mislllatch rl"ults in an error illdimtion. The AN and wsiduI' ('odes wirh A = 3 'lrl' the simplest I>xalllpies of a cia:;:; of arithmt"'t.ic (separate and lIoll&'pardtc) codes whil'h lL'if'"l value of A of the form A = 2° - 1, with (L !wing an inteer [I). This ("hoice simplifi t.he calculation of t.h(' remainder whcn dividing by A (which is lIeeded for the dlt"'rking algorit,hm). dlUl it i'i thl' rl'aSOIl that these codes arc called low-cost arithnll'tir rodes. The cakulat.ion of th(' rf'lOaindcr when dividing by 2" - 1 is simple, bcnnsc t.hl' l'quat.ioll Example 11.10 Tht' lIumher OI1Ch = (ho is fI'pr(';Sl.'nt.t"'d in the A,\ cod,' wit h A = 3 hy 0100102 = 1810' A fault. iu bit, posit.ion 2 3 may result in the erroneous I1Ilmht"'r 011010 2 = 2610. This error is ('nsily d"tl'(tabll'. since 26 is lIot a multiple of 3. 0 The simplest separate code. are the re:.idue code ami the in\"Cr:;(' n-:;idue code. In I'deh of Ul<'Se we attach a separat(' chf'ck s)'lIIhol C(X) to (,\"Cf\" opNand X. For t.h(' f('Sidu(' cod.', C(X) = t mod A, where A i.. call(>(1 t.he eh('('k modulus. For the inver:;e residut' code. C(X) = A - (X mod .t1). For bot.h separat.e codes, Equar iOIl (11.1 i) is replaced by This (>()uation dl'arly holds for addition, IInrltiplirat,ion, l\ml snbtrdrtion (see Fqnat.ions (11.4), (11.6), and (11.9), rI'SI)('ctivt"'ly). For division, tht"' ('(I"at.ioll X - S = Q. D is satisfic d whl'({' X is tht"' dividl'nd, D is till' divisor, Q is till> quot.iellt, and S is the remaindl>r. The corresponding residue l'IlI'('k is therl'fore Iz;r"I"_J = IZ;!"_I ' r=2". (l1.19) IIXIA -ISlA IA = IIQI, 'IDI ,I A . allows t.he liS(' of modulo  - 1 snmmation of the groups of si7.(, a hits thdt compose' the number «('arh group has a valnc 0  Zl  2 Q - 1). For eX8mpll', if A = 3, X = 7 alld D = 5, the result.s are () = 1 8nd S = 2. Thl' cornpollding ridue ('heck is: 11713 -121313 = 11513 .111313 = 2. A f(o;.irtnp cortp with A as a rh('('k modulus ha.... t!w smut"' l'xact lInrl('t('('tablc error magnit.udes a..<; rhl' corresponding iN code'. For l'xampll\ if J'\ = :J, ollly error" Hldt modify thl' rc':mlt by SOUIf' nlllltipll' of 3 will go lIIull'!('(.tf'd, ami COIl!tl.-qucllt,I). single-bit (>rror:; are always dt't('(1ahh'. In addition, thl'dl('('killg alJ;!;orithms for the AN cod(' alld t,ht"' nidUl' code dre t.lIP S"lmp; in both WI' have to compute the ridue of t,11l' rl:.'bult modulo A. Even t,ht" iIlCrpd.'i(' in word Il'ngth, Example 11.11 To calclliate the rcmainder when ,Iividing the Illlmbl'r X = 11110101011 by A = 7 = 2 3 - 1, we partition X into groups of si/.e 3, stdrting with th,' le"1...t sinifil'ant bit.. Tluli yields X == (Z3, Z2. ZI, Z() = (11,110.101,011). WI' t.ht"'n IIdd thl'SC groups modulo 7; i.e., WI' "cnst O\lt." 7'.'1 "\lId add tht"' l'lId-monnrl-clrrv Wh('lll'Ver 11(>(' ....M/Uy. A c.nry-out, ha.'! a wf.ight of 8 amI sillce 1817 = 1 Wt: must add all clld-'\rouml-carry wh.>never there is a carry- ont as illu!otrated helow. 
272 11. The Residue Number System 11 %3 + 110 %2 I 001 + 1 end-around cdrry 010 + WI %1 111 + 011 ZO I 010 I I'nd-8round carry + Oll 11.6 Error Detection and Correction l73 Till> residu(' modulo 7 of X is 3, which is the corrC<'t. rl'maindl'r of \ = 196310 when divided by i. 0 may modify the prol('dur so t.hat two's complemcnt (with R = 2") can also \,p I'mplo)('d: (2" - X) lIIod .4 = (2" - I-\" + 1) mod A = (2" - 1- X) mod \ + 1I1Iod \ ( 11.20) \'0f' t.hNcfore, need t.o add a corn.'Ction term 111 , to t.h(' rf'Sidlll' code when formillg the two' cOll1pll'lI1f'nt.. Note that A m'Lt still be a f8('tor of 2" - 1. A :.imilar corr('ct ion is nf'('(ll'd whl'n Wf' add opl'r8ncls rl'prc.<;('nt.('(\ in two's COll1plt'lUpnt and a carry-out (of weight 2") is gent>rat.('d in t.hf' main d(ld(r. Such a carry-out is dicard{'(l according to the rules of two's cOlllpl,>mt'nt arithmetic. To COmpf'JISBt.f' for this, WI' ll('ftl to subtract 12"1" frolll the rtosidlll' dleck. Since..\ is a factor of (2" - 1), the term 12"1" is equal to IliA. Thl'Se modificdtions r('sult in an interdl'p...ndt'ncc between tile lUain arith- lII('tic unit and the check unit that. op('rates on t.lIP residue:.. SUi'll an illterdf'. pend('nce lUay caus(> a !>it.uation where an Nror from till' main unit prop<igatt'S to the check unit and the efft.'ct of th(' fault is maskP<1. Howevpr, it h& Ll't'll prov(>n in (2) that th(' occllrrl'nce of a single-bit error is alway:. detectable. Both separate and nonseparate COdl'S are prt>:>erved when we perform arith- metic operations on unsigned operands. If WI' wish to include signed operand:. as well, we must require that the code be complementable with respC<'t to R where R is eit.her 2" or 2" - I (wh('re n is the number of bits in the encoded operand). The selected R will det.ermine wl)('tl1<'r two's compll'ment or one's complement arithmetic will be employed. The origin<,1 operancl is compl(>m('ntable with re- spect to M (as bl'fore, AI is eit.her 2 m or 2 m - 1, but wit.h m < n). Hl'nce, for the AN code, the equat.ion R - AX = A(M - X) must be satisfied, yielding R = AM; i.e., A lIU1st. be a hctor of R. If we insist on A being odd, it ('xcludcs till' dlOice R = 2". Thus, only one's compll>m('nt can bl' uS('(I, with A being a factor of 2" - 1. Example 11.12 For Jl = 4, R is <'qual to 2" - 1 = 15 for one's COmpll'lIllmt., alld is divisible by A for the AN code with A = 3. The numbl'r X = 0110 is repn..'S(,ntf'd by 3X = 010010, and its one's complement 101101 (= 45 1 0) is divisible by 3. However, thp two's complt>IIIl'nt of 3X is 101110 (= -1( 1 0) and is not divisible by 3. If JI = 5, thpn for onp's complement R is equal to 31, which is not divisihle by A. The numb('r X = 00110 is represented by 3X = 0010010. and its one's complt'uwnl is 1101101 (= 109 1 0), which is d 0 Example 11.13 For tile rl'sidue code with A = 7 and Jl = 6, R = 2 6 = 61 for t.wo's complement and R - 1 = 03 is divisible by 7. Tht' number OOlOl = 10 10 has thp rt'idue 3 moclnlo 7. The two's cOlUplell1l'nt of 001010 is 110110. The compll'ment of 1317 is 1417 and ad(ling t.he correction term 1117 yields 5, which is thl' correct residul' modulo 1 of 1l01l0 (= 5'lro). If Wt' now add to X = 110110 (ill two's compleml'nt) the number }' = 001101, a carry-out is generated and discarded. We must thereforl' subtract t.l11' corrC<'tion tl'rm 12 6 17 = 1117 from the residue check with thp modulus A = 7, obtailling 110110:::: 1( + 00110l=Y I 000011 101 =IXI7 + llO=IY17 1 011 I end-around carry 100 I corrt.>et ion term 011 3 is the correct ftidup of the result 000011 modulo 7. o For the ft'sidue code wit.h the check modulus A, the equation A - C(X) = (R X) mod A must be satisfied. This implies that II must be an integer multiple of A, again allowing only one':, complement arithmetic tu be w;ed. However, we Error currection can be achil'vro by using two or morl' residue d1t'Cks. The simplest. ClISl' is thl' birt'sidup cod('. which consists of t.wo rt'sidm' cht'dls I :I \ If ,t 2 u I U llll A2 - 2 b - I are two low-cost. re:.idue cht'cks wlth ant J 2. ....1 = -  - . . Jl = l.c.m(a. b) whl're n is t.he number uf bit.s in the uperands. tht'n any slllle-hlt l'rror can be corrcctetl 15). 
274 11, The Residue Number System 11.7 References 275 11,6.2 Error Codes for the Residue Number System 11. 7. Gi\'I'n., number X dnr\ ils reshllJp nl"dltl.. 3. C(X) = IXb. How will thp nidlJ(> challgt' wh('11 ..\ is 'ihifted hy one hit position t.o thl' Il'ft if thC' shift...d onl hit is O? n.I.'pN\t thi.. f"r th" (o\.'ie whC'r{' thC' shift ,<I-out bit is 1. Vprify YOllr rule for X = (lllOl shifted fh'f' t.imf's to the I{'ft. Thf' rf'siehlf' syslt"m is inhf'rf'ntly morf' fault.-tol('rnnt than t.lu.' conw'ut.lonal num- bt"r sysh'm. Thf' hwk of illtC'ractiou among th(' rf'siehlf' diits (no carry-propagation) implips that. a fault. in a singlf' digit. willllot rt'Sult. in ('rrors in ot.lwr diit,-;, This desirahlC' fault isolation prop('rly is prt'st'r\'('(1 whilf' performing addit.ion, sub- t.raction, dnel lIIult.iplicat.ion, but is not prC'sl'rVl'd whilp other o!wrat iOlt' alP performed. Anolhl'r con'<1uC'ncC' of t.he fault. isolation property is thflt a runs is- I.('nt Iy NronL'OUS fl'Sidue digit, whf'n idf'nt.ifilxl, ('an bf' ,lisconn<'C'tf'd ami tIle rt'St. of t.h(' residllt' arit.hmetic unit. can !>t.iI\ be' U8('(1. Tlw fault-tolerance fC"ltnrc will bl' lIumif.stP<1 only if m:lund,Ult. moduli an' addeel to thp original set of mocluli, allowing thp ttctf'.-t.ioll of C'rrors and c\'I'n thC' idt'lItifiCdtion of fault.y residuc digit circuits. The resulting system is cal\('l.1 a redundant residuc number syst 'J1I dllli is ddim'" by a set of N + L moduli (mN.+I,....mN+I.f1IN,...,mr). The L moduli llINtL,....11IN+1 arf1' thf' rroundant. moduli. implying that out of t.lll' total range [0, /III - I) (wh('rc /liT = rf"i L ,n;) 10. AI - I) (with M = 11:-:1 flI.) is t,ll(' If'j:.titimatt" range. The rllne [M. Mr - I) is coiled the iI\('gitinMtp range. It has bn shown (6) t.hat a sinh' error alw"lYs moves the op(rc\nd from t he leit.imdtl' range to t he illiti- matI' range and {'(\n tlwrcforC' be eR.5ily idelltified. 11.8. fhl' I'akulalloll of thf' rC'lIh.illller whl'n divulill hy . \ = lO - 1 C8n hI' dOIll' in l,antll('1 rul hl'r thall ill Sf'rit"S IlMl1Ilt'r. Show a block c1il\l\m of sltcb " parallel c:ircllit for 32-hit long nmnlll'r.; and \ = 15. 11.9. Show that II n'Sidue ('h('('k with thl' modulu.'; J\ - 2" - 1 c'\n dNf'Ct. 1111 ('rror« in a gruup of a-I (or 1('S..'i) adjacf'nt hits. Sudl error.; arp I-alll'd bur'St rTOrs of length a-I (or Ipss) "111<1 they m.w occur whf'll ..,hiftinK all operand by . ''f'nl bit I)ositions. 11.10. Pro\'f'that 1.:,r'lr-l = hlr-l fur r = lO and 0 $ :, $"}.o - 1. 11.11. \\ hC'n pl'rformmg . divide opf'rI\tioll wit.h \N rodro op,.r.mcl'i the ljuohpnt Q n1lJ'it he lJIull ipli('(1 bv J'\. Filld '\ siml>lf' <\Iorit.hm for executing tllli; Itmlt.iplica. tiun when t\ = 2° - I. IIIuslnte your .'Igorithm fur" ;; 15. 11.8 REFERENCES 11.1. \ C'rify that thf're art' only four different rl'1)f("SE'ntatiou.o; in the «'SicluC' nwubpr stC'1JI with (m:z.m.) = (4.2). 11.2. Given the set of moduli (7,5.3), find: (a) the mng(' .\/, (b) the coefficients for t.lle (,hin' RClI1aindC'r Throrell1 aud IhC' V<ihlC' rt'pu'8C'ntC'd by (2,3,2), (c) thf' corn'SlJolidinK mixro-radix rt'l'rl.'nlatiou, (d) the rt'pr("S('lItatioll of 20 in the u'SiduC' system I\ud in tht' mixed-radix s} stelD. 11.3. Prove tht' C'hine8e RC'IIMimle'r rhrorC'm by calculating thl' relll uudl'r modulo m. of rluation (11.14), knowing that e\'ery number \" has (\ unillue rl'prl"SE'utfitioli in the residue s).ste1JJ. 11.4. \\'rih' t1J(' !oubtract.ioll table for m = 5 first in dl'('imal reprt'Sl:'utation and 1 hen in binary r('prt"SC'lJt.ltiem IL'iiuK (a) the ordinary binary coding with (JOO through 101 (for thp dlRit..'i 0 throuWi .1, rp('{"ti,cly), (b) tb... binary code in Table 11.3 \\ hat is the ad\antage of the M.'Cond scheme? 11.5. "nte the rule for cOII\'prting a dt'l'ullal number 10 the r('>,,,hll' system using 1\ table of 11<Y 1m. lIow is this rule simplitlf"d for thp cast' m = fJ1 11.6. Di\'irll" 35 hy 5 ill thl' rt'Siclue !!stC'm wit-h (1113. ml, f1Il) = (7,3.2). ('fill ou cli id,' 35 by 7? 31 by 5! "" It.lt are the condition.s undl'r which division can ca...ily b(' carried out? (I) A. AVIZIENIS, "\rit.hml,tic f'rror ('(Illes: ('ost ,)ntl dfectiel1('SS stlldies for <ipph- cation iu digit.al syst.em design," IFEE 'Ihm.... on Cornputm, ('-20 (No', 1971), 1322-1331. (2) \. AVIZIE:>;IS, '.\ritllllll'tic algorit.hms for errnr-codrd operantis," IEEE Irons em Cornput.crs, ('-22 (Jllne 1973), 567-572. (3) '\ _ MA, /FBI-' 'lnms. on ('omputf'r." .17, (Mdfch 199R), 333-:J:J7. (.1) F. Pm'RDlGII \R \Z. dnd II. :\1. YASSINt", "A sigm-d.digit dI"chitf'Cture for rt";idlll' 10 biliary transforml\tioll," 11-:FE Tmn..,- on Computers. 46 (Oct. 1997). 11-16-1150. (51 T. It.  RAO, "Birt"Sirlue error-corrt'(.t.iuJ., ("Otll'8 for CfllJlputf'r '\rilhlJlE'lic," IEE1-' TmrlS. on Computers. C-IY (M.,)' 1970),398-102. (6) :\1. A. SODERST'AND. W. K. J'>:-;KI:-;S, G. A. JULLlI'S. .lIId F ,I. rWLOH. J{1'.,due numb r system aritllfTll>f1C rnodem (lppluatlOn in digital sir/nal pf"O<.cssJng. IEEE PrE'SS, New York, 19ts6. [7) N. S. S/ABO nntl R. I. TANAKA, Rf.'.,idue onthrnetic and It., applicatIOn to comput r' technology, Mc(;rdw-Hill, New York, 1967. (8) F. J. rWLOR, "H.('Siclue I\ritlllllcllc: A IIItmi..1 with C'xl\lIJplffi," fECF Computer, 17 (Mil)' I!JSI), !io-62. (9) H. Z'MMEnMANN, '1':lIicit'nt VI SI implcmcntl\llI)hS of modulo (2" :I: 1) B(hliti;n find lIIuhiplicdt.ion," Proc. 0114tll Symp. on Comput r lrithmetlc (A(lril 199!J), 158-167. 11.7 EXERCISES 
INDEX A ripplf'-carry, 9.1, 108, 125, 168, 115, o:J sihnf'<l-diKil (SD), 157 Addit.ive inverse, 23, 262 AIinlU('nl, 61, 63, 67, 77, 82, 166 of part.ial products, 150 Ant.ilogarit.hm, 247 Approxin1dt.ion initial, 219 linear, 2M piecewise linear, 255 polynomial, 225, 21 rational, 226, 213 Slim, 219 Array divider, 203 ll1ultipht>r, 167 Asymmetry 1l1('dSUre, 22 Add('r carr)' completion, 135 carry-look-ahead,95, 106, 108, 125, 1.J4, 168 carry-propagate, 24:1 carry-save, 125, 1: I, ViI, 200, 20,1, 206. 217, 213 carry-scl<'Ct., 102. 113, 119 carry-skip, 116, 121 two lev('I, 118, 136 condit.ional-sum, 99, 11:. 123, I.H full (FA), !)3, 125 half (HA), 9.1, 151 hybrid, 119 Ling, 112 Manch('stcr, 119 lUulti-opl'rand, 141, 150 parallel, 93 prefix, 109 Brent.-Kung, 110 I1an-C.uh;on, III Koggc-Stone, 111 Ladner-Fihcher, 111 B Barrel shifLer, 6-1, 67 Ulocking factor, 99, 105, 119 277 
27S Index Index Uoolh's alnritlun, 1-12, 175 radix-4 lIJodifiro. 141, 1 i5 rndix- moditil'<l, 146 (7,3), 121\, 150 l)1\ralle I. I:JO r<'SiduC', 2iO S4.'paratc', 270 ExcC'SS n1l't hoel, f,G Fxponellt. 5:1, tU, 2-17 ba..<;(',53 bil1......'tI, 56. 1.8 overllow, fl7 underflow, f,7. 68 bX(llInelltiRI function, 226, 241. 252 c D lWe \'A X, 5R Ueuurm,di71'<1 nUll1bl'r, 68 Digit :;I't, 21, 2:J sign, H3, 153 sigllilkanl, !i, r.8 wl>ight. 19 Digi t 1'1 fillf'f, 25H Di'biun, 181, 2:H by cOIJ\'C'rgl'nCI'. 226 by r('<'iprol'ation, lIS by :z,'ro, 40, 60 hih-radix, 187,201 look-at1end quotil'nl digit selCl'tion, 199 uonrestormg, 12, 181, IR, 2\11, 2:U rl.'Storing, .12, 188, 203, 227 8e(IUential, 39 SRI', UH, 206 t.hrollgh multiplic8tion, 213 Filii-ill. 97. 103 Fall-out, 1()'1 ,,"'lUlt CO eragl', 269 isolation. 274 loll'rdlll'f'. 271 Fixl'<l-point rC'prl'Sl'ntatJon, .1 Fixl>d-radix nUJllher S} stC'lII, a Floating-point dddit.ion, 61, 81 eXl'('I,tious, fI,l formal. 57 cloublpn.'("IM'IJJ, 67 CXll'ndro, 67 long, 58 single-precision, 67 WEE !>taudarcl, 5.1, 6i, 166, 192, 220, 2 13 nUlIJbers, 5:\, 213, 211'1 l'\lSed multiply-add, 165 Clrr' Rddil iOlJ fr('(' of, 102 compl£'lillll adder, 135 d .t('('tioll of compll'tI<IIl. 9!i ('lid-around. 1-1, 2fiS forcro, 9,1, 10 1 fuudanll'ntfll operator, 107 gt'1lt'rntc, 9G, 106, 112, 20-1 kill. l()(i look-alwld, 95, 204 look-ah,'ad (,IIf'rntor, 95 opC'rfition f(('(' of, 259 Ilwl'dgale, 96, 106, 112, 201 propagation. 27.1 propagation chain. 21, 95, 118 t'DC/Cyher,58 ('I1<\racteristic, 53, 2f,1 Chiu('S(' remailllll'r throrpm, 261> Comparisou con:>tant, 185, 192 Complement digit, 7 dillJini..,llt'd-ratlix, 7 nillp's. 11 Onl"8, 7, 13, 3, 153, 20.1 rndix, 7 two'!>, 7, 12,37, 1.13, 153, luo, 182, 2()'1. 211. 24 ('OIlJIII('x IJlJlnber, 232 Comprl'SSOr, 129 (4;2), 15/\ (7;2), 121'1, lfiO Con\'lrgf'nC(' domain, 227, 2JO, .l33 Iwear, 228 ljuadratir. 21.1, 219 S<"hl'mt', 213 ('OHDlC, 2:U CQunter (2,2), 151 (3,2), 125, 150, lfiR (5.5,4), l.m F E EIl'IIJ('ntar) function, 225 I,'rror l1("cumlilat.iou of. 87 al)proxillJdtion, 239 correnion, 269 lI,>tt><"t.ion, 269 dist.ribution of rf'lative, 87 relat.i\ C' rOUiul-l)ff, ts7 repre5<:'ntation, 65 absolute, 65 av{'rag(' rc'lut.h'", 61t lDaxmlUm rl'lati"e, 65 rC'lath'e, 65 rounding, 72 unbiased rounding, 73 rror codes, 269 AN, 2iO low-cool, 271 nonsl.'paratc. 270 G GaU' (f.r), 103, 174 delay, 96 Grddual II uderIInw , 69 Guard digits, 62, 77 H lliddt.u bit, :IH, oi Il)'p('rhulic functions, 2:i 279 1 IBM, 5 Ich'dliM,<:1 model, 103 ImdJo:inary-radix nllmt I'r 8yS1I'm, :12 hnpl"lIIent'1tion CCIOiI. 10:,. 1116, 175 Infinite exlPll..ion!l, 15 Infinity, 57 Inlprpn'tation rule, 1, .l4.7 Int./'flodl '\rit.hmeti', 76 L latch, 132 I ('l\(ling Ino pr('(linion, H3 I 4'.L"t common multiplt', 2fH J,ength of intprronnl'ctiontJ, 105 Logarit.hm funrtiou, 230, ll:.! Log.uithmir IIl11l1bl'r system, 24i sp<'Cll-up, 99 Look-up tahle, .! 19 Loss of sitka1J..e. 62 Low('r hound, HH I Mantis....... 53, l51 Iodulu..., 259 dJl'ck, 270 rt'(luudant, 265 Iult.i-Ievel,'ircuit, 1 Mult.il'licat.ion, :15, 111 array, 167 d("C'ompositJon of, 150 high-radix, 172 pipclinN, 17<1 recodl,<:I, 1,12, 155,217 -"''<luC'ntinl, 35 syn,'hrouou.s, 1.11 tnmo:ulecl, 216 Mliluplicdtivc inwlSl', :!Ij:i Multiplving factor, 211 N Nl'Jo:nUt' number:;, b N('wton-ltdphsu u llIethtlcl, l1l:\ 
I 280 Index Index 281 R HOI, 7.1, 216, 219, 225, l50 RoundinK, 82, 165, 221. 2:'0 bin.'., 71 chopping, 71 digit, 77 !tOM rounding, 75 round-off 5('h(,II1C"S, 71 round-to-:I:oo, 76, 82 round-lO-npar{>'\t, 72, 77 rOllnd-to-nC'lu('.',t-(''pn, 73, 77, 82 round-to-ncarC"St-odd, 71 roundiug townrd 7..<'ro, 71 truncRtion, 71 conwnmmto two's romplpml'nt. 30, ,17 mmim<il rt'pr<'SC'nlalion, 29, 116. UH Sign.-d-mRilll,lp mNhod, 6, :m Signitirand, 5J, 63 Sp-up t('("hni!Ju, 241 Sqlll\rp root, I, loo, l31, l52 Stick v bit, 77, i8, 79, 80, 167 NormR!i./.(\llon additive, 229, 240 Ilmltipli 'ati'e, 229, 235 llonn81izro number. 55, 182, 21-1, 230 pQSt, 59. 63, 67, 76. 82, 165 Not R nllmbl'r (N8N), 60, 68, 69, t)t) Number sy!otf'm ("omersion, 103 fucro-radix, 3. 21 imaginary-radix, 32 loarithmic, 2.17 mixed-r8<lix. 264 m'g.,-bmhtY,20 Ill'ga-d('dmnl. 20 nr'g<ltive-r&lix. 19, 22 nonrroundant, 3. 21 pQl;itional, 3 redundent residuE', 274 ridup, 102, 259 slgn-Iorithm, 247 signed-digit (SD), 2-1, 102 unwdghted. 261 wC'ightro, 3, 264 See oL.o Floating-point PrecC!lon, 53 Progranlmnble lOgic arr8Y (PI A), UJ3, 2o.. o Ractix, I, 19, 103, 259 ,'onversion, .. Rane, l, 21,53,66, 252,261 RS}'mmetric,20 dynamic, 53 legitimate, 27-1 Rl'Ciproc81, 218, 252 Rt'("oding canonical, 30, 1.16, 155 See also MultiplicRtion and nooth's algorithm Reduction rate, 157 Rroundancy, 21 in r<"Sidue number s\stf'm, 274 m,'d.'mre, 189 R,'ferC'nce bit, 1.1l, 116 Regi.,ter double-length, 37 single-I(ngth, 37 Regular desiKn, 105, 157 Relative step size, 253 Relali\'elv primE'. 261 Rt'IlMinder, 39, 213 parlial, 182 truncatro, 200 Z('ro, .16 Repregenlabll' number large:.t, 21, 53 smallest, 22 RepfC:j('nt8tion accuracy, 2[).,1 elhclCJlcy, 267 See a180 Frror, reprl'8t'nttion Re:.ldue number. 260 inlo(>J"S(' , 270 rroundant, 271 Ripple-c'drry, 116 oMlder. See Adder T s Taylor s.'ril"l, 225, 210 T"rmlnation algorithm, 241 festing for I.ero. 50 l'ilning di"gnm, 133 Treo., 109 balance<:1 deJa), 161 carry-S3vP, 131, 160, 217 overturned-stairs, 161 Wallace, 126 l'rigonometric fumlions, 23l. 242 inlo('t.S(' functions, 235 Two's complement. See Cornp!t'melll Scaling factor, ,I Sdl'ntific nOIRtion, 53 Shift comhinatorl<il, 61 pr<'-, .10 uniform. 1-17 Signed-digit (SD) UUlllbt'f, ll, 102 adder, 157 binary, 27, 157 binary ,'ncoding, 29 U Unit in the IdSt IJosition (ut'J), 1, .16, 5.1 OIIP'S conlpll'mcnl. SI'C ComplemC'nt Optimal algorithm, 102 Overflow, 2, 12, 13, .10,81 detection, 265 O\'f'rIBp region, 100, 192 p Par","'" prefix circuit, 109 Part-lill products, 141 I:!Ccumulation, I-II, 150 uligwnent, 150 matrix of bits, 151 l'iJWlinl', 132, 11)7. 217 multiplier, 174 rau', la3 5t3gC"S, IJ2