Data Reconciliation & Gross Error Detection: An Intelligent Use of


Data Reconciliation & Gross Error Detection An Intelligent Use of Process Data

Shankar Narasimhan and Cornelius Jordache

Publishing Company Houston, Texas

Data Reconciliation & Gross Error Detection Copyright 02000 by Gulf Publishing Cornparty, Houston. Texas. All I-ightsresel-ved. This book, or parts thereof. may not be reproduced in any form without express written permission of the publisher.

Tr, our grtru Professor Richard S. H. Mak, who played

tlze roles of an initiator and a catalyst.

Gulf P ~ ~ b l ~ s hCompany i~ig Book Di\,ision PO. 13ox 260s 1Hou~lnn.l'esas 77252-260s "Since all measurements and observationb ar-e nothing more than approximations to the truth. the same must be t!-ue of 311 c:ilcuiations resting upon them, and thc: highest a m of all cn:npu!2iior!s matie COiICZi71iEs c0i:Cr::ie ~!:cllOnlsila i:lcsr t ? !o ~ approxirnatc, al; ncari)- YLL pr;?c!icab!c. t ~ i!i2 l tr:!ii~. R u t this can be accomplisi~cdin ;lo OCIICS \


Acknowledgments, xi;; Preface. xv

Chapter I: The Importance of Data Reconciliation and Cross Error Detection, 1 1 Piocitss I h t a Conditioning Metl?ods . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iilduhtriai Esamplcs of Stcidy-State Dats Rccor-iciliaiioil . . . . . . . . . . . . . 5 3 Cr-udi. Split 0p:il-r~izationin a Preheat 'Train of a Kei~ner-y . . . . . . . 7 ?vIin~!l?izing Water Consumption i r hlinera: ~ Beneficiation Circuir\ . . 1 . ilara Recdnciliatio;~Problem Formulztion . . . . . . . . . . . . . . . . . . . . . . . Exa!nples of Slnlpir iiecoilc:liation I'roblems . . . . . . . . . . . . . . . . . . . . . i 1 S!:stt"ns ?Yith .4li Measured Variables . . . . . . . . . . . . . . . . . . . . . . . . . 1 i S).srems Wit\, Unmc.usurcd Variabies . . . . . . . . . . . . . . . . . . . . . . . I 1 Systan Cor?tain;ng C i r o s i El-:orb . . . . . . . . . . . . . . . . . . . . . . . . . ---- 17 Zitrlzfir.; iroir? Data Recurlciliation and Gross Err-cr Detection . . . . . . . . -7;j .I\ B1-it:' H'istory ofi1a:a Recoriciliation anci Cross I:rr:,r 5etection . . . . . 2 1 ? i -7 S c o ! ~and Osgi~nirationoftlie Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . S~~liiiriar:. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 , l < e f e ~ - i ~ ~ .i c. r. s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :D


Chapter 2: hleasurement Errors and Error Reduction Techniqtles, 32 Cl:t\\if~i-,ition of h.leasurements Error\ . . . . . . . . . . . . . . . . . . . . . . . . . . . Random Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ciross Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .




Err-or Reduction hletliods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exponential Fillers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moving Average 1-'ilters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I'olynomial Filtel-s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iiybrid Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38 40 JS

5 54 j6 57

Chapter 3: Linear Steady-State Data Reconciliation. 59 I..inear Systems With All Variahleh Measured . . . . . . . . . . . . . . . . . Cieneral Formulation and S o l u t ~ o l.~. . . . . . . . . . . . . . . . . . . . . . . . . Statistical Basis of Data Reconciliation . . . . . . . . . . . . . . . . . . . . . . . Linear Systerni With Both Meawred and Ul~r~ieaiured Variables . . The Constructiori of a PI-ojectionh4atrix . . . . . . . . . . . . . . . . . . Observahility 2nd Redi~ndancy. . . . . . . . . . . . . . . . . . . . . . . . . . h4atri.i Decornpo\itio~~ Mtihod\ . . . . . . . . . . . . . . . . . . . . . . . . . (;rap11 Theoretic \letl~od . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Otlitr Ciassificutroii klcthotls . . . . . . . . . . . . . . . . . . . . . . . . . . . . kj\tiri?atii12 . Mxi\ Error Co? .I 1.iank.c hl~itrli . . . . . . . . .

59 59 61

63 66

69 70 72 77 77 I

Si~i?i;iatiorlPCciln~iiliC!'or ~\~!lli:iliiig 1jat;i i i ~ ~ ~ ; i ~ i i i : i .i i.~ .i i . Slillllll~:l-\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . J: XI.^.. .,rice\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


ChBpter 4: Steacf-State Ilata Keconcilistion fcr Bilinear S\ stern\. 83

. . . .

. . . .

. . .


hi >? S?

Solutioi~Techniques for Equality Constrained Problems . . . . . . . . . . . . . Methods Using I. agrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . Method of Successive Linear Data Reconciliation . . . . . . . . . . . . . . . . Nonlinear PI-ogramming(NLP) Methods for Inequality Constrained Reconciliation Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sequential Quadratic Programming (SQP) . . . . . . . . . . . . . . . . . . . . . . Generalized Reduced Gradient (GKG) . . . . . . . . . . . . . . . . . . . . . . . . . Vari;~bleClassification for Nonlinear Data Reconciliation . . . . . . . . . . . . Comparison of Nonlinear Optimization Strategies for Data Reconciliation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Surnrnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

122 122 124

128 129 132 134

136 138 138

Chapter 6: Data Recczc:'iation in Dynamic Systems. 142 The Xeed for Dynamic Data Reconciliation . . . . . . . . . . . . . . . . . . . . . . . Linear Discrete Dvnamic System Model . . . . . . . . . . . . . . . . . . . . . . . . . Optinial State Estimation Using Kalman Filter . . . . . . . . . . . . . . . . . . . . . Analogy Between Kal~nanFiltering and Steady-Stare Data Reconiilistiori . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Optiiil31 Control and Kalrn-n Filterin? . . . . . . . . . . . . . . . . . . . . . . . . K;~l:ii;l~i Fil:cr l~llplen?eiit:ition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1)ynaiiiic D;:u Reco~iciliationof Non!incar Syxrenis . . . . . . . . . . . . . . . . h'oii!irtea~-State Extini:itior!s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nor;linzar Data Reco!-iciliation Mctliods . . . . . . . . . . . . . . . . . . . . . . . Su~:;rnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

142 143 148 I53 I55 i57 1 i)O 160 IOl 171 17!

C!~apter7 : Introduction to Grass Errnr Dt-tectio~i.173 Problc.~:~ Statem~n!.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bas!c Stat;s!ical Tests for Gross En-or Liercction . . . . . . . . . . . . . . . . . . . T hz Glohril Test (GT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thc Constraint or Noddl Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thc hleasurcmc~ltTest (MT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Gclicralired Likc!iliood Ratio (GLR) Test . . . . . . . . . . . . . . . . . . Cornp:ii-ison of thc Powcr of Basic Gross Error Dciection Tests . . . . . (;SOS\ Error Detectioli IJ\iii$ PI-incipal Cornporicnt (PC) Te\ti . . . . . . PI-lncipalComponcilt Tests for Residu:ils of Process C:onstlairirs . . . . Principal Con~poncnt'l'csts or] Measurement Adjcstille~its . . . . . . . . Relationship Bct\vccn Principal Component Tests and Otlier St:~tisticalTebts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


175 178 I80 1X i

;., 5 190 1% 196

147 198

Statistical 'Test5 for General Steady-State Models . . . . . . . . . . . . . . . . . . Techniques for Sinzle Gross Error Identification . . . . . . . . . . . . . . . . . . . Serial Eliniinat~orlStratesy for- Itlentifying 3 Single Gross Errc11-.. . . Identifying a Single Gross Error h5; Principal C O I I I I I O ITests ~ ~ I I ~. . . . . 1jetectability and Idzritifiability of Cross El-rors . . . . . . . . . . . . . . . . . . . . Detectability of Gross Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identifiability oiGross Errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PI-oposedProhlerns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sumntary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

290 203 2Cl4

107 100 210 114 2 17 223 224

Chapter 10: Design of Sensor Networks. 300 cstirnalion Accuracy of Data Recorlciliatioii . . . . . . . . . . . . . . . . . . . . . . Sensol- Network Dcsigr? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods Based on Matrix A1gel)ra . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods Rased on Graph Theol-y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods Hased on Optimization Techniques . . . . . . . . . . . . . . . . . . . . Developments in Scrisor Network Ecsiyn . . . . . . . . . . . . . . . . . . . . . Surnrnary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kef$c.cnces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . F

30 I

?(I? 303 3li 322 323 323 325

Chapter 11: Industrial Applicatiorls of Data Reconciliation and Gross Error Detection Technologies, 327

Chapter 8: Multiple Gross Error Identification Strategies for Steady-State Processes, 226

Process Unit Baia:lce Rccuuciiialion and Grus; Error Deteciion . . . . . . . Parameter Est1rna:ion and Data !?eco:lci!iciti~~. . . . . . . . . . . . . . . . . . . . . Plant-W~deMateria! and 1Jtilit;es Kecoilci!iaiioii . . . . . . . . . . . . . . . . . . ....... CascStcdies . . . . . . . . . . . . . . . . Rccnnciliation of' Sef,ne:.> Crude F'iehcar T:-air; E131:: . . . . . . . . . . . . Keco~iciliatioiiof A-numoniu P i x : Dac;: . . . . . . . . . . . . . . . . . . . . . Sumiiiar-). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . f2~:fcrenczs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Stmtegies rnr X,luitipis i;!-oss Emor Identification in Linear Processes . . Sin~ulr;,i?eouiStrategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . sc,.:.1.11 >ri-;~te~:~:; ....................................... Coir!i>ii;a!ic)i-iA Si;lti'ci~\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pcr-fori!ianci. .'il~-a,u;-:.\ for Ij\.al~iatinsGross fir!-01-!ilcn!i!Ycativil '

si!-,llcgii,s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Coi?ip::i-i,o!~:I: J.i;ri:i;iiz GI-:.,I\\Ei-1-01 !deiitrf'~c;i~ior~ Sira~ezies. . . . . . . . . ; I ? < l I : ! !I ! 1 ' . . . . . . . . . . . . . . . . . .... c:-<>,\ E,-:(,c [<:c.;:,fi(:j;;<;!:tr !<(>1;!!!j,>:ll p!-:l~,:,$~s t.,Tiiil? .\r:r;!i!::,ii GL2. l:cii~:,ci . . . . . . . . . . . . . . . . . . . . . . . . . . s i r tit i r ;i i : i t i n ........... !?-oix!\cS Pi-i>h!e;n\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S?!l!ii!1~1." . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kc!'~PLI?c~> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .






\ -i -// Refs~-cnces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Appendix R: Gralth Theory Fundarr:enfal\. 3-3 Giaphh. PI-cicessGI-,i;;i~s.a:ici Su!;sra;lh\ . . . . . . . . . . . . . . . . . . . 37t. t'at!i.;. Cycles. aiitl C::nnccri~iry . . . . . . . . . . . . . . . . 1 Sp~niiinp'l'recs. I(~-ai?lhes. zrid Cti,?;.~!... . . . . . . . . . . . . . . . . . . . . . :;SO .Ll; Graph Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cutset\. I-undamcniais C~.lrsets.a i d F~ir-ida~ni.rltai C lcie.; . . . . . . . . . . . . ?" I [{efcrence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3K3 ,.>:>

rj-oh!:r:i I


Forn1uiati:)n f i l l - :>e!:\leasiiri.lnrnt f3i;ica

. . . . . . . . .

Stnrihiical Pi-o;?i.:~iz.oi' T I . I T ~ ~ i\ t i i i ~ i ;rrld ~i the Global Tc\t . . . . . . . . . . . . G c . i l i . i - a l ~Lihi.iih<><)ii ~i~~ i-::i!r,) hI~'[liod . . . . . . . . . . . . . . . . . . . . . . . . . . !-,iult Lliagiio\l\ 'I~~ih111~.~iii.> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 'I'i1c St;ll< o f the ‘ A i ~ t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Siilriiiiai-y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Kefel-enL.c\ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


781 289 3 9. c,


707 -

20s ?CIS




Appendix C: Fundamentals of Probability and Statistics. 384 Randorn Variables 2nd PI-ohabilityDensity Functions . . . . . . . . . . . . . . . 383 Statistical Properties of IZandom Val-iables . . . . . . . . . . . . . . . . . . . . . . . . 589 Ffypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393


Thc authors are indebted to several people who have contributed to the preparation of this book. T h e m a i n contributions c a m e from Prof. N;lrasinlhan's students at the Indian Institute of Technology (1lT)hladras. 'f. Rcnganatllan and J . Prakash. currently doing their doctc?ra! prozrams. prepared the solutions for all examples lvith assislance fr-om Sreerari~Mriguluri. Mnrukuria Rajar~~ouli spent 110ui.s in :he riigtit psepal-ins tlic ~ a b i z sand iigurzs in dil.l'erent ch:i~::eu. Thc s~~ccessfiilccniltleiioii of ii-~i, book was clue to their eftoris. ii:--. S;icl-iln Pat\i.a!-dhan ;1.!1(! S. i'cslir;~\ an;im. faculty at f Jl' M d r a s . pro\~icli.c!critical in~)t!isii? ii:ipro\ c tile 'li! and c!ari:y of- tni: text. Thanks are also due to Liii: RAGE softv.-are cie\~zl:)pment t a r n at E ~ g i ncers India Lirni!cd, R&D Center. c o n s i s i i ~ gi\f Dr. i\ladhukar Gasg. Dr. V. !?a\ ikuliiar. and XIIS. Sheoi-aJ Sin?i! fi-ti~;: ~,~hclir? PI-ir to indiistrial ixocesscs. Prof. Ysras~rrrhrtnaiss rhafiks l'i-of. Jcse I<:?go: a:id DL-.Didier Maquiri at CRAN-INPL, in Nancy. I-i.ancc, for- a1-rai:ging a :;urulne!- visit for hint. during which a significan: part of the iexi rx:;i\ com!Aetcti. Dr. Jordache thanks all colleagues and ;I-!an:iyr-i; f t - o m Clic~nsh;l~-c. Coi-p.. Raytheon Process Antcirnation. and cspeciall! Sinlulation Sciences liic.. for ci~alleriginghim with practical issues that helped hin-i ciarii'!. niarly implenientatior~(jztails for data I-cconciliarion technology. Dr. Tor~i Clir~ksc;liespi-ovidcd valuable input on ?olynonlial filter-s.Drs. Miel!-Jii~g Lin and Ricardo Duiiia helped with clarifying sonie liteoreticai dcri\,;ilions mentioned in :he text.

Special thanks to thc R&L) management of Sirrtulation Sciences Inc. i S I M S C I m ) for the encou:agernent and support given to Dr. Jordachc during the writing of this manuscript and for allowing him to use DataconTM software and the training book for preparing a dctailed ind~lstrialexample included in this book. Very special thanks are also due to our respective wives. Jaishree Narasimhan, and Doina Jordache. who shielded us from all the problems on the tlorne front during the past two years. Moreover, the authors express a heart-felt gratitude to Debbie Markley of Gulf Publishing Company, who patiently reviewed every detail of the manuscript and worked long hours to help with finishing the index and to make sure this book is as accurate as possible. Fiilaiiy, the authors want to acknowledge and thank Prof. Miguel Bagajewicz for his excellent review and very useful suggzstions on additionai 123aterialto be included.


The quality of process ciata in chemical/petrochemical industries slyrlificantly affects the perforniancc a11d the profit gained frorn using v:i!-iOLIS software for process monitoring, oiilinc optilr~i/atioii,arid contro!. Unfortui1atcly plant ~ n c a s t ~ r e ~ noften e ~ ~ tcontain s errors that invalidate [lit. p r o w s mode! used for optimitation anti c o i ~ ~ r oDl .i m rectmciliation and gross error- detection are technique:, deveioped iil the p;~st3 0 years i'oi~ i,n~"uvin?111saccuiacy of' data so that the!- s:iti~t?):he p!ant ninclzl. Dur-i ! ; ~ t h e l:!.i decade. thsy ha:.;. br!:n r \ i d t l ~a j ~ p i i ~ir!~ i irefine!-ie\. pe:l-.:,~ l i e r l l i ~pldnrs, ll n~inera!prc)cessir!c i:idusrries. a n d so L~rtll.i!l i l i - c i ~ i achieve i:iorc ai.cur:kte plant-wide accour?iins : r i d superic?r-profit~ibiiiiy(1: plailt opera:lc~ns. A!tnough cor~iinercialsoftware h r dut;t rcco~lci!i:3rion and grosi cn-or dercction are ;.,\,ailal.le,t!lc accnir~paiiying~nsilualsusual!). gi\-cf ii:tle theor-et~caiback:_.soar!ti.In c;r-df:r ro I)c ah!< tc: sc:ecr ihe !IS>;~nerhnd.;;r!!,i ?air: tilt: iiloi; bci:cfii:; from ilieir iii?p!z1nzn1aiiurl. :)nz n e d c a sr;oi! uridersta:!dil~z of the Fui:damcr:ta! coiiceprs. 'This l ~ ( ? ~cxplai~~s i. :!I??a-ic: concept, i n datd reco~~cili~ition 2nd rross erTor detection usir~?marly i!lv
proposed. These meihods have not been includeti because they have riot iittaincd matur-ity,although they ntay become important in the future. This book is organized in such a way that it is useful for both industrial personnel and acade~nia.'Phe book will be a valuable tool to enzineers and managers dealing with the selection and implementation of data reconciliatio~isoftware, or those involved in the development of such softLvare. The book will also be useful as a supplementary reference for an irndei-g~aduatelgraduatelevel course in chernical process instrt~mentation and control in which basic concepts can be taught or as a text for a full graduate level course in these topics. Unlike this book, the other books that are currer~tlyavailable on these topics do not present an in-depth analysis of the different techniques available, their limitations or their interrelationships. The book is organized as follows: The firs, -;lapter motivates the need foi- data reconciliation and gross error detection and introduces the major conccpts in\'olvecl. Chapter ? introduces the statistical cIia1-acterizatio~iof liiea>llr.cnlellterrors mil v:irious filtering techniques used for error rzdtictiori t!l;it can for111 jl;111 of ail eve!-all data processing stratezy. The nest f n u r ch:lpterc dcsl n,ir11 ihc st~bjeciof data reconciliation in incre:i.;ing i?: si o!' con:ple~ity.Strad! -sta!e li11ea1-diita reco~~cilia:ioin is i l ~ s c ~ i h ei ~ di C'i;~pti-r-?. ~?::iolrlpo\itii~n~ecilllii!!lcs fill- Iirlear m o d e l with borh n:ea. . .;i~i-i-iisrii! ~~nmi.~i;ii~-i-c? i~iii~iblit, :ire tiesct-ibc4 here. The techniques rei;~~.s.i rcj !!IL.cia\sii'i~stii)~: of :;ti-iab!cs as obser\.ablz- a n j rc.~iundantars aI-,o p! <\e:~ic:i. >fe!hocis (01 c~tiiii:i~iil~ thc ~ie; t i ~ ~~.(-~ri:y~l~snt y b:il;i!~ce~~tild.~ I CI C I T ~ ~ ~I I: i \ e . , hi ~ l \ i d i - ~silcki - i ~ i gproccssei i s iI1cst1-a:ec! w i n y ? > L z ~ i ~I'I~OIV ~ p l e111i1i~~:ll ~ indu\tl-ie~:is ell :I:, utili~ydi~~i-i13u[io~i ili.r\\ i)?-ksIII C ~ C ~ I I iri~lt:\tr-ie\. ~C;I! Cl~:ijit~~. 5 t t - ~ i l t nor1lil:ear j data rcco~icili::tii)r:. N:)niiilear- ~nocizl\ar-c o t r ~ i iused to ;;ccur-atcly describe chcnlic;~I pr~ces.;cs.7'112 iii<:~~i eft icie~i:3i1d widel) used so!urion procedur~cfor :he non!rnear I-econciliariorn31-epi-escnied in this chap:er. Ilandling inequalir i'oil\ti.aints. qircii as hotrridh oil also analyzed in this chapter. I);ira reconciliati~~~i leciiiliqut. for dyniirnic systcllls are discussed iri C";-i.lprer(7. Both Kalnlar~iiltel-ins nlettiods and gerleral optimization tzchiliquei riesigrrcd for- d\ na~iiii.riordinear problc!ns are pre.;en!ed. Ct~:ipters 7 throit$~ 9 tical tiit11 the pr-oblem of gross en-or detection. ~ s issuz\ irlvolvetf in gross error detectior~and Chapter 7 i n t r c t d ~ ~ cthe ~k\~.i-ihi.s tlrc il:l\ic \t;lli\~izaltc.;l\ th;~! call bc wed to dc.tcc1 grciss er-~-or..

The underlying assumptions, char-acteristics. and relative advarttagcs :and disadvantages of various statistica1,tests arc also discussed. FOI-identitying nnultiple gross errors, corl~plexstrategies ar-e requil-ed. A plcttior;~of strategies have been proposed and evaluated in the I-csearch literatur-c.\, special eff'or? has becrr made in Chapter 8 to gi1.e :i ~inificdperspccti\c h classifying the different strategies on the basis of their core pr-inciplzs. \t'e also descr-ibe in detail a typical strategy fronl each of these classes. Chapter 9 treats the problem of gross error identification in dynamic systcr~~s. The efficacy of data reconciliation and gross error detection dependi significantly upon the location of measured variables. Recent attempts to optimally design the sensor network for rnaxinijzing accuracy of data reconciliation solution are described in Chapter 10. Several industrial applications and existent software sysienis for data reconciliation and gross error detection are also discussed in Uhaptc.1- 1 1 . Various aspects related to the benefits of offline and on-line data reconciiiation. the methods n-tostly used and their perfo~manccsare analy~cdhrre. In order to make this book self-sufficier~twith r-espect to the n:xrhi-ma tic:^? background required for- a good undzrstandirl~.:tppzridice> iiii. irlc1udc.d tl~atdescrihe the necessary b a i c co:iccpts fro111linex a!gsi?i:a. %rap11t!:co:v. atid prohabi1i;y ar;d :,ta:isiicai h\;po~hcci.;tc\tiil:.

The Importance of Data Reconciliafion and Gross Error Detection

PROCESS DATA CONDiflONLING METHODS I n aily modern chelnicni p!ant, pctrc~cheiiliiaiprocr3si or refiilel).. h~ndred5oi- eve:; tiiousai~dsof variabiei---silcfi 2s Eo\v r:;:c.s. terr?pei'atu!-cs, pres.;i!i.c;>, ie\.els. ~ : coiiij~o~i:io:~s---31-e d ~ . o L ~ : ~ Inieasured Iz!~ ail,! auio!na:icaliy I-ccoi-dedi'c o f proci. process econo:lii; e.:,tiua;ioi:. h,Toc!crr~compurers 2nd dam l;cc~~iisition systein\ faci!itate the col!ection acd processi~gof a 2re:;t :.o?i!rne of dai;~,ofteii ;actpled \\ itil a frequency of the order of rnifiutes or ei7cnsecc)~:ds. The iisz of corilpctcrs no: on!\. ai!i;\~;sd3t3 tc: be (;b[?,irliC:iit ii si2a:e~ freqaency. hgt bas a l i o :-ei~!itcti:n thc cli;l~inalic:i~ :)f e i ~ o r spresent in ~nal-iualrecoi-dilly. This ir? ii!;c!i has srcat:y iii~p!-o\.ec! ri?e sczuracy ~ i c d \:a!iclity of process data. The incre;irii aiEi)Llpt of i!:!iirn;aiio~:. !loweve:.. ciin be cxp!oitc. represented as the sum of the contributions fro111 o types of errors--rzzrzdorll cl-rnnr a:ld gl-oss errors. The tel-in r-rrndor~z~ r r o riinplies that lieither tile ~n;ignitudenor the sigr: of ~ h ei-ro:c can be prctlicted \x,~ittl ccr!ainty. In othci- u.nrt!s. i f the Inca-

surcment is I-epeatetl uith the banle in:,irurnent under- identical process conditions, a different value may be obtained depending on the outconle of the rantiom CITOI-. Ttit' (311lypossible way these errors can be chal-acterizcd is by the use of probabiiity distr-ibntions. These errors can be caused by a number of different sources such as power supply fluctuations, nc~worktransmission and signal conlersion noise, analog input fi1te1-ing.changes in ambient conditions, and so on. Since these el-rors can arise front different sources (some of which may be beyond the control of the design engineer), they cannot be corllpletely eliminated and are always present in ally ~lieasurement.They usually correspond to the high frequency components of a measured signal. and are tlsuail~.srnall in magr~itudeexcept for some occasional spikes. On the other hand. ~ I I J S SP I . ~ J I - . Sal-c cauied by nonrandom events such 35 instrument malfullitionin_r rdue to inlproper i~istallatio~l of rncasuriny de\ ices), miscrilibratio~i.u Far- (11-cnr-I-oio11of sensors, and solid deposit<. The ~~oili-ailcio~ii na1111-c (1: ij.~i',i: r'i~oi-< iiilplies that at any gi\:eii ri!ne the); h a \ < a certain niagnirude aid 5ign \xhicii nia) be unknown. 'l'hus. if the . . n:cast!rtnteiit i \ rzpel;:i'!.! \::ti? rtii' < ; i i i i i . iri err:.)[- to thc rneahiirzd \aiile u-iii bc the S:~III<. . . 31,<<>!j~\~.iiig gocd i,?\!Liii:a:i<>~t ;ti:(l II~~;:I:ICII:II!CC !>I.OCCC!I.J~C>. xi I!, p eIl$~lI~C !h::! LT!',i\, C l . t t l i \ Ji.2 1101 ?rc\elit 11; I!-&<1 1 1 ~ ~ 5 L I l ~ ~ ~ l ~Litl ~ ! l ! S I.>, !'or- s:‘ri11e :r!~it:. (;:-o,< rii-oi\ c;:i!.\cd 17:). hellso:- i~~i\~iilib~-aiic>:i n?a> ol~curs~!dtIcnl; at ;I p;ir-tii~li:ii-tir;ii' atid 1!7el-~:ai'tcrrei~laiil:I: 2 c.-on>tai!t lei i'l o:- r~t:yr-,iri~~Ie. 0t!1~21~ sin)\< ~ I - I - O I - criu ,eb such as the \.T e21tc~~~li~~g o+ \enL;<)!.> ci11: ClcCtir ;~-;iiic!:il!>,)vc:- :i 11crio erfi!:. :!ic<~:t\,:\\ ; < I \ \ I,,' ~G\'I?sr~!;i!i\.el>i~>i!gT ~ I ~ I ;JCI-;K, ~iti.'1 11~13.$i')><<::XII\c-~;;i.ii- Ic.\\ !.I-cc~LI?I>T~! !>QI1hei1.I I I ~ ~ I I ~ ~21< L I i>'[>i~ ~ C \ <2l 11 l:i~;e~ r l i a ~!.l~ohc ~ L)C : - c i ~ : ~L:I-I-OI-~. io~~~ Eii-k11-3in nlea\i~!cd::ata i'iil lc;ici to :,igriitic:~n: dctc!-io:-atio~ii n plarit 13~1-fol-mai~ce. Sni~iilr;~;idoili,iil,J ~ I - G \ . ci-i-:)~-\ can lead to deterioratiori in [!1i' pe~-forriia~icc cf cn!!!i(-l . -!r.!:!r. \?!?i'!-e;fi!:!!-ger gross cr-ror-scan i ~ u l l i 6. gains :icliic\al>le rhi-oil,~i!p:{>cfisoprlii~i/arion.1i1 \omc crseh. err!-,I ~ < O L I \ d:iti~ c;ixl a150 d:i\e iilc p~-occ t\~c.rcSo~-c i~~ipoi-t:~r~t t o r ~ ~ l u cifc . ni)t ccjrii~~letcly cliriiiitdti.. : i c eftcct o f hot11 I-antlo111ciricj yrct:, elxor,. Sc\-er-alciat; pi~occ.;\i~tgteihitic]i~ehi,ail be used together to a c h i e ~ cthis objective. l i t t l i i re\[. ~ { di.hcr-i!>i. c ri~cthodx~ \ - t ~ i c;~rt c h play an iil111oi-t:irlt rc:le a j p:wt 01;ii1 iriiity~.:i:e~i ii31;t pr-occs\iii~strategy to reduce errcrri in l i l ? ~ i ~ ! l l ~ ~ l 1 l C111:ttie l l ~ 5 111 ~ O ! l i i ! l ~ l ~ >{)I-ll;i:<\ ~l\ iil~~U~trif?h.

Research and development i11 the area of sigrlal conditioning have led to the design of analog and digital filters which can be ~ s e dto attenuate the effect of high frequency noise in the rneasurerncr~ts.1,al-ge gross en-oreach ineasured variable separately. Thus, although these methods improve the accuracy of the measurements, they do not make use of a process rilodel and hence do not ensure consistericy of the data with respect to the interrelationships between different process \~ariablzs.Never~tieless,these teciiniqnrs rrlust be used as a first step to reduce the effect of randoin errors in the d;iw and to eliminate obvious gros:, er-rors. It is possible to further reduce the effect of randolii erroi- and ;:]so eliiliinate systematic gl-nss er-I-or-i l l the dar:i hy exploirins tiie i-ei:i:inns!iir~.; tiiat arc k;~:)wn to exi~Citzr\vce11diffel-ent \,ariahies \e ~ h caccilracy of 1iieasurclnei:~si)y r e d ~ ~ c i ~the i ? effc>ci cf r;tlictoin err-or. iri ilic dats. The priilcipal dilli.rence he~n,eitndata i - c ~ < > i ; ~ i i iariii ~ti~;; otlrer fillelin: tcch~?iq:iesi, ihnt da:a recancili:itioii cipiiciti\~rnahe. 1i.e oi' process niodi.1 constraints and obtains es:inlat.-5 of process :,ariahies hy adju:;ting ~ ~ r ( x enicasuu'-ernznts ss sc; th:it tlic. eslirnate.; x;it15f?~ :he c(>i~si~~:~iiiix. The r-?i~onciledesiinlates ar-c. ;spetted to be inore accurate tji:in tht, i ; L i ; l , ~ , - ~ , t i i tand, i i ~ more irupclrtantly. ai-e also consiaient \\'it11 the kll~~u'11 ~tlatioiisliij~s betwee~tprocess \lar-iables a s defined by the constraints. 111 order for data reconciliation to be effecti\~c,therc. sllonld hc no gros., er-ror-seithcr iri the measurerncnts or i l l the pi-ace.; ;iiodei constr;~ints. Gro';> erro~-detection is 21 companion technique to dats reco11ciliati:in .. ha\ beer1 tleveloped to identify and clinlir~iiti:21-osserrors. 1 hur, data reconciliation and gross error dctcction are applied together to irnpro\.c accuracy of rtieas~rreddata.

771.. I ~ n ~ ~ ~ ~q[Du:a r ~ u ~ tI
Data I-econciliation and gross error detection both achieve enor reduction only by exploiting the redundancy property of measurements. Typically, in any process the variables are related to each other throuyh physical constraints such as material or energy conservation laws. Given a set of such systern constraints. a minimum nurnher of error-free measurements is required in order to calculate all of the system parameters and than this ~ n i n i ~ n u mthen , variables. If there are more measure~ner~ts redundancy exists in the measurements that can be exploited. This type of redundancy is usually called sp~.,ariul reda)~danc~y and the system of equations is said to be overdetermOzed. Data reconciliation cannot be performed without spatial redundancy. With no extra measured information, the system is just detertnined and no correction to erroi;e:>us measurements is possible. Further, if fewer variables i1l;tn necessary to detennine the system are measured, tihe system is ~t/~rl~ldei~t-/tzitIC(: aiid the values of some variahies can bz estimated o i l l ~throus!? otiier nleans or if additional measurenlents are provided. A second t;.pc of redundanq rhat exists in measurements is rertzpoml -.1r11s . irii/c:ld(~)l~y. arises due ti; the fact that rneasurelnents of process ~aiiahlesr!;-c iilad,: ci~ntinuallj~ in rirne a! a s:tmpliiiz rate. producing i.,~orcd~tt:: ~i!:;i: nc.czb\ary to detei.mine a steady-slate p r o c c > ~If. rile o i I t ? t c I :! :eady stare. !!ICE tenlporii? redundanci, can Sc zxploiti.c! h), \i!l?pIy a \ e r a s i n ~tfic n:casrir?n;cnts. and a?p!>irig steady-,:ate da:a I-cconcilia:ic-rnto the a\.cragerl values. Lf rhe PI-i;cc;s stare i.; ciynalnic. hourrver. tllc evo!ution of thc process scace i s describec! by dif;--I-cn:ial c q u a t i o ~ ~co~espc,rl:fing s to mass and cilc:rsy hsisr!ces. U . ! I ~ Y ~ Il;lilc~-en[iycapture 120th the : ~ - ~ r ~ p oand r a l spatial :-c(.f~!ri~i:jrx:; C J ~ 'r~:c:i<~:rccI ;.:~iii??!es. !'<>i- ~ U C ? ?2 ~ ~ O C ~ dy~?;$n~ic S S , &:ia XCc?~icilintion;ilid grai' i'i.i.(>,.Jztectiori teci.,r:igilcs have Sr-eil dc\~elopedto ~:i)lain:ic.ctir.i!te c\tiiii:ite.< co::sistent with !he differential model eqcations of the 1>i-(>cc.s. Signal pr.occssing anti d:~!a reconciliation techniques for error reduction can be applied to industsia! processes as part oi' an integrated sirateg) referred to .is i/gttt c.otzdirii;/!irr: or c1(1/czn~rri/jcutio/i7.Figure 1-1 i1lu:;tra:-: tile varioii5 oper;rtions and the position occupied by data rzconciliation ill ilara conclitionir~gfor oil-fin- rndustlial applications.


i -




Process Data


Data Ac~i.iisition/ Historian


1 .


Applications i .___ --__ 1

1! l&ailo2i





I X!,?b!,:,on 1



v TP------~ j Opiir:li~~:~~T(j



; :f!;.;::ied1







,.%,:,:c:-::~z: . i



!, ::I<:~J,!IC:,. ' ! ~ ! ~ ! : , I C : I A .., Z.C'

Figure 1-1. Onlinz data cciiec!ion s n d conrliiionino system.

:NRrilSiR!AL EXAWIPLES OF STEADY-STATE DATA RECONC6LIATfOhi llere we wil! briefly describe two exarnpies of indclstriai ;!p~ilicatian:, c)f stead)-stsie data i-econciiistion draw11 froni DLI; exp~riencei l l o~dc'rt~ illustrate the need thr sgch 3 !ec!iniqcie and the bcneiiti ;ha: 1;3;1 be deriveti f:-i;ni it. Crude Split Optimization in a Preheat Train of



In any reiineq, the crude oil is initially heated by passing it through aj: intel-conrlccted set of heat evcllangess called the crude preheat train before beins fractionated. In a crude preheat train, typically the crudc is split into one or more parallel streanis, each of ~1l;ichis heated by passing it through a train of heat exchangers before being merged and senr to

a furnace for further heating. 'rhe process str-earns that are used for lieating the crude are the various product and pump-around stseains from a downstream atniosplleric 01- vaclluril CI-udedistillation coluinn. In order to m a x i m i ~ eenergy recovery fr-on1 these procesx sti-earilx. the optimal flows of the crude splits through the diffel-ent parallel heat exchanger trains should be deterillined online. say every few iiours. For determining the oplimal flows, the total inlet flow of ci-ude and all hot process streams along with their inlet tcmperaturcs have to be specified. Moreover, details of all heat exchangers. such as heat exchanger areas. and overall heat transfer coefficients, also have to be specified. Generally, in a crude preheat train, all the stream flows, as kvell as all internlediate tenlperatures, are measured. Thus tl~er-eare more measurements than those required for perfor~ningthe optimization. It is possible to ignore some of the measurernents and use only the measurements of inlet flows and iillet tempei-aturcs of al! st[-earns for determining the optimal crude split f l o u . Houe\.er. since a11 measurements coiitain en-ors, any optimization casriecl out using such rneiisuren~e~lt> \\;ill no! :iecessa~-ilyresult in the predicted gairih. In of-der to over-coilte [hi\. >te;dy-state reconciliation :ind gi-ois error :ietcction is applied io measlircd d a i ; ~tu elir11i;iate Incasuri:lrieiiii containi l l ? ~ I - E S CI-1-or-s ;i~;d L ) b ~ ; ~i !-~ci~ o ~ ~ ezs1i111:ite5 i l ~ ( ~ . of :ill sti-c'irTi ii<>\lx \ ;111d ~\\. ~ ~ :1!?c1 i.~?~I~:tip) l12i:111~:c;. r-eii;nciieti e;tii~i;iii.s oi' inlet i'!oibs a:l:i lc:np?T:ir~Ii-L' 01. d l siL-ea!~ls.2iiid tlie e:;ti111;i:ed O ~ : ~;i!l I lie21t\ i ~ cxclirl~lgei-:;,are u ~ c dto Ji.1~1-ri;inethc: optiril:~! x:alues of tine c!ndc s p l i l ~ ulii;h ai-e thcn impleriientcd in the [,Ian!. U:;c c;f reconcilei! csti:r~;~ie\in illc optin~izi~lion is iikely i i ~result in s c t ~ i a lenergy I-eco\ery frun-i tlie ,,,,,,ss bcii;, ,:; L. ;h; pi-cdicted optinial values. I t
tion of the cr-udc split flows is repeated. T h e measurements i~iadcduring the preceding two hours of steady-state opcratiorl arc averaged and used as data for the reconciliation problem. This example is described in greater detail in the concluding chapter on industrial case studies. Minimizing Water Consumption in Mineral Beneficiation Circuits

In a mineral beneficlation circuit, crushed ore is washed with water along with other additives i n an interconnected network of claxsifiers 01flotation cells in order to liberate the particles containing the minerals from the gangue material. In order to minimize the water consumption for a desil-ed conccntratiolt of the bcneficiated ore, the performance of the flotation cells has to be simulated for different flow conditions. Tlie simulation r~iodclin turn requires data on the feed charactzri\rics as \vcll as on parameter-s such as pulp densities. Generally, the fiours of the feed stream and pare n.:itei- 5trean.s 21-e measured. Using sarilples drawn from differ-ent streai~il.thf corrce:it:-ations of different illinerals in each stream and t!:cir puip dciihitie 2:-e alht; rrleasurecl in the 1:lboratcry. These ~ile-Llni i cf \iulei- :ti hc acidcd. I n OliZ ~ L ! i ! l exerci.\e. : I \i;:>;x,i\ib!: reduce ii;?!ri\llr;ip:i~i; b) f ; v ~J);31-C21:i.

DATA RECONC!hlATEOM PROBLEM FORMUU'IION As stated in the przcediiig s e c t i o ! ~ ~i1;lta . r.:,ro~icili,irlcx~i~lipr-i>\e\ ti~c accur-ac\ of process data by a(l.jusiirig the rne;isui-ed \ ;iii:zs \o thsr t!1c~ satisfy the procc\s conxtr:iiiits. The ;iliioui~tof acljilst~iieniiilaiie it3 ihi. ilteasureiilelits is rnininiixci siirce t l ~ cr;indoii! errc?i-sin ti;: iiiea\urc.ntcr~t~ are expected to bit sinall. I n tile gcnei-a! a s c . not ;ill \-ai-iabic\ ot' rtle process arc measured due to eco!io:nic oi- teih11ic31ii~ilit~itin~:\. Itic c>tiriiates of unn~casuredvar-iable.; :ij \veil as moilci paramcrzr-5 ar-c also obtained as part of tllc reconciliation p r o b l e n ~ .T h e c\tiinatiu:i of' ~innieasur-edvalues b ~ e t oil i tlie r-ecoi~ciled~~ieasurcci valuzs i \ also hilo\\ n 7


as da?a coctptation. In general, data reconciliation can be foniiulated by the following cofistrained weiglited leccst-squczrus optirnizafiot~problem.

subject to

The objective fuiictio~ii - 1 defines the total weighted sun] square of adjustments made t o measurements, ysllere w, are the weights, y, is the measurement and x, is t!>e reconciled estimate for variable i, and u, are :he estimates of unn~easuredvariables. Equation 1-2 defines the set of mode! consiraints. Tt:? v.,.;riri~:\w, are ciioseii depending on the accuracy of differei~tineasurcri?e!;l:.. Tiis ~riodclcan;tr;~in!s are ~cncrallyniatcrial and energy balance?. but could inclt~dcir!equnii:>, i-c!atic;ns iniposed by ieasihiliry of process oprra:ions. The deterniiiiistic riatilral lav~:,of conw-vation of niasi or eiierg) are r\~piiallyu.;ec!l I? i.o!;$:iaii~t.; for d21d recuriciliation becausi the:,' are usuzily 'Kli~\i'!i.tilipiiical 9:- :)!iier t!;pi.s of eqi~ifiio:liin:'~!:.i:l~ 111;1114. tliimca\urt'd parar-i~e:::rs a]-enot ~-ccoini!~i~i?detI, ~ I J L t!?ey : ~ ;:i-cl a! best ki~ownoiii!. appi-olj111alelj-. f'orcin. tile ~li:,;\l;y--d \ ar?ab!e:, l o obey incssct rciatlon., c,i13caasc inacc:urate data rzco;i!:i!i:itio;i aoiutinr; anti incoiri'ct gross error-tii;lg:in>ii. ,Any i ~ ~ a o: s s energy cai:.;e1-~3tia11 1 3 2x1 ~ be cxpi-essed 111 [he !;)llov;iiig geslerai ihrin j l j:

where rhe variable x, represents either the temperature or- co~nposition oT stream i. The above equation is also useful when two nr more sensors are used to measure the sarne variable. say tlow rate or ten~perature of a stream. The type of constraints that are imposed in reco~lci!iation depend un the scope of the reconciliation problein and the type of process units. Furthermore, the complexity of the solution techniques used depends strongly on the constraints imposed. For example. if we are interested in reconciling only the flow rates of all strearns, then the rnateiial baiarlces constraints are linear in the flow variables and a iitzeur- ciuta recorzciiiiltiotz prohkt?~results. 011 ;be other harid, if we wish to reconcile cornposition, {etuperaiure or pressure measurements along with i'lows, :!?en a ? ! : i l l liimir data reconciiin!io??pi-oblem occurs. An issue to be addressed is the kind of constraints t h ~ we t can legitlinately impose in a data reconciliation appl!cafion. Since data reconciliation f~ircz?:the estin~atesof all variables io satisfy t l ~ ciinposed constraints. this issue assL:ines great importzice. Usually. materir?l and energ). balance constrain:s are inclncied because :hey are \,slid ph) sicd 1av:s. !t silo~idt c noted, howcvel-. that thcic cquatiiir:s 2:-e pmeraiiy .,i~ri:ten ass!!:ning i1:ai ere is fio IOSS of male:-ia! oi. cneszy frwm r!le procc'.$ ~!ni:ti) the erivii*>nment.i4'hiie t!lis may be va!id for n~atci-iaiflow. sipnjfjcant losses i ~ cr!ergy i il?:lT occur for trxam~!efrom ii~~propcrly inscia:cct heat cschangcrs. f n such cases. it is h m e r no: to impose ?he eil.ersy balances or sltcr!tatis.eiy iiiciude an anknswn ioss tel-in if; the batr?ncc eqcation thai car1 Se ectiri;ated YS part (_ifthe i~~o:iciliz:ii.::. Oriizr than material 2nd el:erekr co~lxr\~allor? c;,ns!ain:s, ;: mi>iki of 3 i;roces.i unit ca!i contain equations involving the uriil p:rra:nsters. Fix example. a heat excilancrer model can icc!ude a raiir~gequation relaiiriz the heat duty to the overul! heat transfer ccefiiciznr, t!ic exchanger a;c;i available for heat trarlsitr, and the strez-n flows and te11iperatu;-;s. Eqcation 1-5 describes this relationship. L

The ci:lar::ii> to:- uitic!; thc abai,. ecju;rtioi: i b ~vritrei?could be ths o\,rra!I ni:i!eriai ~ I G M . . iile t7i,i&. \)I' individuaf cor;lpo!ien;s, or the flow of encl-ry. -. If there is no aicuniiilatiou of any of these quantities. then these co~lstrairitsar-e algebraic in character and define a i!c~dy-:!?": < \I,-'r \ p l-- ~ t i / > n For a dyiiamic psoces.;. tiowever. !lie accumulation trt-tns carinot be i ~ e ~ l e c t cand d the coristraiii!~are ciiil'crcntial equations. For rlrost process units. there is no giinc~-ationor depletion O I material. Jn the case of rcactors. though. the gerlera~io~~ 01- deplctii,n of individual com~,oiicntidue !o reaction should he taken into account. For s a n e siiriple units siich as splitters, there is nci change either i n the co~npositionor ~enlpe:-atilreof strearns. For such units. the component aaci ellei-gy hala~~ces i-ediici. r o ;: sirnpli. t;)rin sacl: cib Ax'L.-


where Q is tile ties: duty, II is the ov~rafiheat transfer corfficicrit. A is the exchanger area, sad AT,, is the logarithnlic mean tempemtare difference. Should this equation be i!icluded as a constraint whcn appiying data recor;ci!iation t:t"""e ininvolving heat exchangers'! Generall~:since

the o\lerall heat transfer coefficient is unknown and has to be estimated from the measured data, this equation may be included and U estirxiated as part of the reconciliation problem. If there is no prior information abotrt U,however, and no feasibility restrictions on it, then inclusion of this constraint does not provide any additional information and estimates of all other variables will be the same regardless of whether this constraint is included or not. Thus, the data reconciliation problem can as well be solved without this constraint and U can subsequently be estimated by the above equation using the reconciled values of flows and temperatures. On the other hand, if U has to be within specified bounds or if there is a good estimate for U from a previous reconciliation exercise (as in the c ~ u d epreheat train example discussed in the previous section, where the esti~ratesof U from the reconciliation solution of the most recent time per-iod can be used as good a pt-iol-i estimates), then the constraint should he included along with the additional information about U as part of the reconciliation problem. The overall heat transfer coefficient can also be related to the physical properties of the streams. their flows, temperatures and the heat exhangel- characte~isticsusing c o ~ ~ e l a t i o r!t~ sis. not aiivisable to use srich a:equatioi~in ths. reconcili:l:ion model since the cc~n-elatio~~ thelnseives s can be quite en-or!eot~sand forcin: the f OV,S arld te~lipera:ui-cs to fii this equation mi): i;:cr-rase the inaccuracy oi'the cs::n?a!es. Ai:othe;. inlportaii! qucstior~is \iit<:hzr to perfol-11; rcconciiia~ioii~ s i i ~ g :I .S~~C/ti\'-~fc~te as a n'yilnilzic. ~ l l o i ! c !of the process. Praciical!y, a p:-ect-ss is :le\.ei truly a: :I steady staie. 14oi~.ever,a plant is norilia!iy opera:etl for several f:ours or days i n a rzgic\n around a nt~mina!steady-state operating 1ioir:t. For :ipplicatioil~such a ofi!ii:e :)ptimizaiion (as in ;he case of :i cnld~ :;[)lit optiri1iz:ltiofi exarnplr) where ;cconcili:i!ioi! is performed once ?\cry few hours. i t is appropri:i!e to e~nploysteady-s!atc reconciliatio!~ on nieasr!remalts averitged over the time period of interest. During translent ccjndition (such as during a changeover t:) a nei4. crude :ype in a refinery) wher? the departtire frorn steady state is significant. sready-stale reconciliation should not be applic~lSecr:x:;e i: -.::!! I-esult i n large adjustments to ~iicasnrcdvalues. M c a s t , - ~ ~ - ~ etaken r t . durin: such transient periods can be rccorlciled. if necessary. using a dynam1:, inodel of the ~, $or process coritrol applications where reconcili zition needs to be pzrfornled every few niir;utes. dynarrlic data ireconciliation ir; appropriate. Data reconciliation is based (.w tlie assumption that only random error-s LII-e prcscnt in the measurements which follow a normal (Gaussian) discri-

butio11. If a gross error due to a measurelllent bias is present in some measurement or if a significant process leak is present wliicl~hris not been accounted for in the model constraints, then the reconciled data may be very inaccurate. It is therefore necessary to identify and remove such gross errors. This is known as the gross en-or detectiolz j n - o D I ~ n i . Gross errors can be detected based on the extent to which the measurements violate the constraints or on the magnitude of the adjustments made to measurements in a preliminary data reconciliation. Although gross error detection techniques were developed primarily to improve the accuracy of reconciled estimates, they are also useful in identifying measuring instruments that need to be replaced or recalibrated.

EXAMPLES O F SIMPLE RECONClLlATlON PROBLEMS In order to obtain a good understanding of t!le issues and underlying assunlptions in data reconciliation. some of the simplest possihie cases are introduced here. We assume a process operating a: a stead!! state. consti-ained by a sei of iiiiear equations. Systems With All Measured Va~icrbies

Let us first consider t!lc sirnplest tir~tareconciliation pi-oblc~n:tire r-eci,!iation of the streain flcv, are expi:cted to be 111ol-eaccurare than the ~neasurements.Although the problerr, considered here is siiiiple, ir does have important indiistrial appiicatic~nsin accurate accoiliiting far- tlie material flows as. for example, in a lube blending piact. in t!le strtaln ; ~ n i i water distribution subsysleni of a p!ant, or ill a complete r e f i ~ i e ~ ~ .

Example 1-1 Let us consider a si~np!eprocess of a heat exchanger- witii a bypass as shown in Figurc 1-2. Let us also igrrorc the energy flows of t!~is process and focus only on the mass flows. It is assumed that the flo\vs of' all six strea~nsof this process are n~easur-etiand that thcse measurements con-

The Imnportance 1.f Dora Kcconciliu(ion und Gross Error I)etecriori Data Ke~~~nriliutinn and Gross Error Defection

Figure 1-2. Heat exchanger system with bypass.

tain random errors. If we denote the true value of the flow of stream i by the variable xi and the corresponding measured value by yi, then we can relate them by the following equations yi = xi + E ;

i = 1 ... 6



(I - 6)


xj = 0



(1 - 9)

where :he weight:, wi are chosen to reflcct the accuracy of the respective measurements. More accurate ~neasurernectsare given larger weights in order to force their aajustn~entsto be as small as possible. Gznerally, it is assumed that the error vxiances for all the measurements are known and that the weighs are choscn to be the invcrse of these variances. --. 1 nc re con cilia ti or^ problem is thus a constrained optimization problem

(1 - Saj


The measured values (given in Table 1-1) d o r,ot satisfy the above equatiocz, sic:- t'.~;. rcrt-iy randerr? zrr2rs !r is desired to derive estim2tes of the flows that satisfy the above flow balances. Intuitively, we can impose the condition that the differences between the measured and estimated flows, also referred to as adjustments, should be as small as possible. As a first choice, we can represent this objective as

The above function is the familiar least-squares criterion used in regression. Since it is immaterial whether the adjustments are positive or negative, the square of the adjustment is minimized. Although other types of criteria may be used such as minimizing the sum of absolute adjustment, they do not have a statistical basis and also make the solution of the problem more difficult. The least-squares criterion is acceptable, if all measurements are equally accurate. The adjustment made to one measurement is given the same ~mportanceas any other. In practice, however, it is likely that some measurements are more accurate than others depending on the instrument being used and the process environment under which it operates. In order to acccunt for this, we can use a weighted least-squares objective as a more general criterion, given by ~ 4 i n z w , ( -~x i, ) '

where E, is the random error in measurement y,. The flow balances around the splitter, exchanger, valve. and miser can be written as

x i - x2


with the objective functioil given by Equation 1-9 and the constrain~s given by Equations 1-?a through 1-7d. The solution of this optirnizatior; prcblein can be obtained analytically for flow reconciliation. Table 1-1 shows the true, measured, and reconciled flows for the prGcess of Figure 1-2. The reconciled flows shown in column fm1r nf this t&!e are okclned by assu~ningthat all measurements are equally accurate (weights are all equal). It can be easily verified that while the measured values do not satisfy the flow balances, Equations 1-7a through 1-7d, the reconciled flows satisfy then?.

The Imnporlance of Duru Reconciliation ur:d Gloss Error Detecriorl

Table 1-1 Flow Reconciliation for a Completely Measured Process Stream Number

True Flow Values

Measured Flow Values

Min xl~x2>XS>x6

wl(yl -x112 +w2(y2 -x212 +WS(YS-xs12

Reconciled Flow Values

Since the unmeasured variables are present only in the constraint set. the simplest strategy for solving the problem is to eliminate them from the constraints. This will not affect the objective function since it does not involve unmeasured variables. Variable x3 can be eliminated by combining Equations I-7a and 1-7c, while variable x4 can be eliminated by combining Equations 1-7b and 1-7d. Thus, we obtain a reduced set of constraints which involves only measured variables. Systems With Unmeasured Variables

In the previous example, we have assumed that all variables are measured However, usually only a sub\et of the variables are measured The presence of unmeasured var~ablesnot only co~npl~cates the problem solutlon, but also ~ n t r o d u c enew ~ questions such as whether an unmeasured ariable can be ect~mated,or whether d meaxured varldble can be reconclled as Illustrated b> the following example

The reduced data reconciliation problem is now to minimize 1-11 subject to the constraints of Equations 1-12a and 1-12b. It can be observed that this reduced problem involving the variables x i , x2, x5, and x6 is simi!ar to the completely measured case, and an analytical solution can be used co obtain the reconciled values of the measured variables. Using {hi. same measured values for x,, x-,, x5, and x6 as given in Table 1-1, and assurni~gall measuremects to be equaily accurate, the reconc~ledi.a!uec whicn are obiained are shown in Tabie 1-2 In the col~imnmder Case !. Once the reconciied values f ~ the r measured variables ere okained, the estirxates of the uilmeasured variab!es can be calculated usil:g the original constraints. Thus the estimate of x4 is equal tc~that of x2, and the estirnate of xi is equal to that of xs. These values are also indicated in Table 1-2. By comparing with the resu!ts of Table 1-1, it can be observed that since There are fewer measured variables in this case, the estimates of some variables are less accurate than those derived for the completely measured system. The central idea that is gained from this case is that the reconciliation problem can be split or decomposed inro subproble~ns-the first being a reduced reconciliation problem involving only measured variables, followed by an estimation or coaptation problem for calculating the estiniates of anmeasured variables.

Let us consider the flow reconciliation problem of the simple piocess shown in Figure :-2. However, we will cot assume that all the flcws are rneasured as b e f ~ r r Instead, . we will assume that only selective fiu\vs are measured and in each case discltss the issues and prob!ems invclved in partia!ly measured systems. Case I . Flows of streams i , 2, 5, a n d 6 are measured, while the other two stream flows are unmeasureci.

The object~vein this case is to not only reconcile the measured flows. but also to estimate all ?he ~~nrr~r:icl~red flows as part of the reconciliation problem. As in Equation 1-6. we relate the measured and true stream flows.

'The constraints are still given by Equation 1-7. It should be noted that the constraints involve both measured and unnleasured flow variables. The objective function is the weighted sum of squares of adjustments made t c ~1ne:rsured variables. and is given by

. l

Tlre Irnpor-imzcr ofDu(a Rrro~rciliuriono~zdGI-osrError Detecr;on

Table 1-2 Flow Reconciliation of Partially Measured Process Reconciled Flow Values


Case 1-Streams 3 and 4 unmeasured

Case 2-Streams 3 , 4 , 5 , 6 unmeasured

Case 3-Streams 2,3,4,5 unmeasured

1 2 3 4 5 6

100.49 64.25 36.24 64.25 36.24 100.49

101.91 64.45 37.46 64.45 37.46 101.91

100.39 -


These unmeasured variables are also known as observable. A formal definition of the concepts of observability and redundancy is given in Chapter 3. It is sufficient at present to note that while the partially measured process in Case 1 gives a redundant and observable system, Case 2 gives rise to a nonredundant, observable system. Case 3. Only flows of streams I a n d 6 are measured.

The reduced reconciliation problem we obtain for this case is




Case 2. Only flows of streams 7 a n d i 0, measured

In :his case, only Equations I-7a and 1-7b contain measured variables and are useful in the reconciliation problem. The objective f~inctionis set ~ 1 as 7 before to minimize the adjustment made to measured variables and is ~ i v e nby

As in Case I . we ti-y tc elin~inatethe un~rieasaiedvariables from the cc)nstr;,ints 1-7a arid 1-7b. Our attempt tc produce an equation involvirlg (mly ~neasuredvariables by suitably combining :he originai constraints ends in failfire. Thus, the reccncijiaiion problem we obtain is to minimize 1-1 -3 \vithout any constraints. It is immediately obvious that the best estirnarzs of x i anti x2 3ri given by their respective nleasured values which resclts in t.he least adjgstmcrtt of zero for 1-13. The estinlates of the unmsasurcc! variables can now be calcuiated using the constraints. The estimate of x6 is equal to x,. the eslimatc of x4 is equal to x2, and the estimates of x3 and xS are both equal to the difference between x, and x2. 'These values are a11 given in Table 1-2 under Case 2. Two irnpo17ant observations can be made in this case. First, no adjust111crttis made to the two measured variables x,, and x2. This is due to the fact that there is no additional information in the fonn of conssaints that relttte only the measured variables which can be exploited for adjusting their measurements. Such measured variables are also known as noni-eduildnnt val-inbles. Second, a unique estimate for every unmeasured variable is obtained using the constraints and estimates of meaxured variables.

Min wl(yl -x112 + ~ g ( y -6 ~ g ) ~ ,x6

(1- 14)


such that:

Equation 1-15 is obtained by adding all the constraints 1-7a through 1-7d. Assuming that the measurements of xi and Q are equally accurate, their reconciled values obtained are given in Table 1-2 under Case 3. We now attempt to calculate the estimates of the remaining four variables. We wit1 not be successful, however, in obtaining unique estimates for these variables. In other words, :here are many soiutions-in fact, an infinite nurnher-which can satisfy the constraints. For example, one possible solution is to take the estimates of x; and x5 to he both equal to that of x,, and the esti~rkatesof x2 and x4 to be equal to zero. Alternatively, we can choose thc estimates of x2 and x4 :o 5e equal to that of x,,while the estimates of xa and x5 are chosen tc be zero. Without additional info-mation, there is no way of det-erminicg which of these myriad possible solutions is rnorc: accurate. The variables .x2. x3, x4. and x2 are denoted as unabservablc. in rhis case. An interesting feature of this case is thzt thoilgh there are some cnmeaslired variables which cannot be u~iquelyestimated, reconciliation of the variabies x, arid x6 can still be performed utilizing the available measurements. Therefore, Case 3 is a redundant, unobservable system. System Containing Gross Errors

In all the cases considered in Example 1, the measurements did not contain any systematic error or bias. In such cases, data reconciliation does reduce the error in measurements. We wi!l now examine the case when one of the measurements contains a systematic bias or gross error

and demonstrate the need to perform gross error detection along with data reconciliation.

Example 1-3 W e reconsider the flow process shown in Figure 1-2 for which the true stream flows are as given in Table 1-1. W e will assume that all flows are measured with measurements as given in Table 1- I, except that the measurement o f stream 2 contains a positive bias o f 4 units, so that its measured value is 68.45 instead o f 64.45. As before, we reconcile these measurements and obtain estimates which are shown in Table 1-3, in column 2, when all the measurements are used. A comparison o f these estimates with those listed in Table I - I , clearly shows that the accuracy o f the estimates has decreased due to the presence o f the gross error. Furthermore, although only the flow measurement o f stream 2 contains a gross error, the accuracy o f all the flow estimates has decreased. This is known as a smeariilg effect and it occurs due to reconciliation which exploits the spatial constraint rela'.'ions between differentvariables. In order for data reconciliation to be is therefore necessary to identify those measurements c~:itaininggross errors and either eliminate them or make appropriate con:l;enxation. The iast colurnn o f Table 1-3 shows the reconciled esti~natesobtained when the flow measurement o f strearn 2 is discarded acd not ussd ir: the reconci!iati~nprocess. Clearly. ?he accuracy o f the reconciled estimates has improved cocsiderably. even though Be recfundaricy has ciecrzased by discardins thz mzzsurement. Table 1-3

Flow Reconciliation When Stream 2 Flew Measurement Contains a Grass Error Recanciled Flow Values Stream

All measurements used

Stream 2 measurement elimii~ated


1 2

3 4 5 6 --

100.89 65.83 35.05 65.83 35.05 100.89


100.23 64.53 35.71 64.53 35.71 100.23 --

Thus far, we have not considered the important question o f how to identify the measurement contaiuing a gross error based only on the knowledge o f the measured values and constraint relations between variables. There are several ways o f tackling this problenl and in this example we illustrate one approach. Given a set o f measurements, we can initially reconcile them assuming that there are no gross errors in the data. In the flow process example considered here, the reconciled estimates obtained under this assumption have already been shown in the second colu~mlo f Table 1-3. From these reconciled estimates we can compute the differences between the measured and reconciled values (measurement adjustments) for all measured variables, and these are shown in Table 1-4. Table 1-4 Measurement Adjustments for Flow Process Stream

Measurement adjustments

I f the c~fistraintsare linear as in t!lis example, the expected variar;ce of the adjustments can be analytically derived which will be a function of the constraint matrix and !he measurement error variances. For the flow process example considered here, it cai; be shown :hat the standard drviation o f measurernent adjustments for every variable is 0.8165. A simple statistical test can be applied to determice i f the computed measurement aliustments fail within a confidence interval, say within a 220 interval. In this example, the 220 interval (95% confidence in~erval)is [-1.6 I .6]. From Table 1-4, we can observe that the measurement adjustments f ~ r the flows o f streams 2, 4, and 6 fall outside this interval and as a t~rstcut the measurements o f these streams can be suspected o f contai~~ing a gross error. Among these the measureme~itadjustment o f stream 2 has the largest magnitude and can be identified to contain a gross error. After discarding the measurement o f stream 2, we can again reconcile the data and compute the measurement adjustments to examine i f any more gross errors are present.

The procedure used above is a sequential procedure for gross error detection and makes use of the statistical test known as the rrreasurenzeizt test. A variety of statistical tests and methods for identifying one or more gross errors have been developed and are described in Chapters 7 and 8. Although in this example we have only considered a gross error in measurements, it is possible for a gross error to be present in the constraints due to an unaccounted leak or loss of material. Some of the methods described in Chapters 7 and 8 can also be used to identify such gross errors. The example also clearly demonstrates that data reconciliation and gross error detection have to be applied together for obtaining accurate estimates.

BENEFITS F R O M DATA RECONCILIATION A N D GROSS ERROR DETECTION Development of a data reconciliation and gross error detection package for a system and its practical implementation is a difficult and costly task and cannot be justified without its benefits for a particular industrial application. The justiiication for data reconciliation and gross error detection nlay come from the many iiiy~ortantapplications for improving Frocess performance shewn in Figure 1- 1 which requires accurate data for achieving expected benefits a:; outlir?ed below: i . A direct application of data reconciliation is in evaluating process yieids or i n assessing consumption of utilities in different process cnits. Keconciied values provide more accsrate estimates as compared to :he use of r2w nieastlremen!s. f;cr example, refinery-wide material balance reccnciliatior? aids in :t Getter estimate of overail refinery yields. Similarly. a plant-wide energy audit using reconciled flows and temperatures hetps in a better identification of energy inefficient processes and equipment. 2. Applications such as simulation ana opti~nizationof existicg process equipment rely on a model of the equipsilent. These models usually contain parameters which hatie to be estimated from plant data. This is also known as model tunillg, for which accurate data is essential. 'The use o f erroneous measurements in model tuning can give rise to incorrect model paranleters which can nullify the benefits achievable through optimization. There are two possible ways in which data reconciliation can be used for such applications which we illustrate using a simple example.

Let us consider the problem of optimizing the performance of an existing distillatior column. From the operating data, measurements of flows, temperaturzs and compositions of all inlet and outlet streams of the column can be obtained. One possible way is to reconcile these measurements using only overall material and energy balances around the column. The reconciled data can now be used along with a detailed tray to tray model of the column in order to estimate parameters such as tray efficiencies. The tuned model can then be used to optimize the performance of the column. Alternatively, a simultaneous data reconciliation and parameter estimation can be performed using the detailed tray-to-tray model of the column. In this case, if measurements of tray temperatures and/or compcsitions are available, they can also be used and reconciled as part of the proolem. Obviously, the second approach leads to a significant increase in effort and computation time. This approach is also referred to as rigorous on-line modeling and has been incorporated in many commercial steady-state simulators. 3. Data reconciliation can be very useful in scheduling maintenance of process equipment. Reconciled data can be used to accurately estimate key performance parameters of process equipment. For ewam~de.heat transfer coefficient of heat exchangers or the level of catalyst activity in reac!ors can be estimated and used to determine whether- ihe heat exchanper - should be cleaned or whcther the c a t a l y t sbculd be replacedJregenerated, respectively. 4. Many advanced control strategies such as rr,odel-based control or inferential contrci require accurate estimates of controlled vzriables. Dynamic data reconci!iatior, techciques can be used to derive accurate esGmates for better process coilttol. 5. Gross error detection not only improves the estimation accuracy of data reconciliation procedures but is also u s e f ~ lin identifj'ing instrw mentation problems which require special maintenance and correction. Incipient detection of gross errcrs can reduce maintenar,:, cost; a;;< provide a smoother plant operation. These methods can -.1~0 b? pxtended to detect faulty equipment.

A BRIEF HISTORY OF DATA RECONClLlATlON A N D GROSS ERROR DETECTION The problem of data reconciliati011was first introduced in 1961 and during the past four decades more than 200 research publications in the


Ilirra K<,r-r~~iciiiu!ior~ ~ I I Gross I ~ Err-or-Llrtt'i.rioli

two areas of data reconciliation and gross error detection have appeared. Our purpose in this section is to trace some of the significant contributions that spurred developments in these two areas. Interestingly, the problem of data reconciliation was first posed by Kuehn and Davidson [ 2 ]who were then working in the systems engineering division of IBM Corporation. They derived the analytical solution for a linear material balance problem for the case when all variables are measured. In a series of papers between 1968 and 1976 [3, 41 several important ideas in data reconciliation and the optimal selection of measurements, particularly in linear processes, were introduced. These included the treatment of unmeasured variables, and the decomposition of the reconciliation and coaptation problems using a graph-theoretic approach. The key concepts of obser-vability and redundancy were also introduced in these papcrs. The classic paper by Mah et al. in 1976 [ 5 ] also treated the general linear data reconciliation problem including estimation of untneasured variables. The interrelationship between linear algebraic and graph theoretic approaches were brought oct in this paper. More importantly, the paper clearly demonstrated through sinlulation of a refinery process that data reconciliation does substantially iriiprove accuracy especially \\ahen sufficient redundancy exists in the measurements. The problem of detecting gross errors callsed by meastirement biases and process leaks tackled in this work. The nexi major contribuiion was the concept of a pi-ojectior: matrix introduced by Crowe ec al. [Ci]. These authors decomposed the reconciliation aild coaptatior, problenls by using a projection matrix to eliininate the unmeasuied variables. This approach is more general and can be used even if sotlie of the unrneesared variab!es are unobservable. The ti:e of the Q R factorization ir. obtaining tne projectio~imaiI1x and in the solution of unn:easured variables was proposed by Swartz [7] and more recently by Sanchez and Rornagnoli IS]. Data reconciiiation for nonlinear processes was first addressed by Knepper and Gorman [9j who used !he iterative technique rmpcsr.! h-x,r Britt and Luecke ! 101 for parameter estimation in non1ine:ir regression. Their approach has some limitations as compare3 to the approach of successive linearization and use of projection matrix to solve the linearized subproblem proposed by Pai and Fisher 1 1 I]. In general, to solve the nonlinear data recoi~ciliatio~l problem which involves bounds and other inequality constraints, a constrained nonlinear optimization method has to be used. Tjoa and Biegler [12] made use of succes.rive quadralic PI-0-

grar7znzirlg (SQP) for solving a combined data reconciliation and gross error detection problem. as did Ravikumar et al. [13]. In parallel, methods for steady-state data reconciliation were being developed in the mineral processing area. One of the earliest applications of data reconciliation to a mineral processing circuit was published by Wiegel [14]. A representative sample of publications in this field are by Hodouin and Everell 1151, Simpson et al. 1161, and Heraud et al. [17], among others. A survey of computer packages for material balancing in mineral processing industries was published by Reid et al. [18]. The problem of data reconciliation in dynarnic processes has received attention only recently, although it was first tackled using an extended Kahnan filter by Stanley and Mah 1191 who used a simple random walk model to describe the process dynamics. Alrnasy [20] used steady-state reconciliation techniques for dynamic balancing of a linear time invariant dynamic model of the process by considering the equivalent discrete input-output formulation. For a linear dynamic system, the optimal estimates are obtained using a Kalman filter wnich, however. cannot handle inequality constraints. Dynamic data reconciliation has only recently been extended to nonlinear, constrained problems. Liebnlan et al. [21] have transfoi-ined the system of differential-algebraic equaciofis describing a dynamic mode! into a scandard r~onlir7c~czrpl-ogrcin; (NLP) and recoi~ciledthe data using constrained noalinear optin~izationmethods. As compared to steady-state reconciliation which is increasingly being applied to industrial processes, it may take a few more years of developnient before dynamic data reccinciliatior. is also ready and commercially available for industrial applications. Within a f e v ~yeai-s after Kuehn and D2vidson's paper on data reconciliation sppeared. the probleni of identifying gi-oss errors in data and its importance in data reconciliation was pointed cut by Ripps [22j. Kipps also proposed the procedure of measurement eli~ninatiorzas a technique for identifying the measurement containing a bias. This has now become one of the s:?~?c!zrcf cfrztezies in nlult_ip!e gross error identification. Although statistical tects for gross en-or detection were proposed by Reilly and Carpani [23j as early as 1963, they did not attract much attention since they were presented in a conference paper. The global rest and n:easrtl-enle~zltest were proposed by Alrnasy and Sztano 1241 in 1975 and the r7odcl or corzstl-aifzttest by Mah et al. [5] a year later. More than a decade later, the generalized likelilzood r~ltio(GLR) resl was proposed by Narasiinhan and Mah [25], the Bclyr.sicln test by Tamllanc ct a]. G

Tlie Irrrj~urtunceof Dofu Reconciliutiun and Gross Error Det~cfiori

[261, and more recently the principul comnponent test by Tong and Crowe [27]. Although, strategies for identifying the location of one or more gross errors were developed by Mah et al. 151 and Romagnoli and Stephanopou10s [28], a variety of serial elimination strategies using one or more statistical tests for multiple gross error identification were developed by Serth and Heenan 1291 and Rosenberg et al. 1301. More importantly, they also compared the performance of these strategies through simulation for determining the best among them. The method of simulation and its use in evaluating performance of gross error detection tests and strategies was first clearly explained by Jordache et al. [31]. Different measures for evaluating the performance of gross error detection strategies were also introduced in the above three papers. A different strategy called the serial compensation strategy for multiple gross error identification was proposed by Narasinlhan and Mah [25]. Simultaneous strategies for multiple gross error detection have also been proposed by Rosenberg et al. 1301 and more recently by Rollins and Davis [32]. The investigation for detennining the best gross error detection method and for improving its performance is still being pursued. Applications of data reconciliation to single process units either in the laboratory cr in an opeiating plan: were reported by Murthy 1331, Madron et al. 1341, Wdng and Stephanopoulos [35], Crowe 1361. Sheel and Crclwe /07]. among others who applied it to reactors, and by MacDonald and Ifowat i3Sj to 3 nnnequilibrium isotkermzl flash unit. Keconci!iation of flows in indilstria! processes weye reported by Mah et al. [5j and Serth and Heenan [39j, thoagh it is not clear whether these were implemzntcd in acfxal przctice. Applications of data reconciliation to actual icdustrial processes were reported by Ravikomar et al.1131 and many other papers mentioned Ir: Chapter 1 1. Development cfcornmercial software for industrial app:ication:; of data reconciliation and gross error detection began ir? the late 1980s. C ~xce!lent . reviews of data reconciliation and gross error detection have been written at regular intervals by Hlavacek [39], Mah 1401, Tamhane and Yah !4!j, ?n.h jA2], 22d rt-cent!;. 5;. Crcye 1431. The book by Mah 1441 contains a chapter nq this topic a~ does the book by Bodington [451. Currently the only book wholly devoted to this area is by hladron 1461, which has been revised and expanded recently by Ververka and Madron 1471.

SCOPE AND ORGANIZATION OF THE BOOK This book provides a sumniarized analysis of the various approaches to data reconciliation arid gross error detection. Certain criteria for selecr-


ing various techniques and guidelines for their practical implementation are also indicated. In Chapter 1, we have presented the need for data conditioring in process monitoring. Various signal processing and error reduction techniques were briefly mentioned. Data reconciliation which provides a model based error analysis and correction was introduced and illustrated by a simple example. Major concepts in data reconciliation, such as redundancy and observability, were also defined. Chapter 2 introduces the statistical characterization of measurement errors and various univariate error reduction techniques. Data filtering, which is widely used for data conditioning, is described in more detail. Various filtering techniques are presented and compared. Proceeding from Chapter 3 onward, the material is presented in increasing level of complexity. Chapter 3 describes the problem of steady-state linear reconciliation. Both theoretical and computational issues related to the linear data reconciliation are elucidated. Decomposition techniques for linear models with both measured and unmeasured variables are described here. Observability and redundancy are important issues for this case. Variabie classification techniques related to the observability and redundancy coacepts are therefore presented. Both graph-theoretical and matrix-based approaches are described. Chapter 4 deals with steady-state data reconciliaticn fcr bilinear systems. Bi!inear constraints, such as component materiai balarices and csrtain heat balance equations occur frequently in many industrial reconciliator, applications. Bilinear equations contain terms that are products of mi; random variables. Specidized reconciliation s~lutionmethcds have been proposed fcr bilinear constraints. This chapter preients sorne of them along with tthzir zssociated benefits and shortcomiiigs. Chapter 5 rreats nonlinear data reconciliation. Ncnlinear mode!s often ased to accurately describe most chemical processes. Various techniques used for solvirlg the nonlinear reconciliation problem are disC U S S C ~ .Some are bused clil ~ u c L ~ a a i v cl i;~ ~ e a ~ - i ~ i ~wihi ti t, :l lothers , are derived from gencrz! aoclinear prograrr,~ingtechniques. The most effcient and widely used solution methods are presented in this chapter. Decomposition techniques for nonlinear problems are also analyzed. Inequality constraints such as bounds on variables are often imposed with nonlinear models in order to obtain a feasible solution. The treatment of inequality constraints is finally analyzed in this chapter. In the previou:; chapters only steady-state processes are considered. Data reconciliation techniques for dynamic systems are discussed in

Tlrc It~r/~orfat~re of Uara Reco~~cili~zriotr atld Gross b-rrur-Detection

Chapter 6. The reconciliation problem for a linear dynanlic process becomes a state estimation problern which can be solved via Kalman filtering methods. General optimization techniques have to be used for dynamic nonlinear problems which are described as part of nonlinear dynamic data reconciliation techniques. While data reconciliation attempts to eliminate inaccuracies caused by randorn errors in measurements, gross error detection deals with the identification and I-ernoval of systematic biases in measurements and leaks. Chapter 7 introduces the issues involved in gross error detection and describes the basic statistical tests that can be used to detect gross errors. The underlying assumptions, characteristics and relative advanta,aes and disadvantages of various statistical tests are also discussed. Interaction between gross error detection and data reconciliation is also highlighted. In any industrial application using process data, it is very important to identify all gross errors, so they can be removed or appropriateiy accounted for. None of the statistical tests described in the previous chapter provides satisfactol-y gross e11-oridentificatio:~for all practical scenarios and more complex stratezies are required. Chapter 8 describes some of ti15 most successful such strzitegies. The applicability of these methods to nonlifiear processes is fur~l?erdiscussed. Finally. the eifect of Scunds or inequality constraints on gi-o:;serror detzctio:~is analyzed. The previous two chapters describe gross error detection methods far stzady-state processes. Chapter Y treats the gross error identification for dynamic systems. The dynamic feature of a prscess introduces new i s s ~ e ssuch as combining information from measurements co!lected over a pa-iod of :jme and on-line in~plzmenta:ic?n. The cfficacy of data reccnciliation and gross error detection depends ci_cnificar;tly upor, location of measured variable:;. Recent attempts to optima!ly design the sensor network for maximizing accuracy of data reconci!iation solution and a: minimum cost are described in Chapter 10. Several large scale inr1lrc!ria! app!icatior?~2nd existent software sys-tenls for data reconciliation and gross errol- detection are discussed in Chapter 11. Various aspects such as the context of the industrial application, the problems associated with each typc of application and the methods llsed to solve them are analyzed i n this last chapter.

SUMMARY Measurement errors occur frequently in process instrumentation. Some errors are s~nalland random (random errors), others are large and systerilatic (gross errors). Data validation and data filtering are used to reduce the errors in process data. Filtered data, however, usually do not satisfy the plant model. Data reconciliation exploits redundancy in process data in order to determine the necessary measurement adjustments used to create a set of data consistent with the plant model. No data reconciliation is possible without data redundancy (more measurements are available than the min~rr~uln needed to solve the simulation problem). Data reconciliation solution obtained from data with gross errors is not reliable because a large error spreads over other variables, causing unreasonable ciata adjustments. Data reconciliation and gross error detection are closely interrelated. They need to be implrrnented together in crder to obtain a reliable data recocriliation. Scatistical tesis arc useful tools Cor gross error dctection. 1,ocaticn of instrumentation is important for both data reconciliation and gross emor detection. An optimal senior placement car? be predetermined. U~~neasul-ed variables and rngdel parameters can tte estimated by data reconcilistion, providing that enough ~neasurcddata is avaiiable in order to make them observable. Oniy natural material and energy conservation laws are acceptable for plant models used in data reconciliation. Con-elations c;r approxin~aterelaticns among process variables are no! recommended, since they introduce additionai sources of enor. Data recoilciliation can be applied to both steady-state and dynamic processes.


flula Kccurlciiinlio~~ ntirf Gross Error flef(~cfion



14. Wiegel, R. I. "Advances in Mineral Processing Material Balances." Catzad. Metall. Q. 11 (1972): 41 3-424.

1. Reklaitis, G. V. Introduction to Material and Energy Balances. New York: John Wiley & Sons, 1983

15. Hodouin, D., and M. D. Everell. "A Hierarchical PI-ocedure for Adjustment and Material Balancing of Mineral Processes Data." h7t. J. Miner. Proc. 7 (1980): 91-1 16.

2. Kuehn, D. R., and Davidson, H. "Computer Control. 11. Mathematics of Control." Chem. Eng. Progress 57 ( 1 96 1) :44-47. 3. Vaclavek, V. "Studies on System Engineering. I. On the Application of the Calculus of Observations in Calculations of Chemical Engineering Balances." Coll. Czech. Chem. Commun. 34 (1968): 3653. 4. Vaclavek, V., and Loucka, M. "Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plant." Chem. Eng. Sci. 31 (1976): 1199-1205.

5. Mah, R.S.H., G. M. S:wley. and D.W. Downing. "Reconciliation and Rectification of Process Flow and Inventory Data." Ind. & Eng. Chein. Proc. Des. Dev. 15 (1 976): 175-183.

6 Crowe. C. M., Y.k G. Campos. and A. Hrymak. "Reconciliation of Process Flow Rates by Matrix Projection I: Linear Case." P.lChE Journal29 (1983): 881-888. 7. Swartz. C.L.E. "Data Reconciliation for Generalized Flowsheet Applications."Arnerican Chelnical Society ?4ztional Meeting, Dallas, Tex. (1989).

e. Sanchez, M., and J. Romagnoli. "Use of Orthogolial

Transformations in Data C1assifica:ion-Recc.ncilia~ion."Co171purerrClienz. Etzgtig. 20 (1996): 1.83493.

9. Geppe;, J. C., and J. W. Gorii:~n. "Siatisiical Analysis of Coristrained Dsta

S . ~ t s .AlChE '~ Joui-i~a/26 (1980): 260-254. 10. Briit, F1. J., and R. 1-1. Lueche. "The Estimatiorl of Parzmeters ir. Nonlinex 1.1n1il;clt : . ~Models."-li.chtza~trctr-ics 15 (1973): 233-247. i

:. Pai. C.C.D., and G. D. Fisher. "Application of Bsoyden's Method to Reconciliation of Nonlinearly Constrained Data." AIChE Journal 34 (1988): 873-876.

12. I'joa, 1. B., and L. T. Riegler. "Simultaneou:; Srracegies for Data Reconci!iation and Gross Error Detection of Nonlinear Systems." Coinputers Cheni. Engt7g. I5 < 1091): 679490. 1.3. Ravikumar. V., S. R. Singh. 141.0.Garg, and S . Narasimhrm. "RAGE-A Soft\\.are Tool for Data Reconciliation and Gross Error Detection," in Foirndariotls qf Cony7utc.r-Aided PI-ocessOj>erations(edited by D.W.T. Kippin, J. C. Hale,

and J. F. Davis). Amsterdam: CACHE/Elscvier, 1994,429-436.

16. Simpson, D. E., V. R. Voller, and M. G. Everett. "An Efficient Algorithm for Mineral Processing Data Adjustment." Int. J. Miner. Proc. 31 (199 1): 73-96. 17. Heraud, N., D. Maquin, and J. Rago:. "Multilinear Balance Equilibration: Application to a Complex Metallurgical Process." Min. Metall. Proc. 1 1 (1991): 197-204. 18. Reid K. J, K. A. Smith, V. R. Voller, and M. Cross. "A Sunfey of Material Balance Computer Packages in the Mineral Industry." in 17rh Appiicnriot~sq/. L'onlputers and Operations Re.seurclz it1 the Miner01 Industry (edited by T . B. Johnson and R. J. Barnes). New York: AIME, 1982. 19. Stanley, G. M., and R.S.H. Mah. "Ertimation of Flows and Tempel-atures iil Process Networks." AIChE Journal 23 (1977): 6 4 2 4 5 0 . 20. Alrnasy. G. A. "Principles of Dynanlic Baiancin~."AICIIL' Joui-nul 36 (1991): 1321-1330. 21. Liehman, M. J., T. F. Edgar. and L. S. Lasdon. "Efficien: Data Reccnci!ration and Estirllation for Dynarnic Processes Using Nonlinear Prc>~ran~:nins Techniques." Cotnplrters Clzen~.Eilg~~g. 16 1992): S63-986.


22. Ripps. D. L. "Adjustment of Expesinierita! Data." Chenr. Etz,~.PI-og;...i'w1[1 Scr. iZo. 53 61 (1965): 8-13.

and R. E. Carpani. "Application of Statis!ical Theory lo 23. Kei!ly, P . IM., Adjustment of klaierial Balances." presented at the 13th Can. Chzm. 5 2 Cod.. Mo~treal,Quebec, 1963. 54. Almasy, G. A, and T. Sztzno. "Checking and Conectior. of Measur-cmen[~ Basis of Linear Sys:enl Model." Proh. Co?7tro! l i f 0 ~ 1 17-!ieor-\. . 1 on (1975): 57-69. 25. Narajiillhsn, S., arid R.S.H. Mah. "Gerieraiized Likelihood Ratio kletttod fo: Grot.; F ~ n Identification." r AICizE Juut-rial 33 (1987)- 1514-1521. 26. Tarll!lane. ,\. C., C. Jordache, arid R.S.H. Mah. "A Bayesian Approach io Gross Error Detection in Chemical Process Data. Part I: Model Developand hztel. Lab. Sys. 4 (1 988): 33. ment." Clzc,mom~~rics 27. 'Tons. H., and C. M. Crowe. "Detection of Gross Errors in Data Reconciliation by Principal Component i2nalysis." AIClrL- Jourrzai 31 ( 1095!: 1712-1722.




Oufa K~~conciliariori arid Gri1r.7Err<,,-Di~cccrio~i

77ie /inpor-lo~ii.<, of Drrfo h'eco~rciliufio~~ and Grus.7 Er-ror,l


28. Romagnoli, J. A., and G. Stephanopoulos. "Rect~frcat~oli of Process Measurement Data in the Presence of Gross Errors." C'!:enz. Eng. Sci. 36 (1 98 l). 1849-1863.

42. Mah, R.S.H. "Data Screening," in Foundations of Co~nputer-AidedProce.7~ Operations (edited b y G. V. Reklaitis and H. D. Spriggs). Amsterdam: CACHEIElsevier, 1987, 67-94.

29. Serth, R.W., and W. A. Heenan. "Gross Error Detecting and Data Recolicili ation in Steam-Metering Systeins." AIChE Jour-11al30(1986): 743-747.

43. Crowe, C. M., "Data Reconciliation-Progress Cont. 6 (1996): 89-98.

30. Rosenberg, J., R.S.H. Mah, and C. Jordache. "Evaluation of Sehernes for Detecting and Identification of Gross Errors in Process Data." Ind. & Eng. Chenz. Proc. Des. Dev. 26 (1987): 555-564.

31. Jordache, C., R.S.H. Mah, and A. C. Tamhane. "Performance Studies of the Measurement Test for Detecting of Gross Error in Process Data." AIChE Journal 31 (1985): 1187-1 201. 32. Rollins, D. K., and J. F. Davis. "Unbiased Estimatio~lTechnique for Identification of Gross Errors." AIChE Jo~tnznl38 (1992): 563-571. 33. Murthy, A.K.S. "Material Balance around a Chemical Reactor, 11." Ind. Eug. Cl~em.Process. Des. Dev. 13 (1974): 347. 34. Madron, F., V. Vevel-ka. and V. Vanacck. "Statistical i211alysis of Material Balance of a Chemical Reactor." AICI7E Jour-170123(1977): 4 8 2 4 8 6 . 35. Wang. N. S. and G. Stephanopoulos. "Application of Macroscopic Balances to the Identification of Gross Measurement El-ror-s." Biarech~lul.Bioeirg. 25 (1983): 21 77-2208. 36. Crowe. C. M. "Reconciliation of Proce\.z. Flow Rates hy Matl-ix Projectio:l. 11. Tlic Nonlinear Case." A1CI:E Jolri-rial 32 (1986): 616-623. 37. Shcel, J. P.. md C. M. C~owe."Sinii~lntionand Optirnizatio~iof an Existing Eth yibenzc~leDehydrogenatior! Reactor." C C I I.I.~ .C I I Y ~Eng. ~ . 47 (1969): 183-1 57. 38. MacDonald, R. J., aad C. S. Hoivat. "Data Reconciliation and Parametzi Estimation in Pisnt Perl'or!nnnce Anal~~sis."AIC!~B.lo~r~-no! 34 (i938): 1-5. -39. Hlavacek, V. "P.nalysi.i of a Complex Plant-Steady Staic and 'kansielit Eehavior I-Plant Data Estircation a~;d Adjustment." Co~npitrer.~ Cl7er:r. Eng~zg.i ( 1 977): 7.5-8 1.

40. Man, R.S.H. "Design and Analysis of Perfornlance Monitoring Systems." in iZ;ie,,r;,u: ,.',IA r>>Cmrrr-0111 (edited by D. E. Seborg and T. F. Edgar). Ne\i York: Engineering Foundation. 1982. 41. Tamhane, A. C., and R.S.H. Mah "Dat;~Reconciliation and GI-oss El-ror Detection in Chemical Process Nctworks." Tc~cl117or1rerr-ics 27 (1985): 409-422.





and Challenges." J. PI-oc.

44. Mah, R.S.H. Clzemical Process Structures and Irtfonimtiorz Flows. Boston: Butterworths, 1990. 45. Bodington, C. E. Planning, Scheduling and Corztrol bztegratiorz in Process Industries. New York: McGraw-Hill, 1995. 46. Madron, F. Process Plant Performance: Measurenzer~rar~dData Processing for Oprirnizatioiz and Rerroj5t.s. Chichester, West Sussex, England: Ellis Horwood Limited Co., 1992. 47. Veverka, V. V. and Madron, F. Material and Energ?; Balancing in Process Indusiries: From Microscopic Balances t o Large Plants. Amsterdam, The Netherlands: Elsevier, 1997.

M~~~rsurctrre~rr E ~ I - o rarid s El-ror Kcducrio~rTee-hrriqur.~


Thus, the relation between the measured value, true value and random erici in the measurement of a variable i is expressed by Equation 1-6. In this chapter, unless otherwise required, we drop the subscript i and rewrite Equation 1-6 as

Measurement Errors and Error Reduction Techniques

where y is the measured value, x is the true value and E is the random error. The random error usually oscillates around zero. Its characteristics can be described using statistical properties of random variables which are described in Appendix C. Its mean o r e.xpected value is therefore given by,

and its var-iaizce

CLASSIFICATION O F MEASUREMENT ERRORS As mentioned in Chapter 1. there are many sources for instrument errors which deterrninc a measurement error in virtually all measured process data. Some of the nleasurernent errors are random snd srnall i ~.lii;doruer-I-CI!-sj. ~vhilc:others are s)stcn~aticand large (gross errot-s). Some authors, joch as Madron / 1 ] and Jiebman et ai. [2], prefer to dsf'ine a separate c l a s ~called .y
I t is generally observed that if the measurement of a process variable is repeated under identical conditions, the same value is not obtained. This is due to the presence of random errors in measurements. Random errors cannot be either predicted nor accurately explained. We choose to model the effect of nlndonl errors on measurements as additive contributions.

where o is the standard deviation of the measurement error. Sta?zda?-d deiliatio~lis a measlrre of the measurement precision. The sn~allerthe standard deviation, the more precise is the measurement and the higher the probability that the random error will be close to zero. If the r a ~ d o merrors in the measurements of two different v31-iabiesi and j art: also considel-ed statisticr?l!y independent, then tiley Ilave zero correla~ion,that is,

Althcugh statistics! independence does riot a]\+-ays rzpresent rca!it\. this sssun:ption is widely used in data reconcllistion 1itera:ure because it offers a sirrlpler matliematica: description of the measurement errors. Measuremenrs obtained from two different instruments can be correlated if they share a common source of error (for exa~aple,a change in the ambient conditions affecting accuracy in a group of measuring devices). This type of correlation is known as spotiol coi-relatiorz. The degree of association between errors E, and E, is expressed by means of a con-elulion coe8icie~lt,riJ:

Mra.sirr-e~~irrt: Err-01-surtd Error Reducrior; Tc.chttiql
Equation 2-5 can be used to estimate c o v ( ~ ~ , &because ~), rij can be obtained from statistical analysis of a set of repeated measurements 131. Another type of correlation occurs when the same source of the error persists for a number of measurement periods. In that case, the measurements errors of the same variable at different time instants are serially correlated. Serial correlntiorz is also produced by delay time in control operations, due to unit capacity or inertia. For instance, if there is a delay of k time periods, an output-measured value y,,,(t) can be correlated with an input value at time t-k, say yi,(t-k). Kao et al. [4] showed that neglecting serial correlations can be significant for gross error detection and suggested remedies to account for serially correlated data. A summarized statistical description of the serially correlated data, based on tlme series analysis, can be found in Mah 151. Estimating the standard deviation. The standard deviation of a measurement error plays an important roie in data reconciliation and various other error reduction techniques. Since the true standard deviation is never known, an estimate of the standard deviation can be obtained by usiilg a sarnple standard deviation, according to the following f
where s is tlie estimated value of standard deviation, yi is the ifh observsti011 and is the arithmetic average of N obsei-vations cf the same variabie. This t'onnuia prcvides an unbiased eztin?a:e of the standard deviah is important far the reliability of the estimate. Tlie tion. Sample size T more observations, ttie more reliable tlie estimate. Madron [ I ] lndicaces that a minimurn of 15 observations ( f ~ ar steady process variable) shotild be used. A n i~nporr2ntr ~ ? ~ ~ ; r r r n r ' n for t estimating tho standard deviation of a measurement error fro111a s;tmple of measurements using tile ribove equation is that all the measurements of the variable should be drawn froin the same statistical population. Practically, this implies that if we use a sample of N measurements of a variable made at successive time instants for estimating the standard deviation of the measurement error, then it is implicitly assumed that during this time interval the true value of the variable has not changed. Moreo\rer, it is also assumed that the nleasurement


errors at different time instants all have the same standard deviation. Alternative ways to estimate the standard deviations (in fact, the entire covariance matrix of measurement errors) when the tr71e values are not constant and when gross errors also occur are described in Chapter 3. A complete mathematical description and statistical treatment of the random errors requires a probability density function (see Appendix C ) . This implies knowledge or assumption concerning the distribution type. The usual assumption in data reconciliation literature is that process data follow a normal distribution. Madron [ I ] suinmarizes the main reasons for selecting the nonnal distribution as follows:

1. It was found that the n o ~ m a distribution l approximates well the behavior of measurements in natural sciences, particularly within the range mean k30. 2. An error is often the sun1 of a larze number of single, elernentary errors. According to the central limit theorem, under certain generally acceptable conditions, the distribution of such a sum approaches the normal distribution (for a large number of elementary errors). 3. The theory of nonnal model error is well developed and is easy to ireat mathematicaily. The values of probability density and distribution hnctions lor standard normal distribution are available i n tabulaled form in any statistical textbook which facilitates solving of pr-actical problems. One irnlnediate practica! use of the probability density functior; far the ~ o r r n a disrribution l is in estimating the standard deviation of e fucc~ionof r a n i ~ n variables. l This problem is i~nporta~it foi estimating s:a!ldard deviation for a secondcry tandorn vcr-iahle which is ca!culated based on ronlc other directly measured variables, denoted as pt-inzu~ywt-iables. I[ is assulned that prababiiity properties (mean value, standard deviatio~~j of ths directly measured varizbles arc known. For examp!e, a flow rate F can be estimate$ hy sing -2assre:r.ertz i3 27 cSf;.cegauge according to the forinula:

where k is rhe orifice gauge constant, Ap is the pressure difference on thc orifice, po is the inlet orifice pressure and T is the fluid tcnlperature [ I ] .

The variance of a function f(x) of random variables is defined as

by using the Equation 2-12. Zalkind and Shinskey [6] used similar analytical derivations as Madron [I] and provide examples of estimating instrument error and standard deviation by combining component information. An extension of these type of calculations is given in Nair and Jordache 171. They applied the linear combination rule to estimate the effective standard deviatiun of a measurement system based on the information obtained from the accuracy and precision of the measurement system. The accuracy of the system is a measure of the agreement between the instrument reading and the true value. This information is provided by the instrument vendor and it is usually estimated by the linear combination rule applied to measuring and processing components as presented above. The pi-ecisiorz of a measurement is a measure of the agreement o f several repeated readings of the same measurement. Tt-. sample standard deviation estimated by Equation 2-6 is a measure for the repeatability of a measurement in steady conditions. An overall ieffective) standard deviation can be estimated as

If fix) is linear and i!te primary errors are uncorrelated. 2- 11 I-educestt,

where s, is the instrument accuracy (usually given as a percentage of the iilztrument range) and s, is the precision of the instrument (the repearability stai~darddeviation).

The mean value of a function z = f(x) of random variables x is defined as:

where p(x) is the m-variate probability density function of the vector of random variables x. If f(x) is a linear function, i.e.,

[hen the mean value of f(x) is also linear:

Gross Errors

if ftx'i is t: nonlinear fi~.rnction--such as Equation 2-7 aboire-the problem becomes Inore cornplcs and a solution can he obtained by either intzgrating :ht. Equation 2-1 1. or by linearization cf f(x) (Taylor expansion) which enables l~sirigF.quation 2-12 for an approximate solution [ I ] . The latter approach gives rise to the following general approxircation formula:

A detailed definition of' the gross elrors was g i ~ e nin Chapter 1 and at the beginnlfig of this chapter. Usually gross errors are associaied \vi;h sensor faults. Figure 2-1: reproduced from Dunir; et al. [a]. il1us:rates graphically the mcst common types of instrument faults: bias, ccmplete failure, drifting, 2nd precision degradation. If a grcss error exists in a ineasured value, the measurement equation Eqi~ation2-1 changes to:

A practical aspect of the estimation problem of the standard deviation for iinear functions deals with coniputing an overall at.cur-ucy for m 171easur~~n1er11sJ:sfetn. For example. let us assume that t h e e devices contribute to produce a measured value: a sensor, a transmitter. and a recorder. Each coinponent has its own error. and standard deviation. The overall error and standard deviation can be obtained by a linear combination of each componen! e1-r-01-and standard deb-iarion..4n overall standard deviation is obtzined

where 6 is the magnitude of the gross error. Nore that process leaks, which are also categorized as gross errors, cannot be modeled by h u a iion 2- 15. They represent mode1 errors and therefore affect the constraint equations as shown in Chapter 7. Gross errors significantly afkct the accuracy of any industrial application usin: process data. They have to be detected and rern~ved.Sorne of

Figure 2-1. Instrurneni types of fault;. Reproduced with permission cf the America0 Institdie o f Chemical Engineers Copyright @ IF96 AlChE A// righrs reserved.

them, such as occasional outliers (spikes), czn be detected by usifig special tilteii~ig~echniquz.;or statistical quality contro! jaiso kr!own ;is .stulisticill pr:".ess w ~ z ~ I -Other o ~ ) . types might be Inore diffic~ll:to detect withoat zi physical model. Data reconciiiatioi: is the appr~priatetool ic most cases.

ERROR REDUCTION METHODS Analog and digital filters have been widely used to reducz rc::?-x errors (high-frequency noise) in process values. Inadequate sampling frequency convei-ts a high frequency signal into an artificial low-frequency signal. This phenomenon is known as signal uliasil~g.Analog fil!ers are used to prefilter process data bcfore c;i~nplingand prevent aliasing. Digital filters are used afterward to further attenuate high-frequency noise. Seborg et al. [9] provide a sun~marizedpresentation of both analog and digital filters for process data used in control applications. This text

includes a review of various digital filters, which are very helpful tools for data conditioning before data reconciliation. Val-ious classical digital filters have been designed. Each filter type has its own advantages, as well as related shortcomings. Some are able to significantly reduce noise but they introduce a sizable delay in the filtered response. There are other filtering procedures that do not add a long delay but do not produce satisfactory noise removal. Other types of filters give both satisfactory noise removal and time delay in some cases, but perform poorly for measurement having variable frequencies or noise associated with a fast dynamic in the process variables. Overshooting1 undershooting is a comluon problem with the last case. In general, a trade-off between the amount of noise attenuation and the time delay in the filtered results is required in order to achieve the best performance for any type of filter. This can be accomplished by tuning the filter parameters which, unfortunately, is not an easy task. The random noise is often combined with instrument bias, slow drifts, or fast process changes, or other disturbances such as a cycling in the feedback control loops. To distinguish between a trae process change and random noise, the dynamics of the process must be well known or a diagnostic system, such as an expert system. should be used. ir, the absence of such information, it is advisable to avoid exce:isive filiering. Too rnuch filtering tends to nlask significant changes in !he process variables. A b ~ i e fdescription and analysis of ;he niost v,~idelyused classical digitai filters is give]: below. The discussion is restricted to durn ,filter-ing. which means coise rernovzl in the most recent measurernents. Data fiitering is to Se distinguished from da!a snzoothirzg which deals with pasi data. The foriner estimates the c u r r e ~ ~value t bascd on the cul-rent and past measurernents and it is of prirnav concern in process controi. The latter estimates the value of the central poin? from pasi a11d recent measureme1;ts (valucs from both sides of the central point) and ir is mainly used for fault diagnosis a r ~ dsteady-state process optimization. Mzny ~~J!!LC)TS, however, do not distinguish between the two tezr,:: x c ! 2 z :he data "smoothing" term for data filtering as we!]. An ir~regl-(11 of aOsol~ifeer-tars (IAE) similar to that of Kim and Lee [ l o ] will be used to conlpare various filtering techniques in this text. Here the IAE is the sun~nlationof the abso1u:e difference of the filtered values and the conesponding truc values over a specified number of time steps. Note that Kim and Lee used smoothed values instead of tIve values. but using simulated true values gives a cleaner measure for the

Dutu Kc~cor~i.i/iutiofr utid Grors Error D~vecrion

Mcorrrre~~zen: Errors and Error Reduction Techniques

amount of filtering and delay. The lower the IAE, the more filtering (with reduced delay) is obtained.

The exponential filter has several advantages. The effect of an impulse (spike) icput xk is immediately reduced to Oxk.It is computationally efficient and is easy to use and tune for steady state or slow dynamic signals (single parameter tuning). It does not overshoot and ultimately approaches the proper steady-state value. For these reasons, the exponential filter is used in many control systems. It has a problem, however, in that significant measurement noise attenuation is accompanied by relatively large delay in the filtered signal.


Exponential Filters

This filter is by far the most commonly used in industrial applications. It is a discrete-time filter, equivalent to the first order lag in a continuous system, and is a standard filter incorporated in many DCS systems. It is also known in control area. It can be analytically described by the following equation:

where xk = raw (unfiltered) measurement at time tk yk = filtered value at time tk 0 = filter parameter. The exponential filter- requires filter initialization: at tk= 0, yo = xo. The filter parameter 0 is a tuning parameter with a range O < 0 I 1. If 0 is close to zero, significant filtering is obtaineci, while for 0 close to 1, very little filtering is done. Note that the expoilential filter is of the ir61ai:e iini7~1serespotxe (ilfi) type, which means that the effect of acy input 5igrial is felt forever but with di~ninishingeffects.




Example 2-1 Figure 2-2 illustrates the filtering provided by the exponential filter for two different parameters (0 = 0.2 and 8 = 0.4) to a data set presented in Table L-1. This particular data set contains two step signals. steady-state noise, and a spike (outlier). The true values are plotted with the dark solid iine. Raw values were simulated by adding random errors (generated with a selected standard deviation) to the true values. A spike was also simulated in one data point. The integral absolute error (IAE) defined above was included in the chart for performance comparison. As shown by the individual filter p:o:s. the exponential filter with lower filter pxameter (0 = 0.2) indeed does inore filtering in steady-state sitilatiot~sthan tfie filter with C) = 0.4.

Exercise 2-1. Derive For~nela2- I6 from rlicfrr-st-order di,@,~ri!i-rl eyuution urcd bi coti~rolli~erature for the jifi.f-oi&~ kg.





illhere 4 =$l!er tirile c o ~ i s f u ~in~units t . of time. Dircus.~the relaficnsi7ip between the.filter rinzc catlstant zf arzd fhc filter pal-clr;zczet-o. Hint: E.xpreA.; cl, rivutivc dy(t)/d: iis iiii iippraiimatiorz by a l~~zckward step:










Figure 2-2. Exponential filters.






' Measrrreraerir Er-rors and Error- Reducriotl Techrziq~;~.~

However, the overall IAE for 8 = 0.2 is higher than the IAE for 0 = 0.4 because of the increased delay after step changes and the spike. Tuning the exponential filters for noisy data accompanied by frequent step changes and occasional spikes is a challenging task. Table 2-1. True Values and Raw Data for Example 2-1 Time

True Value

Raw Value


Table 2- 1. True Values and Raw Data for Example 2- 1 (continued) Time

True Value

Raw Value

Table 2-1. True Values and R a w Data for Example 2-1 (contint~ed) --



True Value

95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 l l0

80.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 7G.000(! 70.0000 70.0000 70.0900 70.000ti

111 1 I2

113 :14 115 116 117 i 18 119 12C !21 122 123 124 175 126 127 I28 !29 !30 13: !32 :31 1 34 I35 136

137 138 139 140 141 142 143 1 44 145 146 147 :48 141)




70.0900 70.0009 70.9000 70.OC);HI 70.0000 7G.0903 70.00GC 70.000CI 70.0000 70.0000 73.6000 70.WiK) i0.0000 70.0000 70.0000 70.0000 70.0000 [email protected](K) 70.0000 70.0000 70.0000 70.0000 70.0000 70.0000 70 0000 q,,


Raw Value 80.0000 70.0000 70.6065 67.9913 68.9857 67.1895 72.4049 7 1.0063 68.7471 68.7806 72.069 1 67.9624 64.4622 66.7040 70.2901 69.5325 70.9931 76.0272 69.8086 74.4 137 69.8983 72.3616 72.6424 65.6905 73.2806 67.8898 7 1.5254 69.5290 68.2254 68.8 :87 55.8263 7 1.034b 65.5383 74.9540 72.4687 57.6565 69.2903 69.5289 72.3914 65.6845 71.2673 70.7821 71.9 197 68A?19 72.0655 69.9292 7 1.9856 67.6774 68.8003 66.7914 68.7419 7 1.8966 70.5520 71.8154 71.9483 <,




Various modifications have been proposed to enhance the performance of the exponential filter:

1. Rhinehart [ l l ] developed a method for uutor7zatic turziizg of the firstorderfilter. The method assumes that the sampling period is small in comparison to the time for real process changes and that the noise is a random error which follows a normal distribution with zero mean. Instead of specifying the filter parameter- 8 or the filter time constant z , the user needs to specify the desired confidence interval for the filter value (usually 95%). The method adjusts the filter time constant to minimize the time lag while maintaining the desired accuracy. 2. The double exponential filtel; or second-order filter, is equivalent to two first-order filters in series where the second one filters the output from the first exponential filter. This type of filter was used by Tham and Parr [I21 for signal reconstruction when an outlier is detected by a validation test. The classical derivation of this filter, colliing from tinle series analysis is given in Seborg et al. 191.

3. The rloizlit:ear-e.r,~oi:ei~rialfiltelis anot!lcr variation of the exponentiai filter (Webex-, (131). This fi!rer heavily filters the noise, while reducii~g the delay. The nonlinear filter uses a design noise band, de:emtin~d as a multiple R of staudard deviations. The fox-~nof the fiiter is as Equation 2-16. except t h a t the tilter parameter e is defincd as

where: R = tuning parameter o =standard deviation of the measurement error


liati~Kci-r~r~ci!iariori orid Gross Errol- Dercr-riot1

Meosurrtrrrrir Errors a ~ Errord Kedr~crionTechniques

The relationship between the nonlinear exponential filter and the exponential filter is easy to obtain. Equation 2- 16 can be written as

in Figure 2-3. As noticed from the overall IAE for the filtered values, the performance of this filter is higher than that of the simple exponential filters presented in Example 2-1. The only problem with this filter is that it does not filter out spikes at all. Therefore, this filter type is not appropriate for signals with significant outliers.



IAx1 0 = --Ro

if /Ax\< R o and

Since the true standard deviation is never known, a sa~nplestandard deviation can be used as an estimate. The filter pararneter 8 is then used in Equation 2-15 (as before. 0 < 0 5 I). 'Typically, R lies between 3 and 5. If R is less than 3, little noise redtiction is achieved. 011the other hand, if li is greater than 5, signi5car;t delay i e s ~ ~ lwith t s a rnargina! improvement iri the noise filtering. The nonlinear fi!ter acts like an exponential filter with a filter paraineter 0 tha: varies depending or1 the rnzgnitude of the difference between filtered and raw measoremen:s. ?'he filter paianleter e is low far signals close to the previolls filtered 7,7alut: and high for signals far fionl tine previo:~s filter value. Measurements fai from the previous filtcred valut: have 8 = ! (no fi!tcring at all oatside the design noise band). Thel-cfore, delay is elilr~inatedfor situations when there is a rapid significant measur-emcnt change. Since the nonlinear exponential filter is tuned for the frequency of a partictllar noise level. it performs optimally for signals whose noise jevci 1 h sccauy. it is not recommended for filtering signals with spikes :;ii;ce such signals are not sufficiently filtered or not tiltered at all.

Exercise 2-3. Reverse nonlirtear exponential filter. Modifi the definition of the filter parameter 8 for the nonlinear exponential filter so it filters more data outside a chosen noise band and less (or nofilforing at all) inside the noise band. Using data in Table 2-1, calcularefifilter values ard IAE and plot the results as irz E-xample 2-2. Describe a situation where the reverse nor~litzearexponential filter can be useful. For what krnd of signal is this filter- type the most irzappropriate?

Example 2-2 The filtering performance of the nonlinear exponential filter (with = 4) for the same data set as in previous examples, is shown

o = I and R

Figure 2-3.Nonlinear exponential filter.



Durn Kecnrr/ iliiirlorl arid GI-ossO-ror-Dercc~ii~rr

Moving Average Filters

This is another conlmon class of filters. The general analytical expression is:

where N = number of data points in filter w, = weight for the measurement x,

el-, the overall IAE for N = 20 is much higher than the IAE for N = 10, because of the increased delay after step changes and the spike. n e prevjous history persists longer in tlie filtered values for the case with a larger number of data points. For this reason, the moving average filter with equal weights is not recommended for signals with step changes or spikes. For dynamic data, a better performance can be obtained by using a nloving average with unequal weights. As in the case with equal weights, the summation of all wi weights over i = I, . . ..N must be equal to 1. One possible set of such weights can be provided by the following formula ( 1 41:

In the absence of past data for the previous N time points, a history initialization is required:

The moving average is a fir~iteitn[?ul.rert..rpunse (FIR) filter which means that the effect of any input lasts only for N steps. For the common form, all input data is given equal weight, i.e., wi = 1/N. The equal ~ v e i ~~noving ht averaze c a ~ c e l sout p ~ r i o d i cnoise. Like in the case of the cxponenrial filter, the moving average is easy to tun% for steady-state or quasi-steady-statc sig!~als.requiring only the adjustment of the umber of input values used fo calcrrlste the average. Furtlier~noi-c,as with the rxpoire~tialfiltzr, the nlo\.ir?g averagc dces not overshoot and reaches correct steady stare after a step cl-iange. The moving average is also easy to implement and fast t s compute, although it requires more storage &[id calcul3tion t!1a17 :he expocential filter. Moreover, being an F!R filter. the n?c?vir,gavenge requi:-es special init1ali7ation as shcwn above. The 111ovingaverage is tr?cst effective when esiiinating tlie center poini rather than ths cllrre~itvaluc. I: is par-titularly useful ior esti~naringa fixed value or a linear trend.

Example 2-3 Figur-e 2-4 illustrates the filtering provided by the ~novingaverage filter with equal weights for two different parameters (N= 10 and N = 10) to the filter plots. data set presented in Table 2- I . As noticed from the individ~~al the moving average filter with fiigher nc;rnber of data points (N = 20) does more filtering in steady-state situations than the filter with N = 10. Howew

where the weights w; are exponentiaily increasing with i and satisfy the summation condition. T ~ v otuning parameters are involved: the number of data points N and an exponelit r: Usually O < r I 10. For a fixed nurnber of points N,a higher exponent r results in a higher weight for the more recent rneasurernent. i.e.. less filtering. A lower value for the exponent r provides more filtering b3t aiso acids in(~r-edelay.



+ -~


L-J- -J!


R a w \'sliicc (iAE=265 1)

Figure 2-4. Moving average filters.

t--4 %


Q Measurcnrerlt Er7-U~S arzd ,511-orR P ~ ~ I CTechniques I~O~I


Polynomial Filters

Exercise 2-4. Prove that the surnrnation over N of w,k given by Equation 2-23 is equal to I .

Exercise 2-5. Repeat the filter calculation for the data presented in TuOle 2-1 with the moving average filler with u~zequalweights as given b y Equation 2-23. Choose N=20 and r=4. Compare the results with the results in Figure 2-4. Is there any incentive to use the more complex filter weights given by Equation 2-23 rather than a simple moving average filter with equal weights?

Polynomial filters can be derived from the least-squares filters, which have been designed for data smoothing. While least-squares polynomials are widely used in data smoothing, they are also suitable for data filtering [lo, 141. The general form of the polynomial filter is shown by the following equation:


Y k = b m rnt k+bm-,tkm-l + . . - + b 2 t ; + b l t k + b o where:

Note that the filter with unequal exponential weights described above by Equation 2-22 and 2-23 is to be distinguished from the exponentiull~ weighted moving average (EWMA)fiIter which is often used in statistical process control area. The EWMA filter is analytically described as 131:

( 2 - 26)

tk = current time m = filter order (a nonzero, positive integer) bo, . . ., b, = filter parameters chosen such that: N

Min ~ ( X i - y i ) ~ bo. . . b , n i=]

wl~ere:N = number of time steps (data points) included in the tilter. whel-e Yk = sample mean (mavins average with eqzal weights) at time tk yk = filterid value a! tirr~etk i= filter parameter ; O < ?L < 1 Initially, yo is taken as the control target po ( yo = h). This filter can also h= expressed as a weighted average of pas: sample means, i t . , z

Equation 2-25 indicates that the weights assigned to ?he sample means decrease geometrically with age. For that reason, riiis IilLer 1s sornetinies referred to as the geotnetrir nzoving avc,oh.e Jj;l~ec?- [3].Some autinors jS, IS], prefer to use the curren: measuremect, xk. instead of the sarnple mean 7,. A recommended range for 3L is 0.05 < < 0.5 [ 3 ] .MacGregor 115 ) indicates that a common choice is h = 0.2.


Polynotnial filters are FIR type filter:; since they use a !imited history of inputs. They provide good noise reduction, while following thi overa!l trend of the measurement data. The alnouni of filtering and the delay depend on the nirm'oer of d2ta p ~ i n t sused and the order of po!ynomial. The disadvantzge with Equation 2-26 for the polynomial filter is th2t the filter panmeters bo, . . ., h , vary with each time step and musi be recalcillateci by soiving the least squares problem at each time step. Because ol' excessive computations, initial comaerciai applications viere gecerally limited to first order filters. An alternative form of the polynomial fi!ter using time invariant filter parameters can be derived for measurement signals sampled at uniform intervals. Unifonn sanlpling intervals are typical with digital sample data control systems. In this form, the polynomial filter Secomes:


xl, . . ., xN = unfiltered signal values at titne t i , . . ., tN c,, . . .. cN =time invariant filter factors t, - t,,., = . . . = t2 - t,, uniform sampling interval

The important characteristic of this polynomial filter form is that the filter factors cl, . . ., cNare not functions of the time step as are the bo, . . ., b,, parameters in the conventional formulation shown in Equation 2-26. Once the polynomial order nz, and the number of points N are selected, the cl, . . ., CN filter factors are constants. For a particular set nt and N, the filter factors c!, . . ., cx can be calculated by a procedure described in Exercise 2-6. This fhrm of the polynorllial filter was developed and applied to industrial pl-ocesses in tlie early 1970s and was recently published in firstorder foml [:0, 131.

Exercise 2-6. Ptuve Equalion 2-28.p~U I I ( ~ ~sntrr/di~ig ) I - I ~ I irite??lnl~ n~idfiticl017 e.~pres.~Iori for calcztlcrti~lgfl~efil~ei-,facfo/-s i'i, . . ., cw fIint: Dei-itxe rile leu,t-squares ol~jectivc,fr~r;c:io~? 2-27 >r>ith ,'ei/?cct ro ho, . . ., h,,,crrrrisfifirlcljir.~t ci ~ql4iiti01:,f01the vecfor h. tt:lii~hIS a gcrzr,-(lIfi;i;l-i~ b = (PTP)-'PTx ;t~/zci-e I i.i rlie vector of ~c,~$Ifered si,~rial~:rrlu:~.s(iild P is sorne regi.r-~~s,iorl ri!atrix jscc [ 1 6 ] / . A;ext, ILSC Eqlu:zioii 2-26 :t~i-iffen as y b = [ t t tkl f k 2 . . . t y ] b , arzd b y r-elvlac;rzg .~cciorb r! i:h fhe previa~isY E ~ Y C S S IC:IECI ' O Itaking ~, k=N (fi);.tt'!~~ ,%,st0;-c~t,-;-cr;t 57rit0~ ( I ~ I I arzd T ) a.s.~[~~trirrg t/mt t,,J=N.4? ,t,!?ewAt i.5 a cor~stc~nr srrr~l,vlir7giiitel-val, get c;rl expi-
Substituting c~ factors from Equation 2-29 into Equation 2-28 gives a filter form in ai's which greatly reduces data storage requirements. For example, a polynonlial filter using 100 points in the form of Equation 2-28 requires storage of 100 c factors. The same filter i11 the form of Equation 2-29 for the c factors requires storage of only two a coefficients. The filter factor coefficients %, . . .. a,, can be determined by a least-squares solution of the non-square system of equations fonned by writing Equation 2-29 repeatedly, for i=1,2, . . ., N. For a given number of filter data points, the higher the polynomial order the more closely the filtered response follows the measurement data. The high-frequency noise is not removed, however. A low-order polynomial is usually preferred for filtering, although the lower the order, the larger the delay. Typically, a first or second order is used for most polynomial filters in process control systems. For a selected po~yi,,.,~nial order, the only tuning parameter is the number of data points N. A high number of points gises a smoother output but also more delay. Overshooting is another negative behavior of polynomial filrers. It occurs when there is a fast rate of change in a signal value and affects the result even after the sigl~albecomes stabilized. The iarger the nunlber of points used by the filter, the more overshootirlg occurs.


When using a large number of points in a polynomial filter, it rnay be Inore convenient tci use ail analytical representation of the fiiter factors [ IS]. These forms of the tjlters can he derived by recognizing that the filter factors can be represen:ed analytically as

tvher-c:q,.. . .. q,,= filter factor coefficients

Figure 2-5. First-order poiynomiol filters




Mmsirretnenl Errors n ~ xError i Redrrr~ionTecllt7irjues

Example 2-4 Figure 2-5 illustrates the filtering performance provided by a firstorder polynomial fi!ter with two different number of data points (N = 10 and N = 30) to the data set presented in Table 2-1. As shown by the two individual filter plots, the polynomial filter with higher number of data points (N = 30) does more filtering in steady-state situations than the filter with N = 10, but it takes longer to reach steady state after a step change. It also does more overshooting after a step change or a spike than the filter with the lower number of data points. Tuning the polynomial filter is relatively easier for steady-state signals but, as with the other filter types, it is more cumbersome for signals with step changes or spikes.

Hybrid Filters

when most ation Shev This proc ena? reas in s 1 sre sic

first= 10 two data i.ilter e. It lter :lat is


As seen in the filter examples presented above, none of the individual filters behave in a satisfactory way for unsteady-state signals. The performance of classical digital filters can be enhanced by creating hybrids that combine the features of different filters. The silllplest hybrid one can build is an ari:hmetic average of the filtered values from two types of filters. To create a better filter, the two participating individual filters vzlurs should have opposite feztures. For exarnplc, if one filter is able to significantl~.r-duce noise but introduces a long d e l q , the other filter needs to ireare a illoch shortrr deiay a l t h o u ~ hit might follow the noisy data too closeiy. For that reason. the most appropriate combinations for such hybrids are: polynornia!imo\ring average, ~olynorniai/exponentialand cxpnr~ential/nonli~~ear exponentia!. A more c o i n p l e ~hybiid that eliminates overshooting biit fcllows the process dynamics is presented in Clinkscales and Jordache [!4]. To achieve this behavior. their fiiter first detects the type of the change in the process value by analyzing the trend. A modified Shewhart test used in statistical process control 131 has been used to detect a change in the state of a variabie as follows:

ca in sir rt






a c

< 1




where Xi is the current (ith) raw value, X, is a long-term average for the most recent steady-state situation and o is the steady-state standard deviation of the measured variable X. If lZil > SCT,, where SCL is a selected Shewhart control limit, a significant change in process data is detected. This test is much simpler than the usual CUSUM tests used in statistical process control. It is easier to implement, and, with proper tuning, enables a faster state-change detection than the CUSUM tests. For this reason, the Shewhart test is often used in association with CUSUh4 tests in statistical process control [17]. Five major signal types can be determined with their algorithm: steady-state, step ckarlge, rairzp, inzpltlse (spike), and urzdeterrnined trurzsient state. The filter is forced to follow more closely the process dynamics in the case of a true process change and to do more filtering in the case of the random noise (steady state) or spikes. To eliminate overshootinglundershooting, the filter value is limited to the maxirnun-t/minimum short term unequally weighted moving average (more weight on the most recent measurement). A similar approach has been used by Tham and Parr [ ! 2 ] .They also apply statistical tests to determine if there is a trend in ti?e data. They classify the trends in three categories: (a) a distinct :rend: ib) a trend that. due to noise, is not inunzdiately disceri~ible:and (c) no trend. hdditioilal tests are applied to detect ou:liers for each type of trend. If an outiiel- i s found, a s i g ~ ~ recon~truction al fo:niu!a is used to estin:a?e a value a.hich is used to replace the outlier.


Esercise 2-7. Ci-enre lzybi-ids (sint1:le arith:?zetic average_\)j%jr.


~ol~~no;~~icrl/ex~(it~c~itial arzd polyrlor~iial/r,zo1~i!zg cit.erngP rlsrrzg ilr~tcr frcnl TaOle 2-1. Use 0 =0.2,for the expoi7e?itialfi!tet;A'= l/)for-rile nzovir~gaverage and N=20 for ! polynot?lic2ifiltet: For t h f ~i ~ - r ~ - n t ~ ~ ~ ~ r L n o l y t 7 nr7d o r ~ 7hT i ~=l 20, l f i ltlzefilre?~ fctc~otcoef~icierirsare as follow: a. = 4 . 1 critd al=0.01429; r h ~ f i l t e r J'uctors ci carz he calculcrted with Equatiorl 2-29. Cor71pat-ctlze t-e-csultswith f11oseobtai~iedwit17 lize iizdiridual filters.

There are many other error reduction techniques, but analyzing all of them is beyond the scope of this book. Statistical process control represent an area of special interest for data validation and data conditioning in connection with process control applications 13, 15, 17, 181. Stanley 1191 provided an excellent review of almost all known error reduction methods, including data reconciliation. The focus of this textbook, however, is data reconciliation which exploits redundancy in process data to more accurately adjust the process values and detect gross errors.

' I

SUMMARY Random errors can be naturally described by a normal probability distribution which is suitable to most measurements associated with physical sciences. Other distribution types can also be used. Standard deviations of secondary random variables can be estimated from the standard deviation of the primar). random variables. Individual filtering techniques can be used for error reduction in process measurements, but they are not easy to tune. Some reduce significantly the errors, but with large delay. Others have less delay, but ove~shoot/undershootafter a trse step process change. Hybrid filters perfor~nbetter than individual digitai fi!ters, but for best performance they need to be able to reccgcize the type of process signal.


REFERENCES I . Madron, F. Process Plant Perjornzance: Mc,n.rurenzeni and Data Procexsincp for Optin1imtior2and Retr-o5t.s. Chichester, West Sussex, England: Ellis Horwood Lirnitcd Co., 1992.

2. Liebman, M. J., T. F. Edgar, and L. S. Lasdon. "Efficient Data Reconciliation and Estimation for Dynamic Processes Using Nonlinear Programrninz Techniques." Conputers Clzem. Engng. 16 (no. 10/11, 1992): 963-986.

3. Wadsworth, 1-1. M. Handbook of Statistical method.^ for Engineers arid Scieiirisfs. New York: McGraw-Hill, 1990. 4. Kao, C. S., A. C. Tanlhane, and R.S.H. Mah. "Gross Error Detection in Senall) Correlated Process Data." lrid. & Eng. Chelri. Resetrrrh 29 (no. 6 , 1990): 1004-1012. 5. Mah, R.S.H. Clze~nicalProcess Structufes and Inforr?ial~onFlows. Boston Butterworthc. 1990. 6. Zalkind. C. S., and F. G. Shinskey. "Sta~isticalMethods for Computing Over-All System Accuracy." lS;1 Joz~rnal(Oct. 1963): 63-66.

7. Nair, P., and C. Jordnc!le "Rigorous Data Reconciliation is Key to Optima1 Operatioils." Control for rlze Process I~zdusfries,Vol. IV, no. 10. pp. 1 18-123. Chicago: Putman, 1941.

S. Dunia, R . S. I. Qin, T. F. Edgar. and T. 5. McAvoy. "Identificati~nof Fauir? Sznsors Using Principal Componer~tAnalysis." AiCllE Jounzai 42 (no. 10. 19961: 2797-28 12. 9. SeSorg. D. E., T. F. Edgar, and 3.A. Mellicharnp. PI-ocess Dy17cci::icrQ Cl~~7r~-ui. New York: J ~ l i nWiley & Sons, 1989.

I I ~

i O . Kim Y.H., and j. M. Lec. "Impi-ove Frocess hleasuremcnts with a teas! Squares Fiiter." fidr-ocorborr Proce.ssir:g (Aup. i592): 143-146. 1i.

Khinehart, R. K . "Method for Auton~aiicPiciapiation c:f the Time Con5tanr for ;I Fii-st-Order Filter." lt7d. & Etcg. &/;ern. Resrorci~30 (no. 1. 1931 i: 275-277.

12. Tham. M. T., and A. Pan-. "Succeed at Online Validation and Reconstruction of Data." Chetn. Er7g. PI-ogress (May 1991): 4 t ~ 5 6 . j j.

\l'eber, K. "Measurement Smoothirlg with a Nonlinear Exponential Fiiter ." AiCl11:.lou1-1iul26(no. I, 1980): 132-1 33.

14. Clinkscales, T. A., and C. Jordache. "Rybiid. Digital Filtering Technique.; for- PI-occss Data Noise Attenuation with Reduced Delay," presented at the AIChE Spring National Meeting. Atlanta. Ga., April 1994.

15. MacGregor, J . F. "On-line Statistical Process Control." Chein. Engtlg. Progress (Oct. 1988): 21-31. 16. Montgomery, D. C., and E. A. Peck. IIITI-oductio~~ to Linear Regressiotz Analysis. New York: John Wiley & Sons, 1982.

17. Lucas, J. M. "Combined Shewhart-CUSUM Quality Control Schemes," Journal of Quality Technology 14 (no. 2, 1982): 51-59. 18. Rinehart, R. R. "A CUSUM Type On-line Filter." Process Control and Qualify (Amsterdanl: Elsevier) (no. 2, 1992): 169-176.

Linear Data ~econciliation

19. Stanley, G. M. "Avoiding the Chattering Rule: Filtering and Other Techniques for Ignoring Noisy Data but Noticing Instrument Problems," presented at Gensym Diagnostics Working Group, Woodlands, Tex., Oct. 1, 1992.

Linear data reconciliation for steady-state systems has already been introduced in Chapter 1 . The examples analyzed in Chapter I are instances of a linear data reconciliation problem. The gensral formulation and solution of linear data recoilciliation problenls is discussed ill this chapter. Vector notation is used in this and subsequent chapters becacise ir provides a coinpact reprzsentation and allows powerful concepts from licear algebra and rnatrix theory to be exploited. A-ppendix A provides an introductior: to some basic ccnceps of vectors and matrices.

LINEAR SYSTEMS WlTH ALL VARCABLES MEASURED As shown in Chapter- 1, the simplest dsta reconciliation prcblenl ir~volvesa linear model with all variables d i r e c t l ~measured. We also assume that the measurements dc not contain any systematic biases. General Form*drr,i-n end Solution

The model for the measurements described by Equation 1-6 can be written as

where y is a vector of tz n~easul-ernents,x is the corresponding vector of true values of the measured variables and E is the vector of unknown ran-

dom errors. Although in Equation 3- 1 we have assumed that the measurements and variables have one to one correspondence, it does not impose any limitation on the applicability of the method. Other forms of the nleasurenlent model in which the variables are assumed to be indirectly measured can be converted to the above model using appropriate transformations. These issues are discussed in Chapter 7 along with gross error detection strategies. The constraints described by Equations 1-7a through 1 -7d can be represented i11general by

where A is a matrix of dimension ;r! x n, and 0 is a m x I vector whose eIements are ali zero. Each row of Equation 3-2 corresponds to a constraint. It can be easily verified that for a fiow reconciliatior1 problem, the elements of each I-ow of matrix A are either + I , -1 or 0, depending on whether the coi~espondingstream flow is input, output or, respectively, not associated with the process unit for wiijch the flow balance is written. In general, if some of the variables arc: kno.aln exactly, the RHS of Equation 3-2 is a constant nonzero vector. c. The objec~lvefunction. Equaticr~1-9. can bt: repre~entedin general by Min (y - XI'' I

w ( -~X ) Statisiical Basis of Data Recsnciliation

Thc n .r i7 nlatrix W is usua!ly a d i a z o n ~ matrix. l tbe Ciqoilal elcments icpseseniing the wzights z s in Eqtiation 1-5. Roll-ever, ir, gcneral. it car! also contzir: nonzero off-diag~nalelements. The ~nterpretatior,of the element:; of M' in terrns of the statistical properties of the errors r is discussed in the next section. The analytical solution to the abo\:e problem can be ol-~tainec!using the method of Lagrange multipliers [ 1 , 2)

where we have denoted the solution for the estirilates using the notation

2 In der-iving the above solution it is assumed that the matrix A is of full row rank, which i~nplizsthat there are no linearly dependent constraints in Equation 3-2. If the RHS of Equation 3-2 is not identically zero, but a known constant vector c, then the estimates are obtained by replacing the kector Ay in Equation 3-4 by Ay-c.


So far we hstre dzscribcd the fonnula:iori of t!~edsta ieconci!iatic~nprobiein from a pirely inrliitive viewpoint, especially with regard to the seleciion of the cbjective function weights to be uszd for different measurements. The data reconciliation probiern car, also be explained tlsing a statistical theoretical basis, which not only helps in understanding this subject better, but also provides useful quantitative informaticn a h m t t h e improvement in the accuracy of the data obtained through reconciliation and the statistical properties of the resulting estimates. These can be used to identify grossly incorrect data or to design sensor networks as described in Chapter 10. The statistical basis for data reconciliation ariscs fro111 :he properties that are assumed for the random errors in the measurements. Generally. as mentioned in Chapter 2, it is assumed that the random errors follow a multiva-iate normal distribution with zero mean, and a known variance-


Llar~iRecniiciliiiiion rirrd Gross Erroi- Drrccfrori

covariance matrix C. However, it should be kept in mind that sometimes the primary measured signal is transfonned into the final indicated variable of interest. If the transformation is nonlinear such as Equation 2-7, then the error in the indicated variable need not be normally distributed. As indicated in Chapter 2, only the linearized fonn can be approximated by a normal distribution. Thus, if possible, the variables x in the measurement model of Equation 3-1 should represent the primary measured variables, and relationships between the primary measured variable and the variables of interest should be included as constraints. In case the constraints are nonlinear, then a nonlinear data reconciliation technique as described in Chapter 5 can be used to solve the problem. The matrix C contains information about the accuracy of the measurements and the correlations between them. The diagonal element of C, 0; is is the the variance in measured variable i, and the off-diagonal element covariance of the errors in variables i and j. If the measured values are given by the vector y. then the niost likely e\timates for x ale obtained by maximizing the likelihood function of the multivariate norma! distribution:


Max x

I ( 2 ~ ) " I' C~ !"I2

expf- O.i(y



It is also now possible to derive the statistical propel-ties of the estimates obtained through data reconciliation. Consider the case when all the variables are measured. The estimates are given by


- C A ~ ( A C A ~ ) - ~= AY [ I - C A ~ ' ( A C A ~ ) A=I BY ~


Equation 3-8 shows that the estimates are obtained using a lineartransfonnation of the measurements. The estimates, therefore, are also nomially distributed, with expected value and covariance matrix given by

'r .I x) X (y - x)}

where ICI is the determinant of 1.The above maxinium likelihood esti n~ationproblem is equivaleni io rninilnizi~gthe function

The estimates are also required to satisfy the co:;siraints, Equation -3-1. Comparing Equ:idons 3-6 and 3-3, we note that the fonnulstjcrr of the data reconciliation problern from a statistical viewpoint. simply reqcires that the weight matrix W be chosen to be the ir~verseof the covariance matdx C. This choice is also reasonable, if we consider the matrix C to be diagonal. Ir. [!?is ~ ? P ,PEn.--+:,-2-6 hpci\rn,=c --,~ . . -~ ~ ~

above choice gives larger weights to more accurate measurements. Another advanrage of using Equation 3-7 is that it is dimensionless since the standard deviation of a measurenient error has the same units as the measurement. The estimates can now be obtained using Equation 3-4 by replacing W with C-I.


where oi is the staridard deviation of the error in measurement i. Equation 3-7 shows that the weight factor for each measurement is inversely proportional to the standard deviation of its en-or. Since a higher value of standard deviation implies that the measurement is less accurate. the

Equation 3-9 implies that the estimates are unbialed, which is a property of ~naxirnun~ likelihood estimates for the linear s>zstelns.Equation 3-10 gives a measure of the accuracy cf the estimates. In the case where some of the vanables are unmeasured, it is possible to der-i\:e similar properties. These statistical properties are exploited to identify measurements with gross errors as we!l as to design sensor ne!works.

LINEAR SYSTEMS WITH BOTH MEASURED AND UNMEASURED VARIABLES For part~allyrneasuied systems. the reconc~!~at~on pioblem 15 usually solved by decompos~ng~t Into two subproblelns i?. 41. !n the first cubproblem, the reducdant measured variables art: recnnc~led.followed by a coaptat~onproblem In which the ob\crvdble unmeasured var~ablesare e\t~mated T h ~ rstratqy 1s more eiiic~enlthan an attempt tc estlmate all the vanable\ s~multaneou\lyThe general formulaticn and solut~onot the reconcll~atlonploblem lor part~allymeajurcd system\ 1 5 no& descr~bcd Let the number oi unnieawred var~ablesbe p The vanables are class1 fied Into two sets, the vector x of measured vanables and the vector u of

unmeasured variables. The measurement model is still given by Equation 3-1 and the objective function by 3-6. However, the constraints have to variables. Equabe recast in terms of both the measured and un~~zeasured tion 3-2 is written as

where the columns of A, correspond to the measured variables and those of A, correspond to the unrneasured variables. Matrices A, and A, are of dimensions 172 x n and nz x p, respectively. The unrneasured variables u have to be eliminated from Equation 3- 11 using suitable linear combinations of the constraints. This is equivalent to pre~nultiplyingthe constraints by a matrix P, also known as a projection matrix [4]. The matrix P should satisfy the proper-ty

I'A, = 0

(3- 12)

Premultiplying Equation 3-11 by matrix P, we ge: the reduced set of constraints invoiving only measured variables a:,

The number of co!uin!?s 5.f I' should clearly bp, equal to the number of constraints, in. As many in3epencie1:t rows s s possible are co~structedfor P which satisfy property 3-i2. The riumber of such ruws, t. is lifiked to the observabilit:~of the unnteasured variab!er,. If all the unmzasured variables are obser\.ab!e as ir, Cases 1 and 2 of Example ! -2, then t is eqaal :o n1 - p. This ca:l be easi!y infei~edby notiilg that i is equal :o ihe n u n Scr of constrai~tsin the reduced coristrainrs of Equation 1-13. If all p unnieasured variables can he uniquely estimated, then this requires p of the constraint equations. Thus, only the remaininp 177 - P constraints are available for recorlciling the measured variables. It can also be proved that tor sll the unmeasured variables to be observable. the p columns of A, shoulu be independent.

Exercise 3-3. Proile that ifthe col~rnlnsof tnatr-rx A,, are linearly independent, tl~er? lrnique estimafrsfor he exrst.


Exercise 3-4. Solve the reconciliation problem.for- the of linear constraiizts with constant terms: A,x + A,u = c, when the colunzns of A, nzay or- tilay not be linearly independent.


If not all unmeasured variables are observable, then t is equal io ~n-.s, where s is the number of independent columns of A,. The interpretation of this result can be done as follows. Since the unmeasured variables -cannot be uniquely estimated, the estimates of a few of the unmeasured variables have to be additionally specified in order to uniquely solve for the remaining unmeasured variables. If p - s is the ntinlber of unmeasured variables whose estimates have to be additionally specified, then to solve for the other s unmeasured variables requires s of the constraint equations, resulting in nz - s remainkg constraints for reconciiiation. A comparison with Case 3 of the Example 1-2 sholvs that 171 = 4, ti = 6. and p = 4. However, only three of the columns of A, are independent, and an estimate of one of the unmeasured among xs to xi have to be additiona l ! ~specified in order to estimat~,all thc other variables. Thus. ?: - s = i or s = 3. and the number of constraints in the rzduced set. m-s is equal to ! as observed froin Fxuatiurl 1-15. Note that the ilurnbsr of ionstrainis in the redl~cedset is a!so known as degrees of ~edtcnd~t:q.. The reduced data reconciIiation prcblern is to nliililnize 3-6 sub.ject to the ccmtraifits, Equation 3-13. Since the cons?rain:s are similar to Equdtion 3-2, the reconciled valuzs f ~ xrctin be obtained asing Equation 3-8. with the matrix A being replaced by the reduced matrix PA,.

Usicg Equation 3-14, we can ow substitute for x In Equation 3-12 ar~ii obtain the estimates u for the variables u, provided all the variables are observable (01 Gle ~ o i u m n sof A, arc independent). Since A, is a 171 x p matrix with p < f i z , a least-squares approximate solution car, be used. Fmrrl the theory of generslized inverse 151, the least-squares solution is given by


Lfnru Recor~~linriu~r arid Error Deirctior~

The general solution for Q for when all the variables are not observable is deveioped in the next section. The decomposition strategy described above is also useful for data reconciliation of processes with nonlinear constraints as described in Chapter 5. The only additional issue to be discussed is the construction of the projection matrix P, which follows.

where n,u is a reordered vector u. Premultiplj~ingEquation 3-19 by QT we get

or, rearranging

The Construction of a Projection Matrix

There are several different matrix methods for the construction of the projection matrix. One such method is given by Crowe 141. However, probably the most efficient method is to use the QR factorization 151 of the matrix A,. Such a method was first applied to data reconciliation by Swartz [6] and recently utilized by Sanchez and Romagnoli [7] to decompose and solve linear and bilinear data reconciliation problems. Consider the case when the columns of the m x p matrix A, are linear;y independent. Then it is possible to factorize A,, as

where n, is a pe:mutation nxtrix (that is. the columns of 1-1, are the permuted columns of the identity matrix). R , is a nonsingular p x p upper triarigular matrix, and Q is a m x 111 orthogofia! matrix, that is,

In essence, thc i o l ~ m i l ssf Q form a basis for the nz-dimensional space, while [he inatrix K,represents :he p columns of A, in terms of the first 11 basis vectors, Q , . Since Q is or!hogo~ial, the matrix Q2 has tile property


From Equation 3- 18. it is clear that the matrix is the desired projeclion matrix P. The QR factorization is also useful in estimating the unmeasured variables easily. Using the QR factorization. Equation 3-1 1 can be written as

RR,U = - Q ~ A , X


Using Equation 3-1 6 for R in Equation 3-21 we gel


Since R l is a p x p uppel- triangular matrix, Equation 3-23 can be easily solved by backward substitution to givc the estimates of 11. The soluticn car, be formally expressed a?;:

By substituting for the estimates of x (obtained ilsing Q: for P in Equaticn 3-14) in the above eqcation, we obtain the estimates f ~ irr (sicce i7,c is a reordered for-ITIof the original verror uj. In the case when only s of the columns of A, x e independcat, then the QR factorization takes !ne form

where R1 now is a s x s nonsingulal. upper triangular matrix, and R2 is a s x ( p - s) matrix. The projection sllatrix is still given by Qz. In the same

way, the unmeasured variables can be partitioned into two subsets of s and p - s variables.

In order to use the QR factorization for estimating the unmeasured variables we substitute for R in Equation 3-21 using Equation 3-25 and for n,u using Equation 3-26 and obtain

The QR factorization of matrix A, gives

The upper part of the matrix Equation 3-27 involves only the unmeasured variables:

which, since R1 ic nonsingular, gives the solution

From the matrix R. it can be inferred that s = 3 , and that fhe subniatrix corresponding to the first three colun~nsand first three rows is R,. The projection matrix is the transpose of the last column of Q. The reduced constraint matrix is given by

Equatiort 3-29 indicates that the solution for the first s (reordered) unrrleasured variables can be obtai~ledonly if estimates of the remaining p - s unmeasured variables are specified. This is also c~nsistentwith the fact Char not all ucmeasured val'ables are observable. Thz Qii factorization described here is aiso useful i11 identifying which of :he cnrneasured variables are unobsen~i?bIcas described it1 the next section.

The reduced constraint matrix can be seen to be equivalent to Equation 1- 15 which was obtai~iedusing si~npiealgebraic manipulatiori.

U'e illustrate the colistruction of the project~cnmatrix by QR Fdctoilzatic11 2nd its utility in dr,ter:nini~;g observable and uncbser\lab!e variables by usir.5 the fiow reconciliarion problem used in Case 3 of Example 1-2 where fiows of streams i and 6 are the vnly va~iablesmeasured. Frcjrn the constraints Equations 1-7a thrcugh 1-7d for this process. we can obtain the matr-ices corresponding to measured and unn~easured\~ariables.and these are given by

;>bservability and Redundancy In Chapter 1, we introduced the concepts of obscrvat?ility and redundancy without f~rniallydefining them. I n this section. we define these terrrls ciearly and discuss diffe~enttechniqaes for variable classitication. :'he concepts of observabi!ity and redundancy are in~imatelylinked with the solvability and estimability of variables. In medium and !ar,oe scale procsss plants, there are hui~dredsof variables, and, for technic,?! and economic reasofis it is not possible to measure all of them. It is thus important to know f ~ ar given process and s sei of rileasured variables, which of the unmeasured variables can be estimated. The concept of observability deals with this issue. It is also useful to kilow whether a measured variable can be estimated even if its sensor fails for some reason. Redundancy deals with this question. Observability and redundancy analysis can be exploited for adding new measuring instruments or for altering the choice of the set of variables to be measured. It can also play a useful role in efficier~tdeco~nposition and solution of the data reconciliation prob!ern.

Definition of Observability: A variable is said to be observable if it can be estimated by using the measurements and steady-state process constrain ts. Definition of Redundancy: A measured variable is said to be redundant if it is observable even when its measurement is removed. From the above definition of observability, it is obvious that a measured variable is observable, since its measurement provides an estimate of the variable. However, an unmeasured variable is observable only if it can be indirectly estimated by exploiting process constraint relationships and measurements in other variables. Measured variables are redundant if they can also be estimated illdirectly through other measurements and constraints. The observability and redundancy of variables depend both on the rneasul-enlent structure (also called the seizsor networkj as well as on the nature of the constraints. We have already seen how the measrlrernent selection affects the observability of flow variables in the three cases for the example shown in Chapter I . A systematic approach is necessary for determining which of thc: unmeasured variables are observabie. There a:.e broadly two apprcaches that have bcen f ~ l l o w e dfor solving !his problem. One class of methods is based on the use of linear algebra and matrix theory, ujbile the cthcr uses principles of graph theory. Both approaches are discussed here since they provide valuable insights.

Observability and redundarxy classificaticr~of variables can be cai~ied out as pal? cf thc solution of the data reconciliation problem [6, 71. We first describe bow unobservable variables can bc identified during the construction of the projection matrix. Unobservab!e variables are present only when the colu~nnsof A, are not !inearly independent. In such cases, the QR factorization of A, has the form show11 in Equation 3-25, which can be used to rearrange the constraints in the form of Equation 3-27. The solution for unmeasured variables given by Equation 3-29 can be written as



I ;




@ '




T h e matrix K,,contains all the necessary information to classify unmeasured variables. If a row of R, has no nonzero element, then the corresponding unmeasured variable in the LHS of Equation 3-30 can be estimated purely from the estimates of x and is therefore observable. If on the other hand, a row of R, contains a nonzero element, then the corresponding unmeasured variable on the LHS of Equation 3-30 is unobservable since it depends on the estimates chosen for the p-s unmeasured variables on the RHS of this equation. All the p-s unmeasured variables in the RHS of Equation 3-30 are :*!so unobservable, since their estimates have to be specified. Redundant measured variables can be identified either by looking at their reconciled estimates or by considering the reduced constraint matrix A nonredundant measured variable will not be adjusted since it is not possible to estimate this variable indirectly through other variables. Hence, its reconciled value \~iillbe identical to its measured value. Corresponding to this variable, the elements in the collumn of natrix QTA, u~illall be zero.


Example 3-2 In order to classify the measilred and unmeasured variables of Example 3-1, we can make use of the Q R f;ictosization already con:puted in the example. The nldtrix R,, which is useid for classifying i~nrneas:lred variables, can be computed as

Since all the rows of R, contain a norzero eleinent, this implies that all unnleasured variables are unobservable. In order to classify the measured , in Exanlple 3-1. Since variables. we make use of matrix Q ~ A computed both the colun~nsof this matrix contain a nonzero elen~ent,we infer that both measurements (flows of streams 1 and 6) a - e redundant. These results can be cou~itei--checkedwith the results of Example 1-2, Case 3.

Observability and redundancy classification using the projection matrix was also used by Crowe [8]. Crowe applied theoretical rules and derived algorithms to claqsify flows and concentrations in material balance data reconciliation. The procedure allows the inclusion of chemical reactions, flow splitters and pure energy flows. Fewer classification rules, however, are required by the Q R projection matrix approach described above. Matrix methods for obser\~abilityand redundancy classification in bilinear processes were developed by Ragot et al. 191.

Graph Theoretic bfethod The use of graph theoretic concepts for observability and redundancy classification when only overall flows are considered have been developed by Vaclavek [lo] and Mah et al. [l 11. Later, Vaclavek and Loucka [12] extended their classif cation ideas to multicomponent systems, while Stanley and Mah [ l3J developed classification algorithms for energy systems. Kretsovalis and Mah [14, 15, 161 developed graph theoretic algorithms for classifying flows, temperatures and cornpositicn variables in general processes, while Meyer et al. (171 developed a simpler algorithm applicable to bilinear processes. In this section, wc focus only on overall flows of the process. In order to use g r a ~ htheory. the pri?cess under coilsidcraiicn should be represented as a process zr-aph. The process zraph can be simply derived from t ! ~ zflov:sheet of the process by adding an extra node denoted as the environrnent node and cocnecti!lg all process feeds and products to it. Tius. for the prQcrss of Figure 1-2 in Chapter 1, the process graph is sliown in Figlure 3-1, u~hcre~~edirection of the streems are not indicated since they are irrclevai?t for the present analysis. The following s i r ~ ~ pyp,l i e powerful result is ohtai~edf r a n graph the:~ry for identifying unobservable flows: ,An unmeasured flow is unobservc15le if and only if it iorms part o ; some cycle consisting solely of unmeasured flew streams of the prccess graph.

As an example, consider Case 3 of Example 1-2 for which the unmeasured flows of strcanis 2 to 5 were shown to be unobservable. The process graph f a this case is shown again for convenience in Figure 3-2 with the measured streams marked by a cross. It can be easily observed from Figure 3-2 that these streams forin a cycle. On the contrary, for Case 3, the unmeasured flows of streams 3 through 6 d o not form any cyc!e arnong them and are therefore observable. This can be verified fi-oin the pl-ocess graph for this case shown in Figure 3-3 in which the s marked. measured e d ~ e are

Figure 3- 1. Process graph of heat exchanger with bypass

Figure 3-2. kiwi exchanger process graph with

unobservable values.

Redundant measurzd variables can also be identified by using the following simple pr~cedure.We merge every pair of nodes which are linked by an unmeasured stream, obtaining in the process a reduced graph which contains only measured streams. A11 the measured streams of this reduced graph are redundant and will be reconciled. Any measured stream that gets eliminated as a result of the merging process is nonredundant. For example, we apply the merging process to Figure 3-2. The reduced graphs obtained after merging, in sequence, nodes linked by streams 2, 3, 4, and 5 are shown in Figures 3-4a to 3-4c, respectively.


I1u;ii R e i o r ~ c i l i ~ l r arid o ~ r Gri~ccError lleredioll

Figure 3-3. Heat exchanger process graph with observable variables.

The final reduced graph of Figure 3-4c contains the measured edges 1 and 6, which implies tha; they are redundant and will be reconciled. This can be compared with the results of Case 3 of Example 1-2 which shows that flows of sirearns 1 and 6 are present in the reduced data reconciliation proble~n.It can also be observed tha: the reduced data reconciliation problem can be obtained by writing the constraiilts based on the reduced pi-ncctss graph of Figlire 3-4c. Thus, for flcw reconciliation of processes containing unmeasured vanahles, [he reduced data reconciliatior? prob!sin can be fornlulated using a reduced graph instead of using a projection matrix technique.

This exarcple is ased to illusrrate the presence of observzbieiunobservabje unmeasured variables and redundantlnonredundant measured variables coexisting in the same procesr. The process graph for this example is drawn from Mal; [ I l l and is shown in Fie~irz3-5. The rnras~lredflows of this process are indicated in the figure. For classifying the unmeasured variables easily, the measured edges from Figure 3-5 can he deleted resulting in the yraph shown in Figure 3-6a. From this figure it is observed that streams 8, 11. and 14 f o m ~a cycle and are therefore unobservable. The remaining unmeasured flows are observable. In order to identify the redundant measurements, the nodes linked by unmeasured edges are merged, resulting in the reduced graph shown in Figure 3-6b. All the measured flows present in Figure 3-6b are redundant. but the measured flow of edge 1 is nonredundant since it is eliminated during the merging process.

Figure 3-4. Heat exchanger graph after inergin] unobsenablr strerlrns (oI stream

2,(b) stream 3, 0 1 976 Ame:;ccn

(c) strecr~s4 and 5 Reprinted ~ v Ii permissioq from Chemicoi Society



Figure 3-5. Process graph of a refinery subsection. Reprinted with permission from Copyright 0 1 976 American Chemical Society.

[I I].

Dara Rpco~rcilioriot~ urld Gloss Error- Derecriorf

Other Classification Metfzods

Figure 3-6a. Subgraph of unmeasured variables of refinery process. Reprinted with permission from [I


Copyright 01976 American Chemical Society.

Besides the two major variable classificatiori methods previously described, a different approach was developed by R o n ~ a g n o l iand Stephanopoulos (18, 191. They used an output set assignment of the mass and energy balance equations. This approach has been Inore recently revised and used in a data reconciliation computer package (PLADAT 1201). The :nost importact merit of this approach is a classification of process constraints which enables building a reduced set of constraints for data reconciliation (redundant subset). If such a redundant set exists, the reconciliation prob!em can be decomposed into a redundant subpr-obiern (solving the reconciliation problem with the redundant set of constraints) and a couptation suhprw0lern (solving for the observable unmeasured variables). But a QR de-~ o r n position approach for both variable classification and solving of the data reconciliation problem is more straightforward. Recently. Sanchez and Romagnoli 17) also used it for reconciliation problems involving linealand bilinear constraints.


Figure 3-6b. Reconciliation subgraph of refinery rocess. Reprinted with permis[i I J Copyright O I 976 American C h e n l i c o P 5 ~ i e ~ .

sion 60 -,

Hitherto, we have as:;unied that the measurement error covariance matrix Z is complete!^ known. One possibie method of obtaini:ig rhc error variances and covariances is frorn the errar characteiistics of different compancr?is (such as sepsor, transmitter. recorder. ctc.) as explaiced in the preceding chapter. In order LC use this approach, q e need information about the stzndsrd deviations of thc errors comtnitted by the different components, as well the transfal-mations used in p r ~ c e s s i n gand transmitting the data. It is generally difficult to obtain this information. a!thoagh the hzstrrrrnent Engiizeeis Handf7ook by Liptak 12 i j is a goou source for such da:;i. It should be noted that the data given in this handtook is under idcai laboratory conditions and may not be valid under actual process conditions. If nenlinear transformations are involved in processing thc raw measured data, then the standard deviation of the measurement error can b- c o n putcd only by using linear approximations of the transforming filnctions as in Equation 2-13. Bagajewicz [22] has shown that the measurement error obtained in this manner can be considered to be normally diztrib-

uted, if the range of the measuring instrument is large co~nparedto the standard deviation of the measurement error. An alternative way of estimating the covariance matrix is from a sample of measurements made in a time window. If yi, i = 1...N is the vector of measurements made (typically at successive sampling times), then an estimate of the covariance matrix can be obtained analogous to Equation 2-6 as

where 3 is the sample mean given by

The above rnetliod of estimating the covariance matrix is known as the nil-ecf~lzetlzorl.An inipor-tant requirement for estimating C using Equation 3-32 is that the true values of all variables should be consta:lt during the iii:le interval in ~ i ~ h i cthe l i above nieasurenlents are made or, in other \:,or&. the proces,; shculd truly be in 2 steady state. In actuai practice, the true values c11ar.g~continuously arid the estimates obtained wil! be poor if these changes are comparable in magnitude to the rceasurement errors. On the cther hand, if the measurements contair! a gross error, the estimate of :he covariance matrix is cot a f i c t e d provided the magnitude of tile Fross error is constant i;i this tin:e interval. Almasy ugd Mah [23] first proposed an inc!it-ccr ~~~e:llod of rrtirnating rhe co\laria:ice n;atrix when the true values of the pn)czss are u r i d e ~ g o i ~ ~ g constant changes. 'Their method e x p l ~ i t sthz constraint model given by Equat~cn3-2. For- this p u r ~ ~ o swe e . define t!ie constraint residuals i- as

Using Equations 3- 1 and 3-2 and the assumption that the measurement errors follow a nor-ma1 distr-ibution wit11 0 mean and covariance matrix C, \rre can prove that r follo\vs a normal distribution with 0 and covariance matrix V given by

It can be observed from Equation 3-34 that the constraint residuals do not depend on the true values of the variables and d o not need any ~nformation concerning their behavior. From a sanlple of measurements, we can obtain an estimate of the covariance matrix of constraints residuals as

Note that in obtaining an estimate of V using Equation 3-36 we do not make use of an estimate for the mean of the constraint residuals frorn the sample of measurements, since we know its true mean to be 0. The estimate of V can be used in Equation 3-35 to back-calculate and estirnate for C. We first note that the matrices V and C are square symmetric matrices of dimensions rn and a, respectively. The use of Equation 3-35 to estimate C from V, therefore, implies that we have to solve for n(n + 1)/2 parametel-s from in(17z + 1)nequations. Since i z is greater than nz, several possible solutions for C can be obtained. In order to obtain a unique solution, Alnlas). and Mah [23] suggested that the sun1 square of off-diagonal elernents ol-'1 be minimized subject to satisfying Equation 3-35. This is based on tile argument that. in practice, C is usually diagonal or diagonally dorniriant. Art analyticai solution can be cjbiaincci for this problem as follows: 1x1the vector d consist of the diagondi elemsnts and the vector t consis? of the of?-diagonal e!cments cf C iwhich can be formed by piacing ihe colunms one below the other, considering only !he el~mentsbelow thz diagonal). \ire car, similarly arange the diagonal and off-diagonal elements of as a vector p and I-ewriteFq~ation3-35 a:;

where the rnairices M and W are mal;pirr,o matrices that can be constructed froni the elements of the constraint matrix A 1241. The solution for the diagonal and off-diagonal elements of C are given by

Dora Kcconr-il;trtiotl arxd Gross Error I)~=recriotz


Keller et d.1241 extended the above indirect method for obtaining an estimate of C which has only a few nonzero elements in specified locations. In order to obtain a good estimate of C using the indirect method, the following two conditions have to be met: ( I ) The true values of process variables corresponding to each measurement should satisfy the constraint Equation 3-2. ( 2 )The measurements should not contain any gross errors. If a gross error exists in any measurement, then the constraint residuals will not have zero mean. Typically, when the true values of process variables undergo changes, we cannot ignore accumulation terms in the material and energy conservation constreints and the first cozldition may nor be met. In such cases, it has to be questioned whe:her the indirec! method offers an advantage cver the direct method. In order to tackle this problem, Almasy and Mah [23] recommend illat each yi used in Eqlation 3-35 to obtair, an estirnate of V should be the average c f the measurements made within a time interval in which the process is operating around a nominal steady state. A set of N such time perlads should be chosen to obtain a sample of averazed rneastire~nentvalues io be used in Equaiicn 3-36. A justification for this recommendation can be given using the fact that, in ~ r ~ c t i c steady-state e, data reconci!iation is applied to measurements averaged over a time interval in which the process operatss zround a nominal steady state (see industrial examples discussed in Chapter I). Even if the true \ralues in this time period are randomly fiuctl~atinoabout the nominal stead\!state values, we can expect the average of the true valdes to satisfy the steady-state conservation constraints as also assumed in data reconciliation. In order to tackle the problem of gross errors in the indirect method, Chen et al. 1251 proposed a robust method of estimation in which the different constraint residual vectors are given appropriate weights when computing the estimate of V using Equation 3-36. A sniall weight is assigned to a constraint residual vector if it is not consistent with the other vectors in the sariiple set. The estimation algorithm is iterative and

described by Chen et al. 1251. This procedure is useful if only some of the measurement vectors have gross errors. If a gross error of constant magnitude is present in all measurements, then the above procedure will not eliminate the problem. One practical solution is to choose the data from time periods that are widely separated in time so that they do not share common features such as the same gross error with the same magnitude present in all the samples. This can be done if a large historical database of operating data is available. One can also choose to combine the estimates obtained from different methods in a judicious manner. The indirect estimation method, however, has not yet been extended to treat nonlinear constraints.

SIMULATION TECHNIQUE FOR Eb-ALl!_ATING DATA RECONCILIATION We conclude this chapter by describing a simulation technique that can be used for evaluating the effectiveness of data reconciliation and to estimate the error reduction that can bc achieved. For the purposes of simut iation, the following inpur data has to be obtained: (i) Tne proczs flowsheet, which ir~dicatesthe number cf process units, the process streams and ilieir connectivity. The type of process unit need not be specified if sirnukition of cnly overall flov: reconciliatien is to be performed. (ii) The "true" o r "nominal" steddy-state flolr~values of ail strrams. These true values must be consisteri: with the f l o \ ~baiances of the process. These tzue values are usefir1 fcr judging [he improvement in accuracy achieved through data reconcilia~ion. (iii) The set of nieasured flows of :he process and the standard devia;ion of the enor in each n~easurelnent.The standard deviation may be expressed as a fraction of the true value or specified as an absolute \>due.

At first. random errors arc generated which follow a standard normal distribution with mean zero and giver, standard deviations. These are added to the true values to obtain the sirnulated "n~easurements." The constraint matrix, A, is obtained based on the process connectivity information, and the submatrices, A, and A,: are also obtained corresponding to the measured and unmeasured variables. The projection matrix IS now computed using a Q R factorization of A,. The estimates can now be computed using Er~uations3-14 and 3-24 or 3-29. The error in the estimatt.; can now be computed by comparison with the true values. '

In order to obtain a statistically accurate estimate of the en-or reductio~l achie\rable thsough data reconciliation, it is necessary to perform several s i ~ n u l ~trials t i ~ with ~ differ-ent random measurements generated in each trial. T h e reduction in error d u e to data reconciliation is coinputed and averaged over all the trials. Typically about 1,000-10,000 simulation trials are used to obtain this esti~nate. Many software packages like MATLAB and mathematical libraries such as I M S L o r HARWELL have pseudo-random number generators that can b e used for simulation purposes. It should b e noted, however, that it is implicitly assumed in the sirnulation that there are n o rnodel errors (that is, the true values of variables satisfy constraints) and that the measurement errors are norriially distributed with known variances. In practice, since these assumptions may b e violated, the error reduction that can b e achieved will be less than that estimated through simulation. L!ltimately. the benefits of data reconciliation should b e evaluated in pr-actice through actual inlpro\;ernent in process perforrl1ance.

REFERENCES I . Kuehrl, D. R., and H. Davidson. "Computer Control. !I. hlathematics of Control." Clzern. Eng. Progress 57 (1961 ): 4 4 4 7 .


2. Seber, G.A.F. Linenr Ragr-e.ssiorz AnnI>~.sis.New York: John \i:iley k Sons, 1977.

3. Mah, R.S.H. Cl1en1icnlProces.~Stt-rrcturc~sand h~for1?1ntiat7 Flo~r..~. Boston: Buttenvorths, 1990. 4. Crowe, C. M., Y.A.G. Campos, and A. Ihymak. '.Reconciliation of Process Flour Rates by Matrix Projection. I: Linear Case." AlChEJolo7wl29 ( 1983): 881-888 . 5. Noble, B., arid J. W. Daniel. Aj~pliedLitlci~t-Algebra. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1977. 6. Swartz, C.L.E. "Data Reconciliation for Generalized Flomsheet Applications," American Chemical Society National Meeting, Dallas. Tss., ! 989. 7. Sanchez, M., an3 J. Romagnoli. "Use of Orthogonal Transfonllations i n Data Classification-Reconciliation." Corl7pute1-s Chcrn. Et~gn,?.20 ( 1996): 483493

SUMMARY An analytic21 soluiioii is a\,ailahle for linear data reconciliclticfi with all variables measur-sd. Unmeasured variables can b e eliminated from the reconciliation mcdel by a projecticn matrix. A [educed model is obtained, which can be used to reconcile the measured variaS!es. lJa~-iablrclassificatio~r(a> !-r.drrr1c;ltrt~ih1~1~~-ci~~11il~1i7: 1ne;i:;ured valiables and o h . i e t ? ~ ~ h i c ~ / i r ; l o ~ . sun~l-reasured e ~ - ~ ~ ( ~ ~ ~ ~variables; d can lie perforr;~edusing n~atli:; nlethods, whilz sclvi~igthe! data reconciliation problem, or by a separate g!-aph-:heoretic algorithm.

8. Cromre, 2.M."Ohszi-\.ability and Redundancy of Procesh Ds:a for Steady Sta!t. Rzconciliatio~i."Cl~etr~. 1711::. Sc.i. 43 (! 989): 2903-19 17. 9. R . q ~ t .J., D. Maq~~in. G. Bloch, an2 W. Ciomo!ka. "Obsei-\),?Sil,ryand Val iahles Classification in R i h e a r Processes." Betie:ir.r Qrrrrt-t~r-11 J. Alr:o:r~trric Cor:trol-Joltrr~n(n 3 1 (1990): 17-13. 10. Vzclarrek. \'. "Studies on Sy:itcm Engi:leering 111. Optimal Cfioice o f t l l i Balance M ~ 2 ~ ~ r e l n e in n ; sComplicated (llhe:ni.;al Engineerin: Syste111s .-. Cliett~.G I ~Sc/. . 24 (1469): '347-955.

:1 . Mah. R.S.8.. c tication


G. M. Sra~:ley,and I). 'A'. Ilowning. "Reconc~liai~on and Rectiof Process Flo\v aad In\iento~-yData.'' Iilci. & Glg. C : I P ~ iT' tI- .c ) ~ .!]CS.

15 (; 976): 175-1Si.

12 Vacidveh. V , and M Loucha "Selcct~onof hleacu~enienr~ Yece\\dr\ to Achieve Mtilticomponent 121:lss Balances in Chemical Plan:." Ch(~tt7.Etlg Sci. 31 (1976): 1199-1205. / 13. Stanley. G. M., and R.S.11. Mah. "Observahility and Redundancy Clas\ilication in Process Networks. Tlreorems and Algorit!rms." Cilc~ttl.Etlg. Sci. 36 (1981): 19311954.

14. Krctsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classification in h4ulticomponent Process bretworks." AIChE Journal 33 (1988): / 70-82.

15. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classification in Generalized Process Networks. I: Theorems." Cornpzrrers Ckem. Etlgng. 12 ( 1 988): 67 1-688. 16. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classificatioil in Generalized Process Networks. 11: Algorithms." Cornputera Chern. Enyng. 12 (1988): 689-703. 17. Meyer, M., B. Koehret, and M. Enjalbert. "Data Reconciliation on Multicomponent Network Process." Cornjxrters. Chern. Eng. 17 (1993): 807-817.

SteadymState Data Reconciliation for Bilinear Systems

18. Romagnoli, J., and G. Stephanopoulos. "On Rectification of Measurement Errors for Complex Chzmi(,-1 J'lants." Clten~.Etzg. Scie~zce35 (1980): 1067-1081.

19. Komsgnoli. J.. and G. Stephanopoulos. "A General Approach to Classify Operational Parameters and Rectify Measurement Errors far Compkx Chemical Processes." Coi77p. ;Zppl ro Ci:e!it. En,?. (1980): 53-174. 20. Sa~lchez.M.. A. Bandoni. and J . Ro~nagnoli."PLADAT: A Package for Process Variable Classificatign and Plant Data Keccr.cilia!ion." Corrlpz~t~r-~ CII(,III. E r i , ~ nh~i .6 (Sll{lfil.j 11992): S199-S506.

22. Pa~ajcwicr..?...On I the Probability Distribution and Reconciliation of I'roczs; Flznt Data." Cornplrr. CIlem. 1511s. 20 (1 956): 8 13-8 19. 23. .h!masv. G. A., and R.S.H. klah. "Es~irnatior: of Measurement Errur Variances from froces\ Data. lild. EIZX.C'l7er:i. 5'1-ocescDc7.s. L)ci.. 23 (i981'1: 779-784. '

21. Keller, J. Y.. M. Zasadrinski. and h4. Darortacl;. "Analytical Estimator of Measurement Error Vsriance\ in Dais Reconciliatioil." Cov~pirter-sChein. E I I ~ I I16 , ~(1392): . 185-1 38. 25. Chen, J., A. Randoni. and J. A. Ronlagnoli '-Robust Estimatioil of Measorernent Enor Variance/Covari:tncz from Process Sarnpling Data." Cotni~l!tel:~ Clzern. Ei~gilg.2 1 ( 1097): 593400.

BILINEAR SYSTEMS In a chemical plant, the process streams contain several species o r cornponerxs. Besicies stream flow rates, the compositions of some of the screams are also rncasurzd. Since composition analyzers are comparativei y more expensive, on-line analyzers rnzy no: b e used in many c:ises, and these :neasurements are obtaiced from a laboratory, which may also irlciease the errors in r!ie reporied data. Neither the o\-era11 flow balancr, nor &hecomponent balances are generaliy satisfied by the measurements. It is therefore necessary to reconcile both flow and coinpusi!ic?n ineasuien-ici:ts siniultaneously. l'he consirainls of the data reconci!iation problerri arc linear if ive consider o l ~ l yoverall Row balances. Iloivever. if we wish to simu!tancously reconcile flow and composition measurements. then component balarlces also have to be included as cocstraints of the data reconciliation problem. These constraints contain component flow rate terms which are products of the flow rate and composition variables. Since these constraints are nonlinear. it is possible to obtain the solution using a nonlinear data reconciliation technique. It is also possiblc to solve the muIticomponent data reconciliation problem more efficiently by exploiting the fact that the nonlinear : e m s in the constraints are at most products of two variables. The ierm bilitaear data reconciliation is used to refer to problelns containing this specific form of constraints. The reasons for developing special techniques for solving bilinear data rcconciiiation problems are two-

fold. First, these techniques will bc more efficient as cornpared to techniques used for solving nonlinear data reconciliation problems. This becomes especially important w h ~ nplant-\vide data reconciliation is performed. Second, a significant number of industrial applications of data recoriciliation is for multicomponent systems. An important exarnple is the mineral beneficiation circuit where mineral concentration measurements and flows are reconciled. Other typical examples are reconciliation of flows and compositions around a single distillation column or a sequence of columns such as a chill-down train of a petrochenlical complex. In several cases, reconciliation of flows and temperatures of energy flow subsystems are also bilinear problems if the specific enthalpy is only a function of temperature. A crude-preheat train of a refinery and a steam distribution network of a chemical process are illlportant examples. It should be kept in mind, however, that these special techniques cnly solve the problem efficiently. but do not give any additional benefits. Ir? this chapter, we describe two methods that have been specifically developed for reconciling data of bilinear systems. While these methods are more efficient than nonlinear programming techniques, they have the disacl\.antage that. ai present. neither of the nietl~odscan rigorously handle iriequelity constraints, s ~ ~ cash simple bounds on variables. Thus. !I! certain cases. it is possible that these n~eth(idsrnay gi\.c rise to negative estirnatcs of flocfs and co!npositlo!~s.


In order to illustrate a typical bilinear data rccor?ciliatia:l problern, we co~lsidera simple exar:lple of reconciling the ilows and con~positionsof a binary distillation column as shown in Figure 4-1. We will assume that i i ~ crivws and co!npont.rlr rnoie fractions of feed, distillate, and botton~ s t r e a r r ~are ~ iiieasurecl. A typical set of measurzd values is shown in the ~s last column of Table 4-1. The discrepancies in the mater-ial f l o ~ and 1101-rnalizationequations are shown in Table -1-2. It is observed from this table that the measured flo\rrsand compositions do not satisfy the mater-ia1 flows or norntalization equations.



Figure 4-1. Binary distillation column.

Table 4-1 Operating Data of Binary Distillation Column





Measured Values


Cornpor~tnt1 'A Cornpoilent 2 % 1-1 nw

Co111~'o:lcrltI % Co~nponcrit2 'Z Flow Cornl~oiicnt1 C/r;it2 $6

Table 4 - 2 Ccnstraint Balance Residuals before R~conciliation Balancc Type


The data reconciliation ot>jective is forrnulatcd s s in Equation 2-3 and is given by

where W's are the weights, and xJk's are the mole fractions of cornponents. The first three terms in rhe above objective function are the weighted sum of squared adjustments made to stream flows and the other terms involve the adjustments made to the mole fraction measurements. The reconciled estimates have to satisfy the material balances around the coiuntn. The different types of constraints that can be imposed are (i) Overall flow balance around the column (ii) Component flow balarices fcr all the corliponents (iii) Normalization equations for luole fractions of each str-earn All of the abol~econstr-aints need not he imposed since they are not 31; independznl. For a separator. such as the distillarion colurm co1:sidered ic this cxiunple: a coinpiece set of independent co:isrraints is the ccmponent flow balances and ?he noro?alizaticn equaticms. The overall flo\v balance can be derived win2 :hex two $pes of eyuations and, thus. it need not he i:i~pox"dOne comiilon n1ist;tke is to assume chat bjViinposin: :he overaii ilosv ba1:;nce ar?d compone1:r tlou balances, t l ~ zrcco:lciled mole fracti~n estirn;~:es wil; automatically satisfy the nor!nalizatio:i constraint for al; streams. This is not the case. howcver, :is shown later ti~rotighihis example. Thus. the constraints for this example are

The component balance constraints, Equations 4-2 and 4-3, contain products of the flow rate and compositions which make the data reconciliation problem more difficult to solve as compared to the linear case considered in the preceding chapter. The objective function 4-1 along with the constraints Equations 4-2 through 4-6 can be treated as a nonlinear equality constrained optimization problem and can he solved using a constrained nonlinear opti~nizationprogram. However, efficient methods to solve these types of problems have been developed. Using any of these methods, the reconciled data for this example may be obtained and given in Table 4-3. In Table 4-3, the second column shows the reconciled estimates when component flow balances and normalization equations are imposed while the third coiunin gives the estimates when the overall flow and component flow balances are used. Table G-3 Reconciled Data of Binary Distillation Column Recon:iled Values

Wth NormoiizuSon Varicibles


F:ow Cornponei~tI % Component 2 C/o Fiow Co~liponerrt1 % Component 2 C/o FIOUJ

Component 1 % Conmonent 2 % The constraint imbalances after reconciliation for these two cases are given in Table 4-4. The results in this table clearly demonstrate the necessity of including normalization constraints in rnulticomponent data reconciliation problems.

Table 4-4 Constraint Balance Residuals After Reconciliation Residual Values Normalization


Balance Type

Constraints Imposed

Constraints Not imposed

Overall Flow Component balances



1 2 Normalization Equations

8.5453E-13 < 1.OE-13


0.0000E+00 0.0000E+00 0.000OE+O0


Mixers A mixer has two or inore input strcarns and has one output stream as shown schematically in Figure 4-2a. If the streams are single phase then the constraints imposed for these units are (i) Component flow balances:

(ii) Nonnali~ationequations:

General Problem Formulation

The preceding exalcple shows that multicu~nponentdata rccouci!iation for a ciistil!ation colun~nis a bilinear problem. In a si~nilzrm a n ~ e r .the data arcu:id a sequence of separation columns czn be recunciled which also gi\les rise to n Siiinear problem. In thc mineral processing industry, a cornlnon appiicatio:~of hilinezr data reconciliation is the reccnciliatioi: of flow and mi~lera!conipcsitions of a beneficiation C~I-cuit. Wc first present the ger:eral farmulation for n~uiticon:poncr?tdata reconciliation of sucli typical procssses. Dependii~gon the process and the silbsystc~nthat is ccnside:-ed, severai different types of process units may he encoitntered. In chernicai process iis the flow or compositiciis of streams industries, tne different ~ ; ~ ? xvhere undergo changes may be classified as mixers, splitters, separators, and reactors. The type of ccnsrraints that can be imposed dc;-~r;.':; c: Tc!: :tr of the process unit. It is therefore important ~o have -1 ~ I p a l~nderstandi~lg r of the complete set of independent constraints that can be written for each unit and. hence, for the entire sub-process. Although, for each process unit difTerent cornbinations of independent constraints can be written, usually the independent equaticns are chosen as described below.

Sp lifters A splitier splits dn input stream into two or rnore oiltpui streams as shown schenlatically in Figrirr: 4-25. The consrrair?ts that can be written f ~ this r unit are ( i j C'ornponcnt flow balances (equality of composi:ions):

( i i ) Overail flow baiance:


(iii) NOI-rnalizationequation for input


(i) Coinponent tlow bala~ices:

j = 1 ...S; k = 1 ... C

I;JxJh- c I , F , x , ~= 0 (ii) Overall flow balance:


Figure 4-2b.Splitter unit.

Input stre:

o u t p u t sirearn (iv) Split fraction definitions:

Figure 4-20. Mixer unit

All other constraints such as coinponent balances and nomiz4iza:ion constrain!~for output su-eams can be d e r i v ~ dby appropr,at:: combinations of above equations. It should also be kept ir? milid that if a sp!ittel- is a part of a ~ be wilten o:11~;fcr the subsystem, then the ncma!ization c q u a t i o i ~sllouid inpu: stream of the sp!itter and liot for the o~itputstreams of the splitter.

i /

(iii) Normali~ationequation for input stream:


Exercise 4-1. Show thatfit- a splitrer s l i o ~ ~in ~ rFigure i 3-20, the rzumber oJ'iizdepende,lt equations is C.S'+2. Aiso S!IOWthlzt the con~ponentbalarlces arid rlor-rnalizatior~eyi~crtiotlsof all out/~ul .strc.nm.~. cot7 be derived ~ ( r i n c the above CS+2 equations iinposed for- a splitter:

An alternative formulation which makes use of the definition of split fractions is sometimes more convenient. Let a, be the ratio of the flow rate of outlet stream j to that of the inlet stream of the splitter. Then the followilig equations also constitute a cornplete nunredundant set of equations for the splitter.

The use of split fraction variable:; introduces as Inany additional variables as the number of output streams and thus the nurilber of independent equations that tiiust be written for a splitter usin? s!,iii fraction l~ariablcs is q u a ! tc) CS-tS+2. The use of spiit iractic!:: variab!es al-c, complicates the problem further, since the cornponzixr balar;ce> ai.2 170 ionger bilin~arbut are trilinear (pr~duclsof three varil;bles).


Exercise 4-2. D~!~iorzstrute ih(lt i : elir?~inntiug ~ the .sp!itfraciiorl 11at-iai2lesji-ot~ztiic aboi~cmltel;.~ciri~,c set ofCS+S+2 rq!rntior~.r,i! i r p~rii/,lcio obteiri t b CS+S ~ inciei,eiiderir cyiai!iisii qfiiic,fiisr Sot-~nulatinr~ >)r [ I sp:itra:


1npul s

Figure 4-2c. Separator unit





(i) Component balances: A separator, which is the inverse of a mixer, takes an input stream and separates it into two o r more streams of different compositions as shown in Figure 4-2c. If all streams are single phase, the equations for this unit are similar to those of a mixer.

(ii) Norrl~alirationequations:

( i ) Component flow balances:

(ii) Nornialization equations: The alternative set of model equations is obtained by u s ~ n gthe fact that each elementai species is conserved. If we denote the number of atoms of element j in component k by aJk then the following equations can be written for a reactor (i) Elemental balances:

We consider a reacioi with a single feed s:ream and a single product stream as shcwn in Figure 4-2d. Reac~ci-swit!l rnultiple feed or product streams can be modeled by using a mixer befoie the reactor 211d a separator after the reactor. Due to the re;ictioils that occur, neithe1- the ovcrail mt)lar flo\v nor the molsr flow:; of components are conser~~eci. 'These are :G.O alternitfive chcices of niodel equations fcr a reactor. In thc first approach, we assume that iridepe~identr z a c t i o ~ swhich occur in the reactor are specified. Let nLi be the stoichinlnetric coefficient of component k i n I-eaction j. and let the unknown extents of reaction be j = i...R. \iherc R is the nurnbt.1- of independent reactions specified. Using the eltznts of I-eactions we cart \\'rite the following eq~lations




Figure 4-Zd. Reactor u n ~ t

( ~ iNorii~ali~ation ) equations:

As r;hoxi,n hy Reklaitis [I 1, these sets oi' equations are equivalent and give identical results only if a complete set of iildependent reactions that can occur among the components present is specified. In the absence of any ii~formationregarding the reactions that occur. the elemental balance nod el can be used. However, if energy balances also have to be included as part of the rcconciliation, then thc extent of reaction niodcl is convenient as shown subsequently. The model equations of the various units can be classified as either process rtnit type or as stream-type relations. Overall flow and colnpo-

The objective is to determine estimates of ail flows and compositions such that the total weighted sutn square of adjustments made to flow and cornpositio~lmeasurements is minimized. The objective function is given by

The above for~nulationof the DR problem is in terms of flow rate and mole fraction variables. Alternatively, we can also formulate the problem in terms of overall flow and component flow variables, where the component flow NJhof component k in stream j is defined as

Using these variables, the component balances can be written as

The ncl-rnalization equations can a!so be written as

It can be observed from Equations 4-li and 4-i2 that :he constraints ars l i ~ e a rin the flow variai2les and this feature can be exploited in the so!ution pr-occdure. Although the constraints are now in ternis of flew izariahlcr;. the objective iunctiotl sti!l contains mole fraction variables since these are the measured q~antities.In order ic cvercorne this prcbleni, Crowe 131 proposed a rriodiiied objective of the D R prob!em w h ~ c h is to minirnizc the sum square of adjustmcilts made to flows and component flow variables. In :!:is c-5.- :!t :r.odified ohjecti:.~ fvnction is

Since thc component flo\vs arc not the measured quantities, in the above objective fuiictio~iit is necessary to clarify the notion of the measured value of the component flow variables and the weight factors to be used fol- these variables. \I component flow Ni, is taken to be a measured cju:iritiry if. botll the f:ou F! and the composition xjk are nicacur-ed.

In the previous chapter, it was shown that the weight factor of a measured variable can be chose11to be the inverse of the variance of its measurement error. An estimate of the variance of the error in the product N~~ is obtained by linearizing it in terms of the flow rate and composition measurements as

The variance oil, of the error in N,,can be obtained by applying the rule for linear sum of Independent norrnally distributed variables

The weight WNjkcan be taken to be equal to the inverse of the valiance o i . ~k , The choice of a modified objective function for DK and the weight factors for the "measured" component flows can lead to larzer adjustments being made to the measurements. However, the modified cbjzctivc still indirect!y attempts to miniinizc tile totai ad-justment made rn the rneasured val-iables. Tne modified objective function 4-13subject to the constraints Ecjuaiions 4-11 acd 4-12gives rise to a linear DK problem in the flow val-iables. For the special case considered here, all variables are measared and we can irnnediately obtain the es:im:rtes of all flows using the analy:ical solution of Equation 3-4. From these estiriiates, the rccoaci:ed values of the mole fractions can 5e obtained as

Example 4-3 Crowe's method is applied to reconcile the data of thc binay distillation column discussed in Example 4-1.The measured tlows and cornpositions are as given in Table 4-1. The true flows and compositions and h e reconciled T~aluesobtained using Crowe's method are given in Table 4-5.In order to obtain the reconciled values, the measurement error variances in flows are taken as 5% of the true values and for the compositions they are taken as 1% of the clue values. As compared to the reconciled values shown in Table 4-3 which are obtained using a nonlinear optimization technique, Crowe's method gi\.es more acculste flow estimates at the expcnse of greater inaccu-

racy in the composition estimates. This is due to the fact that Crowe's method adjusts the component flows rather than the compositions.

stream. In order to avoid this, Crowe's method classifies the strea~nflows and componeiit flows into the following three categories:

Table 4-5 Reconciled Data of Binary Distillation Column Using Crowe's Mathod Variables

True Values

Reconciled Values


Component 1 % Component 2 % Flow Colnponent 1 O/o Component 2 5% Flow Comporient I 9% Corr~ponent2 % ---

TI-eatrnerzto f LTrzmeaszcred Variables


(i) Category I consists of all ~neasuredstream flow variabies and "measured" component flow variables. Thus, this category consists of ~neasuredvariables only. (ii) Category I1 consists of all component flows corre.iponding io measul-ed compositions, but unmeasured strea~nrlow. It also contains all the unmeasured stream flow variables. Thus, this category consists of a mixture of nieasured compositions and unmeasured stream flow variables. (iii) Category 111 consists of all coniponent fiows corresponding to unlneasured compositions for wllich the streani flow may or may not be measured. Thus, this category consists of unmeasured variables only. 'Tlie flows and cortlpone~itflow variables in the different categoi-ies arc denoted by supel-scripts I, 11, artd 111. The objective f~inctionfor the DR proble~rlcan now be formulated as

Thz presence of unmeasured flows or- con~positionvariables introtluces subtle conil)licaticj:is in Crowe's ~nzthod.L k p e n d i n ~or: the measur-erlle11tsmade. t i e s t r . ~ ~ can n h be c!assifieci into two categorizs: (ij Strca~nsu;ith messu~.ecii1cn.s and seine 0:- a!l cor?~posi:~ocs unn:easured (ii) Strean-is with unnieasured flows w d some or all ;oillpos!tions unmeas~rreci A i~ieasured;/slue for the coniponefit fl(3x.v of 2 stream cannot be obtained if the corresponding composition variable is unmeasured or if tile stream !iotv is unnieasured or both. Since there is one co one correspondence between co~npositionvariables and their component flows, it is appropriate to corlsider the component flow as unmeasured if the corresponding composition var-iable is unmeasured I-egardlessof whether the stream flow is measur-cd or not. However, if a streani flow is unrneasur-ed, then treating all component flows o f this str-ea~nas unmeasur-ed will I-csult iri a loss of infor~uationof the measurcti cornpositions of this

The above objec:ive furiction has k e n expi-cssed coinpact1;r usins vectors F. KC, and xk, corresponding io overail floivs, culnpocent k flows. and compositions of all streams in each category, respectively. The weight m~tr-icesW are diagoilal matrices with Lhe diagonal entries bein: the weights for the appropriate variables of all :;trearils in cach category. The constraints of the 13R problem are the material balances for each unit ~vrittenas described earlier. These equations can be cast in terrus of the variables in the three catcgorii-s. For so!ving this proble!x, Crowe [3] proposed a two-stage decomposition strategy for e1iminati:ig unmeasured variables from the constraint equations. In the first stage, unmeasured component flows is1 Category II! are eliminated by using a projection

S f i ~ a ~ / ~ - S Dar~l i n f e h'ec.on~rlinrrori for Bilinear- S?~rern~

matrix. For this, the procedure used in linear DR can be followed because the co~lstraintsare linear in the component flows. In the second stage, the unmeasured flow variables in Category I1 are eliminated by using a second projection matrix. This requires some algebraic manipulation of the constraint equations which are described in [ 3 ] . The reduced DK problem still requires an iterative procedure to solve for reconciled compositions of Category I1 and component flows of Category I, starting with guesses of Category I1 flow variables. It can be verified that if estimates of Category I1 flows are given, then the reduced reconciliation problem becomes a linear DR problem which can be solved analytically. These reconciled estiinates are used to back-calculate the unitleasured flows of Category I1 using a similar procedure as described in Chapter 3, which are used as starting guesses for the next iteration until convergence. After the estimates of variables in Categories I and 11 are obtained. they can be used to back-c:ilculatc the estiinates of unmeasured component flows of Category 111 as described in Chapter 3. Since Crowe's method directly gives the estimates of component flows in Categories 1 and 111, the mole frac~io11 est1m:ites are obtained using Equation 4-1 6.

Figure 4-4. Mineral flotation process [5].Reproducedwifh permission o f the Canadian Sociev for Chemical Engineering.

Example 4-3 We ccjnsidei- tile c ~ i n e r a ltlot;~tion process analyzed by S~nit!i 2nd Icltiyen [Sj showi~in Figu1.7 3-4. The process ccnsists of three flotation cells (separators> and a mixer. and eight streatxs each consisting of t;./o miilcl.als, copp-r 2nd zinc. i ~ ?additicn io ganzue material. The floiv cf strean 1 is taken to bc unit nias.; (basis). v;hile the otlrer stream flo\vi arc. unmezsurt:d. The ~niceraica1lcel:tr-tliions of all slreams except 8 ar.- rne;isured. These vall;es are s h o ~ .in i ~the firs! row of Table 4-6. Based o n this informaticn the flow and component variables can be classified as Category I-F, Category I-F,,


,,. N ,

N,,.NZ2....F7 N71, j\.T7~



Although, the flow rate of stream 8 being an unmeasured variable should be classified as a Category I1 variable, it can also be classified as a Category I11 variable becuzrse all its con2po.sitinns ore ~nlviecr.rlrred The co~straintsimposed for this process are the flow balances and componen: flour balarlces arour~deach uriit. Kon11aliz2tionequations a;-c. not in~posedbecause we have eliminated thz composirion variables zcrresponding to the unmeasured gangut: In each strearn. This !eads to a reduced nuntber c>f variables and constraints in the data reconciliatio~i problem. We start with an initial guess for rhe flows in streams 2 to 7 as given in Table 4-6, and apply the iterative procedure tc, ob'iin the reconciizc! \ialu,-s. Row 2 of Table 4-6 shcws the reconciiea estimates of the flows and ]nine:a1 ccr,certtrations obtained using Crowe's metihcd. It is observed that the esiimate for zinc concentration in strearn 8 is negative. For comparison. rhe reconciled estimates obtained usiilg a nonlinear programming technique (NLP) (described in the next chapter) are aiso listed in the last row of Table 4-6. Again, the estimate for zinc concentration in stream 8 is infeasible. This points to the need for imposing bound constraints in the data reconciliation problem which we be discussed in the next chapter. The n~aximurn difference brtween the (feasible) mineral concentrations between the two

and concentrate (U) contain both solids and liquid, but the stream denored by W is a pure water stream. Although, these process units sre similar to separators, all streams (except pure water streams) are two-phase streams and conservation equations have to be written for the overall slurry flow as wzll as for the solid or liquid flows through sazh of the units. We refer to these process units as two-phase separators. Similarly, there are two-phase mixers where mixing of these streams occurs.

solutions is about 2-796. Since Crowe's method uses an objective function different fmrn the standard DR problem, its estimates will be less accurate than those obtained using the NLP approach. Table 4-6 Measured and Reconciled Data of Mineral Flotation Process Method





yc,% y~,% ' I; kc, 96 R,,,% F ,?,




LZz 7b









1 1.928 3.81

0.5 0.450 4.72

0.25 0.128 5.36

0.125 0.090 0.41

0.5 19.88 7.09

0.75 21.43 4.95

0.125 0.513 52.10

0.25 35.36 -

0.9229 1 1 Q451 0.4498 5.0356 4.8617

0.9147 0.8324 0.0771 0.0853 0.0823 0.0081 0.1285 0.0906 19.834 21.431 0.512 0.2976 j.0461 0.4099 7.1167 4.9235 51.930 -15.91

0.9253 1 1.9122 0.4509 4.2759 -1.058A

0.9164 0.8287 0.0747 0.0836 0.0877 0.0089 0.1301 0.0899 20.00 21.44 0.5098 35.554 52.1 16 -130.1 0.41 6.9694 4.95 5.3583

"lilitic~li~nltresoffloas for .srreni,i I rhmirgh 8 ar-e iistrd


this row

Sirnpscn's Technique

Figure 4-5a. Two-phcse separator

'P'ne ~ppiicationof data reconciliatior; in the :nirieral processing industries, especia!!y to n?ineral beneficiation circuits has beer; investizated

abcut 30 years ago. Several methods fcr ipccifca!ly solving DR probIc!ns aiising in thesc industries have been dcvelopcd. Arnong these, method deveioped by Sin;pson et al. [Gj is very efficient. Bzfi:re desori5in.g S i ~ n p s o ~ itechniquz, 's it is instructive :o examine mrne of ilte process units encount.ered in the mineral processing ir?dustrizs and see in what respec:s !hey difier frartl the co~espoildingur?its in chemical process industries. In rtlineral beneficiation processes 171.the 01-e is first crushed to obtain particle sizes in the range 16-20 cm. The crushed particles are further reduced in size to between 10-300 microils in grinders. Generally. grinding is clrrried out with addition of water andlor recycled slurry. The particles containing the minerais are separated fro111 the garyue particles in separation units that are either classifiers 01-flotation cells. The slurry containing the rnineral particles is refer-red to as the concentra:e and that contai~~ing the gangue as tailings. Wakr may also be added to the separation units in order to maintain a desired pulp density. Figure 4-53 shows a schzrnatic of such units, where the feed (F), tailings (Oj,

The type of equations written for these unirs differ from the sirnple separator in three respects. First, separate conservaticn relations hatre to be written for the ovcra!l s!urry fiow ar:d the solid tlow. Secondly, iaborarory meassrelneitts of tile pulp df-nsitv of the s l u ~ ~streams y may uisi, be availsble, which needs to Sc recoilciled. Thirdly, rhe solids also ccnrain g a q u e maierial which is not measured znc! only the mineral coilce!itrations of the solids arc measured. 'Thtts normalization e q c a t i o ~ sare not imposed since they arc irrelevant. !I! ordcr fc? take into account the above aspects in the rrlodei equations. the following variables are associated with each streamj.




( 1 ) Thc slurry flow rate (2) The flow race of solid (S,) (3) The mineral concentrations ( x , ~ expressed ) as a weight fraction c ~ f the slurry flow and (4) The pulp density (p,) which is the ratio of the solid to the total s l u q t?ow in a strcarn.

Sirud~-S~rz:c i 1 ~ 1 1 uKrr-o/r
Using the above variables and the notation in Figure 4-5a, the model equations for these units can be written as follows.

(i) Overall flow balance:

(ii) Solids flow balance:



Ball Mills

In crushers and grinders, the size distribution of the input and output streams are measured. Balance equations are written for the ore quantity within each size range. If a slurry stream is also recycled to the mill or water is added. then as in the case of two-phase mixers a sluriy balance and pulp density relations have to be written. Figure 4-5c shows a schelnatic of a general mill.

S,--s~-S[,=O (iii) Pulp density relations: pFF-SF=O poo-S,=O puU - S" = 0 (iv) Mineral component balances: Figure 4-5c. Eall mill c ~ n i t

Le? u.,;be the weigt:t fiactioil of soiids of size rarigt- i in streant j, and the ncmber of size ranges. Thc following constraints are imposed for a mill in general. a, be

These tinits are siniilar to mixers expect that two-phase streams zse i~!volved as shown ir, Figurz 4%.As il? the case of the two-phase ssparator, the stream \N is a pilrc ivaiel- str-earn. 'The equations arz siniilar t~ those for a two-phase separator.

( i ) C)vernll tlow balance:

(ii) Overa!l solids flow bslance:

(iii) Solici flow balance for each size range:

Figure 4-5b. Two-phase mixer unit

where ki is the increase (which could be negative or positive) in the weight of solids in size range i due to grinding. These are typic;~lly

Slradv-State [)art<
unknown quantities arid have to be estimated as part of the reconciliation problem. (iv) pulp density relation:

(v) Normalization equations (for weight fractions):


Since there are no measured vaiues associated with Unmeasured variables, an initial estimate of these variables can be used in the objective function. Moreover, the weights for all unmeasured variables are chosen to be zero so that the objective function is identical to the standard DR objective function which minimizes the weighted sum square of adjustments made to measured variables. In Si~npson'smethod, the nonlinear data reconciliation problem is approximated by a linear data reconciliation problem by suitable choice of working variables and linearisation. The pulp density relations are first used to substitute for the variables p, In ternls of the flow rates.

With this substitution, the pulp density relations need not be considered in the DR problem. As seen before, the colnponent balances are linear if we express them in terms of component flow rates. The variables xjkin the objective function can also be expressed in terms of component flow variables as

Using Equations 4- 19 ar?d 4-2S, the objective function Cali be written as

It car] be o5serbed :hat the n!odel equation!: for different units are bilinear. Tile method dzveluped by Siinpson et a!. i G ] exploiis this feature for effi-fiiien~ s:)lu:io~; of the DR psoblern. \Ve describe tnis method ti?r the simple case uhen th? ntineral beneficiatio~lcircuit co11sis:s o:llp ~f tw;opl~asejllixe!-s 2nd jepzrritors. We assume thai the n~~easuremen:~ inade are sl~lrrytlows (or liquid flows for pure iiquid streams). pulp densities. nlineral concentr-ations given in terms of mass fractivn of mineral in wet solids. The ol?jc.ctivefunrricln of the DR probleni in this case is given by

where i;; is the ovcrali flow I-ate of strearn j, .sI is the total nuniber of streams, and s2 is the number of slurry streams. It should be noted that all \xanahles \\,hetiler measured or no! is ii?cluded in the objectiire function.

The second and third terms in the objective function are 110longer quadratic in the flow vaiables. The objective function can he ~ynly;mutect by a quadratic f ~ n c t i o nby using a first order apnroximation of the flow ratios around sorne estimates Fy, Srand N$

where a, and bj, are respectively equal to and i/Fyand -N~~/(FT)'and

where cJ is equal to -S;/(F;)2. The quadratic approximation of the objective functiorl is therefore given by

Since the constraints are linear in the flqw and component flow variaSles. we now have an approximate linear DR problem corresponding to the objective 4-24 and tlow balance constraints for overall slun-y flow, solids flow and component t1ou.s. This linear DR prob!em can be solved using techniques described in il?e preceding chapter. This DR problern. however, can be coived :nor? efficiently by reducing it to s n unconstrained optimization problem bl- elirnin:~tingall thz constrain!^ tosether ~ i t ah suitab!e c!ioice ot depc~~dcn: variables. The dependent variables are so chosen that their relation to the independent variables is obtained ezsily. Graph theoretic concelits are eaplcited to achieve this. 1,eT us first consicier The oxrerzll f!ows of streams. Proin the preceding chapter, it may tte recs!led that the concep: of a spanning tree of a prccess ?:-a;~his useful in dc:ern~inin_rrhc observabiiiiy or es:in~~lbi!ityof ~inrneas u r d flow variatoles. A funda!nzntal cutsei cf a stream which is a branch of the spanning tree provides a f l a y b:tlance equation which relates the flow of that st]-earnwith the flo\;ls of sti-earns which are chords of the fundarnen~al cutxet. 'These ideas tail be useci to convznientiji choose the dependent variables. We construct a spannir:: tiee of the prccess g:.:ph ?::I ~ h r - thr. ~t flows of the branch streams of the sparining tree as (+r_.ymlrrlt variables and the chord stream flows as independent variables. It can be immediately cutsets of the spanning tree can be used to deduced that the f~~ndaniental relate the dependent and independent flow variables. These relationships can be expressed as





where Fbiis the flow of branch strearn i, F,, is the flow of chord stream j, p , is a coefficient which is 0 if chord j is not part of the fundamental cutset of branch stream i, + I or -1 if chord J is in the fundamental cutset of branch i, and the flow directions of chord J and branch i, are opposite or same with respect to each other, respectively. Thus, tlie dependent branch flow variables can be eliminated from the objective function. Equation 4-24 by using Equation 4-25. If we consider the solid flows. the above ideas can again be applied since the solids flows are related in exactly the same manner as the overall flows. This is true also of component flows of streams for each c o n ponent. Thus, the solids flows and component flows of branch streams can be chosen as dependent variables and related to the corresponding chord flows similar to Equation 4-25. The solution of the r-sl~lting unconstrained optimization problem can be obtained by setting the derivatives of thc objective function with respect to the chord flows to zero and solving the resulting lineal- equations. Complete details of the linear equations to be solved at each iteration are given in Simpson et al. 161. It should be noted that in this technique initial estimates of chord flo~vs (total. slurry and component flows) only have to be guessed since the branch flow estimates can always be calculated ~lsingEqtiation 4-25.

-We ill~l.strateSiri;pson's n l e t h ~ d[or the ~ n i ~ e rflacation al process cons i d s e d 111 Example 4-4. The process _graph of Figure 4-4 is consrs-acted by including tlie environmerit nodc and is shown in Figure 4-6. A spanning tree of this process g~-apllis constructed a;id i\ st:ov:n in Fi~u1-c4-7.

Figure 4-6. Graph of mineral flotation process.

S;m~i~-Srare 11arr1Re~~orfcilirrtiorl for iir!rrlcnr Svsrcrri\

Table 4-7 Reconciled Data of Mineral Flotation Process Using Simpson's Method Method


Initial F* Estimates Nc,

N,, Simpson


1.25 1.15 1.: 0.9 3.399 1.413 0.183 0.0810 11.532 10.823 10.789 0.369

0.1 1.986 0.709

0.15 3.216 0.7425

0.2 0.102 10.42

0.05 1.23 0.0335


1 0.9246 0.9160 0.8424 0.0754 0.0840 0.0736 0.0087 1.9189 0.4505 0.1258 0.0918 19.932 21.462 0.5:s 34.773 i,,% 4.5575 4.3564 4.5278 0.4097 7.0231 4.8803 51.657 -13.77 &"%

Figure 4-7. Spanning tree of mineral flotation process.

Generalization of Bilinear Data Reconciliation Techniques

The choice of this spanning tree implies that the flows of branch streams 1, 2. 3, and 8 are thc dependent variables. The fundamental cutsets with respect to this spanning can be easily obtained and are given by the sets [!. 4, 6, 71, [ 2 , 3 . 5, 6, 71, [3, 3, 71. a:ld 18. 5, 61. Based on the strear11f l : directions, ~ the matrix of coefficients, pij. car; now be const~vcreiland is given by

where the romJs of P con-espond to the branch streams, 1. 2 , 3. and 8 while tlle coluinns of P ccjrresponci to he chord streams 4, 5, 6, and 7 arranged in order. We start with initial e:;tin~atesof chord flow variables and coinpute the branch flows using Equation 4-25, which are shown in 7Bble 4-7. For these estimates, the coefficients aj and bjk are computed. The solutio~iof the reduced unconstrained DR problem gives new estimates for the illor-d flows which arc used for the next iteration. The iterations are carried out until convergence. The final reconciled estirnates obtained are shown in Table 4-7.

In describing the above methods, we have reconci!ed all mcasured values. It is more common in industrial applications to hold some of tile measured variables constant during the process of reconciliatioi~.A sinip!e way to accomplish this is to assign a very high weight (01 very small standard deviation) in the objective function to rzieastlrements that havc to be kept constant. This will force the adjitstrnenc made TO these nle'isurernents to be negligibiy sn?a!l. Both Ciuwe's method and Simpson's me~hocihave been describccl forprocesses involving primarily mixers and separat~rs.if splitters, reactors. or grinding nlills are present in the process, then these metk~odshave to be suitably modified hecause ihe type of equations iinposed fc,r these units do not conforrii to those ~f other units sluch as miser\ and :;tparators. Crowe [3] has atitlined the modifications necessaq. to take spliiterc into aceourit. FCJ:this purpose, the splitter eciuaticns ::re for-r;;ulateti using split-fraction variables. As pointed out earlier, the use of split-fraction variables leads to a tril i n e a r ctrnrtllre fnr th: rc.!rynrr.? balances. In o~-derto use Crowe's method for the bilinear problem, the split-fraction variable\ are esti~~iated in ar? outer iterative loop. For each guess of the split-fraction variables. a bilinear problem results which can be solved using Crowe's method. In general. a constrained optimization technique is required to obtain updated estinlates of the splil-fraction variables at each iteration which robs Crowe's method of much of its efficiency. If only one splitter is pi-esent

Heat Exchanger

in the process, however, then a univariate optimization merhod such as the golden section search [8] can be used for this purpose.

By definition, we assume for a heat exchanger the data for both the cold and hot side fluids need to be reconciled o r estimated. We also assume that the streams are single-phase fluids.

Treatment of Enthalpy Flows

Although both Crowe's method and Simpson's method were developed to solve niulticomponent data reconciliation problems, it is possible to extend these techniques to take into account enthalpy balances and to reconcile temperature variables. In general, the enthalpy of a stream is a nonlinear function of the stream temperature and composition. However, if the enthalpy of a strearn can be assumed to be a linear function of t a n perature and independent of composition, then si~nultaneousmaterial and energy balance reconciliation also give rise to a bilinear problem. Even if the enthalpy of a stream is a nonlinear function of tell,,-isture but is independent of con~position,the methods discussed in this chapter can be used with minor modifications. An important subsystem that satisfies this assun~ptionis that of a crude preheat train of a refinery, where the enthalpy of a petroleum stream is related to the ternper.ature and physical properties such as API gravity and normal Soiling point of the strearn. For the purposes ofthis chapter. we wiil make this assun~ptionand describe 1112 n~odificationsnecessai->t o apply Ci-owe's method or Si~npson'smethod for- simultaileous material a ~ energy ~ d balance reconciliation. As befure, u e first describe the enzrgy bslal~ccsfor different types of process unjts.



Hot Outlet

Hot Inlet

Figure 4-8. Heat exchanger.

-I .he equations for this unit

S ~ O W I :i ! ~Figurc

(i) F l o balafices ~ ~ ~ for hot a11d cold side fluids:

(ii) Enthaipy balance: uiiere H(T) is the spec~ficenthalpy of :he stream whlch is a\sumed to be only a functio~iof temperature

Splitter E~ztlzalpyRalaizce

or in terms of specific enthalplcs of streams

4-8 are

(iii) Component flow balances:


~ > ~f
(iv) Normalization equations for outlet streams:

SUMMARY A complete set of independent constraints has to be imposed for each process unit in formulating data reconciliation problems. Different independent sets o f equations can be imposed for a process unit, but some are more convenient than others. It is important to include normalization constraints on compositions to ensure that the reconciled estimates satisfy them. The constraints of bilinear data reconciliation problems contain products of two variables (flow and compositio11 or flow and ternperature). Special methods have been developed for solving bilinear data reconciliation problems. These are efficient but cannot handle all types of process units and they also cannot take into account feasibility constraints such as bounds on variables. Nonlinear data reconciliation techniques can be used to solve bilinear proble~ns.These are less erficient, but do not have any other limitations.

Heaters or Coolers

A heater or a cooler is a heat exchanger for which data of only the process stream are reconciled, while data of t h e utility stream are assumed to be unavailable or unimportant. The constrai~ltsfor this unit are a subset of the constraints of a heat exchanger. The material, energy, and normalization equations for the process stream only are written. Crowe's method can be easily extended to include enthalpy balances and temperature variables in the reconciliation problem as suggested by Sanchez and Romagnoli 191. The specific enthalpy variables can be treated in a similar m m e r as composition variables. The enthaipy flow of different strcalns can be classified into tine three categories in a similar nanner 2s component flow variables. The objec:ive functior? will cow contain terms for the adjustme!lts made to enthalpy flows of Category 1 streams m.d specific enthalpies of Category 11 streams. The two-step Crowe's projection technique described earlier c w be applied to d s o ~ b t a i nthe reconciled values of specific enthalpies cf ail streams. If the specific enthalpy is a i~onliilear function of temperature, then the tempsrature estimate of each streani can be rect?v~,redfrom the specific enthalpy. I n genera:, this may require ihe solution of a one-dimecsiona! iioniiriear equatior~for each str-earn. Simpson's method has aiso been extended to include splitters, as well 3s to treat enthalpy balances along wifh flow sfid compor?ent balances [lo]. It should be cautioned, however. thar generalizing these methods to include other types of process units such as a flash drum (which can also be described by bilinear equations for ideal therrnodynarnics) may not be a trivial exercise. A significant disad\rantage of these methods is that at present they cannot take into account simple bounds on process variables. This can seriously limit the use of these methods in industrial applications where it is required to obtain feasible estimates of process variables.







2. h?eycr, Snjalhcrt. and B. Koehrr.!. "Data Kc~or:cil~aiionb3n bislticollipoaen! Nctworks [isin: 0h:;enrabiliry and Ktldurtdancy Classification," i n C<~tlil:xte~Applicgtiorls it1 (..l~(,r?~i(,tll!g (i,dited by H . Th. BUSSCn1;ikzr and F I>.Icdernaj. A~nstcrdam:E!sei,ies: 1990.

3. Crcwe, C . M. '"'I Matrix ?roject;on 11: Thr Nonlinear Case." AIChE Jour-t~al311( I 989 ): h 1 O--h?? 3. Rao. R. R.. and S. Nasasin~han."Colnpasison of Tcchrliques F r Data Keconciliation of Multicomponent Processes." hid. & Fig.Clitrri. Rex 35 (1 996): i 36'1368.

5 . Smith. H. W.. and hJ.tchiycn. "Corl~puter,-\
6. Simpson, D. E., V. R. Voller, and M. G. Everett. "An Efiicient Algorithm for Mineral Processing Data Adjustment." Itit. J. Miner. Proc. 31 (1991): 73-96. 7. Wills, B. A. Mineral Processitzg Technology, 4th ed. Oxford: Pergamon,

1989. 8. Edgar, T. F., and D. M. Himmelblau. Oljtimizatiotz New York: McGraw-Hill, 1988.


Cherniccrl PI-ocesses.

9. Sanchez, M., and J. Romagnoli. "Use of Orthogonal Transformations in Data Classification-Reconciliation." Cntrz~~uters Chem. Etzgtig. 20 ( I 996): 483-493.

Nonlinear Steady-State Data Reconciliation

10. Siraj, S. M. Atz EfJicierzt L)e~otnpo~itiut~ Srraregy for- Genet-rzlData Reconciliatiori. M.Tech thesis, IIT Kanpur, India, 1995.

The steady-state co~lservationconstraints that are used to describe most chelnical processes are nonlinear in nature. If we are interested in only overall flow balance reconciliation of such processes, then linear data reconci!iation technique5 described in Chspter 3 are sufficient. hlorec?ver, iintler some restrictions some of these processes can be solved usir~gbilinear data reconciiiatiori techniques as describcc! in Chapter 4. If we \visil, hcwever. to take jr.~oconsideration thermodynamic equilibrium relaticrlships and con!p:ex coire!ation: for thermodynamic and physical properties. then nonlinezr data reco:lcilia:ion techniques 111ust be used. Moreovei-. in Chapters 3 and 4, we have considered only e q l d a l i ~COt7.';irc7ircrr corre~poi~dirig to material and enc!gy conscrirai.ion and have no: imposed even si~np!e bollnds on the variables. The reconciied estimates c;f \~ariablescan therefore becon-ie infeasible. Fnr example, negative reconciled estimates for flows or cornpositions can be obtained. If we impose bounds on the estimates of vaiiables or other feasibility constraints, then these give ~ i s cto ineqiiality cr?.strsi?.rs ir t.': .I::: rec~nci!i~:icn;::;S!=;;; which can be solved only l ~ s i r qa nonlinear d a t ~reconciliation soluticn technique.


Ker o~;criror!o!larrd (;I-or Error- Oerecrioll


We will use a simplified version of the example of a single stage flash unit drawn from MacDonald and Howat [I] to illustrate the formulation and solution of nonlinear data reconciliation problems. Figure 5-1 shows an isothermal flash unit with a feed stream containing propane, n-butane and n-pentane. The steady-state constraint equations for this unit are as given below: Component Balances :

Fzi - Lxi - Vyi = 0 i = 1, 2, 3

Figure 5-1. Equilibrium flash unit

(5 - 1 ) General Problem Formulation


xi - ! = 0


As in the linear case, it is assumed tl-tat the random measurenient errors follow a normal distribution with zem mean and a variance-covariance matrix C. The general nonlinear data reconciliation problem can be fon~iulatedas a least-squares minimization problem as follows :


M X, 1 i1i l ( ~


subject to

Equilibrium Relations:

yi -- P,""'('17)xi/ P

i = I , 2. 3

(5- 5 )

f (x, u) = 0

(5 -S)

-. POI- si~npliciry,Racuk's law has bzen used to describe !he equilibrium relations. Tliz sa:uration prcssurc is obtained using the Antoine equaticn which is given by: Antoine Equation:

in P?' = Ai + Bi / (T + c,)

i = 1,2.3 ( 5 - 6 )

The nonlinear data reconciliatian problem is to reconcile the measurements of the flou~rate, temperature, pressure, and composi:ions of the feed. liquid and vapor product streams so as t o satisfy the constraints 5- i through 5-6.

wjirre f : m x I -gector of eqgality constraints; g : q x 1 vector of inequality constraints: C : n x n variance-covariance matrix; u :p x 1 vector of unmeasured variables; x : n x 1 vector of measured ;.ariables: y : n x 1 vector of mcasured values of' meaxureiiiencs of variables x.

The equality constraints defined by Equation 5-8 typicaliy include all material and energy conservation relations, thermodynamic equilibrium constraints, constitutive equations for material behavior similar to Equations 5-1 through 5-6 of the equilibriun~flash example. The inequality constraints given by Equation 5-9 may be as elementary as upper and lower bounds on variables or complex feasibility constraints related to equipment operation. In the above formulation, we have tacitly assumed that the variables x are directly measured. However, this does not impose any limitation. If measurements are functions (linear or nonlinear) of the variables (for example pH is a function of H+ concentration), then we can always define a new state variable for pH which is directly measured and the relationship between pl-1 and H+ concentration can be included as part of the equality constraint set. We will first consider the solution techniques for nonlinear data reconciliation problems in which only equality constraints are present in the process model. Two solution techniques and their variants are discussed in the following section.

The solution of the data reconciliation problem can be obtained by setting the partial derivatives of Equation 5-10 with respect to the variables x, u, and 1 to zero-thc necessary conditions for an optimal solution point of Problem defined by Equations 5-7 and 5-8-and solving the resulting equations. The following equations are obtained:


SOLUTION TECHNIQUES FOR EQUALITY CONSTRAINED PROBLEMS ?'lie minimization oi' Equation 5-7 svbject to the equality constr:;ints CIS Equat~cn5-8 can be achieved by usin? a general purpose fionlinear optimization tcchniquc. Ho\vever, since thz objec!ive function is quadratic in nature, efficient ~echniqueshave been developed to so!ve the problem. i he estimates :)btai:~edby colving this optimization pi.oblein can h:: shown to be ~ncl.ri;~~tri?r likrlihooci' estir7zatr.i. (MLE). I t should he noteti. however, that these estimates ;nay he biased, whereas, i n 'he linear case !he estirnares are unbiased.


Methods Using Lagrange Multipliers

The equality constrained nonlinear data recoficiliation problern can be solved by using the classical method of Lagrange ~nultipiicrs[2]. The Lagrangian for ?lie problem is ziven by

are ihe Jacc~bianmatrices scctaining ihe p ~ r t i a lderivatives a f t h e no:ilinear functions f with respect io Y and u, respectively. Since the constrr?ints are nonlinear, solving for x, u and ?L involve; an iterative nunreiical procedure. Thc systern of normal eqtiaiions 5-i 1 through 5-13 can bz sclved by any sirnultaceous equation sol\>.-r [3j. Stephenson and Shewchuck [4] used a Newton-Raphson iterative method which is bawd on a quasi-Newton liriearzation of the nonlinear model. Their :~lgo. rithm takes advantage of the sparsity of the Jacobian matrix and the invru-iance of the partial derivatives of the linear tcnns in the model equations which m,akes thc cornputations more efficient for large syslem:;. Ser-th et al. [Sj reporred a sirni!ar approach but with a different nonlinear equation scl\~er. Madron 161 suggested an iterative approach for solving the nor-ma1 equations 5-11 through 5-13 based on successive linearization. Let 2~ and iik represent the estimates of the var-iables obtai~iedat the start of iteration k. k linear approximation can be obtained for the nonlinear ~ O J I straint from the Taylor's expansion of the function f(x,u) in Equation 5-8 and retaining only the constant tenn and the first-order derivative tern).

where the Jacobian matrices J $ and J,k are as defined by Equations 5-14 and 5-15, with the superscript k indicating that they are evaluated at the estimates s k , U k . The Jacobian matrices that appear in Equations 5-11 and 5-12 are also replaced by their estimated values at iteration k. The resulting set of equations are now linear. In Madron's procedure, these linear equations are de-coupled by eliminating vector x from Equations 5-12 and 5-13 using Equation 5- 1 1. From Equation 5- 11,

Using Equations 5-16 and 5-1 7 in Equations 5-12 and 5-13, and rearranging we obtain the fol!owing linear equations involving u and h

Eqca~ion5- IS can be solved for c:hiaining die new estimates for u and

h The estimates for iL are used in Equation 5-17 to obtziil the mu; estimates for x. This procedure is repeated using these new estimates as initial guesses f a the next iteration. 4 disadvar~tagewith all these n~ethcds is that the inclusion of Lagrange ~nlu!tipiiers h in the solutior. increases the size of the problem which require more computationai time. To reduce the size of the problem, Madron 161 proposed a Gauss-Iordan elimication process of the original linearilinearized constraint matrices (J, 1 J , for the nonlinear case). The structure of the I-esulting matrix provides useful information for variable classification.

Method of Successive L i n e a r D a t a Reconciliation

A simpler way to handie the nonlinear data reconciliation is to successively solve a series of linear data reconciliatio~lproblems by linearization of the nonlinear constraints. A linear approximation :o the nonlinear constraints are obtained as in Equation 5-16. We thus obtain a linear data

reconciliation problem for minimizing 5-7 subject to linear equality constraints of Equation 5-16 which can be solved using the technique described in Chapter 3. Britt and Luecke [S] proposed an alternative solution procedure for the linearized problem. Their soIution for the estimates which have to be used at the next iteration are given by


Equations 5-19 t h r o ~ g h5-26 were deriveci by Britt arid Luecke 121 for parameter esti~naiionin n:~nlir?earregr-ession. It was adapred for noniicear data reconciiiation by Knepper and Gorman [7j and also used later by MacDonald and Howat [ i j. The a1gnri:hm requires initial estimates for P values y can all variables coatained in the vectors x and u. T ~ measured be used to ini!ializs the variables x. Bri:: and S u e c k s i s c designed a simplified al~orithn-tthat can be ~ s e c ito initialize the u ~ ~ n e a s u parar~d meters n. At each iieration, the functicn f(x,uj and the Jacobian matrices J, and J, are re-evaluated with the new estimates. The iterations are continued until II u ~ -+ul,~I/ and II x k +-~ xk li satisfy :I srnail tolerancr, crirerion. If convergence is achieved, the solution might not be a global minimum solution. This difficulty is common to most nonlinear least-squares estimation problems. A variant of the above algorithm suggested by b e p p e r and Gorman [ b ] in order to reduce the conlputational time, is to hold the Jacobian matrices constant at the initial estimates and I-e-computethein only after I I ] ? P I - O L I C ~?'his ). approach, the constraints are satisfied (can.stant dit-~cfio17 however, is characterized by slow couvcrgence. Another variant suggested by MacDonald and Howat [ l ] is a de-coupled procedure, in which the estimates for u are held constant and Equa-

tion 5-20 is repeatedly used until the estimates for x converge. Equation 5-19 is now used to obtain new estimates for u, and tile procedure is repeated until all the estimates converge. MacDonald and Howat [ I ] demonstrated through application to a non-equilibrium flash unit that the coupled algorithm provides marginally more accurate estimates at the expense of a greater computational time. The de-coupled procedure can be a useful computational scheme when the nonlinear equations are implicit in the parameters. The method of Britt and Leucke [2J and their variants described above have some limitations. Equations 5-19 and 5-20 involve the inverse of two matrix products, R and (J",)TK-'(J!). In order for the inverse of the matrix products to exist, the following conditions should be satisfied: (i) The matrix (ii) The matrix

Step 1. Start with the measured values as initial estimates for the variables x and initial estimates for u which are provided by the user. Step 2. Evaluate the Jacobian matrices of the nonlinear constraints with respect to variables x and u at the cur~entestirnates.

* .

The projection matrix can be obtained using QR factori~atiorlof tlle matrix J: as described in Chapter 3.

Ji should have full row rank

Step 4. Compute new estimates for x using

J k should be of fill1 column rank.

The second of the above conditions imply that all the u ~ i ~ n e a s ~ ~ r c d variables should be obser\lable. This is identical to thc condition seen in the case of linear systems in Chapter 3. where it i ~ a sl~own s that in order variables to be observa!)le, the colurrins of the confor all ~~nineasured straint nia?rix corr-esponding to thesc variables must be iiide~:-ndent. Even if this condition is met. the first condition rriay n o t be s:itisfieil by some processes deperrding on which of the variables are ~neasur-d (see Exercise 5-1). Thus. the above methods canr?ot he applied i!i general h r all p-~ocesses.



Step 5. Compute the ne\xr estimates fol- 11 through Equation 3-34~ttiliziilg the QX factorization of mat1-ix Jb. Step 6. Stop if the no{. estimates are not significantly d i f k r e ~ ? from t those obtained in the preceding iteration. 0therwi:;e using these ne\v estimates repeat the procetiure s!arting with Step 2. Pai and Fisher [9j were the first to use a procedurs siiliiiar to the algor i ~ h mdescribed above. The additional mcdificatioiis th::t thei!- nlgorithrr? incorporates are:

Exzrcise 5-1. C'onsidei- rl:e jlora/ior: p r x e s s dr.rc.rihed iri E.~ci!;lj,lc 4-4. Grrzerare the yrulir~z~tr-i~v ((/fflzeJacoJ~iarzcorre.s~~ot~diri~y fo rlzensursd variables (this cat1 be dotie arzctl~yricuilyj.Shox. rlzur tile t-ow correspo17dir1gro the flow balat7ce,fo1-r7cd~4 ill rhis . ) I I ( ; I I I L ~ ~ ~ ~ con.rl.s~.> .x o~iixc?j;rr-os arid hence 111vi.etlicir illis tl2crtri.x does r z c x /iiiijef~t!:?-OIL rar7k.

An approach that can be used in general is based on the use of Crowe'h projection matrix technique [8] to solve the linear data i.econciliation problem in each iteration. The basic steps involved in this approach are as follo\vs:

Step 3. Conipute the projection matrix Ph such that it satisfies

(a) A Broyden's update PI-ocedure[I01 for updating the Jacobian :~:atrices rather than recalzulating the111ai czch i;c,.,;laii. iii oi-dei-t o reduce the computational effor? involved. (b)A line search procedure after Step 5 based on a penalty function n~ettiodto compute the estimates to be used for starting the next itera) used. tion. The penalty function Ilf(x,u)ll + ~ ( y - x ) " ~ - l ( ~ - xwas where a is an arbitrary number 95 a 21.


The modification described in (a) can improve computational efficiency of the algorithm and has been demonstrated for smail problems. Hen-

Nonl'tzear Stead3,-SraieData Reconriliarion

ever, the modification in (b) is of questionable utility. This is due to the fact that the objective function of data reconciliation is quadratic and the solution for the estimates obtained in Steps 4 and 5 are optimal, although they d o not satisfy the nonlinear constraints. A line search procedure for modifying these estimates can improve feasibility with respect to the nonlinear constraints but at the cost of sacrificing optimality. This may not lead to an overall reduction in the computational effort. The right choice for the parameter a which plays the role of a subjective weight for the least-squares objective function is also diffjcult to come up with. Pai and Fisher used = 0.1. Swartz [11] first recommended the use of QR factorization for separating the estimation of the measured and unrneasured variables at each iteration. If the problem is highly nonlinear- and the size of :he prob!~mis large, this can become computationally inefficient. Ramamurthi and Bequette [I21 reported an increase in the computational time with the j t-i~easurernents.Becacse of the noise !eve1 (magnitude of gross e ~ s o r s in iuccessi\re linearization process. more iterations are usually required to converge a problem with large errors in data. This p t ~ c e d u r eis very popular. however, for data reconciliation problerns involving overall mass and eaergy balance equations. In the precedins ctaptei. the I-econcilea values far the binaiy distilla:ion colutnn I-epor~ed in Table 4-0 and the reco!lciled values reported it: the !as[ row of Tab!€ 4-6 were obtained usirrg :he sl~ccessivelinear- data ~econiiliationalgorit'hm described above. !t can be cbseriied from the results of Table 4-6. that the reconciled ertimatc of zinc concentra:ion !n stl-zam 8 has a high riegative value, which is absurd. Thas, :his meihod cannot he guaranteed to give feasilile extin1a:es in all cases.


or bounds on the variables. These inequality constraints on measured and unmeasured variables take the form

xminI x < ,x

(5 - 24)

In extreme cases, other types of feasibility constraints have to be imposed. For example, when data reconciliation is applied to heat exchanger networks for reconciling flows and temperatures, it is possible that the estimates violate thermodynamic feasibility, such as the temperature estimate of the hot stream being lower than the corresponding cold xtream temperature estimate. In order to combat this, an inequality constraint should be imposed that forces the temperature of the hot stream to be greater than the corresponding cold stream temperature at both ends of every exchanger. These type of feasibility constraints can be cast in the form of Equation 5-9. The solution to the nonlinear data reconciliation problem when constraints such as Equation 5-2 or 5-24 though 5-25 are imposed, can be obtained by u:;ing genera! purpose nonlitlex programming techniques. A detailed descriptien of such techniques is beyond the scopc. of this test and the reader is referred to excellent books on this sub-iect such as Gill et al. [13] afid Edgar and Himtr?elblau 1141. We discuss below the broad fear techniques especiaIiy with tures of two popular i ~ o n l i ~ e apscg~amnling reference to tlieir application for solvifig ncnlinear datz reconciliation problems. Sequential Quadratic Ercgramming (SQP)


NONLINEAR P R O G R A M M I N G (NLP METHODS FOR INEQUALITY C O N S T M I ED RECONClLlATlQN PROBLEMS Ti?. .~?ajorlimitation of the methods described in the preceding sectioil is their inability to handle inequality constraints. In many situations,

especially when significant gross errors exist, standard data reconciliation can cause the effect of the gross errors to spread over all estimates. If sufficient redundancy docs not exist, the estimates for variables which have srnall values contain significant errors. In some cases, infeasible estimates s t ~ c has negative values for flow rates or compositions are obtained. In order to tackle this problem, it is necessary to impose limits

The seqlienlial quadratic progrmn~aiizgtc:chnique (15, 16. 171 solves a nonlinear optimization problem by successively soiving a series of quadratic programming problems. At each iteration. a quadratic nrn<.r.?m approximation of the general optimization problem is obtained by a quadratic function approximation of the objective function, and a linear approximation of the constraints both using Taylor's series expansion around the current estimates. In the case of the data reconciliation problem defined by Equations 5-7 through 5-9, the objective function is already quadratic and only the constraints have to be linearized. The resulting quadratic program at iteration k + l is formulated as


I ) ~ ~~;e/ c~. o i ~ ~ - i I ~arrd ~ ~ GI-oss r t o n EI ror- I ~ i r i ~ i l r / , r i

where aLis a step-length parameter between 0 and 1. The step length is obtained by minimizing a penalty function (sirn~larto the Lagrangian). The procedure is repeated using the new estimates until convergence. subject to

where z is the vector of original variables (x, u); s = z - is the direction of search for iteration k + l ; VO, Vfi. Vgj are, respectively, the gradients (derivatives with respect to variables z) of the objective function, equality constraint i, and inequality constraint j, ali evaluated at the current estimate L k . B is the Hessian (matrix of second order derivatives of the objective function with respect to the variab!es z j evaluated at the current estimate ik. In thz quadratic program for~nulation,all variables ar-e included in the cibjeciive function. C o ~ n p a r i n swith the data reconciliation objective flifiction Equation 5-7, it can be dcduced that

Note that the Hessian matrix is constarit anu is also singular if there are unmeasured variables in the process. The solution of the quadratic program gives the search direction for obtaining the estimates. A one-dimensional search is performed in direction skat each iteration k , so that the new value for z at the next iteration is

There are several issues of particular interest in solving data reconciliation problems using SQP. Generally. in SQP the exact Hessian matrix (or its inverse) is not computed at each iteration because of the computational burden it entails. Instead, an approximate inverse of the Hessian matrix (or its square root) is obtained by a symmetric Broyden's update technique. In the case of data reconciliation, Equation 5-29 shows that the Hessian matrix is constant and, therefore, there is no need of updating it. Secondly, Equation 5-29 also shows that the Hessian matrix is positi\re semi-definite, if unmeasured variables are present. Therefore, the QP solver that is used should be capable of handling positive semi-definite Hessian matrices. Thirdly, the solution obtained using the QP is used as a search direction and the optimum step length in this direction is obtained by minimizing a penalty function. If the objective function contains nonlinear terms of higher order than quadratic, then this line minimization gives estimates which are more optinla1 and less infeasible. In the case of data reccnciliation, however. since the objective function is q:lr?dl-atic. a step l e ~ g t hof unity sives the optimal estimates that satisfy the linearircd approximation of the cnnstraintc. In this casz. line minimization will i~lzproivt l i ~ . f e ~ . ~ i b1t:iiil i l i f ~rcsperr to ~ 1 ~:ollii~~t'(i:1 ~ c o n ~ t r a i ~ zbj. t . ~s~;cl-ifictl~g opfiluciit\.. Ti is dehntable whether this can lead tc sn overall reductictn in the nurnber of iterations required foi- cosvergence. Thus, sven though a general purpose SQt' technique c a n be uscd, hy exploitin? the special features as discussl-ct i?bove i! is possible to develop a more efi'icicnt tailor-made SQP rechnique for solving nonlinear data reconciliation problems. Methods for solving qusdratic programs are available in HARWELI, library, SQPHP [ I 7j or QPSOL [ I 81. An efficient SQP solvei. denoted as RND-SQP. hac been recently d-\'eloped by Vasantharajan and Biegler 1191. In this technique. a reduced Q P is solved ar each iteration by partitioning the variables into a dependent and independent set, the number of dependent variables being equal to the number of equality constraints. Using the linearized equality constraints. the dependent variables are expressed in terms of independent variables and are thus eliminated SI-orn the Q P subproblem. Furthermore. the Q P subproblem contains the inequality constraints only. The solution for the independent variables obtained by solving the reduced QP is used to obtain the solution for the

dependent variables. This technique can also be adapied for data reconciliation to obtain a very efficient method. Generalized Reduced Gradient iGRG)

The GRG optimization technique solves a nonlinear optin~izationproblem essentially by solving a series of successive linear progra~unlingproblems. At each iteration, a linear program (LP) approximation is obtained by linearizing the objective function and constraints. The LP subproblem is formulated as in Equations 5-26 through 5-28 with the difference that the second (quadratic) term in Equation 5-26 is not present. The GRG technique differs from SQP in one fundamental aspect. At each iteration, the GRG method requires estimates which satisfy the nonlinear constraints, whereas in SQP the estimates at each iteration need not be feasible with respect to the nonlinear constraints. Instead of a line minilnization as in SQP, the solution of the L P subpoblem is adjusted using an iterative procedure such as Newton-Raphson in order to obtain estimates that satisfy the nonlinear constraints. The LP subproblem is itself solved using a standard algorithm by partitioning the variables into dej>eizdenr (basic) variables and i~?dc~)etzdeizr (i:oizDasicJ variables; the dependent variables are implicitly determined by ihe independent 1-ariatiles; making the ~Ljeciivefunction a function c?f oniy the noi:basic variables. The nonbasic variables are split f ~ r t h e rinto superbasic varia'nies, which lie bztiveen their bourlds and norlbasic variables which &re at their bcundc. A one-dimensional search is p e r f c n x d in the direcrior. of the grsdient ef the superbasic variables (hence the term "reduccd ~radient").Vanoils comniercia! CRC- algorithms d%er in the methods they use to ca!q out the search and to rcgain a feasible point with respect to the conlinear ccnstraints [20, 21 J An icteresting issue concern!: the manner in which unmeasured variables are handled in both SQP and GRG. In both these approaches, no distinction is made between measured and unmeasured variables. A quest ~ o nwoi-rtl co~lsidenilgIS whether Crowe's projection matrix method for decoupling the esrimation of measured and unmeasured variables can be gainfully exploited in each iteration of the nonlinear programming techniques. The answer is that Crowe's projection technique cannot be utilized for elinlinating unmeasured variables if there are bounds on these variables, because the estimates fol- unmeasured variables obtained using this technique may violate the bounds. Furthennore, both successive qua-

dratic programming (RND-SQP) and GRG employ a projection technique to eliminate not just unmeasured variables but a set of depellieilt variables (equal to number of equality constraints which are uruaily more than the number of unmeasured variables). An analogy between this and Si~npson'stechnique discussed in Chapter 4 may also be drawn.

Example 5-1 We illustrate the necessity for including bounds in data reconciliation for obtaining feasible estimates using the mineral flotation process considered in Example 4-4 for which the measured data are listed in the first row of Table 4-6. The last row in this table also gives the reconciled estimates obtained using a successi\~elinear data reconciliation solution procedure along with Crowe's projection (method of Pai and Fisher [9] discussed in the preceding section). Since this method cannot handle bounds, they were not inlposed. These reconciled estimates show that an absurdly large negative value for the estimate of the zinc concentration in stream 8 is obtained. The same problem was solved using SQP by irnposing lower bound of 0.1% and upper bound of 100% on all concentrations and lower bou1:d of 0 and upper bound of 1 on all flows. The reconciled estimates obtained arc shown in Table 5- 1. Table 5-1 Reconciled Doto of Minera! Flotation Process using SQP Mehod





F Bc,% k,%








: 0.926? 0.9 157 0.84-18 0.G733 O.C8*? O.CJS03 9.01 lii 1.9012 0.4526 0.1316 0.1 2rJ.267 21.163 G.5080 17.126 4.5589 4.3728 1.4212 0.4!0i 6.9002 4.95 51.255 0.i

In this prob!em, -we are able to obtain feasible estimates by including bouncls in the nonlinear Dl< problem. Thc concentrations of Cu in stream 4 and ZII in stream 8 are at the lower bolinds in the reconciled solution. By comparing these results with Table 4-6, we note thac the reconcileci estimates for all other variables are not significantly different.

nirfnRrcoii~.iliuno,~ onrl Gross EI mr-/)~~f~.rpi.?lori

is of full colu~nn-rank,U2 would not exist and no unobservable variables would exist. A projection matrix P for matrix J, car be created as follows:

VARIABLE CLASSIFICATION FOR NONLINEAR DATA RECONCILIATION In Chapter 3, we have reviewed the methods of observability and redundancy variable classification for linear data reconciliation. Many of those methods [ I 1, 22-29] are also applicable to nonlinear data reconciliation (particularly for problems with bilinear constraints). For problems with higher levels of nonlinearity, a usual procedure is to perform a model linearization first, and apply the variable classification methods for the linear models. Albuquerque and Biegler [30] recently described such a procedure. Although designed for variable classification in connection with dynamic data reconciliation problems, their method can be used for steady state nonlinear data reconciliation problems as well. In this approach. an LU decomposition was used to build a p ~ ~ ~ e c tion matrix, in order to separate the unmeasured variables from the measured ones. The variable classification rules are very similar with those described by Swar-tz [ l l ] with a QR decomposition algorithm. Since the LU decomposition is part of some NLP solving methods for data reconciliation, we briefly describe [he Albuquerque and Biegler algorithm for uriable classification here. Eqtistion 5-!6 describing a linearized niotiel. can also St: written in an abbrcvia~edform suclt as:

\~:i.,erc we grouped all constant terms frorr? the modei iinearizaticr? i ~ t ao global constant vccfor c. We aisn dropped the subscript L:, hy assuming rhat the linearization is pertc~rnredabout ihe final solutior; point. Ic order variables u, a p;ojection matrix P sach that to elirt-rinatc the ~lnn~easui-ed P.1, = O should be const~ucted.Let us assurne that ar? ti' decorcposition of matrix is performed 3s fo1lo:vs:

where E and n are sonie per~nutationmatrices, 1, is a lower triangular matrix, U l is an upper triangular matrix of rank r (thc column rank of ~' matrix. If J,, is of full row-rank, the matrix J,,)and U2 is ~ 0 1 1 1 rectangular 7ero rows in the ~rppcr!I-ianzular rn:ttrix would not exist. Furthennore, if J,

Similarly to the rules derived by Swartz (111 or Crowe (221, observabilitj- requires a zero row of Uc1U2. Furthermore, the nonredu~idant measured variables will have zero columns in the matrix PJ,. All other measured variables are declared redundant. It should be noted that these methods depend on the actual values of the measurements and may give rise to incorrect classification due to numerical problems. Alternatively, graph-theoretic theorems and algorithms have been developed by Kretsovalis and Mah (23, 241 J U L observability and redundancy classification in general processes, which are based on rdgehrtric solvtlbilin, qf tlze c.otrj.ttaint ec~uutiorzsrutlzer than an actuul solution of the DR yrohletn. Hoanever, this method has the limitation that it does not take into account all types of process units (such as flash units) in its analys~s.Speciaiized graph-theoretic observabiiity and redundancy classification al~orirhlnsfor bilinear (niulticoi7iponent) processes have also been developed by otlicr researcher:;, as indicated in Chapter 3.


Erercise 5-2. P!-o:,erllr ohservubi!ih crrd redrlndanc!; i-ules,!iir tile L,U dec.ov;p~siticncr(~,v/-orrclr rf(x-ri!,ed gbove. I

Several issucs regarding the variable c;assificatictn issues in connection with SOP solvinz a1go1-ithlnare irn!~ortant and wil; be briefly mentioned here. Fzirst, SQP requires initial estilnates of all variables (nicasured and unmeasured). If there are unobservable variabies, SQP will still be able to provide estimates for all variables. It uses the initial estimates of unmeasured variables (just sufficient number to make all other unmcasured variables observable) as "specificatiorls" and pel-forms the data reconciliation. Which of the i~nmeasuredvariables ~villbe chosen as the specified ones is irrlplicit in the numerical method of solution (basically when the choice of independent and dependertt val-iables are made based on the


Dnru Rci.orr< itiuiiol~ntid GI-ass

r or- Drtectir~!~

colurnns of the linearized constraint matrix at each iteration). The only way we will get to know which of the unobservable unmeasured variables have been chosen as specified is by examining the final results. If the reconciled estimate of an unmeasured variable is equal to the initial estimate provided by the user, then the unmeasured variable is unobservable (Note that there may be other unobservable variables that may have been back-calculated based on the choice of specified variables which we cannot figure out from the results). Similarly, a redundant measurement can be identified by examining the reconciled values. If the reconciled value of a measured variable is equal to the measured value, then the measurement is nonredundant. In some rare cases, the initial estimate and final reconciled estimate may be the same due to numerical values (small variance, etc.). Again, we may not ocl: dble to pinpoint the cause for zero adjustment precisely. Thus, the only way to perfornl observability/redundancy variables is through comprehensive algorithms cited above. The KND-SQP algorithrrl [19] automatically generates a reduced quadratic program by eliminating all equality constraints (mass, energy, conlponent balance:) and an equal n~lrnberof variables frorn the criginal prcblen. Tilerefore, :his ,nl.r7es[he smallest reconciliation problem. There is no need to identify redundant variab!es because the RND-SQP uses an LP type technique to separate tne variables into dependenUindependent variables and piirninates ail dependeni variables (a m!xtt!re of msasured and un~nezsuredvariables) from the problem to construct reduced QP at each iteration. But if a redundancy anaiysis is required for sensor placei11cil! e r other rcasons, a separate redundaccy analysis by one of the rnectiocis mentioned above can b~ perforrnzi.

COMPARISQN OF NONLiNEAR OPTIMIZATION STRATEGIES FOR DATA RECQNCfLIATlON Nonlinear programming codes are already co~;;,ncicir;!!j: 2; ,;:,:;!: and they have proved to be nume~lcallyrobust and reli-'-le '7.r large-scale inrlxlstrial problems. They perform best when rigorous model5 are used 131, 321. Norilinear programming allows a complete formulation for data reconciliation. as described by Equations 5-7through 5-9. T-joa and Bieglcr 1331 have developed an efficient hybrid SQP method specifically tailored to solve nonlinear data reconciliation problerns. The data reconciliation software package RAGE developed by Ravikumar et a!. 1341 also uses an SQP solver which has been specially adapted to

solve data reconciliation problerns. Liehnan and Edgar [35] cornpx& the generalized reduced gradient (the GRG2 version of Lasdon and Waren [21]) with the successive lineal- (SI,) daia reconciliation solutjon method, and found that the NLP method was more robust at the expense of the co~nputationaltime. While GRG2, a feasible path method, requires the convergence of the constraints at each iteration, SQP--the infeasible path method, satisfies the constraints only at the end when convergence is achieved. SQP and other infeasible path methods (such as MINOS, another generalized reduced gradient method developed by Murtagh and Saunders 1361) usually require less computational time than feasible path methods. Ramamurthi and Bequette [12] compared SQP, GRG and SL methods for data reconciliation purposes. Their findings are sunxnarized as follows: 1. Successive linearization yields significant biases, particularly in the unmeasured variables, while the NLP approaches yield little bias in both measured and unmeasured estimates. 2. Computational time incrcases with the nlagnjtude of the measurement error for SL, but not for SQP c r GKG. 3. Computation:ll time is a strong function of desired accuracy for SL. but not for SQP c r GKG. 3. NLP algorithms are more efficient and more r o b ~ ~for s t highly noilli11 ear probierns. SQP is inore ef'ticient, while (;KG is r-nore reiiable.

SUMMARY . T h e constraints of a nonlinear data I-econciliation problem can contain equality constraints (material balances. energy balances. equilibrium constraints, and property correlations) and inequality constraints (bounds, thermodynamic feasibility constraints). Nonlinear data reconciliation problems which contain only equality constraints can be solved using iterative techniques based o n successive linearization and analytical solution of the linear data reconciliation problem. Nonlinear data reconciliation problems containing inequality co11strain& can be solved only using nonlinear constrained optimization techniques. T h e Generalized Reduced Gradient (GRG) and Successive Quadratic Programmifig (SQP) rnethods are two competitive nonlinear optimization techniques used for solving no~llineardata reconciliation problenis. If bounds on unmeasured \lariables are imposed, then unmeasured variables should not be cli!ni~iaredusing Crowe's projection technique to obtain ;I reduced pi-ohlem. It is Recessar to imposc ho~!ndson variables in certain problenis :o obtain feasible es!;niafes.

REFERENCES 1 . MacDonlild. K. J. alld C. S. Hoiist. -'Dat:r Kcc~~iciiiatior! and l'aranlere~ Estilua-

tior, in Plailt Pe~forn~ance Anal)sis." AIClil? Jour~lul;-l(no. I, Jan. 1980): 1-8. 2. Eritr, H. I., and !i. H. I,:~eckc.. "The Estimation (;i' Pal-amcter\ in Nonl~near Implicit Modcls." Tech~~nrt~c~rr-ic.~ 15 (nu. 2, 1973,):231--217.

3. Dennis. ?. E. Jr., a ~ R. d B. Sch:iahel. iV!rtt~r~-ic.crl rMerlrod.sfor IJ~icot~.rircrirzc~d Oprinlizrrtio~int~ili2iorlliriecit-E(/iwrio/,s.Eng!cwood Clit'h, S.J.:PI-entic(:-Hal!, 1983. 3. Stephenson, G. R.. and C . f:. Shc\vchuck. "Reconciliatioil of Process Data ,l\iChE Joiirntrl 32 (no. 2. Feb. 1986): 247-3-54. with Process Simula~~on."

5. Scrth. R. W., C. M. Valer-o. and W. ,I Hcenan. . "Detection of Gross Errors in Nonlinearly Constrained Darn: A Case Study." Chern. Eng. Conlrtl. 51 (1987): 89--104.

6. Madron, F. Process Plant Perjbnnance: Measurement aizd Darn Processins for Optimization arzd Reirojits. Chichester, West Sussex. England: Ellis Horwood Limited Co., 1992

7.Knepper, J. C., and J. W. Gorrnan. "Statistical Analysis of Constrained Data Sets." AIChE Journal 26 (no. 2, Mar. 1980): 260-264. 8. Crowe, C. M., Y.A.G. Campos, and A. Hrymak. "Reconciliation of Process Flow Rates by Matrix Projection. I: Linear Case." AIClzE Joun1crl29 (no. 6. 1983): 881-888. 9. Pai, C.C.D., and G. Fisher. "Application of Broyden's Method to Reconciliation of Nonlinearly Constrained Data." AIChE Journul 34 (no. 5, 1988): 873-876. 10. Broyden, C. G. "A Class of Methods for Solving Nonlinear Simultaneous Equations." Muth. Conzp. 19 (1965): 577. I 1. Swartz, C.L.E. "Data Reconciliation for Generalized Flowsheet Applications," presented at the Amer. Chem. Society National Meeting, Dallas. Tex., 1989. 12. Ramamurthi, Y., acd B. W. Bequette. "Data Reconciliation of Systems with Unmeasured Variables Using Nonlinear Programining Techniques," presented at the AIChE Spring National Meeiing, Orlando, Fla , 1990. 13. Gill, P. E.. W. Murray, and M.H. Wrisht. ?I-c~cricalOprirnizurio~!.Lonr!on and New York: Acade~cicPress, 19s 1. i4. Edgar, T. F., and D. M. Himmelblau. Optinziznrio~?cf Cj7e1tzicai Pi-oresse:.. Ncw York: McGraw-Hii!. I SX8.

15. EIan, S. P. "A Globsliy Con~ierg-ntMet!iod for Xonlinear Programmi~;g."J. Opti:?lizntiotz Tlieor~Confro122 (!977): 297. i6. Powel!, M.J.D. "A Fast Algorithm P J ~Nonlineariy Cons:raincd Optitni~arion Cuic~llarions."Dlrtldce Cot$ Namer, 1977.

17. Chen, H-S., and M. A. Stadthem. "Enhancen~entsof Han-Powe!i Metho4 for Succesive Qcadl.atic Programming," Cor7zputcrs Che117.Etzgtig. 8 (110. 314, 1984): 229-234. 18. Gill, P. E., W. Mun-ay, M. A. Saunders, and M. H. Wright. User-'s Gxidefbr. SOUQPSOL; A Forrt-a11Package for 2~cadruticProgrartz~t~irzg. Technical Report SOL 83-7. 1983. 19. Vasantharajan, S., and L. T. Bicgler. "Large-Scale Decomposition for Successive Quadratic Programiuing." C o n ~ ~ ~ u t eClzern. t - s Etzgi~g. 12 (no. 1 1. 1988): 1087-1 101.

20. Abadie, J. "The GRG Method for Nonlinear Programming," in Design and Imnplenzentation of Optimization Sojhvare, Sijthoff an3 Noordhoff. Holland: H. Greenberg, Ed., 1978.

33. Tjoa, I. B., and L. T. Biegler. "Simuitaneous Solution and Optinlization Strategies for Parameter Estimation of Differential-Algebraic Fnuation Systems." Ind. & Eng. Chen7. Keseurch 30 (no. 2, 1991): 376-385.

21. Lasdon, L. S., and A. D. Waren, "Generalized Reduced Gradient Software for Linearly and Nonlinearly Constrained Probleilis," in Design and Implemnentatiotz of Optimization Sofhvare, Sijthoff and Noordhoff, Holland: H. Greenberg, Ed., 1978.

34. Ravikumar, V., S . R. Singh, M. 0. Garg, and S. Narasimhan. "RAGE-A Software Tool for Data Reconciliation and Gross Error Detection," in Fo~hrtrrdations of Comnputer-Aided Pr0ces.r Operations (edited b y D.W.T. Ripping. J. C. Hale, and J. F. Davis). Amsterdam: CACHEElsevier, 1994,429435.

22. Crowe, C. M. "Observability and Redundancy of Process Data for Steady State Reconciliation." Clzem. Eng. Science 44 (no. 12, 1989): 2909-2917.

35. Liebman, M. J., and T. F. Edgar. "Data Reconciliation for Nonlinear Processes," presented at the AlChE Annual Meeting, Washington. D.C., 1988.

23. Kretsovalis, A., and R.S.H. Mah. "Observability and Redundancy Classification in Generalized Process Networks. I: Theorems." Cotnpufers Chern. Engng. 12 (1988): 671-688.

36. Murtagh. B. A., and M. A. Saunders. MINOS 5.0 User's Guide. Report SOL 83-20, Dept. of Operation Research, Slanford University, Calif.. 1983.

24. Kretsovalis, A,, and R.S.H. Mah. "Observaoilit;~ :--I Redundancy Classification in Generalized Process Networks. 11: Algorithms." Conzputers Clzern. G ~ g n g 12 . (1988): 689-703. 2.5. Mcyer, M., B. Koehret, and M. Enjalbert. "Data Reconciliation on Multicomponent Network Process." Computers Chenz. Erzgng. 17 (no. 8, 1993): 807-8 17.

26. Ronlagnoli. J.. and G. Stephanopoulos. "On Rectification of Measurement Errol-s for Complex Chemical Plagts." Chci7~.Czg. Scierice 35 (i98O): 1067- 1051.

25.Rornagnoli, J.. and G. Stephancpouios. "A General Approach !o Classify Operational Faranictzrs and Rectify Measurement Errors for Cornpiex Chzn:ical Processcs.'' Con/?. AjqL to Cliem Erzgmzg. (1983): 153-174 28. S z n c h e ~ .id., A. Bandeni, and J . Romagnoii. "PI-AD.4T: A Package for PI-ccess Vanable Clarsirication and Planr Data Reconciliation." Cnnlpui~r.~ Clioll. Engrzg. 6 16 (Suppi.. :992): S439--S506.

19.Sanchez, M.. ar,d J. Kot~lagnoli."Use of Orthogonal Transf~rniationsin Data Classificatiori-Reconciliarion" t,'ornpu!em-s Chen?. Engng. 20 (no. 5, 1996): 483493. 30. Albuquerque, J. S.. and I,. 7'. Biegler. "Data Reconcifiation a~idGross Ermi Detection for Dynamic Systems." AlChE Jortrmzal 42 (no. 10, 1996): 284 1 --2856. 31. Nair. P., and C. Jordachc. "On-line Reconciliation of Steady-State Psocrss

Plants Applying Rigorous Model-Based Reconciliation," presented at the AIChE Spring National Meeting, Orlando, Fla., 1990. 32. Nair, P.. and C. Jordache. "Rigorous Data Reconciliation is Key to Optimal .for the P r o c ~ s sIrrdustries, Vol. IV, no. 10, pp. Operations." Cor~fr-ol t 1 8--123. Chicay: Pulnam Publ., 199 1.

nascent and will continue to evolve, the intent of this chapter is to only introduce the reader to this topic. Before we proceed fulther, it is useful to explicitl!; describe what we mean by a dynamic state of a process. Two features characterize a dynamic state of a process:

Data Reconciliation in Dynamic Systems

THE NEED FOR DYNAMIC DATA RECONCILIATION In the preceding chapters, data reconciliation has been applied to a single vector of mcas~lrcneritsof process variab!es. This vector could be the mcasurenlents made at ally tinie instarit correspocding to a single cnapshot of :he is !nore likely. howevel-. tll:lr stead),-state data rzco:lci!iation is ripplied to a vector containing t l ~ caverage \;a!l;es of thz illcasureinenrs made over a period of :i!ne of, say. a few hours. This approach is satisfaztol-y if the reconciled data is required for applications s l ~ c has steady-state simulaticn. or on-line opiimjzation where the optimal set points are calculated once every few hours. If we coilsider applicatiolls such as regulatory co:ltrol which rcquire accurate estimates of precess variables frequently, then data reconcilianiade at every sampling tion may have to be applied to ~lleasuren~ents inslant. in this cass, it Carl no longer be assumed that the variables obey Storage capacities steady-state mal~riaia ~ i derlclyj ud~dncereia~ion.\i~ips. and transportzti,,~l,&s should be t a k i i i i i t ~accouili, and dynamic rnaterial and energy balances that relate the variables must be used. Estimation of process variables which uses measurements and dynamic relationships between the variables have been developed long before the sut~jectof data reconciliation was born. We discuss some of these important estimation techniques along with recent advances in dynamic data reconciliation under the broad umbrella of dynan~icdata reconciliation in this chapter. Since the area of dynamic data reconciliation is fairly

1. The true values of process variables change with time and thus the measurements of these variables are also functions of time even if we entertain the extreme possibility that measurement errors arc absent. 2. Due to continuously changing inputs, the accu~~iulation within a process unit also changes continuously and has to be taken into account. The above features characterize both operations around a noniinal steady state as well as process transients that take the process froin one nominal steady state to another. Different techniques are available for developing a dynarnic model of a process. These techniques are described under r~zodeliclei~t(ficcitiotlirt several textbooks 1 1 , 21. We will consider only discrzte time lnodels as opposed to continuous niodels because we will be dealing with rueasusements made at discrete time instants which are convenirjntly treated using digital computers. Ful-therniore. \ve will cunsid51-stat? spice modcls as opposed to input-output rnodels due to their ivl~crentad\.antayes. ?Ve \4.i!i begin our descriptio~with linear discrete syster:i madeis beforc ~novins on to nonlinear systems.

LINEAR DiSCRETE DYNAMIC SYSTEM MODE!. A linear, discrete. state-:;pace dyllainic rllc,dei of a process is usual!y described by the fo1lowir.g equations:

where xh : n x 1 vector of state variables uh : p x 1 vector of manipulated inputs w, :s x 1 vector of random disturbances y, : m x 1 vector of rneasuremenrs 4 :m x 1 vector of random errors in mcasurements

The subscript k represents time instant t = kT when the variables are sarnpled or measured, T being the sampling period. The matrices Ak, Bk, and Hk are matrices of appropriate dimensions whose coefficients are known at all times. If the coefficients of these matrices d o not change with time, then the resulting model is known as a linear time-invariant (LTI) system model. It is also customary to use deviation variables rather than actual variables in the model equations. Thus, the state variables, xk, represent the differences between the true values of the variables and their nominal steady-state values. Similarly, the variables uk and y, also represent deviation variables. In this chapter, we implicitly assume that all the variables are deviation variables. Equation 6-1 describes the dynamic evolution of the state variables while Equation 6-2 is the measurement model which describes the relationship between the measurements and the state variables. The standard assumptions made about the random disturbances \vk and the random errors vk are that they are normaliy distributed variables with statistical properties giiien by

Equatioils 6-3 and 6-4 imply that the landom variables wk and vLha:.e zero mean and covai-iance matrices givzn by R , and QL,respectively. Fxluadon 6-6 implies that the disturbances a; different times are not correlated, and similarly the measurement errors at different time instants are not correlated. Furthenore, Equation 6-7 stipulates that the disturbances and measurement errors are not cross-correlated. The random errors in nieasurements, va arises due to several reasons 3s explained in Chapters 1 and 2. On the other hand, the causes of random disturbances, wk, in the state evolution equation can be best explained only if we consider a first principles model derived from the differential mass and energy balances of the process. In this case, randorn fluctna-

tions in the process feed characteristics such as its flow, temperature, pressure and composition =an be modeled as disturbances. Any randorn errors in the control inputs arising due to electrical noise in the transmission lines of the controller or due to imprecise actuator positioning can also be modeled as random disturbances. On the other hand, if an input-output model of the process is identified from the process data, then it may not be possible to separate the effects of random measurement errors and random disturbances. In this case, the differences between the model predictions and actual measurements can be attributed to the combined effect of measurement errors, process feed disturbances, and errors between the actual and computed manipulated inputs. A linear system model of the form given by Equations 6-1 and 6-2 can be derived for m y process from the differential equations that describe the mass and energy conservation relations of a process (also known as a first principles model). Alternatively, model identification techniques may be used for obtaining a dynamic model from the outputs or response of a process to given inputs. N7eillustrate the development of a first principles dynamic model for a simple level control process taken from Hellingham and Lees 131.

A simple level control process is shown in Figui-e 5-i, which has a feed jF!) and two outpets (F1 and F3). The vaive V , is kept open at a fixed position, while vaive V2 is mzniptilsted to co~ltrolthe levcl of the tznk (instead of directly computing the new vrlve position, it is assumed chat the adjustment, ci, :o the valve positior, x is computed at each ti~nej. The tank level and position o i the valve V2 are ~ ~ i e a s u r edenotcd d, by measurements Z , and Z2, ~espectively. The differential equation describing the mass balance for this process is given by

The outlet flow rates are related to the tank level and valve positions by

Duiri Re< oriril~ar~vri irrid G Io n Er-r or- Dcirciiori




4 In deriving the above discrete representation, it is implicitly assumed that the valve position x is constant at the value xk during the time interval kT to (k+1)7: It is also assumed that the random disturbance in the feed F1 is piecewise constant (of constant magnitude within each sampling interval, but the magnitude being random from interval to interval). If the adjustment to valve position ak (computed by the controller after measurements are made at sampling instant k) is implemel~tedat the beginning of the next sampling interval, then the valve position at each sampling instant is given by


where eL+,is the random error in positioning the valve Table 6-1 Parameter Values for Level Control Process Parameter




Fiow of infornxtion Figure 6-1, ~ e ~corttrcl e l process

Table 6- 1 givcs the values for different coilsta11:s w e d fo-this process. ilsiilg these \.alazs, the stale-space model of the l ~ \ ~ control el PI-ocess is obtained as

Suklst~tutingtile above relat~onsIn the rilass baiance equation we set

If we assun:e a uniform sampling interval T, between measurements and use thz subscr-ipt k to rzpi-esent the variables at sampling instant then the discrete equivalent of the above diffei-cntial equation can be obtained usirlg the method described in (41.


wherc v,,,,, and v , , ~ + ,are the random errors in the measurements of the level and valve position (in temis of volts), respectively.


Uilra ficco~ic-i/jntior~ and G'rosc Error Defcc-?ro:i

The manipulated inputs uk are obtained using a control iaw which is qenerally a function of the measurenlents when the variables that need to be controlled are also measured. For a simple proportional linear control law we can write the control law as

Initial estimates of state variables are assumed to be available which possess the following statistical properties:

where ySp,, represents the deviation or change from the current operating set points, and is equal to 0 if there is no change from current set points. In some cases when it is difficult or expe~lsiveto measure the controlled variables an inferential control strategy is used. The manipulated inputs in this case are a function of the state estimates. Even in the case when controlled variables are measured, it may be better to base the control law on estimates ~f these variables since these are likely to be more accurate, if the estimator is designed properly. As mentioned in the preceding section, a primary reason for dynarnic data reconciliatio~lis to derive estin~ateswhich can be uscd for better control. We therefore assume a control law of the form

Given a set of measurements, Yk = (y,, y2, . . ., yk), it is desired to obtain estimates of the state variables x,, which are best in some sense. which are intcrWe will denote these estimates using the notation Cklk, preted as the state estimates at time k obtained using all measurements from time t = I to time 1 = k. It should be noted that by using all the measurements from initial time to derive the estimates, we are autonlatically exploiting ternporal redundarzc~in the measured data. The estimation problem that we arc considering here is a special case of a more general problem in which it is desired to obtain estimates of state variables xj for time j, using all measurements made from initial The estimatime to time k. The estimates so derived are denoted as kjlk. tion problem is referred to as a prediction problem if j > k, a filtering problem if j = k, and a sn~oothingprob!e~n if j < k. Here we a;-e concerned with the filtering problem. Due to thc presence of random disturnances in Equation 6-1, the [rue values of the state variables a: every time instant are themseives rafidom variables. Therefme, a probabilistic measure has to be used for detcrmining the best estimates of the state variabies. The best estimates of the state variables at time I( are obtained by minimizing the following f3nc:ion:


where ik are estimaies ~f [he true vaiues ot state variabfcs 2nd x , ~ are .~ changes in the set points of state variab!es from current set points. In order to achieve good coilirol. it is therefme required to es!in;ate the staie valiables as accurately as possible.

OPTIMAL STATE ESTIMATION USING KALMAN FlWER We first deal with th-, problem of o p ~ i n ~ estimation al of state vanabies TO:. a process that can be described by a linear model of thc: fonn giver! by Equations 6 -1 alld 6 -2, an3 which satisfies ihe assumptions of Equations 6 -3 thrcugh 6 -7. We will also assume that i f e rnanipulated inputs at each time are known constailt values. and ignor.2 Cc>r th: !i1~11' " f l ythe f T ~ tt h ~ t these are functions of the state cstirnates. (We will address this issue in a subsequent scction.) The optimal linear state estimator called the Kalman filter which we describe in this section can be derived using differe~~t theoretical formulatiocs and an excellent treatment may be found in Sage and Melsa 151. We use a least squares formulation approach because it helps us to readily compare this with data reconciliation.

Equation 6- 12 is tf~eexpected sum of squares of the differences between the estimates and true values of state variables, and is thus an extensiori of the well-'mow deternunistic least squares objective function. The solution ~ 7 1 in a convenient recurto the problt-ILIwas Grs~uuutlnai by K a i n ~ a !16, sive form r;,.l is generally no\; referred to as the Kalnian filter. The Kalnlan filter equations are given by


Iluff~h'ecilr~crlirrrionrrz I)?.rramicSysfelns


and also have the rninimurn variance among all unbiased estimators. Furthermore, it can also be shown that the Kalman filter estimates are the maximurn-likelihood estimates (Sage and Melsa 151). For a linear time invariant process, the Kalrnan gain matrix becomes constant after some time, which is also known as the s1ead.y-state Kab~tangain. a @ .

The matrices Pklkand Pklk-,are the covariance matrices of the estirespectively. Starting with initial estimates g o and mates j i k i k and gklk-,. Po. the above equations can be applied in the reverse order of Equations 6-16 to 6-13 to obtain the state estimates at each time k. The derivation of the Kalman filter equations is described clearly in Gelb [8] and Sage and Melsa 15). The book edited by Sorenson [9) contains several papers on Kal~nanfiltering and its applications including the original papers by Kalman.

Exercise 6-2. Prove that the Kalma~zfilter estimates are unbiased.

Exercise 6-1. I1eri1.ethe Kaln~ali$lter eyrctzfioi~.~ [IT millil?li:illg 6-12 for t17r pi-oces.s li~ocielEqirutioiw 6-1 c r ~ d6-2 aizd uti1i:irlg tile ~ t u t i . ~ ~ ip~-<~pe~.rie.s cal Eql~(iti(~/~.< 6-3 tl1i-014gIl6-7, 6 - 10, ( I I I 6~ 11.

I -

The recursi\ie form cf the estin1ator equations ccinsiderably reduceh the colnputationai effort in;.tilved in obtzining the can be observed from these equatiolls that :he e f f r ~ spent ~t i i l obtaining :he s a t e estiniates 31 a i i ~ n eis effeciivcly utilized to obtain the estima~esat the next t i n ~ einstant. Equatioi~6 1 3 can also be iiitcri;reted as 3 prt,dic'to~.correc:or method for obrair;iilg thc cstirn~ites.The estimates jikii, il?~ predicted estimate:; o f tiic state variables at time ic based o ! ail ~ tile measl;renients until tirne k - ! . The seccnd tern1 in this ecluation is the co~rectiont o this estimate ba.,ed on the measurement at rime k. The matrix K!: is known as the K ( l l i ~ ~ a ~ ~ ~ f i l r e ~ .q1i11,and the difference cyh-Hk2k,k.,)is known as the h117ovaiio17.s.Thc innovations are equix-sisiit t o rneasurcrllent residuals of steady-state pr~xessesxvhict~\\ere dctiilcd i : ~Chapter 3. and play a cr-ucial role in 21-0s. error detection, as will be x c n in Chapter 9. The Kalman filter exti~niitex ~ ~ o s x s s t ldesirable le htatistical properties of being unbiased, that is.



Applications of Kalman filter in chemical engineering have been discussed by several authors. Fisher and Seborg [lo] have applied Kalman filtering to a pilot scale nlultiple effect evaporator in L I I Z context of investigating various types of control strategies. Stanley and Mah [I 11 applied it to a subsection of a refinery for estimating flows and temperatures. The dynamic model used in this application is a heuristic random walk model for the state variables which is appropriate for describing processes that operate for long periods around 3 nominal steady state with occasional slow transitions to a new n~rnirialsteady state. The state variables were also forced to satisfy the steady-state material and energy ba!ances. Through this approach. they attempted io exploit both spatial and ter~lporal redundancy in the data for reconci1ia:ion purposes. Makfii et 21. [12! recent!y used a sinliiar technique for estimating flows and concentratiocs of a mineral beneficiation circuit. A first-order, identified-transfer-Pdnction n:odel was used to desc1-ibe ?he dynamics and steady-state material balances were also in~poscd,a:th.sugh ths esrirnaies were expected to satisfy them in a rnininium Ieast-square sense.

Example 6-2 We illustrate the application of the Kalman fiiter far obtaining optiinal estimates for the !eve1 control process described in Example 6-1. In this process, the fluctuations in the feed flow rate and random error in positioning of the control valve a!-e taken as state disturbances. The standard deviations of these randorn disturbances are assumed to be 250 cm3/miri and 0.05, respectively. The standard deviations of the errors in measurements of level and valve position are taken as 0.01 volts each. Based on the state space rnodel derived in Exarnple 6-1, the measurements corresponding to

the closed loop behavior of the process are sin~ulated.Tne control law used for this si~nulationis based on ahserva~iorzsand is given by

mance. For this case it was computed to be 0.0068 cm2. The variance of the valve position to achieve this controi is 0.1057 cm'. Analogy Between Kalman Filtering and Steady-State Data Reconciliation

where Z,, and Z l k are the measured valucs of level and valve position (in cm) obtained by dividing the actual measurements in volts by 0.631 and 1.57, respectively (see Example 6-1). A Kalman filter is used to estimate the states using the steady-state Kalman gain, since we have used a time invariant model. The steady-state Kalman gain obtained by solving a matrix Ricatti equation 191 is obtained as

Figure 6-2 shows the tl-ue, measured and estimated values of the level. It can be observed fi-orn Figure 6-2? that the estimares arc cluser to the true values as compared to the measurements. The measurement error variance, calculated from the sample data over the time period of 200 secorids. is 0.0234 i:ni2. whereas the variance of the error in the estimate i s 0.0023 crn'. The viniancc of i!le differ-ence Sctwecn the t n e values arid the set pcint (u~hichis 0 in this case) is an iildicator of the control perSol--

Data reconciliation techniques were developed primarily for steadystate processes whereas the Kalman filter was developed independently for a linear dynamic process. Both these techniques can be derived using a weighted least-squares estimation procedure. In order to bring out the link between these two approaches, we prove that steady-state data reconciliation can be regarded as a special case of a Kalnjan filter. It can be recalled from Chapter 3 that for steady-statc processes, the material and energy conservation relations are written as algebraic constraints. The differential dynamic form of these cofiservation relations can also be used to derive a discrete linear-stat,: space model form of Equation 6- 1 as shown in Example 6-1. As a special case, we can consider a disturbance free fonn of this equation by setting wkro be identically equal to zero for all time to get:

L,et us dcfine a new state .vector x cornpnsed of both xk and xk_,(where we have deliberately ctiose~ito omit the ti~nc.index X for ready comparihon with steady-state reconcilia:io~;). Equatior, 6- 19 can be rewritten as


If all the state variables are assumed to be tneasurcd. then Equat~un6-2 can be written as Figure 6-2. Measured ond estimated stotes ior level con!rol process using control law based on mec~surements.

Let us also assume that we have unbiased estimates 2k-11h-lof the state variables xk-] obtained from the preceding time instant and that its covariance matrix is Pk.llk.l.These estimates can be treated as additional t~leasur-eme~zts and can be written as f.

where E ~ . , are the random errors in the estimate of x k . ~with zero mean and covariance matrix Pk-llk-l.From the assunled properties for vk, we can easily prove that they are not correlated with Ek.,. Combining Equations 6-24 and 6-25 we can write the modified measurement model as

* : I

The estimates given by Equation 6-3 1 can be shown to be identical to the Kalman filter estimates given by Equation 6-13 with Kk= 0 and H, = I as in the case of the simplified model considered in this section. Since the Kalman filter also accounts for random disturbances in the process model, it may be regarded as an extension of linear steady-state reconciliation technique to dynamic processes. An interesting by-product of the analysis carried out in this section is that it can also be proved that the estimates fik-, given in Equation 6-30 are identical to the optima1 smoothed estimates 2k-llkof the simplified model considered in this section.


where v is the -,lector of random e m x s \vi:l~Lero rne:ln and col/ariaa(:e matrix C defined by

Ec!uatic?ns 6-26 and 6-20 are xi~riilarto the srer:dy-state ~neasul-enieni and cor~s~rzint models, Equjtiort:, 3-1 a i d 3-2. ,2j)plyi11&[he stead)-scare reconciliation solu:ior~ to this n?ocic!. we cat! oS:ain the estimates of x itsing Equation 3-5. Substituting fgr !he different i~andblesas defined by Equations 6-21 through 6-23. 6-27 and 6-27 in this solution we get



Linear dynamic data reconcilia~iontechniques have been applied for estimating the flows of a process by Darouach and Zasadzinski 1131 and Rol:ins and Devanathan [id]. These axthors have converied the linear differentiai equations to algebraic equations by replacing the derivative l>y a forward difference. The problem can r,ow be solved using linear dara reconciliation s o l ~ ~ t i otcci~niq~iex n similar to the procedere disc~lssed in this sec~ion. Sifice the problem dinlensioil increases with time, efficirnt tech:iiques fol. obtaining the estimates have been proposed by thesc authors. Bag:jewicz and Jiang [I51 also cocsidered the pic;blen~ of dynamic data rec. th.7 +lnwc and onciliation of process flcws and tank h c l d ~ l y .Accllm;nll tank holdups to be polynomial functions o f time. these authors convert the differential equations into algebraic equations. Using a window of nieasur-ements, the coefficients of the polynomials are estimated. Optimal Control and Kalman Filtering

Considering only the estirrlates of x, in Equation 6-30 and using the predictzd estin~atesdefined by Equation 6-14 we obtain


. I ,

Let u j first cons~derthe opt~malcontrol problem for a determin~st~c linear process which evolves accord~ngto Equation 6-1, but without any

state disturbances. Let us also assume that the state variables are directly measured without any errors (that is, the true values of state variables are available). We wish to determine the optimal values of the manipulated inputs which minimize the performance index

x(x:~~~~ n

+ u:Fiui)

Min J; = u, i-I

where Ei and Fi are specified weight matrices (Ei are assumed to be nonnegative definite syrnmetric matrices and Fi are assumed to be positive definite symmetric matrices). The first tern1 in Equation 6-32 attempts to keep the state val-iables (deviations of the state variables from their current set-points) on their target value of zero while the second term attempts to minimize large values of the manipulated inputs. The weight matrices can be chosen based on the relative importance of the state \fariables and manipu!atetl inputs. The solution to the above problem [I61 leads to a linear control law of the fonn

v.!here E,is dependeni on thc xbeighi matrices used in Equation 6-32 ai.,d the system matrices.

Lec us now c::csider (he optiinal contro! prob1t.m for a linear stochastic sy~t'rn which e ~ o l v e zcccrding s to Equation 6-1. !n !his caLe. the performance index for the optiinal cont~-01prob:e!i~ car? bc written as

The optirnal values of the manipulated inputs, which rninilnize 6-24, can h r obtained as fc>llows:

Despite the fact that the rnanipulated inputs are themselves functions of the state estimates, it is assumed that they are known deterministic inputs when deriving the state estimates using the Kalrnai: filter. On the other hand, the optimal control law has been obtaincc! for a deterministic system for which the true values of the state variables are assumed to be available, but is used for the stochastic system also. Essentially, this implies that the optimal estimation problem and optimal control problem has been separated. The proof that this procedure gives the optimal manipulated inputs for minimizing 6-34 follows from the Separation Theorem or Certainty Equivalence Principle 1161.

Example 6-3 In order to investigate the effect of using a control law based on estimated states, the level control process described in the preceding two exanip!es was simulated using a control law similar to that used in Exa~nple 6-2. The control law, however; was based on the estimates of the levei and valve p o d i o n obtained using a Kalman filter. The contro! law for this case is given by

The trbe. measured ar;d simil1r;ted values of the levei f ~ this r case are shown in Figure 5-3. We can compare the true values okained in this case with those obtained in ex amp!^ 6-2 anci find that ther? is a marpina! irnprovernent in conti01 perrill-fiiance. Thc -,.ariaace of th? error b e t ~ , e e n the true vaiues arid set point is 0.0065 cm2 w.tiich is ii~arginaiiyiov,,er than that chtai!ied wi?en the control law is bssed on mzasured values. However, t!le variance in the valve position i11 this case is only 0.0523 cm' which is about 20 of that obtained in the pl-eceding example. This implies that we are able to achieve as good control as before with less change in the manipulated variahie. This is due to the fact that the estimated states are more accurate than the measured values. Kalman Filter Implementation

i 1 ! Compute 2,,L

,. the predicted estimates of the state variables

at time X. using the Kalrnan filter equations by treating the manipulated inputs prior to time k as known deterministic inpnts. ( 7 )Co111ptlte the manipulated inputs at tir-ne k by using the estimates f i , k - l initend of x, in Eqii::tion 6-33.

The matrices Pklk.I and Yklkoccur~.ingin the Kalman filter equations. by virtue of being covariance matrices. sltould normally be nonnegative definite, or in cjther words their eigenvalues should be nonnegative. If this is

The square roots of he matrices can be obtained using Cholesky fdctorization 1191. . Step 3. Co~nputethe transformed measurements and transformed measurement matrix defined by

Figure 6-3. Measured and estimated states for level control process for control law based on estimates.

ensured, then the Kalman filter will he stable. However, if the Kalma~lfilter is in~piementedas given by Equations 6- I3 through 6- 17, these mat1-icex tend to 1:)se their nonnezaiive definiteness cha~-acter;i::d the estimates tend tc diverge, due to ~lumericalinaccuracies in the computation. .i\ form known as :he square-root covrxiance filter can Se used to implement the Kalman filter. Equation 6-16, which is used to obiain Pklh-l,preserves the symmetry and positive definiteness char-acter of this matrix, but Equat~on6-17, used for obtainins the updated ccvariance ~ ~ it invol\~eshe in-version of a matrix, can czuse nun~ericaip r o b i e ~ :sigce ntatrix in the cornpatarion of i h fijter ~ gain nlatrix. !t is this equa:ion which is recast i11 telms of square rocics of the co\.a~-iancen:ati-ice:. Fo~ther!nore;the computational efficiency is also inzr-easedby processing the measurements in a sequential manner rather than simaitaneoilsly. thus dvoiding the x e d to compute t!le inverse of a matrix in Equation 6- 15. We describe the steps inv~ivedin the implelnentation and rctcr me reader to Yagchi [I 71 and Borrie 1181 for a detailed derivdtiutl c,i the algorirhn~. Step 1. Sta~tingwith estimates 8,. and Pk.llh.laj~plyEquations 6-14 and Phlk.,. and 6-1 6 to compute the one-step ahead predictions, iklh-l Step 2. Obtain the square roots Sk,k.l and C, of the covariance matrices PkI,_,and 0,. respectively defined by

Since CL is upper triangular, y i and H; can be cornputed without thc cxpl~citneed to invert Ch.

S t e p 4. Since ihe transforned measurements are not correlated (see Exercise 6-4), t h ~ ycan be p:-ocessed seque~ttiallyusing the fo!lowinz eqilztions:

where Sklk-,is the square root of the updated co\iariance matrix Yklkaftelprocessing the first i measurements, and ti;,, is row i of the tr-ansfonned nieasurement ~llatrixH i . We initialize the computations of this step using

Thus, after all n measurements are processed, the updated covariance matrix is obtained as P ~ l k =~ k l k , n ~ ~ I k , n

(6 - 43)

Although, the Kalman gain matrix is not explicitly computed in the above sequential procedure, it can be computed if necessary using

where w(t) is a white noise process with mean function zero and covariance matrix function R(t)G(t-z), where 6(t-z) is the Dirac delta function. The variables are assumed to be sampled at discrete times t = kT and the relation between the measurements and state variables are represented as

where Kk,iis column i of the gain matrix.


The treat~nentof nonlinear processes presents several d i f f i c ~ ~ l t i e s which are not encountered in linear systems. First. it is generally not possible to analyiicaliy obtain a discrete form represent:ition of :he process analogous to Equation 6-1 starting with a jzt c~fnon!inear difFercn:iul equaiions describing !he process. Secondly, it is nlathematicaliy difficult tc treat random coke, if the state transition equations or medsl~rcment equaticr?~are nonlinear functions of the noise (see Borrie [is] for a detailed explanatio~~). Thus, the effect cf noise in a rionlinear process is modeled as a linear additive tern. Thirdly, even if the random noises are assumed to be normally distribuied, nelther the state variab!es nor rhe measuren~entsfollow a Gaussian distribution ciuc to the nonlinearity of the eqnations. Thus, a probabilistic framework can be used only under some approximations (see Jazwinski [20j for a more complete treatment). A Izast-squares formulation, hourever, can alv,'ays be used to derive the estimates. Under the above iirtutations, the evolution of the state variables fcr a generd nonlinear process is rnodeled by the following differential equation:

where vk are the random measurement errors which are assumed to be Gaussian with mean zero and covariance matrix Qk.As in the linear case, we assume that w(t) and vk are not correlated with each other. Equatio 6-45 and 6-47 describe a nonlinear continuous stochastic process with discrete measurements. Based on a linear approximation of Equation 6-45 and Equation 6-47. at each time around the current state esiirnates, an extended Knli~ian,/i!rer can bc: used to obtain the state estiinates recursively using the following equations which are znalogous io Equations 6-45 and 6-47:



Table 6-2 Parameter Values for CSTR Parameter

Equations 6-49 and 6-51 are nonlinear differential equations that have to be numerically integrated to obtain the predicted estimates of the state variables and the predicted covariance matrix of estimates. Equation 6-5 1, which involves the solution of n2 coupled differential equations, can be computatioilally denlanding. These can be avoided by computing a state transition matrix A, at each time based on a linear approximation of the nonlinear functions and assuming it to be constant duling each sampling period (Wishner et al. [21]). With this additional appl-oximation, Equation 6-16 can be used to obtain the predicted covar-iance ~natrix.The method described here represents one of many different approaches for developing recursive estimation techniques and these are described in Muske and Edgar 1321.

Example 6-4 A conri~i~(oir.s .cri,-,-erlrank t-ocrc.tot-(CSTK) with extel-nal heat exchange 1331. and ir, u.hic11 a fij-st order exothernlic reaction (decornpositicm of a reacraht A ) occui-s is used to illt~stratetho ;1pplicatio11of state estimation for:i nonlinear process. The dif::~-i.nti;~!equations describiiiy th? cliangc i l l concerltrutinn (of reactant i\) ai~dtempel-atur.: i l l the reactor are gi\7cn by

whei-e A. and To are the feed concentration and te~rtperature,respeciively and A , T are the react:)r concentration and temperature, respectively. The concel~trationa ~ temperature d vnriabies ar-e scaled using factors A, and T,. respectively. The reaction rate constant is given by






e The values of all parameters are listed in Table 6-2. Corresponding to these values. it can he verified that the steady-statc reactor concentration is 0.153 ! and steady-state reactor temperature 4.609 1 . It is assu:ced that for this process, the reactor coi1centratio:l and ternpcrature are n1e;licred using a satripling period of 2.5 s. uld that [hi. sranliai-d deviat~onsof random e m r s in these measurements are 0.0077 and 0.3305 (5% of the steady-state values), respectively. The open-ioop rcspocse of this process is si~nil!atedfor a step change i~ the feed co~icentrationfrom 5.5 to 7.5, m d an extended Kalnlan filter is used t

I)uru Rt~cori~.ilriztir,tr u~zdG'rosr Error Dr<~ec~iotz

0.20 1 ---- ?'nc


rupted by noises or random disturbances. A similar progression from nonlinear filtering to data reconciliation can be made by neglecting the random noise term in Equation 6-45. There are other key differences. however, in the formulation and solution of ilonliizear dytzantic datu reconciliutiolz problems as compared to nonlinear filtering problems. Liebrnan et al. 1231 and later Ramamurthi el al. 1241 formulated the noniinear dynamic data reconciliation (NDDR) problem and also proposed solution strategies. We first start with a general statement of the problem as posed by Liebrnan et al. [23]before discussing these solution techniques. The NDDK problem may be forinulated as

- - - Measured - - - - - - - Esturnlcd










---- . -




Tim (sec)




Figure 6-4. Estimated concentration of CSTR using extended Kclrnan filter.






subject to 'hi- - - - - -

Measured . . - - - . - kiinmtsd

Figure 6-5. Estimated temperature of CSTR using extended Kalman filter.

Nonlinear Data Reconciliation Methods

It was dertlonstrated earlier that Kzlrnan filter is equivalent to data reconciliation, if we assume that the state transition equations are not cor-

s = f(s);

x(to) = go


There are several features to the above foi-~ntilatior!that need elaboration. Firstiy, :he manipclated input variables u are inciticied as part of :he objective f u ~ c t i o naild ar-c estin:ated at each tilnz step, although they arc assuiilsd to be constant uithir, ea-t: samp!ing p-riod. The ccn3puied values of the nia11ipilla:ed inputs. uq at ezch time J , using Equation 6-9 or any other control law, are different from the actual manipulated iiiputs to the process due to inherent er-rorc in the actuators. Thus, the conlputed values of the manipulated inputs sel-ve as ineas~~rernents, and ihe true \la!ues of these variables have to be esti~tlated. This formulation is Inore general as cornpar-ed to tlie model used in filtering. where the manipulated inputs are assumed to be known exactly. Secondly, the state variables are assumed to be directly measured (or equivalently, the matrix Hk is assumed to be an identity matrix). This

does not impose any limitation because by using a simple transfor~natjon, the problem can still be formulated as above. If a measurement is a nonlinear function of the state variables, then we can introduce a new artificial state variable corresponding to this measurement and the nonlinear relation between the a~tificialstate variable and the actual state variables can be included as part of the equality constraints of Equation 6-57. This transformation is similar to the treatment of indirectly measured variables in steady-state data reconciliation (see Chapter 7). Lastly, the inequality constraints, Equation 6-58 allows bounds on state variables and other feasibility constraints to be also included. It should be noted that filtering methods cannot handle inequality constraints and can therefore give rise to infeasible estimates much like linear steady-state reconciliation methods. Thus, the formulation given by Equations 6-55 through 6-58 is extremely general and practically useful. The general fonnulation of the NDDK problem comes with a price. It is no longer possible to develop a recursive solution technique as in filtering. Furthermore, a close look at the objective function reveals that all the state variables from initial time up to cul'rent time are being simultaneously estimated at each sampling instant. This leads t o an ever gro~ving increase in the number of variables wi:h tiine that I-tave to he estima?ed. which is not practically acceptable. In order to reduce the computational burden. a tllol'illg 14.iridow approach was adopted 12.1. 241. In this apprdach, at each timz r on!;, a u3it:dowcf measurements froin time r-N to rime I are used to estimate ali the srate variables within this time v~indowof size AT. The objective functicr, to be minimized is the weighted swn squared ditferencer between the measur-nients and state estimates withir. this time v~indow.The estirilate obtained for the state v~riablesa! time r frorii this optirnizatio~are ased to compute the Inanipulated inputs. The procedure is repeated at ;he next sampling insiani, giving rise to the term "moving window." The solution strategy used to solve the estiniation problem at each qf the r n n l ~ i n nu~;nrlc>wapproach requires some explanation due to the presence of nonlinear differential equations, Equations 6-56 along with algebraic equations Equations 6-57 and 6-58. Liebman et al. 1231 converted the differential equations into algebraic eqi~alitycoristraints by discretizing them using orthogotzal col1oc:utiorz. I n this technique, the state variable functions (of time) within each sampling period are expressed as a weighted sun1 of the state variable values at different time instants, within this sampling period, representing the collocation nodal points. The weights used in this representation are the orthogona; polyno-

mials. Although, a sarnpling period can be subdivided into several elements, for convenience one element is used per sampling interval. With this choice, the state variable functions within each sampling interval j can be written as

where li(t) are orthogonal basis polynomials, n, is the order and are xi the state variable values at the ith collocation point in sampling interval j. The end points of this interval 1 and n, correspond to the samplin,5 Instants. Using Equation 6-59, the derivati\res can also be expressed in terms of the state variable values at different tinle instants. Equations 6-56 can now be forced to be satisfied at a11 the collocation points resulting in the following algebraic equarions for each sampling interval j.

where x' is the vector of all state variables at all collocaticjn points iii sampling inter-val j. Eqc~ation6-60 can be ivrittzn ro:- each of ti:e i\i sampling interbals in tlle winciou- chosen with the additional stipu1::tion that the variable values at the end of a sampling interval are q u a [ t o those at the bzginning of the next interva!. A nari!inei.;r optirnira:ion techniqlje such as GRG or SQP discussed in Chapter 5 can be used to misinlizz 6-55 sub-ject to 6-57, 6-53. a i d 6-60, 11 .cila~ldb s noted t11;it the number of variah!es in this optimization problem is more th:r~t tile number of statc and inpet vsriables at :he N sampling insta~ilswlt!li~lillz window. since we are also simu!taneously estimating the state variahlss at the collocation polnts within each interval. More details oo the type of orthogonal polynomial used, the size of the problem. and the structure of the derix~ativematrix D are availablc i n Liebman et al. [23j. In order to reduce the cornprttatior~aleffort requii-ed by the non1ine:lrprograrnrning strategy described above. Kamamurthi et al. 1241 proposed a succes.sil~e!\ litzear-ized hol-i;orl cs!itllrrtiotz ( S L H E ) method in \rlhicll Equations 6-56 and 6-57 are both linearized around a given reference trajectory for the state variables. The reference values at each sampling instant j are used to obtain the linearized form of these equations fol- the sa~nplingperiod j. If inequality constl-aints are not included, then an a m -

Durn Recorrciiinrior~in D?.rintnrrS~.srenrs

lytical solution for the estimates of the state variables at the beginning of the time window can be obtained which is then used to numerically integrate the differential equations to obtain the state estimates at other sampling instants within the window. Although this method is efficient, it can give rise to infeasible estimates because it cannot handle inequality constraints. In the above discussion, we have not explicitly included unmeasured variables or parameters as part of the model equations. The nonlinear programming methods can also be used to simultaneously estimate both the measured states and unmeasured parameters. Simultaneous state and parameter estimation in dynamic processes have been considered by Kim et al. 125, 261, who refer to it as error-irz-variables method ( E V M ) estimation. In summary, nonlinear dynamic data reconciliation strategies have several advantages over classical filtering techniques as i s c u s s e d in this section, but they do not address the problem of random noise in the state equations which can be caused by unmeasiired disturbances to the process. They are also computationally more demanding because a recursive f o m ~of the estimator has not been developed. Currently, these techniques tiai~enot been applied to industrial processes and further developnients are required bei'ore they can be applied i n practice.


The differential equations describing the CSTK are integrated from time t-10 to time t by using a 4th-order Kunge-Kutta method to obtain the estimates of the state variables at all sampling instants within this time period. The objective function value is computed corresponding to these estirnates and the state esti~natesat time t-/O are iterated upon until a minimum value of the objective function is obtained subject to the constraints of upper and lower bounds on the initial estimares. This approach differs from the method of Ramamurthi et al. 1241 in that the nonlinear differential equations are not linearized, but are explicitly integrated. It should be noted that in this approach bounds are imposed only on the state estimates at the start of the time window and it is possible that the state estimates at other sa~nplinginstants obtained by explicit integration may violate the bounds. It has the advantage, however, that it is more efficient than the ~liethodof Liebrnan et al. /23]. The estiniated concentration and temperatures obtained using this approach are shourn i n Figur-es 6-6 and 6-7, respectively. It can be observed that the estimated states are very c l ~ s cto the true values. In order to ensure convergence of the optimizatio:~problem at

...nic I

Tne ~?oiiisot:lernlalCSTP described in Example 6-4 is itsed to illustrate the application :~f no~liincardynamic data reconciliation techniqclz. Measurements c o l ~ ~ ~ p " I l d i ltol f the open-loop response of this process for a step c ' n a n ~ ein tile fed co!lcentratiori from 6.5 to 7.5at initial tir:ie were simula~edas in E x a m p l ~6-4. Using a s..indow !etigth of i O samp l i ~ ~periods. g a ncnlinear- dynamic data reconcilia~icntecilnique is applied to ?stinlate the conc-ntrntion and leruperature in the reactor.. Lower and upper houndc on concentraiion were i:nposed as 0.01 anti 0.2. respectively, and o n temperature as 4.0 and 5.0, respectiveiy. Sincr an open-loop sin~ulationis performed in this example, the ob.jective f~itictian of data reconciliation is the weighted sum square of differences between measul-ed and estimated values over the past 10 sampling periods-that is, the second term in Equation 6-55 is absent. The optirnization at ever-y sampling instant I is carried out by maki!lg initial guesses of the temperature and concentratio~tat time I--10.







-- - -



- - .- - -

-7 I





Time (sec)

Figura 6-6. Estimated concentration of window length of 10.

CSTR using dynamic daia reccnciliation For


4 00

1 25

. -


-,- --35



- . - Measured - - - - - - - - Estkmted





65 T~me(see)


- - -7 --





Dynamic data reconciliation is important for process control applications. In order to exploit temporal redundancy in data, dynamic models for the evolution of the state variables have to be used in conjunction with measurements. The Kalman filter can be used to estimate state variables in linear dynamic systems. If disturbances in state variables are ignored, then the Kalman filter is equivalent to data reconciliation. Use of estimated states instead of measxcments can lead to better control. State estimation in nonlinear dynamic systems can be performed using extended Kalrnan filters or its variants. Feasibility restrictions on variables cannot be handled by these methods. Nonlinear- optimization methods can be used for dynamic data reconciliatior? in nonlinear processes. They can account for inequality constraints but at-e !ess efficient than extcilded Kalman filters.

Figure 5-7. Estimated temperature of CSTR using dynamic data reconciliation for window length of 10.

each time, it w;rs found that bounds oil variables had to be imposed. Coir:pal-ing t l t e c estimates wi:h t h o s ~obtained using the extended Kalman fi]cer in Example 6-4. it is obscrved that the extended Kdman filter gives better reslilfs and is also cornpu?ationally inore efficient. Thus, it is Setter to use NDDR technique ~ G Iestimating . the states only when extended Kainan filtering techniqaes do nct give estimatzs that satisfy bounds or; vari:


I . Ljung, l,.Syvrenr Iilet~rilir-oriot~-T!~eot;~~ for rlze i1sr:-. Englzwood Cliffs, N.J.: Prenrice-1-Iall, 1387. 2. Sodersrrom, T., and P. Stoica Sys:enz idenrificariotz. Engiewood Cliffs, N.J.: Prentice-Hall. 1989. 3. Be1lingha:n. B., and F. P. Lees. "The Detection of Maiftinction Using a F-'rticess Control Computer: A Kalman Filtering Technique for General Control 1,oops." Trnils. Inst. Cl;c:rn.Eizg. 55 (1977): 253-365.

4. Franklin. G. F. J. D. Poweli, and M. L. Workman. Digital Cu11rr01oj' Il~nat7licSJ-srerrls.Reading, Mass.: Addison-Wesley, 1980. 5 . Sase, A. P.. and J. L. Melsa. Estimatiorl I'l~eoiywith A~)plict~tior~s to Cor11~,lililicatio~~s nrzd Co~ztrul.New York: McGraw-Hill, 197 1. 6. Kalman, R. E. "A New Approach to Linear Filtering and Prcdict~onProb lems " Trclrrr A S M B J. Basic‘ Eng. 82D (1960):3 5 4 5 .

7. Kalman, R. E. "New Results i n Linear I-'iltering and Prediction Proble~iis," T ~ N I IASME S. J. Bnsir Eilg. 83D ( I 96 1 ): 95-108. 8. Gelh, A . Apl7/i~dOptir~rnlE.~tii17clriorl.Cambridge. Mass.: MIT Press, 1974. 9. Sorenson, H. W. Kali~rui~ Filreriilg: 7hror-y and A/>/,liccrtions. New York: IEEE Press, 1985. 10. Fisher, D. G., and Sehorg, D. E. Muliivariuhle Cornpurer Control-A Study Amsterdam: North Holland, 1976.


1 1 . Stanley. G. M., and K.S.H. Mah. "Estimation of Flows and Temperatures in Process Networks." AfCIzE Jouf-17a123( 1 977): 6 4 2 4 5 0 .

12. Makni, S., D. Hodouin. and C. Bazin. "A Recursive Node Imbalance Method Incorporating a Model of Flowrate Dynamics for On-line Material Balance of Complex Flnwsheets." Miner-als Dzg. 8 (1995): 753-766.

i3. Darouack; hl.. and M. %asadzinski. "Data Reconciliation in Generalized Linear Dynamic Systems."AlQ~EJi~ui-rial 37 (1991): 193-201. 14. Ko!lins, D. K., and S. Devanathzi~l"Unbiased Estimation in Dynaiiiic Data Xeco!icili;ltion." AIChfl .loll,-17ici39 ( i993): 1330-1 331.

15. Bag;!je\vicz. M.. and Q.Jiang. "An Integral Approach to Dynainic Data Reconciliation." AlC'i~f< . / ~ : : I ~1 I3 N ( 19971: ! 2546-2558.

If? Andel-son. 3.D.O.. and 1 . B. Moor-e. Ol~riii7cllC ~ l l t i ? ~I>i11eor !: (21!adt-nric ;M~~tiv~ti\-. Engleirood Cii t f s . N.J.: !'ren~ice-Hall. 1989. 17. R;ischi, A. #/?iilil(ll c ~ , ~ ~ i0f.('toc17ii,sti~ ~-oi Si..ster!l.~.Ilertf~rdshire,C'K: Prz11tice-Tiall. 1993. 18. Borrie, :. A. Sroc.11asric. 5>.stri;7.s,fi):- E n g i t : e r r s - ~ o i Esti11lcctioiz cind Coiltrol. Heitf~)rdshire.LiK: Frcntice-Hail. 1992. 19. P I X S ~IF.\'.H., l3. F'. F'lai;11cry, S. .A. Tc:~~k:olsky. a ~ W7. d 3.. \iette~-li~is. !\/~!IIICIF ic.ci1 Rt,ci!jc~s.Kc\\' kork: Csnibi-idge Li~ivcrsityPress, 1986.

20. Jaz\vinski, A . H. Sro~.!~i:,sti(. PI.OCC.TS~.C (1;zc1F i ! r e r i ~ l7%eo?. ~ New York: Acadzrnic Pi-=.;.;. 1050. 21. Wishncr, R. P.. J. A. Tabaczynski, and M. Atlians. "Comparat~veStudy c f Three Nonlinear Fiitclr." i ~ ~ , ~ , ~ 5~ (i909): ~ ~ i i c4t7rZ 4 9 5 . 22. Muske. K. R., and T. F. Eugar. "Nonlineai- Stare Estimation." In Xclir/i~lrat. Pi-occ.\s Coiztro! (edited by M. A. Henson arid D. E. Seborg), Kcw Jersey: Prenrice-Hall. 1907. -1 1 1-370.

23. Liebinan, M. J., T. I;. Edgar, and L. S. Lastlon. '.Efficierit Data Reconciliation and Estimation for Dynamic Processes Using Nonlinear Prograrnniing Technqiues." C o t ~ ~ l ~ u fClzf~f??. e r - s Engtzg. 16 (no. 10/1 1. 1992): 961-986. 24. Ramamurthi, Y., P. B. Sistu, and B. W. Bequettc. "Control Relevant Ilynanlic Data Reconciliation and Parametei- Estiri~ation."Cot7zpurer.s Client. G I ~ I I , ~ . 17 (no. 1, 1993): 41-59. 25. Kim, I. W., M. J. Liebman, and T. F. Edgar. "Robust Error in Variables Eatimation using Nonlinear- Programming Tcchniqnes." AlChE .lour-ncil 36 (1 990): 985-993. 26. Kim, I. W.. M. J. L-iebman. and T. F. Edgar. "A Scqriential Error in Variables Estimation Method for Nonlinear Dynamic Systems." Coi?~pur~.r.~ Chem. Ef?gng. I5 ( 1991): 663-670.

describing these techniques, at the outset it is better to clearly state the requirements of a gross error detection strategy. This also leads to a better understanding of the variety of techniques that have been proposed, their interrelationships and the achievable results from their usage. Any comprehensive gross error detection strategy should preferably possess the following capabilities:

Introduction to Gross Error Detection

Ability to detect the presence of one or more gross errors in the data (the detection problem) Ability to identify the type and location of the gross error (the identification problem) Ability to locate and identify multiple gross errors which may be present simultaneously in the data (the multiple gross error identification problem) Ability to estimate the magnitude of the gross errors (the estimation problem)

PROBLEM STATEMENTS The technique of data recor?ciiiation ct-ucially depends on the zssun~piion that ~ n f yrandem errors arc presect in the data arid systematic en-ors 2i:ller in the messurements cr the model equations are not present. If this ~issi~!lipt~on is invalid. reco:lciliation can lead to large adiustments being macie to tf:e ruevsured vaiues. and the resulting e s t i ~ a t e scan be very inaccura~t: and even infeasible. Thus it is important to identify such systematic or gross errors befare the final recor,ci!ed estimates are obtained. In the first chapter, ii was pointed our that reconciliation can be performed o ~ i l yif co!istr~i~:tsale prcseni. The sams statelneat can be made with regard to the detectiorl of gross errol-s. Without the a,ailzbiiity of constraints as a cou~ter-checkof the measurements. gross error detectioc cancot be carried out. Therefore. both data reconciiiation and gross error detection techniques exploit the same information available from rilessureriients and constraints. These techniques, therefore, go hdilci-iu-;ra~ld in the processir?g of data. There are two major types of gross errors, as indicated in Chapter 2. One is relatcd to the instrument performance and includes measurement bias. drifting, nliscalibration, and total instrument failure. The other is constraint model-related and includes u~accountedloss of n~aterialand energy resulting frorn leaks frorn process equipment or model inaccuracies due to inaccurate parameters. Various techniques have been designed
Not all gross error detection strategies may fulfill all of the above requireme~ts.The last of the above requirements. although useful, is nor absoluteiy necessary. A gross e r r ~ detection r strategy car1 be analyzed in terms of the component methcds i t uses to tackle :he three main problems of detection, identification, acd multip!e gross error identification, arid the perfonilance of the strstegy is a stron2 function of these conlponent me:iiods In this chapter, sve focus on the first two componer:ts of a gross enor de:ection strategy, that of detection and identification of a sin~ 1 gross 2 error. Methods for nlu!tii;le grcss error detection are discussed in the fo:lo\iiiilg chaptcr. L





This component of a gross er:-rr detcz~i:::: ,:;-:.::,; simplj- , ~ ~ ; i . ; ~ i;v a answer the question of whether yrqcr errors are prese1.t i.n. [he data or not. It does not provide any clues on either the number of gross errors, their types. or their locations. We reiterate the fact that all detection methods either directly or indirectly utilize the fact that gross errors in measurements cause them to violate the model constraints. If measurements do not contain any random errors, then a violation of any of the model constraints by the measured values can be immediately interpreted as due to the presence of gross errors. This is a purely deterministic neth hod.

l r ~ i r , r r f ~ l c r / i oto r ~ GI-(I.\.\ Error J ) P J P ( I ; O I :

We have assumed, however, and rightly so, that all measurements do contain random errors due to which we cannot expect the measurements to strictly satisfy any of the model constraints even if gross errors are absent. Thus, an allowance has to be made for the violation of the constraints due to random errors. Under an assumed probability distribution for the random errors, a probabilistic approach is used for resolving this problem. Some basics of probability distributions and statistical hypothesis testing are explained in Appendix C. The basic principle in gross error detection is derived from the detection of outliers in statistical applications. The random error inherently present in any measurement is assumed to follow a normal distribution with zero mean and known variance. The nor-~nalizederrar- (the difference between the measured value and the expected mean value divided by its standard deviation) follows a standard normal distribution. Most normalized errors fall inside a (1 - a) confidence interval at a chosen a ,r level of significance a. Any \!slue (nor-malized error) which falls outside that confidence region is declared an outlier or a gross error. A number of statistical tests are derived from this basic statistical principle and are able to detect grass crrors. But not all statistical tests are able to identify different types and localion of gross srrors. Some basic statistica! tests are able to detect only r~leasurementerrors (biases). Otherstatistical tests car, on!y detect PI-ocesc model errors or- leaks. On the other hand, the generalized likelihood ratio :es!. which is derived from maxiinurn likelihood estimation principic in statistics, tail be used to detect both instrumnt prob!ems arid PI-ocessleaks. Thc next two sections describe thz :wo basic classes of statistica! tests used for gross error detectio11. Next, a cieriked class of statistical te\ts knowri as the pri~icipalconll;cvlcrzt ipsf.\ is a!so przsznred and co~npared with the basic statistical tests. These tests are based c;n a special ~ y p sof !inear transformation of the residual \iectc?i-su s d in the basic tests. For (he sake of clarity in this chapter. the principal component tests are pre.,:.,::2 ifi a sepzr;:: XC;~GE. The .Gost cnmmonly used statistical tcchniques for detecring gross errors are based on hypothesis testing. In a gross error detection case, the rlrcll hypotlzesis, Ho, is that no gross error is present, and the crlienlniive hypotizesis, H I , is that one or more gross errors are present in the system. All statistical techniques for choosing between these taro hypotheses make use of a rest statistic which is a function of the measurements and constraint model. The test statistic is compared with a prespecified threshold value and the null hypothesis is rejected or accepted, respec-


tively, depending on whether the statistic exceeds t h e threshold or not. The threshold value is also known as the test criterioa or the critical value of the test. The outcome of hypothesis testing is not perkct. A statistical test may declare the presence of gross errors, when in fact there is no gross error (Ho is true). In this case, the test commits a 7:vpe I error or gives rise to a ,false alarm. On the other hand, the test may declare the measurements to be free of error, when in fact one or more gross errors exists (Type I/ error). The power of a statistical test, which is the probability of correct detection, is equal to I - 5 p e I/ error probability. The power and Type I error probability of any statistical test are intimately related. By allowing a larger Type I error probability, the power of a statisticai test can be increased. Therefore, in designin; 9 statistical test, the power of the test must be balanced against the probability of false detection. If the probability distribution of the test statistic can be obtained under the assumption of the null hyputhesis, then the rest criterion can be selected so that the probability of Type 1 error is less than or equal to a specified value a. The parameter a is aiso referred to as the level qf signi$ca~zce for the statistical test. The different statistic21 tests for gross error detection and the choicz of the test cr-iterion are described ir! thc following section. For the sake of' .simplicity, \ve will acalyze the hr?sic statistical tests assuming steadystate cor:di:ions 2nd linear models. The app!icabili!y of such gross errclr test!: to !lon!inear mode!s will be further discussed in Chapter 8. We wil! assume that the linear constraint model is gil;e~by

where A is tile linear constraint matrix and the vector c contains known coefficients. Typically, for linear flow processcs. c is a zero vector unless some of the variables are known exactly. We have deliberately incltlded thic vector in Equation 7-1 fcr ease of comparison with the linearized form of nonlinear constraints which will be treated later. As in the previous chapters. the nieasnr-enlent ersors are assumed to be distributed no~-rriallywith known covariance matrix C. Four basic statistical tests have been developed and widely applied for gross error detection. To simplify the description of these tests. a linear model with all variables measured will be first assumed. This does not exclude the application of such statistical tests to linear models with unmeasured variables, since, as showr: in Chapter 3, linear models with

unmeasured variables can be reduced to linear models with all measured variables by using a projection matrix. The first two tests are based on the vector- of bulailce residuals, r, which is given by

objective function given by Equation 3-6. This result is used later in analyzing the techniques used for gross error identification.

Exercise 7-1. Prove that the global test statistic arzd the optit?lal dafa reconciliation objectivefunction values are equal. In the absence of gross errors, the vector r follows a multivariate normal distribution with zero mean value and variance-covaria~lcematrix V given by

Therefore. under Ho, r - N (0, V), etc. In the presence of gross errors, the elements of residual vector r reflects the degree of violation of process constraints (material and energy conser\:ation laws). On the other hand. matrix V contains i~lforlnationof the process structure (inatfix A) and the measurement variance-covariance matrix, C. The two quantities. i. and V, can be used to construct siatistical tests which can detect the existence of gross en-ors.

Example 7-1 Consider the flow reconciliation of the heat exchanger with bypass process shown in Figure 1-2. Let us assume that all flows are measured and the true, measured and reconciled values (assuming no gross errors) are as given in Table 7-1, where the flow measurement of stream 2 contains a positive bias of 4 units. The standard deviations of all measurement errors are assumed as unity. Table 7-1 Reconciliation of Data Containing a Gross Error for Process of Figure 1-2 -

The Global Test (GT)

The gloSal test, which was the first test proposed [ I , 2, 31, uses the test statistic given hp

Under Ho. the above statistic fo!l~wsa i('-distribution with v degrees of freedom, where v is the rank of matrix A. If the tesf criterion is chosen 2 where is the critical vaiue of distribution at the chosen as a level of significance, then Ho is rejected and a gross error is detected, if y2x~-,,,,. This choice of the trct c-ritrrion ensures that the probability of Type I error for this test is less than or equal to a. The global test combines all the coristraint residuals in obtaining the test statistic, and therefore gives rise to a multivariate 01. collective tes:. .4 point wosth mentioning here is that the global test statistic given by Equation 7-4 is also equal to the minimum ob-jective function value of tile data reconciliation problem. This can be verified easily by substiiuting the solution for the reconciled estimates given by Ecluarion 3-8 in the


True Flow Values



Meosu:ed Flow Values

- -lOi.91


y = rT~ - ' r


Strecrn Number


The constraint matrix for this pl-ocess i: given by



Reconciled Flow Values


where the rows correspond to flow balances for the splitter, heat exchanger, bypass valve, and mixer in order and the colurnns correspond to the six streams in order. T h e constraint residuals for the given measurements can be computed as 1-1.19, 4.25, -1.79, 1.761. The covariance rnatrix of constraint residuals is give11 by

Using Equation 7-4 the global test statistic is computed to be 16.674. This can be verified to be equal to the sum square of the differences between tilL ~conciledand measured values (optimum DR objective function value). The test criterion at 5% level of significance drawn from the chi-square distribution with 4 degrees of freedom is equal to 9.488. Thus the global tcst rejects the nul; hypothesis and a gross error is detected.

bility that one of the tests may be rejected even if no gross errors are present. In other words, the probability of Type I error will be more than the specified value of a. If we wish to control the Type I error probability. the following modified level of significance P, proposed by Mah and Tamhane (61 (derived from Sidak inequality [7]) can be used.

For any specified value of a, the modified value P can be computed using Equation 7-7 and the test criterion for all the constraint tests can be chosen as Zjpb,2.This will ensure that the probability that any one of the constraints tests will be rejected under Ho is less than o r equal to a. It should be noted that a is only an upper bound on the Type 1 error probability and in order to ensure that the Type I error probability is exactly equal to a, the test criterion has to be chosen by trial and error using simulation. Alternati\~eiy,Rollins 2nd Davis [S] proposed the use of a critical value based on the Bonfemoni confidence interval which is given by

The Constraint or t.iodai Test (NT)

For l a r ~ e\a!ues of 171,Equation 7-7 educes to Equation 7-8 Tne vector r can also b ~used to derive test statistics, ot?e ;'or each constraint i. given by Exercise 7-2. Pmve rkat 0j ~lsirzgc rest crit~riotlbrlscd O ~ tile I iriod$~d hvel oj.rigri$cu/ice giveti bj. Equiirioa 7-7, file TI^? I ~ i ? i ) l ~mDnSilityo f rhe ronstrciiizt rcjst wi//be less tiznlz or. ccquu/ to a.

- .

\?here diag(Vj is 2 diagonal n~atriswhose diagonal elelnents are yi,.Tht. ~ z o d ~orl ronst~-ciinttest [a. 51 uscs the test statistics z, for g!-oss error detection. It cdn be proveci that z,; follows a standard norntal distributio~!, i\! ( 0 , l ) under H,,. If any of rne test statistics z,., (or cquivaien~i),L I I C maximurn test statistic) exceeds the test criterion u l , ~ r eZ1..u/2is the critical \ ~ a l u eof the standard nol-lnal distribution for a level of significance (for the two-sided test), a gross error is detected. I!nlike the global tes:. the constraint test processes each co~istraint residua! separately and gives rise to in univariate tests. Since ntultiple tests are performed using the same critical value, it increases the proba-





I t I:; p ~ s s i b l eto obtain o i k r fc:rrns o i the consirdi~lttest by using a linear uansfcxmation of :he constraint residuais. Iiowevzr, 1101 a!l of these fo!-nls possess the same power to detect zross errors. Crowe 191 obtained tcst whiclt h:~.: thr r~r~.xii?zur~~ r>ni$,Pia particular forai of the co~~straiiit The test slatistics of the i~lnnir~zuii? c-oiz.stmirzr test are given by

The Measurement Test (MT)

or, written in vector form,

The third test is based on the vector of measurernerzt adjustments, m e test criterion is chosen to be the same as in the case of the standard constraint test. If there is a gross error in the process, then it can be shown that the expected value of the ~naximumamong the test statistics given by Equation 7-9 is greater than the expected value of the maximum among the test statistics given by Equation 7-5. This implies that if there is a gross error, then the constraint test based on the test statistics of Equation 7-9 has a greater probability of detecting it than the test based on the statistics of Equation 7-5. If the constraint test statistics are derived using any other linear transformation of the residuals, we can show that they do not possess this property. Thus, the corlstraint test based on the statistics of Equation 7-9 has rnasimal power property ( M P ) .

where 2 are the reconciled estimates obtained using Equation 3-8. Using this solution, the measurement adjustments can also be written as

which, under Ho, follows a multivariate normal distribution: N (0, W), where: w = cov(a) = C A ~ V - 'JC (7-13) The following test statistics,

Exercise 7-3. PI-ove rhat tlze expected ilalltc oj"z:,; is gr-ecrtrr than o?-cqzrill to tlze expected value of zIii, i f a 31-usse~-r-c,rofany lnngr~itrtdcP is yreserr! irl,-uirzti. A/.so /?r~~~>ejhfi)l. lliis cnse tho! t;7e expected \>clueo f ; , i is gr-eater tllun or. eq!cai to z;, ji11-all J. Erznzd this 1-.zult ;o sho~r1hoi ~ ! U.srr~tisiics J givc.11 iy Eqrlcctiorz 7-9 hove n:or-ep~u.erj3f3:. dcrt7cting u g/as,s err-or-in flie consrr-aitirr r/?nfz cozzst~-airztre:; statistics dcrived u.rirzg any ~ t h e litze~ir r transfot-vzafiorzoft/zc co~fstruintresidlrcrls. Hiizt: Tile e,pected ~.ulueo f r in ihis ccisc is be,, where e, is (I ~ ~ c i tvector ;. :c;iilr l)a!zre I in poiition i oi,d ;fro c!.seabrrr lJ,se !/ii.s m a l c.;~ i/ Cll~i~h:i-.ii~~r~: irzeqrralitj~:I J w ;5 ( V ~ ' V ) I :(~W ~ W ) ~ ; ~ . I



known as the rneasrrwmerzf tesr sfrtti.sfic.s,follow a standard ncnnal distribution. N ( 0 , l ) under No.Tamhaile [ l O j has sho.rvn t h a ~for a nonciiagonal covariance mairix C, a vector of iest statistics \triih maximal power for- detecting a single gross en-or is obtained y prernu1:iplyir;: a by C-' which gives


e IJrider No, d ance niatrir;


a's,) nol-mally distributed w ~ i hzero n:ean and a covari-

Mah and Tanihane [61 propoq~~! the following test statistics, For the flow process considered in Example 7-1, the constraint residuals and its covariance matrix were computed. From these, the constraint test statistics can be obtained as [0.687, 3.0052, 1.2657, 1.0161J. The standard normal test criterion at 5% level of significance is 1.96. Thus, only the test for co~lstraintresidual 2 is rejected.

e 184

/~(~r;r!\'~,~-oocriruirorl rrrrd f;/osc

Ermr- C)P?C~.~IOI?

known as the nzaxiil7urll power (MP) t7zea.surenzeizt test, which follows a standard normal distribution, N ( 0 , l ) under Ho . Si~nilarto the constraint test. the measurement test also involves multiple univariate tests. Using similar arguments as before, we can show that the probability of C p e I error will be less than or equal to w., if the test criterion is chosen as Z1-w2, where P is given by Equation 7-7 or 7-8 with ill being replaced by 12, the number of univariate measurement tests.


Exercise 7-4. P r o i ~that zd,j have the maxiinur?~po~vetfotdetectirig a gross error iiz one o f the measurements. Hint: Follo~oa similar pi-oof as zr.sed,fol-solvirlg Exercise 7-3.

Exercise 7-5. .Slloir rllnrfor diccpo~~ul C, zu,j = ~




The Generalized Likelihood Ratio (GLR) Test A fourth test for detecting gross errors in steady-star? processes is the generalized likelihood ratio (GLR) test b a e d on the ~nauirnurnlikelihood ratio principle used in statistics. In contrast to other tests. the formulatioll of this test requires a model of the process in the presence of a zross error. also known as the gross er-inr-~liodel.As shown in the risxt section, thih test can identify different types of z l - 0 5 ~error ior u:hich a S T - O S ~ei-sor medel is provided. The procedure has been illusrratsd for gro.\ srrors caused bl. measurement biases and process !ezks by .'>.arasinttan m c i Mah [13]. The gross error model for 2 bias of unknown magnitadz 5 i n nicasurernent j is giver: by

,. j

E-xercise 7-6. Let ai ( I I I C J a . ?7e tit.0 c o l ~ ~ iofrl!atr-i.r w ~ . ~ A. Ilf'th(.m is J ( 1 c.ol~srai7tc .sl!cll li~trr a i= c a . . sl~owillat z,l,i = cci,j.

surement error covariance matrix is diagonal, thc M P measurement test statistics are also the same. For a 5% level of significance, the standard normal test criterion is 1.96. From these n-e observe ths: the measurement tests for measurements 2, 4, and 6 are rejected. The modified level of significance given by Sidak's inequality (Equation 7-7)is equal to 0.0085. while that based on Bonferroni confidence interval (Equation 7-8) is equal to 0.0083. Corresponding to these modified significance levels. the test criteria are 2.63 15 and 2.6396. respectively. Thus, if we usc the lnoditied levels of significance, only the test for measurement 2 is rejected. Additional examples for GT. NT and X.IT tests are found in Crowe et al. [ l 11 and Tamhane and Mah [ 121.




Example 7-3 F7ro:n the iileas;i~-~d and reco~~ciizd values libteci Ln Tai~le7- !, the messuren1e:it zdjustinci~~s can be cornp~itedas [1.0233, 2.6167, -0.4035. --I .6333, 1.3863. --2.00hjj. The ccw;ir-iance matrix of rl;easure!ilctlt adjusiments is ~ i v e nby


The elements of vector rn,are relativel>-easy to dz!irie when only rota1 flow balances are involved. If the !eak i h from a procev unit i, then only the flow constraint for this unit vector is zffected and ~ h u nl, s is identical to ei. However, if the constraints also irlclude colrlponent balances and energy balances (with precisely knowr, composition and temperature \:alues), then the vector m, can only be defined approxirn:ttely using cr:si-

.1.he nleasuretnent tcst statistics are therefore obtained as [ 1.2533, ?.2017,0.404.2.0001. 1.6983. 2.45773. Hccause, in this example, the rnca-

where e, is a u~lityvzc[or '.I ith valiie 1 ir? po>i!ionj and xi(.) eiszwiicr-c. On the ottler hand, leakage of srtatcrial should be ir,ode!ed r s p x t of the cons:raints. A rnass flow leak in a pmcsss node i oi ilnkno:t~n il?as~?itude t) can be modeled by




Recr,~lc;l~~~lion and Gross Error Defection

The vectors fh are also referred to as gross error sigrzature veclors. 1f we define p as the unknown expected value of r, we can formulate the hypotheses for gross error detection as

neering judgment. A recommendation made by Narasimhan and Mah [13] is to choose the elements of mi as follows: (a) Corresponding to the total mass flow constraint of unit i, m, has a value of unity in the ith position. (b) Corresponding to the energy flow constraint associated with node i, the value of of the ith element in the vector m, can be chosen as the average specific enthalpy of the streams incident to node i. The same can b e applied to a component flow constraint for the node i, by replacing specific enthalpy with concentration. (c) The elements in m, not associated with constraints of node i are chosen to be zero. Pure energy or component flow losses in node i can also be modeled by Equation 7-19 by choosing the corresponding element in mi to be unity and all other elements to be zero. Using the gross errcr models, it is possible to derive the statistical distribution of the constraint residuals under HI, when a gross error either in the measurements or constraints is present. It has already been proved that under Fh,the constraint residuals follow a normal distributicn with zero mean and ccvariacce matrix given by Equatian 7-3. Under HI: the constraint residuals stil! follow a normal distribution with covariance rnatrix given by Equatiot; 7-3, but the expected value depends on the type of gross errcr present. If a gross error due to a bias of magnitude b is present in measurenen:j. !her, we can show that

On the other hand, if a gross erlm due to a pr:?cess ieak i \ present in node i , then we can show that

j i

where tio is the nu11 hypothesis that no gross errors exist and HI is the alternative hypothesis that either a process leak or a measurement bias is present. The alternative hypothesis has two unknowns, b and fk. The parameter b can be any real number and fk can be any vector from the set F, which is given by

where m is the number of nodes or process units, and n is the number of llleasured variables. In order to test the two hypotheses given by Equation 7-24, one can use the !ikelihood ratio test. The likelihood ratio test statistic in our case is given by ?r{r/ti, ] h = sup 7 ~r!r~f&) wherz Fr(rlHo}. PrjrlH,) are the probabi!ities of obtaining residual vector r under Ho and HI hypothesis respectively; the supremum <"sup" in Equation 7-26) is co~npctedover all possible irali~esof t l ~ epanmeters piesent in the hypoth~ses.Using the normal probabi!ity density fulicti~n for r; we can write Equatior~7-26 as exp{-o.s(r

Therefore, when a gross error due to a bias we can write

01 a

process leak is przsent,

h = sup b.fh


b f k ) T ~ - l ( rbfL)] -


Since the expression o n the right hand side of Equation 7-27 is always pl~sitive,we can simplify the calculation by choosing as the test statistic where f k = Aej mi

for a bias in measurement j for a process leak in node i

It can be easily observed by comparing Equations 7-7 and 7-34 that J3, and a, are related by siinilar expressions, with the exponent being equal to the reciprocal of the number of multiple univariate tests being performed as part of the test to detect a gross error.

The computation of T proceeds as follows. For any vector fk we compute the estimate b* of b, which gives the supremuin in Equation 7-28. Thus, we obtain the maximu~nlikelihood estimate

Exercise 7-7. Prove fl~atT,,fi~llnwsa central chi-syunr-e distributiorz wzth orlc degree of ft-eedom.

Substituting b* in Equation 7-28 and denoting the corresponding value of T by Tk we get

Exercise 7-8. Prove tllat the squar-e root ( f t l i e GLR test statistics Ix can be obtairzed using llze linrur tr~aiz.~fo~-r~zatior~ of t!?e cor~sfr-oint residuals F ~ v - 'where ~ , tlze colurizns of rnatri.~F at-e the gross error vectors fL.


Example 7-4 This calculaiion is performi-d for every vector statisric T is therefore ob:ai!led a i

f, in

set F and t!ie test

Let P: bz the vector that leads t(i ill:: suprzninln i l l Bqtiat~on7-33. The test statis:ic T is compared v;itli a pres~eciitedthreshold T,, arid a gr-oss crrol- is detected if T exceeds T,;. \Ve call icterpi-ei Tk as a test statistic for {lie presence o i gross error k. Since T is t!ie rnaxiinum dlnong Th,the G!,R test detects a gross prsor if any of tlie tcst slatistics 'Ih exceeds the criticnl value. Thus the GLK tcst. like tlle n~easureriienttest a~:tf [he constraint test, performs multip!e ui~ivariatetests to derecc a gn)ss error. The distribution of Tk,under H,,. car; be shown to he a central chi-scluare distribution with one degree of freedom. Therefore. in order to maintain the Type I error probability of the GLR test less than or- equal to a given value a. we 8-1.tlie upper 1 - P quantile of the chican choose the test criterion as square distribution with one degree of fl-eedorn, where p is given by






If u.e consider gross eirors caused by n-ieasuremer.: b~a:;esfor the s i n pie flow process used in the preceding examples. the gross error signature vector for a bias in nteasur-ernent i 1s :he ith column of rhe constr~iint matrix which is given ir. Examp!e 7-1. The G!,R tesi statistics coiiiputed frcm the constraint resijilals and its covariance matrix ~ i v e nin Exanlple 7-2 are 11.5705. i0.2:/:)4. 0.244, 4.0017, 2.3313, 6.G1011. It can he :leritied tiia: the G i R test statistics arc the square of the M P measureinen! te;t statistics corrtputed in Exa~nplc7-3. The test criteria at 4% Iziei of signi!'icance and at the two ~iodifiedlevels of sigl~ificafice-Sids and Ronferrani-arc siinply the squai-e of :he standard nonn;~ltest ciiieria, 13.8415, 6.925, 6.96763. respectively. Hence, the GLR tests for measurements 2. 4. and 6 are rejected at the 5% leve! of si~nificance,\v!?i!t. nnly t!le tcst fcr nieasuremer~t2 is rejected at the modifled levels of signitic;alice. If we also wish to tcst for leaks in all the f6ur nodes. then ths siznature vectors fol- these four gross errors are simply the unit vectors. The G i K test statistics for these four gross er-rcrs are given by [1.57(iX, 13.2496. 0.3844, 6.0-1011. The GLR tests for leaks in i d e s 2 and 4 are rejected at the 5% level of significance while the test for a leak in node 2 alone is rejected at the modified levels of significance. I t can be observed thal the GLR test statistic for a leak in ilode 1 (splitter node) is the same as the

test statistic for a bias in measurement 1. This is due to the fact that the gross error signature vectors for these two gross errors are identical. The same observation can be made concerning a leak in node 4 and a bias in nieasurement 6. Comparison of the Power of Basic Gross Error Detection Tests

As described in the preceding sections, several statistical tests have been developed for detecting gross errors in measurements caused by biases in the measuring instruments or gross errors in steady-state conservation constraints due to unknown leaks. In order to obtain the best performance, it is important to apply the test which has the maximum power (the probability of detecting the presence of a gross error when one is actually present) without increasing the probability of T;;: I error (probability of wrongly detecting a gross error when none is present). Thus, an important question that can be asked is which among the above , power for detecting a single gross error in four tests gives t h ~maximum the data. This question has not been adequately addressed so far. Most of the works which conipare the performance of different gross error detection strategies only consider the overall perfoi-~nancewhich inclgdes all tne co:npor,ents of detection. identiiication and multiple error dete~tion,bur does not compare the detection component pa!l of the strategy in isolation. \'.ie provide some results rhat partially ariswer this questior?. In making this conlpariscn. we have to c ~ n s i d e ronly the MP test for :he constraint and measurement tests, besides the global test and the GLK test. We can further sirrplify our task by making use of tl1ecretir:al results that liii\~ebeen derived b!; CI-owe[9! znd Na;.xiirnt~arl [14] to show thzt among the cnnstraint, measurement and GLR iests, the GLR test has the maximum power to detect z sing!e gross error. The proof of this result follows. I !

constraint residuals. If we consider the positive square root of the GLR test statistics (without losp of generality), then we can show that the GLR test statistics are also obtained using a linear transformation of the constraint residuals (see Exercise 7-8). Therefore, the MP constraint test, M P measurement test and the GLR test all derive test statistics based on a linear transformation of the constraint residuals. The question can then be posed as to which linear transformation of the constraint residuals gives the most powerful tests. In order to answer this question, we can consider an arbitrary linear transfonnation of the constraint residuals given by

Let a gross eiror nf vagnitude b either due to a measurement bias or a leak be present with corresponding gross error vector fk. Then using Equation 7-22 the expected value of the transformed constraint residuals is obtained as

The covariance matrix of the transformed constrair~edresiduals is ziven by

A tesi can then be devised based on the ti-ansfcrm-d ccnsrl-ained residuals with test statistics given by

Equation 7-38 can also be written as

!Je:r,mz 7-1: The GI,R test is more powerful than an MP measurement test or an MP constraint :est, based on any singular or ncnsingular linear transfonnarion of the constraint residuals for detecting a sirlgie gross error-.

It can be observed from Equations 7-12, 7-15, and 7-17 that the MP ~nlcasure~nent test statistics art' obtained using a linear transformation of

It can be pasily verified that by choosing Y to be V-', A1'V-I, or FTV-' (where F is a matrix of vectors fk defined by Equation 7-23), recpectively, the MP constraint test, the MP measurement test or the CLR test sta-


Llrriu Kccr~iiciliariorrorid GI,,

Error D,-rai.:ioir

tistics are obtained. In order to prove that the GLR test has the maximurn power, we have to prove that the maximum among the expected values of the GLR test statistics is greater than or equal to the expected value of any of the test statistics given by Equation 7-39. For this, we first prove that the maximum among the expected values of the GI,R test statistics is attained by Tk, that is,

where, from Equations 7-30 and 7-35 through 7-37 with Y = FTV-I, the are respectively expected values of fland


The above resu!ts can be easi!y established using the Cauchy-Schwal-tz. inequaliiy.

and identifying the vectors v and w to be

Since the MP constraint and MP measurements tests are obtained using a particular linear transformation of the constraint residuals, based on the above results we can claim that on an average we can expect the GLR test to give higher power for detecting the presence of a single gross error than either the MP constraint test or MP measurement test. If we assume that only gross errors due to measuren~entbiases can be present in the system, then the GLR test becomes identical to the MP measurement test. On the other hand, if we assume that only gross errors which affe,: only one constraint (for example. leaks in overall flow balance constraints) can be present, then the GI,R test becomes identical to the MF constraint test. However, if we allow for both types of gross errors to be present in the system, then the GI,R test is more powerful than either of the other two tests. It should be cautio!~ed,though, that an itzlplicit assumption has been made that we preciseiy know the gross error vectors f , for the different types of gross errors which can occur i r ~the process. This assumption may not be valici if there are ur;czrtainrics in :he distribe ution model or gross error model. Morecver, these results a ~ - valid if we assulne that, at most, one gross error is presznt.

acd by defining vectors v and w given by v = Rf,. and \V = RfL where

(7-44) Exercise 7-_0.PI-ove:I;rrr Mi'

/~ZEO~LOP~H ;ex; < ~ static:ic ~IT vt~t'retzoizlj gt-oss erwt-s irz t?l~~n.~mt-~t71el7:.s trr-e n!lalt.eif.

K is a niatrix such that K7R= \Ti.

rd,,= (7))'''

r - 2 E[z;] over all 1, first \te need to define Ir! order to prov2 E[vTk] Elz;!. which according to Equattons 7-39, 7-36. 2nd 7-37 1s

;;= (7;)"' hen olllv gross errors in c-oi1.str-(1i17r.s lvl7ich offerr a sirzgle colzsrrcrit7: are allow,ed. Hkrt: The XI-oss~ r m ~r ' e r f o,for ~ - these ~qpes of gr-oss errors ntr e i . Exercise 7-10. Prove tl~utMP ctv7.rrt-ai17itext .staristic

Then, we can again make use of the Cauchy-Sch\\artz inequality by defining matrices

R?'R = V - I

and P = YR-'

(7- 46)

1 I 1

'f $4

~~r~~ Kaizllici!ii~rro,i a,,d Gnu, Error I1i~rerlii)tr

It is now only necessary to examine whether the global test or the GLR test gives higher power to detect the presence of a single gross enor. We note from Equation 7-4 that the global test is also based on the constraint residuals and thus it uses the same infomiation as the GLR test for detecting gross errors. But there is a fundamental differenc-,in the manner in which this information is processed. The global test performs a single multivariate test for detecting a gross error, whereas the GLR test performs multiple univariate tests, one for each possible gross error hypothesized for detecting if any one of them is present. The question is which one of these processing schemes gives a higher power. This problem has also been studied in the statistical literature. Unfortunately. it is difficult to obtain a unique answer to this question theoretically. We can perform sirnulation studies of selected processes to evaluate the power of the two tests. Before attempting such a comparison, however, it must be ensured that both these tests give the same Type I error probability. This implies that the criterion for each test has to be chosen to give a specified value of the Type I error probability. This is possible only in the case of the global Test. As explained before, since the GLR test pcrfol-ms multiple univariate Tests, the test criterion ha; to be chosen by trial and en-01. using simulation. The result> obtai~icdthrough such simula~ioncan at best he uszd t ~ )make sor;:2 broad conclusions and str-ictly cannot be generalized t c ~all processes. I t is seen frorn the above discl~ssion,that one lest does r.ot hzve uniforrniy higher power than the other. However. u.s recornmend that for the nrle 01-lnore g~-0.7s21-r-ors01-e!)r-e.~r~lf. the purpose of detecting ~,ht>t17er g!obal test ( S T ) shouId be uszd. This reiom~nendatioi~ is based on the fcliowing collsideratit>n~: (i) The cornputatio~rof the GT ctatistic is more efficient since a single

:est statistic is computed. ( 2 ) In practice to instill confidence among process operators, it is necessary to keep the 5!cc ?!sm, ?r~hahilityhe!ow :1 s7rcitied li~nit.The test criterion for GT can be chosen to precisely obtain this allowable limit. Note that any higher value of the test criterion can satisfy this linut but will result in a lower power. In the ca5e of the GLR test. the lowest test criterion value that satisfies this limit can be chosen only through simulation. 13) The GLR test requires knowledge about the gross error vectors for the different gross errors that can occur in the process for obtaining the test statistics. The global test does not require any information

Irr:rod~rcriorrro C;ro.r.~Error Derecrio~z


regarding the gross errors for the purpose of detection. This may be an important practical consideration since complete knowledge regarding all possible gross errors that can occur in a process is not generally available. It may be argued that G T is inferior since it can only detect the presence or absence of gross errors but for identifying the nature and location of gross error an identification strategy is required. (In the case of the GLR test, the test statistics can be directly used for identification as described in the next section.) This argument is shown to be without merit from two considerations: (1) The use of GT for detection, does not preclude the use of the GLR test statistics for identifying the type and location of the gross error. In this case, it is necessary to construct the GLR test statistics only if a gross error is detected by the global test. (2) It is demonstrated in the next section, that the identification strategy inherent in the GLR test is the standard serial eliruination techliique that was proposed and first used by Ripps [ I ] in combination wi:h the giobal test for i-dentifyicg measurements biases. Thus. the GL.R test for detecting and identifying a single gross error can be viewed as a gross error detection strategy which has as one of its components the GT for-detec~ionand serial eliniination foi identification.

GROSS ERROR DETECTION USCNG PRlNClPAP COMPONENT (PC) TESTS The variance-covariance rnatriccs of constraint rzsiclua!~ V and oT measurements adjustments or W) arc always dense. This implics that even if measurements are independent or weakly correiated, the rsconciled data are always strongly correlated. The reconciled values and, hence, thf, rnrrcur-mpnr ~ ~ + j u c f m ennr pf ~rnrrrlated because they are related to each other via the process model. The same is true for t!le constraint residuals. However, not all basic tests exploit the entire information contained in or W. The non-MP constraint test and the univariate meamatrices V, surement test (MP or non-MP) described earlier in this chapter use only the diagonal terms of matrices V, W,or W, respectively. Alten~atively, the principaj component tests use the entire matrices. It is expected that



Darn Eer-o,zcilinriori nt7d Glrr.~.~ Error Deter-,tor!

such tests will be able to detect more subtle gross errors slnce they are multivariate tests. It was found that multivariate tests such as the global test often detect gross errors that are not detected by the univariate tests. This aspect is very importanr, because failure to detect ail gross errors could result in an unsuccessful data reconciliation (the reconciled solution is infeasible or questionable). Principal component tests are related to the univariate constraint and measurement tests, because they use a linear transformation of the constraint or measurement residual vectors. The following is a brief description of the principal component tests as given in Tong and Crowe [15]. As with the previous tests, we restrict this analysis to linear models with no unmeasured variables. The case with unmeasured variables can be handled with a projection matrix. Two basic types of principal component tests can be derived as follows:

Exercise 7-11. u r

- N (0, V ) ,show :Izat p, - N (0,Ij.

On the other hand, Equations 7-48 and 7-49 can be combined and rewritten as

which means that the residual vector r can be uniquely recons:ructed fro111 its principal componcnts if all of the principal components are retained, that is, p, E !fIm, where r?i is the number of eqc::':,-.ts (balance residuals). However, if fewer than nl principal co~nponentsare retained, we get

Principal Component Tests for Residuals 0 4 Process Constraints

Let us consider a set of linear combinations of vector r

1% here

the colunin5 of I T * , are the eigcnvectc>rs c f V, satibfying

Matrix A, is diago~al,concisting of rhe eigenvalues of V, A,,,1 = :...q. on its diagonal and satis5es

Macrix U, ccnsists of !he oflhonoiir,alized eigenvectors of V. so that

The vector p, consists of pr-ir~cij~ul con7~)onerir.sof cc?rzstruint yes-iduals, and its elements are pr-incipai cci~t~po~ie~lr scores. If gross errors are not present, then r N(0, V) and it can be shown that p, - N(0, I). Therefore, a set of correlated variables, r, is transformed into a new set of uncosselated variables, p,. The principal components are nuinbered in descending order of the rllagnitudes of the corresponding eigenvalues.


~ i t hpr E !)I k . and k < rn. Equati(>i~ 7-54 is rek:rcd to as the !,,-ir;u;~al cor?z!x~r~erzr rlloile/ of vector r. Equation 7-53 indica1.e~tiut thc residuals in the V ~ C ~ O rI -can be decomposed into the cof~:ributions frarn the principal components tenn, and the rcsidl;als of the prir~cipalcoinponen: niodei. r-i. Tltis r:icans that for ~ r o s serro:- detection. irrskrld cf usins sta~isticalrests for r, we c a i ~perf01711hyp::tl~esis testing oil p, and r-2. Since each eletnzilt of vector p, 1s distributed as a standard nol-mcl variat;le. a detection rule sin~ilarto the univariate constri~in:rest can be used and the test for ccilstraint residual i is rejected if pLiexcecds Zi p2. Similar to the univariatc tests, to jiinit the Typc I err-or to level a, J3 cer? be chosen as in Equatioi~7-7 whe:-e thz exponent in this eqnation is replaced by the number of retained principal components, k. Principal Component Tests on Measurement Adjustments

Similar to the principal component test statistics based on constraint residuals. pr-irzcipal mrupor7rrlt ti7ensrr:ernerzi test sttrtisiics can be defined as

where the columns of W, are the eigenvectors of W and k is the number of retained principal components. In general, k < n, where n is the number of measurements.

Exercise 7-12. Ifa

The collective principal component test in Exercise 7-1 3 is called ~ I - U I Z cured clzi-squat-e test if all the principal component\ are not retained. Another important collective test statistic is defined by

known as the Q statistic or the squared prediction error, and sornet~mes, the Rao-statistic. It can be shown that Q, is a weighted sum of squares of the last m-k principal components.

- N (0, W),show tlzatp, - N (0, I).


If gross errors are not present, then it can be shown that pa N (0, I); therefore, the principal components on measurement adjustments are a l w not correlated. Similar to the measurement test, we can conduct a test on every p, by comparing it againbt a threshold Z,


Relationship between Principal Component Tests and Other Statistical Tests

Thtt principal cornpone~ttrests art: also based on a lilreai transfor!nation of the constraint residuals 3s in Equation 7-35. Ir can be vei-ifid that the transfor~lationmatrix uhcd [or deriving principal component constraint test is Y = W: and for the principal component meassremen1 tzst. the transformation matrix is Y = u':A~v-'.Tong and Crowe 1151 iiriplied that, since the n31nber of retained principd componenis is usuaily iess than the number of ;2rincipal cornpor,ents, she modified level of r component tests are smaller. significance f ~ principal Therefore, we expect in general to reduce the o.~erallType 1 rn-or in detecting gross errors by the principal component test. But this algurnent is without merit, because the Type I en-or probability for any test can always be reduced by simply choosing a smallel- value of a. Furthermore, because principal con1poi:ent tests d o not directly identify the gross error it is possible that the strategy used for identifying the gross crror commits additional Type I errors. Some later examples in this chapter illustrate this problem. Analogous to the global test, a collective global test based on principal components can also be proposed for which the test statistic is defined by

The two quantities, yk and Q,, are complementary. The former examines the retained and the latter examines the unretained principd 1 COmPOnent collectively; yk accounts for the amount of variance explained by the principal compo~lentmodel, while Q, accounts for the arnount of the variance unexplained. Tests bascd on these quantities can be conducted to examine whether a gross error is presenr in the retained or u ~ r g a i n e d principal components. For more inforn?ation on coilecti\ie pri~cipalcomponen: tests, see Tong and Ciowe [15]. As previously stated, a major difference betwem thc i~nivariatetests and the rnu!tivariate chi-square tests is that the former does not take the correlation arilocg the residua!^ into account and henc:: teilds to be lcss reliable when correlation increases. Howevc~;the GLR (or MP messurement test) a i d MP constraint :es!, does inccrporate the correlntlon by transforming the residuals using tlte inverse of the covariance matrix. This leads to a maxi~iurnpower for correct!? detecting s gl-oss error. over all other tests, but only when there is a single gross error. When rnulti?le yross errors are present, these tests no longer possess the maximum power. Tong and Crowe 1151 indicated that the maltivariate principal component tests not only provide better detection to subtle gross errors, but also have more power to correctly identify the variables in error over other tests. Again, this statement was not generally confirmed by an extensive conlparison study [Ih].

Using these transformatiorls all the statistical tests can be derived as described below. If unmeasured variables exist, the constraint model is described by

Exercise 7-13. Collective Prirzcipal Cornponeill x2 Tests. Similczr to the global test (Equutior7 7-4)deliise a test usir~gtl~c.statistic defirled by Equatiorz 7-56, wllich it7clrrdes the contrihritiotz of the k retailzed principal compot~entsof fhc balance residual vecfor r.

Axx + AUu= c

where x: n x 1 is the vector of measured variables and u: p x 1 is the vector of unmeasured variables and A, is assumed to be of full column rank, p. As shown in Chapter 3 , the unnieasured variables can be eliminated by pre-multiplying the constrai~itsby a projection matrix P: (m-p) x 171 of rank m-p, where m is the number of constraints, to give the reduced constraints:

Example 7-5 Let us apply the principal component tests based on measuremelit residuals to the process considered in the preceding examples. The nonzero eigenvalues of matrix W computed in Example 7-3 are all unity. The matrix U, whose colulnns are the corresponding normalized eigenvectors is given by


-0.7475 0.1 1517 -0.0067 0.4077 -0.688 1 0.0500 0.3843 -0.4449 -0.5698 U, = -0.0157 0.4950 -0.601 1 0.0067 0.2523 0.31 88 9.3564 0.0773 0.5578

PA,x = PC

(7 - 60)

The c o ~ ~ s t r a i nresiduals t for the reduced constraints call be defined exactly analogous to Equation 7-2: e

0.0287-0.4232 0.2554 -6.0597 -0.7382 0.4542J

%t;r principal coniponents were retained in ihis example. The p~incipal compoiients are coinputeii as 1-0.53 17, -2. t 18 1 , 0.2422. -3.0 1871. At 5% level cf significance, the tests for p~iiicipalccmponen:~2 and 4 are ~ejecied,while ai the niodified level> of sizr,ificance only the test for the last principal coti?ponerir is rejecfed.

(7 - 59)


P = P(A,Y-C)


One can show that the variance-covariance mail-ix of vector p is:






cov(p j = P A , Z ( P ~ ~ , ) '

(7 - 6 2 )


Exercise 7-11. irsirzg !he rule c?f litzcai-trar~sfnnizaiioiz.~ i~z ~nultil.ariatestatistics, pmve t!wr the r~dziced~017r.t1uil~t 1-e.sid~iu1s f o l l o ~G~nunnrzl ~ dirt ribuiior? uiih cu~uriii~ice n1urri.r dr
STATISTICAL TESTS F O R GENERAL STEADY-STATE M O D E L S In the precedir;g sections. the different statistical tests for detecting gross errors were described for the simplest case when all the vrlriables are measured directly. I11 general. unmeasured variables may be prcsent, and the nleasurellients may be indirec:ly related to the variables. Narasimhan and Mall [I71 described sirnple tra~isforrnationsby which the general steadystate models can be converted to the above simple steady-state model.

The statistics of the global, constraint, and measurement tests can be obtained by using PA,. p, and V,, respectively, for A, r. and V iil the appropriate equations. For deriving the GLK test statistics, we note that the gross error signatare vectors for biases and leaks are also transformed due to the use of the projection matrix. These trarisfbrmed si,atiature vectors are given by

I~irrodrrctionf~ Gross Err<>,Detrcfjor2

f,, = Pf, where fk are given by Equation 7-23. The GLR test statistics are now obtained using Equations 7-30 to 7-33 by substituting fpk,Vp, and p, for fL.V, and r, respectively. It can be proved that the GLR test gives MP test statistics for detecting single gross errors even when unmeasured variables are present. 111 some cases, the measurements may not be directly related to the variables as in Equation 3-1. An example of this was given in Chapter 2, where the pressure drop measurement is related to the square of the flow rate variable. Another example is the relationsh~pbetween a pH measurement and the concentration of hydrogen ions and perhaps temperature of the process. These relationships are typically nonlinear, but for simplicity we represent them by the following linear equations.

Let us assume that the constraints are given by AX = c

(7- 65)

We define artificial variables x,, as X, = Ds

Then Equation 7-64 beco!ncs

Eq~lztions7-65 and 7-66 can be jointly wrntten as

Equaiians 7-67 and 7-68 represent an equivalent alternative mudel of the process in which the variables x are like "unmeasured variables" and variables x, are like directly measured variables. Therefore, the method described for treating unmeasured variabIes can now be used to derive the statistics for all tests. The technique described below can be applied even when the measurernents are related to variables by nonlinear equations. However, the


resulting modified constraint equations will be nonlinear and nonlinear data reconciliation and gross error detection techniqdes have to be used to solve this process.

TECHNIQUES FOR SINGLE GROSS ERROR IDENTIFICATIOK The second component of a gross error detection strategy deals with the problem of correctly identifying the type and location of a gross erroIwhich is detected by a test. It should be noted that the identification prob!ern arises only if the detection test rejects the cull hypothesis. Not all detection tests described in the preceding secrioc are designed to disiinguish betweer: different gross error types. Only the GLR test is suitabic far- distinguishing between different types of gross errors, because i t also uses info~nationregardicg the effect of each type of gross ermr on the process mode!. In order to conpare the different techniques developed ic conjunction with the. different tests for gross error idecti5ca:ion and obtarn a good an5erstanding of the interrela~ionships,we initially restnc! our consideration to gross errers caused by biases in measure~nents.In this section, \xre alsa consider only tl:e problem of identifyi~ga single gross error in the measurement. In this case, the idectif~cationproblem reduces :o sinlply identifying correctly the measurement which contains the ,oross error. The techniques for identifying the measarernent contairling :he gross error can be 2 simple rule or a coillplex strategy, depending on the test that is used. The measurement test and the GLR test, by virtue of the manner in which they derive the test statistics, use a simple rule to identify the gross error. It has already been pointed out in the preceding section that the GLR test and the MP measurement tests are identical if we restrict our consideraticn to gross errors caused by measurement biases only. In this case, there is a test statistic corresponding to each measurement. This identification rule used in these tests can be stated as follows:

Zderttify the gross error in tlte rneasurerne~ztthat corresporzds to the rnaxi~~zum test statistic exceeding the test criteriotz. Because of the simplicity of the above rule, it is commonly stated that the measurement test or GLR test does not require a separate strategy for identifying gross errors. We have deliberately chosen to refer to the above rule as the identification component of these tests because \ve demonstrate later in this section that this rule is equivalent to the serial elimination strategy used in conjunction with the global test for identifying a gross error. It should also be noted that if we expect other types of gross errors to occur, such as leaks, then the GLR test which constnlcts a Dross error. test statistic corresponding to each type and location of a , simply extends the above rule to identify the gross el-ror which corresponds to the iliaximum test statistic [1 I].

It can readily be verified that the reduction in objective function value due to elimiriation of measurement i is equal to the the GLR test statistic (or square of the measurernent test statistic) for variable i. This implies that if the tvle used in conjunction wiih the global test is to identify the gross error in that measurement which gives niaximu~llAJ,, then this is precisely the same mle used in the GLR test (or MP measurement test) for identifying the gross error in the rneasurernent corresponding to the maximum test statistic. In other words, the global test in combination with serial elimination strategy is equivalent to the GLR test.

Serial Elimination Strategy for identifying a Single Gross Error

If we use the zlobal test for detecting the presence of gross errors. a comparritiizely inole complex strategy has to he applicd to idetitity the tueasur.2ment con!aining :he gross error. Kipps [I j tird octlined a procedure which was later studied 311d refified by Serrh and Heer~an{I81and Rosenherg et 31. (191. This procedure is known as the .reriai ~!'iitzinufiiftl ~IY)c.c~LI!-~. In the serial elir~iinationprocedure, each tneasurerneni is deleted in iur:i and the ziobal test staristic is recomputed By elirni1ia:ing a me:isureme:it. we make the corrp-spondirig ~rarihhleu~lmeasr-licd.Hence. !he ~ l o b a ilest statistic has to be recomputed usins the reduced consrraint residuals as explained in the precedkg section. Due to the incresse in the number of unmeasured variables. the ol?jective function value. an3 thus the global test statistic, will decrease. Ripps 111 su,n,nested that the pross error can hc identified in that nieasurernent whose deletion leads to the greatest reduction in the objective function value. Although, this strategy is called the serial elimination, it should strictly be called the Iileasureinent elimination. Only in the context of multiple gross error detection described in the next chapter does the imp!ication of serial eliniinalion become clear. Instead [of rgpeatedly solving the data reconciliation problem or cornputi~igthe projection matrices for deletin: each measurernent in turn. Crowe [20] derived simplified expressions for the reduction in the data I-ecuncilia!ion objective fi~ticrionn l u e due io the deletion o f a ~ne;l.;ilrennenti. it is $\,en 1 3

l is obtaiiied by interpc-eting [he Another interesting afid u s e f ~ ~result principie i~ivolve~i in the GLR tes: from the viewpoint oi'data reconci!i:ttioi? objective function value. If we consider Eqiratior, 7-28, which defices thz GLii test statistic, the two terms withir? parentheses on thr RHS of this equation can bz interpreted as the optiinal objective vaiue of data rr,ccnciliaticn proble~ns.Wr, have already ;lot-d that the f11-stterm in this expression is the optimal objective futiction value ~ O I -~ h cstandard tiara reconciliation problem (see Exercise 7-1). The sccor~cite-ni is :he optimal objective functior: value cf the followirig rcccnci!iation proble~ii in which the estimate of the gross error in measurement r is also ob:airred as part of the solution.

Problem PI:

subject to

Ax = c

Thus, the GLR test statistic can alsc? be interpreted as the maximum diffef-encebetween the optiilial objective function values of' data reconcili::


Drirn Rcc o,rci/!nrrori o11d Gro.\> ern,^- L):,I
tion under the assu~llptionthat there is no gross error, and (he opti~nal ob-jective function values obtained by solving Problem PI. It should be noted that in Problem P1, all the measurements are retained even if measurernenr i is assumed to contain a gross error; instead of eliminating measurement i, all measurements are used to obtain an estimate of the gross error. Based on this interpretation and the result given by Equation 7-69, it can be concluded that the optimal objective function values are equal, whether we choose to eliminate the measurement i (hypothesized to contain a gross error), or we choose to retain it and obtain an estimate of the gross error. This result is used in the following chapter to establish equivalence between different multiple gross error identification strategies. Similar to the global test, the co~istrainttest does not completely idelltify the type or location of the gross error. Although more informative, since it identifies the node (or equation) in gross error, the nodal test also requires additional identification. Mah et al. [21] developed an algorithm (for a mass flow network case) for identifying the measurernznts contributing to nodal imbalances. If no measurements arc found in error. any si~nificantnode imbalance is attributed to a leak or a nod el en-or. A m:ijor proh!em with the nodal tes: is the possibility of error cancellarions hich makes i t difficult 10 find the right location of the gross error. Such :erhriiques arc: described in the next chapter.

/( j

Table 7-2 Reduction in GT Statistic for Deletion of Different Measurement Mwsurement Eliminated

Reduction in GT Statistic

Since the maximum reduction of the global test statistic is obtained when measurement 2 is eliminated, the measurement elimination procedure along with the GT identified the gross error correctly. This is not surprising bzcausz this procedure is similar to the use of the GLR test. It can be verified that :he reduction in the G T statistic due to measurement e!imination is identical to the GLR test statistics computed in Example 7-4.

Identifying a Single Gross Error by Principal Component Tests



tistic corresponds to a bias in measurement 2. The global test also rejects the null hypothesis and we can use the measurement elimination strategy to identify the location of the gross error. Table 7-2 shows the reduction in the global test statistic when different measurements are eliminated.

Exercise 7-17. P t ~ 1 . eriziit tirr, oprir,:ul ~;l?+ctivc$rrzcriorlixaice q f PI-~DIP PI~ Iis t-qua1t(j the opiiiuill ol?jecti~~e~fi~rzc;iorz vulue c h i a i ~ z ~b~, d.rci~~i!ig ilie diitii r:co~~ciliotior~ pml,lorr ill :i-li!cll i;!ecis~~reilzerit i is elirllitln~rcialld rile c-or-~-esj,onr!iq vut-iuhle is tr?(!trd c!.y L ~ ! ~ ; , ~ ~ ~ ~ , ~ ~ ~ ( , .


We can identify the constraints in gross error by Ifispecting the cantshutron from the ith ~esidunlin r, r:, to a suspect principal component. say, p,,, which can be calculated by

where \vLi is the ith column vector ot rnatrlx \v, Fro111 the results of Examples 7-3 and 7-4. we &serve that if we choose to identify a gross error in the measurement corresponding to the maximum test statistic. then the gross error in measurelnent 2 is correctly identified by both the MP ~neasurementtest and the GIJK test. In fact, even if we consider gross en-ors due to leaks, the maximum GLR test sta-

Let us define g =(gl, . . . , g J T , and let g' be the same as g except that its elements are sorted in descending order of their absolute values. In general, the contributions of different residual\ to the suspect principal con~ponentare different and are dominated by the first few elements. These are the major contributors to the suspect principal component. The




~ ~ ~ ~ ~ ~ iN / Ii GWTS o~ r i Error o ~ l ZICICCIIOII

major contributors are directly related to the constraints that should also be suspected. The number of major contributors, k, can be set so that

where E , is a prescribed tolerance such as 0.1 Note that since the signs of these contributions can be either plus or ~ r, the cancellation minus, as can the signs of the elements of w , and effect among the elements of g' should be taken into account in identifying the suspect constraints. This is done in Equation 7-7 1. Similar to the nodal test, the principal component test 011balance residuals only indicate:; which of the constraint residuals are major contributors to the suspect principal component. An additional strategy is required for idcntifying the source of the error (leak or measurement bias) and which of the nteasurernents concains gross errors. We can. however, always use a p:incipa1 componeDt tzs: en nielrsur-ement adjustmenls in order to identify a measurement in gross error. Thai can be done by inspecting the contributim fr~!:~ the jth adjustment in a, saj-,a;. to a suspect principal co~nponen!i. The jth adjus[!ner?t contr.ibntion can be calculated by

where \rr,,i is the iih eigenvcctor of W, dnd n is :he tot21 r i u ~ ~ i b eofr nieasrilemenis. We can srudy th? contributions by checking th2 signs 31id niagnitudes of the elemellts in g. In general, as with the principal ccxnponent test for balance rzsitiuais, the contributions vary and are dominated by a few elements. The identification rule for the principal component measurement test is the f;::owi;,gZdentifL the gross error in the measurement that correspo~zds to the major contributor to the maxi~numprincipal coazponent exceeding the test criterion.

I ~ ~ / r ( ~ d ~ t10( . Grg)c\ r i ( v ~ Et I O I -

Example 7-7


/ \

In order to identify the gross error using PC tests, we have to examine the contributions to the rejected principal component. Let us consider only the last principal component which is rejected at the modified level of significance (see Example 7-6). The contributors (measurement adjustments) to this principal component can be analyzed by computing the vector g (Equation 7-72). This vector is given by 10.0293, -1.1073, -0.1030. 0.0975, -1.0237, -0.91 141. The major contributor to the suspect principal component is measurement adjustment 2, and therefore a gross error is identified in this measurement. Tong and Crowe 1151 canied out extensive analysis of the pr-incipal component tests and outlined some practical guidelines for implementing a gross error detection and identification strategy for these tests. Most of their recommendations. such as making use of collective X' tests first. using accurate variance-covariance matrix of measuremcnt errors and the proper error distribution are valid for all stra~egiesin\~olvingunivariate tests. In the end, they recommend that the PC tests should be used in cornbination with other statistical tests, since there is no guarantee that such tests will deteci all grcss errors. They also \xlarn the user about the increased computational time irl ca!culatins the elgenvaIues arid eigenvectors. an3 also cor:tribution analysis of thc PC Cesi statistics f3r gross error ideritificatior,. The bocton! line is that the PC tests are effective in certain situations, but they are not generaily superior ro lhc basic szatistical tesrs described in this chapter.

DETECTkBfLlTY AND IDENTiFIABILITY OF GROSS ERRORS Mi's close this chapter with the discussion of two important questions in gruss error- detection. The first is whether it is possib!~ro detect gross errors irt aii measurements and whether gross errors in two or- more measurements can he distinguished from each other. T h e concepts of detectability proposed by Madron [21],and idcntifiability discussed by different researchers [23,24, 251, are used to answer these questions.

Detectability o f Gross Errors

Similar to data reconciliation. an essential prerequisite for gross error detection is redundancy in measurements. Theoretically, it is possible to detect gross errors only in redundant measurements. A gross error irz a ,Ionredundant measurenzent cannot be detected. This is due to the fact that a nonredundant measurement is eliminated along with unmeasured variables and does not participate in the reduced reconciliation problem. Hence, no test statistic can be derived for a nonredundant measurement and a gross error in such a measurement cannot be detected. In Chapter 3, methods were presented for observability and redundancy classification of process variables. Only redundant measurements are adjusted by reconciliation and observable unmeasured variables can be estimated. By adding sensors to measure new variables or by including additional cocstraints (if available). it is possible to eliminate unobservability and nonredundancy. Both matrix and graphical approaches can be used for observability and redundancy classification. In practice, however, inany redundant variables behave as nonredundant ones. We refer to such measurements ~.S. e: al. j221. Crou'e as pl-ncticnllj I I O I Z I . C ~ I ~ I ZI ?~I C~C~I .ITZU~I ~ ~ ~ I I C ' I IIordzche [ X ] ,Madroll [22]. and Charpsntier et al. 1251 ha;,e reported difficulties in the reconciliation and gross error detectioil in such measuremznts. Similarly, even if some unmeasured varisbjes are observable, their estimates may have such high standard deviations that they may be considered as pmcttca!ly unobsennDle \la:-inSler. If a nieasure~nentof a redundact vzriable co!l!air~s a gross en-cr. then data recor:ciliation should theoretically inake a large adjustment to this n~easurementin order to obtain an xiirnate as close as possible to the true va!ae of the vaiiable. In some cases, however, dye to the nature of the constraints and the standard deviations of variables, I-econciliation rnay make an insignificant adjustment to the emoneous redundant mea~.lre!?el~'2nd instead make sdjustlnents to other fault-free measurz!nentu to satisfy the constraints. Such a ~neasurerncntis not trtlly redundant even if it is classified as redundant theoretically, and it is difficult to identify the gross error in such measurements. Madron 1221 defines a practically redundant rnezsut-ement as one whose adjusrabilir\: is greater than a selected thl-eshold value. This condition is expressed analytically as

where a, is the adjustability, u?]is the standard deviation of the reconciled value i , and oVlis the standard deviation of the ~neasurementerror. The critical limit qris a value from interval (0,l). For example, if qris chosen as 0.1, all measurements i having a, <0.1 are considered practically >O.9, and. therefore, the nonredundant. For such measurements uR,/oy, adjustment made to the meacured value is insignificant. The adjustability a , is also a measure of the improvement in the accuracy of a measured value that can be achieved through data reconciliation. Charpentier et al. 1251 suggested using the ratio

for identifying the measurernents with weak redundancy. This factor is a measure of detectability of an error. Since constraint imbalances indicate the existence of gross errors. the detectability of a gross error depends on its contribution to the imbalances of constraints. The contribution of a measuremefit in a constraint residua! depends 011 the prac-ess cofistraint and relative accuracy of the measurernents (relative standard deviations). The contiibucion of an error to tile cotlstraint imbalances is proportic?nal to the detectability factcr. The !argei- the deteciability facto~;rhe more likely is the gross error to be detected. This ~ 1 s oimplies ihat if t!le delectability factor dl is !argc, !hen grcss e m r s of small magnitudes in ihe coxesponding measuremenis can br; detected relatively easiiy. A complete practical redundancy analysis is usefu! !il identifying a!i measured variables with weak redundancy. For linear processes, ?he staw dard deviarion of reccnciled estimates can be coniputed analytically as described in Chaptsr 3 and adjustability or detectability measures can be computed. For nonlinear proble~ns,however, these measures can be computed only after solving the reconciliation probleril for a given set of measurements and by linearizing the constraints around the reconciled estimztes. Equation 2-13 can be used to calculate the standard deviation of a reconciled value by a summation rule, as explained in Chapel- 2. Simulation studies have also been conducted and variables with the t j l lowing characteristics have been identified as prac~icallynonredundant:

Variables with relatively small standard deviations, in comparison with the standard deviation of the other measurements belonging to the same balance. This is usually the case with the measurements whose order of lnagnitude is also small relative to the other variables in the same balance equation (for instance, flows of small streams that appear in balances with flows of large streams). The required ratio errodstandard deviation for gross error detection is much larger for the variables with small standard deviation than for those with large standard deviation [23]. Parallel streams (for instance, outlet flows from a splitter that are not constrained by any other balance 1231). Flows that appear in the enthalpy balance, but not in mass balance [25]. These are typically pumparound flows that are used in the main column enthalpy balance and associated heat exchanger balances, but they are not included in the tower mash Lalance. An overall mass balance using rneasured feed and product rates for the entire fractionation is usually chosen in order to avoid the large number of unmeasured flows around the column itself. Temperatures of small streams in the same balance with teimperatures of large streams (even if the order of magnitude and standard deviation of such temperatures is similar). Inlet ternperattire of the first heat exchanger in the preheat irain [25:.It usually appears in ~jnlyone enthalpy balance, while the fo1lo.hing tzmperatures enjoy extra redil~dancyby being part of at least two heat balances. = hleasut-ed variables that appear in only one equation with an unmeasured variabie which is not corlstrained by any other balance ecluatior! or bounded. A gross error ir, t l i ~rn2a:;ured vaiiable i s t~suallytransfer-red ro the iinmeasured variable which has lnorc freedom to adjust. There is no simple solu~ionfor the data reconciliation and gross error detection in such variables. Extra constraints and extra instrumentation would certainly help. but that is not always possible. Sometimes artificial "measured" variables can be created from cslculated values in order to enhance redundancy 125, 261. The information about weak redundancy points can bc provided to the users of a particular data reconciliation package in order to enable then] to recognize the limitations in accuracy of the gross error detection methods and trigger decisions for improvin: their instrumentation.

Knowledge of practical variable classification is important information that can be included in the gross error detection algorith~ns.For instance, the detectability factor of a gross error can be used as a tie breaker when more than one measurement share the same value of the statistical test [16].

Example 7-8 For the process considered in the preceding examples, the covariance matrix of llleasurement errors is taken as the identity matrix. The covariance matrix for the estimates of all these variables can be computed using Equation 3-10. The diagonal elements of this matrix are the variances for the estimates. For this process, the variances for all the estimates turn out to be equal to 0.3333. Using these values in Equations 7-73 and 7-74, we obtain adjustability to be equal to 0.4226 and detectability to be 0.8165 for all variables. This implies that gross errors in all measurements have equal chance of being detected. On the other hand, if we take the true values of flow variables to be 1100, 99, 1, 99, 1 , 1001 and assume the measurement error standard deviations to be 1% of the true values, then the adjustability and detectability values for different measurements are given in Table 7-3. Table 7-3 Adjustability and Detec:abiii?y Values for Process in Figure 1-2 Measurement

Error Variailce



From the results given in the above table, we can conclude that it IS relatively more difficult to identify grvas elrors in the measurements of streams 3 and 5 compared to the others. Note that this can also be inferred frorn the first observation made in the discussion preceding this example about measurements with sinall standard deviations. In order :o

verify this observation, about 20 simulation trials were made in each of which a gross error of magnitude 5 to 15 times the standard deviation was simulated in I-low measurement 1 and the GLR test was applied for identifying the gross en-or. Similarly, 20 trials were ~ n a d ewith gross error in measurement 2, and so on for each position of the gross error. The results showed that while gross errors in streams 1, 2, and 6 were identified correctly in all trials, only 60% of the gross errors in stream 3 and 30% of the gross errors in stream 5 were identified correctly. Although the number of trials made are small, the trend of the results corroborates the observations. identifiability of Gross Errors

Even if a measurement has a high detectability, it is important to determine if a gross en-or in this measurement can be identified or distinguished froin a gross error in any other measurement. For linear processes, this quesdon can be answered in different ways, which we describe below. Iordache et al. 1231 pointed out that the test statistics of two diHerent measurements are identical if the columns of matrix A correspondi~:g to :he two measured variables are proportional !o each other. One special case ~f t h i ~occurs when two parallel strezrrls link the san:e twc nodzs oi' a process. This irnplies :hat it is not possible to distl~guishbztween gross errors that occur in these ~neastirements.In the comext of the GLR test. Xarasimhan 2nd Mah 1131 indicated that if the signature vectors of two grass errors are proportional. then these cannot be distiiiguisll~dfi-om each other. If we restrict corlsideration to measurement biases ~il!y. then this observation is :he same as ihc one made by lordriche et a!. [33].By using s i ~ a a t u r eT.~cctorsl identifiability problems bet~veendifferent types of gross errors can be discovered. Recectly, Eagajewicz and Jiang [24! proposed the cancept of equivclrrzt sets o f g~us.7errors. A set of gross errors is equivalen: to another set of gross errors if the two sers cannot be distinguished from each ochcr. For the case of measurement biases, Bagajewicz and Jiang 1241 proved that if a set of measurements or k var~ablesfornms a cycle of the process graph, then gross errors in any combination of k-l nieasorements from this set cannot be distinguished from gross en-ors in any other cornbination. This can be easily verified. if we note that in serial elimination a nleasurement suspected to contain a gross en-or is eliminated, making the corresponding variable unmeasured.

Choosing to eliminate any set of k-I measurements from a cycle of k measurements will automatically make the remaining measurement nonredundant and will eliminate it from the reconciliation problem. This implies that the solution of the reduced reconciliation problem will be the same regardless of which combination of k-l measurements are eliminated. Thus, it is not possible to identify which set of k-I measurements from this cycle contains gross errors and all such sets are equivalent. For the same reason, it is not possible to distinguish gross errors in the measurements of all k variables of a cycle froin any set of gross errors in the measurements of k-I variables of this cycle. As a special case, if we consider a cycle formed by two streams (parallel streams), then is not possible to distinguish between a gross error in one stream from the other. It is also not possible to distinguish whether both the parallel stream measurements contain gross errors or if only one of them contains a gross error. Furthermore, if the number of independent constraints is equal to in, then all sets of nz linearly independent gross errors are also equivalent. This is due to the fact that the reduced reconciliation problem will have no redundancy left and there is no i nforrnation a\iailable to make ar,y distinction between them. We refer to equivalent sets of gross errors as belonging to an equivnirrzc-y class. Eqgivalency classes can also be obtained in te1111s of the signature vector-s of gross ennrs. which allow other types of g r o s errors such as !eaks to he also considered. The followi~igprinciple can be derived: lithe signature vectcrs for a set of k grcss errcrs form cr linearly dependent set of rank k- I , h e n it :s no: possible io theoretica!ly d!'stinguish behueen one combination of k-1 g r a s errors from any oiher cornb;notisn o f k - I grcss errors chosen from this set. It is also not possible tc &siin3uish whether k gross errors or k- l gross errors from fhis sei are presznt in the process. (It is possible, however, to distinguish a cornbirlation o f less than k- T gross errors from ofher combinations.) As c special case, if the rnaximcrrn number o f independent signature vectors is rn, then any set of rn gross errors with !inearly independent siqnature vectors is equivalent to any other such set.

Example 7-9 The process graph of the flow proces5 considered in the preceding examples is shown in Figure 3-1. The following three cycles can be identified in this graph:


Dirrn h'c~.o,fc~iiafio:l olid <;nt EII or- Ucrecllnn

Cycle 1 consisting of streams 2, 3,4, and 5 Cycle 2 consisting of streams 1, 2,4, and 6 Cycle 3 consisting of streams 1, 3, 5, and 6

straints are linearized around the reconciled estimates, it is highly unlikely that the columns of the linearized constraint matrix will become dependent. Even if this occurs, it has to be interpreted as a numerical problem rather than as an identifiability problem.

Thus, the following equivalency classes of sets of biases are obtained: Class 1. [2,3,4]; [2,3,5]; [ 2 , 4 , 5 ] ; [3,4,5];and 12, 3 , 4 , 5 ] Class2. [1,2,4];[1,2,6]; 11,4,6]; [ 2 , 4 , 6 ] ; a n d [ 1 , 2 , 4 , 6 ] Class3. [1,3,5]; [1,3,6]; [1,5,6]; [3,5,6]; and [1,3,5,6]

If we additionally consider, say, a process leak in node 1 (splitter), then we use the signature vectors to identify equivalent sets. 'I'he signature vectors for measurement biases are the columns of matrix A. The signature vector for a leak in node 1 is the first column of matrix .\, which is identical to that for a bias in measurement 1. Thus, a leak in node 1 cannot be distinguished from a bias in measurement 1. In addition, we also obtain the same equivalent sets as obtained using cycles of a graph because the signature vectors for measurements biases in streams 2, 3. 4 and 5 ar-e linearly dependent with rank 3 and so on. it should also be kept in mind that if a set G of gross errors contains a rchse: of zross errors, say gc, belonging to art equi-~alencyclass C, then o:!:er equi\~alcntsets of C can be obtained by replacing gci with other sets which belong to C . Thus, for example. we can derive the equivalent sets for :he combination 11, 2, 3. 41 by replacing the subset [2>3. 41 by other sets of Class 1. Similarly, we can replace [i , 2. $1 by other sets of Class 2. Thus, .,ve obtain another equivale~~cy class siven by

PROPOSED PROBLEMS NOTE: The proposed problems that are irtclr~dedin this c/apter require more extertsive calculatiorts. A computer program or a mathematical fool such as MATLAB is required irz order to get solutw~lsto these problems.

Problem 7-1. A mass flow network from Rosenberg et al. [I91 is represented in Figure 7-1. The true mass flow rates (in Iblsec) in the stream order are given by the vector: [15 15 25 10 5 10 10 5 5 5 10 5 5 10 10 101. All mass k delpiation' f& each meaflow rates are considekdim6asured. ~ h standard surement is 2% of the measured value. Following the procedure explained at the end of Chapter 3; sirnulate random measured values and fiild :he I-econciliation solution. Next; sirn~llatea single gross error (bias or leak) and

Ciass 1.[ I , 2. 3.4;; [ i ; 2. 3; 51: [ I . 2 , 4 , 5 ] ; [ i , 3 , 3 , 51; i!, 2, 3, 61: [1, 3 , 3 . 6;: [?,: 3.4. 6;; 11; 2.5 61: [2,3. 5 , 61; 11,4,5, 61; [2,4,5.4]; [3.4,5, 61 'The last 5 sets are added to Class 4, because they are equivaleni to sets [I, 2. 3, 51: [ I , 2. 4. 31; arid 11, 3, 4, 51 which belong to Class 4. Equivd1encg Class 4 can also be generated using the fact that this process has 4 i n ~ c ~ e n d e constraints nt and all sets of 4 gross errors with linearly ifidependent signature vectors are equivalent. Although identifiability problems can occur in linear processes. ill gellrrol, 11zi.v i.s /lor u ~ I . o / ~ / E Iin ~ I tzoi11ii1e~1r pi-OCESSCS. If nonlinear con-

Figure 7-1. Mass flow network for Problem 7.:. Reprinted with permission from [19].Copyright O 1 987 American Chemical Sociep.

apply appropriate statistical tests (at a-0- .05 level of significance) for gross error detection and identification. For a more extensive study, simulate gross errors of various magnitudes and in different locations. Also, calculate the detectability factors by Equation 7-74 and explain why some gross errors can be detected and correctly identified, while others cannot.

Problem 7-2. The steam-metering system for a methanol synthesis unit [ I 81 is represented in Figure 7-2. The correct values of the steam flow rates are listed in Table 7-4. The values in Table 7-4 are 8-hour averages of the plant data except they have been adjusted to balance the system. A11 flow rates are considered measured. The standard deviation for each measurement is 2.5% of the measured value. Repeat the operations indicated in Problem 7-1, and explain the behavior of various statistical tests ~ cstreams i with different detectability factors. for gross errors s i m u ~ a ~in Choose the level of significance a=0.05 for all statistical tests.

Table 7-4 Correct Values of Flow Rates for the Steam System of Figure 7-2 Sfreurn No.


0 4


Flow Rate 1,000 K g / h

Stream No.

Flow Rok 1,000 Kg/h

1 2 3 4 5

0 86 1 111 82 109 95 53 27

15 I6 17 18 19

60 21 64 32 73 16 21 7 95

6 7 8 9 10 11 12 13 14

1 12 27 2 32 164 05 0 86 52 41 14 86 67 27 111 27 91 86

20 21 22 21 23 2> 26 27 28

10 5 87 27 5 45 2 59 46 64 85 45 81 32 70 77 72 23

Problem 7-3. A sinlpliiied diagram for an ammonia synthesis process from C1oa.e et al. [ I I] is represented in Figure 7-3. By using Crowe's projectioil mati-ix niethod. all xnrneasured variables have been eliminated and a reduced inode1 involvirig o11iy ~rleasuredvariables is obtained. The constraint 111atnx f o ~the - reduced niodel is:

and the lneasured values for the measured conipo!len: flow rates are indicated in Table 7-5. A nondiagonal variance-covariance matrix for the measuremenr errors was uszd for- ct~isp~nblernas follows:

Figure 7-2. Steam metering system of a methanol synthesis plant [I81. Re roduced

with permission of h e Arnericm institute of Chem;col Engineers. AIChE. A// rights reserved.

cqoyrgLDi 986

Figure 7-3. Flow diagram for a simplified ammonia plant [-I 11. Reproduced with permission o f the American /nstitv!e of Chemical Engineers. Copyright a1983 A!ChE. A// rights reserved.

At least one gross error exisr:; in the given measured data. Apply various staristical tests at le\,el c:=O.C5 to find the most likr^ly ioca~ionof the cross error.

dia ram fo: o chemical extraction plant 1271. Reproduced with pernisii~n o f h e ConsJan Sociei f.or Chernica' Engineering.

Figure 7-4. Flow


I'roblem 7-1.A tlov shec! for a chemical exrractioil p i m i From Hull>-e l [27j is repl~s~tlteci in Figure 7-4. Tile measured ilow rates i i : ~lbii~r-!. aver2:g::d over z 13-haul- period. are given ir, Tabl:: 7--6.After e1i:lliilaiin~ the annleasureG tlows by a pro-;ection mat!-ix. :tie foilou ins ~ . c d i i i ~ i i

Table 7-5 Measilred Values for the Ammonia System in Figure 7-3


Spuies _









__ _


Measured Flows (molls) -








C' 4 101


~ ~ ' 2 1


h iiqi*l -




20 7 69 62 205 -

11tiroduoiorz lo Gross Err-or Derct-rion

The variances for the measurement errors are also given in Table 7-6. Apply the global test and the measurement test (both at a=0.05),to identify the measurements suspected in gross error. Eliminate the suspected gross errors (one at a time) and recalculate the statistical tests. Which measurement is more likely to contain a gross error? Repeat the probleln with the nondiagonal covariance matrix given by Holly et al. [27].Explain any difference from the results with the diagonal variance-covariance matrix.

Table 7-6 Measured Values and Variances for the Chemical Extraction Process in Figure 7-4 Strwm


8.35E+05 1.07E+07 1.66E+07 4.95E+O7 1.09E+07 6.3855 6.77E+C7 1.05E+9S 4.70E+06 0.4301 4 1096 8.27E+06 9.49E+OS 23680 771.5 6.547 X.50E+07 7438 837 1 400 1 8.27E+05

Measured value (Ib/hr)





* A n y gross error strategy needs to detect and also identify the location of gross errors. There are two types of errors associated with any statistical test: Type I error (when the test detects a nonexistent error) and Type I1 error (when the test fails to detect an existent error). Only the measurement test and the GLR test can directly identify the location of gross errors (by a simple identification rule). The GLR test is the only test which can identify both measurement biases and leaks by the same type of test. The gross error detection strategy by GLR test involves estimation of magnitudes of gross errors. Maximum power tests can be derived for the measurement test and for rhe nodal (constraint) tests. But the GLR test is more powerful than both of them for the single gross error case. The power of the GLR test is same as for the measurement test for a single measurement bias. Alternatively, the GLR test statistic is equivalent to the measurercent test statistic for a single measurement bias. Principal cornponent test is a iinear cornbinatior, of the eigenvectars of the variance-covariance matrix of cortstraining residuals or measurement adjiistments. Principal component tesi cannot directiy identify the l ~ c a i i o nof cross error. It r c q ~ i r e sadditional analysis ia order to find the major construints or nieasurenie1l:s contributing to the principal co~nponentsthat failed the test. Serial e!irnination can be used to identify gross errors detected by the global :est. The reductio~: in global test statistic after elimination of a measurement is equa! to the GI-? test stz!i:t;p 'The detectability of a gros.: error depends mainly on its magnitude and its location. Some sross errors can be detected, but not always properly identified.


REFERENCES 1. Ripps. D. L. "Adjustment of Experimental Data." Chetlz. Etzg. Progress Synp. Series 6 1 ( 1965): 8-1 3 . 2. Almasy, G. A.. and T. Sztano. "Checking and Correction of Measurements on the Basis of Linear System Model." Prol?lei?lsof Control and Injiot-tnrrtion Tl7eor;v 4 ( 1975): 57-69. 3. Madron. F. "A New Approach to the identification of Gross Errors in Chemical Engineering Measurements." Clzenz. Eng. Sci. 40 (1985): 1855-1860. 5. Reilly, P. M., and R. E. Carpani. "Application of Staristical Theory of Pidjustrncnt to Material Balances," presented at the 13th Canadian Chem. Ens. ConScrence. Ottawa. Canada, 1963.

5. Mah. R.S.H., G. M. Stanley. and D. W. Downing. "Reconciliation and Rectification of Process Flow and Inventory Data." 11111. & Eng. Cl?etn. PI-oc.Dc7.s. Dcix. 15 (1976): 175-183. 6. Mah. R.S.H., and A. C. Tarnhane. "Detection of Gross Errors in Process Dara." AICIIE J O I ~ I - I28 I N(I1082): 828-830.

7.Sidak. Z."Rectang.11ar Confidence Regions for the Means of Multivariate Normc~lDist~-ibtitio!~." Jolc~,~cii ofAt71cr-Stutis. Assoc.. 62 (1967): 62h433. S. Ro!il~ih.1). K. anti J . F. L),i\;is. "Uii'niascd Estimation of Gross Error\ in PI-ocessMe3ji1remen1s.".-!/Ci1i'.lor0~11~11 38 (i992): 563-572.

4. Crow?, C. ii4. "Test of Masirnux Power for Detection of Gio.;s Erro:-s in I-'r~>cesb Constrainrs." rlIC!~f;Jo~/rnal35 (1989): 869-872. 10. Tsmhane. A. C. ''PINore on the Use of Residuais for Detecting at: Otitlier in Linear-Regression." Bion~cil-if369 ( 1 382): 4XX199. i 1 . Cro~r~e. C. hl.. Y.A.Ci. cam pi:^. alld A. HI-yniak. "Reconciliation of Fr3czss F!o:41 Rates b>-Matrix Projec:ion. 1. 1,inear Case.'. AIClrE . I ~ I A ~ 29 I I ((Ii /983 j:


16. Jordache, C., and B. Tilton. "Gross Error Detection by Serial Elimination: Principal Component Measurement Test versus Unjvariate Measurement Test," presented at the AIChE Spring National Meeting, Houston, Tex., March 1999. 17. Narasimhan, S., and R.S.H. Mah. "Treat~nentof General Steady State hlodels in Gross Error Detection." Compurels & Clzenl. Engng. 13 (1989): 851-853.

18. Serth, R. W., and W. A. Heenan. "Gross Error Detecting and Data Reconciliation in Stearn-Metering Systems."AlChE Journal 30 (1986): 243-247. 19. Rosenberg, J., R.S.H. Mah, and C. Iordache. "Evaluation of Schemes for Detecting and Identification of Gross Errors in Process Data." h d . & Ell.;. Chetn. Proc. Des. Dev.26 (1987): 555-564. 20. Crowe. C. M. "Recursive Identification of Gross Errors in Linear Data Reconciliation." AlChE Joul-1,0134(3988): 541-550. 21. Mah. R.S.H., G. M. Stanley, and D. W. Doarning."Reconciliatioil and Rectification of Process Flow and Inventory Data." 111d3 Eng. Chrrt7. Prnc.. Dts. Del,. 15 (1976): 175-1 83.

22. Madron, F. Process Plntlt Perfortnur~ce:Meusztr-emerzt utld Uuru PI-ocessilzg ful- C)i>tililizationand Rett-ofits. Chichester, West Sussex. Eng!and: Ellis Horucoc! Limited Co., 1992. 23. Iordache, C., R.S.H. Mall, and A. C. Tamhar~e."Performance Studies of the X!easurzment Test for Dc!ecting Gros Errors irt Pn:cess Da~a."AICI7EJozrino/ 3 1 (1085): 1 !87- 1201. 24. Ragajev:icz, M. :.. and Q. Jia12g. "Gross Error Modeling and Detection iit Plant Linear Dynamic Keconciliation." C ~ ~ ~ I I ~&I IClzer~i. ~ , D I Er~gizg. -.S 22 (ilu. 11. 1998): 1789-1809.

-i- .Chamentter.

\I., I.J. Chang, G. M. Schv.,entel-. and K.C. Bardin. "An 011iinz Cat2 Reconciiiario~System for Crude and Vacuum rjnits.'. presentzd ar the NPKiZ Computer Ccnf., FIouston, Tex., 19";.

12. Ta:nbar~e,A. C.. and R.S.H. M211. "Dala Reconciliutioj~and G I Y J S Error ~ L3etection in Cheniical Process Networks." Tc.c.lri1on7erric.s27 (1985): 409-422.

26. Kneile, R. "Wrins More Info!-11ia:ion out of Pisnt Data." C!lern. Eng!lg (Mar. 1995): i 10-1 16.

13. Narasinthari. S.. tind R.S.H. Mah. "Genersli~eil1-ikelihooti Ratio Method for Gross EITOIidentii?cl~tion.'-AICIIE J J ~ I - t l n33 l (1 987): I5 14-1 52 1 .

27. Hclly, W., R. Cook, and C. M. Crowe. "Reconciliation of Mass Flow P,-?e Measurernents in a Chemical Ex~I-action Plant." Ca17. .I;. O f Chem. E:7g. 67 (Aug. 1989): 595--601.

i i . Nal-'isimhan. S. "Maximum Pouer tests for Gross EI-ror Detection Usii~g

L.ikelihood Rations." AICIIE Joir1.t7irl36 ( 1990'1: 1589-1591.

!iTong. H., and C. 31. Cro\\~e."Detection of Cross Errors in Data Reconciliation by P~.irlcipalComponciit A~lalysis.'';ZIChE Jour-110141 (1 995): 1712-1723.

Mir/ri,dr Cross Errol I d e ~ i f i J i c ~Sf tj ro~ ~l~t e ~for i c ~Stef~dv-Sia?e Proces~es

Multiple [dentifiiation Strategies for Steady-State Processes In the precedinz chapter, the different statistical tests for detecting the presence of gross error-s and the methods for identifying a single gross error in the data were described. For a well-rnairitained plant, we sholtld genc!rally not expect rncre than one gross error io be present in the da;:~. Therefore; a fundamentai prsrequisite of any gross error detection strategy is that it should hzve good ability t c detect and identify cor~ectlya single grow errgr. If dats recccciliarion is app:ied to a largc s ~ b s y s t e mconsisting of rcariy measurements, however. or if tke sensors are operating in a hortile environment. and/or plant m2in:enance prscedures are inadequate, then it is possible for severa! gross errors to be .;imultaneous!y present in the data. Thus, there is a need to add a third component to :he gross elror detection strztegy which provides the capabi!ity to detect 2nd i d e ~ t i f y multiple gross errors. Generdly i r ~the research iiterature, a gross error detection strategy is presented as a single entity withou: clearly distinguishing t:,, diff~ieii; components of detection. identification of a single gross errcr. 2nd ilentification of multiple gross errors. As mentioned in the preceding chapter, we have chosen to separately ar~alyzethe three different components used in a gross detection strategy to gain a better insight in!o the sirnilarities and differences between the various methods proposed. Our main focus in this chapter is the component of the gross error detection strategy that deals with n~ultipleerror identification.




Multiple gross error identification strategies have been proposed by different researchers over the past four decades and it is not our aim to describe in detail every strategy in this chapter. In order to gain some perspective, we have attempted to classify these strategies into different classes depending on the core principle on which they are based. Within each of these categories we have chosen to describe one of the strategies in detail, depending on ease of description, and then indicate the different variants of this strategy proposed by other researchers. All the techniques developed for multiple gross error identification can strategies or serial strategies. be broadly classified as either si~~zultaneoz~s They may also differ in the type of information exploited for identification. For example, some of the strategies also make use of information about lower and upper bounds on the variables to enhance the identification process. Lastly, as we have noted in the methods used for single gross error identification, not all strategies are designed to distinguish between different types of faults. For ease of comparison and description, we will restrict our considerations to the identification of gross errors caused by sensor biases, and indicate wherever pertinent the extension that car? be made to icclude other types of gross errors. Linear system5 are initially considered followed by the treatment of ~onlirlcarprocesses.


idcrrfi,ficafi'onL'sing Sirzgle Gr3.s.s Error Wsi Statistics Simultaneobs strategies for niultiple gross ec'or idet1:ification attempt to ideztify all gross errors present in the da:a siinultaneously cr in a sillgle iteration. :n the case of the measurement test or GLR test, this stratem v i s a simple extension of the identification ru:e used for identifving a -, , single gross error and can be stated as follows:

Iderztify gross errors 111 all nzeasrtrenze~ztswhose corresporzdilzg test statistics exceed the test criter-iorl. In the case of the GLR test, the ahove rule can be easily extended to identify other types of gross errors by making use of the corresponding

bll,iiipic Gross Err-or i ~ f e f i r r / i < ~ Sirurrgrrs ~ ~ t i i ~ n fur Sirad\-Slo:r, Prorr.rsfs

test statistics. The effectiveness of this rule was investigated by Serth and Heenan [I] and was generally found to result in too many mispredictions. The main reason why the above rule does not work well is due to what has been referred to as the sr7zruriizg eJfect. Since the variables are all related throt~ghthe constraints and the constraint residuals are used in deriving the test statistics, a gross error in one measurement may cause the test statistics of good measurements to exceed the test criterion but not the one which contains the gross error. In other cases, the test statistic of the measurement containing the gross error and those of other measurements can siinultaneously exceed the test criterion. The degree of smearing depends on many factors, such as the level of redundancy, differences in standard deviations of ~neasurenlent errors, and magnitude of the gross error 121.

Identification Using Comhiriatoricrl Hypotheses Another sirnultaneo~lsgross error idcntificatio!l sti-ategb- is based on explicitly postulating all the alternati1.c hypotheses not only for a single gross enror but also for the differen: colnbinations of two 01- more gross errors in the daiz. A test statistic call be derived for each of these alternatives and the most probable one cho\en. This srl-ategy IS especialiy s ~ ~ i t e i l 6, IJ, ,application aionp with the GLR test. It can be recalled finm the prrceciing chzpter rhac the GLR rest considers all possible sinple Sross err(>!hypotheses as part of the alcernativc hypothesis. We can extend this to i!lciuCe hypothesis of ail possib!e combinations of t\izo or more gross errors. In o~f-rerwords. the hypoti2eses can he f~ji-mulatcdits fc~lio.~vs.

Ho ino gross errcr in the d m ) : Err] = 0 F I , (composed of a!ternaii\es I f : . H;'.",


. . ff,!'.".


H i (singie gross errcjr alternatives): Elr] = bf,



where the indices i l , i2 and so on are chosen to exhaustively consider all possible combinations of gross errors. Thus, if t,,,, is the maximum nunlber of gross errors considered to be simultaneously present in the data. then 2'rnax alternatives are present in the composite alternative hypotheses. Corresponding lo each of these alternatives, the GLR test statistic can be derived. In general, let us consider the alternative hypothesis of k gross errors in the data corresponding to the gross error vectors f,,, fi2, . . . fik.The expected values of the constraint residuals under this alternative hypothesis can be written as

where b is a column vector of unknown magnitudes of the gross errors and FL is a matrix whose columns are the gross error vectors fi,,fi2, . . . f,Lcorresponding to the k grcss errors hypothesized (see Chapter 7). We can obtain the like!ihood ratio for this hypothesis and following the same procedure as in deriving the test statistic Tk in Equation 7-30 \hie can obtain the rest statistic for-this hypothesis as

The maximu~nlikelihood esti:nates of the corresponding gross errol magnitudes are given by

i n arder to tietermine the number and type of gmss errors, i~owever, nle csfinot apply the simpk rule of choosing the nla.;iril~rli test ststistic ainong zli alternative !iyp:)thcses. ]'his is due to tile fact t h ~ all i the test statistics do nut follow the same dis!ribution because of the differences in the number of degrees of freedom. The test statistics for k gross errors given by Equation 8-3 can be shown to follow a central chi-sqt~aredistribution with k degrees cf freedon. under the null hypothesis. In order to choose the most probable l~ypotnesis,we can con~putethe Type ! e n a r probabilities lvi cacil of the test sta~;siicsgiven by

N;',"(two gross emos combination altcrriative:,): E[r] = b,f,, + b,f,, H;1,i2..


(k gross error combination alter-native\!: E[rj = 1:!f1,+b,f,, + . . . +t>,f;,

where X: is the random variable which follows a central chi-square distribution with k degrees of freedom. We can now choose the minimum

among the Type I error probabilities given by Equation 8-5. If this exceeds the modified level of significance fi for a chosen allow~bleprobability of Type I error a, then we can conclude that gross errors are present and the hypothesis corresponding to the minimum Type 1 error probability gives the number and type of gross errors present. Furthermore, the estimates of the g r o s errors are given by E q ~ ~ a t i o8-4. n It can be easily verified that this method is equivalent to the choice of the maximum likelihood ratio test statistic for the case of identifying a single gross error described in the preceding chapter.

It must be noted that in any case it is not possible to detect and identify more :I-oss en-ors than the number of constraint equations, that is, t,,,, can at inost be equal to 111. This is because for e\,e13~gross error hypcthe\ i ~ e deither the correspondin: measurement is eliminated or an extra unknown p:irameter cor~esponaingto the ina,nnitgde of the gross error has to be estr~ua!cd.v:hich reduces the d e ~ r e eof redundancy by one. Furihermo~-e,due to gross error identifiahllity pri3blems as discussed in the preceding chapter, only Gne cornbination of gross en-ois belonging to each equivalency ciass needs to be considered in the alternative hypothesis. For example, in the case of tlvo para!]<-1 streanls. o ~ l yone of the h).pctheses fc,r a gross er-ror in cnc of these strearn flow measuremer?:s has to he consicicred. Similarly, i t is not p~s:jibleto sin~~rltaneclusly identify gross errors in fiow r~leasuremeritsof streams forming a cycle c11^ rhc process. In other- wor-ds. only those corubiriations of gross errors can be considered whose gross error cignature vectors are linearly independent. so that the matrix Fk will be of full column rank and the inverse ~f the matrix F':V in Equation 8-4 is guaranteed to exist. It was proved in the preceding chapter that for a single gross error the GLR tzst is equivalent to the use of global test cornbined with measureItlent elimination sti-ategy. This result is valid even when multiple rneasurements are simultaneously eliminated. In other words, the GLR test statistic given by Equation 5-3 for the hypothesis of gross errors being si~nultaneouslypresent irt X measurements is identical to the reduction in

the global test statistic due to elimination of the k measurements suspected of containing gross errors. We describe below a strategy proposed by Rosenberg et al. [ 3 ] based on the global test which is essentially based on this principle. The strategy proposed by Rosenberg et al. [3] makes use of the global test along with elimination of measurements. Corresponding to each alternative hypothesis of 8-1, the measurements which are suspected to contain gross errors are eliminated and the global test statistic is computed. Using these statistics, gross el-rors in measurements corresponding to the most probable hypothesis (the lowest Type I error probability) are identified. Instead of comparing all the alternative hypotheses simultaneously, however, Rosenberg et al. 131 considered in sequence single gross s so on in error alternatives, followed by two gross error a l t e ~ ~ a t i v eand increasing order of the number of gross errors hypotliesized. If at any stage in this sequence the global test statistic computed after eliminating the suspect rneasurements is found to be less than the critical values at the a level of significance drawn from the appropriate chisquare distribution, then the procedure is terminated. This approach therefore attertlpts to ider.tify as few gross en-ors as necessary to accept the null hypothesis that no lncre gross errors are present in the remaining measurements. Rosenberg et al. [3] perf^;-med simulation studies ta co111pare the performance of this scratcgy with other strategies. In particutar-. they found Chat this simulianeous strategy performs better than the sirnp!e simultaneous strategy based on measurement tests described in the p r e ceding section. However, theii- cornparisoil was limited to cases n;heri only cne gross error was present in tlie measurements. 'The sirn~~ltaneous strategy described ah:)ve has not been used rrruch due to the conibinatorial increase in the numb-r of alternative hypotheses to be :ested far eacl: of which a statistic has to be computed, leading tcl excessi\;e computational burden. However, tlie speed and power of computers has b ~ e nincreasing rapidly, and it is wo!-thwhile to s!udji whether the si~illtiiii~cjiis strategy described above gives better performance than serial strategies and has acceptable coinputational requiren:ents.

Iderttification Using Sin~ultaneousEsti~natiorzof Gross Error Mug~zifudes The method due to Kosenberg ct al. [ 3 ] inlplicitly assumes that a t ~ z i t ~ i rnul r ~ u j t z b e rof gross errors are present in the system, because it considers hypotheses of 2 or more gross errors only if the hypothesis of one gross er-ror is not statistically acceptable. This is also true of all serial

Jiang et al. 151 were the first to point out the need for taking into account equivalency of gross errors when choosing the candidate set and they appropriately modified the UBET strategy. They also made use of principal component tests instead of nodal tests to choose the candidate set of gross errors. Their simulation results showed that the overall performance of the gross error detection strategy did not significantly depend on whether the nodal or principal component tests were used to choose the initial candidate set of gross errors. Since the UBET assumes that a maximum identifiable number of gross errors may be present, it can be expected to perform well when many gross errors are present and poorly when only a few gross errors are actually present. This was also confil-med through simulation studies [4J. Other si~nultaneousstrategies were proposed by Jiang et al. 151, which have similarities with the method of Rosenberg et al. 131. and also b ~ , Sanchez et al. 161, who designed various strategies for simultaneous identification and estimation of nleasurement biases and leaks. The above three simultaneous strategies are illustrated usiiig the sir-12ple flow process example considered in Chapter 1 .

strategies that we will discuss later in this chapter. Such methods generally do not perform well when many gross errors are present. An akernafive strategy is to initially presume as many gross errors to be present in the data as can be identified and then to discard some of these possibilities based on additional criteria. Rollins and Davis (41developed a simultaneous strategy called the unbiased estimated tech17ique (UBET) which is based on this principle. It has been pointed out earlier that the maximu~nnumber of gross errors that can be identified is equal to the number of process constraints, m. Moreover, the signature vectors for these gross errors must be linearly independent in order that their magnitudes can be uniquely estimated. In the UBET method, it is initially assumed that rrz gross errors (whose signature vectors are linearly independent) are present. Furthermore, it is assumed that the types and locations of these candidate gross errors are specified. (We will describe later methods that can be used to choose this initial candidate list of gross en-ors using the basic statistical tests). Let F be an 11 x 17: n:atrix whose columns are !he signature vectors of the assumed gross errors. The magnitudes of these gross errors can be simtl1:aneous:y estimated using F instead of Fk in Equation 8-4. The esti. T Pall ~ these gross mated magrlitudes of gross errors will be ~ ( I Z ! > ~ C I if er;-ors are actcally present; hence, :he narne UBET for [his method. In order to decide ivhzther an assumed gross en-or is actiial:y present, we can lest the hypotheses that the magnitude of the assumed gross error is ec(ua1 to zero or not. The estimated n?agnitudes of gross errors can be ssed for this purpose. If no gross emors arc present, then i: can be proved that b , is normr;lly distiibuteci wit!l nizar? zero and variance a,,. whicl? are ~ . can ioncl~lde the diligunat e1::;nzn:s cf rcatrix D, whel-e 1)= F i ' v - I We th3t the rnagnit~ldeof y o s s error i is non-zero if I b,/diil exceeds Z I .,-,, where a is :L chosen level of signiiicance; o~herwise,we can cor~cludc that gross error i is ilo? present. In order to select an initial candidate set of nz gross errors, we can iililke use of basic . 1 :*.i l l P l . , . ~ . ' - -7.::;) LIL'II ' ' LC>,> .,LJrirc,r(l L; X2lll1

Example 8-1 We consider the sintple h a t eschanzer with bypass process sho\vrl i;; Figure 1-2 for the case when all the Iloxl variables are n?eascred. M t assume that thrce gross ei-rors of +5. i 4 and -3 unics i~ measurelnerlts c;f streams 1, 2, and 5. I-espectively, are p:-esent. The true. measured, 2nd reconciled valces of ail fl:)ws for this caye are shov,;~~ in Table 8- I C

Table 8-1 Data for H e a r Exchanger witll Bypass Process Containing Three Gross Errors Stream Number

True Flow Values

Measured Flow Values

Rwoncrled Flow Values


L L ' U : J L b L

1 2 3 4 5 6

ple, one simple metho:! i: tc -ompute the Cr2Ktext statistics fix each gross en-or using Equation 7-70 and to pick the first 171 gross errors with the largest statistics. after taking into account equivalency considerations to ensure that the sigtiature vectors of selected gross errors are linearly independent. Rollins and Davis [4] made use of the nodal test in order to select an initial candidare set of gross errors, but this requires nodal tests to be perfor~nedon colnbinations of nodcs as well.

100 64 35 64 36 100

10691 68 45 74 65 64 70 37 J1 OX --





102 0573 67 1667 718867 67 I667 34 3867 102 0533

Let us cornpute the GJ,R test statistics for all allowable gross error combination hypotheses using Equ:ition 8-3. The maximum riurnber (7:'

gross errors that can be identified for this process is equal to 4, since there are only 4 Table 8-2 shows the GLR test statistics and the corresponding a level? for different combinations up to a maximum of 4 gross errors. Since streams 2, 3 , 4 , and 5 form a cycle, it is not possible to distinguish gross errors in all four of these measurements from a combination of gross errors in any three of these four stream measurements (see Example 7-9). For the same reason, it is also not possible to distinguish gross errors in any 3 of these streams from any other combination of 3 streams chosen &t of these four streams. Thus, in Table 8-2 the test statistics for all these equivalent combinations are listed in the same row. Similarly, other equivalent combinations are listed together in the same row of Table 8-2. Consider the GLR test statistics for a single gross error given in Table 8-2. The GLR test criterion at 5% level of significance is 3.84. If we apply the simple identification strategy based on the GLR test, then gross errors in rneasuremcnts of streams 1, 4, and 6 are identified. Thus. Type I errors for measurements 4 and 6 are committed. Furthennore, the gross error in stream 2 is not identified. These identification errors are caused by the smearing effect of the gross errors. We will now apply the sirrluitafieous strategy of testing all possible hypotheses of one or more gros? errors. Fror-n the result5 of Tahle 8-3,it is observed that a gross ~ r r o rin strzani 1 has the highest test statistic among ali single gross error hypoiheses. Sii~~ila~-Jy, ihe combination [ I , 2) has the iargest test statistic among ail 2 gross en-ors hypotheses. combination [ I , 2, 51 among 3 gross error hypotheses and all four gross e 1 ~ 3 r hypotheses have the same test statistic (since the reduced problern dces not have any redurliailcy for any of these hypotheses). Among these different combinations. the hypothesi5 of gross errcr:; lil measui-ements I and 2 gives the !east Type I error probability cf 1.4126E- 10. Hcnce, the simultaneous strztzgy based on testing all possible hypotheses identifies gross errclrs in measurements 1 and 2.Tnus, this stl.aii-gj ;2al;ifiL:,i;ie gross errors in rneasurenlc~lts 1 and 2 correctly, .-.!Chc?ugh it dces not identify the gross error in stream 5 (in this example, it is tacitly assumed that the Type 1 error probabilities ror different hypotheses are computed accurately, even though they are very small). The gross error in measurement 5 is not identified; therefore, a Type I1 error with respect to that measurement is committed.

T a b l e 8-2 GLR Test Statistics for a11 H y p o t h e s e s of One or M o r e Gross Errors Measurement Combination



[I1 121 [31 l41

PI 161



[1,21 [I, 31 11,41 11,51 [1, 61 [2,31 1241 [2,51 1261 L3,41 13,51 [3,61 14,SI [4,6] [ j , 6; [ I , 2,3l [ l . 2,4J; 1 1 . 2,6]; !i. 4, 61; 12, 4. 51; [1,2,4,61 [I, 2, 51 11, 3,4] [I, 3, 53; {i. 3. 61; [I, 5-61; i?. 5,6]: 11, 3, 5. 61 [1,4,5] [2,3, 41: 12, 3, 31; [2,4,5]; p , 4 . 51; p,3.4.5j C2,3,6! [2, 5, 61 13,4,61 14.5,61 All combinations of 4 mzasurcments out of 6 except [I, 2,4,6]; [1,3,5, 63 and 12. 3. 4. 51

GUI Test Statistic

Type I Error Probability

35.381 2.470 0.084 13.202 3.139 15.105 45.361 36.910 40.295 35.467 36.491 2.968 13.282 7.469 15.489 13.61 4.982 16.803 13.997 37.725 23.i33 45.742

2.7 1 14E-09 0.1160 0.77 19 2.7970E-04 0.0764 1.0169E-04 1.4326E-10 9.6644E-09 1.77868-09 1.9878E-08 1.1915B-08 0.0227 0.00 13 0.0239 4.3307E-04 0.001 I 0.0828 2.2459E-04 9.1329E-01 6.4279E-09 9.4733E-06 6.4362E- 10

45.522 46.254 43.234

7.1653E-i0 5.OOSSE- I !I 2.1918E-09

37.223 40.3: 8

4.1275E-08 9.1230E-09

14.014 17.610 24.5::: 37.854 41.415

0.00289 5.29338-04 3.0349E-08 5.3382B-09




1 0-17nr 1 .o -LL.-,,.,

If we apply the technique of Rosenberg et al. [3], we would first compute the G T and check whether a gross eTor exists. For these process data, the GT statistic is 46.254 while the test criterion at 5% level of significance is 9.488. Since the GT is rejected, we compare all single gross error hypotheses and find that a gross error in measurement 1 has the least Type I error probability. If we eliminate this measurement and recompute the G T we find it is equal to 10.873 while the test criterion at 5% level of significance is 7.815. We consider all two gross error hypotheses and find that the combination [ I , 21 has the least Type 1 error probability. We therefore eliminate these two measurements and recompute the GT statistic to be 0.893. Since the test criterion is now 5.991, the GT is not rejected and we t e m i nate the procedure. In this case also. two of the gross errors are identified correctly without committing a Type 1 error. As with the GLR test, however, a Type 11 error with respect to measurement 5 was committed. In order to apply the UBET method, we have to first choose a candidate set of 4 gross errors. Using the GLR test statistics for single gross errors listed in Table 8-2. we choose biases in measurements 1, 6 , 4. and 5 as candidates. Note t!iat xiynature vectors for these gross errors are linearly independent. The estimates of these gross errors are obtained as [3.8 1. -4.25, -1 2 5 , -1.221 and the variances of these estimates are c o n puted as 13, 2. 2, 31. Hence. the test statistics for the magnitudes of these gross errors are computed as i2.2, 3.0, 0.86, 2.441. For a = 0.05. the crirical value is 1.96. Based on rhese tes: statistics we. therefore. conclude thar the magnitgdes of gross errors in nieasurements 1, 6 and 5 are nonzerc. Thus, UEET commits a Type 1 errGr in 6 and a'rype 11elTor in 2. It shoilid also be nclted that. although we have restricted cur considerations to identifica~ioilof measurement biases, we can use the above simultaneous strategies for identifying leaks if we combine these component strategies with statistical tests such as the GLR or nodal tesis that have the capability to detect and identify leaks. Serial Strategies

As opposed to siri~ultarteousstrategies, serial techniques identify gross en-ors serially. one by one. Many serial strategies in combination with different statistical tests have been proposed and their performances studied (1, 3, 71. They differ by the type of statistical tests that are used, the manner in which the gross errors are identified, the tie-breaking rules used when the criteria fol- gross error identification are identical for dif-

ferent gross errors, and the criteria used in fie algorithm for terminating the serial strategy. Furthermore, some serial strategies also make ase of information regarding upper and lowel- bounds on variables. A better understanding of these techniques can be obtained by focusing on the core principle used in the strategies ignoring the processing details and the statistical test used for gross error detection. We first describe tlie two main principles exploited in serial strategies before describing the different algorithms developed using these principles.

Principle of Serial Elimi~zation The principle of serial elimination first suggested by Kipps [8] has been described in Chapter 7 for identifying a s111glegross error. This principle is useful in identifying gross errors caused by measurement biases ottly. because it relies on eliminating measurements suspected of containing a bias. This basic principle can as well be utilized for inultiple gross error identification by identifying gross errors serially. At each stage of the serial procedure, a gross error is identified in one measurement (based on some criteria) snd the corrcspondiil~measurement is elirninated before proceeding to the next stage. The major advantage of serial e!imi!iation is that it does ~:otrequire ally pi-ior knowledge about t h exisrence ~ or loca~iortof gross eliors.

The principie of seriai coi1:pensation was f rst suggested by Nal-asirnhan acd Mali i3j in co!ijur?ciion with the use of thc GL,R test. rjnlike tlie serial climjnatior~trcl~! principle call t ~ eused to idc-ctify other iypes of gross error>. besides measuren:eni biases. At each stage of [hi: serial procedure, a grohi erroi- is identified ibascd on some criteria) and the measurements or mudel are compensated using the identiijcd type and location of the gr(>sserror and the estimated maynitude of the Sross error, before proc e e d i n ~to the next stage.

The criteria used for identifying a single gross error in each stage of the serial procedure can be based on one of the statistical tests. Historically, the serial elimination principle has been used in conjunction with



~ ~ ~ , ~ l < . ar~d i / r aGross r i o ~Oror ~ L)errcriotz

the global test or measurement test, while the serial compensation principle has been used with the GLR test. However, it was established in the preceding chapter that the measurement test and GLR test are identical tests for identifying measurement biases. Thus the use of the serial eljmination principle in conjunction with the GLR test is the same as using it in conjunction with the measurement test. Furthermore, it was also proved in Chapter 7 that if the global test is used in conjunction with serial elimination strategy, then this is precisely the same as using the GLR test. The former identifies a gross error in the measurement whose elimination gives the maximum reduction in the global test statistic, while the latter identifies a gross error in the measurement corresponding to the maximum test statistic. Thus, identical results are obtained by using the serial elimination strategy in conjunction with the global test, measurement test or GLR test provided they use the same principle for single gsoss error identification at each stage of the serial procedure. The serial elimination and serial compensation principles were historically derived from different viewpoints. However, it is proved subsequently that if a modified sisrial compensation strategy as developed by Keller et al, [lo] is tlsed, then this is exactly identical to the serial e!iminatjon strategy. This result unifies all the different approaches. The reader is urged to keep these resu1:s in mind when going through the description of ihe scn?e of the most efficicn: strategies described below.

Serial Eli:ni~zutionStrategies That Do Itrot Use Bounds Amorlg the serial eliiilication strategies which do no: use bound i~lforination. iile Y J ~ I S ~proposed G~. by Serth and Eeenar; [ I ] kr!own as the itel-<-rivr~tle(~si:re~izeilt iest (!M'T) provides the basic structure. This strategy makes use of this measurernent test for gross error detection, and the ruJe for identifying a single gross errcr ir? :he measurement corresponding to the lnaxirnum test statistic at each stage of tlie iterative procedure. For multiple gross error detection. it uses the srnal elimination strateg. Th? algorithm terminates when the maximum of the test statistics for all remaining nieasurelllents does not exceed the test criterion. The details of the algorithm are as foilows:

Algorithm 1. Iterative hleasurement Test (IMT) Method [I] For ease of understanding the algorithm, the following sets are defined: Let S be the set of all original ~lieasuredva-iables. Let C be the set of mca-

surements which are identified as containing gross error5 (initially set C is empty). At each stage of the iterative procedure, the measurements in set C are eliminated and the variables corresponding to these measurements are treated as unmeasured and projected out of the reconciliation problem as described in Chapter 3. Let T be the set of measured variables in the reduced reconciliation probleril after projection of all unmeasured variables. Initially, T is the set of measurements which occur in the reduced reconciliation problem after projecting out all unmeasured variables which are present.

Step 1. Solve the l n ~ t ~ reconcil~at~on al problem Compute the vectors x,. (Equation 3-8). a (Equation 7-1 1 ). and d (Equation 7- 15, Step 2. Compute the measurement test statistics zdJ (Equation 7-17) for all measurements in set T. Step 3. Find z ,, the maximum absolute \ d u e ainong all zd, fro111Step 2 and compare it with the test criterion Zc = Z1-Bll.If z,,~,; I Z c proceed to Step 5. Otherwise. sel-ct the measurernznt corresponding to z,,,, and add j: to set C. If two or more measurement test statistics arc equal lo z,:,,,, sc!ect tlie rrlc'ssurerneni with the lowest ir,dex/ !n add to set C. Step 4. Remove the mzasure~neniscontained in C from sei S. Sclve the data reconciliation probictm tresting the \.ariabics c o r r e s p o n d ~ n;o~ set C also as unrnzasused. Obtain T, the set of rneasurelt1en:s ir? tht. reduced data recoficiiiation probleit:. arid the \-ei:oI-s a and d en:-iespocding :o !hex rrleasurernents. (Nore that Serth and Hccnarl j l J dehipneti chis climination sch?~uefor a mzss tlow balance pi-ohiern: their alzorithm. renlc! ir:g measurernents is equivalent to reinot !:I: sireanis fro111 the network by fiodai aggregation. In genel-al. this car: br achieved by a projection matrix. as described in Chapter 3.) Retunl to Siep 3. Step 5. The measurements yj, j t C sre suspected of containing gross errors. The reconciled esti~natesafter removal of these rneasurenlents a_re those cbtairi2d in Srep 4 of the last iteration. should strictly be recrllculated at each Note that the test criterion Z1-11,2 Step 3, since depends on the number of measured \-ar-iables in the model (Equation 7-7. where nz is replaced by the numbel- of measurements in thi. model). The more measured variables are eliminated. the

fewer the nulnber of simultaneous multiple tests, therefore the lower the probability of Type I error overall. It is possible, however, to sinlply use a global test at each stage to first detect whether any additional gross errors are present and use the measurement test statistics for identifying the gross error location. The level of significance at each stage k can be maintained at a and the critical value can be chosen from the chi-square distribution with degrees of freedom nz-k+l where k-1 is the number of qross errors identified so far. 'This ensures the Type I elTor probability to be maintained at a. The measurement test used in Steps 2 and 3 can be replaced by a different statistical test. Principal component measurement test [ I I ] can be used instead. but that requires Inore computational time. The principal component analysis should be used only if that provides superior power and lower Type I error. But that is not easily achieved without additional identification 15, 12. 131.

Table 8-3 Gross Error Identification Using Serial Elimination Strategy Stoge k

Degrees of Freedom Test Criterion



Gross Error Measurement Identified in

Serial Compensation Strategies That Do A70t Use Bounds The serial compensation strategy was developed for use with the GLR multiple gross error identification. It exploits the capability of the GLR test to detect different types of gross errors and also makes use of the estimates of the gross errors obtained as part of this method. Again, in this method gross errors are detected serially one at each stage. The component method used in this strategy to identify a gross error at each stage is the simple rule of identifying the gross error correspondin: ;o the maximum GLR test statistic. Since the gross error could either be associated with a rneasurertient or the model (in case af leaks), elimination of measurcmellts is nor appropriate. instead, the estimaled magnitude is used to cornper~saiethe corresponding nleasurenler?t or model constraints. The GLR test is applied to thz co!npensated constraint residurtls to detect and identify^ any other gmss error presefit. T i e procedure stops when the maximurn of the test statistics among ali remaining gros:: eimr possibilities does nct exceed the tes: critelion. We refer to this algorithm as the sinzple set-idcoi77ycJ~isafion .y:rg:egy (SSCS) since a modified version is described later. The details of the algorithm are as follows: L L S ~[9] for


Example 8-2 The ser-ial elimination strategy is illustrated using the same process as in Example 8-1. M'e will use the global test at each stage for detecting whethel- gr-o\s errors are present. and i!' so ilse the GI-R test statistics (n-hich is equival~ntto the ~neasuremecttest; for idciltifying ihc' gross error-. The 1c\.el of signific;~ncefor thc global test is chosen as 0.95 for all staFes of the serial straiegy. For the same mca:;ured data as in Example 8-1, the glcbal test stztistics for each stage of the serial eliinintition procedure arid the c c ~ ~ e s p o n d ing tes: cri:ena are listed in Table 8-3. It is ollseri.ed that the yloba! test ~ - ~ . ~ cthe c I snul! hypo~hcsi:;in stages 1 and 2, but 1101 in stage 3. Thus. t\vo gross errors are ~dentifiedby this algorithm. In thc ill-st stage. since t!le maximum ELK test .,ratistic is attained for measurement 1 (refer to Table 8-2). 2 gross error in this mcasurrme:lt is identified. In the second stage after elii-r~inatingmeasurement !. the C;I K test statistic is mnxirnum for strearn 2 among all re~liainirtg~ncasurerneilts.This can be vcriiied by comparinz [he test statistics for all combinations of two gross errors which contain strearn 1 . Thus. t\vo of the three gros:; errors are con-ecLly identified lvithout a Typs I el-1-0sbeing comn;ittcd.

Global Test stutistic




Algorithm 2. Simple Serial Compens2tio.1Strat-., ( C Y S )


Step 1. Co~nputethe constraint residuals r and covariance matrix of constraint residuals V using Equations 7-2 and 7-3 if no unmeasured va~iables are present. (Otherwise, the method for treating unmeasured variables described in Chapter 7 can be used to obtain the projected constraint resid:ials and its covariance matrix.)

h4rrlriplr Gmss Errol- Ide~tr!iicilriorrSrr-ui~yies.forSreodv-Str~rr.Procrssp.~

Step 2. Compute the GLK test statistics Tk (Equation 7-30) for all gross errors. Step 3. Find T, the maximum yalue among all Tk from Step 2, and compare it with the test criterion x ; - ~ , ~If. T 5 X:-p, proceed to Step 5. Otherwise, identify the gross error corresponding to T. If two or more gross error test statistics are equal to T, arbitrarily select one. Let f* be the gross error vector corresponding to T and b* be the estimated magnitude (Equation 7-29). Step 4. If the gross error identified in Step 3 is a bias in measurement j, then compensate the measurements using the estimated magnitude of the bias. The cornpensated nieasure~nentsare given by

On the other hand, if the gross error identified is associated with the model constraints, for example leak i corresponding to leak vector f, then the constrain? model is co~npensatedusing the estimated magnitude. 'rk2 co~npensatedmodel is given by

in eitilcr case. the compensated constraint residuals a ~ - egive^; by

Return lo Step 2 replacing the constraint residctals with the comper~sated conqtraint residuals.

Step 5. Conlpute the reconci!eci estimate:; for x using the compensated measurements and compensated model. Equivalently, the s t .:mates can be obtained using the compensated c o ~ s t r a i n tresiduals instead of the ciginal constraint residuais in Equation 3-8. 'The pr-inciple of compensating the rr~easurementsor model at each stage of the above strategy iinplicitly assulnes that the gross errors identified in the preceding stages and their estimated magnitudes are correct. In order to understand this clearly, it is instructive to consider the hypotheses that is tested by this algorithm at each stage of the serial procedure. Let us assume that at the beginning of stage k + l , we have already identified k gross errors corresponding to gross error vectors f;, f';, . . . , f;


and their estimated magnitudes are given by by, b:, . . . , b> The hypothesis for stage k+1 undc,r the assumption that all Goss errors identified in the previous stages as well as their estimated magnitudes are correct can be stated as

'H :

(only gross errors identified in previous stages are present):

H k+l (one additional gross error is present):

where f, in the alternative hypothesis can be any one of the gross error vectors corresponding to remaining gross error possibilities not identified in the preceding k stages. It ca:l be noted that in teims of compensated constraints r.esiduals, the hypotheses for~i~uia!ionis si~rlilarto hypotheses f ~ detectir?g r and identifyin2 a single Fross exor and thus the GLII test statistics for !he remaining gmss error possibilities can be co~nputedby using the compensated residl~alsin Equations 7-30 to 7-32. If multiple gross errors are pl-ese~tir? the data, the serial procedul-e oE frying co identify them results in rnisp~edictingthe type c r location of the gross error. Eve11 if the gross 2 1 ~ 0 iis COI-rectlyidentified. the estimate of its m a g ~ i t u d ernay be grossly incorrect. Thus, the cr~rnpensatedresiduals can contain spurious large er-rors noi presen! in the original residua!^. which can i~llpliirthe accuracL- of identification of remaining gross errors. The serial con:pensation strategy may thus give rise to a largc number of ~nispredictions,especially n1h-n many grnss errors are present and when the magnitudes of gross errors are large in comparison with the standard deviations of the randotn errors in measurements. This was derno~stratedthrough simulation studies by Rollins and Davis 141. On the other h a ~ d if , only a few gross errors are present in the data. then the SSCS strategy performs as well as the IMT as shown through simulation studies [9] and typically requires about one-fifth to one-half the computing time required by the serial elimination methods. This is due to the fact that that serial elimination requires recomputation of the

test statistics at each elimination step, which in turn involves recalculation of all matrices involved in order to get a new solution. Co~tstruction of a projection matrix is reqilired after each deletion of a measurement from the network. Only for diagonal C and mass flow balance equations certain efficient elimination procedures can be developed [14].

Algorithm 3. Modified Serial Compensation Strategy (MSCS) [lo] In order to obviate the problem of too many mispredictions by the SSCS strategy due to incorrect compensation, a modified procedure was proposed by Keller et a]. [lo]. In their modified strategy, only the types and locations of gross errors identified in the previous iterations are assumed to be correct and the estimates of the gross errors are not used in compensation. The modified strategy still uses the GLR test for detecting gross errors and the rule for identifying a gross error correspor~dingto the maximum test statistic. Elowever, the hypotheses that are tested at each stage of thp_serial procedure are different from 8-9a and 8-9b and can be stated as follows using the same notation as in 8-9a and 8-9b:

H ; ~ '(only gross ersms ideiltified in previoils stages arc present):


pi (one additionrtl g r o ~ serrcr is presect):

It shoulu be noled cila~111 ~115nuii anu ;!it.-malive hypotheses, the magnitudes "f ,tie gioss errors arL dssuiiied to be unknown and their maximum likelihood estimates are computed as part of the, computation of the GLR test statistics. However. it can be observed that in the hypotheses fonnulation the locations of the gross errors identified in the preceding k stages are assumed to be correct. The GI,R test statistic at stage k + l for a gross error corresponding to vector fi (not identified i n the preceding stages) is given by

Keller et al. [ l o ] demonstratcd through simulation that the MSCS strategy co~nmitsless rnispredictions as cornpared to SSCS, especially when a large number of gross el-sors are present. Although the MSCS strategy was devised as a modification of the SSCS strategy, a look at hypotheses 8-10a and 8-10b shows that in reality no compensation is being applied, and the original constraint residuals are utilized for gross error detection in all stages. If we restrict our consideration to gross errors caused by sensor biases, then this strateFy is in fact exactly equivalent to the serial elimination strategy. The proof of this follows froin the interpretation of the GLR test statistic as the difference between the optimal objective function values of two data reconciliation problems (see Chapter 7). The GLR test statistics for stage k + l of the hlSCS strategy. giver, b!Equatiori 8-! 1, can also be in!eiyre:ed in a sirnilar manner. TfLefirst tc.:-~i: in the RHS of 8-1 1 is the optima! objective function of the data reconciliation prab!em in which the magnitudes of the gross errors itier1:ified i n :he preceding k siages are simultaneously estimated as p:zrt of the datz reconciliation problem. The foiinu!ation of this problem is an extz:lsioii of Problem P i of Chapter 7. Similzly, tile seco!ici tenn in the RHS of 8-1 I is equai to thc optiinai ohiective funcrion of the data rtxo~ici!iatior?probIem i n which the ~nagnitudes of gross e m r s identified in preceding k stages a23 the mag!lirudz of the gross error hypetilesized ir? stage k + i are simultaneously estic~aied.!ii addition, i: wax also pointed our in Chapter 7 chat the optimal objectix-e function value is the same whether- we chocjse to retain the measuiemenr containin? a grow prrnr ,>r,d estir~rrtcthp *?:~,-~i!zCi_e of the Fmss en-or. or we choose to eliminate the rneasurernent contairling a gross el-ror and treat it as an unmeasured variable in the data reconciliation problem. This result is also true when these are several measurements containing gross errors. In other words, we can also interpret the RHS of 8- 1 l as the difference in the optimal objective function values of two data reconciliation problems, one in which the measurements identitied as contail?ing gross errors in the preceding k stages arc eliminated, and the other i n which an adctitiona: measurement in wliich a gross error is hypothesized

nnrn Rccotr

at stage k + ] is also eliminated. But this is precisely the strategy used in IMT. Thus, the MSCS and IMT are identical strategies for identifying gross errors in measurements. I-lowever, the MSCS has the additional capability of identifying other types of gross errors. We can therefore regard MSCS as a serial elimination strategy that can handle different types of gross errors.



Example 8-3 The serial compensation strategy is applied for gross error detection on the same measured data for the flow process considered in the preceding examples. Again, the global test is used for detecting whether gross errors are present and the GLR test statistics are used for identifying the gross error. The main difference between this and the preceding example is that the residuals at each stage are compensated using the magnitude of the estimated bias and Fquation 8-8. The results are presented in Table 8-4. which show that two gross errors are detected in measurements 1 and 2.

Global Test Stahtic

Pegrees oi ireedow v =m-k+ i

Test Criterion

o,.-2 os

Gross Error Identified in Measurerncnt

Algorithn~4. Modified Iterative Measurement Test (MIMT) Method [I]



Step 1. Solve the initial reconciliation problem. Compute the vectors x, a, and d. Step 2. Compute the measur-ement test statistics zd., for all nteasurenlents in set T.

Table 8-4 Gross Error ldentifisation Using Modified Serial Compensation Strategy Stage k

The information about upper and lower bounds on variables can be exploited in the gross error detection strategy. Serth and Heenan [ I ] proposed a heuristic method of utilizing bound information in gross error detection by nlodifying the IMT method. The t~zodifiedIMT method (MIMT) still uses the lneasurement test for detection and the serial elirnination strategy for multiple gross error detection. However, the colnponent strategy used for identifying a gross error at each stage is not a simple rule of choosing the measurement with the maximum test statistic. Instead, the MIM'T method identifies gross errors in measurements only if their deletion from the original set S gives a data reconciliation solution that satisfies the bounds for all measured variables.


Mag~itude of Gross Erro:

Step 3. Fi~idz, the maximum absolute value zri~ongd l zdj f r ~ i l iStep 2 nnd cdmpare it wit!l tbe test criterion %, = %1-P/2.J f z,,, < Z,, pr<)~ei(! ic) Step 6. Citherwise, seiecr the measurement coixesponding to zma., and ternpot-ui-ily add it to set C. if two or nlcre rneasursrcent test statistics are q u a 1 to I,:,,, select the measurement wl:h the lowest indexj to add to set C.

(3from set S. Solve t.he d213 reconciiiatioil problern creating the variables correspondin2 to set C also as unmeasur-ed. Obtain T, the set of measurernefits in the reduced data reconciliation problern, and the vectors a and d corresponding tc these measurcrnznts. Step 4. Rernove tiic ~neasuren!ents con~airledin

Serial Strategies Tlzat 157,s..B01~zd.s The algorithrns described in the preceding section do not ensurc the feasibility of the reconciled solution. Sorne of tile ~ i o wrates in the r'inai solution may be negative or may have large unrL~sonablevalues. Furthermore, if reliable upper and lower bounds of both measured and unmeasured variables are known but are not imposed. the recorlciliatiori can provide a solution that violates certain bounds (infeasible solution). This very fact may be indicative of other undetected gross errors in the data or that the identified sross errors may be incorrect.

Step 3. Determine if the reconciled values for all variables in set T and set C are withrrt their prescribed lower and upper bounds. If all reconciled values are within the bounds. stor-ethe cun-ent solution and return to Step 2. Otherwise, delete the last entry in C, replace it hy the measured variable corresponding to the next largest value of I z,, I > Z,, and return to Step 4. If I z,,,~I 2 Z, for a11 remaining variables, delete the last entry in set C and go to Step 6.

Step 6. The measurements y,, j E C are suspected of containing gross esrors. The reconciled estimates after removal of these measurements are those obtained in Step 4 of the last iteration.


Exercise 8-2. Re~lriteAlgorithm 4 for a p m b l e n ~with both ~neasurcdand urzriieastr~edvariables solved by QR deco1nj3o~itiorz as described in Chapter 3.

There are several limitations in the manner in which bounds are treated in the MIMT algorithm as listed below: (1) The algorithm only checks for bound violations in the nieasured val-iables (more specifically only in the measurements of sets T and C ) . Bounds on unilleasured vaiiables are ignored in the algorithm. ( 3 )The algonth111may tenuinate even if the test statistics for sorne of the measurements exceed the test criterion, due to the modifications in Step 5 of the al~orithnl.In essence, the tnethod relies on bound infomlation at [he expense of the information provided by the statistical test. 13) As an extreme case, the algorithm can also terminate with all the test statistics below the test criterion, but the reconciled solution violating the bounds for sonie of the measured val-iables. This can happen if the test statistics of tlie iriitla! reconciied solu!ion cio not exceed the test critericm ir, Step 2 cf tlie algorithm. 133Since ;he n~ethodis based on sei-ial e1in:ination of measilrernenrs. it has :he :3an1e liinitatiei? as TMT of being appiicable only for- ideritifying biases in measurements and nclt other types uf gross errors. Rosenberz e! al. 131 prcposed thc errei~dedniecislri-cme~z!resr (EMT) and dy~zanzicrizeaszrrc-r,lerzf rest (DMT) strategies that also exn!oit bound infarmation in gross error detection. Strictly, EhlT cannot be classified as a serial strategy because it involves the sin~ultaneouselimination of two or more measurements similar to the simultaneous strategy using the zlobal test described earlier. The EMT strategy initially creates a cnndidate set of measurements suspected of containing gross errors by using the measurement test and selecting those for which the test is rejected.

From this candidate set, combinations of one o: more measurements are eliminated in order. Gross errors in the eliminated set of measurements are identified provided the following two conditions are met: (a) The measurement test for all I-emaining measurements are not rejected. (b) Reconciled estimates of all va~iables(including those that are eliminated) satisfy the bounds.

In extreme cases, if it is not possible io identify a set of measureme~lts which when deleted satisfies the above two conditions, rhen all measurements are suspected of containing gross errors. The DMT algorithm is very similar to MIMT except that it checks for bound violations in :he estimates of both measured and unmeasured variables (including variables whose measurements are eliminated). Moreover, DMT initializes the set C with the measurement having the largest MT statistic and enlarges the set C at each iteration by the measurement with the largest rejected MT statistic. For details of EMT arid DRTI' the reader is referred to the paper by Rosenberg et al. [3].These algorithms also have the limitation that they can be used for identifyin,0 meastlremen: biases only.

Algorithm 5 . Bounded GLk (BGI,R) Method [15] The above serial strategies suffer froin some limitstions described in the preceding section due tc the heuristic manner in which boui?d information is utilized in gross error detcction. If reiiabls bounds on varizb!cs are available and the reconciled estimates of variables are expected to satisfy these bounds, then it seeins more appropriate to inclode thzn: as inequality constraints in the date recc~nciliationproblem. Due to tl~ese ineqgality constraints, the solution to :he data reconciliation probleili cannot be o b t a i ~ e danalytically, cven if the model constraints are linear. A quadratic programming optimization technique has to be used as described in Chapter 5 for solving the resulting data reconciliation problem. (In general, if the model constraints are nonlinear, then a successive quadratic progra~nmingtechnique as described in Chapter 5 has to be employed [i5].) Despite the complexiiy, this method offers an elegant and theoretically more rigorous approach for including bounds in data reconciliation and gross error detection. Such an approach is used in the BGLR method as described in the following steps:


L ) ~ M~<:'co~~i.i/ir~~iv,, n11d C;roz Error I)creciiori

Step 1. Solve the bounded reconciliation problem including the bounds on both measured and unmeasured variables in the reconciiiation problem. (Use a nonlinear optimization technique described in Chapter 5.) Step 2. Identify the active constraints at the solution of the reconciliation problem. (The active constraints include all the conservation constraints as well as any bound constraints that are satisfied as equalities at the optimal solution.) Denote the matrix formed by the active constraints by A , and the RHS of the constraints by c. Step 3. Conlpute the constraint residuals, F, and covariance matrix of these residuals, V,using all the active constraints identified in Step 2 (use Equations 7-2 and 7-3 with A replaced by Aand c replaced by c). Step 4. Compute the GLR test statistics using Equation 7-30 to 7-32 with r and V replaced by F and respectively.


Step 5. If rhe maximum anlong the GLR test statistics T 5 X:-8 go to Step 6. Otherwise apply the simple senal conipensatio~istrategy using -4. r , and V, instead of A. r. and V. Step (1. Solve the bounded data reconciliation prob!ern using :he compensated r~leasurement:;and he co~npensatcdrnodel to obtain tllz rezoncilcci estimates of dl variables. The s:rategj described above is a simpler and generalized versicn of the original ~nethcdde.~elopcdby Nar-asimhan and Hdrikumar [1-5] ir? xhich sepsl-ate tests are applied to the var-ables which are at rhcir bounds (aim referred :o as the re.:-tric~eri ~~nriaOles) a11d :he other unrestricted variah1.-s. .4 few points are worih rneniioni~gwith respect to the above strategy. In Step 5 of the above strategy, it is implicitly assumed that the same set of constraints will be active in all stages of the scriai compensation strateg jl. I rle rneoretlcaliy correct procedure is to solve the boucded data reconciliatiol~probiem using compensated measurements and compensatzd model constraints at each stage after a gross crror is identified, in or-der to deternline the new set of active constraints. Since this will increase the computational burden significantly. it has not been used in the above strategy. If there is no limitation of computing power, however, then it is advisable to solve the bounded data reconciliation problem at each stage of the sel-ial procedure.

Mirlfipl<, G ~ n s rErt-or-Ide17rijic-urionSif-ufe~i<,s for Jlr
25 1

Secondly, in the BGLR method described above the simple serial cornpensation strategy is used for multiple gross error detection. Instead, the serial elimination strategy or modified serial conpensation strategy can be used as well. Lastly, the above strategy does not have the limitations of the MIMT and cther methods. The estimates that are obtained at the end of the procedure will satisfy the bounds, and the test statistics of all measurements will be below the threshold value (provided the bounded data reconc~liationproblem is solved at each stage of the serial procedure). In the extreme case, if an infeasible solution is obtained in Step 1, this indicates that the bounds imposed on variables are too restrictive and have to be relaxed. One important advantage that this method offers is that it can be used even when more complex inequality constraints (such as thermodynamic feasibility restrictions imposed on the temperatures of heat exchangers) are used. If the inequality consrrarnt is active at the optimal data reconciliation solution, then this is simply included as an equality constraint in the constraint set when gross error identification is applied.

Exercise 8-3. D e v e l ~ ptlze BGLR algoritl~r7zfor- the case ~ ~ h c (raz ) rlze boiinded dutc! reco:zciliatior?prz~blei?zis solved ufter each stage qf the serial coniper?.sariorzpr-~cedzrrz,(0) niodijied seriul cor?z~~eilsatiorl is used as the .srruteg~~,fi?r 111~1ri,ole g r o . error ~~ de!ectio~rirz~remclo f ~ e r i ~~0z1 ? 1 ( X ? l z ~a;?d ~ i i ~( cl ~) whcrz , ho;h chunges ( c )rind (b)are made.

Siinulation sc~ldiesconducted by different researchers j 1, 3, 151 show s detection that bound infomiation enhances the perfomianie of g r ~ s error strategies only if thc measured values are close to the bounds. Both simultaneous and serial algorithms described above are not limited to linear flow processes. If nonlinear constraints are involved, they can be !inearized befare applying any of these detection schemes. Gross error detection in nonlinear processes are discussed in greater detail later in this chapter.

Example 8-4 We apply the BGLR method to the process considered in the preceding examples. For this purpose we will assume that lower bounds and upper bounds on variables are specified as in Table 8-5.

T g h t bounds on variables I and 2 are specified to illustrate the effect of bounds on gross error detection. The data reconciliation solution obtained without using these bounds are shown in Table 8-1 Table 8-5 Lower and Upper Bounds on Flows of Heat Exchanger with Bypass Process

Table 8-6 Gross Error identification Using Bounded GLR Method

--. -


Stage k


2 3


Lower Bound

Upper Bound

1 2 3 4 5 6



60 30 55 30 90

67 40 75 40


The reconciled estimates violate the upper bognds o : ~streams I and 2. Therefhre, the data reconciliation problem is solved using the bounds and the reconciled estimates are obtained as [101.97, 57.00, 34.97, 67.00, 34.97, 101.971. It is observed :hat the upper bound or! strsam 2 flow is active at the optinial soiution. By includins this conztraint aloi~gw ~ t hthe flosv balsnces we zrt the expri!ided consti-aiiltnitltris arid RHS of coasti-aiints

The last of the above constrai~itsis the active upper boiliid constrain1 or, the flow of stream 2. C'sing the enlarged constraint set, we ccrrnpute the global test statistic which is equal to 46.338. This is greater than t!le test criterion of 1 I .07 and the (iT is rejected. The maxirnunl GLR test statistic is obtained for lneasurernent 1 and so a gross error is identified in this measurement in stage 1. MTecontintie this procedure by solving the bounded data reconciliation proble~nat e x h stage to determine the active constraints and the results are given in Table 8-6. Again only two of the 3 gross errors are correctly identified. The final reconciled values are (99.0, 64.7033, 34.2967, 64.7033. 34.2967, 99.01, which satisfy the booncis.


Active Constraints s

Global Test Statistic

1 (UB on 2) None l(LBon1)


10.873 2.61

DOF \, = 1 +s)



3 3

Test Criterion

Gross Emor

% t.0.05


1 ! .075 7.815 7.8 15

1 2


Combinatorial Strategies

There are several gross error identification strategies ~ h i c hrnake use of the nodal test. As pointed out in the previous chapter, the use of the nodal test requires a strategy even for identifying a single gross en-or. Since these strategies cannot be easily classified as either serial or simultaneous, we have chosen to categori~ethein separately. Most of these strategies are specifically tailored to Siow processes and cannot be applied to nonlinear processes. The basic principle used in the gross en-or identification strategy based on the nodal test was first proposed by blah et a!. [10]. If a g o s s error is present in any f l o ~ rrneasllrement. then !bib afficts rhc constrsint residaals in which the measurerneat occurs. 'Thus. wc can expect the nodal [?st for :he tivo nodes on which the correspo!lding stream is incident to be rejected. !I these two nodes are merged, however, then :hc correspondiii~ intsrcannecting stream is eliminated and the nodai Lest for this coinbination node (also called a pseudo-node) wiil ntost pi-(>hahlynot be reject~d. 11; order to exploit this principle, nodal tesL.i are co~iductcdor1 rht rcsiciua!s around singie nodes as well as coriihinr?tions of t ~ v oor inore r,oi!es which are corinected by streams. If a noda! test for any cunibinatiori is not re-jected. rhcn the fiov: measurenlents of ail streams which are incldent on the pseudo-node may he assumed tc) be free of gross errors. ! i i !!Inw !~ It should be noted that no de,cisions can be !i?sde c ~ ~ ~ ~ , - r r the measurements of strearns interconnectiiig any two nc3ries for-nliiig the pseudo-node. On the other hand. if the nodal test is rejected, then one or I I I U ~ C ~iit'astirelIieritsincident on the p.;eudo-node can contain grosh errors. No direct statement. howeve;; can bc made regardir~gany of these measurements. By selecting suitable coi:ibinations of nodes on \t.hic!i nodal tests are performed. it is possible to identify a set of measurements which are likely to free of gross errors. The remaining ineasurem'nts at-c suspected o t containing yross 21-roi-s.

hflrl:il~lcGI-OSS Err-or-Ide~~riJicariorz Stmrch.iesfur Sleud>--S:arcProcesses

Further screening of this candidate set can be performed using other serial or simultaneous strategies to identify the measurements containing gross errors. These strategies can also be used to identify leaks in nodcs along with measurement biases [ I , 71. We describe below the linear cornbii~ationtechnique (LCT) algorithm by Rollins et al. [I 81 which uses the above strategy.

Algorithm 6. The Linear Combination Technique [18] In the LCT method proposed by Rollins et al. [18], nodal tests are conducted on constraint residuals around single nodes as well as certain combi~iationsof nodes. In general, for any of these nodal tests, the hypotheses can be expressed as follows:

i-i,;: lTpr = 0 versus H,~: 1TPr# o


where p, is the expected value of the constraint residuals and I, is a vector of zeros and ones representing the linear colribination in the ith test. At a level of significance, H,; is rejected if:

If Kc,, i h not rejected, ali measuremer?ts of streams incident or, the psecdo-node are considered to be free of gl-oss ei-rors. After all 1i:iear combination tests are performed, two sets of n?easared variables are ct;tained. One set. SET]. contains variables whosc measuren?eiits are not suspected to contain gross el-rors. The complenientary set, SET2, consists of variabies whose 1neasureinent.i are suspected cf containing gross errors. Of course the algorithm may result in incorrectly classifying the rneasurements, giving rise to Type I errors (when good measurements are 2~ nllred placed in SET2), or Type I1 errors (when faulty rn~ssr~rementu in SETI). The chosen level of significance a for the tests plays an important role in balancing the two types of errors. In order to reduce the number of linear col~lbiiiationsfor hypothesis testing, Rollins et al. I dl adopted the following rules: a. Conduct nz nodal (constraint) tests on individual nodes. If HOnfor node (balance) k (k=l, . . ., m) is not rejected, no nodal test on combination nodes contatning node k is conducted.


b. A gross error in a small flow is generally difficult to detect and if it is implicitly assumed that such stream measurements d o not contain gross errors, then no nodal test on combinations of nodes connected by a low flow I-ate stream is conducted. c. No nodal test is perfomled on node combinations which are not connected. d. No nodal test is performed on a nodal combination containing two nodes which are connected by a stream whose measurement has been classified to be free of a gross error. The above rules are used to essentially avoid conducting nodal tests on pseudo-node combinations which do not provide any additional infonnation for identifying the good measurements. Mah et al. 1161 used a similar strateSy and made use of the above rules except rule (b). In addition, their procedure L L L . , ~ attempted to identify leaks it1 nodes as a last resort, when the nodal test for a node is rejected, but all the streams incident on this node are classified as good (placed in SETI). Serth and Heenan [ l ] proposed three different variants of the above strategy in one of which information about bounds on variables was also exploited. Yang et al. 271 used a combination of measurement and nodal tes:s in which the measurement test was used to identify an initial candidafe list of suspect measurements ahilc the nodal tests were used to cor!nter-check whether the suspcct ~neasurernentdoes contain a gross errDr. Although strategies based or, nodal tests reduce the nunber uf miupredictions (Typz 1 errors) as compared to serizi strategies [I], they suffer from the follcwing drawbacks: (1) :f nnitiple gross errors are prescn; in the dzta, dge to partial or comp!ete cancellation of these errors, nodai tesrs for node coinbinations on which these streams are incident ~ n s ynot he rejected resulting in i~lcorrectclassif~cation(Type !I errors). ( 2 )They are designed for linear flow processesand it is difiicult to extend t h a n to nonlinear processes.

Example 8-5 The LCT algorithm is applied on the data for the heat exchanger with bypass process considered in the preceding examples. Nodzl tests (using u = 0.05) are performed on single and co~nbinationof appropriate nodes and the streams are classified as shown in Table 5-7.

Table 8-7 Gross Error Detection Using LCT Algorithm --

Node Combination

NT Statistic


Rejected Rejected Not Rejected Not Rejected Rejected Not perfornled Stream 3 is good Not perfonned - disconnected Not perforrncd - disconnected Not performed - Stream 4 is good Not performed - Stream 5 is good

Measurements Clossified Free of Gross Errors -

fixed percentage of the true values. .4 specified number of gross emrs are added to obtain the measurement vector. The location of the gross errors are unifonnly and randomly selected over the set of measurc(l variables, while the magnitudes of the gross errors are uniformly ancl randomly chosen between specified upper and lower bounds. The sign of' the gross error is also chosen randomly. The magnitudes of the gross errors are constrained by

3 and 5 4 and 6 -


Other than measuremenrs I and 2, the rest are classifiec! as good and thus two of the three gross errors are correctly identified by IXT. Other collective methods used to simultaneously identif;, leaks and measurement biases and estimate their eri-or magnitudes have been recently proposed by Jians arid Qiigajewicz [l9].

PERFORMANCE MEASURES FQR EVALUATING GROSS ERROR IDENTCFICATION STRATEGIES The p e i f o r ~ a n c eof rhe serial e l i n ~ l n a t i ~and n compensation 1?1ethods described above can be compared by nteans of computzr siinu1:ltior: expi-rircen:~in which kno:xn errors are introduced into the dara and !he ability of the schetnes to icizntify and correct the erroys is evaluated. SLI~II colnparisons have becn trequen:ly reported ! 3 , 5. 6, 9. !O, i5, 18, 191. Since the error detection procedure is stochastic in nature. the per-fosmance is averaged ovel- a suitably large number of ;rials. A mini~nuntof 1,000 sirnulation trials is recommended. - PI-ocessvariables. for each simulaGiven the set of true xaliles f o ~all !ion trial, a measurement \.ector is generated as

where 6 is the vector of gross errol-s. Using random numbers from the standard normal disfribution. a vector E of random errors is first added to the true val~ies.For this. the standard deviations are ust~allytaken as home

where 1 is a lower fraction and u is an upper fraction of the random value (not including the gross error). For instance, 1 = 0.05 and u = 0.59. For the purpose of gross error detection, the level of significance a tor the statistical test is also required. This value is frequently chosen so that the average Type I error when no gross errors are presen: is 0.1. Different performance measures have been used to evaluate gross error detection performance 13, 91 as follows:

1. The overatl power (OP) of the method to identify zross errors cornealy is $yen by O-,rerall Number of gross erTors correctly identitied pca'er NumSsr of gross errors siinulated


The overail pcuyer is co~npu:eri only for si;:~~rla~ion trials ill wlrich gross erri;rs 21-esin~ulated. Roi!ins and Cavis 141 defi!led an o\~ri-a! as folioa s. which is a more conservative ines: L\UI-c. =

Wu!nt)i'r of ~iln11la.tio11 trials with perfect idetttification --. N~ilnberof sirnulation trials

(S - 16a)

By perfect identificiition, it is implied that all gross errors and their locations arc. correctly identified anti no mispredictions are ritade. Obviously, an ideal strategy is one which results in an OPF of unity. Note chat it is always possible to get an O P value of i by predictir~gall ~neasurernentsto contain gross errors. This is not satisfactory hcc.ause too n~artyinispr-edictions a!-e also made in the b a r ~ a i n .111 tlii. case,

Multiple Gross Error Idenlifi~.ufronSt~-alcgies for Steady-State ProCesSeS

however, the OPF value will be zero if some of the ~neasurerne~lts do not contain gross errors. Sanchez et al. [ 6 ]have modified the definition of O P E taking into account equivale~zcyof gross errors, and have denoted their performance measure as OPFE. This measure is also conzputed like O P F except that "perfect identification" is interpreted to account for equivalency of gross errors. Perfect identification of gross errors is achieved, if the set of gross errors identified in a simulation trial belongs to the same equivalency class as the set of gross errors actually present in the measured data. This also implies that no mispredictions are made in the simulation trial. 2. The average n u m b e r of Type I e r r o r s (AVTI), which defines the number of incorrect identifications made by a method, is given by


Number of gross errors wrongly identified Nurnber of simulation trials made


The measure AVTI is cornp~ltedsepal-ately for each simulation run. whether or not gross en-ors are simulated. The interpretation of " w r o t t ~ iclentificatio~iz"car1 bz suitabiy made after taking eqvivaienc): of Fross errors into consideraric:;. 3. A~icthermctasi1r.e of pe~fo!-marrc~ is the selectivity, defined as Nui:lbtr of pros.; errors corr-ectlyidentified Sc!cciivi!y = Tom1 rlun~bzr-ol' gross errors beicc;ed


IL may he noted that the denominator includes only ! h ~ s csirnukition trials whel-e a gross error is simulated. 1.Average e r r o r of estimation (AEE) is the f o u ~ t htype of performance measure. It gives the accuracy of estimation of the bias magnitude on methods, where the average. It is used to compare serial co~npe~lsation estimates of the gross emor magnitudes are also PI-ovided. AER =



estimated value actual valr~e actual value -

(8 - 19)


COMPARISON O F MULTIPLE GROSS ERROR IDENTIFICATION STRATEGIES Different simulation studies have been conducted comparing the performance of gross error detection strategies. Among the different strategies proposed for multiple gross error identification we would like to determine the best, in terms of the measures described above. Unfortunately, the performance studies conducted so far d o fiot provide a definite answer. Nevertheless, some conclusions can be drawn from these studies. Before making a comparison it is important to ensure that different strategies are compared on the same basis. Since it is always possible to improve the power of a strategy at the expense of a greater Type I error probability, it is important to ensure that the different strategies have the same Type I error probability, so that a judgment can be made based on their power and se.lectivity measures, etc. Secondly, it does not matter whether the GT, MT or GLR test is used for detection because the same performance can be obtained with all these tests by using the same component strategies for single and multiple gross error identification. This follows from the equivalency results between , that these three tests proved in the previous chapter H o w ~ v e r strategies make use cf the nodal tests or principal component tests are distinct and cannot be combined with other tests. Our comparison is focosed on the r gross error detection. component strategy used f ~ multiple Since audified serial compensation is identical to serial elirnirratior? and aiso has the ability to handle gross errors other than biases, amar:: tl~csetwo only MSCS needs to be considered. I
egy proposed by Rosenberg et al. [3] was compared with szrial elimination and was found to perform equally well. However, in this comparison the equivalency between different gross errors has not been taken into consideration properly, since the effect of equivalency of gross errors was not completely known at the time of their study. The simultaneous strategy based on testing all possible gross error combination hypotheses has not been evaluated in any study so far, since it was considered to be computationally intensive. A cornparison between LCT and SSCS was made by Rollins and Davis 141 and they showed that LCT performs much better especially when standard deviations of measurement errors are small and when a large number of gross errors are present in the data. However, since MSCS is better than SSCS a comparison between LCT and MSCS is more appropriate. A major problem with LCT is that at present it is applicable only to linear flow processes. The strategy based on principal component tests was recently cornpared with MSCS and other strategies by Jiang et al. 15).Their study is the only one where equivalency between different gross error sets is properly taken into account before computing the performance measures. Tneir study did not demonstrate s u p ~ r i o performance r of PC test strategy. This is alsc confirmed by recent studies by Jordache and Tiltcn [I21 and Bagajewicz ct al. j13j.

GROSS ERROR DETECT96W IN NONLINEAR PROCESSES All statisrical tests and serial eliminakion c r scrial compensation tech11iquespresenred in this chapter call be used for detection afid identification 3f gross errors in processes described by n o ~ l i n e a rmodels. T i e iisua! procedure is first to perfom1 a linearization of the process model followed by an identification method designed for linear equations. Typicaily, thz measured data are reccrziIec! m ? e r rflc ::scmp:icr, :!:z: r.; gross errors are present and the cwctraint equations are linearized aronnd thc reconciled estimates. This strategy. although very popular, may not be suitable for highly nonlinear pmcesses with data corrupted by significant gross errors. The reconciled estimates which are obtained based on the assumption that no gross errors exist in the data may be far from the true values and the resulting linearized model may not be a good approximation. Consequently, chis approach may fail to identify the true gross errors. There is

no guarantee of a successful gross error detection and identification even for pure linear models. Nonlinear processes are much more complex, and they require special methods of gross error detection. One step forward was provided by Kim et al. 1201. They tailored the MIMT serial elinlination algorithm to fit the nonlinear data reconciliation problem. Their enhanced algorithm differs from the MIMT algorithm in two ways. First, in Step 1, the data reconci!iation problem is solved using nonlinear programming (NLP) techniques. Second, in Step 5 , the reconciled values and the measurement adjustments are also calculated based on the nonlinear solution. The variance-covariance matrix of the adjustment vector, however, is calculated from a linearized model as proposed by Serth and Heenan 1711. Their method, tested on a CSTR reactor model, showed superior perfomatice in comparison with the MIMT algoritllrn used with successive linearization, especially when the number of gross errors increases. The NLP solver is more robust and provides more reliable estimates Ihr reconciled values and gross e ~ ~ o lwhich -s enhances the perforinance cf oross error detection and identification. If large gross errors exist i!i the data. however, they need to be screened out prior to application of this technique. Moreover, the compu:aticxlal tin;= can be 7ery high for iargescalc industrial problems. A new swategy for detection of gross eri-0:-s in noniit~earprocesszs \ V a q recently proposed by Renganathan and Narasiinha~?122). In thei~apprcach. a test strategy analogous to the GLR method was proposed that does not require li~earizationof the constraints. .4 brief ciescl-iptior, of this nzetl~ud f0:lows.


Gross Error ldentificatisn in Nonlinear Processes

Using Nonlinear GLR Method It was proved in the preceding chapter that the GLR test statistic for a measul.eme11t i is identical to the difference in the 011jeciivefi:17r!im(OF) values of two data reconciliation problems. one of whlch assumes mar n ~ ) gross errors are present, whereas the other assunlctb ~iiacii gross en-or is present in ~neasurementi. The formulation of the data reconciiiatiorl problem when a gross crror is assumed in measurement i, was also described in the preceding chapter (Problem PI iri Chapter 7j. For identifying a single gross error in the GLR neth hod, the maximum difference between the OF values over all the gross errors hypothesized is obtained. If this difference exceeds a critical value ther, a gross crror is

detected and is identified in the variable which gives the maximum O F difference. In other words, the gross error model that gives the minimum least squares O F value is selected, which means that the gross en-or model that best fits the observed data is selected as the most likely possibility. For nonlinear processes, a gross error detection test can be obtained by applying the above principle of GLR test, that is, the test statistic is obtained as the maximum difference in O F values between the no gross error snodel and the gross error model for variable i. The test statistic T is given by

where T, = O F for no gross error model -

O F for ith gross error model

The OF for no gross error model is obtained by solving the standard nonlinear- DR problem as formulated in Chapter 5. The OF for ith gross error model is obtained by [lie solving the nonli~?earDR problesn analogous to Pioblern PI descr-ibcd in the preceding chapter, obtained by s i n l ~ ply rep1acir.g :he linear constr-aints (Equation 7-1) with the nonlinear constraints of the process (Equations 5-8 and 5-9). These noclinear DK problems a r e soiveci u s i r ~ gnonlinear programming techniques a s described i i ~Chapter 5 . Thc test statistic is compared with a prespecificd threshold (critical vaiuej T,, a i d a gross ever is detected if 7 exceeds T;.. ?'his means that the corresponding gross srrcr mode; Sesi fits :he data 2nd so the variable corresponding to that gross elror model is identified to bc biased. 'The magnitude of bias b is also obtained as part of the soluticn of Problem PI. Although no statistical reasoning can be given for the choice ~ c of linear of rhe critics1 valkc, iiic 52111~L G A ( C l iterion cia ;I! ~ i case processes can be uxc3. This may be adjc,:ed by :rial and error using simulaticn if it is desired to obtain a specified Type I error probability. It sli0~11dbe noted that, iri this approach, the nonlinear constraints are treated as such and not approximated by a linear form. Furthennore, bounds and othel- inequality constraints can be included as part of the constraints. and the gross error detection test can still be applied as described above



since it uses only the optimal objective function values. For ease of refel-ence, we denote tinis test as a 17onlinearGLR test (NCLR). The NCLR test described above can be applied to detect at most one gross error. However, this test can also be combined with any simultaneous or serial strategy for multiple gross error detection described in this text. As a particular case, we describe the use of NGLR test along with MSCS strategy for multiple gross error identification caused by measurement biases.

NGLK Test wit17 Modified Sei-ial CoinperzsatiorzStrrrregy (MSCS) In MSCS, at each stage of application of the test, only the locations of previously detected gross errors are assumed to be col-rect. but the estimates of all the gross error magnitudes are assumed to be unknown and are therefore estimated sin~ultaneously.This process is repeated until no further grcss errors are detected. Applying this sirategy along with the NGLR test, we obtain the test statistic at stage k + l as in Equations 8-21 and 8-22, but the O F values at stage k+ I fcr the no gross error snodel and gross error model for variable i are, respectively, obtained by minimizing the followirlg objective functions, subject to the nonlinear constraints given by Equations 5-8 and 5-9.

w h e ~ eb, is a X vector of unknown biases and EL is 3 matrix whose columns are the unit vectors. The jth colurnn trector of E, has a unit value in the position coiresponding to the measureliient in which a gross error was identified in stage j for j = 1. . . . k. Tile minimization with respeci to the unknown masnitudes o f gross error magnitudes, bh, for computing the objective function values implies that only the locations of gross errors identified in the previous stages are assumed to be correct. Their magnitudes, however, have to be estimated simultaneously along

average vector y of the N observations satisfies the following measurement model:

with the gross error hypothesized in the present stage, which are actually the premises for the hypotheses in MSCS. For highly nonlinear processes such as reactors, Renganathan and Narasimhan (221 demonstrated that the NGLR method gives better performance than methods which rely on linearization of the constraints. 4,


BAYESIAN APPROACH TO MULTIPLE GROSS ERROR DETECTION The major PI-oblem in gross error detection is how to enhance the power and selectivity of the test without increasing the frequency of false detections (Type I errors). One way to enhance the power and selecti\iity of gross error rests is to use the information from the past data end, particularly, the frequency of past failures on measuring instruments. To incorporate historical information 011 measul-ing instrumentation, we can make use of the Bayes theorem in statistics [23].A gmss enor detection and identification procedure for measurement biases based on this approach has been developed by Tamhane et al. [24. 251 for t e a d y slate processes. The usual detection techniques for steady state processes are applied to the snapsiiots of da:a or- averages ol data collected within a time period. But if a significant instrumenr failurr occurs within the dzta collecticn period, the s:aiisiical test:; using averages of data may not be able to capture that failure. Eve11 if eventually they will capture that failure, it might take a long time ur:til an i11strun:ent problenl is detected by a st~tisticzltest. The new Bayesian apvrcach is not a one-time application, as with t!ie p r e u i o ~ s:;teady-state tests or cornhinatlon (if tests, but rather a seq~~e~liicd a17yliccrfic~rzthnt iricor-pora:es histolical data collected and updated over time. The sequential approach raise: more questions than the hlatis!icai tests using data averages. For instance. how often are the instiun~ents inspected to confinn the occurrence of gross errors? If a gross crror is c u ~ l ~i~cci, L ~ how S O O I L is ihe instrument repaired? 1s it reasonable to assrme that the repair is pel-fect, that is. the instr-amznt will be as good as new'? Should the model include a frtctor for the aging of instruments? Incorporating all these issues in the sequential application framework makes the Bayesian al2o1-ithni a 1;1uch Inore challenging task than the previous gross error detection and identification algorithms. To simplify the description of the Bayes test for gross error detection we will first assume a one-tirnc application of this test. We consider a relatively shol-t msssusement pcriod o f R' consecutive observations. T t ~ c



where: x is the i z x 1 vector of tnie values, E is the n x 1 vector of random errors and 6 O e represent a vector formed by products 6,ei(i = 1 . . . n); 6, is the magnitude of a gross error in measurenlent i, and ei is unity if a gnxs error is present in measurernent i or zero otherwise. We assume that a gross error can occur in any measurement i only at the beginning o f the measurement period. The vector x is assunled to satisfy the following linear constraints:

Note that if a nonlinear model is used, a linearization of the-original model should be first performed. The following assumptions \+ill also be made for the one-time application of the Bayes test: a. Vector & follows a multivar-iate normal distribution N(0,Q) with known covariance matrix Q = (l/N)C, where C is the covariance matrix cf the individutil data vectors. b. A gross error in any measurement i is a kfiown constant rrta~nitude. say c. The values of prior probabilities of instrument fsilures zre kno-,vn. UJe assume that each element 6, of 5 is modeled as a Bernoulli rat:dorn variable taking on values 8: and O with probabilities p, and ! - p!, respectively (i = 1 . . . n). If &,=by,i e I and 6,=0. ip I. then the comesponding gross error vector is denoted by 5,. There are 2" possible states of natcre where 6, rangcs from (0, 0, . . ., 0 ) for the staie with no gross error, to (Si. Ej2. . . ., for the state with zross errcrs occusring in every measurement. ci. Gross errors (cr instrument failures) in different instru~nenrsocc~lrindependently of each other. Then the prior probability that 6 = 6,is given by



We will refer to the xi's as the group pr-ior probabilities. The Bayes theorem is zpplied to compute the gt-oup posterio~-pr-obahilit~%, of 6,,


Dora R~concjljnrior~ arrd Gt-ossEr-/-orL)erccriorl

[24, 261 that the choice which leads to a maximal d i ~ n c n s i ~ ~ ~ l mation gives rise to the following Bayesian formula:

given the group prior probability .nl of 61 and the measured data. A general Bayes formula for posteriors is given by

X,exp - 0.Xy - 6 , ) ' ~ ( y - 6,)}

Tc, =




{o.s(~- 6, ) T ~ ( y 6,))


where f(datal6J denotes the conditional probability density function (p.d.f.) of the given data given that the true state of nature is and the summation in the denominator is over all 2" subsets J. The Baycs decision rule for identification of the most likely state of nature is the following:

where U' is the covariance matrix of the vector d of modified meaurement adjustments (see Equation 7-15 in Chapter 7).

The most likely state of nature 61 correspoizds to the inaximuin posterior iiz Equation 8-27. Therefore, if ?r; = max E, then the rneasurernents it- I* are declared in gross error. If Ix=0, then all measure1nen:s are declared free of gross errors. From the identification rule above, we can see that the Bayesian ;~pproachf'alls into the category of sinzultc!neous strategies f ~ gr r o i er-r?)rdt.tection U I I icietztificntioi?. ~ "TOSS eIAIOIS. Since it also assumes knowledge about the magnitude of = the Bayesiari strategy is close!y related to the serisi compensaticn stratcgies based on GLR test. In fact, for equal prior probabilities of gross error occcrrence p, for all measurements i = ! . . . n, Fornlula 8-17 hecornes simiiar to Equation 7-27 used to derive the GI,R zest. There art two differences though: (i) instead 01' direct nleasured da:;i )-,a 1ir:s:litra1:sfr1rm2tio11 gf vector JJ is used far the GLR tes: (namely the r e s i d u ~ ~ l vecfor r = Ay) and (ii) rhs deno~ninatorin Equation 8-27 is a surnnatian over all possib!e stares of nature 6,rather than the state cf nature So coi-respotidi~gto the case of no gross error as in Equation 7-27. Note that in Equation 8-27 we cannot use v directly for dnta because its p.d.f. involv-,~the true vector x which is unknown. What is required is a transformed vector Cy such that


where t is the index for time periods: y(t) is the average of h'Z data vectors observed during time period t ; &it)is the vector of rantlon; erro1.s assutn?ci to foilow a nlultivariate normal distribution Ar(0.Q); 6 @ e (t-I) repi-esent tke vector of gross cn-ors present at time ( : - - I ) , i.e., at the beginning of measurement time period i.


(i) the p.d.f. of Cy is free of x and depends only on 6: ( i ~the ) covariance matrix C Q C is nonsingular. Equation 8-25 indicates that matrix C must satisify the condition. C = MA, for s o ~ n e1i7 x ti7 matrix M. But C is not unique. It can be shou n


Next, we will present the sequential application of the Bayesian test, which is the desired implementation of the Bayesian strategy for gross e n o r detection. The sequential application of the Bayesian test enaoles continuous updating of the prior probabilities of instrument failures for better gross error detection and identification. The measurement model is now time-dependent even though the underlying process is steady state:

Initially we assume tha: the occun-ence of gross errors are independent Hctnoulli ranciorn veriables with a constant (with respzc! to time) failure rare Oi for the ith insiiunen:. In other words. the probability that the iih instruxent fails (in the sense of a gross er-ror occurring in the measured value) a: the stalt of any given period t is the same, namely. Oi, and t i e faiiure oi' a given instrument ir, different tirlte periods are independent. For a given 6i,the conditional probability that instrument i is in a failed scale at time 1-1 is given by

where z,(t-i) is the time since the last check on instrument i. Note that 0, is quite different frorn pi(i-I), the probability that the ith instrument is in a failed state at time f-I. The instrument may be in a failed state at time t-I .p

current data than the prior infonmation and vice versa. By choosing 8': and si, 1';' and m can be estimated from Equation 8-33 above. The Bayesian formula can be used to compute the posterior p.d.f. of 8,. given its prior p.d.f. and the conditional p.d.f. of the failure data.

if it failed in any of the previous time periods and has not been repaired. To co~nputep,(t-1) required in Equation 8-26 which is used to calculate the Bayesian test (Equation 8-28) we assume a prior distriburion on each 0, and compute p,(t-110,) with respect to this prior distribution. This approach has the capability of using the past instninient failure to update the prior distributions on the 0,'s. Independent beta distributions 1271,

'(1, fI(0,) = -



f, (0, I failure data) =


+mr) o;j-~(l I





g, (failure data I Oi ) = $;(I - 0,)"

1 = ,,;KJ;



A!thougli not expliciily stated, the following assuniptions h2ve bzen also ma& so far in the Bayesian cladel:

i = l ...n t


n-11e1-cni is tliz number of past fxi!ures for iristrnmen~i and t'j' is tile lifetime of insirunizr;t i for the j r l i failure, that is, the ~iurnbcrcf time periods between its jtti failare and ij-!)ti1 f-:?iiures (or t ' y for j = l ) . A method of chocsinz initial values 1'JJ)and rnf;'was proposed by C<)lornbo and Colistantirli [?XI:

Parameter I < s, <: 3 enables the user to select which factor is rnore itnportant in the estimation of the prior 0';); a large value (close to 2) yields small \salues of 1'':' and i l r '? ~vhichmeans that mot-e weight is given to the

(8 - 3 5 )

In Equation 8-35, gi is the probability that instrulnent i lasts (was not in failed state) for exactly .ri time periods. The p,(t-I) (required in Equation 8-26) is computed by



(8 - 31)

i l g i(failure data I Oi )f (0, )dB,

can be used conveniently for this purpose, because they are conjugate priors with respect to the geometric distributions which are followed by the instr-ument lifetimes (see Equation 8-35). In Equation 8-3 1 above, I-(.) denote the gamma function, and 1, and rn, are two parameters with the following interpretation: 1, is the number of previous failures for instrument i ancl nil is the sun1 of previous lifetimes for instrunlent i; the ratio l i / (l,+tni)is the mean of the beta prior distribution. denoted by 0;. The ptlra~neters1, and m ,are updated using data on past failures of insti-annents 135)2s follows:


gi (failure data 1 Oi )f, (0, )



(i) The mag~~itudes of gross er-I-orsat-e known z?nd conita:lr values 6(ii) The insirument fa~!ur-eprobabilities are indrpendent :if i!lstrun~erltazec (iii) Checking and corrective actions arc il~:~iisdiate arid per-l'cct Thew assun~ptions,however, usually do not lloid i n real life. In rhz nexc section we will show how these assunlptioris can ?>erelaxcd in Chis ~.xacticalirnpleinentation of the Bayesian strategy. First. the magnitudes of rlzz glnxs el-10:s ( - ( I ~ I be .seqriei7ticzll~~ r / ~ t i t r ~ ( ~ d based on past data rather than considering them known constants. There are various ways of estimating the magnitudes of Zmss en-ors. 0r;c ~~izrhod was presented in Chapters 7 and 8 in connection with the GLR test. Arxther procedure was suggested by Romagnoli [29]. Iordache [26] pr-oposed a simplified c net hod whicli makes use of the modificd acijusrmc!lt \2i.ctor-d

defined by Equation 7-15. This method, though, is suitable only for linear constraints as given by Equation 8-25. The expected value of vector d can be obtained by combining Equation 7-15 with Equation 3-8:

If we approximate E(d) in Equation 8-37 by the observed vector d = Wy, then the vector 6 of gross errors can be estimated from the equation


(8 - 38)

Note that in Equations 8-37 and 8-38 the vector 6 is actually 6 Q e (t-1) described in Equation 8-29. Furthermore. matrix W is generally singular, therefore a least-squares solution should be nhfqined. One way is to use a Moore-Penrose pseudo inverse of W. This solution, which also involves a singular value deconiposition of matrix W, provides a minimum Euclidean nomi of vector 6. that is. it minimizes 6'6. The solution is unique [30. 311 and can be written as:

Kote that, i n clrder- ro obtain a meaningful so!ution. only the estimates for thc 6,'s associated with the measurements declared in gross error by the Uayes test are upda~ed.T;ic.refore, ali e:'s except those corresponding to Inzasurcrn2n:s suspected ir! grcss error ill the previous step (t-1) are zero. Secondly. the hilir?-e;ure.s 9,'s rvill l i o T be coizstn~lt,but will increase ~viththe azes of the instru~nents.Let Oi(Ti)be the failure proSability for ins!rl;~?ienti whcn its sct:~ai age is T,. The followirlg model can be uscc; S9r 8,(Tj) [2?];

w1le1-e0 < 8,(1) < 1 and PI 2 0 are given constants. If Pi = 0, a constant '1 me . rate model is obtained. For PI > 0, the failure rate increases wit5 age O,('I-,)i1). Note that the model described by Equation 8-40 has (as 'l-,+-. not heen inlplemented in the Bayes test yet. because it is rather complicated.


It was only used to simulate gross errors based on the aging function for 0, [23,24]. In the Bayes test, Equation 8-28 constant Oi is still assumed. Thirdly, delays in checking and impeqecf col-rcctive uciions can also be taken into account. Immediate instrument checking after gross error detection, followed by a corrective action is not usually feasible in practice. First, because of inherent Type I errors. we may want to verify over a longer period of time the consistency in the gross error detection. A simple rule such as 213 (2 out of 3) can be adopted; that means that a gross error should be detected by the Bayes test in some particular measurement i at least twice out of three consecutive time periods. Second, even with sustained evidence of gross errors, the operators may want to postpone the instrument checking and correction at a more convenient time (for instance, at the end of the shift or at the scheduled maintenance time). Until then, a gross error is assumed detected, but not corrected in the instrument i. Therefore, the parameters I;, m,, and Qi are not updated until the instrument was checked and found to cause a gross error. But the magnitude of the gross error 6i will be updated continuously until the instrument has been repaired. Based on the above assumptions, a Bayesian gross error detectien and identification algorithm can be imp!emcnted as follows: Step CI. /!zirializrrfiot~. At the beginning input the following information: (i) Constraint matrix A (or the model tc be linearized), covariance matrix C, and the number of data Tjectors per sampling period, N. If average data of N data vectors is a s e d , the ccvariance niatrix Q=( 1/N)X will be uscd instead of C. (ii) For each n1easuren;ent i =i . . . i i , enter the following initiai estimates: 8';. 0':. l'y', and ,nini. Note that 1': and I,,' j') can be initialized from 0 ' : and a pxarneter i < s, < 2.Equztion 8-33; 6'!' can be initia:izrd as constant number of standard deviation, i.e., cio.;o, . (iii) Set the ages ~ ~ ( to 0 )a11 instruments i equal to I (fresh instruiiients). Set the time period t equal to 1 also. Step 1. Rc~ndi17 rl~c.hrdmtrr vcJctot-tfor period r and corl~putethew average vectol ? (t) Step 2. Ccrlc~tlrrtethe prior pt-obubilities p,(t-1). Equation 8-30 and the group priors 7 t I , Equation 8-26.

Step 3. Calculate tlze posterior probabilities x I , Equation 8-28. Note that, if all possible states of nature for the vector 6 O e are considered, the computational time for Steps 2 and 3 is exceedingly large. There are two ways to reduce the computational time: (i) Since the denominator for all posteriors is the same, only the nurnerators should be computed; we can even calculate the natural logarithm of the numerators, thus avoiding the computation of the exponential functions. (ii) The number of states of nature for the vector 6 63 e can be reduced to a much smaller subset, by the following strategy: Calculate first the posteriors associated with the states of nature involving only one e, equal to I at a time (single gross error case) and the 6$ case (no gross error). Select the measurements corresponding, say, to the top 25% of posterior values and calculate the posteriors of combinations of at most three of those measurements (we assume that at most three gross errors can sin~ultarleouslyexist). Other strategies for reducing the number of hypotheses to be tested 1181 can also be adopted.

Step 4. Cecisiorz. Ficd the rriaxi~nningroup posterior, %;= max ?,, among all selected combinations i. !f the c:)rresponding set Ix is empty. n o groxq error is detected. Othe-mlise, the measurements in set I* are suspected in gross error. However. the iinai decision is delayed. For iilstance, if rule 213 is used, a ineastuement is declared in gmsz error when detccted at least twice out of three consecuti-ie time p e r i ~ d s The . same is trite for the n o gross error case. Step 5 . Actiotz. If the current time t corresponds io the scheduled inspectior, time for the instru~ne~ltation. the foliowisg actions are assuriled: (i) Check the instruments in set I* detected a: Step 4. Note that insrri!meat checkins 1nav hc delayed. Iil that case. I7 is the set of all instruments declared in gross error between two consecutive inspections. (ii) Decide which ones are actually faulty, say subset 1'. (iii) Repair or replace the instrumsnts in set 1'. (iv) Update the age of instrutnents l', i.e., t,(t)=i for a true failure followed by a corrective action.

Step 6. Reestinzale the magrzitudes of the gtuss ei-t-or-s,Equation 8-39. Only the magnitudes of the gross errors for the measurements in the detected set I* are reestimated. Note that. due to inherent 'Type I en-ors, the magnitudes of s a n e falsely detected el~ol-sare also reestimated. But, if averaged data is used, their estimates should be much smaller than those for the true gross errors.


Step 7. Reestimate the parameters 1 and mjt)for beta prior distribution of 0; (i=l . . . n). The associated parameter 0 is also reestirnated. by the ratio 1, I (li+m,).


Step 8. Set titne period r = t + I and return to Step I A sequence of failures (gross error occurences) for a certain instrun~ei?t i, followed by detection and correction actions according to the Bayesian algorithm described above is shown in Figure 8-1. More details about the sequential Bayesian a1go1-ithmcan be found in Tamhane et al. 124. 251. Tordache [26] perfonned a comparative evaluation of the Baycsian algorithm with a similar strategy based on the measurement test. using siniulation runs with 10.OC)O time periods. The perforrnacce criteiis med were the probability of Type I errors. the power of correct deiection and the average deldy time before a iorrczt detection f(jr the sarne averege number of Type I errors. In gerterai, the Bayesian method outperforms the rneasurerccnt test i n the follo~vingsituations: high Frequencies of gruss error occurrence (1ns1riple gro5.r errors), large spread in the magnitudes and frequencies oi' srcss error-s. arid 10i1: delays in cor~i'irrntrtior?and re;~ai!-s.O n the cither hand, the 3ayesian nlethod conve~-gesvery slowly. S!arting with ini~i:tl g u e cf ~ e, equal to 33% or 300% of the tlue va!ue. a large nuinher of observed faiiures (on the order of 100) is neecied befor-e 6 , converges. Therefore, accurate initia! ehtimates of 8,'s a - e needed before the Baye.iian method may be put to plec!ic:l! I!:?. If there Is uncertainity about the prior estirriates of H,,'s, one strategy is to placc :nore weight on thc current data until more historical data i.: obtained. The perfoi-inance of the Bayesian method is nluch less dependent on the esri~natesof6,'s. More work is required to rilake the Bayesian appi-oach rsally competitive with other gross error identification strategies. If all implementation details are clarified, it will become an appropriate strategy for online applications.

hydrocarbon streams (A and B). Five heat balance equations can be ernployed in order to describe the system. The first four are obtained by equating the shell and the tube side duties. The fifth is an HTF energy balance. The process operating values and the measured variables are given in Table 8-8. Standard deviations of the measurement errors are also listed in Table 8-8. The units for flows are bbllhr and for the temperatures degrees F. Densities, in Iblbbl, are: HTF: 290, A: 300, B: 320. The enthalpies, in Btullb, are related to temperature by the following equations:

12E(?ENL> F Fail~~re CD - conrct detection El) - falhe detect~on CK checking and repair C - chcck~ng(no rcpair) D, - detectioii delay


HTF: H = 0.3233 + 1.14E-4T2 Am: H = 0.424T + 2.80E-4T2 Solve first the nonlinear data reconciliation to obtained the reconciled values fcr the case with no gross errors (the data in Table 8-8 is free of

Figure 8-1. A sequential Bayesian failure detection process

PROPOSED PROBLEMS NOTE: The proposed p r ~ b l e n l sthat c r e irzclzcded irz this chapter requite more cxterzsitv calculrrfiorzs. A cotnprrrer program or a matlzernatical tool such cs hlAT1J.B iz required irr order to get sol:itin~zsto ?It~se prob!etrzs. Problem S-2. Ilse the the stearii metering system for- a rnethanol synthesis unit represented in Figure 7-2 and the dat; indicated ili Prob!em 7-2 to siiilulate multiple gross errors. For faster solvir?g and a Iliore clear analysis, at most three simultaneous gross errors are recor~~rr~endeci. if possible, apply both serial elimination and serial compensa;i,,ii strategies (with appropriate statistical wsts at w.=0.05 level of significance) and perform a compar-ison of results. Gross errors of various magnitudes and in locations with different detectability factors are recommended. Problem 8-2. A section of a heat exchanger train froni reference 1321 is showt~in Figure 8-2. A 11ccrt t t - ~ i r 7 s f c ' r -flrrirl (HTF) is used to heat two

Figure 8-2. Heot exchanger network. Reproducedfrom reference 1301 with permis sion o f Gulf Publishing Co 6

gross errors). To speed up the calculations, successive linearization is recommended, but any NLP solver can also be used, if available. Next, change the measured value for flow F5 from 312.9 to 362.5 and for the temperature T8 from 662.0 to 669.1. Apply appropriate gross error detectior and identification strategies to find the location of gross errors and their corrected values. To compare the results, see reference [32]. Table 8-8 Data for the Heat Exchanger Network in Figure 8-2

F1 F2 F3 F4 F5 TI T2 T3

T1 T5 Th 1'7 T8 T9 1i ( j 1'1 1

TI1 Node Nc.

SUMMARY For detection and identification of multiple gross errors, sirnultaneous or serial strategies have been designed. Two major types of serial strategies exist: serial elimination and serial compensation. The sin~ultaneousstrategy for mutiple gross errors based on measurement tests is simple, but it usually detects too many nonexistent errors. The simultaneous strategies based on a GLR test require testing of multiple hypotheses and take significant computational time. A method of selecting the most likely hypothesis is required for these strategies. The simultaneous strategy based on a GLR test is equivalent to serial eliminatior? strategy based on a global test (elimination of one, two, three, and so on measurements). *The ~ t ~ a j oadvantage r of serial elimination is that it does not require prior knowledge about the location and magnitude of gross errors. It mizht take significant computatioaal time, howevel-, and co~cldsuffer a decrease in the solution accuracy due to the reductic11in redtlndaacy if too many measurements are eliminated. A certain rcodificd serial cornpensarion st!-ategy is equiiralent to szrial elimination. The ~;iaxi!nzmnurnber of nieasurecnents that can be eliminated is equal to the nur~ibercf constraints (or reduced constraints, when ur:measured variables cxist). For all gmss e m r detection strategies. rhe global test shoz:d be app!ied f:rst (to avoid ur?ni-ccssarycalculation of statistical tesis, when no gross errcr exists). *The SSCS algorithm may detect many nonexis:eitt gross errors, ~ C C ~ Lof I Spossible ~ wrong conlpensation. The MSCS algorithm is equivalent to the serial elininatio:: stratezv usicg the iterative measuremen: test (iMT) for detection and identification of measurement biases. The MSCS, however, can detect leaks as well. L .










! 1


Multiple Gross Error l&ntiPc(irir,,~S ~ r a t e ~ i e s f oSleodj-Starc r f'rocesses

Dufo Kccorrciiiurin~~ urrii Gross F - n ~ rDcrectror~ .

7. Yang, Y., R.Ten, and L. Jao. "A Study of Gross Error Detection and data Reconciliation in Process Industries." Computers Ch;lcn?z.Engng. 19 (Suppl., 1995): S217-S222.

SUMMARY (continued)

8. Ripps, D. L. "Adjustment of Experimental Data." Chenz. Etzg. Progr. S y ~ n l ~ . Series 61 (no. 55, 1965): 8-1 3.

Bounds on variables enhance the performance of gross error detection strategies only if the measured variables are close to either bound. Combinations of tests (e.g., nodal and measurement tests) can be successfully implemented but require a strategy for reducing the number of test hypotheses. Strategies based on combinations of tests cannot be easily extended to nonlinear models. The MIMT and MSCS strategies can be applied to nonlinear models (with some modifications). One way to enhance the power of detection and identification of a gross error is to include prior probability of instrument failure and a prior estimate of the magnitude of the gross error and apply a Bayesian t y p ~test. The prior iafomation can be continuously updated by a sequential appl~cadonof the Bayesian algorithm.

9. Narasimhan, S., and R.S.H. Mah. "Generalized Likelihood Ratio Method for Gross Error Identification." AIChE Journal 33 (1987): 15 14-1521. 10. Keeler, J. Y., M. Darouach, and G. Krzakala "Fault detection of Multiplc Biases or Process Leaks in Linear Steady State Systems." Cotnpurers & Chem. Engng. 18 (1994): 1001-1004. 11. Tong, H., and C. M. Crowe. "Detection of Gross Errors in DATA Reconcillation by Principal Component Analysis." AIChE Jor~rtzal41 (no. 7 , 1995): 1712-1722. 12. Jordache. C., and B. Tilton. "Gross Error Detection by Serial Elimination: Principal Component Measurement Test versus Univariate Measurement Test," presented at the AIChE Spring National Meeting, Houston. Tex . March 1999.

13. Bagajewicz, M.. Q. Jiang, and M. Sanchez. "Performance Evaluation of PCA Tests for Mnltiple Gross Error Idcntification." Con7plrter.s & Chem. Etzgng. 3 (Suppl., 1999): S5S5-S59.l. 14. Romagnoli, J. A., and G. Stepha~iopolous."Rectit~catio!~ of P r ~ c e s sMrasurement Data in the Prcsence of G r x s Errors." Chenl. E11,q. Sciet!ce 36 (1981): 1849-1863.

REFERENCES 1. Ser:h K. W., and W. A tieenzn. "Gross Error Dc:cction and Reccncilia!io~~ ir, Stcam-met-ring Systems." AIChE Jourtzal32 (1986): 733-'717.

2. lordache, C., R.S.13. hilah, and A. C.Tamhane. "Ferfom~anceStudies of thz hJie,~surementTest for Deicc:ins Gross EITOITin Process Data.'' AICIzE Iorrr1701 31 (no. 7, 1985): 1187-1201.


15. Harikumzr, P., and S. Narasirnhar~."A Method to Incozporate Bounds in Data Reconciliation and Gross Error Detection-TI. Gross Error Detection Strategies." Con!puters & Chern. Engng. 17 (no. :1, 1993): 1 !2 !-1 !3s.



3. Rosenberg, J., R.S.H. Mah, and C. Iordachc. "Evalua~ionof Sche~:les fol Detecting and identifying Gross Errors in Process Data." Ijlcl. & Evg. Clier?:. PI-oc. Des. Dev. 26 (1987): 555-564.

16. Mah, R.S.H., G. M. Siarlley, and D. W. Downiny. "Reconciliation 2nd Rectification of Process Flow and Inventory Data.'' !lid. & 013. Clzern Pruc. 17e.y. Cev. 15 (1976): 175163. 17. Rosenbel-g, J. "Evalustion of Schemes for Uctecting and Identifying Gross Errors in Process Data." M.S. Thesis, Northwestern University, Evanston, I11.,1985.

4. Roilins, D. K., a;;,'. J. F. 3z,i.,. "TT1;:;:-scd Es:i;;;;:i~.; ;if C;GSSErrors in Process Measul-e~nwts" AICIzE Journal 38 (1 992): 563-572.

18. Rollins, D. K., Y. Cheng, and S. Devanathan. "Intelligent Selection of Hypothesis Tests to Enhance Gross Error Identification." Computers & Clzeril. E I Z R I 20 I ~ .(1 996): 5 17-530.

5. Jiang, Q., M. Sanchez. and M. J. Bagajewicz. "On the Performance of Principal Component Ai~alysisit1 Multiple Gross Error Identification." lrld. C; Eng. Chern. K P . ~ P U I -38 C ~(no. ~ 5. 1999): 2005-201 2.

19. Jiang, Q., and M. Bagajewicz. "On a Strategy of Serial Identification urith Collective Compensation for Multiple Gross Error Estimation in Linear Data Reconciliation." bid. & Etzd. Chern. Research 38 (no. 5, 1999): 21 19-2128.

6. Sanchcz, M.. J . Ronlagnoli, Q. Jiang, and M. Bagajewicz. "Si~nultaneous Estimation of Biases and Leaks in Process Plants." Carrtputers & Chern. Engrzg. 23 (no. 7, 1999): 841-858. F


20. Kim, I. W., h$. S. Kang, S. Park, and 1'.F. Edgar. "Kobust Data Rcconciliation and Gross Error Detection: The Modified MIM'I' using NLP." C O I ~ Z P I I I er.s & Cl7c1n.E I I ~ I2I1 ~(no. . 7, 1997): 775-782.

21. Serth R. W., C. M. Valero, and W. A. Heenan "Detection of Gross Errors in Nonlinearly Constrained Data: A Case Study." Ctzenl. Eng. Cornrn. 5 1 (1987): 89-104. 22. Renganathan, T., and S. Narasinihan. "A Strategy for Detection of Gross Errors in Nonlinear Processes." Ind. & Eng. Chem. KPS. 38 (1999): 239 1-2399.

Gross Error Detection in Linear Dynamic Systems

23. Box. G.E.P, and G. C. Tiao. Bayesian I~zferetlcei~rSruiisticui Arta1jsi.s. Reading, Mass.: Addison-Wesley, 1973. 24. Tamhane, A. C.. C. Iordache. and R.S.H. Mah, "A Bayesian Approach to Gross Error Detection in Chemical Process Data. Part I: Model Devcloprnent." Clre~,:or~iefrics arid Irl/cl. Lah. Sys. 4 (1988): 33-45.

25. Tamhane. A. C.. C. lordache, and K.S.H. Mali. "A Bayesian Approach to Gross Error Detection in Chemical Proccss Data. Part 11: Sirnulation Results." C'ha~~!n~~ic~ric-s urld Ititel. 1 ~ 1 1 SVS. . 4 (1988): 13 1-1 16. 26. Iorciache. C. A Bn~c.rirr~~ A ~ ~ I I - o i rto c hGI-0.5-sEl-I-nrIlereclio~zill Proc-e5.s D(zt(i. Ph.D. Dissertation. N o ~ ~ l i w e s t University, er~~ Evanston. Ill., 1987.

17.l,fa!~!!. R. E. Sc113fc!-, iind N. D Sirt_~p~l~-wall~i. ltietlrodc ji)r Siatisricui ,-117rrI\-.sir o/'iic,/i:rl/ilin.(r12d/.;/i~ Dcrtn. New York: Wiley. !!,?A.

1s.Ct)Ion~ho..A. G.. and li.Co~ls!nntlni. " for Bets Dihtribuiiorl as Bayesian PI-io:-." IEEE trail.^. o ! ~~?t~lir:hiiit~ K-29 (co. 1, 19Si)j: 17-20. 29. Rci?~agnoli.J. A. "On I h t a Reconci!iatioii: Con>ti-aintProcessi:lg and TrestnlcnL of Bia\.'- Clif,!~~. Gig. .Ccier?r-c,58 !192.3): 1 1 07--! 1 ! 7. 3 9 Golub. ti. H.. aixi C Keinsch. "Sir1gu1a1-Vrllue 1)ecoinpoition and Lcasiq u i r e s Solutions." :?'~i:~lo-. Mrrrh. I4 (1970): i l G 3 4 2 0 .

3 1 . Seber. G.A.F. Lirirc:~.Rc~g,z~.\..s!or~ Atzol~ris New YorL: U'ilcy. 1977. 32. Alhers. J. E.." Data Keconciiiation \uith Un~nea>uredVariables." H ~ d r ~ c . PI-oc.(March 1994). 65--60.

A s it currently stands, industrial applications of dynamic data reconciliation have not been attempted. It is therefore not surprising that only a f e u attempts htive been rmde to address problcrns in the subject of gross error detection in dynamic sysiems even in the resexch literatwe. Several deve!opnents, however, have occurred in tnc. closely related iopi: of mcdelbased fault dizgnosis which can be gainfully exp!oited for gross erro:- identification in chernicai processes also. T h e purpose of this chapter is tc expose ?he reader to the issues ~nvolveciin this p;cblern area and to also providc an introdiuzf on to the pmblem of mcdei-based f a d t diagrrosis. It should 'Fe pointed out that, typicaily, gross error d t t ~ x t i c xis~ more concerned with the PI-ob!ern of d e i e c t i ~ r gbiases i n l-nedsured data oi. process leaks, whereas fault d i a g ~ o s i streats a wider class of probl~,rns associated with scnsors. actuators, and t!le prcczss ~nc~cicl. The i n t ~ - o d ~ c tion to fault diagnosis that w e provide is only brief in a s 11iuch as it pcrtains to the problem of detecting biases in measurements. T h e interested reader c2n refer to the book by ::P :: c! I!.[I:!::r the :::crc rzcc::: k?ek by Gertler [2] for a more co~.\rrei~rr.sivetreatment of this subject. For the purposes of this chapter, w e use the t e r n s fault and gross err-or interchangeably. In Chapter 7, w e listed the basic requirenlents that any gross error detection strategy for steady-stat€ processes should fulfill. 'These are the abilities to (i) detect the presence of one o r more gross errors, (ii) identify t h e type a n d location of rhe gross error, ( i i i j identify multiple gross errors, and (iv) provide estimates of the gross errors. T'nesc requirelnenCs


I)r~caK~~coi~ciirtiri~~,i nrrd Gr-oss Error- Ijcr~~criorr

carny over to gross error detection techniques for dynamic systems also. In addition, these techniques should consider the following issues: (i) In steady-state processes, gross error detection strategies exploit only spatial redundancy in the data for the purpose of detecting and identifying gross errors. Similar to dynamic data reconciliation, however, which exploits temporal redundancy in the data for improving the accuracy of estimation, gross error detection and identification can also exploit temporal redundancy for improving diagnostic performance. This is typically achieved by applying gross error detection techniques to a window of measurements made within a chosen time period. (ii) Since a gross en-or has an effect only on those measurements that are made after its occurrence, it is also important to estinlate as part of the overall gross error dGLL,.ion strategy the time instant at which a gross error has occurred. In the following section, we describe a procedure to meet these requirements. It can be noted from the preceding chapters that for steady-state processes the techniques of data r~conciliationand gross emor detection go hand in hand. This is also valid for dynamic systems and the type of state es:ii;lator used also has an impact on the gross el-ror detection straterv. For the sake of simplicity we consider only gross errors caused by biases in sensors, aithough, in principle, the method we describe can be applied for identifying other types of Faults. Furtherm~re,we restrict our consideratior; to linear dynamic systems for which a Kalrnan filter estiInator is used far state estimation. L,


occurs instantaneously at some time instant and once it occurs its ~nagnitude remains constant for all subsequent times (until the sensor is recalibrated). This is also referred to as a step jump in the measurements 131. In this case, the measurement model may be written as

where e, is the ~ t unit h vector, is the unit step function which is 1 for all times k L to and is 0 otherwise (as in Chapter 6, the subscripts to and k are used to represent time instants roTand kT). Equations 6-1 and 9-1 together represent the gross error (or fault) model caused by a measurement bias in sensor i. Although a bias is modeled as a step change, it can also be modeled as a drift with con>+antrate of change as follows.

where s is the rate at which the bias changes.

The objectives of a gross e:ror deteciion strategy are to ( i ) detect whcther a gross errol- has occurred, (ii) determine the time ro at which a grcss errar has occurred and the n:easui-ement i that contains the bias, and (iiij estimatz the magnitude b or the rate of cha~lges of the bias as the case may be. For the sake of definiteness, in the subsecluent sections we model sensor biases as step changes. u



We consider a linear dynamic system for w h ~ c hEquation 6-1 describes the dynamic evolution of the state variables. If biases in measilrements are not present. then Equation 6-2 can be used to describe the relation between measurements and state variables. We assume that the statistical properties of the state and measurement noises given by F4uations 6-3 through 6-7 are obeyed for the process. We further assunie that a Kalrnan filter given by Equations 6-13 through 6-17 is used to estimate the state variables at each sampling time. Consider the case when a bias of magnitude b in measurement i occurs at a time f = tor One theoretical model fbr the bias is to assume that it

Tlre principal qilantities which are used in gross error d~tectionstrategies are the i~izovationsdefined in Chapter 6, which are computed as part of tfi-. K-l!m3n f;.!ter estin~ztorat each time.

I he innovarions are anai~gousto nleasurenlent residuals which have been used by the n~easurementtest for detecting gross errors in steady state processes. Under the null hypothesis that no biases are preseni in the measurements made from i~litialtime to current time k, it can be 7


proved 141 that the innovations are normally distributed with expected values and covariance matrix given by

Instead of constructing the global test using only the innovations at every sampling instant, the innovations obtained over a time window of. say, N sampling instants may be jointly used because they have a joint Gaussian distribution by virtue of property, Equation 9-6. 'The global test statistic in such a case is given by

Moreover the innovations at time k and time j are not correlated, that is,

Exercise 9-1. Prove that the in1lovotio17.r fo/l012>a Ga~lssian distribution wit11 stati.stitn1 propet-tics given Dv Equurior~s9-4 tl2mrrglr 9-6, when izo gross error.,-ure present ;IZ the nreu.surei71er7t,s.

This test waq proposed by Mehra and Peschon [8] and it can be proved that under the null hypothesis, has a chi-square distribution with hTi1 degrees of freedom. For a given level of significance. the test criterion can be chosen from this distribution to detect whether a gross error is present anlong the N measurements within the time window. Using this %st, however, it is dificult to estimate the time of occurrence of a sross error.

Example 9-1 Utilizing the properties of the innovations it is possible to construct a st.atistical test analogous to the global test defined in Chapter 7 for detecting the presence of a gross error. The test statistic is gi\.cn by

Under :he ncll hypothzsis t5at nc gross errors ale piesent ir! the meas111-emecis. it czn be proved that y follows a chi-square disiribution with I: degrees of frecdo~nwhere 11 is the nun:h~r of nteasurements. For a chsserl cac choose as :he test cri!eriorl ,, alid rejec-t :sue1 of signif cance 0.. tliz cul! hypothesis if y exceeds the criterion. This tes: can be applied 3t every sanipling !inx instant to detect when a gross eITor ha: occurred. If tht. test rejects the nuil hypoihesis for the first til:ic at some ti~ncinstant, say 1,). then it clay be concluded that a grass c ~ ~has o r occurred ar time to. ui ccurse, this coriclusion is subject to the Type I and Type I1 error p~obabilitiesof the test. In order to protea against these errors, one possibility is to conclude that a gross error has occurred at tirne r,, if the test re.jects the null hypothesis not only at time to, but also for M out o f t h e next N tin:e instants. Such a sirnplc voting system has been proposed by Rollins and Devanathan 151. A more elegant approach is to use the seqz~entialpi-obubili~ratio (SPRT) first proposed by Wald 161 and used in fault diagnosis by Montgotner-y and Williams 171. anlong others. This approach, howc~er.has so far not been used in gross error detection.

We consider- a level control process for which a linear discrete model alas derived in Esaniple 6-1. Rased o:i the data $\-en in that exa:npIe. :neasuremeiits correspo~dinytc, the closed loop beha\-io:- of the proczis were siniuiated wi:liutlt addin? any biases to the rni.z.;irrenien(s and tilc Kalman fiitz: is used to obtain estiniates of level and \31ve position. T!K steady state va!ue of th:: covari:~rtcematrix of estirn:itior: en-ors ~ G I chi.; process can be computed as

and hence the steady state c ~ v a r i i ~ n c1:iatrix e of inno\ ations citn !)e con;puted as

The zlobal test statistic 7is co~nputed:I: each tirlie as \cell as the curnillative global test statistic 8. These are shown in Figures 9-1 and 9-7. respectively, along with the chi-square test criterion at 5% levei of'significance. It is obser-ved froin these plots that thc GT reject, the nu!l hypothe-

Gross Lrror I>etecrron In Ldnear n\romr< S\srer7zs

sis 2 out of 100 sa~nplinginstants. The cumiilative global test, however, does not reject the null hypothesis at any time within this set of 100 Samples It should be noted that the test criterion for the simple GT is constant at 5.99 while the test criterion for the cunlulative GT increase with time, since the number of degrees of freedom increases. Measurements were also simulated for a bias in valve position of 0.5 volts occurring at initial time and the corresponding GT and cumulative GT gatistics for 100 samples are shown in Figures 9-3 and 9-4, respectively. While the CiT rejects the null hypothesis for 26 out of 100 samples, the cumulative GT rejects the null hypothesis for all samples. The results indicate that the cumulative G T does not commit Type I errors and is able to detect the presence of the bias for all sanlpliog times. This is expected because the cumulative GT also exploits temporal redundancy. It should be cautioned, however, that, in this simulation, the time at which a gross error occurs is known exactly and, hence, the cumulative GT comrnits no Type I or Type I1 errors. In a sequeritial applica~ioilof the cumulative G T the time of OCCUI-Fence of the gl-oss error is not kooivn precisely. Therefore, all the measurenlents used in computing the cumu1;itiae G T statistic will not contaiil t!le effect of ths gross elrcr and the tesl may not h a i s perfect detectio~l capabil;ty. i\evcnh.-les<. tile s e \ ~ I t sindicate tha! sequential tests such a:; Spm 161. \v]iich i s nlsn 3 c~lmoin:i\~eks:, sho~ildbe used for the purposes cf gross en-or de~ectlor:in dynamic processes.



Test Statistic


Test Criterion


40 Time




Figure 9-2. Cumulative global test statistic for mecsurements without gross errors.




Tzst Statistic Test C2enon

14.0 12.0

Figure 9-1. Global test statistic for measurements without gross errors




Figure 9-3. Global test statistic for measurements with bias in valve position

Gro.7.5 !?I-rorf)~rr(.fi~ll ill f2i~zear I)~llnnIicSvsierlis





Test Statistic


Test Crj:erion

Figure 9-4. Cumulative global test statistic f ~ measurements r with bias in valve position.

The globa! test can detec! *bether a gross e ~ ~ has c r ~ c c u r r e dand call also bc appropriately lised to estimate the time of occcrrence of a grcss error. It requires an idenrification strategy. h ~ w e v e r ,to deterinine the type 2fid Jocaiion of i11e gross en-or (as in the case of steady-state processes descr~bedin Chapter 7 ) . Serid elimination strategies; such as those developed for steady-state sysiems, have not been adapted for dyealnic systems as yet-though, in is possible to devise such strategies in c~rnbinationwitli the global test. The generalized likelihood ratio (GLR) test, however, which was used in steady state processes was in fact hascd on the Gl .R tect proposed by Willsky and Jones [ 3 ] for fault d i a ~ n o s i sin dynamic processes. 'Thus, this test can be used for identifying the type and location of the gross enor-. In fact, this technique was applied by Narasimhan and Mah /9] for identifying different types o f faults including sensor biases for dynamic che~nicalprocesses. We discuss the features of the GLR technique for detection, identification. and estimation of sensor biases.

The GLR test for steady state processes, which is described in Chapter 7, was shown to be capable of identifying different types of gross errors, provided a model for the effect of the gross error on the process (also known as the gross error model) is given. In the case of a dynamic process. the effect of a sensor bias of magnitude b in measurement i that occurs at time lo is given by Equation 9- 1. The evolution of the state variables is still described by Equation 6-1. Without the knowledge that this gross error has occllrred, the Kalman filter estimates will continue to be obtained using Equations 6-13 through 6-17. Therefore, until time to -1 when there is no bias in the measurements, the expected values of the innovations at each t i ~ n ewill still be zero. At su5seauent times, however. the expected values of the innovations at any time k L to are given by

The matrix Gk.lois referred to as the sigrznturv rnatri.~and depends on the time k at which the innovations are conlputed and the time to at which a gross error hac occurred. It depends on the system matrices and the type of control law used. For a control 12w based on the estimates as o,iven biy Equation 6-3 we can recursi\~elycompute the signature matrix using ttic following equations.



Ik,ro= -4kTk-l,~o + BkCk-l.ik-l,~g


with all the above matrices initialized tt) the 0 n1at1-ices when k < I,+ I? can also be proved that ever? if a sensor bias is Frcsent in the measure!ne!ltsl the innovatiorts follow a Gaussian distribution with covariance matrix given by Equation 9-5. Moreover. the inno\raiions at different tirnes are not con-elated. This result is valid in geceral for other types of nci'dirii.c fauits, that is. faults whose effect on the process can be modeled as an additive term to the nornial process nod el (compare Equations 6-2 and 9-1 in the case of a sensot- bias).

Gross Error DPICC!IOII irr IJII<,U~ D ~ I I O I .5y.s1ern.y ~I~C

Exercise 9-2. Prove that ~/lzerzm gross crror raused by n biar in measurement i occur5 ut tin7e to, the expected values ofthe iiznovations are giverz by Equations 9-9, ~ v l ~ e the r e sigrzature matrix is computed recursively usitzg Equntio~is9-10 through 9-12. Also show that the iiznovatioils follow u Gciussiurl distrif2ution with statistical properties giverz by Equutior~s9-4 tlzr~ugh9-6.

Since the innovations at different times are uncorrelated, they jointly follow a Gaussian distribution. both under the null hypothesis and under the alternative hypothesis that a sensor bias is present. Based on the statistical properties of the innovations established in Exercises 9-1 and 9-2, the GLR test can be applied to a window of N innovations computed from time to to to+N. The GLR test statistic (which is equal to twice the natural logarithm of the lnasinlurn likelihood ratio as in Equation 7-28) can be obtained in this case using T = Max TI



where Ti is the maximum likelihood ccst statistic for ifient i given by


bias in measure-

A sensor bias is identified in the measu1-c~nenti- which has the maximum test statistic among all rneasureinents. Thc maximurn likeliiiood estinlate of the nlagniiude of the bias is givcn by

29 1

Although it is possible to apply the GLR test using only the innovations at time to to identify the gross error that has occurred at this time, we have exploited temporal redundancy in the data by exploiting all the innovations from time to to to+N in the GLR test.

Exercise 9-3. Using the joint distribution of irznovatioizs obtained in Exercises 9-1 and 9-2, derive the GLR tesi statistic for identibing the location of a sensor bias. Also derive the nulxinzurn likelihood estimate of the sensor bias magnitude given by Eq~latioiz 9-15. Follow a similar procedure as outlined in Chapter 7.

Example 9-2 The GLR test is applied to the level control process studied in Example 9-1. Using the system model developed in Example 6-I , the expected values of the innovations for biases of specified magnitudes in eithet- the level or valve position measurements can be computed using Equations 9-9 thrcugh 9-12. The expected values of innovations (~n.-asurentcnt residuais) in ievel at diffeicnt sampling instants, after a s-nsnr bias of 0.2 v o l ~ s(0.317 cm) in level measurercent or z sensor bias of 0.5 vclt.; (0.3185 cmj in valve position measurement has occun-ed at start time, are piotted in Figure 9-5. Simiiarly, the expected values of valve position innovatio~sfor the same senscr biases are plotted i n Figilre 9-6. These plot\ esserttlally dzscribe the expected evoluticn of the innovations aftel- the partica!ar sensor bias of specified magnitude has occurred, assuning that the !imr at which the bias occurs is known precisely. For a bias of a diffe;-cnt magnitude, ihese curves will be shifted up or down. From these f i ~ u r e s , it is c!ear that the expected trend ir! the two innovations are diffcrcnt for the C W SCIISOI~ v i i l ~ e band ic principle, be possible to distill. , g~;,,i tz:ween these b i s ~ z s . For a given set of measurements, the GLR method essentially determines the best fit of the pattern of measurements to these expected trends (shifted appropriately to determine the best estimate of thc magnitude) in order to identify the bias that has occurred. The method also accounts fhr correlation among these trends and the relative magnitude of errors in thc iilnovations. As a particular case, measurements corresponding to a bias

4Vn!\.c. bijs -

-+ I,e\sl bias

Figure 9-5. Expected evolution of level innovation for sensor biases.

of magnitude 0.5 volts in valve position were simulated at initial time and the GLR test statistics were c o n ~ p ~ i t efor d window lengths of 10 and 20, respectively. The GLK test statistics for bias in levei and bias in valve position were found to be 29.37 and 4.89, respectively, for- a window length of 10 and 40.58 and 7.2 1, respectively, for a window length of 20. Since the maximum test statistic occurs for bias in level hypothesis, and the test statistic also exceeds the test criterion (3.84 at 5% level of significance), a bias in level is identified. The bias ruagnitude was estimated as 0.589 for a window length of 10 and 0.516 when a window iength of 20 is used. The ability of the GLR method to identify the bias as well as to obtain a more accurate estimate of its magnitude increases wiih the window length, as expected. In the above derivation, it is implicitly assumed that the time to at which a sensor bias is presumed to have occun-ed is known precisely. In practice, only an estimate of this time can be obtained. The procedure described in the preceding section which makes use of :he global test can be used for this purpose. Alternatively, Willsky and Jones [ 3 ] used the GLR test itself to estiniate the time of occurrence of the gross error by treaiing it as 2 parameier (similar to :lie unknown bias nlagnitude) acd obtaining the maximum likelihood rztio oLer all possib!e values of to within the time wir?dow being considered. This car? resuit in a ~ignificant computation burden, especial!y for larze systrms unlcss the system matrices are independent of time. An on-iiiie aigorithm which uses the Kalrnan filtzr for estimating the stzce variables at each t i ~ n e ,:he giobal test for detecting the time s f occurrence of a gross e!-i~~r and tlis G I 2 tnethoc! f ~ idenrifying r the locatior! and es~i~nating the nlagnitude oi' the gi-uss error is as folio\vs. l* assurne that we are cul~entlj!at time k=O and have initial estimates ~f the state variables and the co\!ariance matrix of the estimates.

Step 1. Ir~creme~ll time ccunter k, and use the Kalnian filter equstinrls for cunent time instant k, to compute the state variables and the covariance matrix of the state estimates using Equations 6-13 through 6-17. Aiso corxipute the innovations vk. 0





l i m x (sec)

Figure 9-6. Expec:ed evolution of valve innovation for sensor biases.

Step 2. Apply the global test using thc test statistic (Equation 9-71. if the G T rejects the null hypothesis, initialize 311 elements of the vector d, matrix C a ~ i dmatrices T, G, and J (which are required for computing the GLR test staristic), and the quantity 7 (,required for computing the global

test statistic of Equation 9-8) to zero. Set the time index to = k and go to Step 3; or else return to Step 1.

Step 3. Update tlie matrices T, G and J using Equations 9-10 through 9- 12. Update d, C, and 7 using the following equations:

.'I !


biasep in actuators, process leaks, or even co~npletefailure of ser~sorsand actuators [9]. Once we expand the definition of gross errors, however. to include these types of faults it is only proper to also critically examine a whole host of methods developed in the general area of fault diagnosis to evaluate their suitability. In the following section, we give a brief introduction to some of the fault diagnosis techniques and recommend to the interested reader the books by Patton et al. [ I ] , Basseville and Nikiforov [ l o ] and Gertler [2] for a more detailed treatment.



Step 4. Increment time counter k and cornpute the state estimates and covariance rnatrix of state estimates using Kallnan filter eqoations. If the time index k = to+N go to Step 5 ; or else return to Step 3.

Step 5 . App!y the global test using . If the G T rejects the null hypothesis, :hen it confirms a gross error did occur at time t,. Compute the GLR test statistics T, = (di)'/Cii, where di is the ith element of d and Ci, is the ith diagolial eiernent of C. Identify a bias of 1nagni:ude d,=iC,*i-; in the measul-enieilt i' which gives the ~riaxirnurnvaluc of T, arnong all the ~neasurenlents.Recalibrate the sensor if required. Return to Step 'I. Since the global test which is i!sed to determine the tinle of occurrence of a gross error i r ~Step 2 rnay commit a Type I em:, it is again sppiied ai Step 5 using all the inriovations during the eiapsed time window of N mcasuremefiis to confirrn whether a gross error did occur N time stcps before. This causes a delaj of N time sarnpling instarts before a zross error i s detected. It should be noted [ha: the G T (confirmatory t.-st) applied in Step 5 can also cornniit a Type I error or a Type I1 ei-ror. If a gross error did occur at time i,, and the G T at Step 5 does fiot de~cctthis. tl-.-., .- ...- -.WL I : ~ ~ ~ , I I G L<,11~iuCit: d ~ L ; j iilai I I U g 1 ~ ) h herrors are present in the meact -.r:nts made so fzr 2nd resumc the on-line monitoring procedure from Step 2. This causes a further delay of at least N time steps before the gross error is detected. Other variations of this on-line scheme are also possible. 111 the GLR method described above, we have only considered the detection and identification of gross errors due to measurement biases. This npproach can also be used to detect gross errors or faults due to '


The term faults cover a wide range of malfunctions associated with sensors, actuators, and process equipment. They include both soft faults, such as biases in sensors or actuators; degradation in equipment performance, such as fouling of heat exchangers or catalyst deactivation, or partial blockages of pipes; and also hard faults, such as failure of sensors and actuators or unacceptable leaks from pipes and other process units. Several different techniques have been developed to detect and identify these types of faults. These methods car1 be grouped into different classes which are briefly described here. Faults which can be associated with different parameters can be detected and idcntified by directly estimating these parameters as part of a general state 2nd p:uameter estirnation method. These paramr-ters can be compared with their nominal design values ti, determine .ahether a fault exists. For example, fotiling of heat exchangers can be detected and identified by estircating the overall heat [ransfer coefficient of tieat exchangers. Similarly, deactivation of a catalyst can be detectzd and identified by estimating the rate constants of reactions. A survey of such methods has been presented by !sem~arin[I I]. Iiimnlelblau and coworkers [12, 13J have described the application of this technique for fouling of heat exchangers and fouling of catalyst usi!ig a simulated example of a reactor with heat exchange. Tnis leci~iriquecan also be applied for detecting and identifying sensor or actuato, biases by assuming these biases to be present and estinlating their magnitudes. Based on the estimated ~nagnitude.a decision can be made whether the bias is large enough to warrant corrective measures. Bellingham and Lees 1141 used this approach for detecting and estimating sensor biases for a simulated example of a level control process. In general, these techniques are more useful when the model is derived

froln first principles and it is easy to associate the parameters with different faults. The technique, however. can also be applied for det5cting and identifying faults or changes in parameters of empirical models from their nominal values 121,but it is difficult to associate these with actual equipment-related faults. The book by Himmelblau 1151 describes several techniques for fault diagnosis and applications to chemical processes. A second class of rnethods is based on design of observers for fault diagnosis, where the observer can be regarded as a state estimator which has a similar fonli as a Kalrnan filter for linear systems, but with the Kalrnan gain mat~ixchosen based on other requil-ements rather than on minimurn variance estimation considerations. The innovations obtained from such an observer can be defined by a si~nilarequation as Equation 9-3. For fault diagnosis using these innovations, the cbserver or elements of the gain matrix is designed so that the innovations become nlore sensitive to those faults which we wish to detect. A linear transformation of the innovations may also be used with the transformation matrix being designed to meet the requirements. This technique is equivalent to fault diagnosis techniques based on what are known as parity equations that make use of inpui-ourpat models rather than a state space representatio:: [?I. Gcrile: and Luo [I61 l?;tve described rhc design of parity equations for a disiillation col:!mn t;, make then; sel~sitiveto sensor faulrs and insensidisrui-barxxs in the feed flow and fced con~positions. tive to ~~nrneasured Another clzss of faalt diagnosi: nrethods are based on designing or structuring rrsidirals such 2s ini1ovatic;ns as par-ity e q u a t i o ~ sso that each elenlent of the re1;iduals resj:or?d (differ from zer!i signitican~ly)to a pa-ticular fault or set of fauits but no: th-. others. Thus. the effec! of each fault ori :he resid~~;ils ca:1 b: described by a binary vector knoa.1: as the f i i l r : ~ si,yricitrtl-c~. iL'i1t.112 f31~1toccurs, a test is applied to each element nf the residuals :o decide u.hether- it is significantly different from zero or a not. Based on these decisions and comparison with the fault sig~iat~ires. fault may be detected and identified. Although these methods have no: h .,a,., t r i d i::~t i n chemical processes, it should be pointed out that the concept of a fault sisnature has been utilized even in the GLR method. Finally, a I-elati\ely neu) ~nechodfor- sensor fault detection and identiiication by Dutiia ct al. [ 171 uses a principal conlponent model. The use of nlgltivariate pi.irlc.ij)i~!<.otilpotlerz!iitltrl?sis (PCA) for sensor- fault identification via reconstruction provides a reliable technique for fault d i a g ~ o sis \\)hen there is sufiicient correlation among the measurements of precess variable.;. The principal component model captures measurement

correlations and reconstructs each variable by a successive substitution and optimization. Sensor reconstruction is used to validate the sensor measurement via the PCA model. The procedure proposed by Dunia et al. [I71 assumes that one sensor has failed, and that the remaining ones are used for reconstruction. A sequential procedure is used to analyze and validate all sensors.

THE STATE OF THE ART Extensive research in the area of gross error detection and identification in dynamic chemical processes has not been carried out. Even the few techniques that have been attempted are applicable to linear systems or linearized approximations of nonlinear processes. Bagajewicz and Jiang 1181 have proposed an integral dynamic measurement test for gross error detection in linear dynamic processes. This test is essentially the use of the measurement test developed for a steady-state process, which is applied to a dynamic flow process. In their method, since the integral form of the dynamic flow balanceb are converted to algebraic equations by modeling the f l ~ w and s levels a i polynomial in time, linear data reconciliation and gross error detectiozl tests deve!oped for stcaciy state processes can be appiied. Xlbuqrierque aild Bieglzr 1191 have considered the problem of gross error tierectiorl i!; n o n i i ~ e a rdy1121ilicprocesses. Their r u ~ t h o dis significantly different from the traditional methods of data reconciiicltion and sross enor deiection, and is based G n robust estimation of the state variables in the presence of gross errors. In order to ob:ain good stace est~matzielsen in the presence nt g r n s errors, suitable forms of the objective fuaction for reconciiiatio~iare chosen. Aithoush the authars denons:rate that their methcds \vo~-kwell for se!ec?ed examples, extensive testing stiil need: to be don?. Moreover. these rnethods can only be applied for lreating measurement biases and not other Lypes of gross erluis oi-c~ii:;>. In summary, it wil! tal:? :n3rc years of resea:-:!: an?, developinent ei'fort before industrial applications of gross error detection for dynarilic processes can be taken up. Since significant developments have occurred in the field of fault diagnosis, some of which have been applied in the nuclear and aerospace engirreering, it is also wo~thwhileexamining ho\s these techniques can be adapted and used for chemical processes also.

5. Roll~nsD. F., and S. Devanathan. "Unbiased Estimation in Dynamic Data Reconciliation." AIChE Jour17al39 (1993): 1330-1334.


6. Wald, A. Sequential A11alj~si.s.New York: Wiley, 1947. Reprint. Dover, 1973.

Gross error detection in dynarnic systems involves not only the detection and identification of gross errors but also the time at which a gross error has occurred. The innovations used for Kalman filtering for discrete linear dynamic models are usually normally distributed with zero mean and a certain covariance matrix and they can be used to construct statistical tests. Global test statistic for gross error detection can be developed for linear dynamic processes. This test can either be for each time instant or for a time window of measurements. The time of occurrence of a gross error can be detected either by a global test, a sequential probability ratio test, or the generalized likelihood ratio test. The GI,R test can be used to detect, identify, and estimate gross errors. An on-line gross error detection a!goiithm which uses Kalrnan filt e r i ~ gfor estimating the state variables, the dy~iamicglobal test for detecting the time when a gross error occurs and the dynamic GLR test for gross error identif cation and estimation cf rnagt;itudes of the gross errors for :inear dynamic processes is available. Techn~quesdeveloped f ~ fault r diag~ivsisin dynarnic systems car, be adapted for grass error detection and identification.

7. Montgomery, R. C., and J. P. Williams. "Analytic Redundancy Management for Systems with Appreciable Structural Dynamics," in Fault Diugrzosis in I)?tunnic Systemn.7-Tlzeoty and Applicarion.~.(edited by R. Patton, P. Frank, and R. Clark), pp. 361-386. Englewood Cliffs, N. J.: Prentice-I-fall, 1989. 8. Mehl-a, R. K., and J. Peschon. "An Innovations Approach to Fault Detection and Diag~iosisin Dynamic Systems." Autornaticu 7 (1971): 637-640. 9. Narasimhan, S., and R.S.H. Mah. "Generalized Likelihood Ratios for Gross Error Identif-ication in Dynamic Processes." AlChE Jonrnctl 34 (1 988): 1321-1331. 10. Rasseville, M.. and I. V. Nikiforov. Detection of AOrupt Changes--Tlzeoty and Applicario~~s. Englewood Cliffs, N. J.: Prencice-Hall. 1993. 11. Isermatm. R. "Process Fault Diagnosis Based on Modeling and Estinlation Methods: A Survey." Aiitornurica 20 (1 984): 3 8 7 4 0 4 . 12. Watanabe, K., and D. M. Himmelblau. "Fault Diagnosis in Nonlinear Chenlical Processes---Parts 1 and 11." AlCI7E ./ourt7ril29 ( i 983): 243-260. 13. Park, S.. and D. I??. H~rnmclblau."Fault Detection arld Giaynosis via E'aranieter Estin~acionin Lumped Cynarnic Sysrerns." Ind. 07g. C'lle111.Des. Dew. 22 (no. 3. 1983): 4 8 2 4 8 7 . 14. Belli:~ghan~. 5., and F. P. Lees. "The. !letection ol' blalfunction Using 3 Process Control Ccmputer: A Kalrna~lFilterifig 'Technqiue for Genera! Contra1 Laops." T~UIIS. IChertzE 55 (1977): 2.53-265.

15. I-iirnrnzIblau, D. M.F~u!t Detecricn civ/J I:ing17osis ciict~zicnlP:-cjcrssex. .4nistcrdarn: El:icvier. 1978.

Clzcrliicn! nr;d Pi.tro-

16. Gertlcr. J. J.. aad Q. Luo. "Robus: 1sc;lvSle Models tor Failure Diaposis." AlC'IE Josrnnl35 (1989:: 1856-! 868.

REFERENCES 1. Patton, R. et al. Fuiilr Diagnosis in D~nc~llzic S)1.srerns-Thecry tions. Englewood CIiffs. N. J.: Prentice-Hali. 1989.


and i-lp,nlica-

2. Gertler, J. J. Fault D ~ i ~ c t i oand n Diagnosic in Et~.rineering SJJS~CIIIS. New York: Marce! Dekker, 1998. 3. Willsky, A. S., and H. L. Jones. "A Generalized Likelihood Ratio Approach to the Detection acd Estimation of Jumps in Linear Dynamic Systerns." IEEE Trans. Allron7atic Corztrol AC-21 (1976): 108-1 12. 4. Maybeck, P. S. Sfochastic Models, Estimation and Control-Vol. York: Academic Press. 1982.

2. New

17. Dunia. R.. S. !. Qiu. T. 1:. Edgar. and T. J. Mc/\voy. "Idenrificatior? of Fault), Sensors lising Principal Component Analysis." AICIIE Joit1-tzi1132 (no. 10, 1996): 2797-28 12. 18. Bagajewicz, M. J., and Q.Jiang. "Gross E I T OModeling ~ and Detection in Plant Idinear Dyna~nicReconci!iation." Corlil~uict-.r8 Clletn. D I ~ I71 I ~(110. . 12, 1998): 1789-1 809. 19. Albuquerque, J. S., and L. T. Uicglcr. "Data Reconciliation and Cross Error I / (1996): 2841-2856. Detection for Dynamic Systems." AIChI: J o L ( ~ I (32

Design of Sensor Neiworks

The principal objective of data reconciiiation and gross error detection is tu improve the accuracy and consistency of estimates of process variables. These techniques certainly reduce rhe errol- content in measuremsnt.; if redundzncy exists in the measurements. The extent of improverne~~t that zrtn be achieved depends c;uci:tIly on ( 1 ) the r!ccuracy of the sensors whict: is s~~cxitied by the variance in the nteasilre~nenterrors and ( 2 ) rh: numhsr of' variables and thsir type which are measured. Different sensors mr:y be ava~lablefor neastrricg a variabie wit11 wideiy varyifig capshilities such as the range over which i: can ineasiire, reliabiiitj~,2nd accuracy. The cos: of tne sensor .vil! be a function of its capabilities. This i11fc1-mationmusf typior suppliers [ I:. cally tie obtained from instru~nt.n;ationman~ifzcture~-s If v;e consider all tlte different variables sucli as flow rates. temperatules. pressures and compositions of the s:realns in 2 process, these coulc! be of the order of sevzral thousands in number. Clearly, fi-om the vie\

design o f a sc3nsol-network. Although this problem i s an irnpor-tant one i n design of new plants, it can also be used to retrofit the nieasur-ement structure of existing plants by identifying new variables that need to by measured for improved monitoring and control of the process. The design of a sensor network is influenced by different considerations, such as controllability of the plant, safety, reliability, environmental regulations, and accurate estimation of all important variables. If the estimates of variables are used in control, then the accuracy of estimation also has an effect on control performance. Keeping the scope of this text in mind, in this chapter we only consider the design of sensor netwol-ks for maximizing accuracy of estirnatior: through data reconciliation, while giving due considerations to the cost of the design. Moreover, the treat~nentin this chapter is limited to linear (,flow) pmcesses only. It should be noted that the objective of maximizing estimation accuracy is only one of the impol-ttnt considerations and a compreher~sivedesi~1-i should also take into account other requirements mentioned ahove. T11ix problern is receiving increasing attenticn from different reseali-her-s in recrrlt years and new solution strategies are being developed. It may req~~ire se.r era1 vears of additional effort before these solutioris are irnplernzntrd in practice.

ESTIMATION ACCURACY OF DATA RECONCILlATfON Eefore .risediscuss rhe rna:bematical iormu1a;ion of t!lt; sensol- nctc,o:h design problern. we first e x ~ ? m i n the e estiri:ztioi~ accuracq obtained through data reconciiiation and ihe effect that the cilcice of 111casure(! variables have cn it. The flow reco:~ci!iaticn exampie discussed in Exain111e 1-1 i l l Chapter 1 doe:; hi~hiightsome of these issuzs. \17e ~-eexai~lii~c~ this problem in greater depth.

Example 10-1

..- .-;concilcd c.,:;i,n:;., .

:;f :hc stream flows for the process shown iii Figure 1-? were presented in Tables 1 - 1 and 1-2 for different c h v i c e ~0;' measured variables. Let us consider the results of Table 1-1 fcr which ail flows ar-e measured. and Case 2 of Table !-2 for which the only the i l o . \~ of streams 1 and 2 are measured. Thc difference bettileen the cstimatcd and true values of all streams (estimation errors) can be computed t r - o i ! ~ these results and are shown in Table [email protected]!, along with the surn of squar-rs of the cstirnation errors. T I . A

From these results, we can observe that the error in the flow estimates of streams 1, 3, 5 , and 6 are much less for the case when all flows are measured as compared to the case when only the flows of streams 1 and 2 are measured. The estimation errors, however, for streams 2 and 4 are marginally more when all flows are measured. Although, from a purely intuitive viewpoint, we expect the estimation errors to be reduced if more measurements are available, it is clear from this example that the estirnalion errors for all variables are not reduced &hen more nzeusurenzents are nzade. This is more forcefully brought out by the example presented by Mah [4] where it was shown through simulation that, as more measurements are made, a larger fraction of the reconciled estimates have smaller errors. It is thus clear that it is not appropriate to focus on any perticular variable for the purpose of designing sensor networks to increase accuracy of estimation. Table 10-1 Estimation Errors in Reconciled Flows for Process in Figure 1-2

The overall measure for estimation accuracy was first proposed by Kretsovalis and Mah [5] and is defined by

It should be noted that as J decreases the estimation accuracy irzcreases and therefore we refer to it as the measure of the overall estimation accuracy. It is implicitly assumed in the above definition that all variables are observable. If there are unobservable variables, then the measure of overall estimation accuracy can be written as the expected sum of squares of estimation errors for observable variables only. We will ignore this modification and restrict our considerations throughout this chapter to the design of sensor networks which ensure L I I ~observability of all variables. It can be proved [ 5 ]that the overall estimation accuracy given by Equation 10-1 for a data reconciliation solution is given by

Estimation Errors

All Flows Measured

Flows 1, 2 Measured


2 3 4

5 6 Sun) Squares of E5tinlarion Esrors

As an overall measure of estimation accuracy, we can use thz s u n cf squares of the estimation errors of all varizbles (which represents the overall inaccuracy). Table 10-1 shows that the sum of square of estimation errors is less when more meacurements are available. We can therefore use this .measure in order to design sensor networks. This measure, however, also depends on the measured values which can be different each time due to their random characteristics. The appropriate measure that we can use for design purposes is the expected value of the sum of squares of the estimation errors which will be independent of the actual outconx of the measurements and will depend only on the sensor network design as well as the inherent process structure.

where S is the covariance matrix of esiinlaiion errors and the operator Tiis the trucr of the matrix. It should be noted t!lat the diagonal elements of S are rhe varia~lcesof the estimation eIrors and J is ther-efore the sum of the variances of estimation errors, which is eqtiivalenr to the expecced sum of squarcs of the estimation errors. Alti30;lgh it is possibk to cierive the estimatior? error covariance matrix from the data reconciliation soluticns for the measured and unmeasured variables ziveri in Chapter 3, we descl-ibe later an alternative sequential update procedure which is more usefa! in the context of the sensor ~ e t w o r kdesigri problem. Different approaches have been developed to soive the sensor network design problem. In the fclioviing section, we discuss these methods which consider objectives of estimation accuracy, observability and cost.

SENSOR NETWORK DESIGN Methods Based on Matrix Algebra

Let us consider a linear flow process for which the material balances are given by Equation 3-2:

Let us separate the flow variables into a set of n-m irzdeperzdenr variables XIand a set of rn dependent variables x, and rzcast Equation 10-3 as where the variables x represent the strean? flows. In general, only some of thew flow variables are measured and the relationships between the measurements and stream flows can in general be represented uslng

where each row of matrix H is a unit vector with unit in the column corresponding to the flow variable which is measured, the number of rows being equal to the number of rneasurernents. We have chosen Equations 10-3 and 10-4 to represent a partially measured linear process rather thari the equivalent alternative model Equations 3-1 and 3-11, because it is Inore convenient for the purposes of designing sensor networks. Mininzum Observable Serzsor Networks We are interested in sensor networks which ensure the observability of all var-iables. We, therefore. first address the question of the minimum nilrnhe; of measurer?lents to he made in order to ensure that every flow variable is observable. LVe will for con\.e~iiencerefer tci silch a design .ts a !nitllmrlin c!O.setvni)!e seizsot. rzerlcoi-k. If there are 11 stream flows to be ebtinlated and :ye have nz flow ccnstrair~ts,then it i:, evident rh:tr at leaxt I:-m flows must be specified. In other words, :tic n~inirnumr,umSer cf ~neasuremerltsis 11-171. For the flow process cor~sidei-edin Example 1-1. the ~niniaur-nof meas~~rernenis io be made in order to ecsure that all stream fiou,s are observable ic 2, since there are 6 streams and 4 flow balanczs. Czse 3 of Ex:imple 1-2 is a specific instance of a minimum cbserl~ablesensor uet\vork for this process. AII additional poinl to be noted is that in a niinirnv111 obsergable sensor network none of the measured variables is reduldant and the rtconci!ed values of these variables are exactly eqttal to their respective lneasured values. No: every combination of n - ~ n measureinents will give rise to an observable systern. however. For cxaruple. in Case 3 of Example 1-2. although two rneasurernents are made, the flowis of streams 2 to 5 are unobservable. The condition that a sensor network must satisfy in orde1- to ensure observability of all variables in a linear process is discussed as follows:

The dependent variables are chosen in such a way that the matrix A, is non-singular. If we ~neasureonly the independent variables, then we can use Equation 10-5 to compute unique estimates of the dependent variables as


It is thus clear that this sensor network is a minimum observable sensor network. Therefore, the condition to be satisfied by a minimum sensor network in order to give rise to an observable system is that the sub-matrix corresponding to the unmeast-lred variables should be nonsingular. Note that this irnplies that the columns of the constraint matrix corresponding to unmeasured variables are lineariy indspendent (which is the oilservability condition in Exercise 3-3). If :nore measurenients zre made than the niiinirnurn 1-equired to ensu-e cbservabiiity of all variables, then we obtain a rt.du:zcIatarl: seizscr r~erwork desigiz. Ever: if a rzdundant sensor netwoik is designed. it does not automatically imp!y !hat all flows are observable. There coltld be stlosets of variables v,.hich are unobservabIc while the rest are redundant. A redund it' we dant se:?sor network gives rise to an observable process if a ~ only can choose rz-in independent t~ariablesfrom among the set of rneaa:wci vzriables such that the coi~straintsubrrlatrix correspcnding to the remaining variables is nonsingular. In this case, the dependent set contains oce or more measured variabies. We refer to such a design as a 1-eclr~nd~znr obsen>ablrscrzsor network. We can always obtain a redundant observable sensor network starting from a miniinurn observable sensor network by choosing to additionally ~xeasureone or niore of the unmeasured variables in the minimum scnsoi network.

Estir~mtionAccuracy of Minimurn Observable Sensor Networks now consider a minimum observable sensor network design we and obtain the overall estimation accuracy for reconciled estimates. Based on the discussion above, we can choose th2 measured variables as the independent variables. For any observable sensor network (nonredundant or otherwise), the estimates obtained using data reconciliation must satisfy constraint Equation 10-3. Thus, Equation 10-6 can also be used to relate the reconciled estimates for an appropriate choice of the independent and dependent variables. Since there is no redundancy in a minimum observable sensor network, the estimates of the independent (measured) variables are equal to their respective measurernents. This implies that the covariacce matriy of estimation errors in the independent variables is equal to QI,which is the covariance matrix of meastlrenlent errors of the independent variables. Let us denote the covariance matrix of estimation en-ors corresponding to the independent and dependent variables by SIand S,,. respectively. Then \ve obtain from the preceding arguments that

matrix corresponding to the dependent variables is nonsingular). For each combination, the independent variables can be chosen as the measured variables and the measure J for each sensor network design can be computed using Equation 10-10. The combination that gives the least J is the optimal sensor network design that we seek.

Example 10-2 We will illustrate the minimum observable sensor network design that maximizes estimation accuracy for the ammonia process shown in Figure 10-1. We will limit our consideration only to the overall mass flows of this process. For simplicity, let us consider the case when the flow sensors used for measuring any stream have an error with variance equal to I. Since there are 8 streams and 5 process units, we require a minimull1 of 3 sensors to observe all variables. The different feasible combinations of sensor locations along with the corresponding measures of estimation accuracy are shown in Table 10-9. We can observe that there are 6 optimal sensor network designs corresponding to sensor locations (1, 2. 6). (2, 5, 7), (1, 3, 6). (3, 5. 7). (1, 4, 6 ) . and (4, 5, 7) with a minimum expected sum square of e s t i ~ ~ a t i o n errors equal to ! 1 units. It can also be observed tllar, although there ale

Lising Equatians 18-8 and 10-6 :rJecan show that

Combining 5quatic)ns 10-2. 10-3. a i ~ d10-9. ihc measure tiir overall esti:natio:: accuracy for a aiinimurn obszrvahle ser?s(>rnetwork car1 be expressed as

A niininit~rnobservable sensor rietwork design that minimizes J defined by Equation 10-10 is desired. 111order to solve this problen~.a nlixed integer optilnization pr-oblem can be used, which is desci-ibed later in this chapter. Hcre we \vill use a nai've approach and examine every ft.trsil2le combination to d e ~ c r ~ n i nthe e optimal solution. We can select every co111bin;ttion of n-111 independent variables (such that the sub-

Figure 10-3. Simplified ammonia process


56 combinations of choosing 3 sensors locations out of 8 sensors locations, only 32 of these combinations give rise to observable sensor network designs. Table 10-2 Minimum Observable Sensor Network Designs for Ammonia Process No.

Measured Variables

Overall Expected Estimation Error

Redundant observable sensor networks. The measure of estimation accuracy for redundant observable sensor networks, can be computed using simple update formulae developed by Kretsovalis and Mah [ 5 ] .Let us begin with a minimum observable sensor network corresponding to a set of n-m measured independent variables, xl and the remaining unmeasured variables xD. The measure of estimation accuracy for this sensor network is given by Equation 19-10, Let us consider the addition of a new . q be the variance sensor to measure one of the variables in the set x ~Let in the error of this new measurement. As in Eqazuation 10-4, the new measurement y can be related to the variables x by


and hT is a unit row vector with unity in the colun~nposition con-esponding to the new @variablebeing measured. The expected es~imateerror covariance mr:trices of the independe~lf and -dependent -v~ariaSles after the -., addition of this new measurenient, S,, and S1, respectively are $\.en by



The change in the nteasul-e of i-ctirnntinn :Icc:vracy due t~ of this new measurement can be show11 to he

!hi_. additioit

Equations 10-12, 10-15, and 10-16 can be directly used to compute the change in the measure of estimation accuracy due to the addition of a new measurement using the covariance matrix of estimate errors in the preceding sensor network design solution. Thus, starting from a minimum sensor network design solution the measure of estirnation accuracy for a redundant sensor network design can be obtained by successively adding the required measurements, and using the update equations after each addition. Similar equations can also be derived for deletion of a measurement from a redundant sensor network design solution. In this case, the change in the measure of estimation accuracy is given by

Choosing these measured variables as the independent variables, the covariance matrix of estimation errors in the independent variables is the identity matrix (of dimension 3). The matrix F is given by

If we choose, in addition, to measure the flow of stream 1 , then the vector which relates this measurement to the independent variables is given by


and the updated covariance matrix of estimate errors i n the independent va1-iables is given by

S, = S, + ~ , s , K G ~ s ,

The value of kl from Equation 10-15 is equal to 1!(1+2) = 113 and hence the decrease in estirnation error can be calculated from Equation 10-16 a5 -3.3333. The updated covariance matrix of estimate errors of independent variables is given by


It should be noted that rhe sct of independent variables and d~.perldzni lsariablcs do not change as new ~nea:;iirzinei;ts arc added or dclztzd. so that after a series of additions 01 ciele:ic?ns each of these sets can contain a mixture of measured and unl~ieascredvariables. Care must be taken. hou~ever.when a measuremelit is deleted to mrure that an ~~nobszr\,aS!e dzsign is not ob;ai!lcd. In f x t , if a mea\urernerlt is deleted which can icad to :in l.:nobservable process. then the der,c;min'ltor :II Equatior? 10 19 will become zero and this can be used as an indicator to iivoiti 5uch choices.

Example 10-3 We will consider the arurnonia process example and cotiipute the decrease in the expected en-oi- in estirnatcs for the addition of a single measurement to a minimurn obsel-vable serlsor- network design. The 1:;11-iances of all sensor errors are taken as unity as before. For this purpose. we will start with the optimal minimum observable sensor design obtained in Example 10-1 in which the variables 2, 5. 7 are measured.

ir, order to design an optimal redunda~tobservable sensor netm.orh ffir a specified number of sensors, say i - ( I - > I Z - ~ ? : ; ive can start with any ~:lini m u ~ ngbsen~ablesensor network design and add i--iz+rt: additiocal sensors, Gne a: a tirnz and update the 1neasu1-eof estination r;cclil-acy usi:iz FAuations 10-12. iii- 14, ail3 i0- 15. \Ve can then i-riocc2ir the sc11sa:-sby. in tGm. add:r?g 3 new nieasurenlent and deleting an exis:ing rnzasal-=meill m get a new redu~ldantobservable sensor ~ e t w o r kdesigil conristing of isensors. Equations 10-17 through 10-19 can be used ior upaatir~gthe measure of estimation accuracy when a measurement is deleted. In [his mar - >- a- 1 'l l I'-" ' L ~ S ~ ~ Y L L ~ l l b i ~ l a ic~lr i ori isensor ~ iocations can be examined i r c\-r'-~,rtr, find the design which gives the maximum expected estimation accuracy. 'This will result, however, in an exponential number of solutions to be examined for a general problem. Kretsovalis and Mah 151 outlined two sub-optimal design procedures for a redundant censor network desigfi for a specified number of sensors, as described o n the following page. L L I Y ~ ,

I_fe\ry~r4 .SL'I~ior NL'III.OI-k \

Algorithm 1 Step 1. Determine the optimal minimum observable sensor network. Step 2. Add a new sensor in turn to each of the remaining unmeasured variables and co~nputethe reduction in estimation error using Equation 10-16. Step 3. Based on the results of Step 2, select the best r-17+177 sensor locations (the locations that give maximum reduction in estimation error) to obtain the redundant sensor network design.

Algorithm 2


addition of a single sensor as observed from colu~nn1 of Table 10-3. On the other hand, if we apply Algorithm 2, then we would first select variable 1 (or 6) to be measured since this gives the maximum estimate error reduction (column 1). In the next iteration, we select variable 6 (or 1) to be measured since this gives maximum estimate error reduction (column 2) among all remaining variables, and, finally, variable 8 is chosen to be measured due to the same reason. In this example, both algorithms give the opti~nunlsensor network design. although in general this may not be the case. Table 10-3 Expected Estimation Error of Redundant Sensor Network Design for Ammonia Process --

Step 1. Same as in Algorithm 1

Change in Expected Estimation Error (MeasurementsAdded)

Step 2. Same as in Algorithm 1. Stcp 3. Determine the sensor placement that gives the rnaximur~treduction in expected estimation error from the results of Stcp 2 and add it to measured set of I anables. Stop if the number of measurements inade so f a - is. I - : or e!se retcrn to Step 2. Both the abave algorithn:~3c not necessariij~give the s e x o r nztwork design that Inaxirnizes cstirnation accuracy. 5l;t reduces :he computationa1 burdsn significantly. hfirzilnrin~Cost Sensor iVefu~orkL)esig:zs

T.\le apply the above two algorithms for designing redundan: observabie senso!- networks using six measurements for the ammonia process. Frorn Example 10-2. the optimal minifzlum observable sensor network design corresponding to measured variables 2, 5, 7 is chcsen. We have tc. select three addiiio~~ai villidbies TG be measureci with the objective of reducing the expected es:i,.;ation error as milch as possible. Table 10-3 shows the expected decrease in estimation error achieved by addin,0 one. t\vo, and thtee additional sensors for different coinbinations of the variables selected to be measured. The ~naximumesti~ilateerror reduction is achieved by choosing to measure additionally the variables 1 , 6, 8. If we apply Algorithm 1 above we would select the variables 1, 6, 8 to be measured since these give the maxi~numestimate error reduction for

Insea2 of maxifilizing esti~nationaccut-acy. a n:inimun~ cost sefisor network may be designed that ensures observability of all variables. This ob.iectivc fur~ctionwas considered by Madron and Vevcrka [6j for sensor network design. Althougl~several other issues were considered in their work, we limit our consideration to the design of minimum observable sensor networks at minirnun? total cost. The design algorithm proposed by Madron and Ververka [6]essetitially attempts to obtain a set of dependent variables such that the measured independent variables wi!l have the least total cost. The colun~nsof the constraint matrix are first arranged in decreasing order of the cost of the sensor for rneasuri~lgthe corresponding variables. A Gaussian elimination procedure is applied, with the pivot element

being chosen from the next available column if possible, and reordering of the rows and columns is done if required. This procedure stops once the first m columns form an identity matrix. The least cost minirnum observable sensor network design is obtained by measuring the variables corresponding to the remaining rz-m columns of the constraint matrix. We will illustrate this procedure by means of the following example.

Example 10-5 We consider the ammonia process with the sensor cost data for nieasuring different variables given by Table 10-4. Table 10-4 Flow Sensor Costs for Ammonia Process -


Sensor Cost

1 2 3 4

2.5 4.0 3.5 3.0


1.0 2.0

7 8 .. -






2.0 1.5


The order in which the pivots were selected for Gaussian eliinination are (1, I), (2,2), (3,3), (4,4), and (5,6), where the elements within brackets indicate the row and column index of the pivot element. It should be noted that for selecting the pivot elemelits the columns had to be rearranged since a nonzero pivot element was not available in the next column. The least cost mininiuni observable sensor network design is obtained by measuring the variables 6, 8, and 5 corresponding to the last three colun~nsof modified matrix A. Madron and Veverka [ 6 ] also considered constraints on the sensor location problem such as specifications of which variables are unmeasureable and which variables were required to be estinlated. They also considered the problem of locating additional sensors in a given partially n~easuredprocess in crder to obtain an observable sensor network dcsign at mini~nlurnadditional cost. In order :o solve these problems. tire columns of the constraint matrix A have to be ordered appropr~atcly befcre applying Gaussian elimination. 'The details of the pracedure tnay be obtsined from the pubiication by Madron and Veverka i61.

The constraint matrix for this process is giver, by Methods Based on Graph Theory


where the colurnns are arranged in order o! decreasing sensor costs for measuring the correspo~ldingstrearn flows and the rows are the flow balances for nodes 1 to 5. After- applying Gaussian elimination to obtain an identity matrix in the first t n columns. we get the following modified matrix

Sensor networks far linear tlow processes can be desi%nzd elegantly dsifig graph-theorecic techniques. Unlike other methods. powerful insights are obtained concernkg the :structure of the seasor neiwork which make it possible to develop efficient algorithms for solving :he design problem. We wili again consider the design of sensor networks for maximizing esti~nationaccuracy or for minimizing the total cost. The additional graph-theoretic concepts required for understanding r h r method discussed in this section can be found in Appendix B.

Maxit~zuazEstinzatiotz Accuracy Serzsor Network Design



Minimum observable sensor networks. In the preceding section we stated that a miniinunl observable sensor network can he designed by

choosing n-m independent variables to be measured such that the constraint submatrix corresponding to dependent variables is nonsingular. In other words, our choice of independent variables should make it possible to express each of the dependent variables as a linear combination of independent variables only. In Chapter 3, we showed that all unmeasured variables are observable, if no cycle containing only unmeasured variables exists in the process graph. We also showed that in order to ensure observability of all unmeasured variables using a minimum number of measurements, the unmeasured variables should fonn a spanning tree of the process graph. In other words, a minimum observable sensor network can be designed by simply constructing any spanning tree of the process graph and choosing the flows of chords of the spanning tree as the measured variables. In this case. the chord stream flows are the independent variables and the branch stream flows are the dependent variables. Note that this is similar to the choice of independent and dependent val-iable choice made in Simpson's method for solving bilinear data reconciliation problems efficiently that was discussed in Chapter 4. The relationship between dependent and independent variables can also be obtained easily using the fundamental cuisets of the spanning tree. As descri!>ed in Appendix R , a fundarnenta! cutset, with respect to a branch of the spanning 1:-ze, contains ope or more chord:; acd the stream !low coi-responding to the branch can be written in ternis (jf these chord streams floivs as

where K: is the fundamental cutset with respect io branch i. The elr,rnent.; pij are O if chord j is not an element of K,': otherwise they are +1 or -1 +:2rzc!i7g cr. whether chord .j has the same or 3pposite orientatlnr. 15 branch i. If the variance in the measurement error of chord flow j is o'. J then fro111Equation 10-2 1, the expected variance of the estimate crror of branch flow i can be obtained as

For a minimum observable sensor network, the estimate of the measured stream is given by the measured values themselves. Thus, the expected variance in the estimate of a measured variable is equal to its measurement error variance. The overall expecled estimation error (rneasure of estimation accuracy) is the sum of all the expected variances in the estimate of all variables. Using Equation 10-22, we get the overall measure of estimation accuracy as

where k, is the number of fundamental cutsets of the spannin,o tree in which chord j occurs. Equation 10-23 is exactly equivalent to Equation al concepts 10-10 except that it uses spanning tree and f u n ~ ~ m e n tcueset instead of their matrix equivalents.

Example 10-6 The process graph of the ammonia process considered in Example 10-2 is depicted in Figure 10-2. A spcnning iree of ths process graph is s h o w in Figure 10-3 which consists of branches 2, 3, -2. 5, and 8 and chords 1. 5. arid 7. ~Zorrsspcrndirigio this spannirig tree, in the minimurn obse:-vabls sensor cetwork design ihc flclws c!f strear::s 1, 6. and '7 a-c mcasu~-26.Thifilndamental cutsets of this spanning tree are [2. 1, 71, 13.1, 71, [&, 1 . 71. 1, 6, 7j, and [8, 1, GI whsrc the branch in each fundamenral cutsi-t =irt: denoted by 2n underscore. Chord 1 occurs ir. five fcndamental cutssts. chord 6 in two. and chord 7 in four. If we :issunie all measurerrterlt enor variances as unity, then from F4uation 10-13 we get the overall expected estimaiicn error for this sensor design as 14. This va!ue may be compared with the solution given in Table 10-1. A process graph can contain several spanning trees. The numbel- of spanning trees which is equal to the number of fcasible minimum ohsen-able sensor network designs can be as large as nu-', where n is the number of nodes in the process graph [ 7 ] . There are several algorithms for constructing a spanning tree of a process graph and for finding the fundamenla1 cutsets of the spanning tree. Some of these algorithms alons with computer programs i11 FORTRAN language have been described in Deo (71. It should be kept in mind that, for the purposes of constructing a spanning tree the direction of ihe streams are ignored, that is, the process


We can start with an arbitrary spanning tree and use the elementary tree transformation technique to successively obtain new spanning trees which gives improved overall estimation accuracy. The o~llyissue to be resolved is to select the chord and branch to be interchanged to improve estilnatiorl accuracy. An algorithm for this purpose is outlined below.

Algorithm 3 Step 1. Construct an arbitrary spanning tree of the process graph. Figure 10-2. Process graph of simplified ammonia plant

Step 2. Determine all fundamental cutsets of the current spannin,0 tree solution and compute its overall estimation error using Equation 10-23. Step 3. Find the number of occurrences of each chord in the fundamental cutsets and compute the contribution ( k i + l ) o t of each chord i to the overall estirnation inaccuracy, and I-ank the chords in decreasing order of their contribution.

Figure 10-3. Miqiniurn observabie sensor neiwork.

graph is treated as an undirected graph. Tile directions of thc sirzans arc csed only to obtain the c~cfficien!~,piJin Equation 10-21 if reqnircd. Tile problem cf designing a minimum observable sensor actwork thai rnaxirnizes .-,stirnation accuracy (cr e.qilivalently miilimizes J ) ca:] be restated as ille prob!cm of constructi~lga span!~ingtree of the process graph which gives the least value of J as defined by Equatior? 19-23. Starting with a spanning tree. we call generate a new spanning tree by means of a chard-branch interchange described in Appendix B as an elementary tree transformation. In this technique, if we add 3 chord to the spanning tree, then we aflould delete a branch fro111 the fundaruental cycle forrned by tlie chord. in order- to obtain a new spanning tree. This elementary tree transforma[ion implies that the sensor ~neasuringthe chord stream flow is removed and instead a sensor is used to measure the branch flow deleted from the initial spanning tsee solution.

Step 4. Select tlie next ranked chord. say c,, from the crdered set of chords arid find the fundamental cycle formed h \ chord c,. Stop if !liere are no more chords to be examir?ed: else ma!; :he branches in the fundaniental cyclc in increasing order of their ineasure~nenterror \ arinnces. Step 3. Select the next ranked branch, say bJ, from the fu!u!?darne~:talcycle and interchange cliord ci and t;ranch b, to <)btain a ilew spanning t~-ec.i f there are no rr,ore brafiches to be examir~ed,return ro Step 4. Step 6. Obtain thc fundamental zutszts v.4t.b respect to T ~ new E spannjng tres and cornputz the overali estimation error '~si;tgEquation 10-23 col-resp~ndingio this new soluticn. If the overail estimatic;n error of the nex. solution is less than that of the old spanning tree. replace old solutio:~ v:ith new spanning tree and return to Step 2: or eise i-cstore old spanning tree solution and return to Step 5. In the above algorithm, at each stage an atle~i~pt is rnadc to obtain a new spanning tree solution having better estimation accuracy, by a chord-brancli i~lterchangein the c u ~ ~ e spannin~ nt tree. If this artetnpt is successful, then the current solution is replaced by the new one arid the procedul-e repeated. If after systematically exarilining all possible chord-branch intercha11;--L> an improved solution is not obtained. then the algorithm stops.

The procedure adopted in the above algorithm is known as a n e i g h b ~ r l z ~search ~ d technique since only the neighboring spanning tree solutions which differ from the current one in respect of only one branch are examined for obtaining a better solution at every iteration. 'The algorithm, therefore, gives only a local optimum solution and not the global optimum (sensor network design with least estimation error).

the maximum (or minimum) weight spanning tree is one of the classical problems of graph theory that has been well studied. Several algorithms are available for determining the maximum weight spanning tree in a straight-forward manner [7], and we choose to describe Kruskal's algorithm below 18):

Example 10-7

Step 1. Sort the streams (edges of the process graph) in decreasing order of their weights. Initialize the set of edges in the tree, T to be a null set.

The above algorithm is applied to the ammonia process using the initial spanning tree with branch set [2, 3, 4, 5, 81 considered in Example 10-6. From the fundamental cutsets of this spanning tree obtained in Example 10-6, the overall estimation error is computed as 14. Moreover; the chord set ranked in order of their individual contributions is [I, 7, 61. We select chord I and find the branch set which forms a furldamental cycle with this chord as [2. 3, 4, 5, 81. Since all nleasurernent error variances are equal, we can choose any branch for the interchange. If we arbitrarily select branch 2 and interchange with chord 1, we get a new spanning trse with branch set [ I , 3, 4. 5 , 81. The overall estimation en-or of this solutio11 is 12 which is less than tlle current solution. Therefore, we accept this new solution. If we repeat this procedure with i h ~ new solution we find that none of ihe chord-branch interchanges results in a bettzr solution. Thus, rhe satsor n e t w ~ r kdesign ob:ained by this algorithm corresponds i G the measurelnents of streams 2, 6, and 7 with estimatioi? error of I?. This soIution is worse than the global optimilm desigr, which has an estimation error of 1 1 a!: observed from Table 10- 1. P.lgorithms for redunc!ant sensor network desiyts for n~sxintizingesti:na:iolr Liccilrilcy using graph theoretic techniques are >let to he de\,eloped Minimum Cost Sensor Aretwork Design

The desipn of a n~inimumobservable sensor netviork which has the least cost amang all minimum observable sensor networks can be easily accomplished using graph-theoretic techniques. From the discrtssion in the preceding section. we note that every spanning tree corresponds to a minimum observable sensor network. If we assign a weight to each stream which is equal to the cost of the sensor required to measure the flow of the stream- then the problem considered here is to determice tiis maximum weight spanning tree, where the weight of a spanning tree is the sun1 of the weights of its branch streams. The problem of deterrninirip

Step 2. Pick the next edge from the sorted list. Step 3. Check if the edge picked forms a cycle with edges of the partial tree constructed so far. If so, discard this edge or else add edge to set T. If the number of edges in T is less than n (number of process units), return to step 2, or else Stop. The method of checking if an edge forms a cycle with other edges in a set, and the other operations to be carried out when an edge is added to T, are explained in Deo 171. We illustrate this algorithm in the following exalnple.

Example 10-8 VJe repcat Exan~pie10-5 tc find the least cost minimum observable sensor network of the ammonia prccess, bur this time using Kruskal's algorjthm. The c~rderof the streams in decreasing order of weights (sensor costs) is 12, 3, 4, 1. 6 , 7 , 8. 51. We pick edge 2 and add it to the ;see being constr~ited.h'ext, we pick edge 3 from the sorted !is[ arrd add it co the tree (since it does not form zi cycle with edge 2). \Ve continue io pick and add edges 4, and 1 because they do nct fom cycles with ~ t h e edges r added so far to the tree. When we next pick edge 6, we find that it cannot be added to tree since it forms a cycle with edges 2, 3, 4, and 1 added to tree so far (can be visually verified from the ammonia process graph of Figure 10-2). So we discard it and pick the next edge in the sorted list and add it to tree since it does not form a cycle with other edges. \Ve stop because we now have added 5 edges to the tree which is equal to the number of process units. The resulting spanning tree is shown in Figure 10-4. Corresponding to the spanning tree the sensor network design measures chord streams 5, 6, 2nd 8. Comparing with the solution obtained in Example

10-5, we can observe that this is the minimum cost sensor network design among all rnini~nunlobservable networks. The order of choosing the edges of the spanning tree is identical to the order of picking the pivots from the columns of the constraint matrix in the linear algebraic method used in Example 10-5.



if x,is not measured if xi is measured

Let c, be the cost of sensor for measuring flow of stream i, and let ci; be the maximum allowable standard deviation of the error in the estimate of variable i. A minimum cost sensor network design which satisfies the constraints on estimation error can be formulated as



qi subject to

Gi I o,

i = I...n

Figure 10-4. Minimum cost observable sensor network


Methods Based on Optimization Techniques

The p~-oblemof ser:sor !letwork design can also be forrnulateti ss s 11iathema:ical cpti~ni/.aiionprob!eni ant1 solved usi:lg appropriate o ~ j t i rnization tecll~iiques.In the preceding sections, the design cf sensor networks with the objective of either it3inimizing cots or filaxiriz~ngestimation accuracy was considered. Optimization tcchciques offer the 1:ossihility of siinllltaneo:~s!y considering differecr objectives arid a l s i inlposin~otlier constraints. Fur!hermore, i: can be estended to consider more gel~eralprocesses illvolving flows. temperatuixs, pressures and concentration measurements. H3giijeivicz [9j the first to formulate sensor nerwork design opti~nization131-oblern.We describe here only the Sonnulation of the problem and refer the reader to standar-d texts on op~imizationfor the details of the optimization rechnique used to solve the probiern. In sensor- network design. the important decision to be rnade with regard to each stream flow variable is whether to measure it or not. In or-der t o ~nathematicallyformulate these decisions, we can convenicntly make LISZ nf binary (0-1) integer decision variables qi, one for each str-eam i which have the following interpretation.

where is the standard deviation of the esror in the estimate of flow of stream i and is thc square root of the dlagonal e!ement of cokariancz matrix 3f estimation errors, S, which can be computed for- ar?). ciloice of sensor locations (defined by the valuzc chosan Sol- the binary var-iables o,,) using Equation 3-10, The above problsm is a mixed integer optimizatior, proble~nand can be sclved using techiliques such a5 branch and bound. Alternatively, commercial optimization packages such as GAMS, GINO or MINCS ~ h i c hhavc a suite of opti:nization iechniq~ebcan be efl'cctivelq used for solviny the atxve problern. In the aboie optimization problern for:~:ula:icn, miui;nizution of the sensor network cos: was used as the objective (Equarion 10-25>,subject to a minirnum accuracy specification for the estirnaies. Alternativelyl we can choose to maximize the overall estilnation accuracy subject to a maximum limit on the cost of the sensor network. Ocher constraiilts, such as minimurn deslred reliability, can also be included in the problem S(.>i.mulation.

DEVELOPMENTS IN SENSOR NETWORK DESIGN The earliest statement of the sensor network design problem for maximizing estimation accuracy through data reconciliation was given by Vaclavek [I01 for linear (flow) pi-ocesses. Later, Vaclavek and Loucka


Dura Recol~cili~~rio~l and G'rus~Errol- Derecriorl

[I 11 proposed algorithms for sensor network design for ensuring observability of important variables in linear as well as multicomponent (bilinear) processes. Almost two decades later, Kretsovalis and Mah [ S ] proposed a systematic strategy for solving this problem, where a measure of estimation accuracy was formally defined and algorithms proposed for design of redundant sensor networks for maximizing estimation accuracy. Ali and Narasimhan [12, 13, 141 addressed the problem of sensor network design for maximizing reliability and developed graph-theoretic algorithms for this purpose. Matrix methods for sensor network design for maximizing reliability were independently developed by Turbatte et al. [15]. Observable sensor network designs for linear and bilinear processes using matrix methods were also addressed by their group [16]. More recently, the issue of sensor network design for improving diagnosability and isolability of faults have beck1 tackled by Maquin et al. [17] and Rao et al. [IS]. Independently, Madron and Veverka [6] tackled the problem of ~ninimum cost observable sensor network design using matrix methods. Bagajewicz [9] formulated the sensor network design as an optimization problem. The use of generic optimization algorithms fr3r sensor network design considering different objectives such as cost, estimation accuracy, and reliability was reported bj. Sen et al. [19]. Recent!y, Bagajewicz and Sanchez [20] presented a n~ethodolozyfor designing or upgrading a sen53r netork in a process plant with the goal of achieving a certair, degree of observability and redilndancy for a specific set of variables. Although significafit progress has Seen made in t l i ~design of sensor networks, a compreliensive strategy sirnul:aneously considering different objectives still has to be;oped.

Design of Ser~sorNerworks


SUMMARY The location and accuracy of sensors determine the estimation accuracy of data reconciliation and performance of gross error detection methods. The unmeasured flows in a minimum observable sensor network design for a flow process forms a spanning tree structure. A minimum cost minimum observable sensor network design is equivalent to the minimum weight spanning tree. The general sensor network design can be formulated and solved as a mixed integer nonlinear optimization problem.

REFERENCES 1. Liptak, B. G. hzstrument Engineers' Handbock-Process Measur-eilzer!rciizd Aizalysis, 3rd ed. Oxford: Euttenvorth-Heinemann. 1995. 2. Peters, M. S.. and K. D. Tinmerhaus. Plui~tDesigiz u ~ z dEcoiloiliic~,;OIC1ze~i:icaiEi7gitzet.r~.New York: McGraw-Hill. :980.

3. Coulson, J. M.: J. F. Richardson, and R. K. Sinnott. CI:er~zicalGrgiizeer-ill?P'ol. 6. Desigiz. Oxfard: Pergamcn. 1983. 4. Mah, R.S.H. C h ~ i ~ i c Process al S?i-ticturesa;ld fi:Jonmtioil Flotcs. Boston: Butte~xorths.1390. 5. Kretsovalis. A . , and X.S.H. Atah. "Eifect s f Redu~dattcyoil Esiimation Accuracy in Process Data Reconciliation." C11en1.E;iig. Sri. 42 ( 1 9 S 7 f

2! 15-2321. 6. Madfor?,F., an* V. Veverka. "Optimzl Selection of Measuri~lgPoinr in Complex Plants by Linear Models." AIChE Joumal38 (1992): 227-236. 7. Ceo. N. Grap11 Thecln r+
8. Kruskzl, J. B.. Jr. "On the Shortest Spanning Subtree of a Graph and rhc Travelling Salesman Problem." Pi-oc. .4m. Marlz. Soc. 7 (1956): 48-50.


Data Rrco~lciliatio~~ olld GI-ossEi-I-or Detecr~orr

9. Bagajewicz, M. "Design and Retrofit of Sensor Networks for Linear Processes." AIChE Journal 41 (1 997): 2368-2306. lo. Vaclavek, V. "St~~dies on System Engineering-111. Optimal Choice of the Balance Measurements in Complicated Chemical Engineering Systems." Chem. Eng. Sci. 24 (1969): 947-955. 11. Vadavek, V., and M. Loucka. "Selection of Measurements Necessary to Achieve Multicomponent Mass Balances in Chemical Plants." Chein. Eng. Sci. 31 (1976): 1199-1205. 12. Ali, Y., and S. Narasimhan. "Sensor Network Design for Maximizing Reliability of Linear Processes." AlChE Journal 39 (1993): 820-828. 13. Ali, Y., and S. Narasimhan. "Redundant Sensor Network Design for Linear Processes." AIChE Joccrnal4 1 ( 1 995): 2237-2306.



Industrial Applications of Data Reconciliation and Gross Error Detection Technologies

14. Ali, Y., and S. Narasimhan. "Sensor Network Design for Maximizing Reliability of Bilinear Processes." AICIiE Journal 42 (1996): 2563-2575. 15. Turbatte, H. C., D. Maquin, B. Cordier. and C. T. Huynh. "Analytical Redundancy and Reliability of Measurement Systern," presented at IFAC/IMACS Symposium Safeprocess '9 1, Baden-Baden. Germany, 199 1. 49-54.

16. Ragot, J., D. Itlaquin, and G. Bloch. "Sensor Positioning for Processes Cescribcd by Bi!inear Equations." Diczyrlo.c.tic rr srtr?te c/e fntzc:ior~inrrlr 2 (1992): 115-132. 17. Maquin. 3., M. Luong. and J. Ragot. "Fault Detection and Isolation and Sensor Network Design." R4IRG-APII-JESA 3 1 (1 997): 393-406.

13. Rao, R., M. Ghushan, 2nd K. Rengaswa~ny."Locating Sensors in Complex Chemical P1a.n:~ Eased on Fault Diagnostic Ohstr\;a'oility Criteria." AI'C!IE J~!r7-tzul15(1999): 3 !O-322. 19. Sen, S., S . Narasi~n!ian. and K. Deb. "Sensor Network Design of Linear Processes Using Genetic Algorithms." Conzpuret-r Clzen~.Etzg~lg.2'2 (1993): 385-390. 20. Bagajcwicz, M. J., and M. C. Sanchez. "Design and Upgrade of NonredunAavt r!-.l,Rcdn~dantLinear Sensor Networks." AICIzE J o ~ ~ r t ~ 45a l(!?93!: 1927-1938.

Data reconciliation technology is widely applied nowadays in various chemical, petrochemical. and other material processing industries. It is applied offline o r in connection with on-line applications, such a s process optimization and advanced process control. This chapter presents a review of major industria! applications of data reconciliatiori and gross elsc;r detecticr, reported in the 1itera:ure. Based or. this published infamation, w e only describe the brozd features of the applications without going inta the details about the particular solution technique o r the software ~ l s e d .However, we describe in greater detail two applications (with which the authors were personally associated) in order to highlight some practical prob:en~sand their resclution. Although there are many other industrial impleinentations and software for daia reconciliation applications, they are either proprietary (and n o detailed information is publicly disclosed) o r the p~lblishedsource of information is not easily accessible. . . Th,. ,,,, znaiysis In :his chapter is organized according ro the rnajor industrial types of app!ications for data reconciliation technology. From the multitude of industrial data recorlciliation app!ications, w e can distinguish three major types of applications:

1. Process unit balance reconciliatioil and gross error detection 2. Parameter estimation and data reconciliation 3. Plant-wide material and utilities reconciliation


D:,to Hecr~~iriliurion arid Gross Error-Detecrior~

Instrument standard deviations are very important in data reconciliation, and a syste~naticestimation procedure for the standard deviations should be employed. Redundancy is another important factor in data reconciliation solution and capability of detecting gross errors. Additional instrumentation may often be necessary to achieve a satisfactory level of redundancy. An optimal sensor location design software is ideal for data reconciliation applications. Gross error detection should be always followed by instrument checking and correction. Uncorrected instrument problems deteriorate the quality of the data reconciliation solution. Data reconciliation/validation is a complex problem that might require more than one solutlon technique. Since important flows and temperatures in tne pant may be nonredundant, data filtering and validation can be used to provide a quality solution overall. A satisfactory data reconciliation system should have enough flexibility to handle process configuration changes, variable standard deviations, missing nleasure~nentrand to accept various kinds of equations, including inequality process co~straints. More guidelints and challenges stiil facing data reconciliation technc?logy have bee^ poinied out by Ragajewicz and Mullick 131: For rerinery applicatioris wid, a large vzrie~yof stream cnrnposiriocs, proper assay cha;actenzation is the key to a successful data reconciliation. With inaccura~ecompositions, results may not saiisfy material balances and good measurements may 'Je ide~ltifiedas gross errors. For more acccrate darz reconcilia:ion, marerial and energy balance recoilciliatior: is necessary. Heat balances enhance the redundancy in the flow measuremenis and aIi in~proveciaccuracy in the reco~lciledflow rates is obtained. *The steady-state assumptior, and the data averaging m i ~ h tnot be satisfactory for some processes and material balances cannot be accurately closed. Dynanuc data reconciliation software needs to be developed for such processes and especially for advanced process control applications. Rigorous models do not necessarily increase the accuracy in the data reconciliation solution, but they enable merging data reconciliation and the associated parameter estimation problem in one run. Increased accuracy in gross error detection is still a current need, since none of the existent methods and strategies provide effective gross error detection for all types of errors and error locations simultaneously.


Typical software for process units material and energy data reconciliation and gross error detection are: DATACONT" (Simulation Sciences Inc.) [3, 5 , 61, DATREC (Elf Central Research) [4], RECON (part of RECONSET of ChemPlant Technology s.r.o., Czech Republic) 1211, VAL1 (Belsim s.a.) 1141, and RAGE (Engineers India Limited) [I, 161. Many NLP-based optimization packages designed for on-line applications have data reconciliation capabilities. They mostly use rigorous models, making the gross error detection more challenging. For that reason. only few have some sort of gross error detection. R o ~ e o ' " , a new product of Simulation Sciences Inc. designed for closed loop on-line optimization has data reconciliation and gross error detection capability [22].

PARAMETER ESTIMATION A N D DATA RECONCILIATION A problem associated with data reconciliation is the estimation of various model parameters. Data reconciliation and gross error detection algorithms make use of plant models, which might have totally unknown parameters, or parameters that are changing during the plant operation. Most of these paranezers-such as heat transfcr coefficients, fouling factors, dis:illation column tray efficicncies, compressor efficiencies, etc.are fixed va!ues for the process optiinizziion; therefore, a high accurzcy in thcir es~imatedvalues is required. Cine zpproach to parameter es~imationproblen~is to solve it sirnullatzeously with the reconciliation problem. Thc model parameters can be treated as regular ucmeasured variables, or as tuning psrarneters that are adjusted in NLP-type algorithms to match the pla:?; measuremects. The major proble~nwith this irpproac'n is that in the presence of gross err015, the pa-aneters may be adjusted to wrong values or some measurements can wrongiy be declared in gross error because of errors i r ~model parameters. To obtain an accurate solution for both measured variables and which is time model parameters, ail iterative process is usually I-equ~red, consuming. especially with rigorous 111ddeib. An alternative approach is to separute and secjuc~r~tinll)~ srrlvc. the two problems. First, data reconciliation and gross error detection is performed using only overall material and energy balances. The model parameters are then estimated using the reconciled values. This is similar to projecting out the unmeasured model parameters from the reconciliation problem along with their associated model equations. The parameter estimates obtained using the sequential approach are identical to those


Dora Kfcotrriliizlior~ and G r o s Error 1)errction

obtained using the simultaneous approach if there are no a priori estimates of the paranzeters available. Moreover, the parameter estimates obtained using the sequential approach may not a1:vays satisfy bounds on parameter values. An iterative procedure may be used to eliminate such problems. This approach was applied to parameter estimation problems in connection with advanced process control applications. The computational time is a serious constraint for such applications, and usually only one iteration is applied 121.

PLANT-WIDE MATERIAL A N D UTILITIES RECONCILIATION Plant-wide reconciliation is a very important tool for material and utilities accounting, yield accounting, or monitoring of energy consumed by the process. Many refineries are already saving a significant amoxnt of money by using a production accounting and reconciliation system. The usual term for a plant-wide reconciliation system is production nccountins;therefore, we will adopt this term for the description in this section. Keld arcol~nringis another frequently used tenn 123, 241. A production accounting system interacts with various groups iil the plant. The operations department provides the input infarmation and collects the reconciled results. The ins?nmenta:ion group obtains ins:^-umenc status ar,d perfmlls instrumenr recalibration and correction if mcessary. Process engineers, accou~iinganc! financial personnel, and planning and scheduling management retrieve periodic reports for their v a r i o ~ sneeds. Daily, week!y, or msnthly reports are standard requirements for a production accounting system. ?arious types of data are required for piant reconci!ia:i~n as indicated in Table l i -1. For beacr data quality and timely infomation, a producriori accounting system is usually integrated with the plant historian and the entire process information/management system. Some data is retrieved automaticaily from a historian, but other data is entered manually. Human el-rol-1s a factor affecting the data accuracy (and the reconciled results). For that season, some sort of data and model validation is very important. Veverka and Madron 1251 describe an empirical procedure for detecting topology errors, such as a missing stream or a wrong stream onenlation. They used the balance residual (or deficit), defined as: r = inputs - outputs - accumulation

(1 1 - 1 )

Table 11-1 Data Types for a Production Accounting System Data Type

Plant Topology Process Data Tanks Inventories Movements (from unit to unitltank) and Transfers (from tank to tank) Blends

Receipts Shipments Meters/Sensors Accuracy Factors Idat:oratorq. Additioi~aiDxta Entry


Process units, tanks, and their conn~ctingstreams PI-ocessdata (e.g.. comperisated flows) from the process and utility units Inventory from each tank Movement data-stdstop tirne, source node, destination node, quantitjr transferred

Blend data-stardstop time, source node, destination node, tank volumes and S!ending quantities Receipt data-startlstop time. source node, destination node, quantity receivei! Receipt data-startlstop time. source node, destination node, quantities measured Instrument acct~racyfacton (tolerance, reliability, etc.) for each of ihe measurement deilices Lab t a t results (density. %H2U. cctnpositioi!~, ctc.) Unspecified or adjusted data to hz nsc:i by the modzl px-ior to its balarxs calculation. Includes missing valuzs. specified valueb, adjustments, etc.

Thz balance deficit for each node is compared with a critical vtilue, say r,,,. If for a paiticula- node r > rCTi,, then the balance around that node is declared inconsistent. The major proble~nis how to set the rCritvalue f9r each node. A good portion of the node imba!ance can be attributed t o errors in data, and is therefore zdmicsible. The remaining of {hedifference is conside~c'x:d=!izg e n s . The vz!ze r,,, can be obtained f r o x the statistical analysis of the balance residual r, which is a racdoxn variable (sirnilar to the nodal test for gross error detection described in Chapter 7). Serious imbalance problems occur due to frequent (daiiy) changes in some movements [26].The p!ant topology for many refinery processes is rather dynamic. There are many "te~nporaryflows" that one day have a nonzero value, and in another day becomes zero (closed valve) or the flow is redirected to a different tank or unit. Mistakes in the reconciliation topology input can very easily be produced. Kelly 1261 proposed a

inore complex strategy for finding the wrong material constraints followed by detection of gross errors in measured data. His strategy is based on a previous algorithm for deleting different combinations of measurements in order to assess the ~eductionin the objective function, developed by Crowe [27]. Since the deletion process gives rise to a large combinatorial problem, Kelly designed an algorithm to narrow down the number of possible combinations to delete. A good method for gross error detection is crucial for a productiorl accounting system, which is exposed to many sources of errors. In the presence of significant topology and measurement errors, the reconciled result may become meaningless. The gross error detection task, however, is very challenging for this type of problem because it is very difficult to distinguish between the true measurement errors and topology errors. Leaks and losses and existence of unmeasured flows create an even higher level of complexity. A lot of research effort in data reconciliation is done to resolve these issues and provide more accurate production accou~itingalgorithms. Some unmeasured flows are observable and can be estimated based on their relationship with measured flows. But the precision of such estimaricn is often unsatisfactory. due to propagation of errors 1251. An alternative way to get an estimate for unmeasured flovis and other variables i n v o l v ~ din !he p ! a ~ ~reconciiiation t is by using appropriate chemical enzineering cr,lculations. which is best accomplished with a process simulator 14, 251. Process unit data reconciliation performed before plzntwide reconci!iation is another possible approach [28]. A more compliczted problem for plant reconcilia~ionproblems is the estimatior? of material and energy iosscs [26' 291.There are many sources f ~ r mzterial and energy losses in a refinery or chemical plant such a,; flares, fugitive emissions (from volatile organic compounds), leaking valves, fittings, pumps, or heat exchangers, and tank losses (by evapcration or liquid leaks). In addition to the real losses nieiltioned above, there are apparent losses caused by rneasc~mer.:errors, lab density errors, line fills, or timing errors due to unsynchronizetl readings in tank gauges and meters. Many loss models and loss estimation fonnulas arc available. Governmental agencies in the United States such as the American Petroieum Institute (API) and the U.S. Environmental Protection Agency (EPA), provide publications with procedures for predicting fugitive emissions, tank losses, and various leaks. For plant-wide data reconciliation, i t is important to clarify how to use these estimates. There are three major ways of including the loss estimates in the data reconciliation model:

1. Treating the losses as un~neasuredflows, and reestimating them based on the measured data. This approach does not require a "good" estimate of the loss flows, but requires observzbility of all loss flows, which is very unlikely to be obtained in a real plant. 2. Modeling leaks separately as explained in Chapter 7. A GLR test procedure can be used to detect leaks and estimate the order of magnitudes of the leaks (or losses). The method might have practical limitations if too many leaks are included in the model (it becomes a large combinatorial problem; also there might not be enough redundancy to accurately estimate the magnitudes of all leaks and losses). 3. Treating the losses and leaks as "pseudo-measured" flows. The estimated loss or leak value is used as a "measured" value, and a relatively higher standard deviation than that for the real measured flows is given to the loss flow. The values of the flow rates for the loss flows are reconciled together with the other measured flows. Typical software for plant-wide reconciliation and yield (productiort) accounting is [21]: OpenYieid (Simulation Scie~lcesInc.) and SIGMAfine (KBC Advanced Technologies).

CASE STUDIES Reconciliaiion of Refinery Crude Preheat Train Data 11 63

A crude preheat train is an impcfiant subsystem of a refirieiy used to rni~iimizethe external energy ronsitmption reqsired for heaticg crude oil. It consists o i a network of heat exchar?gers which 31.5 used to preheat crude oil before the crude is sent to a furnace for further heating prior to distiliation. The hot streams used for preheating the crude are tile distilS. 1 !-I shows the crude late streams from the downstream C O ~ U I I I ~ ~Figur-e preheat train of a refinery consisting of 2 1 exchangers in which the crude ;.: h w t d by 1I distillate streams. The flow of the crude before the splitter. as well as the two split f l o w qf the crude, are measured. The inlet flows of all distillate streams are measured as are also the inlet and outlet temperatures of both hot and cold stream to every exchanger. Table 11-2 shows a typical set of measured flows and Table 11-3 shows the measured temperatures. Tile motivation for reconciling these measurements arises from the need to optimize the crude split flows every few hours. In Chapter 1. we have already described this application and here we will focus only on the ~econciliationproblem.

lndrrstrial Ap;,dicutioii.sof Dara Rrcorzr:iliarior7 and Grwsv Error DelectiOll Tcchr,olr,sies 337

Dur(~K~~coi~ci!iuriori and Gross Error Drraclion

Table 1 1-3 Measured and Reconciled Temperatures of Crude Preheat Train Crude


Tcble 1 1-2 Measured and Reconciled Flows of Crude Preheat Train



Reconciled Flows (tons/hr) B e i ~ r eGED Afier GED

399.2352 153.1427 246.0925 8.3865 58.4888 267.0283 86.5460 454.8024 54.5657 209.2158 106.7616 106.6069 170.3060

Reconciled Temperature (C] After GED Before GED


Figure 1 1-1. Crude preheat train of a refinery.

Measured Fiow (tonslhr]

Measured Temperature (C]

409.5705 151 8511 257.7!94 8.3885 58.6845 267.7579 8 1SO42 455.0960 54.5277 209.4634 109.9967 106.7458 171.2554 0


(table contit~uedon tieir page)

Table 1 1-4 Constants for Specific Heat Capacity Correlation

Table 1 1-3 (continued] Measuredand Reconciled Temperatures of Crude preheat Train Stream

Measured Temperature (C)


302.300 299.800 263.800 229.492 309.458 253.525 230.667 283.885 258.167 152.550 212.142 190.267 302.300 273.617 233.075 345.925 345.925 345.925

Recor,ciled Temperature (C) After GED Before GED

301.0654 301.0387 266.6501 227.4435 304.7259 253.0694 233.4092 277.3424 242.7938 158.8106 206.1244 195.0234 295.0904 268.48 12 240.6166 345.4640 355 4640 345.4640

301.4735 300.7566 265.2976 228.3426 306.4544 253.1749 232.4505 283.3504 260.7749 198.7786 2 1 1.7749 190.5526 30U.22 11 275 4542 25 1.9620 347.0395 347.0395 317.0395

The problem iri this case is t c reconci!e all t!le f! o u ~ sand ccmperatures so as to sa~isfymaterial arid energy baiances of each process unit ef this subsystem 1x1 addition, ir is required to estimate the overall heat transfer coefficient of ezch exchanger given the area and number of tube and shell passes. It is assuned that ail the streams zre single phase Auit-ls and ihe specific heat capacity of ~ a i strsam h is giver. by

where ai and bi are constants and Ti is the temperature of the strean? in Jclgrees C. T i l ~~ulibcanrsai anci bi ful- rne different streams are given in T~!J:E11-4. The are2 and the iiumber tube passes for each exchanger are given in Table 11-5, while all exchangers have a single shell pass.



b x 100


0.4442 0.458 1 0.4455 0.48 19 0.4263 0.4455 0.4143 0.4263 0.4092 0.4285 0.4143 0.4062

0.101 1 0.1036 0.1011 0.1081 0.0975 0.1011 0.0959 0.0975 0.0962 0.0986 0.0959 0.0957

Table 1 1-5 Heat Exchanger Areas and Number of Tube Passes for a Crude Pr9heat Train Exchanger



Areo (m2)

Tube Passes

lrzdu~trialApplicarionr of Dura Keconciliariorr arrd G r m Err<,r Detertiort TPchJloiogies341



Using a standard deviation of 1% of the measured values for all stream flows and temperatures, the reconciled estimates are obtained assuming that no gross errors are present in the data. In the third colunln of Tables 11-2 and 1 1-3, the reconciled flows and temperaiures, respectively, are shown. (All the results of this case study were obtained using the software package RAGE.) The constraints that are used for each exchanger are the flow balances for the hot and cold streams, as well as the enthalpy balance. For the mixer, flow and enthalpy balances are imposed while, for the splitter, a flow balance and equality of temperatures across the splitter are imposed. No other feasibility constraints or bounds on the variables are imposed. In order to remove gross errors from the data, the GLR test along with serial compensation strategy is applied for multiple gross error detection after linearization of the constraints around the reconciled estimates. The final reconciled estimates after all gross errors are identified and compensated are also shown in the last column of Tables 11-2 and 11-3. Ure focus on some interesting problemsffeatures of the measured and reconciled dara. I i we consider the measured temperatures of streams incident on exchanger E38A (the streams CR2C, CR2D, D S l A , and DSI B), we note :hat the crude stream is getting cooled fro111 217.608 to 216.992 degrees C. whi!e the intended "hot" distillate strearn is getting heated froin 224.333 io 254.908 dcgrees C . A1:hough it may bc possib!e for the roles of ho: and cold streams ti: bz reversed dcpznding on the prevailing flows and temperatures. what is urlacceptable here is :bat heat is being transferred fi-om the !owel temperature ci-udc to the higher ternperature diitilla~cstream which is thermodynamically infeasible. 11 can be verified thai reconciliation before cr after gr-oss errpol-detecfion (GED) does riot correct this probleni 2nd thz estiniates for exchangcr E3YA still \&)late ther-mody~ialnicfea;ibi!ity. If wc use these estimates to obtain an estimate of the overall heat iransf'er coefficient for this exchanger. then we obtain a ne9a:ive value for it which ic absilrd. In order to obtain thermodynamically feasible estimates, several possibilities were examined. One general z p ~ r c n c his $5 inc!i,de feasihi!i?y constraints at the ho; and cold end of each exchpqger of the form

where the subscript i is the inlet and subscript o is the outlet end of the exchanger. This would, however, increase the number of constraints significantly. Moreover, this presupposes knowledge of the cold and the hot streams for each exchanger and does not allow any role reversal. A simpler technique is to include the relation between overall heat transfer coefficient (U) and heat load for every exchanger and impose bounds on the overall heat transfer coefficient. If we impose a nonnegativity restriction on U, then we can ensure that thermodynamic feasibility is maintained regardless of which of the streams plays the role of the hot stream and which plays the role of the cold stream. Using this approach, the reconciliation problem was solved again. (Note that, as explained in Chapter 5, in order to solve this problem a constrained nonlinear optimization program has to be used and the unmeasured heat transfer coefficient parameters cannot be eliminated using a projection matrix.) The reconciled temperature estimates of the four streams incident on exchanger E38A alone before GED and after GED are shown in the second column of Tables 11-6 and 11-7. respectively. For comparison, we also reconcile the problem by deleting each of the four suspect temperature measurements in turn and also after deleting all the four temperature measilrenients which violate thermodynamic feasibility. The reconciled temperatcre esiimrrtes for these four streams before and zfter GEU are also shown in Tables 11--6and I 1-7. Table 1 1-6 Reconciled Ternperstvres 6efore GED Around Exchanger E38A for Different Cases




Rec~ncild Tempcrob~res(C) Bounds on U

T of DSl B T of CR?C T of CR2D At! Four T's unmecsured unmeasured unmeasurd unmeasbrd u n m ~ s u r e d

T of GSlA


DS iA US1 9


235.6707 235.6655 217.2356 217.2356 --

227.4071 248.6476 223.1970 212.1257 --

222.3988 222.7002 232.2426 250.8886 2 19.9628 229.2839 214.8731 214.7073

223.2842 250.1673 ?21 1 179 207,0257

:95.7'.178 23 1.7550 227.4480 210.7080

It~da.\-rricil Applicarions r,fData R~cc~ncriliarion nr~dGross Err-or Defection Techndogies 343

D~iroR,~ronciliarinr~ ortd Gross Ermr Drrc+criorz


Table 1 1-7 ~econciledTemperatures After GED Around Exchanger E38A for Different Cases


Bounds on U


Reconciled Temperatures (C) T of DSlA T of DSl B T of CR2C T of CR2D NI Focr T's unmeasured unmeasured unmeasured unmeasured unmeasured

225.7176 221.1880 225.7 175 255.417 1 216.8586 235.1001 216.8586 217.0614

224.43 15 217.9601 217.0624 220.4571

224.4547 255.2667 236.3756 220.0339

223.9519 255.5525 233.8225 217.1436


215.7303 223.5384 23 1.2648 227.2575

From the reconciled estimates, it can be observed that by imposing nonnegativity bounds on U, it is possible to obtain feasible estimates before and after GED. In fact, the results show that the heat transfer coefficient for exchanger E38A is at its lower bound of zero, which implies that this exchanger is being bypassed completely by one of the streams, resulting in the temperatures of both hot and cold streams being unchanged across this exchanger. T h e only other case when feasible temperature estimates were obtained was after deleting the temperature of stream DSlB and application of GED (refer to column 4 of Table 11-7). Even in this case, the stream ten1peratitl.e~change marginally across exchanger E38B, indicating tha: this exchanger is largely being bypassed. ( T h ~ swas also later confirmed after i n s p e c t i ~ gthe manual valve positions on the crude r exchanger.) 'fhe results clearly demonstrate that bypass line f ~ this imposition of 'oo~indcoristl-aints on the parameters can be used as a gener;lc nlethod to obtain feasible estin~atzs. This case stildy also brought out other issues that needed to be addressed in practice:


optimization, it may be necessary to include a heat loss term in the enthalpy balances. However, enough redundancy does not exist for treating the loss terms as unknowns. One possibility is to assume a specified fraction of the heat load of exchangers to be lost based on past experience or based on recommended loss estimation methods. A s pointed out in Chapter 1, the reconciliation of the crude preheat train data was performed every four hours using averaged measured data for the preceding two hours. Since the heat transfer coefficients of exchangers cannot be expected to change dramatically from one time period to the next, it is possible to use their estimates derived in one time period as "measurements" for the next time period with a larger standard deviation. Due to this extra redundancy, better estimates can be obtained. Moreover, the heat transfer coefficient estimates change smoothly from one time period to the next and will not fluctuate wildly. A trend of the heat transfer coefficients can be used to decide when cleaningl~naintenance procedures have to be initiated for the exchangers. Reconciliation of Ammonia Plant Data [30]

Ammonia is a chemical product with many industrial ap~lications such as refrigerants and fertilizers. Figurz 11 -2 shows a simpiified process flowsheet diazram for the sy11:hesis section of an ammonia prGcess [30]. Ammonia is produced by ar, exoti~ermicreaction of nitrogen and hydrogen:


It was assumed that ali the streams ar-ein a single phase. A more ri,oorous method would r.x-..:.-.LyU,,+ *I.,. ..+-+- "I -C +LU.L .,,,eaiz -+!G be determined and an appropriate correlation tn he used for determining the stream enthalpy. Some fraction of the crude flows was being bypassed in a few other exchangers also, but sufficient measurements (redundancy) were not available for treating the bypass fractions as unknown parameters and estimating them as pa11 of the reconciliation problem. Heat losses fro111 exchangers were not accounted for in the enthalpy balances of e s c h a n ~ e r s In . order to use the reconciled data for better UIC


The feed stream S l t3 the synthesis section already contain.: ammonia from ilpstrezrn processes. To separate it, stream S! is cooled 2nd sent to flash drum F1, where the ammonia-rich liquid SS is separated from the remaining vapor S2. Before entering the reactor section, the vapor stream is preheated by a product stream. The reactor section consis:> of t:vo reaction stages and two internal heat exchangers. Stream S4 is split into three streams (SS, S6, and S7) and the split fractions are used to control the reactor feed temperatures. Stream S7 is used to quench the hot product stream from the first-stage reactor and stream S5 is used to recover some of the heat from the product of the second stage reactor (S13). The three streams are then recombined and fed to the first-stage reactor. Most of the cooled reactor prod-

irrd~si,-i~~l,4,7,7iir:ar;r)t1r ofno?u Kecorrci1;at;on und Gro.5.s Error Derecrior: Tec/utologies 345

uct (S1.5) is rccycled to another section of the plant (stream S 16), while the remainder (S17) is further cooled with refrigerant (stream S22) to condense most of the ammonia (stream S20). The two condensed streams (S3 and S20) are combined and further purified downstream.

Table 1 1-8 Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements STRM VBL







U M F U ti


M M I-



U U ti


TO 1



F10 TI0

3075 000 500


F! 1 TI 1

2880.000 5.00



256 367 -22 46 150.001) 32.7478 11.8345 7.1737 2.3594 45.8846


u u LT 2




Figure 1 1-2. An crmmonia synthasis industrial process.


u I! I:


The atnrnonid synthesis plant contains instrumentation for measuring flow rates, temperatures and \lariol~sstrezrn compositions (mole fractions). The measured values, their associated standard deviations and the reconciled values are reported in Tables 11-8 and 1 1-9. Tables 1 1-8, I 1-9, and 11-10 show the reconciliatiort results for a case where no gross errors were found (Case A). Table 11-8 reports all stream calculation results, for both measured and unmeasured data. Other calculated values such as rcaction extents. heat exchanger duties, U A values, and flash data, art: also reported at the bottom of Table 11-8.

R.47-E TEMP PRES X1 X2 X -3 X4 X5


M F U ii


il ti (tobie corzritzued on next pnge)


Table 1 1-8 (continued) Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements STRM VBL









hl3/HR C ATM MOL4 .MGL% XtO1*5c MOL8 MOL%

RATE l'Eh4P PRES xI x2 X3 X4



M O i Yc









Table 1 1-8 {cont~nued) Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements




of llara K~.cor~crliutiun arld Gross EI-ror-Dctr~ctlor~ T~c-i:r-i:noio,~i~~ 347






X4 X5




48 000

I 84

1600 000


1 00

1 50

-15 00

1665 585 127 69 150 000 49 5273 18 3357 14 2649 46916 13 I805


S14 C









1665 585


15 24 150 O(K) 49 5277 18 3757 14 2649 46910 l i 1805

S 19








hf 1

h4 21 M

A I-M MOL% MOL% MOL% MOL% 0 '-6


45 000



H I 19

1 0000 1 COO8

1 14 32 1 4C 78 19

57 0000 21 0OOr) 15 0000 5 0000 :!OOO(1

NI 19

C1 13 1 9GQG 4R-I9 I 0090 Xt33 19 1 0000

1472177 -8 15 i 50 OtXI 56 0256 20 74:'; 16 1 557 5 1069 1 7915


S 16




rJ U hl U U U U 1

M3ff3R C ATM MOL% MOL% MO1Q Mold% MOL%



1 50

i50 000

1 02E+Oi -2 1 82 149 998 58 9875 21 3170 129215 4 2499 2 5238

able cor1rrr:lred or1 rzc,xt p u ~ e i 6

Irrdt~str-iuLA~~~~licrrriur~.s of Ilata ReconciliaZion and Error D~recriorlTechrrolo~ies349

Data Recoriciliatiorz and Grocs Error- Derrcrion


Table 1 1-8 {continued) Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements

Table 1 1-8 (continued) Stream and Unit Reconciliation Solution for Ammonia Example Case A. No Gross Errors Present in Measurements STRM VBL



























F04 TO4

3075.000 2.00



100.000 -21.82 150.000 .0000 ,0000 .0000 ,0000 200.0000


S2 1


H2-4 N2-4 C1-4 AK-4 NFI3-4

1.0(100 1.0000 1.0000 1.0000 1.0000

.30 .31 1.31 .36 .09 .28 .76

1.03E+05 1.02E+05 135.00 134.86 150.000 58.0000 58.9875 21.0000 21.3170 13.0000 12.9218 3.0000 4.2499 2.0000 2.5238


S22 L P










5.50E+04 5.6OEi4~! 134.86 150.000 55.9575 21.3!70 i2.921S 4.2499 2.5238





1.1 1

l.50E+04 1.51E+i)l 134.86 150.000 58.9875 21.3170 12.921s 4.2499 2.5238






Irulusrriul Ap~~licufi~~rrs qfDnra Keconcilinrior~and Cross Error Detccrrorr T~~cl~,lologres 35 1

Durn Recor

T a b l e 1 1-8 (continued) S t r e a m and U n i t Reconciliation S o l u t i o n for A m m o n i a E x a m p l e C a s e A. No G r o s s E r r o r s P r e s e n t in M e a s u r e m e n t s

T a b l e 1 1-8 (continued) S t r e a m and U n i t Reconciliation S o l u t i o n for A m m o n i a E x a m p l e Case A. No G r o s s E r r o r s P r e s e n t in M e a s u r e m e n t s STRM VBL










S16 S 17


.98191 .O 1809

S5 S6 S7


.55063 .I4869 -30068











M3/I3R C ATP.4 >fOL':c MOLQ KOL %





5 U








1 22E+03-




91 67165 03000




ii ii


I; U








u u





-21 g2 150 oon Go003





I lit19







-8 15 sry)

!5000n !50000


11ld1rsr1-id A[q~licafions of Durn Kecoi~ciliuiro~~ rnflrlGross Error 13rierrio11T e c l ~ n o l o ~ i355 e.~

Darn Kecorrciha~ionand Gross Error Derec~ion

a1 test (GT), the measurement test (MT), and the principal component measurement test (PCMT), properly indicated that there are no gross errors in measurements. In Case B, three gross errors were simulated in the following measurements:

Table 11-10 Summary of Calculation Results for the Ammonia Example Case A. No Gross Errors Present in Measurements


= = = =

5 40 (1 NON-REDUNDANT) 59 (0 UNOBSERVABLE) 4 1 (29 FIXED BY USER) = 82 = 20

F07 (magnitude = 10000 m3/hr, ratio 610 = 10); C1-4 (magnitude = 5 rnol%, ratio 610 = 5); T22 (magnitude = 3.6 Deg. C, ratio 610 = 3).






Tr?ble I i-9 coniains the reconciliation results far all measurcd vzriables. Table 11-16 indicates general run data jncmber of iterations w t i i convergence, number of equations, and :lumber of variables for each caiegcry-mezsured, unmeasured, fixed, the number of nonredunciant and unobservable variables, the degrees of redundancy and a suinlnary of w e r e
They have different detectability factors, as indicated in Table 11-9. To increase the chance of detection and correct identification, a higher ratio of the gross error magnitude 6 to the corresponding standard deviation o was used for the measurements with lower detectability factor. The i o b i r the detectability, the higher the ratio 610. Table 11-11 shows the reconciliation results for Case B for all measured variables. No error elimination was used in this run. Table l l - 12 indicates various run datr? and the summary results from the GT and MT. The G T indicates the existence of Fross errors, while the M T declares 7 measurements in gross error. In addition to the three true gross errors, foils other gros: errors were found by the MT. Serial eliminatior? was used for better en-or iderttificarion. Table 11-13 shows the summarized results from a first 1z11 with serial elimination calculations. 111this rdn. the MT was used in zddition to GT. We notice that in the first elimination step the]-e were two measured variables sharing the s a n e MT statistic (the larges! MT statistic value). This pziticulzr algorithm used the detectability F&cior as a tie brezker in the elimination pracess. Since FO7 has a higher deteciabiliry factor (0.5 135) than FC6 (0.2297), F07 wzs chosen tc be eliminated first. This turned out to be the right choice. S~lbsequel?tly.F06 was not foanld in gross en-or anymore. Next. CI-4 and T22 were also eliminated and no more gross r v n r s were found by both GT and iviT. The estimated values for the three eliminated measurements are very close to the values reported in Table 11-9 for the no gross error case. Table 11-14 shows a similar run, this Time using the PCMT for gross error identification. Initially. both GT and PCMT indicate existence of gross errors. The elimination path and final results. however, are some-

' & a 356

Itzduir~-~uI Appliculion.~of l>a/u Rc,conci!iutron awd Gross Error L)efrcri(~..: T~clzlzulogies 357

Duru Rcc(iri
what different from the run with the MT. In the first elimination pass, F05 was found to be the major contributor to the largest inflated principal component. It is true that F05, F06, and F07 are model-related to each other (they are all outlet streams of the splitter SP2) and the smearing effect of the gross error in F07 is easier. In this case, the calculation of the contribution shares to the first lxgest principal component that failed the PCMT, indicates that F05 should be first eliminated. Subsequently, an associated temperature, T13. also ha? to be eliminated and properly adjusted in order to satisfy the heat balance for exchanger E3. The last two eliminated measurements, T22 and C1-4, are true gross errors. The MT and PCMT tests usually detect correctly gross errors in measured variables with relatively higher detectability factor. The outcome of the two types of tests for gross errors in measured variables with relatively lower detectability factors could be the same for both tests (no gross error detection or wrong gross error identification) or one test can perfom better than another. it is not clear which test performs conslstently better. More analj~sisand comparisoil of the two type of tests for the ammonia example can be found in Jordache and Tilton [31].




Darn Reconciliutio~~ orld GI-ossEt-ror L ) C I E ( . I ~ U I I

Table 11-12 Summary of Calculation Results for the Ammonic Example Case B. Gross Errors Present in Three Measurements Measurement Test Used for GED


















23 125 7732















T 1-3











303 '30 C

255 m-m






Industrial Applicarions ofI~u10Reconcilialion and Grosr Error I ) E ~ ~ ~rEch,zolog~Es I;(,~ 361


Table 1 1 1 3 (continued) Summary of Calculation Results for the Ammonia Example Case 6. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Measurement Test Used for GED

Table 11-13 Summary of Calculation Results for the Ammonia Example Case 6. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Measurement Test Used for GED

















S4 X3 S22 TEMP




= 41 (29 FIXED BY [JSER) = 82 = 19









= 4i (29 FIXEC BY USER) = 82 = 18























32202.97 16


S4 X3









fndustr;ul A,vpliccitinrr$ of Dafa Recorlciliatiorr and Gross El-rorDerecrior~T e c h / o g j u s

Table 1 1- 13 (continued) Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Measurement Test Used for GED

Table 11-14 Sr~mmary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Principal Component Measurement Test Used for GED



























= = = = =
















S5 RATE 28

S8 TEMP -20

S 16 RATE 17

S13 RATE i4

Slr) KATE 12

S4 RATE 12


Sf2 TEMP -8


s11 TEMP 7

S10 TEMP 7

Sll RATE 6

3l.r XS

S 14 X1 7


I 2





1030 @600

!9a99 9992

22203 3128


I 0009

18 0000

!2 8741


1 2000

-11 6000

-29 1212

S14 TEMP 5 4

3 731

S22 TEMP !00


3 049

S4 X3






Sf4 X4 74

S 19 X2


S13 TEMP 4



S19 X! 5




--""" MEASUREMENT F05 WILL BE DELETED IN THE NEXT PASS ' "' (tciblr ~o~i~irrrrcd o n tletr j~aqc!

Dnro Rer on( ~lratronand Gloss Error Deii,ctron



Table 11 14 (continued) Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Principal Component Measurement Test Used for GED

Table 11-14 {continuedj Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied principal Component Measurement Test Used for GED



= 39 (1 N3N-REDUNDANT) = 60 (0 UNOBSERVABLE) = 41 (29 FIXED BY USER) = 82 = 19










* --






6368 1.9602




S13 TEMP 38

58 TEMP 30




S12 TEMP 5








S22 TEMP 100











38 (1 SON-REDUNDANT) 61 (0 klhOBSERVABLE) 41 (?9 FIXLD BY USER) 82 18

litdustrial A/~~~licutiuns of Dara Rrconci1iu:;orz nizd GI-us5El-ror L)e:ecrio~~ T r c h n ~ l o ~ i e357 s



Table 11 14 (continued) Summary of Calculation Results for the Ammonia Example. Case 6. Gross Errors Present in Three Measurements. Serial Elimination of Gross Errors Applied. Principal Component Measurement Test Used for GED.

Table 11 14 [continued) Summary of Calculation Results for the Ammonia Example Case 6. Gross Errors Present in Three Measurements Serial Elimination of Gross Errors Applied Principal Component Measurement Test Used for GED --








65494.10 18




48 1.3397

























S22 TEMP 100








1650 0000

54999 9996

65494 1276




-liO GO00

4s 1 3396



1 2000

33 6000

-3-9 3 232









= 37 (1 NON-REDUNDANT) 62 (0 UNOBSERVABLE) 41 (29 FIXED BY USER) 82 17

= = = =









Table 1 1 14 (cont~nvedj Summary of Calculation Results for the Ammonia Example Case B. Gross Errors Present in Three Measurements Seriai Eiimination ofGross Errors Applied Principal Component Measurement Test Used for GED

lndirsrriai Applicurions of L)ata Reronciliation and Gross Error Def. riior~Techno!nRie.y


Steady-state process unit data reconciliation and gross error detection technology is widely used in chemical, petrochemical and other related industrial processes. On-line data reconciliation is important for enhancing the accuracy in process optimization and advanced process control. Steady-state detection is necessary in order to increase the accuracy in the reconciled values and to provide meaningful gross error detection. If the process is not operated at steady-state for a longer period of time, dynamic data reconciliation should be applied. Proper component and thermodynamic characterization and accurate compositions are very important for a successful data reconciliation and gross error detection. Rigorous rnodel enables merging the data reconciliation and parzmeter estimation into one problem which can be solved simultaneously. Plant-wide material and utilities reconciliation is an important tool for production (or yie!d) accounting. 'The most challenging prcblem in produciion acccunting is the estimatior? of various leaks and losses. With enough redundancy, data reconciliation can provide reasonsble estimates :or the magni:udes of materials that are not accounted for. Irnposing bounds on vaiiables is often necessary to eilsure a feasible solutior, for the daia reconciliation problem. 7'0 so!ve a botinded problem, an NLP-based software is n~eded. The existen: gross error detection meThods do not accurately detect gross errors ail the time. Some methods are better than 0thers, but their overal! performancz depends upon the model accuracy and the level of data redundancy.




3 048






S4 x3

S19 X3

S4 X1


S19 X1 11

S14 X3


S 14 X4 -3

58 S4 X5

S14 X5 5

8 *




= 41 (29 FIXED BY USER) = li2 = 16


1 2

3 4












TI 3

5 OC00

450 0000

181 4463



1 2000

-37 6000

-29 1232



1 0000

18 0000



12 8642 --





12. Christiansen, L. J., N. Bruniche-Olsen, J. M. Carstensen, and M. Schroeder. "Performance Evaluation of Catalytic Processes." Co17zputersClzern. Engng. 21 (Suppl., 1997): S 1179-S1184.

REFERENCES 1. Ravikumar, V. R., S. Narasimhan, S. R. Singh, and M. 0. Garg. "RAGEThe State of the Art Package for Plant Data Reconciliation and Gross Error Detection," presented at the International Symposium on Automation and Control Systems, New Dehli, India, 1992.



2. Chi, Y. S., T. A. Clinkscales, K. A. Fenech, A. V. Gokhale, C. Jordache, and V. L. Rice. "On-line, Closed Loop Control and Optimization of an FCCU Using a Self Adapting Dynamic Model," presented at the AIChE Spring National Meeting, Houston, Tex., 1993.

5. Leung, G.. and K. H. Pang. "A Data Reconciliation Strategy: From On-Line Implementation to Off-Line Applications," prese~ltedat the AIChE Spring National Meeting. Orlando, Fla., 1993.

16. Ravikumar, V., S. Narasimhan, M. 0. Garg, and S. R. Singh. "RAGE-A Software Tool for Data Reconciliation and Gross Error Detection," in Foul*darions of Computer-Aided Process Operations (edited by D.W.T. Rippin, J. C. Hale, and J. F. Davis). Amsterdam: CACHE/Elsevier, 1994, 329L.436.


17. Stephenson, G. R., and C. F. Shewchuck. "Reconciliation of Process Data with Process Simulation." AICIzE Jo~cl-1lal32 (1 986): 247-254.


18. Meyer, M.. B. Koehrct, and h4. Enjalbert. "Data Reconciliation on Multicomponent Network Frocess." Conipzito..~Clzenz. E!zgng. 17 (no. 8. 1993): 807-8 17.

6. Scott, M. D., J. M. Tiessen. and S. L. Muilick. "Reactor Integ~atedRigorous on-line rnodel (ROMTM) for a Muiti-Unit Hydrctrearer-Catalytic Refonner Complex Optimization," prescnted at the NPRA Computer Conference, Anaheim, Calif., 1994. 7. Chiari. M., G. Bussari, M. G. Grotto!;, and S. Pierucci. "On-line Data Reconciiiatioil and Optitnization: Refinzry P,pplizat~ons." Co:nputer.s C!l?n~. Elzgng. 21 ([email protected],1997): S 1!85-S 1 190.

8. Nair, P., and C. Jordache. "Rigorctus Data Reconciiir?tioi~is Key to Opti~nal Operations." Contro! (Oct. 1991): 118-123. 9. Tan~ura,K. I, T. Sumioshi. G. D. Fisher, and C. E. Fontenot. "Octimization of Ethylene Plant Operatioils Using Rigorous Models," presented at the AIChE Spring National Meeting, Houston. Tex.. 1991.

10. Sanchez, M. A., A. Bandoni. and J. Romagnoli. "PLAPAT-A Package fur Process Variable Classification and Plant Data Reconciliation." Computers Chern. Erzgng. (Suppl. 1992): S499-S506. 11. Natori, Y., M. Ogawa. and V. S. Vemeuil. "Application of Data Rcconciliation and Sirnulatior1 to a Large Chemical Plant." Proceedings of Large Chemical Plants 8th International Symposium, Antwerp, Belgium, 1992, pp. 101-1 13.

14. Dempf, D., and T. List. "On-line Data Reconciliation in Chemical Plant." Cornj~utersChenz. Engng. 22 (Suppl., 1998): S 1023-S 1025. 15. Placido, I., and L. V. Loureiro. "Industrial Application of Data Reconciliation." Conllputers Chem. Engilg. 22 (Suppl., 1998): S 1035-S 1038.

3. Bagajewicz, M., and S. L. Mullick. "Reconciliation of Plant Data. Applications and Future Trends," presented at the AIChE Spring National Meeting, Hcuston. Tex.. 1995. 4. Charpentier, V., L. J. Chang, G. M. Schwenzer, and M. C. Bardin. "An OnLine Data Reconciliation System for Crude and Vacuum Units," presented at thc NPRA Computer Conference, Houston, Tex., 1991.

13. Holly, W.. R. Cook, and C. M. Crowe. "Reconciliation of Mass Flow Rate Measurements in a Chemical Extraction Plant." The Canadian J1. of Chenz. Erlgr~g.67 (1989): 595401.

13. Nar:isirnt;an. S., R.S.H. Iviah, atid A. C.Tamhaiie. "A Composite Statistical l'cst foi Detecting Changes in Steady State." AICIIE Joii:-~zal32 (1386): i4i19-:-118.


20. Narasimhan. S., C. S. Kao, aitd R.S.H. Mah. "Detecting Changes in Stzady Stzte Using the Mathematical Theory of Evidence." AIChE J o ~ c r ~ l a l 33(1990): 1930-1 931.

ii j l!



21. CEP Software Cirectory, a Sup;~!ement to C;le~ti.C:,qng. P~.o,oress.p ~ i b lished by thz American I ~ s t i t u ~ofe Chcrnical Er;gineers. 1998.

22. Tong. Fi., aiid D. Bluck. "An Indurtrir?I kpplicaticn of Principa! Ccinponer:t lest ro Fault Cetec~ionand Ideniifisatiott." presented at tile IFAC Confcrence, 1998.


23. Reagan E.. B. Tilton, and S. Sa!nmcx!.. “Yield A ~ ~ ~ i i i i tai ndg Data intezraiion." presented at the NPKA Compi1ti.r Conference, Atlanta. Ga., 1996. 24. Grosdidier. P. "Understand Operation Information Systems." H~drorar1,on Pmcessii~y(Sept.1998): 67-78, 25. Veverka. V. V., and F. hladron. Marc.)-in1 cr~iclG~arg?Ralancirig ill PI-ocess 1rzdu.srries: Fronz Micr-oscopic. Ra1~11ce.s to L A I I X ~ Plants. Amsterdam: Elsevier. 1997.

By suitably adding vectors which have the same number of elements, O ln(71-evectors other vectors can be formed. A linear coinhii~afionof ~ L L ' oris a vector which is formed by niultiplying each vector by a real number (scalar) and adding them. For example, consider a linear combination of the above vectors a and b represented by the vector d = a l a + a 2 b . If we choose the scalars a, = 1.5, and a2= 2.0, then the vector d is given by

ber of elements in any vector in a basis set (components of a vector) need not be equal to the dimension of the vector space spanned by the basis. This is illustrated clearly by the vector space spanned by the basis set a and b, where each of these vectors has 3 components but the dimension of the vector space spanned by them is only 2. Note that we often speak of a vector having n elements as an n-dimensional vector. This only implies that the vector having n elements is a member of the n-dirnensional space of vectors. We use the notation 157," for the n-dimensional real vector space.

Given a set of 11 vectors, S = { a , , a?; . ., a n } ,we can generate all possible linear combinations of vectors in this set, a l a l + a2a, + . . . a n a n ,by choosing all possible values for the scalars a i . We refer to the col!rrt;qn of vectors thus generated as a vector- space sl~arz~zed by the vectors in set S and denote it as V(S). It should be noted that the zel-o vector is a laernbe1- of this space. A set of vectors S is said to be linear-ly independent, if a linear combinstion of the \iectors in this set equals 0 only f ~ the r case when all the scalars a, are equal to 0 and not for any other choice of the scalar values. if a set of vectors is lineal-ly dependc~it,then t!?ei-c is a vector in thi:; set hose scalar multiplyin2 factor (a,is i l o n ~ ~ r \vhich o) can Sc expressed a.; a li!lear con:birlation of the other vec!ors in this st-!. We can delere this vector and again check if the I-enlaining vccto:-s are lineariy independent If' nct, we can repeat this procedure until we are left with a sct of veciors that arc lineal-!y ir~dependent. The sr,t of vectors \vhich remain f ~ r ma ~:zir~irnnl set ~flineal-11-ii7d~pe17ilclzt ; which span the vectar space V ( S ) .This mijtimal set of x c t o r s is said to for111a hcrsis sct for V(C). For example, if we consider the set S consisting of vectors a. b. anti d defined above, then this forms a linearly depertcicnt set because an! vector in this set can be expressed as a lineas combination of the other two vectors in this set. We can choose to delete \,ector d tiom this set. in which case we are left with the two vectors a afid r j which can be vei-tfied to be linearly independen!. Thils. the vectors a and b for-m a basis set for the vector space spanned the three vectors a. b, and d. Another basis set for the same vector space is a and d. There can be many different choice of a basis set for a vector space. but the number of vectors in every basis set is the same, and is denoted as the di171~17~iolr of the \.ector space. It must be borne in mind that the num-

MATRICES A N D THEIR PROPERTIES A real matrix of order nz x n is an ordered set of eltmentc consisting of rows of i z elements each. Each row of a matrix can be regarded as an n-dimensional row vector and each column of the matrix can be regarded as an trz-dimensional column vector. Thus, an nz x- rr matrix can either be ccnsidered as an ordered set of ~nrow vectors each of dimension n. or as a set of 11 column vectors each of dimension 111. Two specla1 r ~ a t r i c e sare the zero matrix dent~tedby 0. whose elements are 0, and the identity matrix of orcier I? s 17,denoted by the sylnhol 1, whose column i is the unit vector ei. it shou!d be noted tha: we do not explicitl!. denote the dimensions of the matrix in the notation because it is usua!ly c!ear from the context. r vectcr spaces associated with evei-y matrix as There 2re f ~ u iinpostant definzd below: 171

(i) Row Space: the space spanned by the roklJs. (2) Column Space: the space spanned by the co!unins. This space is 2 1 kno~v11as the I-unge sj~oceof a mat:-ix. (3) Null Space: the space spanned by all veclors x which satisfy A x = G. where x is a vector belonging to 9i". (4) Left Yiuli Space: the null space of the transpose of a matrix.

There are some important properties that link these vector spaces. The l-a~zkof a matrix is equal to the dimension of its row space, u~hichis also eqiial to the dimension of its column space. This immediately implies that the rank of a matrix r I ri~in(nz,11). A matrix of order 11 x iz is know11 as a square matrix of order n. If the rank of such a matrix is n, then the matrix is known as a nonsingular matrix and its inverse exists.




Appendix A-Basic

liurir Kci ori<-ilinrionar~dGross Et-ror Deteciion

The nonzero vectors xiof size n that satisfy the equation

T h e following equality can also be proved:

where N(A) is the null space of matrix A. From the above equation, it follows that if the rank of a matrix is equal to n (which implies that the columns of a matrix are linearly independent), then the dimension of the null space of the matrix is 0. The only vector which satisfies Ax = 0 in thls case is the 0 vector. In general, we are interested in obtaining a vector x which is the solution of the linear set of equations Ax = b. In other words, we wish to express the vector b as a linear combination of the colurnns of A.This is possible only if b is a member of the column space of A. Furthermore, the solution is unique if the null space of A has d i w - n ~ i o n 0. This property is used in obtaining the solution of the unmeasured variables in data reconciliation discussed in Chapter 3. In general, we can express the solution vector x as

x = X r + X,,

(A - 2)

where x, belongs to the column space of .4 and x, belongs to the null space of A. This is known as the ralrgr and rzull s p c e dc~co)npo.~itio17 (RND). which is used in the RND-SQP nonlinear constrained opti~nizatic11algorithm discussed i i l Chi!pter 5. 'The el;.erzlv!urr of a square matrix A of clrder n are the 12 roots of its characteristic equation:

={hl,jL2, ) . . .. A,,}. !f we The set of these roct\ is denoted by h < ~ define the t1at.r of matrix A by

then tr(A) = h, + h2+ . . . + &,

Coricrpfsin Linear Algebra

are referred to as eigenvectors. The eigenvalues and eigenvectors of certain matrices are used to build the principal component tests in Chapter 7.

REFERENCES 1. Noble, B., and J. Daniel. Applied Linear Algebra. Englewood Cliffs, N.J.:

Prentice-Hall, 1977. 2. Strang, G. Linear Algebra and its Applications, 3rd ed. Orlando, Fla.: Harcourt Brace Jovanovich, 1988. 3. Golub, G. H., and C. F. Van Loan. Matrix (3omnputations. Baltimor-c-Johns Hopkins University Press, 1996.

Appendix B

Graph Theory




Graph theory deals with problems related to topological properties of figures. It is also useful for analyzing problems concerning discrete objects and their interrelationships. In this appendix, we define some of the important concepis of graph theory used ir? the book and illustrate them usins era~npies.Son?e facts are simp!;' stated withou: proofs and we direct the interested render to the book by Deo [ I j for thest: proofs 2nd add~tionalconcepts and theorems.




Figure B-1 A graph.

GRAPHS, PROCESS GRAPHS, AND SUBGRAPHS A grcrplz consists of a set of /?odes,V, and a set ~f edges, E. Each edge is associatei with a pair of nodes, which it joins. An exampie of a graph is shown in Figure £3-1, which has six nodes draw2 as circles and eight edges shown as lines. Each edge is said to be incident on the nodes with which it is associated. The degree of a node is the number of edges incident on it. Aprocess graph is a graph which is simply obtalned trom a process flowsheet by adding an additional node called the envi~c)nrnent node to which all process feeds and products are connected. For example, the graph in Pisure H- I is the process graph of a simplified ammonia process whose flowsheet is shown in Figure 10-1. The nodes of a process graph cotrespond to process units and the edges of the process graph correspond to streams that interconnect the units.

Thus, if the process contains n units and e streams, then the corresponding process giaph contains n+l nodes and e edges. For ease of reference, we nurnbei or label the edges and nodes of the graph using the same numbers or labels as used in the process flowsheet, except for the environment node which is labelled as E. If the directions of the edges are ignored, as in Figure R-I, then an undirected graph is obtained; otherwise, the graph is directed. In this text, we are only concerned with undirected graphs.



A suligt-aph of a graph consists of a subset of nodzs ar.d edges of the graph. Each edge of the subgraph jsins the s a n e two nodes as it does in the graph. In other words, if an edge is part of a subgraph. then the end nodes with which it is associated in the graph should also be pait of the r z is a subgraph of thz graph shcwn in subgraph. The graph in F l g ~ ~ B-2 Figure B- 1.

Figure 6-2. A svbgraph of the graph in Figure 6-1.

L)UIURerr,ncili:ztion and Gross El-ror 1)efecfion

PATHS, CYCLES, A N D CONNECTIVITY A pat11 between two nodes (denoted as the initial and terminal nodes of the path) is a finite alternating sequence of edges and nodes such that each edge in the sequence is incident on the two nodes preceding and succeeding it, and no node appears more than once in this sequence. A path is called a cycle if the initial and terminal nodes are the same. For example, in Figure B- 1, the alternating sequence of nodes and edges, E- 1-M-2-H-3R, is a path between initial node E and terminal node R, while the sequence E-6-S-5-SP-8-E is a cycle. A graph is connected if there exists a path between every pair of nodes of the graph. The graph in Figure B-1 is a connected graph as is generally the case for all process graphs.

Figure B-3. Subgraph formed by deleting node E from graph in Figure B-1

S P A N N I N G TREES, BRANCHES, A N D CHORDS A connected subgraph of the graph which does not contain any cycles and which includes all nodes of the graph is called a spanning tree of the graph. An edge of the graph that is part of the spanning tree is called a hlmlclz, while edges of the graph not part of the spanning tree are called c1zords. Figure B-2 is a spanning tree of the graph of Figure b-i. Corres ~ o n d i n gto this spailning tree, edges 2, 4, 5 , 7, ar,u 8 are branches while the remaining edges 1, 3, and 6 arc chords I? should be noted that brafiches and chords are defined with respect to the specified s p a ~ n i n gtree of a graph. If a different spafining tree of the graph is chosen then, accordingly. different edges of the graph are c1sssifie6 as brancl~esc r chards. It can be proved tha: n spanning tree contains n branches and e-1: chords where n is the number of upits in the process flowsheet (or one less than the number of nodes o l the process gr-zph). and e is the number of streams or edges of the graph.

Figure 6-4.Graph fcrmed by m~rgingnodes E and M of graph in F i ~ u r e5-1.



GRAPH OPERATIONS A graph can be modified by operations such as deletion of edges or nodes and by merging of nodes. The deletion of an edge from a graph results in a subgraph which contains all nodes and all edges except the deleted edge. For example, the spanning tree shown in F i g ~ r eB-2 can be obtained from the graph in Figure B-l by deleting edges 1, 3, and 6. The deletiorz of a node from a graph results in a subgraph which contains all the nodes of the graph except the deleted node, and contains all edges of the graph except the edges which are incident or1 the deleted node.

The subgraph shown in Figure B-3 can be cbrained from :he graph of F i g ~ r zB-1 by deleting the node E. The t t 7 e t ~ i n gcf nvo izodes cT a graph resul:s in a modified graph obtained by replacing the two merged nodes by a new node and dcieting the edge,; ir~cidcnt(111 both these :lodes. E d p which are incident 011 only one of the two merged nodes in the orig~nal graph arc now incident on the new node of the modified graph. The graph in Figur-e B-4 is obtai~iedCrom erzph in Figur-e B-l by n~ergingnode5 E and M. The new merged node in Figure B-4 is denoted as EM.





A crttret of a graph is a set of edges of a graph whose deletion disconnects the graph, but the deletion of a proper subset of the edges of a cutset does not disconnect the graph. The set of edges 12. 5. 61 is a cutset of the zraph in Figure B-l since the de!etion of this set of edges disconnects the graph into two node sets one cont;:irmin~ ikf. E, and SP, and :he other


Doru Keconciliulion and Gmss Erro~-Dprecriotz

containing H, R, and S. On the other hand, the set of edges [l, 2, 5, 61 is not a cutset although the removal of this set of edges disconnects the graph si~iceits proper set of edges [2,5,6] is a cutset. There is a correspondence between cutsets and flow balances that can be written for a process. A flow balance can be written around every unit of a process, which will involve the flows of streams that enter or exit this unit. It can be verified that the edges corresponding to these streams form a cutset of the process graph. Thus, corresponding to every cutset consisting of all edges incident on a node, a flow balance can be written. Flow balances can also be written corresponding to other cutsets which are essentially linear combinations of flow balances around individual process units. Thus, corresponding to the cutset [2, 5, 61 of the graph in Figure B-1, the flow balance equation involving the flows of streams 2, 5 , and 6 is a linear combination of the flow balances around process units H, R, and S of the process. It should be noted that the direction of the streams should be taken into account when writing the flow balances corresponding to cutsets of the process graph. A cutset of the graph which contains only one branch of a spanning tree of the graph and zero or more chords is called a fu1dui~let7talcutset corresponding to the spanning tree. For exa~nple,edge set [I, 3, 71 is a filndarnenta! cutset of the graph ill Figure EZ-1, corresponding to the spanning tree shown in Figure B-2. However. although the set of edges [2. 5, 61 is aiso a cutse! of the in Figure B-1, it is no: a fundamental cutset with respect to the spanning uee of Figure 3 - 2 because it c o n ~ ~ i two n s branches, 2 and 5 of the spalning tree. With respect to every branch of a spanning tree of a graph, 2 fundamental cutset can be identified. The fiindarnental cu:sets corresponding to t!e spznning tree, Figure B-2, sf rhe graph in Figure E-1, are @, 31, @, 31, [5,3 , 6 ] , [Z, I , 31, and [& 1, 61, where the branch in each fundamenral cutset is indicated by an undersccre. A concept, which is complementary to a fundamental cutset, is that cf a fundamental cycle with rzspect to a spanning tree of a graph. A funriclr?ze:7tul cycle with rcspect tc a s p n x i n g :Tee cf ;&;'-$I is r; c j x k ;f ;:.-: graph farmed by exactly one c h ~ r "2nd 9ne or more brq.r?ches. The cycle E-1-M-7-Sf-8-E, which consists of edges [I, 7, 81, is a fundamental cycle of the graph of Figure B-1 with respect to the spanning tree B-2, which consists of chord 1 and branches 7 and 8. For each chord of a spanning tree of a graph a fu~idameatalcycle can be identified. The fundamental cycles with respect to the spanning tree, Figure B-2, of the 4, 5, 7, 21, where the graph in Figure B-I are [I,7, 81, [G, 5, 81, and [3, chords are indicated by an underscore.

Fundamental cutsets are complementary to fundamental cycles in the sense that if a chord cj occurs in the fundamental cutset of a branch b;, then branch bi occurs in the fundamental cycle of chord cj. This may be verified from the fundamental cycles and funda~nentalcutsets with respect to the spanning tree of Figure B-2 listed in the preceding paragraph. This property can be used to identify the fundamental cycles with respect to a spanning tree given the fundamental cutsets with respect to the same spanning tree. Fundamental cycles (or fundamental cutsets) can be used to generate new spanning trees of a graph starting from a given spanning tree. The technique known as an elementary tree tt-unsfot-tnatioir(ETT) involves the interchange of a chord with a branch. In this technique, we add to the spanning tree a chord and delete a branch belonging to the fundamental cycle with respect to the original spanning tree formed by the chord which has been added. For example, the spanning tree shown i n Figure B-5 is a new spanning tree of the graph in Figure B-1 obtained from the spanning tree in Figure B-2 by adding chord 1, and deleting branch 7, which belongs to the fundamental cycle forrned by chord I. The new spanning tree differs from the initial spanning tree in respect of one cizoid and one branch, and is also referred 1 0 as the neighbor of !he initial spanning tl-ee.

Fioure 6-5. Spannina tree formed by ETT of spannir.9 tree in Figure 8-2

REFERENCE 1. Deo, N.Gruplz 777eor~~ , i t kA/~j)Iicafiorr.~ to E I I ~ ~ ~ I C( 'zC ~ d~(:Ott~j~i~ietY I I ~ Science. E~~glewood Cliffs, N.J.: Prentice-Hall. 1974.

Appendix C

Fundamentals of Probability and Statistics

R A N D O M VARIABLES A N D PROBABILITY DENSITY FUNCTIONS Probability is a mathematical theory dealin: with the laws of rundonr example, the rzsu!t of a phy~icdlor cheinical experiment is a random event. The measti!-ed ol inferred va!m obtained at the end o i experiment is a ru?zdoin variirh!e which lies within a specified interval with a certain prcbability. it is easier to understand the behavior of random variables if we analyze a discrzte event. The rolling of a pair of dice provides a good exam~ l aei a random variable. It is impossible to PI-cdict the outcome of an individ~lalroll; however, i t is nmre likely that the summation ngmher lor the pair of dice i~ a 7 rather [hail a 12. This is because, of the 35 possible rolls, there is oaiy one way :o roll a 12-naniely (6.6). while thcre are six ways to roll a 7-namely (6,l). (5,2).(4.3), (3,4); (2.5). (1,6). Let us assume that we roll the dice thousands ui' i i ~ ~ dlrd ~ c arecord how many times each ro!l occurred. Then. the probability fur,c:ion for each roll R. i.e., P(K), or so-called prohahilip (PDF, or p.d.f.1 is: e1.e77t~.For

P(K) =

Number of tiriles roll K occun-ed Total number of rolls

Figure C-1 shows the graph obtained by plotting P(R) for all possible rolls R. The probability of rolling a given valne R in a single throw of the dice is the area under its rectangle. For example, you have a 4 in 36 chance of rolling a 9. The probability of rolling 3 or 12 in a single throw of the dice is the total area under their rectangles, 2/36+1/36=3/36. The PDF graph in Figure C-1 is discontinuous, because rolling a pair of dice produces only discrete values, i.e., the resulting value must be an integer between 2 and 12 inclusively. Integrals of the PDF are quite useful because they determine the probability of occurrence of a group of events. For example, we can obtain P (59) = P(R= lC)+P(R=l l)+P(R= 12) = 3:36+2/26+ 1/36 = 6/36. Most e~-rorsin plant tnerrsurements are random variables. But unlike the rolls of a pair of dice in the previous example, they are continuous varirtbies, not results of discrete events. This means that there is an infinity of possiblz "discrete" values for the events sssociated with contingous variables. For that reason. for continuous random variables, the integral

-i I


Probability o f rolling 9 is 4/36

5/36 PI-obability of vaiue rolled

2 3




7 8



Value rolled with a pair of dice, R Figure C-1. Probabiiiiy density function for r o l l i a ~ poir ~ of dice.





Daru R~rcoi~cil~o;ii~ri and GI-ossError- ijc.rec1ir.1

probabilities are of special practical interest. For example, we can make statements such as "there is a 95% chsnce that the true flow rate for stream S10 lies between 5,000 and 6,000 BPD." Or, "there is a 2.5% probability that the true flow rate in stream S10 exceeds 6,100 BPD." In order to calculate such probabilities, a continuous probability density function is required. For reasons specified in Chapter 2, the most widely used density function for continuous random variables in physical and chemical sciences is the normal dis~ributiorzfir7ctiorz. The normal distribution is also known as the Gaussiarz distribution, and its PDF IS described by the formula:

The most important properties of the normal distribution PDF are as follows:



1. The maximum value of F(X) occurs at the mean. p. 2. The standard deviation o determines the width (or the skewizess) of the curve. For a very accurate instrument (small o),the density function will look like a sharp peak centered at zero. On the contrary, for an inaccurate instrument (large o),the PDF will look rather flat.

3. The


factor normalizes the density so that

4. It is symmetric about the mean. 5. The probability of a measurement error lying between XI and X2 is:

where p is the mean -value of the random variable X,and o is its standard deviation. Since in practice we expect the errors to be zero on the average, p = 0 for the measurement en-or density filnction. Figure C-2 shows a Gaussian PDF with a mean of zero and a standard deviation of 1. This funct~onis a conti!luous analog of the dice rolling density function. The ~ o r n t a ldisti-ibution with Lero mean and standard deviarion of I is also know11 as .srcrrzilcir-dIZOI-IIK~Idi.~trib~4tto1z.



This probability is equivalent to :he area under the curve between XI and X7. 6. Similarly, the probability that a nieasurement error (it\ ab~olutevalue) is greater than a panicuiar value X*is

Tnis probability is equivalent to the area u:ldcr the curve ou:side the interva! (-X* and X*). 4 s iilcstrated in Figcre C-2, 95% of !iie randorn errors should lie within 1.96 standard deviztions. Anaiytlcaily, this means:

Figure C-2. Normal distribution density function

This is often ca!led the 95% coi~fiderlceirztet-::nl.The 99410 c o n f dence interval occurs within 2.58 standard deviations of the mean. Note that these figures are true when there is only one measured variable. For ~nultiplemeasured variables, the threshold is recalculated based on the rules given in Chapter 7 (see Sidak's rule). Another distribution of interest for the statistical applications in this book is the chi-squaw (x" dislt-ihution. If' Kl, K?, . . . R,, are indepen-

dent random variables described by a standard normal distribution. then a chi-square random variable is defined by:

2.As v approaches infinity, the chi-square density approaches the normal distribution. The x2(8) curve in Figure C-3 is starting to illustrate this behavior.


The integer v is usually known as the tzumher- of degrees of fi-eedot~z. The probability density function for a for different degrees of freedom v is illustrated in Figure C-3. The probability distribution function for the chi-square distribution l i described analytically by the following fomlula:


where r is the gamma function. The most important propmies of the chisquare distribution are:

1. The inean m!ue of %'(\I) is v.



0.18 0.16 0.14 0.12 F(X) 0.1 0.08 0.06 0.04 0.02

P ( X <~ 9.49) = 0.95

W2>9.49) = 0.05 95% o f total area

I iI




Figure C-4. Ccnfidence intervals for chi-sauure distribution.

Confidence intervals can also be constructed far the X' distributioi;. For e?tar;lple. Fignre C-4 silows a 95% i-crifidence region for a 2 ' distn1;ution f ~ n c t i o nwith 4 degrees nf freedom. in this particular cxce, 95%-of the random variab!es siiould lic between 0 and 9.49.


STATlSTICAL PROPERTIES OF RANDOM VARIABLES The statistical properties for random variables were indirectiy mentioned as part of the ar?alytical description of :he probability density tunctions above. Now we are going to define thein i n ;1 more genci-a1 fratncwork. There are two basic properties for the random variables (also known in statistics as ~uot?zer~!s): the mean and the variurlce of the random variable. The mean vaiue of a random variable X.px. is defined as the e.rpected v~ilueof X . For a continuous variable. it can he expressed analytically as:

Figure C-3. Chi-square distribution dens+ function

The expected value defined above can also be defined as the first moment about zero. In general, the first moment about a constant value 6 is given by E[X-6). If 6 is equal to the expected value of X, the central first moment is obtained, and the corresponding distribution is called the centi-ul distribution. Otherwise, a noncenti-a1 distrib~~rior~ is obtained. The mean (expected value) of a random variable Z whose distribution is a joint distribution of other random variables, X I , X2 . . . X, (i.e., a tnultivariate distribution) is defined as





since the variance of a constant is zero. A practical application of the above definitions is the derivation of thc mean vector and the covariance matrix of a vector of randorn variables which is a linear function of other vector of random variables. For exaniple, let us assume a iinear equation in vector form such as:

where Z = f(X1. X2 . . . XI,) and @(XI,X2 . . . X,) is the joiilf pi-obabilitl), density$i17ctioil of the random variables X I , X2. . . X,,. If f(X1,X2. . . XI,,) is a linear function, i.e.,

then the mean

:,slue of %


If f(X,, X2 . . . X,) is a linear function and the errors of the primary random variables X I , . . . X, are mutually independent random variables, (C- 13) reduces to:

Let E(x)=O arid Cov(x) = Q, ~ v h e r zCov(x) means tha covarisnce matrix of vector of laildo111 val-iables x. Then, tile expccted \~alueof y is:

a 1 ~ Ilnear. 1

-l!ic variance of a random variable X. var(Xj, i s defir~eda< the reccnd



~uoinerztabout Lero, 1.e.. These results for linear transformations of vector :)f randon, variables arc used in Inany derivations throu,?hoot this book. 'I'he relationship between the variance and the standard deviation of a random variable is given by: Var(>o = o i

(C- 13)

The variance of the multivariate random variable Z = f(X,. X ? . . . XI,) is defined as

HYPOTHESIS TESTING Hypothesis testins is a very inipol-tant statistical tool for making decisions about random variables. The procedure uses information from a random sa~npleof data to test the truth or falsity of a statement. 7'he basic staternent about a random variable is usually called the rzuil,

denoted by Ho. The opposite hypothesis about same random variable is ca!!ed :he alternative hypothesis, here denoted b y H I . The decision to accept or reject the null hypothesis is based on a statistical test. The test statistic (the value of the statistical test for given data) is first calculated with the data in the random sample. A decision criterion (a threshold of the statistical test) is used to make the decision about the hypothesis Ho. Two kinds of errors may be made at this point. If the null hypothesis is rejected when it is actually true. a Type I error is made. Alternatively, when the null hypothesis is accepted when it is actually false. then a Type II error is made. The probabilities of occurrence of Type I and Type I1 errors are as follows: N

= P(Type I error) =

y = p(Type I1 error) =

eject Ho I I&

= true)

(C - 19)

eject H , I H , = true)

(c 20)

Power = ~ ( ~ c c e N, p t 1 H I = true) = 1 - y

(c-21) tc: tetit



The powel- of the test is often bsed to evaluate a particular statistical test. and ir is defined as:

In ihis book, the hypothesis testing is used

For multiple tests, as in the case of multiple measurements in tile piant, the probability of Type 1error is higher than a. An upper bound P can be designed, as explained in Chapter 7. Let zj be the test statistic for n?easurement j. If lzjl > Z I d 2 , then the null hypothesis Ho is rejected and hypothesis H I , is accepted. This means that z, is outside the +z1-B/2confidence interval for a standard normal distribution. This is similar to value X being outside the interval (-1.96. +1.96) for a = 0.05 in Figure C- 1 . On the other hand, if a global test described in Chapter 7 is used to test the null hypothesis ifo against the global alternative hypothesis H,, thc threshold for the test is x;,, at a chosen level of significance. As in the null hypothesis Figure C-4, if the test statistic is greater than x,f is rejected and a gross error is declared in the measurement set.

the nui! hypothesis:

HG : :here is uo gross error in proceys dera, \rcrsut; thz al:en~ativei\y?oti1r,sia:

HI : there is ar least a gi-oss error in process data. or. mori- specifically: f3,, : there is a gross error in measurement j.

The choice of the test threshold depends on the statistical test that is used for hypothesis testicg. If the statistical test follows a standard nornlal distribution, such as some of the sta~isticaltests in Chapter 7, a thresli~ldZ , , , is used, at a chosen a level of significance. The value f( is used to control the probability of Type I error at value a.

1. Wadsworth, H. M. Hntzdbook of Statisrical Merizodstbr E12ginrer.r and Sc.ierzrisrs. New York: McGraw-Hill, 1990. 2. Hines. W. W., and D. C. Montgomery. PI-uiiiihili~. nrd Srotirricr it; Ell,girierr-ing atlci !Mnrlaget?~enr Scie;zre. New York: J o h n \Vile). & Sons. 1980.

3. CATACON Workbook. Brca, Calif.: Simulation Sciences. Inc.. 1996.

Accwacy of estimation, 301 of measurement, 36, 37 Adjustability, 210-21 1 Ad.jastments. 12-14, 16, 19 Amrnonia plant case study, 343-368 synfnesis process, 219,307-308, 3 14 Aritoine equation, 120 ,%verage error of estimatioi~(AEE), 258 nuicber cf Type I errors (AVTI), 258 Balance compone.nt flow, 88. 91,93-95, 98 deficit, 332 elemental, 95 entlialpy, 115 ovei-a11low, 88,93, IUb, 107 residuals, 17: Ball mills, 107-1 11 Bayes decision rule for identification, 266 formula, 267, 269 Bayesian algorithm, 27 1-273,278 Bayesian test, 23, 264-273, 278 sequential application of. 267 Bernoulli randorn variables, 265, 267

Bias in measurement, 32, 37, 176, 186.256,282,290,29 1,294 Bounded GLR method (BGLR). 249-253 Bosnds cn variables. 22. 25, 61. 138. 23Y.241,246-253,262.369 Branches of spanning trees. 1i0-112. 3 16-320,38Ct_:X? Broyden's r:l:it~-:x ilpdste pmcedure. 127. 131 Cauchy-Schwartz ineqnality, 192 Certainty Equivaleace Principle. 157 Chi-square distribution, 189, 225. 390-351, 387,388 random variable, 358. 389 Cholesky Cactoi-ization, 159 Chords in a graph, I 10-1 12,316-320. 380-383 Circuits mineral beneficiation, 7 Coaptatioii subproblern, 77 Collective methods for bias and leak detection, 256 principal component tests. 200. 209 Combinatorial strategies, 253-254

Confidence interval, 387 for nonlinear processes, 22,26, 138 Bonfmoni, 181 history of, 2 1-24 Connectivity, 382 in dynamic systems, 142-173,282 Constant direction approach, 125 industrial applications of, 327-37 j Constraint test, 23, 180-182,201, 203, linear steady-state, 59-84, 155 232,253-255,259,278,334 (see material balance, 72 also Nodal test) nonlinear problems, 262 Constraints nonlinear dynamic (NDDR), 165, bilinear, 134 166, 168, 169, 170 equality, 122 nonlinear steady-state, 25, 119-141 ' parameter estimation and, 33 1-332 inequality, 128-129: 166 nonlinear, 124, 131, 138 plant-wide material and utilities, Continuous stirred tank reactor 332-372 (CSTR), 162-1 64, 168-1 70,261 problem formulation, 7.9-10 Control law, 148 process unit, 328, 334 Coulers, 116 simple problems, 11 Correlation coefficients, 33 simulation techniques for Covariance matrix, 63, 121 evaluating, 8 1-82 of balance residua!^, 178 statistical basis of, 61-63 of measurement adjustments, 183 steady-state, 4,5,6, 7, 10, 23, 25,27, Critical value of a statistical test, 177, 80, S1,85, 153, 154, 166, 329 successive linear (SL), 124-128. 178, 180, Crowe's project matrix methcd, 135. 137 9'7--104, 113-1 14. 1 16, 126, DATACON'" software. 35 1, 354 132-1 33, 138,219,333 DATREC s~ftware,33 1 Degrees Crude preheat tiair?,5-4, 86, 329, of frecdorn, 388 335-339,340 of redundancy, 65 Delay split optirnrzatio!l, 5 , 10 CUSUM tests, 55 ir, instmment checking, 27; C11:sets in a giapt-r, 1 i0-! !2, 317, 319, in daia filtering, 39, 41 320,381-383 Detectability factor, 21 1 Cycles in a greph. 380 Dirac delta function, 161 Distributed control system (GCS). 4G Distl-ibutions Data beta. 268 coaptation, 8, 15. 22 ceetral. 192 conditioning, 4, 56 multivariate, 392 filttlring. 27, 39, 5 1 noncen~ral,392 rectificaiion, 3 normal. 35,386-387 smoothing. 39, 5 1 validation, 27, 56 standard normal, 386 Dyrlarnic measurement test (DMT). Data reconciliation (DR) benefits from, [email protected] 248-149 bilinear, 25, 85-1 17, 119, 316 Edges, 321,378-382 dynamic, 10,23,27, 142-173,330 Eigenvalues, 196, 200, 209, 376 estimation accuracy of, 301-303 Eigenvectors, 196, 377 flow reconciliation example, 11-13



Uutrr Reco~~ciliufion mid Gross 6 - r o l -Derectior7

Elementary tree transformation, 383 Energy balances, 9 conservation constraints, 9 conservation laws, 8, 27 flows, 1I Enthalpy balance, 212,340,342 flows, treatment of, 114, 116 Equivalency classes, 215,2 16,258 Error-in-variables (EVM) estimation, 168 Error reduction methods, 38 Errors gross, 1 4 , 6 , 7 , 1 1, 17-20, 21, 22, 23,24,26, 27,32,34, 35, 37, 60,XO-81, 128, 174-225, 226280,327-372 normalized, !76 random, !A,7, 12,27,32-37,41. 56.61, 81, 1 6 1 4 5 . 151, 154, 163, 168,175. 176,358 reduction methods. 38-56 sqtiared prediction, 199 syr;temaric. 32 Type I. 177, 181, 184, !e8, l9c. 191, 198,223,229-23 1,233. 234, 236.240,254,255, 257, 258,259,262,264, 27!, 273, 284- 286,294,332,393 Type Ii, 177,223,234,236,254 255.284,286.294,392 E s t ~ m a t ~xcuracy o:~ of da:a reconc~l:at~on. 30 1 of mlnlmum obsen able scn\or net\\forka,306 Expected value of random errors. 32-34.389-39 1 of a furict~onof mndom varr~Sles, 35-37 Extended measurement test (EMT), 248.249 Faults, 282, 295 additive. 289 diagnosis, 28 1,284,288, 295-297 hard. 295

isolability, 295-297 signature, 296 soft, 295 Filters analog, 38 digital, 38, 39, 54, 56 double exponential, 42,45 exponential, 40-47,48, 54 exponentially weighted moving average (EWMA), 50 finite impulse response (FIR), 49,s 1 first-order, 40,42,45, 53.54, 55 geometric moving average, 50 hybrid, 54-56 infinite impulse response (IIR), 40 Kalman (see Kalrnan filterj least-squares, 5 1 moving average, 48,49,54 nonlinear exponential. 42-43, 45. 46.47, 50, 54 polynomial, 5 1-54 reverse nonlinear exponential, 44 seccnd-order, 4 5 5 3 square-root covariance, 158 Flow balznces, 12 energy, l I enthalpy, 114 estimated. 12 mass, 11 measured, 12 reconciliation. 13, 14. 16. I8 Fourth-order Runge-Kut:a method. 163, 169 Fundamental cutsets, 382-383 cycles, 382-383 CAMS, 323 Gauss-Jordan elimination process, 124 Gaussian distribution, 10-1 1, 161, 284, 289, 290, 386 (see also Normal distribution) elimination, 31 3, 3 15

Generalized likelihood ratio (GLR) test, 23, 185-194, 199, 201, 203-205,214,223,227-230, 234-238,240,241-244,252, 259, 261, 262, 266,269, 277, 288-294,296,298,340 Generalized reduced gradient (GRG), 132-133, 137, 138, 167 GINO, 323 Global test (GT). 23, 178-1 80, 193, 194, 198, 199,201.203-207,, 222. 230, 23 I, 236, 238, 240, 248,252,259,277,283-288, 293,294,298,355,359-362, 364,366,367,368 Graph, 378-380 operations, 380 process, 378-380 subgraphs, 379-381.383-384 theoretic methods, 22. 25, 72. 82, 1 10, I35,3 15-3 16.320,324 theory fundamentais. 378-393 Grinding mills, I I3 Gross error detzction (GED). 1 4 . 2-3. 24.26, 174-225,226-280,330. 340-541. 359 basic statistical tests for. 174-195 benefits from, 20-2! for-steady-state processes. 226-280 history of, 21-24 i n lineal-dynaniic sysirms. 28 1-299 in nonlinear processes. 260-154 rrtedel. 185 $?rial strategies f ~ r236-238 . signature models, 187 simultaneous strategies for, 227-236,248 using principal componcni trsts, 195-200 Gross errors. 1,6, 7, 1 I. 17-20. 2 1 . 22. 23, 24, 26. 27. 32, 34. 35. 37-38,60, 80-81, 128, 174-225. 226-280,327-375 equi\rale~~cy classes, 215. 216. 258 equivalent sets of, 2 14, 2 16

ident~fiabilityof, 214-21 7 identification strategies, 256-260 signature vectors, 20 1, 2 15, 216,230 HARWELL mathematic library, 82, 131 Heat balance equations, 25 exchangers, j4, 9, 10, 1 1, 2 1. 73, 74, 75, 1 15-! 16, 162, 179, 212,233,252.274,276,29S. 336,339,340,342.343 transfer coefficients, 9-1 0. 2 1, 295,33 1 transfer fluid (HTF). 27G-275 Heatcrs, 1 16 Hessian matrix. 130-1 3 1 Hotelling T' test, 329 Hypotheses alternative, 176. 187. 228, 743. 241, 392 cornhinatoria!. 278 ~ l o b a alrema!ivc. l 395 null. 176. 2G3. 529. 231. 2%-2%. 293. 391-393 resting. 39 1-393 irnpierner~tationof da::~recocziliarion yidz!ines. 339 on-iine. stead) -state. 329 IMSL niathen2a:ic library. S2 Independent equatic~~s. 88 r-sr~cioinerrors. 3 ; I~~novations. 150. 25-7-2813 inteeral o!'ahsolutc. eri.ors (IAEI, 3934.47,19 lritcgral dynamic rncasurerncnt test. 297 Itcr.:lti\.c rueasurcnient test (IMI'), 238-240, 213.236. 248.277 Jacobian matrix, 123-1 24. 125. 126. 127 Jo~ntprobability dzns~tyfunct~on.390


Doin Ilccor~crllarronand Gross Error- Derrcrion

Kalnian filter, 148-160, 163, 164, 165, 170, 171,283,285, 289, 293,294,296,298 extended, 23, 161, 163, 169 filtering methods, 26, 148-160 gain matrix, 150, 158, 160 implementation, 158 steady-state Kalman gain, 151, 152 Kruskal's algorithm, 321-322

left null space, 375 null space, 375,376 projection, 6 4 , 6 6 4 9 range space, 375 rank, 375 row space, 375 signature, 289 trace of, 303, 376 Maximum likelihood estimates (MLE), 122,230 Maximum power (MP) Lagrange multipliers, 60, 61, constraint tests, 181-182, 199 122-124, 131 measurement test, 184-185, Leak detection, 185-189,254-256,335 190-1 93,199,203-206 Leaks, 37, 174, 185, 189. 190 Mean values, 392 Least-squares Measurement formulation, 160 accuracy of, 37 minimization, 121 direct method, 78 op~imization,8, 13. 161 elimination, 23, 24 weighted objective fur~ction.8 error covariance matrix, 77,78, Le\ el of significance, 176. 392 79, 178 modified, 181 errors, 27,32-38 Likelihood functi~n,52 indirect method, 78, 80 Line search. 127, 130 practically nonredundanr, 21 0 Idinear practically unobservable cc~nbinationtechniqw (LCT). variables, 210 254-256.260 precision, 37 da:a rezonci!istior. pioblems. 9, test (MT). 20, 23, 183-1 &5,201, 59-82,155 222,227,255,355-356,359, program (LP), 132 360,361 systcrns, 63-77 test statistics, 183--I85 L.ccal neigilhorhood search techniq~te. Mineral 323 beneficiatiotl circuits, 7, 104 120ss e5timatlon. 334 flotation process, 102 LU decomposition of matrix, process circuits, 23 134, 135 MINOS, 137,323 Mixers, 9 1, 94, 102, 113 Maznitude enthalpy balance, 114 of bias, 185, 262 two-phase, 106-107, I08 of gross error, 37,265.270. 273 Model MATLAB, 82.2 17.274 identification, 143 Matrices and their properties, 373, linear discrete dynamic system, 375-377 143-145 Matrix tuning, 20-21 column space, 375 Modified iterative measurement test covariance, 63. 121 method (MIMT), 247-249,25 1, decomposition methods, 70-72 261,278

Modified serial compensation strategy (MSCS), 2 6 2 4 6 , 2 5 9 , 2 6 0 , 263-264,277-278 Moving window approach, 166-167 Newton-Raphson iterative method, 123,132 Nodal test, 23, 180-182, 208,232, 253-255,259,278,334 (See also Constraint test) Nodes, 378-38 1 Nonlinear data reconciliation, 9,26,85, 164-1 70 GLR test, 263-264 optimization strategies for data reconciliation, 136, 167, 171 programs (NLP), 23,25,26, 103, 104, 128-129, 134, 137,261, 276,331,369 state estimations, 160-1 64 Normal distribution, 10-1 1, 161, 284, 289, 290,386 (See also Gaussian distribution)





Objective function (OF), 261-263 for data :econciliation. 8, 60 difference. 26 i-262 reduction in. 204-205 Observzbiiiiy, 22, 69-70, 71, 72. 74, 82, 135, 135,2lG definition of. '70 Oil-iine data coilection and cocditioning, 5 implementation a:' data reconciliatiun. 329-330 optimization, 10 Open Yield software, 335 Optimal !55 control and Kalmax fi1:::-in;, state estimation, 148 Orthogonal collocation, 166-167 Overall power (OP), 257 function (OPF), 257 function equivalency, (OPFE), 258 Parity equations, 296 Paths, 380

Performance measures for GE identification strategies average error of estimation (AEE), 258 sverage number of Type I errors (AVTI), 258 overall power (OP), 257 overall power function, 257 overall power function as equivalent sets, 258 Plant-wide material and utilities reconciliation, 332-372 Posterior probability, 266267 Power of statistical test, 177, 181, 190,392 Preheat train, 5 4 , 2 1 2 Principal component analysis (PCA), 296 scores, 196 measurement test (PCMT). 197, 355-356,360 model, 197,297 of constraint residuals, 196 tests, 21, 176. 195-200.207-209, 223.232.259,364-366 Prior distribution. 268 pr~babilitji,255 Probability density functions (PDF or p.d.f.). 35, 384-339 Process control applications, 10 data conditioning methods. 1 4 unit balance reconciliatiotl, 328-33 1 Produt-ti011accounting, 332, 335 Projection matrix, 22, 25,64,66,67, 70. 74, 8 1, 82, 101-102, 132, 134,146,2U1, L ! Y , ZLI

Q statistic, 199 (See also Rao-statistic error or squared prediction error) Quadratic objective function, 129 problem (QP), 131 QPSOL, 131 QR factorization, 22,66-7 1, 8 I, 127, 128

d Gross El-ror I>etecrion

redundant observable, 305, RAGE software, 136,33 1,340 309-313,320,324 Random errors, 1 4 , 7 , 10, 12, 27,32-37, 38, Separation Theorem, 157 41,56,61, 82, 143-145, 151, Separators, 94, 113 154, 163, 168, 175, 176,358 two-phase, 105, 108 events, 384 Sequential variables, 384-393 probability ratio test (SPRT), 284 Range and null space decomposition quadratic programming (SQP), (RND), 131, 136,376 129-132, 135,136 Rao-statistic error, 199 Serial Raoult's law, 120 compensation, 237, 241, 243, 260, Reactors, 94-96, 113 277 Real correlation, 34 numbers, 376 elimination procedure, 24,204, 214, vectors, 376 RECON software, 33 1 Reconciliation of ammonia plant data. strategies, 236,246247, 259 343-372 Shewhart test, 54-55 RECONSET software. 33 1 SIGMAfine software, 335 Redundancy, 22,27,69-73,82, 134, Signal 135,209,210, 21 1, 228, 300. aliasing, 38 330,342,343,354,369 processing, 25 classification: 71. 135 reconstruction. 45.55, 297 definition, 70 types, 55 degrees of, 65 Signature marrix, 289-290 spatial. 4 Simpie seriai compensation strategy temporzl, 4, 149, 171, 282, 291 (SSCS:, 241-244,245.259, Redundant sabproblem, 77 260,277 Rigorous on-line modeling. 2 1 Simpson's technique, 104-105, 108. W4D-SQP, 131. 133, 136,376 109,111,113, 114, 117, 133,3!6 ROh4e01"softwa-e. 331 Simultaneous strategies Runge-Kutta method, Courth-order. for multiple gross error 163, 169 identification, 227 using 2 Bayesian approaclr, Seiectivity, 258 264-273 Sensor network using combinatorial hypothesis design, 70, 300-326 testing, 228-23 1 developments in des~gn,323 using simultaneous estimation of mar;im.;m es:imati~naccuracy GE's magnitudes, 232-233 design, 306. 3 15 using single gross error test minimum cost designs, 3 13-3 15, statistics, 227 320,322,324 Smearing effects, IS, 228 minimum observable. 304306,307, Soave-Redlich-Kwong (SRK), 354 312,315,316,318,322,324 Spanning tree, 110-1 12.316-322, optimization techniques for, 322 325,380-383

Spatial correlation, 33 redundancy, 4 Splitters, 91-93, 96, 113-1 14 enthalpy balance, 114 Squared prediction error, 199 SQPHP, 131 Standard deviation, 33, 34-37,45,56, 2 18 nom~aldistribution, 386-387 Statistical moments, 389, 390 process control, 38, 55 properties of innovations, 283-284 properties of random variables, 389-391 quality control tests (SQC). 3, 38 tests for general steady-state models, 200-202 Steady-state linear reconciliarion, 25 processes. 4. 25, 282, 369 Subgraph, 379-381 Subproblems coapta~ion.77 redundant. 77 Successive linear data reconciliation, 126128 Successive quadratic progran?minz (SQP), 22-23, 13I. 135, 138. 167 Successively linearized hoiizon estiriiation (S1,IIE). 167 Systematic errors, 32 Systems bilinear, 25, 85- 117 containing gross errors, 17 dynamic, 25, 27 linear. 63 linear dynamic, 28 1 noilli~~ear, 160 non1-edundant, 16 observable, 17, 25 redundant, 25 unobservable, 17


with all measured variables, 11-14, 15, 16,22,26 with unmeasured variables, 14-17,22 Taylor's series expansion, 36, 123, 129 Test statistics, 175-203, Theory of evidence, 329 Truncated chi-square test, 199 Unbiased estimated techniques (UBET), 232-233,236 Univariate tests, 180, 184. 199 VALI, 33 1 Variables basic, 132 classification methods, 77 dependent. 132 independent, 132 measured, 11-14, 15, 16, 22,26. 32, 33,62, 6356.69, 70,72, 81, 101, 109. 121, 132, 135. 149, 177,212,238,249,257.26C, 305,314,3 16,358 nonbasic. 132 nonredundant, 16,210 observable, i7, 25 primary, 35 random, 3 5 5 7 , 119,256,257. 184.389 redundsqt, 17%136. 200, 2 I?. restricted, 250 secondary random, 35 split-fractior., 1 13-1 14 superbasic, 132 unmeasuied, 14-17. 21, 27.63-58. 81, 100, 103, 110, 121. 126, 132-133, 135, 136, 177,200. 202, 204,2 12, 24 I, 248, 249. 305, 312, 315, 316,331. 358 unobservable. 17, 135, 136. 210 Variance. 392-393 of the estimated error, 302-303 of random variables, 33,389-39 1


Dora Rccoitciliatiun aud Gross Error Urrecriol~

Vectors and their properties, 373-375 column, 373 dimension of, 374 gross error signature, 187, 193, 215 of balance residuals, 178 of measurement adjustments, 183 real, 373

row, 373 space scanned, 377 Weighted least-squares objective function. 8 Windows, 166-1 68,285,293


Yield accounting, 332, 335, 369

Abadie, J. i 32 Albcquerque, J. S., 134,297 Ali, Y., 324 Almasy, G. A.. 23,79.80


Bagajewicz, M., 77, 155,214,260, 297,322,324,330 Bagchi, A., 158 Basseville, M., 295 Bellingharn. B., 145, 295 Bequette. B. W., 128, 137 Biegier, L. T., 22, 134, 136,297 Bodirrgtcn, C. E., 24 Borrie, J. A,. 158 Eritt, H. L., 1-2, 25,!26 Carpani, R. E., 23 Charpentier, V., 2 10, 329 Chen, J., 80 Clinkscale>,1. A., 5 1, 52, 54 Crowe, C. M., 22,24,66, 72.97,98, 101, 113, 135, 181, 190, 196, 198, 199,204,239,210,314 Daniel, J., 373 Darouach, M.. 155

Davidson, H., 22 Davis, J. F., 24, 181,232, 243,254, 257,259,260 Dee, N., 317, 321, 378 Devanathan, S., 155,2S4 Dunia, R., 37. 296, 297 Edgar, T. F., !39, 137, !62 Everell, M. D.. 23 Fisher, G., 22, 127. 133 Fisher, D. G., 151 Gelb, .4., 153 Gertler, J . J., 281,295. 296 Gill, P. E., i29 Gonnan, J. 'vSi ,22, 12-5

Har~humar,P., 256 neenan, W. A., 24,203,228,238, 239,247,255,261 Heraud, N., 23 H~mmelblau,D. M.. 129.295, 295 Hlavacek, V., 24 Hodouin, D., 23 Howat, C. S., 24, 120, 125, 126


n' i;rois EI ror- Ilerectian

Ichiyen, N., 102 lordache, C.. 210,214,269,273 Isermann, R, 295 Jazwinski, A. H., 160 Jiang, Q., 155,214,233,260,297 Jones, H. L., 285,293 Jordache, C., 24, 37,53,55, 136,260, 360 Kalman, R. E., 149 Kao. C. S., 34 Keller, J. Y., 80,238. 244. 245, 259 Kelly, J., 333 Kim, Y. $1..39, 168. 261 Knepper, J. C., 22. 125 fieisovalis, A., 72. 135. 303, 309, 3 12,324 Kuehn, D R., 22 Lasdon. L. S.. 132. 13'7 Lee, J. M.. 39 Lees, F. P.. 145, 295 Luccl-te,R . H.. 22. 125. 126 L-iebman, M. J.. 23. 137. 165, 160, 169 Liprak, B. (3.. 57 Loucka, M., 72.323 !,uo, Q.. 296 MacDonald, R. J.. 24. 120. 125, 126 MacGrcgor, J. F.. 50 Mac!:on. F., 24. 32. 31. -35. 37. 123. 121.209,3 13.3; 5. 324, 332 h4ah. R.S.H., 22. 23.2-4, 34, 72, 74, 79,SC,,200, 206, 2 14. 237. 253, 255, 288, 303,309,3 1 1.321 Makn~.S.. 15 1 Maquin. D., 32 : Mehra. R. K., 285 Melsa. J. I.., 148. 150. 15 1 Meyer, M., 72.96 Montgomery, R. C.. 284 Mullick, S. L., 330 RQu~tagh, B. .A,, 137 Muthy, A.K.S.. 23 Muske, K. R., 162

Nair, P., 37, 136 Narasimhan, S., 23, 24, 185, 190, 200, 2 14,237,256,26 1,264,288,324 Nikiforov, I. V., 295 Noble, B., 373 Pai, C.C.D., 22, 127, 133 Parr, A,, 45, 55 Patton, R., 281, 295 Peschon, J., 285 Press, W. H., 159 Ragot, J., 72 Ramamurthi. Y., 128, 137, 165, 167, 169 Rao, R., 324 Ravikumar, V., 23, 24. 136 Reid, K. J., 23 Reilly, P. M., 23 Reklaitis. G. V., 8, 95 Kenganathan, T., 261,264 Rhinehart. R. R., 45 Rlpps, D. L., 23, !95, 204, 237 Rolllns, D. K 24, 155, 181, 232, 243. 257,259,260,284 Romagnol~,J. A., 22. 24.66. 77, 116,258 Rosenberg. J ,24,204,231.233,236. 248.249,260


Sage, A. P., 148, 150, 151 Sanchez, M., 22.66,77, 116.233,258 Saunders, M. A., 137 Seborg, D. E., 38,45 Sen, S., 324 Serth, R. W., 24, 123, 204, 228, 238, 239,247,255,261 Sheel. J. P., 24 Shewchuck, C. F.. 123 Shinskey, F. C. , 37 Sirripson, D. E., 23, 104, 108 Smith, H. W., 102 Sorenson, H. W., 150 Stanley, G. M., 23, 56,72, 151 Stephanopoulos, G., 24,77 Stephenson, G. K.123 Strang, G., 373

Swartz, C.L.E., 22. 66, 128, 1.34, 135 Sztano, 'r., 23

Wald, A., 284 Wang, N.S., 24 Warcn. A. D., 132. 137 Weher. R.. 45 Wiegel. R. I., 23 Williams, J. P., 284 Wi!lsky, A. S., 288, 293 Wisltner, R. P., 162

Tamhane. A. C.. 23.24, 1 8 I , 183, 264,273 Tham, M. T., 55 Tilton, B., 260, 356 Tjoa, I. B., 22, 136 Tong, H., 24, 196. 198, 199.209 Turbatte, H. C., 324

Yang. Y., 255

Vaclavek, V., 72, 323, 324 Veverka, V. V., 24, 3 13,315,324,332

Zalkind. C. S.. 37 Zasadzlnski, M., 155


Data Reconciliation & Gross Error Detection: An Intelligent Use of

, Data Reconciliation & Gross Error Detection An Intelligent Use of Process Data Shankar Narasimhan and Cornelius Jordache Publishing Company Houst...

16MB Sizes 0 Downloads 0 Views

Recommend Documents

An Intelligent Methodology for Malware Detection in Android
[13] Yan Ma and Mehrdad Sepehri Sharbaf. “Investigation of static and dynamic android anti-virus strategies”. In Inf

Reconciliation in Guatemala: the role of intelligent justice - laits
Guatemala, Memory of Silence. The CHC goes some way to breaking that silence, but recognises that the report itself is n

Econolite Canada - Intelligent Traffic Systems - Detection, Radar
Econolite is committed to employing advanced technologies that reduce travel time, ease congestion, enhance transit oper

Motomaster eliminator intelligent battery charger error codes
Document - motomaster eliminator intelligent battery charger error codes found kisme, 09.04.2017 at 14:36.

XAPP383: Single Error Correction and Double Error Detection - Xilinx
Aug 1, 2003 - Introduction. To improve system reliability, a designer may wish to provide an automatic error detection a

motomaster-eliminator-intelligent-battery-charger-error-codes.exe - Is This File Safe? Status: Not Trusted. First seen:

Data error checking test
Shinto Fletcher pengertian manajemen keuangan internasional menurut para ahli multicopista their churrs and disturbingly

Web Data Mining Using An Intelligent Information System Design
Strategies by using Web Mining Techniques',. IEEE (2003). [6] Sung Ho Ha, Sung Min Bae, Sang Chan. Park, 'WEB MINING FOR

An error occurred
Psicoblogs - Sitio web donde encontraras resumenes de las meterias brindadas en la facultad.

An error occurred. - | MMRC
Oct 7, 2017 - [url= like the best bits of sou