Evidence for a change in the rate of aging of osteological indicators in [PDF]

This research addressed whether a difference in skeletal senescence exists between older American documented collections

0 downloads 6 Views 15MB Size

Report

Download PDF

PNG Network

Recommend Stories

Indicators of Climate Change in California

Stop acting so small. You are the universe in ecstatic motion. Rumi

Identifying Indicators of Behavior Change

Your task is not to seek for love, but merely to seek and find all the barriers within yourself that

Theories and Indicators of Change

Do not seek to follow in the footsteps of the wise. Seek what they sought. Matsuo Basho

Organization in audition by similarity in rate of change: Evidence from tracking individual frequency

Ego says, "Once everything falls into place, I'll feel peace." Spirit says "Find your peace, and then

Change in Contribution Rate Form

The wound is the place where the Light enters you. Rumi

The rate of change in Ca2+ concentration controls sperm chemotaxis

Knock, And He'll open the door. Vanish, And He'll make you shine like the sun. Fall, And He'll raise

Evidence of a Change in the Long Term Spin-down Rate of the X-ray Pulsar 4U 1907+ 09

Forget safety. Live where you fear to live. Destroy your reputation. Be notorious. Rumi

Further Evidence for the Involvement of EFL1 in a Shwachman

The only limits you see are the ones you impose on yourself. Dr. Wayne Dyer

Indicators of Adaptive Capacity to Climate Change for Agriculture in the Prairie Region of Canada

No amount of guilt can solve the past, and no amount of anxiety can change the future. Anonymous

Indicators of Adaptive Capacity to Climate Change for Agriculture in the Prairie Region of Canada

We may have all come on different ships, but we're in the same boat now. M.L.King

Idea Transcript

University of New Mexico

UNM Digital Repository Anthropology ETDs

Electronic Theses and Dissertations

7-1-2010

Evidence for a change in the rate of aging of osteological indicators in American documented skeletal samples Wendy Elizabeth Potter

Follow this and additional works at: https://digitalrepository.unm.edu/anth_etds Part of the Anthropology Commons Recommended Citation Potter, Wendy Elizabeth. "Evidence for a change in the rate of aging of osteological indicators in American documented skeletal samples." (2010). https://digitalrepository.unm.edu/anth_etds/54

This Dissertation is brought to you for free and open access by the Electronic Theses and Dissertations at UNM Digital Repository. It has been accepted for inclusion in Anthropology ETDs by an authorized administrator of UNM Digital Repository. For more information, please contact [email protected].

EVIDENCE FOR A CHANGE IN THE RATE OF AGING OF OSTEOLOGICAL INDICATORS IN AMERICAN DOCUMENTED SKELETAL SAMPLES

BY

WENDY ELIZABETH POTTER B.A. ANTHROPOLOGY, ARIZONA STATE UNIVERSITY, 1998 M.S. BIOLOGICAL ANTHROPOLOGY, UNIVERSITY OF NEW MEXICO, 2001

DISSERTATION Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Anthropology The University of New Mexico Albuquerque, New Mexico August 2010

ii

DEDICATION To my family, with love.

iii

ACKNOWLEDGEMENTS I would like to thank all of the members of my committee for their encouragement, support, and feedback throughout the course of my graduate studies and dissertation research. I am indebted to my committee chairs, Dr. Osbjorn Pearson and Dr. Jane Buikstra, for their invaluable comments and suggestions to better this project. Many thanks also to Dr. Keith Hunley and Dr. Heather Edgar, who provided constructive criticism of the project design and the manuscript. Finally, thanks to Dr. Edward Bedrick, who was instrumental in helping me properly interpret the outcome of the statistics used in the project. Thanks also to Dr. George Milner and Dr. Lyle Konigsberg. Dr. Milner kindly provided instructional materials for the Transition Analysis aging standard and a version of the ADBOU Age Estimator program. In addition, he provided helpful feedback on a poster I presented at the 78th annual meeting of the American Association of Physical Anthropologists, which reported preliminary results from this research. Dr. Konigsberg graciously provided assistance with the statistical package R (R Core Development Team, 2008), including basic code that was adapted for this project. This project could not have been completed without the assistance of collection curators at the following institutions: Dr. David Hunt, National Museum of Natural History at the Smithsonian Institution; Dr. Laura Fulginiti, Maricopa County Forensic Science Center; Dr. Heather Edgar, Maxwell Museum; Dr. Lee Meadows Jantz, University of Tennessee; and Lyman Jellema, Cleveland Museum of Natural History. My thanks to these individuals for providing both documentation of and access to the

iv

remains. Many thanks go to the staff of these institutions for their support, including Lara Noldner, Carmen Mosley, and Rebecca Wilson. Thanks also to Dr. Kristin Hartnett for her assistance with arranging access to the collection of pubes and ribs she procured at the Maricopa County Forensic Science as part of her dissertation project. I would like to thank the following agency for grant money supporting this research: the Student Research Allocations Committee of the Graduate and Professional Student Association at the University of New Mexico. I and deeply appreciative of the contributions my friends and colleagues have made toward bettering my dissertation, including scholarly discourse, statistical help, formatting, and editing/feedback. Thanks particularly to Andrew McQuade, Megan Rhoads, Shamsi Daneshvari, and Angie Evans. I am especially grateful for and indebted to my family, whose continued love and support throughout my graduate education has been an immense source of strength for me. I could not have completed this without my parents, Jean and Glen Potter, my siblings, Dr. Karen Cadman and Steve Potter, and my “Albuquerque parents,” Colonel and Dru Rhoads. Thanks to my computer support staff, Andrew McQuade, Steve Potter, and my brother-in-law, Caleb Cadman; no doubt this process would have been delayed without your help. Thanks also to Jessica Sullivan, who has supported me throughout this endeavor and is like a sister to me. Thanks to Skylar, who kept me company during the writing process and helped me keep my perspective. Finally, I want to express my gratitude to Andrew McQuade, who has provided unwavering emotional support and endured the hardships of living with a doctoral candidate.

v

EVIDENCE FOR A CHANGE IN THE RATE OF AGING OF OSTEOLOGICAL INDICATORS IN AMERICAN DOCUMENTED SKELETAL SAMPLES

BY

WENDY ELIZABETH POTTER

ABSTRACT OF DISSERTATION Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Anthropology The University of New Mexico Albuquerque, New Mexico

August 2010

EVIDENCE FOR A CHANGE IN THE RATE OF AGING OF OSTEOLOGICAL INDICATORS IN AMERICAN DOCUMENTED SKELETAL SAMPLES

by

Wendy Elizabeth Potter B.A. IN ANTHROPOLOGY, ARIZONA STATE UNIVERSITY, 1998 M.S. IN BIOLOGICAL ANTHROPOLOGY, UNIVERSITY OF NEW MEXICO, 2001 DOCTORATE OF PHILOSOPHY IN ANTHROPOLOGY, UNIVERSITY OF NEW MEXICO, 2010

ABSTRACT

The question of uniformity in skeletal age changes across populations is fundamental to all comparative work in skeletal biology. Whether an aging standard will work on target groups that differ in time, space, and background from the reference sample is essential for reliable, accurate age estimation. This research addressed whether a difference in skeletal senescence exists between older American documented collections and more recent ones. The pubic symphysis, auricular surface, sternal rib end, and suture obliteration were scored for a sample of American Blacks and Whites drawn from the Terry, Hamann-Todd, Bass Documented, Maxwell Museum, and Maricopa County Forensic Science Center collections. The samples were divided into two groups: the Reference group included the Terry and Hamann-Todd samples, and the

vii

Recent group included the remaining three series. Differences between Reference and Recent groups were tested using proportional odds probit regression analysis and an analysis of deviance. Results indicated a significant difference in pubic symphyseal senescence between older Reference and more Recent American skeletal samples. No difference was found for cranial suture closure or the sternal rib end. Statistical problems and noteworthy critiques of auricular surface aging methods precluded an assessment of whether a difference between groups was present for this indicator. For the pubic symphysis, a slight deceleration of the rate of metamorphosis was reported for the Recent group, particularly for males and Whites. However, the broad age ranges associated with phases defined by most pubic symphyseal aging methods appear to mitigate this problem for forensic assessments of age at death. In contrast, paleodemographic and bioarchaeological analyses may be more greatly affected, as broad age ranges are often not desirable for such investigations. These results advance anthropologists’ current knowledge and understanding of the applicability and reliability of aging standards when used on skeletal samples differing in time, space, and composition from the reference sample, and impact how skeletal age estimations are interpreted in forensic anthropological, paleodemographic, and bioarchaeological investigations. Uncertainty as to the possible causes for the differences observed, whether due to secular change, sampling issues, observer bias, or environmental factors, provides an abundance of future research opportunities.

viii

Table of Contents DEDICATION ............................................................................................................. iii ACKNOWLEDGEMENTS..........................................................................................iv ABSTRACT .................................................................................................................vii LIST OF FIGURES....................................................................................................xiii LIST OF TABLES.......................................................................................................xv CHAPTER 1 INTRODUCTION...................................................................................1 INTELLECTUAL MERIT ...................................................................................................6 CHAPTER 2 HISTORICAL SETTING.......................................................................8 HISTORICAL CONTEXT OF THE DEVELOPMENT OF HUMAN SKELETAL COLLECTIONS IN THE UNITED STATES ............................................................................................................8 Summary................................................................................................................37 CHAPTER 3 THEORETICAL BACKGROUND AND LITERATURE REVIEW.39 ESTIMATION OF AGE AT DEATH ...................................................................................40 Development of osteological aging standards ........................................................41 Osteological aging standards ..............................................................................42 Cranial suture closure.....................................................................................45 Pubic symphysis.............................................................................................49 Sternal extremity of the fourth rib...................................................................54 Iliac auricular surface .....................................................................................55 Multiple-trait approaches................................................................................57 Transition analysis......................................................................................58 APPLICATION OF AMERICAN AGING STANDARDS TO TARGET GROUPS ...........................61 Cranial sutures ......................................................................................................61 Strengths and weaknesses ..................................................................................62 Pubic symphysis.....................................................................................................64 Strengths and weaknesses ..................................................................................68 Auricular surface ...................................................................................................76 Strengths and weaknesses ..................................................................................77 Fourth rib ..............................................................................................................79 Strengths and weaknesses ..................................................................................81 Transition Analysis ................................................................................................82 Strengths and weaknesses ..................................................................................83 CRITIQUES OF ESTIMATING AGE FROM THE ADULT SKELETON .......................................85 Inherent variation in the aging process ..................................................................85 Methodological problems.......................................................................................87 Statistical problems................................................................................................89 Regression toward the mean...............................................................................89 Age structure mimicry........................................................................................90 SUMMARY ..................................................................................................................96 ix

CHAPTER 4 RESEARCH DESIGN ..........................................................................99 MATERIALS ..............................................................................................................100 Reference Collections .......................................................................................... 101 Anatomical Collections ....................................................................................101 Hamann-Todd Collection (HTH)..................................................................102 Terry Collection (TC)...................................................................................103 Recent Collections ............................................................................................... 105 Documented Collections ..................................................................................105 Bass Collection (UTK) .................................................................................105 Maxwell Museum Collection (MMA) .......................................................... 106 Autopsy Collections......................................................................................... 107 Maricopa County Forensic Science Center autopsy sample (MCFSC) ..........107 DATA COLLECTION METHODS ..................................................................................108 Sample Selection Protocol ...................................................................................108 Dataset ................................................................................................................109 Data collection ....................................................................................................116 Pubic Symphysis.............................................................................................. 117 Phase/Stage Based Standards .......................................................................117 Transition Analysis ...................................................................................... 118 Auricular Surface ............................................................................................. 118 Phase/Stage Based Standard .........................................................................118 Transition Analysis ...................................................................................... 119 Sternal End of the Fourth Rib...........................................................................120 Phase/Stage Based Standard .........................................................................120 Ectocranial Suture Closure ...............................................................................120 Phase/Stage Based Standard .........................................................................120 Transition Analysis ...................................................................................... 121 Other data ........................................................................................................121 DATA ANALYSIS METHODS ...................................................................................... 121 Data preparation .................................................................................................121 Analytical methods............................................................................................... 124 Right versus left side morphology ....................................................................124 Intraobserver agreement ...................................................................................124 Descriptive statistics ........................................................................................ 125 RESEARCH QUESTIONS AND HYPOTHESIS TESTING ....................................................127 ASSUMPTIONS ..........................................................................................................134 Documented Skeletal Collections .........................................................................134 Age at Death ....................................................................................................135 Racial Designation ........................................................................................... 135 Osteological Aging Standards..............................................................................136 Reliability of Aging Standards .........................................................................136 Validity of Aging Standards .............................................................................137 LIMITATIONS ............................................................................................................138 Availability of Information ...................................................................................139 Issues of Sample Bias and Sample Representativeness .........................................139 SUMMARY ................................................................................................................141 x

CHAPTER 5 RESULTS............................................................................................ 143 PRELIMINARY DATA ANALYSIS .................................................................................143 Right versus left side morphology.........................................................................143 Intraobserver agreement ...................................................................................... 144 DATA ANALYSIS .......................................................................................................146 Descriptive statistics ............................................................................................ 146 Spearman’s Correlations ..................................................................................151 Spearman’s correlations between variables scored and age at death..............151 Spearman’s correlations between variables scored........................................152 Plots of stage versus age at death......................................................................153 Comparison of observed and expected values...................................................155 Agreement between observed and expected values...........................................156 Plots of differences between observed and expected values .............................. 158 Plots Highlighting Differences for the Entire Dataset ...................................158 Plots Highlighting Differences Among Skeletal Series .................................160 Plots Highlighting Differences Among Age Cohorts ....................................164 Identification of the best predictors of the difference between observed and expected phases................................................................................................ 167 Calculation of bias and inaccuracy ......................................................................170 Hypothesis Testing ............................................................................................... 172 Question 1........................................................................................................172 Reference versus Recent, sexes and races pooled .........................................179 Question 2........................................................................................................187 Method type .................................................................................................188 Indicator used............................................................................................... 188 Anatomical region of the indicator ............................................................... 190 Sex...............................................................................................................191 Race.............................................................................................................194 Sex-race category......................................................................................... 198 Time, by ten-year birth cohorts.....................................................................200 Adult Stature................................................................................................ 202 Synopsis.......................................................................................................204 Question 3........................................................................................................205 Standards supported by the anthropological literature...................................206 Pubic symphysis....................................................................................... 206 Auricular surface ...................................................................................... 208 Fourth Rib ................................................................................................ 209 Cranial sutures.......................................................................................... 209 Multiple indicators: pubic symphysis, auricular surface, and cranial sutures210 Standards selected for inclusion in the stepwise regression model ................211 Pubic symphysis....................................................................................... 213 Auricular surface ...................................................................................... 214 Fourth Rib ................................................................................................ 214 Cranial sutures.......................................................................................... 214 Multiple indicators: pubic symphysis, auricular surface, and cranial sutures214 Standards supported by other data ................................................................ 215 xi

Pubic symphysis....................................................................................... 215 Auricular surface ...................................................................................... 217 Fourth Rib ................................................................................................ 219 Cranial sutures.......................................................................................... 219 Multiple indicators: pubic symphysis, auricular surface, and cranial sutures221 Evaluation of Hypothesis 3: combining multiple lines of evidence ...............221 SUMMARY ................................................................................................................223 CHAPTER 6 DISCUSSION...................................................................................... 225 HYPOTHESES REVISITED............................................................................................ 225 DIFFERENCES IN AGING BETWEEN AMERICAN SKELETAL SAMPLES ............................. 226 INTERPRETATION ......................................................................................................227 Changes in the American political, social, cultural, and technological landscape 229 IMPLICATIONS ..........................................................................................................233 SUMMARY ................................................................................................................235 CHAPTER 7 CONCLUSIONS .................................................................................237 RECOMMENDATIONS.................................................................................................240 Forensic investigations ........................................................................................ 241 Bioarchaeological and paleodemographic analyses .............................................243 FUTURE RESEARCH ...................................................................................................244 SUMMARY ................................................................................................................245 REFERENCES CITED ............................................................................................. 250 APPENDICES ...........................................................................................................299 APPENDIX A: SPEARMAN’S CORRELATIONS ............................................................... 300 APPENDIX B: PLOTS OF DOCUMENTED AGE BY PHASE ................................................308 APPENDIX C: PLOTS OF THE DIFFERENCE BETWEEN OBSERVED AND EXPECTED PHASES BY YEAR OF BIRTH .........................................................................................................326 APPENDIX D: PLOTS OF THE DIFFERENCE BETWEEN OBSERVED AND EXPECTED PHASES BY SKELETAL SERIES ......................................................................................................332 APPENDIX E: PLOTS OF THE DIFFERENCE BETWEEN THE OBSERVED AND EXPECTED PHASE BY 10-YEAR BIRTH COHORT....................................................................................... 338 APPENDIX F: REGRESSION OUTPUT FOR IDENTIFICATION OF THE BEST DESCRIPTIVEVARIABLE PREDICTORS OF THE DIFFERENCE BETWEEN OBSERVED AND EXPECTED PHASES344 APPENDIX G: DESCRIPTIVE DATA FOR PHASES BY GROUP, SEX, AND RACE ..................356 APPENDIX H: AGE AT TRANSITION DATA BY GROUP ...................................................382 APPENDIX I: GRAPHS OF THE NUMBER OF INDIVIDUALS PER OBSERVED PHASE, BY REFERENCE AND RECENT GROUPS .............................................................................408 APPENDIX J: AGE AT TRANSITION DATA BY SEX ......................................................... 419 APPENDIX K: AGE AT TRANSITION DATA BY RACE ..................................................... 435 APPENDIX L: AGE AT TRANSITION DATA BY SEX-RACE CATEGORY ............................. 451

xii

List of Figures Figure 1: Number of skeletal remains in the dataset by series.......................................110 Figure 2: Number of Males and Females by Series ......................................................111 Figure 3: Number of Blacks and Whites by Series .......................................................112 Figure 4: Average Age by Collection...........................................................................112 Figure 5: Average Age at Death by Sex-Race Category ...............................................114 Figure 6: Average Year of Birth by Collection.............................................................115 Figure 7: Box plot comparing age at death by skeletal series........................................149 Figure 8: Box plot comparing year of birth by skeletal series .......................................150 Figure 9: Box plot comparing stature by skeletal series................................................150 Figure 10: Plot of observed Suchey-Brooks phase by age for Recent and Reference populations ..................................................................................................................154 Figure 11: Plot of observed Boldsen and colleagues’ symphyseal texture component score by age for Recent and Reference populations......................................................154 Figure 12: Plot of observed Boldsen and colleagues coronal-pterica suture closure component score by age for Recent and Reference populations....................................155 Figure 13: Plot exemplifying a tendency of the method to underestimate chronological age...............................................................................................................................159 Figure 14: Plot exemplifying a tendency of the method to overestimate chronological age ....................................................................................................................................159 Figure 15: Plot illustrating overestimation of age in the Terry and Hamann-Todd series (Reference)..................................................................................................................161 Figure 16: Plot illustrating overestimation of age in the Maricopa County sample (Recent).......................................................................................................................161 Figure 17: Example of an aging method that tended to underestimate age in the Maxwell sample .........................................................................................................................162 Figure 18: Plot of İşcan’s method, which tended to underage the Bass and Maricopa County series...............................................................................................................162 Figure 19: The Transition Analysis (COR) method tended to underestimate the age for all series ...........................................................................................................................163 Figure 20: Example of steeper trend line slopes were observed for all skeletal series ...163 Figure 21: Example of an aging standard that overestimates age in younger cohorts ....164 Figure 22: Example of an aging standard that underestimates age in older cohorts .......165 Figure 23: Example of an aging method that illustrates a shift from overaging to a value closer to zero (see specifically the 20-29 year age cohort)............................................165 Figure 24: Example of an aging method that illustrates a shift from underaging to a value closer to zero (see specifically the 70-79 and 80+ year age cohorts).............................166 Figure 25: Example of a difference in the rate of change, illustrating the Recent group’s slower rate of progression through stages.....................................................................174 Figure 26: Example of a difference in the rate of change, illustrating the Recent group’s faster rate of progression through stages ......................................................................175

xiii

Figure 27: Example of no difference in the rate of change, but different ages at transition between the Reference and Recent groups ...................................................................176 Figure 28: Example of no difference in the rate of change or ages at transition between Reference and Recent groups.......................................................................................177 Figure 29: Example of methods with the majority of individuals classified into earlier phase scores.................................................................................................................183 Figure 30: Example of methods with the majority of individuals classified into later phase scores ..........................................................................................................................183 Figure 31: Closest approximation to a normal distribution of individuals classified into phases..........................................................................................................................184

xiv

List of Tables Table 1: Number of skeletal remains in the dataset, by series and sex-race category ....110 Table 2: Sample Sizes by Age Cohort..........................................................................113 Table 3: Age at Death Statistics for Series in the Dataset .............................................113 Table 4: Average Birth Year for Sex-Race Groups by Series .......................................115 Table 5: Average Stature by Sex-Race Groups.............................................................115 Table 6: Correlation coefficients between age indicator and age at death .....................138 Table 7: The estimated concordance correlation coefficient, with 95% confidence limits ....................................................................................................................................143 Table 8: Agreement between first and second observation scores using the weighted Kappa statistic .............................................................................................................144 Table 9: Intraobserver agreement values across all aging methods, by skeletal series ...146 Table 4.10: Summary table of descriptive statistics for all variables in the dataset........147 Table 11: Descriptive statistics for continuous variables ..............................................148 Table 12: Agreement between observed and expected values using the weighted kappa statistic ........................................................................................................................157 Table 13: Summary of significant (α=0.05) predictors of calculated differences between observed and expected values (PE) by aging standard..................................................168 Table 14: Bias and inaccuracy values for traditional phase-based aging methods .........171 Table 15: Bias and inaccuracy values for Boldsen and colleagues’ Transition Analysis methods.......................................................................................................................171 Table 16: Analysis of deviance and improvement chi-square output: Total .................181 Table 17: Analysis of deviance output: Total ..............................................................182 Table 18: Recent sample’s rate of progression through morphological stages, compared to the Reference sample...................................................................................................185 Table 19: Summary of the significant and non-significant differences between groups for the tested osteological aging standards.........................................................................189 Table 20: Analysis of deviance and improvement chi-square output: Females.............191 Table 21: Analysis of deviance output: Females .........................................................192 Table 22: Analysis of deviance and improvement chi-square output: Males ................193 Table 23: Analysis of deviance output: Males .............................................................193 Table 24: Recent males rate of progression through morphological stages, compared to Reference males ..........................................................................................................194 Table 25: Analysis of deviance and improvement chi-square output: Blacks...............195 Table 26: Analysis of deviance output: Blacks ............................................................195 Table 27: Analysis of deviance and improvement chi-square output: Whites ..............196 Table 28: Analysis of deviance output: Whites ...........................................................197 Table 29: Recent Whites rate of progression through morphological stages, compared to Reference Whites ........................................................................................................197 Table 30: Analysis of deviance and improvement chi-square output: Sex-race categories ....................................................................................................................................198 Table 31: Analysis of deviance output: Sex-race categories ........................................199 Table 32: Recent White males rate of progression through morphological stages, compared to Reference White males ............................................................................199

xv

Table 33: Recent Black males rate of progression through morphological stages, compared to Reference Black males ............................................................................199 Table 34: Analysis of deviance and improvement chi-square output: 10-year birth cohorts.........................................................................................................................201 Table 35: Analysis of deviance output: 10-year birth cohorts ......................................202 Table 36: Analysis of deviance and improvement chi-square output: Adult stature .....203 Table 37: Analysis of deviance output: Adult stature ..................................................204 Table 38: Summary of strengths and weaknesses for each aging indicator from the literature ......................................................................................................................207 Table 39: Summary of aging standards and indicator components selected for inclusion in the regression model....................................................................................................211 Table 40: Summary of aging standards selected for inclusion in the regression model .213 Table 41: Summary of other data used to assess the reliability of age estimation methods ....................................................................................................................................216 Table 42: Summary of aging methods supported by the lines of evidence presented in this research .......................................................................................................................222

xvi

Chapter 1 Introduction Biological anthropologists study human variation. Through the study of variation, norms are identified, sources of variability are assessed, and adaptations to change are revealed (Brant & Pearson 1994). In order to understand the evolutionary processes that have molded variation in our species, it is important that we be able to assess the age at death in past populations. Our ability to assess age at death from the skeleton is hampered by our lack of understanding of the aging process and how it varies and changes in space and time (Brant & Pearson 1994). So little is understood about human aging that Schmitt and colleagues (2002) consider the age at death assessment of adult skeletons one of the most difficult problems in forensic and physical anthropology. Since the turn of the 20th Century, life expectancy at birth of the total United States population has increased significantly from 49 to 77.5 years, (Shrestha 2006). In 1900, the median age in the United States was 24 years and the average life expectancy was 47 years; nearly a century later, the median age was 31.5 years and the average life expectancy had passed 75 (Spirduso 1995). Causes for this trend likely include lower activity levels, improved diet and living conditions, improved health care, and improved control of infectious diseases (Flegal et al. 1998; Armstrong et al. 1999; Jantz 2001; Shrestha 2006). However, these improvements in environmental conditions do not have a linear relationship with time; in their study of cranial asymmetry in American skeletal samples over the last 200 years, Kimmerle and Jantz (2005), suggest that fluctuations in environmental and material conditions as a consequence of slavery, the American Civil

1

War and Reconstruction Period, and the Great Depression, likely account for this nonlinear relationship. The human lifespan is generally thought to have been shorter in the past than it is now (Aykroyd et al. 1999). Paleodemographic analyses of past populations suggest that young adult mortality was high and that few individuals lived past 50 years (Goldstein 1953; Brooks 1955; Weiss 1973; Ruff 1981; Mensforth & Lovejoy 1985). For example, Goldstein’s (1953) estimate for the average age at death at Pecos Pueblo, New Mexico is 43 years, Brooks (1955) reports mean ages at death for native California Indians at less than 30 years, and Lovejoy and colleagues (1977) estimate the mean life expectancy at the Libben site to be 20 years, with only a few individuals living beyond 40. Howell (1982) argues that the young average age at death estimated by Lovejoy and colleagues 1977) are not realistic, because many children would be orphaned and few adults would attain a lifespan long enough to become grandparents. In addition, according to Masset (1989), it is a myth that cemeteries in antiquity lack older individuals. In instances where historical documents accompany archaeological cemetery samples, parish records always include older individuals (Masset 1989). Aykroyd and colleagues (1999) agree, stating that historical evidence does not support the conclusion that individuals did not live past 50 years of age. One explanation for these discrepancies might be that a genuine difference exists between chronological age and skeletal age1, and these studies are using biological indicators to assess chronological age. Chronological age is the number of years lived from birth, and biological age is measured by the senescence of functions of the

1

See Chapter 3.

2

organism, which can be affected by lifestyle, environment, and genes (du Noüy 19372; Laugier 1955; Acsádi & Nemeskéri 1970). The morphology of skeletal indicators is a manifestation of biological age, and it is important to acknowledge that biological and chronological ages do not necessarily correspond (Laugier 1955). Another reason why historical records do not support the short lifespan suggested by skeletal analyses might be that older individuals are systematically under-aged (Mensforth & Lovejoy 1985) as a result of methodological biases and regression-based age estimation techniques employed by many anthropologists (Konigsberg & Frankenberg 1992; Skytthe & Boldsen 1993; Paine & Harpending 1998; Aykroyd et al. 1999). Alternatively, the age structure of the aging-standard reference population might be mirrored in the target population (Bocquet-Appel & Masset 1982). Such age mimicry3 might create the illusion that adult mortality rates are high and that the age at death estimates are low (Boldsen et al. 2002). Yet another possibility might be that the features used to estimate age vary in space and time as a result of genetic, socio-cultural, or ecological variation. With respect to this last issue, American reference sample-based skeletal ageestimation standards assume that the underlying biological basis of the age-indicator relationship is constant across populations, but this assumption has little empirical basis. Although researchers routinely apply American osteological aging standards to archaeological samples in the United States and abroad, recent studies have shown that

2

Pierre Lecomte du Noüy, a French biologist and head of the division of biophysics at the Insitut Pasteur in Paris from 1922 until 1947, made contributions of mathematics to modern problems of biology and introduced the concept of a “biological time interval” specific to living. 3 See Chapter 3 for a more thorough discussion of this problem, as well as a brief summary of the ensuing debate in paleodemography.

3

this approach leads to systematic error in age-at-death estimation (Todd 1921; Singer 1953; Brooks 1955; Gilbert & McKern 1973; Gilbert 1973; Pal & Tamankar 1983; Mensforth & Lovejoy 1985; İşcan et al. 1987; İşcan 1988; Katz & Suchey 1989; Brooks & Suchey 1990; Klepinger et al. 1992; Russell et al. 1993; Sinha & Gupta 1995; Galera et al. 1998; Nawrocki 1998; Baccino et al. 1999; Hoppa 2000; Oettle & Steyn 2000; Schmitt et al. 2002; Schmitt 2004; Kimmerle et al. 2008a; Sharma et al. 2008), often in the direction of overestimation of age for younger individuals and an underestimation of age for older individuals (Murray & Murray 1991, Russell et al. 1993, Osborne et al. 2004, Djuric et al. 2007, Hartnett 2007, Martrille et al. 2007, Berg 2008). The age-indicator relationship also no doubt varies as a function of biological factors like sex and ancestry. In addition, as a result of the tremendous socio-cultural change in our species over the past 10,000 years, it is likely that this age-indicator relationship has changed over time. Several studies have reported temporal changes in the osteological indicators used to assess age (Masset 1989; Bocquet-Appel & Masset 1995; Hoppa 2000), though support for such temporal change is not universal (Osborne et al. 2004). Spatially patterned socio-cultural and ecological variation may also contribute to substantial variation in this age-indicator relationship (Bocquet-Appel & Masset 1982; Angel et al. 1986; Murray & Murray 1991; Kemkes-Grottenthaler 1996); WittwerBackofen et al. 2008; Legoux 1966; Eveleth & Tanner 1976). These studies emphasize the limitations of using a spatially and temporally isolated American reference sample to estimate the ages of skeletal remains from different places and times. The impact of such temporal and spatial variation on age estimation is the focus of this thesis. The specific goal is to determine whether older American skeletal series

4

progress through morphological age-related changes at a different rate than more recent ones, specifically for four commonly used osteological indicators of age: the pubic symphysis, auricular surface, sternal end of the fourth rib, and cranial sutures. This goal will be accomplished by testing multiple skeletal age estimation standards to address three specific questions: 1) is the observed morphology of the aging indicator associated with the same chronological ages for both older reference and more recent American skeletal populations; 2) is there a pattern that explains why some aging standards produce significant differences in the aging process of skeletal indicators between groups while others do not; and 3) which standard is the true gauge of whether a change in the indicator’s rate of aging has occurred if results from different age estimation standards for a single osteological indicator contrast? Skeletal remains used in this study are drawn from four documented American skeletal collections and a modern autopsy sample of pubic symphyses and rib ends; these collections are an ideal data source for this research because, with some caveats, the remains have known sex, age4, race5, and date of birth and/or death information. American Blacks and Whites of both sexes were drawn from the Terry Anatomical, Hamann-Todd Osteological, Bass Donated, Maxwell Museum Documented, and the

4

A discussion of the issues regarding the reliability of the documented ages for the Terry and HamannTodd collections is presented in Chapter 3. 5 At this point, I must clarify the term “race” as it is used in this manuscript. The current view of many physical anthropologists, including myself, is that human variation is best described as global geographic variation in physical features that follow gradations or clines for a given trait. Race, as defined here by such terms as Black and White, represent social constructs, not a biological reality. These categories are often strongly correlated with social and environmental influences like class, diet, living conditions, health, activity level, and other factors. The terms Black and White were chosen to describe these social categories because this is the vocabulary used in each skeletal series’ database to describe individuals within the collection. It is understood that these terms may carry undertones that are traditionally associated with the popular conception of races as distinct biological groups, but this is not the context in which I am using them. It is also understood that the terms Black and White are not applied in the same manner through time or by different individuals. See additional discussion in Chapter 4.

5

Maricopa County Forensic Science Center autopsy collections. These collections comprise skeletal remains collected from autopsy, cadaver, unclaimed, and donated bodies. Analyses are limited to Blacks and Whites because these documented collections have relatively few individuals from other groups. Data collection for this study includes demographic information consisting of age, sex, race, adult stature, year of birth, and year of death, as well as phase/stage or component data for four skeletal morphological indicators of age: the pubic symphysis, auricular surface, sternal end of the fourth rib, and cranial sutures. Established American osteological aging methods, specifically Todd (1920), Suchey and Brooks (1990), Hartnett and Fulginiti (2007), Lovejoy and colleagues (1985b), İşcan and colleagues (1984, 1985, 1986), Meindl and Lovejoy (1985), and Boldsen and colleagues (2002), standardize the scoring of the morphological data.

Intellectual merit Estimating age is a critical part of the study of skeletons from archaeological and forensic contexts, but despite an abundance of research on the estimation of age from the adult skeleton, improvement is still needed, specifically regarding the disclosure of the accuracy and precision of estimates. Many current aging standards are based on older reference skeletal samples, and some authors have argued that these standards and reference collections are outdated due to secular changes in overall body size, health, activity, and nutritional status. Whether an aging standard will work on target groups that differ in time, space, and background from the reference sample is essential for reliable, accurate age estimation. This dissertation will address whether a difference in the skeletal aging process exists between older American documented collections and more recent ones. This issue 6

is particularly important for disciplines requiring accurate and reliable estimations of age based on skeletal remains, whether medicolegally significant or archaeological. It is important to note, however, that the implications of this research differ for bioarchaeological and forensic anthropological applications. Although these two fields both rely upon estimates of age, bioarchaeological studies operate at the population level, exploring the impact of foreign pathogens on aboriginal/native populations, the demography of past populations, comparative research on human life span and life expectancy, among other endeavors. If an age estimation method is unbiased6 for that population, the impact of inaccuracy7 in age estimates is less of a concern than it would be for any one individual. In contrast, forensic examinations are concerned with the estimation of age on an individual level; these age estimates need to be both precise8 and accurate.9 These conditions must be satisfied to meet the Daubert Criteria for admission as scientific evidence in court. In addition, recent critiques of the forensic sciences outlined by the National Academy of Sciences are driving changes in forensic anthropology, including the need to identify and report the accuracy and error rates in age estimation, as well as sources of potential bias and human error (Committee on Identifying the Needs of the Forensic Sciences Community, National Research Council 2009).

6

Bias = Σ(estimated age-chronological age)/number of individuals. Bias is directional, so the sign is important; bias is informative for the identification of systematic over- or under-estimation of age at death. 7 Inaccuracy = Σ|(estimated age-chronological age)|/number of individuals. 8 For age estimation, precision is defined as close to the actual chronological age. 9 Here, accuracy is defined as obtaining similar age estimates for a set of remains over many trials.

7

Chapter 2 Historical Setting Ubelaker and Grant (1989) state that skeletal collections are essential for teaching anatomy and human variation, as well as learning about the medical and biological aspects of human history. The development of human skeletal collections in the United States is deeply rooted within the framework of the birth and progress of American physical anthropology, bioarchaeology, and forensic anthropology. With respect to these disciplines, prevailing cultural belief systems and physical anthropology’s “long peripheral relationship to medicine” (Stewart 1979, p. xi) strongly influenced their courses. A synopsis of the development of formal, well-documented human osteological collections in the United States follows.

Historical context of the development of human skeletal collections in the United States According to Walker (2000), the practice of collecting human skeletal remains as war trophies and for religious purposes has deep roots. In the past, Native American and Melanesian groups took heads, scalps, and other body parts during warfare (Driver 1969; Olsen & Shipman 1994; Owsley et al. 1994; White & Toth 1991; Willey & Emerson 1993), and this practice has taken the form of trophy skull collection by soldiers in more recent societies (Sledzik & Ousley 1991; McCarthy 1994; Walker 2000). Prior to the collection of human skeletal remains for scientific research, skulls and bones were often placed on display in cabinets of curiosities and at local historical societies (Quigley 2001). Oddities, pathological bones, and other skeletal remains continue to draw the

8

attention of the public, as evidenced by the success in the 20th Century of Robert Ripley’s Believe it or not!, which displays trophy and trepanned skulls. In the United States, historical associations such as the Library Company of Philadelphia (1731), established by Benjamin Franklin (1706-1790) and others, began to maintain collections that included anatomical specimens; a contemporary collection of anatomical models and human skeletons was also established in Philadelphia at the Pennsylvania Hospital (Orosz 1990; Walker 2000). The German anatomist Johann Friedrich Blumenbach (1752-1840) is considered to be the founder of modern physical anthropology (Jarcho 1966; Cook 2006). Blumenbach was interested in the typological characterization of American Indian, and he tested models of human variation using craniology, or observations of crania (Cook 2006). Based on the crania he had collected and studied, Blumenbach wrote De generis humani varietate nativa (On the Natural Varieties of Mankind), which was first published in 1775. This volume presented a five-race schema: Caucasian, Mongolian, Ethiopian, American, and Malayan. He believed that these groups were different due to degeneration, a concept positing that a single original type accommodated to varying local conditions (Cook 2006). Blumenbach's work also included his description of over sixty human crania, published in Decas prima collectionis sua craniorum diversarum gentium illustrata (1790) and Nova penta collectionis sua craniorum diversarum gentium (1828). Several of the cranial descriptions presented were those of American Indian crania; as a result of his studies, Blumenbach recognized the Asian affinities of Eskimo and Aleut groups (Harper & Laughlin 1982).

9

In the United States, Thomas Jefferson (1743-1826) conducted the first systematic study and collection of human remains. Jefferson examined ancient American Indian skeletal remains excavated from burial mounds on his property in Virginia (Jefferson 1853); prior to this, skeletal remains were collected without documentation as to provenience, and they were kept as curiosities (Quigley 2001). In contrast, Jefferson’s work has been commended for its scientific methodology, as well as its consideration of the archaeological context of the remains in Jefferson’s interpretations (LehmannHartleben 1943; Willey & Sabloff 1980; Buikstra 2006). Population numbers and age distributions were also important in Jefferson’s investigations (Buikstra 2006). Scientists in the 19th Century also showed an interest in ancient American Indian crania (Buikstra 2006). During this time, the study of human remains was being integrated into the investigations of living populations. Much of early work on skeletal remains in the United States focused on the diversity and origins of indigenous groups of the Americas. As in Europe, there was a fixation on cranial variation, and this affected what skeletal elements researchers collected from the field; as a result, postcranial remains were ignored for some time. John Collins Warren (1778-1856), a surgeon and anatomist in Boston, was inspired by Blumenbach’s works (Hrdlička 1918; Jarcho 1966). Warren had an interest in comparative anatomy of the brain and its relation to race, which lead to the collection of crania and his focus on craniology (Warren 1822; Hrdlička 1918; Jarcho 1966). In 1822, Warren published his book Natural History of the Nervous System, which included an appendix on the crania of American Indians (Warren 1822). In this text, he implied that Old World immigrants were the builders of sophisticated structures in the New

10

World, not American Indians (Warren 1822; Cook 2006). Later, he published his conclusion that occipital flattening was the result of cultural modification (Warren 1837). In 1847, he established the Warren Anatomical Museum at the Harvard Medical School (Jarcho 1966); the museum included crania Warren had personally collected, as well as crania collected by the Boston Phrenological Society between 1832 and 1842 (Bowles 1976; Cook 2006). Samuel Morton (1799-1851), a physician and anatomy professor in Philadelphia, also sought to examine and classify the variation present among populations through craniometry (Quigley 2001). Morton was strongly influenced by his education at the University of Edinburgh, where he was immersed in the theories in vogue at the time: polygenism and the hereditarian views of phrenologists (Spencer 1983; Walker 2000; Quigley 2001). As a result, Morton began collecting human skeletal remains, which consisted solely of crania (Buikstra & Gordon 1981). Morton’s work followed Blumenbach’s craniological tradition but took on an impartial view of phrenology (Morton 1839; Hrdlička 1918; Cook 2006; Buikstra 2009). As Buikstra (2009) underscored, phrenologists collected skulls with the goal of investigating character, while Morton focused on context and culture: a decidedly ethnographic goal. To test the theory of a hierarchical ranking of humans, Morton began collecting skulls from all over world in the 1820s. During the course of his work, Morton amassed a large collection of approximately 900 human skulls from archaeological contexts (Morton 1849; Hrdlička 1914; Thomas 2000; Buikstra 2006), which is now curated at the University of Pennsylvania. However, unlike Thomas Jefferson, Morton did not have provenience data for the remains; the crania he studied lacked spatial and temporal

11

contextual information, which undermined his ability to support his conclusions (Buikstra 2006). Other limitations of Morton’s collection included incomplete sampling at archaeological sites and acquisition bias, as only adult crania were collected (Wilson 1876; Buikstra 2006). Morton’s research on crania of aboriginal groups in the Americas culminated in the publication of Crania Americana in 1839 (Morton 1839). Crania Americana included detailed illustrations of crania, defined cranial measurements, presented Morton’s descriptions of Indian skulls, and discussed ethnohistoric, archaeological, and cultural contexts (Morton 1839; Buikstra 2009). The work also outlined his conclusions that all American Indians were descended from a common stock and that the Moundbuilders were indeed Indian, not European; this last point was a response directed at Warren’s claims. Although Morton’s work has been criticized as racist (Gould 1978a, 1978b, 1996), it actually presented data supporting the capability of Native Americans to build complex structures that were at the time assumed to be the work of Old World groups (Morton 1839; Stanton 1960; Silverberg 1968; Buikstra 1979; Buikstra 2006). Additionally, Buikstra (2009) has stressed that Morton’s classification of humans was based on both ethnographic data and the measurement of crania, a combination that was novel at the time. In fact, his research was so influential that Hrdlička (1918) stated physical anthropology in the United States began with Morton, and Wissler (1942-1943) named him the father of American physical anthropology (Jarcho 1966); however, most physical anthropologists today reserve that distinction for Aleš Hrdlička. The collection of human crania for the pursuit of exploring the origins of extant populations was not limited to North America. Paul Broca (1824-1880), the French

12

founder of physical anthropology, built upon Morton’s concept of anthropology and professionalized the discipline (Cook 2006). Broca, like other scientists of the time, had an interest in exploring the origins of races and collected skulls from world travelers in order to further explore his interest in the origin of races. Broca’s work advanced craniometry by the science of cranial anthropometry by developing new measurement tools and measurement indices. During the mid-19th Century, the social, cultural, and political climate changed drastically in the United States. By the start of the third decade, the Industrial Revolution had begun, bringing with it an expedited exploration and settling of the American West, the development of railroads, the expansion of mining, the rise of philanthropy, the increased governmental support of science and exploration, and the establishment of new museums (Jarcho 1966). These changes had an enormous impact on the collection and curation of human skeletal remains, which fostered the development of physical anthropology in the United States. Walker (2000) notes that, at this time, large public natural history museums were established with the dual goals of popular education and scholarly research (Orosz 1990). Recognizing that skeletal collections were a valuable resource for providing information about the past, these newly founded museums provided an institutional framework within which large skeletal collections could be consolidated from smaller private collections. These museums had enough resources to support a staff of research scientists, purchase skeletal remains from private collectors, and sponsor archaeological expeditions (Buikstra 2006). From the perspective of collections of human skeletal remains, many important natural history museums were established in the United States during this time, including

13

the Smithsonian Institution (1846), the Army Medical Museum (1862: now the National Museum of Health and Medicine), the Harvard Peabody Museum of Archaeology and Ethnology (1866), the American Museum of Natural History (1869), the Columbian Museum of Chicago (1893: now the Chicago Field Museum), the Lowie Museum of Anthropology (1901: now the Phoebe Hearst Museum), and the San Diego Museum of Man (1915) (Walker 2000). The Smithsonian Institution’s National Museum of Natural History, established in 1846, became the storehouse of skeletal remains from most federally funded excavations and eventually acquired large collections, like the Huntington and Terry collections, from other institutions when their curation of the remains were terminated (Quigley 2001). The Smithsonian Institution’s history is intertwined with other museums and numerous researchers in the 19th and 20th Centuries; as a result, its development will be dispersed throughout the remainder of this section of the dissertation. The 1860s alone saw the establishment of three museums associated with the collection, study, and curation of human skeletal remains in the United States: the Army Medical Museum, the Harvard Peabody Museum of Archaeology and Ethnology, the American Museum of Natural History. The Surgeon General William Hammond founded the Army Medical Museum in 1862, with J.S. Billings serving as its first curator (Quigley 2001; Buikstra 2006). The museum began as a repository for thousands of medical records and skeletal and soft tissue remains obtained during the treatment and autopsy of military casualties during the American Civil War (Barnes et al. 1870; Otis & Woodward 1865; Walker 2000; Buikstra 2006). At the close of the Civil War, army doctors shifted their focus toward collecting activities toward American Indian crania, as

14

requested by George Otis (Bill 1862; Parker 1883; Wilson 1901; Lamb 1915) to further anthropological research (Henry 1964). As a result, over 2200 American Indian crania were collected. In 1869, as per an agreement between Otis and Joseph Henry of the Smithsonian, the Army Medical Museum transferred its ethnological and archaeological holdings to the Smithsonian, and in return received human skeletal remains possessed by the Smithsonian’s Division of Mammals (Hrdlička 1914; Henry 1964; Walker 2000; Buikstra 2006). Later, nearly all of the human skeletal remains curated by the Army Medical Museum would be transferred to the Smithsonian Institution’s new Division of Physical Anthropology. Between 1898 and 1904, about 3500 skeletal remains were transferred; the Army Medical Museum retained only those remains of pathological significance (Lamb 1915: Sledzick & Barbian 2001). The Harvard Peabody Museum of Archaeology and Ethnology was established in 1866 (Quigley 2001); Jeffries Wyman was the museum’s first curator. Wyman was a physician and anatomist (Jarcho 1966) who reported on disease observed in bones exhumed from mounds and caves in the southeastern United States (Wyman 1871). As curator, Wyman implemented a protocol for the systematic curation of human skeletal remains, which included important provenience data and the temporal origin of the bones (Wyman 1868; Buikstra 2006). Wyman studied postcranial remains (El-Najjar & McWilliams 1978), initiating a paradigm shift in physical anthropology, such that the emphasis of study extended beyond the cranium (Wyman 1869). In addition, he questioned race-based craniology, another prevailing thought of the time (Wyman 1871). Frederick Ward Putnam (1839-1915) was Wyman’s successor (Jarcho 1966), and the

15

founder of the department of physical anthropology at Harvard University. Collections at the Peabody Museum grew quickly due to acquisition of donations from many smaller museums, local societies, and medical schools, as well as the Smithsonian’s holdings of American Indian remains from burial mounds in Florida, Kansas, and Ohio (Jarcho 1966). In the 1920s, the Peabody acquired over 2000 remains from Pecos Pueblos excavated by Alfred Kidder. The museum also curates over 3100 remains from northern Mexico and another 525 from Egypt. Dr. Albert Bickmore founded the American Museum of Natural History in New York in 1869 (Orosz 1990; Quigley 2001). The museum includes collections of human skeletons from around the globe, including the southwestern United States. Large collections curated there include a morphology collection from dissected cadavers, ancient western and eastern Inuits, and the von Luschan Collection, which is comprised of 277 black African skulls (Krogman & Iscan 1986; Schwartz 1998). Both the Army Medical Museum and the Harvard Peabody Museum of Archaeology and Ethnology held human remains from the famed Hemenway Southwestern Archaeological Expedition (1887-1888). Frank Hamilton Cushing (18571900), an ethnologist at the Smithsonian Institution, spent years living among the Zuni of New Mexico (Cushing 1890; Hinsley & Wilcox 1996, 2002). His experience there sparked his interest in Zuni ancestors, leading him to propose an archaeological investigation. The Hemenway Expedition was named after Mary Hemenway, the sponsor of Cushing’s excavations at Los Muertos in Arizona (Cushing 1890; Matthews et al. 1893; Hinsley & Wilcox 1996, 2002; Merbs 2002). During his investigation, Cushing applied his ethnographic knowledge to interpret the past (Cushing 1890; Hinsley &

16

Wilcox 1996, 2002). Washington Matthews (1843-1905), the surgeon heading the United States Army Medical Museum in Washington, DC, was sent to visit the excavations in August of 1887 due to Cushing’s poor health (Matthews 1900; Haury 1945; Buikstra 2006). In November of the same year, Dr. Herman ten Kate, hired by Cushing earlier that spring, arrived to oversee the excavations, preserve the skeletal material, and analyze the human remains (ten Kate 1892; ten Kate & Hovens 1995). As a result of Matthew’s month long visit, he sent his colleague, anatomist Dr. Jacob Wortman, to assist ten Kate with the conservation of the friable skeletal remains (ten Kate & Hovens 1995; Buikstra 2006). Approximately 5000 skeletal remains were recovered from Los Muertos throughout the course of the expedition (Matthews et al. 1893; Merbs 2002; Buikstra 2006). Wortman returned to the Army Medical Museum with the Hemenway Expedition skeletons in 1888 (Matthews et al. 1893; Lamb 1915; Merbs 2002), and the final report, published in 1893, included a discussion of age using a scheme following Broca’s six-stage periods of life (Broca 1875; Matthews et al. 1893; NAS 1893). The cremains recovered during the Hemenway Expedition excavations, however, were originally sent to the Peabody Museum of Salem, before being transferred to the Peabody Museum of Harvard (Haury 1945). While at Harvard University, Emil Haury (1904-1992) studied the cremains for his dissertation research (Haury 1945); Earnest Hooton encouraged Haury’s research on this topic, because he saw the potential value in analyzing burned and fragmentary skeletal remains (Haury 1945). Another medical doctor, Joseph Jones (1833-1896), collected and studied ancient skeletal remains in the United States during the latter half of the 19th Century. Jones excavated remains from stone box graves, earthworks, and mounds from Tennessee and

17

Kentucky, and he played a pivotal role in how syphilis was diagnosed from skeletal remains (Jones 1869, 1876); according to Jarcho (1966), Jones is the first to study preColumbian skeletal remains from the viewpoint of disease. Jones’ continued interest in archaeology led him to investigate shell mounds of the Deep South (Jones 1878). His work emphasized the importance of the archaeological context and temporal place of skeletal remains, allowing for conclusions to be made about the presence of non-venereal syphilis in the pre-Columbian New World (Jones 1878). A decade after his death, Jones’ collection of skulls and artifacts was bought by the Heye Foundation/Museum of the American Indian, but has since been deaccessioned and dispersed (Williams 1932; Buikstra 2006). In addition to the continued excavation and collection of human skeletal remains from archaeological contexts, the second half of the 19th Century saw the first systematic collection of skeletons from medical school cadavers. From 1886 to 1924, George Huntington (1861-1927) was an anatomy professor at the College of Physicians and Surgeons in New York (Quigley 2001). In 1893, Huntington began preserving skeletons from cadavers dissected at the medical school rather than disposing of them (Hrdlička 1937). The cadavers were primarily European immigrants and residents of New York, whose bodies were unclaimed and became the property of the state; cadavers were also acquired from sanitariums, poorhouses, and morgues (Hrdlička 1937; Quigley 2001). This collection, which contains over 3800 skeletons and includes examples of trauma and pathology, is now housed at the Smithsonian Institution. Huntington’s work strongly influenced Robert Terry, T. Wingate Todd, and Aleš Hrdlička.

18

The late 19th Century also gave rise to pioneers in forensic anthropology (Stewart 1979; Klepinger 2006). Thomas Dwight (1843-1911) is considered to be the father of American forensic anthropology (Stewart 1979). Dwight, who was primarily concerned with human variability, was the first American to make major contributions to the field, participating in an unknown number of forensic cases (Warren 1911; Stewart 1979). The history of forensic anthropology was marked by Dwight’s 1878 medicolegal essay; while other contemporary anatomists were studying human skeletons, only Dwight did so with the intention of applying that knowledge to forensic matters (Stewart 1979). Dwight held the Parkman Professorship of Anatomy at Harvard, succeeding Oliver Wendell Holmes (1809-1894) (Warren 1911; Stewart 1979; Klepinger 2006) who had, along with Jeffries Wyman, presented skeletal evidence during their testimony at John Webster’s 1850 trial for murdering Harvard benefactor George Parkman. Another pioneer of American forensic anthropology was George Dorsey (18691931); Dorsey studied anthropology at Harvard and was influenced by the teachings of Dwight (Stewart 1979). Dorsey succeeded William Holmes as the curator of the Field Museum in Chicago (Cole 1931; Stewart 1979). At the trial of Luetgert, the German immigrant who was accused of killing his wife and disposing of her remains in a vat, Dorsey testified as prosecution’s star witness, identifying the bones recovered as those of a human female (Giles & Klepinger 1999; Loerzel 2003: Klepinger 2006). Because he was the first anthropologist to testify in an American criminal trial, Klepinger (2006) argues that Dorsey was the first “full-fledged” forensic anthropologist. After the Luetgert murder trial, Dorsey gave a lecture on “The skeleton in medico-legal anatomy” to the

19

Medico-Legal Society of Chicago (Wigmore 1898; Dorsey 1899), further establishing the discipline. At the turn of the 20th Century, American scientists continued to be fueled by an interest in the origins of aboriginal Americans and human skeletal variation. Franz Boas (1858-1942) was a founder of American anthropology, and his research in physical anthropology focused on anthropometrics, osteometrics, race, racial origins, and race equality, environmental influences and plasticity of the body, human growth, and the development of children (Rankin-Hill & Blakey 1994; Walker 2000; Little & Sussman 2010). Boas collected anthropometric data and skeletal remains from American Indians between 1888 and 1903 (Jantz et al.1992; Quigley 2001), and amassed approximately 200 crania and 100 skeletons that were curated at the American Museum of Natural History in New York until he moved to Columbia University in 1899. The collection was transferred to the university where these and other skeletal remains were used for teaching and research. Boas used the collection to stress the biological equality of races by showing the cranial index varied widely within groups (Gould 1996; Thomas 2000; Quigley 2001). A strong challenger of hereditarian explanations of human variation, Boas illustrated the plasticity of the human cranium in response to environmental change (Boas 1912). Aleš Hrdlička (1869-1943) immigrated from Humpolec, Bohemia, to the United States with his father shortly after finishing high school at the age of 12 (Shultz 1945). Hrdlička eventually earned two medical degrees, the first from the Eclectic Medical College of the City of New York and the second from the New York Homeopathic Medical College. He then accepted an internship at the State Homeopathic Hospital for

20

the Insane in Middletown, New York, where he began his research in anthropometry (Hrdlička 1895; Shultz 1945). He then became an Associate in Anthropology at the Pathological Institute of the New York State Hospitals, a position that permitted him to travel to Europe where he studied anthropology, physiology, and medico-legal topics under Manouvrier, Bouchard, and Brouardel, respectively (Schultz 1945). Hrdlička’s early work focused on “abnormal” individuals from inmates of state institutions and hospitals for the mentally ill; through this experience, he recognized the lack of adequate comparable data on “normal” persons and the need for that information (Schultz 1945). He intensively studied Huntington’s collection before leaving the Pathological Institute to join expeditions studying medical and physical anthropology sponsored by the American Museum of Natural History. In 1897, the United States National Museum at the Smithsonian Institution underwent a reorganization process, and archaeologist William Henry Holmes became head curator of the Department of Anthropology; Holmes created a Division of Physical Anthropology in 1903, which was headed by Hrdlička (Schultz 1945; Jarcho 1966; Quigley 2001; Buikstra 2006). According to Armelagos and van Gerven (2003), a lack of funding prevented Hrdlička from attaining his primary goal, which was to establish an institute of biological anthropology similar to that of Broca’s in France. As a result, Hrdlička was motivated to establish the Smithsonian's National Museum of Natural History as a major research institution (Armelagos & van Gerven 2003). During his appointment at the Smithsonian Institution, Hrdlička’s research and goals focused on the determination of the range of normal variation for humans, the collection and

21

preservation of skeletal remains, and the investigation of the origins of ancient American Indians in the New World (Stewart 1940; Schultz 1945; Buikstra 2006). Hrdlička brought Huntington’s collection to the Smithsonian Institution; this collection was the source of Hrdlička’s inspiration for his earliest anthropological investigations (Jones-Kern 1997). Approximately 1200 of these skeletons arrived at the Division of Physical Anthropology, sorted by skeletal element, not by individual; Hrdlička had to first organize the remains, then catalogue them. This experience influenced Hrdlička, who advocated better accession procedures for record keeping (Jones-Kern 1997). Hrdlička was also instrumental in the recovery of archaeological remains and the creation of osteological research collections. In 1910, Hrdlička collected over 3000 skulls of Peruvian Indians, and in 1926, he began his expeditions to Alaska, which resulted in the collection of a large number of Aleutian, Indian, and Eskimo skeletal remains. Hrdlička also collected anthropometric data and skeletal remains from the North American desert west (Hrdlička 1908b, 1909d, 1935b; Rakita 2006). In his studies of early New World sites inhabited by American Indians, he carefully documented skeletal remains in situ, using stratigraphical evidence to support temporal antiquity (Hrdlička 1907b; Buikstra 2006). Hrdlička traveled the world in search of skulls (Quigley 2001), which he measured and cataloged; in addition, he procured skeletal remains from other travelers (Hrdlička 1904). Unfortunately, his goal was to amass a very large collection, rather than one composed of fewer remains with contextual information. During his forty-year tenure as curator, Hrdlička built up the collection from 3,000 skeletal remains acquired from the Army Medical Museum to over 15,000 skulls (Lamb 1915; Schultz 1945); in

22

addition he recovered skeletal remains that now form part of the collections at the San Diego Museum of Man (Quigley 2001). The founding of the San Diego Museum of Man was the result of the Panama-California Exposition in 1915, which featured the story of man through the ages. Hrdlička coordinated expeditions to Africa, Alaska, Philippines, Siberia, and personally went to Mongolia and Peru to gather remains for the Exposition (Merbs 1980). Today the museum has grown to include southern California archaeological material (La Jolla), the Stanford-Meyer osteopathology collection, and the Hrdlička paleopathology collection (Quigley 2001). Hrdlička is considered to be the father of American physical anthropology (Krogman 1976; Quigley 2001). In 1918, he founded and became the editor of the American Journal of Physical Anthropology; twelve years later, he founded the American Association of Physical Anthropologists. His research marked the transition from the old dogma of single-specimen description toward a study of entire societies and samples (Jarcho 1966). Other significant contributions included the recognition of the importance of all bones, ages, races, and sexes to improving the science of physical anthropology (Hrdlička 1904); his focus was on the description of normal variation, rather than explaining how or why (Schultz 1945). However, the field of physical anthropology benefited significantly from his prolific research and publications. In addition to the origins of American Indians (Hrdlička 1907b, 1912a, 1912b, 1917, 1926, 1931b, 1941), Hrdlička also published on collections management (Hrdlička 1900), excavated skeletal remains (Hrdlička 1908c, 1909a, 1909b, 1909c, 1910, 1912c, 1913), standards for data collection (Hrdlička 1904, 1907a, 1920, 1939), variation in the skeleton and dentition

23

(Hrdlička 1908d, 1934, 1935a), physical anthropology (Hrdlička 1908a, 1914, 1918), and catalogues of NMNH collections (Hrdlička 1924, 1927, 1928, 1931a, 1940, 1942, 1944). Like Hrdlička, Ernest Hooton (1887-1954) was influenced by the mainstream evolutionary approach, which emphasized a typological, craniometrically oriented framework that focused on biological determinism and taxonomic description rather than functional interpretation (Walker 2000; Quigley 2001). However, Hooton was a proponent of integrating physical anthropology, archaeology, and ethnology (Hooton 1935). He was classically trained, not a doctor of medicine, and marked the first generation of anthropologically trained biological anthropologists in the United States (Quigley 2001; Cook 2006). In 1913, Hooton founded the Harvard program in anthropology, which was the first major physical anthropology training program in the United States. Hooton also participated in the recovery of archaeological remains, including those at Alfred Kidder’s Pecos Pueblo site (Kidder 1924; Hooton 1925), and later the Pecos skeletal remains, classifying the crania into one of eight morphological types (Hooton 1930; Thomas 2000). Hooton was asked by law enforcement agencies to identify skeletal remains, but he was not a significant contributor to the development of forensic anthropology (Stewart 1979); in fact, Hooton (1943) did not believe that physical anthropology had contributed much to the improvement of methods of individual identification. Hooton was instrumental in establishing academic anthropology at Harvard University; as a result, he mentored dozens of prominent mid-20th Century physical anthropologists, including Harry Shapiro, J. N. Spuhler, J. Lawrence Angel, Alice Brues, Sherwood Washburn, C. Wesley Dupertuis, Stanley Garn, W.W. Howells, and Frederick

24

Hulse, among others (Giles 1977; Shapiro 1981; Garn & Giles 1995; Ubelaker 2006a). Hooton’s influence in this realm was so extensive that, for decades, his academic progeny staffed most of the programs in physical anthropology at other American universities (Thomas 2000); he also trained seven of the eight presidents of the American Association of Physical Anthropologists serving from 1961-77 (Armelagos & van Gerven 2003). In

contrast, Hrdlička only formally trained one student, T. Dale Stewart; this was primarily because he was not affiliated with a university (Stewart 1979); instead, Hrdlička’s focus was on building the Division of Physical Anthropology in the National Museum of Natural History. As discussed previously, Hrdlička had a profound influence on physical anthropology in the United States, both promoting and professionalizing the discipline. Boas, Hrdlička, and Hooton were all instrumental in the founding of American physical anthropology. At the same time other scientists, medical doctors, and anatomists were also contributing to the discipline, specifically through the collection of large numbers of entire human skeletons. Following in the tradition of Huntington’s collection of skeletons, Carl August Hamann, Thomas Wingate Todd, and Robert James Terry amassed large collections during the early 20th Century from anatomy school dissecting rooms and institutional morgues (Thompson 1982). Carl August Hamann (1868-1930) was a surgeon in Philadelphia prior to becoming a professor of anatomy in 1893 at Western Reserve University in Cleveland, Ohio (Quigley 2001). Upon his arrival, Hamann began collecting remains in order to assemble an anatomical teaching museum. He collected both human and non-human skeletal remains. By 1912, when he was appointed the dean of the medical school, Hamann had amassed over 100 human skeletons originating from dissected medical

25

school cadavers (Thompson 1982; Moore-Jansen 1989; Quigley 2001; Kern 2006). Thomas Wingate Todd (1885-1938) left his position at the anatomical department at the University of Manchester, succeeded Hamann, and assumed responsibility of the collection (Thompson 1982; Moore-Jansen 1989). Todd continued to collect human skeletons, cataloguing the unclaimed bodies obtained from the Cuyahoga County Morgue or city hospitals in Cleveland, Ohio (Lovejoy et al. 1985a; Quigley 2001; Kern 2006). The cadavers were measured, weighed, and photographed; after being dissected, the cadavers were macerated for inclusion into the skeletal collection (Quigley 2001). The collection grew significantly during Todd’s tenure at Western Reserve Medical School (Jones-Kern 1997); after his death in 1938, the collection eventually fell into disarray. In 1951, the medical school transferred the collection to the Cleveland Museum of Natural history, where it remains on permanent loan. At the start of the 20th Century, Robert James Terry (1871-1966), a student of the leading British anatomist Sir William Turner and American anatomist George Huntington, also began a skeletal collection derived from cadavers. Turner and Huntington had both amassed skeletal collections: Turner’s from churchyard exhumations and Huntington’s from medical school cadavers (Quigley 2001). It is apparent that both men had influenced Terry academically. Terry, a professor of anatomy and head of the Anatomy Department at Washington University Medical School in St. Louis, Missouri, (Thompson 1982; Moore-Jansen 1989; Quigley 2001; Hunt 2009) had research interests that centered on normal and pathological variants of the human skeleton (Quigley 2001). To facilitate his research, he had begun collecting skeletons from cadavers used in the medical school’s gross anatomy classes in 1910 (Tobias 1991);

26

he instituted a uniform protocol for the collection, cataloging, maceration, and curation of remains. After Terry retired in 1941, Mildred Trotter (1899-1991) assumed the responsibilities for maintaining and adding to the collection (Cobb 1952); five years later, she became the first woman to hold a full professorship at the Washington University School of Medicine (Missouri Women in the Health Sciences 2004-2009). While managing

the Terry Collection, Trotter sought to counteract the existing biases inherent in the collection by making a concerted effort to collect young, White females; this action added hundreds of remains to the collection (Quigley 2001; Hunt 2009). Prior to her retirement in 1967, Trotter transferred the skeletons to T. Dale Stewart at the Smithsonian Institution’s Department of Anthropology, preserving the remains and their documentation for future researchers (Quigley 2001; Hunt 2009). The tradition of amassing large skeletal collections from archaeological excavations and medical school cadavers continued in the 1930s and beyond with archaeological projects sponsored by the Works Progress Administration and Civilian Conservation Corps, and W. Montague Cobb (Jacobi 2002; Milner & Jacobi 2006; Ubelaker 2006a). Roosevelt’s New Deal projects resulted in the excavation of significant archaeological sites, such as Indian Knoll. W. Montague Cobb (1904-1990), a medical doctor, was in graduate school when the prevailing view of the inequality of races was supported by craniology (Rankin-Hill & Blakey 1994; Quigley 2001; Lear 2002). He assisted Todd in acquiring skeletal remains in Cleveland and completed his dissertation on a survey of available anthropological materials, as well as methods of documentation, processing, and preservation of skeletons (Rankin-Hill & Blakey 1994). Cobb (1936)

27

noticed that very little of the skeletal material in American museums was from Black individuals; he was inspired to change this by collecting Black American skeletal remains. In 1932, he moved to Howard University in Washington, DC, where he established its laboratory of anatomy and physical anthropology, collecting skeletons from dissection room cadavers. Like others of his time, Cobb retained anatomical, demographic, and medical records for each individual (Cobb 1936); the collection contains over 700 individuals from Washington, who died between 1932 and 1969 (Rankin-Hill & Blakey 1994). Over the years, Cobb continued to study the HamannTodd remains and also examined Terry’s collection of skulls in St. Louis. As mentioned previously in this chapter, the anatomical sciences served as the foundation of forensic anthropology (Grisbaum & Ubelaker 2001). Before 1939, anatomy departments contributed data on human skeletal variation using collections of cadavers of known age, ancestry, and sex. Law enforcement agencies consulted both anthropologists and anatomists in academic and museum settings regarding human skeletal remains; at varying times, those consulted included Thomas Dwight and Earnest Hooton of Harvard University, H.H. Wilder (1864-1928) of Smith College in Massachusetts, George Dorsey (1869-1931) of the Field Museum in Chicago, and Aleš Hrdlička of the Smithsonian Institution (Stewart 1979; Grisbaum & Ubelaker 2001). The modern period in American forensic anthropology began in 1939 with Wilton Marion Krogman’s (1903-1987) Guide to the Identification of Human Skeletal Material for the Federal Bureau of Investigation (Stewart 1979; Haviland 1994; Klepinger 2006). This text brought the skills of anthropologists to the attention of law enforcement and was the earliest work of its kind: written by an anthropologist who applied anthropological

28

methods to the identification of individuals and published in a forensic themed periodical (Stewart 1979; Klepinger 2006). Krogman conducted his graduate studies of at the University of Chicago, during which Todd invited him to take a fellowship at Western Reserve University (Haviland 1994). Two years after completing his dissertation, Krogman was hired as an associate professor of anatomy and physical anthropology at Western Reserve University, where his interactions with Todd and access to the HamannTodd skeletal collection helped him to learn cranial and skeletal variation with respect to age, sex, and ancestry. Krogman’s understanding of variation as it related to individual identification culminated in his 1939 Guide and his 1962 text The Human Skeleton in Forensic Medicine (Krogman 1939, 1962). In 1938, he returned to the University of Chicago as an associate professor of both anatomy and physical anthropology, and nearly a decade later, Krogman took a professorship of physical anthropology at the University of Pennsylvania; he had a profound and lasting influence on his students and physical anthropology. T. Dale Stewart (1901-1997) began working as an assistant to Hrdlička at the Smithsonian's Division of Physical Anthropology in 1924 (Ubelaker 2000, 2006a, 2006b). After earning his medical degree, he returned to the Smithsonian as an assistant curator to Hrdlička; after Hrdlička’s death in 1943, Stewart became curator and remained at the Smithsonian for the rest of his career, eventually becoming the museum director in 1962 (Ubelaker 2006b). Stewart’s work ethic, problem-oriented research approach, and

extensive publication history was likely influenced by his mentor (Ubelaker 2000b). Shortly after becoming the Curator of Physical Anthropology in the National Museum of Natural History in 1942, Stewart began conducting forensic work at the Federal Bureau of Investigation’s request. Although he was a medical doctor, Stewart had a strong 29

interest in skeletal variation and anthropology (Stewart 1930, 1931b); he followed in is mentor’s footsteps, pursuing interests in anthropometry, early man, and forensic anthropology (Ubelaker 2006a). Stewart played an essential role in the identification of American soldiers from the Korean War, which is described in more detail below. After retiring, J. Lawrence Angel (1915-1986) took over as curator, freeing Stewart from his administrative responsibilities. Undisturbed, Stewart continued to conduct research and publish for two more decades, issuing his fundamental text Essentials of Forensic Anthropology in 1979 (Stewart 1979; Ubelaker 2006a, 2006b). The Second World War significantly influenced the discipline of forensic anthropology, and Krogman’s Guide (1939) was used extensively to assist in the United States Army’s task of identifying the war dead. For those military personnel who died in the European theater, European anthropologists actually conducted the identification work in consultation with Harry Shapiro, the Curator of the American Museum of Natural History in New York (Simonin 1948 cited in Stewart 1979; Snow 1948; Vandervael 1952 and 1953 cited in Stewart 1979; Klepinger 2006). In contrast, American anthropologists selected by Francis Randall of the Anthropology Unit, Research and Development Branch, Office of the Quartermaster General were called upon to identify those involved in the Pacific theater (Stewart 1979); this task resulted in the temporary establishment of the Central Identification Laboratory (CIL) in Hawaii in 1947 (Klepinger 2006). Charles Snow, professor of anthropology at the University of Kentucky, was the first physical anthropologist to serve there and the first director of the lab (Bass 1968; Stewart 1979; Klepinger 2006). After Snow returned to the mainland in 1948, Mildred Trotter, an anatomist at Washington University, replaced him at the CIL.

30

There she identified American war dead and eventually collected the long bone measurements on which she and Gleser’s stature formulae are based (Trotter & Gleser 1952; Stewart 1979). Her work at the CIL facilitated future research on military skeletal remains from Korea. In 1976, the United States Army Central Identification Laboratory was permanently established in Honolulu; the mission continues today as the Joint POW/MIA Accounting Command (JPAC), after merging with the Joint Task Force-Full Accounting (Klepinger 2006). Like World War II, the Korean War necessitated large-scale individual identification. It was at this point that Stewart (1953) indicated the need to improve adult age estimation; Stewart recognized that existing standards were based on predominantly lower income individuals from city morgues; in addition, he recognized the problems with the accuracy of the documented chronological ages in the Hamann-Todd (see section on the strengths and weaknesses of pubic symphyseal aging methods in Chapter 3) and the unhealthy lifestyles of many of the individuals (Stewart 1953). Stewart (1953) pushed for new age-estimation methods based on healthy Americans, and he found that deceased American military personnel from the Korean War, who were killed traumatically, would be a sufficient sample. Stewart temporarily transferred from the Smithsonian Institution to an identification laboratory in Japan to conduct research on age estimation with the help of Ellis Kerley and others (Stewart 1979); upon Stewart’s return, Thomas McKern (1920-1974) in 1955 went to Washington, DC, to work with him on his collected data. Results of their collaboration included a better understanding of age changes in young males and a new method for aging the pubic symphysis (McKern & Stewart 1957; Stewart 1979; Klepinger 2006).

31

By the 1960s several anthropologists from the Smithsonian Institution and Federal Aviation Administration had worked on forensic investigations, and academic researchers were also contributing to anthropology and forensic obligations (Klepinger 2006). As a result, Ellis Kerley (1924-1998) found critical mass for the establishment of a separate Physical Anthropology section of the American Academy of Forensic Sciences in 1972 (Stewart 1979; Snow 1982; Klepinger 2006). This action also prepared for the establishment of the American Board of Forensic Anthropology, which was incorporated in 1977 as an organization to provide a program of examination and certification in forensic anthropology (Stewart 1979; Klepinger 2006). Kerley began his anthropological career by studying teeth from Indian Knoll, and he had spent several years identifying war dead in Japan after World War II (Ubelaker 2001). His dissertation was on the microscopic study of cortical bone. During his tenure at the Armed Forces Institute of Pathology, he developed an osteon counting method for estimating age from cortical bone and completed his doctorate from the University of Michigan in 1962 (Kerley 1962; Sledzick 2001; Ubelaker 2001). Three years later, Kerley took a teaching position at the University of Kentucky, before returning to his work with the identification of military remains in 1987. He identified war dead at the United States Central Identification Laboratory in Hawaii, first as a consultant then as the director (Ubelaker 2001). Recognizing the need for contemporary osteological collections that were not as biased as military samples, two autopsy samples were collected and two documented skeletal series were established in the late 1970s and early 1980s. Sheilagh Brooks (b 1923) was among the founders of the American Academy of Forensic Sciences Physical

32

Anthropology section and the American Board of Forensic Anthropology. In addition, Brooks, along with her husband, Richard, and colleagues, Stanley Rhine and Walter Birkby, founded and sustained the Mountain Desert and Coastal regional group of forensic anthropologists. Brooks’ research has explored topics such as the differentiation between human and nonhuman bone, recovery protocol, and stature estimation from incomplete long bones (Powell et al. 2006), but skeletal aging criteria has been of particular interest to her (Brooks 1955; Brooks & Suchey 1990). She, along with colleague Judy Suchey, has left her mark on the study of skeletal aging, as manifested by the development of the Suchey-Brooks method for estimating age from the changes of the pubic symphyseal face. The Suchey-Brooks method is based on a very large and well-documented contemporary reference sample of autopsied individuals from Los Angeles, California. The pubic symphyses of over 1,000 individuals were removed during autopsies performed between 1977-1979 (Brooks & Suchey 1990), resulting in the formation of the largest collection established specifically to address age-related changes of the pubis as they vary by sex and ancestry. A smaller autopsy sample of the sternal ends of the fourth rib was subsequently collected in the early 1980’s. Mehmet Yasar İşcan, Susan Loth, and colleagues (1984, 1985) conducted research on improving the skeletal criteria for age estimation (Brooks 1951, 1955; Brooks & Suchey 1990; Suchey et al. 1988; Powell et al. 2006) that resulted in sex-specific age estimation standards based on the sternal rib end. İşcan has made other significant contributions to the disciplines of human osteology and forensic anthropology, including several prominent texts (Krogman & İşcan 1986; İşcan 1989a; İşcan & Kennedy 1989).

33

The Suchey-Brooks and İşcan samples consist only of osteological elements retained after autopsy, based on a specific research goal. In contrast, the two documented skeletal series established in the 1980s are composed of complete, or nearly complete, skeletons that are collected to facilitate the investigation of many potential research questions. These documented collections are, to some degree, associated with teaching and research universities. William Bass III (b1928) founded the Forensic Anthropology Center at the University of Tennessee in Knoxville. He has been active in forensic anthropological casework throughout his career and was a founding member of the Physical Anthropology section of the American Academy Forensic Sciences and the American Board of Forensic Anthropology. In 1971, Bass created the University of Tennessee’s Anthropology Research Facility (ARF), an area where human cadavers are studied for taphonomic processes and postmortem changes. In 1981, he established the William M. Bass Donated Skeletal Collection (Ubelaker & Hunt 1995); many of the skeletons were once part of the cadaver research at the ARF, and as a result, most have decomposed naturally as opposed to being rendered in a lab setting (Bassett et al. 2003). To this day, the Bass Donated collection remains an active donation program. In 1984, Stanley Rhine established the Maxwell Museum of Anthropology’s Documented Skeletal Collection, which is located in the Department of Anthropology at the University of New Mexico. Like the Bass Donated series, the Maxwell Museum collection is one of the few active donation programs for skeletal remains. The collection contains remains that are donated, unclaimed, or medico-legal cases. The museum works in conjunction with the State of New Mexico Office of the Medical Investigator to render

34

the fleshed remains down to bone. The Maxwell Museum was originally founded in 1932, under the name Museum of Anthropology of the University of New Mexico; forty years later, it was renamed the Maxwell Museum of Anthropology after a donation made by Dorothy and Gilbert Maxwell supported a significant expansion (UNM Maxwell Museum of Anthropology 2001-2010). Its collections also include over 3000 archaeological remains from the American Southwest and forensic skeletal remains from the State of New Mexico. The 1980s ushered in a new era, one that initiated investigations into human rights abuses (Klepinger 2006). The first such investigation was in Argentina, after the end of the military dictatorship that was suspected of being responsible for the disappearance of thousands of people (Klepinger 2006). Clyde Snow (b 1928) developed and honed his skills in forensic anthropology while working at the Civil Aeromedical Institute (CAMI). Like Kerley, Snow was one of the architects of the Physical Anthropology section of the American Academy of Forensic Sciences (Schick 1997). Beginning in the 1980s, he traveled to Argentina and consulted in the recovery and identification of individuals exhumed from unmarked graves (Joyce & Stover 1991; Klepinger 2006). Snow trained students in the proper excavation of the graves to ensure the recovery of the maximum amount of forensic evidence, thus forming the Argentine Forensic Anthropology Team in 1984. Later he served as the training director for the Guatemalan Forensic Anthropology Foundation (Schick 1997) and assisted with the extensive forensic excavations in the former Yugoslavia (Stover 1997). With a continuing need for large contemporary collections, the next generation of documented skeletal collections in the United States is virtual. The Forensic

35

Anthropology Data Bank was created in 1986 with a grant from the National Institute of Justice (Jantz and Moore-Jansen 1988). The Data Bank contains demographic and osteological data from the Bass Donated collection and forensic cases submitted by anthropologists from around the United States (Jantz & Jantz 1999; Dirkmaat et al. 2008). The collection contains data for over 2,600 individuals, including approximately 1100 forensic cases with known sex and ancestry information. While the Data Bank is the most current collection (Dirkmaat et al. 2008), many different anthropologists have contributed the data; these observers have varying—and unknown—experience levels, a potential source of error that should not be overlooked. While virtual collections like the Forensic Data Bank exist, physical skeletal collections continue to proliferate. Most recently, between 2005 and 2006 Kristen Hartnett collected the pubic symphyses and sternal ends of the fourth ribs from subset of individuals autopsied at the Maricopa County Forensic Science Center in Phoenix, Arizona. As with the Suchey-Brooks and İşcan samples, Hartnett’s collection consists of certain skeletal elements that were gathered for a specific research goal, which in this case is the modification of pubic symphyseal and rib age estimation methods (Hartnett 2007). The importance of the maintenance and development of contemporary skeletal collections that encompass a wide range of human variation cannot be understated. Human skeletal collections are essential for teaching and training, method development and testing, and research on anatomy and human variation.

36

Summary At one time, isolated human skeletal remains were displayed in cabinets of curiosities. However, during the 19th Century, medical doctors, biologists, and anatomists with an interest in human cranial variation began collecting and studying larger numbers of human skeletal remains (Cook 2006). Many were particularly interested in the origins of American Indians, and this focus resulted in the collection of archaeological remains. During this time, some scientists thought intelligence was related to the anatomical features of the brain; cranial measurements were defined and calipers were developed to quantify them (Cook 2006). In the late 1800s, the seeds of forensic anthropology were planted and Huntington initiated the collection of large numbers of skeletons derived from medical school dissecting rooms. During the early 20th Century, American physical anthropology emerged as a profession due to the efforts of Hrdlička; the collection of archaeological remains continued and two additional large skeletal collections originating from anatomy schools and institutional morgues were established. The latter half of the 20th Century was marked by the acquisition of human skeletal remains that were donated, medico-legal, and/or military in origin. Anthropological goals included the description of human skeletal variability, the identification of war dead and medico-legal remains, and the modernization of collections. In addition, an innovative approach to the collection of data resulted in the creation of an electronic database in the mid-1980s; this data bank allows for the continual addition of demographic and anthropological information for current forensic

37

casework and represents a new type of resource available for research in physical anthropology.

38

Chapter 3 Theoretical Background and Literature Review Exploring and interpreting differences among human populations is a cornerstone of biological anthropology. According to Moore-Jansen and colleagues (1994), the skeletal biology of Americans has changed, and continues to change, due to secular modifications, migration, and gene flow. The literature also documents changes in the timing of physiological maturation, which are associated with nutritional differences related to socioeconomic status and/or cultural diversity (Ito 1942; Wingerd et al. 1974; Moore-Jansen 1989). If changes to skeletal biology and rates of physiological aging are present, regardless of cause, have modifications occurred in the rate of senescent change of skeletal indicators? At present, little is known about this issue, although this question is of great importance to the applicability and reliability of current aging methods (WittwerBackofen et al. 2008). Skeletal biologists and anthropologists make a fundamental assumption that both the pattern and rate of age-related morphological changes do not vary significantly through time. However, Schranz (1959) questioned the use of current standards on ancient remains, and Hoppa’s (2000) study suggests that significant differences in the timing of age-progressive change of the pubic symphysis may exist between reference and target samples. Paleodemographers have recognized this problem with respect to applying age assessment standards derived from modern reference populations to archaeological samples (Bocquet-Appel & Masset 1982), as these researchers must assume that a reference sample of 19th or 20th Century skeletons will

39

provide valid information for estimating the age at death of historic groups10 (Usher 2002). This assumption must also be made for estimating the age at death of late 20th and early 21st Century forensic cases when using standards developed based on late 19th and early 20th Century American skeletal reference series. Despite this concern, no one has attempted a large-scale evaluation of how well current aging standards perform on documented American skeletal samples of varying genetic backgrounds, living conditions, health statuses, and time periods until now. This is an important contribution to the literature seeking to identify the validity and reliability of applying reference standards to target samples, since age estimation standards are tested against other known-age samples. Usher (2002) emphasizes that by using various documented skeletal collections, it is possible to test assumptions about the uniformity of patterns of biological aging of the human skeleton. This research addresses the importance of determining whether any aging methods can be uniformly applied to all individuals and which skeletal traits are the most reliable indicators of age.

Estimation of age at death As mentioned in the Introduction, a genuine difference exists between chronological age and skeletal, or biological, age (du Noüy 1937; Laugier 1955; Acsádi & Nemeskéri 1970; Angel et al. 1986; Borkan 1986; Cox 2000; Kemkes-Grottenthaler 2002). Skeletal age markers and biological variables are an estimate of physiological status, not a representation of chronological age (Arking 1998; Kemkes-Grottenthaler 2002; Rösing et al. 2007). Unlike chronological time, the rate of biological aging can be

10

The reader is referred to this chapter’s section on critiques of estimating age from the adult skeleton for a more detailed discussion of this issue.

40

affected by lifeways, health and disease states, living and working conditions, climate, nutrition, endocrine function, and other environmental and genetic factors (Acsádi & Nemeskéri 1970; Angel 1984; Borkan 1986; İşcan 1989a; Loth & İşcan 1994). As a result, it is essential to recognize that physiological and chronological ages do not necessarily correspond (Laugier 1955; Acsádi & Nemeskéri 1970) and that chronological age estimates drawn from bone involves inherent risk of error (Acsádi & Nemeskéri 1970; Maples 1989; Arking 1998; Rösing et al. 2007).

Development of osteological aging standards In gerontological research, Spirduso (1995) outlined criteria for reliable agerelated biomarkers. Primary criteria include the following: strong correlation between the indicator and age, an indicator that is not altered by pathological conditions, ageprogressive changes that are not affected by metabolic or nutritional changes, a sequential and unambiguously identifiable aging pattern, and continuous remodeling throughout the organism’s lifespan (Spirduso 1995). Spirduso’s (1995) lesser criteria include wide applicability/generalization and reliable changes within a short time interval. In skeletal research, however, Kemkes-Grottenthaler (2002) notes that Spirduso’s standards cannot be applied to skeletal indicators of age because osteological changes are too complex and confounding factors abound. While not all of Spirduso’s primary criteria for age-related biomarkers are applicable to skeletal indicators of age, I do not believe that these criteria should be abandoned in the pursuit of age estimation in anthropological settings. Skeletal indicators of age do have a sequential aging pattern, and many features have a moderately strong correlation with chronological age; the task for anthropologists is to clearly 41

identify which indicators are not altered significantly by pathological, metabolic or nutritional changes, quantify the error in age estimates and consider other variables that may contribute to a more refined estimate particularly in older individuals, and define an unambiguously identifiable aging pattern that does not have a distinct end-point prior to the death of the individual. The paramount prerequisite for developing osteological aging standards in anthropology is an extensive knowledge of the skeletal system and its variation over time (İşcan and Loth 1989; Meindl & Russell 1998). In forensic anthropological contexts, specific requirements exist for age estimation methods: 1) the method must be transparent and replicable, with underlying data presented to the scientific community via publication in a peer-reviewed journal; 2) the accuracy of the method must be tested statistically; and 3) the method must be accurate enough to estimate age (Ritz-Timme et al. 2000). However, in general the construction of age estimation standards relies on identifying the divisible, unidirectional developmental/degenerative course of an osteological indicator (Boldsen et al. 2002) and subsequently corresponding these stages of skeletal morphology to chronological age based on known-age reference populations (Todd & Lyon 1924; Ferenbach et al. 1980; Usher 2002).

Osteological aging standards Bone is a living tissue that remodels throughout the body’s lifespan, responding to hormones, trauma, and pathological conditions. Bone remodeling was reported subsequent to tooth loss in the latter half of the 18th Century (Hunter 1771), and research in skeletal remodeling has expanded since to include studies on growth and development, pathological conditions and endocrine disorders, repair mechanisms subsequent to 42

trauma, isotope analyses, and degenerative changes/aging, to list a few. It is because of the dynamic nature of bone that age-related changes can be investigated. Changes in bone and cartilage occur during all stages of life (Plato et al. 1994). During childhood and adolescence, these changes are related to growth. Accordingly, age estimation of skeletally immature individuals is based on skeletal and dental growth and development (McKern & Stewart 1957; Moorees, Fanning, & Hunt 1963a; Moorees, Fanning, & Hunt 1963b; Redfield 1970; Suchey et al. 1984; Krogman & İşcan 1986; Ubelaker 1987; Ubelaker 1989a; Ubelaker 1989b; Meindl & Russell 1998; Scheuer & Black 2000), which follow a predictable order across all human populations (Brooks 1955; İşcan 1989a; Buikstra & Ubelaker 1994). As a result, sub-adult age estimation is fairly accurate (Ritz-Timme et al. 2000), though environmental stress can slow down epiphyseal union by several years (Johnston & Zimmer 1989). In contrast, changes in the adult skeleton are related to remodeling and degeneration; these changes are less uniform and are associated with decreased adaptability and performance (Plato et al. 1994). Because adult age estimation methods are essentially based on “wear and tear” indicators (Schmitt et al. 2002), the accuracy of most adult age estimation standards is poor in comparison to methods based on developmental changes. In addition, the rate of senescence is affected by genetics, life experiences, culture, and environment (Acsádi & Nemeskéri 1970; İşcan 1989a; Meindl & Russell 1998). As a result, there is a decrease in the accuracy of estimates with increasing chronological age. Simply stated, it is more difficult to estimate age in the adult due to the unpredictable irregularity of the aging process (Acsádi & Nemeskéri

43

1970; Ferenbach et al. 1980; Bocquet-Appel & Masset 1982; İşcan 1989a; Maples 1989; Stini 1994; Lovejoy et al. 1997; Hoppa 2000; Schmitt et al. 2002). Though other age estimation methods exist (Gustafson 1950; Gustafson & Simpson 1953; Leopold & von Jagow 1960; Miles 1963; Gustafson 1966; Lengeyel 1968; Acsádi & Nemeskéri 1970; Kerley 1970; Burns & Maples 1976; Lovejoy & Burstein 1977; Thompson 1979; Lovejoy & Barton 1980; Lovejoy 1985; Condon et al. 1986; Stout & Paine 1992; Ritz et al. 1994; Kim et al. 2000; Ritz-Timme et al. 2000; Rösing et al. 2007; Griffin et al. 2009), most American anthropologists choose to utilize gross morphological aging standards for several reasons: these methods are easy to apply, do not require specialized equipment, and are non-destructive. These standards score the morphological changes of bony surfaces including the pubic symphysis, auricular surface of the ilium, sternal end of the fourth rib, and defined locales along the cranial sutures. Of course, the observer must have a solid understanding of normal variation in agerelated features and an ability to identify pathological conditions and postmortem damage (Meindl & Russell 1998) to properly apply these aging methods. Another disadvantage of gross morphological aging standards is that they are considered less accurate than some other methods, such as aspartic acid racemization (Ritz-Timme et al. 2000); however, aspartic acid racemization results can be influenced by fluctuations in temperature, humidity, and pH, as well as differences in laboratory methods (Waite et al. 1999; Alkass et al. 2010). Interestingly, the preferred adult aging methods utilized by European and American physical anthropologists differ (Brooks & Suchey 1990; Wittwer-Backofen et al. 2008). Europeans follow recommendations compiled during a symposium in Prague

44

in 1972, which were subsequently revised at a paleodemographic conference in 1978 (Ferenbach et al. 1980); the recommendations for the skeletal aging of adults are based on the four criteria of the complex method (Acsádi & Nemeskéri 1970), specifically the pubic symphyseal face, the spongiosa structure of the humeral and femoral heads, and the obliteration of endocranial sutures. In contrast, physical anthropologists in the United States and the United Kingdom tend to follow guidelines printed in Buikstra and Ubelaker (1994) (Wittwer-Backofen et al. 2008). Because of their popularity and ubiquity in the anthropological literature, American gross-morphological age estimation standards are the focus of this dissertation. All adult skeletal age estimation standards are developed from the analysis of human remains drawn from archaeological and/or willed body/cadaver reference samples (see Chapter 2). It is important to consider that many reference samples have skewed age distributions and disproportionate representations of the sexes and races. These limitations, as they pertain to the reliability of specific aging methods, will be discussed later in the chapter. Cranial suture closure Cranial suture closure predates all other age indicators in the literature, originating with the first recognition of a relationship between age and cranial suture synostosis by Vesale in 1542 (Masset 1989). Beginning in the 1860s, the first studies on estimating age-at-death examined cranial sutures, coinciding with the collection and study of human crania (Masset 1989; Kemkes-Grottenthaler 2002). The estimation of age based on the obliteration of cranial sutures gained favor, as it was assumed that suture closure was a manifestation of a normal, progressive age-related physiological process (Masset 1971; 45

Hershkovitz et al. 1997; Dorandeu et al. 2008). However, the value of suture closure as an estimator of age at death has been questioned (Singer 1953; Hershkovitz et al. 1997; Galera et al. 1998), particularly because synostosis is affected by factors other than age, such as mechanical stress and genetic contributions (Cohen Jr. 1993), and is often asymmetrical within individuals11 (Zivanović 1983). Louis Gratiolet (1856) first proposed a sequence of ectocranial suture closure, which progressed from the obliteration of the sagittal to the lambdoid suture, with the coronal suture closing last. Five years later, Broca (1861) developed a five-point scoring system for sutures and noted that males aged 50 years and older still presented with many open sutures. In 1869, Pommerol determined from a single skull that that suture obliteration began around 40 years, though even by early 20th Century standards (Todd & Lyon 1924), his sample was lacking. Nearly two decades later, Ribbe (1885) examined 50 skulls and determined that the initiation of sutural union ranged from 21 to 55 years; as a result, Ribbe argued that it was not possible to estimate age any closer than fifteen to twenty years. Similarly, Paul Topinard (1885) noted that the age of union of sutures varied greatly; however, he determined that the sequence of closure was the same. The sagittal suture began closing first at around the age of forty, followed by the coronal suture at fifty years; the temporal suture was closing by the mid-60s. In 1890, T. Dwight examined the skulls of the poor from a sample of individuals that Todd and Lyon (1924) argue were of unverified age. Though it was unclear if Dwight examined ectocranial or 11

Zivanović examined a sample of East African Bantu from the Galloway Skeletal collection at Makerere University Medical School, Department of Anatomy, Kampala, Uganda, and a sample of European skulls from the following departments: Department of Anatomy and Department of Pathology in Belgrade; Department of Anatomy, Department of Pathology, and Department of Forensic Medicine in Novi Sad; Department of Anatomy in Vienna; Department of Anatomy, Medical College of St. Bartholomew’s Hospital in London. Both Bantu and European skulls showed evidence of asymmetrical closure at several landmarks.

46

endocranial sutures, his conclusions were that the sequence of and age at closure were too irregular to be of use in age estimation. At the turn of the century, Frédéric (1906) examined 255 European and 119 non-European skulls using a modified version of Broca’s stages12, focusing primarily on the ectocranial suture closure; like other before him, he determined that suture closure did not result in precise age estimation. Todd and Lyon (1924) sought to address the question of estimating age from cranial suture closure by improving upon these past studies, particularly in the area of examining a large sample of documented age with both Black and White individuals. Todd and Lyon (1924) endorsed endocranial suture obliteration over that of ectocranial sutures for aging and followed Frédéric’s version of Broca’s scoring scheme. Although they determined that a definite pattern in suture closure was present, Todd and Lyon (1924) noted the large degree of individual variability and did not recommend using the method as the sole indicator of age. In 1925, Todd and Lyon examined the progression of ectocranial suture obliteration and developed a system for the determination of age at death based on the degree of closure. Their standards were developed using crania from the Hamann-Todd collection (Todd & Lyon 1925a; Todd & Lyon 1925b). They identified a modal pattern of closure (Meindl & Lovejoy 1985), but made a critical error when they eliminated many crania because they did not fit their idea of normal. The inaccuracy of the method was significant. As a result, this method was deemed inappropriate for age estimation for forensic purposes (Singer 1953; Brooks 1955; Thompson 1982; Suchey et al. 1986).

12

Frédéric inverted Broca’s scoring scheme, such that 0 indicated patent sutures, 1-3 indicated the degree of partial closure, and 4 indicated complete obliteration.

47

Masset (1989) deemed the correlation of lateral and temporal sutures with age poor; accordingly, he scored three coronal, four sagittal, and three lambdoidal sites on a scale of 0 to 4, with zero denoting a segment that is completely open and four denoting a segment that is completely obliterated. Though the scoring procedure replicates that of Acsádi and Nemeskéri13 (1970), the age estimate is not based on the sum of the ten segment scores, as it is with Acsádi and Nemeskéri (1970). Instead, Masset calculated an obliteration coefficient (S), which is the average score for all suture sites. In addition, Masset (1989) attempted to correct for systematic statistical errors including those resulting from sex differences, the reference sample’s age structure, and regression analysis. In 1985, Meindl and Lovejoy published their newly developed method for estimating age at death based on the degree of ectocranial suture closure. Their method is the one most commonly employed by physical anthropologists in the United States, thanks in part to its inclusion in Buikstra and Ubelaker’s (1994) Standards for Data Collection from Human Skeletal Remains. The Meindl and Lovejoy method was developed using a sample of 236 crania from the Hamann-Todd anatomical collection. Ten sutural landmarks, divided into vault and lateral-anterior systems, are scored on a scale from 0 (open) to 3 (completely obliterated). Vault sites include midlambdoid, lambda, obelion, anterior sagittal, bregma, midcoronal, and pterion; lateral-anterior sites include midcoronal, pterion, sphenofrontal, inferior sphenotemporal and superior

13

Acsádi and Nemeskéri (1970) devised standard criteria for ectocranial aging, based on four stages created by the sum of segments of the sutures: three coronal, four sagittal, and three lambdoidal segments. The maximum score was 40. Stages progressed as follows: stage 1 correlated to a sum of 0-9; stage 2 correlated to a sum of 10-19; stage 3 correlated to a sum of 20-29; and stage 4 correlated to a sum of 30-40.

48

sphenotemporal. Composite scores are calculated by summing the scores for sites within the vault and lateral-anterior systems; these composite scores correlate to mean ages and age ranges that are not sex-specific. Pubic symphysis In 1858, Aeby observed age variation in the pubic symphysis, but did not speculate as to which specific ages were associated with these changes. Aeby also made note of changes that were associated with the formation of the ventral margin, which were previously described by Bonn (1777). Over a decade later, Henle (1872) reported that the pubic symphyseal face underwent variation in texture with age. In the early 1920s, Todd was first to systematically address the age-progressive morphological changes in the pubic symphysis (Todd 1920; Todd 1921). His method was the first to be developed based on a large collection of skeletal material with documented age, sex, and ancestry (Kemkes-Grottenthaler 2002). The method focused on five major aspects of the pubic symphysis: the surface of the symphyseal face, the ventral border, the dorsal border, and the superior and inferior extremities (Todd 1920). Todd (1920) originally developed a ten-phase system to score morphological patterns of the pubic symphysis for White males based on 306 individuals from the Western Reserve University anatomical collection, now the Hamann-Todd Osteological Collection (Todd 1920; Brooks & Suchey 1990). While the collection is generally considered “documented,” the source of age information varies. Some ages were recorded from hospital death certificates or provided by relatives, but the coroner or anatomist estimated many after death (Brooks & Suchey 1990). Both Todd (1920) and Cobb (1952) noticed that the graphed mortality chart showed high peaks at five-year intervals and explained 49

that this display may be the result of age estimation by coroners and/or the tendency of people to round off their ages. Other problems emerged as the result of sample bias, specifically that the collection consists of a significant number of transients and lacks adequate representation of younger ages (Angel et al. 1986; Katz & Suchey 1986; Gillett 1991). As with Todd and Lyons’ cranial suture closure standard, certain skeletons were eliminated if they did not fit the expected standards for skeletal development existing at the time (Gillett 1991); this selectivity may have affected the overall variability in the sample and influenced downstream age estimates produced by the Todd method (Brooks 1955). Todd’s 10-phase method has been critiqued, revised, and refined many times since its inception, primarily for the development of component methods, adjustment of age ranges, and condensation of morphological descriptions into fewer phases. Brooks (1955), McKern and Stewart (1957), Gilbert and McKern (1973), Meindl and colleagues (1985), Brooks and Suchey (1990), and Hartnett (2007) have all made notable contributions. Brooks first modified Todd’s method in 1955, which shifted the age ranges in an attempt to correct for the overaging that resulted from the original phase age limits. Brooks (1955) also critiqued Todd for excluding pubic symphyses that didn’t fit the expected morphological patterns; instead she argued that all variants should be recorded. Shortly thereafter, McKern and Stewart (1957) published a new component aging system for pubic symphyses based on reference sample of 349 Korean War deaths: as a result, their sample over-represented young males and lacked females altogether. This system is based on the idea of maximizing the information gleaned from the pubic

50

symphysis, and it contrasts with the phase-based method of Todd, which does not account for variability in the pattern of morphological features present. The authors divided the pubic symphysis into three separate components: the dorsal plateau, ventral rampart, and the symphyseal rim. Each component was then scored separately according to each one’s individual set of progressive changes; then the summed scores were used to determine the mean age, standard deviation, and observed age range from a table (Kemkes-Grottenthaler 2002). According to Hanihara & Suzuki (1978), the combination of these scores was highly correlated with chronological age in the Korean War sample. The component method was meant to be an improvement on Todd’s method by eliminating the subjective bias in the interpretation of skeletal changes and better describing individual variability. The major drawbacks to the McKern-Stewart method include its inapplicability to females, its bias toward young males, and its difficultly in application by inexperienced users, particularly as a result of the complexity of the scoring system (Katz & Suchey 1986). Fifteen years later, Gilbert and McKern (1973) addressed one of these problems by creating a standard for females. Following McKern and Stewart, Gilbert and McKern (1973) developed standards for a female-specific component method. The authors based their system on a reference sample of 103 autopsied American females14 with known age at death.

14

Gilbert and McKern (1973) do not state explicitly the source of their sample of female pubic symphyses. However, a list of individuals and institutions assisting with the “building of the necessary research population” (p.38) in the acknowledgements provides the information: University of Kansas, Department of Pathology and Department of Anatomy; University of Tennessee Institute of Pathology; University of Missouri-Columbia Department of Anatomy; Washington University Department of Anatomy; and Dr. T. Dale Stewart’s “own collection” (p.38).

51

Within a few years, Hanihara and Suzuki (1978) published their research on the estimation of age from the pubic symphysis using a regression analysis of single traits on a sample of 70 Japanese males and females. The major critique of this method was that it was only reliable between the ages of 18-38 years, as the skeletal changes were highly variable for the pubic symphysis after age 40 (Hanihara & Suzuki 1978). Meindl and colleagues (1985) also modified Todd’s system using a set of 96 males and females drawn from the Hamann-Todd collection. Their research found that the original Todd method performed better than the Gilbert-McKern, McKern-Stewart, and Hanihara-Suzuki component methods and that no significant differences existed among Black females, Black males, White females, and White males. Meindl and colleagues’ (1985) revised method had defined the pubic symphyseal morphological transition using the following five stages: pre-epiphyseal, active epiphyseal, immediate post-epiphyseal, maturing/predegenerative, and degenerative, which corresponded to Todd stages I-V, VI, VII, VIII, and IX-X, respectively. In 1990, Brooks and Suchey generated a skeletal aging system with standards for males and females using linear regression analysis, which stemmed from an earlier investigation of age determination in the pubic symphysis of males (Katz & Suchey 1986). The Suchey-Brooks standard is a modification of the Todd method using six phases instead of ten. Katz and Suchey (1986) tested interobserver error for Todd’s method and found that certain phases needed to be condensed because the observers could not consistently discriminate between them. As such, Brooks and Suchey (1990) combined Todd phases I-III, IV-V, and VII-VIII; in addition, the authors also included all individuals so as not to lose the variability observed in the sample. The Suchey-Brooks

52

method was based on a very large and well-documented reference sample of 20th Century individuals from Los Angeles, California, autopsied between 1977-1979; the sample included 739 males and 273 females. This is by far the largest sample to be studied for age-related changes in the pubis, and all individuals had legal documentation of their age at death. The Los Angeles County coroner sample also has better age representation than other reference samples, ranging 14-92 years, with most under 60. The sample is considered fairly representative of a 20th Century population in terms of race, socioeconomic class, and geographic origin (de Arenosa & Suchey 1987, cited in Brooks & Suchey 1990). Along with phase descriptions, Brooks and Suchey (1990) contracted France Casting to produce a set of reference casts, illustrating the early and late patterns for each stage. Currently, the Suchey-Brooks standard is one of the most trusted and frequently used pubic symphyseal aging techniques, which may be due in part to its endorsement by Klepinger and colleagues (1992) and its inclusion in Buikstra and Ubelaker’s (1994) Standards for Data Collection from Human Skeletal Remains. Recognizing the limitations of existing methods, particularly with aging older individuals, several researchers have added a seventh phase to the Suchey-Brooks pubic symphysis standard (Hartnett 2007; Berg 2008). Berg redefined age ranges for phases 5 and 6 using transition analysis, and added a phase 7 based on a reference sample of Balkans and Americans from Tennessee, focusing specifically on changes in the older female pubis. Berg (2008) argues to incorporate physiological changes associated with osteoporosis/osteopenia into phase descriptions; Hartnett includes physiological measures in her revision of the Suchey-Brooks phase/stage based pubic symphyseal aging technique for her dissertation. Hartnett’s revisions, hereafter denoted as the Hartnett-

53

Fulginiti standard, are based on an autopsy sample drawn from the Maricopa County Forensic Science Center in Phoenix, Arizona, and account for better age estimates for those individuals over the age of 50-60 years. Sternal extremity of the fourth rib Kerley (1970) was the first to note age-related changes of the sternal end of the ribs, but methods for estimating age from this indicator did not emerge until the 1980’s. İşcan and Loth (1986) and İşcan and colleagues (1984, 1985) set standards for the age progressive morphological changes of the sternal end of the right fourth rib. Collectively, they developed a phase analysis system composed of nine phases (0-8) that describe changes in shape, form, texture, pit depth, and bone density of the sternal end. Their method was based on a reference sample of 277 rib ends collected at autopsy from medical examiner’s cases in Broward County, Florida (Loth & İşcan 1989). Originally, İşcan and his colleagues only tested the right side fourth rib, assuming that the left fourth rib would not be statistically significantly different from its antimere. Yoder and colleagues (2001) tested this hypothesis and found no significant difference between sides. Other variations of age estimation from ribs include methods developed by Kunos and colleagues (1999) and DiGangi and colleagues (2009). For each of these modifications, three aspects of the first rib were evaluated for their morphology: the costal face, rib head, and tubercle facet. The first rib was chosen because it was easily identified, was not influenced by mechanical stress, and exhibited remodeling into the eighth decade. These were then seriated by the degree of morphological changes for the first rib by age, and then a target age was assigned based on similar morphology. 54

DiGangi and colleagues (2009) take Kunos and colleagues’ (1999) method one step further by performing a transition analysis. Despite the development of these more recent aging methods (Kunos et al. 1999; DiGangi et al. 2009), the standard developed by İşcan and his colleagues remains the primary method of choice. Iliac auricular surface It is commonly reported that Sashin (1930) was the first to describe age-related changes of the auricular surface (Buckberry & Chamberlain 2002), followed by Kobayashi (1967); however, Sashin states that Meckel was the first to describe the sacroiliac joint in 1816, noting that the joint surfaces were smooth in youth and rougher in older individuals. Lovejoy and colleagues conducted the first systematic study of agerelated morphological changes of the auricular surface in 1985; Lovejoy and colleagues (1985b) devised a phase-based standard that described age related morphology for 5- or 10-year age intervals starting at age 20 years and ending at 60+. Their standard was based on a large reference sample of 500 individuals from the Hamann-Todd Osteological Collection, 250 from the Libben archaeological population, and fourteen forensic cases. Lovejoy and colleagues’ standard was originally designed for use with seriation to estimate the age distribution of a skeletal sample, but it has been routinely applied to individual cases for forensic anthropological analysis. The method was tested by Lovejoy and colleagues (1985a) on two samples selected from the Hamann-Todd collection (Mulhern & Jones 2005) to ascertain the accuracy and bias. Results indicate that the method is equally as accurate as pubic symphyseal aging, though admittedly

55

more difficult to apply (Lovejoy et al. 1985a; Lovejoy et al. 1985b); as a result the reliability of the method remains a concern15. Several studies have attempted to improve upon the original method. Buckberry and Chamberlain (2002) revised Lovejoy and colleagues (1985b) method and developed a component system that estimated age based on the composite score. The authors claim that the revised technique was easier to apply and had lower levels of inter- and intraobserver error than Lovejoy and colleagues’ (1985b) method. Igarashi and colleagues (2005) also developed a newer method for age estimation, based on the iliac auricular surface based on 700 modern Japanese skeletons with recorded age. This technique took a different approach to description of the morphology, noting the presence or absence of certain relief and texture features16 and then selecting the parameter estimates of each feature. Subsequently the age estimation was obtained by summing these parameter estimates. Igarashi and colleagues (2005) claimed their method was more accurate than other methods, particularly at the older age ranges, and that reliability was better for both males and females. Osborne and colleagues (2004) used Lovejoy’s standards for aging the auricular surface to estimate age for a sample of 266 individuals from the Terry (194) and Bass Donated (72) collections. Using the original age-phase correlation scheme, only 33% of the sample was correctly aged. The authors suggested that the 5- and 10-year age ranges published by Lovejoy and colleagues (1985b) were too narrow for use in forensic contexts. As a result, Osborne and colleagues (2004) modified the auricular surface aging system to include only six phases, combining phases with means that were not

15 16

The reader is referred to the discussion of weaknesses of the auricular surface later in this chapter. Nine features for males and seven for females.

56

statistically significantly different. For the new method, inaccuracy was lowest for middle decades (40-59) and highest for the oldest decade (80-89); the method overestimated age for lower decades (20-49) and underestimated age for higher decades (50-89). Mulhern and Jones (2005) tested both Lovejoy and colleague’s (1985b) method and the revised auricular aging standard by Buckberry and Chamberlain (2002) on 309 individuals of known age sex and race from the Terry and Huntington Collections. The authors were interested in determining whether the two methods were comparable in terms of accuracy. Their results indicated that the revised method was less accurate than the original for individuals under 50, but more accurate between 50-70 years. Although I have had no training or experience with Buckberry and Chamberlain’s (2002) aging method for the auricular surface, their age estimates for older age groups are not constrained by the 50+ value assigned to the terminal phase of Lovejoy and colleague’s (1985b) standard. However, the revised method uses the same terminology as the original, and this does not assist those already struggling with the interpretation of Lovejoy and colleague’s (1985b) feature descriptions. Multiple-trait approaches Both intrinsic and extrinsic factors modify aging patterns, implying that a single indicator will only reflect one part of a complex process (Kemkes-Grottenthaler 2002). Despite continuous modification and recalibration of existing methods, the paradigm has shifted from single-indicator to multiple-indicator approaches. The reasons were to minimize the error of single indicators by using several individual methods to estimate

57

age (Cox 2000) and to combine multiple indicators to produce an internally consistent age distribution (Wright & Yoder 2003). In 1970, Acsádi and Nemeskéri created the Complex Method, which evaluated four features: structural changes of the spongiosa of the humeral and femoral epiphyses, symphyseal face of the pubis, and endocranial suture closure. In 1985, Lovejoy and colleagues developed the multifactorial aging method using the Hamann-Todd Osteological Collection as the reference sample (Lovejoy et al. 1985a), despite known problems with the sample. This standard incorporated age information from multiple indicators, including the auricular surface, pubic symphysis, and clavicular and proximal femoral radiographs, following Lovejoy and colleagues (1985b), Meindl and colleagues (1985), and Walker and Lovejoy (1985), respectively. Both the complex and multifactorial methods produce better accuracy for age estimates in older individuals (Jackes 1985). Transition analysis A novel approach to skeletal age at death estimation took hold in the 1990s. Boldsen and colleagues (2002), like others preceding them, had the goal of revising the means of age estimation of adult skeletons, this time in light of the Rostock Manifesto; the Rostock Manifesto called for a number of improvements to be made in the field of paleodemography, including the development of more reliable and more vigorously validated age estimation methods (Hoppa & Vaupel 2002). Due to known problems with phase based techniques, the authors sought to develop a new kind of age estimation method. The culmination of their osteological experience and statistical changes was a new multi-factorial age estimation method that returned to a component system for the 58

pubic symphysis, introduced a component scoring system for the auricular surface, and evaluated five different points along cranial sutures with a new descriptive scale. The authors’ approach follows the logic of McKern and Stewart (1957), who divided the pubic symphysis into individually scored components, each with their own set of morphological changes to score; like other age indicators, these morphological changes are a series of unidirectional stages. Boldsen and colleagues (2002) believe that this approach better reflects the complex changes observed for age indicators than do static phase descriptions, because senescent changes do not occur simultaneously. As the authors state, it is difficult to force a complex anatomical structure into one particular stage (Boldsen et al. 2002). In addition, a component scoring system allows osteologists to take full advantage of the meager information that might be available in poorly preserved age indicators, especially if they are fragmentary. Boldsen and colleagues’ (2002) newly developed scoring system, as well as the features scored, were derived from previous descriptions as well as extensive experience working with thousands of prehistoric and historic skeletons. Pubic symphyseal and iliac auricular surface characteristics were defined using American skeletal samples and Danish archaeological remains. Only one of the auricular surface features, specifically the posterior iliac exostoses, was not present for the archaeological remains. Boldsen and colleagues (2002) only observed this trait in elderly individuals in the Terry Collection. This feature, along with the breakdown of the dorsal margin of the pubic symphysis, is considered to be a trait characteristic of old age. Instead of assigning a static age range to the stages observed for each feature scored, the authors chose to use a different statistical technique: transition analysis.

59

Boldsen and colleagues (2002) describe transition analysis as an estimation procedure that allows for inferences to be made about the timing of transition from one stage to the next. The ADBOU age estimator program calculates a maximum likelihood point estimate for age at death and associated confidence intervals for each individual. In the ADBOU age estimator program, the observer can choose the sex, race, and hazard model, which is either a uniform prior distribution or an informed prior. If the uniform prior distribution is chosen, the program assumes that all target individuals have an equal chance of being all ages, ignoring the target sample’s age distribution (Konigsberg and Frankenberg 1994). This approach is criticized by BocquetAppel and Masset (1982), and later by Di Bacco and colleagues (1999), because it weights extremely old and highly unlikely ages-at-death heavily. Boldsen and colleagues (2002), acknowledge the critique but counter with the claim that uniform prior use is precedented (Konigsberg et al. 1998). If the informed prior is chosen, the program uses documentary information from either United States national homicide data from 1996 (Peters et al. 1998) or 17th Century rural Danish parish records (Johansen 1998). Bayesian prediction allows for a direct visualization of the variability because age at death is assessed by the probability that it belongs to a set chronological interval (Schmitt et al. 2002). The prior probability is the probability of an individual belonging to a specific age category, given no information other than the assumption that the individual is similar to the reference sample to be used. The likelihood is the probability that an individual with a particular score belongs to a particular age, based on the age distribution of the reference set for that point’s score. The posterior probability is

60

indicative of the chance that an individual belongs to a particular age group taking into account the prior probability and the likelihood (Aykroyd et al. 1999).

Application of American aging standards to target groups An enormous amount of research has been conducted on skeletal aging over the last century. Since an exhaustive survey of every osteological-based aging study would be voluminous, only highlights related to the dissertation question are presented. The application of aging standards developed from one sample to any given target sample assumes that both samples possess the same biological aging characteristics (KemkesGrottenthaler 2002); however, an abundance of research has identified factors influencing individual variability in the aging process. This literature review focuses on those American aging standards tested in this research, highlighting the performance of each by sex, race, nationality, and time, as well as emphasizing the standards’ strengths and weaknesses.

Cranial sutures Differences in the synostosis of cranial sutures between the sexes have been reported. In 1955, Brooks noted a difference between female pubic mean age and cranial suture closure age; this slower rate of suture obliteration in females ranged anywhere from five to twenty-five years. Singer (1953) found a similar lag in suture obliteration for females. In contrast, Brooks (1955) reported that males had a higher correlation between pubic and cranial mean ages. Galera and colleagues (1998) found sex-specific and population differences in cranial suture closure for a sample of skeletons drawn from the Terry Collection using four independent scoring standards, including that of Meindl 61

and Lovejoy; Nawrocki (1998) also found a correlation between suture closure and sex for the Terry Collection. Masset (1989) provided some presumptive evidence for a secular trend in cranial suture closure, finding a slight difference between the means of closure in two Portuguese samples with death years approximately fifty years apart: Lisbon and Coimbra. BocquetAppel and Masset (1995) published additional evidence supporting a secular trend in suture obliteration. Using the same Lisbon sample, they compared cranial suture closure to a sample from Prague with years of death nearly a century later. Bocquet-Appel and Masset (1995) plotted the percentage of lambdoidal suture closure against age and reported that at 60-70 years of age, the Lisbon group appears 20-30 years younger than their actual age when the modern Prague sample’s standards were applied.

Strengths and weaknesses Meindl and Russell (1998) report that age estimation based on cranial suture closure fell into disfavor in the 1950s, due to critiques by Singer (1953), Brooks (1955), and McKern and Stewart (1957); decades later, the value of suture closure as an estimator of age at death is still questioned (İşcan & Loth 1989; Buikstra & Ubelaker 1994; Hershkovitz et al. 1997; Galera et al. 1998; Boldsen et al. 2002). Though some have argued that the lateral-anterior sutures are more reliable than the vault sites, ectocranial suture closure is generally considered inaccurate, providing only a quick and rough impression of age (Singer 1953; Brooks 1955; McKern & Stewart 1957; Meindl et al. 1983; Ritz-Timme et al. 2000). Cranial sutures appear to obliterate with increasing age, but synostosis is affected by other factors like mechanical stress and genetic contributions (Cohen Jr. 1993). Regardless, the large variability in rates of closure makes age 62

estimation problematic (Singer 1953; Brooks 1955; McKern & Stewart 1957; Krogman & İşcan 1986; Masset 1989; Saunders et al. 1992; Buikstra & Ubelaker 1994; BocquetAppel & Masset 1995; Galera et al. 1998; Rösing et al. 2007). This variability means that the determination of an individual’s age at death is only possible between very wide limits, which do not allow for any meaningful estimation of age (Acsádi & Nemeskéri 1970; Hershkovitz et al. 1997). Even Boldsen and colleagues (2002), admit that they only include suture obliteration in their aging standard because the cranium is often the only element recovered in forensic cases. Thus, cranial suture closure is now returning as part of standards utilizing multiple indicators of age (Meindl & Russell 1998), with the caveat that in ambiguous cases, postcranial age indicators should be weighted more heavily than cranial suture scores (Buikstra & Ubelaker 1994). The anthropological literature reports that the correlation between cranial suture closure and chronological age is low to non-existent (Singer 1953; Brooks 1955; Masset 1971; Perizonius 1984; Hershkovitz et al. 1997) and includes documentation of very old individuals with open sutures (Perizonius 1984; Aykroyd et al. 1999). Variables possibly affecting suture obliteration include epidemic diseases, and vascular, hormonal, genetic, biomechanical, and local factors (Persson et al. 1978; Masset & de Castro e Almeida 1990, as cited in Jackes 2000; Cohen Jr. 1993; Kanisius & Luke 1994). Asymmetry is also a concern, as marked differences between right and left sides have been noted; the lateral-anterior sites of Meindl and Lovejoy system, as well as the coronal suture, are vulnerable to asymmetry (Zivanović 1983; Kemkes-Grottenthaler 1996, as cited in Jackes 2000). If only one side is present, there is a risk of erroneous age assessment. Other problems with cranial suture closure aging standards include unclear

63

descriptions of obliteration stages (Hershkovitz et al. 1997) and limitations due to a clear end point (complete obliteration) to the standard that can be reached long before death (Jackes 2000). Despite the presence of significant individual variation, Todd and Lyon (1924) emphasize that cranial suture synostosis has a clear, ordered sequence of changes. Another benefit of cranial suture closure aging standards, specifically that of Meindl and Lovejoy, is low interobserver error (Galera et al. 1995). In contrast to other reports, Meindl and colleagues (1983) report a moderate (0.65) correlation between cranial suture closure and chronological age.

Pubic symphysis In terms of skeletal aging, differences between the sexes—both in morphology and rate of maturation—have been observed for the pubic symphysis (Todd 1921; Gilbert & McKern 1973; Gilbert 1973; Brooks & Suchey 1990; Sharma et al. 2008). Originally, Todd reported no significant differences in pubic aging between the sexes. But soon after, Todd (1921) discovered that females developed the ventral rampart formation two to three years later than males, and that females exhibited dorsal flattening two to three years earlier than males. Differences in the rate of skeletal morphological change between the sexes was also described by Gilbert and McKern (1973), who stated that female pubic symphyses appeared ten years older than their male counterparts. Because sex-specific differences have been consistently observed for the pubic symphysis, more recent revisions to the Todd system, including Brooks and Suchey (1990) and Hartnett (2007), have devised aging standards with separate age ranges and/or different phase descriptions for males and females. 64

The greater variability observed for female pubic morphology (Jackes 1985; Katz & Suchey 1986; Kemkes-Grottenthaler 2002; Djurić et al. 2007) translates to less reliability in age estimates and has been attributed to hormonal changes and trauma related to childbearing by numerous authors (Todd 1921; Putschar 1976; Suchey et al. 1979; Bergfelder & Hermann 1980). For this reason, Brooks and Suchey (1990) suggest that dorsal lipping should not be relied upon as an indication of age in females. However, a crude analysis comparing small samples of low-parity and high-parity females from the historic Spitalfields cemetery did not detect significant differences in the variation of mean stage by age (Hoppa 2000). Despite this, Hoppa (2000) did find that female pubic symphyses appeared younger than males after the age of 40 years in some samples. In his earliest evaluation of pubic symphyseal aging, Todd (1920) did not discuss differences in the rate of metamorphosis between American Blacks and Whites drawn from the Hamann-Todd Osteological Collection. Similarly, Brooks (1955) did not note any differences between races for the pubic symphysis. But when Katz and Suchey (1989) tested for racial differences in pubic symphyseal metamorphosis using a welldocumented multiracial American sample of 704 male pubic bones collected after autopsy at the Los Angeles County coroner’s office in 1977, significant differences in age were found across racial groups. The relationship between estimated age using a sixstage modified Todd system and chronological age was examined as a function of race, which was classified as White, Black, or Mexican. The authors analyzed the data twice, once using linear regression models and once incorporating an analysis of variance. The authors observed that Blacks and Mexicans with advanced pubic symphyseal patterns tended to have lower chronological ages than Whites exhibiting the same morphology

65

(Katz & Suchey 1989). As a result, Klepinger and colleagues (1992) encourage the use of a racially specific variant of the Suchey-Brooks pubic symphyseal aging standard for males. Conflicting results were reported when American aging standards were applied to skeletal samples from Asia. Schmitt (2004) conducted a blind study of the SucheyBrooks pubic symphysis method on a Thai sample of known sex and age at death. The Thai sample was composed of unclaimed bodies/indigents, as well as willed remains from the Department of Anatomy at the University of Chiang Mai, Thailand. Results found that both bias and inaccuracy increased with age, and that the chronological age tended to be underestimated. A blanket conclusion that age assessment based on American standards should not be used for samples from Asia was made, because Thai individuals retain earlier phase morphology even in advanced age, resulting in lower age estimates than actual chronological age (Schmitt 2004). The degree of inaccuracy was striking, up to 32 years for individuals over the age of 60. In contrast, when Hanihara (1952) scored Japanese pubic symphyses using Caucasian standards, the Japanese individuals appeared two to three years older than their actual ages. Similarly, Sakaue (2006) tested the Suchey-Brooks system on a recent Japanese sample (n=416) and found that the differences between the mean ages of Japanese and American samples for all six stages were not statistically significant. This contrast hinted that these trends might be affected by potential underlying differences in socioeconomic status, health, and/or nutrition. Pal and Tamankar (1983) and Sinha and Gupta (1995) reported differences in pubic symphyseal aging between a sample of males from India and American White

66

males using the Todd ten-phase standard and the McKern-Stewart component method. When compared to the reference sample (see Todd 1927), the Indian males had statistically significant lower mean ages of development for phases II, III, and VI-X, as well as differences in developmental timing of several pubic symphyseal components. Specifically, the Indian males had an advanced development of the dorsal margin and formation of the symphyseal rim. Indian males also exhibited a delay in the completion of the dorsal plateau, ventral beveling and rampart, and the symphyseal rim (Sinha & Gupta 1995). The Suchey-Brooks standard has been tested on European samples as result of recent international forensic anthropological investigations of mass graves. Djurić and colleagues (2007) evaluated the Suchey-Brooks method using a Serbian sample consisting of 52 males and 33 females from the Institute for Forensic Medicine, University of Belgrade (1999-2002). The authors found the Suchey-Brooks method to be more accurate in males (89.7%) than for females (72%); the greater inaccuracy for females was expected due to the increased variability in their age indicators (Djurić et al. 2007). The oldest individuals were underaged, regardless of sex. Kimmerle and colleagues (2008a) also tested whether American aging standards for the pubic symphysis would reliably estimate the age at death for Balkan skeletal remains. The pubic symphyses of 212 male and 84 female Balkans were scored using the SucheyBrooks method. The comparative sample of 2,078 American males and females (Blacks and Whites) was drawn from the Forensic Data Bank; their Todd scores were converted to the 6-phase Suchey-Brooks stages. To test for population differences in aging, Kimmerle and colleagues (2008a) used proportional odds probit regression, an analysis of

67

deviance, and an improvement chi square statistic. Results showed statistically significant differences between American and Balkan females (Kimmerle et al. 2008a), but no difference between the aging processes of males and the total populations (sexes pooled). Berg (2008) also reported differences in the median ages for pubic symphyseal phases between American and Balkan females. Hoppa (2000) used the Suchey-Brooks standard to identify differences in agerelated changes of the pubic symphysis between the original Brooks and Suchey (1990) reference sample and two different target samples: a 20th century forensic sample of similar composition to that of the Suchey-Brooks reference sample (Klepinger et al. 1992) and an archaeological sample derived from the 18-19th Century cemetery at Spitalfields (Molleson & Cox 1993). Hoppa (2000) found that the mean phase of each 10-year group within the reference and both target samples was not the same, indicating that the rates of skeletal change occurring in the three samples are significantly different. The differences observed were particularly significant for females over the ages of 30 in the archeological target and 40 in the forensic target; both the forensic sample and archaeological sample females have younger looking morphology than the reference sample at comparable actual ages (Hoppa 2000).

Strengths and weaknesses Todd’s pubic symphysis aging standard has significant methodological problems, including reference sample age documentation, sample size, age range, and sex distribution (Brooks 1955; Djurić et al. 2007). It is well documented that the Terry and Hamann-Todd collections contain individuals whose ages are not truly known (Boldsen et al 2002; Hunt & Albanese 2005; Konigsberg et al. 2008). This problem may be the 68

result of several circumstances: 1) individuals did not provide accurate information regarding their age because either their true age is unknown or is misrepresented for cultural reasons; 2) an inconsistency between stated age and what the morgue physician thought the body condition was consistent with; or 3) morgue physicians assigned an age to individuals of undocumented age at death based on the physical appearance of the body at autopsy examination (Howell 1976; Meindl et al. 1983; Lovejoy et al. 1985a; Usher 2002; Hunt & Albanese 2005). Questions about the accuracy of ages for the sample implies that all aging methods based on the Hamann-Todd collection should not be trusted unconditionally, because less than one in six cadavers from Cleveland hospitals had sufficiently documented ages and the difference between stated and observed ages were often 15-20 years (Meindl et al. 1983; Lovejoy et al. 1985a). In a separate study, Lovejoy and colleagues (1985a) found only three “cadaver records” had legal documentation of birth date in their sample drawn from the Hamann-Todd collection. However, several of the same authors claim in a later publication (Meindl et al. 1990) that their sample of 512 individuals selected from the Hamann-Todd collection each had a legal age at death recorded on their United States Revised Death Certificate, which was filed at the Vital Records Division of Cleveland City Hall; they stressed that the next of kin providing the information are listed and that it is, as it was then, illegal to falsify such information. It is unclear exactly how the number of individuals with legal documentation of age at death increased so dramatically in five years, but perhaps it was related to the location of the information: cadaver records versus the Vital Records Division.

69

Of course, it is impossible to know for certain which ages are accurate and which are not. In the Hamann-Todd collection, the uncertainty of some “documented” ages is marked by “?” or “ca.” However, these designations are not the only indicators of potential problems with the recorded age at death. The anatomists working on the Hamann-Todd collection described/rated each of the cadavers “stated age” according to their certainty of the documented age, using terms like “certainly correct” and “certainly incorrect” (Meindl et al. 1983; Lovejoy et al. 1985a). Stated ages that were deemed “certainly correct” were those falling within a +/-5 year range of the observed age. When the stated age of an individual was 30 years but unfused epiphyses were observed, the stated age was classified as “certainly incorrect” (Lovejoy et al. 1985a). Finally, It is well documented that ages cluster at five-year intervals in the Terry and Hamann-Todd collections, probably as the result of estimated ages or individuals who rounded off their reported age (Todd 1920; Cobb 1952; Katz & Suchey 1986; Boldsen et al. 2002; Hunt & Albanese 2005; Konigsberg et al. 2008). The Todd pubic symphysis aging standard also suffers from significant inaccuracy, due in part to the variability observed in the symphyseal face, which results in morphology that is difficult to classify into one stage or another (McKern & Stewart 1957). So much variability was present within the Hamann-Todd sample that Todd could not get over a quarter of his sample to fit into his aging scheme (Jackes 1985). As a result, Todd purposefully reduced the sample variation, deleting problematic skeletons that did not fit the standards for skeletal development existing at the time (Angel et al. 1986; Katz & Suchey 1986; Brooks & Suchey 1990; Gillett 1991). Todd’s method does not account for individual variation according to Katz and Suchey (1986). The Suchey-

70

Brooks method addressed this issue by modifying Todd’s standard, using a much larger reference sample with legal documentation of age at death. Brooks and Suchey’s (1990) sample had better age at death, geographic, and racial representation, so it is preferable because it allowed for normal variation to be evaluated (Gillett 1991); this is reflected in the large age intervals associated with the Suchey-Brooks phases and the significant overlap in 95% confidence intervals for age estimates (Brooks & Suchey 1990; Schmitt 2004; Kimmerle et al. 2008b). However, substance abuse in 20th Century autopsy samples may be an issue (Klepinger et al. 1992). Taylor (2000) evaluated age-related changes in the sternal end of 173 ribs from individuals with known chronic substance abuse problems that were collected at autopsy at the King County Medical Examiner’s Office in Seattle, Washington. Taylor (2000) found that that chronic substance abuse affected the reliability and accuracy of the İşcan and colleagues’ rib aging method; in contrast, Hartnett (2007) did not find a statistically significant difference between groups17 when the observed and actual pubic symphyseal phases were compared. Other critiques of pubic symphyseal aging standards include low repeatability and low reliability (Saunders et al. 1992; Molleson & Cox 1993; Hoppa 2000; Schmitt et al. 2002; Rösing et al. 2007). Though Galera and colleagues (1995) did not find significant interobserver error for either the Todd or Suchey-Brooks standard when testing the methods on 963 skeletons from the Terry Collection, Kimmerle and colleagues (2008b) found that correlations between observer’s scores for Todd’s method varied from low to high for a sample of identified individuals from Kosovo. Significant interobserver error

17

The groups compared were chronic substance abusers and those with no history of substance abuse. Both groups were drawn from an autopsy sample at the Maricopa County Forensic Science Center.

71

may be due in part to confusion between the development and degeneration of a single feature: the ventral rampart (Suchey & Katz 1998; Kimmerle et al. 2008b). It may also be the result of the difficulty of applying upper phase designations based on photos and insufficient phase descriptions (Gillett 1991); Brooks and Suchey (1990) attempted to solve this problem as well, commissioning the casting of reference plaques and formulating clear phase descriptors for both males and females (Gillett 1991). Interestingly, Kimmerle and colleagues (2008b) reported more observer variation for the Suchey-Brooks method than for Todd’s; despite their improvements, some suggest that characteristics scored for the Suchey-Brooks standard are still difficult to assess, resulting in large interobserver error (Saunders et al. 1992; Baccino et al. 1999), as well as accuracy and precision that were less than desired (Klepinger et al. 1992). Like other aging standards, the phase-based methods relying solely on the pubic symphysis tend to overestimate the age of the young and underestimate the age of the old (Martrille et al. 2007). Meindl and colleagues (1983) determined that pubic symphysis aging standards were biased, such that the error increases with age. Numerous authors have emphasized that age assessment from the pubic symphysis is not reliable past the fourth decade (Hanihara & Suzuki 1978; Suchey 1979; Meindl et al. 1983; Lovejoy et al. 1985a; Lovejoy et al. 1985b; Meindl et al. 1985; Katz & Suchey 1986; Klepinger et al. 1992; Lovejoy et al. 1997; Meindl & Russell 1998; Sakaue 2006; Djurić et al. 2007; Martrille et al. 2007; Sharma et al. 2008). Age estimation methods based on the pubic symphysis require sex-specific standards; sex-specific changes are particularly noticeable for those older than 40 years (Gilbert & McKern 1973) and are likely related to the degenerative process (Schmitt et

72

al. 2002). While revisions to the Todd method do attempt to better describe age ranges and/or differing morphology associated with males and females, the original Todd method uses the same standard for both sexes. Population-specific standards may also be necessary, as the timing of age-related morphological changes in Asian and African samples appear to be different from those of European samples (Schmitt et al. 2002). At this point, only Boldsen and colleagues’ Transition Analysis method considers race when estimating the age and confidence intervals, though the choice is limited to Blacks and Whites. Another concern with the reliability of the pubic symphyseal aging standards is that Hoppa (2000) observed differences among samples18 in the rate of early development and later degeneration of this indicator, as scored by the Suchey-Brooks method; however, Konigsberg and Frankenberg (2002) argue this difference was the result of interobserver error. Regardless, the numerous modifications to the Todd system of aging the pubic symphysis have not solved the controversy of whether variations exist in age estimates due to sex, race, population, inter-observer variability, or method reliability (Kimmerle et al. 2008a). Age estimates produced by pubic symphyseal standards may be affected by numerous factors, including childbirth, physical inactivity due to trauma or debilitation, and asymmetry. Stewart (1957) reported differences in the dorsal margin of multiparous women that resulted in overaging. In contrast, Klepinger and colleagues (1992) noted that physical inactivity was associated with severe underestimation of chronological age; the authors cite an example of an individual in his/her 50s, who was estimated to be in

18

See previous section for a discussion of this paper. Samples compared were Brooks and Suchey’s original reference sample, a 20th century forensic sample of similar composition to that of the SucheyBrooks reference sample (Klepinger et al. 1992), and an archaeological sample derived from the 18-19th century cemetery at Spitalfields (Molleson et al. 1993).

73

his/her 20s based on skeletal indicators. Accordingly, Klepinger and colleagues (1992) stressed the need to note variables like extremes in body weight, alcoholism, trauma, and physical disability, as they affect age estimates. Additionally, the developmental and degenerative processes forming the age indicator are not necessarily symmetrically stable within an individual; this is well documented in the literature and several models have been proposed to explain skeletal asymmetry, including genetic determinants, biomechanical factors, and environmental stress (Jones et al. 1977; Schell et al. 1985; Albert & Greene 1999; Halgrimsson 1999). Overbury and colleagues (2009) reported asymmetry in the pubic symphysis Suchey-Brooks age phases for over 60% of a sample of 20th Century White males (n=130) drawn from the Hamann-Todd anatomical collection; however, the authors note that the presence of asymmetry does not compromise the accuracy of the method if the morphologically advanced symphyseal face is used to age the asymmetrical individual. Despite these criticisms, pelvic indicators are considered superior to cranial ones (Meindl & Lovejoy 1989; Nagar & Hershkovitz 2004), and the age-related changes of the pubic symphysis are regarded by many anthropologists to be the best indicator of age at death in adults, providing the greatest accuracy and reliability (McKern & Stewart 1957; Stewart 1979; Meindl et al. 1985; Suchey et al. 1986; Steele & Bramblett 1988; Ubelaker 1989a; Buikstra & Ubelaker 1994; Bass 1995; Suchey & Katz 1998; Boldsen et al. 2002). The pubic symphysis is clearly the most studied indicator of age in adult humans, and it is both the most commonly used and most widely trusted indicator (Todd 1920, 1921; Brooks 1955; McKern & Stewart 1957; Stewart 1957; Gilbert & McKern 1973; Hanihara

74

& Suzuki 1978; Suchey 1979; Meindl et al. 1985; Katz & Suchey 1986; Meindl & Lovejoy 1989; Gillett 1991; Aykroyd et al. 1999). Since developmental changes are fairly constrained, and senescent changes are more variable, Boldsen and colleagues (2002) suggested that the pubic symphysis is the most informative indicator of adult age at death because it includes both developmental and degenerative changes; the auricular surface, sternal rib end, and cranial sutures exhibit only degenerative changes. Because of the development of the ventral rampart of the pubic symphysis, the pubic symphysis is considered the most accurate indicator for young adults (Mant 1984; Meindl et al. 1985; Bedford et al. 1993; Sirohiwal et al. 1998; Martrille et al. 2007). Klepinger and colleagues (1992) assessed the performance of three pubic symphyseal aging techniques using the mean absolute deviation of true age from the scored interval mean falling within +/- 1 and +/- 2 standard deviations. Their study was based on a sample of 202 female and 116 male pubic symphyses collected at autopsy by Suchey from the Office of the Chief Medical Examiner-Coroner, County of Los Angeles, and Micozzi and Carroll from the Dade County Medical Examiner, Miami Florida, respectively. Based on this research, Klepinger and colleagues (1992) concluded that among the Suchey-Brooks, Gilbert-McKern, and McKern-Stewart methods, the SucheyBrooks standard is best for forensic casework. The Suchey-Brooks standard for estimating age from the pubic symphysis is considered one of most reliable macroscopic age estimation methods (Telmon et al. 2005). Even when considering other standards based on the pubic symphysis, it is least open to be criticized based on methodological issues for reasons mentioned previously.

75

The Suchey-Brooks system is currently used as the worldwide standard for estimating age from the pubic symphysis (Kimmerle et al. 2008a). Although Hartnett (2007) and Berg (2008) have modified existing pubic symphyseal aging methods by adding an additional phase, Boldsen and colleagues’ (2002) Transition Analysis standard for scoring the pubic symphysis is the most recent original aging method developed. The most significant benefits of this new standard include finer detail in scoring morphological change by using a component approach and a more accurate estimated age distribution, particularly for those aged 50+. The latter is particularly important because the ability to estimate the age at death of older individuals has been elusive in the past.

Auricular surface For Lovejoy and colleagues’ (1985b) auricular surface aging standard, neither Murray and Murray (1991) nor Osborne and colleagues (2004) detected significant differences between males and females or Blacks and Whites. This contrasts with Mensforth and Lovejoy’s (1985) findings, which indicate that the female auricular surface changes less rapidly than that of males. As mentioned in the previous section on the pubic symphysis, Schmitt (2004) studied a Thai sample of known age and sex. In addition to using the Suchey-Brooks standard, she also tested the Lovejoy and colleagues (1985b) iliac auricular surface method. Results were similar, finding that both bias and inaccuracy increased with age, as well as that the standard produced a systematic underestimation of chronological age. Osborne and colleagues (2004) tested Lovejoy and colleagues’ (1985b) auricular surface aging standard and did not find a significant difference between American 76

samples drawn from the late 19th and early 20th Century Terry Collection and the later 20th Century Bass Documented Collection. Specifically, Osborne and colleagues (2004) tested the effects of age, sex, ancestry, and collection/time of auricular surface morphological change and found that only age influences the observed changes.

Strengths and weaknesses Based on Murray and Murray’s (1991) conclusion that there are neither sex nor geographic differences in auricular aging, Konigsberg and Frankenberg (1992) argue that the auricular surface is useful in paleodemography because it satisfies the Uniformitarian assumption; this assumption is met because Murray and Murray (1991) find that both living and past human populations age in a similar manner at identical rates (Meindl & Russell 1998). In addition, the auricular surface itself is often the best-preserved indicator of age in archaeological remains (Aykroyd et al. 1999; Cox 2000) and as a result, Meindl and colleagues (1983) recommend that the auricular surface should be a primary method of age determination for these samples. When compared to other traditional phase based standards, the auricular surface has been touted to be at least as accurate in predicting older ages as the pubic symphysis, if not better (Meindl & Lovejoy 1989; Saunders et al. 1992; Bedford et al. 1993). In fact, Meindl and colleagues (1983) claim that the overall correlation of age with the auricular surface is 0.72. One major drawback to Lovejoy and colleagues’ auricular surface aging standard is that many authors suggest that too much variability occurs within any given individual’s auricular surface morphology, making it difficult to classify those that are ambiguous (Buckberry & Chamberlain 2002). Even authors contributing to the development of the auricular surface aging standard state that it is difficult to interpret 77

due to the complexity of the solely degenerative changes (Meindl & Lovejoy 1989). Predictably, Rogers (1990) could not replicate Lovejoy and colleagues’ results in older samples, and Jackes (1992) found significant interobserver disagreement, though this may be partially explained by the fragmentary nature of the indicator (Meindl & Russell 1998). Many investigators suggest that the method suffers from low repeatability and reliability (Murray & Murray 1991; Jackes 1992; Saunders et al. 1992; Molleson & Cox 1993; Hoppa 2000; Schmitt et al. 2002). But Galera and colleagues (1995) did not find significant interobserver error when they tested the on 963 skeletons from the Terry Collection. Murray and Murray (1991) tested Lovejoy and colleagues’ (1985b) auricular surface aging method on 189 individuals from the Terry Collection to determine if the method was applicable to individuals in forensic contexts. The authors suggested that the age ranges provided by the standard, particularly for older individuals, were too large for forensic cases (Murray & Murray 1991). For example, those between the ages of 50-60 years were underestimated by approximately 10.5 years, and individuals older than 60 years were underaged by an average of 24 years. When tested on samples taken from the Bass Donated and Terry Anatomical collections, inaccuracy was lowest for middle decades 40-69 year olds and highest for 80-89 year olds (Osborne et al. 2004). Bias was positive—overestimating age—for the young, and negative for the old (Murray & Murray 1991; Saunders et al. 1992; Osborne et al. 2004; Mulhern & Jones 2005; Martrille et al. 2007). This is to be expected due to the statistical methodology used, because regression toward the mean19 returns results of systematic over- and under-aging (Konigsberg &

19

See discussion later in this chapter.

78

Frankenberg 1994; Konigsberg et al. 1997; Aykroyd et al. 1997; Nawrocki 1998; Aykroyd et al. 1999). Other tests of the auricular surface aging standard conducted by Murray and Murray (1991) and Santos (1996) discovered that the method governs the age estimates of target populations; instead of being characteristic of the target Terry and Coimbra samples, it reflects the built-in biases of the method (Bocquet-Appel & Masset 1982; Jackes 2000). Other critiques of the Lovejoy and colleagues auricular surface aging standard emphasize its inability to allow for individual variation in skeletal aging (Bedford et al. 1993), narrow age ranges (Aykroyd et al. 1999; Buckberry & Chamberlain 2002; Schmitt 2004), and subjectivity to the effects of motion at the sacroiliac joint during pregnancy (Sashin 1930).

Fourth rib Sex-specific differences have been reported for the sternal rib ends, resulting in separate age ranges and/or different phase descriptions for male and female standards (İşcan et al. 1984; İşcan et al. 1985; Loth & İşcan 1989). In addition, İşcan and colleagues (1987) specifically designed their study to test for racial differences in the age estimation from the sternal end of the rib. They tested their fourth rib aging standard, which was developed from a White American population, on a sample of Black Americans from an autopsy sample from Broward County, Florida. While the authors acknowledge that the Black sample was limited in size and age range, they did find differences between races in both the rate and pattern of metamorphosis. After the age of 30 years, differences between Blacks and Whites were apparent; specifically, İşcan and colleagues (1987) reported that American Blacks were overaged in phases 5-7. In 79

subsequent publications, Loth (1988) and Loth and İşcan (1989) noted differences in the rate and pattern of sternal rib aging specifically between Black and White females, with Black females appearing younger than their White counterparts. In contrast, Russell and colleagues’ (1993) observed slight, non-significant delays in sternal rib end changes for American Blacks compared to American Whites in a sample of males drawn from the Hamann-Todd collection. Similarly, Oettlé and Steyn (2000) found a tendency toward slower age-related morphological change in the sternal end of the ribs for South African Blacks compared to Americans. Oettlé and Steyn (2000) applied İşcan’s American rib aging standards to a large sample of Black males and females collected between 19941996 from Gauteng Province, South Africa. The morphology of the South African Blacks was such that the observed phases of rib age were younger than the expected phase according to chronological age. This effect may be result of disease exposure, physical activity level, genetic and cultural differences, and socioeconomic and nutritional disparities (Sanders 1966; Oettlé & Steyn 2000). Similarly, Yavuz and colleagues (1988) noted that a modern White autopsy sample of Turks, which is geographically, genetically, and culturally different from the American White sample from which İşcan’s rib aging standard was devised, generally tended to attain phases later than Americans, though these differences were not statistically significant. As mentioned previously, Yoder and colleagues (2001) tested the applicability of İşcan and colleagues’ right fourth rib aging technique on left and right ribs 2-3 and 5-9. This study was undertaken as a response to critiques of İşcan’s rib aging method, which included difficulty applying the standard as a result of poor preservation of the sternal end and/or the inability to isolate the fourth rib. Yoder and

80

colleagues (2001) found that most ribs provided age estimates similar to that of the fourth rib and that a composite score20 yields the same age as the fourth rib, suggesting that other ribs can used with İşcan’s and colleagues’ age estimation method.

Strengths and weaknesses As with all skeletal aging standards, reports documenting the interobserver rates are contradictory for İşcan’s fourth rib aging method; Dudar and colleagues (1993) found high interobserver error, but tests conducted by Galera and colleagues (1995) on 963 skeletons from the Terry Collection did not find significant interobserver error. Unsurprisingly, tests conducted by Loth and İşcan (1989) after developing the standard revealed minimal interobserver error and no significant difference in scoring by experience level. Similarly, Oettlé and Steyn (2000) determined that the standard’s repeatability was acceptable. In general, İşcan and colleagues’ fourth rib standard produces large standard error in age estimates (Rösing et al. 2007), and many osteologists have found that the method is not useful in isolation (Russell et al. 1993; Loth 1995; Aykroyd et al. 1999). In contrast, Ritz-Timme and colleagues (2000) recommend using İşcan and colleagues’ fourth rib standard for cadavers, skeletal remains, historic and archaeological cases, particularly for individuals under 40 years of age, as published standard errors are as low as +/- 2-4 years. While all traditional American aging standards have greater inaccuracy as chronological age increases, Martrille and colleagues (2007) found that İşcan’s fourth rib aging standard had the least inaccuracy of all the methods for older individuals

20

The average of an individual's phase scores for multiple ribs, omitting the fourth rib.

81

examined from the Terry Anatomical Collection; similar conclusions were reported by Saunders and colleagues (1992), Russell and colleagues (1993), and Dudar and colleagues (1993). Methodological bias results in the tendency to overestimate the age of the young and to underestimate the age of the old (Russell et al. 1993; Martrille et al. 2007); however, systematic biases in age estimates have been reported for İşcan and colleagues’ fourth rib aging technique: both a Canadian autopsy sample and a Canadian archaeological sample were consistently underaged (Dudar et al. 1993), and autopsied French individuals were consistently overaged (Baccino et al. 1999). Another concern in forensic and archaeological contexts is that it is difficult to isolate the fourth rib if the remains are incomplete or fragmentary (Kunos et al. 1999); however, this obstacle may be overcome by using a suitable, alternative rib (Loth et al. 1994; Yoder et al. 2001). Finally, although Kunos and colleagues (1999) believe that the rib cage is not subjected to physiological stress in the same way that pelvic indicators of age are, the authors do suggest that the lower ribs may be affected by mechanical stresses. Rösing and colleagues (2007) echo this concern, proposing that age changes in the sternal end of the ribs may be dependent on activity patterns.

Transition Analysis Boldsen and colleagues (2002) noted differences in the point estimates and confidence intervals produced by Transition Analysis between the sexes, as well as between American Blacks and Whites, despite inputting the same observations. However, these differences have less of an impact if all components for all indicators are

82

used to compute the age estimates. The authors attribute these sex- and race-based age estimate disparities to genetic differences and/or diverse lifetime experiences.

Strengths and weaknesses For traditional phase based aging methods, osteologists sometimes find it difficult to distinguish between two sequential stages in a particular bony feature, thus leaving the determination open to interpretation (Kimmerle et al. 2008b). This problem might arise because the particular skeletal feature is anomalous, modified through a pathological process, or damaged after burial (Dr. George Milner, personal communication, 2009). Phase based methods have discrete age intervals, often with a constant width, to describe imprecision in estimations (Boldsen et al. 2002); but this approach assumes that all individual age estimates have the same degree of error (Boldsen et al. 2002). The benefit of using Transition Analysis is that scores components of aging indicators separately, better reflecting the complex changes observed, and it allows every skeleton to have its own degree of error, which is dependent on its particular suite of traits (Boldsen et al. 2002). Another benefit is that the ADBOU Age Estimator program allows for the observer to record a score between phases (ex. 2-3) if he or she is unsure for a certain component. Kimmerle and colleagues (2008b) support the continuous nature of categories and the need for Transition Analysis in calculating age at death estimates. The point estimates and confidence intervals for age vary among individuals because the suite of traits is different for each; unlike traditional phase based methods, Boldsen and colleagues’ Transition Analysis is designed to analyze trait data and optimize the output. Aykroyd and colleagues (1999) argue that by using Bayesian analyses, Transition Analysis reduces the trend of underestimating the age of the old and performs better than 83

traditional methods by providing smaller average differences between predicted and actual age, as well as a smaller 95% confidence interval for the point estimate. Bayesian prediction and maximum likelihood estimates of age are better because they are calculated from a combination of stages for different variables and priors can be generated from the collection at hand. The maximum likelihood age estimates produced by Transition Analysis are independent of the distribution in the original reference sample (Boldsen 1997; Aykroyd et al. 1999; Schmitt et al. 2002). As a result, it appears that age estimates produced by Transition Analysis using an appropriate informed prior are not subject to mimicry as are traditional, regression analysis aging standards (Boldsen et al. 2002). However, Bayesian estimation also has limitations. In a study conducted by Aykroyd and colleagues (1999), it produced systematic bias; it was not determined if the bias was inherent in method or was the result of the dataset tested. According to its authors, Transition Analysis using pubic, auricular, and cranial indicators should perform the best because it allows for the different rate of change in components of the indicator and it combines all data to get the most accurate estimate (Boldsen et al. 2002); formal testing by Boldsen and colleagues (2002) confirmed this result, finding a 0.88 correlation of the estimate with chronological age. Boldsen and colleagues’ Transition Analysis using pubic symphysis scores alone performed nearly as well, with a correlation of 0.86, followed by the auricular surface (0.82) and cranial sutures (0.66) (Boldsen et al. 2002). The primary critique of this aging standard is that the reference collection used by Boldsen and colleagues (2002) includes three Black females over the age of 90; Hoppa and Vaupel (2002) argue that these individuals, who are part of the Terry Anatomical

84

Collection, probably died at younger ages, and should be excluded from future analyses. Other critiques are not specific to this method; as with other skeletal aging standards, the results from Transition Analysis include broader age ranges for older individuals than for younger ones. The likelihood curves produced by Boldsen and colleagues’ Transition Analysis become broader as age increases, indicating the error associated with estimating age of older individuals. This may result in part because many components, like dorsal margin of pubic symphysis, progress quickly through stages in early adulthood then plateau or only change slightly later on (Boldsen et al. 2002).

Critiques of estimating age from the adult skeleton The estimation of age at death from the skeleton is subject to a number of concerns, including the inherent variation in the aging process, methodological issues, and statistical problems.

Inherent variation in the aging process Skeletal biologists need to understand the fundamental biological process of aging in the human skeleton because the inherent variation in that process is a primary source of error for all current age estimation standards (Bocquet-Appel & Masset 1982; Lovejoy et al. 1997; Hoppa 2000; Schmitt et al. 2002). This complex variability in the process of skeletal aging (Ferenbach et al. 1980; Maples 1989; Stini 1994; Schmitt et al. 2002) is the nature of human senescence, complicating the estimation of chronological age from biological/skeletal indicators (Spirduso 1995). Senescent changes in bone are degenerative, not developmental, and as a result they are far more variable, producing age estimates with what Boldsen and colleagues 85

(2002) argue is a considerable degree of error; the degeneration of the body and skeleton is complex and does not occur at same rate for all individuals (Meindl et al. 1983; Wittwer-Backofen et al. 2008; Samworth & Gowland 2007) or even for multiple indicators within an individual (Kemkes-Grottenthaler 2002). The skeleton is more plastic than is generally assumed (Stewart 1957; Kemkes-Grottenthaler 2002), and it participates in the overall metabolism of the organism (Acsádi & Nemeskéri 1970). As a result, bone responds to internal and external influences by changing its morphology. Individual aging is influenced by a number of factors (Krogman 1970; Ferenbach et al. 1980; Angel & Caldwell 1984; İşcan & Loth 1985; İşcan et al. 1987; Vaupel et al. 1998; Jackes 2000), and is determined by continual interactions among genes, culture, and environment (Arking 1998; Schmitt et al. 2002). The skeletal aging process is influenced by many variables, including genetic factors (Angel 1984; Kemkes-Grottenthaler 2002), growth (Tanner 1962; Bogin 1999), diet/nutrition (Acsádi & Nemeskéri 1970; Angel 1984; Plato et al. 1994; Samworth & Gowland 2007), living conditions (Acsádi & Nemeskéri 1970), health status/disease (Steinbock 1976; Acsádi & Nemeskéri 1970; Angel 1984; Ortner 2003), occupation (Trotter 1937; Ubelaker 1979; Kennedy 1989; Plato et al. 1994; Samworth & Gowland 2007), lifestyle and physical activity levels (Stewart 1980; Angel 1984; Plato et al. 1994; Samworth & Gowland 2007), environmental changes (Acsádi & Nemeskéri 1970; Kemkes-Grottenthaler 2002), biomechanics, endocrine function (Krogman 1970; Maples 1981), bone mineral density (Arking 1998) substance abuse (Saville 1965; Wolf 1981; Angel 1984), and other factors (Buckberry & Chamberlain 2002; Prince 2004).

86

There is strong evidence for low individual variability of morphological aging criteria during young ages, but variability among individuals increases as they age due to their unique interactions with the environment, as well as the accumulation of genetic and behavioral influences (Harper & Crews 2000; Wittwer-Backofen et al. 2008). In fact, numerous authors have noted that traditional osteological age indicators typically do not accurately estimate age for older individuals, finding that the older the individual, the greater the error in the age estimate (Angel 1984; Rösing et al. 2007; Wittwer-Backofen et al. 2008). This inaccuracy is reflected in the broad age ranges (i.e. 50+ years) produced by several standards for older individuals. This problem is so severe that several prominent anthropologists have argued that it may be impossible to age older individuals with any precision using current methods (Hanihara & Suzuki 1978; Suchey et al. 1986; Meindl & Lovejoy 1989; Milner et al. 2000; Boldsen et al. 2002; Berg 2008).

Methodological problems Multiple skeletal age indicators can be correlated with each other, meaning that the information provided by each indicator is not independent of the others (Boldsen et al. 2002). If age indicators are related or closely linked, the standards will reinforce a central tendency rather than providing a better (or more precise) age estimate (Jackes 2000). In contrast, Boldsen and colleagues (2002) suggest that if the correlation of skeletal traits is purely attributed to age, then the indicators would be independent if age is controlled for (Boldsen 1997). This assumption of “conditional independence” (Boldsen 1997) may work well for senescent changes in skeletal morphology as the mutation accumulation mechanism—one theory of aging—suggests (Rose 1991).

87

Boldsen and colleagues (2002) found this assumption to hold true for the Terry Collection. Osteological age indicators have been developed and tested in a variety of contexts, and the summation of these results indicates a high degree of variation in the accuracy of age prediction (Meindl & Russell 1998). Bocquet-Appel and Masset (1982) argue that inaccuracy and unreliability is inherent in all osteological aging standards, which is a result of the low correlation between skeletal features and chronological age. Bias and inaccuracy should also be a source of concern for investigators. Bias is defined as the sum of the estimated age minus chronological age, divided by the number of individuals. Bias is directional, so the sign is important. Inaccuracy is defined as the sum of the absolute value of estimated age minus real age, divided by the number of individuals; here, the sign is not important (Meindl & Russell 1998; KemkesGrottenthaler 2002). All of the standards tested in this research have well-documented inaccuracy that increases with age (Buikstra & Konigsberg 1985; Lovejoy et al. 1985a; Katz & Suchey 1986; Murray & Murray 1991; Dudar et al. 1993; Bedford et al. 1993; Santos 1996; Nagar & Hershkovitz 2004), though Meindl and Russell (1998) argue that using multiple age estimation standards minimizes this problem. Unfortunately, bias is harder to deal with because certain methods work well for particular ages but not others; this is due to reference population age structure influences, which will be discussed in more detail below. Finally, advanced training—and more importantly, experience—is required for the accurate application of osteological age estimation methods (Ritz-Timme et al. 2000;

88

Rösing et al. 2007). Without the proper application of the aging standard to any given target sample, researchers should be skeptical of the results presented.

Statistical problems Statistical problems also thwart the process of age estimation from skeletal indicators (Samworth & Gowland 2007). Two prominent, related problems include regression toward the mean and age structure mimicry.

Regression toward the mean Regardless of the indicator used, most current American age estimation standards tend to overestimate the ages of the young and underestimate the ages of the old in target samples (Murray & Murray 1991; Russell et al. 1993; Osborne et al. 2004; Djuric et al. 2007; Hartnett 2007; Martrille et al. 2007; Berg 2008). Numerous authors suggest that these systematic errors are the result of regression toward the mean (Katz & Suchey 1986; Konigsberg & Frankenberg 1994; Konigsberg et al. 1997; Aykroyd et al. 1997; Aykroyd et al. 1999; Schmitt et al. 2002; Rösing et al. 2007). Most traditional phase-based standards use linear regression to correlate the morphology of the indicator with chronological age. The calculated regression line equation is then applied to convert scores to a predicted age estimate (Schmitt et al. 2002). With this technique, though, lower correlations between the skeletal indicator and chronological age results in greater bias (Aykroyd et al. 1997; Aykroyd et al. 1999). If age at death is regressed on a set of skeletal age indicators, the output is an estimate of age for each value of the indicator; while this is what researchers want, these estimates are influenced by the age composition of the reference sample. For example, if the young 89

in the reference sample outnumber the old, then archaeological samples aged using techniques based on this reference sample look younger than their actual age, feeding into the belief that past populations lived much shorter life spans. Age structure mimicry One of the most significant sources of error in skeletal age estimation arises from problems with the reference collection from which the methods are based (Masset 1971; Bocquet-Appel & Masset 1982; Masset 1989). Documented skeletal reference series have distorted age compositions and/or selection criteria (Wittwer-Backofen et al. 2008), often over-representing older, non-Hispanic White individuals, particularly for those collections established in the latter half of the 20th Century. The published claims of systematic errors in age estimation are likely due in part to differences between reference samples and target populations. Although Masset (1971) noted differences in the underlying age-structures of the method’s reference sample and target groups, it was Bocquet-Appel and Masset’s (1982) seminal work that raised concerns about methodological problems in paleodemography and sparked heated debate in the anthropological community. The authors’ critiques, which have been reiterated in later works (Bocquet-Appel & Masset 1985; Masset 1989, 1993; Bocquet-Appel 1994; Bocquet-Appel & Masset 1996), focused on two points: first, that methods of age estimation were too imprecise and biased to produce useable results for demographic analyses; and second, that aging methods reflect the age structure of the reference sample. The latter problem, age structure mimicry (Mensforth 1990), results in the replication of the reference sample mortality profile in target samples and unreliable age estimators, with error increasing when indicators correlate poorly with age (Bocquet90

Appel & Masset 1982; Konigsberg et al. 1994; Meindl & Russell 1998; Kimmerle et al. 2008a). Schmitt and colleagues (2002) summarized the problem of age-structure influences effectively. Each stage of a particular standard has a corresponding mean age that was calculated from the reference population. The mean age of each stage depends greatly on the overall age structure of the reference population, and as a result, when the standard is applied to a target archaeological sample, its distribution will be similar to that of the reference sample. A study by Gillett (1991), comparing age estimates for an archaeological site on the eastern shore of the San Francisco Bay produced by the Todd and Suchey-Brooks pubic symphysis standards, clearly illustrates the problem of age structure mimicry. When Gillet (1991) looked at predicted survivorship and probability of death, the Todd standards resulted in a lack of representation of older individuals (40+ years). The Todd standard predicted that the chance of survival for these individuals sharply declined at age 40, but the Suchey-Brooks standard predicted that the chance of survival would steadily decline in the older ages (Gillett 1991). The difference observed between the two predicted population distributions reflects the difference between the phase age limits between the two methods, especially for upper decades (Gillett 1991). Gillett (1991) concluded that the usual assumption that prehistoric Californians died by the age of 50-55 is likely a reflection of the standard from which their ages were estimated. This problem also exists for the application of osteological aging standards on an individual level in forensic contexts (Schmitt et al. 2002), but to a lesser extent (Komar & Buikstra 2008). Bocquet-Appel and Masset argued that imprecise age estimates and mimicry preclude the ability to answer fundamental questions of interest in target groups, and

91

thus, declared the death of paleodemography. Another critique of paleodemography showed that the inverse of the mean age at death is approximately equal to the crude birth rate in a non-stationary population (Sattenspiel & Harpending 1983); this argument suggested that traditional paleodemographic data was more informative about fertility than mortality in populations with an unknown growth rate, a conclusion that was counterintuitive to most researchers at the time. Several years later, Buikstra and colleagues (1986) used regression analysis to illustrate that birth rate was indeed more highly correlated with the death proportion than the crude death rate in non-stationary populations. In contrast to the grim outlook posited by Bocquet-Appel and Masset (1982), other researchers countered that the “death” of paleodemography was exaggerated and extreme (Van Gerven & Armelagos 1983; Buikstra & Konigsberg 1985; Greene et al. 1986; Konigsberg & Frankenberg 1992, 1994). Van Gerven and Armelagos (1983) argued that skeletal samples do not invariably reflect the structure of their reference populations and that age estimates do not produce the random fluctuations predicted by Bocquet-Appel and Masset’s (1982) a priori criteria for age estimation. Buikstra and Konigsberg (1985) argued that Bocquet-Appel and Masset’s (1982) critiques were extreme, but they acknowledged that the imprecision in age indicators, particularly for older adults, and interobserver error remain significant problems. Beginning in the mid-1980s, researchers attempted to address some of the critiques of paleodemography by testing alternative methods. Jackes (1985) suggested that the mean and standard deviation for age at death within pubic symphyseal phases could be used to probabilistically assign ages at death in the target sample. She used the

92

normal distributions of age within phases to get smooth distributions, but Frankenberg and Konigsberg (2002, 2006) have criticized her methodology. The authors argue that it is unlikely that age is normally distributed within stages and that the age distribution is still dependent, in part, on the reference sample. Another development was the application of a hazards model to paleodemography as an alternative to using the traditional life table approach (Gage & Dyke 1986; Gage 1988; Wood et al. 1992). The benefits to hazards analysis include the replacement of the life table values l(x) with a survivorship function21, d(x) with a smooth function22, and q(x) with a hazard rate. Hazards models allow for variation in the age ranges that are assigned to individual skeletons and can be adjusted for growth rate, assuming some estimate of the growth rate is available (Asch 1976; Milner et al. 2000). However, Frankenberg and Konigsberg (2006) stress that hazards models alone do not circumvent the non-stationarity problem. Konigsberg and Frankenberg (1992) stated that the perceived paucity of older individuals in archaeological samples was the result of using inappropriate methods of age estimation; the authors suggested the solution to this age structure mimicry problem was to use maximum likelihood estimates of life table or hazard functions to incorporate the uncertainty of age estimates. They argued that when age is estimated rather than known, the traditional method of assigning individuals to age classes will produce biased estimates of age structure; this bias was known in other areas of study and a potential solution was drawn from the fisheries literature. The "iterated age-length key" uses a contingency table of an age indicator against known age classes in a reference sample to

21 22

The probability of survival to age “a.” The probability density function of age at death.

93

infer the age at death structure in a target sample. In addition, Konigsberg and Frankenberg (1992) suggested a future course for research that included the revision and development of new aging methods and the incorporation of uncertainty of age estimates in parameters for life tables using maximum likelihood or hazards methods. However, Konigsberg and Frankenberg did not anticipate the use of ordinal parametric models like logistic or probit regression to describe the development of age indicators that are phasebased (Skytthe & Boldsen 1993). Boldsen brought this approach, termed transition analysis because it models the age to transition between indicator phases, to paleodemography. Subsequently, the Transition Analysis aging method was introduced (Milner et al. 2000; Boldsen et al. 2002). In 1996, Bocquet-Appel and Masset presented their results using iterative proportional fitting (IPFP) in simulations; the authors applied sets of conditional probabilities in order to estimate the age at death structure for the target sample. Bocquet-Appel and Masset (1996) argued that their method differed from that outlined in Konigsberg and Frankenberg (1992); however, Frankenberg and Konigsberg (2002, 2006) disagreed. In 2002, when Konigsberg and Frankenberg compared the two, finding that the methods were essentially the same statistic; they argued that both methods used on the same data should produce the same results. Bocquet-Appel and Masset’s (1996) test produced differing results, a conclusion that Konigsberg and Frankenberg (2002) do not accept. Konigsberg and Frankenberg (2002) also criticized Jackes’ (2000) use of the IPFP; in this case, they argued that Jackes violated the statistical assumptions of the model by using more age groups than there were morphological stages in the aging method.

94

Amidst the decades of conflict and controversy, a collaborative effort to advance the field emerged. A workshop of invited researchers came together at the Max Planck Institute for Demographic Research in Rostock, Germany, to focus on biostatistical methods and adult aging techniques within the scope of paleodemography (Hoppa & Vaupel 2002). The workshop provided the attendees with an identical dataset on which to test their techniques. One of the most significant outcomes was the realization that the theoretical framework in which varying statistical methods were placed was critical. The Rostock Manifesto is the theoretical approach adopted by the researchers. The Manifesto calls for the development of more reliable and more vigorously validated age indicator stages relating skeletal morphology to known chronological age, the use of a multidisciplinary approach to develop models and methods to estimate the probability of observing a suite of skeletal characteristics “c,” given known age “a,” the recognition by osteologists that what is of interest in paleodemographic research is the probability that the skeletal remains are from a person who died at age “a,” given the evidence concerning “c”23, and the calculation of the probability distribution of lifespans in the target population must be done first, before individual estimates of age (Hoppa & Vaupel 2002). The outcomes of this collaborative effort are published in a text edited by Hoppa and Vaupel (2002); within, numerous North American and European researchers present their approaches to solving problems in paleodemography. Despite significant contributions in the past quarter century, researchers are still struggling with accurate adult age estimation from skeletons (Storey 2007). This is particularly problematic for the estimating ages of older individuals, as evidenced in the

23

Note that this is different from the probability referred to in the previous clause.

95

recent work by Wittwer-Backofen and colleagues (2008). In their study, thirteen independent observers24 using a variety of aging techniques analyzed the skeletal remains of 121 adults from Lauchheim, an early medieval cemetery. The age ranges and mean age estimations were compared, and the results indicated smaller age ranges for younger individuals and broader age ranges for older age groups, regardless of the method used. In summary, to some degree, all age estimates derived from conventional, phasebased methods are affected by mimicry and imprecision. However, the maximum likelihood age estimates produced by Transition Analysis are independent of the age distribution in the original reference sample (Boldsen 1997) and should not be subject to mimicry (Boldsen et al. 2002).

Summary Hoppa and Saunders (1998) suggest that anthropologists only have a basic knowledge of population differences for skeletal age change. Conflicting results regarding the influence of sex and race on age estimation have emerged from the published osteological research. The anthropological literature demonstrates that the relationship between the skeletal indicator and chronological age varies among samples drawn from different geographical regions (Schmitt 2004), implying that a single standard of senescence for populations of different origins is not appropriate (Hoppa 2000). These population differences may actually be the result of diverse genetic backgrounds, behaviors, environments, or other factors. Secular trends could have a significant impact on age estimation if, for instance, contemporary individuals skeletally

24

The observers were often the developers of the method they scored.

96

mature earlier or senesce at different rates than their historical counterparts (Klepinger 2001). Klepinger (2001) found a secular trend for increasing childhood obesity, which was associated with a trend for accelerated skeletal maturation; though according to Klepinger (2001) this secular trend is real, the author determined that it is negligible for the estimation of age, simply contributing to the noise of population variation. Only a few recent studies have directly addressed the issue of a secular trend in the rate of senescence of joint surfaces, and the results are contradictory. American aging standards do not appear to be uniformly applicable to all target populations worldwide. But does this problem exist for American target samples differing in genetic background, environmental factors, and time, from those used to develop American aging standards? Currently, American aging standards are applied to prehistoric, historic, and forensic samples alike, despite the fact that the standards are developed from samples primarily composed of individuals living during the 19th and early 20th Centuries, which are not even representative of the populations from which they were derived. If secular change, genetic and environmental differences, or other factors strongly influence the aging process of the human skeleton, these standards may not be appropriate for age estimation of target samples differing from the reference sample, even if they belong to the same population. As a result, age estimates for target samples may not be reliable or accurate, impacting downstream analyses in bioarchaeological and paleodemographic endeavors as well as having legal implications for the admissibility of forensic anthropological evidence in court. Rules for the admissibility of scientific evidence require publication of the method in peer-reviewed journals as well as an assessment of the validity and reliability

97

of the standard. Interestingly, forensic tests of aging standards have focused on the application of American standards to foreign target samples, likely due to the large scale investigations of mass graves and necessity to age Eastern European samples where no population-specific standards exist. It is unclear if American aging standards perform as well on recent American target samples as they do for the reference populations from which they were developed. To date, no large-scale investigation of this question has been undertaken using documented American skeletal samples. To address this problem, this dissertation research will explicitly compare large documented skeletal samples drawn from older American reference series and more recent documented American osteological collections to determine whether these samples age at a different rate. American Blacks and Whites of both sexes were drawn from the Terry Anatomical, Hamann-Todd Osteological, Bass Donated, Maxwell Museum Documented, and the Maricopa County Forensic Science Center autopsy collections so that differences between older and more recent groups can be explored. Morphological indicators of age were scored according to the following standards to determine if changes are absent, ubiquitous, patterned, or random in nature: Todd (1920) pubic symphysis; Suchey and Brooks (1990) pubic symphysis; Hartnett and Fulginiti (2007) pubic symphysis; Lovejoy and colleagues (1985) auricular surface; İşcan and Loth (1986) sternal end of the fourth rib; Meindl and Lovejoy (1985) cranial sutures; and Boldsen and colleagues (2002) pubic symphysis, auricular surface, and cranial sutures.

98

Chapter 4 Research Design The goal of this research is to determine whether older American skeletal series progress through the senescent changes of skeletal indicators at a different rate than more recent ones. The answer to this question determines the validity of the current practice of universally applying American aging standards to all American skeletal series, regardless of differences in sex, race, genetic background, living conditions, health status, time, and geographic region is valid. If differences in aging exist between older and more recent American skeletal samples, then existing standards, many of which are based on these older samples, will not produce reliable age estimates for forensic or archaeological remains. The data used to address this question include demographic data and scores of the morphology for four skeletal indicators according to phases/stages defined by seven established American aging standards. Four osteological age indicators were examined: the pubic symphysis, auricular surface, sternal end of the fourth rib, and cranial sutures. American aging standards tested include the Todd, Suchey-Brooks, Hartnett-Fulginiti, and Boldsen and colleagues pubic symphysis methods, the Lovejoy and colleagues and Boldsen and colleagues auricular surface methods, the İşcan and colleagues sternal rib end method, the Meindl and Lovejoy and Boldsen and colleagues methods for scoring cranial suture closure, and the Boldsen and colleagues methods combining the pubic symphysis, auricular surface, and cranial suture indicators. These data were recorded for nearly one thousand remains drawn from five documented American skeletal collections.

99

Materials Nine hundred and seventy-one sets of adult human skeletal remains, aged 20 years or older, were examined to test whether American osteological aging standards are universally applicable to diverse American target samples. The dataset included the remains of Blacks and Whites drawn from the Terry, Hamann-Todd, Bass Documented, and Maxwell Museum collections, as well as samples curated at the Maricopa County Forensic Science Center. These collections comprise skeletal remains collected for over a century from autopsy, cadaver, unclaimed, and donated bodies. These collections are an ideal data source for this dissertation research because, with the caveats previously described, the remains have known sex, age, race, and date of birth and/or death information. Morphology of the pubic symphysis, auricular surface, sternal rib ends, and cranial sutures were examined for a sample of individuals born between the early 19th and late 20th Centuries. Analyses were limited to Blacks and Whites because the selected collections have comparatively few individuals from other groups. Nonetheless, these two groups are important in historic research and contemporary forensic analyses in the United States, and the results will clarify the potential impact of often-overlooked variables on age estimation in other American groups. The skeletal series utilized in this project can be roughly divided into older “Reference” samples and more “Recent” collections. This is done for two reasons: first, to distinguish samples that have traditionally been used as the references for traditional American aging standards from those that are not; and second, to maximize sample sizes such that a wider range of variation can be examined, particularly for Blacks. For this research, the older Reference American skeletal samples have an average birth year of 100

1878, with a range of 1828-1943. Many of these individuals lived through tumultuous times, including the American Civil War and Reconstruction Period; living standards declined after the war, particularly for those living in the South (Carson 2006). The more Recent American skeletal samples have an average birth year of 1939, with a range of 1889-1985; many of these individuals benefited from the first commercially available antibacterial antibiotic, which was available in the early 1930s. While vaccines for smallpox and plague were available in the 19th Century, the 20th Century saw the development of and widespread immunization for many more infectious diseases, including cholera, typhoid, tuberculosis, influenza, polio, measles, mumps, and rubella (Centers for Disease Control and Prevention 2006). Some members of each group experienced the Great Depression; this event may have significantly affected the health of these individuals, as well as the quality and quantity of food consumed. These samples provide a relatively diverse group and long temporal continuity of American documented skeletal remains, which is important to effectively test the research hypotheses. The Reference series serve as the source population for the development of many of the osteological aging standards tested during the data analysis phase of this research. Specific information for each collection regarding collection strategy, source populations, and biases are presented below.

Reference Collections Anatomical Collections Anatomists, not physical anthropologists, pioneered the first collections of entire human skeletons. As a result, older collections were primarily drawn from anatomy 101

school dissecting rooms and unclaimed morgue remains. The majority of the individuals included in these samples were born in the early 19th through the early 20th Centuries. Hamann-Todd Collection (HTH) As described in Chapter 2, the Hamann-Todd Osteological Collection is the product of two of the great anatomists of the early 20th Century: Carl August Hamann and Thomas Wingate Todd (Quigley 2001). The collection contains approximately 3,700 skeletons (Cobb 1981; Moore-Jansen 1989) and is the single largest comparative anatomical collection in the United States (Moore-Jansen 1989). It is currently housed at the Laboratory of Physical Anthropology within the Cleveland Museum of Natural History. The series contains the remains of American Whites and Blacks primarily from the Cleveland area, with the remainder from elsewhere in Ohio. These individuals were born between 1823 and 1934. The collection contains the remains of individuals from the anatomy department’s dissection labs, unclaimed or indigent burials, as well as those willing their bodies to science and research (Usher 2002). Documentation accompanying the remains includes age at death, race, sex, date of birth and/or date of death, as well as place of birth, occupation, and cause of death when available (Cobb 1959; Thompson 1982; Moore-Jansen 1989). For individuals with a specified place of birth, Cobb (1952) found that 60% of the Whites in the sample were born abroad, including origins in Scandinavia, Britain, Germany, and eastern and southern Europe. Native-born Whites have Ohio, New York and Pennsylvania listed as birthplaces (Cobb 1952). The majority of Black individuals migrated to the Cleveland area, with southern U.S. birthplaces including Georgia, Alabama, the Carolinas, Tennessee, Virginia, Kentucky, Mississippi, and Arkansas (Cobb 1952). The series 102

contains a large number of males (82%) and adults (Moore-Jansen 1989); only 25-30 skeletons are 16 years or under (Gottlieb 1982). As a whole, the individuals within the collection are considered to be generally of lower socioeconomic strata based on the cause of death listed (Cobb 1952; White 1991). The Hamann-Todd collection has served as the documented reference material for a number of commonly used aging methods, most notably the Todd method for scoring the metamorphosis of the pubic symphysis, the standards Lovejoy and colleagues developed for aging the iliac auricular surface, and the Meindl and Lovejoy technique for scoring cranial suture closure. Although significant concerns have been leveled against the Hamann-Todd collection regarding the reliability of age at death information25, the collection has not been rendered useless. As Meindl and colleagues (1990) suggest, the careful selection of remains can result in a viable sample; I was cognizant of these issues and adjusted the sampling strategy26 for data collection accordingly, justifying the use of these remains for this research. Terry Collection (TC) Two scientists were also integral to the formation of the Robert J. Terry Anatomical Skeletal Collection: Robert Terry, a medical doctor, and noted anthropologist, Mildred Trotter; the formation of the collection is detailed in Chapter 2. The Terry Collection contains 1,728 skeletons (Hunt 2009) and is the second largest anatomical skeletal collection in the United States (Moore-Jansen 1989). It is currently

25

The reader is referred to Chapter 3 for a more detailed discussion of known issues with recorded age at death. 26 See the Sample Selection Protocol in the Data Collection Methods section, this chapter.

103

housed at the National Museum of Natural History at the Smithsonian Institution in Washington, DC. The series contains the remains of American Whites and Blacks from St. Louis and, to a lesser extent, other locations in Missouri (Hunt & Albanese 2005). These individuals were born between 1828 and 1943. Like the Hamann-Todd collection, the Terry collection contains the remains of individuals obtained from local hospital and institutional morgues, assembled from 1914 to 1965 (Murray & Murray 1991). The demography of the source population, specifically indigents and city morgue cadavers, initially presented an overabundance of older Black and White males; young, White women were originally underrepresented compared to males in the collection (Hunt 2009). Trotter attempted to correct the sex and race biases inherent in the collection by both replacing and adding new skeletal remains (Trotter 1981), though this did not eliminate all sampling problems. The majority of these supplemental females were willed donations because of the Willed Body Law of Missouri (passed in the mid-1950s), which required a signed release by the next-of-kin; this law resulted in a shift away from the lower socioeconomic status typical of earlier cadavers to a middle and upper income bracket for newer remains (Quigley 2001). Like the Hamann-Todd collection, the socioeconomic status of the majority of the early Terry individuals is low, an assumption based on the documented cause of death, which were often diseases of poverty and exposure (Moore-Jansen 1989). In contrast to the Hamann-Todd collection, most individuals in the Terry collection were native born (Moore-Jansen 1989). As with the Hamann-Todd Collection, the remains have documentation as to sex, age, race, cause of death, and date of birth and/or death (Thompson 1982). Morgue records also contain the name of the individual, morgue or institution of origin, permit

104

number, and the dates of embalming (Hunt 2009). Other documentation held by the museum includes skeletal inventories, dental charts, autopsy reports, photographs and anthropometric measurements for two-thirds of the cadavers, over 800 plaster death masks, and 1050 hair samples (Hunt 2009). Like the Hamann-Todd series, the Terry collection has played an integral role in physical anthropology by serving as a reference sample for bone changes associated with age, sex, and race; in addition, the collection was also the basis for Trotter’s stature estimation equations (Quigley 2001). Another similarity to the Hamann-Todd collection includes uncertainty for some of the reported ages at death; again, the sample selection protocol for this thesis attempts to alleviate this problem.

Recent Collections More recent documented skeletal remains will be drawn from both ongoing skeletal donation programs and an autopsy sample.

Documented Collections Bass Collection (UTK) At present, the William M. Bass Donated Skeletal Collection, named for its founder, contains over 750 skeletons (Smithsonian Institution 2009) and is continually growing as a result of the University of Tennessee Forensic Anthropology Center’s active skeletal donation program. The collection is currently housed at the Forensic Anthropology Center, which is located within the Department of Anthropology at the University of Tennessee in Knoxville. The series predominantly contains the remains of

105

American Whites and Blacks, with a smaller portion of Hispanic ancestry. Individuals are generally from Tennessee, with some derived from other states in the nation (University of Tennessee Forensic Anthropology Center 2005). Birth years range from the 1890’s to the 2000’s, with ages at death ranging from fetal to 101 years old. The source population includes forensic cases and donated remains (Usher 2002), and the series is biased towards males and Whites. The remains in the Bass Documented Collection have documentation as to sex, age, race, cause of death, date of birth and/or death, adult stature, weight, handedness, and education level. The current donor information form also requests information regarding occupation, childhood socioeconomic status, and medical history (University of Tennessee Forensic Anthropology Center 2005). Maxwell Museum Collection (MMA) The Maxwell Museum collection currently consists of over 260 individuals, and like the Bass Donated collection, is continually growing as a result of the Maxwell Museum of Anthropology’s active skeletal donation program. The collection is currently housed at the Maxwell Museum of Anthropology’s Laboratory of Human Osteology, which is in the Anthropology building at the University of New Mexico in Albuquerque. The series contains the remains of predominantly White Americans from the state of New Mexico. Birth years range from 1887 to 1971. The source population includes forensic cases and donated remains (Quigley 2001; Usher 2002; Laboratory of Human Osteology Maxwell Museum of Anthropology 2009), and the series is biased towards older Whites. Most individuals in the collection have documented sex, age, population affinity, and cause of death. Since the mid-1990s, a new donor information sheet has been 106

distributed to prospective donors, which requests both demographic data as well as additional information about health history and occupation (Laboratory of Human Osteology Maxwell Museum of Anthropology 2009).

Autopsy Collections Maricopa County Forensic Science Center autopsy sample (MCFSC) The Maricopa County Forensic Science Center autopsy collection consists of only pubic symphyses and bilateral sternal fourth rib ends. These samples were collected at the time of autopsy with the consent of the next of kin, between 2005 and 2006 by Kristen Hartnett as part of her dissertation research. The Maricopa County Forensic Science Center autopsy collection contains bone samples from 602 individuals; 582 individuals had both pubic symphyses and ribs (Hartnett 2007). The collection is housed at the Maricopa County Forensic Science Center in Phoenix, Arizona. While the sample includes both Black and White individuals, the preserved collection has an abundance of White males and females, as well as older individuals (Hartnett 2007). Individuals identified as Hispanic were included in the White27 category. Hartnett (2007) provided little justification for this action, although she did mention that this was in accordance with the Maricopa County Forensic Science Center’s classification system and that law enforcement often combines the two groups. All individuals are from Maricopa County, Arizona, in accordance with the Forensic Science Center’s jurisdiction. Birth years range from 1906 to 1988, with ages at death ranging from 18 to 99 years old with a mean of 54.1 (Hartnett 2007). All

27

Hartnett (2007) uses the term Caucasian.

107

remains fall under the county’s medicolegal jurisdiction, including unattended and nonnatural deaths. The pubes and ribs were drawn from decedents of known age, sex, and race, and were collected at the time of autopsy/examination by the pathologist. Drug and alcohol history was also obtained, but Hartnett (2007) admits that it was unclear if this information was provided by a qualified medical source. The Institutional Review Board for human subjects research at Arizona State University did not allow for the collection of other antemortem information. During the course of gaining consent from the next of kin, it was noted that consent was more likely given for elderly individuals than younger ones and more likely for Caucasians than Blacks, Native Americans, Hispanics, or Asians. This is in part due to lower population numbers for Blacks and Asians in the Phoenix area and religious beliefs for Hispanics and Native Americans (Hartnett 2007).

Data Collection Methods Sample Selection Protocol Sample selection sought to balance four factors: documented chronological age, sex, race, and year of birth. Individual ages were recorded and each individual was placed in one of seven arbitrary age sets (20-29 years, 30-39 years, 40-49 years, 50-59 years, 60-69 years, 70-79 years, and 80+ years). The sampling strategy devised for this research project called for groups of ten same sex and race individuals (Black female, Black male, White female, and White male) per ten-year age set, for a maximum of 280 individuals drawn from each skeletal series.

108

As discussed in Chapter 3, the Terry and Hamann-Todd collections are known to contain individuals whose ages are not truly known; the sampling strategy employed for this research attempted to avoid known problems with these collections by excluding individuals with ages designated as “?” or “ca,” as well as those with ages ending in “0” or “5.” This targeted elimination sought to reduce the potential bias in the dataset introduced by including individuals of suspect age. Some remains of unknown age could be estimated to belong to age sets other than that for their true chronological age; there is no definitive way to calculate the imprecision of the estimates. Researchers must rely on Todd’s and others assessments of whether the stated age was congruent with the condition of the soft and hard tissues of the body. Chronological age is an essential component of the statistical analyses used; imprecise ages could affect the results obtained. Importantly, the elimination of individuals with ages ending in “0” and “5” may also introduce bias, removing the morphology and variation present for those whose chronological ages actually are multiples of five. However, I think any bias introduced by the elimination of remains with potentially estimated ages is negligible when compared to that introduced by knowingly including imprecise ages.

Dataset Samples selected from the Terry and Hamann-Todd collections closely approximated the maximum set of 280 individuals. Due to their relative lack of Black decedents and smaller collection sizes, the samples drawn from the Bass Donated, Maxwell Museum Documented, and Maricopa County autopsy series were reduced to approximately half the size of those for the Terry and Hamann-Todd collections (Table 1). The total number of skeletal remains included in the dataset was 971, with 56% 109

(n=544) from reference anatomical series and 44% (n=427) from more recent donated and autopsy collections (Figure 1). Table 1: Number of skeletal remains in the dataset, by series and sex-race category

Black females White females Black males White males Total

HamannTodd 66 70 69 70 275

Terry

Bass Donated 3 55 29 69 156

67 64 70 68 269

Maricopa County 3 70 16 60 149

Maxwell Museum 1 49 4 68 122

Total 140 308 188 335 971

The dataset was also nearly evenly split between the sexes, with a composition of 54% (n=523) males and 46% (n=448) females. The number of males and females within the Maricopa County, Terry, and Hamann-Todd series was equivalent; a disparity between the sexes was observed for the Maxwell Museum and Bass Donated collections, both with an abundance of males that account for roughly two-thirds of the remains composing each sample (Figure 2). Figure 1: Number of skeletal remains in the dataset by series Composition of Dataset 122 275

149

156 269 HTH

TC

UTK

MCFSC

MMA

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum

110

Figure 2: Number of Males and Females by Series

Number of Males and Females by Series 600 500

72 50

400

76

73

98

300

58

200

131

138

136

139

females

males

100

MMA MCFSC UTK TC HTH

0

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum

The distribution of races within skeletal samples is not balanced, however. Blacks only compose one third (n=328) of the total dataset. The three Recent American skeletal series have significantly more individuals classified as White than Black (Figure 3). This disparity is the result of the demographic composition of these collections as a whole. As mentioned previously, few African Americans are residents of Maricopa County and few tend to donate their remains to skeletal collections, either in Tennessee or New Mexico. The mean chronological age at death was fairly uniform across skeletal samples; only the Maxwell Museum sample was considerably older on average than the other samples (Figure 4). The similarity among collections for mean age at death was to be expected, based on the sample selection scheme. The Maxwell Museum collection

111

includes an abundance of older individuals, and as a result, the mean age of the selected sample reflects the paucity of younger individuals. Figure 3: Number of Blacks and Whites by Series

Number of Blacks and Whites by Series 700 600

117 5

500 400

MCFSC UTK

124

32

300

MMA

130

19

TC

200 100

137

132

135

140

black

white

HTH

0

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum

Figure 4: Average Age by Collection

Average Age by Collection 80

Age-at-Death (Years)

70 68.58

60 50

54.11

55.88

58.69

54.42

40 30 20 10 0 HTH

TC

UTK

MMA

MCFSC

Skeletal Sample

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum

112

Table 2: Sample Sizes by Age Cohort

20-29 30-39 40-49 50-59 60-69 70-79 80+ total

HamannTodd 40 40 40 40 40 40 35 275

Terry 30 40 40 40 39 40 40 269

Bass Donated 11 13 31 27 27 23 24 156

Maricopa County 23 21 25 17 20 22 21 149

Maxwell Museum 3 14 8 23 23 28 23 122

total 107 128 144 147 149 153 143 971

Table 2 shows the division of each sample by ten-year age cohorts. The data clearly demonstrate the paucity of younger individuals in the Maxwell Museum Documented and Bass Donated samples. Descriptive statistics for age at death by skeletal series are summarized in Table 3. The average age at death for each sex-race category, by series, is presented in Figure 5.

Table 3: Age at Death Statistics for Series in the Dataset

min max mean median mode

Hamann-Todd 21 96 54 54 22

Terry 20 102 56 56 30

Bass Donated 20 101 59 58 49

Maricopa County 20 97 54 52 43

Maxwell Museum 22 101 63 66 68

The average birth year by series is presented in Figure 6, which clearly illustrates the difference in time between Reference and Recent samples. Average birth years for sex-race groups by collection are presented in Table 4. The mean year of birth was fairly uniform across sex-race groups when the skeletal samples are pooled: the average age of birth was 1914 for Black females, 1913 for Black males, 1913 for White females, and 1913 for White males. The similarity among sex-race groups for mean birth year was expected because of the sample selection scheme. 113

Adult stature information was available for 58% of the dataset; no stature data was collected for the Maricopa County autopsy sample. Of the remaining 822 individuals in the dataset, 565 had documentation of stature: 99% of the Hamann-Todd sample, 37% of the Terry sample, 74% of the Maxwell Museum Documented sample, and 67% of the Bass Donated sample. Average adult stature for sex-race groups is Figure 5: Average Age at Death by Sex-Race Category

Average Age

70

50

40

30 20

MMA UTK

10

MCFSC

0 BF

WF Sex-Race Cat

HTH BM

mp le

TC Sa

Age-at-Death (Years)

60

WM

egory

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum BF=Black females WF=White females BM=Black males WM=White males

114

Figure 6: Average Year of Birth by Collection

Average Year of Birth by Collection 1960 1950

1940 Year of Birth

1939 1920

1925

1900 1880 1860

1874

1880

1840 1820 HTH

TC

MMA

UTK

MCFSC

Skeletal Sample

HTH=Hamann-Todd TC=Terry UTK=Bass Donated MCFSC=Maricopa County MMA=Maxwell Museum

Table 4: Average Birth Year for Sex-Race Groups by Series

Black females White females Black males White males

Hamann-Todd 1877 1871 1876 1872

Maxwell Museum 1932 1921 1910 1928

Terry 1885 1887 1879 1879

Bass Donated 1927 1937 1941 1940

Maricopa County 1947 1950 1957 1949

Table 5: Average Stature by Sex-Race Groups

Black females White females Black males White males TOTAL

Hamann-Todd 64.4 61.8 68.0 67.0 65.3

Maxwell Museum N/A 63.2 69.8 68.8 66.9

Terry 62.9 62.5 67.3 66.8 65.6

Bass Donated 68.3 66.7 67.2 67.3 67.1

average 63.6 62.5 68.3 67.5

presented in Table 5. As is expected, the average adult height is sexually dimorphic for this dataset, with American males approximately 5 inches taller on average than their female counterparts, regardless of race.

115

Data collection The pubic symphysis, auricular surface, sternal end of the fourth rib, and cranial sutures were examined for each individual. To avoid investigator bias, phase values/component scores were assigned based on morphological characteristics without prior knowledge of the individual’s age at death. The Maricopa County autopsy sample lacks auricular surfaces and cranial material as a result of Hartnett’s (2007) acquisition protocol; accordingly, only the pubic symphyses and sternal ends of the fourth rib were evaluated. When present, both right and left sides were examined for the pubic symphysis, auricular surface, and rib ends for all remains in the dataset. To avoid observer bias, only after completion of work at each institution were the demographic information collated with the morphological observations. A single observer recorded all data to maintain uniformity in phase assessments and eliminate the potential error introduced by multiple scorers. As noted previously, current aging standards are divided into either stage/phase based or transition analysis methods. Transition analysis allows for flexibility in the varying rates of separate bony modifications, as the age-related changes of the indicators are not necessarily simultaneous. This method avoids the problematic nature of phases/stages defined by a suite of characteristics that may not be an accurate description of the actual set of morphological traits observed.

116

Pubic Symphysis Phase/Stage Based Standards The pubic symphysis undergoes a general progression of morphological change defined by deep ridges and furrows that with age fill in to produce a smooth surface with a ridge of bone on the ventral surface; then a bony rim forms around the face before the face and rim deteriorate. Specific features scored for the pubic symphysis include the surface relief, degree of delimitation of upper and lower extremities, extent of ventral rampart and dorsal plateau formation, appearance of dorsal lipping and ventral bony ligamentous outgrowths, rim erosion, and face shape. Based on the combination of features observed, the age progressive changes of the pubic symphysis were categorized into one of six phases according to the SucheyBrooks phase descriptions (Suchey & Katz 1986; Brooks & Suchey 1990). This method was chosen because Klepinger and colleagues (1992) have shown that it is most appropriate for forensic applications because it has the best accuracy and precision of the methods tested. The features of the pubic symphyseal face were also classified according to phase descriptions defined by Todd (1920, 1921) and Hartnett and Fulginiti (2007). These methodological variants exemplify the oldest and the most recent phase-based standards developed for assessing age from the pubic symphysis. Unlike the SucheyBrooks method, which was based on an autopsy sample from the 1950s, the Todd phase descriptions were established using the Hamann-Todd collection as the reference source. The Hartnett-Fulginiti phase descriptions are based on a recent autopsy sample and include an additional phase to that of the Suchey-Brooks method. This phase was created

117

to more accurately describe face changes after the age of 50 years, though the method still forces varying morphological features into stages/phases. Transition Analysis The morphological stages of five pubic symphyseal features, specifically symphyseal relief, symphyseal texture, superior apex, and dorsal and ventral symphyseal margins, were observed for use with transition analysis following standards published by Boldsen and colleagues (2002). The symphyseal relief changes from deep to shallow to residual billowing, and then flattens before changing to an irregular surface. Symphyseal texture, which is scored on the dorsal demiface, changes from fine-grained to coarse grained; subsequently microporosity and macroporosity is paramount. The superior apex metamorphoses from no protuberance, to the presence of a protuberance, to its integration into the symphyseal face. The ventral symphyseal margin changes from serrated with pronounced ridges and furrows, to beveled/flattened billows, to the formation and completion of the rampart, to rim formation, and finally to rim breakdown. The dorsal symphyseal margin follows a similar pattern of metamorphosis as the ventral margin, minus the rampart. The dorsal margin changes from serrated with pronounced ridges and furrows, to flat, to rim formation, and finally to rim breakdown.

Auricular Surface Phase/Stage Based Standard The auricular surface of the ilium is thought to undergo a general metamorphosis, beginning with a fine granular surface and transversely organized billows on the superior and inferior demifaces. Next the transverse organization and billows fade and the 118

granulation becomes coarser; then the surface becomes more irregular, with areas of macroporosity and inferior lipping developing. Concurrently, rim formation at the apex occurs and the retroauricular area transforms from smooth to rugged. The auricular surface, including the apex, superior demiface, inferior demiface, and retroauricular area, was examined for the following morphological features: presence and degree of billowing, granulation, porosity, and transverse organization on the face of the auricular surface. Each auricular surface was then assigned to one of eight stages, as defined by Lovejoy and colleagues (1985b). Transition Analysis The morphological stages of nine features of the iliac auricular surface were scored for use with transition analysis following Boldsen and colleagues (2002). Features observed included superior demiface topography, inferior demiface topography, superior surface morphology, apical surface morphology, inferior surface morphology, inferior surface texture, and superior posterior iliac exostoses, inferior posterior iliac exostoses, and posterior iliac exostoses. The surface topography is scored for both the superior and inferior demifaces and follows predictable changes from an undulating surface, to an elevated central region, and finally to a flat or irregular surface. The surface morphology is scored separately for the superior, apical, and inferior aspects and metamorphoses from billowed to flat to bumpy; the inferior surface texture changes from smooth to porous, with independent scores for microporosity and macroporosity. Marginal bony proliferation is scored for the superior posterior and inferior posterior iliac exostoses; the surface starts as smooth, next changes to rounded bony elevations, then pointed, jagged, and touching exostoses, and terminates 119

as fusion to the sacrum. Finally, marginal bony proliferation scored for the posterior exostoses changes from no exostoses, to round and pointed exostoses.

Sternal End of the Fourth Rib Phase/Stage Based Standard The metamorphosis of the sternal end of the fourth rib begins as a slight indentation with rounded borders that pit deepens with increasing age, creating a V shape with a regularly scalloped border. As the pit continues to deepen, the shape widens to a U, and the scalloped border become more irregular; finally, the bone quality deteriorates, becoming more brittle and porous, and bony projections form. Specific features scored for the sternal end of the fourth rib include pit depth, pit shape, and rim and wall configuration. Based on the descriptions and combination of these individual features, the sternal rib end morphology was classified into one of the eight phases defined by İşcan and Loth (1986) and İşcan et al. (1984, 1985).

Ectocranial Suture Closure Phase/Stage Based Standard Meindl and Lovejoy’s (1985) method for cranial suture closure scores ectocranial sutures at ten locales: midlambdoid, lambda, obelion, anterior sagittal, bregma, midcoronal, pterion, sphenofrontal, and superior and inferior sphenotemporal. Scores follow a four-point scale ranging from zero (open) to 3 (obliterated).

120

Transition Analysis Boldsen et al. (2002) scored only five craniofacial sites for the transition analysis method: lambdoidal-asterion, sagittal-obelica, coronal-pterion, zygomaticomaxillary, and interpalatine. This method scored suture closure on a five-point scale: 1) open, defined as a noticeable gap between the cranial bones; 2) juxtaposed; 3) partially obscured, with bony bridges present; 4) punctuated, characterized by an appearance of scattered small points or grooves; and 5) obliterated, when no evidence of a suture remains.

Other data In addition to recording phase/stage categories for each of the skeletal remains including in the dataset in a laptop computer database, digital photographs of the pubic symphyseal face, auricular surface, sternal end of the fourth rib, and cranial suture sites were taken to serve as documentation of the morphology scored. Collections curators provided demographic data, including documented age, sex, race, birth year, death year, and stature data.

Data Analysis Methods Data preparation All demographic data, the observed phase/stages for each indicator, and the observed transition analysis component scores for the pubic symphysis, auricular surface, and ectocranial suture closure were entered into an Excel spreadsheet. The expected phases/stages were then calculated using the documented age and the age ranges associated with the standard’s phases. For example, using the Todd 121

standard for estimating age from the pubic symphysis, a 32-year-old individual would have an expected phase of 6, as phase 6 is associated with an age range of 30-35 years. For standards with very large age ranges that overlap multiple phases, such as the Suchey-Brooks and Hartnett-Fulginiti methods, the expected phase was chosen using the mean age for phases. For example, using the Suchey-Brooks standard for estimating age from the pubic symphysis, a 32-year-old female could have an expected phase of 2, 3, 4, and 5. However the documented age of 32 is closest to the mean of phase 3 (30.7 years), so the expected phase assigned is 3. Additional steps were required to transform the raw data for ectocranial suture closure to a useable age estimate. First the suture sites were divided into vault and lateral-anterior sites. Cranial vault sites include midlambdoid, lambda, obelion, anterior sagittal, bregma, midcoronal, and pterion. Lateral-anterior sites include locales at midcoronal, pterion, sphenofrontal, inferior sphenotemporal, and superior sphenotemporal. Next the sites were summed to reach a composite vault score and a composite lateral-anterior score. The value of these composite scores is then associated with one “S” designation for the vault and another “S” designation for the lateral-anterior sites. These “S” designations were essentially viewed as phases, each with mean ages and confidence intervals. Like the Suchey-Brooks and Hartnett-Fulginiti methods, Meindl & Lovejoy’s standard for scoring ectocranial sutures for age estimation also has very large age ranges that overlap multiple “S” designations. As above, the expected “S” designation was chosen using the mean age closest to the documented age. Thus, for a 32-year-old individual, the expected vault “S” designation is S1 and the expected lateralanterior sites “S” designation is S1.

122

The transition analysis raw component scores also required additional transformation before comparison to actual age. Raw data consisted of stage assessments for each individual component scored; without combining these component scores and inputting demographic information, Boldsen and colleagues’ method cannot equate these data directly to an age estimate or range. Accordingly, the sex, race, hazard, and observed component scores for transition analysis indicators were entered into the ADBOU Transition Analysis Age Estimator program. This program, provided by Dr. George Milner, combined the raw scores and demographic information to calculate an age at death using maximum likelihood estimation. The age at death calculation included both point estimates and 95% confidence intervals for each of the following indicators: pubic symphysis, auricular surface, and cranial suture closure (Boldsen et al. 2002). Age calculations were also produced based on a combination of these three indicators following two schemes: one assuming a uniform prior age distribution (UNI) and the other assuming the age distribution of the chosen forensic hazard (COR) (Boldsen et al. 2002). The pre-industrial Danish prior was not used because the individuals included in the dataset were born after the start of the American Industrial Revolution Once all of the expected values were determined, the difference between the observed and the expected phases/designations was calculated for each stage traditional standard. Because no phases exist for transition analysis, the predictive ability of the standard must be computed using a comparison of chronological age in years to the calculated point estimate from the ADBOU age estimator program. Accordingly, the difference between the observed/calculated and the expected/documented age in years was calculated for all transition analysis indicators: pubic symphysis, auricular surface,

123

ectocranial suture closure, combined indicators assuming a uniform distribution and an informed prior distribution.

Analytical methods Right versus left side morphology A concordance correlation coefficient was calculated to determine whether the observed phase scores for right side morphology differed significantly from those of the left side. The statistic is similar to intra-class correlation (Nickerson 1997) and is used to evaluate the agreement between two values from the same sample (Lin 1989). The concordance correlation coefficient evaluated the agreement between the right and left side phase scores by measuring the variation from the concordance line, a 45-degree line through the origin (Lin 1989). If right and left phase scores are not significantly different, then observations from one side will be used to streamline data analysis and interpretation.

Intraobserver agreement A random subset of ten individuals from each skeletal series was drawn from the dataset to test for intraobserver agreement between first and second observations for each aging standard. The weighted Kappa statistic was used to test for intraobserver reliability (Landis & Koch 1977) and is appropriate because it accounts for the magnitude of disagreement between the first and second observations. These calculations are important for detecting any observer inconsistencies with a specific standard; if the

124

observer was not reliable with a certain method, the statistical analyses using that data should be interpreted with caution.

Descriptive statistics Simple descriptive statistics for all variables, including means, standard deviations, and distributions, were calculated for each of the methods; this step was necessary to establish whether the assumption of normality for statistical testing is valid. In addition, Spearman’s correlations between age and phase, as well as between variables, were calculated. These statistics measured the degree of association between the two variables, providing an indication of whether that particular variable was correlated with chronological age and/or each other. The documented chronological ages were plotted according to their observed phase/stage for each standard, for both the Reference and Recent groups. This provided a visual representation of the variation in age at death encompassed by each phase of each standard for each skeletal sample and will provide visual summaries of the data, generating insights into whether any obvious differences existed in rates of senescence. The weighted Kappa statistic was used to test for agreement between observed phase/calculated age and expected phase/documented age. This data provided a measure of the degree of difference between the actual morphology of the indicator and what was expected for that individual based on their age at death. The calculated differences between the observed and actual values were plotted against year of birth by the total dataset, each of the five skeletal series, and ten-year age cohorts to explore whether patterns emerge in the amount of predictive error for each aging standard. Linear trend lines were fitted to each plot by series and age cohort. These plots and regression lines 125

served as a heuristic tool to identify possible changes in the rate of aging through time, independent of Reference or Recent designation. Once the differences between the observed and expected values were calculated (PE) for each individual for all aging standards scored, standard multiple regression was used to identify which of the following descriptive variables best predicted the PE value: sex, race, sex-race groups, adult stature28, year of birth, age at death, reference/recent group, series, and region of the United States. All categorical values were transformed into nominal data according to the following scheme: 1=female and 2=male; 1=Black and 2=White; 1=black females, 2=black males, 3= white females, and 4=white males; 1=reference and 2= recent; 1=Hamann-Todd, 2=Terry, 3=Maxwell Museum, 4=Bass Donated, and 5=Maricopa County; 1=Midwest US, 2=Southeast US, and 3=Southwest US. This was done so that all variables could be evaluated in the same regression model. Results will identify whether any of the variables tested influence how well the aging standards perform. Multiple regression assumes that the variables are normally distributed, there is a linear relationship between the independent and dependent variables, the variables are measured reliably/without error, and that the variance of errors do not differ at different values of the independent variable (Osborne & Waters 2002). These assumptions were

28

Environmental and psychosocial insults during childhood are known to affect bone growth and may influence adult stature (Peck & Lundberg 1995; Bogin 1999); however, some researchers have found that growth in length was maintained at the expense of cortical thickness, despite nutritional and disease stress (Himes 1978; Huss-Ashmore 1981). In addition, when negative effects like nutritional or disease stress are ameliorated, long bone lengths increase during these periods of growth recovery, but cortical thickness/bone mass did not (Huss-Ashmore et al. 1982). The effect of this catch-up growth on stature can be problematic for assessing stresses experienced during the growth years of individuals using adult stature. Although cortical bone thickness, as well as Harris lines, may be more sensitive indicators of childhood stress, it was beyond the scope of this project to measure these variables. Thus, adult stature data, which was readily available for much of the dataset, was used as a rough proxy for childhood living conditions.

126

tested and results indicated that regression analysis was appropriate: the residuals of nearly all of the descriptive variables approximated a normal distribution, plots of the standardized residuals and standardized predicted values were linear for nearly all variables, variables were measured reliably, and most plots comparing the standardized residuals and standardized predicted values indicated homoscedasticity.

Research Questions and Hypothesis Testing To determine whether older American skeletal series age at a different rate than more recent ones, three questions were addressed using a combination of deductive and inductive statistical approaches. The sample size, mean age in years, standard error of the mean, 95% confidence interval for the mean, standard deviations, and observed age range were computed for each phase of each standard. These data were computed for each sex, race, sex-race group, and total sample within the Reference and Recent samples. This tabulates the proportion of cases that exhibit a particular phase of an indicator by documented age, provides a means for comparing this data to existing publications, and serves as a launching point for further hypothesis testing. Finally, the ages at transition were calculated using a log-age cumulative probit model in R (R Development Core Team 2008) for each of the aging standards, for both Reference and Recent groups. Data was calculated for the total sample for each group, pooling sex and race. Subsequently, each group was divided by sex and race, and the ages at transition were then calculated for each aging standard by these divisions. These data will numerically present a comparison of ages at transition for the groups examined.

127

Question 1: Is the observed morphology of the aging indicator associated with the same chronological ages for both older Reference and more Recent American skeletal populations? To test whether variation in the morphological aging processes of the pubic symphysis, auricular surface, sternal rib end, and cranial sutures exist between Reference and Recent American skeletal series, the null hypothesis of no difference in rate of senescent change was tested for the following established American aging standards: Todd, Suchey-Brooks, Hartnett-Fulginiti, Lovejoy and colleagues, İşcan and colleagues, Meindl and Lovejoy, and Boldsen and colleagues. This question was tested using proportional odds probit regression and generalized linear regression, coupled with an analysis of deviance; these analyses provide a measure of the significance of the association between the proportions of cases from each sample that exhibit a particular phase of an indicator, conditional on age. Using proportional odds probit regression in R (R Development Core Team 2008), the observed phase was regressed onto the log-age and population. The model then was run with an additional term for the interaction between log-age and population. An analysis of deviance was used to compare the two models, and an improvement chisquare statistic with associated p-value formally tested the impact of the added interaction term (Fox 2002), which allowed for the slopes of the regression lines to differ. The null hypothesis for this test is that the addition of the interaction term is not important. If the reduction in deviance is not significantly greater than chance, then the added interaction term does not belong in the model; this outcome is indicative of regression line slopes that do not significantly different between groups. If the chi-square likelihood ratio

128

statistic has an associated p-value of 0.05 or less, then the addition of the interaction term improves the model. This improvement indicates a significant association between the aging process of the indicator and the population. This analysis was run multiple times: 1) reference versus recent samples with pooled sex and race; 2) reference versus recent samples by sex; and 3) reference versus recent samples by race. This statistical approach is appropriate because probit regression models the dependence of the indicator on age. This is accomplished by calculating the means, standard deviations, log-likelihood, and standard error of the ages of transition for each phase of the indicator for each group. Subsequently, a measurement is produced that is indicative of the association between the proportions of cases from each sample that exhibited a particular phase of an indicator, conditional on age (see Kimmerle et al. 2008). A similar approach was taken to analyze whether a significant association exists between population and the point estimates of age produced by Boldsen and colleagues’ Transition Analysis. Proportional odds probit regression requires the dependent variable to be ordinal; however, the point estimates for age are continuous data, so a generalized linear model was fitted in R (R Development Core Team 2008). The point estimate of age was regressed onto the documented age and population, and then the model was run again with the additional interaction term for age*population. As above, an analysis of deviance was calculated and the associated p-value was used to formally test the impact of the added interaction term. Three potential outcomes exist. The first possibility was that no difference in the rate of aging between Reference and Recent samples was detected for any of the

129

standards tested. The second outcome was that all standards tested produced a statistically significant difference in the rate of aging between populations. The third outcome was that some standards would produce statistically significant differences between samples and others would not. The expectation was that the third outcome was most likely. Two lines of evidence justified this expectation. First, the literature on temporal change for age indicators is mixed. Osborne et al. (2004) observed no change for the auricular surface when comparing samples drawn from the Terry and Bass Donated collections, while others have reported secular trends for cranial suture closure (Masset 1989, BocquetAppel & Masset 1995) and the pubic symphysis (Hoppa 2000). Second, a pilot study was conducted (Potter 2009) that produced both significant and non-significant trend lines for the difference between observed score versus expected phase regressed on year of birth. Question 2: Assuming that the third outcome from Question 1 is supported, is there a pattern that explains why some aging standards produce significant differences in the aging process of skeletal indicators between groups while others do not? The null hypothesis of no pattern was tested for the following groupings: 1) method type, specifically phase/stage based or component based/transition analysis; 2) indicator used, particularly the pubic symphysis, auricular surface, fourth rib, cranial sutures, or combined; 3) anatomical region, either postcranial or cranial; 4) time; 5) adult stature, categorically defined as short, average, and tall, 6) sex, either female or male; and 7) race, either Black or White.

130

To test whether a pattern was present within the outcome, the results from Question 1 hypothesis testing were sorted into two groups: 1) standards producing statistically significant differences among series and 2) standards that do not produce differences among series. Subsequently, an inductive approach was used to identify patterns according to method type, indicator, anatomical region, time, adult stature, sex, or race. Two possible outcomes existed. The first outcome was that the results appear to be random, with no apparent pattern explaining why some standards produce statistically significant differences between older and more recent American skeletal populations and others do not. The second possible outcome was that a pattern was present. The expectation was that there would be non-random groupings of standards that have different chronological ages associated with specific indicator morphology for Reference and Recent American skeletal populations. Justification for the patterns that may explain the results was based on critiques of particular aging standards and published anthropological research on aging. Method type may influence the results because phase/stage based methods force the morphology of the indicator into a preset description of traits that may or may not accurately describe the features present. In contrast, transition analysis is flexible, allowing for differing rates of development or degeneration of individual components within a single indicator. Certain indicators of age, regardless of the method used, may produce similar results. This was expected for the pubic symphysis, because the Suchey-Brooks and Hartnett-Fulginiti methods were both derived from the original Todd scoring system. Similar morphological attributes are also examined as components of the pubic symphysis indicator for the Boldsen and

131

colleagues’ Transition Analysis scheme, and thus may also group with these phase-based methods. Similarly, results may group according to the anatomical region of the indicator; postcranial—particularly pelvic—indicators clustered in Jackes’ (2000) reanalysis of Kemkes-Grottenthaler’s dissertation data. This may be the result of the suture obliteration method itself, because it had a definite “cutoff point” marking the cessation of morphological change that postcranial measures lack (Jackes 2000). The temporal difference between the Reference and Recent samples may also influence the results, particularly if most indicators and standards return a statistically significant difference between these populations. The average year of birth for the Reference sample is 1878 (range 1828-1943), while it is 1939 (range 1889-1985) for the Recent sample. The difference in birth years may allow for the recognition of subtle changes in the rate of progression of skeletal age changes, possibly resulting from secular trends, environmental factors, cultural practices, socioeconomic status, improvements in living conditions, diet, disease prevalence, and advances in health care and disease prevention. As mentioned previously, the adult height of individuals was used as a proxy for childhood health and socioeconomic status due to availability of the data. Environmental and psychosocial insults during childhood are known to affect bone growth and can influence adult attained stature, though catch-up growth may reduce or eliminate these effects. These childhood stresses may also affect skeletal aging. Patterns may also arise with regard to sex and race. Differences in the estimation of age exist between males and females (Kemkes-Grottenthaler 2002), and these differences are reflected in the use the same scoring system for both sexes with different means and age ranges ascribed to the phase based on sex (Meindl & Lovejoy 1985;

132

Boldsen et al. 2002), as well as the use of aging standards that score males and females separately (İşcan & Loth 1986; Brooks & Suchey 1990; Hartnett 2007). Additionally, research supports the theory that females may be buffered from environmental insults, which may render them less susceptible to changes in nutrition, socioeconomic status, and other extrinsic variables (Bogin 1999). Some research on dental and skeletal development indicates that race may also affect age estimates. For example, Blacks are advanced compared to Whites in terms of dental development and eruption, and similar results have been noted for the development of ossification centers and the epiphyseal fusion of elements in the hand and wrist (Masse & Hunt 1963; Garn & Bailey 1978). Question 3: In the case of contrasting results from multiple aging standards for a single skeletal indicator, which standard is the best gauge of whether a difference in the rate of senescent change has occurred between older Reference and more Recent American samples? A three-fold approach was taken to identify which osteological aging standards were used to determine whether a change in the skeletal aging indicator had occurred for American samples. First, results from published studies, as reviewed in Chapter 3, were used to assess the strengths and weaknesses of the skeletal age indicators and standards tested in this research project. This endeavor identified which aging standards may have methodological biases or other problems that affect the reliability of the results obtained from hypothesis testing. Second, stepwise regression analysis was used on a pooled sample to determine which standards were the best predictors of actual age as determined by inclusion in the prediction formula. A pooled sample combining temporally and geographically distinct collections was chosen to create a more diverse sample that

133

encompasses a wider range of variation observed for American populations. Finally, other criteria contributing to the assessment of the method’s reliability were considered, including intraobserver agreement values, correlation coefficients with chronological age, my experience and comfort level with the methods, and the robusticity of the statistical results within this dissertation. Aging standards that were free from methodological bias, had low intra-observer error, predicted chronological age well as defined by its inclusion in the stepwise model, and had easily recognized features that were highly correlated with chronological age, were considered more reliable than others. These aging methods were used as the measure of whether older American skeletal series aged at a different rate than more recent ones.

Assumptions The assumptions made in this research are broadly divided into two categories: those regarding the skeletal collections and those regarding the aging standards tested.

Documented Skeletal Collections The first major assumption made in this research is that the documented information for the individuals within these five American skeletal collections is accurate. Although these collections are considered “documented,” a closer look at the source of the recorded information warrants concern regarding the reliability of some data.

134

Age at Death Some of the skeletal remains within Terry and Hamann-Todd collections do not have known chronological age at death (Boldsen et al 2002; Hunt & Albanese 2005; Konigsberg et al. 2008); this issue was discussed thoroughly in Chapter 3. To summarize, often this uncertainty was denoted by the addition of a “?” or “circa” to the documented age. In other instances, the age was estimated, rounding to the nearest multiple of five; for those cases that have ages ending in 0 or 5, it was less clear which were true ages and which were estimated. As stated earlier in this chapter, the sampling strategy employed for this research attempted to avoid the inclusion of those remains with questionable ages; however, it must be assumed that the ages recorded were correct for the individuals who were selected for the dataset.

Racial Designation The term race, as it is used in this manuscript, was clarified in Chapter 1. Different individuals do not apply the social construct of race, and its classification terms, in the same manner; perceptions have also changed over time. Race is a complicated and problematic concept that is likely inextricably intertwined with other variables, such as socioeconomic status, education, nutrition, and access to health care (Williams 1996, Cooper et al. 2001). In addition, race is a fluid concept that is interpreted in many ways (Herman 1996); how individuals are classified has changed significantly over time in the United States, both by themselves or when ascribed by others. In earlier skeletal collections, race was recorded from death certificates and was likely ascribed by morgue physicians based on physical features, which may not be an accurate refection of how the

135

decedents would have identified themselves during life. In contrast, potential donors self-report their race on questionnaires for the Bass and Maxwell documented collections, particularly for the more recently acquired remains. This self-identification may not match opinions of observers in the community. This discrepancy in how race information is acquired is another potential source of bias inherent in comparing these skeletal series.

Osteological Aging Standards The second major set of assumptions made in this research lies in the reliability and validity of the osteological age at death estimation methods currently employed. The anthropological literature has called into question these very attributes of current aging standards.

Reliability of Aging Standards Reliability issues include observer error, age and sex dimorphism, asymmetry, intertrait correlation, causative factors, and heritability (Saunders 1989). Of principal concern is inter- and intra-observer error, because morphology-based age assessment indicators are subjectively interpreted despite a multitude of photographic and/or cast reference materials (Kemkes-Grottenthaler 2002). Experience is important, as the researcher’s ability to properly identify age-related morphological traits is essential to the standard’s predictive potential (Baccino et al. 1999; Kemkes-Grottenthaler 2002). For example, the Gilbert-McKern pubic symphysis standard’s scoring of the development and breakdown of the ventral rampart is particularly prone to observer error (Suchey 1979). Even with the introduction of their method for scoring the iliac auricular surface, 136

Lovejoy and colleagues (1985b) admit to the increased difficultly in the application of their standard over those for the pubic symphysis. As outlined in the analytical methods section of this chapter, intraobserver agreement was calculated for all standards used as part of this research and will identify which standards are more reliable when employed by the author.

Validity of Aging Standards Validity issues focus on the predictive value of the indicators used, based on how strongly correlated the indicator is to chronological age (Kemkes-Grottenthaler 2002). A compilation of correlation coefficients for several age indicators and age at death is presented in Table 6. Poorer correlations translate to greater bias (Aykroyd et al. 1999). What is the minimum acceptable correlation coefficient to indicate a reliable relationship between morphology and age? The answer varies by author and ranges anywhere from 0.7 to 0.9 (Lovejoy et al. 1985a; Bocquet-Appel & Masset 1982, respectively). However, the utility of correlation coefficients as a proxy for the predictive value of a trait is also questionable, because age indicators are typically discrete and the relationship between a phase/stage and its corresponding age range is not linear (Kemkes-Grottenthaler 2002). Gerontologists know that the aging phenomenon is highly variable among individuals (Bryant and Pearson 1994), and this is particularly evident for older individuals, because skeletal age-at-death markers become progressively more inaccurate with increased age (Angel 1984). But even intra-individual variability is problematic, as evidenced by asymmetrical cranial suture closure and auricular surface scores (Moore-Jansen & Jantz 1986; Kemkes-Grottenthaler 1996).

137

Table 6: Correlation coefficients between age indicator and age at death (after Kemkes-Grottenthaler 2002) Indicator Pubic symphysis (Todd)1 Pubic symphysis (Todd)2 Auricular surface (Meindl et al.) Ectocranial sutures (Meindl & Lovejoy)3 Ectocranial sutures (Meindl & Lovejoy)4 1 2 3 4

Females -0.64 ---

Males 0.85 0.57 ---

0.34

0.59

Sexes combined -0.57 0.72 0.57 lateral-anterior sutures 0.50 vault sutures 0.56

Katz & Suchey (1986) Meindl et al. (1985) Meindl & Lovejoy (1985) Kemkes-Grottenthaler (1996) [cited in Kemkes-Grottenthaler (2002)]

It was assumed for this thesis that the accuracy of the age estimation methods tested here was acceptable, such that the hypothesis testing, regardless of outcome, produced meaningful results. In an attempt to approximate each method’s accuracy, three values were calculated for each aging standard: the correlation coefficient between the phase scores and age, bias, and inaccuracy.

Limitations The limitations of this research are the direct result of both information availability for American skeletal collections and issues related to sample bias and sample representativeness. Because of these obstacles, it is impossible to make definitive statements about causative agents of change and the population at-large. Only inferences can be made regarding variables that may contribute to or influence change in the rate of aging in the studied American skeletal samples.

138

Availability of Information The hypotheses that are tested during the course of this dissertation focus on temporal and spatial changes in the relationship between the morphology of the skeletal indicator and chronological age. It is not possible to quantify all of the variables that may affect this relationship. While newer documented skeletal collections like the Bass Donated and Maxwell Museum series are now attempting to gather additional information about their donors, information on childhood health, nutrition, and living conditions, as well as socioeconomic status, occupation, and health history, are not generally available for individuals in older anatomical collections. While, this lack of individual information makes it difficult to draw conclusions about specific variables influencing the rate of senescent change in the human skeleton, it does not necessarily strongly impact age estimation, as this data is also unknown for target samples.

Issues of Sample Bias and Sample Representativeness Usher (2002) states that the reliability and representativeness of reference collections are usually taken for granted. Existing collections contain a subset of individuals from a larger population, which may be influenced by acquisition methods and collection strategies. This selected subset of individuals is not No skeletal collections are truly representative of the general population from which they are drawn (Usher 2002). Acquisition artifact is a well-known problem for documented collections; bias occurs by the inherent nature of skeletal collections as a result of the demographic profile, methods of collection, recording of the history of the materials, and curation/storage (Huxley 2005). Acquisition bias is present for all of the samples used in

139

this project because not every person in the population at large has an equal chance of becoming part of the collections. Certain people are more likely to donate or to become medicolegal cases. For example, femoral length is correlated with donation type in the Terry Collection (Ericksen 1982) and with differences in achieved educational levels and socioeconomic status in the Bass Documented Collection (Wilson et al. 2007). Although all five of the skeletal samples examined for this research are American, there are likely significant differences between these samples with regard to genetic background, overall socioeconomic status, income, living conditions, nutrition, and health care during life. These specific differences cannot be fully understood or quantified because these data are not necessarily available for any given individual or sample, but they must be acknowledged as sources of bias that are inherent in this study. The source of the skeletons within the collections cannot be ignored (Morris 2007). While the Hamann-Todd and Terry anatomical collections are composed of predominantly lower socioeconomic status individuals drawn from hospital and institutional morgues, the Bass Donated and Maxwell Museum collections tend to contain willed bodies from more middle income backgrounds (Corruccini 1974; Hunt & Albanese 2005; Wilson et al. 2007). Therefore, generalized statements about income and socioeconomic status do not apply to all individuals included in the sample. The Terry collection, for example, contains both indigent and willed remains. Morris (2007) reports that self-donated/willed bodies are often those of well-educated individuals who have attained a higher social class than those of indigent acquisitions. As a result, Morris (2007) hypothesizes that differences in observations may reflect different lifestyles regardless of the biological origin of the group.

140

Clearly, sampling bias is a significant concern; skeletal series do not represent all of the variability within the source population that is anticipated for any given geographic area or period of time (Usher 2002). This problem is exemplified in a study conducted by Komar and Grivas (2008), which compared the Maxwell Museum of Anthropology documented skeletal collection to three New Mexico samples: autopsy, deceased, and living. The Maxwell sample differed significantly from each of these populations in age, sex, race, cause of death, and manner of death. A significant overrepresentation of males, Whites, and elderly individuals in the Maxwell Documented Collection is present. This problem undoubtedly extends to the other samples studied for this research and must be acknowledged as a caveat when drawing conclusions from this data.

Summary To determine whether older American skeletal series age at a different rate than more recent ones, three research questions will be addressed. First, a determination as to which American aging standards produce significant and non-significant differences in age estimates between older reference and more recent documented skeletal series will be made though formal hypothesis testing. If some or all of the null hypotheses of no difference are rejected, then the next action will be to determine if patterns exist in the results according to method type, indicator used, anatomical region, time, sex, race, or adult stature. Next, an inductive approach to data analysis and a summary of the strengths and weaknesses of current aging standards will be used to determine which standards should be weighted more heavily when considering the primary dissertation question. Finally, all results will be combined to determine whether differences exist in the rate of osteological aging between older and more recent American skeletal 141

collections, using heavily weighted traits as a proxy for whether a generalized pattern of change has occurred.

142

Chapter 5 Results Twelve different aging standards were applied to older and more recent American skeletal samples to determine whether both samples age at the same rate. Three hypotheses were tested; their results follow.

Preliminary Data Analysis Prior to hypothesis testing, the difference between right and left side scores was assessed, intraobserver agreement and descriptive statistics were computed, and assumptions of normality were tested.

Right versus left side morphology A concordance correlation coefficient was calculated using SAS v9.1 (SAS Institute, Cary, NC) to determine whether the observed phase scores for right side morphology differed significantly from those of the left side. Results are presented in Table 7. The concordance correlation coefficient increases as the true correlation increases and decreases as the within-subject variability increases. Thus, 0.9338, the Table 7: The estimated concordance correlation coefficient, with 95% confidence limits Statistic Sample Size Mean 1 Mean 2 Variance 1 Variance 2 Covariance Concordance Correlation Lower Confidence Limit Concordance Correlation Concordance Correlation Upper Confidence Limit

143

Value 14919 3.9598 3.9584 4.5849 4.5908 4.284 0.9317 0.9338 0.9358

value calculated for the concordance correlation here, indicates a high true correlation and low variability between right and left side phase scores by individual. Phase scores for right and left sides were not significantly different, so observations from the right side were chosen arbitrarily to streamline data analysis and interpretation.

Intraobserver agreement A random subset of 50 individuals was drawn from the dataset to test for observer inconsistencies using the weighted Kappa statistic; ten individuals were randomly selected from each skeletal series to form the subset. Table 8 summarizes the weighted Kappa results for each aging standard scored. The bold text highlights the highest and lowest values. The observer had the highest agreement values for the Todd and SucheyBrooks pubic symphyseal scoring systems. This result was expected because the Todd Table 8: Agreement between first and second observation scores using the weighted Kappa statistic Method (obs1 vs. obs2)

Value*

ASE

Todd Pubic symphysis

0.9146

0.0337

0.8487

0.9806

Suchey-Brooks pubic symphysis

0.8628

0.0512

0.7624

0.9631

Hartnett-Fulginiti pubic symphysis

0.8164

0.0558

0.7070

0.9258

Lovejoy et al. auricular surface

0.6748

0.0743

0.5292

0.8204

Iscan et al. 4 rib end

0.7152

0.0801

0.5583

0.8722

Meindl & Lovejoy cranial sutures

0.8342

0.0154

0.8039

0.8644

Boldsen et al. transition analysis (all)

0.8113

0.0137

0.7844

0.8382

Boldsen et al. transition analysis (PS)

0.8296

0.0226

0.7854

0.8738

Boldsen et al. transition analysis (AS)

0.7793

0.0234

0.7334

0.8252

Boldsen et al. transition analysis (CS)

0.7645

0.0290

0.7076

0.8214

th

95% Confidence Limits

Total 0.8765 0.0069 0.8629 0.8900 All methods combined * All weighted Kappa values are significant at the p=0.05 level (Gwet 2002). A one-sided test of the Pr > Z and a two-sided test of the Pr > |Z| for the H0: weighted kappa = 0 both returned p values of Z and a two-sided test of the Pr > |Z| for the H0: weighted kappa = 0 both returned p values of |Z| for the H0: weighted kappa = 0 both returned p values of (|Chi|)

1

2112.5

0.121

1

12799

4.45E-08

1

4071.7

0.009002

1

2950.2

0.0116

1

2290.6

0.002416

did not have a significantly different slope from Reference Whites for Lovejoy and colleagues’ auricular surface and the Transition Analysis pubic symphysis standard. In addition, Whites showed nearly identical results to males; the only differences observed were for Lovejoy and colleagues’ auricular surface standard and Boldsen and colleagues’ auricular surface superior demiface topography component. Differences between Reference and Recent American Whites in the rate of progression through age-related morphological changes were observed for all standards except Lovejoy and colleagues’ auricular surface method; Table 29 shows that the Recent Whites age at a slower rate than their Reference counterparts for all standards producing

Table 29: Recent Whites rate of progression through morphological stages, compared to Reference Whites Faster

Slower

No difference

Pubic symphysis Todd Suchey-Brooks Hartnett-Fulginiti Boldsen superior apex Boldsen ventral symphyseal margin Boldsen dorsal symphyseal margin Auricular Surface Lovejoy et al. Boldsen superior demiface topography

X X X X X X X X

197

a statistically significant result, except the superior demiface topography component of the auricular surface. The residual deviance and residual degrees of freedom data for both Blacks and Whites indicate poor fitting models for all aging methods, regardless of indicator. Sex-race category The ages at transition were calculated and an analysis of deviance was performed based on four sex-race categories to explore the possible interaction between sex and race effects (Tables 30-31). Tables with the ages at transition and plots of the data are presented in Appendix L. The sample size for Recent Black females was too small to calculate these values; results for this category were not obtained. Again, all models were poorly fitted to the data, regardless of aging method or sex-race category. Differences between Reference and Recent American sex-race categories in the rate of progression through age-related morphological changes were observed for all

Table 30: Analysis of deviance and improvement chi-square output: Sex-race categories CHI-SQUARE LIKELIHOOD RATIO TESTS: SEX-RACE CATEGORIES Resid. Resid. Aging Standard Model Df Dev Test Df LR stat. Pr(Chi) Todd

phase=log(age) + pop phase=log(age) + pop + log(age):pop Suchey-Brooks phase=log(age) + pop phase=log(age) + pop + log(age):pop Hartnett-Fulginiti phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop apex phase=log(age) + pop + log(age):pop Boldsen ventral phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Boldsen dorsal phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Lovejoy et al. phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop demiface topography phase=log(age) + pop + log(age):pop

908 907 912 911 911 910 862 861 906 905 911 910 764 763 773 772

198

2052.673 2029.499 1849.362 1838.391 2301.126 2289.208 1253.195 1232.994 2081.484 2051.077 1784.432 1762.248 2135.400 2122.252 1260.990 1259.896

1 vs. 2

1

23.17455 Pr(Chi)

1.48E-06

1 vs. 2

1

10.97097

0.000926

1 vs. 2

1

11.91767

0.000556

1 vs. 2

1

20.20041

6.97E-06

1 vs. 2

1

30.4067

3.50E-08

1 vs. 2

1

22.18320

2.48E-06

1 vs. 2

1

13.1479

0.000288

1 vs. 2

1

1.094386

0.2955014

Table 31: Analysis of deviance output: Sex-race categories ANALYSIS OF DEVIANCE TABLE (glm): Sex-race categories Aging Standard Transition Analysis pubic symphysis Transition Analysis auricular surface Transition Analysis cranial sutures Transition Analysis combo, uniform Transition Analysis combo, forensic

Model TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop

Resid. Df 957 956 818 817 811 810 968 967 968 967

Resid. Dev 558534 550698 263426 261571 499043 498804 308967 308960 149051 148510

Df

Deviance

Pr>(|Chi|)

1

7835.7

0.0002259

1

1854.8

0.01609

1

238.04

0.5341

1

7.0946

0.8783

1

541.35

0.06045

standards except Boldsen and colleagues’ auricular surface superior demiface topography component and the Transition Analysis cranial suture and combine indicator standards; Tables 32-33 showed that the Recent White males and Recent Black males age at a slower rate than their Reference counterparts.

Table 32: Recent White males rate of progression through morphological stages, compared to Reference White males Faster

Slower

No difference

Pubic symphysis Todd Suchey-Brooks Hartnett-Fulginiti Boldsen superior apex Boldsen ventral symphyseal margin Boldsen dorsal symphyseal margin Auricular Surface Lovejoy et al. Boldsen superior demiface topography

X X X X X X X X

Table 33: Recent Black males rate of progression through morphological stages, compared to Reference Black males Faster

Slower

No difference

Pubic symphysis Todd Suchey-Brooks Hartnett-Fulginiti Boldsen superior apex Boldsen ventral symphyseal margin Boldsen dorsal symphyseal margin Auricular Surface Lovejoy et al. Boldsen superior demiface topography

X X X X X X X X

199

Recent White males aged at a slightly decelerated rate for all of the pubic symphyseal standards and components that produced a statistically significant difference between total Reference and Recent skeletal samples. Pubic symphyseal aging standards did not appear to be equally applicable to Reference and Recent White males. Recent Black males also age at a decelerated rate for Lovejoy and colleagues’ auricular surface standard and all three of Boldsen and colleagues’ pubic symphyseal components that produced a statistically significant difference between total Reference and Recent skeletal samples; however, no significant difference between Reference and Recent Black males was present for the traditional phase-based pubic symphyseal standards. This result implied that only certain morphological features of the pubic symphyseal age indicator had shifted in terms of their rate of progression through agerelated stages. Accordingly, Meindl and Lovejoy’s auricular surface aging standard did not appear to be equally applicable to Reference and Recent Black males. Time, by ten-year birth cohorts It was hypothesized that the temporal difference between the Reference and Recent samples may also explain the pattern of mixed results. The average year of birth for the Reference sample was 1878 (range 1828-1943), while it was 1939 (range 18891985) for the Recent sample. The difference in birth years may allow for the recognition of subtle changes in the rate of progression of skeletal age changes, possibly resulting from secular trends, environmental factors, cultural practices, socioeconomic status, improvements in living conditions, diet, disease prevalence, and advances in health care and disease prevention (Eveleth 1975; Eveleth 1979; Frisancho 1978; Fogel 1986; Malina et al. 1987; Bogin 1999). 200

In the last two decades, numerous anthropological investigations have reported the presence of secular change in human skeletal dimensions and morphology; in addition, trends for earlier maturation were noted for the onset of menarche and the development of secondary sexual characteristics. A possible diachronic trend may also exist for developmental-based skeletal aging systems in American males and females (Webb & Suchey 1985). The shift in the timing of physiological and skeletal maturation hinted at potential changes in the rates of osteological degeneration and senescence as well. Tables 34-35 summarize the statistical output for differences among birth year cohorts. Bold type denotes statistically significant differences among cohorts, which included all pubic symphyseal standards and components that produced significant results for the comparison between Reference and Recent total samples except for the HartnettFulginiti method. However, the residual deviance and residual degrees of freedom data reported in the table indicate poor fitting models for all aging methods.

Table 34: Analysis of deviance and improvement chi-square output: 10-year birth cohorts CHI-SQUARE LIKELIHOOD RATIO TESTS: 10-YEAR BIRTH COHORTS Resid. Resid. Aging Standard Model Df Dev Test Df LR stat. Pr(Chi) Todd

phase=log(age) + pop phase=log(age) + pop + log(age):pop Suchey-Brooks phase=log(age) + pop phase=log(age) + pop + log(age):pop Hartnett-Fulginiti phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop apex phase=log(age) + pop + log(age):pop Boldsen ventral phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Boldsen dorsal phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Lovejoy et al. phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop demiface topography phase=log(age) + pop + log(age):pop

908 907 912 911 911 910 862 861 906 905 911 910 764 763 773 772

201

2077.080 2068.393 1857.601 1850.504 2309.186 2307.281 1261.522 1256.405 2096.774 2092.876 1804.104 1796.088 2136.237 2134.638 1259.167 1259.118

1 vs. 2

1

8.686526

0.00320571

1 vs. 2

1

7.097512

0.007719102

1 vs. 2

1

1.905282

0.1674883

1 vs. 2

1

5.116556

0.0236986

1 vs. 2

1

3.898507

0.04832903

1 vs. 2

1

8.016078

0.004636387

1 vs. 2

1

1.598872

0.2060632

1 vs. 2

1

0.04852136

0.8256565

Table 35: Analysis of deviance output: 10-year birth cohorts ANALYSIS OF DEVIANCE TABLE (glm): 10-year birth cohorts Aging Standard Transition Analysis pubic symphysis Transition Analysis auricular surface Transition Analysis cranial sutures Transition Analysis combo, uniform Transition Analysis combo, forensic

Model TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop

Resid. Df 957 956 818 817 811 810 968 967 968 967

Resid. Dev 564711 564636 274819 273801 507106 505795 307796 306803 149219 149194

Df

Deviance

Pr>(|Chi|)

1

74.21

0.723

1

1017.6

0.08141

1

1311.0

0.1474

1

992.93

0.07688

1

25.777

0.6827

The ages at transition were not calculated for the seventeen ten-year birth cohorts because there was an insufficient representation of the range of ages and observed stages necessary to produce robust data for most of the cohorts. The dataset was biased toward older individuals for the earliest birth years, which resulted in a lack of young ages and low phases for these cohorts. Likewise, the later cohorts were restricted to younger ages, simply because an individual born recently could not attain elderly ages and be included in skeletal series that were established in the 1980s. Even when cohort sizes were increased to twenty-five years, the earliest cohorts still lacked a sufficient number of younger individuals, preventing a statistically sound calculation of ages at transition. Little can be said about whether the differences in pubic symphyseal indicators were the result of birth year, genetic diversity, or environmental variables. Adult Stature Finally, adult stature was hypothesized to explain the pattern of results. The adult height of individuals was used as a proxy for childhood health and socioeconomic status due to availability of the data. The justification for this variable was literature on growth and development of children with varying socioeconomic backgrounds (Ericksen 1982; 202

Peck & Lundberg 1995; Bogin 1999), though catch-up growth may mitigate any negative effects on adult attained stature. Tables 36-37 summarize the statistical output for differences among short, average, and tall individuals. Bold type denotes statistically significant differences among the groups, which were only present for the Transition Analysis auricular surface and cranial suture closure standards. These findings suggest that adult stature had little impact on the explanation of the patterns of age-related rate changes between Reference and Recent groups; however, the models tested are a poor fit of the actual data. While markers of lower socioeconomic status in childhood was correlated with delayed skeletal development and maturation in the literature, it does not appear to contribute much to explaining rate changes in skeletal degenerative age markers. Perhaps adult socioeconomic, nutritional, and health data, which were conspicuously absent in the documentation of existing American osteological samples, would be better for explaining the patterns observed.

Table 36: Analysis of deviance and improvement chi-square output: Adult stature CHI-SQUARE LIKELIHOOD RATIO TESTS: STATURE Resid. Resid. Aging Standard Model Df Dev Test Df LR stat. Todd

phase=log(age) + pop phase=log(age) + pop + log(age):pop Suchey-Brooks phase=log(age) + pop phase=log(age) + pop + log(age):pop Hartnett-Fulginiti phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop apex phase=log(age) + pop + log(age):pop Boldsen ventral phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Boldsen dorsal phase=log(age) + pop symphyseal margin phase=log(age) + pop + log(age):pop Lovejoy et al. phase=log(age) + pop phase=log(age) + pop + log(age):pop Boldsen superior phase=log(age) + pop demiface topography phase=log(age) + pop + log(age):pop

520 519 524 523 523 522 488 487 520 519 524 523 515 514 522 521

203

1261.451 1261.204 1139.065 1139.047 1358.167 1358.162 751.0584 749.3637 1237.644 1235.686 1074.060 1073.991 1461.386 1461.154 867.5873 867.4645

Pr(Chi)

1 vs. 2

1

0.2472045

0.6190504

1 vs. 2

1

0.01743218

0.8949598

1 vs. 2

1 0.004172073

0.9484992

1 vs. 2

1

1.694737

0.1929776

1 vs. 2

1

1.958401

0.1616845

1 vs. 2

1

0.06918113

0.7925328

1 vs. 2

1

0.2312174

0.6306226

1 vs. 2

1

0.1227854

0.7260329

Table 37: Analysis of deviance output: Adult stature ANALYSIS OF DEVIANCE TABLE (glm): Stature Aging Standard Transition Analysis pubic symphysis Transition Analysis auricular surface Transition Analysis cranial sutures Transition Analysis combo, uniform Transition Analysis combo, forensic

Model TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop TA point estimate=age + pop TA pt est=age + pop + age:pop

Resid. Df 555 554 562 561 558 557 563 562 563 562

Resid. Dev 347549 346766 203109 201427 363372 359928 153497 152608 89097 88662

Df

Deviance

Pr>(|Chi|)

1

782.63

0.2635

1

1682.1

0.03043

1

3443.7

0.02097

1

888.16

0.07052

1

434.36

0.09706

Synopsis The general pattern observed was suggestive of strong influences by age indicator, anatomical region of the indicator, sex, and race. Adult stature did not appear to be a major influence with respect to the pattern produced, and sampling issues plagued the analysis of birth year cohort data, rendering it statistically unsound. Both method types produced similar results, indicating that the pattern of significant results observed was not explained by this variable. In contrast, pelvic indicators, specifically the pubic symphysis and auricular surface, both resulted in significant outcomes that accounted for most of the pattern. When the Reference and Recent groups were divided by sex, race, and sex-race categories, White males emerged as the subset contributing most to the pattern of mixed results obtained, followed by Black males. In general, if a significant difference was observed between groups, the results indicate that, on average, the Recent group aged at a slightly slower rate of maturation than the Reference sample. The significant differences observed between the Reference and Recent populations for pelvic indicators of age appeared to be primarily driven by males and Whites. This may be the result of small sample sizes for females and Blacks. Though 204

the entire dataset was nearly evenly split between the sexes, with a composition of 54% (n=523) males and 46% (n=448) females, two of the skeletal series classified as Recent had an abundance of males that accounted for approximately two-thirds of the remains within each sample. In addition, African Americans only account for about one third (n=328) of the total dataset, because the three Recent skeletal series were significantly biased toward Whites. This sample bias may have influenced the trends observed.

Question 3 In the case of contrasting results from multiple aging standards for a single skeletal indicator, which standard is the true indicator of whether a change in the rate of aging has occurred? The comparison between Reference and Recent American skeletal samples produced results in which some standards for an indicator signified a statistically significant difference in the aging rate and others did not; this was the case for the pubic symphysis, auricular surface, and cranial sutures. Which components or standards should be used to determine whether a change in the rate of progression through the age-related morphological metamorphoses of these indicators has actually occurred between Reference and Recent groups? To select which standards will be used to address the primary dissertation question of whether changes in the aging processes of the pubic symphysis, auricular surface, and cranial sutures have occurred for American skeletal samples, a combination of information was used including published data on the strengths and weaknesses of aging standards, stepwise regression to determine which standards/components were the

205

best predictors of chronological age, and other factors contributing to the reliability of the applied standards. Standards supported by the anthropological literature Vigorous debate has ensued over the most accurate and reliable standard for age at death estimation from the skeleton. A summary of the strengths and weaknesses of each age indicator is presented in Table 38. All of the osteological age estimation standards tested for this research were known to have inaccuracy that increased with chronological age (Buikstra & Konigsberg 1985; Lovejoy et al. 1985a; Katz & Suchey 1986; Murray & Murray 1991; Dudar et al. 1993; Bedford et al. 1993; Santos 1996; Nagar & Hershkovitz 2004). Most anthropologists agreed that using a combination of indicators was the best approach to estimating age from the skeleton (Brooks 1955; Lovejoy et al. 1985a; Krogman & İşcan 1986; Acsádi & Nemeskéri 1970; Brooks & Suchey 1990; Aykroyd et al. 1999; Baccino et al. 1999; Ritz-Timme et al. 2000; Rösing et al. 2007); each indicator provided some information, and the error of each was assumed to be largely independent, thus improving the accuracy of the age estimate. If only a single indicator of age at death could be used, the general consensus was that the pubic symphysis was the best indicator and that cranial suture closure was the worst. However, comparatively little discussion has taken place about the strengths and weaknesses of Boldsen and colleagues’ Transition Analysis scoring standards. Pubic symphysis Standards and components indicating a statistically significant difference in the rate of aging of the pubic symphysis between Reference and Recent American samples 206

Table 38: Summary of strengths and weaknesses for each aging indicator from the literature Cranial suture closure (Meindl & Lovejoy) · clear, sequential age changes · low interobserver error

PROs

CONs

· unreliable · asymmetric · defined endpoint can be reached long before death · large variability in rates of closure · weak correlation between closure and age · some claim obliteration is independent of age · large error, esp. 50+

Pubic symphysis (T=Todd) (SB=Suchey-Brooks) (Hartnett-Fulginiti)

Auricular surface

Sternal end of the fourth rib

Transition Analysis

(Lovejoy et al.)

(İşcan et al.)

(Boldsen et al.)

· irreversible, sequential age changes · most studied · generally considered the best indicator · relatively high correlation with age · (SB) highly regarded, attempted to fix known problems with Todd’s standard · (SB) larger, more representative sample · (SB) clearer phase descriptions and reference casts

· good preservation of indicator · satisfies assumption of uniformitarianism · some claim performs better than the pubic symphysis for older individuals · allows for agerelated changes after age 50

· low bias for young adults · little difference between 4th rib and alternative numbers · not affected by mechanical stress like the pelvis · low interobserver error

· (T) known problems with the reference sample, including questionable documentation of age and purposeful elimination of variation · overestimates the age of young · inaccurate for older individuals · inconclusive results regarding interobserver error rates · variability in morphology, hard to classify into a single phase · potential for confusion with the developmental vs. the degenerative changes to the ventral rampart · possibly affected by childbirth

· low repeatability · low reliability · high interobserver error · needs larger age ranges · strong methodological bias · ambiguous morphology, hard to classify into a single phase · too variable · subject to the influences of pregnancy

· difficult to identify 4th rib in fragmentary remains · possibly affected my mechanical stress (lower ribs only) · poor preservation

· not subject to age mimicry · can handle variation in suite of morphological traits observed by scoring components of the age indicator · age assessment can be specialized to an individual · small average difference between predicted and actual ages · smaller 95% confidence interval · multiple indicators generally considered best · combination of pubic, auricular, and cranial indicators best, eliminating outliers and methodological biases · unproven in the literature · increase in variation seen as age increases (like other methods) · possible problem with older black females in the reference sample

207

included the following: Todd, Suchey-Brooks, Hartnett-Fulginiti, Boldsen and colleagues’ superior apex, ventral and dorsal margins, and Transition Analysis using a combination of all five Boldsen and colleagues’ pubic components. The anthropological literature clearly favors the pubic symphysis as the indicator of choice for estimating age from the human skeleton. While methodological concerns were reported for the Todd method and comparative data was lacking for Hartnett’s modifications and Transition Analysis, these standards were concordant with the Suchey-Brooks method, which was the most highly regarded osteological aging standard in the literature because it exhibited relatively high correlations with chronological age and diverse reference sample. Boldsen and colleagues’ pubic symphyseal components were subsumed within the phase descriptions defined for the traditional methods; this suggested that these components may be the portions of the pubic symphysis that were driving the rate difference between Reference and Recent groups. Based on the anthropological literature, the SucheyBrooks standard was weighted heaviest of all four pubic symphyseal aging methods. Auricular surface Standards and components indicating a statistically significant difference in the rate of aging of the auricular surface between Reference and Recent American samples included Lovejoy and colleagues’ phase-based auricular surface standard, Boldsen and colleagues’ superior demiface topography, and Transition Analysis using a combination of all nine Boldsen and colleagues’ auricular components. The anthropological literature presented mixed reviews of age estimation using the auricular surface indicator. Despite the claim that it provided better estimates for older individuals, Lovejoy and colleagues’ phase-based auricular surface standard has been criticized for low repeatability, low 208

reliability, high interobserver error, strong methodological bias, and ambiguous morphology that was hard to classify into a single phase. As with the pubic symphysis, there was a lack of critique in the literature specific to the Transition Analysis auricular surface method, though the method was generally highly regarded. Based on published anthropological reports, Lovejoy and colleagues’ standard was not weighted as heavily as Boldsen and colleagues’ Transition Analysis auricular surface method, though it was unclear exactly how heavily to weight the Transition Analysis standard itself. Fourth Rib İşcan and colleagues’ aging method did not indicate a difference in the rate of senescent change of the sternal end of the fourth rib. The anthropological literature reported that this aging standard had low bias for young adults and low interobserver error; in addition, the sternal end of the fourth rib was not affected by mechanical stress like pelvic indicators. The most significant drawbacks of this indicator included poor preservation in archaeological samples and difficulty isolating the fourth rib, though scoring adjacent ribs can rectify the latter problem. Cranial sutures For this research, only Transition Analysis using a combination of all five of Boldsen and colleagues’ sutural components produced a statistically significant difference in the rate of aging between Reference and Recent American samples. The anthropological literature presented predominantly negative reviews of age estimation based on cranial suture obliteration. Major critiques of age estimation using cranial suture obliteration included low reliability, large variability in rates of closure, weak 209

correlation between closure and age, error introduced by asymmetric obliteration, and large error for older individuals, specifically because the defined end-point of complete obliteration can be reached long before death. However, these critiques were specifically targeted at traditional cranial suture closure standards, like that of Meindl and Lovejoy; little critique has been directly aimed at age estimation from cranial sutures using Boldsen and colleagues’ components and Transition Analysis. Based on published anthropological reports, it was questionable how heavily Boldsen and colleagues’ Transition Analysis cranial suture closure method should be weighted. Multiple indicators: pubic symphysis, auricular surface, and cranial sutures Both the uniform- and forensic-prior Transition Analysis standard for multiple indicators produced a statistically significant difference in the rate of aging of Reference and Recent American skeletal populations. As with the Transition Analysis methods based on each of the indicators individually, no critiques specific to the Transition Analysis UNI and COR methods have been addressed in the anthropological literature. Again, Transition Analysis was generally regarded as better than traditional phase-based aging standards because it was not subject to age mimicry and it allowed for more variation in the scoring technique used to describe the observed morphology. The multiple indicator Transition Analysis standards were considered to be the best by their developers, because outliers and methodological biases present for each individual indicator were eliminated.

210

Standards selected for inclusion in the stepwise regression model A summary of the aging standards and indicator components that were selected for inclusion in the regression model to predict documented chronological age is presented in Table 39. The stepwise regression was performed using all aging standards and indicator components scored for the entire dataset. While the contributions of the first eight variables listed were statistically significant at the 0.05 level, only the addition of the first three added more than a negligible amount to the variation explained by the model.

Table 39: Summary of aging standards and indicator components selected for inclusion in the regression model Summary of Stepwise Selection: all standards and components Variable Entered

Partial R-Square

Model R-Square

C(p)

F Value

Pr > F

Transition Analysis combo, forensic

0.5349

0.5349

148.896

390.96

Evidence for a change in the rate of aging of osteological indicators in [PDF]

Recommend Stories

Idea Transcript

Helpful Links

Smile Life

Get in touch