chapter one - University Of Nigeria Nsukka [PDF]

secondary 2 Biology teachers should make use of CRAT or test similar to this in the assessment of achievements of .....

51 downloads 34 Views 903KB Size

Recommend Stories


philosophy of education - National Open University of Nigeria [PDF]
EDU 718. PHILOSOPHY OF EDUCATION. Explain the curriculum in relation to education. Discuss the concepts of metaphysics, axiology, logic, freedom, epistemology, etc. Course Objectives. There are overall objectives set out in order to achieve the aims

Chapter One Trade And Carriage - NADR [PDF]
Seller / Buyer 1 – Buyer 2 – Buyer 3 etc. THE ENGLISH RULE OF PRIVITY AND THE CONTRACT OF CARRIAGE. ▫ C.i.f. sales contracts. ▫ Bill of Lading Act 1855 18 & 19 Vict c 111. Defects in the Bill of Lading Act 1855. ▫ Carriage of Goods by Sea A

CHAPTER ONE RESNET Standards
The only limits you see are the ones you impose on yourself. Dr. Wayne Dyer

chapter one introduction
You miss 100% of the shots you don’t take. Wayne Gretzky

chapter one introduction
Life isn't about getting and having, it's about giving and being. Kevin Kruse

chapter one introduction
Suffering is a gift. In it is hidden mercy. Rumi

ARISAWKADORIA Chapter One
Make yourself a priority once in a while. It's not selfish. It's necessary. Anonymous

chapter one introduction
Raise your words, not voice. It is rain that grows flowers, not thunder. Rumi

chapter one introduction
This being human is a guest house. Every morning is a new arrival. A joy, a depression, a meanness,

chapter one introduction
Almost everything will work again if you unplug it for a few minutes, including you. Anne Lamott

Idea Transcript


1

UGWU, SYLVIA NJIDEKA

DEVELOPMENT AND VALIDATION OF CRITERION REFERENCED ACHIEVEMENT TEST IN BIOLOGY

SCIENCE EDUCATION

Education

Okeke,chioma m

Digitally Signed by: University of Nigeria, Nsukka e DN : CN = okeke,chioma maryrose O= University of Nigeria, Nsukka OU = Innovation Centre

2

TITLE PAGE

DEVELOPMENT AND VALIDATION OF CRITERION REFERENCED ACHIEVEMENT TEST IN BIOLOGY

BY

UGWU, SYLVIA NJIDEKA PG/M.ED/07/43168

A PROJECT WORK SUBMITTED TO THE DEPARTMENT OF SCIENCE EDUCATION UNIVERSITY OF NIGERIA, NSUKKA, IN PARTIAL FULFILLMENT FOR THE REQUIREMENTS FOR THE AWARD OF THE MASTERS DEGREE IN MEASUREMENT AND EVALUATION

SUPERVISOR: DR. E. K. N. NWAGU

MAY, 2012

3

CERTIFICATION

I hereby certify that the candidate, Ugwu Sylvia Njideka with Reg. No PG/M.ED/07/43168 has duly effected corrections suggested by the External Examiner. I therefore forward this project work to you for onward transmission to the School of Postgraduate Studies. Thank you.

___________________________ Dr. E. K. N. Nwagu Project Supervisor

____________ Date

4

APPROVAL PAGE

This project work is an original work by the author and has not been published elsewhere. It has been read and approved by the Faculty of Education, University of Nigeria.

___________________________ Name of Student

_________________ Signature

____________ Date

___________________________ Name of Supervisor

_________________ Signature

____________ Date

___________________________ Name of Head of Department

__________________ Signature

____________ Date

___________________________ Name of External Examiner

__________________ Signature

____________ Date

___________________________ Name of Dean of Faculty

__________________ Signature

____________ Date

5

DEDICATION

To

My Darling Husband – Dr. A. O. Ogueche My Babies – Miss Clara Chiamaka Ogueche Miss Anita Amarachi Ogueche And My Precious Parents – Mr. and Mrs. Ugwuozor

6

ACKNOWLEDGEMENT It is with immense humility that I express my gratitude to God for all His mercies and enabling conditions to successfully complete this research work. All glory and honour belong to God; so the success of this work is an embodiment of glory to our Father Almighty, God and Our Mother Mary. I wish to express my profound gratitude and appreciation to my Supervisor, Dr. E.K.N. Nwagu, Science Education, University of Nigeria, Nsukka, for his immense supervisory guidance while this research work was in progress. His magnanimity in lending me useful and rare reference materials and his intellectual leadership provided are most gratefully acknowledged. My unquantifiable gratitude goes to my parents Mr. Paulinus Ugwuozor and Mrs. Florence Ugwuozor for their parental care and support as the programme lasts. Again I owe my gratitude to my husband and my babies for their understanding. I also wish to thank my relations and my well-wishers for their moral and financial support. I pray for God’s abundant blessings to you all.

UGWU, SYLVIA NJIDEKA DEPARTMENT OF SCIENCE EDUCATION (MEASUREMENT AND EVALUATION) UNIVERSITY OF NIGERIA, NSUKKA MAY, 2012.

7

ABSTRACT This study was designed to develop and validate criterion referenced achievement test for senior secondary two students in Biology. This became necessary because of the need to have a well developed and validated instrument with proven psychometric properties for teachers’ use in their continuous assessment exercise. This study was guided by three research questions and two hypotheses using instrumentation design. The population for the study was made up of 44,259 Biology students in senior secondary school class two (2010/2011 session) in Enugu State. The sample comprised 113 males and 184 females giving a total of 297 students. The instrument developed and validated was Criterion Referenced Achievement Test (CRAT) for SSII. The validity and reliability of the instrument was established. The internal consistency measure of the instrument was found to be 0.51 using Kudder Richardson formular 20 (KR-20). Mean and standard deviation were used to provide answers to the research questions while T-test and Chi Square Test of Goodness of fit were used to test the hypotheses at 0.05 level of significance. The result of the study showed that the items of CRAT did not deviate statistically from the specifications of the core curriculum; there is no statistical significant difference between the mean achievement of male and female. Based on these findings, the researcher recommended that senior secondary 2 Biology teachers should make use of CRAT or test similar to this in the assessment of achievements of their students in Biology.

8

TABLE OF CONTENTS Page Title Page

-

-

-

-

-

-

-

-

-

-

-

-

i

Certification -

-

-

-

-

-

-

-

-

-

-

-

ii

Approval Page

-

-

-

-

-

-

-

-

-

-

-

iii

Dedication

-

-

-

-

-

-

-

-

-

-

-

iv

-

-

-

-

-

-

-

-

-

-

v

-

-

-

-

-

-

-

-

-

-

-

vi

Table of Contents -

-

-

-

-

-

-

-

-

-

-

viii

List of Tables -

-

-

-

-

-

-

-

-

-

-

x

-

-

-

-

-

-

-

-

-

-

xi

-

Acknowledgement Abstract -

-

-

List of Appendices

CHAPTER ONE: INTRODUCTION Background of the Study

-

-

-

-

-

-

-

-

-

1

Statement of the problem

-

-

-

-

-

-

-

-

-

7

-

-

-

-

-

-

-

-

-

8

Significance of the study

-

-

-

-

-

-

-

-

-

8

Scope of the study

-

-

-

-

-

-

-

-

-

-

10

Research questions

-

-

-

-

-

-

-

-

-

-

10

Hypotheses

-

-

-

-

-

-

-

-

-

-

10

-

-

-

-

-

12

Purpose of the study

-

-

CHAPTER TWO: REVIEW OF LITERATURE Conceptual framework -

-

-

-

-

9

Historical perspective on Achievement Testing

-

-

-

-

-

14

Importance of Achievement Testing

-

-

-

-

-

-

15

-

-

-

-

-

-

18

Strengths and weaknesses of Criterion-Referenced Test

-

-

-

22

-

-

-

24

Procedures for development and validation of Achievement tests

-

-

26

Theoretical Framework

-

Factors that affect Achievement Testing -

The nature of School Biology -

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

44

Review of related Empirical Studies -

-

-

-

-

-

-

-

47

Summary of Literature Review -

-

-

-

-

-

-

-

50

-

CHAPTER THREE: RESEARCH METHOD Research Design

-

-

-

-

-

-

-

-

-

-

-

52

-

-

-

-

-

-

-

-

-

-

-

52

Population of the Study

-

-

-

-

-

-

-

-

-

-

53

Sample and Sampling Techniques

-

-

-

-

-

-

-

-

53

Instrument for Data Collection

-

-

-

-

-

-

-

-

54

Area of Study

-

-

Administration of the Instrument -

-

-

-

-

-

-

-

-

54

Method of Data Analysis

-

-

-

-

-

-

-

-

54

-

-

CHAPTER FOUR: PRESENTATION OF RESULTS Research Question One

-

-

-

-

-

-

-

-

-

-

55

Research Question Two

-

-

-

-

-

-

-

-

-

-

55

Research Question Three

-

-

-

-

-

-

-

-

-

-

56

Hypothesis One -

-

-

-

-

-

-

-

-

-

-

56

-

10

Hypothesis Two

-

Summary of Results

-

-

-

-

-

-

-

-

-

-

- 57

-

-

-

-

-

-

-

-

-

-

-

58

CHAPTER FIVE: DISCUSSION OF RESULTS, CONCLUSION, IMPLICATIIONS, RECOMMENDATION AND SUMMARY Discussion of results

-

-

-

-

-

-

-

-

-

-

- 59

Conclusion

-

-

-

-

-

-

-

-

-

-

- 61

Educational Implication of the Study

-

-

-

-

-

-

-

- 62

Recommendations

-

-

-

-

-

-

-

-

-

-

-

-

- 62

Limitations of the Study -

-

-

-

-

-

-

-

-

-

- 63

Suggestions for Further Research -

-

-

-

-

-

-

-

- 63

Summary of the Study

-

-

-

-

-

-

-

-

-

-

- 64

References

-

-

-

-

-

-

-

-

-

-

-

-

- 67

Appendices

-

-

-

- 73

-

-

-

-

-

-

-

-

-

11

LIST OF TABLES

Table1:

Mean and Standard Deviation of Students Achievement in Criterion Referenced Achievement Test in Biology for SS2 (CRAT 2) by Sex -----------------------------

60

Table 2: Means and Standard Deviations of Males and Females in the Hypothetical CRAT in Biology ----------------------------------------------------------------------

61

Table 3: The Observed and Expected Frequencies of Males and Females Mean Achievements -------------------------

62

12

LIST OF APPENDICES

Appendix A

Criterion Referenced Achievement Test (CRAT) for SS 2 ----------------------------------------------------------------------

Appendix B

73

Table of Specification for SS2 Criterion Referenced Achievement Test (CRAT) in Biology --------------------------------- 79

Appendix C

Seventeen Local Government Areas in Enugu State -------------------------------------------------------------------------- 80

Appendix D

Seventeen Local Government Areas and their various Education

Zones in Enugu State -------------------------- 81

Appendix E

Names of School used in Trial Testing -------------------------------

82

Appendix F

Names of School used in Pilot Testing -------------------------------

83

Appendix G

Item Analysis of SS2 Criterion Referenced Achievement Test (CRAT 2) ---------------------------------------------

Appendix H

Distractor Indices for S.S. 2 Criterion Referenced Achievement

Appendix I

84

Test (CRAT) in Biology -------------------------------- 87

Means and Standard Deviations of Males and Females in the Hypothetical CRAT in Biology --------------90 CHAPTER ONE INTRODUCTION

13

Background of the study. Biology, the science of life is a branch of science, which is traditionally divided into Botany and Zoology. Botany is the study of plants while Zoology is the study of animals (Roberts, 1981). In any human endeavour, biology is considered essential because in the training of a medical doctor man himself is a living being and the more knowledge about life he acquires the better understanding of the body and how it works in health and in diseases. The knowledge of biology is also important in Agriculture because to produce more food requires sound knowledge of plants and all the essential elements (macro- and micro-). This knowledge will improve the food production for the people. The knowledge of biology being applied to agriculture and medicine had resulted to increase in the population of the world, since there are fewer epidemics nowadays (Stone and Cozen, 1981). The knowledge of biology is important in industries for the production of alcoholic drinks, beverages, bread, butter, cheese, tobacco, vinegar etc. and in food processing and storage. Biology has played vital role in education in Nigeria. It has been observed by the researcher that, of all the science subjects taught in secondary schools, biology plays a prominent role in the field of medicine, agriculture, brewery and petro-chemical industries and even in geology and mining. As a result of the indispensability of biology, much emphasis has been placed on biology instruction especially at the secondary school level. This is to ensure full realization of the objectives of biology education as stipulated in the National Policy on Education (FRN, 1998). According to the National Policy on education statement, biology

14

education should ensure adequate laboratory and field skills in biology, meaning and relevant knowledge of biology to everyday life. To ensure the full realization of these interesting objectives, the contents and contexts of the curriculum place great emphasis on field studies, guided discovery, laboratory techniques, skills and conceptual thinking. Unfortunately,

available

evidence

has

revealed

that

students’

performances in biology have been discouraging. According to WAEC Chief Examiner’s report (WAEC 2010), candidates’ weaknesses in biology are prominent in the following areas: i.

Spelling of biological terms and technical words.

ii.

Interpretation of the questions properly.

iii.

Statement of the appropriate units of measurement.

iv.

Graphic interpretations.

v.

Biostatistical knowledge.

vi.

Technicalities in drawing and labeling.

vii.

Poor adherence to fundamental practical rules.

The senior secondary biology candidates have a number of problems associated with both cognitive and motor skills as well as poor adherence of the students to sensitive practical rules. This gave rise to several speculations and arguments at the Science Teachers Association of Nigeria workshop, meetings and seminar held in Port Harcourt in 2002 (STAN, 2002). This was aimed at suggesting appropriate method for improving the current state of achievement in biology and other sciences. (Chief Examiner’s report, WAEC, 2010)

15

In any educational endeavour, there must be a criterion or criteria for measuring success or accomplishment. In addition, the criterion must be uniform and standardized. In Nigerian education, it is advocated that materials used in measuring achievements must be such that minimum error is encountered in the course

of

the

measurement.

Instruments/tests

used

for

achievement

measurement must possess the needed psychometric properties (item facility, discrimination and distractor indices) if they are to serve the purpose they are constructed to serve. The locally constructed teacher-made tests are usually the most common tools for both assessment and promotion of students into new classes. Most teacher-made tests have been found to be carelessly constructed or not constructed based on a test blue-print or table of specification (Nwana, 1988). One therefore needs to address the question of whether students’ poor performances are due to the nature of instruments used in evaluating instruction especially now that modules are replacing units of work. Modules are gradually replacing units of work in our education system. This is because a course that is made up of several units of work which one lecturer may not be able to cover in one semester, can be separated into independent units to form a course of its own. The separated units can be taken up by another lecturer thereby making the lecturers’ workload lesser. This can be done using CRAT. In recent years, accountability, performance contraction, formative evaluation, computer assisted instruction, individually prescribed instruction, mastery learning etc, has spawned an interest in and need for new kinds of tests

16

i.e. Criterion Referenced Achievement Tests (CRATs) or as some prefer to call them objective based tests (Mehrens and Lehmann, 1978). Accountability can be seen as a matter of redesigning the structures by which education is governed. Performance contraction is seen as a specific approach to education or it can be called education voucher. The issues of what students learn the methodology of instruction by teachers, professional or academic qualification of teachers and the availability of instructional materials to a greater or less extent determine the students’ performances. The influence of these issues or factors can at best be traced by CRATs. This is because, in addition to determining who is learning and who is not learning, CRAT also helps to determine why one is not learning. CRAT therefore should be the most preferred test by teachers who seek to achieve all the stated objectives of their instructions especially in Biology. The important consideration in Criterion Referenced Testing is to ascertain whether a student attained those objectives rather than his position relative to other

students.

Traditionally,

the

principal

use

of

criterion

referenced

measurement has been in “mastery tests”. Mastery as the word is typically used, connotes an either/or situation. The person has either achieved the objective(s) satisfactorily or has not. Mastery tests are used in programs of individualized instruction, such as the Individually Prescribed Instruction (IPI) program or in the mastery learning model. Such instructional programs are composed of units or modules usually considered hierarchical, each based on one or more instructional objectives. Each individual is required to work on the unit until he has achieved a

17

specified minimum level of achievement. Thus, the student is considered to have “mastered” the unit. In such programs, the instructional decision of what to do with a student is not dependent on how his performance compares to others. If the student has performed adequately on the objectives, then the decision is to move on to the next unit of study, if not, then the material covered in the test has to be repeated until adequate performance is achieved. Criterion referenced data can be used to evaluate (make decision about) instructional programs. In order to determine whether specific instructional treatments or procedures have been successful, it is necessary to have data about the outcomes on the specific objectives, the programs was designed to teach. A measure comparing students to each other (norm referencing) may not give so effective data as a measure comparing each student’s performance to the objectives. Criterion referenced measures offer certain benefits for instructional decision making within the classroom. The diagnosis of specific difficulties accompanied by a prescription of certain instructional treatments is necessary in instruction whether or not one uses a mastery approach to learning. Criterion referenced tests can be useful in broad surveys of educational accomplishment such as the National Assessment of Educational Progress or State Assessment Programs. It is currently employed in education, business and industry, civil service and the armed forces in Nigeria. They are frequently employed to check the attainment of minimum performance standards. (Mehrens and Lehmann, 1978)

18

It is also interesting to know that there is no difference between males and females in their performances in biology using such criterion as sex. Previous studies have shown that it is inconclusive as to whether there is any difference in the performances of the students or not based on sex. Here, the researcher is of the view that one would not expect differences in performances of students in terms of sex using well developed and validated CRAT in biology. There are some general and peculiar procedures for developing validating any given class or type of tests or scales outlined by Onunkwo (2002). The steps are: defining the objectives, specifying the content to be covered, preparing test blue-print, writing test items, face validation of the test items, trial testing the test items, carrying out the item analysis, selecting good items, final testing of the items and printing and production of the test. The major objective of schools is to bring about desirable changes in behaviour of the learners through the process of teaching and learning. In pursuance of this objective, the need arises for the teacher to know when the objectives have been achieved. There is also need to appraise with a standardized test in which the parts fit together to cover the range of objectives that are deemed important and feasible. Without such appraisal, any educational decision-making may be spurious and questionable. There is therefore, the need to develop and validate the test (CRAT) in biology based on the cognitive domain, in order to assess the different levels of mental processes of understanding of the learners in the different content areas used for the study.

19

Statement of the problem. There has been a steady decline in the achievement of senior secondary students in science subjects (Chemistry, Physics and Biology). The poor students’ achievement in these science subjects especially in biology has been partly traced to badly constructed test and poor teaching methods adopted by secondary school biology teachers. The poor teaching methods include, the use of lecture method, low standard of the biology laboratory etc. most laboratories in schools cannot be different from ordinary classroom. There is generally poor facility, equipment, lack of laboratory attendants and competent teachers (Anene, 1997). The few available teachers now resort to teaching biology with textbooks only. Despite the importance of CRAT, teachers do not use it in continuous assessment. This may be attributed to not knowing the value since they do not teach for mastery or not quite competent in test construction. Therefore, there is great need for CRAT especially in this era where modules are replacing units of work. Since CRAT is not readily available in biology, there is great need for model for teachers to adopt, but the problem now is:

a)

Can a valid and reliable CRAT in biology be developed for senior secondary class two students (SS2)?

b)

Can the CRAT so developed be validated for use in Nigerian secondary schools?

c)

How effective would be CRAT in separating students into different achievement groups according to sex?

20

Purpose of the study. The purpose of the study is to develop and validate a Criterion Referenced Achievement Test in Biology (CRATB) for use in senior secondary schools in Enugu state. The study is therefore designed to: 1.

Developing a Criterion Reference Achievement Test (CRAT), covering some concepts in biology to be used in senior secondary schools in Enugu state.

2.

Determine the facility, discrimination, and distractor indices of the items of CRAT.

3.

Determine the reliability of the CRAT.

4.

Determine the influence of sex on the items of the CRAT.

This is because of the nature of the topic of study. That is, it needs detailed assessment of the students on each topic to be used in the study. Significance of the study. This study will be of immense significance to teachers, students and parents to solve their teaching and learning problems in biology. CRAT is important in selection, for instance in hiring of applicants for a wide variety of specified industrial job, the admission of students to colleges, graduate schools and professional schools. It is an important tool in counseling – an appraisal of the individual’s current skills and knowledge is an obvious first step in the educational and vocational planning that constitutes important objectives of the counseling situation. This study constitute an important feature of remedial teaching programs in schools, they are useful both in the identification of students with

21

special educational disabilities and in the measurement of progress in the course of remedial work. For all types of learners, the periodic administration of well constructed and properly chosen CRAT serves to facilitate learning. Such tests reveal weaknesses in past learning; give direction to subsequent learning and motivate the learner. This study will also be useful for teachers in the evaluation of teaching, the improvement of instructional techniques, and the revision of curriculum content. It can provide information on the adequacy with which essential content is being covered. In situation demanding uniformity of training as in the military services, such uniformity can be assured by the administration of common test. CRAT can likewise indicate how much of the course content is actually retained and for how long. CRAT serves as a model for teachers with which the students are tested with properly constructed instruments to determine actually what was taught. The gains of teachers and other educational instructors and planners as to the use of reliable and valid assessment of students’ need can hardly be over emphasized. The result of the study will provide base-line standard for all senior secondary two (SS2) students in biology.

Scope of the study. The study covers all government owned secondary schools in Enugu state. All the senior secondary class two (SS2) students in Enugu state will be involved

22

in this study. This is because the final class in the senior secondary schools may not be available when they will be needed for the study. This study will focus on the following content areas; 1.

Tissues and supporting systems in animals

2.

Digestive system in animals.

3.

Basic ecological factors

This is because each content area needs proper assessment of the students to ascertain whether the stated objectives of instruction have been achieved. Research questions. The following research questions are formulated to guide the investigation: 1.

What is the validity of the CRAT?

2.

What is the reliability of the CRAT?

3.

What is the influence of sex (gender) on the items of CRAT?

Hypotheses. As part of the investigation of the problem of this study, the following null hypotheses will be tested at 0.05 level of significance: HO1: the items of the CRAT did not deviate significantly from the specification of the core curriculum. HO2: there is no significant difference between mean achievements of male and female SS2 students in Biology as determined by their mean CRAT scores. CHAPTER TWO REVIEW OF LITERATURE

23

The literature review which generates the conceptual and theoretical frameworks of this study is presented under the following subheadings: 2.1 Conceptual framework  Historical perspective on achievement testing.  Importance of achievement testing.  Factors that affect achievement testing. o gender o home educational background o school environment  Strengths and weaknesses of Criterion Referenced Test.  The nature of school Biology.  Procedures for development and validation of achievement tests. 2.2 Theoretical framework.  Classical test theory  Latent trait test theory 2.3 Review of related empirical studies.  Locally developed and validated achievement tests. 2.4 Summary of literature review.

Conceptual framework:

24

By achievement tests, the emphasis is on systematic and purposeful qualification of learning outcomes. Nwana (1982) defined achievement test as a test given to determine how much the pupils have learned. He goes on to state that this type of test covers daily class test, weekly test, terminal test, end of year test, First School Leaving Certificate Examination, School Certificate Examination, Teachers Grade Two Certificate Examinations etc. Achievement test according to Harbor-Peters (1999) is a type of test used to ascertain the standard of performance of students in a particular area. She also stated that some of the decisions taken in education are necessarily needed to be based on achievement results. E.g. promotion into the next class, decisions on certification, curriculum modification and guidance and counseling, e.t.c. Often times teachers ask questions before, during and after their lessons to ascertain how much information, issues, skills concerning the instructional theme at hand the pupils have mastered (Nwagu, 1992). He goes further to state that teachers also organize tests or examinations at the end of each school term, school year or school programme to assess the pupils in terms of achievement in the various content areas of instruction. Nwana (1982) explained an achievement or attainment test as a test given to determine how much the pupils have learned. Iwuji (1990) defined an achievement test as an instrument given at the end of teaching-learning programme. She goes on to explain that is used to assess how much a student is able to achieve in a course he has gone through. According to her, whereas an aptitude test is given before the candidate commences a course, in order to

25

predict the probability of his succeeding in the course, an achievement test is given at the end of a teaching/learning programme to assess how much of the teaching objectives the learner has been able to master. Iwuji’s explanation, is in line with the view of Nwana (1982), who explained an aptitude test as a test used to

predict

of

determine

the

maximum

possible

achievement,

whereas

achievement test is that used to determine what has actually been attained often a teaching/learning process. He explained that while an aptitude testing aims at determining the possible maximum achievement attainable, an achievement testing aims at determining the level of achievement so far attained. Achievement measurement is a systematic and purposeful quantification of learning outcomes. It involves the determination of the degree of attainment of individuals in a test, courses or programmes to which individuals were sufficiently exposed. According to Nwagu (1992) achievement test is an instrument designed to measure relative accomplishment in a specified area of work. He goes on to explain that it is used to assess how much a student is able to achieve in a course. So achievement testing is concerned with carrying out a systematic directed, organized and purposeful assessment or quantification of the learner’s attainment of instructional objectives. It is so important that teachers, schools and institutions cannot do without achievement test. Hence, achievement test could be either norm referenced or criterion referenced. Criterion referenced test has to do with the determination of the extent to which an individual has been able to reach the set standards. This means the determination of whether a person has achieved a specific set of objectives or not.

26

Criterion referenced test as explained by Onunkwo (2002) is said to be a test administered to a student in order to determine his level of proficiency in a particular knowledge, objective or skill. This function is in opposition to that of norm reference test which compares a student’s achievement in a test with the achievement of his class mates in the same test. In this case, the emphasis in criterion-referenced test is whether a student has mastered the task expected of him and is able to perform it and not how his mastery of the task compares with that of the other students that undertook the same test. Usually, a cut-off point is set and any student getting up to the cut-off point is assumed to have mastered the task. The cut off point can be set by for instance, if there are 10 questions on a topic, one may say, five (5) correct answers out of the ten (10) questions will be the cut of point for that topic. Okoye (1997) explained that a student passes in criterion referenced test only when the score he obtains is up to the standard that has been set. Thus, it is possible for all students who took a criterion-referenced test to fail or pass. Historical Perspective on Achievement Test. Assessment has been an integral part of man right from the beginning. It has been argued that formal assessment practices which emerged from formal education are one of the clearest indices of the relationship between school and society, since they provide for communication between the two (Ugwu, 1995). International perspective

27

In Europe, it was economic and social forces that led to the initial institution of various types of quality control mechanisms in education and to the massive development of education achievement over years. The report from Adams, (1981) is that examination at the end of schooling has typically given place to a regular series of standardized test and norm to allow teachers to guage their students’ progress in relation to that of the country as a whole. Marjoram (1997) supported the argument, especially since according to him for a long time; teachers have become very reliable persons in ranking students accurately on both specific abilities and non-cognitive qualities. Broadfoot (1979) stated that examination is very important for selection and they must be invested with as much apparent objectivity as possible so that the results and the failure which they imply for many candidates are accepted. The trends of educational assessment as could be seen have moved from informal to formal assessment. Assessment has developed and changed according to societal role expected of it and it has been greatly influenced worldwide by social, economic and ideological goal of the society up to the present stage of continuous assessment and greater teacher involvement in certification. Importance of Achievement Testing. There are different importances of achievement testing. Iwuji (1990) explained that a teacher teaching a course might give series of tests which may be in the nature of oral questions, homework and class assignment based on various units of the course. The result of such evaluation provides the teacher and

28

the students with some feedback to decide whether to carry on with the lessons for each group of students as planned or to readjust certain areas for specific group or groups. Mehrens and Lehmann (1978) stated that achievement tests accomplish two major importance; namely, to discriminate among individuals according to their degree of achievement or to measure differences between individuals or between the reactions of the same individual on different occasions. This is normreferenced testing which has constituted the dominant goal of most achievement testing in schools. An individual’s high raw score, say 90% may not be highly valued if it is obvious that many other members of the group score above him. Williams (2005) explained that norm-referenced tests are designed to rank students in order of achievement, from high to low, so that decisions based on relative achievement can be made with greater confidence. The major focus of a norm-referenced test is to emphasize the variability of the scores the test is capable of yielding. William went on to state that such tests tend to maximize differences in performance and this provides the most reliable basis upon which the test could be ranked. The second importance of achievement test has to do with the determination of the extent to which an individual has been able to reach the set standards. This means the determination of whether a person has achieved a specific or not. This form of evaluation is known as criterion-referenced testing which some educators feel should be more emphasized by test experts and theorists.

29

Criterion-referenced test by Onunkwo (2002) is explained to be a test administered to student in order to determine his level of proficiency in a particular knowledge, objective or skill. This function according to him is in opposition to that of norm-referenced test which compares a student’s achievement in a test with the achievement of his class mates in the same test. In this case, the emphasis in criterion-referenced test is whether a student has mastered the task expected of him and is able to perform it, and not how his mastery of the task compares with that of the other students that undertook the same task. Usually, a cut-off point is set and any student getting up to the cut-off point is assumed to have mastered the task. A student is expected to have passed in criterion-referenced test only when the score obtained is up to the standard that has been set. Thus, it is possible for all students who took a criterion-referenced test to fail or pass. (Okoye, 1997) According to Onunkwo (2002), criterion-referenced tests are used to: i.

monitor students’ progress in objective based instructional programmes

ii.

diagnose students’ learning differences.

iii.

place students into programmes of learning.

iv.

revise learning materials or rearrange learning units.

v.

evaluate the effectiveness of various educational and social programmes

vi.

assess competences on various certification and licensing examinations.

vii.

assess competences of individuals in basic social skills of numeracy and literacy.

Factors that affect Achievement Testing.

30

Some studies have been carried out on some factors that could affect achievement testing. Such factors include  gender  home educational background  school environment Gender as a factor affecting achievement testing. Previous studies have been inconclusive on the effect of gender on achievement testing. It showed that males and females have differentiated abilities. The males were superior in numerical aptitude, science reasoning and spatial relationships while the females were superior in verbal fluency, perceptual speed, memory and manual dexterity. Ohuche (1988) reported that spatial ability has been found through research studies to be required in mathematics, engineering, science and technological subjects, architecture and geography. In the research, it was found out that females perform poorly on spatial tasks when compared to males; that is, in culture that enforces strict sex-role, females are poorer than males but in permissive culture, they perform similarly on spatial tasks. According to Obe (1984) in his study on urban-rural and sex differences in scholastic aptitude of primary school finalists, males were superior over females in numerical aptitude test but in verbal aptitude test, there was no significant difference between the males and females. In another study carried out by Obioma and Ohuche (1988), both sexes performed alike. In terms of intelligent quotient, Ohuche (1988) observed slight sex difference, the girls being slightly superior to boys up to the age of fourteen years,

31

while the boys’ main superiority was from this age onwards. In Nigeria, Okeke and Idika (1988) in their study observed that girls have poor attitude to science and technology. Considering the above review, the question on gender difference remains inconclusive. Home educational background as a factor affecting Achievement Testing. Home education background as it affects achievement testing will be discussed under the following headings: 

parental level of education, income, material circumstances and the student’s academic achievement.



parental attitudes, valued, towards their children’s academic achievement.

Parental level of education, material circumstances, income and the student’s academic achievement. There is considerable evidence to show that poverty (low income) can adversely affect educational achievement of students. To Blackmore et al (1986), the availability of funds from parents and other family members will affect such things as the quality of the school attended, the number of children in the family who get sent to school, the likelihood of attendant private secondary schools, the number of books which a student possesses and the employment of a private tutor. Dubey et al (1981) in Ndukwe (1990) believe that in terms of material circumstances or environment, parents of the relatively higher classes are able to

32

supply their children with more opportunities to learn those things which will aid their learning in school. In addition, they have more appropriate knowledge, having themselves gone to school to know what kinds of experiences to provide for their children. Thus, from the objective perspective, they asserted that children coming from better-off home environment will have an advantage in learning due to prior and continuous experiences resulting from the opportunities provided in their more privileged circumstances. Uche (1980) is of the same view, because according to him, children from well to do and elite families are either driven to school in cars or provided with maids or servants who take them to and from school. Their parents give them encouragement advising them to work hard in order to pass their examinations with good grades. At home, the children are provided with toys and enjoy such facilities as television, radio and stereo equipments. But children of the working class or poor parents are overloaded with errands, domestic works and are insufficiently motivated to learn. The children share the same room, bed, clothes, even sleep on the floor and also go to school hungry. Okeem (1998) hypothesized that children from middle class or much better-off homes tend generally to do better in school than those from working class homes or poor homes. This is because, in the middle class homes, there may be such facilities as television, radio, pictures, reading and writing material, all of which helps to prepare a child for learning in school. When the children start schooling, parents provide them with books of right ages, magazines and journals, dictionaries, educational toys and games. The children are also provided

33

with well equipped and ventilated rooms and study facilities like adequate tables and chairs. These entire combine with positive attitudes from parents stimulates the children to learn. Parental attitudes, values and children’s academic achievement. Although the level of income, housing standards and family size have been found to determine the achievement of children in schools, parents level of education, attitudes and values towards the education of their children have been found to be more important variables than income and housing. The simplest most important factor that influences educational attainment of children appears to be the degree of parents’ interest in their children’s education. Middle class parents express a great interest as indicated by more frequent visit to the school to discuss their children’s progress (Agbaegbu, 1997). From previous researches, it has been shown that the values and orientations of the low income parents are less conducive to high motivation, to individual striving in school and generally to educational success. The desire to get ahead rather than contentment with getting by may help to explain the over-achievement of middle class children and to distinguish successful from unsuccessful children in the low income family. Thus, it has been suggested that by socialization into general sets of values which are conclusive to ambition and by continued encouragement and support, middle-class parents may provide their children with the types of attitudes that pave way educational success. The education background of parents is a great factor that affects the children’s achievement in school but CRAT in biology will help fill the gap.

34

School environment as a factor affecting achievement testing. Previous studies on school environment had made it clear that school environment is one of the major factors that affect the educational achievement of students. Its effect on achievement testing is adverse, if the necessary things that will make teaching and learning effective are not there. Such things as desks and seats, library, laboratories for practical classes, enough playing grounds (level lands), trees etc. The environment for teaching and learning may be conducive or unconducive. The researcher observed that an environment is said to be conducive for teaching and learning, when all those things that make for effective teaching and learning are there, while the environment is unconducive when effective teaching and learning cannot be carried out there. According to Okeem (1998), the students in conducive environment tend to perform generally well in aptitude tests than those from unconducive environment for teaching and learning. This can be taken care of, using well developed and validated CRAT in biology.

Strengths and weaknesses of Criterion- Referenced Test. Criterion-referenced test has a lot of advantages, so also some disadvantages.

Advantages of Criterion-referenced test.

35

The recent support for criterion-referenced measurement seems to have originated in large part from the emphases on behavioural objectives, the sequencing and individualization of instruction, the development of programmed materials and the increased interests in certification. Criterion-referenced test has been principally used in mastery test. It could also be used to know the degrees of performance, if an individual performed adequately on the objective, then the decision is to move onto the next unit of study. If he has not, then he required to restudy the material covered in the test until he performs adequately, that is, masters the material. It is used for diagnostic testing. Criterion-referenced measurement also allows us to estimate the proportion of performances in a domain at which a student can succeed. Disadvantages of Criterion-referenced test. Criterion-referenced measurement can serve some important functions but it has some limitations. It is only interpreted in terms of how it meets a set standard or criterion. Its content is based on a limited area of objectives. Criterionreferenced test results do not inform decision makers whether the students achieve what they should when they should. Many criterion-referenced tests are shorter and therefore, are not as reliable as norm-referenced test. In criterionreferenced test, standardized tests are not interpreted cautiously, the students’ failure to master an objective may be wrongly attributed to the instruction, the test items, the standard of mastery and/or the objective itself.

The nature of school biology.

36

Biology as mentioned earlier is a branch of science that deals with the science of life. It is the study of living and the living things are divided into two major groups, which are plants and animals. These are then treated under different headings, but in accordance with the aims of the biology curriculum, conceptual approach is used in dealing with the following main themes: 

concept of living



basic ecological concepts



plant and animal nutrition



conservation of matter/energy



variations and variability



evolution and



genetics (Sarojini, 2001). The concepts of these themes are generally developed spirally in three

year course. As recommended in the curriculum, the content is organized into three main sections. Section one covers the first year’s work at the senior secondary level; section two, the second year’s work’ and section three, the third year’s work as follows: Section one 

Science of living things.



Classification and organization of life.



Nutrition; Photosynthesis, Food substances.



Agriculture, Food supply. Population growth.

37



Basic ecological concepts.



Functioning ecosystem.



Ecological management.



Introduction to micro-organisms.



Micro-organisms and health.

Section two 

Cell 1: Living unit. Structure.



Cell 2: Properties and functions.



Supporting tissues and systems.



Feeding mechanisms. Digestive systems.



Transport system and mechanisms.



Gaseous exchange. Respiratory systems.



Excretory systems and mechanisms.



Aquatic and terrestrial habits.



Ecology of populations.

Section three 

Regulation of the internal environment.



Nervous co-ordination



Sensory receptors and organs.



Sexual reproduction: systems. Behaviour.



Development of new organisms. Fruits.



Variation. Application for survival. Evolution.

38



Genetics. The science of heredity.

Development of achievement tests in school biology. The development of the achievement tests in school biology will proceed through a number of systematic stages. These stages include planning, preparing, pilot testing and evaluation. These stages are in line with the suggested guidelines by Herman (1990), and Anene and Ndubisi (1992). In the first stage, a review of the senior secondary biology curriculum as contained in the National curricula for senior secondary schools will be made. In so doing, the instructional contents, objectives and learning activities as contained in the curriculum will be reviewed. The specific tasks as implied in the objectives will be identified. Experts will be involved in this identification. The experts will be requested to classify the instructional objectives in areas of Bloom’s taxonomy of cognitive domain. The areas include knowledge, comprehension, application, analysis, synthesis and evaluation. The table of specification will be used to ensure its content validity. The number of questions to be outlined will be relative to the size of the content (considering the duration involved in teaching the student), as well as the number of tasks implied in the objectives. The next stage will involve the actual preparation, writing as well as ascertaining the correct options to the Biology Achievement Tests items.

Procedures for Development and Validation of Achievement Tests. Ogomaka (1992) outlined some general and peculiar procedures for developing any given class or type of tests or scales. The steps according to him are:

39

i.

Deciding on what to test and for who. Here, the person wanting to develop the test (developer) decides on whether he/she needs a general intelligence test/scale or a specialized ability test/scale (i.e. aptitude test/scale), a group of test or otherwise. The developer decides on the population the test/scale is meant for. Population here will indicate the age range, the environment or cultural background, and some other peculiar characteristics of the individual that make up the population.

ii.

Identifying and specifying the behavioural patterns, mental functions/traits and dispositions that characterise a member of the population possessing the aptitude(s). The developer of this stage reads extensively and intensively related literature and examine similar instrument. She makes a list (and the description) of the collected behavioural patterns, mental functions/traits and disposition and submits the material to (other) experts to critique.

iii.

Writing or construction the items into sub-scales/tests. The developer writes down or makes up questions, pictures/diagrams, statements, tasks, activities that will elicit the expected behavioural pattern(s) or lead to the use of the mental functions, if possessed by the testee(s). She (developer) arranges the made-up items into sub-tests/scales. If the instrument is to be an individually administered test, the developer may also write down some implicative reactions of testee(s). The material is then submitted to (other) experts for critiquing and weighting.

40

iv.

Responding and /or reacting to the returns from other experts. The developer articulates the criticisms, suggestions and inputs of the (other) experts, writes up more items, if need be and arrives at specific weight for items.

v.

Trial/testing of the instrument. The developer draws a representative sample from the population and administers the instrument. She takes keen notes of the reactions of the testees to the tests and to the entire instruments. The developer scores the responses of the testees to the instrument using the resulting testees scores to the individual test items, sub-tests and the entire instrument, the developer undertakes to: a. compute reliability coefficient of the test (through split-half Kudder Richardson approaches). b. compute the pairwise and multiple correlation coefficients among sub-tests. c. carry out factor analysis of the items and sub-tests (scores of testees in each item should correlate highly with the same testee’s scores in the sub-test to which the item belong. In other word, the items should be factorially pure). d. constructing, rearranging and replacing items. The developer rewrites items and constructs now items to replace items found inappropriate or faulty with respect to i. responses or reactions of the testees. ii. factor loadings of the items and

41

iii. reliability coefficients of the sub-tests and the entire test. e. final trial-testing, norming/standardization. The developer draws another representative (but much larger than in the first trial) sample of the population of testees, and administers the test, and scores the responses of the testees. Making use f these scores, the developer computes the reliability coefficients of the test with same other standardized tests of the same standing and purpose and the norms of the test (mean, standard deviations, type of distribution and timing of the test). vi.

Production of a manual and publication. The description of the test and sub-tests, the testing conditions/regulations, expected reactions of testees, the scoring procedures and weighting, and the characteristics of the test/sub-tests should be documented and published. These will help users an also facilitate improvement. Onunkwo (2002), presented a flow chart of the steps involved in test

development and standardization as follows:

42

Defining the objectives

Specifying content to be covered

Preparing the test blue print

Writing test items

Face validation of the test items

Trial testing the items

Carrying out item analysis

Item modification

Selection of good items

Final testing of the items

Printing and production of the test

Defining the objectives: This is the first and most important step in test development and standardization. Thorndike and Hagen (1977) stated that in the development, and standardization of test, the objectives upon which the test is being constructed need to be defined in very clear and unambiguous statements. These objectives need to be defined in specific behavioural terms, as well as classified according to

43

levels of instructional domains. For the cognitive domain, the taxonomy according to Bloom (1956) includes knowledge, comprehension, application, analysis, synthesis and evaluation. The levels of the cognitive domain to be reflected in the objectives depend on the mental stage or ability of the students. Specifying the content to be covered. This is the second step in the development and standardization of tests. According to Thorndike and Hagen (1977), specifying the content to be covered in a test is important because it is the vehicle through which the process objectives are to be achieved. The content of the test should be selected from the sections of the relevant curriculum. The number of questions to be set per proportion of the content area depends on the volume of work which is dependent upon the number of weeks spent on teaching it. Preparing test blue-print. This is a two-way table of specification which aligns the content area of a course with the levels of instructional objectives. Percentage representing the number of items out of the total items for the test is assigned in advance to each level of instructional objectives and content area. Ohuche and Akeju (1988) stated that test blue-print is a two-dimensional diagram with the subject matter to be examined (content) listed along the rows, and the different educational objectives to be tested listed along the columns. Thorndike and Hagen (1977) pointed out the conditions that should be fulfilled in order to construct a test blue-print which will adequately guide in

44

developing a test that truly represent its contents and objectives (i.e. test with adequate content validity). These conditions include: i.

The proportion of test items on each content area should correspond to the proportionate emphasis or importance given to the topic in the class during instruction. This emphasis is in terms of the amount of time spent in teaching the topic and it depends on how voluminous or vast the topic is. In this case, a topic taught in two weeks will contribute more questions than the topic taught in one week. Also the proportion of test items set as each cognitive level should correspond to the importance the teacher considers that cognitive level to have, for the mental level of his students. Indeed, the decisions made by the teacher according to Thorndike and Hagen (1977) in allocating the questions on a test are necessarily subjective ones, the basic principles being that the test should maintain the same balance in relative emphasis on both content and objectives which the teacher has been trying to achieve through instruction.

ii.

The test maker must choose the type(s) of test items which will be most appropriate to constitute the test. He will decide whether to use the objective questions or essays. The decision as to which type of item to use depends on a large extent upon the process objective to be measured, the content area concerned, the skill of the teacher in constructing the different types of test items, the time available for the test development, and the time available for answering the questions by the testees.

45

iii.

The test maker must decide upon the total number of items for the test for objective questions, many items are required but essay questions require few test items. The larger the content areas and process objectives to be measured by a test, the larger the test items. The time available for testing is a practical factor that limits the number of items on a test.

However, the number of items to be constructed depends on the following factors: 

the type of items used on the test



the age and educational level of the testees



the ability level of the testees



the length and complexity of the items



the type of process objectives being measured or tested



the amount of computation or quantitative thinking required by the items.

iv.

The test maker must determine the difficult levels of the items.

Writing test items. The test items are written based on the specifications of the test blue-print. According to Nwana (1982), there should be a construction of excess or surplus item on each section of the test items of the test blue print. This is a precautionary measure to ensure that when the items are eventually reviewed (and consequently, faulty items are discarded), there will still be enough good items left to make up a complete test. Onunkwo (2002) stated that apart from setting many more questions than demanded by each section of test blued print, item writers should: a.

Avoid the use of long and involved statements

46

b.

Specify the degree of accuracy required for full credit

c.

Avoid extraneous clues

d.

Avoid giving clues to the answer of one item in another item statement

e.

Avoid using negative statements and double negatives. Nwana (1982) stated that after the items have been written, instructions

should be provided to guide the testees. These instructions should include the number of questions to be answered from each section (if sectionalized); the mode of response presentation; the scoring weights of each items, the type of writing materials to be used, and the maximum amount of time allowed ort the test.

Face validation of the test items. This has to do with distributing copies of the test, its table of specification and the syllabus upon which it is based to test experts and the subject specialists (Nwagu, 1990). The resource persons are requested to vet the test items in terms of relevance to course content and objectives, appropriateness to the class level, clarity of words and plausibility of the distracters. The test writer then reviews the items in the light of the flaws indicated by the resource persons. It should be noted that at this stages, the test items should be compiled and the test produced in reasonable number for trial testing. Trial testing the test items. Here, copies of a test assumed ready for trial testing are administered to candidates who are representatives of the population who will eventually take the test. Specifically, trial testing according to Nwana (1982) involves administering the test to a smaller number of pupils similar to those whom the final test will, be

47

administered. The responses are then scores and the scripts arranged in order of magnitude of the scores in preparation for item analysis. Russel (1982) and Inomiesa (1986) identified numerous purposes for trial testing test items. These purposes include to: i.

estimate the reliability of the final version of the test

ii.

determine the suitability of the test to the intended culture and group

iii.

identify and select good items and eliminate or modify faulty or poor items by item analysis, in terms of difficulty and distractor indices.

iv.

determine the most appropriate duration of the test

v.

identify any problem that may hamper effective testing conditions

vi.

determine the power of each item to discriminate between good and poor students

vii.

ascertain from the candidates, any unsuspected error or ambiguities in the test

viii.

determine the adequacy of the directions, the time limits and the test format.

Carrying out the item analysis. The aim of item analysis is to find out the effectiveness of a single item within a test Inomiesa (1986). Mehrens and Lehman (1978) stated that item analysis seeks to establish the difficulty and discrimination abilities of the items, as well as the effectiveness of each alternative. An analysis of students’ responses on tests according to Thorndike and Hagen (1977) serves two purposes: a.

provides diagnostic information for verifying the learning of the class as well as guiding further teaching and learning

48

b.

provides important information for preparing better tests for future use. According to Mehrens and Lehmann (1978), item analysis involves the following steps:

i.

arranging the test papers in order of magnitude based on the scores

ii.

cutting out two groups, the upper and lower 25%, 27% or 33% of the papers

iii.

for each item, counting is made of the number of students in each group who chose each alternative or omitted the item completely

iv.

computing the item difficulty, discrimination and distractor indices for each of the items and their alternative answer. It should be inferred that item analysis has to do with assessing each of the

items making u a test for its possession of relevant psychometric properties. In order words, it is concerned with the adequacy of an item of a test with respect to relevant statistical qualities such as item difficulty index, item discrimination index and item distractor index. Item difficulty index. According to Nwagu (1991), item difficulty index is computed as the proportion (sometimes percentage) of the criteria group which responds correctly to the items. He goes on to explain that many criteria can be used to define the criteria group. Such criteria according to Nwagu (1985) include: course grades, intelligence quotient, cumulative grade points, job rating, teacher’s rating, continuous assessment results etc. item difficulty index is also referred to as item easiness index or item facility. Iwuji (1990) defined it as the proportion of persons answering each item correctly.

49

It should be emphasized that there exists varying percentage cut off points for determining the upper and lower ability groups. These variations come from various authorities. Thorndike and Hagen (1977) and Ohuche and Akeji (1988) proposed the use of the upper and lower 25 percent of the students tested. Item difficulty index, “D” is computed from the formula: D=

U+L N

Where U

is the number of testees in the upper ability group who answered the item correctly.

L

is the number of testees in the lower group who answered the item correctly.

N

is the total number of students in both the upper and lower groups. According to Nwagu (1991), the item variance is a maximum when the item

facility (difficulty/easiness) index is 0.50 and departs from this maximum as the item facility departs from 0.50. It approaches zero as facility approaches 1.00. He went on to state that the ideal thing, therefore, would be to include in test, only the item s with facility index of 0.50. According to him, this is not feasible hence an item facility range whose margins are not distantly removed from the optimum is usually adopted by test makers for test item selection and test validation. Onunkwo (2002) stated that the value of D ranges from 1 to 0. A “D” of 1.0 indicates an item which every student scored correctly, that is perfectly easy item. Conversely, a D of 0.00, indicates an item which no student scored correctly that is a perfectly difficulty index. Onunkwo (2002) goes on to explain that the higher

50

the value of D, the easier the items, while the lower the value of D, the more difficult the item. An item difficulty index (D) of 0.50 is considered moderate. Ferguson (1981) recommended a range of 0.30 to 0.70 to be an ideal difficulty index for a test item. Ohuche and Akeji (1988) recommended a “D” of 0.40 and 0.60. For this study, a range of 0.30 to 0.70 is adopted. However, few items that were not distantly removed from this range and which have good discrimination and distractor indices were included in the tests. Item discrimination index. According to Iwuji (1990), the discriminating ability of an item is judged by the level of difference between the number in the upper group and those in the lower group who got the item right. A good achievement test should be able to discriminate between the brilliant and dull students. Ohuche and Akeju (1988) referred to this discrimination ability as sifting the sheep from the goat. An item would be said to be discriminately, ideally and positively, if a higher proportion of the high scoring testees in the group got the item right. This power of a test to detect or measure small difference in achievement is essential if the test is to be used reliably for ranking students on the basis of achievement. Onunkwo (2002) gave a formula for computing item discrimination index as -L R = U 1 /2 N

where R is the discrimination index U, L and N remain the same values as in item difficulty index.

51

An item has a negative discriminating ability if more testees from the lower group than those in the higher group got the item right. A negatively discriminating item is not a good item. The value of discrimination index ranges from -1.00 through 0 to +1.00. A discrimination index of -1.00 indicates that all candidates in the low ability group scored the item right, while all the candidates in the high ability group failed the item. This item discriminates perfectly well and in the right direction. Such item is a good item. A discrimination index of 0.00 indicates that equal number of testees in the upper and lower group passed the item. Such an item does not discriminate and should be discarded. An item that discriminates negatively or lacks discrimination ability should not be selected for the test. Ohuche and Akeju (1988) provided some guidelines for interpreting different values of discrimination index. This is shown below: R of 0.50 and above indicates a high discrimination. 0.30 to 0.49 indicate a moderate positive discrimination 0.20 to 0.29 indicate a borderline positive discrimination 0.00 to 0.19 indicate a lower to zero positive discrimination -0.01 and below indicate negative discrimination. They recommended the use of items with borderline to high positive discrimination for researchers. This would help in avoiding the rejection of too many items which could affect or reduce the content validity of tests. They went on to state that on the basis of sample size, a discrimination value of between 0.15 and 0.19 should be used with caution when the sample size is between 251

52

and 300, whereas an index of over 0.20 should be used with confidence with the same sample size. Distractor index. This is concerned with the degree at which the wrong items attract more people from the lover scoring group than the higher scoring group. Onunkwo (2002) stated that the distractors are checked to see if they are plausible enough to be retained. For a distractor to be plausible, it must appeal to some students as the correct answer as well as appeal more to the lower ability group than it does to the high ability group. This view is in line with the view of Okafor (1997), who stated that effective distractor is usually picked as a correct answer by larger number of subjects in the lower one third. A distractor which attracts more testees in the upper group is not a good distractor. According to Okafor (1997), distractors are incorrect alternative answer in a multiple-choice objective test meant to serve as a ruse devised to test the correctness of the subject’s choice of the correct answer. Iwuji (1990), Okafor (1997) and Onunkwo (2002) all gave the formula for computing distractor index as: D.1 =

L–U 1 /2 N

where D.1

is distractor index.

L

is the number of testees in the lower group selecting the particular distractor.

U

is the number of testees in the upper group selecting the particular distractor.

53

N

is the total number of testees in both the upper and lower group. The value of distractor index ranges from -1.00 through 0 to +1.00. When

an item has a positive distractor index value, it implies that the distractor is effective since it is chosen by more of the students in the low ability group than those in the high ability group. a negative value index indicate that the distractor is not effective because it is chosen by more of the students in the high ability group than those in the lower ability group. When the value of the distractor index is zero, it implies that the distractor does not distract or confuse any student. Such an item is defective and should not be included in the test. A negative discriminating item as well as a zero discriminating item should be regarded as defective and bad item. Such items should not form part of the test. Selecting Good Items: In this case, the test developer decides on which items to be included in the final test. According to Ohuche and Akeju (1988), this involves deciding on which items to be included in final test. Okafor (1997) stated that based on the experience gained from the field test and the results of item analysis, the prototype test is revised. In this case, the magnitude of the difficulty index, discrimination and distractor indices are all looked into before an item is involved in the test. In a situation where an item is found defective with respect to the devised indices for quality, such as item is either modified, replaced or discarded. However, caution is needed while selecting the good items so that the content coverage is not reduced. Writing on this point Mehrens and Lehmann (1978) stated that item analysis data are tentative and as such selecting test

54

items purely on the basis of their psychometric properties should be avoided. They gave the following reasons for the inferences drawn: i.

item analysis data are influenced by the nature of the group being tested, the number of pupils tested, the instructional procedures employed by the teacher and chance errors.

ii.

item difficulty can be affected by guessing, the location of the correct answer among alternatives and the serial location of the item in the test.

iii.

statistical selection of the test items may result in a test that is unrepresentative and biased. Mehrens and Lehmann in this case, suggested the use of rational

procedures as a basis for initial selection of test items, after which statistical techniques are used to check on the judgment. They went on to state that an item should be retained as long as it discriminates positively, is clear and unambiguous, and free from technical defects. Reasoning in the direction, Ohuche and Akeju (1988) stated that although much has been said about item discrimination, difficulty and even reliability are means to the end which is validity. They emphasized that a test would not be worthy if it possessed all the attributes but had no validity. Assembling of Good Items: Thorndike and Hagen (1977) pointed out that in test assembly, it is important to group items dealing with the same content or skill together. By so doing, it would be ensured that the tests contain only related materials. More so, such arrangement would enable the testees to concentrate on a single area of

55

content at a time rather than having to shift back and front among areas of content. This arrangement would also ease the test developer’s job of analyzing the test result for the group or for the individual since it is possible for him to see at a glance, whether errors are more frequent in one content area than another. Mehrens and Lehmann (1978) are of the view that items should be arranged according to their item difficulty levels. On this arrangement, easy items should precede the more difficult ones in all sections of a test. The researcher believes that it is better to mix up the items. Final testing of the items: According to Ogomaka (1992) in final testing of items, the developer draws another representative (but much larger than the first trial) sample of the population of testees and administer the test and scores the responses of the testees. Ogomaka’s view lends credence to the view of Anastasi (1961) who explained that final testing involves administering the validated test t6o a large representative sample of the students for whom the test was designed and who did not participate on the trial testing. Establishing the Norms: According to Onunkwo (2002), the scores of students in the final testing g are usually used in establishing the norms for a test. Such norms according to him may be the age, sex, location and grade norms. Ogomaka (1992) included the test mean, standard deviation, type of distribution and timing of the test as norms established.

56

Norms are used to compare testees’ performance and therefore serve as the standard for judgement. Obimba (1989) stated that norms are the average scores made by representative group of pupils (the norming group) at various age and grade levels. Norms make it possible for test scores to be composed across schools and groups. Printing and production of the test: When norms for a test are established, the next thing to do is to print and produce the final copy of the test. According to Ogomaka (1992), this involves the description of the test and sub-tests, the testing condition regulations, expected reactions of testees, the scoring procedures and weighting and the characteristics of the test/sub-test. All those should be documented and published. According to Obimba (1989), accompanying the final edition of a typical standardized test are answer sheets, scoring stencils and a manual of the test. He went on to explain that the test manual contains detailed information on the nature of the test, procedures for administering and scoring the test, norm tables, for interpreting the scores, the reliability and validity of the test. Theoretical Framework Two major models guide the construction of tests. These models are 1. Classical test model 2. Latent trait test model. Classical Test Theory According to Keats (1980), the classical test model is based on the assumption that a student’s observed score (X) is the simple sum of his true score

57

(T) and an Error Score (E). The true score (T) reflects the true amount of the attribute which the student possesses at the time of measurement, while the Error score (E), indicates the effects of extraneous influences on the measurement process at the time of measurement. The equation for the classical test model as given by Keats is X=T+E. it is a deterministic model for minimizing the error of measurement of a test. It provides a strong basis for constructing a norm-referenced test. According to Nkpone (2001), the classical approach to item difficulties uses proportion of persons attempting the item who are successful. The classical test model of reliability estimation is dependent on the particular examinee sample. Wood (1990) stated that in classical test model, the contribution of each item to the test reliability and validity depends upon what other items are in the test. Latent Trait Test Theory Thorndike (1980) stated that in the latent trait test theory, a test score is interpreted as a scale value on a vertical scale of the latent trait, rather than being expressed in normative terms in relation to some reference groups or persons. This model expresses a sample free one-dimensional trait scale along which every student’s position can be estimated. Tests constructed under the guiding principle of scale value model aim at estimating a student’s location on a vertical scale in relation to anchor points previously set. Wood (1990) and Korashy (1995) stated that in latent trait model, reliability is replaced with the concept of standard error or precision of measurement. “Unlike the classical reliability estimates, the standard error of

58

measurement is independent of the particular examinee sample and it is an indication of the amount of error in ability estimate at different points of the ability continuum”. According to Uebersax (1993), latent trait models let one to: i.

precisely measure the difficulty or easiness of each item.

ii.

determine the association of each item with the construct being measured.

iii.

determine which items are biased in the sense of having different meaning or measurement characteristics in different sub-populations.

iv.

design a test with the fewest items necessary to measure the construct with requisite accuracy.

v.

measure test accuracy at different levels of respondent ability.

vi.

design an adaptive test-one where answers to preceding items determine which items are subsequently administered again with the aim of producing the shortest overall test.

Thorndike (1980) however stated that tests which are constructed, validated and interpreted on the basis of the latent trait test theory are called Criterionreferenced tests. The statistics normally applied in assessing the internal validity of a test in the classical test model are the item biserial (Wood, 1990). Due to the fact that the magnitude of this item statistic depends on the ability distribution of the sample, it has the disadvantage of being sample dependent (Douglas, 1990). In the case of latent trait model, the internal validity of a test is assessed in terms of the statistical fit of each item to the model. The analysis of fit is a check on

59

internal validity. If the fit statistic of an item is acceptable, then the item is valid (Korashy, 1995; Sallini, 1983; Inainer, Morgan and Gustfson, 1980). The present study adopts the latent-trait test model. Review of related empirical studies. Locally developed and validated achievement tests. Previous studies have been carried out on the development and validation of some tests in Nigeria using instrumentation research design. Nwankwo (1985) developed and validated an instrument for measuring Test Anxiety Scale among secondary school students. The study which was carried out in the Educations zones of old Imo state was with a sample of 2351 students. A 30 items test anxiety (TAS) was used. The reliability estimates of stability and internal consistency were found to be 0.92 and 0.94 respectively. The concurrent and construct validity coefficients of 0.96 and 0.90 respectively were established. The author also found out that anxiety increases with increase in class level. Adams (1981) constructed and validated an achievement test in Integrated Science for Nigerian Secondary class one students. The test items he used were based on the contents and objectives of STAN (Science Teachers’ Association of Nigeria) Integrated Science curricula. The number of items per topic was dependent upon the number of activities recommended for the topics in the approved Integrated Science syllabus. A 40 item test categorized into knowledge, comprehension and application levels of cognition was used. The study was carried out with a sample of 358 class one students from 6 schools in Ibadan Municipality. The following psychometric measures were established for the test:

60

an average discriminative index of 46.33 percent; a facility index range of 18.5 82.5 percent and a KR20 internal consistency coefficient of 0.63. A comparative analysis of students’ performance in knowledge, comprehension and application levels using the test statistic shows high and low performance on items in comprehension and application levels. This study used a narrow scope of study – Ibadan Municipality which restricts the generalizability of the findings. An 82 item students’ Evaluation of Teachers Effectiveness Scale (SETES) was constructed and validity with a sample of 800 class four secondary school students for old Imo state. Both the face and content validity of SETES were established. The reliability of the scale was found to be 0.89. (Ogomaka, 1984) In an investigation, Inomiesa (1988) constructed, validated, standardized and used an achievement test on upper primary science (UPSAT-6). 200 items selected for the study came down to 102 items after vetting and trial-testing. The 102 items had a facility index (f) which ranged from 0.30 to 0.70 and discrimination index of 0.20 to 1.00. He finally selected 50 items and compiled it into a final version of “UPSAT-6” and normed it on a stratified random sample of 3,600 students from Anambra, Bendel and Benue states. The grouping of the items was based on the major topics in upper Primary Science Curriculum. The coefficient of internal consistency K-R 20 of 0.87 was established for the test. In another interesting investigation, Obioma (1985) developed, validated and normed a Diagnostic Mathematics Achievement Test (DAMAT) for Nigerian Secondary School students. The 60 item test covered the contents and cognitive objectives of JSS mathematics curriculum. The test was validated in two phases.

61

The test was initially pretested in Nsukka Education Zone of the then Anambra state on 200 students. It was later reviewed and pilot tested on 1000 students in Benue state. Then the test after its final review was normed on 5000 students in the then four Eastern states comprising Anambra, Imo, Rivers and Cross-River states. The test coefficient estimate using cronbach alpha was 0.77. The test maintained an average test difficulty of 0.40, the researcher found out that sex and school location were significant predictors of students’ achievement in JSS mathematics. In line with others, Nkpone (2001) developed and standardized a physics achievement test (PAT) for senior secondary students using the one-parameter, two parameter latent trait model and the classical test modes. The researcher made use of 2215 male and female senior secondary school physics students in Rivers state. The instrument consisted of 60 multiple choice items. The results showed that the overall reliability coefficient (KR20) of PAT was 0.89. There was significant relationship among the item parameters obtained from the one parameter, the two parameter and the classical test models. One limitations of the study is that subjects with low scores were included in the estimate of the item parameter of the physics achievement test. Oragwam (2004) developed and standardized a National Consciousness Scale for Federal Unity Secondary Schools in Nigeria. The researcher made use of a sample size of 640 male and female students. The final instrument contained 47 items built into seven subscales. Trial testing and factor analysis were used to streamline the items included in the final instrument. The reliability of the

62

instrument using Cronbach’s alpha was found to be 0.80 and those of the subscales ranged from 0.50 to 0.83. The results showed that female students performed better than the male students in measures of national consciousness, while the senior secondary school students performed better than the junior secondary school students. There is no such instrument in biology especially using CRAT to the researcher’s knowledge which is one of the reasons of undertaking the study. Summary of the literature review. The reviewed literature shows the need for valid assessment of instrument for determining the extent of achievement of instructional objectives using criterion referenced test. The literature shows that two major theories guide the construction of tests; these theories are the classical test theory and the latent trait theory. Tests which are constructed, validated and interpreted on the basis of the latent trait test theory are called criterion-referenced tests; therefore, the present study adopts the latent-trait test theory. Much work has been done in various subject areas such as Mathematics, Physics, Integrated Science, chemistry etc. Certain subject areas appear to have been neglected either knowingly of unknowingly. Biology is one of such areas in which much works has not been done with regard to instrumentation research (CRAT). The present study is, therefore, considered important so as to ensure that the subject area does not lag behind in terms of validated instrument for measuring outcome of instructional objectives.

63

Achievement testing can be seen as the systematic and purposeful qualification of learning outcomes. Achievement testing is important in evaluation because the result of the evaluation provides the teacher and the students with some feedback to decide whether to carry on with the lessons for each group of students as planned or to readjust certain areas for specific group or groups. Development of achievement tests in school biology proceed through a number of systematic stages which include planning, preparation, pilot testing and evaluation. Criterion-referenced test has advantages and disadvantages; some of the advantages are: it is used for mastery learning and could also be used to know the degrees of performance of the students in a particular content area. Some disadvantages include it is only interpreted in terms of how it meets a set standard or criterion. Its content is based on a limited area of objectives. Moreso, the influence of such factors as sex, school type and school location on students’ academic achievement remains inconclusive. This is true in view of the fact that while some reviewed works found these factors significant, others did not find them significant. This present study is necessary in that direction using CRAT.

64

CHAPTER THREE RESEARCH METHOD This chapter presents the procedures to be adopted in carrying out the study. Specifically, the chapter describes the design of the study, the area of study, population, sample and sampling techniques. It also presents instruments for data collection, validity and reliability of the instrument, administration of the instrument and method of data analysis. Research design This is an instrumentation study. It is instrumentation because; the study involved the development and validation of a Criterion Referenced Achievement Test (CRAT) for evaluating the cognitive learning outcomes of senior secondary biology students. The International Centre for Educational Evaluation (ICEE, 1982) defined instrumentation research design as a study which aims at investigating and introducing new and/or modified contents, procedures, technologies or instruments for educational practices also Ali (2006) defined instrumentation research as a study which is geared towards the development and validations of measurement instrument or the investigation and introduction of new techniques of ruse in education.

Area of study The study will be carried out in Enugu state of Nigeria. The state has six education zones and 17 Local Government Council areas. The education zones are Agbani, Awgu, Enugu, Nsukka, Obollo and Udi.

65

The area of study will be divided into urban and rural locations. The urban locations are cosmopolitan in nature as people of diverse origins and cultures are usually found. They also have more social amenities and are usually inhabited by businessmen, technocrats and civil servants. The rural locations are not cosmopolitan and usually have less social amenities and are inhabited mostly by farmers and artisans. Enugu state lies at about latitude of 06.00N – 06.40N and longitude 07.00E – 07.35E. It is bounded in the East by Ebonyi state, in the West by Anambra state, in the South by Imo state and in the North by Benue and Kogi states.

Population of the study The population of this study is made up of all the senior secondary school students in SS2 offering Biology in the 2011/2012 academic session in Enugu state.

Sample and Sampling Technique: The sample for the study will consist of 297 Biology students in SS2 for 2009/2010 academic session. This will be made up of 113 males and 184 females.

The sample for the study will be drawn through multi-stage proportionate stratified random sampling technique. The stratification is on the basis of the following: Education zone, LGA, Location of school and Gender. The 248 senior secondary schools in the 17 local government areas of Enugu state will be divided into clusters (strata). The cluster will be along sex, school location and education zone. From each cluster, ten percent of the schools will be sampled to give a total

66

of 25 secondary schools. The population of Biology students in the 25 secondary schools is 2,970 from which 10% will be drawn to give 297. Instrument for Data Collection The instrument to be used for the data collection is the draft of CRAT (Criterion-Referenced Achievement Test). It has only one section which consists of multiple-choice test items to be used for the senior secondary school students in SS2. A pool of 120 items was used for trial testing while 64 items will be used for pilot study. The distribution of the test items according to the content and the levels of cognitive domain are shown on the table of specification on appendix II.

Administration of the Instrument The instrument (CRAT) was administered to the subjects by the researcher with the help of SS2 Biology teachers in each of the sampled schools. Guidelines will be given to the Biology teachers involved in the administration. This is to ensure uniformity in the administration. Method of Data Analysis Research question 1 was answered using frequencies and measures of item difficulty, discrimination and distraction. Research question 2 was answered using the K-R 20 procedure while research question 3 was answered using mean and standard deviation. Hypothesis 1was tested using t-test while Hypothesis 2 was tested using Chi square test of goodness of fit.

67

CHAPTER FOUR PRESENTATION OF RESULTS In this chapter, results arising from the analysis of data collected for this study are presented. This presentation is made with a view to providing answers to the research questions and testing of the formulated hypotheses of the study. Research Question One: What is the validity of the CRAT? Appendix B shows the content validity of the CRAT. Drafted CRAT items was submitted to experts in Measurement and Evaluation and a specialist in biology at the University of Nigeria, Nsukka for a detailed editing, careful and critical review of the wordings of the test items. This step was taken in order to avoid the inclusion of irrelevant, misleading and defective items. The experts were used for the establishment of both face validity and content validity of the items. Appendix G and H gives details of the measures of item difficulty, discrimination and distraction.

Research Question Two: What is the reliability of the CRAT? Appendix G shows that the reliability index of CRAT is 0.51. The instrument has high reliability index CRAT was subjected to trial testing for establishing the reliability (internal consistency) of the instruments. The trial testing comprised two phases (trial testing and pilot study). The trial testing was done with a sample size of 150 SS2

68

(80 females and 70 males) students randomly chosen from four schools, (two urban and two rural schools) in four local government areas. The pilot study was done with 64 items for SS2 students that survived the trial testing. Research Question Three: What is the influence of sex (gender) on the items of CRAT? Table 1: Mean and Standard Deviation of Students Achievement in Criterion Referenced Achievement Test in Biology for SS2 (CRAT 2) by Sex Variable

Label

Mean

SD

Sex

Male

15.9889

6.5400

Sex

Female

14.5385

5.1937

The table shows that the mean achievement score of male students in CRAT was 15.9889 with a standard deviation of 6.5400, while that of female was 14.5385 with a standard deviation of 5.1937. In effect, the mean performance score appeared higher in favour of males. Hypothesis One: The items of the CRAT did not deviate significantly from the specifications of the core curriculum. Table 2: Means and Standard Deviations of Males and Females in the Hypothetical CRAT in Biology.

Male

Mean (x)

Standard Deviation (S)

27.00

9.22

Number (n)

Degree of freedom

25.27

9.51

Calculated t-value

Critical t-value

2.55

0.68

1.960

24 52

Female

Standard Error

30

69

The table shows that the calculated t-value is 0.68 while the critical or table t-value is 1.960. Since the calculated t-value is less than the critical or table tvalue, we do not reject the null hypothesis. Hypothesis Two: There is no significant difference between mean achievements of male and female SS2 students in Biology as determined by their mean CRAT scores. Table 3: The Observed and Expected Frequencies of Males and Females Mean Achievements Mean achievements Male

Female

Observed frequency

27

25

Expected frequency

26

26

‫א‬2 = ∑ (O – E)

2

E

(27 – 26)2 = 26

+

(25 – 26)2 26

= 0.038 + 0.038

‫א‬2 = 0.076 (calculated value) Degree of freedom = 2 -1 = 1 Critical value = 3.841 The null hypothesis was upheld, since the calculated ‫א‬2 value 0.076 is less than the critical or table ‫א‬2 value 3.841.

70

Summary of Results 1) The facility, discrimination and distractor indices of the items of CRAT were worked out. The items that showed a good measure of facility, discrimination and distractor indices were retained in the final version of the test while items with very poor facility, discrimination and distractor indices were excluded. 2) CRAT reliability index was worked out using KR-20 approach and the result showed that CRAT possesses a 0.51 level of reliability (internal consistency). 3) The mean and standard deviation of male and female were worked out, but the mean performance score appeared higher in favour of males. 4) The items of CRAT showed no statistical significant deviation from the specifications of the core curriculum as determined using t-test. 5) There was no significant difference in the mean achievement of male and female students as determined using Chi Square Test of goodness of fit.

71

CHAPTER FIVE DISCUSSION OF RESULTS, CONCLUSION, IMPLICATIONS, RECOMMENDATION AND SUMMARY In this chapter, the findings of the study are discussed based on the three (3) research questions and two hypotheses that guided the study. Also included in this chapter are conclusions, implications, recommendations, limitations of the study, suggestion for further research and summary of the entire work Discussion of Results: Research question 1: What is the validity of the CRAT? The results of the item analysis saw the survival of 55 items of CRAT. All the survived items had a good measure of facility, discrimination and distractor indices. Appendix G and H clearly give detailed information on the item analysis. Items with very poor facility, discrimination and distractor indices were excluded from the final version of the tests. All the dropped items had discrimination indices of either negative or zero. The rejection of all items with zero or negative discrimination indices is in line with the views of Iwuji (1990), Okafor(1997) and Onunkwo ( 2002) who all agreed that a negatively discriminating item as well as a zero discriminating item should be regarded as defective and bad item. They stated that such items should not form part of the test.

Research question two: What is the reliability of the CRAT?

72

The internal consistency of the instrument was estimated with Kudder Richardson (KR-20) approach. A high level of reliability was exhibited by CRAT. The reliability estimate of CRAT was 0.51. Although the coefficient is moderate, it can guarantee the usage of the instrument for the assessment of students’ achievement in the senior secondary schools class 2.

Research question three: What is the influence of sex (gender) on the items of CRAT? Sex as a factor in students achievement in CRAT is shown in table 1 of chapter four, the table indicated a higher mean achievement for males. The findings of this study are in line with the findings of Nwagu (1991). In an instrument he developed fir J.S.S. students achievement in social studies found sex to be a significant factor in students achievement in J.S.S. 1 and J.S.S.2 in favour of the males, but was not significant in J.S.S.3. Sullivan (1995) became interested in finding out instructional strategy that is capable of enhancing the development of conceptual understanding in handling quantitative data by boys and girls, so that they may develop their knowledge structure and thinking processes. The result of the study showed that student centered activity oriented presentations help a broad range of subjects to develop their conceptual understanding and thinking skills. No significant gender differences were found for either conceptual understanding or thinking skills. The interpretation in this study is that sex is not a significant factor in students achievement.

73

Hypothesis One: The items of the CRAT did not deviate significantly from the specifications of the core curriculum. Table 2 shows that the calculated t-value is 0.68 while the critical or table t-value is 1.960. Since the calculated t-value is less than the critical or table t-value, we do not reject the null hypothesis. The interpretation in this study is that statistically, there is no significant deviation from the specifications of the core curriculum.

Hypothesis Two: There is no significant difference between mean achievements of male and female SS2 students in Biology as determined by their CRAT scores. At 0.05 probability level, the null hypothesis was upheld. The mean achievement of males and females was not statistically significant. Previous results of studies on this have been divergent and inconclusive. Ugwu (1995) and Ohuche (1988) reported that boys and girls have differentiated abilities. The males were superior in numerical aptitude, science reasoning and spatial relationships while the females were superior in verbal fluency, perceptual speed, memory and manual dexterity. In this study, using CRAT, no significant difference was found in Biology achievement between males and females.

Conclusions: From the results obtained, the following conclusions was drawn

74

1) CRAT exhibited a good measure of facility, discrimination and distractor indices. 2) The internal consistency reliability measure of the CRAT (0.51) was moderate that can guarantee its usage. 3) Sex was not found to be a significant factor in students achievement. 4) The items of CRAT did not deviate statistically from the specifications of the core curriculum. 5) There is no statistical significant difference between the mean achievement of male and female. Educational Implications of the Study The findings of this study have educational implications for teachers, educational administrators and others in education sector. The developed and validated instrument will be of immense benefit to teachers especially in the area of continuous assessment. The instrument will serve as a model test to Biology teachers. The models would be invaluable to all, moreso to auxiliary Biology teachers. The knowledge of the influence of some factors on students’ achievement would provide a useful guide for educational planners, inspectors and administrators.

Recommendations: Based on the findings of the study, the following recommendations are made

75

i)

Senior secondary 2 Biology teachers should make use of CRAT in the assessment of the achievements of their students in Biology.

ii)

CRAT should be looked upon as a model achievement test in Biology for senior secondary schools.

iii)

Regular sensitization workshop, seminars and conferences should be organized for teachers in order for them to be acquainted with requisite techniques needed for construction of valid assessment instruments.

iv)

Educational inspectors should improve upon their duties by embarking on consistent, planned and objective inspection.

Limitations of the Study: The limitations in this study include 1)

The researcher’s inability to control other variables such as teacher variable, student variable (IQ) and educational opportunities that might influence students achievement is a limitation.

2)

The use of 25 secondary schools out of 248 secondary schools in Enugu state may be considered a limitation as it may affect the generalization of this study.

3)

The possibility of some schools not adequately covering the scheme of work in Biology is a limitation.

Suggestions for Further Research: Based on the findings and limitations of this study, the following suggestions for further research are made

76

1) Replication of the study could be made in other senior secondary school subjects. 2) Further studies should be carried out in other states using large sample sizes and covering all the levels of senior secondary schools. 3) The influence of other variables not covered in this work, on students achievement should be researched upon. 4) There is need to develop and validate instruments for assessing psychomotor and affective behavioural objectives in Biology. Summary of the Study: This study was designed to developed and validate criterion referenced achievement test (CRAT) in Biology for senior secondary two students in Enugu state. In this study, the use of CRAT as a model of instruction by teachers was investigated. The study was guided by three research questions and two hypotheses thus:

1) What is the validity of the CRAT? 2) What is the reliability of the CRAT? 3) What is the influence of sex (gender) on the items of the CRAT? However, the following hypotheses were formulated and tested at 0.05 level of significance: 1) The items of the CRAT did not deviate significantly from the specifications of the core curriculum. 2) There is no significant difference between mean achievements of male and female SS2 students in Biology as determined by their mean CRAT scores

77

Literature reviewed indicated the procedures to follow in developing and validating achievement tests. The literature also reviewed the different theories for achievement testing and the appropriate one for this study. The researcher developed a pool of 120 items with four options for SSII students to cover the instructional objectives in Biology. Tables of specification was constructed and given alongside with the items to experts in measurement and evaluation and a specialist in Biology for face and content validation. Trial testing was done with a sample size of 150 SSII students randomly chosen from four schools in the state. After the trial testing, some items were dropped and others modified making up 64 items of CRAT for SSII Biology students. These items were validated by the experts and used for pilot study on 54 SSII students randomly drawn from six secondary schools in Enugu state. These items were subjected to item analysis after which 55 items survived with a reliability index (KR – 20) of 0.51. The sample for the study was drawn through multi-stage proporationate stratified random sampling technique. The total sample size consisted of 2970 SSII students. The data collected were analyzed using mean, standard deviation, t-test and Chi Square test of goodness of fit in order to answer the research questions and test the hypotheses. The results of the study showed i.

The conformity of the items of the CRAT to the specifications of the core curriculum.

78

ii.

Non-rejection of null hypothesis on the influence of sex on students achievement in CRAT.

The implications of the study were presented and discussed. Recommendations were made based on the findings.

79

REFERENCES Adams, I.K. (1981).”Construction and Validation of Achievement Test in integrated Science for the Nigerian Secondary School Class one. Unpublished M.Ed.Thesis. Ibadan: University of Ibadan Adigwe, J.C. (1992). Gender differences in chemical problem solving among Nigerian students. Research Education. 10(2), 187-201 Agbaegbu, C.N. (1997). “Psycho-metric qualities of a test.” In S.A. Ezeudu, U. N. V. Agwaga & C.N. Agbaegbu (Eds) Education measurement and evaluation for Colleges and Universities. (pp. 69-79), Onitsha: Cape Publishers International Limited. Ali, A. (2006). Conducting research in education and social siences. Enugu:Tashiwa network limited. Anene, A. O. (1997). The influence of laboratory experiment on the performance of the Nigerian secondary school students in the University of Nigeria, Nsukka. Unpublished M.Ed Thesis. Department of Education, University of Nigeria, Nsukka. Blackmore and Cooky (1986) .The nature of Mathematics. London Routledge and Kegan Paul. Broadfoot, P. (1979). Assessment, schools and society. London: Menthen. Douglas, G. A. (1990) Latent trait measurement models. In Keeve, J. P. (Ed) educational

research,

methodology

and

measurement.

An

international handbook. New York: Pergamon Press. Federal Republic of Nigeria (FRN) (1998). National Policy on Education. Lagos: federal government press.

80

Ferguson, G.A. (1981). Statistical analogy in psychology and education; 5th edition. Auckland, Tokyo. McGraw-Hail International Book Company. Haigh, C. F. (1999). Gender differences in SAT scores, analysis by race 56(5)11474A. Ibeagi,O. O. (1991). Construction, validation and use of criterion-Referenced Test in Junior Secondary Integrated Science in Secondary Schools in Ankpa L. G. A. Unpublished M.ED Thesis U. N. N. Ininigba, J. (1982). The rural and urban children. Nigerian Chronicle January 11 p.8. Inomiesa, E.A (1988). “The development, validation and use of standardized instrument for the continuous assessment of pupil achievement in upper primary science.”Unpublished Ph.D. Thesis, Faculty of Education, University of Nigeria Nsukka. International Centre for Educational Evaluation (ICEE). (1982) Iwuji, V.B.C. (1990). Measurement and evaluation for effective teaching and learning. Onitsha: Summer educational publishers (Nig) Limited. Johnson, R. (1996). Notes on the schooling of the English working class 1780-1850. In Dale et al. (eds) Schooling and Capitalism, London: Routledge and Kegan Paul. Korashy. A. F.(1995).Applying the Rasch model to the selection of items for a mental ability test. Educational and psychological measurement 55(5) 753-763. Marjoram, T. (1997). Patience reward – a report on the progress of assessment of performance unit, Times Educational Supplement 14. (10) p.77.

81

Mehrens, W.A. and Lehmann, I.J. (1978). Measurement and evaluation in education and psychology: New York: Holt, Rinehart and Winston. Morgan, G. & Gustfson (1980). The use rasch latent measurement model in the equating of scholastic aptitude test. In Spearitt, D. (Ed). The improvement in education and psychology: contributions of latent trait theories. Australia: Australian council for Educational Research. Ndukwe, J. O. (1990). Development and validation of Integrated Science Achievement Test for Junior Secondary School Students. Unpublished M.ED Thesis, U. N. N. Nkpone, H .L. (2001). “Validation of Physics achievement test.” Unpublished Ph.D. Thesis. Faculty of education, University of Nigeria, Nsukka. Nwagu, E. K. N. (1985). “Validity of teachers-made Economics questions used in Anambra state secondary schools.” Unpublished M.Ed. Thesis. Faculty of Education: University of Nigeria Nsukka. Nwagu, E.K.N. (1991). “Development and standardization of social studies achievement test (SSATS).” Unpublished M.Ed. Thesis. Faculty of Education: University of Nigeria Nsukka. Nwana, O.C. (1982). Educational measurement for teacher. Lagos: Thomas Nelson and Sons (Nig.) Limited. Nworgu, B.G. (1992). Educational measurement and evaluation: Theory and practice: Nsukka: Hallman Publishers. Obe, E. S. (1984). Urban-rural and sex differences on scholastic Aptitude of primary school finalists, Journal of the Nigerian Educational research council, 4 (1and 2) pp. 10, 25.

82

Obimba, F.U. (1989). Fundamentals of measurement and evaluation in education and psychology. Owerri: Totan Publishers. Obioma, G.O. (1985). “The development and validation of a diagnostic mathematics achievement test for Nigerian secondary school students.” Unpublished Doctoral Dissertation. University of Nigeria Nsukka. Ogochukwu, F.N. (1990). “The development of an instrument (Shyness scale) for the identification of shyness.”Unpublished M.Ed. Thesis, University of Nigeria, Nsukka Ogomaka, P.M.C. (1984). “Development and preliminary validation of students evaluation of teachers effectiveness scale.”Unpublished M.Ed. Thesis. University of Nigeria, Nsukka. Ogomaka, P.M.C. (1990).Types oaf research .In A. J. Isangedighi & P.M.C. Ogomaka (ed.) (pp. 49-71). Educational research methods. Owerri: Totan Publishers Limited. Ogomaka. P.M.C. (2002). Towards uniformity in research proposals and reports. Owerri: Cape publishers Int’l Limited. Ohuche, R.O. & Akeju, S.A. (1977). Testing and evaluation in education. Lagos: African Educational Resources. Ohuche, R.O. & Akeju, S.A. (1988). Measurement and evaluation in education. Onitsha: African-Fep Publishers Limited. Okafor, S.O. (1997). Procedural steps in test development. In Ezeudu, S.A., Agwagah, U.N.V. & Agbaegbu, C.N. (eds) (pp.80-96). Educational measurement and evaluation for colleges and Universities. Onitsha: Cape Publishers International Limited.

83

Okoye, R.O.C. (1997). “Classification of tests.” In Ezeudu, S.A., Agwagah, U.N.V., & Agbaegbu, C.N. (ed). Educational measurement and evaluation for colleges and universities. Onitsha: Cape Publishers. Onah, F. E. (2006). Development and Standardization of Agricultural Science Achievement Test for Senior Secondzry Schools in Enugu State. Unpublished Ph.D Thesis, U. N. N. Onunkwo, G.I.N. (2002). Fundamentals of educational measurement and evaluation. Owerri: Cape Publishers International Ltd. Oragwam,E.O.

(2004). “Development

and

standardization

of

a

national

consciousness scale for Federal Unity Secondary schools in Nigeria.” Unpublished Ph.D. Thesis. Department of science Education ,University of Nigeria, Nsukka. Osuji, U. S. (1999). “The development, validation and use of a formative objective test in technical drawing for senior secondary schools (FOOTTED)”. Unpublished Ph.D. Thesis. Uturu: Abia State University. Roberts, M. B. U. (1981). Biology: A Functional Approach. Great Britain: Cox and Wyman Limited. Russel, S.S.B. (1982). “An investigation into the problems of teaching social studies

in

secondary

grammar

schools

in

Ibadan.”

Oyo

State.

Unpublished M.Ed. Dissertation, University of Ibadan. Sarojini, T. R. (2002). Modern Biology for Senior Secondary Schools. Onitsha: Africana-Fep Publishers Limited. Science Teachers Association of Nigeria. (2002). Science Teachers Handbook. Nigeria: Longman Limited.

84

Stone, R. H. and Cozen, A. N. (1981). New Biology for West African Schools. London: Longman Group Limited. Stromquist, N.P. (1988). Gender disparities in educational access and attainment: Mainstream and feminist theories. Paper presented at the annual meeting of the comparative and international education society. Los Angelies. (A) Eric Document.28. Sullivan, M. M. (1995). Analysis of conceptual understanding in undergraduate statistics across learning styles and gender after students centered activity orientation instruction. Dissertation Abstract International, 39 (p1697A). Thorndike, R. L. & Hagen, E. P. (1977) Measurement and evaluation in psychology and education. New York: John Wiley and Sons, Inc. Uche,

S.C.

(1980).

Integrated

Science

Teaching:

Perspestive

and

Approaches. Aba: AAU Vitalis Books. Uebersax, J. S. C. (1993). Statistical modeling of expert ratings on medical treatment

appropriateness.

Journal

of

the

American

statistical

Association. Ugwu, O.I. (1995). “Development and standardization of an achievement test in Practical Agriculture for Junior Secondary Schools”. Unpublished M.Ed. Thesis. Faculty of Education. University of Nigeria, Nsukka Williams J. H. (2005). Cross national variations in rural Mathematics achievement. A descriptive overview. Journal of Research in Rural Education. 20, 118. www.umaine.edu/jrre/20-5 htm. Wood, R. (1990). “Item analysis.” In Keeves, J. P. (Ed.) Educational research methodology and measurement. New York: Pergamaon Press.

85

Appendix A

UNIVERSITY OF NIGERIA, NSUKKA.

FACULTY OF EDUCATION DEPARTMENT OF SCIENCE EDUCATION Criterion Referenced Achievement Test (CRAT) for SS 2

Class: ______________________ Sex: ________________________ Name of school:___________________________________________ Local Government Area:___________________________________ Note: Information supplied will be used for research purpose only and not against you or your school. INSTRUCTION Time allowed – 1 hour Answer each question by shading the letter containing the option that best suits your answer. Example: The branch of science that studies all living things is A. Biology

B. Botany

B. C. Zoology

D. Micro-biology

The answer is biology and the letter A is therefore shaded. Now do the following ISSUES ON TISSUES AND SUPPORTING SYSTEMS IN ANIMALS 1. The framework of the body of an organism is termed? A. Tissue

B. System

C. Skeleton D. Bone

2. The functions of skeleton includes all except A. Protection

B. Production of bile

C. Shape

D. Support

86

3. Which of these assists in breathing during respiration? A. Clavicle and pectoral girdle C. Ribs and diaphragm

B. Sternum and clavicle D. Ribs and sternum

4. Mammalian skeleton is divided into A. Axial and Appendicular skeleton

B. Axis and Atlas

C. Endo and Exo skeleton

D. Internal Axial skeleton

5. The bones of the vertical column are held together by A. Ribs B. Flesh C. Ligaments D. Clavicle 6. The number of bones that make up the human skeleton are? A. 200

B. 206

C. 106

D. 226

7. The bone of the vertical column include all except A. Cervical B. Thoracic C. Lumber D. Tibia 8. The meeting point of bones is called A. Joint B. Muscles C. Connector D. Cartilage 9. The strong whitish cord that attaches muscles to bones is called A. Ligaments B. Fluid C. Tendon D. Membrane 10. One of the following is a function of muscle A. Contains the ends of the bones B. Provides the flesh that covers the bones C. Prevents friction in the joint D. Acts as pad at the ends of the bones 11. The vertebra that has no centrum is the A. Atlas

B. Cervical C. Axis D. Lumber

12. The fluid that acts as shock absorber can be found in one of the following. A. Muscles B. Synovial cavity C. Synovial membrane D. Cartilages 13. Which of the following are the types of joint? A. Fixed and immovable

B. Unfixed and movable

C. Fixed and movable

D. Unfixed and immovable

14. One function of the axis is, it helps in A. Holding the head D. Twisting the neck

B. Holding the neck

C. Twisting the head

87

15. Thoracic vertebrae are found in the A. Back region B. Chest region C. Neck region D. Waist region 16. We have ……… lumbar bones in man. A. 4

B. 6

C. 5

D. 7

17. Examples of external skeleton in mammals include all except A. finger nails

B. Hoofs

C. Scales

D. Bones

18. Carapace is a skeletal materials found in A. Insects

B. Mammals C. Crustaceans D. Earthworms

19. Example of animals that posses hydrostatic skeleton is A. Cockroach

B. Earthworm C. Crayfish D. Fish

20. Skeleton provides reserve for such mineral as A. Potassium B. Calcium C. Magnesium D. Sodium 21. The examples of movable joints include all except A. Halve joint B. Pivot joint C. Hinged joint

D. Gliding joint

22. The type of joint that allows movement in almost all directions is A. Ball and socket joint B. Hinged joint C. Pivot joint D. Gliding joint 23. The type of joint that allows movement in only one direction is A. Ball and socket joint B. Hinged joint C. Pivot joint D. Gliding joint 24. One function of lumbar vertebra is A. prevents stomach ache

B. carry the weight of the abdomen

C. carry the weight of the body D. protects the stomach 25. The parts of the skull include all except A. cranium B. jaws C. facial skeleton D. clavicle ISSUES ON DIGESTIVE STSTEM IN ANIMALS 26. Digested food from the alimentary canal is transported to the liver through A. Renal artery B. Hepatic artery C. Hepatic portal vein D. Mesenteric artery

27. The breakdown of food substances into simple, soluble and absorbable forms, best describe the term

88

A. Absorption B. Ingestion C. Egestion D. Digestion 28. The digestion of carbohydrates starts in the A. Mouth B. Small intestine C. Oesophagus D. Large intestine 29. The movement of food through contraction of the muscle of gullet into the stomach is called A. Churning B. Digestion C. Peristalsis

D. Coagulation

30. The digestion of protein starts in the A. Mouth B. Small intestine C. Stomach D. Large intestine 31. The food we eat enters the stomach through the A. Larynx B. Cardiac sphincter C. Pharynx D. Small intestine 32. The end product of carbohydrate is A.

Glucose

B. Lactase

C. Galactose

D. Fructose

33. The pancreatic juice secreted by the pancreas contains the following enzymes except A. Amylase B. Lipase

C. Trypsin

D. Oestrogen

34. The digestion of fats and oil starts in A. Mouth B. Duodenum

C. Stomach

D. Ileum

35. The end point of all the digestive processes is in the A. Duodenum B. Large intestine C. Small intestine

D. Caecum

36. The transfer of digested food materials into the blood stream is A. Assimilation B. Transportation C. Absorption D. Digestion 37. The enzyme in the saliva that acts on carbohydrates in the mouth is called A. mucin B. ptyalin C. maltase D. lipase 38. Fats and oil is converted to fatty acid and glycerol by the action of A. amylase

B. trypsin

C. lactase

D. galactase

39. One of the functions of blood platelets is A. Defend the body against foreign bodies B. Help in blood clothing C. Contains haemoglobin which transports oxygen D. Produce plasma 40. In-between the stomach and the duodenum is A. pharynx B. pyloric sphincter C. cardiac sphincter D. ileum 41. During absorption, digested fats and oil are absorbed through the A. walls of small intestine

B. walls of the villi

C. large intestine

89

D. lacteal of the villi 42. The gastric glands in the stomach secrete A. diastases

B. protease C. gastric juice

D. bile

43. The glands in the walls of small intestine produce intestinal juice called A. erepsin B. succus entericus

C. peptidase D. peptones

44. The liver secretes A. bile

B. insulin

C. thyroxin

D. adrenalin

45. The enzyme that changes lactose to glucose and galactose is A. maltase

B. sucrase

C. lactase

D. invertase

46. Intestinal juice can be found in the A. small intestine B. pancreas C. large intestine D. stomach

47. The muscles in the gullet force bolus of food along by means of A. peristaltic contraction B. churning C. digestion D. coagulation 48. Trypsin converts fats and oil to A. amino acid

B. fatty acid and glycerol

C. glucose

D. fatty acid and glucose 49. The ptyalin the saliva changes starch to A. maltase

B. sucrase C. maltose sugar

D. lipase

ISSUES ON BASIC ECOLOGICAL FACTORS 50. The study of the interrelationships between living organisms and their external environment is the definition of A. Ecosystem

B. Autecology

C. Ecology

D. Synecology

51. The functional role of an organism in an ecosystem or habitat is called A. Niche

B. Autecology

C. Synecology

D. Biome

52. The part of the earth occupied by living things is called A. Biosphere

B. Lithosphere

C. Hydrosphere

D. Atmosphere

53. Autecology is the ecology of A. Three species

B. Two species

C. One specie

D. Four species

54. The collection of organisms of different species is referred to as A. Population B. Community C. Ecosystem D. Synecology

90

55. An ecosystem is made up of two components, they are A. Biotic and living factors

B. Abiotic and non-living factor

C. Biotic and Abiotic factor

D. Biotic and green plants

56. The sum total of the living and non-living things that affect living things is termed

A. Environment

B. Niche

C. Species

D. Habitat

57. A group of organisms that resemble each other and can interbreed is referred to as A. species

B. population C. community

D. ecosystem

58. Hydrosphere is the part of the earth occupied by A. land

B. water

C. air

D. wind

59. A community of plants and animals produced and maintained by the climate is called A. ecosystem B. community C. population D. biome 60. The ecology of many species is said to be A. synecology B. autecology C. ecology

D. community

61. Animals use the oxygen released by plants for A. respiration

B. circulation

C. digestion

D. assimilation

62. Micro-organisms breakdown dead plants and other organisms to A. absorb nutrients B. protect nutrients

C. release nutrients

D. dissolve nutrients 63. Green plants use all for photosynthesis except A. light B. water C. carbon (IV) oxide D. sand 64. The upward movement of the sap in the xylem vessel is brought about by A. transpiration pull

B. guttation

C. adhesion

D. capillarity

91

Appendix B Table of Specification for SS2 Criterion Referenced Achievement Test (CRAT) in Biology

Content

Knowl

Compre Applic

Analys

Synthe

Evalu

Total

35%

30%

20%

5%

5%

5%

100%

i 38%

10

6

5

2

2

0

25

ii 36%

7

7

6

1

0

3

24

iii 26%

8

5

1

1

0

0

15

Total

25

18

12

4

2

3

64

100%

Note: Knowl – Knowledge; Compre – Comprehension; Applic – Application; Analys – Analysis;

i

Synthe – Synthesis; Evalua – Evaluation

issues on tissues and supporting systems in animals ii

issues on digestive system in animals

iii issues on basic ecological factor

92

Appendix C Seventeen Local Government Areas in Enugu State 1)

Aninri

2)

Awgu

3)

Enugu East

4)

Enugu North

5)

Enugu South

6)

Eziagu

7)

Igbo - Etiti

8)

Igbo - Eze North

9)

Igbo – Eze South

10)

Isi – Uzo

11)

Nkanu East

12)

Nkanu West

13)

Nsukka

14)

Oji River

15)

Udenu

16)

Udi

17)

Uzo – Uwani

93

Appendix D Seventeen Local Government Areas and their various Education Zones in Enugu State i)

Agbani Zone:

Enugu South, Nkanu East, and Nkanu West

ii)

Awgu Zone:

Aninri, Awgu and Oji-River

iii)

Enugu Zone:

Enugu East, Enugu North and Isi- Uzo

iv)

Nsukka Zone:

Igbo Etiti, Nsukka and Uzo-Uwani

v)

Obollo Zone:

gbo-Eze North, Igbo-Eze South and Udenu

vi)

Udi Zone:

Eziagu and Udi

94

Appendix E Names of schools used in Trial Testing i)

Community Secondary School Iva-Valley (Enugu North – Urban)

ii)

Urban Girls’ Secondary School Nsukka (Nsukka – Urban)

iii)

Community Secondary School Amaozalla Affa (Udi – Rural)

iv)

Community Secondary School Igogoro (Igbo-Eze North – Rural)

95

Appendix F Names of schools used in the Pilot Study i)

Attah Memorial High School, Adaba (Nsukka – Rural)

ii)

Community Secondary School Amokwu Affa (Udi –Rural)

iii)

Boys Secondary School Nara (Nkanu-East – Rural)

iv)

Girls Secondary School Awkunanaw Enugu (Enugu-East – Urban)

v)

Urban Girls’ Secondary School Nsukka (Nsukka – Urban)

vi)

Saint Theresa’s College Abor (Udi – Urban)

96

Appendix G Item Analysis of SS2 Criterion Referenced Achievement Test (CRAT 2)

Items

A

B

C

D

U

L

U

L

U

L

U

L

1

14

4

2

4

8

4

8

10

2

7

11

4

3

7

4

0

18

3

7

11

3

4

8

3

0

18

4

10

8

4

3

7

4

0

18

5

3

15

6

3

3

6

6

12

6

5

13

3

5

7

3

7

11

7

10

8

2

4

8

4

12

6

8

10

8

4

7

7

-

5

13

9

4

14

3

-

7

8

6

12

10

2

16

5

4

6

3

18

0

11

10

8

5

0

7

6

0

18

12

6

11

7

6

6

-

3

15

13

16

2

0

4

9

5

3

15

14

3

15

3

2

7

6

0

18

15

14

4

4

3

8

3

8

10

16

11

7

4

7

7

0

6

12

17

12

6

0

4

8

7

6

12

18

14

4

5

3

7

3

7

11

19

14

4

8

0

5

5

6

12

20

13

5

6

4

6

2

6

12

97

21

10

8

0

7

7

4

6

12

22

10

8

7

3

5

3

3

15

23

8

10

4

2

8

4

0

18

24

3

15

5

4

5

4

4

14

25

10

8

7

3

5

3

11

7

26

10

8

5

2

6

5

0

18

27

4

14

5

2

6

5

18

0

28

13

5

8

3

5

2

0

18

29

13

5

8

3

5

2

2

16

30

10

8

6

2

6

4

7

11

31

13

5

5

3

6

4

6

12

32

10

8

6

2

5

5

0

18

33

10

8

8

5

5

-

13

5

34

6

12

4

3

7

4

4

14

35

13

5

5

4

5

4

2

16

36

0

18

7

2

5

4

0

18

37

12

6

7

4

7

-

6

12

38

0

18

5

7

3

3

8

10

39

8

10

4

0

9

5

6

12

40

13

5

8

0

5

5

6

12

41

0

18

2

2

8

6

9

9

42

16

2

0

2

9

7

5

13

43

12

6

4

3

7

4

5

13

44

7

11

3

5

5

5

7

11

45

11

7

2

7

7

2

8

10

98

46

11

7

2

2

8

6

3

15

47

12

6

1

4

9

4

5

13

48

3

15

0

2

9

7

7

11

49

7

11

4

1

7

6

0

18

50

11

7

2

2

8

6

0

18

51

9

9

5

3

6

4

0

18

52

14

4

6

5

6

1

6

12

53

11

7

8

2

5

3

4

14

54

10

8

4

2

7

5

18

0

55

8

10

4

3

7

4

5

13

56

13

5

5

2

6

5

2

16

57

15

3

8

5

5

-

6

12

58

15

3

6

4

6

2

8

10

59

10

8

4

2

7

5

12

6

60

11

7

4

2

7

5

6

12

61

13

5

6

2

6

4

2

16

62

7

11

3

4

7

4

5

13

63

6

12

5

1

6

6

15

3

64

11

7

7

4

5

2

8

10

99

Appendix H Distractor Indices for S.S. 2 Criterion Referenced Achievement Test (CRAT) in Biology Distractors Items

P

D

A

B

C

D

1

0.33

0.22

0.11

0.11

*

0.11

2

0.19

0.06

0.22

*

0.06

0.11

3

0.31

0.28

0.17

0.00

*

0.22

4

0.50

0.11

*

0.22

0.17

0.06

5

0.25

-0.17

-0.17

-0.11

*

0.11

6

0.22

0.11

0.00

*

0.06

-0.17

7

0.50

0.33

-0.17

0.11

0.17

*

8

0.50

0.11

*

0.11

0.06

0.00

9

0.42

-0.06

-0.17

0.00

*

0.06

10

0.50

1.00

0.06

0.11

-0.06

*

11

0.14

0.28

0.22

*

0.11

0.22

12

0.36

0.06

0.11

*

0.00

0.06

13

0.39

0.22

0.28

0.22

*

0.22

14

0.36

0.06

0.00

0.11

*

0.06

15

0.19

0.06

0.06

*

0.17

0.11

16

0.19

0.39

-0.06

0.11

*

0.22

17

0.42

0.06

0.00

0.22

*

0.11

18

0.42

0.22

0.22

0.17

*

0.00

19

0.22

0.44

0.17

*

0.17

0.11

20

0.28

0.11

0.28

*

0.06

0.06

100

21

0.50

0.11

*

0.22

-0.06

0.06

22

0.50

0.11

*

0.17

0.17

0.06

23

0.17

0.11

0.11

*

0.06

0.28

24

0.25

0.06

-0.06

0.17

*

-0.17

25

0.50

0.22

0.00

0.17

0.00

*

26

0.31

0.06

0.17

0.28

*

0.11

27

0.50

1.00

0.06

0.06

0.11

*

28

0.50

0.44

*

0.17

0.11

0.00

29

0.19

0.17

0.17

0.17

*

0.28

30

0.01

0.11

-0.06

0.17

*

0.06

31

0.50

0.44

*

0.22

0.06

0.11

32

0.50

0.11

*

0.06

0.28

0.11

33

0.50

0.44

0.06

0.11

0.11

*

34

0.19

0.06

0.06

*

0.00

0.06

35

0.25

0.06

0.17

0.17

*

0.28

36

0.25

-0.28

-0.11

*

0.06

0.06

37

0.31

0.17

0.17

*

0.11

0.06

38

0.33

-0.11

-0.28

*

0.06

-0.22

39

0.11

0.22

0.11

*

0.00

0.00

40

0.22

0.44

0.11

*

0.17

0.11

41

0.50

0.00

0.00

-0.11

0.11

*

42

0.44

0.11

0.28

0.17

*

0.17

43

0.19

0.06

0.06

*

0.11

0.22

44

0.50

-0.22

*

0.11

-0.06

-0.06

45

0.25

0.28

-0.06

0.11

*

0.11

101

46

0.50

0.22

*

0.28

0.22

-0.06

47

0.50

0.33

*

0.11

0.11

-0.11

48

0.06

-0.11

-0.06

*

0.00

-0.17

49

0.36

0.06

0.06

0.28

*

0.06

50

0.39

0.11

0.22

0.22

*

0.17

51

0.50

0.00

*

-0.17

0.11

0.06

52

0.50

0.56

*

0.17

0.17

0.11

53

0.22

0.11

0.28

0.06

*

0.06

54

0.50

1.00

0.28

0.11

0.17

*

55

0.31

0.17

0.17

-0.06

*

0.06

56

0.50

0.44

*

0.28

0.28

0.06

57

0.50

0.67

*

0.17

0.22

0.11

58

0.28

0.11

0.22

*

0.17

0.00

59

0.50

0.33

0.11

0.11

0.00

*

60

0.50

0.22

*

0.00

0.17

0.11

61

0.50

0.44

*

0.11

-0.17

0.11

62

0.31

0.17

0.11

0.06

*

-0.06

63

0.50

0.67

0.06

0.00

0.00

*

64

0.50

0.22

*

-0.06

0.17

0.06

* correct option

102

Appendix I Means and Standard Deviations of Males and Females in the Hypothetical CRAT in Biology.

Male

Mean (x)

Standard Deviation (S)

27.00

9.22

Number (n)

Degree of freedom

25.27

Xm t =

=

=

=

Xf

Sm2 Sf2 + nm nf 27

=

-

9.51

-

25.27

9.222 9.512 + 24 30 1.73 3.54 + 3.01 1.73 6.55 1.73 2.55

t = 0.68

Calculated t-value

Critical t-value

2.55

0.68

1.960

24 52

Female

Standard Error

30

103

S/No of items

Upper 1/3 of Middle 1/3 of Lower 1/3 of the testees the testees the testees No Passed

No Failed

No Passed

No Failed

No Passed

No Failed

P

q

Pq

1

14

4

c

16

8

10

.44

.56

.25

2

7

11

4

14

0

18

.20

.80

.16

3

7

11

3

15

0

18

.19

.81

15

4

8

10

4

14

0

18

.22

.78

.17

5

3

15

6

12

6

12

-

-

-

6

5

13

5

13

7

11

-

-

-

7

10

8

2

16

6

12

.33

.67

.22

8

8

10

4

14

5

13

.31

.69

.21

9

4

14

3

15

6

12

-

-

-

10

2

16

5

13

0

18

.13

.87

.11

11

10

8

0

18

0

18

.19

.81

.15

12

6

12

6

12

3

15

.28

.72

.20

13

16

2

0

18

3

15

.35

.65

.23

14

3

15

3

15

0

18

.17

.83

.14

15

14

4

3

15

8

10

.46

.54

.25

16

11

7

4

14

6

12

.39

.61

.24

17

12

6

0

18

6

12

.33

.67

.22

18

14

4

5

13

7

11

.48

.52

.25

19

14

4

8

10

6

12

.52

.48

.25

20

13

5

4

14

6

12

.43

.57

.25

21

10

8

0

18

6

12

.30

.70

.21

104

22

10

8

7

11

3

15

37

.63

.23

23

8

10

2

16

0

18

.19

.81

.15

24

3

15

5

13

4

14

-

-

-

25

10

8

7

11

7

11

.44

.56

.25

26

10

8

5

13

0

18

.28

.72

.20

27

4

14

5

13

0

18

.17

.83

.14

28

5

13

8

10

0

18

.24

.76

.18

29

13

5

8

10

2

16

.43

.57

.25

30

10

8

6

12

7

11

.43

.57

.25

31

13

5

5

13

6

12

.44

.57

.25

32

8

10

6

12

0

18

.44

.56

.25

33

10

8

8

10

5

13

.43

.57

.25

34

6

12

3

15

4

14

.24

.76

.18

35

13

5

5

13

2

16

.37

.63

.23

36

0

18

7

11

0

18

-

-

-

37

12

6

4

14

6

12

.41

.59

.24

38

0

18

7

11

8

10

.28

.72

.20

39

8

10

0

18

6

12

.26

.74

.19

40

13

5

8

10

6

12

.50

.50

.25

41

0

18

2

16

0

18

-

-

-

42

16

2

0

18

5

13

.39

.61

.24

43

12

6

4

14

5

13

.39

.61

.24

44

7

11

3

15

7

11

-

-

-

45

11

7

2

16

8

10

.39

.61

.24

46

11

7

2

16

3

15

.30

.70

.21

105

47

7

11

1

17

5

13

.24

.76

.18

48

3

15

0

18

7

11

-

-

-

49

7

11

4

14

0

18

.20

.80

.16

50

11

7

2

16

0

18

.24

.76

.18

51

0

18

5

13

0

18

-

-

-

52

14

4

6

12

6

12

.48

.52

.25

53

11

7

8

10

4

14

.43

.57

.25

54

10

8

4

14

0

18

.26

.74

.18

55

8

10

4

14

5

13

.31

.69

.21

56

13

5

5

13

2

16

.37

.63

.23

57

15

3

8

10

6

12

.54

.46

.25

58

15

3

4

14

8

10

.50

.50

.25

59

10

8

4

14

6

12

.37

.63

.23

60

11

7

4

14

6

12

.39

.61

.24

61

3

15

6

12

2

16

.20

.80

.16

62

7

11

3

15

5

13

.28

.72

.20

63

6

12

5

13

5

13

.30

.70

.21

64

11

7

7

11

8

10

.48

.52

.25

∑pq = 11.89 P = proportion of the testees who passed each item q = proportion of the testees who failed each item

[

∑pq

n K-R (20) rtt = n-1 1- Sx2

]

Where n = Number of test items = 55 Sx2 = Variance of the total test scores = 23. 90 ∑pq = 11.89

106

rtt =

55 55-1

[1- 11.89] 23.90

rtt = 1.02 (1- 0.50) rtt = 1.02 (0.50) rtt = 0.51

Smile Life

When life gives you a hundred reasons to cry, show life that you have a thousand reasons to smile

Get in touch

© Copyright 2015 - 2024 PDFFOX.COM - All rights reserved.