In education, certification, counseling, the military, and many other fields, a test or an exam (short for examination) is a tool or technique intended to measure students' expression of knowledge, skills and/or abilities. A test has more questions of greater difficulty and requires more time for completion than a quiz. It is usually divided into two or more sections, each covering a different area of the domain or taking a different approach to assessing the same aspects. A professional certification, trade certification, or professional designation often called simply certification or qualification is a designation earned by a person to certify that he is qualified to perform a job. ... // A school counselor is a counselor and educator who works in schools, and are often referred to as guidance counselors or educational counselors. In professional literature, the term school counselor is preferred. ... Personification of knowledge (Greek Επιστημη, Episteme) in Celsus Library in Ephesos, Turkey. ... A skill is an ability, usually learned and acquired through training, to perform actions which achieve a desired outcome. ... Look up ability in Wiktionary, the free dictionary Ability - the quality of person of being able to perform; A quality that permits or facilitates achievement or accomplishment. ... A quiz is a form of game or mind sport in which the players (as individuals or in teams), attempt to answer questions correctly. ...

A standardized test is one that compares the performance of every individual subject with a norm or criterion. The norm may be established independently, or by statistical analysis of a large number of subjects. This article or section does not cite its references or sources. ... A test is said to be norm-referenced when the translated score tells where the person stands in some population of persons who have taken the test. ... A test is said to be criterion-referenced when provision is made for translating the test score into a statement about the behavior to be expected of a person with that score. ... A graph of a Normal bell curve showing statistics used in educational assessment and comparing various grading methods. ...


Types of questions

Multiple-choice questions

For a multiple-choice question, the author of the test provides several possible answers (usually four or five) from which the test subjects must choose. There is one right answer, usually represented by only one answer option, though sometimes divided into two or more, all of which subjects must identify correctly. Such a question may look like this: Multiple choice (MCQ) questions or items are a form of assessment item for which respondents are asked to select one or more of the choices from a list. ...

 The number of right angles in a square is: a) 2 b) 3 c) 4 d) 5 

Test authors generally create incorrect response options, often referred to as distracters, which correspond with likely errors. For example, distracters may represent common misconceptions that occur during the developmental process. The construction of effective distracters is a key challenge that must be faced in order to construct multiple-choice items that possess strong psychometric properties. Well-designed distracters, considered in combination, can attract considerably more than 25% of the weakest students, so reducing the effects of guessing on total scores. The construction of such items may in some cases require some skill and experience on the part of the item developer. Psychometrics is the science of measuring psychological aspects of a person such as knowledge, skills, abilities, or personality. ...

Figure 1: Multiple choice distracter analysis with Item Characteristic Curve
Figure 1: Multiple choice distracter analysis with Item Characteristic Curve

A graph showing the functioning of a multiple-choice question is shown in Figure 1. The x-axis represents an ability continuum and the y-axis the probability of any given choice. The grey line maps ability to the probability of a correct response according to the Rasch model, which is a psychometric model used to analyse test data. The correct response in the example shown in Figure 1 is E. The proportion of students along the ability continuum who chose the correct response is highlighted in pink. The graph shows the proportion of students opting for other choices along the range of the ability continuum, as shown in the legend. The proportion of students at about − 1.5 on the scale who responded correctly to this item is approximately 0.1, which is below the proportion expected if students were purely guessing. Image File history File links MC_ICC_1. ... Image File history File links MC_ICC_1. ... Rasch models are probabilistic measurement models which currently find their application primarily in psychological and attainment assessment, and are being increasingly used in other areas, including the health profession and market research. ...

An attractive feature of multiple-choice questions is that they are particularly easy to score. Machines such as the Scantron and software grading of computer-based tests can be performed automatically and instantly, which is particularly valuable for situations where there aren't enough graders available to grade a large class or large-scale standardized test. Scantron Corporation Scantron is a company, based in Irvine, California, USA, that makes and sells (1) machine-readable papers on which pupils and students mark their answers to academic test questions, (2) the machines to grade them, (3) Survey and Test Scoring systems, and (4) Image based data collection software...

This format is not, however, appropriate for assessing all types of skills and abilities. Poorly written multiple-choice questions often create an overemphasis on simple memorization and deemphasize processes and comprehension, and they leave no room for disagreement or alternate interpretation, making them particularly unsuitable for humanities such as literature and philosophy.

Free-response questions

Students taking a test at the University of Vienna, June 2005
Students taking a test at the University of Vienna, June 2005

Free-response questions (also known as extended constructed responses) generally require subjects to produce written responses. The length of the written response may be as short as a single word or mathematical expression, in which case the question acquires some of the characteristics of the multiple-choice type. However, at higher levels of education, this type of question usually requires deeper, more analytical thinking. The most difficult free-response questions may involve an essay or original composition of a page or more in length, or a scientific proof or solution requiring over an hour. Image File history File links Download high resolution version (2311x1584, 720 KB)Students taking a test at the University of Vienna at the end of the summer term 2005 (Saturday, June 25, 2005). ... Image File history File links Download high resolution version (2311x1584, 720 KB)Students taking a test at the University of Vienna at the end of the summer term 2005 (Saturday, June 25, 2005). ... The University of Vienna (German: Universität Wien) in Vienna, Austria is the oldest university in the current Austro-Hungarian domain; it formally opened in 1365. ... 2005 : January - February - March - April - May - June - July - August - September - October - November - December- → Deaths in June June 27: Shelby Foote June 27: John T. Walton June 26: Richard Whiteley June 25: John Fiedler June 25: Chet Helms June 24: Paul Winchell June 21: Jaime Cardinal Sin June 20: Jack Kilby... This article or section does not cite any references or sources. ... In mathematics, a proof is a demonstration that, assuming certain axioms, some statement is necessarily true. ...

Free-response questions do not pose as much of a challenge to the test author, but evaluating the responses is a different matter. Effective scoring involves reading the answer carefully and looking for specific features, such as clarity and logic, which the item is designed to assess. Often, the best results are achieved by awarding scores according to explicit ordered categories which reflect an increasing quality of response. Doing so may involve the construction of marking criteria and support materials, such as training materials for markers and samples of work which exemplify categories of responses. Typically, these questions are scored according to a uniform grading rubric for greater consistency and reliability.

At the other end of the spectrum, scores may be awarded according to superficial qualities of the response, such as the presence of certain important terms. In this case, it is easy for test subjects to fool scorers by writing a stream of generalizations, non sequiturs that incorporates the terms that the scorers are looking for. Concept A is a (strict) generalization of concept B if and only if: every instance of concept B is also an instance of concept A; and there are instances of concept A which are not instances of concept B. Equivalently, A is a generalization of B if B is a... Non sequitur is Latin for it does not follow. ...

Practical examination

Knowledge of how to do something does not lend itself well to either free-response or multiple-choice questions. It may be demonstrated only outright. Art, music, and language fall into this category, as do non-academic disciplines such as sports and driving. Students of engineering are often required to present an original design or computer program developed over the course of days or even months. The Bath, a painting by Mary Cassatt (1844-1926). ... For other uses, see Music (disambiguation). ... Driving is the controlled operation of a vehicle, which is usually a motor vehicle such as a truck, bus, or car. ... Engineering is the design, analysis, and/or construction of works for practical purposes. ... A computer program is a collection of instructions that describe a task, or set of tasks, to be carried out by a computer. ...

A practical examination may be administered by an examiner in person (in which case it may be called an audition or a tryout) or by means of an audio or video recording. It may be administered on its own or in combination with other types of questions; for instance, many driving tests in the United States include a practical examination as well as a multiple-choice section regarding traffic laws. Methods and media for sound recording are varied and have undergone significant changes between the first time sound was actually recorded for later playback until now. ... Video (Latin for I see, first person singular present, indicative of videre, to see) is the technology of electronically capturing, recording, processing, storing, transmitting, and reconstructing a sequence of still images representing scenes in motion. ...

Tests of the sciences may include laboratory experiments (practicals/laboratory sessions) to make sure that the student has learned not only the body of knowledge comprising the science but also the experimental methods through which it has been developed. Again, the use of explicit criteria is generally beneficial in the marking of practical examinations or performances. Part of a scientific laboratory at the University of Cologne. ... For other uses of lab, see Lab. ... Practicals is a term used by students to check a system under observation. ...

Limitations of testing and associated issues

General aptitude tests are used in certain countries as a basis for entrance into colleges and universities. An issue associated with the use of these tests is that they are known to be subject to practice effects, and do not assess the accumulated learning of students during their schooling years. As a consequence, the SAT have been renamed from the Scholastic Aptitude Test to the Scholastic Assessment Test. Some evidence indicates that SAT scores of 11th and 12th graders do not correlate highly with freshman year grades and correlate poorly with overall undergraduate ranking — this has caused pressure for ETS to re-evaluate their exams before universities start requiring applicants to provide exam scores for ACT, an exam which also does not correlate very well with freshmen GPA but does correlate better than the SAT. Reasons for poor correlation are as follows: Image File history File links Gnome-globe. ... The Educational Testing Service (or ETS) is the worlds largest private educational testing and measurement organization, operating on an annual budget of approximately $900 million. ...

  • Questions on the exam may be improperly weighting the types of problems encountered within the environment the exam intends to predict. An example of improperly weighting would be for an exam to have the ratio of questions in geometry, calculus, and number theory dissimilar to the ratio of these questions present in the environment for which the exam is intended to serve as a predictor of future performance. More egregiously, a mathematics exam may ask solely about the names, birthdates, and country of origin of various mathematicians when such knowledge is of little importance in a mathematics curriculum.
  • People are variously susceptible to stress. Some are virtually unaffected, and excel on tests, while in extreme cases, individuals can become very nervous and forget large components of exam material. To counterbalance this, often teachers and professors don't grade their students on tests alone, placing considerable weight on homework, attendance, in-class discussion activity, and laboratory investigations (where applicable).
  • Through specialized training on material and techniques specifically created to suit the test, students can be "coached" to "game" the test, significantly raising their scores without actually significantly increasing their general intelligence or knowledge.
  • Although test organizers attempt to prevent it and impose strict penalties for it, academic dishonesty (cheating) can be used to obtain an advantage over other test-takers. On a multiple-choice test, lists of answers may be obtained beforehand. On a free-response test, the questions may be obtained beforehand, or the subject may write an answer that creates the illusion of knowledge.

Despite such issues, tests are less susceptible to cheating than other tools of learning evaluation. Laboratory results can be fabricated, and homework can be done by one student and copied by rote by others. The presence of a responsible test administrator, in a controlled environment, helps to guard against cheating. Stress (roughly the opposite of relaxation) is a medical term for a wide range of strong external stimuli, both physiological and psychological, which can cause a physiological response called the general adaptation syndrome, first described in 1936 by Hans Selye in the journal Nature. ... This article or section does not cite any references or sources. ... The meaning of the word professor (Latin: one who claims publicly to be an expert) varies. ... Homework, short for homework assignments, refers to tasks assigned to students by their teachers to be completed mostly outside of class, and derives its name from the fact that most students do most of such work at home. ... A medical laboratory or clinical laboratory is a laboratory where tests are done on biological specimens in order to get information about the health of a patient. ... Academic dishonesty or academic misconduct is a form of cheating that occurs in an educational setting, usually committed by students. ... Cheating is defined as an act of lying, deception, fraud, trickery, imposture, or imposition. ...

Additionally, in some cases, high-stakes testing induces examinees to rise to meet the exam's high expectations. Generally, the term high-stakes is reserved for tests that are used as a basis for competitive entry into future courses, including tests which are highly weighted within selection criteria that are used for entrance into university courses.

The SAT and other high-stakes exams

In the United States and other countries, tests based primarily on multiple-choice questions have come to be used for assessments of great importance, with consequences including the funding levels of public schools and the admission of students to institutions of higher education. The most important such test in the U.S. is the SAT, which consists almost entirely of multiple-choice questions (though some of these are specifically designed to inherent inaccuracies of that question type). Originally developed as a test of a student's intrinsic intelligence, its methodology has proven vulnerable to specialized test-preparation programs that improve the subject's score. The SAT is written and administered by the College Board. For this reason, certain commentators have suggested that high stakes testing should be based more on content learned during the schooling years. Difficulties arise with respect to comparability across different schools, sectors, states and so on. A key challenge is to balance the need for comparability with the need to assess the skills, knowledge and abilities students have developed during the schooling years. The term public school has two distinct meanings: elementary or secondary school supported and administered by state and local officials, or, in England, Wales, and Northern Ireland, a private or independent, fee-paying school, generally not coeducational, which prepares pupils for university. ... The University of Cambridge is an institute of higher learning. ... The SATs (pronounced S-A-T not sat) are standardized tests, formerly called the Scholastic Aptitude Tests and Scholastic Assessment Tests, frequently used by colleges and universities in the United States to aid in the selection of incoming freshmen. ... The College Board is a non-profit examination board in the United States that was formed in 1900 as the College Entrance Examination Board (CEEB). ...

The SAT has also been criticized for an alleged racial bias; ethnic minorities supposedly fare worse on the exam than they should. As a result, it began to fall out of favor in the late 1990s, with increasing emphasis on standardized tests that measure actual knowledge. Some of these replacements have likewise come from the College Board, but many states have taken the initiative to design tests of their own. The ACT examination, introduced in 1959 as a competitor to the SAT, also features more knowledge-based questions, and is accepted as an alternative to the SAT for admission to many United States colleges. Many colleges are also placing more emphasis on measures of long-term performance such as the high-school grade point average, the difficulty of classes taken in high school, and teacher letters of recommendation. For the band, see 1990s (band). ... A state is a political association with effective dominion over a geographic area. ... The ACT, formerly the ACT Assessment, is a college-entrance achievement test produced by ACT, Inc. ... 1959 (MCMLIX) was a common year starting on Thursday of the Gregorian calendar. ... A grade in education can mean either a teachers evaluation of a students work or a students level of educational progress, usually one grade per year (often denoted by an ordinal number, such as the 3rd Grade or the 12th Grade). This article is about evaluation of...

There are also other high-stakes exams at higher educational levels, like; Fundamentals of Engineering exam administered by National Council of Examiners for Engineering and Surveying (NCEES). In the United States, the Fundamentals of Engineering exam (also known as the FE exam) is the first of two examinations engineers must pass in order to be certified as a professional engineer. ... The National Council of Examiners for Engineering and Surveying (NCEES) is a national non-profit organization composed of engineering and land surveying licensing boards representing all U.S. states and territories [1] The NCEES is responsible for the administration of the exams that engineers must pass in order to get...

