## What Are Standard Scores In Education? Common terms and concepts – Norm-referenced: A test that’s norm-referenced compares your child’s scores to the scores of a large, random group of kids who’ve taken the same test. This “norm group” may be kids of the same age or in the same grade, and their scores are used to determine what’s typical.

For example, on the Wechsler Intelligence Scale for Children (WISC), the average score is 100. So, your child’s score will be based on that. Test reliability: A “reliable” test provides the same results every time. That means if your child took the test a few times over a period of time, the scores would be roughly the same.

For example, say your child took an IQ test in second grade and again in third grade. If it’s a reliable test, the scores should be similar. Longer tests tend to be more reliable than short ones. That’s because each question is worth more on a shorter test.

1. If a test has five questions and your child gets one wrong, the score will be an 80 percent.
2. But if a test has 20 questions and your child gets one wrong, the score will be a 95 percent.
3. Standard score: Most educational tests have standard scores based on a scale that makes the average score 100 points.

But average is actually a range. There’s wiggle room in test scores to account for possible mistakes. This is sometimes called “standard error.” Standard error allows us to say how confident we are that a score falls within a certain range. Strong tests typically have a lower standard error.

Standard deviation (SD): This is the average distance between all test scores and the average score. Take the WISC-V, with an average score of 100. Most kids fall in the range of 85–115 points. That’s a standard deviation (SD) of 15 points. Being one SD away (15 points) is still considered average. That gap isn’t big enough to be statistically significant.

But a score that’s two standard deviations (30 points) away in either direction is significant because there’s a greater chance that this score isn’t due to chance or error. Percentile: The percentile shows the proportion of scores that were lower than your child’s score.

• Let’s say your child is one of 100 kids being tested, and scores in the 75th percentile.
• That means your child scored higher than 75 of the 100 kids tested.
• Many tests are made up of a number of short tests that look at different skills.
• Those short tests are called subtests.
• An achievement test may have subtests for vocabulary, working memory, and visual reasoning.

Each subtest has its own score. Sometimes the scores of subtests that look at different pieces of bigger skills are combined. For example, a vocabulary subtest and a language comprehension subtest might be combined to give a “verbal ability” score. Subtest scores are important.

### What do standard scores mean?

In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured.

### What is an example of a standard score?

Example 4.1 One SUV gets 25 mpg. Thus, its standardized score is z = (25 – 22)/3 = 1. It is one standard deviation above the mean.

### What are standard scores in educational research?

The standard score (more commonly referred to as a z-score) is a very useful statistic because it (a) allows us to calculate the probability of a score occurring within our normal distribution and (b) enables us to compare two scores that are from different normal distributions.

## What are the types of standard scores?

When we standardize scores, we can compare scores for different groups of people and we can compare scores on different tests. This chapter will reveal the secrets of four different standard scores: Percentiles, Z scores, T scores, and IQ scores.

#### What is a good standard score?

Frequently Used Terms – The results of most psychological tests are reported using either standard scores or percentiles. Standard scores and percentiles describe how a student performed on a test compared to a representative sample of students of the same age from the general population.

This comparison sample or group is called a norm group. Because educational and psychological tests do not measure abilities and traits perfectly, standard scores are usually reported with a corresponding confidence interval to account for error in measurement. Standard Score Most educational and psychological tests provide standard scores that are based on a scale that has a statistical mean (or average score) of 100.

If a student earns a standard score that is less than 100, then that student is said to have performed below the mean, and if a student earns a standard score that is greater than 100, then that student is said to have performed above the mean. However, there is a wide range of average scores, from low average to high average, with most students earning standard scores on educational and psychological tests that fall in the range of 85–115.

This is the range in which 68% of the general population performs and, therefore, is considered the normal limits of functioning. Classifying standard scores, However, the normal limits of functioning encompass three classification categories: low average (standard scores of 80–89), average (standard scores of 90–109), and high average (110–119).

These classifications are used typically by school psychologists and other assessment specialists to describe a student’s ability compared to same-age peers from the general population. Subtest scores, Many psychological tests are composed of multiple subtests that have a mean of 10, 50, or 100.

• Subtests are relatively short tests that measure specific abilities, such as vocabulary, general knowledge, or short-term auditory memory.
• Two or more subtest scores that reflect different aspects of the same broad ability (such as broad Verbal Ability) are usually combined into a composite or index score that has a mean of 100.

For example, a Vocabulary subtest score, a Comprehension subtest score, and a General Information subtest score (the three subtest scores that reflect different aspects of Verbal Ability) may be combined to form a broad Verbal Comprehension Index score.

Composite scores, such as IQ scores, Index scores, and Cluster scores, are more reliable and valid than individual subtest scores. Therefore, when a student’s performance demonstrates relatively uniform ability across subtests that measure different aspects of the same broad ability (the Vocabulary, Comprehension, and General Information subtest scores are both average), then the most reliable and valid score is the composite score (Verbal Comprehension Index in this example).

However, when a student’s performance demonstrates uneven ability across subtests that measure different aspects of the same broad ability (the Vocabulary score is below average, the Comprehension score is below average, and the General Information score is high average), then the Verbal Comprehension Index may not provide an accurate estimate of verbal ability.

• In this situation, the student’s verbal ability may be best understood by looking at what each subtest measures.
• In sum, it is important to remember that unless performance is relatively uniform on the subtests that make up a particular broad ability domain (such as Verbal Ability), then the overall score (in this case the Verbal Comprehension Index) may be a misleading estimate.
You might be interested:  What Is Counseling In Education?

Percentile Standard scores may also be reported with a percentile to aid in understanding performance. A percentile indicates the percentage of individuals in the norm group that scored below a particular score. For example, a student who earned a standard score of 100 performed at the 50th percentile.

• This means that the student performed as well as or better than 50% of same-age peers from the general population.
• A standard score of 90 has a percentile rank of 25.
• A student who is reported to be at the 25th percentile performed as well or better than 25% of same-age peers, just as a student who is reported to be at the 75th percentile performed as well or better than 75% of students of the same age.

While the standard score of 90 is below the statistical mean of 100 and is at the 25th percentile, this performance is still within the average range and generally does not indicate any need for concern. Confidence Interval Psychological tests do not measure ability perfectly.

1. No matter how carefully a test is developed, it will always contain some form of error or unreliability.
2. This error may exist for various reasons that are not always readily identifiable.
3. In order to account for this error, standard scores are often reported with confidence intervals.
4. Confidence intervals represent a range of standard scores in which the student’s true score is likely to fall a certain percentage of the time.

Most confidence intervals are set at 95%, meaning that a student’s true score is likely to fall between the upper and lower limits of the confidence interval 95 out of 100 times (or 95% of the time). For example, if a student earned a standard score of 90 with a confidence interval of +5, this means that the lower limit of the confidence interval is 85 (that is, 90 – 5 = 85) and the upper limit of the confidence interval is 95 (90 + 5 = 95).

The standard score of 90 may be reported in a psychological report as 90 + 5 or 90 (85 – 95). Although the student’s score on the day of the evaluation was 90 in this example, the true score may be lower or higher than 90 owing to an error associated with the method in which the ability was measured.

Therefore, it is more accurate to say that there is a 95% chance that the student’s true performance on this test falls somewhere between 85 and 95. Tests that are highly reliable have relatively small confidence bands associated with their scores, indicating that these tests provide the most consistent scores across time.

Example: Reporting Scores The following statement is one that can be commonly found in a psychological report and can be used to illustrate these definitions: “Jacob obtained a standard score of 93 + 7 on a test of reading comprehension, which is ranked at the 33rd percentile and is classified as average.” This is what that statement means: First, Jacob’s observed score fell below the mean of 100.

Second, Jacob did as well as or better than 33% of students his age from the general population. Third, there is a 95% chance that Jacob’s true score falls somewhere between 86 and 100. Fourth, Jacob’s performance is considered average relative to same-age peers from the general population.

### How do you explain standard scores to parents?

Standard Score – The standard score is a way of showing how close a score is to the average score that was obtained in the sample. Now, there’s a lot of math involved in converting the scores, however, all of that glorious math is done by the test creators.

They create a nice, organized little table that lays out all of the information for the test administration (your speech therapist). So the speech therapist takes the child’s raw score and looks it up in the table in the test manual. The table converts the raw score to the standard score like magic. The most important thing for parents to understand with standard scores is what is considered “average”.

Common practice on standardized tests used for speech and language assessments is that 100 is the mean score and the standard deviation is +15 or -15. This means that scores between 85 and 115are considered to be within the average range. Anything above 115 is considered “above average” and anything below 85 is considered “below average”.

## What is the most commonly used standard score?

II. Standard Scores ( Video Lesson 6 II ) ( YouTube version ) Standard scores are used when you want to compare two distributions of data that have significantly different means and standard deviations. Standard scores allows us to compare apples and oranges, if you will.

 Employee Customer Rating Quarterly Sales Volume Aparna 15 76,000 Dave 2 55,000 Latika 21 39,000 Gloria 17 100,000 Rasheed 10 20,000 X bar = 13 X bar = 58,000 s = 7.314 s = 31,233

A. z scores The most commonly used standard score is the z score, Z scores have a mean of 0 and a standard deviation of 1. Once a distribution has been converted to z scores it can be compared to any other distribution that has been converted to z scores. The formula for calculating z scores is shown below: The formula reads: z equals the raw score of a sample minus the mean of that sample divided by the standard deviation of that sample. So now we can convert each of the two different sets of raw scores into z scores:

 Employee X 1 z 1 X 2 z 2 Aparna 15 0.27 76,000 0.58 Dave 2 -1.50 55,000 -0.10 Latika 21 1.09 39,000 -0.61 Gloria 17 0.55 100,000 1.34 Rasheed 10 -0.41 20,000 -1.22 X bar = 13 X bar = 58,000 s = 7.314 s = 31,233

With the new z scores we can do a better comparison of the two different data sets. Remember that a z score close to zero is close to the mean of the distribution. Negative z scores are below the mean and positive z scores are above the mean. For both of these categories we want values that are above the mean.

You might be interested:  How To Celebrate Un Day In School?
 Employee Customer Rating Quarterly Sales Volume Aparna 0.27 0.58 Dave -1.50 -0.10 Latika 1.09 -0.61 Gloria 0.55 1.34 Rasheed -0.41 -1.22

Since Aparna and Gloria are the only two employees that had both scores above the mean, they will be the only ones to get a bonus.B. T scores Z scores are an excellent standard score for most people. The only real downside of z scores is that you have to deal with negative numbers.

For those people that are uncomfortable with negative numbers, the T score can be used. The T score has a mean of 50 and a standard deviation of 10, To calculate a T score you must first calculate the z score as above. Once the z score has been identified for each raw score you can use the formula below to convert these z scores into T scores: T = (z)(10) + 50 The formula reads: T equals the z score times 10 and then plus 50.

The table below shows the same data as above but as T scores:

 Employee Customer Rating Quarterly Sales Volume Aparna 52.7 55.8 Dave 35.0 49 Latika 60.9 43.9 Gloria 55.5 63.4 Rasheed 45.9 37.8

#### What is a standard score and how do you find it?

Standard Score (cont.) Z-scores are expressed in terms of standard deviations from their means. Resultantly, these z-scores have a distribution with a mean of 0 and a standard deviation of 1. The formula for calculating the standard score is given below: As the formula shows, the standard score is simply the score, minus the mean score, divided by the standard deviation. Therefore, let’s return to our two questions.

#### What does a standard score of 75 mean?

This means that your child performed as well as or better than 50 percent of children who are his age or in his grade. If your child earns a percentile rank of 75 on a standardized test, your child scored as well or better than 75 percent of his peers.

## Why are standard scores important in assessment?

Interpreting test results – Test results are usually presented in terms of numerical scores, such as raw scores, standard scores, and percentile scores. In order to interpret test scores properly, you need to understand the scoring system used.

• Types of scores
• Raw scores. These refer to the unadjusted scores on the test. Usually the raw score represents the number of items answered correctly, as in mental ability or achievement tests. Some types of assessment tools, such as work value inventories and personality inventories, have no “right” or “wrong” answers. In such cases, the raw score may represent the number of positive responses for a particular trait. Raw scores do not provide much useful information. Consider a test taker who gets 25 out of 50 questions correct on a math test. It’s hard to know whether “25” is a good score or a poor score. When you compare the results to all the other individuals who took the same test, you may discover that this was the highest score on the test. In general, for norm-referenced tests, it is important to see where a particular score lies within the context of the scores of other people. Adjusting or converting raw scores into standard scores or percentiles will provide you with this kind of information. For criterion-referenced tests, it is important to see what a particular score indicates about proficiency or competence.
• Standard scores. Standard scores are converted raw scores. They indicate where a person’s score lies in comparison to a reference group. For example, if the test manual indicates that the average or mean score for the group on a test is 50, then an individual who gets a higher score is above average, and an individual who gets a lower score is below average. Standard scores are discussed in more detail below in the section on standard score distributions.
• Percentile scores. A percentile score is another type of converted score. An individual’s raw score is converted to a number indicating the percent of people in the reference group who scored below the test taker. For example, a score at the 70th percentile means that the individual’s score is the same as or higher than the scores of 70% of those who took the test. The 50th percentile is known as the median and represents the middle score of the distribution.
• Score distribution
• Normal curve A great many human characteristics, such as height, weight, math ability, and typing skill, are distributed in the population at large in a typical pattern. This pattern of distribution is known as the normal curve and has a symmetrical bell-shaped appearance. The curve is illustrated in Figure 2. As you can see, a large number of individual cases cluster in the middle of the curve. The farther from the middle or average you go, the fewer the cases. In general, distributions of test scores follow the same normal curve pattern. Most individuals get scores in the middle range. As the extremes are approached, fewer and fewer cases exist, indicating that progressively fewer individuals get low scores (left of center) and high scores (right of center).
• Standard score distribution. There are two characteristics of a standard score distribution that are reported in test manuals. One is the mean, a measure of central tendency; the other is the standard deviation, a measure of the variability of the distribution.
• Mean The most commonly used measure of central tendency is the mean or arithmetic average score. Test developers generally assign an arbitrary number to represent the mean standard score when they convert from raw scores to standard scores. Look at Figure 2. Test A and Test B are two tests with different standard score means. Notice that Test A has a mean of 100 and Test B has a mean of 50. If an individual got a score of 50 on Test A, that person did very poorly. However, a score of 50 on Test B would be an average score.
• Standard deviation. The standard deviation is the most commonly used measure of variability. It is used to describe the distribution of scores around the mean. Figure 2 shows the percent of cases 1, 2, and 3 standard deviations (sd) above the mean and 1, 2, and 3 standard deviations below the mean. As you can see, 34% of the cases lie between the mean and +1 sd, and 34% of the cases lie between the mean and -1 sd. Thus, approximately 68% of the cases lie between -1 and +1 standard deviations. Notice that for Test A, the standard deviation is 20, and 68% of the test takers score between 80 and 120. For Test B the standard deviation is 10, and 68% of the test takers score between 40 and 60.
• Percentile distribution. The bottom horizontal line below the curve in Figure 2 is labeled “Percentiles.” It represents the distribution of scores in percentile units. Notice that the median is in the same position as the mean on the normal curve. By knowing the percentile score of an individual, you already know how that individual compares with others in the group. An individual at the 98th percentile scored the same or better than 98% of the individuals in the group. This is equivalent to getting a standard score of 140 on Test A or 70 on Test B.
You might be interested:  Do You Think People Should Pay For Higher Education Why?

#### How do you read standard scores?

A standard score of 90, the beginning of the average range, corresponds to a percentile rank of 25. A standard score of 110, the uppermost end of average, has a percentile range of 75. So a child at the 30th percentile on a test of reading or math is performing within the region of what would be considered ‘average.’

#### What are standardised scores and why are they useful?

Many people will remember test scores from their school days such as ‘7 out of 10′ for a primary school spelling test, or ‘63%’ for one of their secondary school exams. Such scores, known as raw scores, are readily understandable and useful in indicating what proportion of the total marks a person has gained in a test.

However, these scores are less useful in enabling teachers to compare pupils’ performance meaningfully between one test and another, and to monitor progress over a period of time. This is because raw scores do not account for factors such as the difficultly of a test or performance relative to other test takers.

Many professionally produced tests, including NFER Tests for years 1-5 give additional outcomes, beyond simple proportions or percentages. One of these measures is standardised scores. Why use standardised scores? Standardised scores are more useful measures than raw scores (the number or percentage of questions answered correctly) as they enable test-takers to be compared with a large, nationally representative sample that has taken the test prior to publication.

Usually, tests are standardised so that the average, nationally standardised score automatically comes out as 100, irrespective of the difficulty of the test. This means teachers can readily identify whether a test-taker is above or below the national average. As standardised scores are converted onto a common scale they enable meaningful comparisons between scores from other standardised tests.

Standardised scores from most educational tests cover the same range, from 70 to 140. Hence a pupil’s standing in, say, mathematics and English can be compared directly using standardised scores. A pupil’s standardised score can also be tracked from test to test to monitor whether progress is being made.

Standardised scores may also make an allowance for the different ages of test takers. This is known as an age-standardised score. In a typical class in England and Wales, the oldest pupils can be up to 12 months older than the youngest. Almost invariably, in ability tests taken in the primary and early secondary years, older pupils achieve slightly higher raw scores than younger pupils on average.

However, age-standardised scores are derived in such a way that the ages of the pupils are taken into account by comparing a pupil with others of the same age (in years and months) in the nationally representative sample. Thus a younger pupil may gain a lower raw score than an older pupil, but have a higher standardised score.

• This is because the younger pupil is being compared with other younger pupils in the reference group and has a higher performance relative to his or her own age group.
• Age-standardised scores are therefore beneficial in allowing meaningful comparisons between pupils within a school.
• However, please be aware that not all standardised tests are age-standardised and ask your test provider if you are unsure.

For more on the effective use of assessment, head over to the NFER Assessment Hub where you’ll find a host of free guidance and resources. You can also sign up to our monthly assessment newsletter for exclusive assessment-related content delivered direct to your inbox.

### What does a standard score of 70 mean?

7. When do standard scores suggest below normal results? – Confusingly, different tests use different terms to describe levels or degrees of language or speech problems. As a rule of thumb, on a scale where 100 is the average (like the CELF-5):

a standard score of 70, or less than 70, suggests a severe impairment ; a standard score of 71-77 suggests a moderate impairment ; a standard score of 78-85 suggests a mild impairment ; and a standard score of 86-114 (inclusive) is within the normal range for the test.

On the normal curve diagram (above), you can see the percentile range equivalent for each of these standard score ranges. Remember, for the reasons set out in Part C and D, standard scores must be interpreted with caution and never in isolation from other assessment results. However, we can say that Child 1’s language skills warrant urgent further investigation.

## What does a standard score of 2 mean?

4. Z scores and Standard Deviations – Technically, a z-score is the number of standard deviations from the mean value of the reference population (a population whose known values have been recorded, like in these charts the CDC compiles about people’s weights). For example:

• A z-score of 1 is 1 standard deviation above the mean.
• A score of 2 is 2 standard deviations above the mean.
• A score of -1.8 is -1.8 standard deviations below the mean.

A z-score tells you where the score lies on a normal distribution curve. A z-score of zero tells you the values is exactly average while a score of +3 tells you that the value is much higher than average. Back to Top

## What is a low standardised score?

Standardised scores – Standardised scores compare a pupil’s performance to that of a nationally representative sample of pupils from the relevant year group, who will have all taken the same test at the same time of year. The average score on most standardised tests is 100.

Technically a score above 100 is above average and a score below 100 is below average. About two-thirds of pupils will have standardised scores between 85 and 115. Almost all pupils fall within the range 70 to 140, so scores outside this range can be regarded as exceptional. If you wish to group pupils according to standardised (or age-standardised) scores, the following descriptions may be useful.

These may vary between test providers, but this example from NFER tests gives you an idea of what the range of scores may mean: 