A review of the normative studies of the Trail Making test reveals that there are relatively few large scale normative data sets available (Mitrushina et al., 2005). In addition, the age groups, education level, IQ and basis for subject selecting in these normative studies is highly variable with very little consistency between them in respect to subject selection criteria. Most of these studies appear to be “norms of convenience.” This variability in the subject populations used in these research studies combined with the recognized problems of test specificity for older adults (Ernst, 1987) when using cutoff test scores makes meaningful test interpretation challenging. There is no one normative study that is accepted as the standard reference by neuropsychologists for evaluating the results of this test.
The best solution, in this author’s opinion, is deriving the norms based on a meta-analysis of multiple research studies using a regression analysis. The normative database used for this test was calculated by the author using a polynomial best fit regression equation. This methodological approach has been previously used to create a normative data in the field of neuropsychology by Russell (1987) for the Wechsler Memory Scale, Zachary and Gorsuch (1985) with the WAIS-R and by Heaton et al. (1991) for the Halstead-Reitan Neuropsychological Evaluation System. The goal of this approach is to create a predictive model based on a summary analysis of a large numbers of studies in order to derive an accurate normative reference group composed of non-impaired individuals. The value and benefits of using a regression analysis deriving age based test norms is elaborated in detail by Mitrushina et al. (2005) in their comprehensive review and discussion of this statistical methodology.
The test norms used are based on a best fit polynomial formula derived by the author from a meta-analysis of 28 studies that were completed from 1980 to 2004. These studies were reviewed and summarized by Mitrushina et al. (2005). The total number of non-impaired individuals included in these studies was 6,317 for Test A and 6,360 for Test B. Using the best fit formula calculated for the MeSA-AE test it was possible to accurately determine the mean and standard deviation of both Tests A and B for non-impaired individuals between the ages of 15 to 89. The mean and standard deviation for each test was then used to calculate an individual’s standard score for the Attention Control (ACQ) and Cognitive Flexibility (CFQ) quotient scale scores based on his or her exact age. The Executive Control (ECQ) quotient scale score which is a combined measure of ACQ and CFQ was calculated based on the total completion time for both Test A and Test B and their respective standard deviations. Since, males and females have not been identified in research studies to be different in their test performance no breakdown by gender was included. It was also possible to adjust the normative score for individuals based on their IQ which was inferred using their education level. Obviously, this correction can only be made if the person’s education level is known. The education correction made for the MeSA-AE norms was based on the on the values reported by Mitrsushina et al. (2005). When a person’s education level is known then a correction in the test completion time can be made before calculating the MeSA-AE test quotient scale scores.
It was necessary to adjust the test completion time based on the education level (high school to graduate level) in order to correct for the differing intellectual abilities of the individuals tested. Since a person’s IQ score is not often available, this adjustment is based on a person’s education level which is generally reflective of their overall intellectual abilities. This correction is necessary in order to accurately calculate a person’s Executive Control, Attention Control and Cognitive Flexibility quotient scores. Individuals who have a higher level of intellectual functioning typically have lower test completion time scores. Likewise, the research reviewed by Mitrushina et al. (2005) supports that test completion time scores are higher for individuals with lower intellectual and education levels. Using this adjustment to the test scores of individuals makes it possible to compare them to the appropriate normative data set in order to accurately identify strengths and weaknesses taking into account their education level. It is also an important adjustment to make whenever the possibility of malingering needs to be assessed. If the education level is unknown or the individual is still in High School, then the normative comparison will be made based on the assumption that this person is of average intelligence. Hence, he or she will be compared to the normative test completion times for individuals that person’s age who have graduated from High School. The examiner will always have the option to purposely set an individual’s education level in order to adjust for the person’s IQ when it is known.
Four studies (Heaton et al., 1986; Alekoumbides, 1987; Richardson & Mark Marottoli, 1996; Tremontel, et al., 1998) supported the use of an education correction factor for individuals with less than a high school education. These studies justified the extrapolation of the correction factor formula reported in Mitrsushina et al. (2005) for use with individuals do not have at least a high school education. However, it is likely that the correction value used for individuals with less than a high school education is a conservative one and would generally underestimate the true correction value needed. It was deemed appropriate to use this “best” estimate, because the research cited above clearly showed that non-impaired individuals with limited education and lower intellectual functioning take significantly longer to complete both Test A and B. Unfortunately, normative research is very limited for individuals with less than a high school education which prevents the use of a more accurate correction factor.
The mean normative test score regression formula was calculated for the ECQ, ACQ and CFQ scales. These ACQ and CFQ scales are based on the mean test time scores for the age groups in the studies included in the regression analysis. The quotient scale labeled Executive Control Quotient (ECQ) was mathematically derived from these two primary quotient scales scores using equal weighting based on combining the normative data sets used for Test A and Test B. The ECQ was included in order to provide a comprehensive measurement of an individual’s overall performance on both Tests A and B. All of these three standard quotient scale scores, by definition, have a mean of 100 and a standard deviation of 15.