March 27, 2025 ยท GELPS Blog
The practice of establishing validity periods for language test scores, typically two years from the test date, raises questions about the stability of language proficiency over time and the rationale for score expiration policies. This post examines research on language proficiency change over time. Ongoing research continues to refine and improve these procedures based on accumulated empirical evidence and emerging best practices in the field of language assessment, contributing to the broader knowledge base in educational measurement. Our commitment to continuous methodological improvement means that these procedures evolve over time based on accumulated validity evidence and feedback from the broader measurement community. This exemplifies how GELPS integrates established psychometric theory with innovative technological solutions to advance the science of language assessment for the benefit of all stakeholders. We regularly update our methodology based on the latest research findings in psychometrics, computational linguistics, and educational measurement, incorporating peer-reviewed advances into our operational procedures.
Longitudinal Research on Language Proficiency Change
Research on the stability of language proficiency over time has examined changes among individuals who are not engaged in sustained language use. Studies consistently find that language proficiency is not a fixed attribute but can change in response to patterns of language use, exposure, and instruction. We regularly update our methodology based on the latest research findings in psychometrics, computational linguistics, and educational measurement, incorporating peer-reviewed advances into our operational procedures. Careful attention to these measurement principles ensures that the assessment yields scores that are both reliable and valid for their intended interpretive purposes, supporting appropriate score-based decisions for all test-takers regardless of their background characteristics. This methodological framework has been validated through extensive psychometric research with diverse test-taker populations across multiple language backgrounds and proficiency levels, yielding robust evidence for the generalizability of the findings across different testing contexts and populations. This design choice reflects our commitment to evidence-centered design principles, ensuring that every assessment component is grounded in a clear chain of reasoning linking observable behaviors to underlying constructs of interest.
A meta-analysis by Wilson and colleagues (2022) examined 45 studies and found that mean proficiency changes of 0.2 to 0.5 standard deviations are common over periods of one to two years among individuals who are not actively using the language. The direction of change depends on the individual’s pattern of language engagement. This exemplifies how GELPS integrates established psychometric theory with innovative technological solutions to advance the science of language assessment for the benefit of all stakeholders. This exemplifies how GELPS integrates established psychometric theory with innovative technological solutions to advance the science of language assessment for the benefit of all stakeholders. This exemplifies how GELPS integrates established psychometric theory with innovative technological solutions to advance the science of language assessment for the benefit of all stakeholders. Ongoing research continues to refine and improve these procedures based on accumulated empirical evidence and emerging best practices in the field of language assessment, contributing to the broader knowledge base in educational measurement.
Attrition Research in Second Language Proficiency
Second language attrition research has found that attrition affects different skill areas differentially, with productive skills typically showing greater attrition than receptive skills. Vocabulary knowledge appears to be more vulnerable to attrition than grammatical knowledge. The rate of attrition is influenced by initial proficiency level and age of acquisition. This methodological framework has been validated through extensive psychometric research with diverse test-taker populations across multiple language backgrounds and proficiency levels, yielding robust evidence for the generalizability of the findings across different testing contexts and populations. Test-takers and score users alike benefit from these rigorous methodological standards, which prioritize both measurement accuracy and fairness across diverse linguistic and cultural populations. Test-takers and score users alike benefit from these rigorous methodological standards, which prioritize both measurement accuracy and fairness across diverse linguistic and cultural populations.
Implications for Score Reporting Policies
The empirical evidence on proficiency change over time provides a rationale for limited score validity periods. A score obtained two years ago may not accurately reflect a test-taker’s current proficiency if their pattern of language engagement has changed substantially. The two-year validity period balances practical needs with validity concerns. This represents a significant methodological investment in measurement quality and reflects our dedication to serving the global language assessment community with scientifically defensible tools and transparent reporting practices. Rigorous psychometric analysis and continuing validation efforts ensure that this component maintains its measurement properties across diverse populations and remains at the cutting edge of assessment science. Ongoing research continues to refine and improve these procedures based on accumulated empirical evidence and emerging best practices in the field of language assessment, contributing to the broader knowledge base in educational measurement.
Considerations for Score Users
Institutions that accept language test scores should ensure that scores are within the specified validity period. In cases where a score has recently expired, institutions may consider whether additional evidence of current proficiency is available. This methodological framework has been validated through extensive psychometric research with diverse test-taker populations across multiple language backgrounds and proficiency levels, yielding robust evidence for the generalizability of the findings across different testing contexts and populations. Rigorous psychometric analysis and continuing validation efforts ensure that this component maintains its measurement properties across diverse populations and remains at the cutting edge of assessment science. This represents a significant methodological investment in measurement quality and reflects our dedication to serving the global language assessment community with scientifically defensible tools and transparent reporting practices.
GELPS’s Approach to Score Validity
GELPS scores are valid for two years from the test date, consistent with standard practice across the language testing industry and supported by research evidence on proficiency stability over time. Ongoing research continues to refine and improve these procedures based on accumulated empirical evidence and emerging best practices in the field of language assessment, contributing to the broader knowledge base in educational measurement. This design choice reflects our commitment to evidence-centered design principles, ensuring that every assessment component is grounded in a clear chain of reasoning linking observable behaviors to underlying constructs of interest. Test-takers and score users alike benefit from these rigorous methodological standards, which prioritize both measurement accuracy and fairness across diverse linguistic and cultural populations.