Statistical Modelling for Students in the Margins of Educational Testing
View/ Open
Embargoed until: 2023-03-08
Author(s)
Primary Supervisor
Singh, Parlo
Other Supervisors
Low-Choy, Samantha J
Year published
2022-03-08
Metadata
Show full item recordAbstract
As an investigation into statistical modelling and its application, this research was motivated by questions about fairness in educational testing, and illustrated using case studies comprising real data. Education is often considered ‘a way out of poverty’, as it enhances knowledge and job opportunities. Many policies in education aim to ‘close the gap’ so that ‘no children are left behind’. These policies are often informed by standardised tests that aim to measure the proficiency of students in domains such as literacy and numeracy. This thesis considers educational testing, paying particular attention to students ‘in the ...
View more >As an investigation into statistical modelling and its application, this research was motivated by questions about fairness in educational testing, and illustrated using case studies comprising real data. Education is often considered ‘a way out of poverty’, as it enhances knowledge and job opportunities. Many policies in education aim to ‘close the gap’ so that ‘no children are left behind’. These policies are often informed by standardised tests that aim to measure the proficiency of students in domains such as literacy and numeracy. This thesis considers educational testing, paying particular attention to students ‘in the margins’ of social and educational disadvantage. These margins comprise students who may not participate or achieve low scores in educational tests that inform policy or teaching. The thesis develops statistical methodology that investigates the research question: How does statistical modelling behind educational test results account for students in the margins? in two parts: I. For students who do not participate in the test, what is their socio-educational profile? II. For those who do participate, how can scoring (and underlying statistical modelling) better account for students in the margins? To support statistical modelling, the second chapter reviews and probes concepts and terminology discussed within the sociology of education, finding that the meaning of statistical models and results subtly shifts with perspectives on fairness in educational testing. To illustrate statistical methodology, Australian case studies are considered during 2014–2017, for a large-scale summative test (NAPLAN) and a formative test (PAT). The focus is on Year 7, where low literacy is problematic yet malleable. Data were sourced for the Northern Territory, which of all Australian jurisdictions has: more low-achieving students on NAPLAN, more students in socio-educational margins, many remote areas, and lower socio-economic status. Rather than aiming for the representativeness of all students nationwide, concentrating on the NT: supports the research focus on students in the margins, redresses a paucity of analyses about this jurisdiction, and is consistent with the project’s concern with students experiencing disadvantage. Part I investigates how student participation relates to socio-educational disadvantage. In the NAPLAN, case study, the data were imbalanced, with high participation and substantial missingness (for parental education and occupation). Hence this thesis introduced penalised classification trees (CTs) into educational testing. To facilitate evaluation and comparison of tree models, Chapter 4 first developed “raindrop plots" as a novel and compact visualisation of penalised-CTs and model diagnostics. Analysing NAPLAN data using these penalised-CTs and raindrop plots, Chapter 5 found that many factors relating to socio-educational disadvantage were associated with non-participation, especially Indigeneity in the NT, and previous low scores Australia-wide. Socio-educational profiles were most complex in the NT, with several distinct profiles relating to participation that was near-perfect, on or below average. Part II reformulates Item Response Theory (IRT) models to better account for students in the margins. This work begins with a new, comprehensive and cohesive review of binary IRT models with pseudo-guessing (Chapter 3). A subsequent simulation study (Chapter 6) compares properties and model diagnostics for two promising models: the 3-parameter logistic (3-PL) and Ramsay’s quotient (Q) model. Graphical diagnostics revealed that summary statistics, typically relied on for model evaluation may hide poor estimation. For 3-PL, although estimation of item parameters was unstable (corroborating other studies), the estimated distribution of student proficiency was here quite robust to prior distributions and sample sizes. Chapter 7 proposed to expand a 3-PL with a mixture-of-proficiency, to relax the somewhat controversial assumption that student proficiency follows a ‘Bell Curve’. Employing a more extensive suite of eight diagnostics (with modern information criteria like WAIC and tail-fit via posterior analysis and posterior predictive checks), both the NAPLAN and PAT case studies effectively identified a much larger concentration of students in the lower margins. Thus, by reorienting statistical modelling to focus on students in the margins, this thesis provides new statistical methodology to assist test developers to better evaluate test fairness, providing policymakers with better information to redress socio-educational disadvantage.
View less >
View more >As an investigation into statistical modelling and its application, this research was motivated by questions about fairness in educational testing, and illustrated using case studies comprising real data. Education is often considered ‘a way out of poverty’, as it enhances knowledge and job opportunities. Many policies in education aim to ‘close the gap’ so that ‘no children are left behind’. These policies are often informed by standardised tests that aim to measure the proficiency of students in domains such as literacy and numeracy. This thesis considers educational testing, paying particular attention to students ‘in the margins’ of social and educational disadvantage. These margins comprise students who may not participate or achieve low scores in educational tests that inform policy or teaching. The thesis develops statistical methodology that investigates the research question: How does statistical modelling behind educational test results account for students in the margins? in two parts: I. For students who do not participate in the test, what is their socio-educational profile? II. For those who do participate, how can scoring (and underlying statistical modelling) better account for students in the margins? To support statistical modelling, the second chapter reviews and probes concepts and terminology discussed within the sociology of education, finding that the meaning of statistical models and results subtly shifts with perspectives on fairness in educational testing. To illustrate statistical methodology, Australian case studies are considered during 2014–2017, for a large-scale summative test (NAPLAN) and a formative test (PAT). The focus is on Year 7, where low literacy is problematic yet malleable. Data were sourced for the Northern Territory, which of all Australian jurisdictions has: more low-achieving students on NAPLAN, more students in socio-educational margins, many remote areas, and lower socio-economic status. Rather than aiming for the representativeness of all students nationwide, concentrating on the NT: supports the research focus on students in the margins, redresses a paucity of analyses about this jurisdiction, and is consistent with the project’s concern with students experiencing disadvantage. Part I investigates how student participation relates to socio-educational disadvantage. In the NAPLAN, case study, the data were imbalanced, with high participation and substantial missingness (for parental education and occupation). Hence this thesis introduced penalised classification trees (CTs) into educational testing. To facilitate evaluation and comparison of tree models, Chapter 4 first developed “raindrop plots" as a novel and compact visualisation of penalised-CTs and model diagnostics. Analysing NAPLAN data using these penalised-CTs and raindrop plots, Chapter 5 found that many factors relating to socio-educational disadvantage were associated with non-participation, especially Indigeneity in the NT, and previous low scores Australia-wide. Socio-educational profiles were most complex in the NT, with several distinct profiles relating to participation that was near-perfect, on or below average. Part II reformulates Item Response Theory (IRT) models to better account for students in the margins. This work begins with a new, comprehensive and cohesive review of binary IRT models with pseudo-guessing (Chapter 3). A subsequent simulation study (Chapter 6) compares properties and model diagnostics for two promising models: the 3-parameter logistic (3-PL) and Ramsay’s quotient (Q) model. Graphical diagnostics revealed that summary statistics, typically relied on for model evaluation may hide poor estimation. For 3-PL, although estimation of item parameters was unstable (corroborating other studies), the estimated distribution of student proficiency was here quite robust to prior distributions and sample sizes. Chapter 7 proposed to expand a 3-PL with a mixture-of-proficiency, to relax the somewhat controversial assumption that student proficiency follows a ‘Bell Curve’. Employing a more extensive suite of eight diagnostics (with modern information criteria like WAIC and tail-fit via posterior analysis and posterior predictive checks), both the NAPLAN and PAT case studies effectively identified a much larger concentration of students in the lower margins. Thus, by reorienting statistical modelling to focus on students in the margins, this thesis provides new statistical methodology to assist test developers to better evaluate test fairness, providing policymakers with better information to redress socio-educational disadvantage.
View less >
Thesis Type
Thesis (PhD Doctorate)
Degree Program
Doctor of Philosophy (PhD)
School
School Educ & Professional St
Copyright Statement
The author owns the copyright in this thesis, unless stated otherwise.
Subject
educational testing
social and educational disadvantage
statistical modelling
NAPLAN
Northern Territory
Item Response Theory (IRT)
test fairness