The Power of Categorical Goodness-Of-Fit Statistics
MetadataShow full item record
The relative power of goodness-of-fit test statistics has long been debated in the literature. Chi-Square type test statistics to determine 'fit' for categorical data are still dominant in the goodness-of-fit arena. Empirical Distribution Function type goodness-of-fit test statistics are known to be relatively more powerful than Chi-Square type test statistics for restricted types of null and alternative distributions. In many practical applications researchers who use a standard Chi-Square type goodness-of-fit test statistic ignore the rank of ordinal classes. This thesis reviews literature in the goodness-of-fit field, with major emphasis on categorical goodness-of-fit tests. The continued use of an asymptotic distribution to approximate the exact distribution of categorical goodness-of-fit test statistics is discouraged. It is unlikely that an asymptotic distribution will produce a more accurate estimation of the exact distribution of a goodness-of-fit test statistic than a Monte Carlo approximation with a large number of simulations. Due to their relatively higher powers for restricted types of null and alternative distributions, several authors recommend the use of Empirical Distribution Function test statistics over nominal goodness-of-fit test statistics such as Pearson's Chi-Square. In-depth power studies confirm the views of other authors that categorical Empirical Distribution Function type test statistics do not have higher power for some common null and alternative distributions. Because of this, it is not sensible to make a conclusive recommendation to always use an Empirical Distribution Function type test statistic instead of a nominal goodness-of-fit test statistic. Traditionally the recommendation to determine 'fit' for multivariate categorical data is to treat categories as nominal, an approach which precludes any gain in power which may accrue from a ranking, should one or more variables be ordinal. The presence of multiple criteria through multivariate data may result in partially ordered categories, some of which have equal ranking. This thesis proposes a modification to the currently available Kolmogorov-Smirnov test statistics for ordinal and nominal categorical data to account for situations of partially ordered categories. The new test statistic, called the Combined Kolmogorov-Smirnov, is relatively more powerful than Pearson's Chi-Square and the nominal Kolmogorov-Smirnov test statistic for some null and alternative distributions. A recommendation is made to use the new test statistic with higher power in situations where some benefit can be achieved by incorporating an Empirical Distribution Function approach, but the data lack a complete natural ordering of categories. The new and established categorical goodness-of-fit test statistics are demonstrated in the analysis of categorical data with brief applications as diverse as familiarity of defence programs, the number of recruits produced by the Merlin bird, a demographic problem, and DNA profiling of genotypes. The results from these applications confirm the recommendations associated with specific goodness-of-fit test statistics throughout this thesis.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Australian School of Environmental Studies
Item Access Status
empirical distribution function test