Generalized Inflated Discrete Models: A Strategy to Work with Multimodal Discrete Distributions

Abstract

Analysts of discrete data often face the challenge of managing the tendency of inflation on certain values. When treated improperly, such phenomenon may lead to biased estimates and incorrect inferences. This study extends the existing literature on single-value inflated models and develops a general framework to handle variables with more than one inflated value. To assess the performance of the proposed maximum likelihood estimator, we conducted Monte Carlo experiments under several scenarios for different levels of inflated probabilities under multinomial, ordinal, Poisson, and zero-truncated Poisson outcomes with covariates. We found that ignoring the inflations leads to substantial bias and poor inference of the inflations—not only for the intercept(s) of the inflated categories but other coefficients as well. Specifically, higher values of inflated probabilities are associated with larger biases. By contrast, the generalized inflated discrete models (GIDMs) perform well with unbiased estimates and satisfactory coverages even when the number of parameters that need to be estimated is quite large. We showed that model fit criteria, such as Akaike information criterion, could be used in selecting the appropriate specifications of inflated models. Lastly, the GIDM was implemented using large-scale health survey data as a comparison to conventional modeling approaches such as various Poisson and Ordered Logit models. We showed that the GIDM fits the data better in general. The current work provides a practical approach to analyze multimodal data that exists in many fields, such as heaping in self-reported behavioral outcomes, inflated categories of indifference and neutral in attitude surveys, large amounts of zero, and low occurrences of delinquent behaviors.

Publication
Sociological Methods and Research, 50(1)