Bayesian Inference for the Negative Binomial-Quasi Lindley Model for Time Series Count Data on the COVID-19 Pandemic

Authors

  • Sirinapa Aryuyuen Department of Mathematics and Computer Science, Faculty of Science and Technology, Rajamangala University of Technology Thanyaburi, Pathum Thani 12110, Thailand
  • Unchalee Tonggumnead Department of Mathematics and Computer Science, Faculty of Science and Technology, Rajamangala University of Technology Thanyaburi, Pathum Thani 12110, Thailand https://orcid.org/0000-0003-2290-6818

DOI:

https://doi.org/10.48048/tis.2022.3171

Keywords:

Bayesian inference, Negative binomial-quasi Lindley distribution, Time series count data, Poisson regression model, Negative binomial regression model

Abstract

In statistical models, the generalized linear model (GLMs) plays a role in studying to describe a response variable as a function of 1 or more predictor variables. Computational methods and mixed distributions are frequently used to build predictive models to perform time-to-event data analysis. To develop a statistical model so that the model can make predictions appropriately and accurately, it starts with developing a suitable distribution for the nature of the actual data. This paper proposes a new mixed negative binomial distribution for count data with over-dispersion, the so-called negative binomial-quasi Lindley (NB-QL) distribution. A new GLMs framework for the NB-QL model to build the time series count data model is introduced, and its application is carried out based on the actual data sets of the COVID-19 epidemic in Thailand. The models are related to GLMs as they are linear relationships between outcome variables and covariates. Where the response variable was in the form of time series count data under the exponential family distribution function, with the random components and link functions. In this study, we study the factors that affect the number of COVID-19 death cases in Thailand and provide the predictive modeling of the number of the COVID-19 death cases from 1 January 2020 to 31 December 2020, for which this data set has the observed sample of 366 days. In contrast, a model with an NB-QL distribution and NB has approached the uniform. Based on the deviance, DIC,  and the probability integral transform histogram, we can see that the proposed model is also suitable for forecasting the number of the COVID-19 death cases daily in Thailand, indicating that the NB-QL time series model was another efficient alternative to modeling count data that has an over-dispersion problem. According to the NB-QL time series model about the number of the COVID-19 death cases daily in Thailand, it is indicated that the average number of daily COVID-19 deaths is influenced by the number of the COVID-19 death cases in the previous 3 days. The average number of COVID-19 death cases in Thailand is also influenced by the previous 2 days. At the same time, the number of infected cases daily in Thailand is influenced by the number of the COVID-19 death cases daily. In addition, there are also the components interventions of internal covariate effects due to the data, as there was a surge in the number of the COVID-19 death cases daily in Thailand at the time.

HIGHLIGHTS

  • A new mixture NB distribution to be a flexible alternative to analyze count data with over-dispersion. The new distribution is a mix of the NB and QL distributions; a name is Negative Binomial-Quasi Lindley
  • A new mixed negative binomial distribution for time series count data with over-dispersion, and the Bayesian approach is the method used to estimate the parameters of the proposed model. We will apply the GLMs framework to build the time series count data
  • The new mixed NB distribution in this study is an extremely effective alternative for modeling count data in the context of over-dispersion


GRAPHICAL ABSTRACT

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

JA Nelder and RWM Wedderburn. Generalized linear models. J. R. Stat. Soc. Ser. A 1972; 135, 370-84.

AC Cameron and PK Trivedi. Regression analysis of count data. 2nd eds. Cambridge University Press, New York, 2013.

M Greenwood and GU Yule. An inquiry into the nature of frequency distributions representative of multiple happenings with particular reference to the occurrence of multiple attacks of disease or of repeated accidents. J. R. Stat. Soc. 1920; 83, 255-79.

W Gardner, EP Mulvey and EC Shaw. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychol. Bull. 1995; 118, 392-404.

JS Long and JS Long. Regression models for categorical and limited dependent variables. SAGE Publications, New York, 1997.

AC Cameron and P Johansson. Count data regression using series expansions: With applications. J. Appl. Econ. 1997; 12, 203-23.

H He, W Tang, W Wang and P Crits-Christoph. Structural zeroes and zero-inflated models. Shanghai Arch. Psychiatry 2014; 26, 236-42.

Z Wang. One mixed negative binomial distribution with application. J. Stat. Plan. Inference 2011; 141, 1153-60.

H Zamani and N Ismail. Negative binomial-Lindley distribution and its application. J. Math. Stat. 2010; 6, 4-9.

S Aryuyuen and W Bodhisuwan.The negative binomial-generalized exponential (NB-GE) distribution. Appl. Math. Sci. 2013; 7, 1093-105.

Y Gençtürk and A Yiğiter. Modelling claim number using a new mixture model: Negative binomial gamma distribution. J. Stat. Comput. Simul. 2016; 86, 1829-39.

D Yamrubboon, W Bodhisuwan, C Pudprommarat and L Saothayanun. The negative binomial-Sushila distribution with application in count data analysis. Thail. Stat. 2017; 15, 69-77.

S Aryuyuen. Bayesian inference for the negative binomial-generalized Lindley regression model: Properties and applications. Commun. Stat. Theory Methods 2021. https://doi.org/10.1080/ 03610926.2021.1995434

Department of Disease Control. Daily covid-19 report, Thailand information. Daily COVID-19 report, Available at: https://data.go.th/dataset/covid-19-daily, accessed January 2022.

A Heinen. Modelling time series count data: An autoregressive conditional Poisson model, Available at: http://dx.doi.org/10.2139/ssrn.1117187, accessed May 2021.

R Ferland, A Latour and D Oraichi. Integer-valued GARCH process. J. Time Ser. Anal. 2006; 27, 923-42.

K Fokianos, A Rahbek and D Tjøstheim. Poisson autoregression. J. Am. Stat. Assoc. 2009; 104, 1430-9.

F Zhu. A negative binomial integer-valued GARCH model. J. Time Ser. Anal. 2011; 32, 54-67.

S Fu. A hierarchical Bayesian approach to negative binomial regression. Methods Appl. Anal. 2015; 22, 409-28.

S Fu. Hierarchical Bayesian LASSO for a negative binomial regression. J. Stat. Comput. Simul. 2016; 86, 2182-203.

D Yamrubboon, A Thongteeraparp, W Bodhisuwan, K Jampachaisri and A Volodin. Bayesian inference for the negative binomial-Sushila linear model. Lobachevskii J. Math. 2019; 40, 42-54.

A Gelman, JB Carlin, HS Stern, DB Dunson, A Vehtari and DB Rubin. Bayesian data analysis. CRC Press, New York, 2013.

T Liboschik, K Fokianos and R Fried. Tscount: An R package for analysis of count time series following generalized linear models. J. Stat. Softw. 2017; 82, 1-51.

K Fokianos and D Tjøstheim. Log-linear Poisson autoregression. J. Multivar. Anal. 2011; 102, 563-78.

R Shanker and A Mishra. A quasi Lindley distribution. Afr. J. Math. Comput. Sci. Res. 2013; 6, 64-71.

T Harris, JM Hilbe and JW Hardin. Modeling count data with generalized distributions. Stata J. 2014; 14, 562-79.

SR Geedipally, D Lord and SS Dhavala. The negative binomial-Lindley generalized linear model: Characteristics and application using crash data. Accid. Anal. Prev. 2012; 45, 258-65.

R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, Available at: https://www.R-project.org, accessed May 2021.

Y Su and M Yajima. R2jags: Using R to Run 'JAGS'. R package version 0.7-1, Available at: https://CRAN.R-project.org/package=R2jags, accessed January 2022.

D Lunn, C Jackson, N Best, A Thomas and D Spiegelhalter. The bugs book: A Practical Introduction to Bayesian analysis. Chapman Hall, London, 2013.

DJ Spiegelhalter, NG Best, BP Carlin and AVD Linde. Bayesian measures of model complexity and fit. J. R. Stat. Soc. Ser. C 2002; 64, 583-639.

C Genest and J Neslehova. A primer on discrete copulas. ASTIN Bull. 2007; 37, 475-515.

C Czado, T Gneiting and L Held. Predictive model assessment for count data. Biometrics 2009; 65, 1254-61.

P McCullagh and JA Nelder. Generalized linear models. Routledge, Boca Raton, Florida, 2019.

AL Byers, H Allore, TM Gill and PN Peduzzi. Application of negative binomial modeling for discrete outcomes: A case study in aging research. J. Clin. Epidemiol. 2003; 56, 559-64.

A Yesilova and A Yilmaz. The application of overdispersion and generalized estimating equations in repeated categorical data related to the sexual bahaviour traits of farm animals. J. Appl. Sci. 2007; 7, 1762-7.

DT Molla and B Muniswamy. Power of tests for overdispersion parameter in negative binomial regression model. IOSR J. Math. 2012; 1, 29-36.

Downloads

Published

2022-11-01

How to Cite

Aryuyuen, S. ., & Tonggumnead, U. . (2022). Bayesian Inference for the Negative Binomial-Quasi Lindley Model for Time Series Count Data on the COVID-19 Pandemic . Trends in Sciences, 19(21), 3171. https://doi.org/10.48048/tis.2022.3171