Back to Journals » International Journal of General Medicine » Volume 14

Trend Analysis and Forecasting the Spread of COVID-19 Pandemic in Ethiopia Using Box–Jenkins Modeling Procedure

Authors Gebretensae YA , Asmelash D 

Received 11 February 2021

Accepted for publication 31 March 2021

Published 21 April 2021 Volume 2021:14 Pages 1485—1498


Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 3

Editor who approved publication: Dr Scott Fraser

Yemane Asmelash Gebretensae,1 Daniel Asmelash2

1Department of Statistics, College of Natural and Computational Science, Aksum University, Aksum, Ethiopia; 2Department of Clinical Chemistry, School of Biomedical and Laboratory Sciences, College of Medicine and Health Sciences, University of Gondar, Gondar, Ethiopia

Correspondence: Daniel Asmelash
Department of Clinical Chemistry, School of Biomedical and Laboratory Sciences, College of Medicine and Health Sciences, University of Gondar, P.O. Box 196, Gondar, Ethiopia
Email [email protected]
Yemane Asmelash Gebretensae
Department of Statistics, College of Natural and Computational Science, Aksum University, P.O. Box 1010, Aksum, Ethiopia
Email [email protected]

Introduction: COVID-19, which causes severe acute respiratory syndrome, is spreading rapidly across the world, and the severity of this pandemic is rising in Ethiopia. The main objective of the study was to analyze the trend and forecast the spread of COVID-19 and to develop an appropriate statistical forecast model.
Methodology: Data on the daily spread between 13 March, 2020 and 31 August 2020 were collected for the development of the autoregressive integrated moving average (ARIMA) model. Stationarity testing, parameter testing and model diagnosis were performed. In addition, candidate models were obtained using autocorrelation function (ACF) and partial autocorrelation functions (PACF). Finally, the fitting, selection and prediction accuracy of the ARIMA models was evaluated using the RMSE and MAPE model selection criteria.
Results: A total of 51,910 confirmed COVID-19 cases were reported from 13 March to 31 August 2020. The total recovered and death rates as of 31 August 2020 were 37.2% and 1.57%, respectively, with a high level of increase after the mid of August, 2020. In this study, ARIMA (0, 1, 5) and ARIMA (2, 1, 3) were finally confirmed as the optimal model for confirmed and recovered COVID-19 cases, respectively, based on lowest RMSE, MAPE and BIC values. The ARIMA model was also used to identify the COVID-19 trend and showed an increasing pattern on a daily basis in the number of confirmed and recovered cases. In addition, the 60-day forecast showed a steep upward trend in confirmed cases and recovered cases of COVID-19 in Ethiopia.
Conclusion: Forecasts show that confirmed and recovered COVID-19 cases in Ethiopia will increase on a daily basis for the next 60 days. The findings can be used as a decision-making tool to implement health interventions and reduce the spread of COVID-19 infection.

Keywords: ARIMA models, COVID-19, forecast, trend, Ethiopia


Corona Virus Disease 2019 (COVID-19) was reported in Hubei, China on 31 December 2019 and the WHO declared a global pandemic disease after one month. The infection was spreading at an alarming rate both domestically and internationally.1 According to the WHO, more than 25 million confirmed cases of COVID-19 and 800,000 deaths have been reported globally as of 31 August 2020.2 On March 13, 2020, the Ethiopian Federal Ministry of Health has confirmed a coronavirus disease (COVID-19) case in Addis Ababa, Ethiopia. Consequently, the government of Ethiopia suspended schools and public gatherings. The total confirmed cases increased to 51,910 and the reported death rate of 815 as of 31 August 2020.3

People infected with COVID-19 may have little or no symptoms and the symptoms ranged from mild symptoms to severe illnesses, and the incubation period of COVID-19 may last 2 weeks or longer. The disease may still be infectious during the latent period of infection and the virus can spread through respiratory droplets and close contact from person to person.4

In the fight against the pandemic, it is crucial to be able to identify the rate at which the epidemic spreads. Awareness at the level of spread at any given time has the ability to help governments plan and develop public health policies to deal with the consequences of the pandemic. The way to be aware of the magnitude of the spread, and thus the timing of its peak, is to be able to accurately predict the number of active cases at any given time.5

Epidemic mathematical models are best possible technique in analyzing the control and spread of infectious diseases. Time-series analysis is a tool to extrapolate forecasts, in which the mathematical model is established based on to the regularity and trend of the historical values observed over time and has been commonly used in predicting the spread of COVID-19. Modeling the disease and providing future forecasts of the possible number of cases per day may help the health care system to prepare for new patients. The statistical prediction models are therefore useful both in predicting and monitoring the global threat of pandemic. Therefore, it is extremely important to create models that are both computationally competent and practical in order to help policy makers and medical staff.6,7

Auto Regressive Integrated Moving Average (ARIMA) models are the most commonly used methods.8,9 The ARIMA model has been successfully applied in the field of medical research due to its simple structure, fast implementation and ability to explain the data set.10 The use of ARIMA to forecast time series is important with uncertainty as it assumes no knowledge of any underlying model or relationship as in some other methods. Generally, ARIMA depends on past series values as well as earlier forecast error terms. However, in relation to short-run forecasting, the ARIMA models are comparatively more robust and efficient than more complex structural models.11,12

The ARIMA methodology is a statistical approach used to evaluate and create a forecasting model that best represents a time series by modeling the correlations in the data. Many of the advantages of the ARIMA model have been found in empirical research and support the ARIMA as an effective way in particularly short-term time series prediction. A major advantage of the ARIMA approach is that it makes no assumptions about the number of terms or the relative weights to be applied to the terms.13,14

The advantage of the ARIMA model is its versatility to reflect with simplicity, numerous time series varieties, as well as the related Box–Jenkins methodology for optimum model construction operation.8,15 In addition, ARIMA model gives weight to past values and error values to correct model prediction more reliable than other basic regression and exponential methods. Generally, ARIMA models frequently outshine more complex structural models in terms of short-term predictive capabilities.16

A number of studies were conducted to evaluate the global forecasts for COVID-19. A study in Iran showed that the ARIMA model predicts that Iran can easily show an increase in daily COVID-19 total confirmed cases and total deaths, while the daily total confirmed new cases, total new deaths. The study predicts that Iran will be able to control COVID-19 in the near future.17

A study conducted in Nigeria to develop an appropriate predictive model could be used as a decision-making tool for the health interventions and to minimize the spread of Covid-19 infection. Data on the daily spread were collected for the development of the autoregressive integrated moving average (ARIMA) model and the result showed a sharp increased trend of COVID-19 spread in Nigeria within the specified the time frame.18

A study conducted in Italy used the ARIMA model to forecast reported and recovered case of the COVID-19 outbreak. The projections for confirmed cases may exceed 182,757, and the recovered cases could be reported 81,635 at the end of May. The final findings suggest that there will be a decrease of about 35% in confirmed cases and an increase of 66% in recovered cases.19

To our knowledge, there is no study conducted on the trend analysis and forecasting of COVID-19 in Ethiopia. Thus, the main objective of the study was therefore to analyze trends in the spread of COVID-19 using ARIMA models and to find the best predictive model and apply it to the possible predictive occurrence of COVID-19 cases in Ethiopia. Therefore, this study will help policy makers and the public to adopt new strategies and strengthen existing preventive measures against the COVID-19 pandemic and can help predict the health infrastructure needs in the near future.

The contributions of this paper can be summarized as follows: The first contribution is to find the best empirical model that has been established for the prediction of newly reported and recovered cases of COVID-19, the precision of which helps governors in decision-making to handle the pandemic and health system strategies; Second contribution, we can highlight the trend of reported and recovered cases of COVID-19 in Ethiopia. In addition, this paper explores a sample forecasting approach 60 days ahead. This forecast result enables us to check the efficacy of the forecasting models in various situations, helping in the battle against COVID-19 in Ethiopia in future strategy.

The rest of the article is organized as follows:

Dataset Description includes a description of the dataset used for this study. The forecasting models used in this study are described in Auto-Regressive Integrated Moving Average (ARIMA) Models to Parameter Estimation and Model Validation for details of the procedures used in the research methodology. Results obtained, related discussions and conclusions on the performance forecasting models are given in Result, Discussion and Conclusion.

Materials and Methods

Dataset Description

Regular updates of officially confirmed cases of COVID-19 were collected from the official website of the Ethiopian Public Health Institute (EPHI). A total 172 observations of laboratory-confirmed, recovered and fatal cases of COVID-19 were included in the study from 13 March to 31 August 2020.3

Model Description

Auto-Regressive Integrated Moving Average (ARIMA) Models

The ARIMA model forecasting approach differs from other approaches because it does not consider specific trend in the historical data of the sequence to be predicted. It uses an interactive approach to identify a possible model from a general model class. The chosen model is then tested against historical data to see if the sequence is correctly represented.

Moving Average (MA) Process

This model uses past errors as a dependent variable.20 Let be a white noise process, a sequence of random variables independently and identically distributed (iid) and then the order MA model is given as:


This model is described in terms of past errors and thus, we estimate the coefficients . Therefore, only q errors will affect the existing level, but higher order errors do not affect . This indicates that it is a short memory model.

Auto-Regression (AR)

According to an autoregressive model of order p, an AR (p) can be expressed as;


The model is described in terms of past values and therefore we would like to estimate the coefficients , and use the model for forecasting. All previous values will have cumulative effects on the existing level, which is a long-run memory model.21

Autoregressive Integrated Moving Average (ARIMA) Process

ARIMA modeling methods were used in this study based on a common method available for modeling and forecasting the time series data. ARIMA is the most common class of time series models which can be made “stationary” by differencing (if necessary), possibly in combination with non-linear transformations such as logging or deflating (if necessary)

ARIMA (p, d, q) is the general non-seasonal ARIMA model: where p is the number of autoregressive terms, d is the number of differences and q is the number of moving average terms. A white noise model is classified as ARIMA (0, 0, 0) since there is no AR part because does not depend on , there is no differencing involved and also there is no MA part since does not rely on . For instance, if is non-stationary, we take a first-difference of so that becomes stationary. (d = 1 implies one-time differencing)


is an ARIMA (p, 1, q) model. A random walk model is classified as ARIMA (0, 1, 0) because there is no AR and MA part involved and only one difference exists.22

Model Identification

The data required should be stationary for the development of time series models. If non-stationary data are used in a model, the results can show a relationship that is misleading. Therefore, time series data must be checked for stationary before the model is defined.

Generally, a time series is stationary if it is described by constant mean and variance, and an autocovariance that does not depend on time. If any of these requirements are not fulfilled, the data shall be considered nonstationary. The autocorrelation function (ACF) will be used to define this problem, and if the ACF plot is positive and shows a very slow linear decay pattern, the data are non-stationary. The issue of non-stationarity can be resolved by appropriate data differencing if it is caused by mean or model transformation caused by variance. Partial autocorrelation (PACF) is characterized as a linear correlation between Y t and Y (t-k), which controls the possible effects of linear relationships between intermediate lag values. The next is to determine the initial values for seasonality and non-seasonality orders (P and q).23

Parameter Estimation and Model Validation

After identifying the appropriate ARIMA order (p, d, q), we tried to find precise estimates of the model parameters using the least squares as described by Box and Jenkins. The parameters are obtained by the maximum probability for the time series, which is asymptotically accurate. For Gaussian distributions estimators are generally adequate, efficient and consistent and are asymptotically normal and efficient for non-Gaussian distributions. In this study, STATA v. 15 and SPSS version 25 softwares were used to develop the ARIMA model. The statistical significance level was set at 0.05. Models chosen in the last stage were validated using methods which include Root mean squared error (RMSE), mean absolute percentage error (MAPE) and normalize Bayesian information criteria (BIC).23,24


Study Data Characteristics

The overall data on the distribution of COVID-19 were collected and analyzed from 13 March 2020 to 31 August 2020. A total of 51,910 COVID-19 cases were observed from March 13, 2020 to 31 August 2020, and the incidence showed a rising trend day by day, with a high rate of increase after mid-August 2020. Total recovered and death rates as of 31 August 2020 were 37.2% and 1.57% of the totals, respectively, for the highest incidence and recovery ratio since the COVID-19 index in Ethiopia. The average total number of confirmed, recovered and reported cases per day from 13 March 2020 to 31 August 2020 was 301.8, 112.2 and 4.74, respectively (Table 1).

Table 1 Descriptive Statistics of Confirmed, Recovered and Death Cases in Ethiopia

The descriptive analysis of the overall data showed that the new daily COVID-19 confirmed cases and recovered cases significantly increased after the 154th and 143th days, respectively, since the outbreak of the epidemic. It displayed a progressively upward trend, suggesting a possible un-stabilized epidemic and a steady upward trend. From 21 June to 21 July, the number confirmed and recovered cases was almost constant. However, the number of confirmed and recovered cases increased by almost double as of August 2020 compared to July 2020 reports. However, the number of deaths remained stable between 13 March to 30 August, 2020 with minor changes. In Ethiopia, the trend of COVID-19 has been increased progressively in the upward direction for six months starting from the first reported case on 13 March 2020 (Figure 1).

Figure 1 COVID-19 outbreak trend over time.

Model Identifications

In the identification of the model, the ACF and PACF were applied in COVID-19 confirmed cases to check if the data were stationary. A very slow linear decay pattern can be corrected by first degree order of differentiation.

After applying autocorrelation, the moderately large negative spike at the second lag followed by correlations that bounce around between being positive and negative and all of which are either not statistically significant or just barely cross the threshold of statistical significance. The steady decline in the partial correlations towards zero. Finally, the first difference of COVID-19 confirmed cases was best characterized as the following a second- or third-order moving average process. This indicates that the first variation in COVID-19 recovered cases is better described as following the first–order moving average process (Figures 27).

Figure 2 Autocorrelation plot of COVID-19 confirmed cases.

Figure 3 ACF plot after 1st differencing of the COVID-19 confirmed cases data.

Figure 4 PACF plot after 1st differencing of the COVID-19 confirmed cases data.

Figure 5 Autocorrelation plot of COVID-19 recovered cases.

Figure 6 ACF plot after 1st differencing of the COVID-19 recovered cases.

Figure 7 PACF plot after 1st differencing of the COVID-19 recovered cases.

Stationarity Test

The stationary test was conducted using the Augmented Dickey–Fuller Test (ADF). In order to apply the ARIMA modeling technique effectively, the series must be stationary and free from any sort of trend. Thus, to confirm the status of the daily confirmed and recovered cases of COVID-19 in Ethiopia, the ADF test was used to validate the stationarity observed from the series transformation (ADF test: for confirmed and recovered cases, respectively, indicating there is no unit root that means the series are stationary at first lag). However, the time series was not found to be stationary, which is the natural form of the data, and then we transformed into stationary by making the first difference (Table 2).

Table 2 Stationarity Test of the Series with Augmented Dickey–Fuller Test for Confirmed and Recovered Cases

Candidate Model Identification

The order of the model was determined on the basis of ACF and PACF after a common difference. The following candidate models were developed based on the spikes seen in the ACF and PACF graphs. The candidate model with the lowest value of RMSE, MAPE and Normalize BIC was identified as the best model to match the daily spread of the COVID-19 in Ethiopia. The p and q parameters of the ARIMA models were predicted and the projected models were then compared to the RMSE, MAPE and BIC values. This suggests the estimation of ARIMA (0, 1, 5) and ARIMA (2, 1, 3) models for the forecasting of daily spread and the recovery cases of COVID-19 in Ethiopia, respectively.

The guess models below were compared to different ARIMA models using model selection criteria such as RMSE, MAPE and BIC, but the model suggested proved to be relatively robust compared to other competing models using SPSS V25 software. Considering the RMSE and BIC values, it is clear that the ARIMA (0, 1, 5) model has the lowest RMSE, MAPE and BIC values, making it the most effective modeling and forecasting of the spread of COVID-19 in Ethiopia. The same is true for the recovered cases, we were able to measure the aforementioned candidate models and also to use the above model selection criterion, finally we have detected that the daily recovered cases used ARIMA (2, 1, 3) as the best model with the lowest RMSE, MAPE and BIC values. The performance of the various ARIMA models with different orders of Autoregressive and Moving Average were checked and verified using statistics such as RMSE, MAPE and BIC. The results show that the proposed model performed well, both in-sample and out-of-sample (Table 3).

Table 3 Model Fit for Confirmed and Recovered COVID-19 Cases in Ethiopia

Model Coefficients Test

The best candidate models for confirmed and recovered cases were ARIMA (0, 1, 5) and ARIMA (2, 1, 3) respectively, based on the RMSE, MAPE and BIC criterion. The model was then estimated with its forecasting parameter for the daily confirmed and recovered series of COVID-19 in Ethiopia (Tables 4 and 5).

Table 4 Parameter Estimation Using ARIMA (0, 1, 5) Model for Confirmed Cases of COVID-19 in Ethiopia

Table 5 Parameter Estimation Using ARIMA (2, 1, 3) Model for Recovered Cases of COVID-19 in Ethiopia

Examining the estimation results for confirmed cases, we see that the MA (1) coefficient is 0.88, the MA (2) coefficient is −0.343, and the MA (5) is −0.249 which are highly significant. The estimated standard errors are 0.081, 0.106 and 0.084, respectively.

The best suited models can be re-written based on the findings and evaluation of the different ARIMA model described as presented in Tables 4 and 5 respectively.


Where; represents the value of daily confirmed cases, : represents the error terms


Where; represents the value of daily recovered cases, : represents the error terms

Forecasting Using ARIMA Model

The daily spread data from 13 March to August 31, 2020, were predicted using the ARIMA (0,1,5) model and the daily recovered were predicted using the ARIMA (2,1,3) model based on the spread of COVID-19 in Ethiopia. The results indicated that the predicted values matched well with the actual values. The forecast date, point forecast and the upper and lower confidence limit values of the forecast for the next 2 months. The daily forecast was the point forecast with the 95% confidence limit of the upper and lower boundary values. The model’s forecasting power is very high as demonstrated by the slight gap between real and fitted values (Table 6).

Table 6 Forecasting of Daily Total COVID-19 Confirmed Cases and Total Recovered Patients in Ethiopia for the Next 60 Days According to ARIMA Models with 95% CI

We can clearly conclude that the model selected can be used for modeling and forecasting the spread of COVID-19 in Ethiopia. Therefore, the forecasts showed that the spread of COVID-19 confirmed and recovered cases in Ethiopia would increase daily for the next sixty days (Figures 8 and 9).

Figure 8 A 60 days forecast of total confirmed cases of COVID-19 according to ARIMA models with 95% confidence interval in Ethiopia.

Figure 9 A 60-day forecast of total recovered cases of COVID-19 according to ARIMA models with 95% confidence interval in Ethiopia.


The study presented current trends of COVID-19 outbreak from March 13, 2020 to 31 August, 2020 as visualized in the EPHI official website report. Since then, COVID-19 cases showed an uptrend. Total recovery and death rates as of 31 August, 2020 were 37.2% and 1.57%, respectively, which reflected the peak incidence and recovery ratio since the outbreak of COVID-19 in Ethiopia. And, the number of confirmed, recovered and death rates were increased significantly.

Based on the findings of the study, the spread of COVID-19 in Ethiopia was expected to move in an upward trend. Having developed an appropriate model, Ethiopia can apply this model to forecast the trend of COVID-19.

In Ethiopia, starting with the first reported case, the COVID-19 trend showed a progressive upward direction for six months, which was consistent with the Nigerian study.25 However, the trend of confirmed COVID-19 cases in Ethiopia has shown that it is better than the US and European countries, though they had comparatively higher testing capacities. Having significant level of inadequate preventive practice measures in Ethiopia,26,27 thus there is important to comprehend the trend of COVID-19 and to generalize the implications of the strategies used by the government to mitigate the spread of the disease.

The candidate models were obtained using the autocorrelation function (ACF) and the partial autocorrelation function (PACF). The models were designed based on the peaks found in the ACF and PACF charts. Both ARIMA (0, 1, 5) and ARIMA (2, 1, 3) were found to be the optimal model for confirmed and recovered COVID-19 cases, respectively, based on the lowest RMSE, MAPE and BIC values. This model was then used to study the trend of COVID-19 and the estimated increase in the number of confirmed and recovered cases. The finding of the study was consistent with the study conducted in Nigeria, which showed an upward trend in the spread of COVID-19 within the selected timeframe.18

The ARIMA model has been widely used in the infectious disease outbreak modelling. ARIMA, time series coupled with corrective gradual changes successfully predict a linear trend, but fails to forecast a series with turning points.28 The current study used the complete periodic data to establish the ARIMA models and to forecast epidemic in the next 60 days. The ARIMA model fit well and is more suitable for short-term prediction. The ARIMA model was recently used to predict the dynamics of COVID19 disease with acceptable accuracy in a study conducted in Iran, Saudi Arabia, and a study conducted in the 15 most affected countries.17,29,30 The optimal predictive ARIMA model was validated for confirmed and recovered COVID-19 cases based on lowest RMSE, MAPE and BIC value. It was estimated that the less out-of-sample forecast error and the lowest value are preferable, and which may contribute to the future forecast in Ethiopia.

In the current study, wide confidence intervals help to address any unforeseen changes in the forecast of dynamic COVID-19 cases. The prediction interval allows users to determine future uncertainty and to prepare different strategies for the range of possible outcomes. In addition, the wider prediction interval resulting from the non-stationary process was more practical in allowing for higher uncertainty and helps to illustrate the special significance of model identification, especially in evaluating whether or not the data is stationary.31

Furthermore, it is very important to discuss all the studies conducted on the basis of different techniques applied to COVID-19 prediction using statistical, mathematical/analytical and machine learning/data science models to control the spread of COVID-19 globally and for a specific country and to evaluate its impact, to create COVID–19 vulnerability index [1–16].

According to the model prediction, we need to be more aware of the tendency of COVID-19 spreading more than currently observed. In addition, based on the study findings, the trend towards the spread of COVID-19 in Ethiopia is expected to move upward. As a result, rapid control of infections in healthcare settings and in the community is mandatory in order to achieve success with COVID-19 prevention. It can also be used as a decision-making tool to allocate health interventions and mitigate the spread of Covid-19.

This tool can also be used to more reliably forecast short-term disease transmission indicators, to provide response control at all levels of the departments and to provide short-term emergency prevention programs for policy makers. Having established an appropriate model, Ethiopia can apply this model to predict the trend of COVID-19 in the country. ARIMA model forecasts are stable in all variables in the near future, which may be useful in prevention of the COVID-19 pandemic. The ARIMA model can provide rapid assistance in forecasting cases and developing a better preparedness plan in Iran.17

The ARIMA model is one of the most commonly used time series forecasting methods due to its simplicity and systematic structure and appropriate forecasting performance.32 Based on the findings of the study, it was predicted that the spread of COVID-19 in Ethiopia would move upward and the model could be used to predict the COVID-19 trend in the country.

ARIMA models were used to predict the progression of infectious diseases in order to identify the possible outcomes of an outbreak. However, artificial intelligence (AI) has the potential to help in all the stages of healthcare, from surveillance through to rapid diagnostic tests, and faster drug development. AI may also help to decide which patients should be prioritized for treatment and quickly learn which factors predict a higher risk of mortality, as well interventions and population-level controls, have led to reduced harm.33,34 As the number of COVID-19 cases increased nationally in Ethiopia and different studies showed the majority of the community had poor practice on preventive measures,26,27 there should be a need to focus on further measures to minimize the spread of COVID-19.


The current study showed that the spread of COVID-19 in Ethiopia is expected to move upward. Both ARIMA (0, 1, 5) and ARIMA (2, 1, 3) were found as the best model for confirmed and recovered COVID-19 cases, respectively, on the basis of the lowest RMSE, MAPE and normalized BIC values. Forecasts have shown that spread of COVID-19 confirmed and recovered cases in Ethiopia will progressively increase on a daily basis for the next 60 days. The study developed an appropriate statistical model which can be used as a decision-supporting method to implement health interventions and mitigate the spread of Covid-19 infection. While the accuracy of the proposed ARIMA models can be considered good, valid and satisfactory, and despite the fact that the projected values are classified as reliable forecasts. The study indicated that the ARIMA model was an easy-to-use modeling method for rapid forecasting the spread of COVID-19 in Ethiopia. In addition, we recommend to use other forecasting methods such as exponential smoothing and compare the results to our best selected ARIMA models as a baseline for new and recovered cases in Ethiopia. The limitation of the study was no risk factor was evaluated and analyzed, including demographic details of patients, their social network and travels due to the lack of individual-level data.


ACF, autocorrelation function; ANFIS, adaptive neuro-fuzzy inference system; ADF, augmented Dickey–Fuller test; ARIMA, autoregressive integrated moving average; BIC, Bayesian information criteria; PACF, partial autocorrelation function; CDC, communicable disease control; CI, confidence interval; CMC, composite Monte-Carlo; CUBIST, cubist regression; COVID-19, corona virus disease 2019; EPHI, Ethiopia Public Health Institute; MAPE, mean absolute percentage error; RF, random forest; RMSE, root mean squared error; SPSS, Statistical Package for Social Science; VMD, variational mode decomposition; WHO, World Health Organization.

Data Sharing Statement

All daily series of open-source data that support the findings of this study are also available from regular updates by the Ethiopian Public Health Institute:[accessed on 10/01/2020].

Consent for Publication

All authors provided written informed consent to publish this study.


The authors gratefully acknowledge the Ethiopian Public Health Institute for publicly releasing updated datasets on the number of confirmed, recovered and death COVID-19 cases in Ethiopia. And we acknowledged the feedbacks from participants of the 32nd Ethiopian Public Health Association annual conference.

Author Contributions

Both authors made substantial contributions to conception and design, acquisition of data, or analysis and interpretation of data; took part in drafting the article or revising it critically for important intellectual content; agreed to submit to the current journal; gave final approval of the version to be published; and agreed to be accountable for all aspects of the work.


The authors received no specific funding for this work.


The authors reported no conflicts of interest for this work.


1. McIntosh K, Hirsch MS, Bloom A. Coronavirus disease 2019 (COVID-19). In: UpToDate Hirsch MS Bloom. Vol. 5. 2020.

2. World Health Organization. COVID-2019 situation report; 2020. Available from: Accessed April 8, 2021.

3. EPHI. Ethiopian public health institute COVID-19 situational update; 2020 [cited September 1, 2020]. Available from: Accessed April 8, 2021.

4. CDC. Coronavirus disease 2019. Information for healthcare professionals about coronavirus (COVID-19); 2020 [cited May 20, 2020]. Available from: Accessed April 8, 2021.

5. Guo YR, Cao QD, Hong ZS, et al. The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak - an update on the status. Mil Med Res. 2020;7(1):11. doi:10.1186/s40779-020-00240-0

6. Fanelli D, Piazza F. Analysis and forecast of COVID-19 spreading in China, Italy and France. Chaos Solitons Fractals. 2020;134:109761. doi:10.1016/j.chaos.2020.109761

7. Thompson RN, Hollingsworth TD, Isham V, et al. Key questions for modelling COVID-19 exit strategies. arXiv preprint arXiv:200613012. 2020.

8. Zhang GP. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing. 2003;50:159–175. doi:10.1016/S0925-2312(01)00702-0

9. Pai P-F, Lin C-S. A hybrid ARIMA and support vector machines model in stock price forecasting. Omega. 2005;33(6):497–505. doi:10.1016/

10. Cao L-T, Liu -H-H, Li J, Yin X-D, Duan Y, Wang J. Relationship of meteorological factors and human brucellosis in Hebei province, China. Sci Total Environ. 2020;703:135491. doi:10.1016/j.scitotenv.2019.135491

11. Tabachnick BG, Fidell LS. SAS for Windows Workbook for Tabachnick and Fidell Using Multivariate Statistics. Allyn and Bacon; 2001.

12. Meyler A, Kenny G, Quinn T. Forecasting Irish Inflation Using ARIMA Models. 1998.

13. Price BA. Business forecasting methods: Jeffrey Jarrett, (Basil Blackwell Ltd., Oxford, UK, 1991) pp. 463, $19.95. Int J Forecast. 1992;7(4):535–536. doi:10.1016/0169-2070(92)90039-C

14. Hanke JE, Reitsch AG, Wichern DW. Business Forecasting. New Jersey: Prentice Hall; 2001.

15. Hamzaçebi C. Improving artificial neural networks’ performance in seasonal time series forecasting. Inf Sci (Ny). 2008;178(23):4550–4559. doi:10.1016/j.ins.2008.07.024

16. Stockton DJ, Glassman JE. An evaluation of the forecast performance of alternative models of inflation. Rev Econ Stat. 1987;69(1):108–117. doi:10.2307/1937907

17. Tran T, Pham L, Ngo Q. Forecasting epidemic spread of SARS-CoV-2 using ARIMA model (Case Study: Iran). Glob J Environ Sci Manag. 2020;6(SpecialIssue (Covid–19)):1–10.

18. Ibrahim RR, Oladipo OH. Forecasting the spread of COVID-19 in Nigeria using Box-Jenkins modeling procedure. medRxiv. 2020.

19. Chintalapudi N, Battineni G, Amenta F. COVID-19 disease outbreak forecasting of registered and recovered cases after sixty day lockdown in Italy: a data driven model approach. J Microbiol Immunol Infect. 2020;53(3):396–403. doi:10.1016/j.jmii.2020.04.004

20. Slutzky E. The summation of random causes as the source of cyclic processes. Econometrica. 1937;5(2):105–146. doi:10.2307/1907241

21. Yoo J, Maddala G. Risk premia and price volatility in futures markets. J Futures Mark. 1991;11(2):165. doi:10.1002/fut.3990110204

22. Box GE, Jenkins GM, Reinsel G. Time series analysis: forecasting and control Holden-day San Francisco. BoxTime Ser Anal. 1970;Day1970.

23. Mgaya JF, Yildiz F. Application of ARIMA models in forecasting livestock products consumption in Tanzania. Cogent Food Agric. 2019;5(1):1607430. doi:10.1080/23311932.2019.1607430

24. Mandal B. Forecasting Sugarcane Production in India with ARIMA Model. Inter Stat; 2005.

25. Odukoya OO, Adejimi AA, Isikekpei B, Jim CS, Osibogun A, Ogunsola FT. Epidemiological trends of coronavirus disease 2019 in Nigeria: from 1 to 10,000. Niger Postgrad Med J. 2020;27(4):271–279. doi:10.4103/npmj.npmj_233_20

26. Ayele AD, Mihretie GN, Belay HG, Teffera AG, Kassa BG, Amsalu BT. Knowledge and Practice to Prevent Against Corona Virus Disease (COVID-19) and Its Associated Factors Among Pregnant Women in Debre Tabor Town Northwest Ethiopia: A Community Based Cross-Sectional Study. 2020.

27. Asmelash D, Fasil A, Tegegne Y, Akalu TY, Ferede HA, Aynalem GL. Knowledge, attitudes and practices toward prevention and early detection of COVID-19 and associated factors among religious clerics and traditional healers in Gondar Town, Northwest Ethiopia: a Community-Based Study. Risk Manag Healthc Policy. 2020;13:2239. doi:10.2147/RMHP.S277846

28. Sahai AK, Rath N, Sood V, Singh MP. ARIMA modelling & forecasting of COVID-19 in top five affected countries. Diabetes Metab Syndr. 2020;14(5):1419–1427. doi:10.1016/j.dsx.2020.07.042

29. Singh RK, Rani M, Bhagavathula AS. Prediction of the COVID-19 pandemic for the top 15 affected countries: advanced autoregressive integrated moving average (ARIMA) model. JMIR Public Health Surveill. 2020;6(2):e19115. doi:10.2196/19115

30. Alzahrani SI, Aljamaan IA, Al-Fakih EA. Forecasting the spread of the COVID-19 pandemic in Saudi Arabia using ARIMA prediction model under current public health interventions. J Infect Public Health. 2020;13(7):914–919. doi:10.1016/j.jiph.2020.06.001

31. Kufel T. ARIMA-based forecasting of the dynamics of confirmed Covid-19 cases for selected European countries. Equilib Q J Econ Econ Policy. 2020;15(2):181–204.

32. Wang Y, Xu C, Wang Z, Zhang S, Zhu Y, Yuan J. Time series modeling of pertussis incidence in China from 2004 to 2018 with a novel wavelet based SARIMA-NAR hybrid model. PLoS One. 2018;13(12):e0208404. doi:10.1371/journal.pone.0208404

33. Yassine HM, Shah Z. How could artificial intelligence aid in the fight against coronavirus? An interview with Dr Hadi M Yassine and Dr Zubair Shah by Felicity Poole, Commissioning Editor. Expert Rev Anti Infect Ther. 2020;18(6):493–497. doi:10.1080/14787210.2020.1744275

34. Fong SJ, Dey N, Chaki J. Artificial Intelligence for Coronavirus Outbreak. Springer; 2020.

Creative Commons License © 2021 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.