Temporal SARS-CoV-2 Severity Estimates

estimating and predicting the true severity of SARS-CoV-2

Created Mar 30, 2023 - Last updated: Mar 30, 2023

Severity of the SARS-CoV-2 has always been a matter of relative evaluation. At the onset of the pandemic, the virus was dubbed the 2019 novel coronavirus (2019-nCoV), as it was fairly novel to our immune system. Its impact was profound, with staggering mortality rates, hospitalizations, intensive care admissions, and frightening intubations, all of which prompted widespread pandemic restrictions across the globe. The healthcare system was caught off guard, grappling with shortages of personnel, equipment, and therapeutics.

To this day, gauging the severity of COVID-19 remains a relative issue. However, it has become increasingly challenging to do so, given that: (1) the virus has undergone multiple mutations and continues to evolve, (2) our immune system has evolved and continues to change as a result of vaccine immunizations and prior infections, and (3) the healthcare systems are better equipped and more experienced in handling the disease. The global populace has also changed, with an estimated 7 million deaths from COVID-19 worldwide.

Reports indicate that the more recent variants may have decreased pathogenicity owing to altered cell tropism, despite heightened immune evasion. The severity of viruses cannot be determined solely on the basis of their inherent biological traits, and doing so is a sluggish process that necessitates virus isolation and animal modeling. Precise evaluations of severity must consider biological characteristics of the host environment, the healthcare system’s ability to combat the antigen, and the measure used to assess severity.

Our approach to estimating the true severity of SARS-CoV-2

We have developed a system that uses incidence-level data from over 200 thousand patients who receive care at Mass General Brigham (MGB) to measure, track, and predict severity of SARS-CoV-2 over time.

The observed and predicted temporal severity profiles of SARS-CoV-2 in Massachusetts

We use time-series forecasting models to predict outcome-based true severity the next 3 months.

Observed and predicted adjusted absolute risk (AAR) of mortality and hospitalization

Perspective

- hospitalization

Our predictions of the month of February were almost perfect for hospitalization. We predict a spike in hospitalization risks in the next 3 months.

- mortality

We overestimated mortality in February. This is potentially due to a spike in January, which confused our models. Our estimates for the next 3 months show similar trends as the past few months.

1. Prophet
Prophet developed by Facebook, Prophet is an open-source time series forecasting tool that uses a decomposable time series model to make predictions.It is an additive regresssive model with piecewise linear or logistic growth cruve. We can include yearly seasonal component using forurier series and weekly seasonal component using dummy variables.

2. Naive
A naive forecast simply assumes that the future value of a time series will be equal to its current value.It is used only to compare forecasts generated by better sophisticated techniques.

3. Simple exponential smoothing
A basic forecasting technique that gives more weight to recent data points and less weight to older data points. It is often used for short term forecasts.

4. Holt-Winters forecasting
Also known as triple exponential smoothing, Holt’s linear method adds a trend component to the model to capture any upward or downward trends in the data.The Holt-Winters method takes into account three components of a time series: level, trend, and seasonality. Level refers to the average value of the series over time, trend refers to the direction and magnitude of any long-term changes, and seasonality refers to any periodic fluctuations that occur at fixed intervals (such as daily, weekly, or monthly). It only works stationary data.

5. TBATS
The TBATS (Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend, and Seasonal components) model is a state-of-the-art forecasting technique that can handle multiple seasonalities and non-linear trends.TBATS is a robust and flexible forecasting algorithm that can handle time series with multiple seasonal patterns, trend, and non-linear growth. It can also accommodate time series data with different distributions and can automatically select the best forecasting model.

6. ARIMA
ARIMA is a time series forecasting technique that models the autocorrelation in the data using a combination of autoregressive (AR), integrated (I), and moving average (MA) terms.ARIMA models involve three components: autoregression, differencing, and moving average. Autoregression refers to using the past values of a time series to predict future values. Differencing involves taking the difference between consecutive observations to make the time series stationary. Moving average involves using past forecast errors to predict future values. The ARIMA model is denoted as ARIMA(p, d, q), where p is the order of the autoregressive component, d is the degree of differencing required to make the time series stationary, and q is the order of the moving average component. The parameters p, d, and q are estimated from the data using techniques such as maximum likelihood estimation.

7. Auto.Arima
Auto.Arima is an automated version of the ARIMA (Autoregressive Integrated Moving Average) model, which is a popular time series forecasting technique. The Auto ARIMA algorithm starts by fitting an initial ARIMA model to the data and then systematically tests alternative models with different combinations of p, d, and q values. The algorithm selects the best model based on the Akaike Information Criterion (AIC) or the Bayesian Information Criterion (BIC), which are statistical measures of the goodness of fit of a model. The selected model is then used to generate forecasts for the time series.

SARIMA
Seasonal ARIMA is an extension of the ARIMA model that can handle seasonal patterns in the data. Many real-world time series exhibit seasonal patterns, such as weekly or monthly cycles, which cannot be captured by ARIMA models. SARIMA models overcome this limitation by adding seasonal components to the ARIMA model. Specifically, SARIMA models add four additional parameters to the ARIMA model, denoted as (P, D, Q, s), where P, D, and Q represent the autoregressive, differencing, and moving average parameters for the seasonal component, and s represents the length of the seasonal cycle. It assumes that the data is stationary.

Evaluation criteria
We combined all 8 models into function and choose the model based on least root mean square error. We got SARIMA as best model for prediction for mortality as well as hospitalization.