While it is nice to see the overall picture of the Covid-19 outbreak in the United States as well as being able to see how the individual states are doing, that leaves the question of where the real outbreaks are located.
The Center for Systems Science and Engineering (CSSE) at Johns Hopkins University publishes data daily on their GitHub page. The data is broken down to the county level. All that remained was for me to go through the data and see which individual counties were having the largest outbreaks.
So how did I determine who was having the largest outbreak? I used two criteria.
The first criteria was that there had to be more than a 19% growth rate over the last week. That served to eliminate random spikes in reported infection rates due to returning test results. Also, while 19% is a fairly high growth rate, it is more indicative of unrestrained growth instead of being the result of constant testing.
The second criteria was that there needed to be a minimum of 50 new infections reported over the previous week. This served to help remove false positives showing large growth rates from the list that resulted from very small numbers of infections. For example, a county that moved from 2 reported cases to 4 reported cases in one week would show as a staggering 200% growth rate, but that doesn’t indicate an outbreak as far as I’m concerned.
What do those two criteria leave out? The most obvious places that would be left out were places like New York where they still have large numbers of infections but they have their growth rate under control. Other places that were left out were counties that have few infections to date but are rapidly growing.
Remember: there are very few places in the United States that remain untouched by Covid-19. And for those places that remain untouched, they are within a leisurely driving distance from a place where they can become infected.
With those criteria in mind, here are where the outbreaks are currently happening in the United States.
There are several reasons to only update my data on Covid-19 once per week. The first — and probably largest — reason is that it takes quite a bit of time to create a graph for each state. Secondly, it takes several days before any trends can be detected in the raw data. And finally, I simply don’t have time to update the graphs any more frequently unless something significant appears in the raw data.
Each normalized data set is created by taking the raw data reported from the daily infection rate, using a skewed normal function to estimate the actual time the person was infected, and recording the results as a separate data set. For clarity purposes, when I create a graph, I include the raw daily infection rate in addition to the normalized daily infection rate.
The initial data set that I use to create the graphs is provided by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The data sets are available for download on GitHub at this location.
Finally, a few points about the data sets and the resulting graphs:
Each graph covers the dates from March 1, 2020 through May 8, 2020.
The x-axis contains the date.
Since the normalized daily infection rate requires two weeks data to be spread through the normal curve, it follows the reported daily infection rate by two weeks. It is essentially an estimate of the number of infections for each date, but it can’t be any closer to the present than two weeks.
Each graph has a best fit y-axis. That makes it easy to see how the infection rate has progressed in the area covered by the graph, but can be confusing when comparing different places unless the scale of the y-axis is taken into account.
There should never be a negative daily infection rate reported. Some states, like Nevada, have reported negative infection rates several times. That brings into question the quality of the data that these states are reported. That said, I have used the data as reported to help prevent misunderstandings.
Finally, I have tried to make the graphs a little bit easier to view while maintaining the same essential format as all the other graphs in this series.
If anyone is interested in a LibreOffice spreadsheet containing the raw data and calculations for creating each data set, let me know and I’ll make it available.
With many parts of the United States either ending their stay-at-home orders or about to let them expire, I wanted to do an update to use as a baseline to see how the daily infection rate developed as people began to interact more. To accomplish this, I have created a graph of the United States daily infection rate as well as infection rates for each individual state.
The graphs are made using the same technique I used to make the previous graphs. I took the reported daily infection rates, used a skewed normal graph to calculate the possible date that the infections happened, and plotted the two together on the same graph.
The underlying mathematics are pretty simply and rely mostly on brute force calculations done in a spreadsheet. If anyone is interested in the mathematics or a copy of the spreadsheet, let me know and I will provide them.
As a side note, I won’t be doing much analysis on the individual graphs. I created them mostly to get a visual representation of how the known daily infections were effected by changes in people’s behavior.
A few points to watch out for on each graph:
Each graph covers the dates from March 1, 2020 through May 1, 2020.
Each graph has a best-fit y-axis. That makes it easy to see how the infection rate progresses, but can be confusing if not taken into account when comparing different states.
There should never be a negative daily infection rate reported, but some states like Nevada have done this several times. It is possible that those states aren’t reporting their numbers accurately and actually have higher rates reported than they are releasing.
As the United States begins to open from its Covid-19 shutdown, it is important to get a baseline to see how the disease progresses as the stay-at-home orders begin to wane. To create a yardstick, I will be relying on data from Johns Hopkins University available at their GitHub page [https://github.com/CSSEGISandData/COVID-19]. After I process the data, that should give a snapshot of the condition of each individual state as well as an overall view of the United States. With the snapshot in place, it will be easier to see how the infection rate changes over time.
To process the data into easily readable form, I first take the total of infections per day and spread them out through a skewed normal curve. I use a normal graph that is skewed to the left to help distribute the raw data into the most likely time that the person was actually infected instead of the time when they received a positive test result. The downside to processing the data like this is that it is a trailing indicator. In other words, it shows what the infection rate would have been about two weeks prior to the last data available. In order to help compensate for the lag in processed data, I also put the raw data into the same chart. That makes it easier to see where the data might be going over time.
Here is a graph of the United States using the data available as of the end of the day on April 28, 2020.
The “reported” data is taken directly from the data provided by Johns Hopkins University while the “normalized” data is the same data processed through normal distribution.
With the data processed this way, it is easy to see the initial growth rate as well as when it peaked and started down.
The overall downward trend for the United States is what I would expect considering that the states that had the worst outbreaks are rapidly bringing those outbreaks under control. But this doesn’t tell the entire story. As we have all seen, the picture of the United States doesn’t necessarily indicate how each individual part is doing. There are several states that are bringing their daily infections under control while there are others that are still spreading on an exponential curve. Furthermore, as each state applies its own rules to reopening, it’s quite likely that each state will diverge from the others with respect to their infection rate.
Since showing a graph of each state every time I update would be prohibitive, I will show a few that are opening early.
As you can see from the graph of Alaska, they seem to have their outbreak well in hand. Alaska never had a very big outbreak; notice the y-axis and that they are only getting about 5 infections per day. It shouldn’t take much of an effort to keep their outbreak under control.
Georgia is the next state that I want to look at. While Georgia has flattened their curve, they haven’t done much to actually reduce the number of daily infections. With between 600 and 800 daily infections, it will be easy for them to slip back into a exponential growth rate. Even if they manage to escape an exponential growth rate when they begin to return to work, 600 to 800 new cases of Covid-19 per day will serve to maintain a stress on their healthcare system.
Minnesota is trying to open their economy a little bit while still maintaining their stay-at-home order. While their daily infection rate isn’t outrageous, it doesn’t appear to have reached its peak yet.
Mississippi is another state that is trying to reopen their economy. They, too, might not have reached their peak.
Oklahoma appears to have had an initial success of reaching their peak and even starting to reduce their daily infection rate. On the other hand, Oklahoma is being fairly aggressive in reopening their businesses. This state could be the canary in the coalmine as far as whether the daily infection rates begin to clime again after the economy starts to reopen.
South Carolina is in much the same position as Oklahoma with respect to their daily infection rate and reopening strategy. Perhaps they are another canary in the coalmine.
Finally that brings us to Tennessee. While they have flattened their curve, they don’t seem to have reduced the daily infection rate. They are also being very aggressive with their reopening plans. Over the next few days, they are going to remove all stay-at-home restrictions and rely completely on social distancing to try to maintain their daily infection rates.
And that concludes the states that are leading the charge to reopen. It will be educational to see which, if any, of these states manage to keep their daily infection rate in check. The fate of these states is in the hands of their citizens. Whether they take the threat of Covid-19 seriously when they return to work or whether they end up being the first states to require a second round of stay-at-home orders remains to be seen.
Well hello there! I’m not going to bother tapping the microphone to see if it is on. I know that there is no one here listening. How do I know this? Well, this isn’t my first blog even if it is the first time I have blogged at marisaburnes.com. Continue reading Mission Statement