It’s true, folks. Data science in the public health field is in the midst of a creative revolution.
In addition to searching the traditional isolated data silos (e.g., EHRs, public health records, surveys, the Human Genome project), freethinking data scientists are venturing far beyond the doctor’s office.
They’re integrating data from mobile devices and sensors, weather and GIS, nature and history – whatever they think may have a bearing on the problem at hand.
Twitter and Depression
Using social media to monitor the development of outbreaks is, in Internet terms, old hat. Google Flu Trends was beating out CDC reports as early as 2009. In Adam Sadilek’s nEmesis project, Sadilek utilized comparable machine learning methods to identify outbreaks of food poisoning.
Tracking is only getting faster. In Rachelle Chong’s 2013 article on Big Predictions for Big Data Impact on Public Health, Dr. Jennifer Olsen notes that big-data sources have reduced the time needed to detect a pandemic from 167 days (1996) to 23 days (2009). She thinks one week is not unreasonable to aim for.
Depression is typically underreported. Texts about states of mind may only imply depression – some sufferers may never use the word. De Choudhury’s goal is to cut through the noise and find the signals lurking there.
Textual analysis is a part of it – changes in a user’s linguistic style over time can indicate fluctuating mental states – as is social analysis:
- How often is a person posting?
- What kinds of posts are they sharing?
- How are they connected to friends?
The hope is that this kind of big-data analysis will lead to early intervention and prevention strategies, of interest to both individuals and policymakers.
Read more about the privacy debate and de Choudhury’s predictions for the future in: Data Story: How Microsoft Research is Using Social Data to Understand Depression.
Geomedicine and Esri
We’ve talked about mapping, including Mount Sinai’s enlightening pictures of diabetes, in our profile of the health-care industry. Thanks to new capabilities, geomedicine is revealing surprising secrets to public health officials.
Take Andy Oram’s 2013 report from a recent Esri Health GIS Conference. Here Oram saw Esri mapping solutions being used to, among other things:
- Pinpoint incidence clusters: Data scientists found that babies in Louisiana are more likely to be born with low birth weights in locations with particular demographics (e.g., housing projects).
- Improve quality of care: A VA Hospital is using RFIDs and GIS data to monitor equipment failures and track accident occurrences (e.g., where folks are slipping and falling).
- Flag unusual patterns: “In Louisiana, for instance, plotting the instances of certain diseases produced a pattern over a particular waterway that they deduced to be contaminated.”
Factoring a person’s location into their health has important implications. In genetic databases, 35 to 50 percent of disease causes are listed as “unknown”, which really means “environmental.” That’s a lot of information being overlooked that could sharpen diagnoses.
In the future, doctors may be able to supplement address information and EPA inventories with all kinds of relevant data sources – e.g., weather reports, distance to high-quality food sources, pollution patterns – to get a clearer picture of the problem.
Scale this geo-based tool up to the government level, and you have the means to address and preempt public health issues at the regional and national level.
Crossover Modeling and Pollen Storms
Last, but not least, comes the crossover story of the Juniper Pollen Project. Funded by NASA, the project is a collaborative effort between the USA National Phenology Network and several universities in Arizona, New Mexico and Texas.
The aim is to improve predictions of juniper pollen release and issue allergy and asthma warnings to people prone to severe allergic reactions. As of 2013, the project pulls on three main sources of data to issue predictions:
- NASA satellite data: Each year, scientists use NASA’s high-quality imagery to monitor the juniper canopy and watch for the moment of pollen release.
- Dust storm models: To predict the pollen’s dispersal, researchers then employ models adapted from the University of Arizona’s work on tracking dust storms. These models utilize real-time weather data to help provide a clearer picture of the direction and speed of “pollen storms.”
- Field verification: Local on-the-ground volunteers observe the development of pollen cones for the timing of pollen release. Their data, combined with pollen samples from six strategic locations, help fine-tune the models.
This project shows how partnerships among disciplines can yield outstanding results. In this case, NASA scientists, phenologists, and weather forecasters are all working together. It’s a typical example of how data science brings together viewpoints from traditionally separate fields to create new benefits for everyone.
In provoking this type of big-picture thinking, data science shows us new ways of looking at the world and finds connections not seen before. Combining NASA imagery of pollen release with dust storm models to track the tiny pollen grains is the kind of inspired solution that we can expect more of in the future.
Check out Norene Griffin’s summary of the project and interviews with researchers at: NASA Meets Public Health on the Juniper Pollen Project.
But Wait, There’s More…
If you’re searching for further examples of data science in public health, have a look at our summary of Data Science in the Health-Care Industry.
In addition to the history of data in the health-care sector, we cover many current trends and challenges, including:
- Personalized Care
- Disease Mapping and Modeling
- Patient Monitoring and Home Devices
- The Ultimate HER