Master's in Data Science

  • Top Schools
    • 23 Great Schools with Master’s Programs in Data Science
    • 22 Top Schools with Master’s in Information Systems Degrees
    • 25 Top Schools with Master’s in Business Analytics Programs
  • Online Programs
    • Online Data Science Degree Programs
    • Online Masters in Business Analytics Programs
    • Online Masters in Information Systems Programs
    • Online Master’s in Computer Engineering
    • Online Certificate Programs in Analytics
  • By State
    • Alabama
    • Arizona
    • Arkansas
    • California
    • Colorado
    • Connecticut
    • Delaware
    • Florida
    • Georgia
    • Hawaii
    • Idaho
    • Illinois
    • Indiana
    • Iowa
    • Kansas
    • Kentucky
    • Louisiana
    • Maine
    • Maryland
    • Massachusetts
    • Michigan
    • Minnesota
    • Mississippi
    • Missouri
    • Montana
    • Nebraska
    • Nevada
    • New Hampshire
    • New Jersey
    • New Mexico
    • New York
    • North Carolina
    • North Dakota
    • Ohio
    • Oklahoma
    • Oregon
    • Pennsylvania
    • Rhode Island
    • South Carolina
    • South Dakota
    • Tennessee
    • Texas
    • Utah
    • Vermont
    • Virginia
    • Washington
    • Washington, D.C.
    • West Virginia
    • Wisconsin
  • Related Degrees
    • Graduate Certificates in Data Science
    • Data Science Bootcamps
    • Master’s in Accounting Analytics
    • Master’s in Applied Statistics
    • Master’s in Business Analytics
    • Master’s in Business Analytics Online
    • Master’s in Business Intelligence
    • Master’s in Geospatial Science & GIS
    • Master’s in Health Informatics
    • Master’s in Information Systems
    • Master’s in Public Policy Data Analytics
    • MBA in Analytics/Data Science
    • PhD in Data Science Programs
    • Programs Outside the US
You are here: Home / Data Science in the Real World / Data Science in the Health Care Industry

Data Science in the Health Care Industry

Opportunities in Health Care Data Science

The Promise of Big Data

It’s no secret that the U.S. health care industry is an overpriced, inefficient mess. In 2012, the authors of How Data Science Is Transforming Health Care: Solving the Wanamaker Dilemma reported that the U.S. was spending over $2.6 trillion on health care each year; $600 billion of those costs include treatments that either do not help or actually cause harm.

Even more depressingly, health care expenses represented 17.6% of the GDP in 2013, $600 billion of which are consumed by waste and fraud. By 2020, this figure is estimated to rise to nearly 20%. The country ranks 37th out of developed economies in life expectancy and other measures of health.

Some say big data is the answer. Although the health care industry has been notoriously slow to harness its power (for more details, see our profiles of Biotechnology and Pharmaceuticals), there is hope:

  • Accelerating costs are forcing payors and health-care providers to shift from a fee-for-service approach (the more treatments, the better) to one that favors patient outcomes (rewarding providers for targeted treatments that actually work).
  • Physicians are continuing to move towards evidence-based medicine, reviewing data from a huge range of sources before making treatment decisions.
  • In 2012, the NIH devoted approximately $15 million in award funding for eight projects to research uses of big data.

That puts data scientists in a prime position. Used wisely, big data has the potential to help physicians make better decisions across the board – from personalized treatments to preventive care.

Oh, and it has one other major benefit of interest to the health care industry… it slashes costs.

54.161.45.156

ad
Featured Schools

Sponsored Master's in Data Science Programs

More InfoSyracuse University

Online Master's in Data Science
Syracuse University's online Master's in Data Science can be completed in as few as 18 months.

* GRE waivers are available.
Sponsored Program

More InfoSouthern Methodist University

Online Master of Science in Data Science
Earn your M.S. in Data Science online in 20 months from SMU - ranked a Top National University by US News. Bachelor's degree required.

* GRE waivers available for experienced applicants
Sponsored Program

More InfoAmerican University

Online Master of Science in Analytics
Make sound decisions using data analysis in 12 months with a Master's in Business Analytics online from American University. No GMAT/GRE required to apply.
Sponsored Program

More InfoSyracuse University

Online Master of Science in Business Analytics
Looking to become a data-savvy leader? Earn your M.S. in Business Analytics online. GMAT waivers available!
Sponsored Program

More InfoUniversity of California-Berkeley

Online Master of Information and Data Science
Earn your Master's in Data Science online from UC Berkeley - #1 ranked public university by US News
Sponsored Program

More InfoSyracuse University

Online Master of Science in Information Management
Specialize in Advanced Data Science and earn the MS in Information Management online from Syracuse University. GRE waivers available!
Sponsored Program

More InfoGeorge Washington University

Online MS in Management of Health Informatics and Analytics
Earn your online Master's in Health Informatics from the renowned Milken Institute School of Public Health at George Washington University. No GMAT/GRE Required.
Sponsored Program

Personalized Medicine

Imagine you’re a doctor treating a patient with cancer. In the past, you might have based your treatment plan on the results of double-blind studies. These studies may have been rigorous, but they may have failed to take patient differences into account.

Big data changes the game. Now you can merge and analyze data sets from:

  • Clinical trials
  • Direct observations of other physicians
  • Electronic medical records
  • Online patient networks
  • Genomics research (see below)
  • and more…

One of the top goals is to create a personalized treatment plan based on individual biology. Instead of treating your patient with a drug that works 80% of the time (e.g., the breast cancer drug, Tamoxifen), you can employ data science to custom-tailor a regimen just for her.

Genomics

Inexpensive DNA sequencing and next-generation genomic technologies are changing the way health care providers do business. As Michael Walker points out, we now have the ability to map entire DNA sequences and measure tens of thousands of blood components to assess health:

“Next-generation genomic technologies allow data scientists to drastically increase the amount of genomic data collected on large study populations. When combined with new informatics approaches that integrate many kinds of data with genomic data in disease research, we will better understand the genetic bases of drug response and disease.”

Researchers aim to achieve ultra-personalized care. As a beginning, the FDA has already begun to issue medicine labels that specify different dosages for patients with particular genetic variants.

Predictive Analytics and Preventive Measures

Prevention is always better than cure. For the health-care industry, it also happens to save a lot of money. (The Centers for Medicaid and Medicare Services, for instance, can penalize hospitals that exceed average rates of readmission – indicating that they could be doing more to prevent medical problems.)

Take, for example, the partnership between Mount Sinai Medical Center and former Facebook guru Jeff Hammerbach. Mount Sinai’s problem was how to reduce readmission rates. Hammerbach’s solution was predictive analytics:

  • In a pilot study, Hammerbach and his team combined data on disease, past hospital visits and other factors to determine a patient’s risk of readmission.
  • These high-risk patients would then receive regular communication from hospital staff to help them avoid getting sick again.

Sinai isn’t alone. In 2008, Texas Health partnered with Healthways to merge and analyze clinical and insurance claims information. Their goal was the same – identify high-risk patients and offer them customized interventions.

Meanwhile, in 2013, data scientists at Methodist Health System are looking at accountable-care organization claims from 14,000 Medicare beneficiaries and 6,000 employees. Their aim? You guessed it. Predict which patients will need high-cost care in the future.

Patient Monitoring and Home Devices

Doctors can do a lot, but they can’t follow a patient around every minute of the day. Wearable body sensors – sensors tracking everything from heart rate to testosterone to body water – can.

Sensors are just one way in which medical technology is moving beyond the hospital bed. Home-use, medical monitoring devices and mobile applications are cropping up daily. A scanner to diagnose melanomas? A personal EEG heart monitor? No problem.

These gadgets are designed to help the patient, naturally, but they’re also busy harvesting data.

For example:

  • Asthmapolis’s GPS-enabled tracker, already available by 2011, records inhaler usage by asthmatics. This information is collated, analyzed and merged with data on asthma catalysts from the CDC (e.g., high pollen counts in New England) to help doctors learn how best to prevent attacks.
  • With Ginger.io’s mobile application, out in 2012, patients consent to have data about their calls, texts, location and movements monitored. These are combined with data on behavioral health from the NIH and other sources to pinpoint potential problems. Too many late-night phone calls, for instance, might signal a higher risk of anxiety attack.
  • To improve patient drug compliance, Eliza, a Boston-based company, monitors which types of reminders work on which types of people. Smarter targeting means more compliance.

Self-Motivated Care

It’s a “patient heal thyself” world, now. Developments like personal genetic testing (e.g., 23andMe.com), online patient networks and behavioral apps like Be Well are allowing individuals to take control of their own health.

This is getting data scientists very excited. In Big Data and the Consumerization of Healthcare, the author envisions expanding the Be Well app to include long-term analysis of behavioral patterns. Big data could help individuals create a “life report” that connects ongoing changes to current conditions and gives them new perspectives on their well-being.

There’s another benefit to empowering patients. Their insights can be mined for data. In the same Data Science Series article, the author notes:

“A community such as PatientsLikeMe groups over 150,000 patients who share their symptoms, concerns, experiences with treatment and healing stories about over 1,000 conditions.”

That’s a lot of information about symptoms, treatments and side effects that hospitals, pharmaceutical companies and researchers are interested in hearing about.

Disease Modeling and Mapping

One of the flashiest uses of data science in the past few years has been in tracking (and finding ways to halt or prevent) diseases.

For example:

  • At a 24-Hour Data Science Code-a-Thon hosted by Kaiser Permanente in 2013, teams used Hadoop technologies to map incidences of respiratory conditions (e.g., asthma flare-ups occurred in areas with higher ozone levels for extended periods during the summer).
  • While developing the open-source modeling application Spatio-Temporal Epidemiological Modeler (STEM), as reported in 2013, researchers discovered links between changes in local climate and temperature and the spread of outbreaks of dengue and malaria.
  • By mapping hot spots for Type 2 diabetes, Mount Sinai Medical Center is hoping to improve treatments. After identifying the hot spots, researchers focus on which genetic factors are involved in those environments. They can then create better guidelines for physicians and more tailored treatments.

In these cases, a picture is worth a thousand words. Check out the Mount Sinai diabetes maps in this article from Fast Co.Exist.

The Ultimate EHR

One of the biggest dreams of all is a fully digital and unprecedentedly comprehensive electronic health record (EHR). You may also see it referred to as an electronic medical record (EMR). This one precious file would contain every piece of information about a patient’s health, would always be up to date, and could be shared across any network.

Are we there yet? Not by a long shot. But that doesn’t stop data scientists from fantasizing about a file that contains:

  • Structured data from every one of the patient’s health care providers (e.g., lab results, demographic information, prescription histories, etc.)
  • Unstructured professional data (e.g., notes from clinicians, physicians, PCPs and nurses)
  • Unstructured personal data (e.g., notes from in-home caregivers, family members, patients and social workers)
  • Saved images (e.g., X-rays and MRI scans)
  • Genomic data

The U.S. Department of Health and Human Services is already overseeing a plan to ensure the widespread adoption of EMRs, but success would mean complete centralization with the government near the center.

One record to rule them all? Stranger things have happened.

Data Risks and Regulations

The Challenges Ahead

There are plenty of hurdles to creating a data-driven health care industry. Some are technical, some emotional. Health care providers have had decades to accumulate paper records, inefficiencies and entrenched routines. A remedy will not be quick.

And some say it shouldn’t. At least, not without a hard look at patient privacy, data ownership and the overall direction of U.S. health care.

Patient Privacy

Let’s say data scientists achieve their dream of a digital EHR. Who will have access to it? Who will own the data? How will it be protected? The HIPAA Act explicitly states that covered entities must:

“Protect individuals’ health records and other identifiable health information by requiring appropriate safeguards to protect privacy, and setting limits and conditions on the uses and disclosures that may be made of such information without patient authorization.”

But the record of U.S. medical providers isn’t exactly encouraging. Privacy Rights Clearinghouse indicates that from 2005 to December, 2013, the health care industry experienced 1,066 security breaches (i.e., unintended disclosure, hacking, etc.), resulting in compromised data.

How Much Data is Too Much?

EHRs also cry out for questions about relevancy. How much information is too much? After all, your financial records, your social media photos, your sexual history, your location and your weekly liquor store bill are all relevant to your overall health. Should these be included in your digital file?

Now add the constant stream of data from body sensors and in-home devices. A competent hacker will know where you are, where you’ve been and even whether or not you’re likely to experience a heart attack in the next 24 hours.

Muddling it Out

The big-data dream assumes that health care providers are organized, efficient institutions with dedicated technology teams on hand. Reality check: they’re not.

Instead, they are oversized, secretive, often competitive, institutions saddled with a variety of data-related problems:

  • Information locked in inaccessible organizational “silos”
  • Staff unwilling or unable to change their practices
  • Poor communication from data scientists
  • Missing, flawed or misinterpreted data sets
  • Conflicting results

The list goes on.

In the end, too, we have to remember that data science is not a panacea for the health care industry’s ills. We’d need a much bigger pill than that.

History of Data Analysis and Health Care

“I’ve been asked a lot for my view on American health care. Well, ‘it would be a good idea,’ to quote Gandhi.” – Paul Farmer

In the hazy days of 1950, soon after the outbreak of the Korean War, a fresh-faced physicist/dentist named Robert Ledley was given an offer he couldn’t refuse:

“The army called me down to New York… and the colonel said to me, ‘Well, if you volunteer to be in the army, then you’ll become a lieutenant, an officer. But if you don’t volunteer, you’ll be drafted anyway, and sent to boot camp. So I volunteered.”

After a stint at Walter Reed General Hospital, Ledley was offered a job at the National Bureau of Standards in 1952. There he encountered the Standards Eastern Automatic Computer (SEAC). It was love at first sight.

Ledley realized that SEAC could perform complex equations that no human could hope to tackle. He saw that physics, mathematics and computers might be combined to solve biomedical problems.

In 1959, Ledley and a radiologist named Lee B. Lusted teamed up to publish “Reasoning Foundations of Medical Diagnosis.” As a primer in operations research techniques, the article covered symbolic logic, probability and value theory, and educated physicians on the potential of databases and electronic diagnosis. The drive to computerize medicine had begun in earnest.

Medical Informatics

Ledley wasn’t the only researcher interested in the potential of computer science. The concept of health informatics – the study of resources and methods for managing health information – had been kicking around in other countries for years.

But for a post-war, cash-rich U.S. government, this field was especially intriguing. Ledley’s work helped pave the way for change. Between 1960 and 1964, the NIH spent over $40 million establishing dozens of technology-led biomedical research centers.

The 1960s also saw the introduction of MEDLARS/MEDLINE, a computerized bibliographic database compiled by the National Library of Medicine, along with research and experimentation with programming languages.

MUMPS

One of these was MUMPS (Massachusetts General Hospital Utility Multi-Programming System). Developed by Neil Pappalardo, Curtis Marble and Robert Greenes from 1966-1967, MUMPS powered the creation and integration of medical databases. By the early 1970s, it was the most commonly used programming language for clinical applications.

It was, you might say, the decade of peace, love and data:

  • Early 1960s: Morris Collen, a physician with Kaiser Permanente’s Division of Research, develops a system to automate the 10-year-old multiphasic health screening exam and a prototype electronic health record.
  • 1965: Work commences on Systematized Nomenclature of Pathology (SNOP), an effort to systematize the language of pathology for use in computer systems. In 1974, this was extended to include all medical terms – the famous Systematized Nomenclature of Medicine (SNOMED).
  • 1965: Congress amends the Social Security Act to create Medicare and Medicaid. This puts pressure on medical providers to provide documentation of care. Interest in health informatics receives a significant boost.

The health care industry began using computers to provide statistical reports to the government, create patient care applications, centralize medical records and, most importantly, organize billing.

The Promise of PROMIS

Despite all this good will, computerized medicine in the 1970s remained a hodgepodge. Computer manufacturers did not always understand the hospital market, and hospitals did not always understand computers.

Healthcare providers that did take the plunge ran into problems with machine speed and processing capabilities. Even worse, there was little integration between systems.

Nevertheless, physicians forged onward:

  • By 1968, Dr. Lawrence Weed was working on the PRoblem Oriented Medical Information System (PROMIS). Though it did not gain wide acceptance, PROMIS was a strong attempt to establish an integrated system covering all aspects of health care, including patient treatment, as well as the Problem-Oriented Medical Record (POMR).
  • In a similar project, Dr. Homer Warner and his colleagues at Intermountain Healthcare designed the HELP system (Health Evaluation Through Logical Processing) during the late 1960s and early 1970s. HELP provided one of the nation’s first versions of an electronic medical record.

And, of course, there was the infant Internet. By the end of the decade, the idea of online data communications technology had spread beyond large teaching medical centers. Physicians were beginning to receive instant access to computerized databases.

Everybody Wins

The arrival of affordable, increasingly powerful technology in the late 1970s and early 1980s accelerated developments. Large, multi-application vendors stepped up to meet industry demand. Organizations started developing protocols for health care information and data collection.

But it was not until the latter half of the 1980s and the early 1990s that the focus of health care technology began to shift towards clinical integration and improving the quality of patient care.

That’s because everyone – finally – caught up to each other. Thanks to the Internet, networked technologies, large-scale databases and the development of relational database software, data was suddenly everywhere.

Even politicians sat up and took notice. In 1996, Edward Kennedy and Nancy Kassebaum pushed the Health Insurance Portability and Accountability Act (HIPAA), aka the Kennedy–Kassebaum Act, through Congress. Designed to encourage the use of electronic data interchange in the U.S. health care system, HIPAA mandated the establishment of national standards for electronic health care transactions. It also included provisions for patient privacy.

The U.S. goes HITECH

Then came the new milennium and the “oughts.” And here’s where things got complicated. As the authors of the McKinsey report, The Big Data Revolution in U.S. Health Care, note:

  • Payors and providers began to digitize patient records.
  • Pharmaceutical companies continued to transfer years’ worth of research and development data into medical databases.
  • The federal government and similar stakeholders started allowing public access to a treasure trove of health-care knowledge, including clinical trial data and information on patients covered under public insurance.

Devices, gadgets and PDAs became ubiquitous in clinical settings. Storage capabilities continued to increase. The flow of available information surged to flood force, pushed by

  • Pharmaceutical research data (e.g., clinical trial results)
  • Clinical data (e.g., patient records)
  • Activity and cost data (e.g., estimated procedure costs)
  • Patient behavior data (e.g., health purchases history)
  • Biological data (e.g., genomics)

In 2004, George Bush responded by establishing the Office of the National Coordinator for Health Information Technology in order to encourage technological development and electronic information flow in the field.

This role took on new meaning when Congress passed the Health Information Technology for Economic and Clinical Health Act (HITECH) in 2009. In doing so, the government indicated it was willing to spend billions to promote and expand the adoption of health-information technology and create a nationwide network of electronic health records (EHRs).

Share on Facebook Share
Share on TwitterTweet
Share on LinkedIn Share

Career Profiles

  • Business Analyst
  • Data Analyst
  • Data Architect
  • Data Engineer
  • Data Scientist
  • Marketing Analyst
  • Quantitative Analyst
  • Statistician

Schools by State

  • Alabama
  • Arizona
  • Arkansas
  • California
  • Colorado
  • Connecticut
  • Delaware
  • District of Columbia
  • Florida
  • Georgia
  • Hawaii
  • Idaho
  • Illinois
  • Indiana
  • Iowa
  • Kansas
  • Kentucky
  • Louisiana
  • Maine
  • Maryland
  • Massachusetts
  • Michigan
  • Minnesota
  • Mississippi
  • Missouri
  • Montana
  • Nebraska
  • Nevada
  • New Hampshire
  • New Jersey
  • New Mexico
  • New York
  • North Carolina
  • North Dakota
  • Ohio
  • Oklahoma
  • Oregon
  • Pennsylvania
  • Rhode Island
  • South Carolina
  • South Dakota
  • Tennessee
  • Texas
  • Utah
  • Vermont
  • Virginia
  • Washington
  • West Virginia
  • Wisconsin

Industry Uses

  • Biotechnology
  • Energy
  • Finance
  • Gaming and Hospitality
  • Government
  • Health Care
  • Insurance
  • Internet
  • Manufacturing
  • Pharmaceuticals
  • Retail
  • Telecommunications
  • Travel and Transportation
  • Utilities

Data Science Technologies

  • R
  • Python
  • SQL
  • Hadoop
  • Tableau

Latest Tweets

21 Dec

iRobot hopes to use the mapping data generated by Roombas to make homes more intelligent - https://t.co/x7yo4IqZ0y

13 Dec

What are LinkedIn's fastest-growing jobs? #DataScience and machine learning - https://t.co/wzNvXJwdAF

29 Nov

A group of Berkeley students won @CITADEL's Data Open Championship (and $100,000 cash!) - https://t.co/NSZXuesN00

Latest Articles

  • What is Prescriptive Analytics? A Conversation with UC Davis March 3, 2017
  • Picking a Data Science Program: The First Question to Ask January 12, 2017
  • Data Doctors – Improving Health Care with Big Data November 1, 2016
  • Boston University’s Online MS in Applied Business Analytics Management – An Interview September 13, 2016
  • West Virginia University’s Online MS in Business Data Analytics August 17, 2016
  • Data Science with the Stars: An Interview with NASA’s Chris Mattmann February 9, 2016
  • An Interview with the SAS Academy for Data Science January 25, 2016
  • Driving Analytics with the Internet of Things January 19, 2016
  • Listening to Big Data: An Interview with Pandora’s Gordon Rios January 6, 2016
  • The Winning Equation: Predicting Performance Through Sports Analytics October 29, 2015
  • An Interview with UMUC’s Data Analytics Program Chair October 14, 2015
  • An Interview with Arizona State’s MS in Business Analytics Founding Director September 30, 2015
  • Lewis University’s Master of Science in Data Science September 21, 2015
  • Syracuse University’s Online Certificate in Data Science September 8, 2015
  • Women in Data Science: 4 Perspectives August 25, 2015

MastersInDataScience.org is owned and operated by 2U, Inc.
© 2U, Inc. 2018

About Us | Privacy Policy | Terms of Use | Blog