Master's in Data Science

  • Top Schools
    • 23 Great Schools with Master’s Programs in Data Science
    • 22 Top Schools with Master’s in Information Systems Degrees
    • 25 Top Schools with Master’s in Business Analytics Programs
  • Online Programs
    • Online Data Science Degree Programs
    • Online Bachelor’s in Computer Science
    • Online Masters in Business Analytics Programs
    • Online Masters in Information Systems Programs
    • Online Masters in Computer Engineering
    • Online Masters in Computer Science
    • Online Masters in Cybersecurity
    • Online Certificate Programs in Analytics
  • By State
    • Alabama
    • Arizona
    • Arkansas
    • California
    • Colorado
    • Connecticut
    • Delaware
    • Florida
    • Georgia
    • Hawaii
    • Idaho
    • Illinois
    • Indiana
    • Iowa
    • Kansas
    • Kentucky
    • Louisiana
    • Maine
    • Maryland
    • Massachusetts
    • Michigan
    • Minnesota
    • Mississippi
    • Missouri
    • Montana
    • Nebraska
    • Nevada
    • New Hampshire
    • New Jersey
    • New Mexico
    • New York
    • North Carolina
    • North Dakota
    • Ohio
    • Oklahoma
    • Oregon
    • Pennsylvania
    • Rhode Island
    • South Carolina
    • South Dakota
    • Tennessee
    • Texas
    • Utah
    • Vermont
    • Virginia
    • Washington
    • Washington, D.C.
    • West Virginia
    • Wisconsin
  • Related Degrees
    • Data Science Bachelor Degrees
    • Data Science Certificate Programs for 2021
    • Master’s in Accounting Analytics
    • Master’s in Applied Statistics
    • Master’s in Business Analytics
    • Master’s in Business Intelligence
    • Master’s in Geospatial Science & GIS
    • Master’s in Health Informatics
    • Master’s in Library Science
    • Master’s in Public Policy Data Analytics
    • MBA in Analytics/Data Science
    • PhD in Data Science Programs
    • Programs Outside the US
  • Careers
    • Business Analyst
    • Business Analyst Salary Guide
    • Computer Engineer
    • Computer Scientist
    • Data Analyst
    • Data Analyst Salary Guide
    • Data Architect
    • Data Engineer
    • Data Scientist
    • Data Scientist Salary Guide
    • Marketing Analyst
    • Quantitative Analyst
    • Financial Analyst
    • Information Security Analyst
    • Statistician
    • Digital Marketer
  • Online Courses
    • Your Guide for Online Data Science Courses in 2021
    • Online Data Analytics Courses
    • Machine Learning Courses
    • Blockchain Courses
    • Online Digital Marketing Courses
    • FinTech Courses
    • Financial Analysis Courses
    • Cybersecurity Courses
    • Business Analytics Courses
    • Artificial Intelligence Courses
    • UX/UI Courses
  • Bootcamps
    • Data Science Bootcamps
    • Data Analytics Bootcamps
    • Coding Bootcamps
    • Are Coding Bootcamps Worth it?
    • Cybersecurity Bootcamps
    • UX/UI Bootcamps
    • FinTech Bootcamps
    • Digital Marketing Bootcamps
  • Learning
    • What is Data Analytics?
    • What is Business Analytics?
    • What Is Cyber Security?
    • What is Computer Engineering?
    • What is Computer Science?
    • Best Programming Language to Learn
    • Is Computer Science a Good Major?
    • What Can You Do With a Computer Science Degree?
    • What Is a Neural Network?
    • What is an Information System?
    • Learn Data Science Online
    • Benefits of Business Intelligence Software
    • Computer Science vs. Computer Engineering
    • Cyber Security vs Computer Science
    • Data Analyst vs Data Scientist
    • Data Analytics vs. Business Analytics
    • Data Science vs. Machine Learning
  • Resources
  • About 2U

Data Science in the Pharmaceutical Industry

Opportunities in Pharmaceutical Data Science

The Promise of Big Data
For every 5,000 compounds starting in the laboratory, five are tested in humans and one makes it to market.

Moreover, it takes approximately 10 years and an average cost of $2-3 billion to develop each new drug. That adds up to a vast amount of molecular and clinical data stored in proprietary networks, just ripe for analytics.

3.235.25.169

ad
Sponsored Schools

Sponsored

Learn More

Rutgers University

Online Data Science Boot Camp
Rutgers Data Science Boot Camp focuses on teaching students how to solve complex data and visualization problems. Technologies include like Excel, Python, JavaScript, SQL databases, and Tableau.Learn More
Sponsored Program
Learn More

Southern Methodist University

Online Data Science Boot Camp
This bootcamp focuses on teaching the practical and technical skills needed to analyze and solve complex problems and deliver value to organizations. Learners study a wide range of technologies like Excel, Python, JavaScript, SQL databases, Tableau, and more over the course of 24 weeks.Learn More
Sponsored Program
Learn More

Northwestern University

Online Data Science Boot Camp
With the rise of data in today’s economy, Northwestern Data Science and Visualization Boot Camp, teaches a broad array of technologies like Excel, Python, JavaScript, SQL databases, Tableau, and more.Learn More
Sponsored Program
Learn More

Georgia Institute of Technology

Online Data Analytics Boot Camp
Learn the practical and technical skills needed to analyze and solve complex data analytics and visualization problems in 24 weeks.Learn More
Sponsored Program
Learn More

USC Viterbi Affiliated with Trilogy Education Services

Online Data Analytics Boot Camp
Expand your skill set and grow as a data analyst. This program covers the specialized skills to be successful in the field of data in 24 weeks. Learn More
Sponsored Program

Sponsored

Increased Collaboration

There has been a push in recent years to increase collaboration – both internally and with the outside world. To gain a competitive edge, increase their expertise and enlarge their ever-growing databanks, pharmas are now working with:

  • External Partners: These might include Contract Research Organizations (CROs) or data management companies. For example, in 2013, GlaxoSmithKline announced a partnership with SAS to provide a globally accessible private cloud where the pharmaceutical industry can securely collaborate around anonymous clinical trial information.
  • Academic Collaborators: To get a first look at compounds being developed outside of the company, Eli Lilly created the Phenotypic Drug Discovery Initiative. External researchers submit their compounds for screening and Lilly uses its proprietary tools and data to identify whether any of them have the potential to become drugs.
  • Customers and Health Professionals: Thanks to the explosive growth of social media, pharmas can personally reach out to their customers and physicians. They’re also conducting sentiment analysis of online physician communities, electronic medical records, and consumer-generated media to flag potential safety issues. These data can then be used to shape strategy throughout the pipeline progression.
  • Insurance Companies: By creating proprietary data networks where payors and providers can share, analyze and respond to outcomes and claims data, pharmas are able to enlarge their databanks far beyond clinical trials.

Predictive Analytics

The power to forecast the future has applications for drug discovery and avoiding negative outcomes.

In terms of drug discovery, Pharmas spend a vast amount of money screening compounds to test in preclinical trials. To speed up the process, drug companies are using predictive models to search gargantuan virtual databases of molecular and clinical data. Analysts zoom in on likely drug candidates with the help of criteria based on chemical structure, diseases/targets and other characteristics.

For example, Numerate, which works with companies like Boehringer Ingelheim and Merck, designs its predictive models with specific drug targets and treatment goals in mind.

In relation to avoiding negative outcomes, predictive modeling can also be used to short-circuit potential disasters such as deaths from risk factors.

Predictive analytics can also be used to optimize clinical trials through the selection of optimal patients through genetic clustering, and to improve marketing efforts.

Crowdsourced Competitions

In recent years, pharmaceutical companies and institutions have sponsored crowdsourced contests to predict patient and clinical outcomes, sales patterns, molecule activity, and anything else involving big data.

Examples of these include:

  • Eli Lilly’s innovation challenge for inflammatory bowel disease (2019)
  • Merck’s annual Innovation Cup, where students from all over compete by developing business plans
  • The AstraZeneca Health and Science Innovation Challenge, which opens in 2021, allows entrants to pitch ideas and solutions to solve any health challenge.

More Effective Drug Trials

Data scientists can help to reduce the costs of clinical trials by enabling drug companies to implement:

  • Data-Based Patient Selection: Pharmas use multiple data sources – including social media and public health databases – and more targeted criteria (e.g., genetic information) to identify which populations would work best in trials.
  • Real-Time Monitoring: Companies now monitor real-time data from trials to identify safety or operational risks and nip problems in the bud.
  • Drug Safety Assurance: Data scientists can even tap into side-effect data to predict whether a compound will provoke an adverse reaction before it even reaches trial. Working the University of California-San Francisco, researchers at Novartis have built computer models to do just that.

Targeted Marketing and Sales

Once upon a time, pharmaceutical companies would send their reps on lengthy doctors’ visits and invest in expensive, broad-scale product promotion.

In a March 2013 survey from Accenture, Life in the New Normal: The Customer Engagement Revolution, respondents noted that around 25% of their pharmaceutical marketing was delivered over a digital platform and that 87% intend to increase their use of analytics to target spending and drive improved ROI. Six years later, in 2019, many pharmaceutical companies are now planning to spend more than half of their budgets on digital marketing.

Some of that money is likely to go into monitoring doctors’ therapeutic tastes, geographic trends, peak prescription rates – anything that has a direct relevance to the sales cycle. This data then feeds into:

  • Predictive Analytics: Drug companies are employing predictive methods to determine which consumers and physicians are most likely to utilize a drug and create more targeted on-the-ground marketing efforts.
  • Sophisticated Sales: Pharmas are providing drug reps with mobile devices and real-time analytics on their prospects. Reps can then tailor their agenda to suit the physician. Afterward, the sales team can analyze the results to determine whether the approach was effective.

Better Patient Follow-Ups

With the development of miniature biosensors, sophisticated at-home devices, smart pills and bottles, smartphones and health apps, monitoring a patient’s health has never been easier. Pharmaceutical companies are increasingly interested in how the real-time data from these tools can be used to support R&D, analyze efficacy and increase drug sales.

In addition to knowing how their drugs are being used, companies also typically want to hear how customers view their products. Opinions about new drugs are often generated through patient/physician and patient/patient experiences in a way that creates messy, unstructured data sets.

However, if properly organized and analyzed, this data can be a rich trove of information on:

  • Patterns in drug-drug interactions
  • What drives patients to stop taking medications
  • Which patients will not stick to their prescriptions

Data Risks and Regulations

The Challenges Ahead

By the time it reached the 21st century, the pharmaceutical industry had amassed quantities of structured and semi-structured data in separate, mutually inaccessible “silos”.

Unfortunately, it’s having trouble bringing those silos together.. However analysts slice the numbers, the ultimate goal is to turn the data into information that can play a strategic business role. The prerequisite for that is making the data sources communicate with one another in a meaningful way.

What’s more, drug companies are coping with a flood of unstructured information – including social sentiment – that’s coming at them from outside sources. Integrating, manipulating, organizing, and interpreting this data to support some coherent course of action is causing more than a few headaches.

Working with the Fed

Data integration raises the issue of patient privacy. The more databases are shared among institutions, CROs, partners, software companies, etc., the more pharmas run the risk of exposing sensitive patient information to the eyes of those who shouldn’t be seeing it.

That means drug companies need to reduce their exposure to running afoul of federal laws and regulations. For example, the HIPAA and its younger brother the 2009 HITECH Act clearly state that covered entities must:

“Protect individuals’ health records and other identifiable health information by requiring appropriate safeguards to protect privacy, and setting limits and conditions on the uses and disclosures that may be made of such information without patient authorization.”

Data scientists should also be aware of the FDA’s Sentinel Initiative. This is a legally mandated electronic-surveillance system linking and analyzing health care data on millions of patients from multiple sources. Its purpose is to collect data on safety issues and enable regulators to take quick action.

History of Data Analysis & Pharma

In the late 19th century, Colonel Eli Lilly, founder of Eli Lilly and Company, had:

  • Become an independent drug manufacturer
  • Automated the creation of pills and capsules
  • Hired a permanent R&D staff
  • Instituted a raft of quality assurance measures

By the end of his life, he was a millionaire.

Lilly, Heinrich Emanuel Merck, Charles Pfizer, Friedrich Bayer, Edward Robinson Squibb – before they were brand names, they were pioneers of the pharmaceutical industry. They were also some of the first men to use data-driven methods to cut costs and improve products.

The Fog Of War

Fast-forward to the 20th century. Witness the rise of air travel, fortune cookies, Einstein’s Theory of Relativity, jazz – and the megalithic manufacturers we know today.

The shift from small independent companies to conglomerates got started in the 1940s. With so many lives on the line, World War II spurred intense collaboration between governments and pharmaceutical companies. In the 1940s, a mind-boggling collaborative effort between the government, Merck, Pfizer and Squibb (among others) resulted in the mass production of penicillin.

The Golden Age of Development

By the time the troops came back from the war, pharma was becoming big business. The second half of the 20th century saw the development of many pharmaceutical breakthroughs, including ibuprofen, the contraceptive pill, Valium, and the war on cancer, among others. Advances in genetics – including automated protein and DNA sequencing – and psychiatric treatments opened new markets.

This was also the age when data began to make its mark.

Take Electronic Data Capture (EDC). In the years leading up to the 1970s, pharmaceutical companies had been receiving clinical research data on paper forms. This often resulted in data entry errors and delays.

  • To circumvent the problem, the Institute for Biological Research and Development (IBRD) formed an alliance with Abbott Pharmaceuticals.
  • Each clinical investigator would have access to a computer and be able to enter clinical data directly into the IBRD mainframe.
  • After cleaning up the data, IBRD supplied reports directly to Abbott.

The 1970s also saw the introduction of Cambridge Structural Database (CSD) and the Protein Data Bank (PDB), as well as genetics-focused resources like the Staden Package for DNA Sequences.

The Golden Age Goes Platinum

With the arrival of Professor Norman Allinger’s the Journal of Computational Chemistry in 1980, the first decade of computer-assisted drug development had begun. As Sean Ekins discusses in his book, Computer Applications in Pharmaceutical Research and Development, scientists were now empowered to use computational chemistry programs on personal computers.

Software companies blossomed, anxious to provide drug companies with useful tools. Examples of their functions included:

  • Predictive Analytics: Based on statistical models, Dr. Kurt Enslein’s TOPKAT software could predict the toxicity of a molecule from its structural components.
  • 3D Molecular Modeling: Graphics software gave chemists the ability to view molecular structures in 3D and create virtual models.
  • Data Analysis: Statisticians used data management programs like SAS to analyze clinical data; computational chemists could now get, in minutes or hours, statistical analyses that previously took weeks or months.

Due to the massive influx of data, Lilly was the first pharmaceutical company to purchase a supercomputer (the Cray-2), and many of its competitors soon followed.

1990s-2000s

The volume of information only increased with the advent of the World Wide Web and the new millennium:

  • Collaboration between internal departments and external research institutions became the norm.
  • Pharmaceutical companies started to market directly to consumers.
  • Demand for alternative medicines and nutritional supplements skyrocketed.
  • Software programs grew increasingly complex and sophisticated.
  • Discoveries in genetics and the sequencing of the genome generated new drugs and sources of revenue.

Effective data mining became critical to drug development.

Last updated: June 2020

Share on Facebook Share
Share on TwitterTweet
Share on LinkedIn Share

SPONSORED DATA SCIENCE PROGRAMS

UC Berkeley - Master of Information and Data Science
Sponsored Program
Syracuse University - Master of Science in Applied Data Science
Sponsored Program

SPONSORED ANALYTICS PROGRAMS

American University - Master of Science in Analytics
Sponsored Program
Syracuse University - Master of Science in Business Analytics
Sponsored Program

Online Programs

  • Online Master’s in Data Science Programs
  • Online Master’s in Business Analytics
  • Master’s in Information Systems Online
  • Online Master’s in Computer Science
  • Online Master’s in Computer Engineering
  • Online Master’s in Cybersecurity
  • Graduate Certificates in Data Science Online

Career Profiles

  • Business Analyst
  • Data Analyst
  • Data Architect
  • Data Engineer
  • Data Scientist
  • Marketing Analyst
  • Information Security
  • Quantitative Analyst
  • Statistician

Bootcamps

  • Data Science Bootcamps
  • Data Analytics Bootcamps
  • Coding Bootcamps
  • Cybersecurity Bootcamps
  • UX/UI Bootcamps
  • Fintech Bootcamps
  • Digital Marketing Bootcamps

Online Courses

  • Online Data Science Courses
  • Online Data Analytics Courses
  • Online Machine Learning Courses
  • Online Blockchain Courses
  • Online Digital Marketing Courses
  • Online Financial Analysis Courses
  • Online Cybersecurity Courses
  • Online Business Analytics Courses
  • Online Artificial Intelligence Courses
  • Online UX/UI Courses

Industry Uses

  • Biotechnology
  • Energy
  • Finance
  • Gaming and Hospitality
  • Government
  • Health Care
  • Insurance
  • Internet
  • Manufacturing
  • Pharmaceuticals
  • Retail
  • Telecommunications
  • Travel and Transportation
  • Utilities
  • Food

Data Science Technologies

  • R
  • Python
  • SQL
  • Hadoop
  • Tableau

MastersInDataScience.org is owned and operated by 2U, Inc.
© 2U, Inc. 2021

About 2U | Privacy Policy | Terms of Use | Resources