On any given day, insurance data scientists may gather data from:
What’s more, the advent of cloud computing helps companies to aggregate and store it all.
These sources tell insurers far more than did historical data from policy administration systems, claims management applications and billing systems, and the mortality reports of yesteryear. Through a judicious analysis of big data, insurers improve their pricing accuracy, create customized products and services, forge stronger customer relationships and facilitate more effective loss prevention.
Rutgers Data Science Bootcamp
Gain skills needed to analyze data and deliver value to organizations. Complete projects using real data sets from the worlds of finance, healthcare, government, social welfare, and more.
Southern Methodist University
SMU Data Science Boot Camp
Develop concrete, in-demand data skills and learn how to help drive business decisions and solve challenges that companies are facing. No programming experience required.
Northwestern Data Science and Visualization Boot Camp
Northwestern Data Science and Visualization Bootcamp teaches practical and technical skills in 24 intensive weeks. Students apply their knowledge to hands-on projects that translate directly into work in the field.
University of Southern California
USC Viterbi Data Analytics Boot Camp
Expand your skill set and grow as a data analyst. This program covers the specialized skills to be successful in the field of data in 24 weeks.
Personalized Risk Pricing
To match that level of knowledge in the age of decentralization and the Internet, the insurance industry has turned to big data. Insurance data scientists combine analytical applications – e.g., behavioral models based on customer profile data – with a continuous stream of real-time data – e.g., satellite data, weather reports, vehicle sensors – to create detailed and personalized assessments of risk.
Picture a world in which wireless “telematics” devices transmit real-time driving data back to an insurance company.
PAYD is straightforward. It charges customers based on the number of miles or kilometers driven. Hollard Insurance, a South African insurer, has six mileage options.
But PAYD does not take into account driving habits. PHYD plans use telematics to monitor a wide variety of factors – speed, acceleration, cornering, braking, lane changing, fuel consumption – as well as geo-location, date and time. If an accident occurs, the insurance company has the ability to recreate the situation.
Auto insurers can then provide customers with driving scores, ideas for improvement and individual pricing.
In a move similar to auto, property insurance companies are assessing how they can use telematics to create usage-based home insurance. These data sources can include:
Moisture sensors that detect flooding or leaks
Utility and appliance usage records
Sensors that track occupancy
Combine this with information from outside sources (e.g., local crime reports and traffic) and you can arrive at a multi-faceted, comprehensive assessment of one person’s property claim risk.
Going a step further, these sources can be used to protect a customer. For example, with predictive analytics, insurers can calculate the likelihood of an event such as theft or a hurricane and take steps to avoid pain and suffering – as well as, of course, big claims.
Life and Health Insurance
We live in a monitored world. Life and health insurance companies know this more than anybody. To create profiles of customer health and develop individual “well-being” scores, insurers are now casting the information net very wide indeed. They can collect:
Transactional data – e.g., where and what (junk food?) customers buy
Body sensors – i.e., devices that monitor consumption or alert the wearer to early signs of illness
Exterior monitors – e.g., data from workout machines
Social media – e.g., tweets about one’s personal health or state of mind
Insurance aims to improve customer satisfaction, and it is employing big data to accomplish that. The more an insurer knows about its customers’ quirks, the theory goes, the easier it is to keep them happy – and paying premiums.
Companies are combining all their direct customer connections – e.g., email, call center, adjuster reports, etc. – with indirect sources – e.g., social media, blog comments, website and clickstream data – to create a 360-degree profile of each individual.
With a 360-degree profile in hand, insurers have the means to refine their approach to sales, marketing and existing customer service.
Call Center Optimization
A call center is a seething cauldron of data. For insurance data scientists, it’s also a golden opportunity. These folks are investigating ways to:
Combine claims data with telecom data from CDRs to analyze call center activities and refine training guidelines.
Analyze raw telecom data, model temporal call patterns, and create a plan for staffing optimization.
Use sentiment analysis – e.g., speech analytics on call center conversations or Natural Language Processing (NLP) and text analytics on social media – to improve customer service.
Call-center employees are also in an ideal situation to sell customers additional products. One use of a 360-degree profile is to give that friendly voice on the phone the means to offer you the most relevant product for your particular needs.
Fraud costs insurance companies 10s of millions each year. In response, insurers are marshaling their data resources and creating a multi-channel approach to fraud detection. They’re taking a very close look at both traditional structured data (such as claims and policy data), and textual data (such as adjuster notes, police reports and social media).
Pattern, graph and link analysis techniques
… not to mention a host of other handy tools, data scientists are cracking down on suspicious claims.
But perhaps the most complicated issue centers on a customer’s right to privacy. The Finance Industry in general is subject to a host of federal and state regulations that were enacted to protect consumer privacy and avoid discriminatory practices. These have been joined by a series of stringent rules on data collection – all of which an insurance legal department must be aware of.
Just as importantly, insurance companies may need to think about how they treat customer information. It’s all very well to imagine a world run by telematics, but many consumers are rightly afraid of ceding their personal data to a private company. Even the lure of more affordable premiums may not be enough to change their mind.
Insurance data scientists also have to be very careful they’re not mistakenly assuming the role of Big Brother – whether benevolent or not. Despite the hype, not even big data can tell you everything about a person.
In 2009, Blanchard went on disability leave due to a case of severe depression. One day, she went to the bank and discovered her health insurance benefits had been terminated.
The reason? Her insurance company, while trawling for data, had captured smiling photos on her Facebook page and decided she wasn’t depressed enough to be disabled.
History of Data Analysis and Insurance
“Most problems have either many answers or no answer. Only a few problems have a single answer.”
– Edmund C. Berkeley, Right Answers – A Short Guide for Obtaining Them (September 1969)
Insurance has always been a numbers game. What are the odds of a ship sinking? Of the head of the household dying prematurely? Of a wooden house burning down? Since the third millennium B.C., humans have been trying to protect themselves from the risks of living.
Keeping track of risks means knowing the numbers – the data. Increasingly sophisticated techniques were added over time to better calculate the odds. Three and a half centuries ago, “knowing the numbers” was maturing into the mathematics of risk – actuarial science – one of the foundations of modern data analysis.
Insurance companies were happy to offer citizens these products, but they were faced with a variety of statistical conundrums in understanding their data:
What was the likelihood of an insurance-holder dying within a certain time frame?
How should insurers price their products?
What percentage of premiums should they set aside to pay for future benefits (e.g., annuities)?
How much could they afford to invest elsewhere? What would the rate of interest be?
Graunt’s Table and Halley’s Annuities
Fortunately, mathematics had reached a point where it was ready to provide the answers. In 1662, John Graunt, a London haberdasher, conducted a study of mortality rolls in the city.
In his analysis, he found predictable patterns of longevity and death rates in groups of people of the same age. This gave him the means to calculate the probabilities of survival. His work formed the nucleus of the first “life table.”
Thirty years later, in 1693, Edmond Halley took a break from calculating the orbits of comets and descending to the bottom of the Thames in a diving bell to publish an article on life annuities.
Using accurate demographic data from Breslau, a city in Silesia, Halley produced a life table of the population, organized by age and survival. From this, he was able to calculate the premium amount that any man or woman, at each year of age, should pay in order to purchase a life annuity. From this time on, actuarial data multiplied.
The Father of the Computer and His Descendants
Over the next few centuries, to accompany the data, actuarial science grew both in popularity and in the complexity of its calculations. It’s no surprise that Charles Babbage, father of the computer, found time to dabble in it.
During the 1820s, he created actuarial tables from Equitable Society mortality data and published a handy guide to the life insurance industry titled A Comparative View of the Various Institutions for the Assurance of Lives.
But it was the adoption of punch-card tabulating machines and, subsequently, early computer technology, that the insurance industry began the march towards data dominance.
The next big shift came in the late 1960s and 1970s. More powerful machines and better software were coming into play. Online systems allowed workers to share information freely and conduct inquiries in real time. Investment in technology increased steadily.
The arrival of the Internet in the 1990s helped insurance data science.
Individuals were able to bypass intermediaries and shop for coverage on their own terms.
Company and consumer websites sprang up to satisfy demand.
Banks seized the opportunity to expand into the industry.
As a consequence, the amount of customer data being gathered and exchanged exploded.
At the same time, the costs of data processing and storage were dropping rapidly. In lieu of the mass modeling of the past, insurers were gaining the capabilities (and the technical tools) to calculate risk on an individual level. The era of big data was just around the corner.