Statisticians apply statistical theories and methods to collect, analyze and interpret quantitative data. They work for companies involved in market research and public opinion, for industries concerned with areas such as quality control and product development, and – frequently – for local, state and federal governments. Hard-core theoretical statisticians usually find themselves in research and academia.
Depending on their level of experience, statisticians may be asked to:
- Tackle data-related challenges assigned by management
- Decide upon an appropriate strategy to collect data
- Extract data from existing sources or instigate new procedures (e.g. customer surveys, science experiments, opinion polls, etc.)
- Analyze and interpret data using statistical tools, algorithms, models and software (e.g. R, SAS, SPSS, etc.)
- Design new statistical models and data collection tools if needed
- Identify patterns, trends, and relationships within data
- Present statistical reports and data visualizations for diverse audiences
- Provide strategic recommendations/predictions and highlight any data limitations
- Develop and maintain statistical tools, databases and programs
- Regularly monitor data quality
- Work closely with key team members and subject experts (e.g. computer engineers, scientists, IT support, etc.)
Responsibilities are dictated by job titles. Low-level statistical analysts are usually tasked with standard data analyses and supervised by higher-ups. Experienced applied statisticians may be able to propose projects to management, develop new products and processes, oversee statistical teams and work on their own research.
An Interview with a Real Statistician
Mikhail Popov is a data analyst at the Wikimedia Foundation (WMF), home of Wikipedia. Prior to that, he worked as a statistician/DBA at a neuropsychology research program (NRP) with the University of Pittsburgh & UPMC. He graduated from California State University, Fullerton, and then went on to graduate studies at Carnegie Mellon University in their Master’s in Statistical Practice program.
We spoke with Mikhail to learn more about the role of statisticians at the Wikimedia Foundation. Read on for the pros and cons of being a statistician, the programming languages Mikhail uses in his work, and his advice to students studying statistics.
Cons: Nobody knows what exactly you do (not their fault). The most popular response you’ll hear is “ugh, I hated the stats course I had to take in college.” Your peers in computer science are constantly inventing methods that have already been invented and published by your fellow statisticians 50 years ago.
The skills are the same ones I listed in response to a later question, but I want to add one and emphasize another. First, I want to add that data visualization is a tremendously useful skill to learn and maintain. The best plot tells a story and guides the viewer/reader to a conclusion that otherwise takes at least one paragraph to reach with text. Second, I want to emphasize how important it is to be able to interpret results to non-statisticians. Telling a biologist what the hazard ratio is from your survival analysis is useless if you don’t interpret it. Hazard ratios are so ridiculously inaccessible, that in 2014(!), a statistician published an article on reformulations of the hazard ratio to make it more interpretable. Whether people have found a use for those reformulations in practice is another story, but the point is that raw parameter estimates are useless without an accessible narrative (and an accompanying data visualization).
– a nontrivial understanding of the real-world problem and the population for whom the research question is relevant
– judgments such as those about the relevance and representativeness of the data
– judgements about whether the underlying model assumptions are valid for the data at hand
– judgements about causality and the role of confounding variables as possible alternative explanations for observed results
– the ability to interpret and communicate the results of a statistical analysis so non-statisticians can understand the findings”
The best statisticians don’t just throw their data into a black box (e.g., neural network) and rely on machine learning algorithms to do their work for them. Instead, they develop an understanding and intuition for the data they’re analyzing. The best statisticians are also great communicators. They can talk to their clients and collaborators through all the stages of an experiment or study. Those people may not be as well-versed in statistics but are usually experts in the subject matter, so it is crucial to work with them in unison.
The best statisticians poke and prod at assumptions and aren’t afraid of being wrong. To see what I mean, play around with this interactive puzzle from The New York Times.
– Statisticians have gained access to new types and amounts of data. Big Data, if you will, on transactions, customer purchasing/browsing histories, and web sessions. Every major web-based member of commerce out there is tracking and logging its users’ behavior and engagement. Gleaning insights from that is a nontrivial task.
– Statisticians have become more involved in the data collection process and privacy discussions. You don’t need to know EVERYTHING. You can design your system to collect only the data that will be necessary to aid your decision-making process without sacrificing user/customer privacy and putting them at risk in case of a data breach. It’s large-scale experimental design.
– Statisticians should be guides and make recommendations to the decision makers. Data should not be the be-all and end-all decision-maker. We talk a lot about “data-driven decision-making” when we really mean “data-informed decision-making,” because ultimately it should be a person making the decisions that impact other people.
This last point is where I think data scientists could benefit tremendously from a formal education in statistical science, because that education (hopefully) involves learning and thinking about ethics. In school, we spent time reading about and discussing case studies with ethical concerns. I suspect there’s an empathy element that is missing when you come into data science from a purely computer science/engineering background, but I hope I’m wrong.
But the biggest advice I can offer is: Don’t talk only with other statisticians. Engage with students and practitioners in other disciplines. Expose yourself to people, topics and issues in computer science, biology, chemistry, environmental science, psychology, history, performance art, visual art (especially graphic design), interactive art, user experience design, public policy, social and political activism, etc. The best statisticians have a broad perspective and understanding of the world around them. If your role will be to offer statistical expertise to different teams in the organization, you will have to become a mini-expert in whatever the teams are working on. You will never not work with people from other backgrounds in this profession, and communicating is a crucial component of the craft.
The best-paid statisticians in 2015 lived in Seattle. According to PayScale, the median pay was $86,703 (20% above the national average). In second place was San Francisco, where pay was $84,000 (16% above). Both of these places are known technology hubs.
Geography grows more interesting when it comes to statistical analysts. According to PayScale, the top-paid statistical analysts live in Chicago, where the median pay is $70,280 (17% above the national average). Of course, this may be because PayScale did not receive enough survey responses from coastal cities.
Average Salary (2015): $75,069 per year
Median Salary (2015): $72,149 per year
Total Pay Range: $43,597 – $113,611
Median Pay (2012): $75,560 per year
Average Salary (2015): $67,382 per year
Median Salary (2015): $60,507 per year
Total Pay Range: $40,420 – $93,277
Senior Statistical Analyst
Average Salary (2015): $88,665 per year
What Kind of Degree Will I Need?
At minimum, junior data/statistical analysts will need a bachelor’s degree in statistics, applied math, computer science or a related field. Since you will be working with complex statistical software programs, a healthy balance between hard-core math and IT courses is recommended.
Plan for a graduate degree. Companies looking for true statisticians prefer applicants to hold a master’s or a PhD in applied statistics or math and have a strong background in their chosen industry (e.g. finance, biochemistry, computer engineering, etc.). Conveniently, we profile Master’s in Applied Statistics Programs.
What Kind of Skills Will I Need?
- Statistics (e.g. hypothesis testing and summary statistics)
- Math (e.g. linear algebra, calculus and probability)
- Machine learning tools and techniques
- Software engineering skills (e.g. distributed computing, algorithms and data structures)
- Data mining
- Data cleaning and munging
- Data visualization and reporting techniques
- Unstructured data techniques
- R and/or SAS languages
- SQL databases and database querying languages
- Python (most common), C/C++ Java, Perl
- Big data platforms like Hadoop, Hive & Pig
- Cloud tools like Amazon S3
You’ll notice that we’ve echoed the list from our Data Scientist profile. Advanced programming skills will help prepare you for “hybrid” stats/data science jobs now appearing on employment sites.
- Analytical Problem-Solving: Identifying complex challenges; employing the right mathematical approach/methods to make the maximum use of time and human resources.
- Logic & Reasoning: Assessing the strengths and weaknesses of data and statistical methods; understanding the implications of new developments in technology and data mining.
- Effective Communication: Explaining your mathematical techniques and discoveries to technical and non-technical audiences.
- Industry Knowledge: Understanding the way your chosen industry functions and how data are collected, analyzed and utilized.
What About Certifications?
Professional certifications can be powerful additions to your résumé. Ask your mentors for advice, check job listing requirements and consult articles like Tom’s IT Pro “Best Of” certification lists to determine which acronyms employers will recognize and respect.
The American Statistical Association (ASA) has two levels of accreditation. Candidates should attain the entry-level Graduate Statistician (GStat) certification before applying for the full PStat® accreditation.
PStat® certification is based on a professional portfolio, not an exam. Applicants must provide proof of educational credentials (typically a graduate degree in statistics or a related quantitative field), work experience and their commitment to professional development. Work samples and supporting letters from referees are also required.
Run by SAS, this accreditation is explicitly aimed at professionals who use SAS/STAT software to conduct and interpret complex statistical data analyses (i.e. statisticians). In the certification exam, candidates must proof their knowledge of areas like ANOVA, regression, predictive modeling and logistic regression.
Note: For more useful big data qualifications, check out the certifications section in our Data Scientist profile.
Jobs Similar to Statistician
Data Scientists vs. Statisticians
You’ll find a lot of crossover between job descriptions for statistical analysts and Data Analysts, and statisticians and Data Scientists. This has led to a great debate about whether data science is just statistics, sexed up.
Those who argue against the “sexing up” theory note that:
- Statisticians and Data Analysts are primarily concerned with set tasks – from analyzing migration patterns to calculating average conversation times for call center agents. They are given parameters and do their best to collect and analyze information from conventional sources.
- Data Scientists think outside the structured box. They create their own questions/projects and use a much wider range of tools – only some of which are statistical – in order to establish unique connections between big data.
Of course, experienced statisticians have been thinking outside the box since the dawn of the field. However, thanks to the surge of technology, those who wish to call themselves data scientists must now have formidable software engineering, machine learning and predictive analytics skills.
Statistician Job Outlook
The future looks good for statisticians. The BLS is forecasting employment to grow 27% from 2012-2022, much faster than the average for all occupations. Businesses, financial firms, government agencies, pharmaceutical companies and research groups need qualified statistical experts to make sense of the big data tsunami.
This projection comes with a general warning. If statisticians wish to remain relevant in the marketplace, they must be comfortable with data visualization, data munging, machine learning, AI and more. They have to be prepared to formulate open-ended questions, create software programs and teach their non-mathematical colleagues how to avoid analytical mistakes.
It shouldn’t be too hard to accomplish. Statistics departments have recognized the paradigm shift towards technology, and are increasingly focusing curricula on practical applications. If this trend continues, it’s quite possible that the terms “data scientist” and “applied statistician” will merge into one.