Statisticians apply statistical theories and methods to collect, analyze and interpret quantitative data. They may work for companies involved in market research and public opinion, for industries concerned with areas such as quality control and product development, and – frequently – for local, state and federal governments. Hard-core theoretical statisticians usually find themselves in research and academia.
Depending on their level of experience, statisticians may be asked to:
- Tackle data-related challenges assigned by management
- Decide upon an appropriate strategy to collect data
- Extract data from existing sources or instigate new procedures (e.g. customer surveys, science experiments, opinion polls, etc.)
- Analyze and interpret data using statistical tools, algorithms, models and software (e.g. R, SAS, SPSS, etc.)
- Design new statistical models and data collection tools if needed
- Identify patterns, trends, and relationships within data
- Present statistical reports and data visualizations for diverse audiences
- Provide strategic recommendations/predictions and highlight any data limitations
- Develop and maintain statistical tools, databases and programs
- Regularly monitor data quality
- Work closely with key team members and subject experts (e.g. computer engineers, scientists, IT support, etc.)
Responsibilities are dictated by job titles. Low-level statistical analysts are usually tasked with standard data analyses and supervised by higher-ups. Experienced applied statisticians may be able to propose projects to management, develop new products and processes, oversee statistical teams and work on their own research.
Sponsored Certificate and Short Courses
Learn MoreHarvard Business Analytics Certificate Online
Learn MoreHarvard VPAL
An Interview with a Real Statistician
Mikhail Popov is a data analyst at the Wikimedia Foundation (WMF), home of Wikipedia. Prior to that, he worked as a statistician/DBA at a neuropsychology research program (NRP) with the University of Pittsburgh & UPMC. He graduated from California State University, Fullerton, and then went on to graduate studies at Carnegie Mellon University in their Master’s in Statistical Practice program.
We spoke with Mikhail to learn more about the role of statisticians at the Wikimedia Foundation. Read on for the pros and cons of being a statistician, the programming languages Mikhail uses in his work, and his advice to students studying statistics.
A: Pros: As a statistician, you will likely be asked to provide your expertise to teams working in a wide variety of fields and sub-fields, so you will get to become a mini-expert in each problem you’re involved in. You engage the creative, analytical, and social parts of your brain simultaneously on a daily basis. You translate boring numbers into interesting stories. You quantify uncertainty, and when you find patterns and relationships, you are able to say, “This is real, this isn’t just random noise.” Human beings are super good at seeing patterns where there are none, and your role is to guard leaders and decision makers against that.Cons: Nobody knows what exactly you do (not their fault). The most popular response you’ll hear is “ugh, I hated the stats course I had to take in college.” Your peers in computer science are constantly inventing methods that have already been invented and published by your fellow statisticians 50 years ago.
A: The primary language/environment I work in is R. When I was working at NRP, where we did all of the neuroimaging analysis in MATLAB, I used R for everything else. We’re also huge fans of RStudio’s Shiny (a web application framework for R) and use it for our dashboards to give the teams easy access to daily metrics and KPIs. My enthusiasm for R is no secret as I am a co-host and producer of the R Talk podcast. A lot of my work at NRP and a good chunk of my work at WMF also requires me to write file/text processing pipelines, so Unix/Linux/Bash…scripting(?) is key to my work.The skills are the same ones I listed in response to a later question, but I want to add one and emphasize another. First, I want to add that data visualization is a tremendously useful skill to learn and maintain. The best plot tells a story and guides the viewer/reader to a conclusion that otherwise takes at least one paragraph to reach with text. Second, I want to emphasize how important it is to be able to interpret results to non-statisticians. Telling a biologist what the hazard ratio is from your survival analysis is useless if you don’t interpret it. Hazard ratios are so ridiculously inaccessible, that in 2014(!), a statistician published an article on reformulations of the hazard ratio to make it more interpretable. Whether people have found a use for those reformulations in practice is another story, but the point is that raw parameter estimates are useless without an accessible narrative (and an accompanying data visualization).
– judgments such as those about the relevance and representativeness of the data
– judgements about whether the underlying model assumptions are valid for the data at hand
– judgements about causality and the role of confounding variables as possible alternative explanations for observed results
– the ability to interpret and communicate the results of a statistical analysis so non-statisticians can understand the findings”The best statisticians don’t just throw their data into a black box (e.g., neural network) and rely on machine learning algorithms to do their work for them. Instead, they develop an understanding and intuition for the data they’re analyzing. The best statisticians are also great communicators. They can talk to their clients and collaborators through all the stages of an experiment or study. Those people may not be as well-versed in statistics but are usually experts in the subject matter, so it is crucial to work with them in unison.The best statisticians poke and prod at assumptions and aren’t afraid of being wrong. To see what I mean, play around with this interactive puzzle from The New York Times.
– Statisticians have become more involved in the data collection process and privacy discussions. You don’t need to know EVERYTHING. You can design your system to collect only the data that will be necessary to aid your decision-making process without sacrificing user/customer privacy and putting them at risk in case of a data breach. It’s large-scale experimental design.
– Statisticians should be guides and make recommendations to the decision makers. Data should not be the be-all and end-all decision-maker. We talk a lot about “data-driven decision-making” when we really mean “data-informed decision-making,” because ultimately it should be a person making the decisions that impact other people.This last point is where I think data scientists could benefit tremendously from a formal education in statistical science, because that education (hopefully) involves learning and thinking about ethics. In school, we spent time reading about and discussing case studies with ethical concerns. I suspect there’s an empathy element that is missing when you come into data science from a purely computer science/engineering background, but I hope I’m wrong.
A: Get involved in data analysis competitions. As students, you have likely been only exposed to neat data that is meant to cement what you have learned about specific models or statistical methods. Data analysis competitions, such as those on DrivenData and Kaggle, offer you opportunities to engage with (relatively) real world data in an open-ended (rather than structured) way. For instance, if you’ve never done network analysis before, competitions with social network data are a great opportunity to become acquainted with those models on your own, and ask your professors for help if you have questions. To find more competitions, I also recommend checking out KDnuggets. But the biggest advice I can offer is: Don’t talk only with other statisticians. Engage with students and practitioners in other disciplines. Expose yourself to people, topics and issues in computer science, biology, chemistry, environmental science, psychology, history, performance art, visual art (especially graphic design), interactive art, user experience design, public policy, social and political activism, etc. The best statisticians have a broad perspective and understanding of the world around them. If your role will be to offer statistical expertise to different teams in the organization, you will have to become a mini-expert in whatever the teams are working on. You will never not work with people from other backgrounds in this profession, and communicating is a crucial component of the craft.
2019 Statistician Average Salaries
The best-paid statisticians, as of July 2019 work in Dallas, TX. According to PayScale, the median pay was $83,923 (16% above the national average). In second place was Washington, D.C., where pay was $80,693 (12% above).
Average Statistician Salary – Glassdoor: $82,477 per year
Average Statistician Salary – PayScale: $72,070 per year
Total Pay Range: $50,000 – $108,000
Average Statistician Salary – Bureau of Labor Statistics: $81,950 per year
Average Statistical Analyst Salary – Glassdoor: $72,915 per year
Average Statistical Analyst Salary – PayScale: $64,226 per year
Total Pay Range: $49,000 – $92,000
Senior Statistical Analyst
Average Senior Statistical Analyst Salary – Glassdoor: $89,471 per year
Note: Salary information from Glassdoor, PayScale, and the Bureau of Labor Statistics was retrieved as of July 2019.
What Kind of Degree Will I Need?
At minimum, junior data/statistical analysts will need a bachelor’s degree in statistics, applied math, computer science or a related field. Since you will be working with complex statistical software programs, a healthy balance between hard-core math and IT courses is recommended.
Plan for a graduate degree. Companies looking for true statisticians may prefer applicants to hold a master’s or a PhD in applied statistics or math and have a strong background in their chosen industry (e.g. finance, biochemistry, computer engineering, etc.). Conveniently, we profile Master’s in Applied Statistics Programs.
What Kind of Skills Will I Need?
- Statistics (e.g. hypothesis testing and summary statistics)
- Math (e.g. linear algebra, calculus and probability)
- Machine learning tools and techniques
- Software engineering skills (e.g. distributed computing, algorithms and data structures)
- Data mining
- Data cleaning and munging
- Data visualization and reporting techniques
- Unstructured data techniques
- R and/or SAS languages
- SQL databases and database querying languages
- Python (most common), C/C++ Java, Perl
- Big data platforms like Hadoop, Hive & Pig
- Cloud tools like Amazon S3
You’ll notice that we’ve echoed the list from our Data Scientist profile. Advanced programming skills will help prepare you for “hybrid” stats/data science jobs now appearing on employment sites.
- Analytical Problem-Solving: Identifying complex challenges; employing the right mathematical approach/methods to make the maximum use of time and human resources.
- Logic & Reasoning: Assessing the strengths and weaknesses of data and statistical methods; understanding the implications of new developments in technology and data mining.
- Effective Communication: Explaining your mathematical techniques and discoveries to technical and non-technical audiences.
- Industry Knowledge: Understanding the way your chosen industry functions and how data are collected, analyzed and utilized.
What About Certifications?
Professional certifications can be powerful additions to your résumé. Ask your mentors for advice and check job listing requirements to determine which acronyms employers will recognize and respect.
The American Statistical Association (ASA) has two levels of accreditation. Candidates should attain the entry-level Graduate Statistician (GStat) certification before applying for the full PStat® accreditation.
PStat® certification is based on a professional portfolio, not an exam. Applicants must provide proof of educational credentials (typically a graduate degree in statistics or a related quantitative field), work experience and their commitment to professional development. Work samples and supporting letters from referees are also required.
Run by SAS, this accreditation is explicitly aimed at professionals who use SAS/STAT software to conduct and interpret complex statistical data analyses (i.e. statisticians). In the certification exam, candidates must proof their knowledge of areas like ANOVA, regression, predictive modeling and logistic regression.
Note: For more useful big data qualifications, check out the certifications section in our Data Scientist profile.
Jobs Similar to Statistician
Data Scientists vs. Statisticians
You’ll find a lot of crossover between job descriptions for statistical analysts and data analysts, statisticians and data scientists. This has led to a great debate about whether data science is just statistics, on another leve.
- Statisticians and Data Analysts are primarily concerned with set tasks – from analyzing migration patterns to calculating average conversation times for call center agents. They are given parameters and do their best to collect and analyze information from conventional sources.
- Data Scientists think outside the structured box. They create their own questions/projects and use a much wider range of tools – only some of which are statistical – in order to establish unique connections between big data.
Accounting for the surge of technology, those who wish to call themselves data scientists may now have formidable software engineering, machine learning and predictive analytics skills.
Statistician Job Outlook
The future looks good for statisticians. The BLS is forecasting employment to grow 33% from 2016-2026, much faster than the average for all occupations. Businesses, financial firms, government agencies, pharmaceutical companies and research groups need qualified statistical experts to make sense of the big data tsunami.
This projection comes with a general warning. If statisticians wish to remain relevant in the marketplace, they should be comfortable with data visualization, data munging, machine learning, AI and more. They may have to be prepared to formulate open-ended questions, create software programs and teach their non-mathematical colleagues how to avoid analytical mistakes.