According to Michael Friendly’s book “A Brief History of Data Visualization,” data visualization is a graphic representation of quantitative information. Data visualization can transform seemingly arbitrary data into an easy-to-understand format. For example, this data visualization on the complete history of the NFL makes use of a graph and turns it into a timeline of the entire sport.
Candidates seeking a master’s degree in data science, whether by taking courses online or in person, may want to also consider how the art of visualization pairs with the harder skills needed to analyze and organize data.
Graphs are just one of the many ways to visualize data. The following guide will help to explain the foundational concepts and processes that allow such visualizations, as well as the skills needed to make them a reality.
The Growing Importance of Data Visualization
Friendly also writes that data visualization dates to the pre-17th century where “visualization arose in geometric diagrams, in tables of the positions of stars and other celestial bodies, and in the making of maps to aid in navigation and exploration.” Since then, data visualization has developed exponentially.
With the growth of technology, data scientists are able to display data in ways that may have seemed unrealistic decades ago. For instance, they can now use interactive statistical computing systems, large-scale statistical and graphics software, and linear statistical modeling.
Unlocking the Power of Big Data
Big data, simply put, is “larger, more complex data sets, especially from new data sources … [that] are so voluminous that traditional processing software just can’t manage them.” Data visualization can help data scientists interpret and convey their findings from immense amounts of data by creating large-scale and 3D visualizations.
Data Visualization Examples
Some cool big data visualization examples include:
- An interactive graph created by Periscopic depicting the number of gun deaths in America in 2013, using statistics from the FBI and WHO.
- A visual showing earthquakes since 1898 uses a time-lapse mapping style that shows where each earthquake took place over time and ends up taking a tree-like shape.
A few other examples visualize big data using ways such as:
- Detailed graphs showing metro area density moving outward from a pinpoint location like city hall.
- An interactive scale slider showing the population change by decade.
Types of Data Visualization
There are many types of data visualization techniques. However, some may help portray different data statistics better than others. It is important to research which strategy will best showcase your findings before making a final decision.
Planar (2D) Graphs
2D planar graphs are graphs that can be drawn onto a plane and are done without any outer edges crossing. These graphs are particularly useful for visualizing geospatial data. The graphs that fall under the planar 2D category are:
- Contour/isopleth/isarithmic maps
- Dasymetric maps
- Dot distribution maps
- Proportional symbol maps
- Self-organizing maps
3D Volumetric Visualization
3D volumetric visualization is a method that allows one to observe and manipulate 3D volumetric data. This includes:
These data visualization techniques would work well for conveying data on a large-scale, show possible realistic outcomes, and get a closer look without having to travel far.
Temporal structures help convey data collected on time, change and motion. Structures that fall under the temporal category include:
- Alluvial diagrams
- Arc diagrams
- Connected scatter plots
- Gantt charts
- Polar area/rose/circumplex charts
- Sankey diagrams
- Stream graphs
- Time series
Multidimensional graphs work best for conveying numbers and can be used as a visual representation for proportions and comparisons. These include:
- Area charts
- Bar graphs and charts
- Box and whisker plots
- Line charts
- Pie charts
- Scatter plots
- Step charts
- Tag clouds
Tree graphs may be used to display a hierarchical climb or other information that contains no cycles—most commonly to display family history. Graphs that fall under this category include:
When referring to a network in data science, researchers are referring to a set of objects that are connected together. This can include:
Essential Skills for Data Visualization
Data visualization can be accomplished by anyone, but it can be done accurately and more effectively if it is created by someone with the following skills:
Data engineers may often rely on different data visualization methods to help convey their findings. However, to become a data engineer, they must be knowledgeable about the different programming and coding languages. Programming languages allow data engineers to mine and query data, and in some cases use big data SQL engines. Some of the popular languages are:
There are multiple data visualization software solutions available, including:
- Datawrapper: Datawrapper enables users to make charts, maps and tables that are readable on all devices. A four-step process is used that is easy to follow, user-friendly and doesn’t require design skills.
- FusionCharts: FusionCharts was created to help developers communicate and understand data more efficiently. Products include plot charts, data-driven maps, time-series visualizations and the ability to export full reports into PDFs.
- Highcharts: Highcharts is a charting tool created in Norway by somebody who wanted a tool to update his website’s homepage on the snow depth near his family’s cabin. Products include charts, maps and Gantt charts.
Data Science Skills
Although certain skills may not be required, having a broad understanding of data science can help to create effective and accurate data visualization. However, this is not to say users shouldn’t have any skills. Some useful skills for data scientists who want to create data visualizations include:
- Cloud computing
- Database management
- Data wrangling
- Deep learning
- Machine learning
- Microsoft Excel
- Multivariate calculus and linear algebra
- Probability and statistics
- Programming, packages and software
- Understanding of the Structured Query Language (SQL)
Public Speaking and Presentation
Having a skill set in public speaking and presenting, even if it’s minor, can help researchers feel confident when presenting their data. Seven powerful public speaking tips include:
- Taking a deep breath and waiting a few moments before beginning your presentation.
- Show up to give, rather than to take.
- Make eye contact with audience members one by one.
- Speak slowly.
- Ignore the naysayers.
- Turn your nervousness into excitement.
- Say thank you when you’re done.
With the development of technology also came the development of machinery. Having the ability to correctly use machines can help data scientists make more impactful data visualizations. For example, machine learning can improve data visualization by:
- Creating visualizations with dynamic real-time analytics.
- Running through millions of data points in seconds, often finding more profound insights from larger datasets.
- Giving search engines the ability to predict what the user will ask, making better, more informed queries.
- Allowing users to shape data sets into a more defined narrative, giving them a better context for the information they are viewing.
- Creating more precise and predictive data models.
Data visualizations help convey a large amount of data in a way that can’t be done otherwise. Data scientists should consider acquiring the skill sets that can be used to help them successfully and accurately use data visualization.