Tableau Visualisation: Famous Biographies
Introduction
Pantheon (https://www.nature.com/articles/sdata201575) is a large, manually-verified dataset of individuals who have had globally famous biographies written about them. This is an attempt to create an effective visualisation of this dataset that can answer three open ended questions. The visualisation was designed using the 5-sheet design method and implemented in the popular Tableau tool.
Tableau workbook file: https://www.jbm.fyi/static/biographies.twbx
PDF report: https://www.jbm.fyi/static/biographies.pdf
PDF design sheets: https://www.jbm.fyi/static/biographies_design.pdf
Demo Video
Design Sheets
Questions
- Which regions make the largest contributions to globally famous biographies?
- Which domains produce the most famous figures and how does this vary by region?
- How do the contributions of female authors change over time?
Visualisation
Table 1: Visual Mapping
Attribute |
View |
Attribute Type |
Visual Variable |
Expressive |
Birth Country |
A |
Categorical |
Position (map) |
Yes |
Total views (country) |
A |
Ordinal |
Size (radius) |
Yes |
Modal Domain (country) |
A |
Categorical |
Colour |
Yes |
Birth Country |
A* |
Categorical |
Text |
No |
Total Books (country) |
A* |
Ordinal |
Text |
No |
Total views (country) |
A* |
Ordinal |
Text |
No |
Total English views |
A* |
Ordinal |
Text |
No |
Total Non-English views |
A* |
Ordinal |
Text |
No |
Average number of languages |
A* |
Ordinal |
Text |
No |
Modal Domain (country) |
A* |
Categorical |
Text |
No |
Birth City |
B |
Categorical |
Position (map) |
Yes |
Total page views (city) |
B |
Ordinal |
Size (radius) |
Yes |
Modal Domain (city) |
B |
Categorical |
Colour |
Yes |
Birth City |
B* |
Categorical |
Text |
No |
Total Books (city) |
B* |
Ordinal |
Text |
No |
Total views (city) |
B* |
Ordinal |
Text |
No |
Total English views |
B* |
Ordinal |
Text |
No |
Total Non-English views |
B* |
Ordinal |
Text |
No |
Average number of languages |
B* |
Ordinal |
Text |
No |
Modal Domain (country) |
B* |
Categorical |
Text |
No |
Total Views (author) |
C |
Ordinal |
Length (height) |
Maybe |
English Views (author) |
C |
Ordinal |
Length (height) / colour |
Maybe |
Non-English Views (author) |
C |
Ordinal |
Length (height) / colour |
Maybe |
Author Name |
C/C* |
Ordinal |
Text |
Maybe |
Total Views (author) |
C* |
Ordinal |
Text |
No |
English Views (author) |
C* |
Ordinal |
Text |
No |
Non-English Views (author) |
C* |
Ordinal |
Text |
No |
Gender Ratio |
D |
Ordinal |
Angle / colour |
Yes |
Number of authors per gender |
D* |
Ordinal |
Text |
No |
Domain Ratios |
E |
Ordinal |
Angle / colour |
Yes |
Number of authors per domain |
E* |
Ordinal |
Text |
No |
Books per birth year per domain |
F |
Ordinal |
Size (height) / colour |
Maybe |
Books per birth year per domain |
F* |
Ordinal |
Size (height) / colour |
No |
Author name |
G |
Categorical |
Text |
No |
Author field/domain |
G |
Categorical |
Colour |
No |
Author gender |
G |
Categorical |
Symbol |
No |
Author name |
G* |
Categorical |
Text |
No |
Author field/domain |
G* |
Categorical |
Text |
No |
Author gender |
G* |
Categorical |
Text |
No |
Author birth city |
G* |
Categorical |
Text |
No |
Author birth year |
G* |
Ordinal |
Text |
No |
Historical Popularity Index |
G* |
Ordinal |
Text |
No |
Total page views |
G* |
Ordinal |
Text |
No |
English page views |
G* |
Ordinal |
Text |
No |
Non-English page views |
G* |
Ordinal |
Text |
No |
Number of languages |
G* |
Ordinal |
Text |
No |
*Tooltip
Table 1 shows the mapping between attribute and visual. Figure 1 is an annotated screenshot showing the layout of the visualisation. Table 2 provides a description of each of the layout labels.
Table 2 Figure 1 label annotations
Label |
Description |
A |
World map |
B |
Country map (not shown) |
C |
View bar chart |
D |
Gender pie chart |
E |
Domain pie chart |
F |
Birth year line graph |
G |
Author details |
1 |
Birth year filter |
2 |
Reset filter button |
3 |
Country dropdown |
4 |
City dropdown |
5 |
Domain legend/filter |
The main view of the visualisation (A) is a map overlayed with a circle for each country containing a book from the dataset. The circle is sized according to the total number of views of books from that country. This is an effective encoding as it clearly shows on the map which countries make the greatest contributions and allows comparison between countries. The circle is coloured according to the modal domain within that country. This provides the viewer with an understanding of which domains are most popular in each country. Further information, as shown in table 1 is displayed in a tooltip when a user hovers over a circle.
If the user clicks on a country, the main view changes (B) to a map centred on the country, showing similar information to before, but with individual cities plotted instead of countries, as shown in figure 2. Users can also select a country using the dropdown menu (3). Users can select a city by clicking on it, or selecting it from the city dropdown (4).
The domain legend (5) shows which colours are used to represent each of the domains in the charts and can also be used to filter to one or more domain. The birth year slider (1) can be used to filter by birth year. The remaining charts are filtered according to the filters (including country and city).
The gender (D) and domain pie charts (E) show the distributions of gender and domain across the selected data respectively. These are effective encodings as the user won’t need to compare between gender and domain and will be interested in the proportion rather than absolute values.
It’s easy to see which category has the most members. The colours used are consistent to avoid confusion. For finer grained detailed, absolute values can be viewed in the tooltips.
The birth year chart (F) is a line graph showing the number of books for each birth year. There is one colour-coded line per domain. This chart can be overwhelming if the number of authors is too high, but becomes effective once filtered down to a smaller number of books. This chart enables the user to view frequency and domain trends over time.
The views bar chart shows the total number of views per selected book, breaking this down into English and non-English views. Again, this is less effective when a large number of authors are selected, due to the limited expressiveness of a bar chart with too many items, however, once a more limited author selection is made, the chart is useful for comparing views across authors, and language distribution of views.
The final view (G) is a list of all the selected books. The list is coloured coded by domain, and also contains gender symbols for quick reference. These encodings are not particularly effective in visual terms; however, they do enable the other visuals to be further filtered, and detailed information is displayed when the entries are hovered over. If an author is clicked, the tooltip contains a hyperlink that can be used to navigates to the relevant Wikipedia page, as shown in figure 3.
Insights
Question 1
We can see visually from figure 4 that the largest contribution is made by the US, followed by Britain, France, Italy, and Germany. The only other nations with more then 200 contributions are Russia and Turkey.
Figures 5 – 11 show the city-level distribution. In nearly every case, the city with the highest contribution is the capital. The only exception to this is the USA. Usually, the larger contributions are limited to one or two cities, however in the case of Italy and Germany, a much wider variety of cities make large contributions. This is likely because these countries were until recently, a collection of smaller countries.
The pie charts in figures 12 to 19 compare the global distribution of domains with the top 7 countries. The largest domain globally is institutions (green) with arts closely following this (blue). Art is the most popular in the UK and USA, with institutions most popular in all the other countries. As shown in figure 4, institutions are the most prevalent domain in Europe, Africa, Oceania and Asia; art is most prevalent in North America and sport is most prevalent in South America.
Question 3
Figures 20 to 28 show the distribution of gender in the dataset. Figure 20 shows the complete time period, whilst the others show a narrower slice. We can see that from 3500 BC through to 1000 AD the share of female authors reduces. After 1000 AD the female share gradually increases, eventually reaching around 25% from 1975 to 2005.
Evaluation
The visualisation provides the user with an ability to visually explore most of the attributes present in the dataset in an open-ended way. It has powerful filtering abilities and many interactive features. The visualisation is most effective in answering question 1 as it is able to present location data visually on a map. The visualisation is less effective in answering questions 2 and 3 as it requires the user to visually compare pie charts. It could be more effective at answering these questions if it enabled the user to plot all the data points simultaneously on a different type of chart. It would also be interesting to compare the English vs non-English views distribution across a broader dataset then the bar chart enables.