Throughout the lab, you'll occasionally see questions marked with an asterisk ('*').
To get credit for this lab, you'll post your answers to Piazza.

Pt. 1: Building a Geographic Map

  1. Today we'll work with a dataset containing the World Health Organization's most recent estimates of the prevalence and mortality of Tuberculosis by country. Start by loading the CSV file containing the Tuberculosis dataset into Tableau. Its dimensions look something like this:

    Let's click on the tab labeled 'Sheet 1' to go to the worksheet.

  2. Notice that the name of each column provides a brief description for each of the dimensions/measures:

  3. Remember that Tableau tries to infer the type of each of our columns whenever we import a dataset. When Tableau reads country names, US state names, country codes, zip codes, etc., the data is coded as geographic. Double-clicking on one of the geographic dimensions (like Country or Territory Name) produces a simple map with a point at each location represented in the data:

  4. If instead of points, we'd like to show a filled map, we can simply select the appropriate mapping from the Show Me panel:

  5. This isn't a particularly useful map yet, because we aren't really drawing attention to any interesting patterns. Let's try dragging the Estimated number of incident cases (all forms) measure onto the Color mark:

    *Piazza Question 1* According to this map, which countries appear to have the biggest problem with TB? How can you tell?

  6. One problem with this map is that it only highlights the absolute number of TB cases observed - it doesn't take into account population (which we would expect to have an impact on how many cases are observed). Luckily, the WHO has given us a measure normalized for population (this is called disease prevalence). Let's drag the Estimated prevalence of TB (all forms) measure onto the Color mark:

    *Piazza Question 2* According to this map, which countries appear to have the biggest problem with TB? Do you have any new information?

  7. Instead of talking about disease prevalence (which includes both new and existing cases), we might want to highlight just the risk of new infections (this is called disease incidence). This could give us additional information about places where the pattern of this disease is changing. Let's drag the Estimated incidence (all forms) per 100 000 population measure onto the Color mark:

  8. *Piazza Question 3* According to this map, which countries appear to have the biggest problem with TB? Do you have any new information?

Pt. 2: Exploring and Adding Context

  1. Now it's your turn! See what else you can discover in this dataset. Just a few possible questions:

    Remember that geographic data is notoriously tricky, so feel free to use the Coordinated Multiple Views approach we talked about last lab to add context or highlight other interesting features.

  2. Post your map, along with a brief description of what you found, to Piazza.