The impact of deprivation on the distribution of museums
By Andrew Robson
November 23, 2022
This weeks data comes from the Mapping Museums Project. They have gathered, cleansed and codified data relating to over 4000 UK Museums. They have also included some demographic information relating to the location of the museum - for example the Index of Multiple Deprivation (IMD). IMD measures the relative deprivation of geographic areas in the UK, aggregating different dimensions (income, employment, education, health, crime, housing, and living environment). The index ranges from 1 (most deprived) to 10 (least deprived).
Let’s try and see if deprivation has an impact on the distribution of museums - are there different types of museums in areas that are more deprived?
we can start by loading the data into R using the TidyTuesdayR package.
library(tidytuesdayR)
tidy_tuesday_data <- tt_load('2022-11-22')
##
## Downloading file 1 of 1: `museums.csv`
Before we start looking for patterns, let’s check the quality of the data. There is a column called Subject Matter, which relates to the overarching topic address by the museum. This is what we want to know is dependent on IMD or not. However, if there are are very few museums per Subject Matter we won’t be able to find any patterns.
library(tidyverse)
tidy_tuesday_data$museums %>%
count(Subject_Matter) %>%
mutate(Subject_Matter = reorder(Subject_Matter, n)) %>%
filter(n > 20) %>%
ggplot() +
geom_bar(aes(x = n, y = Subject_Matter), stat = 'identity') +
theme_minimal_blog +
labs(y = 'Museum Type', x = 'Number of Museums')
Wow. Who knew there were that many different types of museum? This is only including the museums whose Subject Matter is popular enough to have at least 20 other museums of this type. This is a long tail distribution - there are many Subject Matters with only a few museums, that’s not good for finding patterns. However, it looks like it’s cleanable. There seems to be a hyphen separated hierarchy. Let’s separate those detailed Subject Matters into Museum Type and Museum Subtype (and throw away the sub-sub types).
library(knitr)
museum_types <- tidy_tuesday_data$museums %>%
select(Name_of_museum,
Subject_Matter,
Area_Deprivation_index,
Area_Deprivation_index_crime,
Area_Deprivation_index_housing,
Area_Deprivation_index_health,
Area_Deprivation_index_income,
Area_Deprivation_index_education,
Area_Deprivation_index_employment,
Area_Deprivation_index_services) %>%
mutate(Subject_Matter = gsub(Subject_Matter, pattern = '_', replacement = ' ')) %>%
separate(Subject_Matter, c("Museum_Type", "Museum_Subtype", NA), "-", fil = 'right') %>%
mutate(Museum_Subtype = ifelse(is.na(Museum_Subtype), 'No Subtype', Museum_Subtype))
kable(museum_types[1:5, 1:4])
Name_of_museum | Museum_Type | Museum_Subtype | Area_Deprivation_index |
---|---|---|---|
Titanic Belfast | Sea and seafaring | Boats and ships | 2 |
The Woodland Heritage Museum | Natural world | Other | 9 |
Warwickshire Museum Of Rural Life | Rural Industry | Farming | 8 |
Battle Of Flowers Museum | Arts | Crafts | NA |
Jet Age Museum | War and conflict | Airforce | 8 |
That’s a bit better (and a tiny bit of cleaning has made it readable). I’ve also included the IMD columns as we will be needing them later too! Now let’s have another look at our distribution now the subtypes have been accounted for.
Much better - we also get an indication of which museum type has the most subtypes which might be interesting to dig into in future. There’s also surprisingly few Science and technology museums. Now that we have a better distribution of Museum Types - let’s see what patterns we can find once we compare these types against the Index of Deprivation - remember 1 is the most deprived and 10 is the least deprived. A ridgeline plot might be a fun way to display this.
Disappointing really - there isn’t anything which stands out. I’ve added a median line to try and distinguish any differences. War and conflict museums tend to be in areas which are the least deprived whilst Medicine and health museums are more common in more deprived areas. Quite a depressing insight, but there isn’t much difference between the distributions.
The index of deprivation also contains sub measures which rank areas in more detailed metrics like health, crime and housing. Let’s breakdown this chart a little bit more.
Again - nothing amazing stands out but there are a few outliers. Rural industry museums tend to be in areas with less crime - most likely due to them being rural while high crime deprivation could be seen as an urban feature which you can use this IMD map to explorer further.
- Posted on:
- November 23, 2022
- Length:
- 4 minute read, 673 words
- See Also: