Mini TidyTuesday - Programming Languages.
By Andrew Robson
March 21, 2023
The data this week comes from the Programming Language Database. Let’s quickly load it in and make a nice scatter chart.
# Load necessary packages
library(ggplot2)
library(ggrepel)
library(scales) # For formatting numbers
tuesdata <- tidytuesdayR::tt_load(2023, week = 12)
##
## Downloading file 1 of 1: `languages.csv`
data <- tuesdata$languages
# Subset the data to include only the points of interest
top_languages <- data[data$number_of_users > 500000,]
# Create the plot
ggplot(top_languages, aes(x = number_of_users, y = number_of_jobs)) +
geom_smooth(method = "lm", se = FALSE, linetype = "solid", color = "steelblue", size = 1) +
geom_point(color = "darkorange", size = 3, alpha = 0.8) +
geom_label_repel(data = top_languages, aes(label = title),
force = 20,
fontface = "bold",
color = "black",
size = 4,
box.padding = 0.5,
point.padding = 0.5) +
labs(title = "The Relationship Between Users and Job Openings",
subtitle = "An analysis of top programming languages with over 500,000 users",
x = "Number of Users",
y = "Number of Job Openings",
caption = "Data source: PLDB.com") +
theme_minimal() +
theme(text = element_text(family = "Helvetica", size = 12),
plot.title = element_text(size = 18, hjust = 0.5, face = "bold"),
plot.subtitle = element_text(size = 14, hjust = 0.5),
plot.caption = element_text(size = 10, hjust = 1, margin = margin(t = 5)),
axis.title = element_text(size = 14, face = "bold"),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill='#FAF9F0', color=NA),
plot.background = element_rect(fill='#FAF9F0', color=NA)) +
scale_x_continuous(labels = comma) + # Format x-axis numbers
scale_y_continuous(labels = comma) # Format y-axis numbers
Most programming languages have less job openings than users according to the Programming Language Database - however, SQL breaks the trend. (Which is great news for me because I love writing queries.)
- Posted on:
- March 21, 2023
- Length:
- 2 minute read, 275 words
- See Also: