Lecture 11
Duke University
SOCIOL 333 - Summer Term 1 2023
2023-06-06
Project component 2: descriptive statistics
#1: Your classmates and I!
Cheatsheets!
Google!
A book!
Campus resources!
- ...eh. It can WRITE code, but it can't test it...
- So what it gives you probably won't work, or won't work correctly, at least not right out of the box
- Might be useful for inspiration/finding a starting place, but not for doing your work for you (just like with writing!)
- **Cite your sources**
The basic structure of a plot:
ggplot(
data = DATASET,
# aes stands for aesthetics--it is where you tell R what plot features should represent what variables.
aes(
# sometimes--like in the single variable plots we're making today--you
# only need to specify one variable (x or y).
# If you plot more than one variable you'll specify more than one thing here.
# eg, maybe x and y variables and a third variable for color
x = X.VARIABLE,
y = Y.VARIABLE,
fill = FILL.COLOR.VARIABLE,
OTHER.STUFF)) + # add more elements to the plot with +
# geoms tell R what kind of plot you want to make. More options in the ggplot cheatsheet.
geom_XXXX()
filter()
to get rid of them before plotting the data# first we filter out the NAs
acs12_filtered <- filter(acs12, !is.na(employment))
# then we plot that new dataset
ggplot(data = acs12_filtered, aes(x = employment)) +
geom_bar() +
labs(y = "",
x = "Labor force status",
# and let's add a title too
title = "Number of people in each employment category"
)
Two messages come up:
stat_bin()
using bins = 30
. Pick better value with binwidth
.stat_bin()
).Let’s start with the second one
Now what about that other message?
stat_bin()
using bins = 30
. Pick better value with binwidth
.The function documentation tells us that geom_histogram gives you two ways to set the width of the bars: bins
and binwidth
.
binwidth
sets the width of each bar: eg, each bar should represent 5 hours worked, or 10bins
instead to tell R how many bars there should be–10 bars, 20 bars, etcThe message is telling us that R took its best dumb guess, but we should pick something better.