Introduction to R Programming
Exercise Solutions
Welcome
Welcome to the Intro to R Programming Workshop!
This is the solutions notebook. Only check this out if you want to see the solutions to the exercises!
For Part I (Google Colab) click here
For Part II (Google Colab) click here.
Link to Slides: https://favstats.github.io/ds3_r_intro/
Link to all Materials: https://github.com/favstats/ds3_r_intro
Exercises I
The following includes a list of exercises that you can complete on your own.
Task 1
Take a look at the table below.
Pick three animals from the Animal Lifespan data we haven’t talked about yet.
Assign the lifespan values to respective objects with appropriate names.
Animal | Maximum Longevity (in years) |
---|---|
Human | 122.5.5 |
Domestic dog | 24.0 |
Domestic cat | 30.0 |
American alligator | 77.0 |
Golden hamster | 3.9 |
King penguin | 26.0 |
Lion | 27.0 |
Greenland shark | 392.0 |
Galapagos tortoise | 177.0 |
African bush elephant | 65.0 |
California sea lion | 35.7 |
Fruit fly | 0.3 |
House mouse | 4.0 |
Giraffe | 39.5 |
Wild boar | 27.0 |
Task 2
Create three (different) logical tests which compare the maximum longevity between your chosen animal lifespans.
Does the output you get make sense?
## [1] FALSE
## [1] TRUE
## [1] TRUE
Task 3
Create two vectors with the help of c()
:
- strings (i.e. texts) of all the animals you chose
- the respective lifespan values (in the same order)
Task 5
5.1 Retrieve the second value of the vector that contains your animal names.
Tip: Square brackets are your friend.
## [1] "penguin"
5.2 Using code, find out which animals in your lifespans vector have a maximum longevity of above 25.
Tip: For an elegant solution you need to use both vectors, square brackets and a logical test. If you need help revisit Indexing with logical tests
## [1] "giraffe" "penguin" "elephant"
Task 6
Calculate the animal to human conversion ratios for the animals you’ve picked and assign the results to an object.
Task 7
Calculate the human years for your picked animals and assume they are all 5 years old.
## [1] 20.762712 23.557692 9.423077
Task 8
Pick one of the animals you chose and create a function which takes as input animal years and outputs human years. Test the function and validate with results from the seventh exercise.
You can name the function in this style:
[you_animal_name]_to_human_years
Tip: If you need help revisit the section Dog to Human years function
Create the function here:
penguin_to_human_years <- function(animal_years, human_lifespan = 122.5, penguin_lifespan = 26){
ratio <- human_lifespan/penguin_lifespan
human_years <- animal_years*ratio
return(human_years)
}
Try it out here:
## [1] 23.55769
Exercises II
The following includes a list of exercises that you can complete on your own.
We are going to use the palmerpenguins
dataset for the
tasks ahead!
Functions reference list
For reference, here is a list of some useful functions.
If you have trouble with any of these functions, try reading the
documentation with ?function_name
Remember: all these functions take the data first.
filter()
- Subset rows using column values
mutate()
- Create and modify delete columns
rename()
- Rename columns
select()
- Subset columns using their names and types
summarise()
;summarize()
- Summarise each group to fewer rows
group_by()
;ungroup()
- Group by one or more variables
arrange()
- Arrange rows by column values
count()
;tally()
- Count observations by group
distinct()
- Subset distinct/unique rows
pull()
- Extract a single column
ifelse()
- useful for coding of binary variables
case_when()
- useful for recoding (when
ifelse
is not enough)
- useful for recoding (when
separate()
- separate two variables by some separator
pivot_wider()
- turn data into wide format
pivot_longer()
- turn data into long format
Task 1
Load the tidyverse
and janitor
packages.
If janitor
is not installed yet (it will say
janitor
not found) install it.
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 0.3.5
## ✔ tibble 3.2.1 ✔ dplyr 1.1.2
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.3 ✔ forcats 1.0.0
## Warning: package 'tibble' was built under R version 4.2.3
## Warning: package 'dplyr' was built under R version 4.2.3
## Warning: package 'forcats' was built under R version 4.2.3
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
##
## Attaching package: 'janitor'
##
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
Task 2
Read in the already cleaned palmerpenguins
dataset
using
read_csv
- the following url: https://raw.githubusercontent.com/allisonhorst/palmerpenguins/main/inst/extdata/penguins.csv
Assign the resulting data to penguins
.
Then take a look a look at it using glimpse
.
What kind of variables can you recognize?
penguins <- read_csv("https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv")
## Rows: 344 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): species, island, sex
## dbl (5): bill_length_mm, bill_depth_mm, flipper_length_mm, body_mass_g, year
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## Rows: 344
## Columns: 8
## $ species <chr> "Adelie", "Adelie", "Adelie", "Adelie", "Adelie", "A…
## $ island <chr> "Torgersen", "Torgersen", "Torgersen", "Torgersen", …
## $ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, …
## $ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, …
## $ flipper_length_mm <dbl> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186…
## $ body_mass_g <dbl> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, …
## $ sex <chr> "male", "female", "female", NA, "female", "male", "f…
## $ year <dbl> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007…
Task 3
Only keep the variables: species
, island
and sex
.
## # A tibble: 344 × 3
## species island sex
## <chr> <chr> <chr>
## 1 Adelie Torgersen male
## 2 Adelie Torgersen female
## 3 Adelie Torgersen female
## 4 Adelie Torgersen <NA>
## 5 Adelie Torgersen female
## 6 Adelie Torgersen male
## 7 Adelie Torgersen female
## 8 Adelie Torgersen male
## 9 Adelie Torgersen <NA>
## 10 Adelie Torgersen <NA>
## # ℹ 334 more rows
## # A tibble: 344 × 3
## species island sex
## <chr> <chr> <chr>
## 1 Adelie Torgersen male
## 2 Adelie Torgersen female
## 3 Adelie Torgersen female
## 4 Adelie Torgersen <NA>
## 5 Adelie Torgersen female
## 6 Adelie Torgersen male
## 7 Adelie Torgersen female
## 8 Adelie Torgersen male
## 9 Adelie Torgersen <NA>
## 10 Adelie Torgersen <NA>
## # ℹ 334 more rows
Only keep variables 2 to 4.
## # A tibble: 344 × 3
## island bill_length_mm bill_depth_mm
## <chr> <dbl> <dbl>
## 1 Torgersen 39.1 18.7
## 2 Torgersen 39.5 17.4
## 3 Torgersen 40.3 18
## 4 Torgersen NA NA
## 5 Torgersen 36.7 19.3
## 6 Torgersen 39.3 20.6
## 7 Torgersen 38.9 17.8
## 8 Torgersen 39.2 19.6
## 9 Torgersen 34.1 18.1
## 10 Torgersen 42 20.2
## # ℹ 334 more rows
## # A tibble: 344 × 3
## island bill_length_mm bill_depth_mm
## <chr> <dbl> <dbl>
## 1 Torgersen 39.1 18.7
## 2 Torgersen 39.5 17.4
## 3 Torgersen 40.3 18
## 4 Torgersen NA NA
## 5 Torgersen 36.7 19.3
## 6 Torgersen 39.3 20.6
## 7 Torgersen 38.9 17.8
## 8 Torgersen 39.2 19.6
## 9 Torgersen 34.1 18.1
## 10 Torgersen 42 20.2
## # ℹ 334 more rows
Remove the column year
.
## # A tibble: 344 × 7
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 1 more variable: sex <chr>
## # A tibble: 344 × 7
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 1 more variable: sex <chr>
Only include columns that contain “mm” in the variable name.
## # A tibble: 344 × 3
## bill_length_mm bill_depth_mm flipper_length_mm
## <dbl> <dbl> <dbl>
## 1 39.1 18.7 181
## 2 39.5 17.4 186
## 3 40.3 18 195
## 4 NA NA NA
## 5 36.7 19.3 193
## 6 39.3 20.6 190
## 7 38.9 17.8 181
## 8 39.2 19.6 195
## 9 34.1 18.1 193
## 10 42 20.2 190
## # ℹ 334 more rows
## # A tibble: 344 × 3
## bill_length_mm bill_depth_mm flipper_length_mm
## <dbl> <dbl> <dbl>
## 1 39.1 18.7 181
## 2 39.5 17.4 186
## 3 40.3 18 195
## 4 NA NA NA
## 5 36.7 19.3 193
## 6 39.3 20.6 190
## 7 38.9 17.8 181
## 8 39.2 19.6 195
## 9 34.1 18.1 193
## 10 42 20.2 190
## # ℹ 334 more rows
Task 4
Rename island
to location
.
## # A tibble: 344 × 1
## location
## <chr>
## 1 Torgersen
## 2 Torgersen
## 3 Torgersen
## 4 Torgersen
## 5 Torgersen
## 6 Torgersen
## 7 Torgersen
## 8 Torgersen
## 9 Torgersen
## 10 Torgersen
## # ℹ 334 more rows
## # A tibble: 344 × 8
## species location bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
Task 5
Filter the data so that species
only includes
Chinstrap
.
## # A tibble: 68 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Chinstrap Dream 46.5 17.9 192 3500
## 2 Chinstrap Dream 50 19.5 196 3900
## 3 Chinstrap Dream 51.3 19.2 193 3650
## 4 Chinstrap Dream 45.4 18.7 188 3525
## 5 Chinstrap Dream 52.7 19.8 197 3725
## 6 Chinstrap Dream 45.2 17.8 198 3950
## 7 Chinstrap Dream 46.1 18.2 178 3250
## 8 Chinstrap Dream 51.3 18.2 197 3750
## 9 Chinstrap Dream 46 18.9 195 4150
## 10 Chinstrap Dream 51.3 19.9 198 3700
## # ℹ 58 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
## # A tibble: 68 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Chinstrap Dream 46.5 17.9 192 3500
## 2 Chinstrap Dream 50 19.5 196 3900
## 3 Chinstrap Dream 51.3 19.2 193 3650
## 4 Chinstrap Dream 45.4 18.7 188 3525
## 5 Chinstrap Dream 52.7 19.8 197 3725
## 6 Chinstrap Dream 45.2 17.8 198 3950
## 7 Chinstrap Dream 46.1 18.2 178 3250
## 8 Chinstrap Dream 51.3 18.2 197 3750
## 9 Chinstrap Dream 46 18.9 195 4150
## 10 Chinstrap Dream 51.3 19.9 198 3700
## # ℹ 58 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
Filter the data so that species
only includes
Chinstrap
or Gentoo
.
## # A tibble: 192 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # ℹ 182 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
## # A tibble: 192 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Gentoo Biscoe 46.1 13.2 211 4500
## 2 Gentoo Biscoe 50 16.3 230 5700
## 3 Gentoo Biscoe 48.7 14.1 210 4450
## 4 Gentoo Biscoe 50 15.2 218 5700
## 5 Gentoo Biscoe 47.6 14.5 215 5400
## 6 Gentoo Biscoe 46.5 13.5 210 4550
## 7 Gentoo Biscoe 45.4 14.6 211 4800
## 8 Gentoo Biscoe 46.7 15.3 219 5200
## 9 Gentoo Biscoe 43.3 13.4 209 4400
## 10 Gentoo Biscoe 46.8 15.4 215 5150
## # ℹ 182 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
Filter the data so it includes only penguins that are
male
and of the species Adelie
.
## # A tibble: 73 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.3 20.6 190 3650
## 3 Adelie Torgersen 39.2 19.6 195 4675
## 4 Adelie Torgersen 38.6 21.2 191 3800
## 5 Adelie Torgersen 34.6 21.1 198 4400
## 6 Adelie Torgersen 42.5 20.7 197 4500
## 7 Adelie Torgersen 46 21.5 194 4200
## 8 Adelie Biscoe 37.7 18.7 180 3600
## 9 Adelie Biscoe 38.2 18.1 185 3950
## 10 Adelie Biscoe 38.8 17.2 180 3800
## # ℹ 63 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
## # A tibble: 73 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.3 20.6 190 3650
## 3 Adelie Torgersen 39.2 19.6 195 4675
## 4 Adelie Torgersen 38.6 21.2 191 3800
## 5 Adelie Torgersen 34.6 21.1 198 4400
## 6 Adelie Torgersen 42.5 20.7 197 4500
## 7 Adelie Torgersen 46 21.5 194 4200
## 8 Adelie Biscoe 37.7 18.7 180 3600
## 9 Adelie Biscoe 38.2 18.1 185 3950
## 10 Adelie Biscoe 38.8 17.2 180 3800
## # ℹ 63 more rows
## # ℹ 2 more variables: sex <chr>, year <dbl>
Task 6
Create three new variables that calculates
bill_length_mm
and bill_depth_mm
and
flipper_length_mm
from milimeter to centimeter.
Tip: divide the length value by 10.
mutate(penguins,
bill_length_cm = bill_length_mm/10,
bill_depth_cm = bill_depth_mm/10,
flipper_length_cm = flipper_length_mm/10
)
## # A tibble: 344 × 11
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 5 more variables: sex <chr>, year <dbl>, bill_length_cm <dbl>,
## # bill_depth_cm <dbl>, flipper_length_cm <dbl>
penguins %>%
mutate(bill_length_cm = bill_length_mm/10,
bill_depth_cm = bill_depth_mm/10,
flipper_length_cm = flipper_length_mm/10)
## # A tibble: 344 × 11
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 5 more variables: sex <chr>, year <dbl>, bill_length_cm <dbl>,
## # bill_depth_cm <dbl>, flipper_length_cm <dbl>
Create a new variable called bill_depth_cat
which has
two values:
- Everything above a bill depth of 18mm and 18mm itself is “high”
- Everything below a bill depth of 18mm is “low”
## # A tibble: 344 × 9
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 3 more variables: sex <chr>, year <dbl>, bill_depth_cat <chr>
## # A tibble: 344 × 9
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 3 more variables: sex <chr>, year <dbl>, bill_depth_cat <chr>
Create a new variable called species_short
.
Adelie
should becomeA
Chinstrap
should becomeC
Gentoo
should becomeG
mutate(penguins,
island_short = case_when(
species == "Adelie" ~ "A",
species == "Chinstrap" ~ "C",
species == "Gentoo" ~ "G",
))
## # A tibble: 344 × 9
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 3 more variables: sex <chr>, year <dbl>, island_short <chr>
penguins %>%
mutate(island_short = case_when(
species == "Adelie" ~ "A",
species == "Chinstrap" ~ "C",
species == "Gentoo" ~ "G",
))
## # A tibble: 344 × 9
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 3 more variables: sex <chr>, year <dbl>, island_short <chr>
Task 7
Calculate the average body_mass_g
per
island
.
grouped_by_island <- group_by(penguins, island)
summarise(grouped_by_island, avg_body_mass_g = mean(body_mass_g, na.rm = T))
## # A tibble: 3 × 2
## island avg_body_mass_g
## <chr> <dbl>
## 1 Biscoe 4716.
## 2 Dream 3713.
## 3 Torgersen 3706.
If you haven’t done so already, try using the %>%
operator to do this.
## # A tibble: 3 × 2
## island avg_body_mass_g
## <chr> <dbl>
## 1 Biscoe 4716.
## 2 Dream 3713.
## 3 Torgersen 3706.
Task 8
Use the pipe operator (%>%
) to do all the operations
below.
- Filter the
penguins
data so that it only includesChinstrap
orAdelie
. - Rename
sex
toobserved_sex
- Only keep the variables
species
,observed_sex
,bill_length_mm
andbill_depth_mm
- Calculate the ratio between
bill_length_mm
andbill_depth_mm
- Sort the data by the highest ratio
Try to create the pipe step by step and execute code as you go to see if it works.
Once you are done, assign the data to new_penguins
.
penguins %>%
filter(species %in% c("Chinstrap", "Adelie")) %>%
rename(observed_sex = sex) %>%
select(species, observed_sex, bill_length_mm, bill_depth_mm) %>%
mutate(ratio = bill_length_mm/bill_depth_mm) %>%
arrange(desc(ratio))
## # A tibble: 220 × 5
## species observed_sex bill_length_mm bill_depth_mm ratio
## <chr> <chr> <dbl> <dbl> <dbl>
## 1 Chinstrap female 58 17.8 3.26
## 2 Chinstrap female 48.1 16.4 2.93
## 3 Chinstrap female 49.8 17.3 2.88
## 4 Chinstrap male 52 18.1 2.87
## 5 Chinstrap female 50.9 17.9 2.84
## 6 Chinstrap female 46.8 16.5 2.84
## 7 Chinstrap female 47.5 16.8 2.83
## 8 Chinstrap female 46.9 16.6 2.83
## 9 Chinstrap male 51.3 18.2 2.82
## 10 Chinstrap male 55.8 19.8 2.82
## # ℹ 210 more rows
Calculate the average ratio by species
and
sex
, again using pipes.
## `summarise()` has grouped output by 'island'. You can override using the
## `.groups` argument.
## # A tibble: 9 × 3
## # Groups: island [3]
## island sex avg_body_mass_g
## <chr> <chr> <dbl>
## 1 Biscoe female 4319.
## 2 Biscoe male 5105.
## 3 Biscoe <NA> 4588.
## 4 Dream female 3446.
## 5 Dream male 3987.
## 6 Dream <NA> 2975
## 7 Torgersen female 3396.
## 8 Torgersen male 4035.
## 9 Torgersen <NA> 3681.
Task 9
Count the number of penguins by island and species.
## # A tibble: 5 × 3
## island species n
## <chr> <chr> <int>
## 1 Biscoe Adelie 44
## 2 Biscoe Gentoo 124
## 3 Dream Adelie 56
## 4 Dream Chinstrap 68
## 5 Torgersen Adelie 52
Task 10
Below is a dataset that needs some cleaning.
Use the skills that you have learned so far to turn the data into a tidy dataset.
animal_friends <- tibble(
Names = c("Francis", "Catniss", "Theodor", "Eugenia"),
TheAnimals = c("Dog", "Cat", "Hamster", "Rabbit"),
Sex = c("m", "f", "m", "f"),
a_opterr = c("me", "me", "me", "me"),
`Age/Adopted/Condition` = c("8/2020/Very Good", "13/2019/Wild", "1/2021/Fair", "2/2020/Good")
)
Start here:
tidy_animal_friends <- animal_friends %>%
## first clean the names
clean_names() %>%
## rename some variables
rename(adopter = a_opterr,
animals = the_animals) %>%
remove_constant() %>%
separate(age_adopted_condition, sep = "/", c("age", "year_adopted", "condition"))
## # A tibble: 4 × 6
## names animals sex age year_adopted condition
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Francis Dog m 8 2020 Very Good
## 2 Catniss Cat f 13 2019 Wild
## 3 Theodor Hamster m 1 2021 Fair
## 4 Eugenia Rabbit f 2 2020 Good
If you are done, turn the final data into long format.
## # A tibble: 16 × 4
## names animals name value
## <chr> <chr> <chr> <chr>
## 1 Francis Dog sex m
## 2 Francis Dog age 8
## 3 Francis Dog year_adopted 2020
## 4 Francis Dog condition Very Good
## 5 Catniss Cat sex f
## 6 Catniss Cat age 13
## 7 Catniss Cat year_adopted 2019
## 8 Catniss Cat condition Wild
## 9 Theodor Hamster sex m
## 10 Theodor Hamster age 1
## 11 Theodor Hamster year_adopted 2021
## 12 Theodor Hamster condition Fair
## 13 Eugenia Rabbit sex f
## 14 Eugenia Rabbit age 2
## 15 Eugenia Rabbit year_adopted 2020
## 16 Eugenia Rabbit condition Good