knitr::opts_chunk$set(echo = TRUE, message = FALSE)
The following applies to the knitr opts listed above. It simply controls how the markdown will render when you knit it to html.
include = FALSE prevents code and results from appearing in the finished file. R Markdown still runs the code in the chunk, and the results can be used by other chunks. echo = FALSE prevents code, but not the results from appearing in the finished file. This is a useful way to embed figures. message = FALSE prevents messages that are generated by code from appearing in the finished file. warning = FALSE prevents warnings that are generated by code from appearing in the finished.
#First we install some packages. This works best from the packages tab in the lower right. It can also be done in command line, but is less reliable. After we install, we need to load them with the lines below (lines 20 and 21). In this example we are using two packages, tidyverse and lubridate.
library(tidyverse)
library(lubridate)
library(here) #we didn't load this package in class but it is very helpful. It tells R to look in the master folder for any data we load. It gives you some flexibility when writing the "read_csv" command.
#Here we read in the data.
df <- read_csv(here("hydrology/R_intro/data/temp_df.csv"))
#If you type in str(df) you will see that the date column is not recognized as a date. To fix that we use lubridate to do the following:
df$date <- as.Date(df$date, format = "%m/%d/%Y")
#Here we can do some data wrangling. The following uses the dplyr component of tidyverse. Here is a link to information about tidyverse https://www.tidyverse.org/
#In this line we use what is referred to as a pipe (%>%). This simply passes arguments to the next line. Using mutate we can create new columns, do calculations, change an existing column, etc.
df <- df %>%
mutate(t_range = tmax - tmin, tmax_f = (tmax*(9/5))+32, tmin_f = (tmin*(9/5))+32)
#Select can be used to either keep or get rid of columns. Since we don't want the Farenheit stuff we just made in line 40 we can get rid of them by saying "select(-tmax_f, -tmin_f)". The minus in front of the header name means get rid of them. Conversely, if we said "select(tmax_f, tmin_f), it would keep those two columns and get rid of everything else.
df <- df %>%
select(-tmax_f, -tmin_f)
#The line below is very helpful for doing averages or min or max by day or year or month. We will use this in the climate data analysis. This is part of the lubridate package, but also uses dplyr (pipe and mutate). As a side note, often times errors occur because a package didn't load properly. Computers are imperfect. So a first step in trouble shooting when you get an error is to load the package again. An error that says "could not find the function X" usually means the package isn't actually loaded.
df <- df %>%
mutate(year = year(date), day = day(date), month = month(date))
#Because we made a column of months above we can do things like filter by month. You should play around with filtering by day or year. The "==" means keep all data where the month value equals 1.
df_jan <- df %>%
filter(month == 1)
#We can also filter by other values. For example we can filter by temp. Here we are saying to only keep the data where the max temp is less than 0. You can play around with changing this and see what happens. Filter functions are really useful. For example in your assignment you will need to find the max temp for each day from the Campbell data logger (15 minute data). Consider how to do this given the code for filtering shown above and below.
df_neg_temp <- df %>%
filter(tmax < 0)
#Last we can plot some data. ggplot is part of the tidyverse. In this example we plot the max temp and color each point by date. In the final line we change the color ramp from the default (shades of blue) to something with more contrast. You can take that line away or put it back in to see what happens. You can also change the colors to whatever you like. Here is a link to the colors in ggplot http://sape.inf.usi.ch/quick-reference/ggplot2/colour
ggplot(df, aes(x = date, y = tmax, color = date)) +
geom_point() +
ylab(expression(Maximum~temperature~(""^o*C))) +
xlab("Date") +
theme_linedraw(base_size = 16) +
scale_color_gradient(low = 'cyan', high = 'deeppink', trans = "date")
#Last you may have noticed that all of the comments in here have a "#" in front of them. If you put the hash tag in front of text that tells R this is not code. It is called commenting your code so that other people know what each chunk is doing.