Chapter 3 Part 3: Bucket and Process-Based Models

3.1 Classifying model structure

3.1.1 Learning Module

Throughout the rest of the course, we will gather data and create models to explore how environmental factors, such as snowmelt, land cover, and topography, impact runoff. To discuss these methods, we should review some modeling terminology describing model complexity and type.

Environmental models, including hydrological models, are built around simplifying assumptions of natural systems. The complexity of the model may depend on its application. Effective hydrological models share key traits: they are simple, parsimonious, and robust across various watersheds. In other words, they are easy to understand and streamlined and consistently perform well across different basins or even geographical areas. Therefore, more complex is only sometimes better.

3.1.1.1 Spatial Complexity

There are general terms that classify the spatial complexity of hydrological models:

A lumped system is one in which the dependent variables of interest are a function of time alone, and the study basin is spatially ‘lumped’ or assumed to be spatially homogeneous across the basin. So far in this course, we have focused mainly on lumped models. You may remember the figure below from the transfer functions module. It represents the lumped watershed as a bucket with a single input, outlet output, and storage volume for each timestep.

A distributed system is one in which all dependent variables are functions of time and one or more spatial variables. Modeling a distributed system means partitioning our basins into raster cells (grids) and assigning inputs, outputs, and the spatial variables that affect inputs and outputs across these cells. We then calculate the processes at the cell level and route them downstream. These models allow us to represent the spatial complexity of physically based processes. They can simulate or forecast parameters other than streamflow, such as soil moisture, evapotranspiration, and groundwater recharge.

A semi-distributed system is an intermediate approach that combines elements of both lumped and distributed systems. Certain variables may be spatially distributed, while others are treated as lumped. Alternatively, we can divide the watershed into sub-basins and treat each sub-basin as a lumped basin. Outputs from each sub-basin are then linked together and routed downstream. Semi-distribution allows for a more nuanced representation of the basin’s characteristics, acknowledging spatial variability where needed while maintaining some simplifications for computation efficiency.

In small-scale studies, we can design a model structure that fits the specific situation well. However, when we are dealing with larger areas, model design may be challenging. Our data might differ across regions with variable climate and landscape features. Sometimes, it is best to use a complex model to capture all the different processes happening over a big area. However, it could be better to stick with a simpler model because we might have limited data or the number of calculations is very computationally expensive. It is up to the modeler to determine the simplest model that meets the desired application.

3.1.1.2 Modeling Approaches

Empirical Models are based on empirical analysis of observed inputs (e.g., rainfall) or outputs (ET, discharge). These simple models may not be transferable to other watersheds. Also, they may not reveal much about the physical processes influencing runoff. Therefore, these types of models may not be valid if the study area experiences land use or climate change.

Conceptual Models describe processes with simple mathematical equations. For example, we might use a simple linear equation to interpolate precipitation inputs over a watershed with a high elevation gradient using precipitation measurements from two points (high and low). This represents the basic relationship between precipitation and elevation, but does not capture all features that affect precipitation patterns (e.g. aspect, prevailing winds). The combined impact of these factors is probably negligible compared to the substantial amount of data required to accurately model them.

Physically Based Models These models offer deep insights into the processes governing runoff generation by relying on fundamental physical equations like mass conservation. However, they come with drawbacks. Their implementation often demands complex numerical solving methods and a significant volume of input data. Without empirical data to validate these techniques, there is a risk of introducing substantial uncertainty into our models, reducing their reliability and effectiveness.

An example of a spatial distributed and physically based watershed model(Huning and Marguilis, 2015):

When modeling watersheds, we often use a mix of empirical, conceptual, and physically based models. The choice of model type depends on factors like the data we have, the time or computing resources we can allocate, and how we plan to use the model.

These categorizations provide a philosophical foundation of how we understand and simulate systems. However we can also consider classifications that focus on the quantitative tools and techniques we use to implement these approaches. Consider that we have already applied each of these tools:

Probability Models Many environmental processes can be thought of or modeled as stochastic, meaning a variable may take on any value within a specified range or set of values with a certain probability. Probability can be thought of in terms of the relative frequency of an event. We utilized probability models in the return intervals module where we observed precipitation data, and used that data to develop probability distributions to estimate likely outcomes for runoff. Probability models allow us to quantify risk and variability in systems.

Regression Models Often we are interested in modeling processes with limited data, or processes that aren’t well understood. Regression assumes that there is a relationship between dependent and independent variables (you may also see modelers call these explanatory and response variables). We utilized regression methods in the hydrograph separation module to consider process-based mechanisms that differed among watersheds.

Simulation Models Simulation models can simulate time series of hydrologic variables (as in the following snow melt module), or they can simulate characteristics of the modeled system, as we saw in the Monte Carlo module. These types of models are based on an assumption of what the significant variables are, an understanding of the important processes are, and/or a derivation of these physical processes from first principles (mass, energy balance).

3.1.1.3 A priori model selection:

By understanding the different frameworks of environmental modeling, we can choose the right tools for the right context, depending on our data, goals and resources. In reality, the final model selection is a fluid process requiring multiple iterations at each step. In Keith Beven’s Rainfall-Runoff Modelling Primer, they illustrate the process as:

While we aim to give some hands on experience across multiple model types, there is a wide range of possible models! Why would the most complex model, or one that represents the most elements in a system be best? Why even consider a simple bucket model?

Many modelers have observed that the number of parameters required to describe a key behavior in a watershed is often quite low, meaning increasing the number of parameter does not result in a significantly improved model. This idea that simple models are often sufficient fo representing a system have led to the investigation of parsimonious model structures (less complex).Consider though, that the model must sufficiently represent processes or it will be too unreliable outside of the range of conditions on which it was calibrated.

Now that we have reviewed some concepts, let’s move forward into a time series simulation with snowmelt models.

3.2 Snowmelt Models

3.2.1 Learning Module

3.2.1.1 Background:

Understanding snowmelt runoff is crucial for managing water resources and assessing flood risks, as it plays a significant role in the hydrologic cycle. Annual runoff and peak flow are influenced by snowmelt, rainfall, or a combination of both. In regions with a snowmelt-driven hydrologic cycle, such as the Rocky Mountains, snowpack acts as a natural reservoir, storing water during the winter months and releasing it gradually during the spring and early summer, thereby playing a vital role in maintaining water availability for various uses downstream. By examining how snowmelt interacts with other factors like precipitation, land cover, and temperature, we can better anticipate water supply fluctuations and design effective flood management strategies.

Learning objectives:

In this module, our primary focus will be modeling snowmelt as runoff, enabling us to predict when it will impact streamflow timing. We will consider some factors that may influence runoff timing. However, the term ‘snowmelt modeling’ is a field in itself and can represent a lifetime worth of work. There are many uses for snowmelt modeling (e.g., climate science and avalanche forecasting). If you are interested in exploring more on this subject, there is an excellent Snow Hydrology: Focus on Modeling series offered by CUAHSI’s Virtual University on YouTube.

Helpful terms:

The most common way to measure the water content of the snowpack is by the Snow Water Equivalent or SWE. The SWE is the water depth resulting from melting a unit column of the snowpack.

3.2.1.2 Model Approaches

Similar to the model development structure we discussed in the last module, snowmelt models are generally classified into three different types of abalation algorithms

Empirical and Data-Driven Models: These models use historical data and statistical techniques to predict runoff based on the relationship between snow characteristics (like snow area) and runoff. They use past patterns to make predictions about the future. The emergence of data-driven models has benefited from the growth of massive data and the rapid increase in computational power. These models simulate the changes in snowmelt runoff using machine learning algorithms to select appropriate parameters (e.g., daily rainfall, temperature, solar radiation, snow area, and snow water equivalent) from different data sources.

Conceptual Models: These models simplify the snowmelt process by establishing a simple, rule-based relationship between snowmelt and temperature. These models use a basic formula based on temperature to estimate how much snow will melt.

Physical Models: The physical snowmelt models calculate snowmelt based on the energy balance of snow cover. If all the heat fluxes toward the snowpack are considered positive and those away are considered negative, the sum of these fluxes is equal to the change in heat content of the snowpack for a given time period. Fluxes considered may be

net solar radiation (solar incoming minus albedo),
thermal radiation,
sensible heat transfer of air (e.g., when air is a different temperature than snowpack),
latent heat of vaporization from condensation or evaporation/sublimation, heat conducted from the ground,
advected heat from precipitation

examples: layered snow thermal model (SNTHERM) and physical snowpack model (SNOWPACK),

Many effective models may incorporate elements from some or all of these modeling approaches.

3.2.1.3 Spatial complexity

We may also identify models based on the model architecture or spatial complexity. The architecture can be designed based on assumptions about the physical processes that may affect the snowmelt to runoff volume and timing.

Homogenous basin modeling: You may also hear these types of models referred to as ‘black box’ models. Black-box models do not provide a detailed description of the underlying hydrological processes. Instead, they are typically expressed as empirical models that rely on statistical relationships between input and output variables. While these models can predict specific outcomes effectively, they may not be ideal for understanding the physical mechanisms that drive hydrological processes. In terms of snow cover, this is a simplistic case model where we assume:

the snow is consistent from top to bottom of the snow column and across the watershed
melt appears at the top of the snowpack
water immediately flows out the bottom

This type of modeling may work well if the snowpack is isothermal, if we are interested in runoff over large timesteps, or if we are modeling annual water budgets in lumped models.

Vertical layered modeling: Depending on the desired application of the model, snowmelt may be modeled in multiple layers in the snow column (air-snow surface to ground). Climate models, for example, may estimate phase changes or heat flux and consider the snowpack in 5 or more layers. Avalanche forecasters may need to consider grain evolution, density, water content, and more over hundreds of layers! Hydrologists may also choose variable layers, but many will choose single- or two-layer models for basin-wide studies, as simple models can be effective when estimating basin runoff. Here is a study by Dutra et al. (2012) that looked at the effect of the number of snow layers, liquid water representation, snow density, and snow albedo parameterizations within their tested models. Table 1 and figures 1-3 will be sufficient to understand the effects of changes to these parameters on modeled runoff and SWE. In this case, the three-layer model performed best when predicting the timing of the SWE and runoff, but density improved more by changing other parameters rather than layers (Figure 1).

Lateral spatial heterogeneity: The spatial extent of the snow cover determines the area contributing to runoff at any given time during the melt period. The more snow there is, the more water there will be when it melts. Therefore, snow cover tells us which areas will contribute water to rivers and streams as the snow melts. In areas with a lot of accumulated snow, the amount of snow covering the ground gradually decreases as the weather warms up. This melting process can take several months. How quickly the snow disappears depends on the landscape. For example, in mountainous ecosystems, factors like elevation, slope aspect, slope gradient, and forest structure affect how the snow can accumulate, evaporate or sublimate and how quickly the snow melts.

For mountain areas, similar patterns of depletion occur from year to year and can be related to the snow water equivalent (SWE) at a site, accumulated ablation, accumulated degree-days, or to runoff from the watershed using depletion curves from historical data. Here is an example of snow depletion curves developed using statistical modeling and remotely sensed data. The use of remotely sensed data can be incredibly helpful to providing estimates in remote areas with infrequent measurements. Observing depletion patterns may not be effective in ecosystems where patterns are more variable (e.g., prairies). However, stratified sampling with snow surveys, snow telemetry networks (SNOTEL) or continuous precipitation measurements can be used with techniques like cluster analyses or interpolation, to determine variables that influence SWE and estimate SWE depth or runoff over heterogeneous systems.

You can further explore all readings linked in the above section. These readings may assist in developing the workflow for your term project, though they are optional for completing this assignment. However, it is recommended that you review the figures to grasp the concepts and retain them for future reference if necessary.

3.2.1.4 Model choices: My snow is different from your snow

When determining the architecture of your snow model, your model choices will reflect the application of your model and the processes you are trying to represent. Recall that parsimony and simplicity often make for the most effective models. So, how do we know if we have a good model? Here are a few things we can check to validate our model choices:

Model Variability: A good model should produce consistent results when given the same inputs and conditions. Variability between model runs should be minimal if the watershed or environment is not changing.

Versatility: Check the model under a range of scenarios different from the conditions under which it was developed. The model should apply to similar systems or scenarios beyond the initial scope of development

Sensitivity Analysis: We reviewed this a bit in the Monte Carlo module. How do changes in model inputs impact outputs? A good model will show reasonable sensitivity changes in input parameters, with outputs responding as expected.

Validation with empirical data: Comparison with real-world data checks whether the model accurately represents the actual system

Applicability and simplicity: A good model should provide valuable insights or aid in decision-making processes relevant to the problem it addresses. It strikes a balance between complexity and simplicity, avoiding unnecessary intricacies that can lead to overfitting or computational inefficiency while sufficiently capturing the system’s complexities.

3.2.2 Labwork (20 pnts)

3.2.2.1 Download the repo for this lab HERE

In this module, we will simulate snowmelt in a montane watershed in central Colorado with a simple temperature model. The Fool Creek watershed is located in the Fraser Experimental Forest (FEF), 137 km west of Denver, Colorado. The FEF contains several headwater streams (Strahler order 1-3) within the Colorado Headwaters watershed which supplies much of the water to the populated Colorado Front Range. This Forest has been the site of many studies to evaluate the effects of land cover change on watershed hydrology. The Fool Creek is a small, forested headwater stream, with an approximate watershed area of 2.75 km^2. The watershed elevation ranges from approximately 2,930m (9,600ft) at the USFS monitoring station to 3,475m (11,400ft) at the highest point. There is a SNOTEL station located at 3,400m.

Fool Creek delineated watershed

Question 1: Using this map, what differences can you see between these two sites or what variation do you see in the watershed that may affect snow accumulation and melt? (2 points)?

ANSWER:

Let’s look at some data for Fool Creek:

This script collects SNOTEL input data using snotelr. The SNOTEL automated collection site in Fool Creek supplies SWE data that can simplify our runoff modeling. Lets explore the sites available through snotelr and select the site of interest.

3.2.2.2 Import data:

We will generate a conceptual runoff model using the date, daily mean temperature, SWE depth in mm, and daily precipitation (mm). We will use select() to keep the desired columns only. Then we will use mutate() to add a water year, and group_by() to add a cumulative precipitation column for each water year.

We will also download flow data collected by USFS at the Fool Creek outlet at 2,930m.

Let’s look at the data for the SNOTEL station at 3400m in elevation.

3.2.2.3 Calculate liquid input

to the ground by analyzing the daily changes in Snow Water Equivalent (SWE) and precipitation. This script incorporates temperature conditions to determine when to add changes to liquid input.

Question 2: In plain language, explain each step of the 5 if-else statements above (~5 brief sentences) (3pts).

ANSWER:

Let’s visualize the timing disparities between precipitation and melt input in relation to discharge.

Question 3. What processes could account for the differences between cumulative input, calculated from SWE data, and cumulative precipitation, both measured at the same SNOTEL station? (1 or 2 processes) What could explain the discrepancy if the cumulative input, is found to be less than the cumulative precipitation measured at the same SNOTEL station? (at least 2 possible processes) (4 points) (2-3 sentences)

ANSWER:

3.2.2.4 Modeling SWE

Let’s assume that we have a precipitation gage at 3400m in Fool Creek, but without SWE data, how could we model SWE using temperature?

The model below is a variation of the degree-day method. Here we are running a similar script to the input calculations above, using temperature as the primary determinant. When the temperature falls below a certain threshold and precipitation occurs, an equivalent amount is added to our SWE accumulation. We’ll conduct a simulation with 100 Monte Carlo runs and analyze the optimal outcome. The parameters we are testing are pack threshold, a degree-day factor, and the threshold temperature for determining rain from snow.

This simple model seems to simulate SWE well if assessed with the NSE. Let’s model SWE at the Fool Creek outlet, where we do not have daily SWE data. First we’ll bring in precipitation data, measured at a USFS maintained meteorological station located near the Fool outlet:

We also need temperature data. This is also collected at the USFS Lower Fool meteorological station.

Even though this meteorological station is very close to the SNOTEL station, lets compare the values between the upper watershed and the lower

Now let’s estimate SWE for the Lower Fool Creek. Again, we’ll just simulate a single year.

Run another set of simulations and compare the values of the best performing parameters across sites.

Since we don’t have SWE measurement for Lower Fool Creek, let’s see how the simulated values compare to acutal Upper Fool SNOTEL SWE measurements.

Question 3: Based on the differences in precipitation and temperature totals between the two sites, are SWE simulation results as you expected? (1 sentence)(2 pts)

ANSWER:

Question 4: How does the estimated threshold temperature distinguishing rain from snow change with elevation? Is the relationship between your parameter estimations as expected? How does this influence your confidence in the model? (2 pts)

ANSWER

Question 4: Compare the precipitation and snow accumulation patterns between two sites. Discuss at least 3 potential natural and/or anthropogenic mechanisms driving the observed variations in precipitation and snow accumulation. Consider the factors you observed in question 1 (3 points)(2-3 sentences).

ANSWER:

Question 5: Let’s say you want to model runoff for the Fool Creek catchment identified above. To do that, you want to estimate the timing and volume of snowmelt for this watershed. Refer to at least 2 model types reviewed in the 06_model_classifications lecture or terms in the lecture above and qualitatively describe a model that you could use to capture the spatial heterogeneity of snow accumulation in this watershed (4 points) (2-4 sentences).

ANSWER:

3.3 Evapotranspiration

3.3.1 Repo here

3.3.2 Learning Module

15pnts

3.3.2.1 Background

Evapotranspiration (ET) encompasses all processes through which water moves from the Earth’s surface to the atmosphere, comprising both evaporation and transpiration. This includes water vaporizing into the air from soil surfaces, the capillary fringe of groundwater, and water bodies on land. Much like snowmelt modeling, ET modeling and measurements are critical to many fields and could be a full course on its own. We will be focused on the basics of ET, modeling and data retrieval methods for water balance in hydrological modeling. Evapotranspiration is an important variable in hydrological models, as it accounts for much of the water loss in a system, outside of discharge. Transpiration, a significant component of ET, involves the movement of water from soil to atmosphere through plants. This occurs as plants absorb liquid water from the soil and release water vapor through their leaves. To gain a deeper understanding of ET, let’s review transpiration.

3.3.2.1.1 Transpiration

Plant root systems to absorb water and nutrients from the soil, which they then distribute to their stems and leaves. As part of this process, plants regulate the loss of water vapor into the atmosphere through stomatal apertures, or transpiration. However, the volume of water transpired can vary widely due to factors like weather conditions and plant traits.

Vegetation type: Plants transpire water at different rates. Some plants in arid regions have evolved mechanisms to conserve water by reducing transpiration. One mechanism involves regulating stomatal opening and closure. These plants can minimize water loss, especially during periods of high heat and low humidity. This closure of stomata can lead to diel and seasonal patterns in transpiration rates. Throughout the day, when environmental conditions are favorable for photosynthesis, stomata open to allow gas exchange, leading to increased transpiration. Conversely, during the night or under stressful conditions, stomata may close to conserve water, resulting in reduced transpiration rates.

Humidity: As the relative humidity of the air surrounding the plant rises the transpiration rate falls. It is easier for water to evaporate into dryer air than into more saturated air.

Soil type and saturation: Clay particles, being small, have a high capacity to retain water, while sand particles, being larger, readily release water. When there is a shortage of moisture, plants may enter a state of senescence and reduce their rate of transpiration.

Precipitation: During dry periods, transpiration can contribute to the loss of moisture in the upper soil zone.

Temperature: Transpiration rates go up as the temperature goes up, especially during the growing season, when the air is warmer due to stronger sunlight and warmer air masses. Higher temperatures cause the plant cells to open stomata, allowing for the exchange of CO2 and water with the atmosphere, whereas colder temperatures cause the openings to close.

The availability and intensity of sunlight have a direct impact on transpiration rates. Likewise, the aspect of a location can influence transpiration since sunlight availability often depends on it.

Wind & air movement: Increased movement of the air around a plant will result in a higher transpiration rate. Wind will move the air around, with the result that the more saturated air close to the leaf is replaced by drier air.

3.3.2.2 Measurements

In the realm of evapotranspiration (ET) modeling and data analysis, you’ll frequently encounter the terms potential ET and actual ET. These terms are important to consider when selecting data, as they offer very different insights into water loss processes from the land surface to the atmosphere.

Potential Evapotranspiration (PET): Potential ET refers to the maximum possible rate at which water could evaporate and transpire under ideal conditions. These conditions typically assume an ample supply of water, unrestricted soil moisture availability, and sufficient energy to drive the evaporative processes. PET is often estimated based on meteorological variables such as temperature, humidity, wind speed, and solar radiation using empirical equations like the Penman-Monteith equation.

Actual Evapotranspiration (AET): Actual ET, on the other hand, represents the observed or estimated rate at which water is actually evaporating and transpiring from the land surface under existing environmental conditions. Unlike PET, AET accounts for factors such as soil moisture availability, vegetation cover, stomatal conductance, and atmospheric demand. It reflects the true water loss from the ecosystem and is of greater interest in hydrological modeling, as it provides a more realistic depiction of water balance dynamics.

The formula for converting PET to AET is:

AET = PET * Kc

Where:

AET is the actual evapotranspiration, PET is the potential evapotranspiration, and Kc is the crop coefficient. The crop coefficient accounts for factors such as crop type, soil moisture levels, climate conditions, and management practices. It can vary throughout the growing season as well.

3.3.2.2.1 Direct measurements:

There are several methods to measure ET directly like lysimeters and gravimetric analysis, but this data rarely available to the public. There has been a concerted effort to enhance the accessibility of Eddy Covariance data, so this dataset may expand in the years to come.

knitr::include_url("https://www.youtube.com/embed/CR4Anc8Mkas")

This video focuses on CO2 as an output of eddy covariance data, but among the ‘other gases’ mentioned, water vapor is included, offering measurements of actual ET. The video also provides a resource where you might find eddy covariance data for your region of interest.

3.3.2.2.2 Remote sensing:

Remote sensing of evapotranspiration (ET) involves the use of satellite or airborne sensors to observe and quantify the exchange of water vapor between the Earth’s surface and the atmosphere over large spatial scales. This approach offers several advantages, including the ability to monitor ET across diverse landscapes, regardless of accessibility, and to capture variations in ET over time with high temporal resolution.

Remote sensing data, coupled with energy balance models, can be used to estimate ET by quantifying the energy fluxes at the land surface. These models balance incoming solar radiation with outgoing energy fluxes, including sensible heat flux and latent heat flux (representing ET). Remote sensing-based ET estimates are often validated and calibrated using ground-based measurements, such as eddy covariance towers or lysimeters, to ensure accuracy and reliability. It can be helpful to validate these models yourself if you have a data source available in your ecoregion as a ‘sanity check’. Keep in mind that there are numerous models available, some of which may be tailored for specific ecoregions, resulting in significant variations in estimated evapotranspiration (ET) for your area among these models. If directly measured ET data is not available, you can check model output in a simple water balance. For example, inputs - outputs for your watershed (Ppt - Q - ET) should be approximately 0 (recall from our transfer function module that it is likely not exact). If the ET estimate matches precipitation, it’s likely that the selected model is overestimating ET for your region.

Some resources for finding ET modeled from remote sensing data:

ClimateEngine.org - This is a fun resource for all kinds of data. Actual evapotranspiration can be found in the TerraClimate dataset.

OpenET - you need a Google account for this one. This site is great if you need timeseries data. You can select ‘gridded data’ and draw a polygon in your area of interest. You can select the year of interest at the top of the map, and once the timeseries generates, you can view and compare the output of seven different models.

3.3.2.3 Modeling:

3.3.3 Labwork (20 pts):

For this assignment, we will work again in the Fraser Experimental Forest (same site as snowmelt module). Not only are there meteorological stations in our watershed of interest, but there is a Eddy Covariance tower nearby, in a forest with the same tree community as our watershed of interest. We can use this data to verify our model output for modeled data.

3.3.3.1 The Evapotranspiration package

We will test a couple of simple methods that require few data inputs. However, it may be helpful to know that this package will allow calculations of PET, AET and Reference Crop Evapotranspiration from many different equations. Your selection of models may depend on the study region and the data available.

3.3.3.2 Import libraries

3.3.3.3 Import data

3.3.3.3.1 Meteorological data

We’ll import three datasets in this workflow, one containing actual data from the eddy covariance tower for the years 2017-2018.
Also imported is the discharge (Q) data from our watershed. Note that data is not collected during periods of deep snow. We will use this data to estimate the total discharge for 2022.
Additionally, we’ll import the meteorological data from our watershed for the years of interest. We are going to estimate ET for our watershed during the 2022 water year, though met data is provided in calendar years, so we will need to extract the desired data from the met dataframe.

3.3.3.4 Data formatting

We will model ET at daily timesteps. Note that all imported data is for different timesteps, units and measurements than what we need for analysis. It is common practice to manipulate the imported data and adjust it to align with our model functions. Ensuring the correct structure of the data can prevent potential issues later in the code

Now we have the RHmax and RHmin values that we will need for our ET models

3.3.3.4.1 Precipitation

To add a layer of checks we will also check precipitation totals for the watershed from our met data. This dataframe also provides cumulative precipitation, though the accumulation begins on January 1. Therefore we will need to calculate daily precipitation for each day of the water year and generate a new cumulative value.

This was a lot of work to save one value.

3.3.3.5 Pseudocode

Often when writing code, it is necessary to first write ‘pseudocode’. This allows us to think about workflow before writing scripts involves planning and structuring the logic of a script. Psuedocode is usually mixture of natural language and code-like constructs (e.g., ‘I’ll use ’merge’ or ‘ggplot’ to…’) that serves as a blueprint for our script before we dive into specific programming language syntax. Then we can develop our code by filling in the programming language as needed.

Q1. (2pnts) In you own words, what did we do at each step in the above chunk and what is cumulative_ppt_2022. Why do we want to know this?

ANSWER:

3.3.3.5.1 Discharge data

Our next step will be to import and format discharge data collected from the weir at Fool Creek. Again, data collection starts on April 20th this year. To estimate discharge between Sept 31 of the previous year and April 20th of 2022, we will assume daily discharge between these dates is the mean of the discharge values from each of these dates.

Q2 (3pnts) Write pseudocode in a stepwise fashion to describe the workflow that will need to happen to estimate the total volume of water in mm/watershed area for 2022 lost to stream discharge. Consider that the units provided by USFS are in cubic feet/second (cfs).

ANSWER: 1. Import dataframe 2. …

3.3.3.5.2 Evapotranspiration data

Import and transform ET data.
Note that ET data is in mmol/m2/s. We want to convert this to mm/day.
Eddy covariance data collected from towers represents the exchange of gases, including water vapor, between the atmosphere and the land surface within a certain area known as the “footprint.” This footprint area is not fixed; it changes in size and shape depending on factors like wind direction, thermal stability, and measurement height. Typically, it has a gradual border, meaning that the influence of the tower measurements extends beyond its immediate vicinity.

For our specific tower data, we’ve estimated the mean flux area, or footprint area, to be approximately 0.4 square kilometers. However, when estimating the total evapotranspiration (ET) for our entire watershed, we need to extrapolate the ET measured by the tower to cover the entire watershed area. This extrapolation involves scaling up the measurements from the tower footprint to the larger area of the watershed to get a more comprehensive understanding of water vapor exchange over the entire region.

This appears to be a fairly well balanced water budget, especially consdering that we have made some estimates along the way. Let’s see how this ET data compares to modeled data.

3.3.3.6 The Priestly Taylor method (actual evapotranspiration)

The Priestley-Taylor method is a semi-empirical model that estimates potential evapotranspiration as a function of net radiation. This method requires a list which contains:
(climate variables) required by Priestley-Taylor formulation: Tmax, Tmin (degree Celcius), RHmax, RHmin (percent), Rs (Megajoules per sqm) or n (hour).
We have measurements of relative humidity (RH) data from within our watershed meteorological station. However, RH data can be found through many sources providing local weather data.
n refers to the number of sunshine hours.

Evapotranspiration comes with a list physical ‘constants’, but we want to replace important constants with data specific to our cite.

Recall that each of these are estimates of potential ET. Crop coefficients (Kc values) for specific tree species like those found in the Fraser Experimental Forest (lodgepole pine or Englemann spruce) may not be as readily available as they are for agricultural crops. However, you may be able to find some guidance in scientific literature or forestry publications. From what we have found, Kc estimates for lodgepole pine forests can be between 0.4 and 0.8. These values may vary depending on factors such as climate, soil type, elevation, and other site-specific factors. Our own estimates using water balance data from dates that correspond with the eddy flux tower data suggest seasonal fluctuations, with a mean of 0.55.

AET = PET * Kc

Let’s plot the two modeled ET timeseries with the eddy covaraince tower data.

Q3. (4pnts) Consider the two evapotranspiration (ET) models we have generated, one consistently underestimates ET, while the other consistently overestimates summer values and underestimates winter values. Consider cosystem specificity in modeling. Why do you think these two methods generate such different estimates? What ecosystem-specific factors might contribute to the discrepancies between models?

ANSWER:

Q4 (3pnts) Let’s assume we do not have eddy covariance tower data for this time period, but you have the provided discharge and precipitation measurements. Which model would you choose to estimate an annual ET sum and why?

ANSWER:

Q5 (3pnts) In our modeling script, while our main objective was to estimate evapotranspiration (ET), which specific aspect required the most labor and attention to detail? Reflect on the tasks involved in data preparation, cleaning, and formatting. How did these preliminary steps impact the overall modeling process?