This document introduces the dbhydroR
package and its associated functions. These functions are aimed at improving programmatic workflows that query the DBHYDRO Environmental Database which holds over 35 million hydrologic and water quality records from the Florida Everglades and surrounding areas.
Computers running the Windows operating system can only install binary package archive files unless they have additional compiler software installed. Without this software, dbhydroR
can be installed from CRAN by running the following command in the R
console:
install.packages("dbhydroR")
Otherwise, the dbhydroR
can be installed by running the following command in the R
console:
::install_github("ropensci/dbhydroR") devtools
Once installed, the package can be loaded using the following command:
library(dbhydroR)
Water quality data can be retrieved using the get_wq
function which takes four required arguments. The user must specify a station ID, a test name, and a date range. Station IDs can be located on the SFWMD Station Maps. An abbreviated list of available test names can be found in the appendix to this document while a full listing can be found on the DBHYDRO metadata page. Dates must be specified in YYYY-MM-DD format (e.g. 2015-02-26). The following set of examples retrieve measurements between March 2011 and May 2012. They can be run from the R console by issuing the command:
example(get_wq)
get_wq(station_id = "FLAB08", date_min = "2011-03-01",
date_max = "2012-05-01", test_name = "CHLOROPHYLL-A, SALINE")
get_wq(station_id = c("FLAB08","FLAB09"), date_min = "2011-03-01",
date_max = "2012-05-01", test_name = "CHLOROPHYLL-A, SALINE")
get_wq(station_id = c("FLAB0%"), date_min = "2011-03-01",
date_max = "2012-05-01", test_name = "CHLOROPHYLL-A, SALINE")
get_wq(station_id = c("FLAB08","FLAB09"), date_min = "2011-03-01",
date_max = "2012-05-01", test_name = c("CHLOROPHYLL-A, SALINE",
"SALINITY"))
By default, get_wq
returns a cleaned output. First, the cleaning function converts the raw output from native DBHYDRO long format (each piece of data on its own row) to wide format (each site x variable combination in its own column) using the reshape2 package (Wickham 2007). Next, the extra columns associated with QA flags, LIMS, and District receiving are removed. Finally, row entries associated with QA field blanks, which are used to check on potential sources of contamination, are removed. Setting the raw
flag to TRUE
will force to retain information on QA field blanks as well as the other QA fields. An example query that retains this information and the original long formatting is shown below.
<- get_wq(station_id = "FLAB08", date_min = "2011-03-01",
raw_wq date_max = "2011-05-01", test_name = "CHLOROPHYLL-A, SALINE",
raw = TRUE)
This raw data can then be cleaned using the function:
clean_wq(raw_wq)
Hydrologic time series data can be retrieved using the get_hydro
function. The first task to accomplish prior to running get_hydro
is to identify one or more dbkeys which correspond to unique site x variable time-series. This can be done before-hand using the get_dbkey
function, the SFWMD Station Maps or the DBHYDRO Browser. One useful strategy for finding desired dbkeys is to run the get_dbkey
function interactively using progressively narrower search terms. For example, suppose we are interested in daily average wind data at Joe Bay but we have no alphanumeric dbkey
. Initially we could run get_dbkey
with the detail.level
set to “summary.”
get_dbkey(stationid = "JBTS", category = "WEATHER", param = "WNDS",
detail.level = "summary")
Our search returns two results but only one of them has a daily average (DA) measurement frequency. We can verify the remaining attributes of our likely dbkey by setting the freq
parameter to “DA” and the detail.level
parameter to “full.”
get_dbkey(stationid = "JBTS", category = "WEATHER", param = "WNDS",
freq = "DA", detail.level = "full")
This exact dbkey can only be returned reliably by specifying all of the get_dbkey
parameters applicable to the “WEATHER” category.
get_dbkey(stationid = "JBTS", category = "WEATHER", param = "WNDS",
freq = "DA", stat = "MEAN", recorder = "CR10", agency = "WMD",
detail.level = "dbkey")
Now that we have our dbkey in hand, we can use is as input to get_hydro
. In addition to a dbkey, we must specify a date range. Dates must be entered in YYYY-MM-DD format (e.g. 2015-02-26).
get_hydro(dbkey = "15081",
date_min = "2013-01-01", date_max = "2013-02-02")
Alternatively, we can specify a set of arguments in our call to get_hydro
that will be passed to get_dbkey
on-the-fly. Use caution when using this strategy as complex stationid/category/parameter combinations can easily cause errors or return unexpected results. It is good practice to pre-screen your parameter values using get_dbkey
.
get_hydro(date_min = "2013-01-01", date_max = "2013-02-02",
stationid = "JBTS", category = "WEATHER", param = "WNDS",
freq = "DA", stat = "MEAN", recorder = "CR10", agency = "WMD")
The contents of multiple data streams can be returned by specifying multiple dbkeys or entering on-the-fly get_dbkey
queries that return multiple dbkeys.
get_hydro(dbkey = c("15081", "15069"), date_min = "2013-01-01",
date_max = "2013-02-02")
get_hydro(date_min = "2013-01-01", date_max = "2013-02-02",
category = "WEATHER", stationid = c("JBTS", "MBTS"),
param = "WNDS", freq = "DA", stat = "MEAN")
More get_hydro
examples including queries of other category
values (“SW,” “GW,” and “WQ”) can be viewed by issuing the following commands from the R
console:
example(get_dbkey)
example(get_hydro)
By default, get_hydro
returns a cleaned output. First, the cleaning function clean_hydro
converts the raw output from native DBHYDRO long format (each piece of data on its own row) to wide format (each site x variable combination in its own column) using the reshape2 package (Wickham 2007). Next, some extra columns are removed that are associated with measurement location (longitude/latitude), frequency, and QA flags are removed. Setting the raw
flag to TRUE
will force get_hydro
to retain the original formatting and metadata fields. An example query that retains this information and the original long formatting is shown below.
<- get_hydro(date_min = "2013-01-01", date_max = "2013-02-02",
raw_data stationid = "JBTS", category = "WEATHER", param = "WNDS",
freq = "DA", stat = "MEAN", recorder = "CR10", agency = "WMD", raw = TRUE)
clean_hydro(raw_data)
There are many test names available in DBHYDRO. A subset of these are detailed in the following table.
Code |
---|
AMMONIA-N |
CARBON, TOTAL ORGANIC |
CHLOROPHYLL-A(LC) |
CHLOROPHYLL-B(LC) |
CHLOROPHYLL-A, SALINE |
DISSOLVED OXYGEN |
KJELDAHL NITROGEN,TOTAL |
NITRATE+NITRITE-N |
NITRITE-N |
PHEOPHYTIN-A(LC) |
PHOSPHATE,ORTHO AS P |
PHOSPHATE,TOTAL AS P |
SALINITY |
SILICA |
SP CONDUCTIVITY, FIELD |
TEMP |
TOTAL NITROGEN |
TURBIDITY |
See section on URL-based data access in the DBHYDRO Browser User’s Guide