rnassqs | Usage | Release | Development |
---|---|---|---|
(Wheat image from here.) |
|||
As required by the NASS Terms of Use: This product uses the NASS API but is not endorsed or certified by NASS.
rnassqs
allows users to access the USDA’s National
Agricultural Statistics Service (NASS) Quick Stats data through their
API. It is simple and easy to use, and provides some functions to help
navigate the bewildering complexity of some Quick Stats data.
For docs and code examples, visit the package web page here: https://docs.ropensci.org/rnassqs/.
Install the package via devtools
or CRAN:
# Via devtools
library(devtools)
install_github('ropensci/rnassqs')
# Via CRAN
install.packages("rnassqs")
To use the NASS Quick Stats API you need an API key. The API key
should in general not be included in scripts. One way of making the key
available without defining it in a script is by setting it in your
.Renviron
file, which is usually located in your home
directory. If you are an rstudio
user, you can use
usethis::edit_r_environ()
to open your
.Renviron
file and add a line that looks like:
="<your api key here>" NASSQS_TOKEN
Alternatively, you can set it explicitly in the console with
nassqs_auth(key = <your api key>)
. This will set the
environmental variable NASSQS_TOKEN, which is used to access the API.
You can also set this directly with
Sys.setenv("NASSQS_TOKEN" = <your api key>)
.
See the examples in inst/examples for quick recipes to download data.
The primary function is nassqs()
, with which you can
make any query of variables. For example, to mirror the request that is
on the NASS API
documentation, you can use:
library(rnassqs)
# You must set your api key before requesting data
nassqs_auth(key = <your api key>)
# Parameters to query on and data call
<- list(commodity_desc = "CORN", year__GE = 2012, state_alpha = "VA")
params <- nassqs(params) d
Parameters do not need to be capitalized, and also do not need to be in a list format. The following works just as well:
<- nassqs(commodity_desc = "corn", year__GE = 2012, state_alpha = "va") d
You can request data for multiple values of the same parameter by using a simple list as follows:
<- list(commodity_desc = "CORN", year__GE = 2012, state_alpha = c("VA", "WA"))
params <- nassqs(params) d
NASS does not allow GET requests that pull more than 50,000 records in one request. The function will inform you if you try to do that. It will also inform you if you’ve requested a set of parameters for which there are no records.
Other useful functions include:
# returns a set of unnique values for the parameter "STATISTICCAT_DESC"
nassqs_param_values("statisticcat_desc")
# returns a count of the number of records for a given query
nassqs_record_count(params=params)
# Get yields specifically
# Equivalent to including "'statisticat_desc' = 'YIELD'" in your parameter list.
nassqs_yields(params)
# Get acres specifically
# Equivalent to including all "AREA" values in statisticcat_desc
nassqs_acres(params)
# Specifies just "AREA HARVESTED" values of statisticcat_desc
nassqs_acres(params, area = "AREA HARVESTED")
The NASS API handles other operators by modifying the variable name. The API can accept the following modifications:
For example, to request corn yields in Virginia and Pennsylvania for all years since 2000, you would use something like:
<- list(commodity_desc = "CORN",
params year__GE = 2000,
state_alpha = c("VA", "PA"),
statisticcat_desc = "YIELD")
<- nassqs(params) #returns data as a data frame. df
See the vignette for more examples and details on usage.
Contributions are more than welcome, and there are several ways to contribute:
rnassqs
to query data from ‘Quick Stats’ and would like to
contribute your query, consider submitting a pull request adding your
query as a file in inst/examples/.rnassqs
uses
roxygen2, which means the documentation is at the top of each function
definition. Please submit any improvements as a pull request.rnassqs
follows the
style outlined in Hadley Wickham’s R Packages. Following
this style makes the pull request and review go more smoothly.In June 2019 the usdarnass
package was released on CRAN and is also
available to install via github.
usdarnass
has similar functionality to this package.
NASS also provides a daily tarred and gzipped file of their entire dataset. At the time of writing it is approaching 1 GB. You can download that file via their data site.
The FTP link also contains builds for: NASS census (every 5 years ending with 2 and 7), or data for one of their specific sectors (CROPS, ECONOMICS, ANIMALS & PRODUCTS). At the time of this writing, specific files for the ENVIRONMENTAL and DEMOGRAPHICS sectors are not available.
Thank you to rOpensci reviewers Adam Sparks and Neal Richardson and
editor Lincoln Mullen, for their fantastic feedback and assistance. User
feedback and use case contributions have been a huge help to make
rnassqs
more accessible and user-friendly. More use cases
or feature requests are always welcome!