The goal of rfars
is to facilitate transportation safety analysis by simplifying the process of extracting data from official crash databases. The National Highway Traffic Safety Administration collects and publishes a census of fatal crashes in the Fatality Analysis Reporting System and a sample of fatal and non-fatal crashes in the Crash Report Sampling System (an evolution of the General Estimates System). The Fatality and Injury Reporting System Tool allows users to query these databases, and can produce simple tables and graphs. This suffices for simple analysis, but often leaves researchers wanting more. Digging any deeper, however, involves a time-consuming process of downloading annual ZIP files and attempting to stitch them together - after first combing through immense data dictionaries to determine the required variables and table names.
rfars
allows users to download the last 10 years of FARS and GES/CRSS data with just one line of code. The result is a full, rich dataset ready for mapping, modeling, and other downstream analysis. Codebooks with variable definitions and value labels support an informed analysis of the data (see vignette("Searchable Codebooks", package = "rfars")
for more information). Helper functions are also provided to produce common counts and comparisons.
Installation
You can install the latest version of rfars
from GitHub with:
# install.packages("devtools")
devtools::install_github("s87jackson/rfars")
or the CRAN stable release with:
install.packages("rfars")
Then load rfars and some helpful packages:
Getting and Using Data
The get_fars()
and get_gescrss()
are the primary functions of the rfars
package. These functions download and process data files directly from NHTSA’s FTP Site, or pull the prepared data stored on your local machine, or (as of Version 2.0) pull the prepared data from Zenodo. The data files hosted on Zenodo are stable, have DOIs, and replicate the data that would be produced by get_fars()
and get_gescrss()
, but in a fraction of the time.
They take the parameters years
and states
(FARS) or regions
(GES/CRSS). As the source data files follow an annual structure, years
determines how many file sets are downloaded or loaded, and states
/regions
filters the resulting dataset. Downloading and processing these files can take several minutes. Before downloading, rfars
will inform you that it’s about to download files and asks your permission to do so. To skip this dialog, set proceed = TRUE
. You can use the dir
and cache
parameters to save an RDS file to your local machine. The dir
parameter specifies the directory, and cache
names the file (be sure to include the .rds file extension).
Executing the code below will download the prepared FARS and GES/CRSS databases for 2014-2023.
myFARS <- get_fars(proceed = TRUE)
myCRSS <- get_gescrss(proceed = TRUE)
get_fars()
and get_gescrss()
return a list with six dataframes: flat
, multi_acc
, multi_veh
, multi_per
, events
, and codebook
.
The tables below show records for randomly selected crashes to illustrate the content and structure of the data. The tables are transposed for readability.
Each row in the flat
dataframe corresponds to a person involved in a crash. As there may be multiple people and/or vehicles involved in one crash, some variable-values are repeated within a crash or vehicle. Each crash is uniquely identified with id
, which is a combination of year
and st_case
. Note that st_case
is not unique across years, for example, st_case
510001 will appear in each year. The id
variable attempts to avoid this issue. The GES/CRSS data includes a weight
variable that indicates how many crashes each row represents.
year | 2014 | 2014 | 2014 | 2014 |
state | Illinois | Illinois | Maryland | Maryland |
st_case | 170423 | 170423 | 240191 | 240191 |
id | 2014170423 | 2014170423 | 2014240191 | 2014240191 |
veh_no | 0 | 1 | 0 | 1 |
per_no | 1 | 1 | 1 | 1 |
county | 179 | 179 | 43 | 43 |
city | 2600 | 2600 | 0 | 0 |
lon | -89.56736 | -89.56736 | -77.75684 | -77.75684 |
lat | 40.66485 | 40.66485 | 39.65347 | 39.65347 |
acc_type | NA | Pedestrian/ Animal | NA | Pedestrian/ Animal |
age | 37 Years | 64 Years | 32 Years | 23 Years |
air_bag | Not a Motor Vehicle Occupant | Not Deployed | Not a Motor Vehicle Occupant | Not Deployed |
alc_det | Evidential Test (breath, blood, urine) | Not Reported | Not Reported | Not Reported |
alc_res | 0.16 % BAC | AC Test Performed,Results Unknown | Not Reported | Test Not Given |
alc_status | Test Given | Test Given | Not Reported | Test Not Given |
arr_hour | Unknown EMS Scene Arrival Hour | Unknown EMS Scene Arrival Hour | Unknown EMS Scene Arrival Hour | Unknown EMS Scene Arrival Hour |
arr_min | Unknown EMS Scene Arrival Minutes | Unknown EMS Scene Arrival Minutes | Unknown EMS Scene Arrival Minutes | Unknown EMS Scene Arrival Minutes |
atst_typ | Blood | Blood | Not Reported | Test Not Given |
bikecgp | Bicyclist Failed to Yield - Sign-Controlled Intersection | NA | Not a Cyclist | NA |
bikectype | Bicyclist Ride Out - Sign-Controlled Intersection | NA | Not a Cyclist | NA |
bikedir | With Traffic | NA | Not a Cyclist | NA |
bikeloc | Intersection-Related | NA | Not a Cyclist | NA |
bikepos | On a roadway, in a shared travel lane | NA | Not a Cyclist | NA |
body_typ | NA | Other Bus Type | NA | 4-door sedan, hardtop |
bus_use | NA | Charter/Tour | NA | Not a Bus |
cargo_bt | NA | Bus | NA | Not Applicable (N/A) |
cdl_stat | NA | Valid | NA | No (CDL) |
cert_no | ************ | ************ | ************ | ************ |
day | 22 | 22 | 31 | 31 |
day_week | Sunday | Sunday | Thursday | Thursday |
death_da | 23 | Not Applicable (Non-Fatal) | 31 | Not Applicable (Non-Fatal) |
death_hr | 10:00-10:59 | Not Applicable (Non-fatal) | 2:00-2:59 | Not Applicable (Non-fatal) |
death_mn | 30 | Not Applicable (Non-fatal) | 27 | Not Applicable (Non-fatal) |
death_mo | June | Not Applicable (Non-Fatal) | July | Not Applicable (Non-Fatal) |
death_tm | 1030 | 8888 | 227 | 8888 |
death_yr | 2014 | Not Applicable (Non-fatal) | 2014 | Not Applicable (Non-fatal) |
deaths | NA | 0 | NA | 0 |
deformed | NA | Minor Damage | NA | Disabling Damage |
doa | Not Applicable | Not Applicable | Not Applicable | Not Applicable |
dr_drink | NA | No | NA | No |
dr_hgt | NA | 74 | NA | 72 |
dr_pres | NA | Yes | NA | Yes |
dr_wgt | NA | 225 lbs. | NA | Unknown |
dr_zip | NA | NA | NA | NA |
drinking | Yes (Alcohol Involved) | Not Reported | No (Alcohol Not Involved) | No (Alcohol Not Involved) |
drug_det | Evidential Test (Blood, Urine) | Not Reported | Not Reported | Other |
drugs | Yes (drugs involved) | Not Reported | No (drugs not involved) | No (drugs not involved) |
drunk_dr | 0 | 0 | 0 | 0 |
dstatus | Test Given | Test Given | Not Reported | Test Not Given |
ej_path | Not Ejected/Not Applicable | Not Ejected/Not Applicable | Not Ejected/Not Applicable | Not Ejected/Not Applicable |
ejection | Not Applicable | Not Ejected | Not Applicable | Not Ejected |
emer_use | NA | Not Applicable | NA | Not Applicable |
extricat | Not Extricated or Not Applicable | Not Extricated or Not Applicable | Not Extricated or Not Applicable | Not Extricated or Not Applicable |
fatals | 1 | 1 | 1 | 1 |
fire_exp | NA | No or Not Reported | NA | No or Not Reported |
first_mo | NA | No Record | NA | No Record |
first_yr | NA | No Record | NA | No Record |
gvwr | NA | 26,001 lbs. or more | NA | Not Applicable |
harm_ev | Pedalcyclist | Pedalcyclist | Pedestrian | Pedestrian |
haz_cno | NA | Not Applicable | NA | Not Applicable |
haz_id | NA | Not Applicable | NA | Not Applicable |
haz_inv | NA | No | NA | No |
haz_plac | NA | Not Applicable | NA | Not Applicable |
haz_rel | NA | Not Applicable | NA | Not Applicable |
hispanic | Non-Hispanic | Not A Fatality (not Applicable) | Non-Hispanic | Not A Fatality (not Applicable) |
hit_run | NA | No | NA | No |
hosp_hr | Unknown | Unknown | Unknown | Unknown |
hosp_mn | Unknown EMS Hospital Arrival Time | Unknown EMS Hospital Arrival Time | Unknown EMS Hospital Arrival Time | Unknown EMS Hospital Arrival Time |
hospital | EMS Ground | Not Transported | EMS Air | Not Transported |
hour | 0:00am-0:59am | 0:00am-0:59am | 0:00am-0:59am | 0:00am-0:59am |
impact1 | NA | 12 Clock Point | NA | 12 Clock Point |
inj_sev | Fatal Injury (K) | No Apparent Injury (O) | Fatal Injury (K) | No Apparent Injury (O) |
j_knife | NA | Not an Articulated Vehicle | NA | Not an Articulated Vehicle |
l_compl | NA | Valid license for this class vehicle | NA | Valid license for this class vehicle |
l_endors | NA | Endorsement(s) Required, Compliance Unknown | NA | No Endorsements required for this vehicle |
l_restri | NA | Restrictions, Compliance Unknown | NA | No Restrictions or Not Applicable |
l_state | NA | Illinois | NA | Pennsylvania |
l_status | NA | Valid | NA | Valid |
l_type | NA | Full Driver License | NA | Full Driver License |
lag_hrs | 34 | 999 | 2 | 999 |
lag_mins | 25 | 99 | 3 | 99 |
last_mo | NA | No Record | NA | No Record |
last_yr | NA | No Record | NA | No Record |
lgt_cond | Dark - Lighted | Dark - Lighted | Dark - Not Lighted | Dark - Not Lighted |
location | Not at Intersection - On Roadway, Not in Marked Crosswalk | Occupant of a Motor Vehicle | Not at Intersection - On Roadway, Not in Marked Crosswalk | Occupant of a Motor Vehicle |
m_harm | NA | Pedalcyclist | NA | Pedestrian |
mak_mod | NA | Other Make Bus***: Conventional (Engine out front) | NA | Pontiac G6 |
make | NA | Other Make | NA | Pontiac |
man_coll | Not a Collision with Motor Vehicle In-Transport | Not a Collision with Motor Vehicle In-Transport | Not a Collision with Motor Vehicle In-Transport | Not a Collision with Motor Vehicle In-Transport |
mcarr_i1 | NA | US DOT | NA | Not Applicable |
mcarr_i2 | NA | NA | NA | Not Applicable |
mcarr_id | NA | NA | NA | Not Applicable |
milept | None | None | NA | NA |
minute | 5 | 5 | 24 | 24 |
mod_year | NA | NA | NA | NA |
model | NA | 981 | NA | 22 |
month | June | June | July | July |
motdir | Not a Pedestrian | NA | Not Applicable | NA |
motman | Not a Pedestrian | NA | Not Applicable | NA |
msafeqmt | None Used | NA | None Used | NA |
nhs | This section IS NOT on the NHS | This section IS NOT on the NHS | This section IS NOT on the NHS | This section IS NOT on the NHS |
not_hour | 0:00am-0:59am | 0:00am-0:59am | Unknown | Unknown |
not_min | 5 | 5 | Unknown | Unknown |
numoccs | NA | 01 | NA | 01 |
owner | NA | Vehicle Registered as Business/Company/Government Vehicle | NA | Driver (in this crash) was Registered Owner |
p_crash1 | NA | Going Straight | NA | Going Straight |
p_crash2 | NA | Pedalcyclist or other non-motorist in road | NA | Pedestrian in road |
p_crash3 | NA | No Avoidance Maneuver | NA | Steering left |
pbcwalk | Yes | NA | None Noted | NA |
pbswalk | Yes | NA | None Noted | NA |
pbszone | None Noted | NA | None Noted | NA |
pcrash4 | NA | Tracking | NA | Tracking |
pcrash5 | NA | Stayed in original travel lane | NA | Stayed in original travel lane |
pedcgp | Not a Pedestrian | NA | Crossing Roadway - Vehicle Not Turning | NA |
pedctype | Not a Pedestrian | NA | Pedestrian Failed to Yield | NA |
peddir | Not a Pedestrian | NA | Not Applicable | NA |
pedleg | Not a Pedestrian | NA | Not Applicable | NA |
pedloc | Not a Pedestrian | NA | Not At Intersection | NA |
pedpos | Not a Pedestrian | NA | On a roadway, in a travel lane | NA |
peds | 1 | 1 | 1 | 1 |
pedsnr | Not a Pedestrian | NA | Not Applicable | NA |
per_typ | Bicyclist | Driver of a Motor Vehicle In-Transport | Pedestrian | Driver of a Motor Vehicle In-Transport |
permvit | 1 | 1 | 1 | 1 |
pernotmvit | 1 | 1 | 1 | 1 |
persons | 1 | 1 | 1 | 1 |
prev_acc | NA | None | NA | None |
prev_dwi | NA | None | NA | None |
prev_oth | NA | None | NA | None |
prev_spd | NA | None | NA | None |
prev_sus | NA | None | NA | None |
pvh_invl | 0 | 0 | 0 | 0 |
race | White | Not a Fatality (not Applicable) | White | Not a Fatality (not Applicable) |
rail | Not Applicable | Not Applicable | Not Applicable | Not Applicable |
reg_stat | NA | Illinois | NA | Pennsylvania |
rel_road | On Roadway | On Roadway | On Roadway | On Roadway |
reljct1 | No | No | No | No |
reljct2 | Intersection-Related | Intersection-Related | Non-Junction | Non-Junction |
rest_mis | Not a Motor Vehicle Occupant | No | Not a Motor Vehicle Occupant | No |
rest_use | Not a Motor Vehicle Occupant | Restraint Used - Type Unknown | Not a Motor Vehicle Occupant | Shoulder and Lap Belt Used |
road_fnc | Urban-Minor Arterial | Urban-Minor Arterial | Rural-Minor Arterial | Rural-Minor Arterial |
rolinloc | NA | No Rollover | NA | No Rollover |
rollover | NA | No Rollover | NA | No Rollover |
route | Local Street - Municipality | Local Street - Municipality | U.S. Highway | U.S. Highway |
rur_urb | Urban | Urban | Rural | Rural |
sch_bus | No | No | No | No |
seat_pos | Not a Motor Vehicle Occupant | Front Seat, Left Side | Not a Motor Vehicle Occupant | Front Seat, Left Side |
sex | Male | Male | Male | Male |
sp_jur | No Special Jurisdiction | No Special Jurisdiction | No Special Jurisdiction | No Special Jurisdiction |
spec_use | NA | Vehicle Used as Other Bus | NA | No Special Use |
speedrel | NA | No | NA | No |
str_veh | 1 | 0 | 1 | 0 |
tow_veh | NA | No Trailing Units | NA | No Trailing Units |
towed | NA | Not Towed | NA | Towed Due to Disabling Damage |
trav_sp | NA | Not Reported | NA | Not Reported |
tway_id | WASHINGTON ST | WASHINGTON ST | US-40 | US-40 |
tway_id2 | VETERANS DR | VETERANS DR | NA | NA |
typ_int | Four-Way Intersection | Four-Way Intersection | Not an Intersection | Not an Intersection |
underide | NA | No Underride or Override Noted | NA | No Underride or Override Noted |
unittype | NA | Motor Vehicle In-Transport (Inside or Outside the Trafficway) | NA | Motor Vehicle In-Transport (Inside or Outside the Trafficway) |
v_config | NA | Bus (seats for more than 15 occupants, including driver) | NA | Not Applicable |
valign | NA | Straight | NA | Straight |
ve_forms | 1 | 1 | 1 | 1 |
ve_total | 1 | 1 | 1 | 1 |
vin | NA | NA | NA | NA |
vnum_lan | NA | Two lanes | NA | Three lanes |
vpavetyp | NA | Blacktop, Bituminous, or Asphalt | NA | Blacktop, Bituminous, or Asphalt |
vprofile | NA | Level | NA | Level |
vspd_lim | NA | 30 MPH | NA | 40 MPH |
vsurcond | NA | Dry | NA | Wet |
vtcont_f | NA | Device Functioning Properly | NA | No Controls |
vtrafcon | NA | Traffic control signal (on colors) with Pedestrian Signal | NA | No Controls |
vtrafway | NA | Two-Way, Not Divided | NA | Two-Way, Divided, Positive Median Barrier |
work_inj | No | Not Applicable (not a fatality) | No | Not Applicable (not a fatality) |
wrk_zone | None | None | None | None |
func_sys | NA | NA | NA | NA |
rd_owner | NA | NA | NA | NA |
cityname | NA | NA | NA | NA |
countyname | NA | NA | NA | NA |
statename | NA | NA | NA | NA |
trlr1vin | NA | NA | NA | NA |
trlr2vin | NA | NA | NA | NA |
trlr3vin | NA | NA | NA | NA |
nmhelmet | NA | NA | NA | NA |
nmlight | NA | NA | NA | NA |
nmothpre | NA | NA | NA | NA |
nmothpro | NA | NA | NA | NA |
nmpropad | NA | NA | NA | NA |
nmrefclo | NA | NA | NA | NA |
prev_sus1 | NA | NA | NA | NA |
prev_sus2 | NA | NA | NA | NA |
prev_sus3 | NA | NA | NA | NA |
helm_mis | NA | NA | NA | NA |
helm_use | NA | NA | NA | NA |
gvwr_from | NA | NA | NA | NA |
gvwr_to | NA | NA | NA | NA |
icfinalbody | NA | NA | NA | NA |
trlr1gvwr | NA | NA | NA | NA |
trlr2gvwr | NA | NA | NA | NA |
trlr3gvwr | NA | NA | NA | NA |
vpicbodyclass | NA | NA | NA | NA |
vpicmake | NA | NA | NA | NA |
vpicmodel | NA | NA | NA | NA |
underoverride | NA | NA | NA | NA |
devmotor | NA | NA | NA | NA |
devtype | NA | NA | NA | NA |
acc_config | NA | NA | NA | NA |
a1 | 0 | 0 | 3 | 3 |
a2 | 0 | 0 | 0 | 0 |
a3 | 0 | 0 | 0 | 0 |
a4 | 0 | 0 | 0 | 0 |
a5 | 0 | 0 | 0 | 0 |
a6 | 0 | 0 | 0 | 0 |
a7 | 0 | 0 | 0 | 0 |
a8 | 0 | 0 | 0 | 0 |
a9 | 0 | 0 | 0 | 0 |
a10 | 0 | 0 | 0 | 0 |
p1 | 16 | 0 | 0 | 3 |
p2 | 16 | 0 | 0 | 0 |
p3 | 16 | 0 | 13 | 0 |
p4 | 16 | 0 | 0 | 0 |
p5 | 16 | 0 | 0 | 0 |
p6 | 16 | 0 | 0 | 0 |
p7 | 16 | 0 | 0 | 0 |
p8 | 16 | 0 | 0 | 0 |
p9 | 16 | 0 | 0 | 0 |
p10 | 16 | 0 | 0 | 0 |
The multi_
dataframes contain those variables for which there may be a varying number of values for any entity (e.g., driver impairments, vehicle events, weather conditions at time of crash). Each dataframe has the requisite data elements corresponding to the entity: multi_acc
includes st_case
and year
, multi_veh
adds veh_no
(vehicle number), and multi_per
adds per_no
(person number).
state | st_case | name | value | year |
---|---|---|---|---|
Illinois | 170423 | weather1 | Clear | 2014 |
Maryland | 240191 | weather1 | Rain | 2014 |
state | st_case | name | value | year |
---|---|---|---|---|
Illinois | 170423 | weather1 | Clear | 2014 |
Maryland | 240191 | weather1 | Rain | 2014 |
state | st_case | veh_no | per_no | name | value | year |
---|---|---|---|---|---|---|
Illinois | 170423 | 0 | 1 | drugtst1 | Both: Blood and Urine Tests | 2014 |
Illinois | 170423 | 0 | 1 | drugtst2 | Both: Blood and Urine Tests | 2014 |
Illinois | 170423 | 0 | 1 | drugtst3 | Test Not Given | 2014 |
Illinois | 170423 | 0 | 1 | drugres1 | DELTA 9 | 2014 |
Illinois | 170423 | 0 | 1 | drugres2 | Tetrahydrocannabinols (THC) | 2014 |
Illinois | 170423 | 0 | 1 | drugres3 | Test Not Given | 2014 |
Illinois | 170423 | 1 | 1 | drugtst1 | Unknown Test Type | 2014 |
Illinois | 170423 | 1 | 1 | drugtst2 | Test Not Given | 2014 |
Illinois | 170423 | 1 | 1 | drugtst3 | Test Not Given | 2014 |
Illinois | 170423 | 1 | 1 | drugres1 | Test For Drug, Results Unknown | 2014 |
Illinois | 170423 | 1 | 1 | drugres2 | Test Not Given | 2014 |
Illinois | 170423 | 1 | 1 | drugres3 | Test Not Given | 2014 |
Illinois | 170423 | 0 | 1 | mtm_crsh | Failure to Yield Right-Of-Way | 2014 |
Illinois | 170423 | 0 | 1 | mtm_crsh | Failure to Obey Traffic Signs, Signals or Officer | 2014 |
Illinois | 170423 | 0 | 1 | nmimpair | Under the Influence of Alcohol, Drugs or Medication | 2014 |
Illinois | 170423 | 0 | 1 | mpr_act | Crossing Roadway | 2014 |
Maryland | 240191 | 0 | 1 | drugtst2 | Test Not Given | 2014 |
Maryland | 240191 | 0 | 1 | drugtst3 | Test Not Given | 2014 |
Maryland | 240191 | 0 | 1 | drugres2 | Test Not Given | 2014 |
Maryland | 240191 | 0 | 1 | drugres3 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugtst1 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugtst2 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugtst3 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugres1 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugres2 | Test Not Given | 2014 |
Maryland | 240191 | 1 | 1 | drugres3 | Test Not Given | 2014 |
Maryland | 240191 | 0 | 1 | mtm_crsh | Failure to Yield Right-Of-Way | 2014 |
Maryland | 240191 | 0 | 1 | mtm_crsh | Not Visible (Dark clothing, No Lighting, etc.) | 2014 |
Maryland | 240191 | 0 | 1 | nmimpair | None/Apparently Normal | 2014 |
Maryland | 240191 | 0 | 1 | mpr_act | Crossing Roadway | 2014 |
The events
dataframe provides a sequence of events for each vehicle in each crash. See the vignette(“Crash Sequence of Events”, package = “rfars”) for more information.
state | st_case | veh_no | aoi | soe | veventnum | year |
---|---|---|---|---|---|---|
Illinois | 170423 | 1 | 12 Clock Point | Pedalcyclist | 1 | 2014 |
Maryland | 240191 | 1 | 12 Clock Point | Pedestrian | 1 | 2014 |
The codebook
dataframe provides a searchable codebook for the data, useful if you know what concept you’re looking for but not the variable that describes it. rfars
also includes pre-loaded codebooks for FARS and GESCRSS (rfars::fars_codebook
and rfars::gescrss_codebook
). See vignette('Searchable Codebooks', package = 'rfars')
for more information.
Counts
See vignette("Counts", package = "rfars")
for information on the pre-loaded annual_counts
dataframe and the counts()
and compare_counts()
functions. Also see vignette("Alcohol Counts", package = "rfars")
for details on how BAC values are imputed and reported in Traffic Safety Facts.
Helpful Links
- National Highway Traffic Safety Administration (NHTSA)
- Fatality Analysis Reporting System (FARS)
- Fatality and Injury Reporting System Tool (FIRST)
- FARS Analytical User’s Manual
- General Estimates System (GES)
- Crash Report Sampling System (CRSS)
- CRSS Analytical User’s Manual
- NCSA and Other Data Sources
- NHTSA FTP Site