Skip to contents

The goal of rfars is to facilitate transportation safety analysis by simplifying the process of extracting data from official crash databases. The National Highway Traffic Safety Administration collects and publishes a census of fatal crashes in the Fatality Analysis Reporting System and a sample of fatal and non-fatal crashes in the Crash Report Sampling System (an evolution of the General Estimates System). The Fatality and Injury Reporting System Tool allows users to query these databases, and can produce simple tables and graphs. This suffices for simple analysis, but often leaves researchers wanting more. Digging any deeper, however, involves a time-consuming process of downloading annual ZIP files and attempting to stitch them together - after first combing through immense data dictionaries to determine the required variables and table names.

rfars allows users to download the last 10 years of FARS and GES/CRSS data with just one line of code. The result is a full, rich dataset ready for mapping, modeling, and other downstream analysis. Codebooks with variable definitions and value labels support an informed analysis of the data (see vignette("Searchable Codebooks", package = "rfars") for more information). Helper functions are also provided to produce common counts and comparisons.

Installation

You can install the latest version of rfars from GitHub with:

# install.packages("devtools")
devtools::install_github("s87jackson/rfars")

or the CRAN stable release with:

Then load rfars and some helpful packages:

Getting and Using Data

The get_fars() and get_gescrss() are the primary functions of the rfars package. These functions download and process data files directly from NHTSA’s FTP Site, or pull the prepared data stored on your local machine, or (as of Version 2.0) pull the prepared data from Zenodo. The data files hosted on Zenodo are stable, have DOIs, and replicate the data that would be produced by get_fars() and get_gescrss(), but in a fraction of the time.

They take the parameters years and states (FARS) or regions (GES/CRSS). As the source data files follow an annual structure, years determines how many file sets are downloaded or loaded, and states/regions filters the resulting dataset. Downloading and processing these files can take several minutes. Before downloading, rfars will inform you that it’s about to download files and asks your permission to do so. To skip this dialog, set proceed = TRUE. You can use the dir and cache parameters to save an RDS file to your local machine. The dir parameter specifies the directory, and cache names the file (be sure to include the .rds file extension).

Executing the code below will download the prepared FARS and GES/CRSS databases for 2014-2023.

myFARS <- get_fars(proceed = TRUE)
myCRSS <- get_gescrss(proceed = TRUE)

get_fars() and get_gescrss() return a list with six dataframes: flat, multi_acc, multi_veh, multi_per, events, and codebook.

The tables below show records for randomly selected crashes to illustrate the content and structure of the data. The tables are transposed for readability.

Each row in the flat dataframe corresponds to a person involved in a crash. As there may be multiple people and/or vehicles involved in one crash, some variable-values are repeated within a crash or vehicle. Each crash is uniquely identified with id, which is a combination of year and st_case. Note that st_case is not unique across years, for example, st_case 510001 will appear in each year. The id variable attempts to avoid this issue. The GES/CRSS data includes a weight variable that indicates how many crashes each row represents.

The ‘flat’ dataframe (transposed for readability)
year 2014 2014 2014 2014
state Illinois Illinois Maryland Maryland
st_case 170423 170423 240191 240191
id 2014170423 2014170423 2014240191 2014240191
veh_no 0 1 0 1
per_no 1 1 1 1
county 179 179 43 43
city 2600 2600 0 0
lon -89.56736 -89.56736 -77.75684 -77.75684
lat 40.66485 40.66485 39.65347 39.65347
acc_type NA Pedestrian/ Animal NA Pedestrian/ Animal
age 37 Years 64 Years 32 Years 23 Years
air_bag Not a Motor Vehicle Occupant Not Deployed Not a Motor Vehicle Occupant Not Deployed
alc_det Evidential Test (breath, blood, urine) Not Reported Not Reported Not Reported
alc_res 0.16 % BAC AC Test Performed,Results Unknown Not Reported Test Not Given
alc_status Test Given Test Given Not Reported Test Not Given
arr_hour Unknown EMS Scene Arrival Hour Unknown EMS Scene Arrival Hour Unknown EMS Scene Arrival Hour Unknown EMS Scene Arrival Hour
arr_min Unknown EMS Scene Arrival Minutes Unknown EMS Scene Arrival Minutes Unknown EMS Scene Arrival Minutes Unknown EMS Scene Arrival Minutes
atst_typ Blood Blood Not Reported Test Not Given
bikecgp Bicyclist Failed to Yield - Sign-Controlled Intersection NA Not a Cyclist NA
bikectype Bicyclist Ride Out - Sign-Controlled Intersection NA Not a Cyclist NA
bikedir With Traffic NA Not a Cyclist NA
bikeloc Intersection-Related NA Not a Cyclist NA
bikepos On a roadway, in a shared travel lane NA Not a Cyclist NA
body_typ NA Other Bus Type NA 4-door sedan, hardtop
bus_use NA Charter/Tour NA Not a Bus
cargo_bt NA Bus NA Not Applicable (N/A)
cdl_stat NA Valid NA No (CDL)
cert_no ************ ************ ************ ************
day 22 22 31 31
day_week Sunday Sunday Thursday Thursday
death_da 23 Not Applicable (Non-Fatal) 31 Not Applicable (Non-Fatal)
death_hr 10:00-10:59 Not Applicable (Non-fatal) 2:00-2:59 Not Applicable (Non-fatal)
death_mn 30 Not Applicable (Non-fatal) 27 Not Applicable (Non-fatal)
death_mo June Not Applicable (Non-Fatal) July Not Applicable (Non-Fatal)
death_tm 1030 8888 227 8888
death_yr 2014 Not Applicable (Non-fatal) 2014 Not Applicable (Non-fatal)
deaths NA 0 NA 0
deformed NA Minor Damage NA Disabling Damage
doa Not Applicable Not Applicable Not Applicable Not Applicable
dr_drink NA No NA No
dr_hgt NA 74 NA 72
dr_pres NA Yes NA Yes
dr_wgt NA 225 lbs. NA Unknown
dr_zip NA NA NA NA
drinking Yes (Alcohol Involved) Not Reported No (Alcohol Not Involved) No (Alcohol Not Involved)
drug_det Evidential Test (Blood, Urine) Not Reported Not Reported Other
drugs Yes (drugs involved) Not Reported No (drugs not involved) No (drugs not involved)
drunk_dr 0 0 0 0
dstatus Test Given Test Given Not Reported Test Not Given
ej_path Not Ejected/Not Applicable Not Ejected/Not Applicable Not Ejected/Not Applicable Not Ejected/Not Applicable
ejection Not Applicable Not Ejected Not Applicable Not Ejected
emer_use NA Not Applicable NA Not Applicable
extricat Not Extricated or Not Applicable Not Extricated or Not Applicable Not Extricated or Not Applicable Not Extricated or Not Applicable
fatals 1 1 1 1
fire_exp NA No or Not Reported NA No or Not Reported
first_mo NA No Record NA No Record
first_yr NA No Record NA No Record
gvwr NA 26,001 lbs. or more NA Not Applicable
harm_ev Pedalcyclist Pedalcyclist Pedestrian Pedestrian
haz_cno NA Not Applicable NA Not Applicable
haz_id NA Not Applicable NA Not Applicable
haz_inv NA No NA No
haz_plac NA Not Applicable NA Not Applicable
haz_rel NA Not Applicable NA Not Applicable
hispanic Non-Hispanic Not A Fatality (not Applicable) Non-Hispanic Not A Fatality (not Applicable)
hit_run NA No NA No
hosp_hr Unknown Unknown Unknown Unknown
hosp_mn Unknown EMS Hospital Arrival Time Unknown EMS Hospital Arrival Time Unknown EMS Hospital Arrival Time Unknown EMS Hospital Arrival Time
hospital EMS Ground Not Transported EMS Air Not Transported
hour 0:00am-0:59am 0:00am-0:59am 0:00am-0:59am 0:00am-0:59am
impact1 NA 12 Clock Point NA 12 Clock Point
inj_sev Fatal Injury (K) No Apparent Injury (O) Fatal Injury (K) No Apparent Injury (O)
j_knife NA Not an Articulated Vehicle NA Not an Articulated Vehicle
l_compl NA Valid license for this class vehicle NA Valid license for this class vehicle
l_endors NA Endorsement(s) Required, Compliance Unknown NA No Endorsements required for this vehicle
l_restri NA Restrictions, Compliance Unknown NA No Restrictions or Not Applicable
l_state NA Illinois NA Pennsylvania
l_status NA Valid NA Valid
l_type NA Full Driver License NA Full Driver License
lag_hrs 34 999 2 999
lag_mins 25 99 3 99
last_mo NA No Record NA No Record
last_yr NA No Record NA No Record
lgt_cond Dark - Lighted Dark - Lighted Dark - Not Lighted Dark - Not Lighted
location Not at Intersection - On Roadway, Not in Marked Crosswalk Occupant of a Motor Vehicle Not at Intersection - On Roadway, Not in Marked Crosswalk Occupant of a Motor Vehicle
m_harm NA Pedalcyclist NA Pedestrian
mak_mod NA Other Make Bus***: Conventional (Engine out front) NA Pontiac G6
make NA Other Make NA Pontiac
man_coll Not a Collision with Motor Vehicle In-Transport Not a Collision with Motor Vehicle In-Transport Not a Collision with Motor Vehicle In-Transport Not a Collision with Motor Vehicle In-Transport
mcarr_i1 NA US DOT NA Not Applicable
mcarr_i2 NA NA NA Not Applicable
mcarr_id NA NA NA Not Applicable
milept None None NA NA
minute 5 5 24 24
mod_year NA NA NA NA
model NA 981 NA 22
month June June July July
motdir Not a Pedestrian NA Not Applicable NA
motman Not a Pedestrian NA Not Applicable NA
msafeqmt None Used NA None Used NA
nhs This section IS NOT on the NHS This section IS NOT on the NHS This section IS NOT on the NHS This section IS NOT on the NHS
not_hour 0:00am-0:59am 0:00am-0:59am Unknown Unknown
not_min 5 5 Unknown Unknown
numoccs NA 01 NA 01
owner NA Vehicle Registered as Business/Company/Government Vehicle NA Driver (in this crash) was Registered Owner
p_crash1 NA Going Straight NA Going Straight
p_crash2 NA Pedalcyclist or other non-motorist in road NA Pedestrian in road
p_crash3 NA No Avoidance Maneuver NA Steering left
pbcwalk Yes NA None Noted NA
pbswalk Yes NA None Noted NA
pbszone None Noted NA None Noted NA
pcrash4 NA Tracking NA Tracking
pcrash5 NA Stayed in original travel lane NA Stayed in original travel lane
pedcgp Not a Pedestrian NA Crossing Roadway - Vehicle Not Turning NA
pedctype Not a Pedestrian NA Pedestrian Failed to Yield NA
peddir Not a Pedestrian NA Not Applicable NA
pedleg Not a Pedestrian NA Not Applicable NA
pedloc Not a Pedestrian NA Not At Intersection NA
pedpos Not a Pedestrian NA On a roadway, in a travel lane NA
peds 1 1 1 1
pedsnr Not a Pedestrian NA Not Applicable NA
per_typ Bicyclist Driver of a Motor Vehicle In-Transport Pedestrian Driver of a Motor Vehicle In-Transport
permvit 1 1 1 1
pernotmvit 1 1 1 1
persons 1 1 1 1
prev_acc NA None NA None
prev_dwi NA None NA None
prev_oth NA None NA None
prev_spd NA None NA None
prev_sus NA None NA None
pvh_invl 0 0 0 0
race White Not a Fatality (not Applicable) White Not a Fatality (not Applicable)
rail Not Applicable Not Applicable Not Applicable Not Applicable
reg_stat NA Illinois NA Pennsylvania
rel_road On Roadway On Roadway On Roadway On Roadway
reljct1 No No No No
reljct2 Intersection-Related Intersection-Related Non-Junction Non-Junction
rest_mis Not a Motor Vehicle Occupant No Not a Motor Vehicle Occupant No
rest_use Not a Motor Vehicle Occupant Restraint Used - Type Unknown Not a Motor Vehicle Occupant Shoulder and Lap Belt Used
road_fnc Urban-Minor Arterial Urban-Minor Arterial Rural-Minor Arterial Rural-Minor Arterial
rolinloc NA No Rollover NA No Rollover
rollover NA No Rollover NA No Rollover
route Local Street - Municipality Local Street - Municipality U.S. Highway U.S. Highway
rur_urb Urban Urban Rural Rural
sch_bus No No No No
seat_pos Not a Motor Vehicle Occupant Front Seat, Left Side Not a Motor Vehicle Occupant Front Seat, Left Side
sex Male Male Male Male
sp_jur No Special Jurisdiction No Special Jurisdiction No Special Jurisdiction No Special Jurisdiction
spec_use NA Vehicle Used as Other Bus NA No Special Use
speedrel NA No NA No
str_veh 1 0 1 0
tow_veh NA No Trailing Units NA No Trailing Units
towed NA Not Towed NA Towed Due to Disabling Damage
trav_sp NA Not Reported NA Not Reported
tway_id WASHINGTON ST WASHINGTON ST US-40 US-40
tway_id2 VETERANS DR VETERANS DR NA NA
typ_int Four-Way Intersection Four-Way Intersection Not an Intersection Not an Intersection
underide NA No Underride or Override Noted NA No Underride or Override Noted
unittype NA Motor Vehicle In-Transport (Inside or Outside the Trafficway) NA Motor Vehicle In-Transport (Inside or Outside the Trafficway)
v_config NA Bus (seats for more than 15 occupants, including driver) NA Not Applicable
valign NA Straight NA Straight
ve_forms 1 1 1 1
ve_total 1 1 1 1
vin NA NA NA NA
vnum_lan NA Two lanes NA Three lanes
vpavetyp NA Blacktop, Bituminous, or Asphalt NA Blacktop, Bituminous, or Asphalt
vprofile NA Level NA Level
vspd_lim NA 30 MPH NA 40 MPH
vsurcond NA Dry NA Wet
vtcont_f NA Device Functioning Properly NA No Controls
vtrafcon NA Traffic control signal (on colors) with Pedestrian Signal NA No Controls
vtrafway NA Two-Way, Not Divided NA Two-Way, Divided, Positive Median Barrier
work_inj No Not Applicable (not a fatality) No Not Applicable (not a fatality)
wrk_zone None None None None
func_sys NA NA NA NA
rd_owner NA NA NA NA
cityname NA NA NA NA
countyname NA NA NA NA
statename NA NA NA NA
trlr1vin NA NA NA NA
trlr2vin NA NA NA NA
trlr3vin NA NA NA NA
nmhelmet NA NA NA NA
nmlight NA NA NA NA
nmothpre NA NA NA NA
nmothpro NA NA NA NA
nmpropad NA NA NA NA
nmrefclo NA NA NA NA
prev_sus1 NA NA NA NA
prev_sus2 NA NA NA NA
prev_sus3 NA NA NA NA
helm_mis NA NA NA NA
helm_use NA NA NA NA
gvwr_from NA NA NA NA
gvwr_to NA NA NA NA
icfinalbody NA NA NA NA
trlr1gvwr NA NA NA NA
trlr2gvwr NA NA NA NA
trlr3gvwr NA NA NA NA
vpicbodyclass NA NA NA NA
vpicmake NA NA NA NA
vpicmodel NA NA NA NA
underoverride NA NA NA NA
devmotor NA NA NA NA
devtype NA NA NA NA
acc_config NA NA NA NA
a1 0 0 3 3
a2 0 0 0 0
a3 0 0 0 0
a4 0 0 0 0
a5 0 0 0 0
a6 0 0 0 0
a7 0 0 0 0
a8 0 0 0 0
a9 0 0 0 0
a10 0 0 0 0
p1 16 0 0 3
p2 16 0 0 0
p3 16 0 13 0
p4 16 0 0 0
p5 16 0 0 0
p6 16 0 0 0
p7 16 0 0 0
p8 16 0 0 0
p9 16 0 0 0
p10 16 0 0 0

The multi_ dataframes contain those variables for which there may be a varying number of values for any entity (e.g., driver impairments, vehicle events, weather conditions at time of crash). Each dataframe has the requisite data elements corresponding to the entity: multi_acc includes st_case and year, multi_veh adds veh_no (vehicle number), and multi_per adds per_no (person number).

The ‘multi_acc’ dataframe
state st_case name value year
Illinois 170423 weather1 Clear 2014
Maryland 240191 weather1 Rain 2014
The ‘multi_veh’ dataframe
state st_case name value year
Illinois 170423 weather1 Clear 2014
Maryland 240191 weather1 Rain 2014
The ‘multi_per’ dataframe
state st_case veh_no per_no name value year
Illinois 170423 0 1 drugtst1 Both: Blood and Urine Tests 2014
Illinois 170423 0 1 drugtst2 Both: Blood and Urine Tests 2014
Illinois 170423 0 1 drugtst3 Test Not Given 2014
Illinois 170423 0 1 drugres1 DELTA 9 2014
Illinois 170423 0 1 drugres2 Tetrahydrocannabinols (THC) 2014
Illinois 170423 0 1 drugres3 Test Not Given 2014
Illinois 170423 1 1 drugtst1 Unknown Test Type 2014
Illinois 170423 1 1 drugtst2 Test Not Given 2014
Illinois 170423 1 1 drugtst3 Test Not Given 2014
Illinois 170423 1 1 drugres1 Test For Drug, Results Unknown 2014
Illinois 170423 1 1 drugres2 Test Not Given 2014
Illinois 170423 1 1 drugres3 Test Not Given 2014
Illinois 170423 0 1 mtm_crsh Failure to Yield Right-Of-Way 2014
Illinois 170423 0 1 mtm_crsh Failure to Obey Traffic Signs, Signals or Officer 2014
Illinois 170423 0 1 nmimpair Under the Influence of Alcohol, Drugs or Medication 2014
Illinois 170423 0 1 mpr_act Crossing Roadway 2014
Maryland 240191 0 1 drugtst2 Test Not Given 2014
Maryland 240191 0 1 drugtst3 Test Not Given 2014
Maryland 240191 0 1 drugres2 Test Not Given 2014
Maryland 240191 0 1 drugres3 Test Not Given 2014
Maryland 240191 1 1 drugtst1 Test Not Given 2014
Maryland 240191 1 1 drugtst2 Test Not Given 2014
Maryland 240191 1 1 drugtst3 Test Not Given 2014
Maryland 240191 1 1 drugres1 Test Not Given 2014
Maryland 240191 1 1 drugres2 Test Not Given 2014
Maryland 240191 1 1 drugres3 Test Not Given 2014
Maryland 240191 0 1 mtm_crsh Failure to Yield Right-Of-Way 2014
Maryland 240191 0 1 mtm_crsh Not Visible (Dark clothing, No Lighting, etc.) 2014
Maryland 240191 0 1 nmimpair None/Apparently Normal 2014
Maryland 240191 0 1 mpr_act Crossing Roadway 2014

The events dataframe provides a sequence of events for each vehicle in each crash. See the vignette(“Crash Sequence of Events”, package = “rfars”) for more information.

The ‘multi_per’ dataframe
state st_case veh_no aoi soe veventnum year
Illinois 170423 1 12 Clock Point Pedalcyclist 1 2014
Maryland 240191 1 12 Clock Point Pedestrian 1 2014

The codebook dataframe provides a searchable codebook for the data, useful if you know what concept you’re looking for but not the variable that describes it. rfars also includes pre-loaded codebooks for FARS and GESCRSS (rfars::fars_codebook and rfars::gescrss_codebook). See vignette('Searchable Codebooks', package = 'rfars') for more information.

Counts

See vignette("Counts", package = "rfars") for information on the pre-loaded annual_counts dataframe and the counts() and compare_counts() functions. Also see vignette("Alcohol Counts", package = "rfars") for details on how BAC values are imputed and reported in Traffic Safety Facts.