Skip to contents

A table describing each GESCRSS variable name, value, and corresponding value label.

Usage

gescrss_codebook

Format

A data frame with 34,662 rows and 8 variables:

source

The source of the data (either FARS or GESCRSS).

file

The data file that contains the given variable.

name_ncsa

The original name of the data element.

name_rfars

The modified data element name used in rfars

label

The label of the data element itself (not its constituent values).

Definition

The data element's definition, pulled from the Analytical User Manual

Additional Information

Additional information on the data element, pulled from the Analytical User Manual.

value

The original value of the data element.

value_label

The de-coded value label.

2014

Indicator: 1 if valid for 2014, NA otherwise.

2015

Indicator: 1 if valid for 2015, NA otherwise.

2016

Indicator: 1 if valid for 2016, NA otherwise.

2017

Indicator: 1 if valid for 2017, NA otherwise.

2018

Indicator: 1 if valid for 2018, NA otherwise.

2019

Indicator: 1 if valid for 2019, NA otherwise.

2020

Indicator: 1 if valid for 2020, NA otherwise.

2021

Indicator: 1 if valid for 2021, NA otherwise.

2022

Indicator: 1 if valid for 2022, NA otherwise.

2023

Indicator: 1 if valid for 2023, NA otherwise.

Details

This codebook serves as a useful reference for researchers using GES/CRSS data. The 'source' variable is intended to help combine with the fars_codebook. Data elements are relatively stable but are occasionally discontinued, created anew, or modified. The 'year' variable helps indicate the availability of data elements, and differentiates between different definitions over time. Users should always check for discontinuities when tabulating cases.

The 'file' variable indicates the file in which the given data element originally appeared. Here, files refers to the SAS files downloaded from NHTSA. Most data elements stayed in their original file. Those that did not were moved to the multi_ files. For example, 'weather' originates from the 'accident' file, but appears in the multi_acc data object created by rfars.

The 'name_ncsa' variable describes the data element's name as assigned by NCSA (the organization within NHTSA that manages the database). To maximize compatibility between years and ease of use for programming, 'name_rfars' provides a cleaned naming convention (via janitor::clean_names()).

Each data element has a 'label', a more human-readable version of the element names. For example, the label for 'harm_ev' is 'First Harmful Event'. These are not definitions but may provide enough information to help users conduct their analysis. Consult the CRSS User Manual for definitions and further details.

'Definition' and 'Additional Information' were extracted from the Analytical User’s Manual.

Each data element has multiple 'value'-'value_label' pairs: 'value' represents the original, non-human-readable value (usually a number), and 'value_label' represents the corresponding text value. For example, for 'harm_ev', 1 (the 'value') corresponds to 'Rollover/Overturn' (the 'value_label'), 2 corresponds to 'Fire/Explosion', etc.

@source Codebooks are automatically generated by extracting SAS format catalogs (.sas7bcat files) and VALUE statements from .sas files during data processing, then consolidating variable names, labels, and value-label mappings across all years into searchable reference tables. Source files are published by NHTSA and available here.

See also

"fars_codebook"

Examples

head(rfars::gescrss_codebook)
#>     source     file name_ncsa name_rfars                 label
#>     <char>   <char>    <char>     <char>                <char>
#> 1: GESCRSS accident  LAND_USE   land_use              Land Use
#> 2: GESCRSS accident  LAND_USE   land_use              Land Use
#> 3: GESCRSS accident  LAND_USE   land_use              Land Use
#> 4: GESCRSS accident  LAND_USE   land_use              Land Use
#> 5: GESCRSS accident  LAND_USE   land_use              Land Use
#> 6: GESCRSS accident    REGION     region Region of the Country
#>                                                                          Definition
#>                                                                              <char>
#> 1:                                                                             <NA>
#> 2:                                                                             <NA>
#> 3:                                                                             <NA>
#> 4:                                                                             <NA>
#> 5:                                                                             <NA>
#> 6: This data element identifies the region of the country where the crash occurred.
#>                                                                                                                                                                                                                    Additional Information
#>                                                                                                                                                                                                                                    <char>
#> 1:                                                                                                                                                                                                                                   <NA>
#> 2:                                                                                                                                                                                                                                   <NA>
#> 3:                                                                                                                                                                                                                                   <NA>
#> 4:                                                                                                                                                                                                                                   <NA>
#> 5:                                                                                                                                                                                                                                   <NA>
#> 6: This data element is derived based on the State in which the Primary Sampling Unit is located where the crash occurred. See Appendix B: Rules for Derived Data Elements for an explanation of this data element and how it is derived.
#>     value                                    value_label   2014   2015   2016
#>    <char>                                         <char> <char> <char> <char>
#> 1:      1      Within area of population 25,000 - 49,999      1      1   <NA>
#> 2:      2     Within area of population 50,000 - 100,000      1      1   <NA>
#> 3:      3             Within area of population 100,000+      1      1   <NA>
#> 4:      8                                     Other area      1      1   <NA>
#> 5:      9                                        Unknown      1      1   <NA>
#> 6:      1 Northeast (PA, NJ, NY, NH, VT, RI, MA, ME, CT)      1      1      1
#>      2017   2018   2019   2020   2021   2022   2023
#>    <char> <char> <char> <char> <char> <char> <char>
#> 1:   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
#> 2:   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
#> 3:   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
#> 4:   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
#> 5:   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>   <NA>
#> 6:      1      1      1      1      1      1      1