banner UC Berkeley Homepage Link TSC Homepage Link

research >>geocoding crash data >>Proposal Outline

in progress

Each year local and governmental agencies collect and analyze California traffic crash SWITRS (Statewide Integrated Traffic Records System) data to monitor injury rates, identify high-crash locations, develop traffic safety programs, and evaluate the effectiveness of safety measures. Many SWITRS data users need to link motor vehicle collision data with exact geographical information to identify dangerous roads, intersections, and to study crash patterns on specific road and intersection types. There are currently many barriers to accurate, inexpensive, and efficient means of accesses geocoded collision data.They are:

Expense. Commercially available platforms to geocode SWITRS data are very expensive.

Ease-of-use. Commercially available geocoding engines generally use a single data field to match addresses, and occasionally use a secondary zone field (e.g., zip code, city, county) to prevent out-of-area matches. Current location information in SWITRS, however, is represented by a collection of data fields including primary and secondary roads, qualified by direction and offset fields. Therefore, special programming is needed to precisely geocode SWITRS data into commercially available software.

Inaccuracy. Accurate geocoding requires the use of consistent street names, correctly spelled street names, accurate "offset" and "direction" data fields estimated by the reporting officer, and a current and extremely accurate area map. Due to these barriers, most geocoding is inadequate for use in analysis of intersection safety.

Inefficiency. Even with the best software, programming and base map, some crashes will require manual geocoding, a very labor intensive process.

Redundancy. Many individual jurisdictions, county jurisdictions and some state jurisdictions are currently geocoding crash data. The geocoding being done may be duplicated by other researchers unaware of the overlap.

Until first responders use Global Positioning System (GPS) devices at the scene to record the location of a crash, geocoding crash location is critical for researchers and local communities to map collision occurrences. A centralized effort to provide accurate coordinates for geocoded crashes would resolve current impediments to traffic safety research and put the State of California at the forefront of technological solutions for public health.

The California Highway Patrol is investigating possibilities for the automatic inclusion of GPS collision locations in the SWITRS data. The data can be geocoded in two fashions. The first possibility is to equip all first-responders with GPS units and require them to report the GPS location on each collision report form. The second approach is for the CHP to use available location information to extrapolate the GPS coordinates of each collision. The former approach requires a one-time but tremendous overhaul of the collision reporting process; the later approach requires the CHP to commit to a yearly effort to geocode all of the data using special software programs. In the current state of technology, no software program can produce 100% accurate estimates, and most often software is unable to produce any estimate for a significant fraction of the collisions (15-20%). The resulting "mismatches" then require significant, often manual, attention. The CHP, and other states' highway agencies, are actively researching solutions to this dilemma.

Other local agencies have begun, on a piecemeal basis, to geocode data. However, since this is done in a agency-by-agency fashion, the problem of "duplicating the wheel" arises.

In addressing this problem, the TSC will work with OTS, the CHP and local agencies to accomplish our project goal of developing recommendations for how to most efficiently accomplish state level geocoding of SWITRS data.

Funding for this program was provided by a grant from the California Office of Traffic Safety, through the Business, Transportation and Housing Agency