Friday, November 21, 2014

Geocoding crucial

The Director's Desk readership includes a lot of analysts and GIS aficionados. I'm going to geek out in this post, so unless you are among them, consider yourself forewarned.

Public safety analysts and technicians mine data from dispatch records, incident report, and other databases in order to work with these data in a GIS framework. The dots don't appear on the maps magically, though. The geocoding process uses software algorithms to convert the text description of an address into a point on a map. 

Geocoding is both art and science, and accuracy is important. If large numbers of events will not geocode, or geocode improperly, the validity of any analysis is compromised. It doesn't take much to throw things off, either, because geocoding errors are often not random. Rather, they tend to be systematic: the same address gets missed over and over, or a tiny error in a street reference file results in the same address getting incorrectly placed on the wrong side of a census tract boundary, evey single time. 

Because of this, accuracy of geocoding should be a top concern for those of us who manage GIS applications. The key is to understand what isn't geocoding properly, and to systematically correct as much of that is possible. You may not be able to prevent the occasional fat-fingered entry where someone inserted an extra zero in an address field, but if you can never properly geocode the street address of a local high school, you've got to figure out why and correct that. 

Here in Lincoln, we're geocoding a few hundred thousand police and fire incidents and dispatches annually. I watch the unmatched records closely, in order to monitor any consistent geocoding problems. So I was pretty pleased to see this geocoding history report for recent fire dispatches yesterday morning:


Hard to top that, in almost a thousand records that are updated twice daily. That's the Omega Group's Import Wizard software pictured in this screen shot, which manages the data import and geocoding from both police and fire records systems in Lincoln, in order to populate CrimeView and FireView applications.

My advice to analysts is not to be complacent even if you have a high hit rate. Keep an eye on your unmatched records, find the repeats, figure out why, and fix the problem whenever possible. 

3 comments:

Anonymous said...

So who do the kudos go to? Records employees, Officers entering the reports, or is there someone else?

Tom Casady said...

9:32,

It is the result of a well-designed and closely monitored process. It begins with consistent address entry by employees, followed by routines created by IT staff to clean address data prior to export, and good tools in Import Wizard to further refine data.

In addition, the quality of Lincoln's underlying enterprise GIS is crucial. The reference data used in geocoding (streets, parcels, and address points) are top-notch. These data files are managed by the Planning Department, Building & Safety Department, and County Assessor's Office.

There are a lot of moving parts, many people and organizations, and many years involved in getting these great results.

9:32 said...

That's awesome. Usually with that many different sources working together, you get way less than 100% results. Says a lot about the employees who work for the City of Lincoln.