Monday, July 28, 2008

Not quite what it seems

You would think a website with the url "criminalsearches.com" would do what the name implies. Not exactly. Lincoln Journal Star reporter Micah Mertes contacted me last week to answer some questions about the proliferation of various crime mapping web sites, including our own CrimeView Web, for an article published on Saturday. One of the sites he asked about was criminalsearches.com.

There are are growing number of web sites that gather public record police, court, or prosecutorial data then geocode that and display the results on maps. Microsoft Virtual Earth or the Google Maps application programming interface are the usual tools for these site. While I was on the phone with Mr. Mertes, I was looking at criminalsearches.com, and noted that it's using the Google Maps API.

While Google Maps/Google Earth and Microsoft's Virtual Earth are pretty remarkable applications, mapping addresses is never perfect. Geocoding tabular address data is inherently inaccurate: it's an estimate of the location of the actual phenomenon, usually interpolating a real-word address along a street segment. Some geocoding is better; some is worse, depending on the quality of both the reference data and the data to be mapped, and the specific techniques employed. Here's an example:


The red marker is the address of our Northeast Team police substation at 4843 Huntington Avenue, geocoded by Google Maps. The Blue marker is the actual location, right at the front door. I pointed out this relatively minor problem with mass-geocoded data to the reporter, but there are other problems with these sites that are potentially more significant.

One of those problems is missing data. One of the other sites the newspaper article mentions is The DEA's national clandestine lab register. One would think that this would map the location of all, or at least most, of the known meth labs. If you drill down to Lancaster County, Nebraska you will see that Lincoln has only five meth labs listed, with dates from 2004 through 2007. In reality, we investigated 34 meth labs during those years. It's not that the data on this website is wrong, just that it is incomplete.

Another problem is that some sites may give a false impression. Back to criminalsearchs.com: after clicking the "neighborhood watch" link, I ran a few addresses, one of which was my home during high school, at about 56th and Vine. Here is the resulting map (click to enlarge):


A couple of those icons are Nebraska registered sex offenders. Criminalsearches.com is essentially mapping the public record data already available on the Nebraska State Patrol's public sex offender registry web site. The others, though, are various records from States like Missouri, Oklahoma, Oregon, North Carolina. Apparently these States have public record court or prosecutorial data that is readily available.

The "criminals" searched near 56th and Vine include a 2004 ticket for an illegal U-turn in Oklahoma, a 1996 ticket for speeding in North Carolina, and a 2001 ticket for no seat belt in Oregon. Moreover (expect for the Nebraska sex offenders), there is little chance that any of these people still live at the address that was on a ten year old traffic ticket issued in another State. You may notice that two of these people appear to live in Bethany Park. The addresses on their records is actually quite a ways down the street to the west, in apartments across Cotner Boulevard. Looking at another area of Lincoln, I found an icon on S. 2oth Street. It was denoting the address of a person who received a ticket for no operators license in North Carolina back in 1996. The ticket was dismissed by the prosecutor . The person who received the ticket is in my database, too: he died in 2005.

I believe, that when it comes to public record information about sex offenders, felons under correctional control, and the locations of crime, that public agencies should not attempt to shield citizens from the uncomfortable knowledge that stuff happens, and that there are people in our community with some pretty problematic behaviors. We may have been oddly happier when we weren't as informed about these facts, but the truth is the truth, and when we hold such public record information, I think we obligated to let it rip. But I also think we are obligated to take reasonable steps to insure that it is accurate as best we can, and to provide the context that helps citizens understand these data--such as this dialog.

To their credit, if you read the fine print, these sites have extensive disclaimers that describe some of these shortcomings. Just don't take it all at face value.

8 comments:

Anonymous said...

Thanks to the "criminalsearches" folks, I have now discovered a proliferation of hardened criminals in my neighborhood. Let's see: two speeders (one of them 94 in a 75 zone in Pittsburg PA!) and a 27 y/o who apparently didn't use her seatbelts back when she was 17. OMG, what is this world coming to?

Well, it's a free site...proving the old saw about "you get what you pay for"

Anonymous said...

Remember, most young "journalists" don't know ...spit. I remember being his age and an undergrad; I knew so much more than I ever had before, and I erroneously believed that I was learned, worldly, and ever so clever. This isn't the first time you've had to use your blog to effectively complete that same scribbler's articles, is it?

Tom Casady said...

8:46

I have no complaint about the article--I just thought there was even more to the story once I dug into the site a little deeper.

One of the nice things about running your own blog is that you can delve into subjects in more depth or from a different perspective than the news media--and sometimes into subjects that just don't work well in eight column inches or a 90 second spot.

Anonymous said...

As you point out, Chief, we may have been "oddly happier" when we didn't know about such things. The 24-hour news beat, proliferation of competing news media, and the easy availability of public records that used to be somewhat difficult to get has changed things--and not always for the better.

Anonymous said...

How does crimereports.com compare?

Crazy Cat Lady said...

After browsing around the site, I came to the same conclusions. If someone wants to dig up old traffic information or get the "dirt" on someone's teenage misdeeds, this is the site for them. It shouldn't be relied upon for knowing whether or not you have "safe" neighbors.

Tom Casady said...

just me-

Crimereports.com is a good example of using the Google Maps API to produce a neat application that would have been incredibly difficult just a few years ago. Its owner, Public Engines, Inc. basically obtains tabular crime data from police departments, then geocodes and displays that data with Google Maps.

Aside from the limitations of the geocoding, discussed in the main post here, it's as good and as current as the data provided by the agencies that have signed up. In some cases, crimereports.com is obtaining data by screen-scraping tabular crime data from police websites, rather than from a direct feed.

Crimemapping.com, although it appears similar in many respects, is using a different methodology. Its owner, the Omega Group, is actually obtaining more carefully geocoded crime data from agencies--not just the tablular data--and then displaying that against a background of Google Maps. The crime points themselves, though, were geocoded by the participating agencies, and should be more accurate.

Anonymous said...

Your link to LPD's crime view web didn't work for mapping. It won't let you search because it skips the Terms and Conditions agreement.

http://ims.lincoln.ne.gov/CrimeViewCommunity/

Above is a better link to it.