Posted in March 2012

on Geocoding – (Part 2 Building an Evaluation DataSet)

In essence, an ideal address geocoder will parse, normalize, and geocode an address, which can be of very flexible format. The normalized and geocoded result should follow some standard (USPS standard for address normalization is a pretty well developed standard, which I think, is the de facto standard for address geocoding). The expected output of geocoding should include not only lat, long, but also precision. ThereĀ  could be some additional spatial information, such as county or MSA this address belongs to, that might be useful for certain applications. There might also be fixing of missing zipcode, misspelled street, city, or state names, depending on how lenient/tolerent you want the geocoder to be.

 

Of all of these complicated software development demand/goals, where do you start?

 

You start with building an address list

Continue reading