When dealing with Strings….

Strings are general, strings are tricky. Regular expression is a useful tool when it comes to string manipulation, it also generates problems. Writing your own rules requires extreme precaution. Below are some notes I gathered from dealing with Strings (specifically people data, names, addresses, .etc) at large scale.

1. Have you considered punctuations?

Apostrophes exists in people’s last name, street names and many proper names. So is dash, forward and backward slash. Does your RegEx match these? Should your output normalize them? Continue reading

The iphone is broken

First of all, I am not an apple hater. I’ve used iphone for a short period of time before, and I’ve done work on an Macbook pro. I understand the different design philosophy from apple, and I think I can live with it when I recently upgraded my 2-year contract with ATT. However, I have discovered (through the short period of time that I had with the device) that some of iphone’s function design flaws is outweighing its great aesthetic value. Today is the 3rd day I got my iphone 4s (I was using Samsung Captivate with customized ROM and Kernel before that, yes I am aware that I can Jailbreak my iphone but for iOS5.1.1, jailbreak is not release yet)

1. Automatically drop wifi when locked (to save power).

Continue reading

How I maxed out my thinkpad w510

I’ve always been a loyal thinkpad user, since the good old IBM days. For the last seven years I’ve used classic models like T61, T23, R51 to the modern ones x220, t420, w510. It always amaze me how durable they are and how easy to dissemble and reassemble the whole machine. My current configuration of my main machine today is:

w510:

CPU: Intel Core i7 Q820

RAM: 16GB

Boot HD:Patriot Pyro SE 120 GB SSD

Second HD: 1TB HD (more on how to do this)

Video: NVIDIA Quadrop FX 880M

Display: FHD (1920X1080)

 

By today’s standard, it still is quite a beefy machine (I did most of the reconfiguration last year in 2011). The CPU, video and display remains unchanged as stock configuration from lenovo. I’ll illustrate adding the RAM and the second HD below. With a little hacking and research before purchase, the whole machine of the above configuration can be had for under $1500.

Continue reading

on Geocoding – (Part 2 Building an Evaluation DataSet)

In essence, an ideal address geocoder will parse, normalize, and geocode an address, which can be of very flexible format. The normalized and geocoded result should follow some standard (USPS standard for address normalization is a pretty well developed standard, which I think, is the de facto standard for address geocoding). The expected output of geocoding should include not only lat, long, but also precision. ThereĀ  could be some additional spatial information, such as county or MSA this address belongs to, that might be useful for certain applications. There might also be fixing of missing zipcode, misspelled street, city, or state names, depending on how lenient/tolerent you want the geocoder to be.

 

Of all of these complicated software development demand/goals, where do you start?

 

You start with building an address list

Continue reading

On Geocoding (part 1 – definition)

=What is Geocoding=

More than once have I received the same response to a discussion related to geocoding :” hold on for a sec, what is geocoding again?”. To which I response with the same definition you can find on wikipedia: “geocoding is the procedure to translate addresses into lat/lons (wiki), along with a accuracy/precision/confidence, so that you can pin it to a map, as well as know how reliable that pinpoint is.” There are a number of web services that can make this job very easy for you. including google, bing, yahoo, and if you want to develop a geocoder yourself, a list of resources might be helpful. However, a more thorough and modern definition of geocoding requires further discussion. In my opinion, geocoding has been too narrowly defined as pinning addresses. People’s thoughts on “geocoding”, once related to “putting addresses on a map”, would be, “Oh, that’s useful; shouldn’t that be a simple task?”, There are at least the following 3 point that is worth noting when talking about geocoding:

Continue reading