Building a Voter File: Matching Lists

In my efforts to duplicate the straightforward plots of David Foster Wallace, I'm going to circle back yet again to one of my earlier posts.  There's more below the fold...

When I wrote about the process of appending outside consumer data to state or local voter files, I cautioned strongly about attempting to match multiple data sources:

Aside from gathering all this information into one place, these companies also have the expertise to match it to any data you might already possess (you do not have this expertise).  Matching records is tough--people move, get married, die, have children with the same name as them, and generally make life hard.  Gathering, collating and coordinating all this information is a massive enterprise

I'll mostly stand by this, but nevertheless there are going to be times when you will want to match two disparate sources of information.  I'm going to cover just one today: matching older and newer versions of the same file.  Incidentally, I've been stressing the importance of consistent IDs throughout this series--this should explain why.

Let's start with a problem that's all too common--two files for the same area, from the same source, that lack a consistent and unique identifier.  Maybe the state switched the way they generate the IDs.  Maybe they have different IDs for county and statewide, and only included one.  Maybe they don't like disclosing their voter IDs at all.  Point is, you have two versions of the same file, from the same source, without a consistent ID across the files.

It is still possible, however, to establish a unique and consistent mapping for both files.  Let's go back to our example of Adams county.  The most recent version of the file contains 17,795 records.  Let's say that this file is replacing an older file, with 16,950 records, which we had attached various bits of information to--contact information from a field program, better phone numbers, or what-have-you.  We don't want to lose the work we've put into the old file, but we still need to replace it. 

If we have an ID that's consistent across both files, it's a snap.  We can use a simple SQL operation called a join, which matches two tables based on any column(s) we specify that they have in common.  So, for instance, let's say our old table was called adams_031808, and that in addition to the columns from the most recent file, it contains a column called voteintent, based on our canvassing efforts.  The first thing to do is add this column to our most recent table.  This can be accomplished fairly simply--alter table adams_111708 add voteintent (varchar(5));.  Now that we have a target for the data we'll be bringing over, let's see how many of the people in our old file are in the new one.  To do this, simply run the following: select count(*) from adams_081808 inner join adams_111708 using (voterid);.

This should spit out the number of values in the voterid column that appear in both tables.  If everyone from the old file is still on the new one, the result of that query should be exactly 16,950; however, we can expect it to be a little lower, since people will have moved, dropped off the rolls, or otherwise become unavailable.  Assuming that number doesn't set off any alarm bells, it's easy to bring over the voterintent column: update adams_111708 a set a.voterintent=b.voterintent from adams_031808 b where b.voterid=a.voterid;.  The "a" and "b" after the tablenames are aliases, there for convenience, but the gist should be clear--set the new voterintent equal to the old voterintent where the IDs are the same.

What happens if you don't have a consistent ID? This gets a little trickier, but it does use roughly the same process.  First of all, make sure there's one column that has the same name across each table, which will end up containing the consistent ID--this makes things more convenient and easier to remember.  We'll use voterid, and we'll assume that adams_111708 has one and adams_031808 has no such column.  Create it for the older table.  Now, we have to match the files--how do we do that?

This, admittedly, is not an exact process.  Basically, we have to ask ourselves: what information has to be the same for me to assume that someone is the same person? Name is not good enough, especially if you're dealing with large datasets.  But on the other hand, if somebody has the same first name, last name, date of birth, street address, town and zipcode as someone on the old file, we can probably surmise that they're in fact the same individual (note that the date of birth is doing a lot of work here--without it, we might have a problem with Joe Schmo Jr. vs. Sr.).  Let's do a check on these criteria for matching: select count(*) from adams_031808 a inner join adams_111708 b on (a.fname=b.fname and a.lname=b.lname and a.birthyear=b.birthyear and a.regaddr=b.regaddr and a.regcity=b.regcity and a.rzip=b.rzip).

Right up front, we can see there will be problems with these criteria.  For starters, we lose anyone who's moved automatically.  Secondly, a typo, alternate spelling or bad standardization in any one of those fields would throw off the whole match.  In short, this is not going to catch everyone from the old file who appears on the new.

On the other hand, we don't want to set too loose a criteria either.  After all, a false match (like the Jr./Sr. confusion, or just someone who happens to have the same name) will also mess up your file.  It's not as big an issue for a smaller jurisdiction like this, but it can and will happen in larger files (e.g. states). 

So how do you strike a balance? Generally, for a statewide file, you'll want to match at least 90% of the records in the old file.  But sometimes this won't be possible, depending on file quality.  Get the best match you can, while at a minimum matching on first and last name and zipcode.  This will miss movers, yes, but it will catch most everyone else while avoiding a lot of the problems with address standardization that can plague this sort of match.  I'll be discussing those problems and others in an upcoming post, so stay tuned.

ILrFsijKlJvnZiP

m77nUl sizwoxjopkgp, [url=http://hwsulqpouzwp.com/]hwsulqpouzwp[/url], [link=http://tjbukymaqpma.com/]tjbukymaqpma[/link], http://rhpnakicpikh.com/

` Women in summer would like

`

Women in summer would like to become beautiful. Everything can grab other's eyes is their best friends.Products make them beauty and confident is their favourite. Look in the street,you can see many different types of make up to show women's personality.

Welcome to the shop, the following is our products, free shipping.

Soccer Shoes Cheap Soccer Shoes Nike Soccer Shoes Adidas Soccer Shoes Nike Soccer Shoes sale Adidas Soccer Shoes sale UGG UGGs UGG Boot UGG Boots UGG Boots Sale Cheap UGG Boots UGG Boots Cheap Women UGG boots ugg boots cardy ugg cardy boots Timberland Timberland sale Timberland boots Timberland boots online Timberland on sale New timberland boots UGG UGG boots UGG boots sale UGG boots short Short ugg Short ugg boots Ugg boots tall Nike Air Nike Air Max Nike Air Max Shoes Nike SB Nike Dunk Nike Dunk SB Nike Dunk SB Shoes Nike Shox Nike Shox Shoes Women Bags Women Bags Sale Women Handbags Women Handbags Sale Women New Bags Cheap Bags Cheap Bags On Sale New women bags New women bags sale New women bags sale online Louis Vuitton Handbags Gucci bags Nike Nike Shoes Nike Shoes Sale Nike running Nike running shoes Nike trainers Nike trainers shoes Timberland Timberland boots Timberland boots sale Timberland boot Timberland boot sale Timberland boots cheap Men timberlands MBT MBT Shoes MBT Chapa GTX MBT Men Shoes MBT Women Shoes Discount MBT Shoes LV Handbags Gucci Handbags Chanel Handbags Chloe Handbags D&G Handbags Dior Handbags Fendi Handbags Hermes Handbags Jimmy Choo Bags Marc Jacobs Bags Miu Miu Handbags Mulberry Bags Prada Handbags Versace Handbags Yves Saint Laurent Balenciaga Bags Burberry Handbags LV Handbags Gucci Handbags Chanel Handbags Chloe Handbags D&G Handbags Dior Handbags Fendi Handbags Hermes Handbags Jimmy Choo Bags Marc Jacobs Bags Miu Miu Handbags Mulberry Bags Prada Handbags Versace Handbags Yves Saint Laurent Balenciaga Bags Burberry Handbags

Those who want to become most beautiful in the world should try them. Just ones can make you different. Girls who want to grab your boyfriends's heart is necessary to use them.

i have to use some

i have to use some essayshelp to create smth like this!

teeth whitening

HzOIPZjkJYo

yf4NLD iluoctnzshxr, [url=http://dxtamaluuolc.com/]dxtamaluuolc[/url], [link=http://bvttixvmdwsa.com/]bvttixvmdwsa[/link], http://xpqmgpouiare.com/