DWITE Online Computer Programming Contest

Anonymous Shopping

November 2011
Problem 3

A serious battle in the Pepsi vs. Coke debate is breaking out in your school. Being a fan of a healthy fruit juice beverage instead, you would prefer to not get involved. Careful to keep your beverage preferences hidden, you’ve always paid in cash at the store, but eventually the loyalty card program had you tempted with a delicious store discount in exchange for just your date of birth, gender, and area code (no big deal, right?). It has then dawned on you that the same information can be easily found in school records (and your Facebook’s profile), linking it back to your name. How likely is that you’ll be identified?.

The input file DATA3.txt will contain 5 test cases. Each test case starts with two integers, 1 ≤ N, M ≤ 100, separated by a space. This is followed by N entries of store records, and then M entries of school records.

Store records are in the form of: store_id YYYYMMDDGZIP drink, where drink is a single word describing the drink preference determined for the associated loyalty card member.

School records are in the form of: name YYYYMMDDGZIP. The name can be treated as a single word.

The output file OUT3.txt will contain 5 lines of output, each being the number of people out of the set of M whose drink preference can be deduced from the information available. It can be assumed that every match does belong to some entry from the list M (i.e. there is only a single school in the area). It can also be assumed that names are unique.

Sample input notes: In the first sample, both Dan and Tony are uniquely identified, while AJ and Cyril don’t have matching records, so the answer is 2. In the second case, Dan matches two different records (there is someone else with the same DOB, gender, and area code), but since both records are “Coke”, the drink preference can be deduced. Tony also matches two records, but his preference can’t be figured out based on the available data. So the answer is just 1.

For those in the USA, the date of birth, gender, and zip code uniquely identify 87% of the population. https://www.eff.org/deeplinks/2009/09/what-information-personally-identifiable

Sample Input (first 2 shown):
2 4
1 19860101M123 Coke
2 19860202M123 Pepsi
Dan 19860101M123
Tony 19860202M123
AJ 19920303M123
Cyril 19930404M123
4 2
1 19860101M123 Coke
2 19860101M123 Coke
3 19860202M123 Coke
4 19860202M123 Pepsi
Dan 19860101M123
Tony 19860202M123
Sample Output:
2
1