A crossover operator for the k- anonymity problem

Recent dissemination of personal data has created an important optimization problem: what is the minimal transformation of a dataset that is needed to guarantee the anonymity of the underlying individuals? One natural representation for this problem is a bit-string, which makes a genetic algorithm a logical choice for optimization. Unfortunately, under certain realistic conditions, not all bit combinations will represent valid solutions. This means that in many instances, useful solutions are sparse in the search space. We implement a new crossover operator that preserves valid solutions under this representation. Our results show that this reproductive strategy is more efficient, effective, and robust than previous work. We also investigate how the population size and uniqueness can affect the performance of genetic search on this application.