Abstract:
Machine learning algorithms called classifiers make discrete predictions about new data by training on old data. These predictions may be hiring or not hiring, good or bad credit, and so on. The training data may contain patterns such as a higher rate of good outcomes for members of certain groups (e.g. racial groups) and a lower rate of good outcomes for other groups. This is quantified by the "80% rule" of disparate impact, which is a legal measure and definition of bias. It is ethically and legally undesirable for a classifier to learn these biases from the data. We propose two methods of modifying data, called Combinatorial and Geometric repair. We test our repairs on three data sets. Experiments show that our repairs perform favorably in terms of training classifiers that are both accurate and unbiased.