Dark Reactions: Recommender Guided Materials Discovery
Haverford College. Department of Computer Science
Place of Publication
Table of Contents
Haverford users only
We present an exploration of data mining and machine learning techniques applied to a materials science dataset, with the goal of improving a lab's efficiency when running experiments. The primary product of our work is two tools to help chemists' better explore the space of possible reactions: a recommender system which we hope will increase the serendipitous discovery of interesting reactions that the chemists would not have thought to explore; and a seed-based ranking system which helps chemists prioritize which reactions to run, and with what parameters. We present a number of different techniques for tuning our recommender system, as well as presenting an automated approach to evaluating recommender systems in contexts where labels are expensive to learn (time, resources, equipment). Reactions are given a label in f1; 2; 3; 4g, where 4 corresponds to successful formation of a crystalline product, 3 corresponds to mostly successful formation of a crystalline product and 1 and 2 correspond to different failure cases. Using SVM we are able to achieve 65% accuracy on a 4-category classification on a held-out test set of 30% of our data set. Preliminary empirical results suggest a significant improvement in efficiency: observed rate of observing a 3 or a 4 increased from 65% (n=5486) without our system to 86% (n=190) using recommendations from our system. Our system is available at http://darkreactions.haverford.edu/.