Overview



Tony Mu, Sahil Naikwadi, Harsh Patel, and Anmol Das

Positive-Unlabeled Learned Binary Classifier

Our goal for this project is to use the methods and techniques mentioned in "Learning classifiers from only positive and unlabeled data" to learn a binary classifier using positive and unlabeled datasets. We will attempt this by using a python program with the scikit-learn library. We plan to use a dataset from The UCI Machine Learning Repository that consists of 2 labels. We want to pick a dataset with only two labels because we can manually create a PU dataset with it by mixing the positive and negative samples but also we will have a dataset to train traditional classifiers to show performance comparisons between a PU learned classifier and a traditional classifier. We wish to create a PU learned classifier that will have a reasonable performance compared to a traditional classifier. Our performance criteria will be based on precision and recall of the classifiers. We can confirm our findings if our PU learned binary classifier performs close to aa traditionally trained binary classifier.

Tony and Sahil will focus on the researching and developing our python program for PU learning, Harsh will focus on developing and maintaining our project website, and Anmol will focus on organizing and documenting our process, findings and outcomes. We plan to meet two times a week to discuss progress and help each other as needed. Communication is key for this assignment as the group size is larger and we will need to accommodate a meeting time for everyone. We decided to split this work based on the strengths of each group member, we believe this will be a good split between all of the work.

Comments