Computers and humans have unique ways to perform information processing. When it comes to identifying patterns that come from bulk data sets, computers are more equipped in doing such type of work. On the other hand, humans are good at understanding patterns that are sourced from only a few examples.
In order to bridge these processes, MIT researchers have developed a new system that promotes collaboration between humans and computers so as to enable an enhanced decision-making. This new system discerns by crunching and distilling the data in order to produce a simpler representation.
"In this work, we were looking at whether we could augment a machine-learning technique so that it supported people in performing recognition-primed decision-making," said Julie Shah, co-author of the paper.
Shah, together with her colleagues Been Kim and Cynthia Rudin, are working on an augmented type of machine learning dubbed as 'unsupervised.' Been Kim is the author of the PhD thesis which became the basis of the new research paper. Cynthia Rudin is an associate professor of statistics at the Sloan School of Management of MIT.
In the supervised type of machine learning, the computer receives labeled data from humans. The machine would then correlate the data with those images that it is more familiar with, based on the frequency of appearance. One example is by looking at visual-based features that occur at the highest number of frequency among images bearing the label 'car.'
In the unsupervised type of machine learning, the computer looks for data commonalities that are otherwise unstructured. This results to getting a group of data clusters wherein the members are related in one way or another, though the reasons of correlation are not immediately explained.
The researchers performed two modifications to the type of algorithm that is commonly employed in the latter type of machine learning.
The first involves clustering data that is done not only according to the shared features between the data items but also according to their similarity when paired with a representative example. The latter is what the researchers have called 'a prototype.'
The other modification is based on creating what the researchers have called a 'subspace.' Subspace is a representative set that is produced after the new algorithm has winnowed the list of features as compared to simply ranking the shared features depending on their importance.