Today’s visual recognition systems must be trained using millions of images that have been painstakingly labeled by human annotators. In specialized domains such as microscopy, the annotated training data is particularly difficult and costly to generate because labeling the images correctly requires expertise. Furthermore, collecting large data sets can raise privacy concerns and sometimes requires curation to remove racial, gender, or other kinds of bias. The high cost of creating large data sets of annotated images limits the adoption of computer vision technology and puts it beyond the reach of many potential users.
With a National Science Foundation (NSF) Early CAREER award, Bharath Hariharan, assistant professor of computer science, is developing computer recognition systems that can identify difficult visual concepts with minimal training — using as few as one or two labeled images and approximately 1,000 unlabeled images. To accomplish this goal, researchers will explore two strategies inspired by human vision. First, humans learn to perform new visual tasks by drawing on prior visual experience. Similarly, the systems developed by this research will learn new visual tasks by leveraging a memory of past tasks across multiple domains of knowledge. Second, just as humans learn through rich interactions with expert teachers, the systems built by this research will be able to learn by asking detailed questions of experts.
Read the rest of the coverage from Cornell Research