LEVAN: Learning Everything about Anything | Educación y TIC | Scoop.it

Recognition is graduating from labs to real-world applications. While it is encouraging to see its potential being tapped, it brings forth a fundamental challenge to the vision researcher: scalability. How can we learn a model for any concept that exhaustively covers all its appearance variations, while requiring minimal or no human supervision for compiling the vocabulary of visual variance, gathering the training images and annotations, and learning the models?


In this work, LEVAN developers introduce a fully-automated approach for learning extensive models for a wide range of variations (e.g. actions, interactions, attributes and beyond) within any concept. Their approach leverages vast resources of online books to discover the vocabulary of variance, and intertwines the data collection and modeling steps to alleviate the need for explicit human supervision in training the models. Their approach organizes the visual knowledge about a concept in a convenient and useful way, enabling a variety of applications across vision and NLP. A comprehensive aggregation of online system has been queried by users to learn models for several interesting concepts including, breakfast, Gandhi, beautiful, etc. To date, the LEVAN system has models available for over 50,000 variations within 150 concepts, and has annotated more than 10 million images with bounding boxes.


Via Dr. Stefan Gruenwald