WordPress Ad Banner

Meta unveils image-analyzing Al

Meta, the parent company of Facebook, has recently introduced a new AI model named “Segment Anything,” which has the ability to identify objects within images. The main objective of this initiative is to advance research in segmentation and offer more general image and video understanding. Segmentation is a crucial technique that allows the identification of pixels in an image that belong to a specific object.

Meta believes that their initiative will democratize segmentation and make it more accessible to various applications such as analyzing scientific imagery and editing photos. With this new AI model, Meta aims to expand the possibilities of image and video processing, and accelerate research in the field of computer vision.

WordPress Ad Banner

Meta has recently launched a new project that is set to revolutionize the computer vision space. Usually, it requires technical expertise and access to AI-trained infrastructure to create an accurate segmentation model for specific tasks. However, under the new project, Meta has released its general Segment Anything Model (SAM) and its Segment Anything 1-Billion mask dataset (SA-1B), which aim to enable a broad set of applications and foster further research into foundation models for computer vision.

SAM covers a broad set of uses and can be even used for underwater photos. It can improve creative applications such as extracting image parts for collages or video editing. It could also be used to boost scientific studies of natural occurrences on earth or even in space. The SA-1B dataset is available for research purposes, and SAM is available under a permissive open-license framework.

Meta’s goal was to build a foundation model for image segmentation that is promptable, trained on diverse data, and can adapt to specific tasks, much like how prompting is used in natural language processing models. However, the segmentation data needed to train such a model is not readily available online or elsewhere, unlike images, videos, and text.

To solve this problem, Meta developed a general, promptable segmentation model and used it to create a segmentation dataset of unprecedented scale. The SA-1B dataset is the largest to date, containing more than 1.1 billion segmentation masks collected on about 11 million licensed and privacy-preserving images.

Earlier, there were two classes of approaches to solve any segmentation problem: interactive segmentation that required a person to guide the method, and automatic segmentation that needed substantial amounts of manually annotated objects to train. “Neither approach provided a general, fully automatic approach to segmentation,” Meta said.

Meta collected the SA-1B data using SAM. “Annotators used SAM to interactively annotate images, and then the newly annotated data was used to update SAM in turn. We repeated this cycle many times to iteratively improve both the model and dataset.”

Meta’s new project is expected to have a profound impact on computer vision, enabling a broad set of applications and further research into foundation models for computer vision. Meta believes the possibilities are broad, and they are excited by the many potential use cases they haven’t even imagined yet.