Facebook has today announced upgrades to its AI-based object recognition system called automatic alternative text (AAT) that brings up descriptions of photos for folks who are visually impaired. Facebook says that it has increased the number of ‘concepts’ that the system can reliably identify in a frame by a factor of 10X. Additionally, the social media giant claims that the descriptions are now more detailed, and they can include more linguistic and functional data such as activities being depicted in a photo, types of animals, any landmarks, and more. 

AAT uses simple phrasing for descriptions, instead of lengthy sentences

The company says it has also drastically improved the details about positional location and relative size of elements shown in a photo, so that users have a better idea of what is happening in a picture. “So instead of describing the contents of a photo as “May be an image of 5 people,” we can specify that there are two people in the center of the photo and three others scattered toward the fringes, implying that the two in the center are the focus,” Facebook adds.

Generic image description offered by AAT. (Image Facebook)

Additionally, if there is a mountain shown in a picture, the AAT system will now highlight it as the primary object depending on how much space of the frame it occupies in comparison to other objects. To make the upgrades, Facebook says it relied on ‘a model trained on weakly supervised data in the form of billions of public Instagram images and their hashtags.’ However, the company does mention that AAT uses simple phrasing for descriptions, instead of lengthy sentences to serve its purpose. As of now, AAT-powered image descriptions are accessible in 45 languages. 

Facebook trained its AI on billions of public Instagram images and their hashtags

And in order to improve the linguistic part of the descriptions for users with visual impairment, Facebook turned back to the target audience to understand the type of images they would like to emphasize when it comes to image descriptions, and those that don’t need much attention. “We designed the new AAT to provide a succinct description for all photos by default but offer an easy way to get more detailed descriptions about photos of specific interest,” Facebook notes. 

Detailed description of a photo generated by AAT (Facebook)