Google right now launched an updated model of Voice Access, its service that permits customers to management Android units utilizing voice instructions. It leverages a machine studying mannequin to routinely detect icons on the display screen primarily based on UI screenshots, enabling it to decide whether or not components like photos and icons have accessibility labels, or labels supplied to Android’s accessibility companies.
Accessibility labels enable Android’s accessibility companies to refer to precisely one on-screen component at a time, letting customers know after they’ve cycled by way of the UI. Unfortunately, some components lack labels, a problem the brand new model of Voice Access goals to deal with.
A vision-based object detection mannequin referred to as IconNet within the new Voice Access (model 5.0) can detect 31 completely different icon varieties, quickly to be prolonged to greater than 70 varieties. As Google explains in a weblog put up, IconNet is predicated on the novel CenterNet structure, which extracts app icons from enter photos after which predicts their areas and sizes. Using Voice Access, customers can refer to icons detected by IconNet by their names, e.g., “Tap ‘menu’.”
To prepare IconNet, Google engineers collected and labeled greater than 700,000 app screenshots, streamlining the method by utilizing heuristics, auxiliary fashions, and information augmentation strategies to establish rarer icons and enrich present screenshots with rare icons. “IconNet is optimized to run on-device for mobile environments, with a compact size and fast inference time to enable a seamless user experience,” Google Research software program engineers Gilles Baechler and Srinivas Sunkara wrote of their weblog put up.
Google says that sooner or later, it plans to broaden the vary of components supported by IconNet to generic photos, textual content, and buttons. It additionally plan to prolong IconNet to differentiate between similar-looking icons by figuring out their performance. Meanwhile, on the developer facet, Google hopes to improve the variety of apps with legitimate content material descriptions by bettering instruments to counsel content material descriptions for various components when constructing purposes.
“A significant challenge in the development of an on-device UI element detector for Voice Access is that it must be able to run on a wide variety of phones with a range of performance performance capabilities, while preserving the user’s privacy,” the authors wrote. “We are constantly working on improving IconNet.”
Voice Access, which launched in beta in 2016, dovetails with Google’s different cellular accessibility efforts. The firm is continuous to develop Lookout, an accessibility-focused app that may establish packaged meals utilizing laptop imaginative and prescient, scan paperwork to make it simpler to overview letters and mail, and extra. There’s additionally Project Euphonia, which goals to assist folks with speech impairments talk extra simply; Live Relay, which makes use of on-device speech recognition and text-to-speech to let telephones pay attention and communicate on an individual’s behalf; and Project Diva, which helps folks give the Google Assistant instructions with out utilizing their voice.
VentureBeat’s mission is to be a digital city sq. for technical decision-makers to acquire data about transformative expertise and transact.
Our web site delivers important data on information applied sciences and techniques to information you as you lead your organizations. We invite you to develop into a member of our group, to entry:
- up-to-date data on the topics of curiosity to you
- our newsletters
- gated thought-leader content material and discounted entry to our prized occasions, equivalent to Transform
- networking options, and extra