Efficient Audio-Visual Understanding on AR Devices

摘要

Augmented reality (AR) is a set of technologies that will fundamentally change the way we interact with our environment. It represents a merging of the physical and the digital worlds into a rich, context aware user interface delivered through a socially acceptable form factor such as eyeglasses. The majority of these novel experiences in AR systems will be powered by AI because of their superior ability to handle in-the-wild scenarios. A key AR use case is a personalized, proactive and context-aware Assistant that can understand the user’s activity and their environment using audio-visual understanding models. In this presentation, we will discuss the challenges and opportunities in both training and deployment of efficient audio-visual understanding on AR glasses. We will discuss enabling always-on experiences within a constrained power budget using cascaded multimodal models, and co-designing them with the target hardware platforms. We will present our early work to demonstrate the benefits and potential of such a co-design approach and discuss open research areas that are promising for the research community to explore.

日期
11月 5, 2021 11:00 AM — 12:00 PM
事件
ACM/IEEE International Conference on Computer-aided Design HALO Workshop
李萌
李萌
助理教授、研究员、博雅青年学者

李萌,北京大学人工智能研究院和集成电路双聘助理教授、研究员、博雅青年学者。他的研究兴趣集中于高效、安全的多模态人工智能加速算法和芯片,旨在通过算法到芯片的跨层次协同设计和优化,为人工智能构建高能效、高可靠、高安全的算力基础。

var dimensionValue = 'SOME_DIMENSION_VALUE'; ga('set', 'dimension1', dimensionValue);