高速视觉芯片研究进展

王哲; 杨旭; 吕卓阳; 丁伯文; 于双铭; 窦润江; 石匆; 刘剑; 吴南健; 冯鹏; 刘力源

doi:10.7498/aps.75.20251534

摘要

在边缘计算场景中, 视觉感知系统的响应速度、体积及功耗已成为核心挑战. 传统感算分离的视觉系统因数据传输导致的高延迟、高功耗以及隐私泄露等问题亟待解决. 在此背景下, 模仿人类视觉系统的视觉芯片成为有效解决方案之一, 视觉芯片将图像采集与信息处理集成在一起, 实现了感算一体的协同处理机制, 能在边缘端高效完成视觉感知与计算任务. 本文围绕高速视觉芯片的技术路径, 系统梳理了其关键环节的研究进展, 分别从高速传感器件、读出电路与智能处理3个层面展开论述. 分析了互补金属氧化物半导体图像传感器、动态视觉传感器与单光子图像传感器在实现高速光电转换中的物理机制、结构创新与性能瓶颈; 探讨了高速模数转换、地址事件编码及时间相关单光子计数等读出电路架构及其效率优化策略; 并介绍了基于脉冲信号的高速图像复原与脉冲神经网络处理等前沿智能处理算法. 最后对高速视觉芯片未来发展趋势进行了展望.

关键词:

Abstract

In edge computing scenarios, response speed, compactness, and power efficiency have become critical challenges for visual systems. Traditional vision architectures that separate sensing and computing have high latency, excessive power consumption, and potential privacy leakage, which are caused by data transmission. To address these issues, vision chips inspired by the human visual system have emerged as a promising solution. By integrating image acquisition and information processing within a single hardware platform, such chips enable a sensing–computation co-processing paradigm, supporting efficient visual perception and computation directly at the edge. Developing high-speed vision chips is an inherently interdisciplinary task that bridges physics, electronics, and information science. It addresses the key problems in device fabrication, circuit design, and intelligent algorithm integration. This paper systematically reviews recent advances in the core components of high-speed vision chips.

For high-speed sensor devices, this paper analyzes the physical mechanisms, structural innovations, and performance limitations of complementary metal oxide semiconductor (CMOS) image sensors (CISs), dynamic vision sensors (DVSs), and single-photon image sensors. High-speed CIS devices enhance temporal response by optimizing two fundamental aspects: charge transfer velocity and transfer path length. Gradient doping is employed to induce high-speed drift motion during charge transfer, while structural optimization based on physical device modeling shortens the transfer path, thereby enabling fast response. In contrast, the DVS performs event-triggered readout when light intensity changes exceed a predefined threshold. This event-driven mechanism effectively eliminates static redundant information and only generates spike-based data reflecting brightness changes, achieving low latency and high temporal resolution. For single-photon detection, the CIS-based quantum image sensors study the source and physical mechanism of noise, achieving ultra-low noise and extremely high conversion gain. The image sensors using single-photon avalanche diodes (SPADs) leverage the avalanche effect to directly convert incident photons into pulse outputs, realizing high-speed and high-sensitivity single-photon detection. Furthermore, electric-field modulation enhances photogenerated charge collection and reduces temporal jitter, thereby improving timing precision in SPADs.

In terms of readout circuits, this paper reviews the architectures and optimization strategies for high-speed analog-to-digital converters (ADCs), address-event encoding, and time-correlated single-photon counting. To enhance conversion efficiency while minimizing chip area and power consumption, various ADC architectures have been developed. The successive approximation register (SAR) ADC has become a foundational solution due to its high integration and low power characteristics. Hybrid architectures such as SAR/single-slope (SS) and pipeline–SAR combine the strengths of different schemes, thereby effectively overcoming the area–resolution trade-offs inherent in traditional SAR ADCs. For DVS sensors, the address-event representation (AER) readout mechanism performs real-time detection of brightness variations and outputs them as asynchronous events. This greatly enhances image processing throughput while reducing storage and transmission demands. In SPAD-based sensors, on-chip integration of counting and histogram computation effectively alleviates the data throughput bottleneck associated with large-scale single-photon detection. These readout strategies, each tailored to the characteristics of their corresponding sensing mechanisms, collectively improve data conversion and transmission efficiency in high-speed imaging scenarios.

For intelligent processing, the primary objective is to efficiently extract information from sensor data and enable algorithmic intelligence. This process generally involves two stages: the reconstruction stage, which focuses on recovering high-quality image sequences from sparse spike streams, and the intelligent processing stage, which achieves high-speed semantic understanding through real-valued or spike-based computational architectures. By deeply integrating reconstruction and cognition at both algorithmic and hardware levels, end-to-end intelligent vision systems can simultaneously achieve high speed, low power consumption, and high accuracy. With ongoing technological convergence, multimodal vision chips integrating CIS, DVS, and SPAD architectures combine the advantages of different sensor modalities, providing more comprehensive perceptual capabilities for next-generation machine vision systems. Looking ahead to the future, the continuous advancement of semiconductor manufacturing technologies and novel materials, combined with the deep integration of multimodal sensing and heterogeneous computing paradigms, is expected to drive the development of high-performance, low-power, and intelligent vision chips.

Keywords:

high-speed vision chip /
complementary metal oxide semiconductor image sensor /
spiking image sensor /
high-speed spiking processing

搜索

高速视觉芯片研究进展

Research progress of high-speed vision chips

摘要

Abstract

作者及机构信息

Authors and contacts

文章全文

参考文献

施引文献

目录