Publications

(2025). Stochastic Multivariate Universal-Radix Finite-State Machine: a Theoretically and Practically Elegant Nonlinear Function Approximator. In Asia and South Pacific Design Automation Conference (ASP-DAC) 2025.

PDF

(2024). PrivCirNet: Efficient Private Inference via Block Circulant Transformation. In Conference on Neural Information Processing Systems (NeurIPs) 2024.

PDF

(2024). ArkVale: Efficient Generative LLM Inference with Recallable Key-Value Eviction. In Conference on Neural Information Processing Systems (NeurIPs) 2024.

PDF

(2024). ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). OSCA: End-to-end Serial Stochastic Computing Neural Acceleration with Fine-grained Scaling and Piecewise Activation. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). HG-PIPE: Vision Transformer Acceleration with Hybrid-Grained Pipeline. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). FlexHE: A flexible Kernel Generation Framework for Homomorphic Encryption-Based Private Inference. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference. In International Conference on Computer-Aided Design (ICCAD) 2024.

(2024). CASCADE: A Framework for CNN Accelerator Synthesis with Concatenation and Refreshing Dataflow. In IEEE Transactions on Circuits and Systems I: Regular Papers (TCAS-I) (2024).

(2024). MoteNN: Memory Optimization via Fine-grained Scheduling for Deep Neural Networks on Tiny Devices. In Design Automation Conference (DAC) 2024.

(2024). FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference. In Design Automation Conference (DAC) 2024.

(2024). Alchemist: A Unified Accelerator Architecture for Cross-Scheme Fully Homomorphic Encryption. In Design Automation Conference (DAC) 2024.

(2024). ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer. In Design, Automation and Test in Europe Conference and Exhibition (DATE) 2024.

(2024). Enhancing 3D Detection Through Feature Aligned Deep Fusion. In International Conference on 3D Vision (3DV) 2024.

(2024). A 16.38TOPS and 4.55POPS/W SRAM Computing-in-Memory Macro for Signed Operands Computation and Batch Normalization Implementation. In IEEE Transactions on Circuits and Systems I: Regular Papers (TCAS-I) (2024).

(2024). MixCIM: A Hybrid-Cell-Based Computing-in-Memory Macro with Less-Data-Movement and Activation-Memory-Reuse for Depthwise Separable Neural Networks. In IEEE Custom Integrated Circuits Conference (CICC) 2024.

(2023). CoPriv: Network/Protocol Co-Optimization for Communication-Efficient Private Inference. In Conference on Neural Information Processing Systems (NeurIPs) 2023.

(2023). READ: Reliability-Enhanced Accelerator Dataflow Optimization using Critical Input Pattern Reduction. In ACM/IEEE International Conference on Computer Aided Design (ICCAD) 2023.

(2023). Not your father’s stochastic computing (SC)! Efficient yet Accurate End-to-End SC Accelerator Design. In International Conference on ASIC (ASICON) 2023.

(2023). MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention. In International Conference on Computer Vision (ICCV) 2023.

(2023). Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization. In ACM/IEEE International Conference on Computer Aided Design (ICCAD) 2023.

(2023). Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference. In ACM/IEEE International Conference on Computer Aided Design (ICCAD) 2023.

(2023). Efficient Non-Linear Adder for Stochastic Computing with Approximate Spatial-Temporal Sorting Network. In Design Automation Conference (DAC) 2023.

(2023). AVATAR: An Aging- and Variation-Aware Dynamic Timing Analyzer for Error-Efficient Computing. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) (2023).

(2023). READ: Reliability-Enhanced Accelerator Dataflow Optimization using Critical Input Pattern Reduction. In Design, Automation and Test in Europe Conference and Exhibition (DATE) 2023 (extended abstract).

(2023). Accurate yet Efficient Stochastic Computing Neural Acceleration with High Precision Residual Fusion. In Design, Automation and Test in Europe Conference and Exhibition (DATE) 2023.

(2022). BiT: Robustly Binarized Multi-distilled Transformer. In Conference on Neural Information Processing Systems (NeurIPs) 2022.

PDF

(2022). Depth Shrink: Empowering Hardware-Friendly Shallow Neural Networks. In Conference on Machine Learning (ICML) 2022.

PDF

(2022). Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet. In International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022.

PDF

(2022). SplitNets: Designing Neural Architectures for Efficient Distributed Computing on Head-Mounted Systems. In Conference on Computer Vision and Pattern Recognition (CVPR) 2022.

(2022). Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation. In Conference on Computer Vision and Pattern Recognition (CVPR) 2022.

PDF

(2022). NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training. In Conference on Learning Representations (ICLR) 2022.

PDF

(2021). DNA: Differentiable Network-Accelerator Co-Search. In International Symposium on Low Power Electronics and Design (ISLPED) 2021.

PDF

(2021). AlphaNet: Improved Training of Supernets with Alpha-Divergence. In Conference on Machine Learning (ICML) 2021 (Long Oral).

PDF

(2021). AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling. In Conference on Computer Vision and Pattern Recognition (CVPR) 2021.

PDF

(2021). Improving efficiency in neural network accelerator using operands hamming distance optimization. In Asia and South Pacific Design Automation Conference (ASP-DAC) 2021.

PDF

(2020). KeepAugment: A Simple Information-Preserving Data Augmentation Approach. In Conference on Computer Vision and Pattern Recognition (CVPR) 2021.

PDF

(2020). Co-exploration of neural architectures and heterogeneous asic accelerator designs targeting multiple tasks. In ACM/IEEE Design Automation Conference (DAC) 2020.

PDF

(2018). TimingSAT: Decamouflaging timing-based logic obfuscation. In IEEE International Test Conference (ITC) 2018.

PDF

(2018). A Synergistic Framework for Hardware IP Privacy and Integrity Protection. In Springer (2018).

PDF

(2018). A Practical Split Manufacturing Framework for Trojan Prevention via Simultaneous Wire Lifting and Cell Insertion. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) (2018).

PDF

(2018). Federated Learning with Non-IID Data. In arXiv:1806.00582 (2018).

PDF

(2018). A Practical Split Manufacturing Framework for Trojan Prevention via Simultaneous Wire Lifting and Cell Insertion. In Asia and South Pacific Design Automation Conference (ASP-DAC) 2018.

PDF

(2018). PrivyNet: A Flexible Framework for Privacy-Preserving Deep Neural Network Training. In arXiv:1709:06161 (2018).

PDF

(2017). Provably secure camouflaging strategy for IC protection. In IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) (2018).

PDF

(2017). Provably secure camouflaging strategy for IC protection. In ACM/IEEE International Conference on Computer Aided Design (ICCAD) 2017.

PDF

(2017). Cross-level monte carlo framework for system vulnerability evaluation against fault attack. In ACM/IEEE Design Automation Conference (DAC) 2017.

PDF

(2017). AppSAT: Approximately Deobfuscating Integrated Circuits. In IEEE International Symposium on Hardware Oriented Security and Trust (HOST) 2017 (Best Paper Award).

PDF

(2016). Practical public PUF enabled by solving max-flow problem on chip. In ACM/IEEE Design Automation Conference (DAC) 2016.

PDF

(2016). A monte carlo simulation flow for seu analysis of sequential circuits. In ACM/IEEE Design Automation Conference (DAC) 2016.

PDF

(2013). Characterization of Random Telegraph Noise in Scaled High-κ/Metal-Gate MOSFETs with SiO2/HfO2 Gate Dielectrics. In ECS Transactions, 52 (1) 941-946 (2013).

PDF

var dimensionValue = 'SOME_DIMENSION_VALUE'; ga('set', 'dimension1', dimensionValue);