1

Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing
A 28nm 534.6TOPS/W Mixed-Precision Edge Accelerator for Embodied AI Using Stochastic Computing
Breaking the Layer Barrier: Remodeling Private Transformer Inference with Hybrid CKKS and MPC
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
FLASH: An Efficient Hardware Accelerator Leveraging Approximate and Sparse FFT for Homomorphic Encryption
LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design