1

EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval
MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
FENIX: Flexible and Efficient Hybrid HE/MPC Acceleration with Near-Memory Processing
H2EAL: Hybrid-Bonding Architecture with Hybrid Sparse Attention for Efficient Long-Context LLM Inference
HD-MoE: Hybrid and Dynamic Parallelism for Mixture-of-Expert LLMs with 3D Near-Memory Processing
No Redundancy, No Stall: Lightweight Streaming 3D Gaussian Splatting for Real-time Rendering
SpecMamba: Accelerating Mamba Inference on FPGA with Speculative Decoding
Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing
A 28nm 534.6TOPS/W Mixed-Precision Edge Accelerator for Embodied AI Using Stochastic Computing
Breaking the Layer Barrier: Remodeling Private Transformer Inference with Hybrid CKKS and MPC