Private AI

MPCache: MPC-Friendly KV Cache Eviction for Efficient Private LLM Inference
FENIX: Flexible and Efficient Hybrid HE/MPC Acceleration with Near-Memory Processing
Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing
Breaking the Layer Barrier: Remodeling Private Transformer Inference with Hybrid CKKS and MPC
FLASH: An Efficient Hardware Accelerator Leveraging Approximate and Sparse FFT for Homomorphic Encryption
FlexHE: A flexible Kernel Generation Framework for Homomorphic Encryption-Based Private Inference
FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference