1

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference
ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer
CoPriv: Network/Protocol Co-Optimization for Communication-Efficient Private Inference
Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention
Not your father’s stochastic computing (SC)! Efficient yet Accurate End-to-End SC Accelerator Design