Four Papers Accepted by DAC'2025

Four papers on efficient LLM are accepted by DAC'2025 as regular papers.

Last updated on Jun 18, 2025

Four papers on efficient LLM:

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
UniCAIM: A Unified CAM/CIM Architecture with Static-Dynamic KV Cache Pruning for Efficient Long-Context LLM Inference
ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance