Efficient AI

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
ReaLM: Reliable and Efficient Large Language Model Inference with Statistical Algorithm-Based Fault Tolerance
SpecASR: Accelerating LLM-based Automatic Speech Recognition via Speculative Decoding
FLASH: An Efficient Hardware Accelerator Leveraging Approximate and Sparse FFT for Homomorphic Encryption
LightMamba: Efficient Mamba Acceleration on FPGA with Quantization and Hardware Co-design
SCALES: Boost Binary Neural Network for Image Super-Resolution with Efficient Scalings
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
var dimensionValue = 'SOME_DIMENSION_VALUE'; ga('set', 'dimension1', dimensionValue);