1

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
FlexHE: A flexible Kernel Generation Framework for Homomorphic Encryption-Based Private Inference
HG-PIPE: Vision Transformer Acceleration with Hybrid-Grained Pipeline
MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers
PrivQuant: Communication-Efficient Private Inference with Quantized Network/Protocol Co-Optimization
ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Alchemist: A Unified Accelerator Architecture for Cross-Scheme Fully Homomorphic Encryption
FastQuery: Communication-efficient Embedding Table Query for Private LLMs inference
MoteNN: Memory Optimization via Fine-grained Scheduling for Deep Neural Networks on Tiny Devices
var dimensionValue = 'SOME_DIMENSION_VALUE'; ga('set', 'dimension1', dimensionValue);