李萌的个人主页
李萌的个人主页
首页
新闻
最新讲座
论文发表
开源项目
学生
联系方式
浅色
深色
自动
中文 (简体)
English
Deep Learning
HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference
Based on the KTransformers project, develop an adaptive scheduling framework that leverage the heterogeneous computation capability of CPU and GPU for efficient mixture-of-expert LLM inference.
AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
Develop an adaptive scheduling framework for efficient mixture-of-expert LLM inference on edge devices.
NASViT: Neural Architecture Search for Efficient Vision Transformer with Gradient Conflict-Aware Supernet Training
Propose gradient conflict-aware training to improve supernet-based NAS and develop a family of optimized hybrid CNN/ViT networks that achieve state-of-the-art performance Pareto.
AlphaNet: Improved Training of Supernet with Alpha-Divergence
Develop AlphaNet to improve the supernet-based NAS with a more generalized alpha-divergence-based knowledge distillation and achieve state-of-the-art performance Pareto.
AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling
Develop AttentiveNAS that focuses on improving the sampling strategy for supernet-based NAS to achieve state-of-the-art performance Pareto.
引用
×