DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence (1)
Quick Note
Core Methodology?
- Hybrid Attention (Compressed Sparse Attention + Heavily Compressed Attention)
- Manifold-Constrained Hyper-Connections (mHC)
- Muon Optimizer
个人观点,仅供参考。
Quick Note
Core Methodology?
个人观点,仅供参考。
Quick Note
What Problem it is Trying to Solve? (Motivation)
Low-memory footprint RAG via ANNS (Approximate Nearest Neighbor Search).