Let’s chat about Deep Seek
LLMs
DeepSeek
Open weights model
Mixture of experts
Multiheaded latent attention
model distillation
Presentation and reading of DeepSeek whitepapers

This event was attended by just Mike and Peter, so we went to the pub and read the whitepaper for two of their recently released models. In particular, Pete helped Mike to understand the math on pages 7-9 of the Deepseek V3 technical report
Additionally, Mike prepared a presentation to communicate the differences between the DeepSeek models and other flagship LLMs. The presentation is available here