▶
Video
March 2026
▶
▶
▶
▶
▶
▶
▶
Build and Train an LLM with JAX
Learn more: https://bit.ly/4rce49q Introducing Build and Train an LLM with JAX, a short course built in partnership with…
February 2026
▶
Generative Adversarial Networks (GANs) Specialization
Learn more: https://www.deeplearning.ai/courses/generative-adversarial-networks-gans-specialization/ The DeepLearning.AI…
▶
TensorFlow: Advanced Techniques Specialization
Learn more: https://www.deeplearning.ai/courses/tensorflow-advanced-techniques-specialization/ The DeepLearning.AI Tenso…
▶
TensorFlow: Data and Deployment Specialization
Learn more: https://www.deeplearning.ai/courses/tensorflow-data-and-deployment-specialization/ Continue developing your …
December 2025
▶
▶
Traditional Holiday Live Stream
https://ykilcher.com/discord Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://w…
▶
TiDAR: Think in Diffusion, Talk in Autoregression (Paper Analysis)
Paper: https://arxiv.org/abs/2511.08923 Abstract: Diffusion language models hold the promise of fast parallel generation…
▶
Titans: Learning to Memorize at Test Time (Paper Analysis)
Paper: https://arxiv.org/abs/2501.00663 Abstract: Over more than a decade there has been an extensive research effort on…
November 2025
▶
[Paper Analysis] The Free Transformer (and some Variational Autoencoder stuff)
https://arxiv.org/abs/2510.17558 Abstract: We propose an extension of the decoder Transformer that conditions its genera…
October 2025
▶
[Video Response] What Cloudflare's code mode misses about MCP and tool calling
Theo's Video: https://www.youtube.com/watch?v=bAYZjVAodoo Cloudflare article: https://blog.cloudflare.com/code-mode/ Lin…
▶
[Paper Analysis] On the Theoretical Limitations of Embedding-Based Retrieval (Warning: Rant)
Paper: https://arxiv.org/abs/2508.21038 Abstract: Vector embeddings have been tasked with an ever-increasing set of retr…
August 2025
▶
AGI is not coming!
jack Morris's investigation into GPT-OSS training data https://x.com/jxmnop/status/1953899426075816164?t=3YRhVQDwQLk2gou…
July 2025
▶
Context Rot: How Increasing Input Tokens Impacts LLM Performance (Paper Analysis)
Paper: https://research.trychroma.com/context-rot Abstract: Large Language Models (LLMs) are typically presumed to proce…
▶
Energy-Based Transformers are Scalable Learners and Thinkers (Paper Review)
Paper: https://arxiv.org/abs/2507.02092 Code: https://github.com/alexiglad/EBT Website: https://energy-based-transformer…
May 2025
▶
On the Biology of a Large Language Model (Part 2)
An in-depth look at Anthropic's Transformer Circuit Blog Post Part 1 here: https://youtu.be/mU3g2YPKlsA Discord here: ht…
April 2025
▶
On the Biology of a Large Language Model (Part 1)
An in-depth look at Anthropic's Transformer Circuit Blog Post https://transformer-circuits.pub/2025/attribution-graphs/b…
January 2025
▶
[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
#deepseek #llm #grpo GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in t…
December 2024
▶
Traditional Holiday Live Stream
https://ykilcher.com/discord Links: TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick YouTube: https://w…