AI integration

Providers

Replicate
Amazon Bedrock
together.ai
- demos
- cookbook
- YouTube 📡
Abacus AI
Deep Infra
Fireworks AI
Hyperbolic
Cohere
- doc

Libraries

Model Context Protocol
LangChain: doc, Python API, source
GPT4All: doc, code
LangChain4J: API doc, release notes, source
Spring AI
Griptape: doc

Mess

AssemblyAI: YouTube 📡
Abdul Majed Raja: YouTube 📡↓
"Prompt Engineering": YouTube 📡↓
Greg Kamradt: YouTube 📡
Matt Wolfe: YouTube 📡↓

Articles and videos

LLaMA & Alpaca: “ChatGPT” On Your Local Computer 🤯 | Tutorial by Martin Thissen (March 18^th, 2023) ► A short explanation on how to use Dalai and LLaMA.
Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps by Jason Zhou (June 11^th, 2023) ► A small effective demo of using Hugging Face and LangChain.
$0 Embeddings (OpenAI vs. free & open source)↑ by Greg Richardson (June 25^th, 2023) ► A demo of two ways to compute embeddings: online with Hugging Face and locally in the browser.
"Next Level Prompts?" - 10 mins into advanced prompting by Jason Zhou (August 29^th, 2023) ► Some tools/sites helping to write prompts: Guidance, FlowGPT, gpt-prompt-engineer, PromptsRoyale.
A developer’s guide to open source LLMs and generative AI — Open source generative AI projects are a great way to build new AI-powered features and apps. by Gwen Davis (October 5^th, 2023) ► Some information on open-source LLMs and a short list of four ones.
🤬 How the #@%$! Do You Use an LLM in a SaaS Platform?↓ by Arjan Egges (October 6^th, 2023) ► Arjan Egges describes his first steps to build learntail.com, using OpenAI and Langchain to generates quizzes.
Pydantic is all you need: Jason Liu by Jason Liu (October 9^th, 2023) ► Jason Liu presents his Instructor library to structure prompting and extraction (for OpenAI).
How I Fine-Tuned An AI Clone - Can You Tell The Difference?↓ by Greg Kamradt (November 2^nd, 2023) ► A lengthy but too fast video just to end up with using HeyGen to create a deep fake video.
LLM: Trust, but Verify — Understand the challenges of developing, testing, and monitoring non-deterministic software; this is a new and significant challenge for observability. by Pratik Daga (November 3^rd, 2023) ► The author describes the problem of model drift and proposes a mechanism to detect it.
Wanna RAG? These are your best LLMs!!! by Abdul Majed Raja (November 16^th, 2023) ► A presentation of Galileo’s Hallucination Index.
No, You DON'T NEED OpenAI Function Calling!!!! by Abdul Majed Raja (November 17^th, 2023) ► A quick n’ dirty presentation of Gorilla OpenFunctions.
Training Your Own AI Model Is Not As Hard As You (Probably) Think by Steve Sewell (November 22^nd, 2023) ► Using several steps to generate code from a Figma design: I wonder if what is presented here really works on other cases than this demo.
llamafile is the new best way to run an LLM on your own computer by Simon Willison (November 29^th, 2023) ► A presentation of llamafile: a single file containing the model and its executable which can run on several OSes.
Detect Texts from Documents (even SCANNED)!!! by Abdul Majed Raja (January 14^th, 2024) ► A presentation of Surya: a tool to identify text lines and compute their bounding boxes.
Exploring ColBERT with RAGatouille by Simon Willison (January 27^th, 2024) ► Some experimentation with ColBERT, a fast retrieval model.
Everything WRONG with LLM Benchmarks (ft. MMLU)!!! by Abdul Majed Raja (February 10^th, 2024) ► Presenting a paper ("When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards") which analyses how model’s scores are sensible to the way benchmarks are structured.
Engineering Practices for LLM Application Development by David Tan and Jessie Wang (February 13^th, 2024) ► Lessons learned from the building of a PoC of a concierge using an LLM.
La recherche sous stéroïdes - une histoire de sémantique↓ by Mathilde Rigabert and Martin Labenne (May 3^rd, 2024) ► Some feedback about implementing semantic search on an e-commerce site. This could have been much shorter.
Poorman's ChatGPT-4o Works!! 🤣 by Abdul Majed Raja (May 15^th, 2024) ► A short presentation of KingNish/OpenGPT-4o a Hugging Face space supporting several modalities by using open models.
The 4 Big Changes in LLMs by Sam Witteveen (July 1^st, 2024) ► Sam Witteveen advices to consider four things: models are getting smarter, they are getting faster, there are getting cheaper, and context windows are getting larger.
RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing by Isaac Ong, Amjad Almahairi, Vincent Wu, Wei-Lin Chiang, Tianhao Wu, Joseph E. Gonzalez, M Waleed Kadous, and Ion Stoica (July 1^st, 2024) ► The authors propose a router that selects to run a query either toward an expensive powerfull model or toward a cheaper smaller model, in order to reduce cost while sacrifying little quality.
↪What is an LLM Router? by Sam Witteveen (July 3^rd, 2024) ► Nothing more than the previous announcement.
InternLM - A Strong Agentic Model? by Sam Witteveen (July 5^th, 2024) ► A basic presentation of internlm/internlm2_5-7b-chat, a model specialised for JSON and function calling.
Prompt Poet - Character AI's Prompting Framework by Sam Witteveen (August 2^nd, 2024) ► A presentation of Prompt Poet, a Python framework to manage prompts.
Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 1) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this first part of a blog series, we'll explore the fundamental principles of LLM caching, delve into the various caching architectures and implementations that can be employed by Uri Rosenberg (August 7^th, 2024) ► Some cache architectures for LLM, classical and RAG. The cache key can be exact or semantic.
↪Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 2) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this second part of a blog series, we'll explore LLM caching implementations. by Uri Rosenberg (August 7^th, 2024) ► How to implement the previous architecture on AWS using LangChain or not.
How streaming LLM APIs work by Simon Willison (September 21^st, 2024) ► Some experimentation of using SSE with GPT-4o Mini, Sonnet 3, and Gemini Pro using curl, Python‘s HTTPX, and JavaScript’s fetch().
Is Spring AI Strong Enough for AI? — Explore Spring's capabilities within the AI domain, its potential integration with AI libraries, and its ability to effectively manage AI workflows. by Reza Ganji (September 27^th, 2024) ► This article is comparing very different things: Spring, TensorFlow Serving, Kubernetes, MLflow, and Python. Additionally, it only states some obvious facts.
Explore a New C# Library for AI by Matt Williams (October 11^th, 2024) ► Some very little information about Microsoft.Extensions.AI, some new .NET packages to integrate AI.
Run a prompt to generate and execute jq programs using llm-jq by Simon Willison (October 27^th, 2024) ► A new llm plugin to generate and execute jq commands.
How Google is helping developers get better answers from AI — Today’s guest is Logan Kilpatrick, a senior product manager at Google, who tells Ben about his journey from software engineering to machine learning to product management, all with an emphasis on reducing developer friction. They talk through the challenges of non-determinism in AI models and how Google is addressing these issues with a new feature: Grounding with Google Search. Plus, what working at the Apple Store taught Logan about product management. by Logan Kilpatrick and Ben Popper (November 5^th, 2024) ► There is no real information in this interview of Logan Kilpatrick, a product manager for Google AI Studio.
Model Compression: Improving Efficiency of Deep Learning Models — Model compression is a key component of real-time deployment of deep learning models. This article explores different approaches to make models more efficient. by Inderjot Singh Saggu (November 6^th, 2024) ► A high-level and clear description of model pruning, quantisation, and knowledge distillation.
ChainForge by Simon Willison (November 8^th, 2024) ► Some little information about ChainForge, a tool to evaluate prompts.
Introducing the Model Context Protocol by Simon Willison (November 25^th, 2024) ► Anthropic proposes a protocol to connect a LLM to tools and data sources.
Anthropic's New Agent Protocol! by Sam Witteveen (November 27^th, 2024) ► A presentation of Model Context Protocol and some experimentation with it.
Structured Generation w/ SmolLM2 running in browser & WebGPU by Simon Willison (November 29^th, 2024) ► Runing a 1.7B model (HuggingFaceTB/SmolLM2-1.7B-Instruct) in Chrome.
17 Python Libraries Every AI Engineer Should Know by Dave Ebbelaar (December 12^th, 2024) ► The title says it all.
Integrating AI With Spring Boot: A Beginner’s Guide — In this guide, you will learn how to integrate AI into your Spring Boot app using Spring AI and simplify your AI setup with familiar Spring abstractions. by Gunter Rotsaert (January 27^th, 2025) ► An introduction to Spring AI.
Using pip to install a Large Language Model that’s under 100MB by Simon Willison (February 7^th, 2025) ► Simon Willison created a (useless) Pypi package embedding a tiny LLM (HuggingFaceTB/SmolLM2-135M-Instruct).
files-to-prompt 0.5 by Simon Willison (February 14^th, 2025) ► Simon Willison describes his files-to-prompt tool, used to send some files and a prompt to a LLM.
Emerging Patterns in Building GenAI Products by Bharani Subramaniam and Martin Fowler (February 25^th, 2025) ► A good overview of the common methods for integrating generative AI (mostly LLMs).
Open Deep Research (April 16^th, 2025) ► Together AI explains how they build their open-source deep research.
Exploring Promptfoo via Dave Guarino’s SNAP evals by Simon Willison (April 24^th, 2025) ► Some information about using promptfoo as an eval tool.
Learn the Hugging Face Kernel Hub in 5 Minutes by David Holtz, Daniël De Kok, Nicolas Patry, Pedro Cuenca, Simon Pagezy, Merve Noyan, and Vaibhav Srivastav (June 12^th, 2025) ► Hugging Face now hosts optimised kernels that can be easily downloaded and used in our own models.
Fine-tuning
- Fine Tune a model with MLX for Ollama by Matt Williams (August 30^th, 2024) ► How to fine-tune a model with MLX and use it in Ollama.
- ↪Is MLX the best Fine Tuning Framework? by Matt Williams (January 18^th, 2025) ► A detailed introduction to fine-tuning with MLX. This is an expanded version of the previous video.
- Fine-tuning Large Language Models by Zain Hasan, Artem Chumachenko, George, and Max Ryabinin (January 16^th, 2025) ► The basics of LLM and fine-tuning, a demo of Together’s LoRA fine-tuning API, some experiments done by Together, and some pieces of advice.
- Fast Fine Tuning with Unsloth by Matt Williams (January 24^th, 2025) ► A presentation of Unsloth which optimises fine-tuning on Nvidia GPUs.
- Axolotl is a AI FineTuning Magician↓ by Matt Williams (January 31^st, 2025) ► This presentation of Axolotl is too verbose and it is not very good because Matt Williams does not master the subject.
RAG
- ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings) by Greg Richardson (February 7^th, 2023) ► Greg Richardson describes in details how he implemented a chat to answer questions on Supabase: tokenising the doc, finding the paragraph closest to the question, and generating the answer.
- Build RAG Application Using a LLM Running on Local Computer with GPT4All and Langchain — Privacy-preserving LLM without GPU↑ by "(λx.x)eranga" (March 10^th, 2024) ► A clear explanation with working code of how to scrap an Internet doc, to chunk it, to store it in Chroma, and to use GPT4All to generate the answer.
- Ne mettez pas les projets RAG en production trop vite ! by Philippe Prados (June 3^rd, 2024) ► Philippe Prados lists some examples of problems that will occur with a too simplistic implementation of a RAG. But this simply means that you do not design a demo and a scalable application the same way, the second is much more complex.
- ↪Rendre résilient un projet RAG by Philippe Prados (June 17^th, 2024) ► Philippe Prados suggested many changes to LangChain in order to make it more resilient, e.g. to properly support transactions.
- Breaking up is hard to do: Chunking in RAG applications — A look at some of the current thinking around chunking data for retrieval-augmented generation (RAG) systems. by Ryan Donovan (June 6^th, 2024) ► A high level presentation of some chunking methods and how to evaluate them.
- Supercharging RAG with Generative Feedback Loops from Weaviate by Letitia Parcalabescu (June 17^th, 2024) ► A presentation of Generative Feedback Loops, which is just about storing LLM generated text in a vectorial database, so it be retrieved quickly rather than regenerated by the LLM.
- Building search-based RAG using Claude, Datasette and Val Town by Simon Willison (June 21^st, 2024) ► The debrief of a life session of implementing a small RAG in Val Town.
- Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) by Yannic Kilcher (June 26^th, 2024) ► A critic of "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools".
- Gemma 2 - Local RAG with Ollama and LangChain by Sam Witteveen (June 28^th, 2024) ► A simple RAG implementation.
- Practical tips for retrieval-augmented generation (RAG) — Retrieval-augmented generation (RAG) is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models. by Cameron R. Wolfe PhD (August 15^th, 2024) ► Some high level advice on how to implement RAG.
- Knowledge Graphs: The Secret Weapon for Superior RAG Applications — Integrating knowledge graphs in RAG applications enhances recommendation accuracy and context-awareness, providing structured, interconnected data.🚫 by Pavan Vemuri, Prince Bose, and Tharakarama Reddy Yernapalli Sreenivasulu (August 19^th, 2024) ► This article is only about the data retrieval. The data needs to be structured, so it can be stored as a semantic graph.
- RAG vs. Fine Tuning by Cedric Clyburn (September 9^th, 2024) ► The basics of RAG vs. fine-tuning, and a description of combining both.
- Introducing Contextual Retrieval↑ (September 19^th, 2024) ► Anthropic experimented RAG with adding context to chunks, using embedding and BM25, and reranking.
- ↪Contextual RAG is stupidly brilliant!↓ by Abdul Majed Raja (September 23^rd, 2024) ► This presentation of Anthropic’s analysis on how to improve RAG is poorly done.
- Multimodal Document RAG with Llama 3.2 Vision and ColQwen2 by Zain Hasan (October 8^th, 2024) ► A presentation of ColPali design: using a vision language model (PaliGemma or Qwen-2) to transform image patches into vectors, finding the patch vectors nearest to the user query, and providing the corresponding full images and user query to a vision LLM (Llama 3.2 vision).
- Why Your RAG System Is Broken, and How to Fix It with Jason Liu (⧉) by Jason Liu and Sam Charrington (November 11^th, 2024) ► Some advice on RAG implementation: doing fast and simple evals (e.g. looking at the length, using regexp…), use them very frequently, reranking…
- Build a document-based question answering system by using Docling with Granite 3.1 by Ash Minhas, Anna Gutowska, and Erika Russi (December 18^th, 2024) ► A small demo of interrogating a document using Granite, Docling, LangChain, and FFAIS.
- 2 Methods For Improving Retrieval in RAG by Johannes Jolkkonen (December 19^th, 2024) ► This video seems to be a real usage of RAG, not the usual YouTuber doing the usual demo. The guy improved his RAG system by preprocessing the documents to extract structured data from them using a LLM.
- GraphRAG vs. Traditional RAG: Higher Accuracy & Insight with LLM↓ by Sara Bacha (February 17^th, 2025) ► This presentation of GraphRAG is too high level, you have no clue on how to implement it.
- Build an AI-powered multimodal RAG system with Docling and Granite by BJ Hargrave and Erika Russi (February 26^th, 2025) ► Yet another RAG example, this one extracts text, tables, and images from a PDF file.
- RAG vs. CAG: Solving Knowledge Gaps in AI Models↑ by Martin Keen (March 17^th, 2025) ► A basic and good comparison of Retrieval-Augmented Generation and Cache-Augmented Generation.
- What is Retrieval-Augmented Fine-Tuning (RAFT)? by Isaac Ke (June 9^th, 2025) ► Fine tuning a model so it gets better at using only the relevant documents provided by the retrieval part and at answering that it does not know if no document is relevant.
- NotebookLM
  - Google's RAG Experiment - NotebookLM by Sam Witteveen (May 28^th, 2024) ► The title says it all. Google demo is impressive, using voice for querying and answering.
  - How to create AI Podcasts with NotebookLM Tutorial by Abdul Majed Raja (September 17^th, 2024) ► A presentation of an impressive Google demo usable (from Illuminate and NotebookLM): you give a paper as entry, it generates a two-persons podcast.
  - NotebookLM’s automatically generated podcasts are surprisingly effective by Simon Willison (September 29^th, 2024) ► People are playing with NotebookLM-generated podcasts, sometimes at a meta-level.
  - New in NotebookLM: Customizing your Audio Overviews by Simon Willison (October 17^th, 2024) ► Simon Willison is playing with the fact that NotebookLM users can now provide guidelines for the podcast to generate: as usual he picks up the pelican example and asks the AI-hosts to behave as if they were pelicans.
  - Google's UNREAL AI Gets an UPGRADE... by Wes Roth (October 19^th, 2024) ► The "poop fart" podcast and how Wes Roth added video on it using HeyGen. He also quickly describes the new NotebookLM features.
Web scrapping
- Web Scraping AI AGENT, that absolutely works 😍 by Abdul Majed Raja (May 9^th, 2024) ► A presentation of ScrapeGraphAI, a Python library to scrap a Web site and to interrogate an LLM on the scrapped data.
- “Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent by Jason Zhou (May 16^th, 2024) ► Scrapping the Web with FireCrawl or AgentQL, and an LLM.
- How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai by Hai Nghiem (May 17^th, 2024) ► Some Web scrapping tools: Beautiful Soup, Jina AI, Firecrawl, and Scrapegraph-ai.
- How Stack Overflow fends off scraping bots — Josh Zhang, a staff site reliability engineer at Stack Overflow, tells Ryan and Eira how the Stack Exchange network defends against scraping bots. They also cover the emergence of human botnets, why DDoS attacks have spiked in the last couple of years, and the constant balancing act of protecting sites from attack without inhibiting legitimate users. by Josh Zhang, Ryan Donovan, and Eira May (July 30^th, 2024) ► The subtitle says it all.
- Agentically scrape the web with Firecrawl & LangGraph (LangChain) by Hai Nghiem (October 25^th, 2024) ► The title says it all.
- NuExtract 1.5 by Simon Willison (November 16^th, 2024) ► NuExtract models extract structured data from unstructured text.
Tool calling
- AI Agents' Secret Sauce by Sam Witteveen (October 7^th, 2024) ► Some basic but good advice on how to implement tools.
- What is Tool Calling? Connecting LLMs to Your Data by Roy Derks (January 13^th, 2025) ► The basic presentation of tool calling is classical. But the description of "embedded tool calling" is not detailed enough to understand how that can work.
Docling
- Docling by Simon Willison (November 3^rd, 2024) ► A short feedback on experimenting with docling.
- Building a Basic RAG System with Docling: A Comprehensive Guide↓ by Shashanka B R (December 24^th, 2024) ► This presentation of doing RAG on a PDF file is rather bad, but the code (in the GitHub repo) is fine.
- How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites) by Dave Ebbelaar (February 13^th, 2025) ► A presentation of Docling, a library for parsing documents.
Frameworks
- LLM Toolkit: Validation is all you need by Jeff Schomay (May 20^th, 2024) ► Building a tool that, from an English question, performs a database request and generates an answer, using Instructor and Fructose.
- LangChain
  - LangChain101: Question A 300 Page Book (w/ OpenAI + Pinecone) by Greg Kamradt (February 27^th, 2023) ► A small demo using LangChain, OpenAI, and Pinecone.
  - Workaround OpenAI's Token Limit With Chain Types by Greg Kamradt (March 1^st, 2023) ► Some solutions to summarise or extract answers from too long documents.
  - The LangChain Cookbook - Beginner Guide To 7 Essential Concepts by Greg Kamradt (March 29^th, 2023) ► Some short examples of the LangChain features.
  - ↪The LangChain Cookbook Part 2 - Beginner Guide To 9 Use Cases by Greg Kamradt (May 2^nd, 2023) ► The continuation of the previous video.
  - LangChain: Run Language Models Locally - Hugging Face Models by "Prompt Engineering" (April 25^th, 2023) ► A demo of executing a model on Hugging Face and locally.
  - 5 Levels Of LLM Summarizing: Novice to Expert by Greg Kamradt (May 4^th, 2023) ► More LangChain examples.
  - Scrape any website with OpenAI Functions & LangChain by "LLM School" (August 2^nd, 2023) ► The title says it all.
  - Construire son RAG (Retrieval Augmented Generation) grâce à langchain: L’exemple de l’Helpdesk d’OCTO by Florian Bastin and Nicolas Cavallo (October 17^th, 2023) ► A detailed example demonstrating how to extract data from Confluence, embed the chunks, create a chain to find and format the answer, and evaluate the result.
  - Gradio 5 - Building a Quick Chabot UI for LangChain by Sam Witteveen (October 10^th, 2024) ► A small example of a streaming chat program with Gradio 5 and LangChain.
  - Content Extraction using Large Language Models & JavaScript↓ by Amanda Winkles (January 9^th, 2025) ► An example of using LangChain with Granite to extract data from a PDF and Mistral Large to format it into a Markdown table. But the Json format is unspecified, she is using some flaky heuristic to clean up Granite’s answer, an LLM is an overkill to convert Json into a Markdown table…
  - LangChain RAG: Optimizing AI Models for Accurate Responses by Erika Russi (February 13^th, 2025) ► A simple RAG system using LangChain and Granite 3.0 8B Instruct.
- LangChain4J
  - Java Meets AI: A Hands On Guide to Building LLM Powered Applications with LangChain4j By Lize Raes (October 5^th, 2023) ► An overview of LangChain4j.
  - Experiments with Langchain4j or Java way to LLM-powered applications by Iryna Hvozdyk (February 6^th, 2024) ► A good overview of LangChain4j features, this is mostly for persons who do not know the typical AI use cases.
  - The Definitive Guide to Tool Support in LangChain4J by Ken Kousen (February 24^th, 2024) ► A rather slow presentation of using tools in LangChain4j.
  - Java rencontre l'IA : Comment intégrer les LLMs dans vos applications avec LangChain4j by Lize Raes (May 3^rd, 2024) ► The same, in French and updated.
  - Evolution of Java Ecosystem for Integrating AI by Poonam Parhar (January 29^th, 2025) ► Building a RAG chat using LangChain4J and Oracle Generative AI.
Tools
- Ollama
  - Ollama on CPU and Private AI models! by Abdul Majed Raja (November 8^th, 2023) ► A presentation of Ollama.
  - Ollama Web UI (ChatGPT-ish) - Local AI FTW!!! by Abdul Majed Raja (December 1^st, 2023) ► Running Ollama Web UI in Docker.
  - Ollama's Newest Release and Model Breakdown by Matt Williams (September 21^st, 2024) ► Ollama 0.3.11, Solar Pro Preview, Qwen 2.5, Bespoke Minicheck, Mistral Small, and Reader-LM.
  - Quick Look at Hollama↓ by Matt Williams (October 8^th, 2024) ► The "unboxing" of Hollama, a good basic UI for Ollama. But there is little value in such a video, you can easily do the same yourself.
  - Ollama + HuggingFace - 45,000 New Models by Sam Witteveen (October 25^th, 2024) ► Ollama can now use any GGUF recorded on Hugging Face.
  - Ollama: Llama 3.2 Vision by Simon Willison (November 13^th, 2024) ► Some very little information about Ollama supporting the vision features of Llama 3.2.
  - Open WebUI by Simon Willison (December 27^th, 2024) ► Simon Willison discovers Open WebUI, he is satisfied by the installation easiness, and he experiments it with Llama 3.2 3B.
  - Building a Vision App with Ollama Structured Outputs by Sam Witteveen (December 31^st, 2024) ► A presentation of Ollama Structured Outputs and some examples using them with Llama 3.2’s vision.
  - Solved with Windsurf by Matt Williams (February 14^th, 2025) ► Matt Williams wrote a utility in Rust, a language he barely knows, using Windsurf, to get a report on the models installed in Ollama.
  - Function calling using LLMs — Building AI Agents that interact with the external world. by Kiran Prakash (May 6^th, 2025) ► A simple example of a script using tools and, then, converted to usinf MCP.
  - The Ollama Course of Matt Williams
    - 1. The Ollama Course: Intro to Ollama by Matt Williams (July 23^rd, 2024) ► An overview of Ollama: installation, basic usage, and downloading a model.
    - 2. Installing Ollama by Matt Williams (July 30^th, 2024) ► How to install Ollama on Windows, Linux, and MacOS.
    - 3. How to use the Ollama.com site to Find Models by Matt Williams (August 6^th, 2024) ► An explanation of the description of Ollama models.
    - 4. The Ollama Course - Using the CLI by Matt Williams (August 14^th, 2024) ► A presentation of all the CLI commands.
    - 5. Comparing Quantizations of the Same Model - Ollama Course by Matt Williams (August 21^st, 2024) ► Compare the results of the same model with different quantisations and select the one that has the quality / speed that is the best for your needs.
    - 6. An Introduction to RAG - Part of the Free Ollama Course by Matt Williams (August 29^th, 2024) ► A basic introduction to RAG.
    - 7. Embeddings in Depth - Part of the Ollama Course by Matt Williams (September 4^th, 2024) ► An overview on how to perform embedding using Olllama.
    - Let's build a RAG system - The Ollama Course🚫 by Matt Williams (September 11^th, 2024) ► An example of a small RAG program, both in Python and JavaScript.
    - What are the different types of models - The Ollama Course by Matt Williams (September 19^th, 2024) ► A basic presentation of the model types: text/base, chat/instruct, code, and vision.
    - Crack Ollama Environment Variables with Ease - Part of the Ollama Course by Matt Williams (September 26^th, 2024) ► The most important environment variables and how to set them on MacOS, Linux, and Windows.
    - Upgrade Your AI Using Web Search - The Ollama Course by Matt Williams (October 2^nd, 2024) ► A simple program using SearNGX and Cheerio to perform a Web search, retrieve the found pages, scrap the text in them, and generate an answer with Llama 3.2 1B.
    - Taming AI Hallucinations?🚫 by Matt Williams (October 9^th, 2024) ► Matt Williams describes some basic facts about hallucination.
    - Unlock AI Mastery with Pro Tips on Prompting! by Matt Williams (October 16^th, 2024) ► Some basics on prompt writing.
    - Master Ollama's File Layout in Minutes! by Matt Williams (October 23^rd, 2024) ► A description of how Ollama records the models using several files, similarly to what Docker does.
    - Don’t Embed Wrong! by Matt Williams (October 31^st, 2024) ► Matt Williams speaks about using prefixes for RAG with Ollama, but there is no explanation of how they work, he just says that they improve results.
    - AI Model Context Decoded by Matt Williams (November 6^th, 2024) ► How to change the context size and some warnings about using a large context size.
    - AI Vision Models Take a Peek Again! by Matt Williams (November 8^th, 2024) ► Using Llama 3.2’s vision in Ollama 0.4.0.
    - Let's Update Ollama Everywhere by Matt Williams (November 13^th, 2024) ► Explaining something very basic: upgrading Ollama on Mac, Windows, Linux, and Docker.
    - Cracking the Enigma of Ollama Templates by Matt Williams (November 20^th, 2024) ► An introduction to model templates.
    - Find Your Perfect Ollama Build by Matt Williams (November 22^nd, 2024) ► How to build Ollama, the main branch or a PR.
    - Simplify Ollama Cleanup Like a Pro by Matt Williams (November 27^th, 2024) ► A presentation of Gollama to clean up Ollama data and how to uninstall Ollama.
    - The Path To Better Custom Models by Matt Williams (December 6^th, 2024) ► An introduction to Ollama model files.
    - The Truth About Ollama's Structured Outputs by Matt Williams (December 11^th, 2024) ► A presentation of structured outputs and a comparison with JSON mode.
    - Optimize Your AI - Quantization Explained↓ by Matt Williams (December 28^th, 2024) ► This description of model and context quantisation is unclear, mostly because there is no technical explanation.
    - MSTY Makes Ollama Better by Matt Williams (February 28^th, 2025) ► A presentation of MSTY, a UI for Ollama.
- llm
  - Language models on the command-line w/ Simon Willison by Simon Willison and Hugo Bowne-Anderson (June 13^th, 2024) ► Simon Willison presents his llm CLI tools.
  - ↪Language models on the command-line by Simon Willison (June 17^th, 2024) ► An overview of the video.
  - Using LLMs on the command line by Mark Needham (October 26^th, 2024) ► A short presentation of llm.
  - Ask questions of SQLite databases and CSV/JSON files in your terminal by Simon Willison (November 25^th, 2024) ► Simon Willison adds to sqlite-utils the possibility to ask questions in natural language and have a LLM generate the SQL query.
  - How I use LLMs – neat tricks with Simon’s `llm` tool — Earlier this year I co-authored a report about the direct environmental impact of AI, which might give the impression I’m massively anti-AI, because it talks about the signficant social and environmental of using it. I’m not. I’m (still, slowly) working through the content of the Climate Change AI Summer School, and I use it a fair amount in my job. This post shows some examples I use. by Chris Adams (December 30^th, 2024) ► Some positive feedback and some examples of usage of llm.
  - LLM 0.22, the annotated release notes by Simon Willison (February 17^th, 2025) ► The title says it all.
  - Structured data extraction from unstructured content using LLM schemas by Simon Willison (February 28^th, 2025) ► Simon Willison added support of JSON schemas to llm.
  - llm-openrouter 0.4 by Simon Willison (March 10^th, 2025) ► Simon Willison improved the support of OpenRouter.
  - Feed a video to a vision LLM as a sequence of JPEG frames on the CLI (also LLM 0.25) by Simon Willison (May 5^th, 2025) ► The release notes of llm 0.25 with some details about a new llm-video-frames plugin to extract frames from a video and send them to the model.
  - LLM 0.26a0 adds support for tools! by Simon Willison (May 14^th, 2025) ► Simon Willison prototypes tool integration in llm.
  - How I used o3 to find CVE-2025-37899, a remote zeroday vulnerability in the Linux kernel’s SMB implementation by Simon Willison (May 24^th, 2025) ► Sean Heelan found a use-after-free bug using GPT-o3 via llm.
Agents
- 5 Problems Getting LLM Agents into Production by Sam Witteveen (June 4^th, 2024) ► Some advice on using agents.
- Evals for AI Agents, the right way!!! by Abdul Majed Raja (August 12^th, 2024) ► The usual bad presentation of a paper ("TOOLSANDBOX: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities") evaluating the efficiency of using LLM as agents.
- Agent-S : Unleash The Power Of GUI Computer Use Agents ! by Sam Witteveen (October 21^st, 2024) ► A high level presentation of "Agent S: An Open Agentic Framework that Uses Computers Like a Human: a framework to use applications as a human would do it.
- Microsoft Launches 10 NEW AI Agents by Sam Witteveen (November 24^th, 2024) ► Mircrosoft is moving agressively on AI and integrates 10 agents in Dynamics 365.
- Building effective agents (December 19^th, 2024) ► A good and simple overiew of some workflow and agent architectures.
- ↪Building effective agents by Simon Willison (December 20^th, 2024) ► Some extracts of the previous article.
- ↪How to Build Effective AI Agents (without the hype) by Dave Ebbelaar (January 20^th, 2025) ► This video is just Anthropic’s article.
- Trace & Evaluate your Agent with Arize Phoenix by Sri Chavali, John Gilhuly, and Aymeric Roucher (February 28^th, 2025) ► A presentation of Arize Phoenix, a platform to trace and evaluate smolagents, the evaluation uses LLM-as-a-judge.
- 5 Types of AI Agents: Autonomous Functions & Real-World Applications by Martin Keen (April 28^th, 2025) ► Martin Keen proposes these categories: simple reflex, model-based reflex, goal-based, utility-based, and learning.
- LangGraph
  - AgentWrite with LangGraph by Sam Witteveen (September 6^th, 2024) ► Sam Witteveen describes how he set up a short LangGraph example to write long articles, similarly to LongWriter.
  - Building a LangGraph ReAct Mini Agent by Sam Witteveen (September 17^th, 2024) ► A description of a simple Pattern in LangGraph: ReAct Function Calling.
- ChatDev
  - Build AI agent workforce - Multi agent framework with MetaGPT & chatDev by Jason Zhou (September 8^th, 2023) ► A presentation of ChatDev.
- CrewAI
  - CrewAI August Update: Planning Steps, Training, and Advanced Features Explained by Sam Witteveen (August 20^th, 2024) ► Sam Witteveen presents some new CrewAI features, but there is no explanation on how training is taken into account, on how test scores are computed…
- Autogen
  - Autogen - Microsoft's best AI Agent framework that is controllable? by Jason Zhou (October 3^rd, 2023) ► A presentation of AutoGen.
  - Microsoft's Magentic One: This FREE AI AGENT can CONTROL BROWSER, DO CODING & MORE! by "AICodeKing" (November 10^th, 2024) ► A presentation and some little test of Magentic-One, a multi-agent system from Microsoft able to surf on the Web, read local file, write code, and pilot a terminal to execute that code.
  - Multi-Agent AI EXPLAINED: How Magentic-One Works by Sam Witteveen (November 13^th, 2024) ► A better presentation of Magentic-One.
- Swarm
  - Introducing Swarm with Code Examples: OpenAI's Groundbreaking Agent Framework by Sam Witteveen (October 14^th, 2024) ► Some simple examples using Swarm framework and some feedback about it.
- PydanticAI
  - PydanticAI - The NEW Agent Builder on the Block by Sam Witteveen (December 4^th, 2024) ► Yet another framework. PydanticAI is simple and pythonic.
  - PydanticAI - Building a Research Agent by Sam Witteveen (December 6^th, 2024) ► Using PydanticAI to create an agent for Web search.
- smolagents
  - smolagents - HuggingFace's NEW Agent Framework by Sam Witteveen (January 6^th, 2025) ► Hugging Face has created yet another agent framework. Sam Witteveen does his usual presentation and experimentation with it.
  - ↪How to make Muilt-Agent Apps with smolagents by Sam Witteveen (January 8^th, 2025) ► More experimentation with smolagents, in particular with multiple agents configurations.
Desktop agents
- UI-TARS AI Agent: This IS THE BEST AI Agent EVER & BEATS Claude's Computer Use! by "AICodeKing" (January 23^rd, 2025) ► A simplistic demo of UI-TARS, an agent tha can pilot applications UI.
Browser agents
- Browser Use Agent: This FULLY FREE AI Agent CAN CONTROL BROWSERS & DO ANYTHING! (Beats Anthropic!) by "AICodeKing" (November 18^th, 2024) ► A presentation of Browser Use, a Python framework to create agents able to drive a Browser.
- Deepseek Operator (+Free APIs) : This 100% FREE AI Agent Beats OpenAI's Operator FOR FREE! by "AICodeKing" (January 24^th, 2025) ► A demo of Browser Use WebUI, a UI for a Browser agent.
- Qwen-2.5 Operator: This is The BEST LOCAL AI Operator Agent THAT YOU CAN USE NOW! by "AICodeKing" (January 30^th, 2025) ► Using Browser Use with Qwen2.5-VL.
- Gemini Browser Use by Sam Witteveen (February 14^th, 2025) ► Some simple experimentation with Browser Use and Gemini 2.0.
OpenAI Agent SDK
- How to Build an Agent with the OpenAI Agents SDK by Sam Witteveen (March 17^th, 2025) ► A classical Sam Witteveen’s presentation.
MCP
- What is MCP? Integrate AI Agents with Databases & APIs by Roy Derks (February 19^th, 2025) ► A high level description of MCP.
- microsoft/playwright-mcp by Simon Willison (March 25^th, 2025) ► Microsoft released an MCP server wrapping Playwright.
- Building an MCP server in 2 minutes.... by "2MinutesPy" (April 13^th, 2025) ► A simplistic example of implementing a MCP server in Python.
- Comprendre le Model Context Protocol (MCP) : connecter les LLMs à vos données et outils by Teilo Millet, Gireg Roussel, and Ismael Debbagh (April 18^th, 2025) ► A long high-level presentation of MCP.
- MCP Crash Course for Python Developers by Dave Ebbelaar (April 19^th, 2025) ► This presentation of MCP is rather slow and not so clear.
- Tiny Agents: an MCP-powered agent in 50 lines of code by Julien Chaumond (April 25^th, 2025) ► A simple JavaScript example of using MCP.
- ↪Tiny Agents in Python: a MCP-powered agent in ~70 lines of code by Célina Hanouti, Julien Chaumond, Lucain Pouget, and Shaun Smith (May 23^rd, 2025) ► The same in Python.
- Make AI Agents Fetch Real-time Data using THIS Powerful MCP Server⇊ by "2MinutesPy" (May 26^th, 2025) ► This is not a presentation of MCP, but an advertisement for Bright Data.