Libraries
- Model Context Protocol
- LangChain: doc, Python API, source
- GPT4All: doc, code
- LangChain4J: API doc, release notes, source
- Spring AI
- Griptape: doc
Mess
AssemblyAI: YouTube📡
: YouTube📡↓
: YouTube📡↓
: YouTube📡
: YouTube📡↓
Articles and videos
- LLaMA & Alpaca: “ChatGPT” On Your Local Computer 🤯 | Tutorial by (March 18th, 2023) ► A short explanation on how to use Dalai and LLaMA.
- Hugging Face + Langchain in 5 mins | Access 200k+ FREE AI models for your AI apps by (June 11th, 2023) ► A small effective demo of using Hugging Face and LangChain.
- $0 Embeddings (OpenAI vs. free & open source)↑ by (June 25th, 2023) ► A demo of two ways to compute embeddings: online with Hugging Face and locally in the browser.
- "Next Level Prompts?" - 10 mins into advanced prompting by (August 29th, 2023) ► Some tools/sites helping to write prompts: Guidance, FlowGPT, gpt-prompt-engineer, PromptsRoyale.
- A developer’s guide to open source LLMs and generative AI — Open source generative AI projects are a great way to build new AI-powered features and apps. by (October 5th, 2023) ► Some information on open-source LLMs and a short list of four ones.
- 🤬 How the #@%$! Do You Use an LLM in a SaaS Platform?↓ by (October 6th, 2023) ► describes his first steps to build learntail.com, using OpenAI and Langchain to generates quizzes.
- Pydantic is all you need: Jason Liu by (October 9th, 2023) ► presents his Instructor library to structure prompting and extraction (for OpenAI).
- How I Fine-Tuned An AI Clone - Can You Tell The Difference?↓ by (November 2nd, 2023) ► A lengthy but too fast video just to end up with using HeyGen to create a deep fake video.
- LLM: Trust, but Verify — Understand the challenges of developing, testing, and monitoring non-deterministic software; this is a new and significant challenge for observability. by (November 3rd, 2023) ► The author describes the problem of model drift and proposes a mechanism to detect it.
- Wanna RAG? These are your best LLMs!!! by (November 16th, 2023) ► A presentation of Galileo’s Hallucination Index.
- No, You DON'T NEED OpenAI Function Calling!!!! by (November 17th, 2023) ► A quick n’ dirty presentation of Gorilla OpenFunctions.
- Training Your Own AI Model Is Not As Hard As You (Probably) Think by (November 22nd, 2023) ► Using several steps to generate code from a Figma design: I wonder if what is presented here really works on other cases than this demo.
- llamafile is the new best way to run an LLM on your own computer by (November 29th, 2023) ► A presentation of llamafile: a single file containing the model and its executable which can run on several OSes.
- Detect Texts from Documents (even SCANNED)!!! by (January 14th, 2024) ► A presentation of Surya: a tool to identify text lines and compute their bounding boxes.
- Exploring ColBERT with RAGatouille by (January 27th, 2024) ► Some experimentation with ColBERT, a fast retrieval model.
- Everything WRONG with LLM Benchmarks (ft. MMLU)!!! by (February 10th, 2024) ► Presenting a paper ("When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards") which analyses how model’s scores are sensible to the way benchmarks are structured.
- Engineering Practices for LLM Application Development by and (February 13th, 2024) ► Lessons learned from the building of a PoC of a concierge using an LLM.
- La recherche sous stéroïdes - une histoire de sémantique↓ by and (May 3rd, 2024) ► Some feedback about implementing semantic search on an e-commerce site. This could have been much shorter.
- Poorman's ChatGPT-4o Works!! 🤣 by (May 15th, 2024) ► A short presentation of KingNish/OpenGPT-4o a Hugging Face space supporting several modalities by using open models.
- The 4 Big Changes in LLMs by (July 1st, 2024) ► advices to consider four things: models are getting smarter, they are getting faster, there are getting cheaper, and context windows are getting larger.
- RouteLLM: An Open-Source Framework for Cost-Effective LLM Routing by , , , , , , , and (July 1st, 2024) ► The authors propose a router that selects to run a query either toward an expensive powerfull model or toward a cheaper smaller model, in order to reduce cost while sacrifying little quality.
- ↪What is an LLM Router? by (July 3rd, 2024) ► Nothing more than the previous announcement.
- InternLM - A Strong Agentic Model? by (July 5th, 2024) ► A basic presentation of internlm/internlm2_5-7b-chat, a model specialised for JSON and function calling.
- Prompt Poet - Character AI's Prompting Framework by (August 2nd, 2024) ► A presentation of Prompt Poet, a Python framework to manage prompts.
- Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 1) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this first part of a blog series, we'll explore the fundamental principles of LLM caching, delve into the various caching architectures and implementations that can be employed by (August 7th, 2024) ► Some cache architectures for LLM, classical and RAG. The cache key can be exact or semantic.
- ↪Bridging the Efficiency Gap: Mastering LLM Caching for Next-Generation AI (Part 2) — LLM caching refers to the process of storing and managing the intermediate computations and outputs generated by language models, allowing for rapid retrieval and reuse in subsequent queries or tasks. In this second part of a blog series, we'll explore LLM caching implementations. by (August 7th, 2024) ► How to implement the previous architecture on AWS using LangChain or not.
-
How streaming LLM APIs work by (September 21st, 2024) ► Some experimentation of using SSE with GPT-4o Mini, Sonnet 3, and Gemini Pro using
curl
, Python‘sHTTPX
, and JavaScript’sfetch()
. - Is Spring AI Strong Enough for AI? — Explore Spring's capabilities within the AI domain, its potential integration with AI libraries, and its ability to effectively manage AI workflows. by (September 27th, 2024) ► This article is comparing very different things: Spring, TensorFlow Serving, Kubernetes, MLflow, and Python. Additionally, it only states some obvious facts.
-
Explore a New C# Library for AI by (October 11th, 2024) ► Some very little information about
Microsoft.Extensions.AI
, some new .NET packages to integrate AI. -
Run a prompt to generate and execute jq programs using llm-jq by (October 27th, 2024) ► A new
llm
plugin to generate and executejq
commands. - How Google is helping developers get better answers from AI — Today’s guest is Logan Kilpatrick, a senior product manager at Google, who tells Ben about his journey from software engineering to machine learning to product management, all with an emphasis on reducing developer friction. They talk through the challenges of non-determinism in AI models and how Google is addressing these issues with a new feature: Grounding with Google Search. Plus, what working at the Apple Store taught Logan about product management. by and (November 5th, 2024) ► There is no real information in this interview of , a product manager for Google AI Studio.
- Model Compression: Improving Efficiency of Deep Learning Models — Model compression is a key component of real-time deployment of deep learning models. This article explores different approaches to make models more efficient. by (November 6th, 2024) ► A high-level and clear description of model pruning, quantisation, and knowledge distillation.
- ChainForge by (November 8th, 2024) ► Some little information about ChainForge, a tool to evaluate prompts.
- Introducing the Model Context Protocol by (November 25th, 2024) ► Anthropic proposes a protocol to connect a LLM to tools and data sources.
- Anthropic's New Agent Protocol! by (November 27th, 2024) ► A presentation of Model Context Protocol and some experimentation with it.
- Structured Generation w/ SmolLM2 running in browser & WebGPU by (November 29th, 2024) ► Runing a 1.7B model (HuggingFaceTB/SmolLM2-1.7B-Instruct) in Chrome.
- 17 Python Libraries Every AI Engineer Should Know by (December 12th, 2024) ► The title says it all.
- Integrating AI With Spring Boot: A Beginner’s Guide — In this guide, you will learn how to integrate AI into your Spring Boot app using Spring AI and simplify your AI setup with familiar Spring abstractions. by (January 27th, 2025) ► An introduction to Spring AI.
- Using pip to install a Large Language Model that’s under 100MB by (February 7th, 2025) ► created a (useless) Pypi package embedding a tiny LLM (HuggingFaceTB/SmolLM2-135M-Instruct).
-
files-to-prompt 0.5 by (February 14th, 2025) ► describes his
files-to-prompt
tool, used to send some files and a prompt to a LLM. - Emerging Patterns in Building GenAI Products by and (February 19th, 2025) ► A good overview of the common methods for integrating generative AI (mostly LLMs).
-
Fine-tuning
- Fine Tune a model with MLX for Ollama by (August 30th, 2024) ► How to fine-tune a model with MLX and use it in Ollama.
- ↪Is MLX the best Fine Tuning Framework? by (January 18th, 2025) ► A detailed introduction to fine-tuning with MLX. This is an expanded version of the previous video.
- Fine-tuning Large Language Models by , , , and (January 16th, 2025) ► The basics of LLM and fine-tuning, a demo of Together’s LoRA fine-tuning API, some experiments done by Together, and some pieces of advice.
- Fast Fine Tuning with Unsloth by (January 24th, 2025) ► A presentation of Unsloth which optimises fine-tuning on Nvidia GPUs.
- Axolotl is a AI FineTuning Magician↓ by (January 31st, 2025) ► This presentation of Axolotl is too verbose and it is not very good because does not master the subject.
-
RAG
- ClippyGPT - How I Built Supabase’s OpenAI Doc Search (Embeddings) by (February 7th, 2023) ► describes in details how he implemented a chat to answer questions on Supabase: tokenising the doc, finding the paragraph closest to the question, and generating the answer.
- Build RAG Application Using a LLM Running on Local Computer with GPT4All and Langchain — Privacy-preserving LLM without GPU↑ by (March 10th, 2024) ► A clear explanation with working code of how to scrap an Internet doc, to chunk it, to store it in Chroma, and to use GPT4All to generate the answer.
- Ne mettez pas les projets RAG en production trop vite ! by (June 3rd, 2024) ► lists some examples of problems that will occur with a too simplistic implementation of a RAG. But this simply means that you do not design a demo and a scalable application the same way, the second is much more complex.
- ↪Rendre résilient un projet RAG by (June 17th, 2024) ► suggested many changes to LangChain in order to make it more resilient, e.g. to properly support transactions.
- Breaking up is hard to do: Chunking in RAG applications — A look at some of the current thinking around chunking data for retrieval-augmented generation (RAG) systems. by (June 6th, 2024) ► A high level presentation of some chunking methods and how to evaluate them.
- Supercharging RAG with Generative Feedback Loops from Weaviate by (June 17th, 2024) ► A presentation of Generative Feedback Loops, which is just about storing LLM generated text in a vectorial database, so it be retrieved quickly rather than regenerated by the LLM.
- Building search-based RAG using Claude, Datasette and Val Town by (June 21st, 2024) ► The debrief of a life session of implementing a small RAG in Val Town.
- Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools (Paper Explained) by (June 26th, 2024) ► A critic of "Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools".
- Gemma 2 - Local RAG with Ollama and LangChain by (June 28th, 2024) ► A simple RAG implementation.
- Practical tips for retrieval-augmented generation (RAG) — Retrieval-augmented generation (RAG) is one of the best (and easiest) ways to specialize an LLM over your own data, but successfully applying RAG in practice involves more than just stitching together pretrained models. by (August 15th, 2024) ► Some high level advice on how to implement RAG.
- Knowledge Graphs: The Secret Weapon for Superior RAG Applications — Integrating knowledge graphs in RAG applications enhances recommendation accuracy and context-awareness, providing structured, interconnected data.🚫 by , , and (August 19th, 2024) ► This article is only about the data retrieval. The data needs to be structured, so it can be stored as a semantic graph.
- RAG vs. Fine Tuning by (September 9th, 2024) ► The basics of RAG vs. fine-tuning, and a description of combining both.
- Introducing Contextual Retrieval↑ (September 19th, 2024) ► Anthropic experimented RAG with adding context to chunks, using embedding and BM25, and reranking.
- ↪Contextual RAG is stupidly brilliant!↓ by (September 23rd, 2024) ► This presentation of Anthropic’s analysis on how to improve RAG is poorly done.
- Multimodal Document RAG with Llama 3.2 Vision and ColQwen2 by (October 8th, 2024) ► A presentation of ColPali design: using a vision language model (PaliGemma or Qwen-2) to transform image patches into vectors, finding the patch vectors nearest to the user query, and providing the corresponding full images and user query to a vision LLM (Llama 3.2 vision).
- Why Your RAG System Is Broken, and How to Fix It with Jason Liu (⧉) by and (November 11th, 2024) ► Some advice on RAG implementation: doing fast and simple evals (e.g. looking at the length, using regexp…), use them very frequently, reranking…
- Build a document-based question answering system by using Docling with Granite 3.1 by , , and (December 18th, 2024) ► A small demo of interrogating a document using Granite, Docling, LangChain, and FFAIS.
- 2 Methods For Improving Retrieval in RAG by (December 19th, 2024) ► This video seems to be a real usage of RAG, not the usual YouTuber doing the usual demo. The guy improved his RAG system by preprocessing the documents to extract structured data from them using a LLM.
- GraphRAG vs. Traditional RAG: Higher Accuracy & Insight with LLM↓ by (February 17th, 2025) ► This presentation of GraphRAG is too high level, you have no clue on how to implement it.
- Build an AI-powered multimodal RAG system with Docling and Granite by and (February 26th, 2025) ► Yet another RAG example, this one extracts text, tables, and images from a PDF file.
- RAG vs. CAG: Solving Knowledge Gaps in AI Models↑ by (March 17th, 2025) ► A basic and good comparison of Retrieval-Augmented Generation and Cache-Augmented Generation.
-
NotebookLM
- Google's RAG Experiment - NotebookLM by (May 28th, 2024) ► The title says it all. Google demo is impressive, using voice for querying and answering.
- How to create AI Podcasts with NotebookLM Tutorial by (September 17th, 2024) ► A presentation of an impressive Google demo usable (from Illuminate and NotebookLM): you give a paper as entry, it generates a two-persons podcast.
- NotebookLM’s automatically generated podcasts are surprisingly effective by (September 29th, 2024) ► People are playing with NotebookLM-generated podcasts, sometimes at a meta-level.
- New in NotebookLM: Customizing your Audio Overviews by (October 17th, 2024) ► is playing with the fact that NotebookLM users can now provide guidelines for the podcast to generate: as usual he picks up the pelican example and asks the AI-hosts to behave as if they were pelicans.
- Google's UNREAL AI Gets an UPGRADE... by (October 19th, 2024) ► The "poop fart" podcast and how added video on it using HeyGen. He also quickly describes the new NotebookLM features.
-
Web scrapping
- Web Scraping AI AGENT, that absolutely works 😍 by (May 9th, 2024) ► A presentation of ScrapeGraphAI, a Python library to scrap a Web site and to interrogate an LLM on the scrapped data.
- “Wait, this Agent can Scrape ANYTHING?!” - Build universal web scraping agent by (May 16th, 2024) ► Scrapping the Web with FireCrawl or AgentQL, and an LLM.
- How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai by (May 17th, 2024) ► Some Web scrapping tools: Beautiful Soup, Jina AI, Firecrawl, and Scrapegraph-ai.
- How Stack Overflow fends off scraping bots — Josh Zhang, a staff site reliability engineer at Stack Overflow, tells Ryan and Eira how the Stack Exchange network defends against scraping bots. They also cover the emergence of human botnets, why DDoS attacks have spiked in the last couple of years, and the constant balancing act of protecting sites from attack without inhibiting legitimate users. by , , and (July 30th, 2024) ► The subtitle says it all.
- Agentically scrape the web with Firecrawl & LangGraph (LangChain) by (October 25th, 2024) ► The title says it all.
- NuExtract 1.5 by (November 16th, 2024) ► NuExtract models extract structured data from unstructured text.
-
Tool calling
- AI Agents' Secret Sauce by (October 7th, 2024) ► Some basic but good advice on how to implement tools.
- What is Tool Calling? Connecting LLMs to Your Data by (January 13th, 2025) ► The basic presentation of tool calling is classical. But the description of "embedded tool calling" is not detailed enough to understand how that can work.
-
Docling
-
Docling by (November 3rd, 2024) ► A short feedback on experimenting with
docling
. - Building a Basic RAG System with Docling: A Comprehensive Guide↓ by (December 24th, 2024) ► This presentation of doing RAG on a PDF file is rather bad, but the code (in the GitHub repo) is fine.
- How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites) by (February 13th, 2025) ► A presentation of Docling, a library for parsing documents.
-
Docling by (November 3rd, 2024) ► A short feedback on experimenting with
-
Frameworks
- LLM Toolkit: Validation is all you need by (May 20th, 2024) ► Building a tool that, from an English question, performs a database request and generates an answer, using Instructor and Fructose.
-
LangChain
- LangChain101: Question A 300 Page Book (w/ OpenAI + Pinecone) by (February 27th, 2023) ► A small demo using LangChain, OpenAI, and Pinecone.
- Workaround OpenAI's Token Limit With Chain Types by (March 1st, 2023) ► Some solutions to summarise or extract answers from too long documents.
- The LangChain Cookbook - Beginner Guide To 7 Essential Concepts by (March 29th, 2023) ► Some short examples of the LangChain features.
- ↪The LangChain Cookbook Part 2 - Beginner Guide To 9 Use Cases by (May 2nd, 2023) ► The continuation of the previous video.
- LangChain: Run Language Models Locally - Hugging Face Models by (April 25th, 2023) ► A demo of executing a model on Hugging Face and locally.
- 5 Levels Of LLM Summarizing: Novice to Expert by (May 4th, 2023) ► More LangChain examples.
- Scrape any website with OpenAI Functions & LangChain by (August 2nd, 2023) ► The title says it all.
- Construire son RAG (Retrieval Augmented Generation) grâce à langchain: L’exemple de l’Helpdesk d’OCTO by and (October 17th, 2023) ► A detailed example demonstrating how to extract data from Confluence, embed the chunks, create a chain to find and format the answer, and evaluate the result.
- Gradio 5 - Building a Quick Chabot UI for LangChain by (October 10th, 2024) ► A small example of a streaming chat program with Gradio 5 and LangChain.
- Content Extraction using Large Language Models & JavaScript↓ by (January 9th, 2025) ► An example of using LangChain with Granite to extract data from a PDF and Mistral Large to format it into a Markdown table. But the Json format is unspecified, she is using some flaky heuristic to clean up Granite’s answer, an LLM is an overkill to convert Json into a Markdown table…
- LangChain RAG: Optimizing AI Models for Accurate Responses by (February 13th, 2025) ► A simple RAG system using LangChain and Granite 3.0 8B Instruct.
-
LangChain4J
- Java Meets AI: A Hands On Guide to Building LLM Powered Applications with LangChain4j By Lize Raes (October 5th, 2023) ► An overview of LangChain4j.
- Experiments with Langchain4j or Java way to LLM-powered applications by (February 6th, 2024) ► A good overview of LangChain4j features, this is mostly for persons who do not know the typical AI use cases.
- The Definitive Guide to Tool Support in LangChain4J by (February 24th, 2024) ► A rather slow presentation of using tools in LangChain4j.
- Java rencontre l'IA : Comment intégrer les LLMs dans vos applications avec LangChain4j by (May 3rd, 2024) ► The same, in French and updated.
- Evolution of Java Ecosystem for Integrating AI by (January 29th, 2025) ► Building a RAG chat using LangChain4J and Oracle Generative AI.
-
Tools
-
Ollama
- Ollama on CPU and Private AI models! by (November 8th, 2023) ► A presentation of Ollama.
- Ollama Web UI (ChatGPT-ish) - Local AI FTW!!! by (December 1st, 2023) ► Running Ollama Web UI in Docker.
- Ollama's Newest Release and Model Breakdown by (September 21st, 2024) ► Ollama 0.3.11, Solar Pro Preview, Qwen 2.5, Bespoke Minicheck, Mistral Small, and Reader-LM.
- Quick Look at Hollama↓ by (October 8th, 2024) ► The "unboxing" of Hollama, a good basic UI for Ollama. But there is little value in such a video, you can easily do the same yourself.
- Ollama + HuggingFace - 45,000 New Models by (October 25th, 2024) ► Ollama can now use any GGUF recorded on Hugging Face.
- Ollama: Llama 3.2 Vision by (November 13th, 2024) ► Some very little information about Ollama supporting the vision features of Llama 3.2.
- Open WebUI by (December 27th, 2024) ► discovers Open WebUI, he is satisfied by the installation easiness, and he experiments it with Llama 3.2 3B.
- Building a Vision App with Ollama Structured Outputs by (December 31st, 2024) ► A presentation of Ollama Structured Outputs and some examples using them with Llama 3.2’s vision.
- Solved with Windsurf by (February 14th, 2025) ► wrote a utlity in Rust, a language he barely knows, using Windsurf, to get a report on the models installed in Ollama.
-
The Ollama Course of
- 1. The Ollama Course: Intro to Ollama by (July 23rd, 2024) ► An overview of Ollama: installation, basic usage, and downloading a model.
- 2. Installing Ollama by (July 30th, 2024) ► How to install Ollama on Windows, Linux, and MacOS.
- 3. How to use the Ollama.com site to Find Models by (August 6th, 2024) ► An explanation of the description of Ollama models.
- 4. The Ollama Course - Using the CLI by (August 14th, 2024) ► A presentation of all the CLI commands.
- 5. Comparing Quantizations of the Same Model - Ollama Course by (August 21st, 2024) ► Compare the results of the same model with different quantisations and select the one that has the quality / speed that is the best for your needs.
- 6. An Introduction to RAG - Part of the Free Ollama Course by (August 29th, 2024) ► A basic introduction to RAG.
- 7. Embeddings in Depth - Part of the Ollama Course by (September 4th, 2024) ► An overview on how to perform embedding using Olllama.
- Let's build a RAG system - The Ollama Course by (September 11th, 2024) ► An example of a small RAG program, both in Python and JavaScript.
- What are the different types of models - The Ollama Course by (September 19th, 2024) ► A basic presentation of the model types: text/base, chat/instruct, code, and vision.
- Crack Ollama Environment Variables with Ease - Part of the Ollama Course by (September 26th, 2024) ► The most important environment variables and how to set them on MacOS, Linux, and Windows.
- Upgrade Your AI Using Web Search - The Ollama Course by (October 2nd, 2024) ► A simple program using SearNGX and Cheerio to perform a Web search, retrieve the found pages, scrap the text in them, and generate an answer with Llama 3.2 1B.
- Taming AI Hallucinations?🚫 by (October 9th, 2024) ► describes some basic facts about hallucination.
- Unlock AI Mastery with Pro Tips on Prompting! by (October 16th, 2024) ► Some basics on prompt writing.
- Master Ollama's File Layout in Minutes! by (October 23rd, 2024) ► A description of how Ollama records the models using several files, similarly to what Docker does.
- Don’t Embed Wrong! by (October 31st, 2024) ► speaks about using prefixes for RAG with Ollama, but there is no explanation of how they work, he just says that they improve results.
- AI Model Context Decoded by (November 6th, 2024) ► How to change the context size and some warnings about using a large context size.
- AI Vision Models Take a Peek Again! by (November 8th, 2024) ► Using Llama 3.2’s vision in Ollama 0.4.0.
- Let's Update Ollama Everywhere by (November 13th, 2024) ► Explaining something very basic: upgrading Ollama on Mac, Windows, Linux, and Docker.
- Cracking the Enigma of Ollama Templates by (November 20th, 2024) ► An introduction to model templates.
-
Find Your Perfect Ollama Build by (November 22nd, 2024) ► How to build Ollama, the
main
branch or a PR. - Simplify Ollama Cleanup Like a Pro by (November 27th, 2024) ► A presentation of Gollama to clean up Ollama data and how to uninstall Ollama.
- The Path To Better Custom Models by (December 6th, 2024) ► An introduction to Ollama model files.
- The Truth About Ollama's Structured Outputs by (December 11th, 2024) ► A presentation of structured outputs and a comparison with JSON mode.
- Optimize Your AI - Quantization Explained↓ by (December 28th, 2024) ► This description of model and context quantisation is unclear, mostly because there is no technical explanation.
- MSTY Makes Ollama Better by (February 28th, 2025) ► A presentation of MSTY, a UI for Ollama.
-
llm
-
Language models on the command-line w/ Simon Willison by and (June 13th, 2024) ► presents his
llm
CLI tools. - ↪Language models on the command-line by (June 17th, 2024) ► An overview of the video.
-
Using LLMs on the command line by (October 26th, 2024) ► A short presentation of
llm
. -
Ask questions of SQLite databases and CSV/JSON files in your terminal by (November 25th, 2024) ► adds to
sqlite-utils
the possibility to ask questions in natural language and have a LLM generate the SQL query. -
How I use LLMs – neat tricks with Simon’s `llm` tool — Earlier this year I co-authored a report about the direct environmental impact of AI, which might give the impression I’m massively anti-AI, because it talks about the signficant social and environmental of using it. I’m not. I’m (still, slowly) working through the content of the Climate Change AI Summer School, and I use it a fair amount in my job. This post shows some examples I use. by (December 30th, 2024) ► Some positive feedback and some examples of usage of
llm
. - LLM 0.22, the annotated release notes by (February 17th, 2025) ► The title says it all.
-
Structured data extraction from unstructured content using LLM schemas by (February 28th, 2025) ► added support of JSON schemas to
llm
. - llm-openrouter 0.4 by (March 10th, 2025) ► improved the support of OpenRouter.
-
Language models on the command-line w/ Simon Willison by and (June 13th, 2024) ► presents his
-
Ollama
-
Agents
- 5 Problems Getting LLM Agents into Production by (June 4th, 2024) ► Some advice on using agents.
- Evals for AI Agents, the right way!!! by (August 12th, 2024) ► The usual bad presentation of a paper ("TOOLSANDBOX: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities") evaluating the efficiency of using LLM as agents.
- Agent-S : Unleash The Power Of GUI Computer Use Agents ! by (October 21st, 2024) ► A high level presentation of "Agent S: An Open Agentic Framework that Uses Computers Like a Human: an framework to use applications as a human would do it.
- Microsoft Launches 10 NEW AI Agents by (November 24th, 2024) ► Mircrosoft is moving agressively on AI and integrates 10 agents in Dynamics 365.
- Building effective agents (December 19th, 2024) ► A good and simple overiew of some workflow and agent architectures.
- ↪Building effective agents by (December 20th, 2024) ► Some extracts of the previous article.
- ↪How to Build Effective AI Agents (without the hype) by (January 20th, 2025) ► This video is just Anthropic’s article.
- Trace & Evaluate your Agent with Arize Phoenix by , , and (February 28th, 2025) ► A presentation of Arize Phoenix, a platform to trace and evaluate smolagents, the evaluation uses LLM-as-a-judge.
-
LangGraph
- AgentWrite with LangGraph by (September 6th, 2024) ► describes how he set up a short LangGraph example to write long articles, similarly to LongWriter.
- Building a LangGraph ReAct Mini Agent by (September 17th, 2024) ► A description of a simple Pattern in LangGraph: ReAct Function Calling.
-
ChatDev
- Build AI agent workforce - Multi agent framework with MetaGPT & chatDev by (September 8th, 2023) ► A presentation of ChatDev.
-
CrewAI
- CrewAI August Update: Planning Steps, Training, and Advanced Features Explained by (August 20th, 2024) ► presents some new CrewAI features, but there is no explanation on how training is taken into account, on how test scores are computed…
-
Autogen
- Autogen - Microsoft's best AI Agent framework that is controllable? by (October 3rd, 2023) ► A presentation of AutoGen.
- Microsoft's Magentic One: This FREE AI AGENT can CONTROL BROWSER, DO CODING & MORE! by (November 10th, 2024) ► A presentation and some little test of Magentic-One, a multi-agent system from Microsoft able to surf on the Web, read local file, write code, and pilot a terminal to execute that code.
- Multi-Agent AI EXPLAINED: How Magentic-One Works by (November 13th, 2024) ► A better presentation of Magentic-One.
-
Swarm
- Introducing Swarm with Code Examples: OpenAI's Groundbreaking Agent Framework by (October 14th, 2024) ► Some simple examples using Swarm framework and some feedback about it.
-
PydanticAI
- PydanticAI - The NEW Agent Builder on the Block by (December 4th, 2024) ► Yet another framework. PydanticAI is simple and pythonic.
- PydanticAI - Building a Research Agent by (December 6th, 2024) ► Using PydanticAI to create an agent for Web search.
-
smolagents
- smolagents - HuggingFace's NEW Agent Framework by (January 6th, 2025) ► Hugging Face has created yet another agent framework. does his usual presentation and experimentation with it.
- ↪How to make Muilt-Agent Apps with smolagents by (January 8th, 2025) ► More experimentation with smolagents, in particular with multiple agents configurations.
-
Desktop agents
- UI-TARS AI Agent: This IS THE BEST AI Agent EVER & BEATS Claude's Computer Use! by (January 23rd, 2025) ► A simplistic demo of UI-TARS, an agent tha can pilot applications UI.
-
Browser agents
- Browser Use Agent: This FULLY FREE AI Agent CAN CONTROL BROWSERS & DO ANYTHING! (Beats Anthropic!) by (November 18th, 2024) ► A presentation of Browser Use, a Python framework to create agents able to drive a Browser.
- Deepseek Operator (+Free APIs) : This 100% FREE AI Agent Beats OpenAI's Operator FOR FREE! by (January 24th, 2025) ► A demo of Browser Use WebUI, a UI for a Browser agent.
- Qwen-2.5 Operator: This is The BEST LOCAL AI Operator Agent THAT YOU CAN USE NOW! by (January 30th, 2025) ► Using Browser Use with Qwen2.5-VL.
- Gemini Browser Use by (February 14th, 2025) ► Some simple experimentation with Browser Use and Gemini 2.0.
-
OpenAI Agent SDK
- How to Build an Agent with the OpenAI Agents SDK by (March 17th, 2025) ► A classical ’s presentation.
-
MCP
- What is MCP? Integrate AI Agents with Databases & APIs by (February 19th, 2025) ► A high level description of MCP.
- microsoft/playwright-mcp by (March 25th, 2025) ► Microsoft released an MCP server wrapping Playwright.
- Building an MCP server in 2 minutes.... by (April 13th, 2025) ► A simplistic example of implementing a MCP server in Python.
- Comprendre le Model Context Protocol (MCP) : connecter les LLMs à vos données et outils by , , and (April 18th, 2025) ► A long high-level presentation of MCP.