top of page
Codersarts Blog.
What’s new and exciting at Codersarts
Search


LLM Observability with OpenTelemetry: Build a Content Moderation API in Python and FastAPI
Introduction Content moderation at scale is one of the most operationally demanding problems in AI applications. Rule-based filters miss context and produce too many false positives. Fully manual review does not scale. A large language model can read text the way a human moderator would, understanding tone, context, and intent, and produce structured output that downstream systems can act on automatically. In this tutorial we build a FastAPI content moderation API that passes
ganesh90
1 day ago23 min read


Build Your First LLM-as-a-Judge for RAG Pipelines with Python and OpenAI
Introduction Retrieval-Augmented Generation (RAG) pipelines are widely used to build question-answering systems grounded in private or domain-specific documents. But evaluating whether a RAG pipeline is actually working well is harder than building it. Traditional metrics like BLEU and ROUGE measure surface-level word overlap and miss the semantic quality of answers. Human review is accurate but expensive and slow at any meaningful scale. LLM-as-a-Judge sits between these two
ganesh90
2 days ago26 min read


Fine-Tune NVIDIA Nemotron-3 Nano on a Customer Support Dataset
Introduction NVIDIA Nemotron-3 is a family of open models built for reasoning, coding, chat, and agentic workflows. The Nano variant packs strong language understanding into a 4-billion-parameter model that can be fine-tuned on a single 24GB GPU, making it practical for teams who want to adapt a capable base model to their own domain without renting a large training cluster. In this tutorial, we fine-tune Nemotron-3-Nano-4B on a customer support dataset. After training, the m
ganesh90
3 days ago16 min read


Build Your First AI Voice Agent: Speech, Conversation, and Audio Playback with Python and OpenAI
Introduction Most AI tutorials show you a text box. You type, the model replies, and the whole exchange stays on screen. That covers the mechanics of calling an LLM, but it leaves out what makes voice AI feel genuinely different: the question comes from a microphone, the answer comes back as speech, and the whole thing happens without touching a keyboard. This tutorial builds a working voice AI agent from scratch in Python. Press Enter to start recording, speak your question,
ganesh90
4 days ago13 min read


Turn Your Existing Blog Archive Into a Podcast — For Less Than the Cost of Coffee
Most readers skip your articles — not because the content is bad, but because reading takes time they don't have. This post breaks down how AI-powered blog-to-audio platforms work, from architecture to cost to rollout, and how a single article can become audio, a podcast episode, and multilingual content automatically. Includes a free downloadable PRD.

Pratibha
5 days ago22 min read


Build Your First LLM App: Text Summarizer and Explainer with Python and OpenAI
Introduction Before you build agents that use tools, remember conversations, or talk to other agents, it helps to start with the simplest possible thing an LLM app can do: take some text in, send it to a model with clear instructions, and return a useful result. In this tutorial, we build a Text Summarizer and Explainer, a terminal application that takes any block of text and processes it in one of three ways: a short summary, a plain language explanation, or a bulleted list
ganesh90
5 days ago12 min read


Build Your First AI Chatbot with Memory Using Python and OpenAI
Introduction Most AI chatbot demos are stateless: every message you send is treated as the first. The model has no idea what you said three turns ago, cannot refer back to details you shared earlier, and cannot build a coherent conversation over time. This is the biggest gap between a demo and a real chatbot. In this tutorial, we fix that. We build an AI Chatbot with Memory that maintains the full conversation history across every turn, passes it to the model on each request,
ganesh90
6 days ago11 min read


Build Your First RAG System: A Python Walkthrough
In this guide, you’ll create a fully functional local RAG pipeline in Python that can:
Read custom documents
Convert them into embeddings
Store them in a vector database
Retrieve relevant context
Generate grounded answers using an LLM
By the end, you’ll have a complete command-line RAG application running locally on your machine.

Pratibha
6 days ago7 min read


Build Your First AI Agent: Sentiment Analysis Agent with Python and OpenAI
Introduction Understanding how people feel about a product, a service, or an idea is one of the most valuable things a business can do, and it is also one of the tasks where AI consistently outperforms rule-based approaches. A single review can carry joy, frustration, and sarcasm all at once. A rules-based keyword matcher misses this nuance. An LLM does not. In this tutorial, we build a Sentiment Analysis Agent. It is a terminal application that takes any text input, sends it
ganesh90
Jun 1210 min read


The 24/7 AI Receptionist: How Clinics Are Automating Scheduling, Billing & Patient Calls Without Adding Staff
A Voice AI receptionist is an AI-powered system that answers phone calls — and increasingly, in-app and website voice interactions — on behalf of a clinic, and carries out real conversations with patients in natural, spoken language. It's not an IVR menu ("Press 1 for billing, press 2 for appointments"). It's a system that listens, understands intent, responds conversationally, and — most importantly — takes action on the patient's behalf.

Pratibha
Jun 1221 min read


Learn MCP by Building a To-Do List Manager with Python and Claude Desktop
Introduction Most AI assistants are good at answering questions but poor at remembering what you asked them to do yesterday. They have no persistent state across conversations — every session starts fresh. The Model Context Protocol (MCP) solves this by letting you build external tools that Claude (or any MCP-compatible host) can call during a conversation, with results persisted wherever you choose. In this tutorial, we build an MCP To-Do List Manager — a local server that g
ganesh90
Jun 1215 min read


Semantic Chunking in RAG Systems Explained
Semantic chunking is a chunking strategy that groups text based on meaning rather than fixed size.
Instead of splitting text after a certain number of tokens, semantic chunking tries to identify:
topic boundaries,
semantic transitions,
and coherent conceptual units.
The goal is simple:
Keep semantically related information together.

Pratibha
Jun 127 min read


Sliding Window Chunking Explained for Modern RAG Systems
Sliding window chunking has become one of the most widely used retrieval strategies because it helps preserve continuity between chunks without requiring complex semantic analysis.

Pratibha
Jun 127 min read


Build Your First A2A Agent: An Email Drafting Pipeline Using Python and OpenAI
Introduction Most AI email tools work as a single prompt: paste your draft, get a rewrite. The problem is that rewriting well requires two very different cognitive tasks — understanding what is wrong with the email, and then knowing how to fix it. Combining both into one prompt produces mediocre results for the same reason that asking a single person to be both a critic and a writer at the same time produces weak output. In this tutorial, we build an A2A Multi-Agent Email Dra
ganesh90
Jun 1121 min read


Fixed-Size Chunking in RAG: Still Relevant in 2026?
Chunking is the process of splitting documents into smaller retrievable units before embedding and indexing them.
In a RAG pipeline:
Documents are split into chunks.
Each chunk is converted into embeddings.
The embeddings are stored in a vector database.
User queries retrieve the most relevant chunks.
The retrieved chunks are passed to the LLM as context.
This means retrieval quality depends heavily on chunk quality.

Pratibha
Jun 116 min read


Build a Cost-Efficient Writing Quality Checker with Tiered Model Routing and OpenAI
Introduction Not every piece of text needs the most powerful language model to check it. A short sentence with a grammar error can be caught by a fast, cheap model in under a second. Only long, complex writing with structural and coherence problems genuinely benefits from the most capable model available. Tiered model routing applies this logic systematically. Short to medium text (up to 100 words) goes to GPT-4o-mini for grammar and clarity. If it detects structural or coher
ganesh90
Jun 1111 min read


Building an AI Book Recommender with Kimi K2 and Streamlit
Introduction Finding the next great book is harder than it sounds. Generic bestseller lists ignore your taste, and search engines return the same ten titles for every query. What most readers need is a recommendation that actually understands them — their preferred themes, emotional tone, narrative pace, and the books they already love. In this tutorial, we build an AI-powered Book Recommender using Kimi K2, Moonshot AI’s flagship agentic model. The user describes their readi
ganesh90
Jun 108 min read


Building an AI Interview Prep Agent with Qwen 3.7 Max and Streamlit
Introduction Job interviews are stressful, not because candidates lack skills, but because they lack structured preparation. Most people either over-prepare generic answers or walk in completely unprepared for role-specific questions. In this tutorial, we build an AI-powered Interview Prep Agent using Qwen 3.7 Max, Alibaba’s flagship reasoning model. The agent takes a single job title as input and returns a full preparation package: categorized question types, 8 tailored prac
ganesh90
Jun 108 min read


How to Build a Full-Stack Inventory Management System with React, FastAPI, and SQLite
A production-ready full-stack inventory management system that eliminates spreadsheet chaos, provides real-time stock visibility, and automatically alerts you when inventory falls below reorder thresholds.

Pratibha
May 513 min read


Research Assistant with AI Sampling
Assignment Overview Scenario: You are a research engineer at an academic institution building tools to help researchers manage and analyze scientific literature. Your task is to create an advanced MCP server that not only provides access to research papers but also uses AI sampling (server-initiated LLM calls) to generate intelligent summaries, extract key findings, and compare papers. This assignment builds on Assignment 1 by adding Module 4 concepts: sampling, production pa
ganesh90
Apr 38 min read
bottom of page