How to Build a Real-Time AI News Aggregator with Django, OpenAI, and Tavily
- 19 hours ago
- 12 min read

The Information Overload Problem
Every morning, millions of people open a dozen browser tabs trying to piece together what happened overnight. Tech news, financial markets, policy updates, sports scores — each domain lives on a different site, written in a different style, pushing a different agenda. By the time you have read three articles, you have lost fifteen minutes and still do not have a clear picture.
The problem is not a shortage of news. The problem is synthesis. No single tool aggregates the right headlines, condenses them into plain language, cites the original sources, and lets you ask follow-up questions — until now.
An AI News Aggregator built with Django, OpenAI GPT-4o-mini, and the Tavily Search API changes the experience completely. Instead of clicking across tabs, a user types "What happened in AI startups this week?" and receives a structured, sourced summary in seconds.
Here are six real-world use cases for this kind of tool:
Tech enthusiasts tracking AI and startup launches without reading fifteen blogs
Investors monitoring economic indicators, earnings announcements, and market-moving events
Policy researchers synthesizing legislation updates and regulatory news from multiple jurisdictions
Students gathering current-events evidence for assignments and presentations
Content creators sourcing verified trending topics before writing or recording
Journalists running rapid background research before an interview or deadline
In this blog post you will learn how the system works conceptually, how the architecture is layered, which technology stack to choose at different experience levels, how to phase the build across a real project timeline, and what pitfalls to avoid. No code is required to read this article — just curiosity and a desire to build something useful.
📄 Before you dive in — grab the free PRD template that maps out this entire system: architecture, API spec, sprint plan, and system prompt. [Download the free PRD]
How It Works: Core Concept
At its heart, an AI News Aggregator is a retrieval-augmented generation (RAG) pipeline compressed into a conversational interface. Let us break that down.
The naive approach and why it fails
The most obvious approach to building a "news AI" is to feed GPT-4o-mini a question and trust its training data. The immediate problem: GPT-4o-mini's knowledge has a cutoff date. Ask it "What happened in AI today?" and it will invent plausible-sounding but completely fabricated news. This phenomenon is called hallucination, and in a news context it is not just annoying — it is dangerous. Users trust AI-generated news summaries as factual. Fabricated citations destroy credibility.
The retrieval-augmented fix
Instead of asking the model to recall news, you retrieve live news first and then ask the model to summarise what it retrieved. Tavily is a search API built specifically for AI pipelines. Unlike scraping Google, Tavily returns clean article excerpts and URLs in a structured JSON response, formatted exactly for LLM consumption.
ASCII Data-Flow Diagram
User Types Question
|
v
Django View (POST /api/chat/)
|
v
Session Lookup → Load Conversation History
|
v
Tavily API Call ←—— [Current Date Injected Here]
|
v
Article Array (titles + excerpts + URLs)
|
v
Prompt Assembly (history + articles + question)
|
v
OpenAI GPT-4o-mini
|
v
AI Response with Source Citations
|
v
Message Saved to SQLite (sources as JSON field)
|
v
Rendered in Chat UI with Clickable Links
The key analogy
Think of Tavily as a research intern who runs to the library, grabs the five most relevant newspaper clippings, and hands them to a senior editor (GPT-4o-mini). The editor reads all five, writes a one-paragraph briefing, and hands it back to you with footnotes pointing to each original clipping. You never had to visit the library yourself.
This separation of concerns — retrieval versus generation — is what makes the system both accurate and conversational.
System Architecture Deep Dive
The system has five distinct layers, each with a clear responsibility. Understanding the boundary between layers is the difference between a prototype that works once and a production system that works reliably.
Layer-by-Layer Overview
Layer 1 — Frontend (Vanilla JavaScript) The browser-side code handles three jobs: capturing user input, sending POST requests to the Django API, and rendering markdown-formatted responses. Because the project deliberately avoids a JavaScript framework, the codebase stays readable and deployable without a Node.js build pipeline. DOMPurify sanitises any HTML before insertion into the DOM, preventing XSS attacks from user-generated or AI-generated content.
Layer 2 — Django Application Server Django 5.2 runs the application logic. It serves the single-page HTML template, manages session cookies for anonymous users, exposes REST endpoints via Django REST Framework, and coordinates all downstream calls. The session system is critical: it provides per-user conversation isolation without requiring registration, lowering the barrier to entry for end users.
Layer 3 — Data Layer (SQLite + Django ORM) Four models carry the data weight: NewsCategory stores seeded topic pills (Technology, Finance, Science, Politics, Sports), Conversation groups messages per session, Message stores each user question and AI response alongside a JSON array of source URLs, and SavedArticle stores bookmarked articles with their title, summary, and source link.
Layer 4 — Tavily Search API Tavily handles live web retrieval. On each chat request, the backend constructs a search query using the user's question plus the current date (injected programmatically to prevent the model from treating training-era news as "today"). Tavily returns up to five article results with clean, LLM-ready excerpts.
Layer 5 — OpenAI GPT-4o-mini GPT-4o-mini receives the assembled prompt: the conversation history, the Tavily article excerpts, and the user's question. It returns a structured summary that cites specific sources by number. The system prompt instructs the model to always anchor claims to the provided sources and never to invent citations.
Component Table
Component | Role | Options |
Django 5.2 | Application framework, routing, session management | Flask, FastAPI |
Django REST Framework | Serialisers, viewsets, JSON response handling | Plain Django views |
SQLite | Development database, conversation + article persistence | PostgreSQL, MySQL |
OpenAI GPT-4o-mini | News summarisation and conversational response | GPT-4o, Claude 3 Haiku |
Tavily API | Live web search returning LLM-ready article excerpts | SerpAPI, Bing Search API |
Vanilla JavaScript | Frontend interactivity, POST requests, DOM rendering | React, Vue, HTMX |
Gunicorn | WSGI production server | uWSGI, Daphne |
DOMPurify | XSS prevention for rendered AI output | Manual sanitisation |
Session Middleware | Anonymous user isolation without registration | JWT-based auth |
JSON Field (Message) | Per-message source storage for save functionality | Separate SourceLink model |
Data Flow
User opens the app. Django middleware assigns a session cookie if one does not exist.
User clicks a category pill (e.g., "Technology") or types a custom question into the chat input.
The JavaScript frontend sends a POST request to /api/chat/ with the question and the current session ID.
The Django view retrieves or creates a Conversation record for this session.
The view saves the user's question as a Message record with role="user".
The view calls the Tavily API with the query string and the current ISO date appended to ensure recency.
Tavily returns a JSON array of articles: each has a title, url, content (excerpt), and score.
The view loads the last ten messages from the conversation to build a context window.
The assembled prompt (system instructions + article excerpts + conversation history + new question) is sent to GPT-4o-mini via the OpenAI Python SDK.
GPT-4o-mini returns a response that references sources by index number.
The response and the Tavily URL array are saved together as a new Message record with role="assistant".
The API returns the response text and source URLs as JSON.
JavaScript renders the response in the chat panel and appends clickable source link badges below each message.
If the user clicks "Save Article", a POST to /api/save-article/ creates a SavedArticle record for that session.
Two Non-Obvious Design Decisions
Decision 1 — Date injection at the query level, not the system prompt level Many developers assume that telling the model "today is May 6, 2026" in the system prompt is enough. It is not. The model may still blend its training-era knowledge with the provided articles. Injecting the current date directly into the Tavily search query (e.g., "AI startup launches May 2026") forces the retrieval layer to return genuinely fresh content before the generation layer sees anything at all. Fresh inputs produce accurate outputs.
Decision 2 — Storing sources as a JSON array in the Message model rather than a separate SourceLink table A normalised database design would use a foreign-keyed SourceLink table. However, for a read-heavy UI where sources are always displayed alongside their parent message, JSON storage in a single field eliminates an extra JOIN on every chat render. The trade-off is that sources are not independently queryable — but the product roadmap has no feature that requires that capability, making the simpler design the right choice for now.
Tech Stack Recommendation
Choosing the right stack depends on where you are in your learning journey and what you intend to do with the finished product.
Stack A — Beginner / Learning
Layer | Technology | Why |
Backend | Django 5.2 | Batteries-included ORM, admin, sessions, routing in one package |
API Layer | Django REST Framework | Serialiser-based JSON endpoints with minimal boilerplate |
Database | SQLite | Zero-configuration, file-based, perfect for local development |
AI Model | OpenAI GPT-4o-mini | Low cost (~$0.15/1M input tokens), high quality, simple API |
Search | Tavily API Free Tier | 1,000 free searches/month, LLM-optimised output |
Frontend | Vanilla JavaScript | No build step, easier to debug, no framework overhead |
Server | Django dev server | Built in, auto-reloads on file changes |
Estimated monthly cost (Stack A): $0–$5 for personal use (GPT-4o-mini at low volume + Tavily free tier)
Stack B — Production / Client Deployment
Layer | Technology | Why |
Backend | Django 5.2 + Gunicorn | WSGI production server with worker process management |
API Layer | Django REST Framework + throttling | Rate limiting prevents API cost spikes |
Database | PostgreSQL 16 | ACID compliance, concurrent writes, JSONField performance |
AI Model | OpenAI GPT-4o-mini with caching | Prompt prefix caching reduces repeated token costs |
Search | Tavily API Pro ($99/month) | Higher rate limits, priority support, more results per query |
Frontend | Vanilla JS + CDN-hosted DOMPurify + Marked.js | XSS-safe markdown rendering without a build pipeline |
Server | Gunicorn behind Nginx | Reverse proxy, SSL termination, static file serving |
Hosting | DigitalOcean Droplet or Railway | Predictable pricing, easy Django deployment |
Estimated monthly cost (Stack B): $20–$130/month depending on Tavily tier, hosting size, and OpenAI usage volume
Implementation Phases
Building any non-trivial application requires a phased approach. Trying to build everything at once leads to integration bugs that are impossible to isolate. The following four phases take you from zero to a fully functional, deployable news aggregator.
Phase 1 — Project Setup and Data Modelling
What gets built: Django project scaffolding, virtual environment, installed packages (django, djangorestframework, openai, tavily-python, python-dotenv), settings configuration, SQLite database, and the four core models: NewsCategory, Conversation, Message, and SavedArticle.
Key decisions in this phase:
Structuring the app inside a single Django app called news keeps the project self-contained and easier to package as course material.
Seeding the NewsCategory table via a Django management command rather than hard-coding categories in the view separates data from logic and makes the category list editable without a code deploy.
Enabling Django's built-in session framework with a cookie-based backend gives every anonymous visitor a unique identity with zero authentication overhead.
This phase maps directly to the Django project setup module in the Codersarts AI News Aggregator course, where you watch every command run in real time with full explanation of why each dependency is needed.
Phase 2 — Tavily Integration and Search Pipeline
What gets built: A reusable Python function that accepts a query string, appends the current date, calls the Tavily API, and returns a cleaned list of article dictionaries. This function is unit-testable in isolation before any AI code is written.
Key decisions in this phase:
Appending the ISO-format current date (datetime.date.today().isoformat()) to every Tavily query guarantees that results are anchored to the present, not the model's training period.
Capping results at five articles per query balances context window usage against response quality. Fewer than three articles produce thin summaries; more than seven inflate the prompt cost with diminishing returns.
Storing raw Tavily results temporarily in a request-scoped variable (not persisted) keeps the database lean. Only the final URL array is saved with the AI response.
The Tavily integration module in the course includes a live demo of how different query phrasings affect result quality — an insight that is hard to learn from documentation alone.
Phase 3 — OpenAI Integration and Prompt Engineering
What gets built: The system prompt template, the conversation context assembly logic, and the OpenAI API call. This is the intellectual core of the product. The system prompt instructs GPT-4o-mini to act as a news analyst, cite every claim with a source index number, avoid speculation, and refuse to answer questions unrelated to the provided articles.
Key decisions in this phase:
Loading the last ten messages as conversation history keeps the context window manageable while supporting multi-turn follow-up questions ("Tell me more about the second story").
Using the OpenAI messages array format (system + alternating user/assistant turns) correctly maintains conversational coherence rather than just sending the latest question in isolation.
Instructing the model to format sources as [1], [2] etc. inline makes client-side source-badge rendering trivially simple: split on numbered patterns and match to the URL array.
The Codersarts course devotes two full lessons to prompt engineering for news summarisation — including what happens when you skip the "only cite provided sources" instruction and how to detect and fix hallucination in a RAG pipeline.
Phase 4 — Frontend, Save Functionality, and Deployment
What gets built: The complete single-page HTML/CSS/JavaScript interface, the save-article endpoint and SavedArticle model integration, the saved-articles panel, Gunicorn configuration, and a production-ready settings.py with environment variable management via python-dotenv.
Key decisions in this phase:
Rendering AI responses as markdown using Marked.js and sanitising the output with DOMPurify before inserting into the DOM prevents XSS attacks that could arise if an AI response included injected HTML — a real attack surface when the model references external content.
Storing saved articles per session (not per user account) keeps the system registration-free. Sessions expire after browser closure by default, which is appropriate for a consumer news tool but can be extended via SESSION_COOKIE_AGE if the product roadmap requires persistence.
Configuring Gunicorn with --workers 2 --threads 2 on a two-core VPS handles concurrent users efficiently without the complexity of async frameworks, which are overkill for this request pattern.
The deployment module in the Codersarts course walks through DigitalOcean setup, Nginx configuration, SSL certificate installation with Let's Encrypt, and environment variable management on a live server — skills that apply to every Django project you ever deploy.
Common Challenges and How to Solve Them
Even experienced developers run into specific problems when building AI-powered news tools. Here are the six most common, with their root causes and proven fixes.
Challenge 1 — AI cites training data as "today's news"
Root cause: The model's training-era knowledge bleeds into responses when no retrieval anchor exists. Even with a system-prompt date, the model may weight its parametric knowledge over provided articles.
Fix: Inject the current date into the Tavily query string (not just the system prompt). This forces live retrieval to return genuinely recent articles, giving the model no foothold for training-era citations.
Challenge 2 — JSON source storage breaks on edge cases
Root cause: When Tavily returns zero results (e.g., a very niche query at an off-peak time), the source array is empty. If the view blindly serialises an empty list and the frontend expects at least one source, link rendering breaks.
Fix: Validate the Tavily response before prompt assembly. If results are empty, return a graceful message ("No live news found for this query. Try rephrasing.") and save an empty JSON array [] in the Message record. The frontend should check array length before rendering source badges.
Challenge 3 — Category pills produce repetitive responses
Root cause: Clicking "Technology" five times in a row sends the same generic query to Tavily each time, returning nearly identical articles and producing near-identical AI summaries.
Fix: Append a timestamp or session message count to category queries to force result freshness. Alternatively, track which article URLs have already been shown in this session and pass them as an exclusion list to the Tavily call.
Challenge 4 — Tavily rate limits during development
Root cause: The free tier allows 1,000 searches per month. Active development with hot-reloading can burn through this quickly if every page refresh triggers a search.
Fix: Add a local mock fixture (a saved Tavily JSON response) for development mode. Toggle between live and mock search with an environment variable (TAVILY_MOCK=true). This also makes unit testing the prompt assembly logic fast and deterministic.
Challenge 5 — XSS from AI-generated links
Root cause: GPT-4o-mini sometimes emits raw HTML anchor tags in its response, especially when trained on web content that includes inline links. Inserting these directly into the DOM with innerHTML is an XSS vulnerability.
Fix: Always pipe rendered markdown through DOMPurify before DOM insertion. DOMPurify strips dangerous attributes (onerror, javascript: hrefs) while preserving safe formatting.
Challenge 6 — Multi-topic conversation context confusion
Root cause: A user who asks about "AI news" then "stock market news" then "climate policy" in a single session builds a conversation history that spans three unrelated domains. The model may produce confused responses that blend topics.
Fix: Include only the last 6–10 messages in the context window (not the full session history). Recency bias naturally prevents cross-topic contamination while still supporting natural follow-up questions within a topic thread.
All six of these challenges — including live debugging walkthroughs — are covered in detail inside the Codersarts AI News Aggregator course. Seeing the bugs happen in real time, with explanations of why they occur, is far more instructive than reading about them abstractly.
Ready to Build This Yourself?
You have seen the architecture. You understand the data flow. You know the pitfalls. Now it is time to actually build it.
The Codersarts AI News Aggregator course gives you everything you need to go from blank directory to deployed, working application — with video walkthroughs for every single step.
Here is what is included:
✅ Complete Django 5.2 project source code — every file, every line
✅ Full Django REST Framework API implementation with all endpoints
✅ Tavily search integration with date injection and mock fixtures
✅ OpenAI GPT-4o-mini prompt engineering for news summarisation
✅ Conversation history management with session-based isolation
✅ Save article functionality with per-session saved article panel
✅ XSS-safe frontend with DOMPurify and clickable source badges
✅ Gunicorn production configuration with environment variable management
✅ Step-by-step deployment walkthrough on a live server
✅ Debugging walkthroughs for all six common challenges
Tier 1 — $30 — Full source code + complete video tutorial series. Build the entire project from scratch, understand every design decision, and deploy to production.
Tier 2 — $20/hour — Everything in Tier 1 plus a 1:1 session with a Codersarts instructor to customise the project for your specific use case, portfolio, or client requirement.
Conclusion
The AI News Aggregator is a practical, deployable demonstration of retrieval-augmented generation applied to a problem millions of people face every day. By combining Tavily's live search capability with GPT-4o-mini's summarisation power inside a clean Django REST API, you produce a product that is genuinely more useful than any existing free news tool. The architecture is straightforward enough to build in a weekend yet substantial enough to deploy as a portfolio piece, a client project, or the foundation of a commercial SaaS.
Start with Phase 1 — the data models and project scaffold — and you will find that each subsequent phase clicks into place naturally. The Codersarts course is there when you want guided walkthroughs instead of solo exploration.



Comments