Building an AI Book Recommender with Kimi K2 and Streamlit
- 7 hours ago
- 8 min read
Introduction
Finding the next great book is harder than it sounds. Generic bestseller lists ignore your taste, and search engines return the same ten titles for every query. What most readers need is a recommendation that actually understands them — their preferred themes, emotional tone, narrative pace, and the books they already love.
In this tutorial, we build an AI-powered Book Recommender using Kimi K2, Moonshot AI’s flagship agentic model. The user describes their reading taste in plain language and the agent returns a curated list of 6 books, each with a personalised reason, plus a suggested reading order.
What makes this more than a single API call is the 3-phase multi-turn workflow: the model first builds a deep reader profile, then selects books against that profile, and finally arranges them into a thoughtful reading sequence. Each phase builds directly on the previous one, producing output that feels genuinely reasoned rather than randomly generated.

What We’re Building
The agent follows a structured 3-phase workflow:
Phase | Stage | What Happens |
1 | Profile | Analyses the reader’s taste and identifies key reading traits |
2 | Recommendations | Selects 6 books matched precisely to the profile |
3 | Reading Order | Arranges the books into an optimal reading sequence |
Each phase feeds into the next. The book selection in Phase 2 is directly grounded in the traits identified in Phase 1, and the reading order in Phase 3 considers the emotional arc and difficulty curve of the specific 6 books chosen — not a generic ordering.
The result is a Streamlit web app where users describe their taste, click a button, and receive a downloadable reading list.
Tech Stack
Component | Tool |
AI Model | Kimi K2 (kimi-k2) |
API Provider | Moonshot (official Kimi API) |
API Client | openai Python SDK (OpenAI-compatible) |
UI Framework | Streamlit |
Env Management | python-dotenv |
Why Kimi K2? It supports a 256K-token context window and is purpose-built for agentic workflows. Long reader descriptions, detailed book analysis, and multi-phase reasoning all fit comfortably within a single context.
Project Structure
kimi/
├── book_agent.py # Agent logic — API calls, 3-phase workflow
├── app.py # Streamlit UI
├── requirements.txt # Dependencies
└── .env # API key and config (not committed to git)
Setting Up
1. Install Dependencies
pip install openai streamlit python-dotenv
2. Configure Environment
Create a .env file in the project folder:
KIMI_PROVIDER=moonshot
MOONSHOT_API_KEY=your_moonshot_api_key_here
Get your Moonshot API key at platform.moonshot.cn — sign up, navigate to API Keys, and generate a key.
Building the Agent — book_agent.py
Client Setup
We load credentials from .env and create a single API client pointed at Moonshot’s OpenAI-compatible endpoint. The openai SDK works here unchanged because Moonshot’s API mirrors the OpenAI interface.
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv() # must run before os.getenv — injects .env values into the process environment
API_ENDPOINT = os.getenv("KIMI_BASE_URL", "https://api.moonshot.cn/v1") # Moonshot's OpenAI-compatible base URL
SECRET_KEY = os.getenv("MOONSHOT_API_KEY", "") # API key from platform.moonshot.cn
LLM_MODEL = os.getenv("KIMI_MODEL", "kimi-k2") # default to kimi-k2; override via KIMI_MODEL in .env
llm = OpenAI(api_key=SECRET_KEY, base_url=API_ENDPOINT) # single client instance reused across all three phases
We also define the model's persona as a system prompt that persists for the entire conversation:
CURATOR_PERSONA = (
"You are an expert literary curator with deep knowledge across all genres, "
"cultures, and eras of literature. You give thoughtful, personalised book " # 'personalised' signals British-English style — intentional for literary tone
"recommendations based on the reader's specific tastes. Be specific — always "
"name real books with real authors." # prevents the model from inventing fictional titles
)
The Core Chat Function
ask_model handles every API call in the agent. It appends the user message, sends the full conversation history, and returns both the reply and any internal reasoning the model produced.
def ask_model(conversation: list, user_text: str) -> tuple[str, str]:
conversation.append({"role": "user", "content": user_text}) # add new message before sending — the full history goes with every call
response = llm.chat.completions.create(
model=LLM_MODEL,
messages=conversation, # full conversation so far — this gives the model memory of all prior phases
)
reply = response.choices[0].message.content or "" # the model's visible response — always present
reasoning = getattr(response.choices[0].message, "reasoning_content", "") or "" # Kimi-specific internal chain-of-thought; getattr avoids AttributeError on providers that don't return it
assistant_entry = {"role": "assistant", "content": reply}
if reasoning:
assistant_entry["reasoning_content"] = reasoning # attach reasoning to history so later phases can reference prior thinking
conversation.append(assistant_entry)
return reply, reasoning # reply goes to the UI; reasoning is stored and shown in the "Model Reasoning" expander
The 3-Phase Workflow
def curate_books(reader_taste: str, on_stage=None) -> dict:
conversation = [{"role": "system", "content": CURATOR_PERSONA}] # system message is always first — sets the literary curator persona for the entire session
output = {"reader_taste": reader_taste, "stages": []} # output dict accumulates all three phases — returned to the UI when complete
# ## Phase 1: Reader Profile ###############################################
# The model identifies 3-4 reading traits from the user's description.
# These traits are stored in conversation so Phase 2 can ground its selections in them.
if on_stage:
on_stage("Building your reader profile...") # updates the UI stage card — optional so the function works without a UI
profile, reasoning1 = ask_model(
conversation,
f"A reader has described their taste as follows:\n\n\"{reader_taste}\"\n\n" # embeds the raw user input verbatim so the model works from exact words, not a summary
"Analyse their preferences. Identify 3-4 key reading traits " # fixed number forces the model to be selective rather than listing everything
"(e.g. preferred themes, pacing, narrative style, emotional tone) " # examples guide the model toward the right abstraction level
"and explain what each tells you about what they will enjoy." # asking it to explain why ensures each trait is reasoned, not just labelled
)
output["stages"].append({"phase": "Profile", "content": profile, "reasoning": reasoning1}) # reasoning1 stored separately — shown in the UI expander, not mixed into main content
# ## Phase 2: Recommendations ##############################################
# The model selects 6 books. Because conversation contains Phase 1's trait analysis,
# each selection is justified against a specific identified trait — not chosen generically.
if on_stage:
on_stage("Curating your reading list...") # updates the UI stage card to show Phase 2 is active
book_list, reasoning2 = ask_model(
conversation,
"Based on the reader profile you just built, recommend exactly 6 books. " # "you just built" refers the model back to Phase 1 output already in conversation
"For each book include: title, author, genre, and 2-3 sentences explaining "
"precisely why it matches this reader's taste. " # "precisely why" prevents vague matches like "this is a good book"
"Format each as:\n**[Title]** by [Author] *(Genre)*\n[Reason]" # explicit format string so the UI can render markdown without post-processing
)
output["stages"].append({"phase": "Recommendations", "content": book_list, "reasoning": reasoning2}) # reasoning2 captured but not shown in the main UI — available for debugging
# ## Phase 3: Reading Order ################################################
# The model orders the specific 6 books it just chose — not a generic rule.
# It considers emotional journey, difficulty curve, and thematic flow across those titles.
if on_stage:
on_stage("Suggesting a reading order...") # updates the UI stage card to show Phase 3 is active
reading_order, reasoning3 = ask_model(
conversation,
"Now suggest a reading order for the 6 books you recommended. " # "you recommended" anchors the order to the exact 6 books in conversation — not a generic list
"Number them 1-6 and give a one-sentence reason for each position — "
"consider emotional journey, difficulty curve, and thematic flow." # three explicit criteria prevent the model from defaulting to alphabetical or publication order
)
output["stages"].append({"phase": "Reading Order", "content": reading_order, "reasoning": reasoning3}) # final stage — after this the output dict is complete and returned to the UI
return output # all three phases are in output["stages"] — the UI iterates over this list to render each section
Building the UI — app.py
The UI uses a 3-card pipeline display that transitions from neutral to active to complete as each phase finishes, giving the user clear visual feedback on progress.
import os # reads KIMI_PROVIDER from environment after load_dotenv() runs
import streamlit as st
from dotenv import load_dotenv # loads .env so os.getenv can read the provider setting in the UI
from book_agent import curate_books # imports the 3-phase agent function from the backend module
load_dotenv() # must run before any os.getenv call — injects .env values into the process environment
st.set_page_config(
page_title="Book Recommender",
page_icon="📚",
layout="wide", # full browser width — better for multi-column results layout
initial_sidebar_state="collapsed" # no sidebar content — collapsed to avoid the slide animation
)
PHASES = [
("🧑🎨", "Profile", "Analyse reading taste"), # Phase 1 — reader trait analysis
("📖", "Recommendations", "Curate 6 books"), # Phase 2 — book selection
("🗂️", "Reading Order", "Suggest reading sequence"), # Phase 3 — ordering
] # PHASES is defined at module level so render_pipeline can reference it without parameters
The pipeline renderer updates the same placeholder on every progress callback, so the cards animate in place without re-rendering the whole page:
def render_pipeline(active_index: int, done_indices: list):
cols = st.columns(3) # one column per phase — equal width
for i, (icon, label, desc) in enumerate(PHASES):
with cols[i]:
if i in done_indices:
st.success(f"{icon} **{label}**\n\n{desc}") # green — phase complete
elif i == active_index:
st.info(f"{icon} **{label}**\n\n_{desc}_") # blue — phase currently running
else:
st.container(border=True).markdown( # neutral border — phase not yet started
f"{icon} **{label}**\n\n{desc}")
The input is a multi-line text area rather than a single-line field, giving readers space to describe their taste in natural, detailed language:
reader_taste = st.text_area(
"Describe your reading taste",
placeholder=(
"e.g. I love slow-burn literary fiction with morally complex characters. "
"I enjoyed The Secret History, Normal People, and anything by Kazuo Ishiguro."
),
height=120, # tall enough for a detailed description — more input = better recommendations
label_visibility="collapsed", # label hidden — the placeholder is descriptive enough
)
run = st.button(
"Get Recommendations",
type="primary", # renders as a filled primary button — visually prominent
disabled=not reader_taste.strip(), # greyed out until the user types something; .strip() blocks whitespace-only input
)
Results are stored in st.session_state so they survive Streamlit re-renders without re-calling the agent:
with st.spinner(""): # shows a loading indicator while the three API calls complete
result = curate_books(
reader_taste.strip(), # .strip() removes leading/trailing whitespace before sending to the model
on_stage=on_stage # passes the progress callback so each phase updates the pipeline card
)
st.session_state["result"] = result # store in session_state so results survive Streamlit re-renders (e.g. when download button is clicked)
Finally, the full output is available for download:
st.download_button(
label="Download as Markdown",
data=md, # md is a string built by concatenating all three phase outputs with markdown headers
file_name="book_recommendations.md", # fixed filename — the content already includes the reader's taste description at the top
mime="text/markdown", # tells the browser this is a .md file — triggers correct file association on download
)
Running the App
streamlit run app.py
Open your browser at http://localhost:8501. Describe your reading taste in the text area (the more specific the better) and click Get Recommendations.
The agent works through its three phases, updating the pipeline cards as it goes, and displays the full reading list within seconds.
Output and What to Expect
For a reader who described enjoying slow-paced literary fiction with philosophical themes (similar to Dostoevsky and Hesse), the agent produces:
Profile: Traits such as “drawn to existential introspection”, “prefers dense, layered prose”, and “values moral ambiguity over resolution”
Recommendations: 6 books like Steppenwolf by Hermann Hesse, The Master and Margarita by Bulgakov, and Nausea by Sartre — each with a 2-3 sentence explanation tied directly to the identified traits
Reading Order: An ordered sequence that starts with the most accessible title and builds toward the most philosophically demanding, with a one-sentence rationale for each position
The model’s internal reasoning (visible in the “Model Reasoning” expander inside the Profile card) shows how it connects the reader’s stated preferences to the trait labels it generates, making the recommendation logic transparent.
Who Can Benefit
Avid readers who have exhausted recommendations from friends and want something genuinely personalised
Book clubs looking for a curated selection matched to the group’s collective taste
Librarians and educators who need to match readers to titles quickly and accurately
Developers learning to build multi-phase AI agents with persistent conversation context
How Codersarts Can Help
Building AI agents like this one requires solid understanding of multi-turn reasoning, API integration, and production-ready UI design. If you need help implementing a custom AI agent for your project (book recommendations, business automation, or something entirely different), Codersarts offers end-to-end development and mentorship support.
Custom AI agent development tailored to your use case
One-on-one mentorship and code reviews
Project-based learning with real-world applications
Get in touch: codersarts.com | contact@codersarts.com




Comments