Research Assistant with AI Sampling

Apr 3
8 min read

Assignment Overview

Scenario: You are a research engineer at an academic institution building tools to help researchers manage and analyze scientific literature. Your task is to create an advanced MCP server that not only provides access to research papers but also uses AI sampling (server-initiated LLM calls) to generate intelligent summaries, extract key findings, and compare papers. This assignment builds on Assignment 1 by adding Module 4 concepts: sampling, production patterns, and advanced deployment options.

Learning Objectives:

Implement server-initiated AI sampling with user approval flow
Build intelligent tools that leverage LLM capabilities
Add structured logging for observability and debugging
Create health monitoring resources with metrics
Support multiple transport options (stdio and SSE)
Write comprehensive tests including sampling scenarios
Deploy a production-ready MCP server

Functional Requirements

MCP Tools with Sampling (Module 4 Concepts)

Implement the following tools. Tools marked with ⚡ must use AI sampling:

Tool Name	Sampling?	Parameters	Description
list_papers	—	directory: str (optional, default: "./papers"), format: str (optional)	Lists all papers in the directory. Returns array with {filename, title, format, size, date_added}
read_paper	—	filename: str, section: str (optional)	Reads full paper content or specific section (abstract, introduction, methods, results, conclusion)
summarize_paper	⚡	filename: str, audience: str (default: "general"), max_length: int (default: 300)	Uses AI sampling to generate audience-appropriate summary. Audience: general, expert, student, executive
extract_findings	⚡	filename: str, focus: str (optional)	Uses AI sampling to extract key findings, methodology, and results. Focus can be: methodology, results, implications
compare_papers	⚡	filename1: str, filename2: str, aspect: str (default: "all")	Uses AI sampling to compare two papers. Aspect: methodology, findings, conclusions, all
search_papers	—	query: str, search_in: list[str] (optional, default: ["title", "abstract", "content"])	Full-text search across papers. Returns matching papers with context snippets
add_annotation	—	filename: str, page: int, annotation_text: str, annotation_type: str	Adds annotation to a paper. Type: highlight, note, question, critique
generate_bibliography	⚡	papers: list[str], style: str (default: "APA")	Uses AI sampling to generate formatted bibliography. Styles: APA, MLA, Chicago, IEEE

Sampling Implementation Requirements:

Implement _sample() wrapper function for all sampling calls
Use ModelPreferences with intelligence=0.85 for academic accuracy
Include approval dialog messages in Claude Desktop
Implement graceful fallback when sampling is unavailable (ctx is None)
Log all sampling events: start, completion, token count, duration
Handle sampling errors gracefully and return user-friendly messages

MCP Resources with Health Monitoring

Implement the following resources:

Resource URI	Type	Description
paper:///{filename}	Template	Direct access to paper content via custom URI scheme
library://index	Static	JSON index of all papers with metadata (title, authors, year, format)
library://by-author/{author}	Template	Papers filtered by author name
library://by-year/{year}	Template	Papers filtered by publication year
annotations:///{filename}	Template	All annotations for a specific paper
health://server/status	Static	Server health: uptime, tool call counts, sampling stats, error rate
health://server/metrics	Static	Detailed metrics: papers indexed, total sampling calls, avg response time
config://server/info	Static	Server metadata: version, supported formats, directories

Health Resource Requirements:

Track server uptime since startup
Count tool calls by tool name
Count sampling calls with success/failure rates
Calculate average response times for tools and sampling
Track error rates (5xx errors, validation failures, sampling errors)
Include cache hit/miss statistics
Update metrics in real-time as operations occur

MCP Prompts with EmbeddedResource

Create the following prompt templates:

Prompt Name	Arguments	Purpose
summarize-paper	filename: str, audience: str (default: "general")	Generates paper summary with EmbeddedResource. No copy-paste needed
literature-review	topic: str, papers: list[str], style: str (default: "comparative")	Creates structured literature review from multiple papers. Styles: comparative, chronological, thematic
research-questions	filename: str, focus: str (optional)	Generates research questions based on paper gaps using EmbeddedResource
explain-methodology	filename: str, detail_level: str (default: "medium")	Explains research methodology for different audiences using EmbeddedResource
critique-paper	filename: str, criteria: list[str] (optional)	Academic critique based on criteria: methodology, evidence, logic, impact

Technical Requirements

Data Management

Support multiple paper formats: .txt, .md, .pdf (use pdfplumber library)
Create papers/ directory structure: papers/raw/ for originals, papers/processed/ for extracted text
Build paper index (library_index.json) with metadata extraction
Store annotations in annotations.json with timestamps
Implement file watching to auto-update index when new papers added (optional bonus)
Handle large papers efficiently (stream processing for papers > 10 MB)

Structured Logging

Implement StructuredLogger class that emits JSON logs to stderr:

Log events: tool_called, tool_completed, sampling_start, sampling_complete, resource_read, error
Include timestamps (ISO 8601 format), event type, duration, parameters
For tool calls: log tool name, parameters (sanitized), success/failure, duration
For sampling: log model, token count, intelligence level, approval status
For errors: log error type, message, stack trace (truncated), context
Use JSON Lines format (one JSON object per line)
Provide log levels: DEBUG, INFO, WARNING, ERROR

Transport Options

Support both stdio and SSE transports via CLI arguments:

Default: stdio transport (for direct Claude Desktop integration)
SSE option: HTTP server using Starlette + uvicorn
CLI flags: --transport [stdio|sse] --port [port] --host [host]
Example: python research_server.py --transport sse --port 3001 --host 0.0.0.0
For SSE: implement /sse endpoint, proper CORS headers, health check endpoint
Graceful shutdown: handle SIGINT and SIGTERM signals
Include startup banner showing transport mode, port (if SSE), paper directory

Error Handling & Production Readiness

Implement handleexception() for consistent error handling
Create ERROR_MESSAGES dict for user-friendly error templates
Validate file access with safepath() to prevent directory traversal
Handle missing papers gracefully (404-like behavior)
Timeout protection for sampling calls (max 60 seconds)
Retry logic for transient failures (file locks, network issues)
Rate limiting for sampling calls (max 10/minute to prevent abuse)
Input sanitization for all user-provided strings

Testing Requirements

Comprehensive Test Suite:

Minimum 40 test cases covering all tools, resources, and prompts
Test sampling scenarios: mock ctx object, test approval flow, test fallback behavior
Test all three paper formats: .txt, .md, .pdf
Test error conditions: missing files, corrupted PDFs, invalid parameters
Test health metrics: verify counters update, verify uptime calculation
Test resource subscriptions: verify notifications sent on index changes
Test structured logging: verify log events emitted with correct structure
Test transport options: separate test suites for stdio and SSE
Performance tests: measure response time for large papers (> 1 MB)
Integration tests: end-to-end workflows (search → read → summarize)

Sample Test Structure:


tests/
├── conftest.py             # Fixtures: temp dirs, sample papers, mock ctx
├── test_tools_basic.py     # Non-sampling tools
├── test_tools_sampling.py  # Sampling tools with mocked LLM
├── test_resources.py       # Resource resolution and caching
├── test_prompts.py         # Prompt rendering with EmbeddedResource
├── test_health_metrics.py  # Health monitoring accuracy
├── test_logging.py         # Structured logging validation
├── test_transport_stdio.py # stdio transport
├── test_transport_sse.py   # SSE transport
├── test_paper_formats.py   # .txt, .md, .pdf handling
└── test_integration.py     # End-to-end workflows

Deliverables

Submit a ZIP file named StudentID_Assignment2_ResearchAssistant.zip containing:

research_server.py - Complete MCP server with sampling and SSE support
structured_logger.py - Logging module (or integrated in main file)
tests/ directory - Complete test suite (40+ tests)
papers/ directory - Sample papers:
- At least 5 papers in different formats (.txt, .md, .pdf)
- Varied topics and lengths for testing

library_index.json - Generated paper index
annotations.json - Sample annotations on papers
claude_desktop_config.json - Configuration for stdio mode
requirements.txt - All dependencies with pinned versions
README.md - Comprehensive documentation:
- Setup instructions for both stdio and SSE modes
- How to add new papers to the library
- How to run tests with pytest
- Example interactions (5-7 scenarios)
- Sampling approval flow explanation
- Health metrics interpretation guide
- Architecture diagram (optional but recommended)
- Known limitations and future work
DEMO.md or video link - Demonstration of:
- Basic tools (list, read, search)
- Sampling tools with approval dialog
- Health monitoring resources
- Prompt templates with EmbeddedResource
- SSE mode running on HTTP server

Grading Rubric

Criteria	Points	Description
Tool Implementation (Basic)	15	Non-sampling tools: list, read, search, annotate - correct functionality
Tool Implementation (Sampling)	20	Sampling tools with proper _sample() wrapper, approval flow, fallback
Resource Implementation	15	8 resources including custom URI schemes, health metrics
Prompt Templates	10	5 prompts with EmbeddedResource, proper argument handling
Structured Logging	10	JSON logging with all required events, proper structure
Transport Options	10	Both stdio and SSE working, CLI argument parsing, graceful shutdown
Testing	20	Comprehensive test suite (40+ tests), sampling mocks, coverage > 75%
Health Monitoring	10	Accurate metrics, uptime tracking, performance stats
Code Quality	10	Type hints, docstrings, error handling, modular design
Documentation	5	Clear README, setup guide, examples, architecture overview
Demo	5	Video/documentation showing key features and sampling flow
Total	130	Bonus: +20 points for file watching, advanced metrics, Docker deployment

Submission Guidelines

Due Date & Platform

Submission Deadline: [Date] at 11:59 PM (Late penalty: 10% per day, max 3 days)
Submission Platform: Upload to [Moodle/Canvas/Blackboard] under "Assignment 2" section
File Size Limit: 100 MB (compress sample papers if needed, use smaller PDFs)
Format: ZIP file only
Resubmission: Allowed until deadline; latest submission graded

Testing Before Submission

Run these commands to verify your submission:

pytest -v # All tests pass
pytest --cov=research_server --cov-report=term-missing # Coverage > 75%
python research_server.py --transport stdio # Starts without errors
python research_server.py --transport sse --port 3001 # HTTP server starts
python research_server.py --help # Shows usage information

Academic Integrity

This is an individual assignment. Collaboration on concepts is allowed, but all code must be original.
You may reuse patterns from course modules, but Task Manager code from Assignment 1 should not be directly copied.
Cite any external libraries beyond standard MCP SDK (e.g., pdfplumber, starlette).
AI tools may be used for learning, but you must understand and explain your implementation.
Plagiarism detection will be used. Violations result in zero credit and disciplinary action.

Tips for Success

Build incrementally - Start with Assignment 1 patterns, add sampling, then logging, then SSE
Test sampling early - Mock the ctx object in tests to avoid dependency on Claude Desktop
Use Capstone as reference - Review research_assistant_server.py for sampling patterns
Start with .txt papers - Add .md and .pdf support after basic tools work
Log everything - Structured logs will help debug sampling and transport issues
Test both transports - Run separate test sessions for stdio and SSE
Monitor health metrics - Use health resources to verify tool calls are tracked
Document assumptions - If requirements are unclear, state your interpretation in README
Create good sample data - Include diverse papers to showcase search and comparison
Record demo early - Capture working features before final changes to avoid time pressure

Call to Action

Ready to transform your business with AI-powered intelligence that accelerates insights, enhances decision-making, and unlocks the full value of your data?

Codersarts is here to help you turn complex data workflows into efficient, scalable, and evidence-driven AI systems that empower teams to make smarter, faster, and more confident decisions.

Whether you’re a startup looking to build AI-driven products, an enterprise aiming to optimize operations through data science, or a research organization advancing innovation with intelligent data solutions, we bring the expertise and experience needed to design, develop, and deploy impactful AI systems that drive measurable business outcomes.

Get Started Today

Schedule an AI & Data Science Consultation:

Book a 30-minute discovery call with our AI strategists and data science experts to discuss your challenges, identify high-impact opportunities, and explore how intelligent AI solutions can transform your workflows and performance.

Request a Custom AI Demo:

Experience AI in action with a personalized demonstration built around your business use cases, datasets, operational environment, and decision workflows — showcasing practical value and real-world impact.

Email: contact@codersarts.com

Transform your organization from data accumulation to intelligent decision enablement — accelerating insight generation, improving operational efficiency, and strengthening competitive advantage.

Partner with Codersarts to build scalable AI solutions including RAG systems, predictive analytics platforms, intelligent automation tools, recommendation engines, and custom machine learning models that empower your teams to deliver exceptional results.

Contact us today and take the first step toward next-generation AI and data science capabilities that grow with your business ambitions.