Research Assistant with AI Sampling
- 2 days ago
- 8 min read

Assignment Overview
Scenario: You are a research engineer at an academic institution building tools to help researchers manage and analyze scientific literature. Your task is to create an advanced MCP server that not only provides access to research papers but also uses AI sampling (server-initiated LLM calls) to generate intelligent summaries, extract key findings, and compare papers. This assignment builds on Assignment 1 by adding Module 4 concepts: sampling, production patterns, and advanced deployment options.
Learning Objectives:
Implement server-initiated AI sampling with user approval flow
Build intelligent tools that leverage LLM capabilities
Add structured logging for observability and debugging
Create health monitoring resources with metrics
Support multiple transport options (stdio and SSE)
Write comprehensive tests including sampling scenarios
Deploy a production-ready MCP server
Functional Requirements
MCP Tools with Sampling (Module 4 Concepts)
Implement the following tools. Tools marked with ⚡ must use AI sampling:
Tool Name | Sampling? | Parameters | Description |
list_papers | — | directory: str (optional, default: "./papers"), format: str (optional) | Lists all papers in the directory. Returns array with {filename, title, format, size, date_added} |
read_paper | — | filename: str, section: str (optional) | Reads full paper content or specific section (abstract, introduction, methods, results, conclusion) |
summarize_paper | ⚡ | filename: str, audience: str (default: "general"), max_length: int (default: 300) | Uses AI sampling to generate audience-appropriate summary. Audience: general, expert, student, executive |
extract_findings | ⚡ | filename: str, focus: str (optional) | Uses AI sampling to extract key findings, methodology, and results. Focus can be: methodology, results, implications |
compare_papers | ⚡ | filename1: str, filename2: str, aspect: str (default: "all") | Uses AI sampling to compare two papers. Aspect: methodology, findings, conclusions, all |
search_papers | — | query: str, search_in: list[str] (optional, default: ["title", "abstract", "content"]) | Full-text search across papers. Returns matching papers with context snippets |
add_annotation | — | filename: str, page: int, annotation_text: str, annotation_type: str | Adds annotation to a paper. Type: highlight, note, question, critique |
generate_bibliography | ⚡ | papers: list[str], style: str (default: "APA") | Uses AI sampling to generate formatted bibliography. Styles: APA, MLA, Chicago, IEEE |
Sampling Implementation Requirements:
Implement _sample() wrapper function for all sampling calls
Use ModelPreferences with intelligence=0.85 for academic accuracy
Include approval dialog messages in Claude Desktop
Implement graceful fallback when sampling is unavailable (ctx is None)
Log all sampling events: start, completion, token count, duration
Handle sampling errors gracefully and return user-friendly messages
MCP Resources with Health Monitoring
Implement the following resources:
Resource URI | Type | Description |
paper:///{filename} | Template | Direct access to paper content via custom URI scheme |
library://index | Static | JSON index of all papers with metadata (title, authors, year, format) |
library://by-author/{author} | Template | Papers filtered by author name |
library://by-year/{year} | Template | Papers filtered by publication year |
annotations:///{filename} | Template | All annotations for a specific paper |
health://server/status | Static | Server health: uptime, tool call counts, sampling stats, error rate |
health://server/metrics | Static | Detailed metrics: papers indexed, total sampling calls, avg response time |
config://server/info | Static | Server metadata: version, supported formats, directories |
Health Resource Requirements:
Track server uptime since startup
Count tool calls by tool name
Count sampling calls with success/failure rates
Calculate average response times for tools and sampling
Track error rates (5xx errors, validation failures, sampling errors)
Include cache hit/miss statistics
Update metrics in real-time as operations occur
MCP Prompts with EmbeddedResource
Create the following prompt templates:
Prompt Name | Arguments | Purpose |
summarize-paper | filename: str, audience: str (default: "general") | Generates paper summary with EmbeddedResource. No copy-paste needed |
literature-review | topic: str, papers: list[str], style: str (default: "comparative") | Creates structured literature review from multiple papers. Styles: comparative, chronological, thematic |
research-questions | filename: str, focus: str (optional) | Generates research questions based on paper gaps using EmbeddedResource |
explain-methodology | filename: str, detail_level: str (default: "medium") | Explains research methodology for different audiences using EmbeddedResource |
critique-paper | filename: str, criteria: list[str] (optional) | Academic critique based on criteria: methodology, evidence, logic, impact |
Technical Requirements
Data Management
Support multiple paper formats: .txt, .md, .pdf (use pdfplumber library)
Create papers/ directory structure: papers/raw/ for originals, papers/processed/ for extracted text
Build paper index (library_index.json) with metadata extraction
Store annotations in annotations.json with timestamps
Implement file watching to auto-update index when new papers added (optional bonus)
Handle large papers efficiently (stream processing for papers > 10 MB)
Structured Logging
Implement StructuredLogger class that emits JSON logs to stderr:
Log events: tool_called, tool_completed, sampling_start, sampling_complete, resource_read, error
Include timestamps (ISO 8601 format), event type, duration, parameters
For tool calls: log tool name, parameters (sanitized), success/failure, duration
For sampling: log model, token count, intelligence level, approval status
For errors: log error type, message, stack trace (truncated), context
Use JSON Lines format (one JSON object per line)
Provide log levels: DEBUG, INFO, WARNING, ERROR
Transport Options
Support both stdio and SSE transports via CLI arguments:
Default: stdio transport (for direct Claude Desktop integration)
SSE option: HTTP server using Starlette + uvicorn
CLI flags: --transport [stdio|sse] --port [port] --host [host]
Example: python research_server.py --transport sse --port 3001 --host 0.0.0.0
For SSE: implement /sse endpoint, proper CORS headers, health check endpoint
Graceful shutdown: handle SIGINT and SIGTERM signals
Include startup banner showing transport mode, port (if SSE), paper directory
Error Handling & Production Readiness
Implement handleexception() for consistent error handling
Create ERROR_MESSAGES dict for user-friendly error templates
Validate file access with safepath() to prevent directory traversal
Handle missing papers gracefully (404-like behavior)
Timeout protection for sampling calls (max 60 seconds)
Retry logic for transient failures (file locks, network issues)
Rate limiting for sampling calls (max 10/minute to prevent abuse)
Input sanitization for all user-provided strings
Testing Requirements
Comprehensive Test Suite:
Minimum 40 test cases covering all tools, resources, and prompts
Test sampling scenarios: mock ctx object, test approval flow, test fallback behavior
Test all three paper formats: .txt, .md, .pdf
Test error conditions: missing files, corrupted PDFs, invalid parameters
Test health metrics: verify counters update, verify uptime calculation
Test resource subscriptions: verify notifications sent on index changes
Test structured logging: verify log events emitted with correct structure
Test transport options: separate test suites for stdio and SSE
Performance tests: measure response time for large papers (> 1 MB)
Integration tests: end-to-end workflows (search → read → summarize)
Sample Test Structure:
tests/
├── conftest.py # Fixtures: temp dirs, sample papers, mock ctx
├── test_tools_basic.py # Non-sampling tools
├── test_tools_sampling.py # Sampling tools with mocked LLM
├── test_resources.py # Resource resolution and caching
├── test_prompts.py # Prompt rendering with EmbeddedResource
├── test_health_metrics.py # Health monitoring accuracy
├── test_logging.py # Structured logging validation
├── test_transport_stdio.py # stdio transport
├── test_transport_sse.py # SSE transport
├── test_paper_formats.py # .txt, .md, .pdf handling
└── test_integration.py # End-to-end workflows
Deliverables
Submit a ZIP file named StudentID_Assignment2_ResearchAssistant.zip containing:
research_server.py - Complete MCP server with sampling and SSE support
structured_logger.py - Logging module (or integrated in main file)
tests/ directory - Complete test suite (40+ tests)
papers/ directory - Sample papers:
At least 5 papers in different formats (.txt, .md, .pdf)
Varied topics and lengths for testing
library_index.json - Generated paper index
annotations.json - Sample annotations on papers
claude_desktop_config.json - Configuration for stdio mode
requirements.txt - All dependencies with pinned versions
README.md - Comprehensive documentation:
Setup instructions for both stdio and SSE modes
How to add new papers to the library
How to run tests with pytest
Example interactions (5-7 scenarios)
Sampling approval flow explanation
Health metrics interpretation guide
Architecture diagram (optional but recommended)
Known limitations and future work
DEMO.md or video link - Demonstration of:
Basic tools (list, read, search)
Sampling tools with approval dialog
Health monitoring resources
Prompt templates with EmbeddedResource
SSE mode running on HTTP server
Grading Rubric
Criteria | Points | Description |
Tool Implementation (Basic) | 15 | Non-sampling tools: list, read, search, annotate - correct functionality |
Tool Implementation (Sampling) | 20 | Sampling tools with proper _sample() wrapper, approval flow, fallback |
Resource Implementation | 15 | 8 resources including custom URI schemes, health metrics |
Prompt Templates | 10 | 5 prompts with EmbeddedResource, proper argument handling |
Structured Logging | 10 | JSON logging with all required events, proper structure |
Transport Options | 10 | Both stdio and SSE working, CLI argument parsing, graceful shutdown |
Testing | 20 | Comprehensive test suite (40+ tests), sampling mocks, coverage > 75% |
Health Monitoring | 10 | Accurate metrics, uptime tracking, performance stats |
Code Quality | 10 | Type hints, docstrings, error handling, modular design |
Documentation | 5 | Clear README, setup guide, examples, architecture overview |
Demo | 5 | Video/documentation showing key features and sampling flow |
Total | 130 | Bonus: +20 points for file watching, advanced metrics, Docker deployment |
Submission Guidelines
Due Date & Platform
Submission Deadline: [Date] at 11:59 PM (Late penalty: 10% per day, max 3 days)
Submission Platform: Upload to [Moodle/Canvas/Blackboard] under "Assignment 2" section
File Size Limit: 100 MB (compress sample papers if needed, use smaller PDFs)
Format: ZIP file only
Resubmission: Allowed until deadline; latest submission graded
Testing Before Submission
Run these commands to verify your submission:
pytest -v # All tests pass
pytest --cov=research_server --cov-report=term-missing # Coverage > 75%
python research_server.py --transport stdio # Starts without errors
python research_server.py --transport sse --port 3001 # HTTP server starts
python research_server.py --help # Shows usage information
Academic Integrity
This is an individual assignment. Collaboration on concepts is allowed, but all code must be original.
You may reuse patterns from course modules, but Task Manager code from Assignment 1 should not be directly copied.
Cite any external libraries beyond standard MCP SDK (e.g., pdfplumber, starlette).
AI tools may be used for learning, but you must understand and explain your implementation.
Plagiarism detection will be used. Violations result in zero credit and disciplinary action.
Tips for Success
Build incrementally - Start with Assignment 1 patterns, add sampling, then logging, then SSE
Test sampling early - Mock the ctx object in tests to avoid dependency on Claude Desktop
Use Capstone as reference - Review research_assistant_server.py for sampling patterns
Start with .txt papers - Add .md and .pdf support after basic tools work
Log everything - Structured logs will help debug sampling and transport issues
Test both transports - Run separate test sessions for stdio and SSE
Monitor health metrics - Use health resources to verify tool calls are tracked
Document assumptions - If requirements are unclear, state your interpretation in README
Create good sample data - Include diverse papers to showcase search and comparison
Record demo early - Capture working features before final changes to avoid time pressure
Call to Action
Ready to transform your business with AI-powered intelligence that accelerates insights, enhances decision-making, and unlocks the full value of your data?
Codersarts is here to help you turn complex data workflows into efficient, scalable, and evidence-driven AI systems that empower teams to make smarter, faster, and more confident decisions.
Whether you’re a startup looking to build AI-driven products, an enterprise aiming to optimize operations through data science, or a research organization advancing innovation with intelligent data solutions, we bring the expertise and experience needed to design, develop, and deploy impactful AI systems that drive measurable business outcomes.
Get Started Today
Schedule an AI & Data Science Consultation:
Book a 30-minute discovery call with our AI strategists and data science experts to discuss your challenges, identify high-impact opportunities, and explore how intelligent AI solutions can transform your workflows and performance.
Request a Custom AI Demo:
Experience AI in action with a personalized demonstration built around your business use cases, datasets, operational environment, and decision workflows — showcasing practical value and real-world impact.
Email: contact@codersarts.com
Transform your organization from data accumulation to intelligent decision enablement — accelerating insight generation, improving operational efficiency, and strengthening competitive advantage.
Partner with Codersarts to build scalable AI solutions including RAG systems, predictive analytics platforms, intelligent automation tools, recommendation engines, and custom machine learning models that empower your teams to deliver exceptional results.
Contact us today and take the first step toward next-generation AI and data science capabilities that grow with your business ambitions.


Comments