top of page

Research Assistant with AI Sampling

  • 2 days ago
  • 8 min read




Assignment Overview

Scenario: You are a research engineer at an academic institution building tools to help researchers manage and analyze scientific literature. Your task is to create an advanced MCP server that not only provides access to research papers but also uses AI sampling (server-initiated LLM calls) to generate intelligent summaries, extract key findings, and compare papers. This assignment builds on Assignment 1 by adding Module 4 concepts: sampling, production patterns, and advanced deployment options.




Learning Objectives:


  • Implement server-initiated AI sampling with user approval flow

  • Build intelligent tools that leverage LLM capabilities

  • Add structured logging for observability and debugging

  • Create health monitoring resources with metrics

  • Support multiple transport options (stdio and SSE)

  • Write comprehensive tests including sampling scenarios

  • Deploy a production-ready MCP server





Functional Requirements




MCP Tools with Sampling (Module 4 Concepts)

Implement the following tools. Tools marked with ⚡ must use AI sampling:


Tool Name

Sampling?

Parameters

Description

list_papers

directory: str (optional, default: "./papers"), format: str (optional)

Lists all papers in the directory. Returns array with {filename, title, format, size, date_added}

read_paper

filename: str, section: str (optional)

Reads full paper content or specific section (abstract, introduction, methods, results, conclusion)

summarize_paper

filename: str, audience: str (default: "general"), max_length: int (default: 300)

Uses AI sampling to generate audience-appropriate summary. Audience: general, expert, student, executive

extract_findings

filename: str, focus: str (optional)

Uses AI sampling to extract key findings, methodology, and results. Focus can be: methodology, results, implications

compare_papers

filename1: str, filename2: str, aspect: str (default: "all")

Uses AI sampling to compare two papers. Aspect: methodology, findings, conclusions, all

search_papers

query: str, search_in: list[str] (optional, default: ["title", "abstract", "content"])

Full-text search across papers. Returns matching papers with context snippets

add_annotation

filename: str, page: int, annotation_text: str, annotation_type: str

Adds annotation to a paper. Type: highlight, note, question, critique

generate_bibliography

papers: list[str], style: str (default: "APA")

Uses AI sampling to generate formatted bibliography. Styles: APA, MLA, Chicago, IEEE




Sampling Implementation Requirements:


  • Implement _sample() wrapper function for all sampling calls

  • Use ModelPreferences with intelligence=0.85 for academic accuracy

  • Include approval dialog messages in Claude Desktop

  • Implement graceful fallback when sampling is unavailable (ctx is None)

  • Log all sampling events: start, completion, token count, duration

  • Handle sampling errors gracefully and return user-friendly messages




MCP Resources with Health Monitoring

Implement the following resources:


Resource URI

Type

Description

paper:///{filename}

Template

Direct access to paper content via custom URI scheme

library://index

Static

JSON index of all papers with metadata (title, authors, year, format)

library://by-author/{author}

Template

Papers filtered by author name

library://by-year/{year}

Template

Papers filtered by publication year

annotations:///{filename}

Template

All annotations for a specific paper

health://server/status

Static

Server health: uptime, tool call counts, sampling stats, error rate

health://server/metrics

Static

Detailed metrics: papers indexed, total sampling calls, avg response time

config://server/info

Static

Server metadata: version, supported formats, directories




Health Resource Requirements:


  • Track server uptime since startup

  • Count tool calls by tool name

  • Count sampling calls with success/failure rates

  • Calculate average response times for tools and sampling

  • Track error rates (5xx errors, validation failures, sampling errors)

  • Include cache hit/miss statistics

  • Update metrics in real-time as operations occur




MCP Prompts with EmbeddedResource

Create the following prompt templates:


Prompt Name

Arguments

Purpose

summarize-paper

filename: str, audience: str (default: "general")

Generates paper summary with EmbeddedResource. No copy-paste needed

literature-review

topic: str, papers: list[str], style: str (default: "comparative")

Creates structured literature review from multiple papers. Styles: comparative, chronological, thematic

research-questions

filename: str, focus: str (optional)

Generates research questions based on paper gaps using EmbeddedResource

explain-methodology

filename: str, detail_level: str (default: "medium")

Explains research methodology for different audiences using EmbeddedResource

critique-paper

filename: str, criteria: list[str] (optional)

Academic critique based on criteria: methodology, evidence, logic, impact





Technical Requirements




Data Management


  • Support multiple paper formats: .txt, .md, .pdf (use pdfplumber library)

  • Create papers/ directory structure: papers/raw/ for originals, papers/processed/ for extracted text

  • Build paper index (library_index.json) with metadata extraction

  • Store annotations in annotations.json with timestamps

  • Implement file watching to auto-update index when new papers added (optional bonus)

  • Handle large papers efficiently (stream processing for papers > 10 MB)




Structured Logging

Implement StructuredLogger class that emits JSON logs to stderr:


  • Log events: tool_called, tool_completed, sampling_start, sampling_complete, resource_read, error

  • Include timestamps (ISO 8601 format), event type, duration, parameters

  • For tool calls: log tool name, parameters (sanitized), success/failure, duration

  • For sampling: log model, token count, intelligence level, approval status

  • For errors: log error type, message, stack trace (truncated), context

  • Use JSON Lines format (one JSON object per line)

  • Provide log levels: DEBUG, INFO, WARNING, ERROR




Transport Options

Support both stdio and SSE transports via CLI arguments:


  • Default: stdio transport (for direct Claude Desktop integration)

  • SSE option: HTTP server using Starlette + uvicorn

  • CLI flags: --transport [stdio|sse] --port [port] --host [host]

  • Example: python research_server.py --transport sse --port 3001 --host 0.0.0.0

  • For SSE: implement /sse endpoint, proper CORS headers, health check endpoint

  • Graceful shutdown: handle SIGINT and SIGTERM signals

  • Include startup banner showing transport mode, port (if SSE), paper directory




Error Handling & Production Readiness


  • Implement handleexception() for consistent error handling

  • Create ERROR_MESSAGES dict for user-friendly error templates

  • Validate file access with safepath() to prevent directory traversal

  • Handle missing papers gracefully (404-like behavior)

  • Timeout protection for sampling calls (max 60 seconds)

  • Retry logic for transient failures (file locks, network issues)

  • Rate limiting for sampling calls (max 10/minute to prevent abuse)

  • Input sanitization for all user-provided strings





Testing Requirements




Comprehensive Test Suite:


  • Minimum 40 test cases covering all tools, resources, and prompts

  • Test sampling scenarios: mock ctx object, test approval flow, test fallback behavior

  • Test all three paper formats: .txt, .md, .pdf

  • Test error conditions: missing files, corrupted PDFs, invalid parameters

  • Test health metrics: verify counters update, verify uptime calculation

  • Test resource subscriptions: verify notifications sent on index changes

  • Test structured logging: verify log events emitted with correct structure

  • Test transport options: separate test suites for stdio and SSE

  • Performance tests: measure response time for large papers (> 1 MB)

  • Integration tests: end-to-end workflows (search → read → summarize)




Sample Test Structure:



tests/
├── conftest.py             # Fixtures: temp dirs, sample papers, mock ctx
├── test_tools_basic.py     # Non-sampling tools
├── test_tools_sampling.py  # Sampling tools with mocked LLM
├── test_resources.py       # Resource resolution and caching
├── test_prompts.py         # Prompt rendering with EmbeddedResource
├── test_health_metrics.py  # Health monitoring accuracy
├── test_logging.py         # Structured logging validation
├── test_transport_stdio.py # stdio transport
├── test_transport_sse.py   # SSE transport
├── test_paper_formats.py   # .txt, .md, .pdf handling
└── test_integration.py     # End-to-end workflows





Deliverables

Submit a ZIP file named StudentID_Assignment2_ResearchAssistant.zip containing:


  • research_server.py - Complete MCP server with sampling and SSE support

  • structured_logger.py - Logging module (or integrated in main file)

  • tests/ directory - Complete test suite (40+ tests)


  • papers/ directory - Sample papers:

    • At least 5 papers in different formats (.txt, .md, .pdf)

    • Varied topics and lengths for testing


  • library_index.json - Generated paper index

  • annotations.json - Sample annotations on papers

  • claude_desktop_config.json - Configuration for stdio mode

  • requirements.txt - All dependencies with pinned versions


  • README.md - Comprehensive documentation:


    • Setup instructions for both stdio and SSE modes

    • How to add new papers to the library

    • How to run tests with pytest

    • Example interactions (5-7 scenarios)

    • Sampling approval flow explanation

    • Health metrics interpretation guide

    • Architecture diagram (optional but recommended)

    • Known limitations and future work


  • DEMO.md or video link - Demonstration of:


    • Basic tools (list, read, search)

    • Sampling tools with approval dialog

    • Health monitoring resources

    • Prompt templates with EmbeddedResource

    • SSE mode running on HTTP server





Grading Rubric


Criteria

Points

Description

Tool Implementation (Basic)

15

Non-sampling tools: list, read, search, annotate - correct functionality

Tool Implementation (Sampling)

20

Sampling tools with proper _sample() wrapper, approval flow, fallback

Resource Implementation

15

8 resources including custom URI schemes, health metrics

Prompt Templates

10

5 prompts with EmbeddedResource, proper argument handling

Structured Logging

10

JSON logging with all required events, proper structure

Transport Options

10

Both stdio and SSE working, CLI argument parsing, graceful shutdown

Testing

20

Comprehensive test suite (40+ tests), sampling mocks, coverage > 75%

Health Monitoring

10

Accurate metrics, uptime tracking, performance stats

Code Quality

10

Type hints, docstrings, error handling, modular design

Documentation

5

Clear README, setup guide, examples, architecture overview

Demo

5

Video/documentation showing key features and sampling flow

Total

130

Bonus: +20 points for file watching, advanced metrics, Docker deployment





Submission Guidelines




Due Date & Platform


  • Submission Deadline: [Date] at 11:59 PM (Late penalty: 10% per day, max 3 days)

  • Submission Platform: Upload to [Moodle/Canvas/Blackboard] under "Assignment 2" section

  • File Size Limit: 100 MB (compress sample papers if needed, use smaller PDFs)

  • Format: ZIP file only

  • Resubmission: Allowed until deadline; latest submission graded




Testing Before Submission

Run these commands to verify your submission:


  • pytest -v                                    # All tests pass

  • pytest --cov=research_server --cov-report=term-missing   # Coverage > 75%

  • python research_server.py --transport stdio  # Starts without errors

  • python research_server.py --transport sse --port 3001    # HTTP server starts

  • python research_server.py --help             # Shows usage information




Academic Integrity


  • This is an individual assignment. Collaboration on concepts is allowed, but all code must be original.

  • You may reuse patterns from course modules, but Task Manager code from Assignment 1 should not be directly copied.

  • Cite any external libraries beyond standard MCP SDK (e.g., pdfplumber, starlette).

  • AI tools may be used for learning, but you must understand and explain your implementation.

  • Plagiarism detection will be used. Violations result in zero credit and disciplinary action.





Tips for Success


  • Build incrementally - Start with Assignment 1 patterns, add sampling, then logging, then SSE

  • Test sampling early - Mock the ctx object in tests to avoid dependency on Claude Desktop

  • Use Capstone as reference - Review research_assistant_server.py for sampling patterns

  • Start with .txt papers - Add .md and .pdf support after basic tools work

  • Log everything - Structured logs will help debug sampling and transport issues

  • Test both transports - Run separate test sessions for stdio and SSE

  • Monitor health metrics - Use health resources to verify tool calls are tracked

  • Document assumptions - If requirements are unclear, state your interpretation in README

  • Create good sample data - Include diverse papers to showcase search and comparison

  • Record demo early - Capture working features before final changes to avoid time pressure





Call to Action

Ready to transform your business with AI-powered intelligence that accelerates insights, enhances decision-making, and unlocks the full value of your data?


Codersarts is here to help you turn complex data workflows into efficient, scalable, and evidence-driven AI systems that empower teams to make smarter, faster, and more confident decisions.


Whether you’re a startup looking to build AI-driven products, an enterprise aiming to optimize operations through data science, or a research organization advancing innovation with intelligent data solutions, we bring the expertise and experience needed to design, develop, and deploy impactful AI systems that drive measurable business outcomes.




Get Started Today



Schedule an AI & Data Science Consultation:

Book a 30-minute discovery call with our AI strategists and data science experts to discuss your challenges, identify high-impact opportunities, and explore how intelligent AI solutions can transform your workflows and performance.




Request a Custom AI Demo:

Experience AI in action with a personalized demonstration built around your business use cases, datasets, operational environment, and decision workflows — showcasing practical value and real-world impact.









Transform your organization from data accumulation to intelligent decision enablement — accelerating insight generation, improving operational efficiency, and strengthening competitive advantage.


Partner with Codersarts to build scalable AI solutions including RAG systems, predictive analytics platforms, intelligent automation tools, recommendation engines, and custom machine learning models that empower your teams to deliver exceptional results.


Contact us today and take the first step toward next-generation AI and data science capabilities that grow with your business ambitions.




Comments


bottom of page