Building a Database MCP Server: The Ultimate Code Walkthrough - Part 2

ganesh90
Jul 4
17 min read

Updated: Jul 7

Prerequisite: This is a continuation of the blog Part 2: Building a Database MCP Server: The Ultimate Code Walkthrough - Part 1

The Foundation - Setting Up Our Digital Workshop

import asyncio
import json
import sqlite3
import os
import re
from pathlib import Path
from typing import List, Dict, Any, Optional, Union
from contextlib import asynccontextmanager
from collections.abc import AsyncIterator
from mcp.server.fastmcp import FastMCP
import logging

What's happening here? 🤔

Think of this section as setting up your workshop before starting a big project. We're importing all our tools and setting up our "digital workbench."

Core System Libraries:

import asyncio: This is like hiring a master multitasker! AsyncIO allows our server to handle multiple database requests simultaneously without blocking. Imagine a waiter who can take multiple orders, serve different tables, and process payments all at the same time without making anyone wait.
import json: Our universal translator! Every response from our database server will be formatted as JSON, making it easy for other programs (including AI systems) to understand and process our data. It's like having a diplomat who can speak every programming language.
import sqlite3: Python's built-in SQLite interface – this is your master key for all SQLite database files. No additional installation required! It's like having a Swiss Army knife that came free with your toolkit.

File and Pattern Wizardry:

import os: The operating system interface that helps us work with files and directories across different platforms (Windows, Mac, Linux). Think of it as your GPS for navigating the file system.
import re: Regular expressions – our pattern-matching superhero! This becomes our security guard, scanning SQL queries for dangerous patterns that could indicate injection attacks. It's like having a bouncer who can spot troublemakers from a mile away.
from pathlib import Path: This is like upgrading from a horse-drawn carriage to a Tesla! Path() gives us modern, elegant file path handling that works seamlessly across all operating systems. No more messy string concatenation!

Type System Excellence:

from typing import List, Dict, Any, Optional, Union: These make our code self-documenting with type hints. It's like putting clear, descriptive labels on all your storage boxes so everyone knows exactly what's inside. Your IDE will thank you with better autocomplete and error detection!

Advanced Python Features:

from contextlib import asynccontextmanager: This is for advanced resource management – think of it as an automatic door that always closes behind you, even if you forget or something goes wrong.
from collections.abc import AsyncIterator: Type hints for asynchronous operations. It's like having proper documentation for your conveyor belt system.

The MCP Foundation:

from mcp.server.fastmcp import FastMCP: This is the heart and soul of our system! FastMCP provides the foundation for creating our database tools and API endpoints. Think of it as the blueprint and construction materials for building our digital database assistant.

System Monitoring:

import logging: Our system's detailed diary! Every important event, error, and operation gets recorded. When things go sideways (and they sometimes will), these logs become our detective clues to solve the mystery.

This is like laying out all your knives, cutting boards, and ingredients before you start cooking a gourmet meal. Organization is the secret to success!

PostgreSQL Support - The Smart Import Strategy

# PostgreSQL support
try:
    import psycopg2
    import psycopg2.extras
    POSTGRES_AVAILABLE = True
    print("PostgreSQL support loaded")
except ImportError:
    POSTGRES_AVAILABLE = False
    print("PostgreSQL not available - SQLite only mode")

The Brilliant Defensive Programming Pattern! 🛡️

This is like having a car that can run on premium gas but automatically switches to regular if premium isn't available:

The Try Block - Optimistic Attempt:

import psycopg2: The most popular PostgreSQL adapter for Python. This is like trying to install enterprise-grade database equipment.
import psycopg2.extras: Advanced PostgreSQL features, especially RealDictCursor which makes query results behave like dictionaries instead of confusing tuples.
POSTGRES_AVAILABLE = True: Sets a global flag indicating "Yes! We have enterprise database capabilities!"
print("PostgreSQL support loaded"): Immediate feedback to the user about system capabilities.

The Except Block - Graceful Fallback:

except ImportError:: Catches the specific error when Python can't find the PostgreSQL libraries.
POSTGRES_AVAILABLE = False: Sets the flag to indicate "We'll work with SQLite only, and that's perfectly fine!"
print("PostgreSQL not available - SQLite only mode"): Clear communication about what the system can do.

Why This is a Good Practice:

No crashes: System continues working even without PostgreSQL
Clear feedback: User knows exactly what capabilities are available
Future-proof: Easy to add PostgreSQL support later by just installing the library
Smart adaptation: Code checks this flag before attempting PostgreSQL operations

System Configuration - Our Digital DNA

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

Logging Configuration Deep Dive:

logging.basicConfig(level=logging.INFO): This is like setting up a security camera system for your application:
- INFO level: Shows important events without overwhelming detail
- Alternative levels: DEBUG (everything), WARNING (problems), ERROR (failures), CRITICAL (disasters)
- Why INFO: Perfect balance of useful information without noise
logger = logging.getLogger(__name__): Creates a personalized logger for this specific module:
- name: Special Python variable containing the module name
- Hierarchical structure: Like having department-specific filing systems
- Best practice: Each module gets its own logger for organized debugging

It's like having a professional secretary who keeps detailed notes of every important meeting, phone call, and decision, but doesn't bore you with trivial details like "someone walked past the office."

The Document Class - Your Information Envelope

class Document:
    """Simple container for text documents with metadata."""

    def __init__(self, text: str, metadata: Dict[str, Any] = None):
        self.page_content = text
        self.metadata = metadata or {}

The Smart Data Container Explained 📄

Imagine a magical envelope that holds both a letter and a detailed information card about that letter (like where it came from, when it was written, who sent it). That's exactly what our Document class does!

Constructor Breakdown:

def init(self, text: str, metadata: Dict[str, Any] = None):: The blueprint for creating document objects
- text: str: The actual content (required parameter)
- metadata: Dict[str, Any] = None: Optional extra information about the document

Instance Variables:

self.page_content = text: Stores the actual text content following LangChain conventions (industry standard)
self.metadata = metadata or {}: The brilliant "or {}" trick prevents the dangerous default mutable argument problem

The Clever or {} Trick Explained:

metadata = None  # User didn't provide metadata
self.metadata = None or {}  # Results in {}
metadata = {"source": "file.pdf"}  # User provided metadata  
self.metadata = {"source": "file.pdf"} or {}  # Results in {"source": "file.pdf"}

Why This Matters:

Prevents bugs: Avoids shared mutable defaults that can cause mysterious data corruption
Always safe: You can always do document.metadata["key"] = "value" without crashes
Industry standard: Follows best practices used by professional developers

It's like a library book that not only contains the story but also has the catalog card permanently attached, so you always know both what it says and where it came from.

Database Manager Class - Your Database Butler

class DatabaseManager:
    """Manages database connections and operations"""
    def __init__(self, db_type: str = "sqlite", db_path: str = "database.db", 
                 connection_string: str = None):
        self.db_type = db_type.lower()
        self.db_path = db_path
        self.connection_string = connection_string
        self.connection = None
        self.max_results = 1000  # Default limit for safety
        self.is_initialized = False

Meet Your Personal Database Assistant! 🏢

This class is the heart of our system – think of it as designing a professional database administrator role with all the skills and safety measures you'd want.

Class Declaration:

class DatabaseManager:: Creates a blueprint for managing database connections
Docstring: Clear explanation of the class purpose (essential for maintainable code)

Constructor Deep Dive:

def init(self, db_type: str = "sqlite", db_path: str = "database.db", connection_string: str = None):
- Multiple parameters: Flexible enough to handle different database types and configurations
- Default values: Smart defaults make the class easy to use
- Type hints: Self-documenting code that helps IDEs and other developers

Instance Variable Analysis:

self.db_type = db_type.lower():
- Why .lower()? Handles user input inconsistency ("SQLite", "SQLITE", "sqlite" all work)
- Defensive programming: Prevents bugs from capitalization differences
- Example: User types "SQLite" → stored as "sqlite"
self.db_path = db_path:
- Flexible storage: Could be a file path ("/data/my.db") or database name ("production_db")
- Context dependent: Meaning changes based on database type
self.connection_string = connection_string:
- PostgreSQL requirement: Stores complex connection details
- Example: "postgresql://user:password@localhost:5432/mydb"
- Optional for SQLite: SQLite doesn't need connection strings
self.connection = None:
- Lazy initialization: Connection will be created when needed
- Safety first: Starts in a safe, disconnected state
- Memory efficient: No resources wasted until actually needed
self.max_results = 1000:
- Critical safety feature: Prevents accidentally loading millions of records
- Memory protection: Stops queries that could crash your system
- Reasonable default: 1000 records is usually manageable but substantial
self.is_initialized = False:
- State tracking: Prevents operations on uninitialized connections
- Safety flag: Other methods check this before proceeding
- Debugging aid: Easy to see if initialization succeeded

It's like hiring a personal assistant who remembers your preferences (database type), knows where all your files are stored (database path), has all the necessary access codes (connection string), and comes with built-in safety protocols to prevent disasters!

Database Initialization - The Connection Wizard

async def initialize(self):
    """Initialize database connection"""
    try:
        if self.db_type == "sqlite":
            await self._init_sqlite()
        elif self.db_type == "postgresql" and POSTGRES_AVAILABLE:
            await self._init_postgresql()
        else:
            raise ValueError(f"Unsupported database type: {self.db_type}")
        self.is_initialized = True
        logger.info(f"Database initialized: {self.db_type} at {self.db_path}")
    except Exception as e:
        logger.error(f"Database initialization failed: {e}")
        self.is_initialized = False
        raise

The Smart Lock System! 🔐

This method is like a smart lock on your house that automatically recognizes what type of key you need and uses the appropriate mechanism.

Method Signature Analysis:

async def initialize(self)::
- Async advantage: Allows other operations while connection is being established
- Real-world benefit: Like being able to answer phone calls while waiting for your computer to boot up
- Modern pattern: Async/await is the standard for I/O operations in Python

The Branching Logic:

if self.db_type == "sqlite"::
- Route 1: File-based database handling
- Delegates responsibility: Calls specialized SQLite setup method
- Single responsibility: Each method has one clear job
elif self.db_type == "postgresql" and POSTGRES_AVAILABLE::
- Route 2: PostgresSQL database handling
- Double check: Both type must be correct AND libraries must be available
- Fail-safe: Prevents crashes if PostgreSQL libraries aren't installed
else::
- Error handling: Catches unsupported database types
- Clear error message: Tells user exactly what went wrong
- Fail fast principle: Better to crash immediately with clear message than mysteriously later

Success Path:

self.is_initialized = True: Only set after successful initialization
logger.info(...): Records success with details for debugging
State consistency: Object state accurately reflects reality

Error Handling Excellence:

except Exception as e:: Catches any error during initialization
logger.error(...): Records failure with specific error details
self.is_initialized = False: Ensures object state reflects failure
raise: Re-throws exception so caller knows initialization failed

Why Re-raise? Like a chain of command – the error gets logged at this level for debugging, but higher-level code needs to know that initialization failed so it can handle the situation appropriately.

SQLite Connection Specialist

    async def _init_sqlite(self):
        """Initialize SQLite connection with existence check"""
        db_path = Path(self.db_path)
        
        # Check if database file exists
        if not db_path.exists():
            raise FileNotFoundError(f"Database file does not exist: {self.db_path}")
        
        # Check if it's actually a file (not a directory)
        if not db_path.is_file():
            raise ValueError(f"Path exists but is not a file: {self.db_path}")
        
        # Create parent directory if needed (for log files, etc.)
        db_path.parent.mkdir(parents=True, exist_ok=True)
        
        self.connection = sqlite3.connect(self.db_path, check_same_thread=False)
        self.connection.row_factory = sqlite3.Row

The Thorough Security Guard! 🗃️

This method is like a security guard who checks multiple forms of ID before letting you into a secure building.

Path Object Creation:

db_path = Path(self.db_path):
- Modern approach: Converts string path to powerful Path object
- Cross-platform: Works identically on Windows, Mac, and Linux
- Rich functionality: Provides methods like .exists(), .is_file(), .parent

Validation Layer 1 - Existence Check:

if not db_path.exists()::
- Primary check: "Does this database file actually exist on disk?"
- Prevents confusion: Stops cryptic errors later in the process
- Clear error: FileNotFoundError with exact path that was checked

Validation Layer 2 - File Type Check:

if not db_path.is_file()::
- Subtle but important: Path exists but is it a file or folder?
- Common mistake: Users sometimes point to directories instead of database files
- Specific error: ValueError explaining the exact problem

Proactive Directory Creation:

db_path.parent.mkdir(parents=True, exist_ok=True):
- Forward thinking: Creates parent directories if they don't exist
- parents=True: Like "mkdir -p" in Unix - creates intermediate directories
- exist_ok=True: Doesn't error if directories already exist
- Why needed: Sometimes you need directories for log files or backups

Actual Connection Establishment:

self.connection = sqlite3.connect(self.db_path, check_same_thread=False):
- sqlite3.connect(): Standard Python function for SQLite connections
- check_same_thread=False: Essential for async operations - allows connection use across threads
- Thread safety: Required for our async database operations

User Experience Enhancement:

self.connection.row_factory = sqlite3.Row:
- Game changer: Makes query results behave like dictionaries
- Before: result[0] (what's column 0 again? 🤔)
- After: result['name'] (crystal clear! ✨)
- Professional touch: Much more usable than plain tuples

PostgreSQL Enterprise Setup

    async def _init_postgresql(self):
        """Initialize PostgreSQL connection"""
        if not self.connection_string:
            raise ValueError("PostgreSQL connection string required")
        
        self.connection = psycopg2.connect(
            self.connection_string,
            cursor_factory=psycopg2.extras.RealDictCursor
        )
        self.connection.autocommit = True

The Enterprise Database Specialist! 🐘

This method handles the more complex enterprise database setup with appropriate validation and configuration.

Validation Check:

if not self.connection_string::
- Required parameter: PostgreSQL needs detailed connection information
- Clear error: Tells user exactly what they need to provide
- Example needed: "postgresql://user:password@localhost:5432/dbname"

Connection Establishment:

self.connection = psycopg2.connect(...):
- Enterprise connection: Much more complex than SQLite file connections
- Multi-line formatting: Makes complex function calls readable
- Professional setup: Handles authentication, networking, database selection

Advanced Configuration:

cursor_factory=psycopg2.extras.RealDictCursor:
- Consistency: PostgreSQL equivalent of SQLite's Row factory
- Dictionary results: Same user experience as SQLite
- Professional touch: Query results are easy to work with

Transaction Management:

self.connection.autocommit = True:
- Simplified management: Automatically commits each operation
- Trade-off: Easier to use but less control over transactions
- Appropriate choice: Good for our use case of simple database operations

Connection Health Monitoring

    def is_connected(self) -> bool:
        """Check if database is connected and functional"""
        if not self.is_initialized or not self.connection:
            return False
        
        try:
            cursor = self.connection.cursor()
            if self.db_type == "sqlite":
                cursor.execute("SELECT 1")
            else:  # PostgreSQL
                cursor.execute("SELECT 1")
            result = cursor.fetchone()
            cursor.close()
            return result is not None
        except Exception as e:
            logger.warning(f"Connection test failed for {self.db_path}: {e}")
            return False

The Database Doctor! 🔍

This method is like having a doctor who can quickly check if a patient is responsive and healthy.

Quick Preliminary Checks:

if not self.is_initialized or not self.connection::
- Fast path: Avoid expensive tests if we already know the answer
- Logic: If setup never happened OR connection object doesn't exist
- Efficiency: No point testing a connection that doesn't exist

Active Health Test:

cursor = self.connection.cursor():
- Standard pattern: Create cursor for database operations
- Database interface: Cursors are how you execute SQL commands
cursor.execute("SELECT 1"):
- Universal test: Simplest possible query that works on all SQL databases
- Minimal overhead: Just returns the number 1
- Real communication: Actually talks to the database to verify connectivity

Result Verification:

result = cursor.fetchone():
- Get response: Retrieves the result of our test query
- Expected result: Should return (1,) or similar
return result is not None:
- Boolean conversion: True if we got any result, False if None
- Health indicator: Any result means the connection is working

Resource Cleanup:

cursor.close():
- Good practice: Always clean up database resources
- Prevents leaks: Stops accumulation of unused cursors

Error Handling:

except Exception as e::
- Broad catch: Database connections can fail in many ways
- Warning level: Connection tests failing is common, not catastrophic
- Informative logging: Shows which database and what error occurred

The Security Guardian - Query Validation System

    async def get_table_schema(self, table_name: str) -> Dict[str, Any]:
        """Get and cache table schema information"""
        if table_name in self._table_schemas:
            return self._table_schemas[table_name]
        try:
            cursor = self.connection.cursor()
            
            if self.db_type == "sqlite":
                cursor.execute(f"PRAGMA table_info({table_name})")
                columns = cursor.fetchall()
                schema = {}
                for col in columns:
                    col_dict = dict(col)
                    schema[col_dict['name']] = {
                        'type': col_dict['type'],
                        'not_null': bool(col_dict['notnull']),
                        'default_value': col_dict['dflt_value'],
                        'primary_key': bool(col_dict['pk'])
                    }
            else:  # PostgreSQL
                cursor.execute("""
                    SELECT column_name, data_type, is_nullable, column_default
                    FROM information_schema.columns
                    WHERE table_name = %s
                    ORDER BY ordinal_position
                """, (table_name,))
                columns = cursor.fetchall()
                
                schema = {}
                for col in columns:
                    col_dict = dict(col)
                    schema[col_dict['column_name']] = {
                        'type': col_dict['data_type'],
                        'not_null': col_dict['is_nullable'] == 'NO',
                        'default_value': col_dict['column_default'],
                        'primary_key': False  # Would need additional query
                    }
            self._table_schemas[table_name] = schema
            return schema
            
        except Exception as e:
            logger.error(f"Error getting schema for table {table_name}: {e}")
            return {}

The Structure

async def get_table_schema(self, table_name: str) -> Dict[str, Any]:

async def: This is an asynchronous function. MCP async processing is a modern way to run tasks without blocking the system. It allows your program to handle many tasks at the same time, instead of waiting for each one to finish before starting the next.

Think of it like this:

Synchronous processing is like a single line of people at a counter. Each person waits for the one before them to finish.
Asynchronous processing is like having many counters open. People can be served at the same time.

What does async mean?

The async keyword tells the computer:

This task might take time, so keep going with other work and come back to it when it is ready.

In MCP, this means you can run model tasks in the background while the system continues working on other tasks.

Why is async important in MCP?

It helps your system:

Run faster
Use resources more efficiently
Handle more users or tasks at the same time
Avoid delays caused by slow tasks
Stay stable even if some tasks fail

Key parts of async processing:

Task queues to manage which tasks to run
Context managers to track each task's progress
Event loops to switch between tasks efficiently
Callbacks to handle what happens when a task finishes or fails

table_name: str: Takes a table name as a string parameter
-> Dict[str, Any]: Returns a dictionary with string keys and any type of values

if table_name in self._table_schemas:
    return self._table_schemas[table_name]

The Memory Check:

self._table_schemas: A dictionary that acts like a filing cabinet
Why a dictionary/cache? Database schema rarely changes, so asking every time is wasteful
Performance boost: Instead of asking database "What's in the users table?" 100 times, ask once and remember
Example: If you already know what's on McDonald's menu, you don't need to read it again

Database-Specific Schema Discovery

if self.db_type == "sqlite":
    cursor.execute(f"PRAGMA table_info({table_name})")
    columns = cursor.fetchall()

This code performs the following actions if the database type is SQLite:

self.db_type == "sqlite": Checks whether the current database is an SQLite database.
cursor.execute(f"PRAGMA table_info({table_name})"): Executes an SQLite-specific command (PRAGMA table_info) that retrieves metadata about the columns of the table named table_name.It returns details like column name, type, whether it can be null, default value, etc.
columns = cursor.fetchall(): Fetches all rows returned by the query. Each row represents one column in the table.

What's PRAGMA?

PRAGMA: SQLite's special command for system information
table_info: Specific PRAGMA that reveals table structure
Like asking: "Hey SQLite, tell me everything about this table"

What comes back from SQLite:

cid | name    | type    | notnull | dflt_value | pk
0   | id      | INTEGER | 1       | NULL       | 1
1   | username| TEXT    | 1       | NULL       | 0
2   | email   | TEXT    | 0       | NULL       | 0

PostgreSQL Path:

cursor.execute("""
    SELECT column_name, data_type, is_nullable, column_default
    FROM information_schema.columns
    WHERE table_name = %s
    ORDER BY ordinal_position
""", (table_name,))

cursor.execute("""
    SELECT column_name, data_type, is_nullable, column_default
    FROM information_schema.columns
    WHERE table_name = %s
    ORDER BY ordinal_position
""", (table_name,))

This SQL query retrieves column metadata for a specific table from the information_schema.columns view, which is available in PostgreSQL.

Breakdown:

SELECT column_name, data_type, is_nullable, column_defaultRetrieves:
- column_name: the name of the column
- data_type: the data type (e.g., integer, text, etc.)
- is_nullable: whether the column allows NULL values
- column_default: the default value (if any)
FROM information_schema.columnsQueries: the system catalog that holds metadata about all table columns.
WHERE table_name = %s: Filters: results to only include the specified table. %s is a placeholder for parameterized query input to prevent SQL injection.
ORDER BY ordinal_position: Ensures the columns are returned in the order they appear in the table.
information_schema: PostgreSQL's built-in catalog of database structure

Schema Transformation

SQLite Processing:

schema = {}
for col in columns:
    col_dict = dict(col)
    schema[col_dict['name']] = {
        'type': col_dict['type'],
        'not_null': bool(col_dict['notnull']),
        'default_value': col_dict['dflt_value'],
        'primary_key': bool(col_dict['pk'])
    }

Step by step:

col_dict = dict(col): Convert database row to dictionary
col_dict['name']: Get column name (like 'username')
bool(col_dict['notnull']): Convert 1/0 to True/False
Create standardized format for each column

Example transformation:

# Raw SQLite data:
{'name': 'username', 'type': 'TEXT', 'notnull': 1, 'pk': 0}

# Becomes:'username': {
    'type': 'TEXT',
    'not_null': True,
    'default_value': None,
    'primary_key': False
}

PostgreSQL Processing:

schema[col_dict['column_name']] = {
    'type': col_dict['data_type'],
    'not_null': col_dict['is_nullable'] == 'NO',
    'default_value': col_dict['column_default'],
    'primary_key': False  # Would need additional query
}

Key differences:

column_name vs name: Different field names between databases
is_nullable == 'NO': PostgreSQL returns 'YES'/'NO', convert to True/False
primary_key': False: Comment indicates limitation - would need extra query

Caching and Error Handling

self._table_schemas[table_name] = schema
return schema

The Filing System:

Store the discovered schema in memory
Next time someone asks about this table, return instantly

except Exception as e:
    logger.error(f"Error getting schema for table {table_name}: {e}")
    return {}

Graceful Failure:

If anything goes wrong, log the error
Return empty dictionary instead of crashing
Philosophy: Better to have no schema info than to break the entire application

The Quality Inspector

    def _validate_insert_data(self, table_name: str, data: Dict[str, Any], schema: Dict[str, Any]) -> List[str]:
        """Validate data before insertion"""
        errors = []
        
        for column_name, column_info in schema.items():
            if column_info['not_null'] and column_info['default_value'] is None:
                # This column requires a value
                if column_name not in data:
                    errors.append(f"Required column '{column_name}' is missing")
                elif data[column_name] is None:
                    errors.append(f"Column '{column_name}' cannot be NULL")
                elif isinstance(data[column_name], str) and data[column_name].strip() == "":
                    errors.append(f"Column '{column_name}' cannot be empty (requires a non-empty value)")
        
        # Check for unknown columns
        for column_name in data.keys():
            if column_name not in schema:
                errors.append(f"Unknown column '{column_name}' for table '{table_name}'")
        
        return errors

Purpose

This method checks if data meets the table's requirements before trying to insert it. Like a bouncer checking IDs before letting people into a club.

def _validate_insert_data(self, table_name: str, data: Dict[str, Any], schema: Dict[str, Any]) -> List[str]:

Takes table name, data to insert, and table schema
Returns list of error messages (empty list = no problems)

Required Field Validation

for column_name, column_info in schema.items():
    if column_info['not_null'] and column_info['default_value'] is None:

The Triple Check Logic:

column_info['not_null']: Database says this field is required
column_info['default_value'] is None: No fallback value available
Combined meaning: "This field MUST have a value provided"

Why both conditions?

A column can be NOT NULL but have a default value
Example: created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
In this case, you don't need to provide a value because database will use current time

Three Validation Scenarios

Scenario 1: Missing Column

if column_name not in data:
    errors.append(f"Required column '{column_name}' is missing")

Example:

# Schema requires: username, email, password# Data provided: {'email': 'user@example.com', 'password': '123'}# Missing: username# Error: "Required column 'username' is missing"

Scenario 2: Explicit NULL

elif data[column_name] is None:
    errors.append(f"Column '{column_name}' cannot be NULL")

Example:

# Data: {'username': None, 'email': 'user@example.com'}# Error: "Column 'username' cannot be NULL"

Scenario 3: Empty String

elif isinstance(data[column_name], str) and data[column_name].strip() == "":
    errors.append(f"Column '{column_name}' cannot be empty (requires a non-empty value)")

Example:

# Data: {'username': '   ', 'email': 'user@example.com'}# After strip(): username becomes ''# Error: "Column 'username' cannot be empty (requires a non-empty value)"

Unknown Column Detection

for column_name in data.keys():
    if column_name not in schema:
        errors.append(f"Unknown column '{column_name}' for table '{table_name}'")

Typo Prevention:

# Schema has: username, email, password# Data: {'usrname': 'john', 'email': 'john@example.com'}  # Typo!# Error: "Unknown column 'usrname' for table 'users'"

Return Value:

return errors

Empty list []: All validation passed
List with errors: Each string describes a specific problem

The Data Sanitizer 🧹

    def _clean_insert_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
        """Clean and prepare data for insertion"""
        cleaned_data = {}
        
        for key, value in data.items():
            if value is None:
                cleaned_data[key] = None
            elif isinstance(value, str):
                # Convert empty strings to None for database insertion
                # This respects NOT NULL constraints properly
                cleaned_value = value.strip()
                cleaned_data[key] = cleaned_value if cleaned_value else None
            else:
                cleaned_data[key] = value
        
        return cleaned_data

Purpose

This method prepares data for database insertion by cleaning and standardizing it. Like washing vegetables before cooking.

def _clean_insert_data(self, data: Dict[str, Any]) -> Dict[str, Any]:

The Cleaning Process

cleaned_data = {}
for key, value in data.items():
    if value is None:
        cleaned_data[key] = None

Preserve NULL values: If someone explicitly wants NULL, keep it NULL.

elif isinstance(value, str):
    cleaned_value = value.strip()
    cleaned_data[key] = cleaned_value if cleaned_value else None

String Processing Logic:

isinstance(value, str): Only process strings
value.strip(): Remove leading/trailing whitespace
cleaned_value if cleaned_value else None: If empty after strip, convert to NULL

Examples:

# Input: {'name': '  John  ', 'description': '   '}# Processing:# '  John  '.strip() = 'John' (not empty, keep it)# '   '.strip() = '' (empty, convert to None)# Output: {'name': 'John', 'description': None}

else:
    cleaned_data[key] = value

Preserve Other Types: Numbers, dates, booleans pass through unchanged.

Why Clean Data?

Database Consistency:

Empty strings '' and NULL are different in databases
NULL means "no value"
Empty string means "a value that happens to be empty"
Most validation logic works better with NULL

Whitespace Issues:

# Before cleaning:
{'username': '  john  '}  # Accidental spaces

# After cleaning:
{'username': 'john'}  # Clean data

Part 3 is available at the following link: https://www.codersarts.com/post/building-a-database-mcp-server-the-ultimate-code-walkthrough-part-3

Transform Your Projects with Codersarts

Enterprise-grade database servers with advanced security and performance optimization
Multi-database integration systems supporting SQLite, PostgreSQL, MySQL, and cloud databases
AI-powered query optimization and intelligent database management tools
Production-ready deployment with monitoring, logging, and automated backup systems

Don't let complex implementations slow down your innovation. Connect with Codersarts today and turn your ideas into reality!

Ready to get started? Visit Codersarts.com or reach out to our team to discuss your next project. The future of intelligent automation is here – let's build it together!

The Foundation - Setting Up Our Digital Workshop

PostgreSQL Support - The Smart Import Strategy

System Configuration - Our Digital DNA

The Document Class - Your Information Envelope

The Smart Data Container Explained 📄

Database Manager Class - Your Database Butler

Meet Your Personal Database Assistant! 🏢

Database Initialization - The Connection Wizard

The Smart Lock System! 🔐

SQLite Connection Specialist

The Thorough Security Guard! 🗃️

PostgreSQL Enterprise Setup

The Enterprise Database Specialist! 🐘

Connection Health Monitoring

The Database Doctor! 🔍

The Security Guardian - Query Validation System

The Structure

What does async mean?

Why is async important in MCP?

Key parts of async processing:

Database-Specific Schema Discovery

PostgreSQL Path:

Schema Transformation

SQLite Processing:

PostgreSQL Processing:

Caching and Error Handling

The Quality Inspector

Purpose

Required Field Validation

Three Validation Scenarios

Scenario 1: Missing Column

Scenario 2: Explicit NULL

Scenario 3: Empty String

Unknown Column Detection

Return Value:

The Data Sanitizer 🧹

Purpose

The Cleaning Process

Why Clean Data?

Transform Your Projects with Codersarts

Comments