Building a Database MCP Server: The Ultimate Code Walkthrough - Part 2
- ganesh90
- Jul 4
- 17 min read
Updated: Jul 7
Prerequisite: This is a continuation of the blog Part 2: Building a Database MCP Server: The Ultimate Code Walkthrough - Part 1

The Foundation - Setting Up Our Digital Workshop
import asyncio
import json
import sqlite3
import os
import re
from pathlib import Path
from typing import List, Dict, Any, Optional, Union
from contextlib import asynccontextmanager
from collections.abc import AsyncIterator
from mcp.server.fastmcp import FastMCP
import logging
What's happening here? 🤔
Think of this section as setting up your workshop before starting a big project. We're importing all our tools and setting up our "digital workbench."
Core System Libraries:
import asyncio: This is like hiring a master multitasker! AsyncIO allows our server to handle multiple database requests simultaneously without blocking. Imagine a waiter who can take multiple orders, serve different tables, and process payments all at the same time without making anyone wait.
import json: Our universal translator! Every response from our database server will be formatted as JSON, making it easy for other programs (including AI systems) to understand and process our data. It's like having a diplomat who can speak every programming language.
import sqlite3: Python's built-in SQLite interface – this is your master key for all SQLite database files. No additional installation required! It's like having a Swiss Army knife that came free with your toolkit.
File and Pattern Wizardry:
import os: The operating system interface that helps us work with files and directories across different platforms (Windows, Mac, Linux). Think of it as your GPS for navigating the file system.
import re: Regular expressions – our pattern-matching superhero! This becomes our security guard, scanning SQL queries for dangerous patterns that could indicate injection attacks. It's like having a bouncer who can spot troublemakers from a mile away.
from pathlib import Path: This is like upgrading from a horse-drawn carriage to a Tesla! Path() gives us modern, elegant file path handling that works seamlessly across all operating systems. No more messy string concatenation!
Type System Excellence:
from typing import List, Dict, Any, Optional, Union: These make our code self-documenting with type hints. It's like putting clear, descriptive labels on all your storage boxes so everyone knows exactly what's inside. Your IDE will thank you with better autocomplete and error detection!
Advanced Python Features:
from contextlib import asynccontextmanager: This is for advanced resource management – think of it as an automatic door that always closes behind you, even if you forget or something goes wrong.
from collections.abc import AsyncIterator: Type hints for asynchronous operations. It's like having proper documentation for your conveyor belt system.
The MCP Foundation:
from mcp.server.fastmcp import FastMCP: This is the heart and soul of our system! FastMCP provides the foundation for creating our database tools and API endpoints. Think of it as the blueprint and construction materials for building our digital database assistant.
System Monitoring:
import logging: Our system's detailed diary! Every important event, error, and operation gets recorded. When things go sideways (and they sometimes will), these logs become our detective clues to solve the mystery.
This is like laying out all your knives, cutting boards, and ingredients before you start cooking a gourmet meal. Organization is the secret to success!
PostgreSQL Support - The Smart Import Strategy
# PostgreSQL support
try:
import psycopg2
import psycopg2.extras
POSTGRES_AVAILABLE = True
print("PostgreSQL support loaded")
except ImportError:
POSTGRES_AVAILABLE = False
print("PostgreSQL not available - SQLite only mode")
The Brilliant Defensive Programming Pattern! 🛡️
This is like having a car that can run on premium gas but automatically switches to regular if premium isn't available:
The Try Block - Optimistic Attempt:
import psycopg2: The most popular PostgreSQL adapter for Python. This is like trying to install enterprise-grade database equipment.
import psycopg2.extras: Advanced PostgreSQL features, especially RealDictCursor which makes query results behave like dictionaries instead of confusing tuples.
POSTGRES_AVAILABLE = True: Sets a global flag indicating "Yes! We have enterprise database capabilities!"
print("PostgreSQL support loaded"): Immediate feedback to the user about system capabilities.
The Except Block - Graceful Fallback:
except ImportError:: Catches the specific error when Python can't find the PostgreSQL libraries.
POSTGRES_AVAILABLE = False: Sets the flag to indicate "We'll work with SQLite only, and that's perfectly fine!"
print("PostgreSQL not available - SQLite only mode"): Clear communication about what the system can do.
Why This is a Good Practice:
No crashes: System continues working even without PostgreSQL
Clear feedback: User knows exactly what capabilities are available
Future-proof: Easy to add PostgreSQL support later by just installing the library
Smart adaptation: Code checks this flag before attempting PostgreSQL operations
System Configuration - Our Digital DNA
# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
Logging Configuration Deep Dive:
logging.basicConfig(level=logging.INFO): This is like setting up a security camera system for your application:
INFO level: Shows important events without overwhelming detail
Alternative levels: DEBUG (everything), WARNING (problems), ERROR (failures), CRITICAL (disasters)
Why INFO: Perfect balance of useful information without noise
logger = logging.getLogger(__name__): Creates a personalized logger for this specific module:
name: Special Python variable containing the module name
Hierarchical structure: Like having department-specific filing systems
Best practice: Each module gets its own logger for organized debugging
It's like having a professional secretary who keeps detailed notes of every important meeting, phone call, and decision, but doesn't bore you with trivial details like "someone walked past the office."
The Document Class - Your Information Envelope
class Document:
"""Simple container for text documents with metadata."""
def __init__(self, text: str, metadata: Dict[str, Any] = None):
self.page_content = text
self.metadata = metadata or {}
The Smart Data Container Explained 📄
Imagine a magical envelope that holds both a letter and a detailed information card about that letter (like where it came from, when it was written, who sent it). That's exactly what our Document class does!
Constructor Breakdown:
def init(self, text: str, metadata: Dict[str, Any] = None):: The blueprint for creating document objects
text: str: The actual content (required parameter)
metadata: Dict[str, Any] = None: Optional extra information about the document
Instance Variables:
self.page_content = text: Stores the actual text content following LangChain conventions (industry standard)
self.metadata = metadata or {}: The brilliant "or {}" trick prevents the dangerous default mutable argument problem
The Clever or {} Trick Explained:
metadata = None # User didn't provide metadata
self.metadata = None or {} # Results in {}
metadata = {"source": "file.pdf"} # User provided metadata
self.metadata = {"source": "file.pdf"} or {} # Results in {"source": "file.pdf"}
Why This Matters:
Prevents bugs: Avoids shared mutable defaults that can cause mysterious data corruption
Always safe: You can always do document.metadata["key"] = "value" without crashes
Industry standard: Follows best practices used by professional developers
It's like a library book that not only contains the story but also has the catalog card permanently attached, so you always know both what it says and where it came from.
Database Manager Class - Your Database Butler
class DatabaseManager:
"""Manages database connections and operations"""
def __init__(self, db_type: str = "sqlite", db_path: str = "database.db",
connection_string: str = None):
self.db_type = db_type.lower()
self.db_path = db_path
self.connection_string = connection_string
self.connection = None
self.max_results = 1000 # Default limit for safety
self.is_initialized = False
Meet Your Personal Database Assistant! 🏢
This class is the heart of our system – think of it as designing a professional database administrator role with all the skills and safety measures you'd want.
Class Declaration:
class DatabaseManager:: Creates a blueprint for managing database connections
Docstring: Clear explanation of the class purpose (essential for maintainable code)
Constructor Deep Dive:
def init(self, db_type: str = "sqlite", db_path: str = "database.db", connection_string: str = None):
Multiple parameters: Flexible enough to handle different database types and configurations
Default values: Smart defaults make the class easy to use
Type hints: Self-documenting code that helps IDEs and other developers
Instance Variable Analysis:
self.db_type = db_type.lower():
Why .lower()? Handles user input inconsistency ("SQLite", "SQLITE", "sqlite" all work)
Defensive programming: Prevents bugs from capitalization differences
Example: User types "SQLite" → stored as "sqlite"
self.db_path = db_path:
Flexible storage: Could be a file path ("/data/my.db") or database name ("production_db")
Context dependent: Meaning changes based on database type
self.connection_string = connection_string:
PostgreSQL requirement: Stores complex connection details
Example: "postgresql://user:password@localhost:5432/mydb"
Optional for SQLite: SQLite doesn't need connection strings
self.connection = None:
Lazy initialization: Connection will be created when needed
Safety first: Starts in a safe, disconnected state
Memory efficient: No resources wasted until actually needed
self.max_results = 1000:
Critical safety feature: Prevents accidentally loading millions of records
Memory protection: Stops queries that could crash your system
Reasonable default: 1000 records is usually manageable but substantial
self.is_initialized = False:
State tracking: Prevents operations on uninitialized connections
Safety flag: Other methods check this before proceeding
Debugging aid: Easy to see if initialization succeeded
It's like hiring a personal assistant who remembers your preferences (database type), knows where all your files are stored (database path), has all the necessary access codes (connection string), and comes with built-in safety protocols to prevent disasters!
Database Initialization - The Connection Wizard
async def initialize(self):
"""Initialize database connection"""
try:
if self.db_type == "sqlite":
await self._init_sqlite()
elif self.db_type == "postgresql" and POSTGRES_AVAILABLE:
await self._init_postgresql()
else:
raise ValueError(f"Unsupported database type: {self.db_type}")
self.is_initialized = True
logger.info(f"Database initialized: {self.db_type} at {self.db_path}")
except Exception as e:
logger.error(f"Database initialization failed: {e}")
self.is_initialized = False
raise
The Smart Lock System! 🔐
This method is like a smart lock on your house that automatically recognizes what type of key you need and uses the appropriate mechanism.
Method Signature Analysis:
async def initialize(self)::
Async advantage: Allows other operations while connection is being established
Real-world benefit: Like being able to answer phone calls while waiting for your computer to boot up
Modern pattern: Async/await is the standard for I/O operations in Python
The Branching Logic:
if self.db_type == "sqlite"::
Route 1: File-based database handling
Delegates responsibility: Calls specialized SQLite setup method
Single responsibility: Each method has one clear job
elif self.db_type == "postgresql" and POSTGRES_AVAILABLE::
Route 2: PostgresSQL database handling
Double check: Both type must be correct AND libraries must be available
Fail-safe: Prevents crashes if PostgreSQL libraries aren't installed
else::
Error handling: Catches unsupported database types
Clear error message: Tells user exactly what went wrong
Fail fast principle: Better to crash immediately with clear message than mysteriously later
Success Path:
self.is_initialized = True: Only set after successful initialization
logger.info(...): Records success with details for debugging
State consistency: Object state accurately reflects reality
Error Handling Excellence:
except Exception as e:: Catches any error during initialization
logger.error(...): Records failure with specific error details
self.is_initialized = False: Ensures object state reflects failure
raise: Re-throws exception so caller knows initialization failed
Why Re-raise? Like a chain of command – the error gets logged at this level for debugging, but higher-level code needs to know that initialization failed so it can handle the situation appropriately.
SQLite Connection Specialist
async def _init_sqlite(self):
"""Initialize SQLite connection with existence check"""
db_path = Path(self.db_path)
# Check if database file exists
if not db_path.exists():
raise FileNotFoundError(f"Database file does not exist: {self.db_path}")
# Check if it's actually a file (not a directory)
if not db_path.is_file():
raise ValueError(f"Path exists but is not a file: {self.db_path}")
# Create parent directory if needed (for log files, etc.)
db_path.parent.mkdir(parents=True, exist_ok=True)
self.connection = sqlite3.connect(self.db_path, check_same_thread=False)
self.connection.row_factory = sqlite3.Row
The Thorough Security Guard! 🗃️
This method is like a security guard who checks multiple forms of ID before letting you into a secure building.
Path Object Creation:
db_path = Path(self.db_path):
Modern approach: Converts string path to powerful Path object
Cross-platform: Works identically on Windows, Mac, and Linux
Rich functionality: Provides methods like .exists(), .is_file(), .parent
Validation Layer 1 - Existence Check:
if not db_path.exists()::
Primary check: "Does this database file actually exist on disk?"
Prevents confusion: Stops cryptic errors later in the process
Clear error: FileNotFoundError with exact path that was checked
Validation Layer 2 - File Type Check:
if not db_path.is_file()::
Subtle but important: Path exists but is it a file or folder?
Common mistake: Users sometimes point to directories instead of database files
Specific error: ValueError explaining the exact problem
Proactive Directory Creation:
db_path.parent.mkdir(parents=True, exist_ok=True):
Forward thinking: Creates parent directories if they don't exist
parents=True: Like "mkdir -p" in Unix - creates intermediate directories
exist_ok=True: Doesn't error if directories already exist
Why needed: Sometimes you need directories for log files or backups
Actual Connection Establishment:
self.connection = sqlite3.connect(self.db_path, check_same_thread=False):
sqlite3.connect(): Standard Python function for SQLite connections
check_same_thread=False: Essential for async operations - allows connection use across threads
Thread safety: Required for our async database operations
User Experience Enhancement:
self.connection.row_factory = sqlite3.Row:
Game changer: Makes query results behave like dictionaries
Before: result[0] (what's column 0 again? 🤔)
After: result['name'] (crystal clear! ✨)
Professional touch: Much more usable than plain tuples
PostgreSQL Enterprise Setup
async def _init_postgresql(self):
"""Initialize PostgreSQL connection"""
if not self.connection_string:
raise ValueError("PostgreSQL connection string required")
self.connection = psycopg2.connect(
self.connection_string,
cursor_factory=psycopg2.extras.RealDictCursor
)
self.connection.autocommit = True
The Enterprise Database Specialist! 🐘
This method handles the more complex enterprise database setup with appropriate validation and configuration.
Validation Check:
if not self.connection_string::
Required parameter: PostgreSQL needs detailed connection information
Clear error: Tells user exactly what they need to provide
Example needed: "postgresql://user:password@localhost:5432/dbname"
Connection Establishment:
self.connection = psycopg2.connect(...):
Enterprise connection: Much more complex than SQLite file connections
Multi-line formatting: Makes complex function calls readable
Professional setup: Handles authentication, networking, database selection
Advanced Configuration:
cursor_factory=psycopg2.extras.RealDictCursor:
Consistency: PostgreSQL equivalent of SQLite's Row factory
Dictionary results: Same user experience as SQLite
Professional touch: Query results are easy to work with
Transaction Management:
self.connection.autocommit = True:
Simplified management: Automatically commits each operation
Trade-off: Easier to use but less control over transactions
Appropriate choice: Good for our use case of simple database operations
Connection Health Monitoring
def is_connected(self) -> bool:
"""Check if database is connected and functional"""
if not self.is_initialized or not self.connection:
return False
try:
cursor = self.connection.cursor()
if self.db_type == "sqlite":
cursor.execute("SELECT 1")
else: # PostgreSQL
cursor.execute("SELECT 1")
result = cursor.fetchone()
cursor.close()
return result is not None
except Exception as e:
logger.warning(f"Connection test failed for {self.db_path}: {e}")
return False
The Database Doctor! 🔍
This method is like having a doctor who can quickly check if a patient is responsive and healthy.
Quick Preliminary Checks:
if not self.is_initialized or not self.connection::
Fast path: Avoid expensive tests if we already know the answer
Logic: If setup never happened OR connection object doesn't exist
Efficiency: No point testing a connection that doesn't exist
Active Health Test:
cursor = self.connection.cursor():
Standard pattern: Create cursor for database operations
Database interface: Cursors are how you execute SQL commands
cursor.execute("SELECT 1"):
Universal test: Simplest possible query that works on all SQL databases
Minimal overhead: Just returns the number 1
Real communication: Actually talks to the database to verify connectivity
Result Verification:
result = cursor.fetchone():
Get response: Retrieves the result of our test query
Expected result: Should return (1,) or similar
return result is not None:
Boolean conversion: True if we got any result, False if None
Health indicator: Any result means the connection is working
Resource Cleanup:
cursor.close():
Good practice: Always clean up database resources
Prevents leaks: Stops accumulation of unused cursors
Error Handling:
except Exception as e::
Broad catch: Database connections can fail in many ways
Warning level: Connection tests failing is common, not catastrophic
Informative logging: Shows which database and what error occurred
The Security Guardian - Query Validation System
async def get_table_schema(self, table_name: str) -> Dict[str, Any]:
"""Get and cache table schema information"""
if table_name in self._table_schemas:
return self._table_schemas[table_name]
try:
cursor = self.connection.cursor()
if self.db_type == "sqlite":
cursor.execute(f"PRAGMA table_info({table_name})")
columns = cursor.fetchall()
schema = {}
for col in columns:
col_dict = dict(col)
schema[col_dict['name']] = {
'type': col_dict['type'],
'not_null': bool(col_dict['notnull']),
'default_value': col_dict['dflt_value'],
'primary_key': bool(col_dict['pk'])
}
else: # PostgreSQL
cursor.execute("""
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = %s
ORDER BY ordinal_position
""", (table_name,))
columns = cursor.fetchall()
schema = {}
for col in columns:
col_dict = dict(col)
schema[col_dict['column_name']] = {
'type': col_dict['data_type'],
'not_null': col_dict['is_nullable'] == 'NO',
'default_value': col_dict['column_default'],
'primary_key': False # Would need additional query
}
self._table_schemas[table_name] = schema
return schema
except Exception as e:
logger.error(f"Error getting schema for table {table_name}: {e}")
return {}
The Structure
async def get_table_schema(self, table_name: str) -> Dict[str, Any]:
async def: This is an asynchronous function. MCP async processing is a modern way to run tasks without blocking the system. It allows your program to handle many tasks at the same time, instead of waiting for each one to finish before starting the next.
Think of it like this:
Synchronous processing is like a single line of people at a counter. Each person waits for the one before them to finish.
Asynchronous processing is like having many counters open. People can be served at the same time.
What does async mean?
The async keyword tells the computer:
This task might take time, so keep going with other work and come back to it when it is ready.
In MCP, this means you can run model tasks in the background while the system continues working on other tasks.
Why is async important in MCP?
It helps your system:
Run faster
Use resources more efficiently
Handle more users or tasks at the same time
Avoid delays caused by slow tasks
Stay stable even if some tasks fail
Key parts of async processing:
Task queues to manage which tasks to run
Context managers to track each task's progress
Event loops to switch between tasks efficiently
Callbacks to handle what happens when a task finishes or fails
table_name: str: Takes a table name as a string parameter
-> Dict[str, Any]: Returns a dictionary with string keys and any type of values
if table_name in self._table_schemas:
return self._table_schemas[table_name]
The Memory Check:
self._table_schemas: A dictionary that acts like a filing cabinet
Why a dictionary/cache? Database schema rarely changes, so asking every time is wasteful
Performance boost: Instead of asking database "What's in the users table?" 100 times, ask once and remember
Example: If you already know what's on McDonald's menu, you don't need to read it again
Database-Specific Schema Discovery
if self.db_type == "sqlite":
cursor.execute(f"PRAGMA table_info({table_name})")
columns = cursor.fetchall()
This code performs the following actions if the database type is SQLite:
self.db_type == "sqlite": Checks whether the current database is an SQLite database.
cursor.execute(f"PRAGMA table_info({table_name})"): Executes an SQLite-specific command (PRAGMA table_info) that retrieves metadata about the columns of the table named table_name.It returns details like column name, type, whether it can be null, default value, etc.
columns = cursor.fetchall(): Fetches all rows returned by the query. Each row represents one column in the table.
What's PRAGMA?
PRAGMA: SQLite's special command for system information
table_info: Specific PRAGMA that reveals table structure
Like asking: "Hey SQLite, tell me everything about this table"
What comes back from SQLite:
cid | name | type | notnull | dflt_value | pk
0 | id | INTEGER | 1 | NULL | 1
1 | username| TEXT | 1 | NULL | 0
2 | email | TEXT | 0 | NULL | 0
PostgreSQL Path:
cursor.execute("""
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = %s
ORDER BY ordinal_position
""", (table_name,))
cursor.execute("""
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = %s
ORDER BY ordinal_position
""", (table_name,))
This SQL query retrieves column metadata for a specific table from the information_schema.columns view, which is available in PostgreSQL.
Breakdown:
SELECT column_name, data_type, is_nullable, column_defaultRetrieves:
column_name: the name of the column
data_type: the data type (e.g., integer, text, etc.)
is_nullable: whether the column allows NULL values
column_default: the default value (if any)
FROM information_schema.columnsQueries: the system catalog that holds metadata about all table columns.
WHERE table_name = %s: Filters: results to only include the specified table. %s is a placeholder for parameterized query input to prevent SQL injection.
ORDER BY ordinal_position: Ensures the columns are returned in the order they appear in the table.
information_schema: PostgreSQL's built-in catalog of database structure
Schema Transformation
SQLite Processing:
schema = {}
for col in columns:
col_dict = dict(col)
schema[col_dict['name']] = {
'type': col_dict['type'],
'not_null': bool(col_dict['notnull']),
'default_value': col_dict['dflt_value'],
'primary_key': bool(col_dict['pk'])
}
Step by step:
col_dict = dict(col): Convert database row to dictionary
col_dict['name']: Get column name (like 'username')
bool(col_dict['notnull']): Convert 1/0 to True/False
Create standardized format for each column
Example transformation:
# Raw SQLite data:
{'name': 'username', 'type': 'TEXT', 'notnull': 1, 'pk': 0}
# Becomes:'username': {
'type': 'TEXT',
'not_null': True,
'default_value': None,
'primary_key': False
}
PostgreSQL Processing:
schema[col_dict['column_name']] = {
'type': col_dict['data_type'],
'not_null': col_dict['is_nullable'] == 'NO',
'default_value': col_dict['column_default'],
'primary_key': False # Would need additional query
}
Key differences:
column_name vs name: Different field names between databases
is_nullable == 'NO': PostgreSQL returns 'YES'/'NO', convert to True/False
primary_key': False: Comment indicates limitation - would need extra query
Caching and Error Handling
self._table_schemas[table_name] = schema
return schema
The Filing System:
Store the discovered schema in memory
Next time someone asks about this table, return instantly
except Exception as e:
logger.error(f"Error getting schema for table {table_name}: {e}")
return {}
Graceful Failure:
If anything goes wrong, log the error
Return empty dictionary instead of crashing
Philosophy: Better to have no schema info than to break the entire application
The Quality Inspector
def _validate_insert_data(self, table_name: str, data: Dict[str, Any], schema: Dict[str, Any]) -> List[str]:
"""Validate data before insertion"""
errors = []
for column_name, column_info in schema.items():
if column_info['not_null'] and column_info['default_value'] is None:
# This column requires a value
if column_name not in data:
errors.append(f"Required column '{column_name}' is missing")
elif data[column_name] is None:
errors.append(f"Column '{column_name}' cannot be NULL")
elif isinstance(data[column_name], str) and data[column_name].strip() == "":
errors.append(f"Column '{column_name}' cannot be empty (requires a non-empty value)")
# Check for unknown columns
for column_name in data.keys():
if column_name not in schema:
errors.append(f"Unknown column '{column_name}' for table '{table_name}'")
return errors
Purpose
This method checks if data meets the table's requirements before trying to insert it. Like a bouncer checking IDs before letting people into a club.
def _validate_insert_data(self, table_name: str, data: Dict[str, Any], schema: Dict[str, Any]) -> List[str]:
Takes table name, data to insert, and table schema
Returns list of error messages (empty list = no problems)
Required Field Validation
for column_name, column_info in schema.items():
if column_info['not_null'] and column_info['default_value'] is None:
The Triple Check Logic:
column_info['not_null']: Database says this field is required
column_info['default_value'] is None: No fallback value available
Combined meaning: "This field MUST have a value provided"
Why both conditions?
A column can be NOT NULL but have a default value
Example: created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
In this case, you don't need to provide a value because database will use current time
Three Validation Scenarios
Scenario 1: Missing Column
if column_name not in data:
errors.append(f"Required column '{column_name}' is missing")
Example:
# Schema requires: username, email, password# Data provided: {'email': 'user@example.com', 'password': '123'}# Missing: username# Error: "Required column 'username' is missing"
Scenario 2: Explicit NULL
elif data[column_name] is None:
errors.append(f"Column '{column_name}' cannot be NULL")
Example:
# Data: {'username': None, 'email': 'user@example.com'}# Error: "Column 'username' cannot be NULL"
Scenario 3: Empty String
elif isinstance(data[column_name], str) and data[column_name].strip() == "":
errors.append(f"Column '{column_name}' cannot be empty (requires a non-empty value)")
Example:
# Data: {'username': ' ', 'email': 'user@example.com'}# After strip(): username becomes ''# Error: "Column 'username' cannot be empty (requires a non-empty value)"
Unknown Column Detection
for column_name in data.keys():
if column_name not in schema:
errors.append(f"Unknown column '{column_name}' for table '{table_name}'")
Typo Prevention:
# Schema has: username, email, password# Data: {'usrname': 'john', 'email': 'john@example.com'} # Typo!# Error: "Unknown column 'usrname' for table 'users'"
Return Value:
return errors
Empty list []: All validation passed
List with errors: Each string describes a specific problem
The Data Sanitizer 🧹
def _clean_insert_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
"""Clean and prepare data for insertion"""
cleaned_data = {}
for key, value in data.items():
if value is None:
cleaned_data[key] = None
elif isinstance(value, str):
# Convert empty strings to None for database insertion
# This respects NOT NULL constraints properly
cleaned_value = value.strip()
cleaned_data[key] = cleaned_value if cleaned_value else None
else:
cleaned_data[key] = value
return cleaned_data
Purpose
This method prepares data for database insertion by cleaning and standardizing it. Like washing vegetables before cooking.
def _clean_insert_data(self, data: Dict[str, Any]) -> Dict[str, Any]:
The Cleaning Process
cleaned_data = {}
for key, value in data.items():
if value is None:
cleaned_data[key] = None
Preserve NULL values: If someone explicitly wants NULL, keep it NULL.
elif isinstance(value, str):
cleaned_value = value.strip()
cleaned_data[key] = cleaned_value if cleaned_value else None
String Processing Logic:
isinstance(value, str): Only process strings
value.strip(): Remove leading/trailing whitespace
cleaned_value if cleaned_value else None: If empty after strip, convert to NULL
Examples:
# Input: {'name': ' John ', 'description': ' '}# Processing:# ' John '.strip() = 'John' (not empty, keep it)# ' '.strip() = '' (empty, convert to None)# Output: {'name': 'John', 'description': None}
else:
cleaned_data[key] = value
Preserve Other Types: Numbers, dates, booleans pass through unchanged.
Why Clean Data?
Database Consistency:
Empty strings '' and NULL are different in databases
NULL means "no value"
Empty string means "a value that happens to be empty"
Most validation logic works better with NULL
Whitespace Issues:
# Before cleaning:
{'username': ' john '} # Accidental spaces
# After cleaning:
{'username': 'john'} # Clean data
Part 3 is available at the following link: https://www.codersarts.com/post/building-a-database-mcp-server-the-ultimate-code-walkthrough-part-3
Transform Your Projects with Codersarts
Enterprise-grade database servers with advanced security and performance optimization
Multi-database integration systems supporting SQLite, PostgreSQL, MySQL, and cloud databases
AI-powered query optimization and intelligent database management tools
Production-ready deployment with monitoring, logging, and automated backup systems
Don't let complex implementations slow down your innovation. Connect with Codersarts today and turn your ideas into reality!
Ready to get started? Visit Codersarts.com or reach out to our team to discuss your next project. The future of intelligent automation is here – let's build it together!

Comments