top of page

AI Backend System Design & Implementation

  • 5 minutes ago
  • 5 min read



Course: AI Backend Engineering with FastAPI 

Assignment Type: Applied Project Assignment 

Difficulty Level: Medium → Advanced 

Estimated Time: 12–18 hours 

Submission Mode: Online LMS (Moodle / Canvas / Blackboard)




Assignment Context

Modern AI systems rarely consist of a single model endpoint. Instead, they involve multiple components working together such as:


  • API orchestration

  • model inference

  • retrieval systems

  • asynchronous processing

  • background workflows

  • logging and monitoring


In this assignment, you will design and implement a mini AI backend platform that integrates several of these components.


The purpose is to help you apply architectural thinking and implement production-style patterns that were discussed throughout the course.





Assignment Objective

The objective of this assignment is to build a multi-endpoint AI service that demonstrates the ability to:


  • structure backend systems using modular architecture

  • handle synchronous and asynchronous AI workloads

  • integrate retrieval pipelines

  • manage long-running processes

  • provide observable and maintainable APIs


Your system should resemble a simplified AI backend microservice.





Expected System Overview


Students must build a FastAPI-based backend that supports the following capabilities:


  1. Conversational AI endpoint

  2. Document retrieval endpoint

  3. Streaming response endpoint

  4. Background processing task

  5. Monitoring/logging functionality


All components should work together as part of a single coherent backend system.





Functional Requirements

Your backend must implement at least four endpoints described below.




Chat Endpoint



Endpoint

POST /chat



Purpose

Handle conversational interactions with an AI model.



Required Behavior

The endpoint must:


  • accept a structured message format

  • validate request input using Pydantic

  • call an LLM service

  • return a generated response



Example Request



{

 "messages": [

   {"role": "user", "content": "What is FastAPI?"}

 ],

 "temperature": 0.7

}



Example Response



{

 "response": "FastAPI is a modern Python framework..."

}




Streaming Endpoint



Endpoint

POST /chat/stream



Purpose

Deliver responses progressively rather than all at once.



Required Behavior


The endpoint should:


  • generate tokens or words gradually

  • return a StreamingResponse

  • demonstrate asynchronous streaming behavior



The response should visibly stream output when tested in the browser or via API clients.





Retrieval Endpoint



Endpoint

GET /retrieve



Purpose

Simulate a retrieval system used in Retrieval-Augmented Generation (RAG).



Expected Behavior


The endpoint should:


  • accept a query parameter

  • return a list of retrieved documents

  • simulate vector search behavior



Example response:




{

 "documents": [

   "FastAPI is a high-performance web framework",

   "RAG combines retrieval and generation"

 ]

}


The retrieval logic may be mocked or implemented using static documents.




Background Processing Endpoint



Endpoint

POST /process



Purpose

Trigger a long-running task.



Requirements


The endpoint must:


  • start a background task

  • immediately return a response to the client

  • execute the job asynchronously



Example use cases include:


  • document indexing

  • embedding generation

  • data analysis



Use: BackgroundTasks to implement the processing.





System Design Requirements


Students must organize their project using a layered architecture.


Required components include:


Routers

Handle HTTP interactions.

Example:

app/routers/




Services

Contain business logic and AI processing.

Example:

app/services/




Schemas

Define request and response models.

Example:

app/schemas/




Core utilities

Contain reusable functionality such as:


  • logging

  • configuration

  • middleware


Example:

app/core/





Logging and Monitoring


Your system must include logging functionality.


Logs should capture:

  • request start

  • request completion

  • errors

  • background job activity


A middleware-based logging system is recommended.





Async Programming Requirements


At least one endpoint must use asynchronous processing.


Examples include:


  • LLM service call

  • streaming endpoint

  • external API simulation



Use:


async def

await


appropriately where required.





Testing the System


Students should verify that the system works correctly using:


  • FastAPI Swagger UI (/docs)

  • Postman or similar API clients

  • browser testing for streaming responses


Evidence of testing should be included in the submission.





Implementation Guidelines


Students should follow these best practices:


Code Organization


  • avoid placing all logic in main.py

  • separate routers and services

  • keep functions small and focused




Naming Conventions


Use descriptive names for:


  • endpoints

  • services

  • functions

  • variables




Documentation

Add comments where necessary to explain important logic.





Project Deliverables

Students must submit the following materials.





Source Code

Upload a complete project folder containing all files.

Required structure:



project/

app/

routers/

services/

schemas/

core/

main.py


Compress the folder before submission.





README Documentation

The project must include a README.md file containing:


  • project overview

  • system architecture explanation

  • instructions to run the API

  • example API calls





Execution Evidence


Include screenshots showing:


  • API server running

  • Swagger documentation page

  • successful endpoint responses

  • streaming output

  • background task logs





Short Reflection Report

Students must submit a short document (1 page) explaining:


  • challenges faced

  • design decisions made

  • possible improvements





Submission Format

Submit a ZIP archive containing:


  • StudentID_AI_Backend_Project.zip

  • The archive should include:

  • project_code/

  • README.md

  • screenshots/

  • reflection.pdf


Upload the file through the LMS submission portal.





Assessment Criteria

Student submissions will be evaluated using the following criteria.


Category

Marks

System Architecture

20

API Endpoint Functionality

25

Streaming Implementation

15

Background Task Implementation

15

Code Quality

10

Documentation

10

Testing Evidence

5

Total Marks: 100





Important Guidelines


Students should:


  • start early

  • test each endpoint individually

  • focus on architecture rather than complexity


Do not attempt to implement unnecessary features beyond the scope unless seeking bonus credit.





Bonus Opportunities (Optional)


Students may earn additional marks for implementing:


  • authentication middleware

  • request rate limiting

  • real LLM integration

  • vector database integration

  • Celery task queue





Academic Integrity


All submitted work must be your own implementation.


Students may consult:


  • official documentation

  • course materials

  • reference resources


However, direct copying from external repositories is not allowed.





Instructor Notes

Students should treat this assignment as a mini production system design exercise, not just a coding task.


Focus on:


  • modularity

  • clarity

  • maintainability





Call to Action

Ready to transform your business with AI-powered intelligence that accelerates insights, enhances decision-making, and unlocks the full value of your data?


Codersarts is here to help you turn complex data workflows into efficient, scalable, and evidence-driven AI systems that empower teams to make smarter, faster, and more confident decisions.


Whether you’re a startup looking to build AI-driven products, an enterprise aiming to optimize operations through data science, or a research organization advancing innovation with intelligent data solutions, we bring the expertise and experience needed to design, develop, and deploy impactful AI systems that drive measurable business outcomes.




Get Started Today



Schedule an AI & Data Science Consultation:

Book a 30-minute discovery call with our AI strategists and data science experts to discuss your challenges, identify high-impact opportunities, and explore how intelligent AI solutions can transform your workflows and performance.




Request a Custom AI Demo:

Experience AI in action with a personalized demonstration built around your business use cases, datasets, operational environment, and decision workflows — showcasing practical value and real-world impact.









Transform your organization from data accumulation to intelligent decision enablement — accelerating insight generation, improving operational efficiency, and strengthening competitive advantage.


Partner with Codersarts to build scalable AI solutions including RAG systems, predictive analytics platforms, intelligent automation tools, recommendation engines, and custom machine learning models that empower your teams to deliver exceptional results.


Contact us today and take the first step toward next-generation AI and data science capabilities that grow with your business ambitions.




Comments


bottom of page