AI-based Optical Mark Recognition (OMR) solution tailored for ACT bubble sheets

Codersarts
Jun 8
11 min read

Hi Readers,

Welcome to another blog on AI projects ideas.

Below is a comprehensive set of requirements and specifications for an AI-based Optical Mark Recognition (OMR) solution tailored for ACT bubble sheets. This document can serve as a foundation for planning, designing, and implementing the system, as well as a showcase in a project portfolio.

1. Introduction

1.1 Purpose

The purpose of this document is to define the requirements for the development of a high-accuracy AI-based Optical Mark Recognition (OMR) solution specifically designed for ACT bubble sheets. The goal is to accurately detect, interpret, and record marked answers, ensuring reliability, scalability, and ease of use for various educational and testing institutions.

1.2 Scope

In-Scope
- Designing and implementing an image processing pipeline to detect and interpret marked bubbles on ACT answer sheets.
- Use of AI/ML (Machine Learning) algorithms to improve accuracy and reduce errors due to variations in marking, scanning, or paper quality.
- Integration of a user-friendly interface to display results and allow manual reviews or corrections.
- Scalability to handle multiple concurrent scans and large volumes of test sheets.
Out-of-Scope
- Printing and distribution of physical ACT sheets.
- Grading logic beyond the reading and capture of selected answers. (e.g., storing correct answers for scoring is possible but not the primary focus.)
- Automated scanning hardware or optical devices beyond the software solution.

1.3 Definitions, Acronyms, and Abbreviations

OMR: Optical Mark Recognition, the process of detecting and interpreting marks on physical documents.
AI: Artificial Intelligence, which includes machine learning and deep learning methods.
ACT: A standardized test used for college admissions in the United States.
UI: User Interface.
API: Application Programming Interface.
DPI: Dots Per Inch, a measure of image resolution.

1.4 References

Official ACT Bubble Sheet Samples.
Standard OMR guidelines and best practices.
Relevant AI/ML literature for image classification and object detection.

1.5 Overview

This document details the functional and non-functional requirements of the OMR system, including system features, external interface requirements, performance, security, and other critical design considerations. The end goal is to deliver a robust, scalable, and accurate OMR solution for ACT bubble sheets.

2. Overall Description

2.1 Product Perspective

The AI-based OMR system will be a standalone software application capable of:

Accepting scanned images of ACT bubble sheets as input.
Automatically detecting alignment, orientation, and bubble regions.
Identifying which bubbles are filled and providing an output of selected answers.

It can be integrated into existing educational or testing solutions via APIs or provided as a desktop/online application for administrators to upload scanned sheets and view results.

2.2 Product Features Summary

Image Preprocessing
- Deskewing and correcting for rotation.
- Noise reduction and normalization.
- Contrast and brightness adjustments.
Bubble Detection & Analysis
- Identification of bubble regions using layout information or AI-based detection.
- Classification of marks (filled vs. unfilled).
- Handling partially filled or faint marks using confidence scores.
Answer Extraction & Validation
- Extraction of bubble indices corresponding to test responses.
- Validation rules to detect multiple marks in a single question.
- Confidence scoring for ambiguous responses.
Error Handling & Review
- Flagging of anomalies (e.g., tears, stains, multiple marks).
- Manual correction interface for flagged questions.
Reporting & Data Export
- Summarized result of marks in standardized formats (CSV, JSON, XML).
- Graphical UI to review scanned sheets and recognized answers.
System Administration & Configuration
- Administrator panel for configuration (batch processing, scanning parameters).
- Logging and audit trail for scanned sheets and detection results.

2.3 User Classes and Characteristics

Administrators: Responsible for configuring the system, managing user access, and overseeing large-scale scanning processes.
Operators/Proctors: Perform daily scanning tasks, monitor scanning equipment, and manage question paper templates.
Reviewers: Check flagged scans for anomalies and manually correct any misread marks.
Developers/Integrators: Maintain and integrate the software with other systems or workflows.

2.4 Operating Environment

The system will run on standard Windows/Linux/Mac servers or cloud-based virtual environments.
It must work with images scanned typically at 300-600 DPI, in JPEG or PNG format (others optional).
Must function on modern web browsers if a web-based interface is provided (Chrome, Firefox, Safari, Edge).

2.5 Design and Implementation Constraints

Comply with ACT standard bubble sheet layout constraints.
Respect image resolution and orientation guidelines.
Adhere to data privacy regulations for student test data (e.g., FERPA in the U.S.).
System must handle real-time or near real-time processing of individual answer sheets, as well as batch processing.

2.6 User Documentation

A user manual for operators describing how to scan and upload sheets, interpret results, and handle flagged anomalies.
An administrator manual detailing configuration options, system requirements, and integration endpoints.

2.7 Assumptions and Dependencies

Access to high-quality scanned images is available.
ACT bubble sheet layout remains consistent with official templates.
Sufficient training data is available or can be generated for AI-based detection models.
Accurate alignment references (e.g., corner squares or registration marks) exist on the official bubble sheets.

3. System Features

3.1 Feature: Image Import and Preprocessing

3.1.1 Description

The system shall accept image uploads (JPEG, PNG, TIFF) of ACT bubble sheets.
The system shall normalize image brightness, contrast, and reduce noise.
The system shall correct image skew or rotation within ±15°.

3.1.2 Stimulus/Response Sequences

User uploads or imports an image of the bubble sheet.
System automatically processes the image for orientation and noise.
Processed image is passed to the bubble detection module.

3.1.3 Functional Requirements

FR1: The system shall deskew images up to ±15° with an accuracy of at least 98%.
FR2: The system shall detect and correct brightness/contrast for consistent grayscale/threshold levels.
FR3: The system shall accept multiple image formats (JPEG, PNG, TIFF).

3.2 Feature: Bubble Detection and Classification

3.2.1 Description

The core AI algorithm detects bubble locations and classifies filled vs. unfilled marks.
The system uses either pre-defined sheet templates or a trained model to locate bubble regions.

3.2.2 Stimulus/Response Sequences

Pre-processed image is fed into the bubble detection model.
Model outputs bounding boxes or coordinates for each bubble.
The classifier determines whether each bubble is marked or unmarked.

3.2.3 Functional Requirements

FR4: The system shall detect all bubble locations with a minimum 99% accuracy for standard ACT sheets.
FR5: The system shall classify filled bubbles with at least 99.9% accuracy under typical scanning conditions.
FR6: The system shall handle partially filled or faint marks by providing a confidence level.

3.3 Feature: Answer Extraction

3.3.1 Description

Maps detected filled bubbles to specific question IDs (Q1, Q2, ..., Q75, etc. for ACT).
Flags anomalies like multiple bubbles selected for one question or no bubble selected.

3.3.2 Stimulus/Response Sequences

Bubble detection data is mapped onto the question index.
System checks each question for multiple or missing marks.
Results are stored in a structured format (e.g., JSON) for further use.

3.3.3 Functional Requirements

FR7: The system shall correctly map detected marks to question indices for at least 99.5% of the questions.
FR8: The system shall flag questions with multiple detected marks.
FR9: The system shall allow configuration of the threshold for detection of a “filled” bubble.

3.4 Feature: Review and Correction Interface

3.4.1 Description

Provides a GUI to review scanned sheets and correct flagged or low-confidence marks.
Allows manual override of AI-detected marks.

3.4.2 Stimulus/Response Sequences

System generates a result with flagged items (ambiguous or multiple marks).
Reviewer opens the flagged results in a UI.
Reviewer can confirm or override the system’s classification.

3.4.3 Functional Requirements

FR10: The system shall display flagged bubbles visually for quick inspection.
FR11: The system shall allow manual editing of bubble status (filled/unfilled).
FR12: The system shall save changes and recalculate final answers.

3.5 Feature: Reporting and Export

3.5.1 Description

Summarizes the recognized answers in a user-friendly format.
Exports data to CSV, JSON, or XML for external processing.

3.5.2 Stimulus/Response Sequences

System completes the OMR process and compiles results.
User chooses export format (CSV, JSON, XML).
System generates the report and downloads or sends via API.

3.5.3 Functional Requirements

FR13: The system shall produce a final report listing each question and the detected answer choice.
FR14: The system shall support exporting results in at least CSV and JSON.
FR15: The system shall provide an option to print a summary report.

3.6 Feature: Administrative Configuration

3.6.1 Description

Administrators can configure user roles, scanning profiles, and system parameters (e.g., batch size, concurrency).

3.6.2 Stimulus/Response Sequences

Administrator logs in with elevated privileges.
Administrator configures scanning parameters or modifies detection thresholds.
Changes are saved globally in the system.

3.6.3 Functional Requirements

FR16: The system shall allow role-based access control (Administrator, Operator, Reviewer).
FR17: The system shall allow customization of detection thresholds by question or bubble region.
FR18: The system shall maintain an audit log of all administrative changes.

4. External Interface Requirements

4.1 User Interfaces (UI)

Desktop/Web Interface:
- Main dashboard displaying recent scans and their statuses.
- Option to upload scanned sheets (single or batch) with drag-and-drop or file selector.
- Review panel for flagged results, with magnified preview of questionable bubbles.
- Administrative panel for user management and system settings.
Mobile/Tablet Interface (Optional):
- Limited functionality primarily for viewing results and minor corrections.

4.2 Hardware Interfaces

Standard scanners (flatbed or multi-page feed) that produce digital images at 300 DPI or higher.
No specialized hardware is strictly required for the AI-based OMR solution itself, beyond standard computing environments (CPU/GPU if needed for machine learning acceleration).

4.3 Software Interfaces

Operating Systems: Windows, Linux, or macOS.
Databases: MySQL/PostgreSQL or cloud-based solutions for storing scan results, user data, and logging.
APIs: RESTful or GraphQL endpoints for integration with external systems, e.g., for automated scanning workflows or advanced analytics.

4.4 Communication Interfaces

HTTP/HTTPS for web-based interactions.
TCP/IP for server-side API communications.
Secure channels (SSL/TLS) for sensitive test data and user credentials.

5. Non-functional Requirements

5.1 Performance Requirements

Accuracy:
- Must consistently achieve a minimum of 99.9% accuracy in bubble detection/classification under standardized scanning conditions.
Throughput:
- The system should handle at least 20–50 sheets per minute in batch mode, depending on hardware capabilities.
Latency:
- Single sheet processing, including detection and classification, should not exceed 5 seconds under typical system loads.

5.2 Security Requirements

Authentication:
- System must support secure login for all users.
Authorization:
- Role-based access control to ensure only authorized users can view or edit scanned results.
Data Protection:
- Encryption of sensitive test data in transit (HTTPS) and at rest (database encryption).

5.3 Reliability & Availability

Reliability:
- System should maintain consistent performance and handle error cases gracefully.
Availability:
- Target availability of 99.5% or higher for critical production environments.
Fault Tolerance:
- Implement robust error handling and automatic retries for transient failures.

5.4 Maintainability

Modular Architecture:
- Separating image preprocessing, AI detection, and result management to simplify updates or replacements.
Documentation:
- Comprehensive documentation of code, APIs, and configuration.
Logging and Monitoring:
- Real-time logging for debugging and performance monitoring.

5.5 Scalability

Horizontal Scalability:
- Ability to add more servers or cloud instances to handle increased workload.
Distributed Processing (optional):
- Supports queue-based batch processing to distribute the load across multiple nodes.

5.6 Usability

User-Centric Design:
- A clear and intuitive UI for both operators and reviewers.
Accessibility:
- Compliance with accessibility standards where feasible (WCAG 2.1).

6. Use Cases

Use Case 1: Single Sheet Scan
- Actors: Operator
- Description: Operator uploads a single scanned sheet. The system processes and outputs results.
- Preconditions: Operator is logged in, scanning hardware is functioning.
- Postconditions: The system displays recognized answers or flags anomalies.
Use Case 2: Batch Processing
- Actors: Operator/Administrator
- Description: Operator uploads a batch of scanned sheets. The system processes them in sequence, outputs results, and flags anomalies for each sheet.
- Preconditions: Large volume of scanned sheets is available.
- Postconditions: All results are stored and can be reviewed; flagged items appear in the review queue.
Use Case 3: Review and Correction
- Actors: Reviewer
- Description: Reviewer opens the review interface, sees flagged or low-confidence questions, and corrects them if needed.
- Preconditions: OMR detection is complete, flagged items exist.
- Postconditions: Corrected results are saved, final answer set is updated.
Use Case 4: Administration & Configuration
- Actors: Administrator
- Description: Admin updates detection threshold, user roles, or batch processing configurations.
- Preconditions: Admin is logged in with necessary privileges.
- Postconditions: Changes are saved and system parameters are updated globally.
Use Case 5: Reporting & Data Export
- Actors: Operator/Reviewer/Administrator
- Description: User exports final results in desired format (CSV, JSON) and downloads or transmits them.
- Preconditions: Completed scanning process, final results available.
- Postconditions: Exported file is generated, no changes to system data unless manually updated.

7. Acceptance Criteria

High Recognition Accuracy
- At least 99.9% detection accuracy for bubbles under normal scanning conditions with standard ACT sheets.
Robust Error Handling
- The system gracefully handles scanning anomalies (e.g., partial scans, misaligned sheets) and flags them for manual review.
Performance Benchmarks
- Single sheet processing within 5 seconds.
- Batch processing of 20–50 sheets per minute on a standard server environment.
User-Friendly Interface
- Operators, reviewers, and administrators can easily perform their tasks without extensive training.
Security and Compliance
- All user authentication, data encryption, and role-based access controls are in place.
- Complies with relevant data protection regulations (e.g., FERPA for U.S. student data).
Scalability and Maintenance
- The system can scale to handle large volumes of ACT sheets with minimal performance degradation.
- The system’s modular design allows easy updates and maintenance.

Conclusion

These requirement details outline the essential features, constraints, and objectives for creating a high-accuracy, AI-driven OMR solution for ACT bubble sheets. By combining robust image preprocessing, advanced machine learning algorithms, and a user-friendly interface for review and corrections, the proposed system aims to minimize errors while efficiently processing large volumes of standardized test answer sheets.

This document can be used as a baseline for stakeholders, developers, and project managers to align on scope, design, and implementation strategies. It also serves as a strong portfolio showcase, illustrating the depth of planning and rigor needed for a mission-critical educational technology solution.

Here’s how you might frame the AI-based OMR solution tailored for ACT bubble sheets from three different perspectives:

1. As a Student Project

If you’re building this as an academic capstone or club project, focus on learning objectives and clear, incremental milestones:

Objectives & Scope

Goal: Automate grading of ACT-style bubble sheets using AI.
Key Skills: Image preprocessing, template matching, CNN-based classification, and API design.

Milestones

Research & Planning
- Survey existing OMR approaches.
- Obtain sample ACT bubble-sheet scans.
Preprocessing Module
- Deskew & normalize each scanned page.
- Crop out the answer grid using fixed coordinates.
Bubble Classification
- Label a small dataset of filled vs. unfilled bubbles.
- Train a simple CNN (e.g., with TensorFlow/Keras) to recognize marks.
Prototype Integration
- Write a script that runs preprocessing → classification → scoring for one sheet.
- Display results in a basic GUI or Jupyter notebook.
Validation & Reporting
- Measure accuracy on held-out scans.
- Generate a grade report CSV for a small class.
Presentation
- Prepare a demo video or live demo.
- Document your code, write up lessons learned, and discuss potential improvements (e.g., handling smudges).

Deliverables

GitHub repo with well-organized code.
A Jupyter notebook walkthrough.
A short report highlighting accuracy metrics and error cases.

2. As a Professional Developer Implementation

For a production-ready developer build, emphasize robustness, modularity, and integration:

Architecture Overview

Ingestion Layer
- Accepts scanned images via REST endpoint or file upload.
- Validates image format and resolution.
Preprocessing Service
- Uses OpenCV for grayscale conversion, denoising, and perspective correction.
- Template-matching to detect and extract the answer grid.
Bubble Detection Engine
- CNN model (PyTorch/TensorFlow) served via a microservice (e.g., FastAPI).
- Combines thresholding and morphology checks to boost confidence.
Scoring & Validation
- Applies business rules (one mark per question).
- Flags ambiguous cases for manual review UI (React or simple Flask dashboard).
Reporting & API
- Returns JSON with per-student answers, scores, and confidence levels.
- Generates PDF scorecards and item‐analysis charts on demand.
Logging & Monitoring
- Structured logging (ELK stack) for traceability.
- Prometheus/Grafana dashboards for throughput and error rates.

Key Implementation Tasks

Model Training Pipeline: Automate data augmentation (rotations, brightness shifts) to improve generalization.
CI/CD: Containerize each service, write unit/integration tests, and deploy via Kubernetes.
Security & Compliance: Ensure encryption in transit (HTTPS) and at rest, plus audit logging for FERPA compliance.
Scalability: Implement autoscaling for peak exam periods, use AWS S3 for batch uploads, and Lambda or background workers for processing.

3. As a Startup Product Offering

When positioning this as a business solution, highlight pain points solved, go-to-market strategy, and monetization:

Value Proposition

Accuracy & Reliability: 99.5%+ mark‐reading accuracy reduces manual rework.
Speed & Scalability: Process thousands of sheets per hour in the cloud.
Actionable Insights: Built-in analytics help educators identify tricky questions and cohort trends.

Core Features

Turnkey Deployment: Cloud API + optional on-prem Docker package.
Custom Branding: White-label reports and dashboards that match institutional branding.
Integrations: Prebuilt connectors for popular LMS (Canvas, Blackboard) and SIS platforms.
Operator Dashboard: Human-in-the-loop review interface for low-confidence sheets.
Analytics Suite: Item-analysis heatmaps, score distributions, and longitudinal tracking.

Go-to-Market Plan

Pilot Program: Offer free trials to a handful of community colleges or test-prep centers.
Pricing Model:
- Per-sheet pricing (e.g., $0.10–$0.25 per sheet) for pay-as-you-go.
- Subscription tiers with monthly volumes and support levels.
Channels:
- Partnerships with scanning hardware vendors.
- Direct sales to K–12 districts, universities, and test-prep companies.
- Online self-service sign-up for smaller learning centers.

Roadmap & Differentiators

Expand to Other Sheet Types: SAT, GRE, or custom surveys.
Adaptive Feedback: Provide students with question-level feedback based on common mistakes.
Mobile Scanning App: Let teachers use smartphones to capture and process sheets instantly.
AI Tutoring Insights: Add recommendation engines that suggest remediation content for low-scoring topics.

No matter your role—student, developer, or startup leader—our AI-based OMR solution offers a clear path to building, evaluating, or commercializing a high-accuracy bubble-sheet reader.

Ready to streamline your bubble‐sheet scoring?

👉 Contact us at contact@codersarts.com to schedule your demo today!