MathAIG

2014 · Assessment Systems · Psychometrics · Domain-Specific Languages · Automatic Item Generation

MathAIG 3.0: An Automatic Item Generator for Mathematics Assessment

Collaborators

Jeff Wilson — Lead Application Developer
Susan Embretson — Principal Investigator, School of Psychology, Georgia Institute of Technology

Overview

MathAIG (Math Automatic Item Generator) 3.0 is a domain-specific system for authoring and generating standardized mathematics assessment items. I developed MathAIG 3.0 to address long-standing inefficiencies in item development workflows and to ensure strict numerical integrity in generated content. The system was built in collaboration with Dr. Susan Embretson, whose work in item modeling and psychometrics guided the theoretical foundation of the project.

MathAIG integrates a customized scripting language, structured data repositories, constraint-driven generation, and precision-safe mathematical computation to produce high-quality multiple-choice items suitable for large-scale testing contexts.

Motivation

Earlier versions of the system (1.0, 2.0, 2.1) revealed significant workflow limitations. Authoring items required editing multiple database tables across a distributed schema, often through direct SQL manipulation. This made item creation time-consuming and error-prone.

Additionally, previous versions lacked strong guarantees about numerical integrity. In particular:

Integer overflow and underflow could not be detected in Java.
Floating point representations introduced rounding artifacts.
Scientific notation items exposed precision limitations.

Because many item authors are not trained in computer science, MathAIG 3.0 was designed to protect authors from representation-level numerical errors by enforcing arbitrary precision arithmetic across the system.

Core Architecture

1. Modular XML-Based Item Specification

Each item is defined as a self-contained XML document that contains:

Metadata (grade, standard, benchmark, etc.)
A script section for generation logic
Question and answer templates
Embedded formatting directives

This structure supports separation of logic from presentation and allows items to be versioned and vetted independently.

2. Domain-Specific Language (DSL)

MathAIG uses a customized scripting language built on Apache JEXL. JEXL was extensively modified so that:

All mathematical operations use arbitrary precision integers and decimals.
Rational numbers are directly representable.
Constraint programming constructs are integrated.
Custom formatting and math utilities are available.

The DSL blends procedural, functional, and constraint programming characteristics.

3. Arbitrary Precision and Rational Arithmetic

A central design goal is numerical integrity.

MathAIG replaces standard fixed-size numeric types with:

Arbitrary precision integers
Arbitrary precision decimals
Direct rational representations

This ensures that:

Scientific notation items remain exact.
Fraction arithmetic is precise.
No overflow or silent rounding artifacts occur.

Although arbitrary precision incurs computational overhead, correctness in assessment contexts outweighs raw speed concerns.

4. Constraint-Driven Generation

Item models generate values under explicit constraints. For example:

Bounded integer generation
Rational stepping
Multiplicity constraints
Boolean assertions
Uniqueness constraints

If a constraint fails, the script re-executes until conditions are satisfied or a maximum retry limit is reached. This allows authors to reason declaratively about correctness while retaining procedural control.

5. Shared Database with Versioning

MathAIG includes a structured data repository for substitution variables such as names, contexts, or domain constants.

Key features include:

JSON-defined schemas
SQLite-backed storage
Versioned tables
Dependency tracing

Versioning isolates changes so that extending a dataset does not invalidate previously vetted item scripts.

6. Formatting System

Mathematical presentation is handled via formatter objects:

Proper versus mixed numeral fractions
Scientific notation
Decimal precision control
Comma formatting for integers
Mathematical symbols such as π, multiplication, and powers
XHTML wrappers for structural formatting

This abstraction separates computation from rendering and ensures consistent visual output across generated variants.

7. Natural Language Adaptation

MathAIG includes grammar adaptation tools for variable substitution contexts, such as:

Gender-based pronoun switching
Context-sensitive textual adjustments

These tools support natural-sounding item stems even under heavy variabilization.

8. Workflow and Quality Assurance

Because assessment development requires rigorous vetting, MathAIG was engineered with workflow safeguards:

Database version isolation
Dependency tracing
Deterministic serial number seeds for reproducibility
Support for automated regression testing
Preservation of vetted item states

A vetted item can be regenerated exactly from a known seed, enabling reproducible validation and long-term maintenance.

Psychometric Foundation

MathAIG implements the item model approach described in Dr. Embretson’s work on automatic item generation. In this framework:

Core item structures are variabilized.
Substitutions are drawn from controlled databases.
Complexity and psychometric properties are preserved across variants.

The collaboration ensured that the engineering design aligned with contemporary psychometric theory rather than merely generating superficial variants.

Limitations and Future Directions

MathAIG currently requires authors to write procedural scripts. While the DSL is domain-specific and simplified, it still requires programming skill.

The ideal future system would allow entirely declarative specification:

Authors would specify constraints and formatting.
A solver would infer generation procedures.

However, two obstacles remain:

Most constraint solvers rely on floating point arithmetic and cannot guarantee precision.
Few solvers support arbitrary precision integers, decimals, and rationals simultaneously.

Given these constraints, a procedural engine augmented with constraint programming features was implemented rather than a fully declarative solver-based architecture.

Impact

MathAIG represents a synthesis of:

Psychometric item modeling
Precision-aware numerical computing
Domain-specific language design
Workflow engineering for assessment production

It addresses both the theoretical demands of modern assessment science and the practical engineering constraints of large-scale item development.

The collaboration with Dr. Susan Embretson grounded the system in established item generation research while enabling the construction of a technically robust implementation.