MathAIG
2014 · Assessment Systems · Psychometrics · Domain-Specific Languages · Automatic Item Generation
MathAIG 3.0: An Automatic Item Generator for Mathematics Assessment
Collaborators
- Jeff Wilson — Lead Application Developer
- Susan Embretson — Principal Investigator, School of Psychology, Georgia Institute of Technology
Overview
MathAIG (Math Automatic Item Generator) 3.0 is a domain-specific system for authoring and generating standardized mathematics assessment items. I developed MathAIG 3.0 to address long-standing inefficiencies in item development workflows and to ensure strict numerical integrity in generated content. The system was built in collaboration with Dr. Susan Embretson, whose work in item modeling and psychometrics guided the theoretical foundation of the project.
MathAIG integrates a customized scripting language, structured data repositories, constraint-driven generation, and precision-safe mathematical computation to produce high-quality multiple-choice items suitable for large-scale testing contexts.
Motivation
Earlier versions of the system (1.0, 2.0, 2.1) revealed significant workflow limitations. Authoring items required editing multiple database tables across a distributed schema, often through direct SQL manipulation. This made item creation time-consuming and error-prone.
Additionally, previous versions lacked strong guarantees about numerical integrity. In particular:
- Integer overflow and underflow could not be detected in Java.
- Floating point representations introduced rounding artifacts.
- Scientific notation items exposed precision limitations.
Because many item authors are not trained in computer science, MathAIG 3.0 was designed to protect authors from representation-level numerical errors by enforcing arbitrary precision arithmetic across the system.
Core Architecture
1. Modular XML-Based Item Specification
Each item is defined as a self-contained XML document that contains:
- Metadata (grade, standard, benchmark, etc.)
- A script section for generation logic
- Question and answer templates
- Embedded formatting directives
This structure supports separation of logic from presentation and allows items to be versioned and vetted independently.
2. Domain-Specific Language (DSL)
MathAIG uses a customized scripting language built on Apache JEXL. JEXL was extensively modified so that:
- All mathematical operations use arbitrary precision integers and decimals.
- Rational numbers are directly representable.
- Constraint programming constructs are integrated.
- Custom formatting and math utilities are available.
The DSL blends procedural, functional, and constraint programming characteristics.
3. Arbitrary Precision and Rational Arithmetic
A central design goal is numerical integrity.
MathAIG replaces standard fixed-size numeric types with:
- Arbitrary precision integers
- Arbitrary precision decimals
- Direct rational representations
This ensures that:
- Scientific notation items remain exact.
- Fraction arithmetic is precise.
- No overflow or silent rounding artifacts occur.
Although arbitrary precision incurs computational overhead, correctness in assessment contexts outweighs raw speed concerns.
4. Constraint-Driven Generation
Item models generate values under explicit constraints. For example:
- Bounded integer generation
- Rational stepping
- Multiplicity constraints
- Boolean assertions
- Uniqueness constraints
If a constraint fails, the script re-executes until conditions are satisfied or a maximum retry limit is reached. This allows authors to reason declaratively about correctness while retaining procedural control.
5. Shared Database with Versioning
MathAIG includes a structured data repository for substitution variables such as names, contexts, or domain constants.
Key features include:
- JSON-defined schemas
- SQLite-backed storage
- Versioned tables
- Dependency tracing
Versioning isolates changes so that extending a dataset does not invalidate previously vetted item scripts.
6. Formatting System
Mathematical presentation is handled via formatter objects:
- Proper versus mixed numeral fractions
- Scientific notation
- Decimal precision control
- Comma formatting for integers
- Mathematical symbols such as π, multiplication, and powers
- XHTML wrappers for structural formatting
This abstraction separates computation from rendering and ensures consistent visual output across generated variants.
7. Natural Language Adaptation
MathAIG includes grammar adaptation tools for variable substitution contexts, such as:
- Gender-based pronoun switching
- Context-sensitive textual adjustments
These tools support natural-sounding item stems even under heavy variabilization.
8. Workflow and Quality Assurance
Because assessment development requires rigorous vetting, MathAIG was engineered with workflow safeguards:
- Database version isolation
- Dependency tracing
- Deterministic serial number seeds for reproducibility
- Support for automated regression testing
- Preservation of vetted item states
A vetted item can be regenerated exactly from a known seed, enabling reproducible validation and long-term maintenance.
Psychometric Foundation
MathAIG implements the item model approach described in Dr. Embretson’s work on automatic item generation. In this framework:
- Core item structures are variabilized.
- Substitutions are drawn from controlled databases.
- Complexity and psychometric properties are preserved across variants.
The collaboration ensured that the engineering design aligned with contemporary psychometric theory rather than merely generating superficial variants.
Limitations and Future Directions
MathAIG currently requires authors to write procedural scripts. While the DSL is domain-specific and simplified, it still requires programming skill.
The ideal future system would allow entirely declarative specification:
- Authors would specify constraints and formatting.
- A solver would infer generation procedures.
However, two obstacles remain:
- Most constraint solvers rely on floating point arithmetic and cannot guarantee precision.
- Few solvers support arbitrary precision integers, decimals, and rationals simultaneously.
Given these constraints, a procedural engine augmented with constraint programming features was implemented rather than a fully declarative solver-based architecture.
Impact
MathAIG represents a synthesis of:
- Psychometric item modeling
- Precision-aware numerical computing
- Domain-specific language design
- Workflow engineering for assessment production
It addresses both the theoretical demands of modern assessment science and the practical engineering constraints of large-scale item development.
The collaboration with Dr. Susan Embretson grounded the system in established item generation research while enabling the construction of a technically robust implementation.