Project Overview

CAIT (Conversational AI Tutor) is a Major Qualifying Project completed at Worcester Polytechnic Institute with ASSISTments. The project focused on extending a conversational tutoring experience that combines strengths from intelligent tutoring systems (ITS) and chatbot-style interaction.

Instead of limiting evaluation to internal demos, the main goal was to study CAIT in realistic classroom usage. We captured authentic student-tutor conversations during middle school math problem solving and treated those interactions as primary evidence for how well the tutor supports understanding, persistence, and step-by-step reasoning.

A key deliverable was a conversational analysis workflow that transformed raw dialogue into interpretable patterns, allowing us to compare behavior across instructional conditions and identify where tutor guidance was or was not helping students move forward.

While outcome signals were promising in places, findings were ultimately inconclusive due to limited real-world sample size. The project still produced a strong deployment and evaluation baseline for future iterations, including clearer experiment design, larger classroom trials, and more robust measurement of learning impact.

Key Features

Integrated CAIT directly into the ASSISTments learning flow so conversational support was available during normal problem-solving.
Used randomized control trial (RCT) framing to compare tutoring strategies and reduce bias in instructional comparisons.
Captured real classroom dialogue data rather than relying only on synthetic or internal testing interactions.
Built a conversational analysis process to classify and interpret how student questions and tutor responses evolved over time.
Connected conversation quality to practical classroom outcomes, emphasizing usability and measurable educational value.
Documented follow-up recommendations for scaling data collection and improving next-phase experimental rigor.

Technologies Used

Tutoring Platform Integration: ASSISTments ITS environment and classroom math workflows.

Conversational AI Layer: LLM-assisted tutoring interactions designed for guided, educational dialogue.

Evaluation Framework: Randomized control trial (RCT) structure for comparing instructional conditions.

Conversation Analytics: Data capture, cleanup, and post-hoc interpretation of student-tutor transcripts.

Learning Context: Middle school mathematics with emphasis on practical in-class deployment.

Challenges & Learnings

The biggest challenge was balancing research quality with classroom reality. It is relatively easy to prototype conversational tutoring behavior, but much harder to gather enough high-quality in-class interaction data to support strong statistical conclusions.

We also learned that educational AI performance depends as much on evaluation design and instrumentation as it does on model behavior. Logging quality, condition assignment, and consistency of classroom usage all directly shape whether measured outcomes are interpretable.

Even with inconclusive final metrics, the project delivered concrete value: a deployable tutoring workflow, a repeatable analysis path, and a clear roadmap for future iterations focused on larger datasets and tighter linkage between conversational behavior and learning gains.

Project Deliverables

The file below is the official MQP report submitted to WPI for this project and includes the full methodology, analysis, and results.

CAIT MQP Final Report

Complete project report documenting motivation, study design, conversational analysis, findings, and future recommendations.