Graphletter is a Next.js 15 application that combines structured compliance data (SCF) with LLM-based document analysis to produce framework-aware compliance assessments.
Documents (PDF, DOCX, images, CSV) are uploaded and content is extracted using pdf-parse, mammoth, and tesseract.js OCR. Text is normalized and chunked for analysis.
The SCF control catalog (1,200+ controls across 33 domains) is loaded from versioned CSV data. Cross-framework mappings connect SCF controls to 79+ regulatory standards.
Extracted content is matched against SCF controls and assessment objectives. Each control has testable criteria that evidence is evaluated against.
Dual-provider AI assessment: GPT-5-mini handles ingestion and extraction, GPT-5 performs control mapping and final schema normalization, and Claude 3.7 Sonnet drives gap analysis and remediation recommendations.
Per-control confidence scores, evidence strength ratings (Strong/Moderate/Weak/Insufficient), gap identification, and remediation guidance are compiled into structured reports.
Different tasks use different models optimized for their specific requirements. Providers are configured with automatic fallback.
| Task | Model | Temp | Rationale |
|---|---|---|---|
| Document ingestion / parsing | GPT-5-mini | 0.0 | Fast, cost-efficient extraction with stronger 2026 accuracy for chunking and metadata capture |
| Control mapping / classification | GPT-5 | 0.1 | Improved reasoning and structured JSON reliability for evidence-to-control mapping |
| Gap analysis + recommendations | Claude 3.7 Sonnet | 0.2 | Stronger long-document synthesis and analytical writing for remediation narratives |
| Final structured compliance output | GPT-5 | 0.1 | Normalizes multi-model outputs into a strict, deterministic compliance schema |
Long-running operations (evidence upload, multi-control assessment) use Vercel Workflow Dev Kit for durable execution. Each pipeline stage is a retryable step with state persistence — operations survive function timeouts, deployments, and transient AI provider failures.
The evidence pipeline is split into three durable stages: content extraction, upload & persistence, and AI assessment. Assessment objectives are evaluated in parallel with automatic retry on transient failures.