Kontia/ai_integration.md

# AI Integration Plan for Modules 1, 2, 3, 7, and 10

## Summary
- Goal: add AI assistance to M1, M2, M3, M7, and M10 while keeping deterministic scoring/compliance logic as the source of truth.
- Product mode selected: **Assist-only** (AI suggests; user explicitly accepts/applies).
- Model strategy selected: **Higher quality** (primary `gpt-4.1`, fallback `gpt-4.1-mini`).
- Persistence selected: **Structured traceability** (store request/response metadata, model, usage, warnings, and accept/dismiss events).

## Core Architecture Changes
- Add a shared AI service layer used by all 5 modules:
  - `callOpenAiJsonSchema()` and `callOpenAiJsonObjectFallback()`.
  - shared timeout/retry handling, prompt versioning, truncation guard, and safe JSON parsing.
  - standardized output envelope: `engine`, `model`, `usage`, `warnings`, `confidence`.
- Add a shared persistence model for AI traces:
  - New Prisma model `AiSuggestion` (or equivalent name), fields:
    - `id`, `userId`, `moduleKey`, `featureKey`, `subjectType`, `subjectId`.
    - `inputHash`, `requestJson`, `responseJson`, `confidence`.
    - `engine`, `model`, `usageJson`, `warningsJson`, `promptVersion`.
    - `status` (`GENERATED | ACCEPTED | DISMISSED | EXPIRED`), `actedAt`, `createdAt`, `updatedAt`.
  - Purpose:
    - deduplicate repeated requests by `inputHash`.
    - preserve auditability and user decisions.
- Add shared decision endpoint:
  - `POST /api/ai/suggestions/{id}/decision` with `{ decision: "accept" | "dismiss" }`.
- Keep deterministic engines untouched and authoritative:
  - M1 scoring, M2 scoring, M3 deterministic match score, M7 rule alerts, M10 scoring remain unchanged.

## Public API and Type Additions
- `POST /api/diagnostic/ai/suggestions` (M1):
  - Input: `moduleKey`, optional `questionId`.
  - Output: list of suggestions with `questionId`, `suggestedAnswerOptionId`, `rationale`, `missingEvidence`, `confidence`, `suggestionId`.
- `POST /api/strategic-diagnostic/ai/insights` (M2):
  - Input: current strategic form snapshot + evidence metadata.
  - Output: `sectionGaps`, `priorityActions`, `suggestedEvidence`, `suggestedFieldValues`, `confidence`, `suggestionId`.
- `GET /api/licitations/ai/recommendations` (M3):
  - Input: optional query (`top`, `profileId`), based on deterministic pre-ranked items.
  - Output: deterministic score + AI score + blended score + `aiReasons`, `aiRisks`, `nextStep`, `suggestionId`.
- `POST /api/compliance/m7/ai/playbook` (M7):
  - Input: current M7 dataset.
  - Output: `predictedIncidents`, `priorityOrder`, `preventiveActions`, `escalationAdvice`, `confidence`, `suggestionId`.
- `POST /api/audits/ai/findings` (M10):
  - Input: selected simulation + current institutional dossier.
  - Output: `auditorLikelyFindings`, `missingEvidence`, `topRisks`, `remediationPlan`, `confidence`, `suggestionId`.
- Type updates:
  - Extend view types to include optional AI sections per module (`aiSuggestion`, `aiInsights`, `aiFit`, `aiPlaybook`, `aiFindings`) with strict nullable typing.
  - Add common `AiUsage`, `AiWarning`, `AiSuggestionStatus`, `AiDecision`.

## Module-by-Module Behavior
- M1:
  - AI infers likely answers for unanswered questions based on answered responses and evidence notes.
  - UI per question: “Sugerencia IA” card with `Apply` and `Dismiss`.
  - Applying uses existing response save route so scoring pipeline is unchanged.
- M2:
  - AI analyzes strategic data + evidence counts to propose prioritized actions and missing evidence by section.
  - UI in each section + results tab: “Plan sugerido por IA”.
  - Suggested field values are shown as explicit apply actions, never auto-written.
- M3:
  - Keep deterministic scoring as base.
  - AI re-ranks only top deterministic candidates to control latency/cost.
  - UI shows three values: deterministic score, AI fit, blended score; reasons become semantic and more specific.
- M7:
  - AI creates a prioritized compliance playbook from existing alerts/deadlines/checklist.
  - UI adds “Plan IA” tab with incident predictions + actions with owner suggestion and target date.
  - No severity/status changes are auto-applied.
- M10:
  - AI simulates auditor perspective using latest simulation and dossier.
  - UI adds “Dictamen IA” with likely findings, missing evidence, and remediation order.
  - Deterministic section scores remain unchanged and visible alongside AI narrative.

## Test Plan
- Unit tests:
  - AI parser/normalizer for each module schema.
  - prompt truncation + fallback behavior.
  - suggestion persistence lifecycle (`GENERATED -> ACCEPTED/DISMISSED`).
- Route tests:
  - auth, validation, malformed JSON handling.
  - OpenAI failure paths return structured fallback/empty-assist safely.
- Integration tests:
  - M1 apply suggestion updates response but does not break scoring calculations.
  - M2/M3/M7/M10 AI outputs render without changing deterministic KPIs/scores unless user explicitly applies where applicable.
- Regression tests:
  - existing deterministic test suites for M1/M2/M3/M7/M10 continue passing unchanged.
- Acceptance scenarios:
  - each module can produce at least one AI suggestion with trace metadata.
  - user can accept/dismiss and decision is persisted.
  - pages remain functional when AI is unavailable.

## Environment and Rollout Defaults
- New env vars:
  - `OPENAI_SMART_MODEL=gpt-4.1`
  - `OPENAI_SMART_FALLBACK_MODEL=gpt-4.1-mini`
  - `OPENAI_SMART_TIMEOUT_MS=75000`
  - `OPENAI_SMART_MAX_CHARS=55000`
  - optional per-module overrides (`OPENAI_M1_MODEL`, `OPENAI_M2_MODEL`, etc.).
- Rollout:
  - Phase 1: backend + trace model + M3 and M10 (highest visible value).
  - Phase 2: M1 and M2 assist flows.
  - Phase 3: M7 predictive playbook tuning.
- Documentation updates:
  - add a new section in README for “AI Assist Modules (M1/M2/M3/M7/M10)” and env variables.

## Assumptions
- Current auth gating behavior remains unchanged in this scope.
- AI output language is Spanish for end-user text.
- No auto-application beyond explicit user action.
- Existing AI-enabled modules (M4/M5/M6/M8/M9) are not refactored in this effort.