Files
Kontia/ai_integration.md
Marcelo Dares ea23136288 changes
2026-04-29 01:15:50 +02:00

6.0 KiB

AI Integration Plan for Modules 1, 2, 3, 7, and 10

Summary

  • Goal: add AI assistance to M1, M2, M3, M7, and M10 while keeping deterministic scoring/compliance logic as the source of truth.
  • Product mode selected: Assist-only (AI suggests; user explicitly accepts/applies).
  • Model strategy selected: Higher quality (primary gpt-4.1, fallback gpt-4.1-mini).
  • Persistence selected: Structured traceability (store request/response metadata, model, usage, warnings, and accept/dismiss events).

Core Architecture Changes

  • Add a shared AI service layer used by all 5 modules:
    • callOpenAiJsonSchema() and callOpenAiJsonObjectFallback().
    • shared timeout/retry handling, prompt versioning, truncation guard, and safe JSON parsing.
    • standardized output envelope: engine, model, usage, warnings, confidence.
  • Add a shared persistence model for AI traces:
    • New Prisma model AiSuggestion (or equivalent name), fields:
      • id, userId, moduleKey, featureKey, subjectType, subjectId.
      • inputHash, requestJson, responseJson, confidence.
      • engine, model, usageJson, warningsJson, promptVersion.
      • status (GENERATED | ACCEPTED | DISMISSED | EXPIRED), actedAt, createdAt, updatedAt.
    • Purpose:
      • deduplicate repeated requests by inputHash.
      • preserve auditability and user decisions.
  • Add shared decision endpoint:
    • POST /api/ai/suggestions/{id}/decision with { decision: "accept" | "dismiss" }.
  • Keep deterministic engines untouched and authoritative:
    • M1 scoring, M2 scoring, M3 deterministic match score, M7 rule alerts, M10 scoring remain unchanged.

Public API and Type Additions

  • POST /api/diagnostic/ai/suggestions (M1):
    • Input: moduleKey, optional questionId.
    • Output: list of suggestions with questionId, suggestedAnswerOptionId, rationale, missingEvidence, confidence, suggestionId.
  • POST /api/strategic-diagnostic/ai/insights (M2):
    • Input: current strategic form snapshot + evidence metadata.
    • Output: sectionGaps, priorityActions, suggestedEvidence, suggestedFieldValues, confidence, suggestionId.
  • GET /api/licitations/ai/recommendations (M3):
    • Input: optional query (top, profileId), based on deterministic pre-ranked items.
    • Output: deterministic score + AI score + blended score + aiReasons, aiRisks, nextStep, suggestionId.
  • POST /api/compliance/m7/ai/playbook (M7):
    • Input: current M7 dataset.
    • Output: predictedIncidents, priorityOrder, preventiveActions, escalationAdvice, confidence, suggestionId.
  • POST /api/audits/ai/findings (M10):
    • Input: selected simulation + current institutional dossier.
    • Output: auditorLikelyFindings, missingEvidence, topRisks, remediationPlan, confidence, suggestionId.
  • Type updates:
    • Extend view types to include optional AI sections per module (aiSuggestion, aiInsights, aiFit, aiPlaybook, aiFindings) with strict nullable typing.
    • Add common AiUsage, AiWarning, AiSuggestionStatus, AiDecision.

Module-by-Module Behavior

  • M1:
    • AI infers likely answers for unanswered questions based on answered responses and evidence notes.
    • UI per question: “Sugerencia IA” card with Apply and Dismiss.
    • Applying uses existing response save route so scoring pipeline is unchanged.
  • M2:
    • AI analyzes strategic data + evidence counts to propose prioritized actions and missing evidence by section.
    • UI in each section + results tab: “Plan sugerido por IA”.
    • Suggested field values are shown as explicit apply actions, never auto-written.
  • M3:
    • Keep deterministic scoring as base.
    • AI re-ranks only top deterministic candidates to control latency/cost.
    • UI shows three values: deterministic score, AI fit, blended score; reasons become semantic and more specific.
  • M7:
    • AI creates a prioritized compliance playbook from existing alerts/deadlines/checklist.
    • UI adds “Plan IA” tab with incident predictions + actions with owner suggestion and target date.
    • No severity/status changes are auto-applied.
  • M10:
    • AI simulates auditor perspective using latest simulation and dossier.
    • UI adds “Dictamen IA” with likely findings, missing evidence, and remediation order.
    • Deterministic section scores remain unchanged and visible alongside AI narrative.

Test Plan

  • Unit tests:
    • AI parser/normalizer for each module schema.
    • prompt truncation + fallback behavior.
    • suggestion persistence lifecycle (GENERATED -> ACCEPTED/DISMISSED).
  • Route tests:
    • auth, validation, malformed JSON handling.
    • OpenAI failure paths return structured fallback/empty-assist safely.
  • Integration tests:
    • M1 apply suggestion updates response but does not break scoring calculations.
    • M2/M3/M7/M10 AI outputs render without changing deterministic KPIs/scores unless user explicitly applies where applicable.
  • Regression tests:
    • existing deterministic test suites for M1/M2/M3/M7/M10 continue passing unchanged.
  • Acceptance scenarios:
    • each module can produce at least one AI suggestion with trace metadata.
    • user can accept/dismiss and decision is persisted.
    • pages remain functional when AI is unavailable.

Environment and Rollout Defaults

  • New env vars:
    • OPENAI_SMART_MODEL=gpt-4.1
    • OPENAI_SMART_FALLBACK_MODEL=gpt-4.1-mini
    • OPENAI_SMART_TIMEOUT_MS=75000
    • OPENAI_SMART_MAX_CHARS=55000
    • optional per-module overrides (OPENAI_M1_MODEL, OPENAI_M2_MODEL, etc.).
  • Rollout:
    • Phase 1: backend + trace model + M3 and M10 (highest visible value).
    • Phase 2: M1 and M2 assist flows.
    • Phase 3: M7 predictive playbook tuning.
  • Documentation updates:
    • add a new section in README for “AI Assist Modules (M1/M2/M3/M7/M10)” and env variables.

Assumptions

  • Current auth gating behavior remains unchanged in this scope.
  • AI output language is Spanish for end-user text.
  • No auto-application beyond explicit user action.
  • Existing AI-enabled modules (M4/M5/M6/M8/M9) are not refactored in this effort.