This is a [Next.js](https://nextjs.org) project bootstrapped with [`create-next-app`](https://nextjs.org/docs/app/api-reference/cli/create-next-app). ## Getting Started First, run the development server: ```bash npm run dev ``` Open [http://localhost:3000](http://localhost:3000) with your browser to see the result. You can start editing the page by modifying `app/page.tsx`. The page auto-updates as you edit the file. This project uses [`next/font`](https://nextjs.org/docs/app/building-your-application/optimizing/fonts) to automatically optimize and load [Geist](https://vercel.com/font), a new font family for Vercel. ## Learn More To learn more about Next.js, take a look at the following resources: - [Next.js Documentation](https://nextjs.org/docs) - learn about Next.js features and API. - [Learn Next.js](https://nextjs.org/learn) - an interactive Next.js tutorial. You can check out [the Next.js GitHub repository](https://github.com/vercel/next.js) - your feedback and contributions are welcome! ## Deploy on Vercel The easiest way to deploy your Next.js app is to use the [Vercel Platform](https://vercel.com/new?utm_medium=default-template&filter=next.js&utm_source=create-next-app&utm_campaign=create-next-app-readme) from the creators of Next.js. Check out our [Next.js deployment documentation](https://nextjs.org/docs/app/building-your-application/deploying) for more details. ## OCR Setup (Recommended) PDF analysis uses direct text extraction first. If text is insufficient (common in scanned PDFs), the API falls back to OCR with `ocrmypdf`. Install host dependencies (Ubuntu/Debian): ```bash sudo apt-get update sudo apt-get install -y ocrmypdf poppler-utils tesseract-ocr tesseract-ocr-spa tesseract-ocr-eng ``` Verify: ```bash ocrmypdf --version ``` If OCR is not available, the API returns a specific error (`OCR_UNAVAILABLE`) with install guidance. ## AI Extraction for Acta Constitutiva Onboarding now uses AI as the default extraction engine after PDF text analysis: 1. Extract direct text from PDF. 2. If text is insufficient, run OCR. 3. Send extracted text to OpenAI to map fields and lookup dictionary. 4. If AI fails, fallback extraction is used so onboarding is not blocked. Environment variables: ```bash OPENAI_API_KEY=sk-... OPENAI_ACTA_MODEL=gpt-4.1-mini OPENAI_ACTA_TIMEOUT_MS=60000 OPENAI_ACTA_MAX_CHARS=45000 ``` ## Local CLI Script (PDF -> OCR/text -> AI) Run: ```bash npm run acta:analyze:ai -- ./path/to/acta.pdf ``` Optional output file: ```bash npm run acta:analyze:ai -- ./path/to/acta.pdf --out ./result.json ``` ## Licita Ya API Key Test Add these vars to `.env`: ```bash LICITAYA_API_KEY=your-licitaya-api-key LICITAYA_BASE_URL=https:// LICITAYA_TEST_ENDPOINT=/tender/search?items=10&page=1 LICITAYA_ACCEPT=application/json LICITAYA_TIMEOUT_MS=20000 ``` Run the connection test: ```bash npm run licitaya:test ``` Override values on demand: ```bash npm run licitaya:test -- --base-url https://www.licitaya.com.mx/api/v1 --endpoint /tender/search?items=10&page=1 --accept application/json ``` You can also pass a full URL in `--endpoint`: ```bash npm run licitaya:test -- --endpoint https:/// ``` Common Licita Ya lookups: ```bash # Search tenders (keyword + filters) npm run licitaya:test -- --endpoint '/tender/search?keyword=computadora,monitor&state=NLE,XX&items=10&page=1&order=1' # Search by date (YYYYmmdd) npm run licitaya:test -- --endpoint '/tender/search?date=20260313&items=10&page=1' # Get one tender by ID npm run licitaya:test -- --endpoint '/tender/SCRZJ' ``` Country base URL (pick one only): - Mexico: `https://www.licitaya.com.mx/api/v1` - Argentina: `https://www.licitaya.com.ar/api/v1` Notes: - The script sends your key in header `X-API-KEY`. - It prints status code + response preview. - A non-2xx response exits with code `1` (useful for CI checks).