# OEE Dashboard Fix Plan ## Comprehensive Strategy for Resolving All Issues --- ## Executive Summary We have identified 5 distinct issues affecting your OEE dashboard. This plan addresses each systematically, ordered by priority based on impact, risk, and dependencies. **Estimated Total Implementation Time:** 2-3 hours **Recommended Approach:** Sequential implementation with testing between each phase ### Key Improvements in This Updated Plan This plan has been enhanced based on critical friction point analysis for Node-RED environments: 1. **Global Context Persistence** - Added robust initialization logic for all global variables to handle Node-RED restarts and deploys without data loss or spikes 2. **State Synchronization (Push + Pull Model)** - Enhanced START/STOP button state tracking with both push notifications AND pull requests to handle mid-production dashboard loads 3. **Angular Timing Issues** - Replaced brittle fixed timeouts with data-driven initialization and polling fallback for reliable chart loading across all system speeds 4. **Dual-Path KPI Architecture** - Implemented separate paths for live display (real-time, unthrottled) and historical graphs (averaged, smooth) to eliminate the stale-data vs jerky-graphs trade-off 5. **Time-Based Availability Logic** - Enhanced availability calculation with configurable time thresholds to distinguish brief pauses from legitimate shutdowns 6. **LLM Implementation Guide** - Added comprehensive best practices section for working with LLMs to implement this plan with precise, defensive code ### Critical Refinements (Final Review) Based on final review, these critical refinements have been integrated: 1. **Clear Buffer on Production START** - Prevents stale data from skewing averages if Node-RED restarts mid-production and context is restored from disk 2. **Consolidated lastMachineCycleTime Updates** - Now updated ONLY in Machine Cycles function (not Calculate KPIs) to maintain clean "machine pulse" signal, initialized to `Date.now()` on startup to prevent immediate 0% availability 3. **Combined Initialization Strategy** - Graphs now use BOTH data-driven initialization (fast when production is running) AND 5-second safety timeout (for idle machine scenarios) 4. **Multi-Source KPI Calculation** - Calculate KPIs now explicitly handles triggers from both Machine Cycles (continuous) and Scrap Submission (event-based) with proper guards 5. **Complete Init Node** - Added production-ready initialization function with all global variables (`kpiBuffer`, `lastKPIRecordTime`, `lastMachineCycleTime`, `lastKPIValues`) properly initialized with correct default values and logging --- ## Issue Breakdown & Root Causes ### **Issue 1: KPI Updates Only on Scrap Submission** **Symptom:** KPIs stay static during production, only update when scrap is submitted or START/STOP clicked **Root Cause:** - Machine Cycles function has multiple return paths with `[null, ...]` outputs - Output to Calculate KPIs (output port 2) only happens in specific conditions - When `trackingEnabled` is false or no active order, KPI calculation is skipped - **Critical line:** `if (!trackingEnabled) return [null, stateMsg];` prevents KPI updates **Sub-issue 1b: START/STOP Button State** - Button state not persisting because UI doesn't track `trackingEnabled` global variable - Home template needs to watch for tracking state changes --- ### **Issue 2: Graphs Empty on First Load, Sidebar Broken** **Symptom:** Graphs tab shows blank, navigation doesn't work until refresh **Root Causes:** 1. **Timing Issue:** Charts created before Angular/scope is fully ready 2. **Scope Isolation:** `scope.gotoTab` might not be accessible immediately 3. **Data Race:** Charts created before first KPI data arrives **Why refresh works:** Second load benefits from cached scope and existing data --- ### **Issue 3: Availability & OEE Drop to 0%** **Symptom:** Metrics incorrectly show 0% during active production **Root Cause:** - Calculate KPIs function has logic that sets availability to 0 when certain conditions aren't met - **Need to verify:** When does `trackingEnabled` check fail? - **Hypothesis:** When production is running but tracking flag isn't properly set, availability defaults to 0 --- ### **Issue 4: Graph Updates Too Frequent/Jerky** **Symptom:** Data points recorded too often, causing choppy visualization **Root Cause:** - Record KPI History is called on EVERY Calculate KPIs output - With machine cycles happening every ~1 second, KPIs recorded every second - Need time-based throttling (1-minute intervals) instead of event-based recording --- ### **Issue 5: Time Range Filters Not Working** **Symptom:** Shift/Day/Week/Month/Year buttons don't change graph display **Root Cause:** - `build(metric, range)` function receives range parameter but **ignores it** - Function always returns ALL data from `realtimeData[metric]` - Need to filter data based on selected time range --- ## Fix Plan - Phased Approach ### **PHASE 1: Low-Risk Quick Wins** ⚡ *Estimated Time: 30 minutes* *Risk Level: LOW* #### 1.1 Fix Graph Filters (Issue 5) **Files:** `projects/Plastico/flows.json` → Graphs Template **Changes:** ```javascript // BEFORE function build(metric, range){ const arr = realtimeData[metric]; if (!arr || arr.length === 0) return []; return arr.map(d=>({x:d.timestamp, y:d.value})); } // AFTER function build(metric, range){ const arr = realtimeData[metric]; if (!arr || arr.length === 0) return []; // Calculate time cutoff based on range const now = Date.now(); const cutoffs = { shift: 8 * 60 * 60 * 1000, // 8 hours day: 24 * 60 * 60 * 1000, // 24 hours week: 7 * 24 * 60 * 60 * 1000, // 7 days month: 30 * 24 * 60 * 60 * 1000, // 30 days year: 365 * 24 * 60 * 60 * 1000 // 365 days }; const cutoffTime = now - (cutoffs[range] || cutoffs.shift); // Filter data to selected time range return arr .filter(d => d.timestamp >= cutoffTime) .map(d => ({x: d.timestamp, y: d.value})); } ``` **Testing:** - Click each filter button - Verify data range changes in charts - Check that no errors occur **Potential Issues:** - If no data exists in selected range, chart might be empty (expected behavior) **Rollback:** Easy - revert to original build() function --- #### 1.2 Fix Empty Graphs on First Load (Issue 2) **Files:** `projects/Plastico/flows.json` → Graphs Template **Strategy:** Use data-driven initialization instead of fixed timeout for reliability **Changes:** **A) Combined Data-Driven + Safety Timeout (RECOMMENDED)** ```javascript // BEFORE setTimeout(()=>{ initFilters(); createCharts(currentRange); },300); // AFTER - Wait for first data message OR timeout let chartsInitialized = false; scope.$watch('msg', function(msg) { // Initialize on first KPI data arrival if (msg && msg.payload && msg.payload.kpis && !chartsInitialized) { // Scope and data are both ready initFilters(); createCharts(currentRange); chartsInitialized = true; console.log('[Graphs] Charts initialized via data-driven approach'); } // Update charts if already initialized if (chartsInitialized && msg && msg.payload && msg.payload.kpis) { updateCharts(msg); } }); // ADDED: Safety timer for when machine is idle (no KPI messages flowing) setTimeout(() => { if (!chartsInitialized) { console.warn('[Graphs] Charts initialized via safety timer (machine idle)'); initFilters(); createCharts(currentRange); chartsInitialized = true; } }, 5000); // 5 seconds grace period for KPI messages ``` **Why Both?** - **Data-driven**: Ensures charts initialize as soon as data is available (fast, reliable) - **Safety timeout**: Handles "dashboard loaded but machine is idle" scenario (no KPI messages) - Together they cover both active production and idle machine scenarios **B) Fallback: Polling with timeout (if data-driven doesn't work)** ```javascript function initWhenReady(attempts = 0) { const oeeEl = document.getElementById("chart-oee"); const availEl = document.getElementById("chart-availability"); if (oeeEl && availEl && scope.gotoTab) { // Both DOM and scope ready initFilters(); createCharts(currentRange); } else if (attempts < 20) { // Retry every 100ms, max 2 seconds setTimeout(() => initWhenReady(attempts + 1), 100); } else { console.error("[Graphs] Failed to initialize charts after 2 seconds"); } } // Start polling on load initWhenReady(); ``` **C) Ensure scope.gotoTab is properly bound** ```javascript // BEFORE (function(scope){ scope.gotoTab = t => scope.send({ui_control:{tab:t}}); })(scope); // AFTER (function(s){ if (!s.gotoTab) { s.gotoTab = function(t) { s.send({ui_control: {tab: t}}); }; } })(scope); ``` **D) Add defensive chart creation with retry** ```javascript function createCharts(range){ // Ensure DOM elements exist const oeeEl = document.getElementById("chart-oee"); const availEl = document.getElementById("chart-availability"); if (!oeeEl || !availEl) { console.warn("[Graphs] Chart elements not ready, retrying..."); setTimeout(() => createCharts(range), 200); return; } // ... rest of existing chart creation logic } ``` **Testing:** - Clear browser cache - Navigate to Graphs tab from fresh load - Test sidebar navigation - Verify charts appear without refresh - Test on slow network/system **Potential Issues:** - Data-driven approach requires KPI messages flowing - If no production running, charts won't initialize (add timeout fallback) **Recommended Implementation:** 1. Start with data-driven approach (Option A) 2. Add polling fallback (Option B) as safety net 3. Implement defensive checks (Options C & D) **Rollback:** Easy - revert to original setTimeout logic --- ### **PHASE 2: Medium-Risk Data Flow Improvements** 🔧 *Estimated Time: 45 minutes* *Risk Level: MEDIUM* #### 2.1 Implement KPI Update Throttling with Dual-Path Architecture (Issue 4) **Files:** - `projects/Plastico/flows.json` → Calculate KPIs function (add second output) - `projects/Plastico/flows.json` → Record KPI History function (add averaging) **Strategy:** Dual-path updates solve the stale display vs jerky graphs trade-off - **Path 1:** Unthrottled live KPIs to Home Template for real-time display - **Path 2:** Throttled/averaged KPIs to Record History for smooth graphs **Part A: Modify Calculate KPIs to Output on Two Paths** ```javascript // At the end of Calculate KPIs function // Prepare the KPI message const kpiMsg = { topic: "kpis", payload: { timestamp: Date.now(), kpis: { oee: msg.kpis.oee, availability: msg.kpis.availability, performance: msg.kpis.performance, quality: msg.kpis.quality } } }; // Return to TWO outputs: // Output 1: Live KPI to Home Template (real-time, unthrottled) // Output 2: KPI to Record History (will be averaged/throttled) return [ kpiMsg, // Path 1: Live display { ...kpiMsg } // Path 2: History recording (clone to prevent mutation) ]; ``` **Wiring Changes:** - Calculate KPIs node needs **2 outputs** (add one more) - Output 1 → Home Template (existing connection) - Output 2 → Record KPI History (new connection) **Part B: Add Averaging Logic to Record KPI History** ```javascript // Complete Record KPI History function with robust initialization // ========== INITIALIZATION ========== // Initialize buffer let buffer = global.get("kpiBuffer"); if (!buffer || !Array.isArray(buffer)) { buffer = []; global.set("kpiBuffer", buffer); node.warn('[KPI History] Initialized kpiBuffer'); } // Initialize last record time let lastRecordTime = global.get("lastKPIRecordTime"); if (!lastRecordTime || typeof lastRecordTime !== 'number') { // Set to 1 minute ago to ensure immediate recording on startup lastRecordTime = Date.now() - 60000; global.set("lastKPIRecordTime", lastRecordTime); node.warn('[KPI History] Initialized lastKPIRecordTime'); } // ========== ACCUMULATE ========== const kpis = msg.payload.kpis; if (!kpis) { node.warn('[KPI History] No KPIs in message, skipping'); return null; } buffer.push({ timestamp: Date.now(), oee: kpis.oee || 0, availability: kpis.availability || 0, performance: kpis.performance || 0, quality: kpis.quality || 0 }); // Prevent buffer from growing too large (safety limit) if (buffer.length > 100) { buffer = buffer.slice(-60); // Keep last 60 entries node.warn('[KPI History] Buffer exceeded 100 entries, trimmed to 60'); } global.set("kpiBuffer", buffer); // ========== CHECK IF TIME TO RECORD ========== const now = Date.now(); const timeSinceLastRecord = now - lastRecordTime; const ONE_MINUTE = 60 * 1000; if (timeSinceLastRecord < ONE_MINUTE) { // Not time to record yet const secondsRemaining = Math.ceil((ONE_MINUTE - timeSinceLastRecord) / 1000); // Debug log (can remove in production) // node.warn(`[KPI History] Buffer: ${buffer.length} entries, recording in ${secondsRemaining}s`); return null; // Don't send to charts yet } // ========== CALCULATE AVERAGES ========== if (buffer.length === 0) { node.warn('[KPI History] Buffer empty at recording time, skipping'); return null; } const avg = { oee: buffer.reduce((sum, d) => sum + d.oee, 0) / buffer.length, availability: buffer.reduce((sum, d) => sum + d.availability, 0) / buffer.length, performance: buffer.reduce((sum, d) => sum + d.performance, 0) / buffer.length, quality: buffer.reduce((sum, d) => sum + d.quality, 0) / buffer.length }; node.warn(`[KPI History] Recording averaged KPIs from ${buffer.length} samples: OEE=${avg.oee.toFixed(1)}%`); // ========== RECORD TO HISTORY ========== // Update global state global.set("lastKPIRecordTime", now); global.set("kpiBuffer", []); // Clear buffer // Send averaged values to graphs and database return { topic: "kpi-history", payload: { timestamp: now, kpis: { oee: Math.round(avg.oee * 10) / 10, // Round to 1 decimal availability: Math.round(avg.availability * 10) / 10, performance: Math.round(avg.performance * 10) / 10, quality: Math.round(avg.quality * 10) / 10 }, sampleCount: buffer.length // Metadata for debugging } }; ``` **Recommendation:** This dual-path approach provides the best of both worlds **Testing:** - Start production - Observe KPI update frequency in graphs - Verify updates occur approximately every 60 seconds - Check that no spikes/gaps appear in data **Potential Issues:** - First data point might take up to 1 minute to appear - Rapid production changes might not be immediately visible - Buffer could grow large if production runs without recording **Mitigation:** - Set buffer max size (e.g., 100 entries) - Force record on production stop/start **Rollback:** Medium difficulty - remove throttling logic, clear global variables --- ### **PHASE 3: High-Risk Core Logic Fixes** ⚠️ *Estimated Time: 60 minutes* *Risk Level: HIGH* **⚠️ CRITICAL: Backup flows.json before proceeding** #### 3.1 Fix KPI Continuous Updates (Issue 1) **Files:** `projects/Plastico/flows.json` → Machine Cycles function **Problem:** Machine Cycles has multiple early returns that skip KPI calculation **Current Logic:** ```javascript // Line ~36: No active order if (!activeOrder || !activeOrder.id || cavities <= 0) { return [null, stateMsg]; // ❌ Skips KPI calculation } // Line ~43: Tracking not enabled if (!trackingEnabled) { return [null, stateMsg]; // ❌ Skips KPI calculation } ``` **Solution Options:** **Option A: Always Calculate KPIs (Recommended)** ```javascript // Always prepare a message for Calculate KPIs on output 2 const kpiTrigger = { _triggerKPI: true }; // Change all returns to include kpiTrigger if (!activeOrder || !activeOrder.id || cavities <= 0) { return [null, stateMsg, kpiTrigger]; // ✓ Triggers KPI calculation } if (!trackingEnabled) { return [null, stateMsg, kpiTrigger]; // ✓ Triggers KPI calculation } // Update last machine cycle time when a successful cycle occurs // This is used for time-based availability logic if (trackingEnabled && dbMsg) { // dbMsg being non-null implies a cycle was recorded global.set("lastMachineCycleTime", Date.now()); } // ... final return return [dbMsg, stateMsg, kpiTrigger]; ``` **Critical:** The `lastMachineCycleTime` update must happen ONLY in Machine Cycles function to maintain a clean "machine pulse" signal separate from KPI calculation triggers. **Wire Configuration Change:** - Add third output wire to Machine Cycles node - Connect output 3 → Calculate KPIs **Option B: Calculate KPIs in Parallel (Alternative)** - Add an inject node that triggers Calculate KPIs every 5 seconds - Less coupled, but might calculate with stale data **Recommendation:** Option A - ensures KPIs calculated with real-time data **Testing:** 1. Start production with START button 2. Observe KPI values on Home page 3. Verify continuous updates (every ~1 second before throttling) 4. Check that scrap submission still works 5. Test production stop/start **Potential Issues:** - Calculate KPIs might need to handle cases with no active order - Could calculate KPIs unnecessarily when machine is idle - Performance impact if calculating too frequently **Mitigation:** - Add guards in Calculate KPIs to handle null/undefined inputs - Implement Phase 2 throttling first to reduce calculation frequency - Monitor system performance **CRITICAL: Calculate KPIs Multi-Source Handling** The Calculate KPIs function will now receive triggers from TWO sources: 1. **Machine Cycles** (continuous, real-time) - via new output 3 2. **Scrap Submission** (event-based) - existing connection **Required Change in Calculate KPIs:** ```javascript // At the start of Calculate KPIs function // Must handle both trigger types // The function should execute regardless of message content // as long as it receives ANY trigger const trackingEnabled = global.get("trackingEnabled"); const activeOrder = global.get("activeOrder") || {}; const productionStartTime = global.get("productionStartTime"); // Guard against missing critical data if (!trackingEnabled || !activeOrder.id) { // Can't calculate meaningful KPIs without tracking or active order // But don't error - just skip calculation return null; } // ... rest of existing KPI calculation logic // This logic will now run for BOTH continuous and event-based triggers ``` This ensures availability and OEE calculations work correctly whether triggered by machine cycles or scrap submission. **Side Effects:** - Will trigger Issue 4 more severely → MUST implement Phase 2 throttling first - Database might receive more frequent updates - Global variables will change more often **Rollback:** Medium difficulty - requires restoring original return statements and wire configuration --- #### 3.2 Fix Availability/OEE Drops to 0 (Issue 3) **Files:** `projects/Plastico/flows.json` → Calculate KPIs function **Investigation Steps:** 1. Read full Calculate KPIs function 2. Identify all paths that set `msg.kpis.availability = 0` 3. Add logging to track when this occurs 4. Understand state flow: trackingEnabled, productionStartTime, operatingTime **Hypothesis Testing:** ```javascript // Add debug logging at the start node.warn(`[KPI] trackingEnabled=${trackingEnabled}, startTime=${productionStartTime}, opTime=${operatingTime}`); // Before setting availability to 0 if (/* condition that causes 0 */) { node.warn(`[KPI] Setting availability to 0 because: [reason]`); msg.kpis.availability = 0; } ``` **Likely Fix:** ```javascript // BEFORE } else { msg.kpis.availability = 0; // Not running } // AFTER } else { // Check if production was recently active const prev = global.get("lastKPIValues") || {}; if (prev.availability > 0 && operatingTime > 0) { // Maintain last availability if we have operating time msg.kpis.availability = prev.availability; } else { msg.kpis.availability = 0; } } // Store KPIs for next iteration global.set("lastKPIValues", msg.kpis); ``` **Testing:** 1. Start production 2. Monitor availability values 3. Trigger scrap prompt 4. Verify availability doesn't drop to 0 5. Check OEE calculation **Potential Issues:** - Might mask legitimate 0% availability (machine actually stopped) - Could create artificially high availability readings - State persistence might cause issues after restart **Mitigation:** - Add clear conditions for when availability should legitimately be 0 - Reset lastKPIValues on work order completion - Add production state tracking **Rollback:** Easy if logging added first - can revert based on log analysis --- #### 3.3 Fix START/STOP Button State (Issue 1b) **Files:** `projects/Plastico/flows.json` → Home Template **Problem:** Button doesn't show correct state (STOP when production running) **Investigation:** - Find button rendering logic in Home template - Check how `trackingEnabled` or `productionStarted` is tracked - Verify message handler receives state updates **Changes:** ```javascript // In Home Template scope.$watch if (msg.topic === 'machineStatus') { window.machineOnline = msg.payload.machineOnline; window.productionStarted = msg.payload.productionStarted; // NEW: Track tracking state for button display window.trackingEnabled = msg.payload.trackingEnabled || window.productionStarted; scope.renderDashboard(); return; } ``` **Button HTML Update:** ```html ``` **Backend Update (Work Order buttons):** ```javascript // When START clicked, also set trackingEnabled flag if (action === "start-tracking") { global.set("trackingEnabled", true); // CRITICAL: Clear KPI buffer on production start // Prevents stale data from skewing averages if Node-RED was restarted mid-production global.set("kpiBuffer", []); node.warn('[START] Cleared kpiBuffer for fresh production run'); // Optional: Reset last record time to ensure immediate data point global.set("lastKPIRecordTime", Date.now() - 60000); // Send state update to UI const stateMsg = { topic: "machineStatus", payload: { machineOnline: true, productionStarted: true, trackingEnabled: true } }; // ... send stateMsg to Home template } ``` **Why Clear Buffer on START:** If Node-RED restarts during a production run and context is restored from disk, the `kpiBuffer` might contain stale data from before the restart. When production resumes, new data would be mixed with old data, skewing the averages. Clearing on START ensures a clean slate for each production session. **Testing:** 1. Load dashboard 2. Start work order 3. Verify START button changes to STOP 4. Click STOP (if implemented) 5. Verify button changes back to START **Potential Issues:** - Need to implement STOP button handler if it doesn't exist - State sync between backend and frontend - Button might flicker during state transitions **Rollback:** Easy - remove button visibility conditions --- ## Implementation Order & Dependencies ### Recommended Sequence: 1. **Phase 1.1** - Fix Filters (Independent, low risk) 2. **Phase 1.2** - Fix Empty Graphs (Independent, low risk) 3. **Phase 2.1** - Add Throttling (Required before Phase 3.1) 4. **Phase 3.2** - Fix Availability Calculation (Add logging first) 5. **Phase 3.1** - Fix Continuous KPI Updates (Depends on throttling) 6. **Phase 3.3** - Fix Button State (Can be done anytime) ### Why This Order? 1. **Quick wins first** - Build confidence, improve UX immediately 2. **Throttling before continuous updates** - Prevent performance issues 3. **Logging before logic changes** - Understand problem before fixing 4. **Independent fixes can run parallel** - Save time --- ## Testing Strategy ### Per-Phase Testing: - Test each phase independently - Don't proceed to next phase if current fails - Keep backup of working state ### Integration Testing (After All Phases): 1. **Fresh Start Test** - Clear browser cache - Restart Node-RED - Load dashboard - Navigate all tabs 2. **Production Cycle Test** - Start new work order - Click START - Let run for 2-3 minutes - Submit scrap - Verify KPIs update - Check graphs show data - Test time filters 3. **State Persistence Test** - Refresh page during production - Verify state restores correctly - Check button shows STOP if running 4. **Edge Cases** - No active work order - Machine offline - Zero production time - Rapid start/stop --- ## Rollback Plan ### Per-Phase Rollback: Each phase documents its rollback procedure. In general: 1. **Stop Node-RED** 2. **Restore flows.json from backup** ```bash cp projects/Plastico/flows.json.backup projects/Plastico/flows.json ``` 3. **Clear global context** (if needed) ```javascript // In a debug node global.set("lastKPIRecordTime", null); global.set("kpiBuffer", null); global.set("lastKPIValues", null); ``` 4. **Restart Node-RED** 5. **Clear browser cache** ### Emergency Full Rollback: ```bash # Restore from most recent backup cp projects/Plastico/Respaldo_MVP_Complete_11_23_25.json projects/Plastico/flows.json # Restart Node-RED node-red-restart ``` --- ## Potential Roadblocks & Mitigations ### Roadblock 1: Global Context Persistence on Deploy/Restart ⚠️ CRITICAL **Symptom:** After Node-RED restart or deploy, throttling/averaging/availability logic breaks or shows incorrect data **Root Cause:** Global variables (`lastKPIRecordTime`, `kpiBuffer`, `lastKPIValues`, `trackingEnabled`) may be reset or restored from file/memory store depending on settings.js configuration **Mitigation:** 1. **Add Robust Initialization Logic:** ```javascript // In Record KPI History function - ALWAYS check and initialize let buffer = global.get("kpiBuffer"); if (!buffer || !Array.isArray(buffer)) { buffer = []; global.set("kpiBuffer", buffer); } let lastRecordTime = global.get("lastKPIRecordTime"); if (!lastRecordTime || typeof lastRecordTime !== 'number') { // Set to 1 minute ago to ensure immediate recording on startup lastRecordTime = Date.now() - 60000; global.set("lastKPIRecordTime", lastRecordTime); } ``` 2. **Create an Init Node:** - Add a dedicated "Initialize Global Variables" function node - Trigger on deploy using an inject node (inject once, delay 0) - Wire to all critical nodes to ensure state is set before first execution **Complete Init Node Code:** ```javascript // Initialize Global Variables - Run on Deploy node.warn('[INIT] Initializing global variables'); // KPI Buffer for averaging if (!global.get("kpiBuffer")) { global.set("kpiBuffer", []); node.warn('[INIT] Set kpiBuffer to []'); } // Last KPI record time - set to 1 min ago for immediate first record if (!global.get("lastKPIRecordTime")) { global.set("lastKPIRecordTime", Date.now() - 60000); node.warn('[INIT] Set lastKPIRecordTime'); } // Last machine cycle time - set to now to prevent immediate 0% availability if (!global.get("lastMachineCycleTime")) { global.set("lastMachineCycleTime", Date.now()); node.warn('[INIT] Set lastMachineCycleTime to prevent 0% availability on startup'); } // Last KPI values if (!global.get("lastKPIValues")) { global.set("lastKPIValues", {}); node.warn('[INIT] Set lastKPIValues to {}'); } node.warn('[INIT] Global variable initialization complete'); return msg; ``` 3. **Check settings.js:** - Verify contextStorage configuration - Consider using `file` storage for persistence if using `memory` (default) **Testing:** - Deploy changes multiple times - Restart Node-RED - Verify variables persist/initialize correctly - Check debug logs for initialization messages --- ### Roadblock 2: State Sync Between Flow and Dashboard (Push vs Pull Model) **Symptom:** START/STOP button shows wrong state when user loads dashboard mid-production **Root Cause:** Relying on push model (messages sent during state changes) - if user loads page after tracking started, initial message is missed **Mitigation:** 1. **Add Pull Mechanism in Home Template:** ```javascript // In Home Template initialization (function(scope) { // Request current state on load scope.send({ topic: "requestState", payload: {} }); // Handle state response scope.$watch('msg', function(msg) { if (msg && msg.topic === 'currentState') { window.trackingEnabled = msg.payload.trackingEnabled; window.productionStarted = msg.payload.productionStarted; window.machineOnline = msg.payload.machineOnline; scope.renderDashboard(); } // ... rest of watch logic }); })(scope); ``` 2. **Add State Response Handler:** - Create function node that listens for `requestState` topic - Responds with current global state values - Wire to Home template **Testing:** - Start production - Open dashboard in new browser tab - Verify button shows STOP immediately - Test with multiple browser sessions --- ### Roadblock 3: UI/Angular Timing Races in ui-template ⚠️ HIGH IMPACT **Symptom:** Charts sometimes load, sometimes don't - fixed timeout (500ms) is unreliable on slow systems or complex templates **Root Cause:** Node-RED Dashboard uses AngularJS - digest cycle and DOM rendering timing is unpredictable **Mitigation Option A - Data-Driven Initialization (RECOMMENDED):** ```javascript // Instead of fixed timeout, wait for first data let chartsInitialized = false; scope.$watch('msg', function(msg) { if (msg && msg.kpis && !chartsInitialized) { // First data arrived, scope is ready initFilters(); createCharts(currentRange); chartsInitialized = true; } if (chartsInitialized && msg && msg.kpis) { updateCharts(msg); } }); ``` **Mitigation Option B - Angular Lifecycle Hook:** ```javascript // Hook into Angular's ready state scope.$applyAsync(function() { // DOM and scope guaranteed ready initFilters(); createCharts(currentRange); }); ``` **Mitigation Option C - Polling with Timeout:** ```javascript function initWhenReady(attempts = 0) { const oeeEl = document.getElementById("chart-oee"); if (oeeEl && scope.gotoTab) { // Both DOM and scope ready initFilters(); createCharts(currentRange); } else if (attempts < 20) { // Retry every 100ms, max 2 seconds setTimeout(() => initWhenReady(attempts + 1), 100); } else { console.error("Failed to initialize charts after 2 seconds"); } } // Start polling initWhenReady(); ``` **Recommendation:** Use Option A for most reliable results --- ### Roadblock 4: Throttling vs Live Display Trade-off **Symptom:** With averaging, displayed KPIs are stale (up to 59 seconds old), but without averaging, graphs are jerky **Root Cause:** OEE is a real-time snapshot - averaging smooths graphs but delays live feedback **Solution: Dual-Path KPI Updates** **Architecture:** - **Path 1 (Live):** Machine Cycles → Calculate KPIs → Home Template (no throttling) - **Path 2 (History):** Machine Cycles → Calculate KPIs → Averaging Buffer → Record History (throttled to 1 min) **Implementation:** ```javascript // In Calculate KPIs function - send to TWO outputs return [ msg, // Output 1: Live KPI to Home Template (unthrottled) { ...msg } // Output 2: KPI to History (will be throttled) ]; ``` **In Record KPI History - add averaging logic:** ```javascript // Only this node has averaging/throttling let buffer = global.get("kpiBuffer") || []; buffer.push({ timestamp: Date.now(), oee: msg.kpis.oee, availability: msg.kpis.availability, performance: msg.kpis.performance, quality: msg.kpis.quality }); const lastRecord = global.get("lastKPIRecordTime") || 0; const now = Date.now(); if (now - lastRecord >= 60000) { // Average the buffer const avg = { oee: buffer.reduce((sum, d) => sum + d.oee, 0) / buffer.length, // ... other metrics }; // Record averaged values to history // Send to Graphs template global.set("lastKPIRecordTime", now); global.set("kpiBuffer", []); return { kpis: avg }; } else { global.set("kpiBuffer", buffer); return null; // Don't record yet } ``` **Benefits:** - Live display always shows current OEE - Graphs are smooth with averaged data - No UX compromise --- ### Roadblock 5: Availability 0% Logic Too Simplistic **Symptom:** Availability drops to 0% during brief pauses (scrap submission) but also might NOT drop to 0% during legitimate stops (breaks, maintenance) **Root Cause:** Using previous value without time-based threshold can't distinguish brief interruption from actual shutdown **Improved Logic:** ```javascript // In Calculate KPIs function const now = Date.now(); const lastCycleTime = global.get("lastMachineCycleTime") || now; const timeSinceLastCycle = now - lastCycleTime; const BRIEF_PAUSE_THRESHOLD = 5 * 60 * 1000; // 5 minutes if (!trackingEnabled || timeSinceLastCycle > BRIEF_PAUSE_THRESHOLD) { // Legitimately stopped or long pause msg.kpis.availability = 0; global.set("lastKPIValues", null); // Clear history } else if (operatingTime > 0) { // Calculate normally msg.kpis.availability = calculateAvailability(operatingTime, plannedTime); global.set("lastKPIValues", msg.kpis); } else { // Brief pause - maintain last known value const prev = global.get("lastKPIValues") || {}; msg.kpis.availability = prev.availability || 0; } // NOTE: lastMachineCycleTime is updated in Machine Cycles function ONLY // This keeps the "machine pulse" signal clean and separate from KPI calculation ``` **Configuration:** - Adjust `BRIEF_PAUSE_THRESHOLD` based on your production environment - Consider making it configurable via dashboard setting --- ### Roadblock 6: KPI Calculation Performance **Symptom:** System slow after implementing continuous KPI updates **Mitigation:** - Implement Phase 2 throttling FIRST (now with dual-path approach) - Ensure Calculate KPIs has guards for null/undefined inputs - Profile Calculate KPIs function for optimization - Monitor Node-RED CPU usage during production --- ### Roadblock 7: Browser Cache Issues **Symptom:** Changes don't appear after deployment **Mitigation:** - Clear browser cache during testing (Ctrl+Shift+R / Cmd+Shift+R) - Add cache-busting version to template (optional): ```javascript // In template header ``` - Use incognito/private browsing for testing - Test on different browsers/devices --- ## Success Criteria ### Phase 1: - ✅ Time filters change graph display correctly - ✅ Graphs load on first visit without refresh - ✅ Sidebar navigation works immediately ### Phase 2: - ✅ Graph updates occur at ~1 minute intervals - ✅ Graphs are smooth, not jerky - ✅ No performance degradation ### Phase 3: - ✅ KPIs update continuously during production - ✅ Availability never incorrectly shows 0% - ✅ START button shows STOP when production running - ✅ OEE calculation is accurate ### Integration: - ✅ All features work together without conflicts - ✅ No console errors - ✅ Production tracking works end-to-end - ✅ Data persists correctly --- ## Estimated Timeline | Phase | Task | Time | Cumulative | |-------|------|------|------------| | 1.1 | Fix Filters | 15 min | 15 min | | 1.2 | Fix Empty Graphs | 15 min | 30 min | | 2.1 | Add Throttling | 45 min | 1h 15m | | 3.2 | Fix Availability (with logging) | 30 min | 1h 45m | | 3.1 | Fix Continuous Updates | 30 min | 2h 15m | | 3.3 | Fix Button State | 20 min | 2h 35m | | Testing | Integration Testing | 30 min | 3h 5m | **Total: ~3 hours** (assuming no major roadblocks) --- ## Best Practices for LLM-Assisted Implementation When working with an LLM to implement this plan, use these strategies for best results: ### 1. Isolate Logic Focus (Function Node Precision) **DO:** - Ask for specific function node code: "Write the Record KPI History function with averaging logic including global.get initialization" - Provide exact input/output requirements: "This function receives msg.kpis object and must return msg or null" - Request one change at a time **DON'T:** - Ask vague questions like "fix my dashboard" - Request multiple phase changes in one prompt - Assume LLM knows your flow structure ### 2. Explicitly Define Global Variables **Template for LLM prompts:** ``` Global variable: kpiBuffer Type: Array of objects Structure: [{timestamp: number, oee: number, availability: number, performance: number, quality: number}] Lifecycle: Initialized to [] if null, cleared after recording to history Purpose: Accumulates KPI values for 1-minute averaging ``` **Always specify:** - Variable name - Data type - Default/initial value - When it's read/written - When it should be cleared ### 3. Specify Node-RED Input/Output Requirements **Example prompt:** ``` The Machine Cycles function node must have 3 outputs: - Output 1: DB write message (only when tracking enabled) - Output 2: State update message (always sent) - Output 3: KPI trigger message (always sent for continuous updates) The return statement should be: return [dbMsg, stateMsg, kpiTrigger]; ``` ### 4. Request Defensive Code **Always ask for:** - Null/undefined checks before accessing properties - Type validation for global variables - Initialization logic at the start of functions - Error handling for edge cases **Example:** ```javascript // BAD (LLM might generate) const buffer = global.get("kpiBuffer"); buffer.push(newValue); // GOOD (what you should request) let buffer = global.get("kpiBuffer"); if (!buffer || !Array.isArray(buffer)) { buffer = []; } buffer.push(newValue); global.set("kpiBuffer", buffer); ``` ### 5. Break Down Complex Changes **For Phase 3.1 (Continuous KPI Updates), ask in sequence:** 1. "Show me the current return statements in Machine Cycles function" 2. "Modify the function to add a third output for KPI trigger" 3. "Update all return statements to include kpiTrigger message" 4. "Show me how to wire the third output to Calculate KPIs node" ### 6. Request Testing/Debugging Code **Ask LLM to include:** - Debug logging: `node.warn('[KPI] Buffer size: ' + buffer.length);` - State validation: Check that variables have expected values - Error messages: Descriptive messages for troubleshooting ### 7. Validate Against Node-RED Constraints **Remind LLM of Node-RED specifics:** - "This is a Node-RED function node, not regular JavaScript" - "Global context uses global.get/set, not regular variables" - "The msg object must be returned to send to next node" - "Use node.warn() for logging, not console.log()" ### 8. Phase-by-Phase Verification **After each LLM response:** 1. Verify the code matches the plan 2. Check for initialization logic 3. Confirm output structure matches wiring 4. Ask: "What edge cases does this handle?" ### 9. Example: Perfect LLM Prompt for Phase 2.1 ``` I need to implement KPI throttling with averaging in Node-RED. Context: - Function node: "Record KPI History" - Input: msg.kpis object with {oee, availability, performance, quality} - Output: Averaged KPI values sent to Graphs template (or null if not ready to record) Global variables needed: 1. kpiBuffer (Array): Accumulates KPI snapshots. Initialize to [] if null. 2. lastKPIRecordTime (Number): Last timestamp when history was recorded. Initialize to (Date.now() - 60000) if null for immediate first recording. Requirements: - Accumulate incoming KPIs in kpiBuffer - Every 60 seconds (60000ms), calculate average of all buffer values - Send averaged KPIs to output - Clear buffer after sending - If less than 60 seconds since last record, return null (don't send) Please write the complete function with: - Robust initialization (check and set defaults) - Debug logging (buffer size, time until next record) - Comments explaining each section - Edge case handling (empty buffer, first run) ``` ### 10. Common Pitfalls to Avoid 1. **Assuming LLM knows your flow structure** - Always describe node connections 2. **Not specifying Node-RED context** - LLM might give generic JavaScript instead 3. **Requesting too many changes at once** - Break into single-phase requests 4. **Forgetting to mention global variable persistence** - Specify initialization needs 5. **Not asking for defensive code** - Request null checks and type validation 6. **Vague success criteria** - Define exactly what "working" means --- --- ## Quick Reference: Key Code Snippets ### 1. Init Node (Run on Deploy) ```javascript // Initialize Global Variables - Inject Once on Deploy node.warn('[INIT] Initializing global variables'); if (!global.get("kpiBuffer")) global.set("kpiBuffer", []); if (!global.get("lastKPIRecordTime")) global.set("lastKPIRecordTime", Date.now() - 60000); if (!global.get("lastMachineCycleTime")) global.set("lastMachineCycleTime", Date.now()); if (!global.get("lastKPIValues")) global.set("lastKPIValues", {}); node.warn('[INIT] Complete'); return msg; ``` ### 2. Machine Cycles - Add to Final Return ```javascript // Update last machine cycle time when a successful cycle occurs if (trackingEnabled && dbMsg) { global.set("lastMachineCycleTime", Date.now()); } return [dbMsg, stateMsg, kpiTrigger]; ``` ### 3. Calculate KPIs - Multi-Source Guard ```javascript const trackingEnabled = global.get("trackingEnabled"); const activeOrder = global.get("activeOrder") || {}; if (!trackingEnabled || !activeOrder.id) return null; // ... rest of calculation ``` ### 4. Work Order START Button - Clear Buffer ```javascript if (action === "start-tracking") { global.set("trackingEnabled", true); global.set("kpiBuffer", []); // Clear stale data global.set("lastKPIRecordTime", Date.now() - 60000); // ... send state update } ``` ### 5. Graphs Template - Combined Init ```javascript let chartsInitialized = false; scope.$watch('msg', function(msg) { if (msg && msg.payload && msg.payload.kpis && !chartsInitialized) { initFilters(); createCharts(currentRange); chartsInitialized = true; } if (chartsInitialized && msg && msg.payload && msg.payload.kpis) { updateCharts(msg); } }); setTimeout(() => { if (!chartsInitialized) { initFilters(); createCharts(currentRange); chartsInitialized = true; } }, 5000); ``` --- ## Final Notes 1. **Backup First:** Always backup `flows.json` before starting each phase 2. **Test Incrementally:** Don't skip testing between phases 3. **Document Changes:** Note any deviations from plan 4. **Monitor Logs:** Watch Node-RED debug output during testing 5. **Clear Cache:** Browser cache can mask issues 6. **Use LLM Strategically:** Follow the best practices above for precise, working code **If you encounter issues not covered in this plan, STOP and ask for help before proceeding.**