# OEE Dashboard Fix Plan
## Comprehensive Strategy for Resolving All Issues

---

## Executive Summary

We have identified 5 distinct issues affecting your OEE dashboard. This plan addresses each systematically, ordered by priority based on impact, risk, and dependencies.

**Estimated Total Implementation Time:** 2-3 hours
**Recommended Approach:** Sequential implementation with testing between each phase

### Key Improvements in This Updated Plan

This plan has been enhanced based on critical friction point analysis for Node-RED environments:

1. **Global Context Persistence** - Added robust initialization logic for all global variables to handle Node-RED restarts and deploys without data loss or spikes

2. **State Synchronization (Push + Pull Model)** - Enhanced START/STOP button state tracking with both push notifications AND pull requests to handle mid-production dashboard loads

3. **Angular Timing Issues** - Replaced brittle fixed timeouts with data-driven initialization and polling fallback for reliable chart loading across all system speeds

4. **Dual-Path KPI Architecture** - Implemented separate paths for live display (real-time, unthrottled) and historical graphs (averaged, smooth) to eliminate the stale-data vs jerky-graphs trade-off

5. **Time-Based Availability Logic** - Enhanced availability calculation with configurable time thresholds to distinguish brief pauses from legitimate shutdowns

6. **LLM Implementation Guide** - Added comprehensive best practices section for working with LLMs to implement this plan with precise, defensive code

### Critical Refinements (Final Review)

Based on final review, these critical refinements have been integrated:

1. **Clear Buffer on Production START** - Prevents stale data from skewing averages if Node-RED restarts mid-production and context is restored from disk

2. **Consolidated lastMachineCycleTime Updates** - Now updated ONLY in Machine Cycles function (not Calculate KPIs) to maintain clean "machine pulse" signal, initialized to `Date.now()` on startup to prevent immediate 0% availability

3. **Combined Initialization Strategy** - Graphs now use BOTH data-driven initialization (fast when production is running) AND 5-second safety timeout (for idle machine scenarios)

4. **Multi-Source KPI Calculation** - Calculate KPIs now explicitly handles triggers from both Machine Cycles (continuous) and Scrap Submission (event-based) with proper guards

5. **Complete Init Node** - Added production-ready initialization function with all global variables (`kpiBuffer`, `lastKPIRecordTime`, `lastMachineCycleTime`, `lastKPIValues`) properly initialized with correct default values and logging

---

## Issue Breakdown & Root Causes

### **Issue 1: KPI Updates Only on Scrap Submission**
**Symptom:** KPIs stay static during production, only update when scrap is submitted or START/STOP clicked
**Root Cause:**
- Machine Cycles function has multiple return paths with `[null, ...]` outputs
- Output to Calculate KPIs (output port 2) only happens in specific conditions
- When `trackingEnabled` is false or no active order, KPI calculation is skipped
- **Critical line:** `if (!trackingEnabled) return [null, stateMsg];` prevents KPI updates

**Sub-issue 1b: START/STOP Button State**
- Button state not persisting because UI doesn't track `trackingEnabled` global variable
- Home template needs to watch for tracking state changes

---

### **Issue 2: Graphs Empty on First Load, Sidebar Broken**
**Symptom:** Graphs tab shows blank, navigation doesn't work until refresh
**Root Causes:**
1. **Timing Issue:** Charts created before Angular/scope is fully ready
2. **Scope Isolation:** `scope.gotoTab` might not be accessible immediately
3. **Data Race:** Charts created before first KPI data arrives

**Why refresh works:** Second load benefits from cached scope and existing data

---

### **Issue 3: Availability & OEE Drop to 0%**
**Symptom:** Metrics incorrectly show 0% during active production
**Root Cause:**
- Calculate KPIs function has logic that sets availability to 0 when certain conditions aren't met
- **Need to verify:** When does `trackingEnabled` check fail?
- **Hypothesis:** When production is running but tracking flag isn't properly set, availability defaults to 0

---

### **Issue 4: Graph Updates Too Frequent/Jerky**
**Symptom:** Data points recorded too often, causing choppy visualization
**Root Cause:**
- Record KPI History is called on EVERY Calculate KPIs output
- With machine cycles happening every ~1 second, KPIs recorded every second
- Need time-based throttling (1-minute intervals) instead of event-based recording

---

### **Issue 5: Time Range Filters Not Working**
**Symptom:** Shift/Day/Week/Month/Year buttons don't change graph display
**Root Cause:**
- `build(metric, range)` function receives range parameter but **ignores it**
- Function always returns ALL data from `realtimeData[metric]`
- Need to filter data based on selected time range

---

## Fix Plan - Phased Approach

### **PHASE 1: Low-Risk Quick Wins** ⚡
*Estimated Time: 30 minutes*
*Risk Level: LOW*

#### 1.1 Fix Graph Filters (Issue 5)
**Files:** `projects/Plastico/flows.json` → Graphs Template

**Changes:**
```javascript
// BEFORE
function build(metric, range){
  const arr = realtimeData[metric];
  if (!arr || arr.length === 0) return [];
  return arr.map(d=>({x:d.timestamp, y:d.value}));
}

// AFTER
function build(metric, range){
  const arr = realtimeData[metric];
  if (!arr || arr.length === 0) return [];

  // Calculate time cutoff based on range
  const now = Date.now();
  const cutoffs = {
    shift: 8 * 60 * 60 * 1000,      // 8 hours
    day: 24 * 60 * 60 * 1000,       // 24 hours
    week: 7 * 24 * 60 * 60 * 1000,  // 7 days
    month: 30 * 24 * 60 * 60 * 1000, // 30 days
    year: 365 * 24 * 60 * 60 * 1000  // 365 days
  };

  const cutoffTime = now - (cutoffs[range] || cutoffs.shift);

  // Filter data to selected time range
  return arr
    .filter(d => d.timestamp >= cutoffTime)
    .map(d => ({x: d.timestamp, y: d.value}));
}
```

**Testing:**
- Click each filter button
- Verify data range changes in charts
- Check that no errors occur

**Potential Issues:**
- If no data exists in selected range, chart might be empty (expected behavior)

**Rollback:** Easy - revert to original build() function

---

#### 1.2 Fix Empty Graphs on First Load (Issue 2)
**Files:** `projects/Plastico/flows.json` → Graphs Template

**Strategy:** Use data-driven initialization instead of fixed timeout for reliability

**Changes:**

**A) Combined Data-Driven + Safety Timeout (RECOMMENDED)**
```javascript
// BEFORE
setTimeout(()=>{
  initFilters();
  createCharts(currentRange);
},300);

// AFTER - Wait for first data message OR timeout
let chartsInitialized = false;

scope.$watch('msg', function(msg) {
  // Initialize on first KPI data arrival
  if (msg && msg.payload && msg.payload.kpis && !chartsInitialized) {
    // Scope and data are both ready
    initFilters();
    createCharts(currentRange);
    chartsInitialized = true;
    console.log('[Graphs] Charts initialized via data-driven approach');
  }

  // Update charts if already initialized
  if (chartsInitialized && msg && msg.payload && msg.payload.kpis) {
    updateCharts(msg);
  }
});

// ADDED: Safety timer for when machine is idle (no KPI messages flowing)
setTimeout(() => {
  if (!chartsInitialized) {
    console.warn('[Graphs] Charts initialized via safety timer (machine idle)');
    initFilters();
    createCharts(currentRange);
    chartsInitialized = true;
  }
}, 5000); // 5 seconds grace period for KPI messages
```

**Why Both?**
- **Data-driven**: Ensures charts initialize as soon as data is available (fast, reliable)
- **Safety timeout**: Handles "dashboard loaded but machine is idle" scenario (no KPI messages)
- Together they cover both active production and idle machine scenarios

**B) Fallback: Polling with timeout (if data-driven doesn't work)**
```javascript
function initWhenReady(attempts = 0) {
  const oeeEl = document.getElementById("chart-oee");
  const availEl = document.getElementById("chart-availability");

  if (oeeEl && availEl && scope.gotoTab) {
    // Both DOM and scope ready
    initFilters();
    createCharts(currentRange);
  } else if (attempts < 20) {
    // Retry every 100ms, max 2 seconds
    setTimeout(() => initWhenReady(attempts + 1), 100);
  } else {
    console.error("[Graphs] Failed to initialize charts after 2 seconds");
  }
}

// Start polling on load
initWhenReady();
```

**C) Ensure scope.gotoTab is properly bound**
```javascript
// BEFORE
(function(scope){
  scope.gotoTab = t => scope.send({ui_control:{tab:t}});
})(scope);

// AFTER
(function(s){
  if (!s.gotoTab) {
    s.gotoTab = function(t) {
      s.send({ui_control: {tab: t}});
    };
  }
})(scope);
```

**D) Add defensive chart creation with retry**
```javascript
function createCharts(range){
  // Ensure DOM elements exist
  const oeeEl = document.getElementById("chart-oee");
  const availEl = document.getElementById("chart-availability");

  if (!oeeEl || !availEl) {
    console.warn("[Graphs] Chart elements not ready, retrying...");
    setTimeout(() => createCharts(range), 200);
    return;
  }

  // ... rest of existing chart creation logic
}
```

**Testing:**
- Clear browser cache
- Navigate to Graphs tab from fresh load
- Test sidebar navigation
- Verify charts appear without refresh
- Test on slow network/system

**Potential Issues:**
- Data-driven approach requires KPI messages flowing
- If no production running, charts won't initialize (add timeout fallback)

**Recommended Implementation:**
1. Start with data-driven approach (Option A)
2. Add polling fallback (Option B) as safety net
3. Implement defensive checks (Options C & D)

**Rollback:** Easy - revert to original setTimeout logic

---

### **PHASE 2: Medium-Risk Data Flow Improvements** 🔧
*Estimated Time: 45 minutes*
*Risk Level: MEDIUM*

#### 2.1 Implement KPI Update Throttling with Dual-Path Architecture (Issue 4)
**Files:**
- `projects/Plastico/flows.json` → Calculate KPIs function (add second output)
- `projects/Plastico/flows.json` → Record KPI History function (add averaging)

**Strategy:** Dual-path updates solve the stale display vs jerky graphs trade-off
- **Path 1:** Unthrottled live KPIs to Home Template for real-time display
- **Path 2:** Throttled/averaged KPIs to Record History for smooth graphs

**Part A: Modify Calculate KPIs to Output on Two Paths**

```javascript
// At the end of Calculate KPIs function

// Prepare the KPI message
const kpiMsg = {
  topic: "kpis",
  payload: {
    timestamp: Date.now(),
    kpis: {
      oee: msg.kpis.oee,
      availability: msg.kpis.availability,
      performance: msg.kpis.performance,
      quality: msg.kpis.quality
    }
  }
};

// Return to TWO outputs:
// Output 1: Live KPI to Home Template (real-time, unthrottled)
// Output 2: KPI to Record History (will be averaged/throttled)
return [
  kpiMsg,           // Path 1: Live display
  { ...kpiMsg }     // Path 2: History recording (clone to prevent mutation)
];
```

**Wiring Changes:**
- Calculate KPIs node needs **2 outputs** (add one more)
- Output 1 → Home Template (existing connection)
- Output 2 → Record KPI History (new connection)

**Part B: Add Averaging Logic to Record KPI History**

```javascript
// Complete Record KPI History function with robust initialization

// ========== INITIALIZATION ==========
// Initialize buffer
let buffer = global.get("kpiBuffer");
if (!buffer || !Array.isArray(buffer)) {
  buffer = [];
  global.set("kpiBuffer", buffer);
  node.warn('[KPI History] Initialized kpiBuffer');
}

// Initialize last record time
let lastRecordTime = global.get("lastKPIRecordTime");
if (!lastRecordTime || typeof lastRecordTime !== 'number') {
  // Set to 1 minute ago to ensure immediate recording on startup
  lastRecordTime = Date.now() - 60000;
  global.set("lastKPIRecordTime", lastRecordTime);
  node.warn('[KPI History] Initialized lastKPIRecordTime');
}

// ========== ACCUMULATE ==========
const kpis = msg.payload.kpis;
if (!kpis) {
  node.warn('[KPI History] No KPIs in message, skipping');
  return null;
}

buffer.push({
  timestamp: Date.now(),
  oee: kpis.oee || 0,
  availability: kpis.availability || 0,
  performance: kpis.performance || 0,
  quality: kpis.quality || 0
});

// Prevent buffer from growing too large (safety limit)
if (buffer.length > 100) {
  buffer = buffer.slice(-60); // Keep last 60 entries
  node.warn('[KPI History] Buffer exceeded 100 entries, trimmed to 60');
}

global.set("kpiBuffer", buffer);

// ========== CHECK IF TIME TO RECORD ==========
const now = Date.now();
const timeSinceLastRecord = now - lastRecordTime;
const ONE_MINUTE = 60 * 1000;

if (timeSinceLastRecord < ONE_MINUTE) {
  // Not time to record yet
  const secondsRemaining = Math.ceil((ONE_MINUTE - timeSinceLastRecord) / 1000);
  // Debug log (can remove in production)
  // node.warn(`[KPI History] Buffer: ${buffer.length} entries, recording in ${secondsRemaining}s`);
  return null; // Don't send to charts yet
}

// ========== CALCULATE AVERAGES ==========
if (buffer.length === 0) {
  node.warn('[KPI History] Buffer empty at recording time, skipping');
  return null;
}

const avg = {
  oee: buffer.reduce((sum, d) => sum + d.oee, 0) / buffer.length,
  availability: buffer.reduce((sum, d) => sum + d.availability, 0) / buffer.length,
  performance: buffer.reduce((sum, d) => sum + d.performance, 0) / buffer.length,
  quality: buffer.reduce((sum, d) => sum + d.quality, 0) / buffer.length
};

node.warn(`[KPI History] Recording averaged KPIs from ${buffer.length} samples: OEE=${avg.oee.toFixed(1)}%`);

// ========== RECORD TO HISTORY ==========
// Update global state
global.set("lastKPIRecordTime", now);
global.set("kpiBuffer", []); // Clear buffer

// Send averaged values to graphs and database
return {
  topic: "kpi-history",
  payload: {
    timestamp: now,
    kpis: {
      oee: Math.round(avg.oee * 10) / 10,           // Round to 1 decimal
      availability: Math.round(avg.availability * 10) / 10,
      performance: Math.round(avg.performance * 10) / 10,
      quality: Math.round(avg.quality * 10) / 10
    },
    sampleCount: buffer.length  // Metadata for debugging
  }
};
```

**Recommendation:** This dual-path approach provides the best of both worlds

**Testing:**
- Start production
- Observe KPI update frequency in graphs
- Verify updates occur approximately every 60 seconds
- Check that no spikes/gaps appear in data

**Potential Issues:**
- First data point might take up to 1 minute to appear
- Rapid production changes might not be immediately visible
- Buffer could grow large if production runs without recording

**Mitigation:**
- Set buffer max size (e.g., 100 entries)
- Force record on production stop/start

**Rollback:** Medium difficulty - remove throttling logic, clear global variables

---

### **PHASE 3: High-Risk Core Logic Fixes** ⚠️
*Estimated Time: 60 minutes*
*Risk Level: HIGH*

**⚠️ CRITICAL: Backup flows.json before proceeding**

#### 3.1 Fix KPI Continuous Updates (Issue 1)
**Files:** `projects/Plastico/flows.json` → Machine Cycles function

**Problem:** Machine Cycles has multiple early returns that skip KPI calculation

**Current Logic:**
```javascript
// Line ~36: No active order
if (!activeOrder || !activeOrder.id || cavities <= 0) {
    return [null, stateMsg];  // ❌ Skips KPI calculation
}

// Line ~43: Tracking not enabled
if (!trackingEnabled) {
    return [null, stateMsg];  // ❌ Skips KPI calculation
}
```

**Solution Options:**

**Option A: Always Calculate KPIs (Recommended)**
```javascript
// Always prepare a message for Calculate KPIs on output 2
const kpiTrigger = { _triggerKPI: true };

// Change all returns to include kpiTrigger
if (!activeOrder || !activeOrder.id || cavities <= 0) {
    return [null, stateMsg, kpiTrigger];  // ✓ Triggers KPI calculation
}

if (!trackingEnabled) {
    return [null, stateMsg, kpiTrigger];  // ✓ Triggers KPI calculation
}

// Update last machine cycle time when a successful cycle occurs
// This is used for time-based availability logic
if (trackingEnabled && dbMsg) {
    // dbMsg being non-null implies a cycle was recorded
    global.set("lastMachineCycleTime", Date.now());
}

// ... final return
return [dbMsg, stateMsg, kpiTrigger];
```

**Critical:** The `lastMachineCycleTime` update must happen ONLY in Machine Cycles function to maintain a clean "machine pulse" signal separate from KPI calculation triggers.

**Wire Configuration Change:**
- Add third output wire to Machine Cycles node
- Connect output 3 → Calculate KPIs

**Option B: Calculate KPIs in Parallel (Alternative)**
- Add an inject node that triggers Calculate KPIs every 5 seconds
- Less coupled, but might calculate with stale data

**Recommendation:** Option A - ensures KPIs calculated with real-time data

**Testing:**
1. Start production with START button
2. Observe KPI values on Home page
3. Verify continuous updates (every ~1 second before throttling)
4. Check that scrap submission still works
5. Test production stop/start

**Potential Issues:**
- Calculate KPIs might need to handle cases with no active order
- Could calculate KPIs unnecessarily when machine is idle
- Performance impact if calculating too frequently

**Mitigation:**
- Add guards in Calculate KPIs to handle null/undefined inputs
- Implement Phase 2 throttling first to reduce calculation frequency
- Monitor system performance

**CRITICAL: Calculate KPIs Multi-Source Handling**

The Calculate KPIs function will now receive triggers from TWO sources:
1. **Machine Cycles** (continuous, real-time) - via new output 3
2. **Scrap Submission** (event-based) - existing connection

**Required Change in Calculate KPIs:**
```javascript
// At the start of Calculate KPIs function
// Must handle both trigger types

// The function should execute regardless of message content
// as long as it receives ANY trigger

const trackingEnabled = global.get("trackingEnabled");
const activeOrder = global.get("activeOrder") || {};
const productionStartTime = global.get("productionStartTime");

// Guard against missing critical data
if (!trackingEnabled || !activeOrder.id) {
  // Can't calculate meaningful KPIs without tracking or active order
  // But don't error - just skip calculation
  return null;
}

// ... rest of existing KPI calculation logic
// This logic will now run for BOTH continuous and event-based triggers
```

This ensures availability and OEE calculations work correctly whether triggered by machine cycles or scrap submission.

**Side Effects:**
- Will trigger Issue 4 more severely → MUST implement Phase 2 throttling first
- Database might receive more frequent updates
- Global variables will change more often

**Rollback:** Medium difficulty - requires restoring original return statements and wire configuration

---

#### 3.2 Fix Availability/OEE Drops to 0 (Issue 3)
**Files:** `projects/Plastico/flows.json` → Calculate KPIs function

**Investigation Steps:**
1. Read full Calculate KPIs function
2. Identify all paths that set `msg.kpis.availability = 0`
3. Add logging to track when this occurs
4. Understand state flow: trackingEnabled, productionStartTime, operatingTime

**Hypothesis Testing:**
```javascript
// Add debug logging at the start
node.warn(`[KPI] trackingEnabled=${trackingEnabled}, startTime=${productionStartTime}, opTime=${operatingTime}`);

// Before setting availability to 0
if (/* condition that causes 0 */) {
    node.warn(`[KPI] Setting availability to 0 because: [reason]`);
    msg.kpis.availability = 0;
}
```

**Likely Fix:**
```javascript
// BEFORE
} else {
    msg.kpis.availability = 0; // Not running
}

// AFTER
} else {
    // Check if production was recently active
    const prev = global.get("lastKPIValues") || {};
    if (prev.availability > 0 && operatingTime > 0) {
        // Maintain last availability if we have operating time
        msg.kpis.availability = prev.availability;
    } else {
        msg.kpis.availability = 0;
    }
}

// Store KPIs for next iteration
global.set("lastKPIValues", msg.kpis);
```

**Testing:**
1. Start production
2. Monitor availability values
3. Trigger scrap prompt
4. Verify availability doesn't drop to 0
5. Check OEE calculation

**Potential Issues:**
- Might mask legitimate 0% availability (machine actually stopped)
- Could create artificially high availability readings
- State persistence might cause issues after restart

**Mitigation:**
- Add clear conditions for when availability should legitimately be 0
- Reset lastKPIValues on work order completion
- Add production state tracking

**Rollback:** Easy if logging added first - can revert based on log analysis

---

#### 3.3 Fix START/STOP Button State (Issue 1b)
**Files:** `projects/Plastico/flows.json` → Home Template

**Problem:** Button doesn't show correct state (STOP when production running)

**Investigation:**
- Find button rendering logic in Home template
- Check how `trackingEnabled` or `productionStarted` is tracked
- Verify message handler receives state updates

**Changes:**
```javascript
// In Home Template scope.$watch
if (msg.topic === 'machineStatus') {
  window.machineOnline = msg.payload.machineOnline;
  window.productionStarted = msg.payload.productionStarted;

  // NEW: Track tracking state for button display
  window.trackingEnabled = msg.payload.trackingEnabled || window.productionStarted;

  scope.renderDashboard();
  return;
}
```

**Button HTML Update:**
```html
<!-- BEFORE -->
<button ng-click="handleStart()">START</button>

<!-- AFTER -->
<button ng-click="handleStart()" ng-show="!trackingEnabled">START</button>
<button ng-click="handleStop()" ng-show="trackingEnabled" class="stop-btn">STOP</button>
```

**Backend Update (Work Order buttons):**
```javascript
// When START clicked, also set trackingEnabled flag
if (action === "start-tracking") {
    global.set("trackingEnabled", true);

    // CRITICAL: Clear KPI buffer on production start
    // Prevents stale data from skewing averages if Node-RED was restarted mid-production
    global.set("kpiBuffer", []);
    node.warn('[START] Cleared kpiBuffer for fresh production run');

    // Optional: Reset last record time to ensure immediate data point
    global.set("lastKPIRecordTime", Date.now() - 60000);

    // Send state update to UI
    const stateMsg = {
        topic: "machineStatus",
        payload: {
            machineOnline: true,
            productionStarted: true,
            trackingEnabled: true
        }
    };
    // ... send stateMsg to Home template
}
```

**Why Clear Buffer on START:**
If Node-RED restarts during a production run and context is restored from disk, the `kpiBuffer` might contain stale data from before the restart. When production resumes, new data would be mixed with old data, skewing the averages. Clearing on START ensures a clean slate for each production session.

**Testing:**
1. Load dashboard
2. Start work order
3. Verify START button changes to STOP
4. Click STOP (if implemented)
5. Verify button changes back to START

**Potential Issues:**
- Need to implement STOP button handler if it doesn't exist
- State sync between backend and frontend
- Button might flicker during state transitions

**Rollback:** Easy - remove button visibility conditions

---

## Implementation Order & Dependencies

### Recommended Sequence:

1. **Phase 1.1** - Fix Filters (Independent, low risk)
2. **Phase 1.2** - Fix Empty Graphs (Independent, low risk)
3. **Phase 2.1** - Add Throttling (Required before Phase 3.1)
4. **Phase 3.2** - Fix Availability Calculation (Add logging first)
5. **Phase 3.1** - Fix Continuous KPI Updates (Depends on throttling)
6. **Phase 3.3** - Fix Button State (Can be done anytime)

### Why This Order?

1. **Quick wins first** - Build confidence, improve UX immediately
2. **Throttling before continuous updates** - Prevent performance issues
3. **Logging before logic changes** - Understand problem before fixing
4. **Independent fixes can run parallel** - Save time

---

## Testing Strategy

### Per-Phase Testing:
- Test each phase independently
- Don't proceed to next phase if current fails
- Keep backup of working state

### Integration Testing (After All Phases):
1. **Fresh Start Test**
   - Clear browser cache
   - Restart Node-RED
   - Load dashboard
   - Navigate all tabs

2. **Production Cycle Test**
   - Start new work order
   - Click START
   - Let run for 2-3 minutes
   - Submit scrap
   - Verify KPIs update
   - Check graphs show data
   - Test time filters

3. **State Persistence Test**
   - Refresh page during production
   - Verify state restores correctly
   - Check button shows STOP if running

4. **Edge Cases**
   - No active work order
   - Machine offline
   - Zero production time
   - Rapid start/stop

---

## Rollback Plan

### Per-Phase Rollback:
Each phase documents its rollback procedure. In general:

1. **Stop Node-RED**
2. **Restore flows.json from backup**
   ```bash
   cp projects/Plastico/flows.json.backup projects/Plastico/flows.json
   ```
3. **Clear global context** (if needed)
   ```javascript
   // In a debug node
   global.set("lastKPIRecordTime", null);
   global.set("kpiBuffer", null);
   global.set("lastKPIValues", null);
   ```
4. **Restart Node-RED**
5. **Clear browser cache**

### Emergency Full Rollback:
```bash
# Restore from most recent backup
cp projects/Plastico/Respaldo_MVP_Complete_11_23_25.json projects/Plastico/flows.json
# Restart Node-RED
node-red-restart
```

---

## Potential Roadblocks & Mitigations

### Roadblock 1: Global Context Persistence on Deploy/Restart ⚠️ CRITICAL
**Symptom:** After Node-RED restart or deploy, throttling/averaging/availability logic breaks or shows incorrect data
**Root Cause:** Global variables (`lastKPIRecordTime`, `kpiBuffer`, `lastKPIValues`, `trackingEnabled`) may be reset or restored from file/memory store depending on settings.js configuration

**Mitigation:**
1. **Add Robust Initialization Logic:**
```javascript
// In Record KPI History function - ALWAYS check and initialize
let buffer = global.get("kpiBuffer");
if (!buffer || !Array.isArray(buffer)) {
  buffer = [];
  global.set("kpiBuffer", buffer);
}

let lastRecordTime = global.get("lastKPIRecordTime");
if (!lastRecordTime || typeof lastRecordTime !== 'number') {
  // Set to 1 minute ago to ensure immediate recording on startup
  lastRecordTime = Date.now() - 60000;
  global.set("lastKPIRecordTime", lastRecordTime);
}
```

2. **Create an Init Node:**
   - Add a dedicated "Initialize Global Variables" function node
   - Trigger on deploy using an inject node (inject once, delay 0)
   - Wire to all critical nodes to ensure state is set before first execution

**Complete Init Node Code:**
```javascript
// Initialize Global Variables - Run on Deploy
node.warn('[INIT] Initializing global variables');

// KPI Buffer for averaging
if (!global.get("kpiBuffer")) {
  global.set("kpiBuffer", []);
  node.warn('[INIT] Set kpiBuffer to []');
}

// Last KPI record time - set to 1 min ago for immediate first record
if (!global.get("lastKPIRecordTime")) {
  global.set("lastKPIRecordTime", Date.now() - 60000);
  node.warn('[INIT] Set lastKPIRecordTime');
}

// Last machine cycle time - set to now to prevent immediate 0% availability
if (!global.get("lastMachineCycleTime")) {
  global.set("lastMachineCycleTime", Date.now());
  node.warn('[INIT] Set lastMachineCycleTime to prevent 0% availability on startup');
}

// Last KPI values
if (!global.get("lastKPIValues")) {
  global.set("lastKPIValues", {});
  node.warn('[INIT] Set lastKPIValues to {}');
}

node.warn('[INIT] Global variable initialization complete');
return msg;
```

3. **Check settings.js:**
   - Verify contextStorage configuration
   - Consider using `file` storage for persistence if using `memory` (default)

**Testing:**
- Deploy changes multiple times
- Restart Node-RED
- Verify variables persist/initialize correctly
- Check debug logs for initialization messages

---

### Roadblock 2: State Sync Between Flow and Dashboard (Push vs Pull Model)
**Symptom:** START/STOP button shows wrong state when user loads dashboard mid-production
**Root Cause:** Relying on push model (messages sent during state changes) - if user loads page after tracking started, initial message is missed

**Mitigation:**
1. **Add Pull Mechanism in Home Template:**
```javascript
// In Home Template initialization
(function(scope) {
  // Request current state on load
  scope.send({
    topic: "requestState",
    payload: {}
  });

  // Handle state response
  scope.$watch('msg', function(msg) {
    if (msg && msg.topic === 'currentState') {
      window.trackingEnabled = msg.payload.trackingEnabled;
      window.productionStarted = msg.payload.productionStarted;
      window.machineOnline = msg.payload.machineOnline;
      scope.renderDashboard();
    }
    // ... rest of watch logic
  });
})(scope);
```

2. **Add State Response Handler:**
   - Create function node that listens for `requestState` topic
   - Responds with current global state values
   - Wire to Home template

**Testing:**
- Start production
- Open dashboard in new browser tab
- Verify button shows STOP immediately
- Test with multiple browser sessions

---

### Roadblock 3: UI/Angular Timing Races in ui-template ⚠️ HIGH IMPACT
**Symptom:** Charts sometimes load, sometimes don't - fixed timeout (500ms) is unreliable on slow systems or complex templates
**Root Cause:** Node-RED Dashboard uses AngularJS - digest cycle and DOM rendering timing is unpredictable

**Mitigation Option A - Data-Driven Initialization (RECOMMENDED):**
```javascript
// Instead of fixed timeout, wait for first data
let chartsInitialized = false;

scope.$watch('msg', function(msg) {
  if (msg && msg.kpis && !chartsInitialized) {
    // First data arrived, scope is ready
    initFilters();
    createCharts(currentRange);
    chartsInitialized = true;
  }

  if (chartsInitialized && msg && msg.kpis) {
    updateCharts(msg);
  }
});
```

**Mitigation Option B - Angular Lifecycle Hook:**
```javascript
// Hook into Angular's ready state
scope.$applyAsync(function() {
  // DOM and scope guaranteed ready
  initFilters();
  createCharts(currentRange);
});
```

**Mitigation Option C - Polling with Timeout:**
```javascript
function initWhenReady(attempts = 0) {
  const oeeEl = document.getElementById("chart-oee");

  if (oeeEl && scope.gotoTab) {
    // Both DOM and scope ready
    initFilters();
    createCharts(currentRange);
  } else if (attempts < 20) {
    // Retry every 100ms, max 2 seconds
    setTimeout(() => initWhenReady(attempts + 1), 100);
  } else {
    console.error("Failed to initialize charts after 2 seconds");
  }
}

// Start polling
initWhenReady();
```

**Recommendation:** Use Option A for most reliable results

---

### Roadblock 4: Throttling vs Live Display Trade-off
**Symptom:** With averaging, displayed KPIs are stale (up to 59 seconds old), but without averaging, graphs are jerky
**Root Cause:** OEE is a real-time snapshot - averaging smooths graphs but delays live feedback

**Solution: Dual-Path KPI Updates**

**Architecture:**
- **Path 1 (Live):** Machine Cycles → Calculate KPIs → Home Template (no throttling)
- **Path 2 (History):** Machine Cycles → Calculate KPIs → Averaging Buffer → Record History (throttled to 1 min)

**Implementation:**
```javascript
// In Calculate KPIs function - send to TWO outputs
return [
  msg,              // Output 1: Live KPI to Home Template (unthrottled)
  { ...msg }        // Output 2: KPI to History (will be throttled)
];
```

**In Record KPI History - add averaging logic:**
```javascript
// Only this node has averaging/throttling
let buffer = global.get("kpiBuffer") || [];
buffer.push({
  timestamp: Date.now(),
  oee: msg.kpis.oee,
  availability: msg.kpis.availability,
  performance: msg.kpis.performance,
  quality: msg.kpis.quality
});

const lastRecord = global.get("lastKPIRecordTime") || 0;
const now = Date.now();

if (now - lastRecord >= 60000) {
  // Average the buffer
  const avg = {
    oee: buffer.reduce((sum, d) => sum + d.oee, 0) / buffer.length,
    // ... other metrics
  };

  // Record averaged values to history
  // Send to Graphs template
  global.set("lastKPIRecordTime", now);
  global.set("kpiBuffer", []);
  return { kpis: avg };
} else {
  global.set("kpiBuffer", buffer);
  return null; // Don't record yet
}
```

**Benefits:**
- Live display always shows current OEE
- Graphs are smooth with averaged data
- No UX compromise

---

### Roadblock 5: Availability 0% Logic Too Simplistic
**Symptom:** Availability drops to 0% during brief pauses (scrap submission) but also might NOT drop to 0% during legitimate stops (breaks, maintenance)
**Root Cause:** Using previous value without time-based threshold can't distinguish brief interruption from actual shutdown

**Improved Logic:**
```javascript
// In Calculate KPIs function
const now = Date.now();
const lastCycleTime = global.get("lastMachineCycleTime") || now;
const timeSinceLastCycle = now - lastCycleTime;

const BRIEF_PAUSE_THRESHOLD = 5 * 60 * 1000; // 5 minutes

if (!trackingEnabled || timeSinceLastCycle > BRIEF_PAUSE_THRESHOLD) {
  // Legitimately stopped or long pause
  msg.kpis.availability = 0;
  global.set("lastKPIValues", null); // Clear history
} else if (operatingTime > 0) {
  // Calculate normally
  msg.kpis.availability = calculateAvailability(operatingTime, plannedTime);
  global.set("lastKPIValues", msg.kpis);
} else {
  // Brief pause - maintain last known value
  const prev = global.get("lastKPIValues") || {};
  msg.kpis.availability = prev.availability || 0;
}

// NOTE: lastMachineCycleTime is updated in Machine Cycles function ONLY
// This keeps the "machine pulse" signal clean and separate from KPI calculation
```

**Configuration:**
- Adjust `BRIEF_PAUSE_THRESHOLD` based on your production environment
- Consider making it configurable via dashboard setting

---

### Roadblock 6: KPI Calculation Performance
**Symptom:** System slow after implementing continuous KPI updates
**Mitigation:**
- Implement Phase 2 throttling FIRST (now with dual-path approach)
- Ensure Calculate KPIs has guards for null/undefined inputs
- Profile Calculate KPIs function for optimization
- Monitor Node-RED CPU usage during production

---

### Roadblock 7: Browser Cache Issues
**Symptom:** Changes don't appear after deployment
**Mitigation:**
- Clear browser cache during testing (Ctrl+Shift+R / Cmd+Shift+R)
- Add cache-busting version to template (optional):
```javascript
// In template header
<!-- Version: 1.1 - {{Date.now()}} -->
```
- Use incognito/private browsing for testing
- Test on different browsers/devices

---

## Success Criteria

### Phase 1:
- ✅ Time filters change graph display correctly
- ✅ Graphs load on first visit without refresh
- ✅ Sidebar navigation works immediately

### Phase 2:
- ✅ Graph updates occur at ~1 minute intervals
- ✅ Graphs are smooth, not jerky
- ✅ No performance degradation

### Phase 3:
- ✅ KPIs update continuously during production
- ✅ Availability never incorrectly shows 0%
- ✅ START button shows STOP when production running
- ✅ OEE calculation is accurate

### Integration:
- ✅ All features work together without conflicts
- ✅ No console errors
- ✅ Production tracking works end-to-end
- ✅ Data persists correctly

---

## Estimated Timeline

| Phase | Task | Time | Cumulative |
|-------|------|------|------------|
| 1.1 | Fix Filters | 15 min | 15 min |
| 1.2 | Fix Empty Graphs | 15 min | 30 min |
| 2.1 | Add Throttling | 45 min | 1h 15m |
| 3.2 | Fix Availability (with logging) | 30 min | 1h 45m |
| 3.1 | Fix Continuous Updates | 30 min | 2h 15m |
| 3.3 | Fix Button State | 20 min | 2h 35m |
| Testing | Integration Testing | 30 min | 3h 5m |

**Total: ~3 hours** (assuming no major roadblocks)

---

## Best Practices for LLM-Assisted Implementation

When working with an LLM to implement this plan, use these strategies for best results:

### 1. Isolate Logic Focus (Function Node Precision)
**DO:**
- Ask for specific function node code: "Write the Record KPI History function with averaging logic including global.get initialization"
- Provide exact input/output requirements: "This function receives msg.kpis object and must return msg or null"
- Request one change at a time

**DON'T:**
- Ask vague questions like "fix my dashboard"
- Request multiple phase changes in one prompt
- Assume LLM knows your flow structure

### 2. Explicitly Define Global Variables
**Template for LLM prompts:**
```
Global variable: kpiBuffer
Type: Array of objects
Structure: [{timestamp: number, oee: number, availability: number, performance: number, quality: number}]
Lifecycle: Initialized to [] if null, cleared after recording to history
Purpose: Accumulates KPI values for 1-minute averaging
```

**Always specify:**
- Variable name
- Data type
- Default/initial value
- When it's read/written
- When it should be cleared

### 3. Specify Node-RED Input/Output Requirements
**Example prompt:**
```
The Machine Cycles function node must have 3 outputs:
- Output 1: DB write message (only when tracking enabled)
- Output 2: State update message (always sent)
- Output 3: KPI trigger message (always sent for continuous updates)

The return statement should be:
return [dbMsg, stateMsg, kpiTrigger];
```

### 4. Request Defensive Code
**Always ask for:**
- Null/undefined checks before accessing properties
- Type validation for global variables
- Initialization logic at the start of functions
- Error handling for edge cases

**Example:**
```javascript
// BAD (LLM might generate)
const buffer = global.get("kpiBuffer");
buffer.push(newValue);

// GOOD (what you should request)
let buffer = global.get("kpiBuffer");
if (!buffer || !Array.isArray(buffer)) {
  buffer = [];
}
buffer.push(newValue);
global.set("kpiBuffer", buffer);
```

### 5. Break Down Complex Changes
**For Phase 3.1 (Continuous KPI Updates), ask in sequence:**
1. "Show me the current return statements in Machine Cycles function"
2. "Modify the function to add a third output for KPI trigger"
3. "Update all return statements to include kpiTrigger message"
4. "Show me how to wire the third output to Calculate KPIs node"

### 6. Request Testing/Debugging Code
**Ask LLM to include:**
- Debug logging: `node.warn('[KPI] Buffer size: ' + buffer.length);`
- State validation: Check that variables have expected values
- Error messages: Descriptive messages for troubleshooting

### 7. Validate Against Node-RED Constraints
**Remind LLM of Node-RED specifics:**
- "This is a Node-RED function node, not regular JavaScript"
- "Global context uses global.get/set, not regular variables"
- "The msg object must be returned to send to next node"
- "Use node.warn() for logging, not console.log()"

### 8. Phase-by-Phase Verification
**After each LLM response:**
1. Verify the code matches the plan
2. Check for initialization logic
3. Confirm output structure matches wiring
4. Ask: "What edge cases does this handle?"

### 9. Example: Perfect LLM Prompt for Phase 2.1

```
I need to implement KPI throttling with averaging in Node-RED.

Context:
- Function node: "Record KPI History"
- Input: msg.kpis object with {oee, availability, performance, quality}
- Output: Averaged KPI values sent to Graphs template (or null if not ready to record)

Global variables needed:
1. kpiBuffer (Array): Accumulates KPI snapshots. Initialize to [] if null.
2. lastKPIRecordTime (Number): Last timestamp when history was recorded. Initialize to (Date.now() - 60000) if null for immediate first recording.

Requirements:
- Accumulate incoming KPIs in kpiBuffer
- Every 60 seconds (60000ms), calculate average of all buffer values
- Send averaged KPIs to output
- Clear buffer after sending
- If less than 60 seconds since last record, return null (don't send)

Please write the complete function with:
- Robust initialization (check and set defaults)
- Debug logging (buffer size, time until next record)
- Comments explaining each section
- Edge case handling (empty buffer, first run)
```

### 10. Common Pitfalls to Avoid
1. **Assuming LLM knows your flow structure** - Always describe node connections
2. **Not specifying Node-RED context** - LLM might give generic JavaScript instead
3. **Requesting too many changes at once** - Break into single-phase requests
4. **Forgetting to mention global variable persistence** - Specify initialization needs
5. **Not asking for defensive code** - Request null checks and type validation
6. **Vague success criteria** - Define exactly what "working" means

---

---

## Quick Reference: Key Code Snippets

### 1. Init Node (Run on Deploy)
```javascript
// Initialize Global Variables - Inject Once on Deploy
node.warn('[INIT] Initializing global variables');

if (!global.get("kpiBuffer")) global.set("kpiBuffer", []);
if (!global.get("lastKPIRecordTime")) global.set("lastKPIRecordTime", Date.now() - 60000);
if (!global.get("lastMachineCycleTime")) global.set("lastMachineCycleTime", Date.now());
if (!global.get("lastKPIValues")) global.set("lastKPIValues", {});

node.warn('[INIT] Complete');
return msg;
```

### 2. Machine Cycles - Add to Final Return
```javascript
// Update last machine cycle time when a successful cycle occurs
if (trackingEnabled && dbMsg) {
    global.set("lastMachineCycleTime", Date.now());
}
return [dbMsg, stateMsg, kpiTrigger];
```

### 3. Calculate KPIs - Multi-Source Guard
```javascript
const trackingEnabled = global.get("trackingEnabled");
const activeOrder = global.get("activeOrder") || {};
if (!trackingEnabled || !activeOrder.id) return null;
// ... rest of calculation
```

### 4. Work Order START Button - Clear Buffer
```javascript
if (action === "start-tracking") {
    global.set("trackingEnabled", true);
    global.set("kpiBuffer", []); // Clear stale data
    global.set("lastKPIRecordTime", Date.now() - 60000);
    // ... send state update
}
```

### 5. Graphs Template - Combined Init
```javascript
let chartsInitialized = false;

scope.$watch('msg', function(msg) {
  if (msg && msg.payload && msg.payload.kpis && !chartsInitialized) {
    initFilters();
    createCharts(currentRange);
    chartsInitialized = true;
  }
  if (chartsInitialized && msg && msg.payload && msg.payload.kpis) {
    updateCharts(msg);
  }
});

setTimeout(() => {
  if (!chartsInitialized) {
    initFilters();
    createCharts(currentRange);
    chartsInitialized = true;
  }
}, 5000);
```

---

## Final Notes

1. **Backup First:** Always backup `flows.json` before starting each phase
2. **Test Incrementally:** Don't skip testing between phases
3. **Document Changes:** Note any deviations from plan
4. **Monitor Logs:** Watch Node-RED debug output during testing
5. **Clear Cache:** Browser cache can mask issues
6. **Use LLM Strategically:** Follow the best practices above for precise, working code

**If you encounter issues not covered in this plan, STOP and ask for help before proceeding.**