This commit is contained in:
Marcelo
2025-11-28 09:11:59 -06:00
commit b66cb97f16
34 changed files with 17756 additions and 0 deletions

300
AVAILABILITY_ZERO_FIX.md Normal file
View File

@@ -0,0 +1,300 @@
# Fix: Availability Shows 0% After Production Start
## Date: November 27, 2025
---
## 🔍 PROBLEM DESCRIPTION
**Symptom:** When starting production via the START button, Availability and OEE immediately show **0%** and only become non-zero after the first scrap prompt or after several machine cycles.
**User Impact:** Dashboard shows misleading KPIs at production start, making it appear the machine is offline when it's actually preparing to run.
---
## 📊 ROOT CAUSE ANALYSIS
### Timeline of Events
1. **User clicks START button**
- `global.set("trackingEnabled", true)`
- `global.set("productionStartTime", Date.now())`
- `global.set("operatingTime", 0)`
- `global.set("lastMachineCycleTime", Date.now())` ✓ (via Init node)
2. **Calculate KPIs runs immediately** (triggered by START action)
- `trackingEnabled = true`
- `productionStartTime = <timestamp>`
- `operatingTime = 0` ❌ (no cycles yet)
- `timeSinceLastCycle = 0`
3. **Availability calculation logic check:**
```javascript
if (!trackingEnabled || timeSinceLastCycle > BRIEF_PAUSE_THRESHOLD) {
// NOT this branch (trackingEnabled=true, timeSince=0)
} else if (trackingEnabled && productionStartTime && operatingTime > 0) {
// ❌ FAILS HERE - operatingTime is 0!
// Normal calculation: (operatingTime / elapsedSec) * 100
} else {
// ✓ FALLS TO HERE
// Uses: prevKPIs.availability || 0
// Result: 0% (since lastKPIValues was cleared or is null)
}
```
4. **Result:** Availability = 0%, OEE = 0%
### Why Theories Were Correct
✅ **Theory 1: First KPI Run Before Valid Cycle**
- Calculate KPIs executes immediately after START
- No machine cycles have occurred yet
- `operatingTime = 0` fails the check on line 22
✅ **Theory 2: timeSinceLastCycle Logic**
- Not the issue in this case (timeSince = 0 at start)
- But could be an issue if `lastMachineCycleTime` was stale from previous run
- Our Init node prevents this by setting it to `Date.now()`
✅ **Theory 3: Manual Seeding Overwritten**
- Correct - any manual `operatingTime` value would be replaced by first cycle
- But the real issue is the check `operatingTime > 0` preventing calculation
---
## 🎯 THE FIX
### Strategy: Optimistic Availability on Production Start
**Principle:** When production JUST started and no cycles have occurred yet, assume **100% availability** (optimistic assumption) until real data proves otherwise.
**Reasoning:**
- Machine was just told to START - assume it's ready
- First cycle will provide real data within seconds
- Better UX: Show 100% → real value, rather than 0% → real value
- Avoids false alarm of "machine offline"
### Code Changes
**Location:** Calculate KPIs function, Availability calculation section
**Before (4 branches):**
```javascript
if (!trackingEnabled || timeSinceLastCycle > BRIEF_PAUSE_THRESHOLD) {
// Branch 1: Legitimately stopped
msg.kpis.availability = 0;
} else if (trackingEnabled && productionStartTime && operatingTime > 0) {
// Branch 2: Normal calculation
availability = (operatingTime / elapsedSec) * 100;
} else {
// Branch 3: Brief pause fallback
availability = prevKPIs.availability || 0; // ❌ Returns 0 on first run!
}
```
**After (5 branches):**
```javascript
if (!trackingEnabled || timeSinceLastCycle > BRIEF_PAUSE_THRESHOLD) {
// Branch 1: Legitimately stopped
msg.kpis.availability = 0;
} else if (trackingEnabled && productionStartTime && operatingTime > 0) {
// Branch 2: Normal calculation (has real cycle data)
availability = (operatingTime / elapsedSec) * 100;
} else if (trackingEnabled && productionStartTime) {
// Branch 3: NEW - Production just started, no cycles yet
msg.kpis.availability = 100; // ✅ Optimistic!
node.warn('[Availability] Production starting - showing 100% until first cycle');
} else {
// Branch 4: Brief pause fallback
availability = prevKPIs.availability || 0;
}
```
### Logic Flow Chart
```
START clicked
trackingEnabled = true
productionStartTime = now
operatingTime = 0
Calculate KPIs runs
Check: trackingEnabled? YES
Check: timeSinceLastCycle > 5min? NO
Check: operatingTime > 0? NO ←── KEY CHECK
NEW BRANCH: trackingEnabled && productionStartTime?
↓ YES
Availability = 100% (optimistic) ✅
Display on dashboard: OEE and Availability show 100%
First machine cycle occurs (within 1-3 seconds)
operatingTime becomes > 0
Next KPI calculation uses REAL data
Availability = (operatingTime / elapsedSec) * 100 ✅
```
---
## ✅ EXPECTED BEHAVIOR AFTER FIX
### Before Fix
```
User clicks START
Dashboard immediately shows:
Availability: 0%
OEE: 0%
Wait 3-5 seconds for first cycle...
Dashboard updates:
Availability: 95%
OEE: 85%
```
**Problem:** False alarm - looks like machine is offline
### After Fix
```
User clicks START
Dashboard immediately shows:
Availability: 100% ← Optimistic assumption
OEE: 90-100% ← Based on quality/performance
First cycle occurs (1-3 seconds)
Dashboard updates with REAL data:
Availability: 95% ← Actual calculated value
OEE: 85% ← Based on real performance
```
**Improvement:** Smooth transition, no false "offline" alarm
---
## 🧪 TESTING INSTRUCTIONS
### Test 1: Fresh Production Start
1. Ensure no work order is active
2. Start a new work order
3. Click START button
4. **Expected:** Availability immediately shows 100%
5. Wait for first machine cycle (1-3 seconds)
6. **Expected:** Availability updates to real calculated value
### Test 2: Monitor Debug Logs
1. Open Node-RED debug panel
2. Click START
3. **Expected to see:**
```
[START] Cleared kpiBuffer for fresh production run
[Availability] Production starting - showing 100% until first cycle
```
4. After first cycle:
```
AVAILABILITY CHECK ➤
trackingEnabled: true
operatingTime: <some value > 0>
```
### Test 3: Verify Actual Calculation Takes Over
1. Start production
2. Let machine run for 10-20 cycles
3. **Expected:** Availability should reflect real performance (likely 85-98%)
4. Submit scrap
5. **Expected:** Availability should NOT drop to 0% (brief pause logic)
### Test 4: Stop Detection Still Works
1. Start production
2. Let run for 1 minute
3. Click STOP (or let trackingEnabled become false)
4. Wait 5+ minutes
5. **Expected:** Availability drops to 0% (legitimate stop)
---
## 📝 ALTERNATIVE APPROACHES CONSIDERED
### Option 1: Seed operatingTime = 0.001
**Rejected:** Gets overwritten by first cycle calculation
### Option 2: Delay Calculate KPIs until first cycle
**Rejected:** Requires complex flow rewiring, delays all KPI visibility
### Option 3: Show "N/A" or "--" instead of 0%
**Rejected:** Requires UI changes, doesn't solve the core logic issue
### Option 4: Use 50% as starting value
**Rejected:** Arbitrary, 100% is more optimistic and clear
### Option 5 (CHOSEN): Add dedicated branch for "just started" state
**✅ Accepted:**
- Minimal code change (one extra `else if`)
- Clear logic separation
- No impact on existing behavior
- Easy to understand and maintain
---
## 🔒 SAFETY CHECKS
### What Could Go Wrong?
**Q:** What if machine actually can't start (offline, error)?
**A:** First cycle will never occur, but `timeSinceLastCycle` will eventually exceed 5 minutes, triggering the "long pause" logic that sets availability to 0%.
**Q:** What if operatingTime never increases?
**A:** Same as above - after 5 minutes, availability will correctly drop to 0%.
**Q:** Does this affect quality or performance KPIs?
**A:** No - they have separate calculation logic. Quality = good/total, Performance = cycles/target.
**Q:** What if user clicks START/STOP repeatedly?
**A:** Each START resets `productionStartTime` and `operatingTime`, so the optimistic 100% will show each time until cycles prove otherwise. This is correct behavior.
---
## 🔄 ROLLBACK INSTRUCTIONS
If issues occur:
```bash
cd /home/mdares/.node-red/projects/Plastico/
cp flows.json.backup_20251127_124628 flows.json
# Restart Node-RED
```
Or manually revert the Calculate KPIs function:
- Remove the new `else if (trackingEnabled && productionStartTime)` branch
- Restore the original 3-branch logic
---
## 📊 METRICS TO MONITOR
After deployment, monitor:
- **Time to first real availability value** (should be 1-3 seconds)
- **False 0% occurrences** (should be eliminated)
- **Long pause detection** (should still work after 5+ min idle)
- **User feedback** on perceived responsiveness
---
## FILES MODIFIED
- `/home/mdares/.node-red/projects/Plastico/flows.json`
- Calculate KPIs function node (ID: `00b6132848964bd9`)
- Added 5th logic branch for production start state
---
**Status: FIX COMPLETE ✅**
**Risk Level: LOW** (Isolated change, all existing branches preserved)
**Deployment: READY**