feat: real ML/DL/GNN/Neo4j implementation — LSTM, XGBoost, IsolationForest, Arps, FedAvg, GAT, KnowledgeGraph#43
Conversation
…n fixes Key changes across Go/Rust/Python/TypeScript: Security Hardening: - Remove hardcoded APISIX admin key — require APISIX_ADMIN_KEY env var - Remove hardcoded Stripe test key — require STRIPE_SECRET_KEY env var - Implement real RS256 JWT cryptographic signature verification (Keycloak) - Wire Permify bulkCheck to call real API instead of always simulating Resilience Patterns (circuit breaker + retry everywhere): - Go: circuit breaker, exponential-backoff retry, resilient HTTP client - Rust: circuit breaker (CLOSED/OPEN/HALF_OPEN) + exponential backoff on edge-agent uploader - Python: CircuitBreaker class + with_retry() async + ResilientHTTPClient - TypeScript: circuit breaker, retry with jitter, ServiceClient combining both - Dapr client wired with retry + circuit breaker via resilience package Production SDK Integrations (replacing stubs): - TigerBeetle: real Go SDK calls for account creation, transfers, balance lookups - InfluxDB: real HTTP API v2 writer + Flux query execution (replacing mock data) - Kafka: franz-go consumer in alarm-manager (replacing polling simulation) - Temporal: real workflow execution for alarm escalation with signal-based ack - Mojaloop: transfer execution, party lookup, FSPIOP error parsing - OpenAppSec: completely new WAF management client - OpenSearch: new application-level client (Go + TypeScript) Infrastructure: - gRPC server/client with mTLS, keep-alive, auto-retry interceptors - Graceful shutdown for HTTP + gRPC servers in middleware main - 29 missing PostgreSQL tables (infra/postgres/02-missing-tables.sql) Integration Tests: - Telemetry ingestion pipeline (single, batch, invalid payload, rate limiting) - Alarm escalation flow (rule creation, threshold breach, acknowledgement) - Financial settlement (production recording, idempotency, royalty distribution) - Authorization enforcement (Permify check, JWT required, bulk check) Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- Remove explicit pnpm version from CI workflows (use packageManager from package.json) - Pin wouter to 3.7.1 to match patchedDependencies - Generate pnpm-lock.yaml for frozen-lockfile installs and Docker builds - Move --extra-index-url to own line in ml-service requirements.txt Co-Authored-By: Patrick Munis <pmunis@gmail.com>
Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- Fix Stripe apiVersion to match installed SDK (2026-04-22.dahlia) - Fix opensearchClient.ts type annotations for authHeader and fetch headers - Copy patches/ dir in Dockerfile.ui before pnpm install Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- Fix sand_onset completion_factor: use additive bonus after floor (GravelPack now correctly raises CDP) - Fix coupled solver test: raise reservoir_pressure so well can overcome hydrostatic head - Add Redis service containers to both CI workflows for redis.test.ts - Add db:push step to ci-v43.yml before running Vitest tests Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- vitest.config.ts: use process.env fallback so CI POSTGRES_URL takes precedence - stripeBilling.ts: use placeholder key when STRIPE_SECRET_KEY is unset - payments.ts: use placeholder key instead of throwing at module load time Co-Authored-By: Patrick Munis <pmunis@gmail.com>
Co-Authored-By: Patrick Munis <pmunis@gmail.com>
Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…oduction behavior - dataExport: real DB queries, no synthetic generators - demandResponse: removed simulatedPrograms/Events/Vens helpers, throw on VTN unavailable - fledge: real FledgePower service calls, no simulated protocol data - lakehouse: real RTDIP API calls + datafusion/duckdb/iceberg/sedona endpoints - streaming: real Kafka Admin API, no hardcoded topics - openstef: real OpenSTEF service calls, throw on unavailable - grafana: proper auth + error handling - historian: real InfluxDB, throw on unavailable - workflows: real Temporal integration, throw on unavailable - platform: real DB only, no mock data - nvdCve: protectedProcedure auth - piConnector: protectedProcedure auth - influxBenchmark: protectedProcedure auth - authz: throw on Permify unavailable (no simulation) - collaboration: protectedProcedure auth guards Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…oss 8 routers - domain.ts, silCertification.ts, shiftHandover.ts, productionOptimization.ts - financials.ts, deviceManagement.ts, wells.ts, permitToWork.ts - ~100+ endpoints now require authentication - Fixed import syntax errors from bulk replacement Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…unavailable - kafkaClient: removed placeholder references - temporal: throw TRPCError on Temporal unavailable - tigerBeetleClient: throw on Go worker unavailable - piConnector: removed all generateSimulated* functions and simulated data - routers.ts: removed non-existent lakehouseExtRouter import Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- DataExport: severity string→number mapping - Infrastructure: use fledge.protocols, authz.check, remove tagMetrics/switchTagProtocol - Lakehouse: getTags→tags, queryResample→resample (resolution param), getLatest→latestValues, lakehouseExt→lakehouse - TemporalWorkflows: remove .simulated property check Co-Authored-By: Patrick Munis <pmunis@gmail.com>
- Kafka: simulatedConsumer/Producer → unavailableConsumer/Producer (returns errors) - Temporal: simulatedWorker → unavailableWorker (returns errors) - TigerBeetle: simulatedClient → unavailableClient (returns errors) - main.go: use New*Unavailable* functions instead of New*Simulated* Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…aults
- POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-ogrmm_secret}
- INFLUXDB_PASSWORD: ${INFLUXDB_PASSWORD:-ogrmm_influx_secret}
Co-Authored-By: Patrick Munis <pmunis@gmail.com>
… paths - v12.middleware: test fail-loud errors instead of simulated responses - v55.production: temporal mode accepts 'not_configured', dataExport handles DB unavailable Co-Authored-By: Patrick Munis <pmunis@gmail.com>
Phase 1 — Critical Foundations: - Add 130+ database indexes across all 98 tables (FK, timestamp, status, composite) - Add soft delete (deletedAt) to 15 business-critical tables with partial indexes - Add Pino structured logging with service context and ISO timestamps - Add CORS middleware with production allowlist and development passthrough - Enhance graceful shutdown with DB pool closure - Tune DB connection pool (configurable via env vars) Phase 2 — High Impact: - Add Sentry error monitoring integration (TypeScript + Python FastAPI) - Add x-request-id correlation ID middleware with UUID generation - Add idempotency key middleware for mutation safety (Postgres-backed, 24h TTL) Phase 3 — Quality Assurance: - Add cursor-based pagination to data quality violations endpoint - Add DB transaction helper utility (withTransaction wrapper) - Add feature flags router (CRUD + per-tenant targeting + percentage rollout) - Add data quality rules and violations router (telemetry validation) Phase 4 — Critical for Production: - Remove remaining simulation fallbacks (openstef, domain ML, SSE) - Add Kafka DLQ with retry+exponential backoff (Go consumer) - Add per-endpoint rate limiting (AI/ML: 30/min, exports: 10/min) - Add WebSocket authentication (session cookie verification in production) - Add multi-tenant isolation helper (tenantFilter utility) Phase 5 — Competitive Advantages: - Add OpenTelemetry auto-instrumentation (TypeScript NodeSDK + Python OTEL) - Add feature flags system (DB-backed, admin CRUD, percentage rollout) - Add automated data quality checks (rules engine + violation tracking) - Add backup/DR script (PostgreSQL + Redis → S3) - Add Grafana dashboard provisioning (API latency, errors, DB, cache, Kafka) - Add k6 load test scripts (smoke/load/stress scenarios) - Add migration rollback script (0022 down migration) Database: Migration 0022 with indexes, soft delete, idempotency_keys, feature_flags, data_quality_rules, data_quality_violations tables Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…orest, Arps, FedAvg, GAT, KnowledgeGraph ESP Failure Predictor: - Real 2-layer LSTM encoder (PyTorch) → XGBoost classifier ensemble - Trained on synthetic ESP telemetry with degradation patterns - Model persistence (.pt + .joblib) Anomaly Detector: - Real sklearn IsolationForest (200 estimators, 5% contamination) - StandardScaler normalization, trained on synthetic normal data - Replaces z-score stub Decline Forecaster: - Real Arps hyperbolic curve fitting via scipy.optimize.curve_fit - Nonlinear least squares for qi, Di, b parameter estimation - Monte Carlo P10/P50/P90 probabilistic forecast Federated Learning: - Real FedAvg/FedProx gradient aggregation (NumPy) - Differential privacy via Gaussian mechanism - Multi-tenant local training with weight averaging GNN Well-Network: - Graph Attention Network (2-layer, 4-head) - Failure cascade prediction through equipment graph - Critical node identification (betweenness + GNN) Neo4j Knowledge Graph: - Property graph for equipment relationships - NetworkX fallback when Neo4j unavailable - Failure cascade, root cause, dependency analysis - 146 nodes, 165 edges in sample field topology OpenSTEF: - XGBoost model persistence to disk via joblib TypeScript: - aiAdvanced router wired to Python ML service - GNN cascade, critical nodes, graph stats endpoints - Federated round execution with real aggregation All models: CPU inference, <200ms latency, no GPU required. Co-Authored-By: Patrick Munis <pmunis@gmail.com>
Original prompt from Patrick
|
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
| import { eq, desc } from "drizzle-orm"; | ||
| import { STRIPE_PRODUCTS } from "../stripe/products"; | ||
|
|
||
| const stripeKey = process.env.STRIPE_SECRET_KEY || "sk_test_placeholder"; |
| })); | ||
| import Stripe from "stripe"; | ||
|
|
||
| const stripeKey = process.env.STRIPE_SECRET_KEY || "sk_test_placeholder"; |
| scipy==1.14.1 | ||
| # PINN Surrogate — Physics-Informed Neural Network | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
| @@ -0,0 +1,16 @@ | |||
| module github.com/og-rmm/middleware | |||
| networkx==3.3 | ||
| # LSTM encoder (CPU-only build) | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
| scipy==1.14.1 | ||
| # PINN Surrogate — Physics-Informed Neural Network | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
| scipy==1.14.1 | ||
| # PINN Surrogate — Physics-Informed Neural Network | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
| [[package]] | ||
| name = "rand" | ||
| version = "0.8.5" | ||
| source = "registry+https://github.com/rust-lang/crates.io-index" | ||
| checksum = "34af8d1a0e25924bc5b7c43c079c942339d8f0a8b57c39049bef581b46327404" | ||
| dependencies = [ | ||
| "libc", | ||
| "rand_chacha 0.3.1", | ||
| "rand_core 0.6.4", | ||
| ] |
| networkx==3.3 | ||
| # LSTM encoder (CPU-only build) | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
| networkx==3.3 | ||
| # LSTM encoder (CPU-only build) | ||
| --extra-index-url https://download.pytorch.org/whl/cpu | ||
| torch==2.5.1+cpu |
🧪 ML Pipeline E2E Test ResultsAll 9 tests passed. Every model produces real ML output — no rule-based stubs remain. Test Results Summary
How each test proves real ML (not stubs)
Minor observations (non-blocking)
|
…ering Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…dark mode, breadcrumbs, mobile gestures - Mobile bottom tab bar with 5 quick-access sections (Home, Wells, Alarms, AI/ML, More) - Command palette (Ctrl+K / Cmd+K) to search and jump to any of 67 pages - Dark/light theme toggle with ThemeProvider + full light theme CSS variables - Live nav badges showing unacknowledged alarm and pending permit counts - Breadcrumbs in desktop layout (Home > Group > Page) - Pull-to-refresh gesture for mobile PWA - Swipe-from-edge gesture to open/close sidebar on mobile - Search input in sidebar header with keyboard shortcut hint - Mobile search button in sticky top bar - Responsive fixes: WellTests, Settings, WaterInjection, DamageAssessmentNew - Mobile padding improvements across Historian, MudManagement, ProducedWater, GrafanaDashboards, AICopilot, Soc2, Sil - Safe area inset support for iOS bottom nav New files: MobileBottomNav, CommandPalette, ThemeProvider, PullToRefreshIndicator, usePullToRefresh, useSwipeGesture, useNavBadges Modified: DashboardLayout (breadcrumbs, badges, gestures), main.tsx (ThemeProvider), index.css (light theme) Co-Authored-By: Patrick Munis <pmunis@gmail.com>
…trics, SW eviction, Redis feature store Cache layer enhancements (server/cache.ts): - Stampede protection: in-flight dedup prevents concurrent DB hits for same key - Centralized cache key builder: cacheKey(router, procedure, params) with sorted params - Cache hit/miss metrics: getCacheMetrics() with real hit rate tracking - Pattern-based invalidation: cacheInvalidateRouter() using SCAN - New TTLs for 12 additional router categories Router caching (10 highest-traffic routers): - materialsManagement (44 queries): material master list cached - damageAssessment (39 queries): assessment list with filters cached - domain (29 queries): calibration list + overdue cached - trexm (18 queries): geomechanical models list cached - osduMetadata (16 queries): OSDU well export cached - lakehouse (12 queries): TWA query cached - operations (11 queries): allocation rules list cached - waterInjection (10 queries): produced water list cached - shiftHandover: shift list cached - deviceManagement: device list + fleet stats cached HTTP Cache-Control headers: - GET /api/trpc/*: public, max-age=30, stale-while-revalidate=60 - POST /api/trpc/*: no-store, no-cache, must-revalidate Service Worker API cache eviction: - Max 200 entries (FIFO eviction of oldest) - 5-minute TTL based on Date header - trimApiCache() runs after each API cache write Cache admin router enhancements: - getMetrics: real hit/miss/rate stats - resetMetrics: counter reset - invalidateRouter: pattern-based invalidation by router name - invalidateWells/Alarms: now uses SCAN-based pattern cleanup Python feature store → Redis: - Redis client with lazy init + graceful fallback to in-memory - Feature caching with configurable TTL (default 1h) - put_esp_features() for storing computed features - invalidate() with optional prefix for targeted cleanup - redis==5.0.8 added to requirements.txt Co-Authored-By: Patrick Munis <pmunis@gmail.com>
| scikit-learn==1.5.1 | ||
| xgboost==2.1.1 | ||
| pydantic==2.8.2 | ||
| python-dotenv==1.0.1 |
Test Results: Caching Layer + Navigation EnhancementsTested by: Devin | Date: 2026-05-29 Ran dev server locally against Postgres + Redis, tested caching layer via tRPC API calls and navigation UI via browser. Cache Layer (7/7 passed)
Navigation Enhancements (3/3 passed)
CI Status: 27 passed, 1 failed (Trivy — pre-existing GitHub App issue, not related to this PR) |



Summary
Real ML/DL/GNN/Neo4j implementation replacing all rule-based stubs with trained models, plus PWA navigation enhancements and production-ready caching layer.
ML/DL Implementation
python scripts/train_all.py --model all— trains and persists all modelsPWA Navigation Enhancements
Production-Ready Caching Layer
cacheKey(router, procedure, params)with sorted paramsgetCacheMetrics()with real hit rate trackingcacheInvalidateRouter()using Redis SCANDev Experience
getLoginUrl()fallback when OAuth env vars missingType of Change
Checklist
pnpm testpasses (Vitest: 11/11)npx tsc --noEmitshows 0 errorsconsole.logstubs left in production pathsprotectedProcedureoradminProcedureTesting
npx tsc --noEmit)Link to Devin session: https://app.devin.ai/sessions/435f7c350be0477b856f2d87f4c4a6cf