Architecture for a Probabilistic Risk Modeling Platform

This post outlines a platform architecture designed to model the impact of a hybrid risk registry (qualitative and quantitative risks) on an oil company’s key financial KPIs like EBITDA and Cash Flow on a monthly basis. The design emphasizes modularity, auditability, and the integration of expert judgment with stochastic simulation. 1. Core Principles & Objectives Single Source of Truth: Establish a centralized, versioned Risk Registry for all identified risks. Hybrid Modeling: Natively support both quantitative risks (modeled with probability distributions) and qualitative risks (modeled with structured expert judgment). Financial Integration: Directly link risk events to a baseline financial plan (P&L, Cash Flow statement) to quantify impact. Probabilistic Output: Move beyond single-point estimates to deliver a distribution of potential outcomes (e.g., P10/P50/P90 EBITDA). Auditability & Reproducibility: Ensure every simulation run is traceable to a specific version of the risk registry, assumptions, and financial baseline. User-Centric Workflow: Provide intuitive interfaces for risk owners to provide input without needing to be simulation experts. 2. High-Level Architecture The platform is designed as a set of modular services that interact through well-defined APIs and a shared data layer. ...

2025-09-08 · 6 min · rokorolev

Petroleum Analytics Platform Architecture (2015–2018)

This post captures the practical architecture (2015–2018) that supported the upstream evaluation & stochastic modeling framework outlined in the related post: Upstream Asset Evaluation Framework. It maps legacy design choices to modern terminology and highlights constraints that shaped modeling workflows. 1. Core Principles On‑prem / hybrid HPC + Hadoop (YARN) cluster for heavy simulation; limited early cloud (select AWS EC2/EMR/S3; occasional Azure VM/Blob). No unified “lakehouse” yet: layered zones → Raw (HDFS/S3) → Curated (Hive/Parquet/ORC) → Marts (Hive/Impala/Presto). Limited containers/Kubernetes; batch schedulers dominated (Oozie, early Airflow pilot, Control‑M, Cron). Governance largely manual: Hive Metastore + ad hoc catalog (Excel / SharePoint / SQL). 2. Data Ingestion Source Type Examples Mechanism Notes Geoscience LAS, SEG-Y Batch file drop + ETL parse Large binary + metadata extraction Well / Ops WITSML feeds Batch pull / scheduled parse Standardization step into Hive ERP / Finance CSV / RDBMS exports Sqoop (RDBMS→HDFS), SSIS, Python/.NET ETL Controlled nightly cadence SCADA / Events Downtime logs Kafka 0.8/0.9 (where deployed) or Flume/Logstash Early streaming footprint Market / Pricing Excel price decks Staged in SQL then approved to config tables Manual approval workflow Workflow orchestration: Oozie XML workflows early; selective Airflow DAGs (late 2017–2018) for transparency and dependency visualization. ...

2025-09-04 · 5 min · rokorolev