Petroleum Analytics Platform Architecture (2015–2018)

This post captures the practical architecture (2015–2018) that supported the upstream evaluation & stochastic modeling framework outlined in the related post: Upstream Asset Evaluation Framework. It maps legacy design choices to modern terminology and highlights constraints that shaped modeling workflows. 1. Core Principles On‑prem / hybrid HPC + Hadoop (YARN) cluster for heavy simulation; limited early cloud (select AWS EC2/EMR/S3; occasional Azure VM/Blob). No unified “lakehouse” yet: layered zones → Raw (HDFS/S3) → Curated (Hive/Parquet/ORC) → Marts (Hive/Impala/Presto). Limited containers/Kubernetes; batch schedulers dominated (Oozie, early Airflow pilot, Control‑M, Cron). Governance largely manual: Hive Metastore + ad hoc catalog (Excel / SharePoint / SQL). 2. Data Ingestion Source Type Examples Mechanism Notes Geoscience LAS, SEG-Y Batch file drop + ETL parse Large binary + metadata extraction Well / Ops WITSML feeds Batch pull / scheduled parse Standardization step into Hive ERP / Finance CSV / RDBMS exports Sqoop (RDBMS→HDFS), SSIS, Python/.NET ETL Controlled nightly cadence SCADA / Events Downtime logs Kafka 0.8/0.9 (where deployed) or Flume/Logstash Early streaming footprint Market / Pricing Excel price decks Staged in SQL then approved to config tables Manual approval workflow Workflow orchestration: Oozie XML workflows early; selective Airflow DAGs (late 2017–2018) for transparency and dependency visualization. ...

2025-09-04 · 5 min · rokorolev

Upstream Asset Evaluation & Stochastic Economic Modeling Framework

This post reconstructs the evaluation, financial modeling, and decision analytics framework used when leading an upstream (oil & gas) analytics team (circa 2015–2018). It blends technical reservoir & production modeling with fiscal, stochastic, real‑options, and portfolio layers plus emerging carbon governance. 1. Checklist (Top-Level Components) Scope definition Technical (subsurface & production) models Commercial & fiscal models Market & price modeling Cost & economic models Real options layer Stochastic engine & correlations Portfolio aggregation Risk & sensitivity Carbon / ESG integration Data architecture & governance Validation & model risk management Implementation blueprint 2. Scope & Objectives Asset lifecycle: exploration → appraisal → development planning → execution → ramp-up → plateau → decline → abandonment. Decisions supported: license bidding, sanction (FID), phasing, drilling sequence, facility sizing, hedging, M&A, divestment, suspension, expansion, abandonment timing. Outputs: NPV (pre/post tax), IRR, payback, PI, EMV / ENPV, free cash flow profiles, value at risk (P10/P50/P90), option-adjusted value, carbon-adjusted value, capital efficiency, portfolio efficient frontier. ...

2025-09-04 · 6 min · rokorolev