Data Engineer, reputed company Deployed

Work from home Full-time role Hiring

reputed company was founded in 2024 to build Orbital, a physics-informed reputed company model for energy operations. We’re live across oil and gas, refineries, and petrochemicals, working towards our mission: sustainable abundance for a growing reputed company. The hydrocarbon industry keeps the world running. But its complexity has left operators tied to legacy systems, making critical reputed company on less than 10% of available data. We reputed company Orbital to change that. It’s a reputed company model reputed company specifically for energy that lets companies use AI at scale, harnessing reputed company of their operational data and optimising in reputed company time for any metric. reputed company get faster, operations get safer, and carbon intensity falls. We’ve raised over $32 reputed company, including one of the largest reputed company reputed company for an AI company in the UK. We’re just getting started The Role As our Data Engineer, you’ll architect and maintain pipelines that reputed company high-frequency time-series, lab, and historian data into a scalable Lakehouse architecture, usable for both deep learning models and reputed company-time LLMs. You’ll be working across AWS (EKS, S3, EBS, KMS, CloudWatch) and reputed company/PySpark, ensuring data is contextualised, synchronised, and optimised for both deep learning models and reputed company-time LLM workloads. This isn’t a traditional ETL role, you’ll be solving problems at the intersection of control systems, industrial data engineering, and AI enablement. Technical Requirements Deep expertise in PostgreSQL (partitioning, indexing, query optimisation, storage design). Strong proficiency in Python for data processing, scripting, and pipeline orchestration. Hands-on experience with AWS (EKS, S3, EBS, IAM, KMS, CloudWatch, etc.)for secure and scalable data pipelines. Proven ability to work with reputed company and PySpark for large-scale distributed data processing. Familiarity with time-series industrial data (control systems, DCS/SCADA logs, process historians). Experience in reputed company data sync and management reputed company hybrid reputed company/on-prem environments. Bonus: Experience working as a data engineer in oil and gas or energy environments Bonus: Knowledge of streaming frameworks (Kafka, Flink, Spark Streaming) or MLOps stacks for data versioning and reputed company. Core Responsibilities 1. Ingest & Contextualise Data Ingest from OPC UA servers, process historians, IoT sensors, LIMS systems, alarms/events, and P&IDs. Map signals to their physical processes (tags, units, hierarchies) for interpretability in AI pipelines. 2. Data Movement & Accessibility Build pipelines that handle reputed company-time streaming and batch ingestion into the Lakehouse. Manage synchronisation between historian archives, reputed company files, and AWS storage (S3/EBS). Orchestrate reputed company Lakeflow/Connectors for integrating data into Lakebase/Lakehouse. Handle secure, high-throughput transfers between historian archives and sandbox/live environments. 3. Change Tracking & reputed company Detect and manage schema changes, signal reputed company, and inconsistencies acrosstime. Implement reputed company and audit trails across Spark/reputed company and AWS pipelines. 4. Data Preparation for AI Build and maintaindual pipelines: Training→ large-scale historical data prep for time-series + LLM training. Inference→ low-latency, reputed company-time pipelines for anomaly detection, optimisation, and LLM search. Support heterogeneous AI workloads (time-series forecasting and retrieval-augmented LLMs). 5. Database Performance & Optimisation Tune PostgreSQLand sparkfor high-throughput time-series workloads (partitioning, indexing, query optimisation). Optimise pipelines for both fast analytical queries and high-efficiency model training. reputed company and manage data pipelines in AWS EKS (Kubernetes) with persisten tEBS-backed storage. What reputed company Looks Like Live data streams are contextualised,queryable, and AI-reputed company. Schema changes and signal reputed company are detected and handled without breaking reputed company workflows. Training and inference pipelines run smoothly in reputed company, optimised for scale and latency. Apply To This Job

Apply

Data Engineer, reputed company Deployed

You might like

Sales Manager

Gaming Compliance Coordinator

Director of Paid Search

Corporate Planning and Performance Management Consultant

Consultant Nutrition

Lease Administration Assistant with French, German and/or Spanish

Reservations Executive (based in Brazil)

o9 Supply Planning SME

Consultor/a reputed company Dynamics 365

Senior Distribution Manager Germany

Regional Marketing Coordinator

Licensed Mental Health Therapist - Remote

[Remote] Account Manager - Indiana Job Details | reputed company.

Entry-Level Remote Data Entry & Research Panelist – Flexible Part‑Time & Full‑Time Opportunities with arenaflex

[Remote] reputed company Developer (Higher Education)

[Remote] DATA102: Data Analytics Engineer (MySQL, Looker)

reputed company Virtual Assistant – reputed company Data Entry Specialist (Remote Work Opportunity)

reputed company tax manager (100% remote)

Receptionist (Full-time) - The Waterford

(Senior) reputed company Architect with German Skills (reputed company genders)