Software Engineer, Data Processing & Privacy - 26-00382

Work from home Full-time role Hiring

Additional Notes: Data Privacy and legal environments, working in Python & with Claude, handling/processing PII; Soft skills: attention to detail, reliable, good with reviews/audits. About the role

Client is seeking a detail-oriented Software Engineer on a contract basis to build and run data processing pipelines for datasets used in our research. You'll take raw, heterogeneous inputs — text, code, documents, structured exports — and turn them into clean, well-structured, privacy-safe outputs ready for downstream use.
The work spans ingestion, format normalization, data quality, privacy handling (including PII de-identification), and the supporting tooling that makes the pipeline reliable and self-serve. You'll iterate closely with internal teams on QA findings and harden the pipeline so each new dataset is cheaper than the last.

Responsibilities

Build and extend per-source processing for new data types as they arrive
Ingest and normalize raw exports across many formats into consistent, well-structured outputs
Handle privacy requirements — for example, PII detection and de-identification — to meet our internal compliance bar
Run data quality QA: automated checks plus LLM-assisted review to flag gaps, malformed inputs, and incompleteness
Iterate on internal feedback: root-cause issues, fix, re-run, re-deliver
Build supporting tools: auditing, data exploration, monitoring, simple search over processed data
Land cleaned data with the right storage layout and access controls
Document and harden the pipeline so each new dataset is cheaper than the last

You may be a good fit if you

Have 4+ years of software engineering experience, with substantial time on data pipelines
Are a proficient user of Claude / Claude Code for day-to-day engineering and know when to verify its output

Are genuinely detail-oriented

Have high integrity and take handling real people's personal data seriously
Are comfortable with sustained, careful data work and find satisfaction in getting it right
Can work independently, ship reliably, and communicate clearly about progress and edge cases
Are proficient in Python and comfortable working across many heterogeneous, semi-structured formats (JSON, NDJSON, code, HTML/XML dumps, archives)
Strong candidates may also have experience with
PII detection and anonymization techniques
Working with large, messy, semi-structured text and code corpora
Data quality monitoring and validation
Cloud storage and access-control patterns (S3/GCS, IAM)
Building internal tools or self-serve data platforms for researchers
Information retrieval, search, or RAG systems.

Apply tot his job Apply To this Job

Apply

Software Engineer, Data Processing & Privacy - 26-00382

Responsibilities

You might like

Principal Statistical Programmer (Remote)

Entry level java programmer with NLP Deep learning/(Remote)

Software Engineer, On-Chain Programming

Aetion Evidence & Lokavant Spectrum Platform Developer

Staff Software Engineer (Data Platform)

CNC Programmer III

Programming Team Leader, PK

Junior Software Developer - Backend

Remote Coding Expertise for AI Training - Competition Coders

FT/PT Remote AI Prompt Engineering & Evaluation – Will Train

Southwest Airlines Part Time Data Entry Jobs @Remote $25/Hour

Experienced Virtual Customer Care Representative – Remote Customer Service and Support

Experienced Live Chat Support Specialist – Entry-Level Part-Time Opportunity with arenaflex

Experienced Entry-Level Remote Chat Support Representative – Flexible Work Schedule and Unlimited Growth Opportunities

Stage – Controllo di Gestione (Corporate Internal Function)

Director, MSO Operation

Experienced Customer Service Professional – Phone, Chat, and Email Support for arenaflex

Experienced Customer Support Representative – Remote Part-Time Opportunity at arenaflex

Experienced Customer Service Representative – Remote Work Opportunity with arenaflex

Manager de Conciliación y Automatización - Fintech