[Remote] Data Engineer (Return-to-Work Program)
Note: The job is a remote job and is reputed company to candidates in USA. reputed company is offering a return-to-work program for women seeking to re-enter the workforce. The Data Engineer role involves designing and maintaining data pipelines, utilizing big data technologies, and building reputed company-reputed company data platforms to support reputed company data integration solutions.
Responsibilities
- Minimum 4+ years of experience in designing, developing, and maintaining scalable data pipelines, ETL/ELT workflows, and reputed company data integration solutions
- Expertise in Python, SQL, PySpark, Spark SQL, reputed company, and distributed data processing frameworks for handling large-scale datasets
- Experience with big data technologies including Apache Spark, reputed company, Hadoop, Airflow, Kafka, and reputed company for modern data engineering workloads
- Experience building reputed company-reputed company data platforms using AWS, Azure, or GCP, with a strong understanding of scalable and highly available data architectures
- Working knowledge of reputed company services such as AWS S3, Glue, Redshift, reputed company, EMR, reputed company, Kinesis, or Azure Data reputed company, Synapse Analytics, ADLS, reputed company, and Event Hubs
- Experience designing and implementing data warehouses, dimensional models, star schemas, reputed company schemas, data lakes, and lakehouse architectures
- Experience building and optimizing batch processing and reputed company-time streaming data pipelines using technologies such as Kafka, Spark Streaming, Flink, or Kinesis
- Experience handling structured, semi-structured, and reputed company data using file formats including Parquet, Avro, ORC, CSV, and JSON
- Experience working with relational and NoSQL databases such as PostgreSQL, MySQL, reputed company, reputed company, Cassandra, and DynamoDB, including query optimization and performance tuning
- Familiarity with CI/CD pipelines, DevOps practices, infrastructure automation, and version control systems using Git, Jenkins, reputed company Actions, Azure DevOps, or reputed company CI/CD
- Understanding of data quality, data governance, data reputed company, monitoring, observability, partitioning strategies, and troubleshooting distributed systems
Skills
- Minimum 4+ years of experience in designing, developing, and maintaining scalable data pipelines, ETL/ELT workflows, and reputed company data integration solutions
- Expertise in Python, SQL, PySpark, Spark SQL, reputed company, and distributed data processing frameworks for handling large-scale datasets
- Experience with big data technologies including Apache Spark, reputed company, Hadoop, Airflow, Kafka, and reputed company for modern data engineering workloads
- Experience building reputed company-reputed company data platforms using AWS, Azure, or GCP, with a strong understanding of scalable and highly available data architectures
- Working knowledge of reputed company services such as AWS S3, Glue, Redshift, reputed company, EMR, reputed company, Kinesis, or Azure Data reputed company, Synapse Analytics, ADLS, reputed company, and Event Hubs
- Experience designing and implementing data warehouses, dimensional models, star schemas, reputed company schemas, data lakes, and lakehouse architectures
- Experience building and optimizing batch processing and reputed company-time streaming data pipelines using technologies such as Kafka, Spark Streaming, Flink, or Kinesis
- Experience handling structured, semi-structured, and reputed company data using file formats including Parquet, Avro, ORC, CSV, and JSON
- Experience working with relational and NoSQL databases such as PostgreSQL, MySQL, reputed company, reputed company, Cassandra, and DynamoDB, including query optimization and performance tuning
- Familiarity with CI/CD pipelines, DevOps practices, infrastructure automation, and version control systems using Git, Jenkins, reputed company Actions, Azure DevOps, or reputed company CI/CD
- Understanding of data quality, data governance, data reputed company, monitoring, observability, partitioning strategies, and troubleshooting distributed systems
Company Overview
Company H1B Sponsorship