– Design, implement, and optimize data pipelines and ETL processes to ingest, transform, and load data from various sources
– Build and maintain scalable and fault-tolerant data architectures using technologies like Apache Hadoop, Apache Spark, and cloud-based data platforms
– Develop and maintain data models and schemas to support data warehousing and business intelligence solutions
– Implement data quality checks, monitoring, and error handling mechanisms
– Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and provide clean, reliable data
– Optimize data processing performance and implement best practices for data security and governance
– Automate data processes and workflows using scripting languages (e.g., Python, Scala)
– Stay up-to-date with the latest big data technologies, tools, and industry trends
– Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience)
– Proven experience as a Data Engineer or a similar role in a data-intensive environment
– Strong proficiency in SQL and experience with relational databases (e.g., PostgreSQL, MySQL)
– Hands-on experience with big data technologies like Apache Hadoop, Apache Spark, and cloud-based data platforms (e.g., AWS, Azure, GCP)
– Familiarity with data modeling, data warehousing, and data integration concepts
– Experience with scripting languages (e.g., Python, Scala) and data processing frameworks (e.g., Apache Beam, Apache Flink)
– Knowledge of data governance, data quality, and data security best practices
– Strong problem-solving and analytical skills
– Excellent communication and collaboration abilities
– Passion for working with large-scale data and driving business value through data-driven solutions