Data Engineer with Python

Job Description:
  • Design and develop robust data pipelines, ETL processes, and data integration solutions to collect, transform, and load data from various sources into our data warehouse.
  • Collaborate with cross-functional teams to identify data requirements and translate them into technical specifications and data models.
  • Optimize and tune database systems, queries, and ETL processes for performance and scalability, ensuring efficient data retrieval and storage.
  • Implement data quality and validation mechanisms to maintain data integrity and accuracy.
  • Develop and maintain documentation for data pipelines, data models, and data flow diagrams to facilitate understanding and collaboration among team members.
  • Monitor and troubleshoot data pipeline issues, database performance bottlenecks, and data-related problems to ensure smooth data operations.
  • Stay up to date with emerging technologies and trends in the data engineering space, evaluating and recommending new tools and frameworks to improve data processing efficiency and overall system performance.
  • Collaborate with data scientists and analysts to support their data needs, providing them with clean, reliable, and well-organized datasets.
  • Create and maintain reports and dashboards using Power BI or other visualization tools to enable data-driven decision-making across the organization.
  • Utilize Python programming to develop and maintain data engineering solutions, including data manipulation, data cleansing, and automation of data processes.
  • Proven experience as a Data Engineer or similar role, with a strong understanding of data management principles and best practices.
  • Proficiency in SQL and experience working with both SQL and NoSQL databases (e.g., MySQL, PostgreSQL, MongoDB, Cassandra).
  • Experience with big data technologies, such as Hadoop, Spark, and related frameworks (e.g., Spark Streaming, PySpark).
  • Familiarity with cloud-based data platforms and services, preferably AWS or Azure (e.g., Amazon Redshift, Azure SQL Database, S3).
  • Strong programming skills in Python, with the ability to write efficient and optimized code for data manipulation and automation.
  • Proficiency in Spark/Flink, Kafka/Pulsar.
  • Good at AWS Glue or Azure Data Factory, ETL processes.
  • Experience with data visualization tools like Power BI or Tableau, including creating interactive dashboards and reports.
  • Familiarity with data warehousing concepts and experience working with tools like Apache Airflow or similar workflow management systems.
  • Solid understanding of data security, privacy, and compliance standards.
  • Strong analytical and problem-solving skills, with the ability to quickly troubleshoot and resolve data-related issues.
  • Excellent communication and collaboration skills, with the ability to work effectively in cross-functional teams.
  • Self-motivated and eager to learn, with a proactive approach to problem-solving and staying updated on industry trends and best practices.
Education:

BE OR BTech or MCA or MTech only

Interview Rounds:

2 or 3 Technical rounds

Experience Level:

4 to 10 Years