For CV use mail cv.data@andersenlab.com
Responsibilities:
- Developing, testing, and maintaining daily pipeline ETL processes for collecting data in the cloud;
- Developing and ensuring ETL processes for data processing;
- Creating tools for operating and monitoring the data platform;
- Creating data models and building a data showcase for Data Analysts and ML Engineers;
- Creating services to handle real-time data via Kafka for subsequent analysis using Azure tools;
- Collaborating with all stakeholders to achieve a deep and clear understanding of requirements.
Must-haves:
- Experience as a Data Engineer for 3+ years;
- Understanding of data processing algorithms and principles;
- Experience with Azure cloud storage solutions;
- Experience with Python, Apache Airflow, and PySpark;
- Experience with Kafka;
- Knowledge of PL SQL or SQL;
- Experience with Azure Databricks and Azure Event Hub;
- Experience in building ETL processes;
- Experience with modern Agile software development and testing methodologies using modern ALM tools, such as GitHub/Azure DevOps;
- Level of English – from Intermediate.
Nice-to-haves:
- Experience with Apache Hive, Apache Impala, Amazon Redshift, Snowflake, PostgreSQL, Greenplum, and Vertica;
- Experience in creating cloud applications based on container platforms, such as Microsoft Azure Kubernetes Services or Lambda Azure Functions;
- Experience in building MLOps and data-driven applications.