- Build complex ETL code
- Build complex SQL queries using MongoDB, Oracle, SQL Server, MariaDB, MySQL Work on Data and Analytics Tools in the Cloud
- Develop code using Python, Scala, R languages
- Work with technologies such as Spark, Hadoop, Kafka, etc.
- Build complex Data Engineering workflows
- Create complex data solutions and build data pipelines
- Create and manage data sources
- Integrate with diverse APIs
- Contribute to the ongoing development of the big data ecosystem
- Work closely with stakeholders on the data demand side (finance, analysts, and data scientists)
- Work closely with stakeholders on the data supply side (domain experts on source systems of the data)
- Design and build optimized OLAP and Star Schema data structures
- Build self-monitoring, robust, scalable batch and streaming data pipelines for 24/7 global operations.
- Create highly reusable code modules and packages that can be leveraged across the data pipeline
- Develop and maintain data dictionaries for governance of published data sources
- Develop and improve continuous release and testing processes
- Bachelor’s degree in computer science, computer engineering, or an engineering discipline •3+ years design & implementation experience with distributed applications
- 2+ years of experience in database architectures and data pipeline development •Demonstrated knowledge of software development tools and methodologies
- Presentation skills with a high degree of comfort speaking with executives, IT management, and developers
- Excellent communication skills with an ability to right level conversations
- Technical degree required; Computer Science or Math background desired
- Demonstrated ability to adapt to new technologies and learn quickly
- Highly analytical, motivated, decisive thought leader with solid critical thinking able to quickly connect technical and business dots
- Has strong communication and organizational skills and has the ability to deal with ambiguity while juggling multiple priorities and projects at the same time
- Able to understand statistical solutions and execute similar activities
- Experience with big data tools such as Hadoop, Hive, Spark, etc, as well as knowledge of more traditional warehouses.
- Experience delivering data pipelines and managing resulting data stores using managed cloud services (like AWS or Google Cloud Services)
- Ability to identify and resolve performance and data quality issues
- Experience with modern data pipelines, data streaming, and real time analytics using tools such as Apache Kafka, AWS kinesis, Spark Streaming, ElasticSearch, or similar tools.
- Knowledge of machine learning tools and concepts a plus.
To apply, please email your cover letter and resume to firstname.lastname@example.org.