WHAT IS EXCITING ABOUT OUR DEVELOPERS
- Can easily switch between scripting, functional and object oriented programming in Python or Scala.
- Work with cutting edge technologies like Docker, MLlib, Kafka, Flume, Kudu, Neo4j, and Apache Arrow.
- Mentor and guide team on PySpark best practices.
- Involved in setup and deployment of data-intensive systems.
- Are responsible for maintaining production systems.
- Possess good knowledge of Linux System and Cloud Platforms (AWS, Azure).
- Understand how to develop data pipelines with transformation and complex pre-processing.
- Are part of an open office culture which fosters knowledge sharing sessions (Xebia Knowledge Exchange)
WHAT WE LOOK FOR IN YOU
- Minimum 3 years of experience in designing and implementing scalable big data infrastructure in an agile environment.
- Strong hands-on experience in Spark, Spark Streaming, Hive, Spark SQL and DataFrames with Python/Scala.
- Thorough understanding of Python with Test Driven Development.
- Good understanding of Object Oriented Analysis and Design.
- Experience with columnar databases like HBase and Cassandra.
- Working knowledge of build tools (PyBuilder) and version control systems (Git).
- Experienced in engineering systems from the ground up: familiar with OS-level, distributed databases, big data clusters.
DESIRABLE SKILLS
- Hands-on functional programming
- NoSQL databases
- Unit Test and Coverage in Python
- Java/Scala and Shell scripting
- Knowledge of CI/CD tools such as Travis, Jenkins and Cloud Platforms (AWS/Azure/Google Compute)
- Knowledge of workflow scheduler like Airflow and Oozie.
LOCATION: Bangalore or Gurgaon