Nineleaps Hiring Big Data Developer

Job Location: Chennai

Experience: 2+ Years

Primary Skill: Hadoop

Responsibility:

  • Design, build, test and maintain scalable and stable off the shelf applications to support
    distributed processing using the Hadoop Ecosystem
  • Implement ETL and data processes for structured and unstructured data
  • Pipelines for optimal extraction of data from a wide variety of data sources, ingestion,
    transformation, conversion validation
  • Conduct root cause analysis and advanced performance tuning for complex business
    processes and functionality
  • Ability to review frameworks and design principles towards suitability in the project
    context
  • Client orientation:
  • Propose the right solutions to the client by identifying & understanding critical pain
    points
  • Contribute to the entire implementation process including driving the definition of
    improvements based on business need and architectural improvements
  • Propose, pitch, sell, implement and prove success in continuous improvement initiatives
    Work and collaborate with multiple teams and stakeholders
  • Agile orientation:
  • Be a part of the Agile ceremonies to groom stories and develop defect-free code for the
    stories
  • Review code for quality and implementation best practices
  • Promote coding, testing and deployment best practices through hands-on research and
    demonstration
  • Write testable code that enables extremely high levels of code coverage
  • Mentor young engineers towards guiding them to become great engineers

Desired Skills/ Experience:

  • Preferably 4 to 7 years of experience
  • Highly skilled in:
  • PySpark and Spark
  • PySpark SQL and Dataframe APIs
  • Interpreting Spark execution DAG as displayed in ApplicationMaster
  • Writing optimal PySpark codes + deep knowledge of Spark parameter tweaking for
    execution optimization
  • Python (2 and 3), including knowledge of libraries like NumPy, Pandas, etc.
  • Writing sqoop scripts for ETL from TeraData
  • SQL and Analytical thinking
  • Strong understanding of:
  • Hadoop and Spark architectures and the MapReduce framework
  • Big data storages like HDFS, HBase, Cassandra
  • Data formats like Avro, Parquet, ORC, etc.
  • Exposure to at least one big data platform like Hortonworks, Cloudera, HDP, AWS-
    EMR, MapR, etc.
  • Prior experience with:
  • Using monitoring and administration tools like Ambari, Ganglia, etc.
  • Scheduling big data applications using Oozie (Including workflow and coordinator
    properties)
  • Good OO skills, including good design patterns knowledge
  • Good understanding of technologies like Hive, Pig, Presto, Impala, etc.
  • Prior experience in building spark infrastructure (cluster setup, administration,
    performance tuning) [on-premise (bare metal) and / or cloud-based]
  • Knowledge of software best practices, like Test-Driven Development (TDD) and
    Continuous Integration (CI)