Job Location: Chennai
Experience: 2+ Years
Primary Skill: Knowledge for Hadoop
Job Description
The Big Data Engineer at Draup is responsible for building scalable techniques and processes for data storage, transformation and analysis. The role includes decision-making and implementation of the optimal, generic, and reusable data-platforms. You will work with a very proficient, smart and experienced team of developers, researchers and co-founders directly for all application use cases.
What You Will Do
- Develop, maintain, test and evaluate big data solutions within the organisation.
- Build scalable architectures for data storage, transformation and analysis.
- Design and develop solutions which are scalable, generic and reusable.
- Build and execute data warehousing, mining and modelling activities using agile development techniques.
- Leading big data projects successfully from scratch to production.
- Creating a platform on top of stored data sources using a distributed processing environment like Spark for the users to perform any kind of ad-hoc queries with complete abstraction from the internal data points.
- Solve problems in robust and creative ways.
- Collaborate and work with Machine learning and harvesting teams.
What You Will Need
- Proficient understanding of distributed computing principles.
- Must have good programming experience in Python.
- Proficiency in Apache Spark (PySpark) is a must.
- Experience with integration of data from multiple data sources.
- Experience in technologies like SQL and NoSQL data stores such as Mongodb.
- Good working Knowledge of MapReduce, HDFS, Amazon S3.
- Knowledge of Scala would be preferable.
- Should be able to think in a functional-programming style.
- Should have hands-on experience in tuning software for maximum performance.
- Ability to communicate complex technical concepts to both technical and non-technical audiences
- Takes ownership of all technical aspects of software development for assigned projects.
What Will Give You An Advantage
- Expertise in big data infrastructure, distributed systems, data modelling, query processing and relational.
- Involved in the design of big data solutions with Spark/HDFS/MapReduce/Storm/Hive.
- Worked with different types of file-storage formats like Parquet, ORC, Avro, Sequence files, etc.
- Strong knowledge of data structures and algorithms.
- Understands how to apply technologies to solve big data problems and to develop innovative big data solutions.
- Someone with entrepreneurial mind-set delivering quick and efficient solutions with good design and architectural patterns will be preferred.
Who You Are
- B.E / B.Tech / M.E / M.Tech / M.S in Computer Science or software engineering.
- Experience of 2-6 Years working with Big Data technologies.
- Open to embrace the challenge of dealing with terabytes and petabytes of data on a daily basis. If you can think out of the box have good code discipline, then you fit right in.