Job Location: Chennai
Experience: 2-5 Years
Primary Skill: Hadoop
Key Responsibilities:
- Create and maintain optimal data and model dataOps pipeline architecture
- Assemble large, complex data sets that meet functional / non-functional business requirements.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and cloud-based ‘big data’ technologies from AWS, Azure and others.
- Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
- Keep data separated and secure across national boundaries through multiple data centers and strategic customers/partners.
- Create tool-chains for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and machine learning experts to strive for greater functionality in our data and model life cycle management systems.
- Support dataOps competence build-up in Ericsson Businesses and Customer Serving Units
Key Qualifications:
- Bachelors/Masters/Ph.D. in Computer Science, Information Systems, Data Science, Artificial Intelligence, Machine Learning, Electrical Engineering or related disciplines from any of the reputed institutes. First Class, preferably with Distinction.
- Overall industry experience of 5+ years, at least 3 years’ experience as a Data Engineer.
- 3+ years of experience in the following:
- Software/tools: Hadoop, Spark, Kafka, etc.
- Relational SQL and NoSQL databases, including Postgres and Cassandra.
- Data and Model pipeline and workflow management tools: Azkaban, Luigi, Airflow, Dataiku, etc.
- Stream-processing systems: Storm, Spark-Streaming, etc.
- Object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
- Object-oriented/object function scripting languages: Python, Java, Scala (Advanced level in one language, at least)
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Experience performing root cause analysis on internal and external data and processes to answer specific business questions and seek opportunities for improvement.
- Experience in Data warehouse design and dimensional modeling
- Strong analytic skills related to working with unstructured datasets.
- Experience building processes supporting data transformation, data structures, metadata, dependency and workload management.
- Advanced SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of other databases/date-sources.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Experience with Docker containers, orchestration systems (e.g. Kubernetes), continuous integration and job schedulers.
- Familiar with functional programming and scripting languages such as Javascript or GO
- Knowledge of server-less architectures (e.g. Lambda, Kinesis, Glue).
- Experience with microservices and REST APIs.
- Familiar with agile development and lean principles.
- Contributor or owner of GitHub repo.
- Experience supporting and working with cross-functional teams in a dynamic environment.
- Good communication skills in written and spoken English
- Creativity and ability to formulate problems and solve them independently
Additional Requirements:
- Applications/Domain-knowledge in Telecommunication and/or IoT, a plus.
- Experience with data visualization and dashboard creation is a plus
- Ability to work independently with high energy, enthusiasm and persistence
- Experience in partnering and collaborative co-creation, i.e., working with complex multiple stakeholder business units, global customers, technology and other ecosystem partners in a multi-culture, global matrix organization with sensitivity and persistence