Job Description:
- Partner with product owner to translate product requirements into business solutions.
- Must be able to work with a top notch development team in data modeling, designing, coding and implementing ETL and Bigdata solutions in data warehousing environment
- Formulate hypothesis about the data based on small sample queries and extrapolate findings to payment theories, validating them against ever larger data sets.
- Write optimized code on large cluster to process and analyze terabytes of data
- Proactively tune processes to minimize execution times.
- Design processes to minimize data quality risks and proactively identify potential data issues.
- Prepare technical design documentation for review with team members.
- Prepare logical and physical data models for the new processes.
Responsibilities:
- End-to-End timely execution of assigned data projects, which includes interacting with business as needed, identify and apply innovative ideas, design database objects and ETL processes, develop code, testing and deployment.
- Understands the business and can contribute to technology direction that contributes to measurable business improvements.
- 100% adherence to the set delivery standards and agile process.
- Can triage and resolve site issues without supervision.
- Meet set targets for SLA Batches by actively engaging in identifying and addressing potential issues.
- Identify opportunities to improve existing data platforms and innovate.
- Identifies opportunities for engineering productivity improvements or directions, and evangelizes these successfully.
Job Requirements
- 7+ years experience in building scalable, distributed, fault-tolerant data warehouse and Bigdata applications.
- Must have excellent knowledge and at least 5+ years experience in DB design and development and performance tuning.
- Must have very good knowledge and at least 2+ years experience in Bigdata like Hive, Pig and Spark ecosystems
- Solid experience building data-driven applications/systems.
- Must have excellent knowledge and as a developer of ETL applications with Informatica PowerCenter 9 and above
- Strong skills working with relational databases (Teradata, Oracle, etc…), HDFS, Hive, Pig, Scala and Spark.
- Experience, or familiarity, with NoSQL technologies (Mongo, Hadoop, cassandra, etc.) is a plus.
- Exposure to Unix/Linux environment with the abiltity to write shell scripting is preferred.
- Experience translating unstructured data into valuable analytical information.
- Hands on experience working with source control systems (e.g., GIT).
- Must have strong analysis and design skills, translating requirements into technical specifications and design documents.
- Ability to create engineering requirements and technical designs from business requirements.
- Agile and Waterfall SDLC experience. Able to work independently and as part of a team.
- Background in data warehousing with Finance / Credit industry is a huge plus.
- Experience with data modeling tool such as PowerDesigner will be a plus.
- Experience with Reporting tool Microstrategy/Tableau will be a plus.