Job Description
- Working experience in various AI powered model implementation with NLP Deep Learning algorithms
- Experiences in unstructured text data analysis language speech image and video data analysis across multiple industries e g manufacturing retail etc
- Develop connection from Spark streaming to Kafka Flume using Python
- Examine Streaming performance and provide optimal and precise development using Python PySpark includes connecting to Data Structured or Unstructured Extraction Cleaning
- Develop Models classification or clustering using MLib or Anaconda
- Gather evaluate and document business requirements related to analytics translate to analytics solution definition and ability to implement using Python or Scala
- Data extraction from Raw files using Python Anaconda or built in for POC
- Data pulling or creating from different sources such as HBase Hive Impala or MongoDB
- Responsible for analyzing data from multiple data sources DBs flat files etc and building predictive models using Python
- Linux shell scripting with Python and cron jobs to schedule the run Batch or Real time
- Scala knowledge is preferable in some cases
- Different models and their performances in Real time and Batch developed using Python Pandas MLib PySpark and opting the better solution depending on the cases
- Validate the models statistically as well as from business perspective in discussions with business stakeholders
- Ability to support and guide model deployment and model lifecycle management
- Create model documentation as per client regulatory standards
- Degree in a quantitative field Math Statistics Economics Computer Science and or Engineering MBA
- Experience and skilled Python incl PySpark Spark MLib
- Hands on experience in analytical techniques including sampling clustering decision trees forecasting SVM Random Forest and linear logistic regression
- Hands on experience in Python PySpark MLib Spark Mesos
- Hands on experience using Hive Hbase Impala
- Knowledge on Kafka and Flume is a plus
- Data exploration using OpenCV NumPy Matplotlib SciPy and Pandas for image analysis
- Good Knowledge on Python based libraries e g Keras Tensor flow Knowledge on Scala will be beneficial
- Working with AWS Cloudera Horton Works Knowledge of where analytics fits in to an end to end business solution
- Ability to work with business and technology teams to build and deploy an analytical solution as per client needs
- Ability to multi task solve problems and think strategically
- Strong communication and collaboration skills
- Working experience in various other data science technologies e g R SAS SPSS Matlab are also preferred