Raghav .

Sr Big Data Engineer at Bank of America

United States

About

• Senior Java/J2EE Developer having 8+ years of experience with proven expertise in complete SDLC life cycle - System Analysis, Design and development with emphasis on Object Oriented, J2EE,J2SE and Client Server technologies. • Experienced with the entire Software Development Lifecycle (SDLC) process including requirement analysis, conceptual and detail design, development, verification and testing. • Expertise in application development using various frameworks: Jakarta Struts Framework 1.1/1.2/1.3, Spring Framework 1.2/1.3/2.0/2.5, Java Server Faces(JSF), Spring Batch framework, Hibernate 2.0/3.0/3.2, Java Data Objects with GUI plug-ins • Proficient in XML technologies like XML, DTD, XSL, XSLT, SOAP, WSDL and UDDI. • Proficient in various web based technologies like HTML, DHTML, JavaScript and AJAX. • Developed AJAX scripting to process server side JSP scripting. • Strong experience in design, development and implementation of large-scale web based applications using Object Oriented design with help of Java, J2EE and Different Database related technology. • Expertise in web development using HTML, DHTML, CSS, JBOSS, Drools, Java Script, XSL, XSLT, and XML (SAX, DOM, JAXP, JAXB). • Primary UI/UX frameworks include Kendo UI, jQuery UI, Sencha (EXT JS). • Exposure to workflow management systems Gulp and Grunt. • Experience in implementing web based projects using WebSphere App Server 6.1/7/8.5, Oracle WebLogic Server 9/10/11, JBoss 3.2.x/4.2, ApacheTomcat5.0/5. IDE’s like IBM WebSphere Studio Application Developer (WSAD) 5.0, Maven 2.x/3.x, Eclipse 3.0/3.1 and RAD 6.0/7.0/8 • Extensive Knowledge on databases like Oracle 9i/10g/11g, DB2, and MySQL. Experience in writing complex SQL Queries, Stored Procedures, Triggers, Cursors, and Functions. • Good working knowledge of database tools like TOAD, PL/SQL, Db Visualizer and SQL • Good Understanding and implementation knowledge of Java and J2EE design patterns.

Experience

  • Sr Big Data Engineer at Bank of America
    Feb 2017 - Present · 9 yrs 6 mos

    Utilized Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine Learning use cases under Spark ML and Mllib.  Implemented ML models in Spark using Scala for feature selection  Worked with NLTK library to NLP data processing and finding the patterns. Categorized Text data into positive and negative clusters from different sites using Sentiment Analysis and Text Analytics.  Map-Reduce programming in Java to ingest data from CSV files in to HBase.  Provided SQL like interface on top of HBase using Apache Phoenix.  Migrated an existing on-premises data to AWS S3. Used AWS services like EC2 and S3 for data sets processing and storage.  Developed real time streaming applications using Apache Beam to move data from source transactional DB to another transactional Database.  ELK stack to ingest and visualize the dashboard around key issues.  Designed and Developed applications using Apache Spark, Scala, Python, S3, AWS EMR on AWS cloud to format, cleanse, validate, create schema and build data stores on S3.

  • Sr. Java/ Big Data Developer at KeyBank
    May 2015 - Feb 2017 · 1 yr 10 mos

    Worked on a 40 nodes Hadoop Hortonworks Data Platform running HDP2.1  Worked with highly structured and semi structured data sets of 45 TB in size (135 TB with replication factor of 3).  Responsible for building scalable distribution data solutions using Hadoop.  Worked on Hortonworks-HDP distribution of Hadoop.  Experience with working on Teradata Studio, MS SQL, DB2 for identifying required tables and views to export into HDFS.  Extracted, Transformed, and Loaded (ETL) and Data Cleansing of data from sources like Flat files, XML files, and Databases and Involved in UAT, Batch testing and test plans.  Performed ETL jobs to integrate the data to HDFS using Informatica. Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.  Responsible for moving data from Teradata, MS SQL server, DB2 to HDFS and development cluster for validation and cleansing.  Used Spark Streaming on Scala to construct learner data model from sensor data using MLLib.  Worked on monitoring and troubleshooting the Kafka-storm-HDFS data pipeline for real time data ingestion in data lake in HDFS  Load the data into Spark and did in memory data Computation to generate the output response.  Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and Aggregation and how does it translate to MapReduce jobs.  Developed Python text analytics using re (regular expressions) to find pattern and generate the schema file.

  • Senior Big Data Developer at Capital One
    Mar 2014 - May 2015 · 1 yr 3 mos

    Cluster capacity planning along with operations team and management team and Cluster maintenance as well as creation and removal of nodes, HDFS support and maintenance.   Strong knowledge of Rack awareness topology in the Hadoop cluster.  Involved in Loading data from LINUX file system to Hadoop Distributed File System.  Responsible for building scalable distributed data solutions using Hadoop.  Experience in managing and reviewing Hadoop log files.  Data migration from RDMS to Hadoop using Sqoop for analysis and implemented Oozie jobs for automatic data imports from source.  Created HBase tables to store various data formats of PII data coming from different portfolios.  Strong in Exporting the analyzed and processed data to the Relational databases using Sqoop for visualization and for generation of reports for the team.  Installed Oozie workflow engine to run Multiple ecosystems like Hive and Pig jobs.  Good experience on Analyzing large amount of data sets to determine optimal way to aggregate and report on these data sets.  Implemented Cassandra connection with the Resilient Distributed Datasets (local and cloud).  Performed Sqooping for various file transfers through the HBase tables for processing of data to several NoSQL DBs- Cassandra, MongoDB.  Developed Hadoop data processes using Hive and Impala  Importing and exporting data into HDFS using Sqoop.