Data Science and Big Data Analytics
DESCRIPTION
In this course, you will gain practical foundation level training that enables immediate and effective participation in big data and other analytics projects. You will cover basic and advanced analytic methods and big data analytics technology and tools, including MapReduce and Hadoop. The extensive labs throughout the course provide you with the opportunity to apply these methods and tools to real world business challenges. This course takes a technology-neutral approach. In a final lab, you will address a big data analytics challenge by applying the concepts taught in the course to the context of the Data Analytics Lifecycle. You will prepare for the Proven Professional Data Scientist Associate (EMCDSA) certification exam, and establish a baseline of Data Science skills.
TARGET AUDIENCE
- This course is appropriate for developers and administrators who intend to use HBase. Prior experience with databases and data modeling is helpful, but not required. Prior knowledge of Java is helpful. Prior knowledge of Hadoop is not required, but Cloudera Developer Training for Apache Hadoop provides an excellent foundation for this course.
OBJECTIVES
At the end of the course, students will be able to:
- Deploy the Data Analytics Lifecycle to address big data analytics projects
- Reframe a business challenge as an analytics challenge
- Apply appropriate analytic techniques and tools to analyze big data, create statistical models, and identify insights that can lead to actionable results
- Select appropriate data visualizations to clearly communicate analytic insights to business sponsors and analytic audiences
- Use R and RStudio, MapReduce/Hadoop, in-database analytics, Windows, and MADlib functions
- Use advanced analytics create competitive advantage
- Data scientist role and skills vs. traditional business intelligence analyst
PREREQUISITES REQUIRED
- Querying Microsoft SQL Server 2012
- Querying Microsoft SQL Server 2014
- Java OCA & OCP
- Big Data
- State of the Practice in Analytics
- Data Scientist
- Big Data Analytics in Industry Verticals
- Discovery
- Data Preparation
- Model Planning
- Model Building
- Communicating Results
- Operationalizing
- Using R to Look at Data
- Analyzing and Exploring the Data
- Statistics for Model Building and Evaluation
- K Means Clustering
- Association Rules
- Linear Regression
- Logistic Regression
- Naïve Bayesian Classifier
- Decision Trees
- Time Series Analysis
- Text Analysis
- Analytics for Unstructured Data
- MapReduce and Hadoop
- Hadoop Ecosystem
- In-Database Analytics: SQL Essentials
- Advanced SQL and MADlib for In-Database Analytics
- Operationalizing an Analytics Project
- Creating the Final Deliverables
- Data Visualization Techniques
- Final Lab Exercise on Big Data Analytics
Current Streaming Courses
"The secret to getting ahead is getting started..." ~ Mark Twain





























