Course Description
The course examines different approaches to a data analysis project, with a framework for organizing an analytical effort. Popular tools for data analysis, such as R and Python, are introduced to carry out analysis. The course covers how to obtain and manipulate the raw data for use, as well as the basic exploratory analysis and common data analytical techniques such as regression, simulation, estimation and forecasting. It includes several graphing and visualization tools to understand the data and to present findings and results.
Overall goal of this class is to introduce participants to the discipline of data analysis/ data mining, a science of understanding and analyzing data. The class is designed to provide participants with the tools they need for solving real world problems using statistics and a better understanding of data analysis techniques.
By the end of the course, you will learn a working framework to approach any data analysis project. You will be able to use R (or Python) to complete a large data analysis project, including a write-up with findings, insights and visuals. All tools used are open sourced.
In this program, you will learn the practical expertise regarding data analysis. You will understand how it is the process of transforming data into useful information to support decision making. It is the foundation for data mining, business intelligence, and predictive analytics. This course presents the tools, techniques and common practices used in the industry, including how to obtain, manipulate, explore, model, simulate and present data. It will help you build the essential technical skills to perform as data analyst or data scientist, and to continue other course studies in the certificate program.
This is module#7
Modules
- Module 1: Introduction to Data Science
- Module 2: R Studio and Getting started with R
- Module 3: Data Manipulation Techniques Part 1
- Module 4: Data Manipulation Techniques Part 2
- Module 5: Advanced Data Manipulation
- Module 6: Visualization
- Module 7: Probability and Estimation
- Module 8: Modeling
- Module 9: Regression
- Module 10: Time Series Analysis
Learning Outcomes
- Perform independent analysis of data
- Understand use and navigate R Studio and R
- Implement various algorithms for their needs and improve/modify existing algorithms/techniques for data analysis
- Apply data manipulation techniques for greatest impact
- Use advanced data manipulation tools in analysis
- Implement techniques for data visualization
- Understand how to use probability and estimation in data annaalysis
- Use modelling and regression tools
- Implement time series analysis
- Present analysis and results in a clear and convincing manner
Prerequisites
- Basic Python knowledge is assumed
- Some software development experience (including languages, databases…)
Who Should Attend
- Anyone who wants to learn about using Python to build, evaluate or deploy machine learning and Artificial Intelligent models.
- Scientists, engineers, business analysts, research who explore and analyze data and wish to present their findings in well-formatted textual and graphical forms.
- Anyone wishing to get hands-on experience building machine learning models.
- Professionals, students and job-seekers interested in learning the fundamentals of machine learning and data mining and want to learn to build, evaluate or showcase machine learning applications in Python.
- The course will be appealing mostly to people that need an introduction to numerical computing and visualization using Python environment and also for technical staff that want to enhance their Python programming skills on the specific topics. Anyone who is interested in using Python’s NumPy, Scipy and Matplotlib packages as prototyping tools would also benefit from the course.