This course focuses on the extensive features of the Python data analysis workhorse library, Pandas,and its visualisation counterpart Matplotlib. It covers the reading, preparation and manipulation of tabular data from various sources and in various common formats.
Course overview:
Most wrangling and manipulation processes are covered. Time series data processing and practical linear regression are also covered. For the programming environment we use JupyterLab on the Anaconda platform. Anaconda is one of the most, if not the most popular data science platforms.
Approach:
We believe in learning by doing and take a hands-on approach to training. Delegates are provided with all required resources, including data,and are expected to code along with the instructor. The objective is for delegates to reproduce the analysis in our manuals as well as gain a conceptual understanding of the methods. Exercises and examples are used throughout the course to give practical hands-on experience with the techniques covered.
Course objectives:
This course aims to provide delegates, who already have Python programming experience, in-depth knowledge of Python's main data analysis and visualization libraries (Pandas and Matplotlib). The knowledge gained will enable delegates to design and develop enterprise level data analytics solutions.
Course content day 1:
Data analysis Python training course course contents - day 1
Course introduction:
• Administration and course materials
• Course structure and agenda
• Delegate and trainer introductions
Session 1 - Introduction to dataframes:
• What is a DataFrame
• Loading DataFrames
• Accessing contents
• Useful functions
• Adding and dropping columns and rows
Session 2 - Introduction to dataframes (continued):
• Fitering and assigning data
• Missing values and duplicates
• Arithmetic basics
• Applymap and apply
Course content day 2:
Data analysis Python training course course contents - day 2
Session 3 - Combining dataframes:
• Concatinate
• Merge
• Keys to merge on and suffixes for duplicate columns
• Merge methods
• Append
• Join
• Combine_first: For missing values
Session 4 - Reshaping dataframes:
• Unstacking and stacking
• Pivoting
• Melting
• Concatinating files from disk
Session 5 - Groupby and aggregation: split-apply-combine:
• Basic GroupBy
• Hierarchical GroupBy
• Group by function of Index
Course content day 3:
Data analysis Python training course course contents - day 3
Session 6 - Groupby and aggregation: split-apply-combine (continued):
• Aggregate by mapping on index and columns
• Aggregate by user-defined functions
• Aggregate using multiple functions
• Aggregate using separate function for each column
Session 7 - Groupby and aggregation: split-apply-combine (continued):
• Transfrom
• Apply function
• Pivoting with aggregation
Session 8 - Plotting with matplotlib:
• Pie chart
• Bar chart
• Histogram
• Scatter plot
• Line plot
Course content day 4:
Data analysis Python training course course contents - day 4
Session 9 - Time series data:
• Basic concepts; datetime, timestamp, timedelta, timezones
• Pandas to_date() fucntion
• Date Range
• What is time series data
• Reading time series data
• Missing Dates
• Partial indexing, slicing and selecting
• Resampling
• Moving Window functions
Session 10 - Linear regression:
• What is linear regression
• Simple linear regression
• Multiple regression
Target audience:
This course is designed for anyone with Python programming experience wanting to gain a solid foundation in Python's data analysis libraries. It is a must for aspiring Data Analysts and Scientists. Existing Data Analysts wanting a systematic introduction to Python's Data Analysis tools would also find the course very useful.
Prerequisites:
Programming:
• Delegates are expected to have Python programming experience. They should be able to effectively use Python containers (lists, tuples, dictionaries, and sets), construct loops and conditional statements, write functions and create and use classes and objects. Skills and knowledge that can be acquired by taking our Python programming 1 course.
Numeracy:
• Able to calculate and interpret averages, standard deviations and similar basic statistics
• Ability to read and understand charts and graphs
• For linear regression; an understanding of the meaning of a linear graph (or an ability to understand it quickly when explained).
• Mathematics: GCSE or equivalent