初级
中级
高级
零基础
小班授课+ 线上直播 + 线下授课
Pluralsight学习视频辅助指导
课程视频, 学员故事, 匠人公开课
数据分析师必备技能
优秀学员提供Reference
获得数据分析师经验
商业数据分析项目班技术栈
数据分析
Python
Pandas
Excel
VBA
Power BI
Tableau
SQL
Anaconda
Predictive Modelling
Data Visualisation
Sklearn
LightGBM
Machine Learning
Google Analytics
数据分析职业技能
简历修改
面试指导
模拟面试
行业分析
Presentation训练
面向编程零基础学生,商科在职人士,面向公司数据分析师要求培训,通过项目贯穿理解数据分析,简历修改,可选本地公司数据分析师实习。4位行业导师+1位全程Tutor+课代表和小组组长保障学习成功率
开课时间 2021-3-3
显示全部课程表信息(点击展开)
1. Introduction
- Data structures (Structured data, semi-structured data, unstructured data)
- RDBS and RDBMS concepts
(Relational databases and Relational database management system)
- Data analysis flow and example
- Data analyst job requirements
2. SQL components (1)
- SQL DDL (data manipulation language) - CREATE, DROP, TRUNCATE, ALTER, COMMENT, RENAME
- DB tools - DBeaver introduction and installation
Assignment:
1. Install DBeaver
2. sql_components_tutorial Q1 and Q2
3. Register an AWS account (if you don't have one)
1. Amazon Web Services - RDS (Relational database service) Introduction
2. SQL components (2)
- DML (data manipulation language)
- DCL (data control language)
- TCL (transaction control language)
SQL common data types, operators and functions:
Common data types:
- Text / Varchar
- Integer
- Date / Timestamp
- Serial
Operators:
- Arithmetic operators
- Comparison operators
- Logical operators
SQL aggregation functions:
- Sum
- Count
- Avg
SQL common functions:
- Min / Max
- Distinct
- Substring
- Length
- Upper / Lower
- Coalesce
- Extract
- Concat
- Case statement
- Cast / to_date
SQL Tutorial exercise and case study 1
1. SQL Components - Tutorial (Lesson 2)
2. SQL和数据库相关基础概念 – DCL– Lab (Lesson 2)
3. SQL logical operators - Lab (Lesson 3)
4. SQL aggregation - Lab (2) (Lesson 3)
5. SQL常用功能 - Lab (Lesson 3)
6. Case study (Lesson 3)
SQL joins and window functions
Joins:
- No join
- Inner join
- Left join
- Right join
- Full join
- *Cross join
- *union
Window Functions:
- Row_number / Rank / Dense_rank
- Lead / Lag
- First_value / Last_value
Query optimization (Structure & Layout)
- with clause, views
Query optimization (Performance)
- materialized view
- indexes
- Table partitioning
1. Case study - product spec & performance tuning
2. Leetcode - medium level challenge
· Python 基本介绍和常用IDE各自的特点(学员已经在Tutor协助下完成安装的前提下)
· 基本编程概念 -Object, Class, Method, Attribute
· Python 最重要的三种数据结构(List, Tuple, Dictionary) 各自的基本特征和应用场景
· 数据类型之间的互相转换, (Int, String, Character, Float) 常用method进行数据处理
· Datetime 工具包 操作时间日期数据, 模拟批处理不统一格式下的日期时间数据
Python Control Flow:
条件语句 (if else elif; case)
基本循环 (for , while, range vs. xrange )
Error Handling (tyr except pass)
List Comprehension
函数与类的关系 (function, class & object)
基本函数的参数传导,继承 以及装饰器(decorator)
Lambda 匿名函数
多层循环 (利用 break/continue 优化多层嵌套循环)
Exercise:
6 different in class exercises to master python control flow
Use Pandas for data wrangling:
1. From Numpy to Pandas
2. Series, TimeSeries & Dataframe
3. Structural understanding of DataFrame object and its methods & attributes
4. Data exploration use DataFrame
5. Read/Write use DataFrame
6. Advanced DataFrame filtering & sorting
7. 3 different methods to Loop through Dataframe
8. Map & Apply with self defined function & annoymous function
Data Wrangling use Pandas part 2
1. Convert numerical data to categorical data use bin function
2. Merge & Concat equvalent of join & union in SQL (performance comparison and bottle neck for process data in SQL or Pandas)
3. Groupby and Pivot in Pandas
4. Advanced usage of Groupby with agg,apply & map
Exercise:
1 in class exercise and 5 take home exercises
1. Basic concept of Pandas Series
2. Time Series explained and common usage of Time Series in real life data analysis
3. Advanced Datetime tool kit for populate time series,
4. Time Series Wrangling ( Resample, Offset, Freq....)
5 Time Series ploting and moving window function
All Python Tutorial Exercise & Answer ( Recording after its available
Exercise Topics:
2. Loop, String, Datetime
5. Pandas exercise
常见的经典时间序列建模方式
1.Matplotlib for exploration plot
2. Matplotlib basic graph ( Line chart , Bar chart, Scatter chart)
3. Matplotlib advanced plot ( 3 D Scatter, Correlatin Heatmap, MSNO chart)
4. What is machine learning ( Knowledge Tree and machine learning life cycle)
5 Quick in class example of KNN model
Regression Model:
1. From simple single Neuron to understand fundmental Regression Classifier model
2. Write your own single Neuron Classifier model from scratch
3. Key concepts for regression model : Loss Function; Gradient Descendent; Activation Function
4. how log function serve as Activation function in logistic regression
5. How Logistic regression fight overfitting problem ( L1,L2 Regularizer explained)
Tree Based Models & Ensemble Models
1. Basic decision Tree explained
2. Key concept in decision Tree mdoel - information gain(Shannon Entrophy, Gini Inpurity)
3. Use Sklearn to build a basic decision tree
4. Classification model performance measure
5 Feature inportance & tree visualization technique
5 how tree model fight over fitting ( Prunning or go ensemble)
6. Ensemble tree models - Parallel trees - bagging, random forrest)
7. Ensemble tree models - Sequential trees -Boosting Tree family (GBDT, Ada boosting, XG Boost, Light GBM, Cat boost)
8. In class exercise : Income prediction use different tree model.
9. Hybrid Ensemble Model : Majority Vote Classifier & Stacking Classifier
All Python Tutorial Exercise & Answer ( Recording after its available
Exercise Topics:
1. List& Dictionary
2. Loop, String, Datetime
3. Numpy & Matplotlib
5. Pandas ( no separate exercise, tutorial will be focuse on in class exercise inside corresponding lesson)
6. Machine Learning ( same as above)
Everything you need to know to build a end to end Machine Learning Model:
1. Categorical data encoding
2. Scaling technique ( when and how )
3. Save model & scaler use Pickle or Joblib
4 The curse of dimensionaly ( Dimension reduction vs. Dimension selection )
5. Dimension Reduction ( PCA explained and practise)
6. Machine Learning Pipe Line use Sklearn
7. Validation explained
8. Parameter Tuning - Grid Search technique
9. Model performance measure ( Confusion Matrics, ROC&AUC)
Exercise:
1. QLD Accident severity prediction
2. Income prediction
All Python Tutorial Exercise & Answer ( Recording after its available
Exercise Topics:
1. List& Dictionary
2. Loop, String, Datetime
· Variables, Constant and data type
· Excel Object hierarchy
· Loop & if statement
· Nested loop (using exit and goto statement)· String handling & datetime handling
· Arrays
· Error handling
· Forms & controls
1. use Macro recorder (Pro & Con)
2. Advanced Formula ( Sumifs countifs, getpivotdata, arrayformula - more will be covered in tutorial time)
3. Pivot table basic, how to use pivot table as part of your Excel automation
4. Formula Manager, unique VBA tool to integrate VBA wiht Excel built in formula
5. Event - different way to fire your VBA code
6 How to further practise your skills to be advanced to expert Excel user
Machining Learning Project - Queensland Traffic
第一部分 行业分析和职业选择
澳洲数据岗位就业情况和求职方法
数据类职位的行业选择
数据类职位核心属性区别
数据分析岗位的核心技能要求和难点
第二部分 求职数据类工作简历
在清楚职业定位的基础上,结合过去的学术和工作经验,定制目标数据分析岗位简历。
如何优化内容分布
如何优化文字表述
如何加强学术和经验与目标工作的联系性
- 介绍Data visualization以及不同的Data visualization tool
- 学习如何链接不同的数据源以及制作简单的Tableau图表,比如bar chart, line chart, text table, highlight table, map, tree map等等
- 学习Tableau 添加不同的线(Reference line, trend line, forecast line)的使用
- 简单的Dashboard & Story的操作
- 学习如何在Tableau 中连接不同的Data Source(分别介绍 Union, Join, Blend 含义以及如何操作)
- Tableau的三种常用的基本及高级计算(Calculated field, Table calculation, 以及LOD表达式)
- Tableau常用重要的Features(Group & Set,Bin,Hierarchy,Parameter)
- 学习如何使用Page做动态的Tableau Dashboard
- 学习如何创建新颖的Hex Map
这一部分将会运用到之前所学到的知识,也会学习一些新的工作中常用的图表(Donut chart,waterfall chart)以及一些actions,最后完成一个完整的Tableau 仪表盘。
在这一部分的学习中,会从Marketing Strategy入手,给大家介绍Marketing Analytics的一些概念,以及数据分析的应用和数据分析岗位的工作内容,通过比较Marketing Analyst和Data analyst的岗位不同点帮助学生更好的进行职业选择及规划。同时课程会覆盖Google Analytics的一些概念及操作,让学生对Marketing Analyst的工作有更清晰的认识。
acutal-projects
分组做商业项目
3个月课程,额外1个商业项目,小组共同完成项目数据分析,最后组间Battle, Presentation
community-discussion
中澳数据就业两手抓
面向公司DA招聘要求,掌握数据可视化、商业智能分析、SQL、机器学习等能力,学习数据分析师必备技能
interview-guidance
澳洲最全数据分析课
通过一个项目贯穿课程内容,学习亚马逊AWS云数据库技术,Python深度学习,Excel/VBA企业最常技能,Tableau/Power BI数据可视化
answer-questions
导师+Tutor答疑
4位老师+1位全程Tutor,关于求职,数据分析,职场,公司就业等任何问题都可以提问,从面试官角度思考面试
项目介绍:
通过使用Python (Pandas, Scikit learn, lightGBM, Seaborn) 等数据包对于 收入预测数据进行分类判断,用于帮助销售团队识别目标消费群进行广告推送。
期望达到的效果
技术栈:Python, Pandas, Scikit learn, lightGBM, Seaborn
项目介绍:
分类预测 和 数值预测的练习,经典数据集(被频繁用于数据分析师面试笔试)
期望达到的效果
技术栈:分类预测, 数值预测, Data exploration, Regression算法, pipeline, n folder cross validation
利用昆士兰政府交通事故数据集建立有效的交通事故严重度预测模型(Traffic Accident Severity Prediction)
该数据集并非量身定制的机器学习 toy data set, 需要使用者根据专业知识和默认知识进行相当级别的数据预处理,并且包含相当数量的缺失数据 (贴近真实商业环境中建模所面对的挑战)
期望达到的效果
技术栈:团队项目, 团队Battle, 数据可视化, 机器学习
清华学霸 十五载研究工作经历
数据工程师
Performance and Insight Specialist
Senior Analyst
Data scientist chapter lead
Senior Customer Insights Analyst
查看更多导师