初级
中级
高级
零基础
小班授课+ 线上直播 + 线下授课
Pluralsight学习视频辅助指导
课程视频, 学员故事, 匠人公开课
数据分析师必备技能
优秀学员提供Reference
获得数据分析师经验
商业数据分析实战班技术栈
数据分析
Python
Pandas
Excel
VBA
Power BI
Tableau
SQL
Anaconda
Predictive Modelling
Data Visualisation
Sklearn
LightGBM
Machine Learning
Google Analytics
数据分析职业技能
简历修改
面试指导
模拟面试
行业分析
Presentation训练
面向编程零基础学生,商科在职人士,面向公司数据分析师要求培训,通过项目贯穿理解数据分析,简历修改,可选本地公司数据分析师实习。4位行业导师+1位全程Tutor+课代表和小组组长保障学习成功率
开课时间 2020-11-5
显示全部课程表信息(点击展开)
1. Introduction
- Data structures (Structured data, semi-structured data, unstructured data)
- RDBS and RDBMS concepts
(Relational databases and Relational database management system)
- Data analysis flow and example
- Data analyst job requirements
2. SQL components (1)
- SQL DDL (data manipulation language) - CREATE, DROP, TRUNCATE, ALTER, COMMENT, RENAME
- DB tools - DBeaver introduction and installation
Assignment:
1. Install DBeaver
2. sql_components_tutorial Q1 and Q2
3. Register an AWS account (if you don't have one)
1. Case study - product spec & performance tuning
2. Leetcode - medium level challenge
1. Amazon Web Services - RDS (Relational database service) Introduction
2. SQL components (2)
- DML (data manipulation language)
- DCL (data control language)
- TCL (transaction control language)
SQL common data types, operators and functions:
Common data types:
- Text / Varchar
- Integer
- Date / Timestamp
- Serial
Operators:
- Arithmetic operators
- Comparison operators
- Logical operators
SQL aggregation functions:
- Sum
- Count
- Avg
SQL common functions:
- Min / Max
- Distinct
- Substring
- Length
- Upper / Lower
- Coalesce
- Extract
- Concat
- Case statement
- Cast / to_date
SQL joins and window functions
Joins:
- No join
- Inner join
- Left join
- Right join
- Full join
- *Cross join
- *union
Window Functions:
- Row_number / Rank / Dense_rank
- Lead / Lag
- First_value / Last_value
Query optimization (Structure & Layout)
- with clause, views
Query optimization (Performance)
- materialized view
- indexes
- Table partitioning
· Python 基本介绍和常用IDE各自的特点(学员已经在Tutor协助下完成安装的前提下)
· 基本编程概念 -Object, Class, Method, Attribute
· Python 最重要的三种数据结构(List, Tuple, Dictionary) 各自的基本特征和应用场景
· 数据类型之间的互相转换, (Int, String, Character, Float) 常用method进行数据处理
· Datetime 工具包 操作时间日期数据, 模拟批处理不统一格式下的日期时间数据
Python Control Flow:
条件语句 (if else elif; case)
基本循环 (for , while, range vs. xrange )
Error Handling (tyr except pass)
List Comprehension
函数与类的关系 (function, class & object)
基本函数的参数传导,继承 以及装饰器(decorator)
Lambda 匿名函数
多层循环 (利用 break/continue 优化多层嵌套循环)
Exercise:
6 different in class exercises to master python control flow
Use Pandas for data wrangling:
1. From Numpy to Pandas
2. Series, TimeSeries & Dataframe
3. Structural understanding of DataFrame object and its methods & attributes
4. Data exploration use DataFrame
5. Read/Write use DataFrame
6. Advanced DataFrame filtering & sorting
7. 3 different methods to Loop through Dataframe
8. Map & Apply with self defined function & annoymous function
Data Wrangling use Pandas part 2
1. Convert numerical data to categorical data use bin function
2. Merge & Concat equvalent of join & union in SQL (performance comparison and bottle neck for process data in SQL or Pandas)
3. Groupby and Pivot in Pandas
4. Advanced usage of Groupby with agg,apply & map
Exercise:
1 in class exercise and 5 take home exercises
1. Basic concept of Pandas Series
2. Time Series explained and common usage of Time Series in real life data analysis
3. Advanced Datetime tool kit for populate time series,
4. Time Series Wrangling ( Resample, Offset, Freq....)
5 Time Series ploting and moving window function
常见的经典时间序列建模方式
1.Matplotlib for exploration plot
2. Matplotlib basic graph ( Line chart , Bar chart, Scatter chart)
3. Matplotlib advanced plot ( 3 D Scatter, Correlatin Heatmap, MSNO chart)
4. What is machine learning ( Knowledge Tree and machine learning life cycle)
5 Quick in class example of KNN model
Regression Model:
1. From simple single Neuron to understand fundmental Regression Classifier model
2. Write your own single Neuron Classifier model from scratch
3. Key concepts for regression model : Loss Function; Gradient Descendent; Activation Function
4. how log function serve as Activation function in logistic regression
5. How Logistic regression fight overfitting problem ( L1,L2 Regularizer explained)
Tree Based Models & Ensemble Models
1. Basic decision Tree explained
2. Key concept in decision Tree mdoel - information gain(Shannon Entrophy, Gini Inpurity)
3. Use Sklearn to build a basic decision tree
4. Classification model performance measure
5 Feature inportance & tree visualization technique
5 how tree model fight over fitting ( Prunning or go ensemble)
6. Ensemble tree models - Parallel trees - bagging, random forrest)
7. Ensemble tree models - Sequential trees -Boosting Tree family (GBDT, Ada boosting, XG Boost, Light GBM, Cat boost)
8. In class exercise : Income prediction use different tree model.
9. Hybrid Ensemble Model : Majority Vote Classifier & Stacking Classifier
Everything you need to know to build a end to end Machine Learning Model:
1. Categorical data encoding
2. Scaling technique ( when and how )
3. Save model & scaler use Pickle or Joblib
4 The curse of dimensionaly ( Dimension reduction vs. Dimension selection )
5. Dimension Reduction ( PCA explained and practise)
6. Machine Learning Pipe Line use Sklearn
7. Validation explained
8. Parameter Tuning - Grid Search technique
9. Model performance measure ( Confusion Matrics, ROC&AUC)
Exercise:
1. QLD Accident severity prediction
2. Income prediction
1. use Macro recorder (Pro & Con)
2. Advanced Formula ( Sumifs countifs, getpivotdata, arrayformula - more will be covered in tutorial time)
3. Pivot table basic, how to use pivot table as part of your Excel automation
4. Formula Manager, unique VBA tool to integrate VBA wiht Excel built in formula
5. Event - different way to fire your VBA code
6 How to further practise your skills to be advanced to expert Excel user
- 介绍Data visualization以及不同的Data visualization tool
- 学习如何链接不同的数据源以及制作简单的Tableau图表,比如bar chart, line chart, text table, highlight table, map, tree map等等
- 学习Tableau 添加不同的线(Reference line, trend line, forecast line)的使用
- 简单的Dashboard & Story的操作
- 学习如何在Tableau 中连接不同的Data Source(分别介绍 Union, Join, Blend 含义以及如何操作)
- Tableau的三种常用的基本及高级计算(Calculated field, Table calculation, 以及LOD表达式)
- Tableau常用重要的Features(Group & Set,Bin,Hierarchy,Parameter)
- 学习如何使用Page做动态的Tableau Dashboard
- 学习如何创建新颖的Hex Map
这一部分将会运用到之前所学到的知识,也会学习一些新的工作中常用的图表(Donut chart,waterfall chart)以及一些actions,最后完成一个完整的Tableau 仪表盘。
在这一部分的学习中,会从Marketing Strategy入手,给大家介绍Marketing Analytics的一些概念,以及数据分析的应用和数据分析岗位的工作内容,通过比较Marketing Analyst和Data analyst的岗位不同点帮助学生更好的进行职业选择及规划。同时课程会覆盖Google Analytics的一些概念及操作,让学生对Marketing Analyst的工作有更清晰的认识。
All Python Tutorial Exercise & Answer ( Recording after its available
Exercise Topics:
1. List& Dictionary
2. Loop, String, Datetime
3. Numpy & Matplotlib
5. Pandas ( no separate exercise, tutorial will be focuse on in class exercise inside corresponding lesson)
6. Machine Learning ( same as above)
面向编程零基础学生,商科在职人士,面向公司数据分析师要求培训,通过项目贯穿理解数据分析,简历修改,可选本地公司数据分析师实习。4位行业导师+1位全程Tutor+课代表和小组组长保障学习成功率
开课时间 2021-2-12
显示全部课程表信息(点击展开)
数据人才缺口红利,你能否掌握先机?
随着越来越多知名企业依赖数据做出关键决策,对数据分析人才的需求与日俱增。
第三方权威统计,2017 数据分析招聘需求提高 150%,平均薪资增幅高达 10%,2020年的需求窗口继续保持之前的势头。
选择匠人学院的学习曲线、符合行业需求技能图谱、专业及时学习辅导,快人一步成为抢手人才,掌握先机!
明确知识框架和学习路径
比如数据分析这件事情,如果你要成为数据分析师,那么你可以去招聘网站看看,对应的职位的需求是什么,一般来说你就会对应该掌握的知识架构有初步的了解。你可以去看看数据分析师职位,企业对技能需求可总结如下:
• SQL数据库的基本操作,会基本的数据管理
• 会用Excel/SQL做基本的数据提取、分析和展示
• 会用脚本语言进行数据分析,Python or R
• 有获取外部数据的能力加分,如爬虫或熟悉公开数据集
• 会基本的数据可视化技能,能撰写数据报告
• 熟悉常用的数据挖掘算法:回归分析、决策树、分类、聚类方法
• 能有对分析结果, 有presentation的能力
本课程就针对公司的岗位需求做逐个突破,帮助学员能拿到想要的offer
数据分析岗位要求
与业务线同学一起搭建业务监控指标体系并产品化,熟练掌握python,excel、sql等等,为业务团队提供专题分析、数据分析与挖掘、模型及算法等相关服务,主动的寻找机会获得资源并落地;提炼数据产品需求,提供数据产品解决方案,并最终推动数据产品落地,拥有良好的沟通表达能力。了解数据获取,数据存储、提取,数据预处理,数据建模与分析,数据可视化
需要获取外部数据的数据分析师
Python基础知识,python爬虫,SQL语言,python科学计算package:pandas,numpy等,统计学基础,回归分析方法,数据挖掘基本算法:分类、聚类,模型优化:特征提取
不需要获取外部数据的分析师
SQL语言,Python基础知识,python科学计算package:pandas,numpy等,统计学基础,回归分析方法,数据挖掘基本算法:分类、聚类,模型优化:特征提取
清华学霸 十五载研究工作经历
Performance and Insight Specialist
Senior Analyst
Data scientist chapter lead
Senior Customer Insights Analyst
查看更多导师