DS找工回顾及资料总结

avatar 138645
ryanli
31536
276
从18年下旬NG找工作,到19年底第一次跳槽到tech公司,再到22年初寻找职业发展再次跳槽。 三年半的时间经历了三次找工作,得到地里的很多帮助,想总结一下自己的找工心得回馈地里,希望能帮助到一些小伙伴(仅是个人心得,如有出入,欢迎小伙伴交流指正^-^)。

先来说说DS的领域划分,18年找工作的时候还没有明确的领域划分, 面试也是五花八门,这两年领域划分越来越明确,并且不同领域面试的内容也会不同。这个帖子写的很清楚,我就不再赘述了。instant.1point3acres.cn
从面试的角度来讲:
  • Analytics, Product岗的DS: SQL + Business Case (A/B testing, metrics) + Basic Stats (probability, hypothesis testing) + Basic ML (一般考得很少,也都是问一些基础的问题) + BQ
  • ML (Algorithm, Core) 岗的DS: Coding (公司不同岗位不同,侧重点也会不同,大致分为两类:1. data manipulation + modeling; 2. 算法 Leetcode easy + medium) + SQL (有些公司会考) + ML (会考的比较细节,比如一些公式的推导,不同模型的比较) + Stats (probability, data distribution) + Case Study + BQ

SQL
在SQL上花的时间比较少,没有系统的学习教程,记住基本模板,用 w3schools.com 查function, 用 HackerRank 和 Leetcode进行练习。面试中的SQL一般都不难,大概2-4题 (也遇到过半小时6道题的,还要除去自我介绍和QA环节 >﹏<), 题目都是循序渐进的。如果遇到很难的题目,千万别慌,跟面试官确认题目,确保自己理解正确,然后跟面试官要一到两分钟来理清思路,在写的过程中要和面试官保持交流。

Business Case
了解面试公司的产品 (goal & metrics),地里的面经帮助很大,尤其是最近半年的,能够大致知道会考什么类型的题目,常用的metrics有哪些。可以找小伙伴 mock 一下或者自己练习。推荐 youtube.com 上关于product的视频, 如果有时间的话可以过一遍 A Collection of Data Science Take-Home Challenges (不知道有没有版本更新, 我的是18年的pdf, 有需要的小伙伴可以留下邮箱)。
面试中business case 很讲究答题逻辑,一般流程如下:

Diagnose a problem
1. Clarify the scenario/metric
2. Time - sudden or stable
- Internal: data source? Data collection? Bug?
- External: seasonality? Industry trend? Competitors? Special event?
3. Other products/features by the same company?
4. Segment by user demographic and behavior features
5. Decompose the metric
6. Summarize overall approach (key points)
For example, a metric (KPI) changed compared to before, find the reason:
1) Data issue: incident (bugs)/missing values/outliers (anomaly detection)
2) If it's statistically significant: long-term/short-term, compare to historical data, seasonality (holidays, events)
3) Segment analysis: country, user type, age, gender, device, features, day & time…
4) funnel analysis: AARRR (acquisition, activation, retention, referral, revenue)
5) Impact on other related metrics
6) Facebook side: policy changes, feature changes (UI, new feature launched)
7) Competition: similar products in marketing

Measure success
1. Clarify the function and goal of the product/feature
2. Define metrics (no more than 3: 2 success metrics + 1 guardrail metrics)
For example, how to improve a product:
1. Clarifying the questions:
a. be clear about the goal (engagement, retention, or revenue)
b. Narrow down to a specific product or feature (diverse features, which one focuses on)
c. Clarity: how does the feature work (enabled only for certain users? Do users have to log in?)
2. Identify product improvement opportunities
a. Analyze user journey map (if you are the user, what you want)
b. Segment users into groups (active users, personality, modeling user behaviors)
3. Prioritize ideas
a. Quantitative analysis: proportion of users impacted by each idea
b. Select the most cost-effective idea (less effort to drive more impact)
4. Measure success
a. Define 1-2 success metrics
b. Design the experiments (A/B testing)
5. Summary
a. Goal
b. Ideas
c. Prioritize ideas
d. Metric to use
e. Design experiment

Launch or not
1. Clarify goal and define metrics
2. Experimentation (sample size, duration, …)
3. Recommendation based on experiment results
Conflicting results: Short-term vs. Long-term
For example, launch a new feature:
1) Finding opportunities and/or problems from the case given, clarification
2) Define an objective/goal (with clarification from interviewers): short-term vs. long-term
3) Define metrics (2 success metrics & 1 guardrail metric): DAU, time to spend, interactions (# like and # comments), activities (# posts)
4) Sometimes, to solve the problem, we need to label the data by building a machine learning model. Talk about what model would you build, and what features would you use.
5) Once you build the model, talk about the potential solutions for the opportunity/problem defined.
6) Using the A/B test to evaluate the solution
7) Make recommendations based on the a/b test.

Stats
入门课程 Stanford Statistics 110 (时间紧或者有基础的可以直接看mxawng.com复习一下)
进阶版书籍 Statistical Inference (mybiostats.files.wordpress.com)

ML
经典课程 Coursera 上 Andrew Ng 的 coursera.org 课程
书籍推荐 The Elements of Statistical Learning (有一些针对面试的总结资料, 有需要的小伙伴可以留下邮箱)


BQ
我面试的每个公司都有问到,还是要好好准备, 不能忽视的。当个搬运工,放两个我当时参考过的总结:
instant.1point3acres.cn
instant.1point3acres.cninstant.1point3acres.cn


最后讲一讲三次找工作的体验吧~~
找第一份工作永远是最难的,当初广撒网投了300+简历,回复的寥寥,加上没有实习又算是转专业, 花了近半年的时间才拿到第一个offer。当时的自己有一个误区,比较胆怯不敢投大公司,实际上大公司坑位多,面试更流程化,只要能过简历关,面试会比一些小公司容易。据说recruiters都是搜关键字的,只要简历里有关键字,他们才会注意到你的简历。建议准备一份比较全面的简历,根据job description上的key words 进行微调 (tech skills 和 projects的删改)。
从非tech到tech或者小厂到大厂,数据的量级和产品的类型都发生了变化,要更注意business sense, 以及拿到offer后negotiate的技巧。第一次跳槽就是吃了这个亏,被lowball了。

DS的方向太多了,很可能精通一个领域却对另一个方向一窍不通。要想清楚自己对哪个方向感兴趣,想在哪个领域发展。因为自己对刷题的抗拒,就尝试了面试product & analytics 岗位,虽然有拿到offer,但是在深入了解工作日常后发现自己并不喜欢,只好作罢。领域之间还是有壁垒的,换一个方向可能要从头来过,对长远的职业发展并没有益处。


DS不易, 祝小伙伴都早日上岸,拿到心仪的offer~~
(ps. 非本人账号,可能回复比较慢,望谅解)

  • 964
276条回复