登录
  • #工作信息
  • #求职
  • #机器学习
  • #攻略

【新人求大米】机器学习知识面试前的Check List

MatthewGJFB
4569
25
Books: 1. 动手学深度学习

2. 统计学习方法

3. 深度推荐学习

4. 机器学习-西瓜书

Papers:

1. Self-Attention is All you need

2. Position Encodin

3. Tokenizer

4. BERT

5. ROBERTA

6. Info-XLM

7. Transformer XL

8. XLNET

9. XLM

10. XLMROBERTA

11. ALBERT

12. DSSM

13. Adapter

14. Knowledge Distillation

15. TINEYBERT

16. MiniLM

Algorithm:

1. Logic Regression

2. K-means

3. SVM

4. KNN

5. 贝叶斯

6. 决策树

7. 条件随机场

8. GBDT

9. XG Boost

10. LightGBM

11. 集成学习

12. 优化算法

13. 激活函数

14. 模型初始化

15. 模型评估

a. F1, Macro-F1, Micro-F1

b. Precision, Recall

c. AUC

d. NDCG

16. Loss函数

17. LSTM

18. GRU

19. Word2vec

20. Fast Text

21. Text CNN

22. Seq2seq

23. TFIDF

24. BM25

25. Lambda Mart

26. 强化学习

27. SVD and PCA

28. KDTREE, ANN, 局部敏感hash

29. FGM

Technique:

1. Overfitting

a. Drop Out

b. Layer Normalize

c. Batch Normalize

d. Increase Data

e. Rest Net

f. LSTM

g. Weight decay

h. Parameters initialize

i. Early Stopping

j. 梯度裁剪

2. Underfitting

a. Increase parameters

b. Reduce learning rate

c. Reduce L2 parameter

3. Model compression

a. L1与L2正则

b. 偏差与方差

c. Data Unbalanced

a. Focal loss

d. 知识蒸馏

e. 分布式训练

a. 单机多卡

b. 多机多卡

f. 熵,交叉熵,相对熵

QA:

1. Text CNN how it works?

2. Transformer why divide by sqrt(dimension dk) for QK when do scaled dot-product attention, this leads to having more stable gradients

3. Transformer positional encoding? In order for the model to make use of the order of the sequence, we must inject some information about the relative or absolute position of the tokens in the sequence.

4. Why multi-head? Multi-head attention allows the model to jointly attend to information from different representation subspaces at different positions

5. Self-attention核心思想

6. 多语言模型预训练的objective,多语言模型的核心思想

问面试官的问题:

1. 部门的商业目标,最大挑战是什么

2. 技术氛围

3. 岗位的发展前景,晋升目标,职业发展方向

4. 职业规划

5. 工作节奏

6. 竞争优势
25条回复
热度排序

发表回复