有网友总结了机器学习理论知识中的一些常见问题,链接见:

  1. https://www.1point3acres.com/bbs/thread-713903-1-1.html
  2. https://www.1point3acres.com/bbs/thread-714090-1-1.html
  3. https://www.1point3acres.com/bbs/thread-714558-1-1.html

现试答如下,欢迎批评指正。本文为第三部分,共三部分,每一部分与上述三个链接一一对应。

写代码实现两层fully connected网络

手写CNN

手写KNN

手写Kmeans

手写Softmax的backpropagation

给你个LSTM的结构让你计算how many parameters

Convolution layer的output size怎么算?给出公式

训练好的模型在现实中不work,问你可能的原因

Loss趋于inf或nan的可能原因

生产和开发的时候,data发生了一些shift,应该如何detect和补救

annotation有限的情况下怎么train model

假设有个model要放production了,但是发现online one important feature missing,不能重新train model,你怎么办?

LSTM的公式是什么

Why use RNN/LSTM

LSTM比RNN好在哪

Limitation of RNN

How to solve gradient vanishing in RNN?

What is attention, why attention?

Language model的原理,N-gram model.

What is CBOW and skip-gram?

什么是word2vec,loss function是什么,negative sampling是什么

maxpooling、conv layer是什么,为什么做pooling,为什么用conv layer?什么是equivalent-to-translation,invariant to translation

1x1 filter

什么是skip-connection

What is BERT, explain the model structure

What is transformer model, explain the model structure.

Transformer/BERT比LSTM好在哪

Difference between self-attention and traditional attention mechanism.

Wide and deep

Deepmask, UNet等