Count 朱: Why is the Final Hypothesis only one?

2017年7月26日星期三

在學習 Machine Learning Foundations 線上課程，有一段，充滿疑惑。

下面這一段，是從 The Learning Problem - Component of Machine Learning，節錄的：

明明銀行在評量個人條件來決定核發信用卡，是有許多hypothesis組合在一起的。但這句話的意思是，只從h1、h2、h3中選擇其中一個最好的，做為g？可是實際上，最好的hypothesis，往往是這幾個基本的hypothesis組合在一起。例如：

h4 = h1 and h2 and h3

當我把內心的疑惑，寫出來之後，答案也就出來了。

那就是，這裡舉的例子，少列了一些用邏輯運算組合的hypothesis，如：

h4 = h1 and h2 and h3

h5 = h1 or h2 or h3

h6 = h1 and (h2 or h3)

所以，以實務經驗來看，最好的hypothesis，很可能會從這些組合的hypothesis裡面挑撰一個出來，做為g。當然，這又衍生另外一個問題，就是，這樣的hypothesis set, H有多大？

|H| = ?

課程後面，會提到一個公式

這裡的

M = |H|

意思就是，若H太大，會導致Bad Sample發生的機率變大。這種機率的問題，與上篇所提的，是同一類問題：
Coin Game Example in Machine Learning

這個公式，留待以後解釋。

總之，我的意思是，從h1、h2、h3經由邏輯運算所產生出來的hypothesis，剔掉重覆的之後，其數量應該是有限的，而且不會太多。

-Count

這封郵件來自 Evernote。Evernote 是您專屬的工作空間，免費下載 Evernote

Count 朱