罐子裡有100顆球,紅色的球佔40%,綠色的球佔60%。從罐子裡取出10顆球,希望出現最多1顆紅色球的機率是多少?
這是一個Sampling with Replacement(放回取樣)問題,在Peak Balls from a Bin,我們用簡單的機率公式,可以算出來:
P [# = 0] = 0.6^10 = 0.006
P [# = 1] = 0.6^9 * 0.4 * 10
P [v <= 0.1] = P [# = 0] + P [# = 1] = 0.046…
本篇,我用Python給定不同的母體數量,100、1000、10000、100000,模擬這個Sampling with Replacement。
程式碼如下:
[SamplingWithReplacement.py]
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# | |
# Use random numbers to emulates the Sampling with Replacement. | |
# | |
# We can specify the number of balls in the bin with 40% red balls and 60% green balls. | |
# We select 10 balls from the bin and calculate the probability of the red | |
# balls in the samples. | |
# | |
import random | |
def sampleingWithReplacement (total, replaceTimes): | |
balls = [] | |
red = int (total * 40 / 100) | |
green = total - red | |
for i in range (red): | |
balls.append (1) | |
for i in range (green): | |
balls.append (0) | |
#print ("Display balls in the bin.") | |
#print (balls) | |
numberOfRed = 0 | |
for i in range (replaceTimes): | |
random.shuffle (balls) | |
sample = balls [0: 10] | |
s = sum (sample) | |
if s == 0 or s == 1: | |
numberOfRed += 1 | |
p = numberOfRed / replaceTimes | |
print ("There are %d balls in the bin. (%d red balls, %d green balls)" % (total, red, green)) | |
print ("Select %d balls from the bin with replacement.") | |
print ("Times of replacement = %d" % replaceTimes) | |
print ("Number of red balls in the sample = %d" % numberOfRed) | |
print ("Probability = %f" % p) | |
print ("") | |
if __name__ == '__main__': | |
replaceTimes = 10000 | |
sampleingWithReplacement (100, replaceTimes) # total = 100, probability = 0.034800 | |
sampleingWithReplacement (1000, replaceTimes) # total = 1000, probability = 0.046900 | |
sampleingWithReplacement (10000, replaceTimes) # total = 10000, probability = 0.047300 | |
sampleingWithReplacement (100000, replaceTimes) # total = 100000, probability = 0.043800 |
[Result]
執行結果如下:
There are 100 balls in the bin. (40 red balls, 60 green balls)
Select %d balls from the bin with replacement.
Times of replacement = 10000
Number of red balls in the sample = 348
Probability = 0.034800
There are 1000 balls in the bin. (400 red balls, 600 green balls)
Select %d balls from the bin with replacement.
Times of replacement = 10000
Number of red balls in the sample = 469
Probability = 0.046900
There are 10000 balls in the bin. (4000 red balls, 6000 green balls)
Select %d balls from the bin with replacement.
Times of replacement = 10000
Number of red balls in the sample = 473
Probability = 0.047300
There are 100000 balls in the bin. (40000 red balls, 60000 green balls)
Select %d balls from the bin with replacement.
Times of replacement = 10000
Number of red balls in the sample = 438
Probability = 0.043800
Probability = 0.043800
我們發現,母體數量愈大,Probability趨近於一個極限值,大概在0.046附近。
-Count
這封郵件來自 Evernote。Evernote 是您專屬的工作空間,免費下載 Evernote |
沒有留言:
張貼留言