2017年11月18日 星期六

Use Python to Emulate Sampling with Replacement

罐子裡有100顆球,紅色的球佔40%,綠色的球佔60%。從罐子裡取出10顆球,希望出現最多1顆紅色球的機率是多少?

這是一個Sampling with Replacement(放回取樣)問題,在Peak Balls from a Bin,我們用簡單的機率公式,可以算出來:

P [# = 0] = 0.6^10 = 0.006
P [# = 1] = 0.6^9 * 0.4 * 10
P [v <= 0.1] = P [# = 0] + P [# = 1] = 0.046…

本篇,我用Python給定不同的母體數量,100、1000、10000、100000,模擬這個Sampling with Replacement。

程式碼如下:
[SamplingWithReplacement.py]
#
# Use random numbers to emulates the Sampling with Replacement.
#
# We can specify the number of balls in the bin with 40% red balls and 60% green balls.
# We select 10 balls from the bin and calculate the probability of the red
# balls in the samples.
#
import random
def sampleingWithReplacement (total, replaceTimes):
balls = []
red = int (total * 40 / 100)
green = total - red
for i in range (red):
balls.append (1)
for i in range (green):
balls.append (0)
#print ("Display balls in the bin.")
#print (balls)
numberOfRed = 0
for i in range (replaceTimes):
random.shuffle (balls)
sample = balls [0: 10]
s = sum (sample)
if s == 0 or s == 1:
numberOfRed += 1
p = numberOfRed / replaceTimes
print ("There are %d balls in the bin. (%d red balls, %d green balls)" % (total, red, green))
print ("Select %d balls from the bin with replacement.")
print ("Times of replacement = %d" % replaceTimes)
print ("Number of red balls in the sample = %d" % numberOfRed)
print ("Probability = %f" % p)
print ("")
if __name__ == '__main__':
replaceTimes = 10000
sampleingWithReplacement (100, replaceTimes) # total = 100, probability = 0.034800
sampleingWithReplacement (1000, replaceTimes) # total = 1000, probability = 0.046900
sampleingWithReplacement (10000, replaceTimes) # total = 10000, probability = 0.047300
sampleingWithReplacement (100000, replaceTimes) # total = 100000, probability = 0.043800



[Result]
執行結果如下:

There are 100 balls in the bin. (40 red balls, 60 green balls)
Select %d balls from the bin with replacement.
Times of replacement               = 10000
Number of red balls in the sample  = 348
Probability                        = 0.034800

There are 1000 balls in the bin. (400 red balls, 600 green balls)
Select %d balls from the bin with replacement.
Times of replacement               = 10000
Number of red balls in the sample  = 469
Probability                        = 0.046900

There are 10000 balls in the bin. (4000 red balls, 6000 green balls)
Select %d balls from the bin with replacement.
Times of replacement               = 10000
Number of red balls in the sample  = 473
Probability                        = 0.047300

There are 100000 balls in the bin. (40000 red balls, 60000 green balls)
Select %d balls from the bin with replacement.
Times of replacement               = 10000
Number of red balls in the sample  = 438
Probability                        = 0.043800

我們發現,母體數量愈大,Probability趨近於一個極限值,大概在0.046附近。

-Count
這封郵件來自 Evernote。Evernote 是您專屬的工作空間,免費下載 Evernote

沒有留言:

張貼留言