### Tags: bias, example, excel, expectation, games, generator, istoo, microsoft, msdn, numbers, rand, random, software, statistical

# Bias in rand for excel 07

On Microsoft » Microsoft Excel

16,419 words with 9 Comments; publish: Fri, 30 May 2008 20:21:00 GMT; (30662.50, « »)

I'm testing random numbers for games using excel 07 generator, but there is

too much bias away from statistical expectation. For example, when two

integer numbers are generated between 1 and 6, and subsequently added, the

results differ from the easily predictable numbers. Sevens are consistently

in short supply by about 1.5%. That's a lot! Any suggestions for better

randomization in excel would be very much appreciated.

yttrias

*http://excel.questionfor.info/q_microsoft-excel_36149.html*

All Comments

Leave a comment...

- 9 Comments
- Here are my results from excel 2003
11652

21715

31689

41668

51661

61615

Sub countrand()

Dim numbers(6)

For i = 1 To 6

numbers(i) = 0

Next i

For i = 1 To 10000

myRand = Int(6 * Rnd()) + 1

numbers(myRand) = numbers(myRand) + 1

Next i

For i = 1 To 6

Range("A" & i) = i

Range("B" & i) = numbers(i)

Next i

End Sub

"yttrias" wrote:

> I'm testing random numbers for games using excel 07 generator, but there is

> too much bias away from statistical expectation. For example, when two

> integer numbers are generated between 1 and 6, and subsequently added, the

> results differ from the easily predictable numbers. Sevens are consistently

> in short supply by about 1.5%. That's a lot! Any suggestions for better

> randomization in excel would be very much appreciated.

> --

> yttrias

#1; Fri, 30 May 2008 20:22:00 GMT

- Here are my results from excel 2003
- On Dec 16, 2:22 pm, "Bernard Liengme" <blien....excel.questionfor.info.stfx.TRUENORTH.ca>
wrote:[vbcol=seagreen]

> And what was your sample size?

> --

> Bernard V Liengme

> Microsoft Excel MVPwww.stfx.ca/people/bliengme

> remove caps from email

> "yttrias" <yttr....excel.questionfor.info.discussions.microsoft.com> wrote in message

> news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.questionfor.info.microsoft.com...

The flaw in the MS random number generator has been known for some

time. A description is included in this MS info sheet:

http://support.microsoft.com/kb/829208

I use a simulation package that has it's own generator. But there are

probably others you can download via a web search.

You may want to run statistical tests on ones that you are considering

for validation.

SteveM

#2; Fri, 30 May 2008 20:23:00 GMT

- On Dec 16, 2:22 pm, "Bernard Liengme" <blien....excel.questionfor.info.stfx.TRUENORTH.ca>
- yttrias -
Which "excel 07 generator" are you using?

RAND() worksheet function, or

RND function in VBA, or

Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ?

- Mike Middleton

http://www.DecisionToolworks.com

Decision Analysis Add-ins for Excel

"yttrias" <yttrias.excel.questionfor.info.discussions.microsoft.com> wrote in message

news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.questionfor.info.microsoft.com...

> I'm testing random numbers for games using excel 07 generator, but there

> is

> too much bias away from statistical expectation. For example, when two

> integer numbers are generated between 1 and 6, and subsequently added, the

> results differ from the easily predictable numbers. Sevens are

> consistently

> in short supply by about 1.5%. That's a lot! Any suggestions for better

> randomization in excel would be very much appreciated.

> --

> yttrias

#3; Fri, 30 May 2008 20:24:00 GMT

- yttrias -
- I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
consecutive values at 20 different locations throughout the range of 1M. The

average of those 20 windows in the 1M was 1587, where the expected average

should be 1667.

yttrias

"Bernard Liengme" wrote:

> And what was your sample size?

> --

> Bernard V Liengme

> Microsoft Excel MVP

> remove caps from email

> "yttrias" <yttrias.excel.questionfor.info.discussions.microsoft.com> wrote in message

> news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.questionfor.info.microsoft.com...

>

>

#4; Fri, 30 May 2008 20:25:00 GMT

- I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000
- I don't know about the OP, but Joel's posted code used the VBA Rnd()
function. Prior to 2007, there were 3 distinct random number generators, the

worksheet function RAND(), VBA Rnd(), and the ATP random number generator.

None were very good before 2003, but their problems would arise as high order

autocorrelations between "random" numbers not as the kind of problems the

Joel was testing for. It is unlikely to show up in the process that the OP

seems to describe. Therefore I would want to know how the OP generated

"integer numbers between 1 and 6" and what the OP was using as expected

values.

A decent, but not great algorithm was implemented for worksheet RAND()

(requires SP1 patch to work properly) and carried into 2007. I have seen no

evidence that VBA Rnd() has ever been upgraded. RANDBETWEEN() became a

worksheet function in 2007, and therefore probably switched from using ATP

random number generator to using the worksheet RAND() function, though I have

seen no confirmation of this.

Joel's simulation shows departures from expectation that are NOT

statistically significant. Generating 10000 sums of 2 values using worksheet

RAND() and expected values of (6-ABS(7-tot2))/36, I tried several reps in

2003 with no statistically significant departure from expectation. The

following VBA code does 10 reps of 10000 using VBA Rnd(), and only one rep

was statistically different from expectation (again in 2003). Therefore, I

am somewhat skeptical of the OP's claims until more information is provided.

Jerry

Sub tryit()

Application.ScreenUpdating = False ' to speed execution

For j = 1 To 10

For i = 1 To 10000

x = Fix(Rnd() * 6) + Fix(Rnd() * 6) + 2 ' sum of two random

integers between 1 and 6

Cells(x, j) = Cells(x, j) + 1

Next i

Next j

Application.ScreenUpdating = True

End Sub

"Mike Middleton" wrote:

> yttrias -

> Which "excel 07 generator" are you using?

> RAND() worksheet function, or

> RND function in VBA, or

> Excel 2007, Data ribbon, Data Analysis, Random Number Generation tool ?

> - Mike Middleton

> http://www.DecisionToolworks.com

> Decision Analysis Add-ins for Excel

>

> "yttrias" <yttrias.excel.questionfor.info.discussions.microsoft.com> wrote in message

> news:09EA2A85-AE22-4CB1-9537-0EC3ACAF7DC1.excel.questionfor.info.microsoft.com...

>

>

#5; Fri, 30 May 2008 20:26:00 GMT

- I don't know about the OP, but Joel's posted code used the VBA Rnd()
- You are still providing too little information for us to understand what you
are doing.

There is no RAND() function in VBA. In VBA, you either used the VBA Rnd()

function, or the worksheet RAND() function via Evaluate("RAND()").

As noted in my earlier post, in 2003 the worksheet RAND() function was far

superior to the VBA Rnd(). I have seen no evidence that either function

changed between 2003 and 2007.

Your original post discussed the sum of two (presumably independent) random

integers between 1 and 6 (such as the total from rolling two fair dice). The

expected value of a single such sum would be 7. It is unclear how that

relates to either your observed average of 1587 or expected average of 1667

for an "average of 20 windows of 40,000 consecutive values.

It sounds as though your VBA would involve only a few lines of code. If you

paste that code into your reply, we can see exactly what you did.

Jerry

"yttrias" wrote:

[vbcol=seagreen]

> I ran 1 million samples using RAND in VBA in excel 07. Then I sampled 40,000

> consecutive values at 20 different locations throughout the range of 1M. The

> average of those 20 windows in the 1M was 1587, where the expected average

> should be 1667.

> --

> yttrias

>

> "Bernard Liengme" wrote:

#6; Fri, 30 May 2008 20:27:00 GMT

- You are still providing too little information for us to understand what you
- >I ran 1 million samples using RAND in VBA in excel 07. Then I sampled
>40,000

> consecutive values at 20 different locations throughout the range of 1M.

> The

> average of those 20 windows in the 1M was 1587, where the expected average

> should be 1667.

How are you using th Randomize statement? If you are executing it more than

one time, that could be skewing your results. Consider this posting I have

posted in the past...

The Randomize statement should not be executed more than once for the

lifespan of the object that is executing your code. Doing so more often

actually makes the data less random than if Randomize is only run once. For

so few colors, and the probable use the OP wants to use the Rnd function

for, it will more than likely not matter here, but it is the concept that is

important to know. I can't demo it within Excel because I am not aware of an

available object where I can set the color of individual pixels (like a

PictureBox control in the compiled version of VB for those who have worked

with that language); however, I do have a VB program that demonstrates this

fact visually.

For those of you having access to the compiled versions of VB5 or VB6, here

is posting I have offered over in the compiled VB newsgroups in the past

that demonstrates this fact...

Running Randomize multiple times ends up producing a less random set of

numbers. To see the problem visually, use this code (which is a modification

of a routine Bob Butler once posted). Start a new project and put two

PictureBox'es on your Form (use the default names for everything and

placement of the PictureBox'es is not important). Paste the following code

into the Form's code window. The PictureBox on the left results from using

Randomize only once, the one on the right uses it repeatedly. Both

PictureBox displays are produced from the same looping code with the only

difference being the use of the Randomize statement Ignoring the pronounce

vertical areas (not sure what that is, probably some kind of boundary

rounding problem), for which one does the distribution of colors look more

"random"?

Rick

Const SCALESIZE = 3

Private Sub Form_Load()

Picture1.ScaleMode = 3

Picture2.ScaleMode = 3

Randomize

Picture1.Move 0, 0, _

128 * Screen.TwipsPerPixelX * SCALESIZE, _

128 * Screen.TwipsPerPixelY * SCALESIZE

Picture2.Move Picture1.Width, 0, _

128 * Screen.TwipsPerPixelX * SCALESIZE, _

128 * Screen.TwipsPerPixelY * SCALESIZE

Me.Width = 2.02 * Picture1.Width

Me.Height = 1.1 * Picture1.Height

End Sub

Private Sub Picture1_Paint()

Dim i As Long

Dim j As Long

Dim colr As Long

Dim bitmask As Long

For i = 0 To Picture1.ScaleHeight Step SCALESIZE

For j = 0 To Picture1.ScaleWidth Step SCALESIZE

colr = Rnd * 16711680

Picture1.Line (j, i)-Step(SCALESIZE, _

SCALESIZE), colr, BF

Next j

Next i

For i = 0 To Picture2.ScaleHeight Step SCALESIZE

For j = 0 To Picture2.ScaleWidth Step SCALESIZE

Randomize

colr = Rnd * 16711680

Picture2.Line (j, i)-Step(SCALESIZE, _

SCALESIZE), colr, BF

Next j

Next i

End Sub

#7; Fri, 30 May 2008 20:28:00 GMT

- >I ran 1 million samples using RAND in VBA in excel 07. Then I sampled
- "Rick Rothstein (MVP - VB)" wrote in message
> For those of you having access to the compiled versions of VB5 or VB6,

here

> is posting I have offered over in the compiled VB newsgroups in the past

> that demonstrates this fact...

> Running Randomize multiple times ends up producing a less random set of

> numbers. To see the problem visually, use this code (which is a

modification

> of a routine Bob Butler once posted). Start a new project and put two

> PictureBox'es on your Form (use the default names for everything and

> placement of the PictureBox'es is not important). Paste the following code

> into the Form's code window. The PictureBox on the left results from using

> Randomize only once, the one on the right uses it repeatedly. Both

> PictureBox displays are produced from the same looping code with the only

> difference being the use of the Randomize statement Ignoring the

pronounce

> vertical areas (not sure what that is, probably some kind of boundary

> rounding problem), for which one does the distribution of colors look more

> "random"?

> Rick

In particular this bit -

> Ignoring the pronounced vertical areas (not sure what that is,

> probably some kind of boundary rounding problem)

Is it some rounding problem or could it be related to the non random nature

of random, even with the single Randomize.

Actually I didn't notice those vertical areas you refer to at first in the

left box, however I amended a couple of values in your code, in particular

changed colr = Rnd * 16711680 to colr = Rnd * 16777215, also increased the

sizes by 1 pixel to exaggerate.

Now I get very distinct not entirely random vertical areas in the left box.

For the second box I changed Randomize in each loop to Randomize 1, just for

fun.

Amended code -

Const SCALESIZE = 3

Private Sub Form_Load()

Const Z As Long = 129 ' original 128

Me.Left = 10

Picture1.ScaleMode = 3

Picture2.ScaleMode = 3

Randomize

Picture1.Move 0, 0, _

Z * Screen.TwipsPerPixelX * SCALESIZE, _

Z * Screen.TwipsPerPixelY * SCALESIZE

Picture2.Move Picture1.Width, 0, _

Z * Screen.TwipsPerPixelX * SCALESIZE, _

Z * Screen.TwipsPerPixelY * SCALESIZE

Me.Width = 2.02 * Picture1.Width

Me.Height = 1.1 * Picture1.Height

End Sub

Private Sub Picture1_Paint()

Dim i As Long

Dim j As Long

Dim colr As Long

Dim bitmask As Long

Const C As Long = 16777215 ' original 16711680

For i = 0 To Picture1.ScaleHeight Step SCALESIZE

For j = 0 To Picture1.ScaleWidth Step SCALESIZE

colr = Rnd * C

Picture1.Line (j, i)-Step(SCALESIZE, _

SCALESIZE), colr, BF

Next j

Next i

For i = 0 To Picture2.ScaleHeight Step SCALESIZE

For j = 0 To Picture2.ScaleWidth Step SCALESIZE

Randomize ' original

Randomize 1

colr = Rnd * C

Picture2.Line (j, i)-Step(SCALESIZE, _

SCALESIZE), colr, BF

Next j

Next i

End Sub

Regards,

Peter T

#8; Fri, 30 May 2008 20:29:00 GMT

- "Rick Rothstein (MVP - VB)" wrote in message
- A similar demo for a VBA Userform -
Option Explicit

Private Declare Function FindWindow Lib "user32" _

Alias "FindWindowA" _

(ByVal lpClassName As String, _

ByVal lpWindowName As String) As Long

Private Declare Function SetPixel Lib "gdi32" _

(ByVal hdc As Long, ByVal x As Long, _

ByVal y As Long, ByVal crColor As Long) As Long

Private Declare Function GetPixel Lib "gdi32" _

(ByVal hdc As Long, ByVal x As Long, ByVal y As Long) As Long

Private Declare Function GetDC Lib "user32" ( _

ByVal hwnd As Long) As Long

Private Declare Function ReleaseDC Lib "user32" ( _

ByVal hwnd As Long, ByVal hdc As Long) As Long

Private Sub PaintPixels()

Dim hwnd As Long, hdc As Long

Dim tp As Long, lt As Long

Dim x As Long, y As Long

Dim colr As Long

Const Z As Long = 128 * 2 - 1

Const C As Long = 16777215

Me.Left = 10: Me.Top = 10

Me.Width = (Z * 2) * 0.75 + 45: Me.Height = Z * 0.75 + 60

' if form is too small, change 0.75 to 1 or 1.25

hwnd = FindWindow("ThunderDFrame", Me.Caption)

hdc = GetDC(hwnd)

tp = Me.Top + 15

lt = Me.Left + 10

Randomize

For y = tp To tp + Z

For x = lt To lt + Z

colr = Rnd * C

SetPixel hdc, x, y, colr

Next

Next

lt = lt + Z + 15

For y = tp To tp + Z

For x = lt To lt + Z

Randomize 1

colr = Rnd * C

SetPixel hdc, x, y, colr

Next

Next

ReleaseDC hwnd, hdc

End Sub

Private Sub UserForm_Activate()

Me.Caption = "Click me to (re-) PaintPixels"

PaintPixels

End Sub

Private Sub UserForm_Click()

' Me.Repaint

PaintPixels

End Sub

Peter T

#9; Fri, 30 May 2008 20:30:00 GMT

- A similar demo for a VBA Userform -