Home |
Search |
Today's Posts |
#1
|
|||
|
|||
Need help calculating probabilities
My actual dataset is much more complicated, but hopefully this simplification will do the trick.
Let's say cells A1 and A2 are =randbetween(1,10), cell A3 is =5, cells B1 and B2 are =randbetween(1,11), and cell B3 is =6. Cell C1 is =A1+A2+A3, and cell C2 is =B1+B2+B3. C3 is =(C1-C2)/((C1+C2)/2), a type of percentage difference calculation. Every time I hit F9 the values in column C will of course be a different number. I would like cell D1 to tell me how often C3.4, D2 to tell me how often .4C3=.2, D3 to tell me how often .2C3-.2, D4 to tell me how often -.2=C3-.4, and D5 to tell me how often C3<=-.4 There's a strong preference towards calculating the values of column D by only referencing column C, if at all possible. Update: actual sheet and more details in subsequent posts Last edited by Kiffar : November 17th 12 at 03:42 PM |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Need help calculating probabilities
"Kiffar" wrote:
Let's say cells A1 and A2 are =randbetween(1,10), cell A3 is =5, cells B1 and B2 are =randbetween(1,11), and cell B3 is =6. Cell C1 is =A1+A2+A3, and cell C2 is =B1+B2+B3. C3 is =(A3-B3)/((A3+B3)/2), a type of percentage difference calculation. Every time I hit F9 the values in column C will of course be a different number. I would like cell D1 to tell me how often C3.4, D2 to tell me how often 4C3=.2, D3 to tell me how often .2C3-.2, D4 to tell me how often -.2=C3-.4, and D5 to tell me how often C3<=-.4 First, in your effort to simplify the situation, you have defined a nonsensical problem. Since A3 and B3 are constants, C3 is the constant expression (5-6)*2/(5+6) = -2/11, which is about -0.18. Ergo, -0.2<C3<0.2 is always true. I presume you intended to write that C3 is =(C1-C2)*2/(C1+C2). Note that I made an algebraic simplification of (C1-C2)/((C1+C2)/2). ----- To do __exactly__ as you describe, I would create a Worksheet_Calculate event macro to update the counts in D1:D5. Private Sub Worksheet_Calculate() Application.EnableEvents = False Select Case Range("c3") Case Is = 0.4 Range("d1") = Range("d1") + 1 Case Is = 0.2 Range("d2") = Range("d2") + 1 Case Is -0.2 Range("d3") = Range("d3") + 1 Case Is -0.4 Range("d4") = Range("d4") + 1 Case Else Range("d5") = Range("d5") + 1 End Select Application.EnableEvents = True End Sub To create the event macro, right-click on the worksheet tab at the bottom, click on View Code, then copy the text above and paste into the VBA editing pane on the right. You can now close the VBA window. Note: To save the macro with the workbook in Excel 2007 and later, be sure to save as a macro-enabled file (xlsm). ----- But this is not how I would do it in the first place. If you are open to alternative methods, let us know. |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Need help calculating probabilities
PS.... I wrote:
To do __exactly__ as you describe, I would create a Worksheet_Calculate event macro to update the counts in D1:D5. Although that is what I would do based on your design and solution limitations, I can offer one alternative that does not require a macro. However, I do not recommend it. This solution requires the use of one extra cell. I use F1, which is initially empty. Then enter the following formulas: C3: =IF(F1="","",(C1-C2)*2/(C1+C2)) D1: =IF(C3="",0,D1+(C3=0.4)) D2: =IF(C3="",0,D2+(0.2<=C3)*(C3<0.4)) D3: =IF(C3="",0,D3+(-0.2<C3)*(C3<0.2)) D4: =IF(C3="",0,D4+(-0.4<C3)*(C3<=-0.2)) D5: =IF(C3="",0,D5+(C3<=-0.4)) Set the Iterative calculation option with Max Iterations set to 1. When you are ready to start your experiment, set F1 to 1. That is the first recalculations. Press F9 successively for subsequent recalculations. Clear F1 to start the counters at zero again. To set the Iterative calculation option: 1. In Excel 2003, click on Tools, Options, Calculation. 2. In Excel 2007, click on the Office Button, Excel Options, Formulas. 3. In Excel 2010, click on File, Options, Formulas. Caveat: I deprecate the use of the Iterative calculation option for two reasons. First, it is unreliable in some situations, although it is probably reliable in this limited usage. Second, it masks mistaken circular references; that is, you will not get an error message. But again, I would do this experiment very differently altogether, not relying on either a Calculate event macro or the iterative calculation option. |
#4
|
|||
|
|||
You were correct in ascertaining the nature of my typo in writing up the original post, I have edited it accordingly.
Is this because it would be inaccurate, or because there's a more efficient method? I have no preference toward any type of method, as long as it's accurate. I've never used macros or anything fancy like that before, though, so I might need handholding implementing such things. To give you an idea of my familiarity with Excel functions, I just learned how if statements work this week. Though I'd prefer to limit the number of simulations, so that each time I change one of the non-random variables, it doesn't take very long to update the calculations, as my actual datasets have billions of possibilities. The results are still based off of C1 and C2 effectively, there's just much more that goes into calculating those two cells in the real sheet I have going. Last edited by Kiffar : November 14th 12 at 01:22 AM |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Need help calculating probabilities
"Kiffar" wrote:
But again, I would do this experiment very differently altogether, not relying on either a Calculate event macro or the iterative calculation option. Is this because it would be inaccurate, or because there's a more efficient method? The latter. "Kiffar" wrote: I'd prefer to limit the number of simulations, so that each time I change one of the non-random variables, it doesn't take very long to update the calculations, as my actual datasets have billions of possibilities. The "best" implementation approach will depend on the exact nature of the simulation. I would encourage you to share all the details with us. You can upload an example Excel file (devoid of any private data) that demonstrates the problem to a file-sharing website. Then post the "shared", "public" or "view-only" link (aka URL; http://...) in a response here. The following is a list of some free file-sharing websites; or use your own. Box.Net: http://www.box.net/files Windows Live Skydrive: http://skydrive.live.com MediaFi http://www.mediafire.com FileFactory: http://www.filefactory.com FileSavr: http://www.filesavr.com RapidSha http://www.rapidshare.com Alternatively, send email to me at joeu2004 "at" hotmail.com, ideally with a full description and an Excel file attachment. (Some forums censor real email addresses.) "Kiffar" wrote: I've never used macros or anything fancy like that before, though, so I might need handholding implementing such things. To give you an idea of my familiarity with Excel functions, I just learned how if statements work this week. Generally, VBA (macros) is the best way to implement simulations, first because the logic is sometimes difficult to write in Excel, and second because we can avoid recalculations at unexpected times. The use of RANDBETWEEN and RAND usually causes recalculations every time any cell in any worksheet in the workbook is edited. Be that as it may, here is a quick-and-dirty way to do the simulation that you offered as an example. You will note that the cell usage is very different from what you asked for. Is that really a deal-breaker? In Sheet1: A1: =RANDBETWEEN(1,10) B1: =RANDBETWEEN(1,10) C1: =RANDBETWEEN(1,11) D1: =RANDBETWEEN(1,11) E1: =A1+B1+5 F1: =C1+D1+6 G1: =(C1-C2)*2/(C1+C2) Copy A1:G1 down through A10000:F10000. One way to do that easily: 1. Copy A1:G1. 2. Type A2:G10000 into the Name Box, then press Enter to select that range. 3. Paste by pressing ctrl+V. Note that columns E, F and G correspond to your C1, C2 and C3. There are just 10,000 of them -- 10,000 simulations. For now, copy G1:G10000 and paste-special-value into H1:H10000. That "freezes" the simulation. Note that G1:G10000 et al will continue to change. We don't care. I put the random data into a separate worksheet so that we can use a macro later, if you wish, to "freeze" the simulation instead of using copy-and-paste-special-value. In Sheet2: D1: =COUNTIF(Sheet1!H1:H10000,"=0.4") D2: =COUNTIF(Sheet1!H1:H10000,"=0.2") - D1 D3: =COUNTIF(Sheet1!H1:H10000,"-0.2") - D1 - D2 D4: =COUNTIF(Sheet1!H1:H10000,"-0.4") - SUM(D1:D3) D5: =COUNTIF(Sheet1!H1:H10000,"<=-0.4") D7: =SUM(D1:D5) ' should be 10000, of course For D2:D4, we could use COUNTIFS in Excel 2007 and later instead. For example: =COUNTIFS(Sheet1!H1:H10000,"=0.2",Sheet1!H1:H1000 0,"<0.4") E1: =D1/$D$7 Copy E1 and paste into E2:E5. Format E1:E5 as Percentage. Those are your probabilities. With your example data, RANDBETWEEN(1,10) and RANDBETWEEN(1,11), the probability distribution should be close to the following: 8.98% C3=0.4 10.96% 0.2<=C3<0.4 38.69% -0.2<C3<0.2 18.19% -0.4<C3<=-0.2 23.18% C3<=-0.4 For your example, the expected probability distribution can be computed using a macro. Let me know if you are interested. Is that an approach you can work with? |
#6
|
|||
|
|||
It looks like that would work in concept. Here's the actual sheet I'm working on (In Excel 2007): https://rapidshare.com/files/3231189247/Combat.xlsm
What I've been referring to as C1 and C2 are actually the pink and green numbers in columns K and W. Those numbers are simply autosums of K2: K26 and W2: W26, respectively. How each of those cells calculates its value is not currently decided with certainty (and in fact, has changed since I've uploaded the sample and began writing this, and likely will again). What is known, which hopefully is sufficient for the discussion at hand, is that all of K2: K26 and W2: W26 will be calculated the same way. Additionally, it is certain that the random number component for each row will always be either a number from 1-20 (columns I and Y), or a manually entered value in the adjacent column (J and X). Are there any changes I should be aware of that if made would throw of the probability calculations? One notable thing that could change is the addition of more columns and rows, would macros written for the sheet automatically change the cells they refer to in order to accommodate the same way cell functions do? For simplification I had cut out some of the range classifications, the full list is =.4, =.2, =.1, =.05, -.05, -.1, -.2, -.4, and =<-.4 You can see in the middle where each classification will go, under chance of victory. The chance of defeat section will simply be a mirror of the chance of victory section using linked cells. I'm leaning towards deleting it though, if it matters. The decimals will be displayed as percentages, I had intended to simply do that via the cell format menu. 5% through negative 5% is a stalemate, 5 through 9.99~ is marginal, etc. Note that the percentage difference calculation in the middle is showing the results for the current engagement is an =abs function to always turn the result into a positive number. Thus the probability calculations should use ($J$27-$W$27)/(($J$27+$W$27)/2) rather than referring to M34. I've switched off Excel's automatic formula recalculation for now, so that it only recalculates random numbers when I press F9. However, it seems this is a global setting that will apply to all sheets. It is desirable to have the randomly generated numbers and probabilities not change unless manually refreshed for this specific worksheet only, so would it be possible to replace =randbetween() with a macro? Would it be possible to make a refresh button on the sheet labeled "New Match"? Edit: updated to a more current file Last edited by Kiffar : November 17th 12 at 03:42 PM |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Need help calculating probabilities
"Kiffar" wrote:
It looks like that would work in concept. Here's the actual sheet I'm working on (In Excel 2007): https://rapidshare.com/files/1408769342/SampleZ.xlsm Sorry, but I cannot respond in a timely manner. Dealing with a family tragedy. |
#8
|
|||
|
|||
Quote:
|
#9
|
|||
|
|||
Joe is understandably busy and hopefully doing well. For anyone else still looking, there's been a notable change to the way the to numbers being compared are calculated. Ten more rows have been added, and so the two numbers now being compared are J37 and W37. Further, they are now calculated by summing the same columns, but then multiplied a static percentage, then further multiplied further by a random percentage, set by the sheet's user columns S and T at the top. Then a static number is added. The version of the sheet I have uploaded can increase or decrease the value of J37 and W37 by as much as 20%. Directly to the right of those settings, it shows the current result of the random generation in the current calculation.
Latest sheet he https://rapidshare.com/files/3231189247/Combat.xlsm |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Calculating probabilities | Excel Worksheet Functions | |||
Calculating Probabilities | Excel Worksheet Functions | |||
Question about use of Poisson probabilities | Excel Worksheet Functions | |||
Probabilities | Excel Programming | |||
Macro and probabilities | Excel Programming |