Suggested Errata
Suggested Errata
Introduction to Probability version dated 04/19/2023
Dear Marc,
While catching up this month, I came across a number of suggested errata; see the text below.
Thank you for putting together this course. It was a lot of fun as a fast track probability review.
Cheers,
Dave
Dear Marc,
While catching up this month, I came across a number of suggested errata; see the text below.
Thank you for putting together this course. It was a lot of fun as a fast track probability review.
Cheers,
Dave
2 Combinatorial Analysis
2 Combinatorial Analysis
Course Notes Sections: “Combinations” and “Multinomial Coefficient”
Course Notes Sections: “Combinations” and “Multinomial Coefficient”
The first sentence of the section”Combinations:
“Now consider you are choosing i out of n identical objects. What is the number of different groups of i objects?”
The text should be:
“Now consider you are choosing i out of n unique/ distinguishable objects. What is the number of different groups of i objects you can form?”
The change is relevant. We cannot create a combination of n identical objects i.e. of all repeating elements. Moreover, we can only use the binomial to calculate the number of combinations of n distinguishable (unique) elements i.e. with n non-repeating elements.
To see for ourselves in the WL where we will create permutations of the word “rest” versus “reset”, where we have no repeating elements and two repeating elements, resp.
“Now consider you are choosing i out of n identical objects. What is the number of different groups of i objects?”
The text should be:
“Now consider you are choosing i out of n unique/ distinguishable objects. What is the number of different groups of i objects you can form?”
The change is relevant. We cannot create a combination of n identical objects i.e. of all repeating elements. Moreover, we can only use the binomial to calculate the number of combinations of n distinguishable (unique) elements i.e. with n non-repeating elements.
To see for ourselves in the WL where we will create permutations of the word “rest” versus “reset”, where we have no repeating elements and two repeating elements, resp.
In[]:=
Length@Permutations[Characters["rest"],{2}]Length@Permutations[Characters["reset"],{2}]
Out[]=
12
Out[]=
13
Idem for the section on Multinomial Coefficient, but here the change is more subtle. If the n elements are not all identical, the multinomial can be used to count the number of combinations with a number of repeating elements..
For example the multinomial can be used to compute the number of permutations of all letters in the word “reset” where we have two repeating elements, the letter “e” that is. In the WL we can see it for ourselves:
For example the multinomial can be used to compute the number of permutations of all letters in the word “reset” where we have two repeating elements, the letter “e” that is. In the WL we can see it for ourselves:
In[]:=
Length@Permutations@Characters["reset"]Multinomial@@Values@Counts@Characters["reset"]
Out[]=
60
Out[]=
60
Compare the results with a word without any repeating letters e.g. the word “rest”; the number of permutations of the letters is just Factorial[n].
In[]:=
Length@Permutations@Characters["rest"]
Out[]=
24
7 Random Variables
7 Random Variables
Exercise 2
Exercise 2
The probability function in the question is not the same as the one used in the solution and should read:
p[x_]:=Piecewise,x==8,*Sqrt[5+],x!=8
1
8
-2+Sqrt[4+]
2
x
2
x
2
x
10 Properties of Random Variables
10 Properties of Random Variables
Exercise 3
Exercise 3
The solution text should be: = Var[+++] =
E[+++]
X
1
X
2
X
3
X
4
E[]+E[]+E[]+E[]
X
1
X
2
X
3
X
4
X
1
X
2
X
3
X
4
VAR[]+VAR[]+VAR[]+VAR[]
X
1
X
2
X
3
X
4
15 The Multinomial Distribution
15 The Multinomial Distribution
Exercise 2
Exercise 2
The cost function in the solution and question differ. The costs function in the question is: y1+2y2+3y3.
18 The Normal Distribution
18 The Normal Distribution
Exercise 4
Exercise 4
The text should mention “Height” instead of “Control Mass”.
Exercise 5
Exercise 5
The term cube usually means to the power 3, what is meant here is “cube root”.
19 The Multinormal Distribution
19 The Multinormal Distribution
Exercise 4
Exercise 4
The weight is meant: WholeWeight.
21 Mixture Disitributions
21 Mixture Disitributions
Exercise 4
Exercise 4
The question is unclear and needs an introduction to the data properties. The text could be:
We need the ResourceData “Sample Data: Myrtles” for which we select the “Points” properties to use for our analysis.
We want to create an estimated distribution of a mixture of 3 multi-normal distributions, also known as a Gaussian Mixture Model, or GMM. Next we will compare a contour plot of the GMM with a contour plot of the SmoothKernelDistribution. Can you guess what the data represents?
We need the ResourceData “Sample Data: Myrtles” for which we select the “Points” properties to use for our analysis.
We want to create an estimated distribution of a mixture of 3 multi-normal distributions, also known as a Gaussian Mixture Model, or GMM. Next we will compare a contour plot of the GMM with a contour plot of the SmoothKernelDistribution. Can you guess what the data represents?
Exercise 5
Exercise 5
Question is not very clear and could be:
We will look at the petal and sepal lengths and widths of Fisher’s Iris dataset. The Dataset can be found in the ResourceData.
We will look at the petal and sepal lengths and widths of Fisher’s Iris dataset. The Dataset can be found in the ResourceData.
◼
Create a mixture distribution of the sepal lengths and widths, where all three species are equally distributed. (Assume a multi-normal distribution.
◼
Compute an estimated distribution of the sepal lengths and widths, aggregated over all species. (Assume a multi-normal distribution.)
◼
Finally, create an empirical distribution, aggregated over all species using the SmoothKernelDistribution.
◼
Create a contour plot of all three distributions.
23 The Law of Large Numbers
23 The Law of Large Numbers
Exercise 1
Exercise 1
According to WikiPedia the Chebyshev inequality is:P( X -μ ≥ kσ ) ⩽ So the inequality in the course notes may not be correct. This seems to be in agreement with the solution of Exercise 1 where X-u >= k s, so if X-u=12 and s=6 then k should be 2.Source: https://en.wikipedia.org/wiki/Chebyshev’s_inequality
1
2
k
Exercise 3
Exercise 3
The text would be more accurate with this:
“Suppose the lifetime variable X of an appliance has an exponential distribution with an average lifetime of 10 years. Compute an asymptotic approximation of order 3 for the expectation of ..”
Maybe a clue to what value the parameter “a” (or alpha) is supposed to converge e.g “a->-1” would make the concept clearer. The example in the course notes where the AsymptoticProbability of order 3 is computed states that the parameter “b” converges to 1, so it’s not clear that we need to use -1 in this exercise.
“Suppose the lifetime variable X of an appliance has an exponential distribution with an average lifetime of 10 years. Compute an asymptotic approximation of order 3 for the expectation of ..”
Maybe a clue to what value the parameter “a” (or alpha) is supposed to converge e.g “a->-1” would make the concept clearer. The example in the course notes where the AsymptoticProbability of order 3 is computed states that the parameter “b” converges to 1, so it’s not clear that we need to use -1 in this exercise.
Exercise 4
Exercise 4
Text of the exercise is not clear. Is the experience described in a score from 1 to 10 or from 3 to 10? The solution suggests is should be 3 to 10.
Exercise 5
Exercise 5
The text does not specifically state that we should generate random variables from each individual distribution, or from a composite distribution. The solution suggests the latter.
Generate random variables from a combined distribution comprising of a Normal, Cauchy and StudentT distribution. The distributions’ average is 0 and the standard deviation random. The StudentTDistribution has 2 degrees of freedom.
Plot the progression of the average of 1000 simulated samples. Compare the convergence rate to that of Exercise 4.
Generate random variables from a combined distribution comprising of a Normal, Cauchy and StudentT distribution. The distributions’ average is 0 and the standard deviation random. The StudentTDistribution has 2 degrees of freedom.
Plot the progression of the average of 1000 simulated samples. Compare the convergence rate to that of Exercise 4.
24 Normal Approximations to the Binomial
24 Normal Approximations to the Binomial
Example in Lesson Notes
Example in Lesson Notes
The text should follow the solution:
You roll a fair six-sided die 180 times. What is the probability you roll a one at least 20 times and at least 30 times?
The normal approximation is not correct if we want to replicate the binomial solution:
You roll a fair six-sided die 180 times. What is the probability you roll a one at least 20 times and at least 30 times?
The normal approximation is not correct if we want to replicate the binomial solution:
dist=BinomialDistribution[180,1/6];Probability20-0.5<x<30+0.5,xNormalDistribution
Out[]=
0.521963
Now the result is closer to the binomial one.
Exercise 1
Exercise 1
The question is:
What is the probability more than 100 and 300 or less adults sleep with a comfort object?
So we should exclude 100 and include 300:
What is the probability more than 100 and 300 or less adults sleep with a comfort object?
So we should exclude 100 and include 300:
In[]:=
comfortDist=NormalDistribution[1000*1/3,Sqrt[1000*2/3*1/3]];Probability[100+0.5<x<=300+0.5,xcomfortDist]
Out[]=
0.0138141
Not a big deal, but subtle difference (x ⩽ 300.5 instead of x < 300.5).
Quiz 6
Quiz 6
Problem 6
Problem 6
The only choice I could compute is in there, but’s not marked as correct. Is there (still) an issue with the correct answer in the framework?
Practice Exam
Practice Exam
Problem 15
Problem 15
The normal distribution is appropriate because the var ≥ 10, so we expect to see the normal approximation in the solution, but we do not see it in the solution.
The solution should include:
The solution should include:
In[]:=
n=500;p=0.049;var=np(1-p);Probability[20+0.5<x<30-0.5,xNormalDistribution[np,Sqrt[var]]]
Out[]=
0.646221
Problem 16
Problem 16
The answer is incorrect if we follow the solution.
It should be 9.28606.
It should be 9.28606.