Variance
Avg sq dev from mean
Bayes
P(A|B)*P(B) =P(B|A)*P(A)
Binomial distribution
Prob of x successes in n trials
Mean np
Var np (1-p)
Sampling distribution
Population of 1k bonds
Randomly pick 100 to get mean
Pick another 100 to get mean
Repeat x times
Now we have x means forming sampling distribution of the mean
Central limit theorem
Population mean = mean of sample means
Population variance = n * variance of sample means, with n = sample size
Std err of sample means
= Std dev of sample meanS
= population std dev / sqrt (n)
Think about it , more the observations vary, more likely u will get an inaccurate answer
Vice versa for sample size
Well, we don't have population std dev, so will use std dev of sample (NOT sample means)
Putting the above together, we get a point estimate of population mean from samplings.
We can get a confidence interval of our point estimate with the std error
Depending on the availability of population variance and sample size, we may use t distribution instead of normal, ie z.
Hypothesis testing
Is daily option return = 0?
Sample size of 250 days
Mean return = .1%
Sample std dev of return =.25%
Null hypothesis: population daily option return = 0
If the difference between the sample mean and population mean is big enough, then we can reject the null hypothesis and say mean return ! = 0.
How to quantity whether it's big enough?
We have to look at how accurate the sample mean is, ie how close sample mean is to population mean.
Say, the sample mean is 100% accurate, ie sample size = population size, ANY difference between sample mean and the hypothesized population mean is sufficient for us to reject the null hypothesis.
We quantity the accuracy of the sample mean with std err, ie std dev of sample meanS, = population sd / sqrt (n)
=.25%/sqrt (250) = .000158
.1% divided by the above gives 6.33
Tells us that the difference is 6.33 std dev away, which is very unlikely
5% sig interval is at +- 1.96 sd with z distribution
Regression
R^2 = coefficient of determina
= explained variation / total variation
= (total - unexplained)/ total
With
Total = sum of sq dev from mean
Unexplained = sum of sq dev from predicted
Testing a regression coefficient for significance
There's a certain critical t stat value for the regression coefficient +- std err to be within. That value is a function of degree of freedom , n-k-1, sample size - independent variables - 1
No comments:
Post a Comment