mirror of
https://github.com/redoules/redoules.github.io.git
synced 2025-12-12 15:59:34 +00:00
fixed bullet points
markdown bug fixed
This commit is contained in:
parent
7b576c431c
commit
0e8df83c1e
@ -127,12 +127,14 @@
|
||||
<div class='article_content'>
|
||||
<h2>Least Square Regression Line</h2>
|
||||
<h2>Problem</h2>
|
||||
<p>A group of five students enrolls in Statistics immediately after taking a Math aptitude test. Each student's Math aptitude test score, <span class="math">\(x\)</span>, and Statistics course grade, <span class="math">\(y\)</span>, can be expressed as the following list <span class="math">\((x,y)\)</span> of points:
|
||||
<em> <span class="math">\((95, 85)\)</span>
|
||||
</em> <span class="math">\((85, 95)\)</span>
|
||||
<em> <span class="math">\((80, 70)\)</span>
|
||||
</em> <span class="math">\((70, 65)\)</span>
|
||||
* <span class="math">\((60, 70)\)</span></p>
|
||||
<p>A group of five students enrolls in Statistics immediately after taking a Math aptitude test. Each student's Math aptitude test score, <span class="math">\(x\)</span>, and Statistics course grade, <span class="math">\(y\)</span>, can be expressed as the following list <span class="math">\((x,y)\)</span> of points: </p>
|
||||
<ul>
|
||||
<li><span class="math">\((95, 85)\)</span></li>
|
||||
<li><span class="math">\((85, 95)\)</span></li>
|
||||
<li><span class="math">\((80, 70)\)</span></li>
|
||||
<li><span class="math">\((70, 65)\)</span></li>
|
||||
<li><span class="math">\((60, 70)\)</span></li>
|
||||
</ul>
|
||||
<p>If a student scored an 80 on the Math aptitude test, what grade would we expect them to achieve in Statistics? Determine the equation of the best-fit line using the least squares method, then compute and print the value of <span class="math">\(y\)</span> when <span class="math">\(x=80\)</span>.</p>
|
||||
<div class="highlight"><pre><span></span><span class="n">X</span> <span class="o">=</span> <span class="p">[</span><span class="mi">95</span><span class="p">,</span> <span class="mi">85</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">60</span><span class="p">]</span>
|
||||
<span class="n">Y</span> <span class="o">=</span> <span class="p">[</span><span class="mi">85</span><span class="p">,</span> <span class="mi">95</span><span class="p">,</span> <span class="mi">70</span><span class="p">,</span> <span class="mi">65</span><span class="p">,</span> <span class="mi">70</span><span class="p">]</span>
|
||||
@ -186,10 +188,12 @@ $$</div>
|
||||
\right.
|
||||
$$</div>
|
||||
<p>so <span class="math">\(b_1=-\frac{3}{4}\)</span> and <span class="math">\(b_2=-\frac{3}{4}\)</span></p>
|
||||
<p>When we apply the Pearson's coefficient formula :
|
||||
<em> let <span class="math">\(p\)</span> be the pearson coefficient
|
||||
</em> let <span class="math">\(\sigma_X\)</span> be the standard deviation of <span class="math">\(x\)</span>
|
||||
* let <span class="math">\(\sigma_Y\)</span> be the standard deviation of <span class="math">\(y\)</span></p>
|
||||
<p>When we apply the Pearson's coefficient formula : </p>
|
||||
<ul>
|
||||
<li>let <span class="math">\(p\)</span> be the pearson coefficient</li>
|
||||
<li>let <span class="math">\(\sigma_X\)</span> be the standard deviation of <span class="math">\(x\)</span></li>
|
||||
<li>let <span class="math">\(\sigma_Y\)</span> be the standard deviation of <span class="math">\(y\)</span></li>
|
||||
</ul>
|
||||
<p>We hence have </p>
|
||||
<div class="math">$$
|
||||
\left\{\begin{array}{ r @{{}={}} r >{{}}c<{{}} r >{{}}c<{{}} r }
|
||||
|
||||
@ -128,18 +128,22 @@
|
||||
<h2>Linear Regression</h2>
|
||||
<p>If our data shows a linear relationship between <span class="math">\(X\)</span> and <span class="math">\(Y\)</span>, then the straight line which best describes the relationship is the regression line. The regression line is given by <span class="math">\(\hat{Y}\)</span>=a+bX$. </p>
|
||||
<h3>Finding the value of b</h3>
|
||||
<p>The value of <span class="math">\(b\)</span> can be calculated using either of the following formulae:
|
||||
<em> <span class="math">\(b=\frac{n\sum(x_iy_i)-(\sum x_i)(\sum y_i)}{n\sum(x_i^2)-(\sum x_i)^2}\)</span>
|
||||
</em> <span class="math">\(b=\rho\frac{\sigma_Y}{\sigma_X}\)</span>, where <span class="math">\(\rho\)</span> is the Pearson correlation coefficient, <span class="math">\(\sigma_X\)</span></p>
|
||||
<p>The value of <span class="math">\(b\)</span> can be calculated using either of the following formulae:</p>
|
||||
<ul>
|
||||
<li><span class="math">\(b=\frac{n\sum(x_iy_i)-(\sum x_i)(\sum y_i)}{n\sum(x_i^2)-(\sum x_i)^2}\)</span></li>
|
||||
<li><span class="math">\(b=\rho\frac{\sigma_Y}{\sigma_X}\)</span>, where <span class="math">\(\rho\)</span> is the Pearson correlation coefficient, <span class="math">\(\sigma_X\)</span></li>
|
||||
</ul>
|
||||
<h3>Finding the value of a</h3>
|
||||
<p><span class="math">\(a=\bar{y}-b\cdot\bar{x}\)</span>, where <span class="math">\(\bar{x}\)</span> is the mean of <span class="math">\(X\)</span> and <span class="math">\(\bar{y}\)</span> is the mean of <span class="math">\(Y\)</span>.</p>
|
||||
<h2>Coefficient of determination (<span class="math">\(R^2\)</span>)</h2>
|
||||
<p>The coefficient of determination can be computer with :
|
||||
<span class="math">\(R^2 = \frac{SSR}{SST}=1-\frac{SSE}{SST}\)</span>
|
||||
Where :
|
||||
<em> <span class="math">\(SST\)</span> is the total Sum of Squares : <span class="math">\(SST=\sum (y_i-\bar{y})^2\)</span>
|
||||
</em> <span class="math">\(SSR\)</span> is the regression Sum of Squares : <span class="math">\(SSR=\sum (\hat{y_i}-\bar{y})^2\)</span>
|
||||
* <span class="math">\(SSE\)</span> is the error Sum of Squares : <span class="math">\(SSE=\sum (\hat{y_i}-y)^2\)</span></p>
|
||||
Where :</p>
|
||||
<ul>
|
||||
<li><span class="math">\(SST\)</span> is the total Sum of Squares : <span class="math">\(SST=\sum (y_i-\bar{y})^2\)</span></li>
|
||||
<li><span class="math">\(SSR\)</span> is the regression Sum of Squares : <span class="math">\(SSR=\sum (\hat{y_i}-\bar{y})^2\)</span></li>
|
||||
<li><span class="math">\(SSE\)</span> is the error Sum of Squares : <span class="math">\(SSE=\sum (\hat{y_i}-y)^2\)</span></li>
|
||||
</ul>
|
||||
<p>If <span class="math">\(SSE\)</span> is small, we can assume that our fit is good. </p>
|
||||
<h2>Linear Regression in Python</h2>
|
||||
<p>We can use the fit function in the sklearn.linear_model.LinearRegression class.</p>
|
||||
|
||||
@ -5,7 +5,7 @@ xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
|
||||
|
||||
<url>
|
||||
<loc>redoules.github.io/</loc>
|
||||
<lastmod>2018-11-15T21:59:45-00:00</lastmod>
|
||||
<lastmod>2018-11-15T22:07:32-00:00</lastmod>
|
||||
<changefreq>daily</changefreq>
|
||||
<priority>0.5</priority>
|
||||
</url>
|
||||
|
||||
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue
Block a user