Wednesday, February 10, 2016

A nice probability problem

Today's post will temporarily break away from the pattern of analysis related posts, however you can expect to see some more analysis related stuff on this blog within the next few days.

I came up with a probability question today with (at least what I think is) a nifty solution.

Here is a statement of the problem: 'You are partaking in an experiment. The rules of the experiment are as follows: You flip a fair two sided coin, if you receive heads on a flip, you must roll a fair 6-sided dice, if you roll an even number, you get $2$ points, if you roll an odd number, you get $1$ point. After you receive heads and roll either an even or odd number, you are allowed to flip your coin again. However, if you receive a tails during any of your flips, you receive $0$ additional points and must stop the experiment immediately. How many points should you expect to receive in total? (What is the expected value of this experiment)'

So, as an example, I could receive tails on my first flip, leaving me with a final score of $0$. I could also flip heads, get an even number, and then flip tails, leaving me with $2$ points at the end of my experiment.

To get a better feel for the experiment, here is a plot of its probability distribution (how this plot was obtained will be explained later in this post):



The event with the highest probability is 'receiving a score of $0$', as there is a $0.5$ likelihood of receiving a tails on first flip. The event 'receiving a $2$' is slightly more probable than the event 'receiving a $1$', precisely by $\frac{1}{32}$.

Now, lets try and actually solve the problem. What is the expected value of this experiment?

We'll start by recalling the definition of 'expected value' :

$\textbf{Definition 1 (Expected Value)}$ : The expected value of an experiment with random variable $X$ is $\displaystyle E(X) = \sum_{\forall x} P(X=x)(x)$

In our case, our random variable is $X = no. \ of \ points $, and therefore $X$ can take on any of the values in $\{0,1,2,...\}$, or in other words, $X = \mathbb {N}$

We are currently faced with the task of trying to compute $E(X)$. Initially, I fell into a trap when faced with this problem, as I attempted to find a closed form expression for $P(X=x)$ in order to compute $E(X)$, a method that, while viable, leads to further messiness involving radicals and exponents.

While this was not the method I ultimately used in order to solve the problem, I'll go through it anyway for the sake of completeness.

I first noticed the following recurrence relation: $P(X=k) = \frac{1}{4} (P(X=k-1) + P(X = k-2) )$,

for all $k\geq 2, k\in \mathbb {N}$, with initial conditions: $P(X = 0) = \frac{1}{2} \ and \ P(X=1) = \frac{1}{8}$

I'll briefly justify this recurrence relation: Note that the only possible method to increase points at any point during the experiment is to flip heads, as flipping tails leaves the number of points at any particular point in time unchanged.

Therefore, if we have received $x+2$ points at the end of our experiment, we may have either flipped heads and rolled even after $x$ points or flipped heads and rolled odd after $x+1$ points.

Lets analyze this further: We'll denote sample points as sequences of $Heads = H$, $Tails = T$, $Even\ roll = E$, $Odd \ roll = O$

As an example, a sample point in the event $X = 4$ is $HOHOHET$

The $T$ at the end of any of our sequences is just to terminate the sequence, as it doesn't actually add or subtract any points.

Consider the event $X = x+2$. How many sample points are in the sample space of this event? Well, each sample point in $X = x+2$ is simply a sample point in $X=x+1$ with an additional $HO$, or is  a sample point in $X = x$ with an additional $HE$.

To illustrate this, lets take $X=4$ as an example. I'll list out all of the sample points in $X=4$.

Here they are: $\{HEHET, HOHOHET, HOHEHOT, HEHOHOT, HOHOHOHOT\}$

Now, here are all the sample points for $X=3$

$\{HOHET, HEHOT, HOHOHOT\}$

and all the sample points for $X=2$

$\{HET, HOHOT\}$

First of all, we see that the number of sample points in the event $X=4$ is equal to the sum of the number of sample points in events $X=3$ and $X=2$.

Secondly, we see that as previously mentioned, each sample point in $X=4$ is either a sample point in $X=3$ with an additional $HO$ or a sample point in $X=2$ with an additional $HE$.

For example, take the sample point $HEHOHOT$. This is $HEHOT$ (a sample point of $X=3$) with an additional $HO$.

Now, what does this mean in terms of probability?

Well, an additional $HO$ translates to multiplication by $(\frac{1}{2})(\frac{1}{2}) = \frac{1}{4}$, and an additional $HE$ also translates to multiplication by $(\frac{1}{2})(\frac{1}{2}) = \frac{1}{4}$.

So, $P(X = k) = (probability \ of \ HO )(P(X = k-1)) + (probability \ of \ HE )(P(X = k-2))$

Since $(probability \ of \ HO ) = (probability \ of \ HE ) = \frac{1}{4}$, we have

$P(X=k) = \frac{1}{4} (P(X=k-1) + P(X = k-2) )$

Great, so how can we use this recurrence relation to determine a closed form expression for $P(X=k)$ ?

The answer lies in the theorem below:

$\textbf{Theorem 1 (Solution of second order linear homogeneous recurrence relation with distinct roots)}$

A recurrence relation $u_n = au_{n-1} + bu_{n-2}$ for non-zero $a,b$ has closed form solution:

$u_n = c(\lambda_1)^n + d(\lambda_2)^n$.

Where $\lambda_1$ and $\lambda_2$ are distinct roots of the characteristic equation:

 $\lambda^2 - a\lambda - b = 0$.

and $c,d$ are constants that are determined by two initial conditions, namely $u_1 = known \ value$ and $u_2 = known \ value$

$\square$

One can show that Theorem 1 is indeed true by substituting this closed form solution back into the original recurrence relation, although this method of 'proving' Theorem 1 is quite unsatisfying and leaves much to be desired as it does not shed any light on what motivated Theorem 1 in the first place.

Unfortunately, I do not know much about what motivated Theorem 1, perhaps after some research I'll be able to write a post about this in future.

Anyway, with the help of Theorem 1, I was able to work out a closed form expression for $P(X = k)$.

Here it is:

$\displaystyle P(X = k) = (\frac{\sqrt{17} + 1}{4\sqrt{17}})(\frac{1 + \sqrt{17}}{8})^k + (\frac{\sqrt{17} - 1}{4\sqrt{17}})(\frac{1 - \sqrt{17}}{8})^k$

(This is also how I was able to obtain a plot of $P(X=k)$ against $k$)

Using this, it is possible to just plug the expression we obtained for $P(X=k)$ into the $E(X)$ summation and do a bit of algebra relating to series of the form $\displaystyle \sum_{\forall k} kr^k$ to work out the answer. However, this is quite messy and so I tried to avoid it with an alternative route that gets rid of the need for a closed form expression of $P(X=k)$ altogether.

Here is the alternative method:

$\displaystyle E(X) = \sum \limits_{k=0}^{\infty} (k)P(X=k) = (0)P(X=0) + (1)P(X=1) + \sum \limits_{k=2}^{\infty} (k)P(X=k)$

Now for $k\geq 2$ we know that $P(X=k) = \frac{1}{4} (P(X=k-1) + P(X = k-2) )$

This means that $\displaystyle \sum \limits_{k=2}^{\infty} (k)P(X=k) = \frac{1}{4}(\sum \limits_{k=2}^{\infty} (k)P(X=k-1) + \sum \limits_{k=2}^{\infty} (k)P(X=k-2))$

Lets concentrate on each individual term in this expression:

$\displaystyle \sum \limits_{k=2}^{\infty} (k)P(X=k-1) = \sum \limits_{k=2}^{\infty} (k-1)P(X=k-1) + \sum \limits_{k=2}^{\infty} P(X=k-1)$

We know  that $\displaystyle \sum \limits_{k=2}^{\infty} P(X=k-1) = \sum \limits_{k=1}^{\infty} P(X=k-1) - P(X=0) = \sum \limits_{k=1}^{\infty} P(X=k-1) - \frac{1}{2} = 1 - \frac{1}{2} = \frac{1}{2}$.

Furthermore, $\displaystyle \sum \limits_{k=2}^{\infty} (k)P(X=k-2)) = \sum \limits_{k=2}^{\infty} (k-2)P(X=k-2)) + \sum \limits_{k=2}^{\infty} (2)P(X=k-2))$

We know that $\displaystyle \sum \limits_{k=2}^{\infty} (2)P(X=k-2)) = 2$.

Lastly, we can also see that $\displaystyle \sum \limits_{k=2}^{\infty} (k-1)P(X=k-1) = E(X)$

and that $\displaystyle \sum \limits_{k=2}^{\infty} (k-2)P(X=k-2) = E(X)$

Therefore, $\displaystyle \sum \limits_{k=2}^{\infty} (k)P(X=k) = \frac{1}{4}(E(X) + \frac{1}{2} + E(X) + 2) = (\frac{1}{2})E(X) + \frac{5}{8}$

Now, we have $\displaystyle E(X) = (0)(P(X=0)) + (1)(P(X=1)) +  \sum \limits_{k=2}^{\infty} (k)P(X=k)  = \frac{1}{8} +  \sum \limits_{k=2}^{\infty} (k)P(X=k) = \frac{1}{8} + (\frac{1}{2})E(X) + \frac{5}{8}$

Re-arranging, we have $\displaystyle (\frac{1}{2})E(X) = \frac{6}{8}$ yielding $E(X) = 1.5$

$\square$

The second method still takes advantage of the recurrence relation, but does not go the extra mile in that an actual closed form expression for $P(X=k)$ is not necessary.
















Sunday, February 7, 2016

Hyper-cubes

Last year, I wrote a short 'internal investigation' paper on the 'tesseract' (a four dimensional cube)
 (There is some stuff about linear transformations during the first half of the paper - I would skip this as it's pretty dry). I thought I'd upload the paper below (the paper is in .pdf format)

Hypercube paper

Thursday, February 4, 2016

Introduction to real analysis (1.0) - Preliminaries

(I am wary of the fact that I have not been posting anywhere near $\frac{3}{2}$ times per day, hopefully over the next week I will be able to pull my average up somewhere closer to $\frac{3}{2}$)

This post will contain several definitions/useful concepts that will be frequently used in the series of posts titled 'Introduction to real analysis'.

Definition 1 (Limit of a function) - A real valued function $f$ defined on some domain $D \subset \mathbb {R}$ is said to have a 'limit as $x$ approaches $a$' for some $a \in D$ if the following statement is true:

' There exists a real number $L>0$ such that the following statement is true:

 For all $\epsilon > 0 $ there exists a $\delta > 0 $ such that if $0 < |x - a| < \delta$, then $|f(x) - L| < \epsilon $

Intuitively, this means that given we take points close enough to $a \in D$, $f(x)$ can be made arbitrarily close to $L$.

Definition 2 (Continuity of a function) - A real valued function $f$ defined on some domain $D \subset \mathbb {R}$ is said to be 'continuous at $a \in D$' if the following statement is true:

 For all $\epsilon > 0 $ there exists a $\delta > 0 $ such that if $ |x - a| < \delta$, then $|f(x) - f(a)| < \epsilon $

Definition 3 (Continuity on a set) - A real valued function $f$ with domain $D \subset \mathbb {R}$ is said to be continuous on a set $S \subset D$ if for all $x \in S$, $f$ is continuous at $x$.

Visually, a function that is continuous contains no 'holes' and no 'jumps'. Most of the popular functions, $sin(x)$, $cos(x)$, $a^x$ are continuous in $\mathbb {R}$.

Definition 4 ( Norm in $\mathbb {R}$) - The 'distance' or 'absolute' value function is defined as follows:

$|x| = x$ if $x \geq 0$, $|x| = -x$ if $x < 0$.

Visually, $x$ is the 'distance of $x$ from $0$'.

Theorem 1 (Triangle Inequality) - For any $a,b,c \in \mathbb {R}$ the following inequality holds:

$|a+b| \leq |a| + |b|$

A proof of Theorem 1 will not be provided at the moment, (perhaps it will be provided in a later post). A sketch however, of one possible proof, has been provided below:

Proof Sketch (Theorem 1) - Consider the cases when $a \geq 0$, $a<0$, $b \geq 0$, $b<0$

Prove Theorem 1 for all combinations of aforementioned cases.

Theorem 2 (Reverse Triangle Inequality) - For any $a,b,c \in \mathbb {R}$ the following inequality holds:

$| |a| - |b| | \leq |a + b|$.

One can similarly prove Theorem 2.


THIS 'PRELIMINARY' LIST IS NOT CONCRETE, AND MAY BE SUBJECT TO UPDATE.








Introduction to real analysis (1.3) - Integration - The equivalence between Riemann and Darboux Integrability

The purpose of this post will be to prove that the notions of 'Darboux' and 'Riemann' integrability are equivalent.

In this post, we will continue to use the notation $M_i = \sup{f}_{x \in [x_{i-1}, x_i]}$ and $m_i = \inf{f}_{x \in [x_{i-1}, x_i]}$, although this notation will not be used in the entirety of this post, as later on in the post, when we prove 'Darboux integrability implies Riemann integrability', clearly identifying over which interval we are taking the supremum of $f$ will become necessary.

We start by presenting the theorem we will attempt to prove.

Theorem 1.1 - A bounded function $f$ is Riemann Integrable on $[a,b]$ if and only if $f$ is Darboux Integrable on $[a,b]$.

Proof (Theorem 1.1) - We will start by proving that if $f$ is Riemann Integrable on $[a,b]$, it is Darboux Integrable on [a,b].

If $f$ is Riemann Integrable, then as per definition $\displaystyle \lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) $ exists.

Let $P = \{x_0 = a, x_1, x_2, ... , x_{n-1}, x_n=b\}$ be a partition of $[a,b]$.

Now, consider $\displaystyle \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) - \sum_{i=1}^n (M_i)(x_i - x_{i-1}) = \sum_{i=1}^n (f(\xi_i) - M_i)(x_i - x_{i-1})$

Let us choose $\xi_i$ so that $|f(\xi_i) - M_i| < \frac{\epsilon}{b-a}$.
(By Lemma 1.0 this is always possible)

Then we have $ \displaystyle |\sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) - \sum_{i=1}^n (M_i)(x_i - x_{i-1}) | =

 | \sum_{i=1}^n (f(\xi_i) - M_i)(x_i - x_{i-1})|



 < \sum_{i=1}^n |f(\xi_i) - M_i| |x_i - x_{i-1}|$

(This follows from the triangle inequality)

But we have chosen $\xi_i$ so that $|f(\xi_i) - M_i| < \frac{\epsilon}{b-a}$ for each $i$.

Therefore, $ \displaystyle | \sum_{i=1}^n (f(\xi_i) - M_i)(x_i - x_{i-1})| < \sum_{i=1}^n |f(\xi_i) - M_i| |x_i - x_{i-1}| < ( \frac{\epsilon}{b-a})(\sum_{i=1}^n (x_i - x_{i-1}) ) = \epsilon $

Now recall that by definition, we know that $\lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) = \int_{a}^b f $ (as $f$ is Riemann integrable on $[a,b]$)

We will show that $\int_{a}^b f $ is a lower bound of $ U(f,P)$ for any partition $P$ .

In order to do so, we will proceed by contradiction.

Suppose for some $P$, $\int_{a}^b f  - U(f,P) > 0$.

As  $\lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) = \int_{a}^b f $, for $||P|| < \delta$ we can make $| \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) - \int_{a}^b f | < \int_{a}^b f  - U(f,P)$

Consider a refinement of $P$ , $P'$, where $||P'|| < \delta$.

Then for $P'$, we have $| \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) - \int_{a}^b f | < \int_{a}^b f  - U(f,P)$

This implies that $U(f,P) - \int_{a}^b f < \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) - \int_{a}^b f$ , which is true if and only if $ U(f,P) < \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) $

Furthermore, by the definition of $U(f,P')$, we know that $\sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) \leq U(f,P')$

Before we continue this proof, we need to establish a lemma about Darboux upper and lower sums and how they behave upon refinement.

Lemma 1.1 - If $P'$ is a refinement of $P$, then $U(f,P') \leq U(f,P)$ and $ L(f,P') \geq L(f,P)$.

Proof (Lemma 1.1) - Let $P = \{x_0 = a, x_1, x_2, ... , x_{n-1}, x_n=b\}$. Suppose $P'$ contains one more point (The cases where $P'$ contains two more points, three more points, can be covered inductively - this will be left to the reader).

Then $P' = \{x_0 = a, x_1, x_2, ...., x_k, y' , x_{k+1}, ... , x_{n-1}, x_n\}$.

$U(f,P') = U(f,P) + \sup f_{x \in [x_k,y']}(y' - x_k) + \sup f_{x \in [y', x_{k+1}]}(x_{k+1} - y') - \sup f_{x \in [x_k, x_{k+1}]}(x_{k+1} - x_k) $.

We will now show that $\sup f_{x \in [x_k,y']}(y' - x_k) + \sup f_{x \in [y', x_{k+1}]}(x_{k+1} - y') - \sup f_{x \in [x_k, x_{k+1}]}(x_{k+1} - x_k) \leq 0$.

Clearly, $\sup f_{x \in [x_k,y']} \leq \sup f_{x \in [x_k, x_{k+1}]}$ and $\sup f_{x \in [y',x_{k+1}]} \leq \sup f_{x \in [x_k, x_{k+1}]}$ as $[x_k, y'] \subset [x_k, x_{k+1}] \ and \ [y', x_{k+1}] \subset [x_k, x_{k+1}]$

Therefore, $\sup f_{x \in [x_k,y']}(y' - x_k) + \sup f_{x \in [y',x_{k+1}]}(x_{k+1} - y') \leq (\sup f_{x \in [x_k, x_{k+1}]}) ( y' - x_k + x_{k+1} - y') = \sup_f{x \in [x_k, x_{k+1}]} (x_{k+1} - x_k)$

Thus $U(f,P') \leq U(f,P)$

$\square$

The proof is essentially the same for Darboux lower sums.

Now, back to our proof, we had deduced that $ U(f,P) < \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) $.

Now, we also know that $U(f,P') \leq U(f,P)$ and so $U(f,P') < \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) $.

But we also know that by definition, we have $\sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) \leq U(f,P')$

This yields a contradiction, and therefore  $\int_{a}^b f $ is a lower bound of $ U(f,P)$ for any partition $P$.

Now, we know that we can also make $| U(f,P) -  \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1})  | < \frac{\epsilon}{2} $ given we choose $\xi_i$ carefully, and by choosing a $P$ with $||P||$ small enough, we can make $| \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) - \int_{a}^b f | < \frac{\epsilon}{2}$.

We know that $|U(f,P) - \int_{a}^b f | =

 |U(f,P) - \int_{a}^b f + \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) - \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1})  | $

and $ |U(f,P) - \int_{a}^b f + \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) - \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1})  | \leq

|U(f,P) - \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1})|

 + | \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) - \int_{a}^b f |

< \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon$

Thus by Lemma 1.0, it is clear that $inf \{U(f,P) | P \ is \ a \ partition \ of \ [a,b] \} = \int_{a}^b f$

$\square$

We can similarly prove that $sup \{L(f,P) | P \ is \ a \ partition \ of \ [a,b] \} = \int_{a}^b f$ (this will be left as an exercise to the reader)

So we have successfully shown that ' Riemann Integrability implies Darboux Integrability ' .

We must now proceed to show that the converse is also true, namely, ' Darboux Integrability implies Riemann Integrability '.

In order to do this, we will establish another lemma.

Lemma 1.2 : Let $P = \{x_0 = a, x_1, .... , x_{n-1}, x_n = b\}$ be a partition of $[a,b]$.

Define $|P| = min \{(x_i - x_{i-1}) | \forall i \in \{1,2,...,n\} \}$ that is $|P|$ is the minimum width of partition $P$.

Suppose $P_0 = \{y_0 = a, y_1, ... , y_{m-1}, y_m = b\}$ is a partition of $[a,b]$ such that $|| P_0 || < |P|$ , that is the maximum width of partition $P_0$ is less than the minimum partition width of $P$.

Then for all $i \in \{1,2,....,n-1\}$ there exists a unique $k \in \{1,2,...,m-1\}$ such that 

$y_k \leq x_i$ and $x_i < y_{k+1}$. Furthermore there are no points in $P$ that are between $y_k$ and $x_i$ or between $x_i$ and $y_{k+1}$. 

More specifically, there do not exist any $x_j$ such that $x_j \in (y_k, x_i)$ or $x_j \in (x_i, y_{k+1})$.

That is, visually speaking, for every $x_i$ there is one and only one point in $P_0$ , $y_k$, such that $y_k$ is immediately to the left of' $x_i$, and the point to the right of $y_k$ in the partition $P_0$, $y_{k+1}$, is immediately to the right of $x_i$.

Proof (Lemma 1.2) - 

Consider $X_i = \{ k \in \{1,2,...,m-1\} | y_k \leq x_i \}$ . $X_i$ is nonempty, because we have $||P_0|| < |P|$, therefore, $y_1 < x_1 \leq x_i \forall i$.

$X_i$ is also a finite subset of $\mathbb {N}$. Therefore there exists a $k_0 \in X_i$ such that $k_0$ is the maximum element of $X_i$. That is, $k_0 = max (X_i) $.

We claim that $y_{k_0 + 1} > x_i $. This claim can be proved by contradiction. Suppose $y_{k_0 + 1} \leq x_i$. But then $k_0 + 1 \in X_i$, this is a contradiction, as we said $k_0 = max (X_i)$.

Therefore, $y_{k_0 + 1} > x_i$. Furthermore, we can also prove that $x_{i-1}$ is not in $(y_{k_0}, x_i)$ and that $x_{i+1}$ is not in $(x_i, y_{k_0 + 1})$ by contradiction.

Suppose, $x_{i-1} \in (y_{k_0}, x_i)$ then $y_{k_0} < x_{i-1}< x_i < y_{k+1}$ and so $x_i - x_{i-1} < y_{k_0+1} - y_{k_0}$. But $||P_0|| < |P|$ . Therefore this yields a contradiction .

Similarly, we can prove that $x_{i+1}$ is not in $(x_i, y_{k_0 + 1})$.

Now, all that is left is to prove the uniqueness of $y_{k_0}$. This can also be done by contradiction.

Suppose there was another point $y_j$ in $P_0$ where $j \neq k_0$ such that $y_j \leq x_i$ and $y_{j+1} > x_i$.

If $j > k_0$ then $y_j > y_{k_0}$, and $y_j \leq x_i$. But we know that $k_0 = max{X_i}$. Therefore this yields a contradiction.

If $j< k_0$ then $y_{j+1} \leq y_k \leq x_k$, so $y_{j+1} \ngtr x_i$, this also yields a contradiction.

So $j = k_0$. But we assumed $j \neq k_0$, yielding another contradiction.

Therefore there does not exist another point $y_j$ in $P_0$ where $j \neq k_0$ such that $y_j \leq x_i$ and $y_{j+1} > x_i$.

$\square$

Without further ado, let us prove that ' Darboux Integrability implies Riemann Integrability '.

 First we notice $L(f,P) \leq \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) \leq U(f,P)$ for all partitions $P = \{x_0 = a, x_1, x_2, ... , x_{n-1}, x_n = b\}$ of $[a,b]$.

Thus it suffices to show that $\lim_{||P|| -> 0} U(f,P) = \lim_{||P|| -> 0} L(f,P) = \int_{a}^b f$ in order to show that $\lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i - x_{i-1}) = \int_{a}^b f$, as this is a consequence of the squeeze theorem.

As $\inf {U(f,P)}$ exists, by Lemma 1.0 we may choose a partition $P=\{x_0=a, x_1, ... , x_{n-1}, x_n\}$ such that $U(f,P) - \int_{a}^b f < \frac{\epsilon}{2}$.

Now consider a partition $P' = \{y_0 = a, y_1, ... , y_{m-1}, y_m = b\}$ of $[a,b]$ such that $|| P' || < min (| P | , \frac{\frac{\epsilon}{2}}{3nM})$ where $M>0$ such that $|f(x)| < M$ for all $x \in [a,b]$ ,($f$ is bounded), and $n+1$ is the number of points in the partition $P$ described in the previous paragraph.

From Lemma 1.2, we know that since $\displaystyle || P' || < min (| P | , \frac{\frac{\epsilon}{2}}{3nM}) \leq | P |$, the only difference between $P'$ and $P \cup P'$ is that for each unique  $y_{i*}$, such that $y_{i*} \leq x_i$ and $y_{i*+1}> x_i$, the sub-interval $[y_{i*},y_{i*+1}]$ does not belong to $P \cup P'$, rather these sub-intervals are replaced by $[y_{i*}, x_i]$ and $[x_i, y_{i*+1}]$ for each $i \in \{1,2,3,...,n-1\}$. 

Therefore, $U(f,P') = U(f, P \cup P') - (\sum_{i=1}^n (\sup f_{x \in [y_{i*}, x_i]})(x_i - y_{i*}) + \sum_{i=1}^n (\sup f_{x \in [x_i, y_{i*+1}]})(y_{i*+1}-x_i) ) + \sum_{i=1}^n (\sup f_{x \in [y_{i*}, y_{i*+1}]})(y_{i*+1} - y_{i*}) $

Furthermore, $ | U(f,P') - U(f, P \cup P') | = | \sum_{i=1}^n (\sup f_{x \in [y_{i*}, y_{i*+1}]})(y_{i*+1} - y_{i*}) - (\sum_{i=1}^n (\sup f_{x \in [y_{i*}, x_i]})(x_i - y_{i*}) + \sum_{i=1}^n (\sup f_{x \in [x_i, y_{i*+1}]})(y_{i*+1}-x_i) ) |$

Now,


$| \sum_{i=1}^n (\sup f_{x \in [y_{i*}, y_{i*+1}]})(y_{i*+1} - y_{i*}) - (\sum_{i=1}^n (\sup f_{x \in [y_{i*}, x_i]})(x_i - y_{i*}) + \sum_{i=1}^n (\sup f_{x \in [x_i, y_{i*+1}]})(y_{i*+1}-x_i) ) |$ 

is less than or equal to 

$ \sum_{i=1}^n |(\sup f_{x \in [y_{i*}, y_{i*+1}]})|(y_{i*+1} - y_{i*}) + \sum_{i=1}^n |(\sup f_{x \in [y_{i*}, x_i]})|(x_i - y_{i*}) + \sum_{i=1}^n |(\sup f_{x \in [x_i, y_{i*+1}]})|(y_{i*+1}-x_i) $ 

by the triangle inequality

As $|f(x)| < M$ for all $ x \in [a,b]$ we may also deduce that


$ \sum_{i=1}^n |(\sup f_{x \in [y_{i*}, y_{i*+1}]})|(y_{i*+1} - y_{i*}) + \sum_{i=1}^n |(\sup f_{x \in [y_{i*}, x_i]})|(x_i - y_{i*}) + \sum_{i=1}^n |(\sup f_{x \in [x_i, y_{i*+1}]})|(y_{i*+1}-x_i) $ 

is less than 

$ M(\sum_{i=1}^n (y_{i*+1} - y_{i*}) + \sum_{i=1}^n (x_i - y_{i*}) + \sum_{i=1}^n (y_{i*+1}-x_i) )$

Furthermore, as $\displaystyle \sum_{i=1}^n (y_{i*+1} - y_{i*}) + \sum_{i=1}^n (x_i - y_{i*}) + \sum_{i=1}^n (y_{i*+1}-x_i) < 3(\sum_{i=1}^n ||P'||) < 3(\sum_{i=1}^n  \frac{\frac{\epsilon}{2}}{3nM}) = \frac{\frac{\epsilon}{2}}{M}$

It follows that, $ \displaystyle M(\sum_{i=1}^n (y_{i*+1} - y_{i*}) + \sum_{i=1}^n (x_i - y_{i*}) + \sum_{i=1}^n (y_{i*+1}-x_i) ) < \frac{\epsilon}{2}$.

So we have shown that $| U(f,P') - U(f, P \cup P') | < \frac{\epsilon}{2}$.

Now, we know that $|U(f,P') - \int_{a}^b f | = |U(f,P') - \int_{a}^b f + U(f,P \cup P') - U(f, P \cup P')|$

Now, by the triangle inequality, $|U(f,P') - \int_{a}^b f + U(f,P \cup P') - U(f, P \cup P')| < |U(f,P') - U(f, P \cup P')| + |U(f,P \cup P') - \int_{a}^b f|$.

$P \cup P'$ is a refinement of $P$ , as $ P \subset P \cup P'$.

Therefore, by Lemma 1.1, we know that $|U(f,P \cup P') - \int_{a}^b f| < \frac{\epsilon}{2}$, as we choose $P$ such that $|U(f,P) - \int_{a}^b| < \frac{\epsilon}{2}$.

So, $|U(f,P') - U(f, P \cup P')| + |U(f,P \cup P') - \int_{a}^b f| <  \frac{\epsilon}{2} +  \frac{\epsilon}{2}$.

Thus, for an arbitrary partition $P'$ such that $ \displaystyle ||P'|| < min (| P | , \frac{\frac{\epsilon}{2}}{3nM})$, we know that 


$|U(f,P') - \int_{a}^b f | < |U(f,P') - U(f, P \cup P')| + |U(f,P \cup P') - \int_{a}^b f| <  \frac{\epsilon}{2} +  \frac{\epsilon}{2} = \epsilon$.

Thus, by definition, $\lim_{||P|| - > 0} U(f,P) = \int_{a}^b f$.

The proof that $\lim_{||P|| - > 0} L(f,P) = \int_{a}^b f$ uses exactly the same method.

Therefore, $\lim_{||P|| -> 0} U(f,P) = \lim_{||P|| -> 0} L(f,P) = \int_{a}^b f$

$\square$




















An introduction to real analysis (1.2) - Darboux Integrability

This post will outline the definition of 'Darboux integrability'

Lets get started.

Unlike the Riemann integral, interestingly, we will see that no notion of any limit, formally speaking, will be needed in defining the Darboux integral. Indeed, many people, myself included, find the definition of the Darboux integral simpler than the definition of the Riemann integral, although both notions of integrability (as we will see) are equivalent.

Before we define the Darboux integral, two preliminary concepts must be introduced.

Lets start with one of the properties of the Real Numbers, the so called 'least upper bound' or 'completeness' property.

Definition 1.2 (Least Upper Bound / Completeness property of the Real numbers) -

For any non-empty subset $A$ of $\mathbb {R}$, if $A$ is bounded above, that is, if there exists a real number $b \in \mathbb {R}$ such that $b \geq x$ for all $ x \in A$, then there exists a real number $c$ such that $c \geq x$ for all $x \in A$ with the additional property that for any $d \in \mathbb {R}$ such that $d \geq x$ for all $x \in A$ , $c \leq d$ . We call $c$ the 'least upper bound' of $A$ or we say $c = \sup A$ or $c$ is the 'supremum' of $A$.

The property stated above is actually a consequence of how we construct the real numbers, in fact,  the goal of both of the most popular approaches to constructing the real numbers (Dedekind cuts, Cauchy sequences) is precisely to be able to equip the set of all real numbers with this very property. The completeness property of the Real numbers is absolutely fundamental to all of calculus and Real analysis, as it enables us to start working with sequences and their limits, which in turn enables us to start looking at the limits of real valued functions.

There are several different ways you can try to visualise what the completeness property is, one of these ways is to think about the completeness property as 'the real number line contains no holes'. The same cannot be said about the rationals. As an example, take the set $ S = \{x \in \mathbb {Q} | \   x^2 \leq 2\}$ . Try as you might, you will never be able to find a number $ c \in \mathbb {Q} $ such that $c$ is the least upper bound of $S$. Working in the real numbers, however, the answer is $\sqrt{2}$.

Now we will also define the 'infimum' or the 'greatest lower bound' of a subset of the real numbers that is bounded below.

Definition 1.3 (Greatest Lower Bound / Infimum) - If a nonempty subset of the real numbers, $A$ is bounded below, that is, there exists a real number $b$ such that for all $x \in A$, $ b \leq x$, then there exists a real number $c$ such that $c \leq x$ for all $x \in A$ which possesses the additional property that for any real number $d$ such that $d \leq x$ for all $x \in A$, $c \geq d$. We call $c$ the 'infimum' of $A$, or the 'greatest lower bound' of $A$. We may also write $ c = \inf A$.

The truth of Definition 1.3 follows from the least upper bound property, and a sketch of this proof has been presented below:

Proof (Definition 1.3) (Sketch) : Suppose $A \subset \mathbb {R}$ is bounded below, consider $ S = \{ x \in \mathbb {R} | x \leq y , \forall  y \in A \}$ . Prove that $S$ is bounded above, now apply the least upper bound property. $\sup S$ must exist. Prove that $\sup S$ is equal to the greatest lower bound of $A$ (Proof by contradiction may be useful here)

$\square$

We may have also started with the 'infimum' property and proceeded to prove the 'supremum' property of the Real numbers, that is, the 'infimum' and 'supremum' properties of the Real numbers are logically equivalent (one implies the other).

Two more definitions and we'll be able to define the Darboux integral ( I promise these definitions won't be lengthy)

Definition 1.4 (Boundedness) - A function $f$ is said to be 'bounded' on $[a,b]$ if there exists a real number $M > 0 $ such that for all $x$ in $[a,b]$ , $|f(x)| \leq M$.

Definition 1.5 (Darboux Upper and Lower Sums) - The Darboux upper sum of a function $f$ on $[a,b]$ (provided the function is defined over $[a,b]$) with respect to a partition $P=\{x_0=a,x_1,x_2,...,x_{n-1},x_n=b\}$ of $[a,b]$ is $ U(f,P) = \sum_{i=1}^n (\sup{f}_{x \in [x_{i-1}, x_i]})(x_i - x_{i-1})$

The Darboux lower sum of a function $f$ on $[a,b]$ with respect to a partition $P=\{x_0=a,x_1,x_2,...,x_{n-1},x_n=b\}$ of $[a,b]$ is $ L(f,P) = \sum_{i=1}^n (\inf{f}_{x \in [x_{i-1}, x_i]})(x_i - x_{i-1})$

It may seem a bit strange as to why we use $\inf$ and $\sup$ rather than $\min$ and $\max$. This is because it is possible that $f$ is not continuous, or not continuous everywhere, and therefore $f$ may not itself attain extreme values on some interval.



The diagram above depicts a Darboux upper sum for a function that is continuous over $[a,b]$ (therefore $\sup {f}_{x \in [x_{i-1},x_i]} = \max{f}_{x \in [x_{i-1},x_i]}$)

Rather than using the busy $\sup {f}_{x \in [x_{i-1},x_i]}$ and $\inf {f}_{x \in [x_{i-1},x_i]}$ we will denote $M_i = \sup {f}_{x \in [x_{i-1},x_i]}$ and $m_i = \inf {f}_{x \in [x_{i-1},x_i]}$ later in this post.


Another way of thinking about these upper sums are as 'over-approximations'. It follows that Darboux lower sums can be thought of as 'under-approximations' to the area under $f(x)$ on $[a,b]$.

We will often be interchanging between the notation $U(f,P)$ for a Darboux upper sum of $f$ over $[a,b]$ with respect to the partition $P$ and $\sum_{i=1}^n (M_i)(x_i - x_{i-1}) = U(f,P)$. Likewise, we will also frequently use the notation $L(f,P)$ to denote the Darboux lower sum of $f$ over $[a,b]$ with respect to the partition $P$.

Without further ado, let us define the Darboux integral.

Definition 1.4 (Darboux Integral) - A bounded function $f$ is said to be Darboux integrable if $\inf\{U(f,P) | P \ is \ a \ partition \ of \ [a,b]\} = \sup\{L(f,P) | P \ is \ a \ partition \ of \ [a,b]\}$. Furthermore, if $f$ is Darboux integrable on $[a,b]$ we write $\inf\{U(f,P) | P \ is \ a \ partition \ of \ [a,b]\} = \sup\{L(f,P) | P \ is \ a \ partition \ of \ [a,b]\} = \int_{a}^b f$

Before we go any further, how can we be so sure $\inf\{U(f,P) | P \ is \ a \ partition \ of \ [a,b]\}$ and
$\sup\{L(f,P) | P \ is \ a \ partition \ of \ [a,b]\}$ exist?

It turns out since we assumed $f$ was bounded, we can indeed assume this, the explicit proof of this fact is presented below:

Proof ( $\inf{U(f,P)}$ and $\sup{L(f,P)}$ exist) - Recall that $f$ is bounded, therefore for all $x \in [a,b]$ we have $|f| \leq M$. Let $P = \{x_0 = a, x_1, ..., x_{n-1}, x_n\}$ .

 We have
 $L(f,P) =  \sum_{i=1}^n (m_i)(x_i - x_{i-1}) \leq M(\sum_{i=1}^n (x_i - x_{i-1})) = M(b-a)$.

Therefore $L(f,P)$ is bounded above for any arbitrary partition $P$ of $[a,b]$.

 It thus follows by the least upper bound property of the Real numbers that $\sup\{L(f,P) | P \ is \ a \ partition \ of \ [a,b]\}$ exists. Similarly, we can prove that $\inf\{U(f,P) | P \ is \ a \ partition \ of \ [a,b]\}$ exists.

$\square$

Now back to Definition 1.4, although this definition does not technically involve limits, it does seem very limit-esque. One can imagine the 'area' under $f(x)$ over $[a,b]$ being 'squeezed' by both the lower and upper Riemann sums and thus forced to converge to a certain value.

Before we entertain this concept of an area being 'squeezed' by upper and lower sums any further, let us prove several lemmas regarding the supremum and infimum that will be useful to us later in the next blog post (Introduction to real analysis 1.3).

Lemma 1.0 (Property of Supremum/Infimum) - Suppose $c$ is an upper or lower bound of a nonempty set $A \subset \mathbb{R}$. Then  $c = \sup A$ or $c = \inf A$  if and only if for all $\epsilon > 0$ , there exists $x \in A$ such that $| x - c | < \epsilon$.

Proof (Lemma 1.0) - We will prove one direction of  Lemma 1.0  by contradiction. Suppose $c = \sup A$ for some nonempty set $A \subset \mathbb {R}$ that is bounded above. Now suppose $\forall x \in A$ we have $|x - c| = c - x \geq \epsilon$ then we have $ x \leq c - \epsilon $ for all $x$ in $A$. So we have $c - \epsilon$ is an upper bound of $A$. But we have already defined $c = \sup A$. Contradiction. Similarly, we can prove Lemma 1.0 for $c = \inf A$. (The proof is essentially the same).

To prove the other side, we must prove that $c$ is the least upper bound of $A$. First of all, we already know that $c$ is an upper bound of $A$ (as per the hypothesis of Lemma 1.0). Therefore $x\leq c$ for all $x \in A$. Now consider any upper bound of $A$, call it $d$. We will show that $c \leq d$. Suppose $c > d$, then $c-d > 0$. As per hypothesis, there exists a $x \in A$ such that $|x - c| < c - d$. This implies that $ c - x < c - d$ and that $-x< -d$ or $x > d$. But $d$ is an upper bound of $A$. This yields a contradiction.

$\square$

An introduction to real analysis (1.1) - Riemann Integrability

This post will outline the definition of 'Riemann integrability'

The notion of 'Riemann integrability' is one notion of integrability of a function. There exist several other notions, including Darboux integrability, and Lebesgue integrability.

Lets get started.

Definition 0.9 (Partition) - We call $P$ a 'partition' of $[a,b]$ if $P = \{x_0, x_1, x_2, ... , x_{n-1}, x_n\}$ where $x_i < x_{i+1}$ $\forall i \in \{0,1,2,...,n-1\}$ and $x_0 = a$, $x_n = b$

Visually, if you drew out a horizontal line segment and marked endpoints, and then drew several vertical lines through several different places within your line segment, you would have essentially 'created' a partition of the line segment, where each $x_i$ is a number that represents at what location you drew your several vertical lines through.

In order to work with the Riemann integral, we will also need a way of taking some kind of limit, (interestingly, a limit in the formal sense is not necessary in the definition of the Darboux integral). Therefore we will introduce the notion of a 'refinement' of a partition. We will also again see that an intuitive analogue will be easy to construct once the term has been defined.

Refinement - Suppose $P$ is a partition of $[a,b]$. Let $P'$ be a partition of $[a,b]$. We say $P'$ is a refinement of $P$ if $P \subset P'$ . That is $P'$ contains all the points of $P$ and possibly more.

In order to visualise what this looks like, on your previously drawn partitioned line segment, draw some more vertical lines in several different places (perhaps in a different colour). What you have created, is a refinement of your previously created partition.

We need one more definition before we define the Riemann integral:

Definition 1.0 (Mesh Size) - The 'mesh size' of a partition $P$ of $[a,b]$ is defined to be
 $|| P || = $ max$|x_i - x_{i-1}|$ $\forall i \in \{1,2,3,...,n\}$

It is not hard to see that the 'mesh size' is simply the maximum width of any 'division' within a partition, in other words, it is a measure of how 'fine' or how 'course' any partition of $[a,b]$ is.

We are now ready to define the Riemann integral

Definition 1.1 (Riemann Integral) - A bounded function $f$ is said to be Riemann integrable on $[a,b]$ if  $\lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) $ exists. Where $\xi_i$ is any point in $[x_{i-1},x_i]$ and $P = \{x_0, x_1, ..., x_{n-1}, x_n\}$ is a partition of $[a,b]$

If $f$ is Riemann integrable on $[a,b]$, we say that $\lim_{||P|| -> 0} \sum_{i=1}^n f(\xi_i)(x_i -x_{i-1}) = \int_{a}^b f $

The Riemann integral is sometimes handled slightly differently, and the concept of a 'tagged partition' is introduced. As it has been my experience that introducing this concept does more to confuse the reader than to clarify the concept to the reader, this will be left out. 

This definition is also often supplemented with a picture, and keeping in tradition with this a picture will also be presented below here : (Courtesy : https://mathequality.files.wordpress.com/2012/04/x1-x2-x3-fx1-fx2-fx3.gif)


This is a visual example of one of the sums mentioned in Definition 1.1, where in this case $\xi_i = x_{i-1}$. By definition, if we continue to refine the partition shown in the diagram above, our sum (the sum of the areas of the rectangles) should get 'closer and closer' to a particular number, if this is the case, we call $f$ Riemann integrable over $[a,b]$. It should be noted however, that this must be true not only for $\xi_i = x_{i-1}$ but for any arbitrary $\xi_i \in [x_{i-1},x_i]$. Different choices of $\xi_i$ produce different types of sums, but the limit of these sums as we continue to refine our partition further must all yield the same number.