Tuesday, May 29, 2018

The Pendulum Integral

This post, while strictly speaking not analysis related, involves calculus as well as a method used to prove that an improper integral diverges, which will be useful when comparing convergent/divergent series (which is what will be the topic of my next post).

The energy derivation of the pendulum equation will be used in this post: Please read the energy derivation of the pendulum equation here (scroll down to ' ''Energy'' derivation of (Eq. 1) ' )

As a small addition to the derivation on wikipedia:

When equating change in kinetic energy with change in gravitational potential energy, it is important to realize that the $\displaystyle v$ in $\displaystyle (\frac{1}{2})(mv^2)$ is $speed$, not velocity. 

This makes sense, because if $\displaystyle v$ were velocity, squaring it would not make any sense 
Now, with this in mind, $\displaystyle v$ is always $positive$. It is the $magnitude$, of velocity at any given time. Now, $\displaystyle v = l(| \frac{d\theta}{dt} |)$

Using $\displaystyle v =  l(| \frac{d\theta}{dt} |) $, we have $\displaystyle | \frac{d\theta}{dt} | = \sqrt{(\frac{2g}{l})(cos(\theta) - cos(\theta_0))}$

For $0\leq \theta \leq \theta_0$, (the first quarter period of the pendulums motion), it is clear that $\frac{d\theta}{dt} < 0 $ (the pendulum starts at $\theta_0$ and $DECREASES$ to $0$)

Therefore for the first quarter period, $\displaystyle \frac{d\theta}{dt} = (-1)(\sqrt{(\frac{2g}{l})(cos(\theta) - cos(\theta_0))}$


There are several assumptions that are being made during the energy derivation of the pendulum equation:

(1) All the mass in the system is concentrated in the pendulum bob, therefore the string connecting the pendulum to the stand is assumed to be mass-less.

(2) The string connecting the pendulum to the stand is assumed to be in-extensible and in-compressible. This is why $|\vec{r(t)}|$ does not change during motion, therefore we are assuming that the pendulum is travelling in a circular arc.

(3) The pendulum is in a completely uniform gravitational field (What this means is that the gravitational force acting on the pendulum is assumed to be constant regardless of the pendulums position during its motion - In reality this is not the case (the pendulum does experience a larger gravitational force at the bottom of its swing and a smaller gravitational force at the very top of its swing, but the difference is so small that we ignore it completely) 

(4) There are zero additional forces within the system


From the energy derivation of the pendulum equation, we know that for the first quarter time period:

$\displaystyle \frac{d\theta}{dt} = (-1)( \sqrt{(\frac{2g}{l})(cos(\theta) - cos(\theta_0))}) $

Since $\displaystyle \frac{d\theta}{dt} =(-1)( \sqrt{(\frac{2g}{l})(cos(\theta) - cos(\theta_0))})$ ,

 $\displaystyle \frac{dt}{d\theta} =(-1) (\sqrt{\frac{l}{2g}})(\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}})$

Therefore, effectively , time is a function of the angle the pendulum creates with the vertical.

Now, by the fundamental theorem of calculus , we know that $\displaystyle \int_{\theta_0}^0 (-1) (\sqrt{\frac{l}{2g}})(\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) = \int_{\theta_0}^0 \frac{dt}{d\theta} = t(0) - t(\theta_0)$

Where $\displaystyle t$ is time as a function of $\theta$.

It is thus clear that $t(0) - t(\theta_0)$ is the time taken for the pendulum to swing from $\theta = \theta_0$ to $\theta = 0$ , or $\frac{T}{4}$ where $T$ is the full time period of the pendulum.

Now, we also know that $\displaystyle \int_{\theta_0}^0 (-1) (\sqrt{\frac{l}{2g}})(\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) = \displaystyle \int_{0}^{\theta_0}  (\sqrt{\frac{l}{2g}})(\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}})$

This is because we can swap the order of integration by just multiplying the integrand by $-1$.

So, $ \displaystyle \int_{0}^{\theta_0}  (\sqrt{\frac{l}{2g}})(\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) = \frac{T}{4} $,

this implies that:

$\displaystyle T = 4(\sqrt{\frac{l}{2g}})\int_{0}^{\theta_0} (\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}})$

Lets consider the integral $\displaystyle \int_{0}^{\theta_0} (\frac{1}{\sqrt{cos(\theta) + 1}})$

First, we will show that $\displaystyle (\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) \geq  (\frac{1}{\sqrt{cos(\theta) + 1}})$

To do this we will prove something that is equivalent to this, namely that:

$\displaystyle \sqrt{cos(\theta) - cos(\theta_0)} \leq \sqrt{cos(\theta) + 1}$

Squaring both sides, we have $\displaystyle cos(\theta) - cos(\theta_0) \leq cos(\theta) + 1$

Subtracting both sides by $\displaystyle cos(\theta)$ , we have $\displaystyle -cos(\theta_0) \leq 1$, implying $\displaystyle cos(\theta_0) \geq -1$, which is always true for $\displaystyle cos(x)$.

Therefore, we have proved that $\displaystyle (\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) \geq  (\frac{1}{\sqrt{cos(\theta) + 1}})$

Now, lets try and evaluate $\displaystyle \int_{0}^{\theta_0} (\frac{1}{\sqrt{cos(\theta) + 1}})$

We should start by trying to evaluate the indefinite integral $ \displaystyle \int  (\frac{1}{\sqrt{cos(\theta) + 1}})$

This integral can be reduced to $\displaystyle \frac{1}{\sqrt2}\int sec(\frac{\theta}{2})$ by noticing that $\displaystyle cos(\theta) = cos(\frac{\theta}{2} + \frac{\theta}{2}) = cos^2(\frac{\theta}{2}) - sin^2(\frac{\theta}{2}) = 2cos^2(\frac{\theta}{2}) $

There are several different ways to evaluate integrals of the form $\displaystyle \int secx$, we will use one particular method in order to evaluate $\displaystyle \frac{1}{\sqrt2}\int sec(\frac{\theta}{2})$

I evaluated $\displaystyle \frac{1}{\sqrt2}\int sec(\frac{\theta}{2})$ using the fact that $\displaystyle cos(\theta) = \frac{1 - tan^2(\frac{\theta}{2})}{1 + tan^2(\frac{\theta}{2})}$

With this, I found that $\displaystyle \frac{1}{\sqrt2}\int sec(\frac{\theta}{2}) = (\sqrt2) ln(tan(\frac{\pi}{4} + \frac{\theta}{4}))$

Therefore, $\displaystyle \int_{0}^{\theta_0} (\frac{1}{\sqrt{cos(\theta) + 1}}) =  \int_{0}^{\theta_0} \frac{1}{\sqrt2}\int sec(\frac{\theta}{2}) = (\sqrt2) ln(tan(\frac{\pi}{4} + \frac{\theta_0}{4})) $

$\displaystyle ln(x)$ is a strictly increasing function that increases without bound, that is $\displaystyle \lim_{x \to \infty} ln(x) = \infty$

Now, the argument of $\displaystyle ln$ in $\displaystyle ln(tan(\frac{\pi}{4} + \frac{\theta_0}{4})) $ is $\displaystyle tan(\frac{\pi}{4} + \frac{\theta_0}{4})$ , and as $\displaystyle \theta_0 \to \pi$,  $\displaystyle tan(\frac{\pi}{4} + \frac{\theta_0}{4}) \to \infty$.

Therefore, as $\displaystyle \theta_0 \to \pi$, $\displaystyle tan(\frac{\pi}{4} + \frac{\theta_0}{4}) \to \infty$, which means that $\displaystyle ln(tan(\frac{\pi}{4} + \frac{\theta_0}{4})) \to \infty$.

But this means that

$\displaystyle \int_{0}^{\theta_0} (\frac{1}{\sqrt{cos(\theta) + 1}}) =

 \int_{0}^{\theta_0} \frac{1}{\sqrt2}\int sec(\frac{\theta}{2}) =

 (\sqrt2) ln(tan(\frac{\pi}{4} + \frac{\theta_0}{4}))$ $\to \infty$ as $\theta_0 \to \pi$

However, we know that $\displaystyle (\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) \geq  (\frac{1}{\sqrt{cos(\theta) + 1}}) $

Which means that $\displaystyle T = \int_{0}^{\theta_0} \displaystyle (\frac{1}{\sqrt{cos(\theta) - cos(\theta_0)}}) \geq \int_{0}^{\theta_0}  (\frac{1}{\sqrt{cos(\theta) + 1}}) $

As we know that $\displaystyle \int_{0}^{\theta_0}  (\frac{1}{\sqrt{cos(\theta) + 1}}) \to \infty$ as $\displaystyle \theta_0 \to \pi$, this suffices to show that $\displaystyle \lim_{\theta_0 \to \pi} T = \infty $

This is a nice result, as it implies that it can take arbitrarily long for a pendulum to complete a full swing given we incline the pendulum close enough to the vertical (under ideal conditions)

Tuesday, April 19, 2016

A number theory problem

Today I came across the following number theory problem:

Prove that if $x^p + y^p \equiv 0 (mod\ p)$, and $p$ is a prime number larger than $2$, then $x^p + y^p \equiv 0 (mod\ p^2)$.

The problem seemed innocuous enough at first glance, but proved to be quite challenging for me, as I could only solve the problem through arguably tedious means.

In any case, I thought I'd present my solution here; also for anyone who has stumbled onto this blog and has another solution, please don't hesitate to provide a sketch or full solution in the comments! 

Solution - 

By Fermat's little theorem, note that $x^p + y^p \equiv x+y \equiv 0 (mod\ p)$

Consider the following identity, where $a,b$ are integers, and $n$ is any natural number:

$\displaystyle a^n - b^n = (a-b)( \sum_{i=0}^{n-1} a^{i}b^{(n-1)-i})$

Lets attempt to use the above identity with $a=x$,$b=-y$ and $n=p$. 

Thus, we have:

$\displaystyle x^p - (-y)^p = x^p + y^p =(x+y)( \sum_{i=0}^{p-1} x^{i}(-y)^{(p-1)-i}) = (x+y)(\sum_{i=0}^{p-1} x^{i}y^{(p-1)-i}(-1)^i)$

Where $x^p - (-y)^p = x^p + y^p$ and $\displaystyle \sum_{i=0}^{p-1} x^{i}(-y)^{(p-1)-i} = \sum_{i=0}^{p-1} x^{i}y^{(p-1)-i}(-1)^i$ because $p>2$, so $p$ is an odd positive integer.

Now we note that since $x \equiv -y (mod\ p)$ by Fermat's little theorem,

$\displaystyle \sum_{i=0}^{p-1} x^{i}y^{(p-1)-i}(-1)^i \equiv \sum_{i=0}^{p-1} (-y)^{i}y^{(p-1)-i}(-1)^i = \sum_{i=0}^{p-1} (y)^{p-1} = py^{p-1} (mod\ p)$

and $py^{p-1} \equiv 0 (mod\ p)$

Thus we have shown that $p$ appears as a factor of $x^p + y^p$ twice, and thus $x^p + y^p \equiv 0 (mod\ p^2)$

Monday, March 14, 2016

Work Done - Part 1

Temporarily breaking away from the recent pattern of strictly analysis related posts, I thought I'd write about something physics related.

The concept of work done is an integral part of classical physics, particularly because of something called the 'work-energy principle'.

In this post, I will formally define work done, show how (in some situations, where the forces and paths we are considering are sufficiently 'nice'), it can be computed.

First lets define work done in the case of a constant force and motion in a straight line.

$\displaystyle Definition \ 1 \ (Work \ done \ (under \ constraints) ) - $ The work done by a constant force $\vec{F}$ acting on an object that is displaced $\vec{s}$ is $\vec{F} \cdot \vec{s}$

Describing the above definition using only English, we may write ' The work done by a constant force acting on an object is equal to the signed product of the magnitude of the objects displacement in the direction of the force multiplied by the magnitude of the force' , where the sign depends only on whether or not the component of the objects displacement along the line of the force vector points in line with the direction of the force or opposite the direction of the force.

Here is a diagram to illustrate this:


Now lets consider a more complicated example. What if an object is moving in some arbitrary force field (you can think of this as a region of space where at every single point within the specified region, a force is acting in some direction), where the force in question may be of variable magnitude. Furthermore, the object need not be moving in a straight line.

Lets draw another picture to illustrate this:

The picture I above, admittedly, is a bit overkill. Although drawing all those arrows in MS paint felt therapeutic, my artistic release didn't end up producing a vector field that appeared very continuous. Nonetheless, the point of the image was to show how far away we can deviate from the case where there is only a constant force acting on an object.

So the natural question now becomes, how do we compute the work done by a force on an object in this more general case?


Lets make a few assumptions in order to answer this question. First of all, suppose are force is $\displaystyle \vec{F} : E \subset \mathbb{R}^k \mapsto \mathbb{R}^k$. Lets also assume our force is continuous on $E$.

Lets say the path were moving in, from $t=a$ to $t=b$ is $\vec{\gamma} : [a,b] \mapsto E $. Furthermore, we'll assume this path is continuously differentiable.

Here's another (poorly drawn) picture to illustrate this:

Now, in the picture above, I've divided the curve $\displaystyle \vec{\gamma}$ into segments.

This division actually corresponds to a partition of time, $\{t_0 = a, t_1, t_2, ... , t_{n-1},t_n=b\}$.

Assuming our force is continuous, and the intervals of time we have created in our partition are small enough, it is reasonable to believe the force $\vec{F}$ acting on the object moving in $\vec{\gamma}$ is approximately constant in magnitude and direction throughout the entire interval $[t_{i-1},t_i]$ for any $i \in \{1,2,....,n\}$.

Therefore, within such an interval of time, we can approximate the total work done by this force on the object moving throughout $[t_{i-1}, t_i]$ as $\displaystyle \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1}))$, where $t_{i*}$ is any point in $[t_{i-1}, t_i]$.

Now, since were not only moving in the curve $\vec{\gamma}$ during a specific interval $[t_{i-1},t_i]$, but throughout the whole of $[a,b]$, we need to account for the work this force is doing on us during all other intervals of time as well. This is why we take a summation:

$\displaystyle \sum_{i=1}^n \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $ and call this our first approximation of work done on the object moving in $\displaystyle \vec{\gamma}$ from $t=a$ to $t=b$. In the picture above, $\triangle \vec{\gamma}(t_i) = \vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})$ .


In mathematics, while first approximations are always appreciated, one is usually interested in formulating better, more accurate approximations in order to better capture what is really going on.

In the previous illustration, we partitioned the interval of time the object being considered was moving within, into $n$ sub-intervals: $[t_{i-1},t_i]$ from $i=1$ to $i=n$. Within each of these sub-intervals, we assumed that the particle was moving in a straight line from $\displaystyle \vec{\gamma}(t_{i-1})$ to $\displaystyle \vec{\gamma}(t_i)$ and that the force acting on the object during this interval of time was constant in direction and magnitude, and given by $\vec{F}(\vec{\gamma}(t_{i*}))$ where $t_{i*}$ is some point in $[t_{i-1},t_i]$.

This doesn't capture exactly what is going on though, as it could be that even during this very small interval of time, the object is moving like this:

We immediately see that our approximation of the objects path is a gross under-approximation, and therefore, we also see that it doesn't even appear that the force we are interested in is relatively constant throughout the interval of time we have chosen! This is an unhappy state of affairs, and in an effort to rectify this, we will further subdivide our interval of time $[t_{i-1},t_i]$ so as to better capture what is really going on:

Ahh.. much better.


So we have figured out a way to produce better approximations, all we need to do is simply subdivide our interval of time $[a,b]$ into finer and finer sub-intervals, as the finer we partition, the better we capture whats going on within these intervals of time.

It thus makes sense to define work done as $\displaystyle \lim_{||P|| \to 0} \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $, where $\displaystyle ||P|| = max(t_i - t_{i-1})$ where $i = \{1,2,...,n\}$, granted this limit exists.

All this means is, that work done can be defined as the limit of the sum of approximations to work done over sub-intervals of our entire interval of time, $[a,b]$, as the length of these sub-intervals gets smaller and smaller.

Although we now have a definition of work done, at the present moment, it doesn't seem to be very illuminating, as we have no idea how we can actually compute such a thing!

As it turns out, given we make some reasonable assumptions about $\vec{F}$ and $\vec{\gamma}$, $\displaystyle \lim_{||P|| \to 0} \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $ isn't very difficult to compute.

The assumptions we will make are, that

(1) $\vec{\gamma} : [a,b] \mapsto E \subset \mathbb{R}^m$ is continuously differentiable on $[a,b]$

(2) $\vec{F} : E \mapsto \mathbb{R}^m$ is continuous on $E$

(Note: The definition of the term 'continuously differentiable' is as follows: If $f$ is continuously differentiable on a set $A$, $f'$ is continuous on $A$)

The best way to think of a continuously differentiable function is as 'smooth'. The function must have no sharp corners, and no holes, furthermore the functions 'velocity', or rate of growth, may not become arbitrarily large in a finite interval of time/space.

This rules out some functions, for example, a particle moving with position
 $ \frac{-1}{t}$ has arbitrarily large speed at some point in any interval $(0,t]$, in fact, it can even have arbitrarily large average speed in an interval of the form $(0,t]$.

Although, as long as we only consider the function on intervals $[x,y]$ with $x>0$, the function is indeed continuously differentiable.


Now, using assumptions (1) and (2), we will prove that $\displaystyle \lim_{||P|| \to 0} \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) = \int_{a}^b \vec{F}(t) \cdot \vec{\gamma'}(t) dt $

I will present a rigorous proof first, and then a tl;dr 'proof' of this.

One thing to note when reading this proof is that when a letter is used in some context, for example, the letter $m$ in $\mathbb{R}^m$, if such a letter is then re-used further down the proof, the reader should immediately assume that the letter being used further down has exactly the same meaning as when it was first used.

Since $\vec{\gamma}'(t)$ is continuous on $[a,b]$, it's component functions must be continuous on $[a,b]$.

We also know that $\vec{\gamma}' : [a,b] \mapsto \mathbb{R}^m$

Therefore, with respect to the standard basis $\{\vec{u_1},\vec{u_2}, ... ,\vec{u_m}\}$, we know that $\displaystyle \vec{\gamma}'(t) = \sum_{k=1}^m {\gamma'_k}(t)\vec{u_k} $

Furthermore, each component function ${\gamma'_k}(t)$ is also uniformly continuous on $[a,b]$ (by continuity on a closed interval of $\mathbb{R}$ implies uniform continuity on the same interval)

Lets now consider the function $\vec{F}(\vec{\gamma}) : [a,b] \mapsto \mathbb{R}^m$. By assumption (2), this is continuous on $[a,b]$.

Therefore $|\vec{F}(\vec{\gamma})| : [a,b] \mapsto \mathbb{R}$ is a real valued function that is continuous on the interval $[a,b]$. As a consequence, by boundedness of real valued functions that are continuous on a closed interval, we know that there exists a real number $M > 0$ such that $|\vec{F}(\vec{\gamma}(t))| < M$ for all $t \in [a,b]$.

Therefore there must exist a $\delta >0$ such that for any partition $P = \{t_0=a,t_1,...,t_{n-1},t_n=b\}$ if $||P|| < \delta$, we can ensure that for any $x,y \in [t_{i-1},t_i]$, $\displaystyle |{\gamma'_k}(x) - {\gamma'_k}(y)| < \frac{\epsilon}{mM(b-a)}$

Lets assume we've chosen a partition $P$ of $[a,b]$ with $||P|| < \delta$ so that $\displaystyle |{\gamma'_k}(x) - {\gamma'_k}(y)| < \frac{\epsilon}{mM(b-a)}$ for all $k \in \{1,2,....,m\}$.

Now, consider $  (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1}))$

We know that for any sub-interval $[t_{i-1},t_i]$ of $P$, $(\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) = \sum_{k=1}^m ({\gamma_k}(t_i) - {\gamma_k}(t_{i-1}))\vec{u_k}$

By the mean-value theorem, for any interval $[t_{i-1},t_i]$, there exists a $p_k \in [t_{i-1},t_i]$ such that ${\gamma_k}(t_i) - {\gamma_k}(t_{i-1}) = {\gamma'_k}(p_k) (t_i - t_{i-1})$

Therefore, $\displaystyle (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) = \sum_{k=1}^m ({\gamma_k}(t_i) - {\gamma_k}(t_{i-1}))\vec{u_k} = \sum_{k=1}^m {\gamma'_k}(p_k) (t_i - t_{i-1}) \vec{u_k}$

Now, as we have chosen $||P|| < \delta$, and each $p_k \in [t_{i-1},t_i]$, we know that

$\displaystyle |\sum_{k=1}^m {\gamma'_k}(p_k) (t_i - t_{i-1}) \vec{u_k} - \sum_{k=1}^m {\gamma'_k}(t_i*) (t_i - t_{i-1}) \vec{u_k}| < \sum_{k=1}^m |{\gamma'_k}(p_k) - {\gamma'_k}(t_i*)| (t_i - t_{i-1}) < \frac{\epsilon}{M(b-a)} (t_i - t_{i-1})$

But $\displaystyle \sum_{k=1}^m {\gamma'_k}(p_k) (t_i - t_{i-1}) \vec{u_k} = (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1}))$ and $\displaystyle \sum_{k=1}^m {\gamma'_k}(t_i*) (t_i - t_{i-1}) \vec{u_k}| = \vec{\gamma'}(t_i*) (t_i - t_{i-1})$

Therefore, for any interval $[t_{i-1},t_i]$, $|(\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) - \vec{\gamma'}(t_i*) (t_i - t_{i-1})| <  \frac{\epsilon}{M(b-a)} (t_i - t_{i-1})$

Now, lets consider $\displaystyle  \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $

Note that $\displaystyle | \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) - \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot \vec{\gamma'}(t_i*) (t_i - t_{i-1}) | \leq \sum_{i=1}^n  |\vec{F}(\vec{\gamma}(t_{i*}))| |(\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) - \vec{\gamma'}(t_i*) (t_i - t_{i-1})|$

by the Cauchy-Schwarz and Triangle inequalities.

But $\displaystyle \sum_{i=1}^n  |\vec{F}(\vec{\gamma}(t_{i*}))| |(\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) - \vec{\gamma'}(t_i*) (t_i - t_{i-1})|   < \sum_{i=1}^n M |(\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) - \vec{\gamma'}(t_i*) (t_i - t_{i-1})| < \sum_{i=1}^n \frac{\epsilon}{(b-a)} (t_i - t_{i-1}) < \epsilon $

This shows that given $||P||$ is sufficiently small, our 'approximation' for work done:

(3) $\displaystyle  \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $

can be made arbitrarily close to :

(4) $\sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot \vec{\gamma'}(t_i*) (t_i - t_{i-1})$ 

where (4) is nothing but a Riemann sum for the integral of $\vec{F}(\vec{\gamma}(t)) \cdot \vec{\gamma'}(t)$ over the interval $[a,b]$.

Due to assumptions (1) and (2), namely that $\vec{\gamma} : [a,b] \mapsto E \subset \mathbb{R}^m$ is continuously differentiable on $[a,b]$ and $\vec{F} : E \mapsto \mathbb{R}^m$ is continuous on $E$ , it follows that $\vec{F}(\vec{\gamma}(t)) \cdot \vec{\gamma'}(t)$ is 
continuous on $[a,b]$, and is thus Riemann integrable.

Thus, given $||P||$ is sufficiently small, we can ensure that $\sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot \vec{\gamma'}(t_i*) (t_i - t_{i-1})$ 

is arbitrarily close to $\displaystyle \int_{a}^b \vec{F}(t) \cdot \vec{\gamma'}(t) dt $, and thus that our approximation:

$\displaystyle  \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) $

is also arbitrarily close to $\displaystyle \int_{a}^b \vec{F}(t) \cdot \vec{\gamma'}(t) dt $.

Thus we have proved that $\displaystyle \lim_{||P|| \to 0} \sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1})) = \int_{a}^b \vec{F}(t) \cdot \vec{\gamma'}(t) dt $


Now I will present the 'tl;dr' proof.

Most of this result actually stems from the Mean-Value-Theorem (or a slight variant of the Mean-Value-Theorem).

Essentially, the proof comes down to the fact that when the size of the sub-intervals $[t_{i-1},t_i]$ you have chosen are small, $\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1}) \approx \vec{\gamma}'(t_i*) (t_i - t_{i-1})$, where the $t_i*$ we are talking about is the same $t_i*$ in : $\sum_{i=1}^n  \vec{F}(\vec{\gamma}(t_{i*})) \cdot (\vec{\gamma}(t_i) - \vec{\gamma}(t_{i-1}))$

I will illustrate why this is true when we aren't dealing with curves in $\mathbb{R}^2$ or above. That is, when the object that is moving is moving in a completely straight line, so that we don't have to concern ourselves with any 'component functions' that make things look ugly. So our object can either move 'right' $+ve$ or 'left $-ve$.

If our object is moving smoothly in a line, it's displacement from the origin may look something like this: 

Now, in the figure above the point $p_i* \in (t_{i-1},t_i)$ is such that $\gamma(t_i) - \gamma(t_{i-1}) = \gamma'(p_i*) (t_i - t_{i-1})$.

Since $\gamma'$ is continuous, and the length of the interval $[t_{i-1},t_i]$ is small, we know that $\gamma'(p_i*) \approx \gamma'(t_i*)$ and therefore:

$\displaystyle \sum_{i=1}^n  \vec{F}({\gamma}(t_{i*})) \cdot ({\gamma}(t_i) - \vec{\gamma}(t_{i-1})) = \sum_{i=1}^n  \vec{F}({\gamma}(t_{i*})) \cdot (\gamma(p_i*)) (t_i - t_{i-1}) \approx \sum_{i=1}^n  \vec{F}({\gamma}(t_{i*})) \cdot (\gamma(t_i*)) (t_i - t_{i-1})$ since all sub-intervals are small.

But $\displaystyle \sum_{i=1}^n  \vec{F}({\gamma}(t_{i*})) \cdot (\gamma(t_i*)) (t_i - t_{i-1})$ is just a Riemann sum for $\displaystyle \int_{a}^b \vec{F}(t) \cdot {\gamma'}(t) dt$

The result now follows. This is also the general idea for how to go about proving this result when $\gamma$ is a vector valued function from $[a,b]$ to $\mathbb{R}^m$ when $m \geq 2$.


In the next part of this post, I will introduce the idea of a 'conservative' force, explore the concepts of work done by gravity, gravitational potential, and the so-called 'work - energy principle'.

Thanks for reading

Saturday, March 12, 2016

Introduction to Real Analysis (1.4.3) - Properties of the real numbers and the Monotone Convergence Theorem

In the previous post I defined what real numbers are. In this post I will introduce an important relation on the real numbers, namely $<$, introduce two operations on the real numbers, multiplication $*$ and addition $+$, as well as prove that the real numbers are 'complete'; that is, that the real numbers obey the 'least upper bound' property.

Using these definitions, it is also possible to show that the set of all real numbers is a field. All this means is that the set obeys certain axioms which relate to the two operations defined on the set ($*$ and $+$), although this will be left as an exercise to the reader. Much of this post will be left as an exercise to the reader, as many facts about the real numbers follow from the definitions that will be given in this post. Also, I would like to move on to proving completeness as quickly as possible as it will enable us to start talking about sequences and series of real numbers.

Lets start with defining the 'less than' relation:

$Definition \ 1 - (Less \ Than) - $ For two real numbers $[\alpha], [\beta]$, we say $[\alpha] < [\beta]$ if and only if there exists a positive rational number $\epsilon > 0$ such that for some constructive interval sequence $\{I_n\}=[a_n,b_n]$ in $[\alpha]$ and $\{K_n\}=[c_n,d_n]$ in $[\beta]$, $c_n - a_n > \epsilon$ for all $n>M$ where $M$ is some natural number.

Although  Definition 1 does not require that for any two constructive interval sequences $\{I_n\}$ and $\{K_n\}$ in $[\alpha]$ and $[\beta]$ respectively, there must exist a positive rational number $\epsilon > 0 $ such that for all $n>M$ where $M$ is some natural number, $c_n - a_n > \epsilon$, this is implied by Definition 1 as $[\alpha]$ and $[\beta]$ are both defined to be equivalence classes of constructive interval sequences. The actual proof of this will be left as an exercise to the reader, but essentially it is a consequence of the fact that any two sequences in an equivalence class of nested interval sequences can be made 'arbitrarily close together'.

The definition of 'less than or equal to' will literally be two numbers are 'less than or equal to' each other if and only if they are 'less than' each other xor 'equal to' each other.

We will now prove the law of trichotomy of the real numbers:

$Theorem \ 1 \ (Law \ of \ trichotomy) - $ For any two real numbers , $[\alpha]$ and $[\beta]$, one and only one of the following statements is true:

(i) $[\alpha] < [\beta]$
(ii) $[\beta] < [\alpha]$

(iii) $[\beta] = [\alpha]$

$Proof \ (Theorem \ 1)$ - 

Let $\{I_n\} = [a_n,b_n]$ and $\{K_n\} = [c_n,d_n]$ be two constructive interval sequences in $[\alpha]$ and $[\beta]$ respectively.

Suppose $[\beta] = [\alpha]$ were false. Then there exists some positive rational number $\epsilon$ such that for all natural numbers $M$, there exists an $n>M$ such that $|c_n - a_n| \geq \epsilon$

Now, as $[a_n,b_n]$ and $[c_n,d_n]$ are two constructive interval sequences, we can make $b_n - a_n < \frac{\epsilon}{2}$ and $d_n - c_n < \frac{\epsilon}{2}$ for all $n>M$.

But we know that for some $n>M$, $|c_n - a_n| \geq \epsilon$.

This implies that either $c_n - a_n \geq \epsilon$ or $a_n - c_n \geq \epsilon$.

Suppose $c_n - a_n \geq \epsilon$. However we know that $b_n - a_n < \frac{\epsilon}{2}$.

Therefore, $c_n - b_n > \frac{\epsilon}{2}$.

Now $c_{n+i} \geq c_n$ for all $i$, and $a_{n+i} < b_n$ for all $i$.

This implies that $c_{n+i} - a_{n+i} > c_n - b_n > \frac{\epsilon}{2}$ for all $i$.

Thus $[\alpha] < [\beta]$.

We can similarly prove that if $a_n - c_n \geq \epsilon$, $[\beta] < [\alpha]$.


In the hopes of defining multiplication successfully, I will now present another definition and a lemma about $0$ (and all rational numbers) in the 'real number' sense.

$Definition \ 2- $ The rational number $a$ is equivalent to the equivalence class the constructive interval sequence $[a,b_n]$ belongs to. We denote $a$ in the real number sense, $[a]$.

$Lemma \ 1 - $ If $[\alpha] < [0]$, there exists a constructive interval sequence $I_n =[a_n,b_n]$ in $[\alpha]$ such that $a_n<0$ and $b_n<0$ for all natural $n$.

$Proof \ (Lemma \ 1) - $ By definition, if $[\alpha] < [0]$, we must be able to find a natural number $M$ such that for all $n>M$ , $0 - a_n > \epsilon$ for some positive rational number $\epsilon$, where $[a_n,b_n]$ is in $[\alpha]$.

Therefore, for all $n>M$, $a_n < -\epsilon < 0$.

As $[a_n,b_n]$ is a constructive interval sequence, it is also true that $lim_{n \to \infty} b_n - a_n = 0$, which implies that for $n$ large enough, $b_n < -\epsilon < 0$.

Thus for $n$ large enough, say for $n$ larger or equal to than $M_1$, we can ensure $a_n < 0 $ and $b_n < 0$.

Take another constructive interval sequence $[a^{*}_n,b^{*}_n]$, where $a^{*}_n = a_{M_1 + n}$ and $b^{*}_n = b_{M_1 + n}$.

This is simply a sub-sequence of the aforementioned sequence, and also must belong to $[\alpha]$, as $|a_{M_1 + n} - a_n| < b_n - a_n$ and $\displaystyle lim_{n \to \infty} b_n - a_n = 0$.

Thus Lemma 1 has been proved.


$Lemma \ 2 - $ If $[0] < [\alpha]$, there exists a constructive interval sequence $I_n =[a_n,b_n]$ in $[\alpha]$ such that $0<a_n$ and $0<b_n$ for all natural $n$.

Lemma 2 can be proved similarly.


Now we will define the operations $+$ and $*$ on the real numbers.

$Definition \ 3 \ (Addition) - $ For any two real numbers $[\alpha]$, $[\beta]$, we define $[\alpha] + [\beta]$ as the equivalence class the constructive interval sequence $[a_n + c_n , b_n + d_n]$ belongs to, where $[a_n,b_n]$ is in $[\alpha]$ and $[c_n,d_n]$ is in $[\beta]$.

$Definition \ 4 \ (Multiplication) - $ Let $[\alpha] \geq 0$ and $[\beta] \geq 0$ , then by Lemma 2, there exist constructive interval sequences, $\{I_n\}=[a_n,b_n]$ and $\{K_n\}=[c_n,d_n]$ in $[\alpha]$ and $[\beta]$ respectively such that $a_n, b_n,c_n,d_n \geq 0$ for all natural $n$.

We define $[\alpha]*[\beta]$ as the equivalence class of the constructive interval sequence $[a_nc_n,b_nd_n]$. (It follows from this definition that (at least for $[\alpha] > 0$ and $[\beta] > 0$, at this point) multiplication is commutative.

Now suppose $[\alpha]<0$ and $[\beta]\geq0$, then by Lemma 1 and Lemma 2, we know there must exist constructive interval sequences, $\{I_n\}=[a_n,b_n]$ and $\{K_n\}=[c_n,d_n]$ in $[\alpha]$ and $[\beta]$ respectively such that $a_n, b_n < 0 $ and $c_n, d_n > 0 $ for all natural $n$.

We define $[\alpha]*[\beta]$ in this case as the equivalence class of the constructive interval sequence $[-|a_n|d_n,-|b_n|c_n]$.

For $[\alpha] \leq 0$ and $[\beta] \leq 0$ we know there must exist constructive interval sequences, $\{I_n\}=[a_n,b_n]$ and $\{K_n\}=[c_n,d_n]$ in $[\alpha]$ and $[\beta]$ respectively such that $a_n, b_n ,c_n,d_n \leq 0$for all natural $n$.

 We define $[\alpha]*[\beta]$ as the equivalence class of the constructive interval sequence $[|b_n||d_n|,|c_n||a_n|]$.

Using these definitions it is now possible to prove all of the field axioms of the real numbers: http://mathworld.wolfram.com/FieldAxioms.html


We will now prove that the real numbers are complete.

$Property \ 1 \ (Completeness) - $ Let $S$ be some non-empty subset of $\mathbb{R}$ that is bounded above. There exists a real number $\xi \in \mathbb{R}$ such that $\xi$ is the smallest upper bound of $S$.

$Proof \ (Property \ 1)$ - $S \subset \mathbb{R}$ and $S$ is bounded above. Therefore there must exist some $[\beta] \in \mathbb{R}$ such that for all $[\alpha] \in S$, $[\alpha] \leq [\beta]$.

Now let $\{I_n\} \in [\alpha] \in S$ and $\{K_n\} \in [\beta]$ where $I_n = [a_n,b_n]$ and $K_n = [c_n,d_n]$.

Suppose, for some $N \in \mathbb{N}$, $a_N > d_N$.

We will show that this is not possible.

If $a_N > d_N$, then $a_{N+i} \geq a_N > d_N \geq d_{N+i}> c_{N+i}$

This implies $a_{N+i} - c_{N+i} > a_N - c_N$ for all natural $i$.

Thus by definition, $[\alpha] > [\beta]$.

This is clearly a contradiction. (By the law of trichotomy)

Therefore, our initial supposition was false; and so for all natural $n$, $a_n <\leq d_n$.

Now this also means that the real number $[d_N]$ for some fixed natural $N$ (where $d_n$ is the same $d_n$ as mentioned above, that is, $d_n$ is in general a right endpoint of the constructive interval sequence $\{K_n\} \in [\beta]$ where $[\beta]$ is some upper bound of $S$) is an upper bound of $S$, as if $[\alpha] > [d_N]$, this implies that for all $n>M$, $a_n - d_N > \epsilon > 0$, which contradicts the fact we proved earlier. Thus the fact that $[\alpha] \leq [d_N]$ follows from the law of trichotomy.

Note that $d_N$ is also a rational number, as it is one of the right endpoints of $\{K_n\} \in [\beta]$

We will now go on to prove a useful lemma:

$Lemma \ 1 - $ For all $n \in \mathbb{N}$, there exists a left endpoint $a_n \in \{I_n\} \in [\alpha] \in S$ such that $a_n + \frac{1}{n}$ (which is a rational number) is an upper bound of $S$.

$Proof \ (Lemma \ 1) - $ Suppose there exists a natural number $m$ where for any left endpoints $a_m \in \{I_n\} \in [\alpha] \in S$ (where $[\alpha]$ is any element in $S$, $\{I_n\}$ is any constructive interval sequence in $[\alpha]$), $a_m + \frac{1}{m}$ is NOT an upper bound of $S$.

This means that there must exist some real number $[\alpha_1] \in S$ such that $[a_m + \frac{1}{m}] < [\alpha_1]$.

This means that for a constructive interval sequence $[y_i,t_i]$ in $[\alpha_1]$,

$y_i - (a_m + \frac{1}{m}) > 0$ for all $i \geq N$ where $N$ is some natural number.

Which means that in particular, $y_N - (a_m + \frac{1}{m}) > 0$.

Now $y_N$ is some left end point of $[y_i,t_i] \in [\alpha_1] \in S$, and therefore using the exact same argument as before, $y_N + \frac{1}{m}$ may not be an upper bound of $S$, which means that $a_m + \frac{2}{m}$ is not an upper bound of $S$ either.

Continuing this process, we realize that we will eventually conclude that $a_m + \frac{k}{m}$ can never be an upper bound of $S$ for any natural $k$, we know this must yield a contradiction, as at some point $a_m + \frac{k}{m} \geq d_N$ for some $k$, due to the fact that the natural numbers are unbounded in $\mathbb{Q}$

(This can be proven formally with induction)


Now with the help of Lemma 1, we will begin our construction of the 'least upper bound' or 'supremum' of $S$.

We will do so by first considering an increasing sequence $\{a_n\}$ where for each natural $n$ (we start indexing at n=1), $a_n + \frac{1}{n}$ is an upper bound of $S$. By Lemma 1, these $a_n$ exist as left endpoints of constructive interval sequences within numbers $[\alpha]$ in $S$.

The only thing we need to show is that we can find an INCREASING sequence of such numbers.

Suppose for the sake of contradiction, $a_{n+1} \leq a_n$. Then $a_{n+1} + \frac{1}{n+1} \leq a_n + \frac{1}{n+1}$. By definition, $a_{n+1} + \frac{1}{n+1}$ is an upper bound of $S$ and therefore $a_n + \frac{1}{n+1}$ is also an upper bound of $S$. Thus in order to fix this issue, we can just swap indexes. That is, rather than call $a_{n+1}$, $a_{n+1}$, we simply call it $a_n$ and call $a_n$, $a_{n+1}$. In this way, we have straightened out the issue, as $a_n < a_{n+1}$.

Thus we can create such a sequence.

We will now consider the opposite side, that is, we want to produce a decreasing sequence of upper bounds, $b_n$, such that $b_n - a_n \leq \frac{1}{n}$ for all $n$.

First we need to show this is indeed possible.

Suppose $a_{n+1} + \frac{1}{n+1} \geq a_n + \frac{1}{n}$.

If this is the case, choose some real number $[\beta_{n+1}] < [a_n + \frac{1}{n}]$ such that $[\beta_{n+1}]$ is an upper bound of $S$. If such a real number does not exist, then this implies that $[a_n + \frac{1}{n}]$ is the least upper bound of $S$ and we would be done.

Before we continue, let us prove one more useful lemma:

$Lemma \ 3 \ - $  $[\alpha] < [\beta]$ if and only if for some $\{I_n\} = [a_n,b_n]$ and $\{K_n\} = [c_n,d_n]$ in $[\alpha]$ and $[\beta]$ respectively, $d_n - b_n > \epsilon$ for all $n>M$ where $M$ is some natural number and  $\epsilon>0 \in \mathbb{Q}$

$Proof \ (Lemma \ 3) - $ Let $[\alpha] < [\beta]$. Then for $\{I_n\} = [a_n,b_n]$ and $\{K_n\} = [c_n,d_n]$ there exists a natural number $M$ such that for all $n>M$, $c_n - a_n > \epsilon_1 > 0$ for some $\epsilon_1 \in \mathbb{Q}$.

Now, as $[a_n,b_n]$ and $[c_n,d_n]$ are both constructive interval sequences, there exists a natural number $P$ such that for all $n>P$, $|b_n - a_n| < \frac{\epsilon_2}{2}$ and $|d_n - c_n| < \frac{\epsilon_2}{2}$, where $\epsilon_2 < \epsilon_1$. Therefore, $ -\frac{\epsilon_2}{2} < a_n - b_n < \frac{\epsilon_2}{2}$ and $-\frac{\epsilon_2}{2} < d_n - c_n < \frac{\epsilon_2}{2}$.

Adding together both inequalities, we have $ -\epsilon_2 <d_n - b_n + a_n - c_n < \epsilon_2$, which impliess $ \epsilon_1 - \epsilon_2 < d_n - b_n$ for all $n>P$.

Now the proof of the converse is symmetric.


Continuing on, we will choose some $\{M_i\} \in [\beta_{n+1}]$ where $\{M_i\} = [t_i,b_i]$. Note that the constructive interval sequence $[p_i,a_n + \frac{1}{n}]$ where $\displaystyle \lim_{i \to \infty} (a_n + \frac{1}{n}) - p_i = 0$, certainly belongs to $[a_n + \frac{1}{n}]$. Then by definition, for all $i > N$, $ a_n + \frac{1}{n} - b_i > \epsilon > 0$ for some rational $\epsilon$ (this is a consequence of Lemma 3). In particular, we can find a rational number, which we will denote $b_{n+1}$ such that $a_n + \frac{1}{n} - b_{n+1} > 0$ and $b_{n+1}$ is a rational upper bound of $S$. 

The fact that $b_{n+1}$ is an upper bound of $S$ follows from the discussion above pertaining to the fact that for any $[\alpha] \in S$ and any upper bound $[\beta]$, a left endpoint of a constructive interval sequence in $[\alpha]$ may never be greater than a right end-point of a constructive interval sequence of $[\beta]$.

It thus follows that if the event where $a_{n+1} + \frac{1}{n+1} \geq a_n + \frac{1}{n}$ occurs, we are always able to find a rational upper bound $b_{n+1} < a_n + \frac{1}{n}$, this also implies that $b_{n+1} - a_{n+1} < \frac{1}{n+1}$.

We can thus inductively build a constructive interval sequence $[a_n,b_n]$ where $b_n - a_n < \frac{1}{n}$ for all natural $n$ (we start indexing at $n=1$), where $b_n$ is a rational upper bound of $S$, and $a_n$ is some left end point of some constructive interval sequence of some $[\alpha]$ in $S$.

Clearly, this interval sequence fulfills the criterion for a constructive interval sequence, and thus belongs to an equivalence class.

Let us denote the equivalence class this sequence belongs to $[\xi]$.

We will show that $[\xi]$ is the least upper bound of $S$.

Consider some $[\alpha] \in S$. Take $[c_n,d_n] \in [\alpha] \in S$.

Case 1 - $d_n$ is an upper bound of $S$ for all $n$.

In this case we will show that $[\alpha] = [\xi]$.

Since $[c_n,d_n]$ is a constructive interval sequence , we know that $\displaystyle \lim_{n \to \infty} d_n - c_n = 0$.

Therefore, for $n>M$, we can ensure $d_n - c_n < \frac{1}{N}$ for some natural number $N$.

Now take $[a_n,b_n] \in [\xi]$. We know that $b_n - a_n < \frac{1}{n}$ and $b_n$ is an upper bound of $S$, $a_n$ is some left end point of some constructive interval sequence of some $[\alpha] \in S$.

For $n>M$, we know that $c_n + \frac{1}{N}$ is an upper bound of $S$.

We also know that for $n>N$, $a_n + \frac{1}{N}$ is an upper bound of $S$ by definition.

Take some $n> max(N,M)$.

Suppose $|c_n - a_n| > \frac{1}{N}$.

then $c_n - a_n > \frac{1}{N}$.

This implies that $c_n > a_n + \frac{1}{N}$

But, $[a_n + \frac{1}{N}]$ is an upper bound of $S$, and so the constructive interval sequence $[m_i,a_n + \frac{1}{N}]$ where $\displaystyle \lim_{i \to \infty} (a_P + \frac{1}{N}) - m_i = 0$, certainly belongs to $[a_P + \frac{1}{N}]$.

We discussed earlier that any left end point of any constructive interval sequence of some $[\alpha]$ in $S$ may not exceed a right end-point of a constructive interval sequence of an upper bound of $S$.

Thus we reach a contradiction.

Similarly, we reach a contradiction if $a_n - c_n > \frac{1}{N}$ for some $n> max(N,M)$.

Thus for all $n> max(N,M)$ , $|c_n - a_n| \leq \frac{1}{N}$.

Since $N$ was any natural number, we have shown that $[\xi] = [\alpha]$.

Case 2 - For some natural $n$, $d_n$ is not an upper bound of $S$.

This means that for all $n>M$ for some natural $M$, neither $c_n$ nor $d_n$ are upper bounds of $S$.

Now, as $b_n$ (a right endpoint of $[a_n,b_n] \in [\xi]$) is always an upper bound of $S$ by definition, we have $b_n - d_n > 0$ for all $n>M$.

By lemma 3, an equivalent definition for $[\alpha] < [\beta]$ was that if $\{I_n\} = [a_n,b_n] \in [\alpha]$ and if $\{K_n\} = [c_n,d_n] \in [\beta]$, for all $n>L$ for some natural $L$, $d_n - b_n > \epsilon$ for some positive rational $\epsilon$.

Since we know that in our case, $b_n - d_n > 0$ for all $n>M$, it is not possible for $[\alpha] > [\xi]$.

Thus by the law of trichotomy, $[\alpha] \leq [\xi]$.


Thus we have shown that $[\xi]$ is an upper bound of $S$. All that is left to do now is to show that it is indeed the least upper bound of $S$.

Suppose, for the sake of contradiction, for some upper bound $[\beta]$ of $S$, $[\beta] < [\xi]$.

Let $\{K_n\} = [c_n,d_n] \in [\beta]$. Then for all $n>M$, $a_n - c_n > \epsilon > 0$.

However, as $[c_n,d_n]$ is a constructive interval sequence, for all $n>T$ where $T$ is some natural number, we can ensure that $d_n - c_n < \epsilon$

Thus, for $n> max(T,M)$ , $d_n < a_n$.

But $d_n$ is a right end point of a constructive interval sequence of an upper bound of $S$, and $a_n$ is a left endpoint of some $[\alpha] \in S$. Therefore, $a_n \leq d_n$ for all $n$.

This yields a contradiction.

Therefore $[\beta] < [\xi]$ is false, so by the law of trichotomy, $[\xi] \leq [\beta]$ for any upper bound $[\beta]$ of $S$.

Thus $[\xi]$ is indeed the least upper bound of $S$, and $\mathbb{R}$ under this construction is complete.


For the sake of completeness (no pun intended), I will verify that all the field axioms hold and that the previously defined operations of $+$ and $*$ are well-defined in my next post. (Hopefully by the end of today or tomorrow). We will also stop denoting real numbers as '$[ ]$' and refer to them as normal.

Let's prove the Monotone Convergence Theorem:

$Theorem \ (Monotone \ Convergence) - $ Let $\{a_n\}$ be a sequence of non-strictly increasing real numbers, that is, $a_1 \leq a_2 \leq a_3 ...$.

Furthermore, suppose there exists a real number $b$ such that $a_n \leq b$ for all $n$.

Then $\displaystyle \lim_{n \to \infty} a_n$ exists.

$Proof \ (Monotone \ Convergence \ Theorem) - $ Consider the set non-empty subset of the real numbers, $\{a_n | n \in \mathbb{N} \}$. Clearly, this set is bounded above by $b$.

Therefore, by the least upper bound property of the real numbers, there must exist a number $\xi$ that is the least upper bound of the aforementioned set.

We will show that $lim_{n \to \infty} a_n = \xi$.

Consider $|a_n - \xi| = \xi - a_n$.

For the sake of contradiction, suppose there does not exist a natural number $m$, such that $\xi - a_m < \frac{1}{n}$ where $n$ is some arbitrary natural number.

Then for all $m$, $\xi - a_m \geq \frac{1}{n}$ and $a_m \leq \xi - \frac{1}{n}$ for all $m$.

This is a contradiction, as then $\xi - \frac{1}{n} < \xi$ is a smaller upper bound than $\xi$ of the aforementioned set.

Thus, there must exist a natural number $m$ such that $\xi - a_m < \frac{1}{n}$, which implies, as $a_n$ is non-strictly increasing, that $\xi - a_k < \frac{1}{n}$ for all $ k> m$.

As $n$ was arbitrary, we are finished.


In the next post we will also consider the case where $\{a_n\}$ is a non-strictly decreasing sequence of real numbers.

Thanks for reading.

Tuesday, March 8, 2016

Introduction to Real Analysis (1.4.2) - Sequences, Series and the Real Numbers (Nested Interval Theorem, Equivalence Classes, Definition of the real numbers)

At this point the title of this blog has become something of a running gag (for myself and the one odd viewer that occasionally encounters this website).

As much as I would like to actually maintain an average of $\displaystyle \frac{3}{2}$ posts per day, I haven't been able to because of high school work and other obligations.

In any case, I'll try and make a conscious effort to do so, and if it becomes glaringly obvious that such an average isn't, and never will be feasible to maintain, I'll change the title to something like:

$\displaystyle 0 < average \ posts \ per \ day \leq \frac{3}{2}$


This post will cover several things. Two important theorems will be proved, and then several definitions pertaining to the real numbers will be provided.

First, we will pretend that the real numbers already exist, and possess all the qualities we expect the real numbers to be endowed with. In other words, through whatever means we may have used in order to construct the set of real numbers, the set of all real numbers is a complete ordered field.

All this means is that non-empty subsets of the real numbers satisfy the 'least upper bound' or 'completeness' property, that all the laws of inequality hold, and that the real numbers are equipped with addition and multiplication and all the good stuff.

The 'completeness property' is formally stated below:

$\displaystyle (Completeness \ Property) - $ Let $ S \subset \mathbb{R}$ be non-empty. Furthermore, suppose there exists some number $b \in \mathbb{R}$ such that $ a \leq b$ for all $a \in S$. Then there must exist an element $\xi \in \mathbb{R}$ such that $\xi$ is a least upper bound of $S$, meaning that $\xi \geq a$ for all $a \in S$ and for any other upper bounds of $S$ (numbers $b \in \mathbb{R}$ such that $b \geq a$ for all $a \in S$), $\xi \leq b$.


In my previous post, I ended on:

 ''Imagine the 'Real Number' line was in front of you. Now mark a point on the line, and another point to the right of this point with a left and right closed bracket respectively. Continue to do this, but make sure that all new left closed brackets you mark are to the right of previously marked left closed brackets, and that all right closed brackets you mark are to the left of previously marked right closed brackets, and that there are no left closed brackets to the right of right closed brackets. What you should begin to notice after a while, is that both closed brackets begin to 'close in' to something.''


Lets take this and turn it into a theorem, as promised earlier.

$\displaystyle (Nested \ Interval \ Theorem)$ - Let $I_n = [a_n,b_n] $ (where $ a_n < b_n $ and $ a_n,b_n \in \mathbb{R}$) denote a closed interval. Suppose $I_{n+1} \subset I_n$ for all $n \in \mathbb{N}$. 

Then we may consider the sequence of such intervals $\{I_n\}_{n=0}^{\infty}$

If all of the criterion above are satisfied, the intersection of all of these intervals is non-empty. 

That is,

 $\bigcap_{i=0}^{\infty} I_n \neq \emptyset$


Lets translate what this theorem is saying by comparing it to the description in my previous post. 

Continuing to mark brackets so that left closed brackets are to the right of previously marked left closed brackets and right closed brackets are the the left of previously marked right closed brackets along with the fact that a left closed bracket may never be to the right of a right closed bracket does nothing but ensure that $I_{n+1} \subset I_n$ for all $n \in \mathbb{N}$ . 

Furthermore what these closed brackets (which is just the sequence of nested intervals $\{I_n\}_{n=0}^{\infty}$, mentioned in the aforementioned theorem) are 'closing in' on is simply an element in $\bigcap_{i=0}^{\infty} I_n \neq \emptyset$, which as the theorem asserts, is non-empty.

As it turns out, we can prove the Nested Interval Theorem with nothing but the Completeness property of the real numbers. 


$\displaystyle Proof \ (Nested \ Interval \ Theorem)$ - We will denote any sequence $\{k_n\}_{i=0}^{\infty}$ as $\{k_n\}$ from now on. 

Consider the sequence $\{I_n\}$ of nested intervals. As $I_{n+1} \subset I_n$ it is clear that the left endpoints of $\{I_n\}$ form a non-strictly increasing sequence, we will call this sequence $\{a_n\}$, and the right endpoints of $\{I_n\}$ form a non-strictly decreasing sequence, we will call this sequence $\{b_n\}$.

That is, $a_0 \leq a_1 \leq ...$ and $b_0 \geq b_1 \geq b_2 \geq ... $

We will show that $a_i \leq b_j$ for any natural numbers $i$ and $j$.

Suppose $a_i > b_j $, if $i>j$ we know that $b_i>a_i>b_j$ this implies $b_i > b_j$ , yielding a contradiction, as $\{b_n\}$ is non-strictly decreasing.

If $i<j$ then $a_j < b_j < a_i $, again yielding a contradiction, as $\{a_n\}$ is non-strictly increasing.

Therefore $i = j$. But if $i = j$ then $a_i < b_i$ as $I_i$ is a non-empty closed interval, this yields a final contradiction.

Thus $a_i \leq b_j$ for any natural $i$ and $j$.

This means that the set of all terms of the sequence $\{a_n\}$ is bounded above, which implies (by the completeness/least upper bound property) that there exists a least upper bound of $\{a_n\}$, we will call this least upper bound $\xi$.

So we know that $\xi \geq a_n$ for all natural $n$.

We will now show that $\xi \leq b_n$ for all natural $n$.

Suppose $\xi > b_j$ for some natural number $j$.

But $a_n$ is less than $b_j$ for all natural $n$, as we proved that $a_i \leq b_j$ for any natural numbers $i$ and $j$.

Therefore $b_j$ is an upper bound of $\{a_n\}$.

Then $\xi$ is not the least upper bound of $\{a_n\}$. This yields a contradiction.

Therefore $\xi \leq b_n$ for all natural $n$.

This implies that $\xi \in [a_n,b_n]$ for all natural $n$, and so $\bigcap_{i=0}^{\infty} I_n \neq \emptyset$


So we have proved that $\bigcap_{i=0}^{\infty} I_n \neq \emptyset$ by showing that the least upper bound of the sequence of left endpoints of our intervals is contained within the intersection of all nested intervals. We also know that this least upper bound must exist as per the completeness property of the real numbers.

This motivated me to pose the following question 'We know that working with numbers that possess the completeness property ensures the Nested Interval Theorem is true. Is it then possible to start with the Nested Interval Theorem and prove the completeness property?'

Read as stated above, the question is crude and isn't coherent. What I was precisely wondering was, since we can 'narrow down' on real numbers using the Nested Interval Theorem, can we just 'define' real numbers as collections of Nested Intervals that get smaller and smaller and thus seemingly 'narrow down' on the same \ fixed point?

The word 'collection' in the paragraph above is also crude, which is why we will now introduce the notion of 'equivalence classes' in order to make this precise.

Before we introduce equivalence classes, we will define what is called an 'equivalence relation' on a non-empty set $S$.

$Definition \ (Equivalence \ Relation ) - $ A relation $R$ on a set $S$ is defined as any non-empty subset of the set $ SXS = \{ (a,b) | a \in S , b \in S \} $

If $(a,b) \in R$ we write $aRb$. That is $aRb$ if and only if $(a,b) \in R$.

An equivalence relation on $S$ is a relation on $S$ that satisfies the following properties:

(Reflexive) $aRa$ for all $a \in S$

(Symmetric) If $aRb$ then $bRa$ for any $a , b \in S$

(Transitive) If $aRb$ and $bRc$ then $aRc$ for any $a, b, c \in S$


Now lets get back to defining equivalence classes:

$Definition \ (Equivalence \ Class) - $ An equivalence class for an equivalence relation $R$ on a non-empty set $S$ is the set $[a] = \{ x \in S | xRa \} $

We will also present an important theorem about equivalence relations and equivalence classes:

$Theorem (Equivalence \ relation \ creates \ equivalence \ class \ partition ) - $ An equivalence relation $R$ on the non-empty set $S$ partitions $S$ into disjoint equivalence classes

The proof is also presented below:

$Proof \  ( aforementioned \ theorem ) - $ Consider the equivalence classes $[a]$ and $[b]$ . Clearly these classes are non-empty, as $aRa$ and $bRb$.

Now, suppose $x \in [a]$ and $x \in [b]$. Then $xRa$ and $xRb$. This implies that $aRx$ (by symmetry), which implies that $aRb$ (by transitivity).

Therefore, for any $ y \in [a]$, that is, for any $yRa$, $yRb$ and $y \in [b]$. Thus given $[a] \cap [b] \neq \emptyset$ , $[a] \subset [b]$.

We can prove (symmetrically) that $[b] \subset [a]$. Therefore, if $[a] \cap [b] \neq \emptyset$, it follows that $[a] = [b]$.

The theorem thus follows.


Finally, we will end this post by presenting a definition of 'real number'.


Before we begin to define real numbers, a distinction between the 'limit' we are used to and the 'limit' we will be talking about in this post and the posts to come will need to be made.

In the context of this construction of the real numbers, $lim_{n \to \infty} a_n = L$ where $L \in \mathbb{Q}$ means that for any positive $\epsilon \in \mathbb{Q}$, there exists a $M \in \mathbb{N}$ such that if $n>M$, $|a_n - L| < \epsilon$. We also assume that $a_n \in \mathbb{Q}$ for all $n$, as we have not constructed the set of all real numbers yet.

From now on, we will pretend the real numbers do not exist. Any mention of 'numbers' will immediately be assumed to be rational.

Now we will move on to a definition:

$Definition \ ( Constructive \ Interval \ Sequence ) - $ We call the sequence of closed intervals with rational endpoints, $I_n = [a_n,b_n]$ for $n \in \mathbb{N}$ a constructive interval sequence if it satisfies the following conditions:

(1) $I_{n+1} \subset I_n$ for all $n \in \mathbb{N}$

(2) $\displaystyle lim_{n \to \infty} (b_n - a_n) = 0 $


We will also define a relation on the set of all constructive interval sequences. Instead of denoting this relation $R$, we will denote it $=$.

$Definition \ (Equality \ of \ Constructive \ Interval \ Sequences ) - $ Two constructive interval sequences, 
$\{I_n\}$ and $\{K_n\}$, where $I_n = [a_n,b_n]$ and $K_n = [c_n, d_n]$ are said to be 'equal' that is $\{I_n\} = \{K_n\}$ if and only if :

$\displaystyle lim_{n \to \infty} (c_n - a_n) = 0$.


We will show that this is indeed an equivalence relation.

First we will show that this relation is reflexive. Clearly, $a_n - a_n = 0$. Thus this follows.

Now, Suppose $\{I_n\} = \{K_n\}$. Then for all rational $\epsilon > 0 $ there exists a natural number $M$ such that if $n>M$, $|c_n - a_n| < \epsilon$

As $|a_n - c_n| = |c_n - a_n|$ this implies $|c_n - a_n| < \epsilon$.

Thus $\displaystyle lim_{n \to \infty} (a_n - c_n) = 0$, which implies symmetry.

Now, suppose $\{I_n\} = \{K_n\}$ and $\{K_n\} = \{P_n\}$ where $\{P_n\} = [e_n,f_n]$ is another constructive interval sequence.

Then choose $M$ such that $|c_n - a_n| < \frac{\epsilon}{2}$ and $|e_n - c_n| < \frac{\epsilon}{2}$

We know that $|e_n - c_n| = |e_n - c_n + a_n - a_n| \leq |c_n - a_n| + |e_n - c_n| < \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon$

It follows that $\{I_n\} = \{P_n\}$ which implies transitivity.

Thus '$=$' is an equivalence relation on the set of all constructive interval sequences.

This allows us to take advantage of the fact that an equivalence relation on a set creates a partition of equivalence classes. Indeed it is these equivalence classes that we will define as 'real numbers'.


$Definition \ (Real \ Number) - $ A 'real number' $\alpha$ is an equivalence class of constructive interval sequences.


I will follow up on this definition in the next post, where the properties of the real numbers will be explored, and operations on the real numbers will be defined.