**1. The Monge Problem **

In 1781, Monge was interested in finding the most economic way of moving a pile of soil to a mound. Given points of masses , at locations in that needed to be moved to locations , Monge was interested in finding a bijective map that minimized the weighted distance cost

This minimizing map is known as the optimal transport map. In his memoir, he used geometric arguments to deduce that if an optimal map does exist then it must be determined by a potential . The precise contribution given by Monge was that if the optimal map exists, then it must satisfy

A continuous, general version of the Monge problem can be restated as follows. Let and be measure spaces with probability measures and respectively – these represent the pile of soil we wish to move and the mound we wish to create. We define the cost function as a measurable function . Note that in Monge’s original consideration, this was simply the weighted distance in (1). A *transport map* is a measurable map such that is the push forward of by . This means that for any measurable set , we should have that

and we write . With the above defined, the Monge Optimisation problem is to minimise

over all measurable maps that satisfy . The solution to the Monge Problem (4) is called the optimal transport map and the cost associated with it is known as the optimal transportation cost. We will denote the optimal transportation cost between probability measures and by ; i.e.

**2. The Kantorovich Problem **

In 1942, Kantorovich introduced the following relaxed version of the Monge optimisation problem. We again consider two measure spaces and with probability measures and respectively. We define an admissible transport plan as a joint probability measure on that has marginals and . This means that for all measurable sets and we have

The set of all admissible transport plans will be denoted as . Kantorovichs minimisation problem is to minimise

for all . He showed that a solution to his problem exists and that it is indeed determined by a potential, as was argued by Monge. In 1948, Kantorovich related his problem to the Monge optimization problem and showed that his results from could be applied. In fact, the Kantorovich problem is just a relaxed version of the Monge problem. The Monge problem is more stringent in that it does not allow a piece of mass at a point to be split up and sent to multiple different locations in . We can write the transport plans for the Kantorovich problem in terms of the transport maps of the Monge problem as

where is the Dirac measure defined to be if and otherwise.

That brings us to the end of this introduction, stay tuned for all the details in the next few weeks!

]]>Since then I moved to Switzerland to begin a Masters in Mathematics at the ETH.

This year has been rather eventful, stressful and I have learnt a lot about life and what it takes to succeed. The most important lesson was having set times where I do nothing work related. I find if I have these times, the moments I am working are exceptionally productive.

The purpose of this post, apart from shedding some hard earned wisdom, is to announce that I am back! It’s been almost a year since I posted Analysis Tricks Part 1. To be honest, I have no idea what I had in mind for Part 2 while I was studying energy estimates of the wave equation in Hangzhou China, it was so long ago! So I’m going to start fresh this year and write about some neat Functional Analysis. Next post will be in two weeks time!

]]>(1)

is obvious. To me, it was not obvious at all, and may not be so obvious to you either. But what my supervisor was referring to was a trick that saves a lot of time when it comes to analysis, and so today, I will be talking about it.

I will first prove 1 before I show you the trick. We firstly need to understand the notation in 1;

is called a \textit{multi index} and it is used to raise vectors to powers. For some we define

The degree of is

With this in hand, we can re-write 1 as

(2)

Now, for the equivalence symbol; means we can find two constants and such that

and we say that is equivalent to .

We shall first prove that there exists a constant such that

To do this, we take a small subset of the multi-indices that satisfy , the index which has zero in every entry and also the indices for which

(3)

We call the set of these indices and we note that

(4)

So taking , we have proved the first direction. Now, we wish to prove that there exists a constant such that

Prepare yourself for trick number 1 that will make your life so much easier, because the left hand side is just a finite sum, we only need to control the term by . Then we can take where is the number of elements in the set and the result will follow.

We will assume without loss of generality, that is an increasing sequence. We begin by applying the Cauchy-Schwartz inequality:

(5)

We have two cases, if then we have that

(6)

and the result follows for this case.

However, if then knowing that we can write

(7)

and the result follows.

So we have proved that

(8)

but there is an intuitive way to look at this problem.

Firstly, consider the finite sum for

(9)

If , we can control the sum by but if we can control it by , and so in general,

(10)

What we did was look at extreme cases and saw how all the other cases in between them could be controlled, if we controlled the extreme ones. This is the trick! Now let’s apply it to 2 which is

Again, we note that we only need to control the term that is being summed, and not the entire sum, as we saw from the proof. We now look at the extreme cases, if is the index with zero in every entry, then

and if we can control these cases by

since for all . Hence, by an intuitive path, we have arrived at 2.

We note that in this trick, we only care about the order of these expressions, we look at the extreme cases of the order and note that we can control everything in between the extreme cases.

Stay tuned to the next blog post to see this trick, (and many more!!) in action.

Today we will look at a Putnam Problem, here is the question, (screenshot from the problem sheet that can be found here).

Now first looking at this problem you probably think induction is a good place to start. It isn’t. The reason that it is really hard to begin with induction is dividing polynomials by other polynomials is hard – let alone ensuring that the resulting quotient has integer coefficients and no remainder terms. So, let’s make life easier and re-write the recurrence relation so that it has no division in there (a good first step for solving problems!). So we note that the recurrence relation can be re-arranged to

(1)

Now, you can try and pull induction on 1, I encourage you to. That way you will find that it just reverts back to dealing with a quotient and the simplification step we just did get’s undone. So what to do next?

Well, a good next step is to put all our terms on one side.

(2)

Now from this step, it requires a lot of thinking; or rather one little observation. It is remarkable that no matter what you pick, the left hand side of 2 is identically zero. These are all different polynomials, but combining them as in 2 yields . In particular, if we chose instead of , the recurrence relation is now

(3)

(4)

and after collecting terms we get

(5)

Now 5 is telling us that both sides of the equation must equal a constant if this is to be true for all values of , after all they are not the same polynomials. So you can now go ahead and substitute in to find what this constant has to be. However, if you go ahead and do it, you’ll hit a dead end, but I encourage you to go down this path to see exactly what the dead end. There is one more thing to do with 5 before we find what the constant is, and that is divide by the factors, (once you see the dead end I talked about, you’ll understand why it is necessary to divide). So dividing through by the factors, which are non-zero polynomials,

(6)

Now substituting in and we can see that the constant factor is

And so now we get a new recurrence relation for our polynomial

(7)

which can be arranged to give

(8)

which is clearly a polynomial with integer co-efficients. Hopefully I’ve shown you the problem solving process involved in attacking a question like this. The biggest piece of advice you can take from this problem is at first try something, and if it doesn’t work, analyse why and see if you can avoid it or change your game plan to fix it!

]]>Before I begin, as stated in my articles section, I said I am going to start writing about all the neat maths I’ve learnt throughout my final year project in physics, in which I wrote an introduction to general relativity, (available on the articles page). I’d like to point to a great post written by friend, Ivan, on general relativity. You can find the post here along with all of his other great content.

For those first learning differential geometry, you may be given the definition of a manifold, without any reasoning why the definition is the way it is. We begin with the definition of a topological manifold, if you are unfamiliar with the definition of a topological space, look !!!!here.

A topological manifold is a topological space such that

- X is Hausdorff
- X is second countable
- X is locally euclidean

A Hausdorff space is a topological space such that for any two distinct points in , we can find two open sets and where and and .

Furthermore, a topological space is called \textit{second countable} if there exists a countable basis for the topology. So this means there exists a countable collection of open subsets of , , such that each open subset of can be written as a collection of elements of some subfamily of .

Additionally, a topology is known as locally euclidean if every point in has a neighbourhood that is homeomorphic to some open subset of .

We can see automatically why the locally Euclidean condition useful, once we can say that a space is locally like Euclidean space, many of the properties that Euclidean space enjoys are transferred to our topological manifold in a local sense, not necessarily globally. However, the other two conditions, the Hausdorff requirement and the second countability requirement are not so obvious. To see why need the Hausdorff requirement, lets look at what is called the trivial topology, where a topological space has the topology .

We often like to know what a sequence converges to, and so we will first look at how sequences behave in this topology. The first observation we make is that for any sequence in , it will converge to every point in . Take a sequence in and some point . Now in order for , we need for every open neighbourhood of , there must be points of the sequence also contained in that neighbourhood. Since the only open neighbourhood of is , it follows that . But we chose to be any point in and so the sequence converges to every point in .

The next thing that the map is always continuous for any topological space . To check continuity, we look at the pre-images of the open sets of . Clearly, and so is open. Now, and so is also open, and hence is continuous.

This rather pathological example of a topology illustrates the unwanted behaviours that a space can produce itself. Note, that the behaviours had nothing to do with the sequences or the functions we chose, but was a direct consequence of the discrete topology we had chosen. To avoid all this, we restrict ourselves to Hausdorff spaces, and we will now show two important properties of Hausdorff spaces that solve these problems.

For a topological space that is Hausdorff, each convergent sequence in has a unique limit.

Proof.

Take a sequence in X, and say it converges to two different points and . Since , there exists an such that for every open neighbourhood of , for all . By the Hausdorff property, since there exists an open set that contains and is disjoint to . Hence for all , and so does not converge to . A contradiction and hence and the limit is unique.

In order to solve the continuity problem our spaces need sufficient amounts of open sets in order for the definition of continuity to be useful. Now a good way to ensure this is by taking finite subsets to be closed, and so their complements will be open. Since we can construct a vast number of finite subsets, we will also have a vast number of open sets. We now show that each finite subset of a Hausdorff space is closed.

Proof.

We use the fact if a set contains it’s limit points then it is closed.

Take to be finite, and a sequence such that for some . Say that , and take some , where from the definition of convergence. There is some set disjoint from by the Hausdorff property. However, since for all , there is an open set that contains but does not contain any points of the sequence. A contradiction. Hence, and contains it’s limit points.

EDIT:

This proof is wrong for a few reasons:

- We didn’t use the fact that the subset is finite anywhere in the proof, other then in the first line where we say it is finite – but this is not used later and so the argument still works for any subset. So our argument states that any subset of a Hausdorff space is closed
- The second error is that we have misused the Hausdorff property. The property states that given two distinct points, we can find two disjoint sets containing each point. What we have done in the proof is taken a point and a set containing it , then said we can find a set containing that is disjoint to . This is not the Hausdorff property, and so ultimately the argument breaks down here.

So, a revised proof follows the idea that since the set is finite, , we can write

and since each is a closed set, their finite union is closed. Well, the fact that one point is a closed set is not so obvious, so let’s prove it, and this where the Hausdorff property comes in.

Proof.

Take and by the Hausdorff property, we can find two disjoint open sets and such that and . Now, we note that

Since each is open their union is open and so is open, and hence is closed.

END EDIT

Now we can see why we need the Hausdorff property in the definition. The second countable condition is to ensure the existence of partitions of unity, I will talk about this in detail in some later post, however, for now, you can think of a partition of unity as way to “glue” different functions on a manifold together. These are required for integrating on the manifold and come up frequently in smooth manifold theory.

]]>The Einstein equations are given, in tensor notation by

(1)

where

- is the Ricci curvature tensor of the metric
- is the Ricci Scalar defined as
- is the energy momentum tensor of matter

This is a set of 10 partial differential equations which can be solved for the metric, which tells you how your spacetime is curved. The idea is that we want to be able to solve this set of partial differential equations, and find what our metric is, given a some initial data. We will limit ourselves to the vacuum Einstein equations, where we set and 1 reduces to

(2)

Now, the Cauchy problem in general relativity is that given some initial data set, can we find a stable solution to 2. Now what do I mean by an initial data set?

- A metric,
- The induced second fundamental form of ,
- Both defined on a 3-manifold (hypersurface),

The initial values that we require to solve for a metric will be some

Furthermore, these initial conditions must satisfy the vacuum constraint equations

and

These constraint equations ensure that and form the induced metric and second fundamental form of , and come from the Gauss and Codazzi equations.

So what are the milestones reached in addressing this problem of solving these equations?

- 1920, de Donder discovered the harmonic Gauge
- 1935, Schauder, Existence and Uniqueness of the Cauchy Problem of the Harmonic Gauge
- Stellmacher built on this to prove uniqueness of the Cauchy Problem
- 1958, Choquet-Bruhat, Local Existence
- 1969, Choquet-Bruhat and Geroch, Global Existence

Now let’s look at each of these developments in detail.

The Harmonic gauge or a harmonic co-ordinate system is one in which each of the co-ordinate functions must satisfy the wave equation,

(3)

where this is the box operator. In this harmonic co-ordinate system, the Einstein Vacuum equations reduce to

where . This is a system of quasilinear wave equations, which is an easier problem to look at, then the partial differential equations present in the Einstein equations.

In fact, the existence and uniqueness of solutions for the Cauchy problem of this equation was proved by Schauder in 1935. Then, only using this, Stellmacher proved uniqueness for the Einstein Vacuum Equations. But to prove existence, it is not enough to have that solutions exist to the reduced Einstein equations, and this leads us to the work of Choquet-Bruhat in the 1950’s and the theorems that she proved.

Now we ask, when do solutions of the reduced Einstein equations give solutions of the Einstein equations?

The answer is when the solution also satisfies the harmonic gauge condition.

However, Choquet-Bruhat found that the harmonic gauge is automatically satisfied given that it is satisfied on and that the initial conditions satisfy the constraint equations. This means choosing a local co-ordinate system that satisfies the harmonic gauge will also give us solutions to the Einstein Equations.

Now this was just a local existence result, in general though we would like to be able to talk about a maximal solution, that is true over the largest possible domain.

And She did that along with Geroch in 1969, and she proved that for some initial data set there exists a unique solution, that is maximal, in the sense that we mentioned before, i.e. any globally hyperbolic development of the same initial data can be extended to this maximal one. Now globally hyperbolic is a geometric quality that is required to ensure that our solutions are domain independent and so is then necessary in order to prove uniqueness of solutions. In terms of geometry, globally hyperbolic means that any point in spacetime admits a Cauchy surface.

Now this maximal solution is called the Maximal Cauchy development of the data set , and is a central object in the study of general relativity. It allows us to formulate the fundamental problem of solving these Einstein equations as one of dynamics. Recall from Newtonian physics, the question is that given an initial position and velocity, what is the future? Now we ask, given the initial data set, what is the future? Where the future is manifested as this maximal Cauchy development.

]]>Let’s first begin with an example we all know, the dot product. Now, given any two vectors, we can calculate their dot product, . Let’s do some fancy renaming and let

Let’s make some notes about .

- It is a function of 2 vectors
- It returns a real number
- As we know, since is just the dot product, it is bi-linear, meaning it is linear in both arguments

This is in fact an example of what we call a (0, 2) tensor, as it takes as its arguments 2 vectors and returns a real number. This has a matrix representation, namely the 2 by 2 identity matrix. However, in polar co-ordinates the metric would have matrix representation with components given and with the rest of the entries zero. We can denote any element of of the matrix, and hence elements of the tensor, as .

Now, you may be thinking, why is it a (**0,**2) tensor, what does that zero stand for? Well, for that we have to introduce the notion of a covector.

A covector is function which maps vectors into the real numbers, for those of you who are snazzy with their linear algebra, these are elements of the dual space of a vector space. As an example, consider the some useless function which maps every element in our vector space to , this is a covector, it maps our vectors into the real numbers. Now, we will make an important distinction in notation, and this will come back in a huge way later, for now, memorise it. We represent the components of a vector with raised indices, and components of a covector with lowered indices, . It is important that you know this, say goodbye to arrows and tildes indicating vectors, we know use the position of the index, up is a vector, down is a covector.

Now, an important relationship between vectors and covectors is that a vector can be mapped into a covector, using the Minkowski metric of minkowski space, the metric that has components , , and all other entries zero. Using our notation this is

What on earth does this mean? This is a summation over the index, known as the Einstein summation convention, we say that the indices contract. This is true whenever you have the same index appearing lowered and raised. What is important to note is that the index was lowered, it has become a covector. In much of what will follow, we will use the metric to raise and lower indices as we please. We note that an index can be raised by using the inverse of the Minkowski metric, .

Now, back to the zero in (0,2) tensor; it states that our tensor takes 0 covectors as arguments. A (1,1) tensor, lets call it , would take 1 covector and 1 vector as arguments and return a real number. We can represent the components of this tensor by , where represents the index of the covector and and represents the index of the vector. If was instead a (2,1) tensor, it would be indexed by . In general, a (r,s) tensor will have upper indices representing the number of covectors the tensor takes as input and lower indices representing the number of vector inputs. It is important to note that any real valued function is a (0,0) tensor. So really, tensors have been in front of your eyes this entire time, you just haven’t realised it.

Now, I told you the notation was going to come back in a huge way, and here it is. Say we have a (1,1) tensor . And say we wish to input it a vector, , by our notation and the summation convention, we can simply write

and note that our indices contract naturally. The lowered on which represents the vector input is contracted with the raised on a vector. The same is true of covectors. With this notation now, if you make a mistake, you will know it, and almost always you can just guess how to put things together just by looking at how the indices should contract! It’s powerful stuff.

Now let’s see this stuff in action by finding the equation of motion for the following Lagrangian density

Note that the summation convention is applied to the indices on the partial derivatives. Now the Euler Lagrange equation for fields is

where the summation convention occurs again over the bold indices, one is raised and one is lowered by being in the denominator of the fraction.

Clearly, . Now for the fun part.

Now, we can see that in order for our lagrange equation of motion to make since, this is the only possible position of index we could have gotten, because now, the summation convection holds for when we put in the final partial derivative from the Euler Lagrange equation of motion, and we get

which can be re-written as

Now, for the physics buffs, since is a function of (t,x,y,z) we have that is just the d’Alembert operator, which is the Laplace operator in Minkowski spacetime. So now, the Euler Lagrange equation which we just derived should be recognised as the Klein-Gordon equation.

]]>

I’m going to present 3 equivalent definitions of the n-dimensional real projective space using the language of quotient spaces, and then prove that they are indeed equivalent.

A. is by identifying the antipodal points. (Sphere Model)

B. is by identifying two points if they lie on the same line through the origin. (Line Model)

C. is the unit ball and identifying antipodal points on its surface. (Ball Model)

Now all three models are quotient spaces, and to prove that they are equivalent, i.e. that they are homeomorphic, we will need a bit of theory about identification maps. So let’s take a few moments to say a couple of things.

Say, we have topological spaces and , and a continuous surjective map , then we call an identification map. Why? Because, provides a natural partition of the space into subsets where . That might boggle your mind, so let’s take a look at an example.

Consider the spaces and . We can define the continuous surjective function

Now, clearly, any vector that points in the same direction in will get mapped to the same point on . These sets of pre-images of the point provide the partition of , known as the sets of the pre-images of f, . We can see now, that we can describe as the disjoint union of the pre-images of , i.e.

Since is surjective, none of these sets will be the empty set and so it is well defined. Now, we can represent this quotient space as , which reads as partitioned into pre-images of .” Now, in the example I chose, it is obviously true that is homeomorphic to .

Now let’s go back to the general case, topological spaces and and the map . Now if is an identification map, it immediately follows that is homeomorphic to , i.e. that in the picture below is a homeomorphism. Also note, that the is just the canonical projection form to it’s quotient space.

Now, there are two results that we can use to check if our given map is an identification map, and then simply state that and are homeomorphic, let’s state them so we can use them. The first result relates to the map and the second one relates to the spaces.

Theorem 1: If a surjective and continuous map maps open sets to open sets, or closed sets to closed sets, then is an identification map.

Theorem 2: Let be a surjective and continuous map from to . If is compact and is Hausdorff, then is an identification map.

Now we have the necessary gadgets to prove that the three definitions of the real projective space are indeed homeomorphic. Let’s start by showing that the sphere model is homeomorphic to the line model, and it is always a good idea to draw out a little diagram of our spaces and what maps what to what.

Now, we want to show that the bottom two spaces, are homeomorphic, so following the general model that we introduced above, we want to be between them.

So this must mean we need to find an identification map between the diagonal spaces, shown in red below.

Now, we can try really hard to be smart and think of a single map straight to , or we can just compose some simple functions. Namely, we can see that we can go from to using the inclusion map.

Now since and are both continuous and surjective, it follows that is also continuous and surjective. is a closed subset of and is therefore compact. We now just need to show that is indeed Hausdorff and we would be done by Theorem 2.

So now, let’s prove that this space is Hausdorff. This is no small feat. The idea behind this will be to use the fact that we can find disjoint open sets in because it is Hausdorff, and then use our quotient map to take these disjoint open sets to disjoint open sets in the quotient space.

So we start by taking two elements and in our quotient space, which are just lines through the origin in . We can look at the intersection of these lines with , which would be 4 distinct points, are the intersections of with , and are the intersections with . Since is a subset of the Hausdorff space , we can find disjoint open sets that contain each of and , call them and . Furthermore, we can construct each of these sets so that if then , which ensures that , and the same for . Now, for any , it follows that , hence, , which ensures that . Now, don’t let this trip you out, we only need to have a little bit of the line in , so that in the quotient space, contains the equivalence class , and same for and . So, in the quotient space, and is open, and and is open. Hence, is Hausdorff!

Now, we want to show that Ball Model in C, is equivalent to the sphere model in A. Here is the map of our spaces and functions.

Where takes the point to . Clearly, is continuous and surjective, and so the is continuous and surjective. Now, clearly is compact, and is Hausdorff, the proof of which is virtually the same as that above just ignore the lines through the origin business and go straight to antipodal points. It follows that is an identification map and so is homeomorphic to .

You’re probably thinking, what the heck is the point of all this, like why on earth are we even doing this? Other than for the pure joy of it, which is of course enough of a reason to do it, there is a really nice connection between the projective plane, , and non-orientable surfaces. We will build up to understanding this and then we will come back to our beloved projective plane .

So, what the hell is a surface? You probably have an idea what a surface is, you might think it’s a shape in 2 or 3 dimensions that has some faces, edges and vertices. Well, this is what comes to my mind when I think of a “surface”. But, what isn’t so natural to think of is how we work mathematically with surfaces, however, you’re in luck, because it is quite intuitive. We know that a surface is made up of vertices, edges and faces. So we would like to define these building blocks as *simplicies. *So, consider a rectangle, it has 4 vertices as shown, 4 edges and 1 face which are called the 0-simplex, 1-simplex and 2-simplex respectively.

Now, here I’ll stop and introduce something pretty neat, called the Euler characteristic of surface which is

In terms of our everyday language, this is just vertices – edges + faces. Take note of this, it’s gonna come back big time.

Finally, we note that we can give a surface an orientation, which is an *ordering *of the vertices. If the faces of the surfaces can be orientated such that a loop on the face going clockwise cannot be continuously deformed to a loop going anti-clockwise without overlapping itself, we say that the surface is orientable. As an example, consider the Mobius strip, it is non-orientable as a loop on the boundary goes clockwise and then anti-clockwise. In fact, if a surface contains a subset homeomorphic to the mobs strip, it is not orientable.

Now, it turns out, that any connected non-orientable surface, is homeomorphic to the connected sum of n-projective planes . WHAAAAAT! Completely unexpected, right!?

But how do we know how many projective planes we need? Well, we look at the Euler characteristic of course!

Th Euler characteristic of the connected sum of n projective planes is

Now all you need to do is figure out, how many projective planes you need to get the same Euler characteristic of your surface, and bam, you done.

As a side note, if the surface is orientable, it is the connected sum of n-tori or it is the sphere if , where for the connected sum of n-tori.

Now, why is this so useful? Because once you know what you’re homeomorphic to, you can easily compute the fundamental group of connected sums of the projective plane.

So there you go, any surface, is exactly the same as either a sphere, the connect sum of n-tori or the connected sum of n-projective planes. What a time to be alive.

]]>The energy of a system will be the total worked required to assemble it. For example, if you wanted to make a system of two protons separated by a distance of 5cm (assuming you bring them in from infinity) the energy of the system will be the amount of work you do to push the two protons together against their repulsive electrostatic force.

Now, for any electrostatic system, the work is simply

Now this is paralleled for the energy in magnetic fields. The formula we know from electrodynamics,

Where is the magnetic vector potential, analogous to the potential in electric fields , however, it is defined as .

We note that this can’t be applied to a magnetostatic field, since magnetic fields don’t do any work. But why does building up a magnetic field from scratch require work? This is because we start from no field to some field, this involves a *changing *magnetic field, which induces an electric field, which in turn does work, and so we need to battle against that to “assemble” our magnetic field, (by maybe increasing a current or something).

Combining the previous two results, we get that the total energy in an electromagnetic field is given by

In a dielectric filled system, as you build up the free charges around your dielectric, the bound charge in the dielectric exerts a force on it, and so we have to take this into account when calculating the work, and so in a dielectric system the work becomes

In a magnetisable material, the energy is given by

Now this is all well good to find energy, but, there is an easier way to calculate these quantities without performing integrals, and that is through using something called Inductance. Let’s first define what it is. Say we have two loops side by side, but not touching.

If there is a a changing current flowing in loop 1, the changing magnetic field induced will pass through loop 2, causing a change in flux in loop 2. By Faraday’s Law, the changing flux through loop 2 will create in electric field and so a current will now flow through loop 2. But we said that the two loops are not connected, they don’t even touch! How on earth does one affect the other? The energy stored in the magnetic field of loop 1 is transferred to loop 2 causing the current in loop 2. By Lenz’s law, the current will flow to oppose the changing the magnetic field that caused it. This is known as mutual inductance.

Now let’s do some math with this. What is the magnetic flux through loop two? Well we first need to figure out the magnetic field from loop 1. This is given by the Biot-Savart Law:

where is the current through loop 1.

The flux through loop 2 will be given by

As you can see, the flux is going to be proportional to the current through loop 1 and so we can write

Where

Is the mutual inductance between loop 1 and loop 2. We note that since it doesn’t matter which loop we consider, . Secodnly, is purely geometric, it only relies on the shape of the loops.

However, more often than not, we won’t be dealing with mutual inductance between two circuits or systems, we will usually just have the one loop. In this case, we note that the same phenomenon will occur, Lenz’s states that a current will flow to oppose the change in magnetic field that caused it. But where will this current flow if we don’t have another loop? You guessed it, loop 1 itself, the source loop! This is called self inductance, and the current induced will be opposed to the changing current that caused it. You can think of this induced current as fighting the change you are trying to make to the current in loop 1, and so, it is called the back EMF. Now as before, the flux through loop 1 will be given by

where L is the self inductance of the loop.

In this case of self inductance, we can think of the inductance L as analogous to mass, in that it is what you have to fight against in order to cause a change. The relation is clear if you think about this: if you have a larger mass, it is harder to move it, and similarly if you have a large self-inductance L, the harder it is to change the current. is sad to measure the magnetic inertia of a system.

The energy stored in the magnetic field for a given current is then given by

Now this formula works for the magnetic energy in or not in a magnetisable system, all you need to do is find the inductance first!

You may be asked somewhere along the line of your life to find capacitance. What is capacitance? A better question could be, why doesn’t Csaba teach this if he’s going to ask questions about it? The answer to that question is probably because it’s so simple!

Capacitance tells you how the potential of a system V changes with charge, Q, specifically, free charge, as we are only interested in the charge held by the capacitor. So capacitance is given by

That’s pretty much all you need to know. This is true for self capacitance of some conductor or the mutual capacitance between two conductors where the net charge is zero, e.g. a parallel plate capacitor.

Now, the energy stored in the capacitor can be found with a new equation, namely,

That’s pretty much all you need to know to find energy in the electrodynamic system Csaba throws at you.

May the force be with you!

]]>Now, the man is a bit of a legend and somewhat of a god, but be warned, passing his class will be one of the hardest things you will ever have to do. So I am here to elucidate to you, all the key concepts and ideas you will need to get through the electrodynamics part of the course. I, am your new hope.

Most of the course deals with Maxwells Equations and then applying them to matter, so that’s where we begin, after which we will then move onto electromagnetic waves (a future post).

Now as we go through all this, it is important that you have a picture or mind map of all these things that we are going to talk about, and slowly add to it as we go on. But, before we can do any of that, we have to get through some preliminary equations.

The current density is given by

and the Lorentz force is

This current density is nothing to be afraid off, it simply the amount of current that is moving per unit area. So all you need to do to find the actual current is perform an integral over the area where there is current density, , and wallah, you have current ready to go.

Gauss’ Law: **Electric charges produce an electric field**

Another Law: **Magnetic charges do not exist**

Faraday’s Law: **A changing magnetic field produces an electric field**

Ampere-Maxwell’s Law: **A changing electric field induces a magnetic field **

One final thing to note about the meaning of these equations is that the electric field and the magnetic field are all due to some sort of charge or current, i.e. the source of these fields are electric currents and charges. This is the picture which you should have in your mind, electric charges and currents are the sources of electric and magnetic fields. Similarly. the Lorentz force law can be re-written as

and hence, these fields exert forces on their sources.

Here we need to add some detail to our picture, specifically, what happens to our fields when we hit matter and vice versa. We will first deal with electric fields.

If a material is placed in an electric field it becomes polarised which leads to a net charge on the surface, and in some cases where the polarisation is not uniform, there will be accumulations of charge inside the material itself. We call these accumulations of charge due to polarisation **bound charge**, the one inside the material is known as the volume bound charge and the one on the surface of the material is known as the surface bound charge. Take a moment to properly grasp this effect of polarisation, and add it to the picture you’re building up in your mind, because once you do, the rest is just book keeping.

*The book keeping.*

Polarisation actually occurs to each individual atom in the material, separating the negative and positive parts creating a dipole, each one has it’s own dipole moment (, this is just details). Since this happens over the entire volume of the material, we define

and call the polarisation. Now, since the bound charge arises from this polarisation, you’d assume we would be able to relate the two, and you are correct.

where is a unit vector perpendicular to the surface. Similarly for the volume

How will this effect Gauss’ Law? We have to remember that these only came about due to some external electric field, the electric field must’ve come from some other charge configuration placed near the material. This charge configuration we call the *free charge. *It is the free charge that polarises the material, it *displaces* the electric charges in the material. For this reason we call the field due to the free charge the *electric displacement, D,*

Finally, if you actually want the electric field, it’s pretty simple, all it is, is Gauss’ law but taking into account all the charge present, free and bound,

So, the picture in your head should now contain what happens to electric fields in matter. Now we need to add in a little bit on linear dielectrics, but this too is really only a footnote in the mental picture you’re creating. The footnote is this:

In some materials, the polarisation is linearly proportional to the electric field ,

where is the electric susceptibility of the material. Note that we define the permittivity of the di-electric as

That is pretty much it. You can also derive, if you wish, the following relationships that hold in linear di-electrics,

and

Now to add magnetic fields in matter to this picture. The good news is that it is pretty much exactly the same as the above. In the presence of a magnetic field, a material becomes magnetised, which we quantify by the magnetisation, ,

This magnetisation is manifested through **bound currents, **as each magnetic dipole is really a small current loop, there will be current loops joining on the surface of the material to form the bound surface current,

and in non-uniform magnetisation there is also bound volume current within the material

There is also a free current present, which is there because someone has set up this current, by applying a potential difference. This is opposed to the idea of the bound current which is there due to magnetisation. Now, Maxwells equations still hold applied to the entire current, specifically amperes law is

and from this we see that

The book keeping component of this section is that we define the auxiliary magnetic field,

and so in a medium, amperes law becomes

Note that is analogous the field. Now to add in the linear medium, is linearly proportional to

where is the magnetic susceptibility of the material. You can also easily derive that

where is the permeability of the material.

We are one step away from writing out Maxwells Equations in a medium, we just need to figure out what we are going to do with the non-static case of the Maxwell-ampere law.

If there is any change in the electric polarisation there will be an extra current term, the polarisation current, which will create a magnetic field.

So all up, the current can be written as

Substituting this into Maxwell’s ampere law we get

which can be re-arranged to give

So now that we know this, we can write out our Maxwell Equations in a medium:

So, I’m going to stop here, stay tuned though, next time we will talk about energy and capacitors, and probably do some nice derivations, which, admittedly, this post lacked.

]]>