One thing Maxwell undestood is that the divergence and curl are densities. The electrostatic equation that says the divergecne of the electric field is a multiple of the electric charge density should be thought of as a statement relating one density to another: The divergence is a "flux" density, amd Maxwell's equation relates the flux density of the electric field to the electric charge density.
Let's proceed carefully here, so that this really sinks in. We're familiar with many kinds of densities: mass density, charge density, population density, etc. Each of these refer to an extensive or additive quantity. That is, if you have a region A and you partition it into two subregions B and C, then the mass in A equals the sum of the mass in B and in C. Likewise, the charge in A equals the sum of the charges in B and in C, etc. In these circumstances, i.e., when we are considering an additive quantity X, it is always a good idea to define a density function f(x,y) by taking the ratio:
(amount of X in a small box about (x,y))/(area of the small box)
and then taking the limit as the size of the box is shrunk to zero. We call the limit f(x,y), the density of X at (x,y).
This is what you do, ideally at least, with charge, mass, population, etc. You should do it whenever you are dealing with an additve quantity.
It is therefore of fundamental significance that flux is additive. This means that there is going to be a flux density.
To see this, consider a region A and partition it into two subregions B and C. Consider a vector field F(x,y), and compute the flux of F out of B and C.

In this diagram, the greenish region is B, and the orangish region is C. The two regions together make up A. Some normal vectors on the boundaries of these regions are sketched in. When we are computing the flux out of B and the flux out of C, the boundary between B and C comes up twice, once for B and and once for C. But the normal vector points in opposite directions these two times, as you see in the diagram, so these contributions to the total flux cancel out. What is left are just the pieces of the boundaries of B and C that are also on boundary of A and so the flux out of A equals the sum of the fluxes out of B and C. Thus, flux has the additive property.
Having established that, we know there is a flux density. Let's compute it. To do this, we just work out the flux out of a small box of side length h, and divide by the area, which is h2. Any term of order h3 or higher in the flux will drop out therefore when we let h tend to zero, so we can discard such terms as they arise in our computation.
Here is the square:

As you see, there are four sides, and each must be parameterized separately. Oh, well -- we'll only have to do this once. On the first leg, we go from (x0,y0) to (x0+h,y0) which has the parameterization
(x(t),y(t)) = (x0+t*h,y0) for 0 < t < 1
and on this part, Nds = (0,-h)dt.
The side which is paralellel to this, side 3, goes from (x0,y0+h) to (x0+h,y0+h) -- recall that we don't care about direction of travel when we do flux integrals. This has the parameterization
(x(t),y(t)) = (x0+t*h,y0+h) for 0 < t < 1
and on this part, Nds = (0,h)dt.
Let's now compute the outward flux through this pair of sides. We get:

Notice that when you put these two parts together, which we can easily do since we have arranged the parameter ranges to be the same, we get a difference of Q values at different points in the integrand. But the difference is only in the y variable, and when h is very small, this difference in the Q values is well approximated by a partial derivative since there is a factor of h in the denomonator. This approximation becomes exact as we take the limit in which h tends to zero. And in this same limit, the partial derivative of Q that is under the integral sign stops depending on t, since t only enter in the form t*h, and this drops out in the limit in which h tends to zero. And of course if you integrate a constant in t from t=0 to t=1, you just get that constant. So in the limit, we get Qy(x0,y0) from the flux through these two sides.
The same argument shows that the flux through the other two sides contributes Px(x0,y0) to the density. We have just proved the following theorem:
div(F(x,y)) = Px(x,y) + Qy(x,y)
The condition about continuity is what guarantees that the limits we took
exist and behaved like we said. It is more than we need, but as we will see,
we need to be careful here. The continuity we are asuming is enough to make things right, and we won't worry about minimal conditions for validity of the theorem here.
An imediate corralary of this is the divergence theorem in the plane, also
sometimes called Green's theorem or Gauss's theorem:
Divergence and Flux Density
For any vector field F(x,y) = (P(x,y),Q(x,y))
such that P and Q have continuous first order
partial derivatives in x and y throughout the region being considered,
the outward flux of
F has the additive property, and the corresponding
flux density is the
divergence of F(x,y),
div(F(x,y)) , which is given by
The Divergence Theorem in the Plane
For any vector field F(x,y) = (P(x,y),Q(x,y))
such that P and Q have continuous first order
partial derivatives in x and y throughout the region being considered,
and any region A, the flux of F out
across C, the boundary of A, equals the area integral of div(F(x,y))
over A:

Indeed, for any additive quantity, you get the total
amount associated with a region A by integrating the
corresponding density over A. So once you know that the
divergence is the density of flux, the theorem is obvious.
Moreover, once
we saw that flux was additive, we knew we should compute the density. It
is like falling off a log once you see that you have the additive property.
So that was really the fundamental point. The insight that Gauss and Green
had, which is what led them to the theorem, was that they recognized
this point about the additivity. I'm not sure that many writers of calculus books
understand that this is the main point. Consequently, they make the
theorem look like magic or coincidence.
And again, the point of view is important not only for discoveing proofs of
new mathematical theorems, but also for applications. Once you think
of divergence as a density, looking for a relation between the divergence
of the electric field and the electric charge density is almost obvious...
O.K., so that's the divergence. What about the curl?
Let's introduce another quantity associated with a region A
and a vector field F , one that
involves tangential line integrals: namely,

where C is the boundary of A, oriented counter-clockwise.
We will call this the circulation of F around A. If F is a force field, then this is the amount of work done as you circulate around the boundary of A counter-clockwise. We will see another reason it is called circulation when we turn to the question of recognizing positve or negative curl by looking at a graph of F , but for now the main point is that circulation is additive, and therefore it too has a density, and therefore there will be an analog of the divergence theorem for work integrals.
The additivity is clear by the same argument as before: Consider a region A and partition it into two subregions B and C. Consider a vector field F(x,y), and compute the circulation of F around B and C.

As before, the greenish region is B, and the orangish region is C. The two together make up the region A.
The boundary between B and C comes up twice in the computation of the circulation around B and C, once for B and and once for C. But the tangent vector points in opposite directions these two times, so these contributions to the total circulation cancel out. What is left are just the pieces of the boundaries of B and C that are also on boundary of A and so the circulation around A equals the sum of the circulations B and C. Thus, circulation has the additive property.
Now the circulation density can be computed using the method we used for the flux density. The only difference is that P and Q enter in a different way because we work with Tds in place of Nds this time. Still, the same parameterizations can be used again, and the result is the following theorem:
curl(F(x,y)) = Qx(x,y) - Py(x,y)
The condition about continuity is what guarantees that the limits we took
exist and behaved like we said. It is more than we need, but as we will see,
we need to be careful here. The continuity we are asuming is enough to make things right, and we won't worry about minimal conditions for validity of the theorem here.
An imediate corralary of this is the Stoke's theorem in the plane, also
sometimes called Green's theorem:
Curl and Circulation Density
For any vector field F(x,y) = (P(x,y),Q(x,y))
such that P and Q have continuous first order
partial derivatives in x and y throughout the region being considered,
the circulation of
F has the additive property, and the corresponding
circulation density is the
curl of F(x,y),
curl(F(x,y)) , which is given by
Stoke's Theorem in the Plane
For any vector field F(x,y) = (P(x,y),Q(x,y))
such that P and Q have continuous first order
partial derivatives in x and y throughout the region being considered,
and any region A, the circulation of F
counter-clockwise
around C, the boundary ofA, equals the area integral of curl(F(x,y))
over A:
