In multivariable calculus courses, one usually first encounters the divergence as:
$$\displaystyle \nabla\cdot\vec{u} = \frac{\partial{u_x}}{\partial x} + \frac{\partial{u_y}}{\partial y} + \frac{\partial{u_z}}{\partial z} $$
where $\vec{u} = [u_x,\,u_y,\,u_z]$ in cartesian coordinates. Then one learns the divergence theorem:
$$\displaystyle \int_\Omega \nabla\cdot\vec{u}\, dx = \int_{\partial\Omega} \vec{u}\cdot\hat{n}\,dS $$
and then, since cylindrical coordinates have usually already been introduced, one is told or asked to prove that:
$$\displaystyle \nabla\cdot\vec{u} = \frac{\partial{u_x}}{\partial x} + \frac{\partial{u_y}}{\partial y} + \frac{\partial{u_z}}{\partial z} = \frac{1}{r}\frac{\partial}{\partial r}\left( r u_r \right) +\frac{1}{r}\frac{\partial u_{\theta}}{\partial\theta} + \frac{\partial u_z}{\partial z}$$
where $\vec{u} = [u_r,\,u_\theta,\,u_z]$ in cylindrical coordinates. This explanation exemplifies the typical approach based on the multivariate chain rule, with some extra material to help build intuition. As you can see from that page, there are several nuisances about the typical approach:
- The derivation hinges on how the divergence looks in cartesian
- There are lots of partial derivatives to keep track of
- One has to know some vector derivative identities
- It’s not really clear whether the resulting cylindrical formula will actually satisfy the divergence theorem (it’s common for students to get the impression that cartesian-divergence and cylindrical-divergence are two different things, and that the divergence theorem only applies to the former)
I want to expound further on the first point. Textbooks commonly present the cartesian formula as the definition of the divergence $\nabla\cdot$ because it gives students something concrete to calculate with. However, the concept of divergence does not involve any coordinate system – the number you get using the cartesian form is always exactly the same number you would get using the cylindrical form. This is why it is attractive to write $\nabla\cdot$, since it doesn’t mention any $x$ or $y$, $r$ or $\theta$ in accordance with the coordinate-system-independent nature of the divergence.
Well, if the cartesian formula isn’t the definition then what is? It is the divergence theorem. One should view the divergence theorem as defining an operator $\nabla\cdot$ in terms of how it interacts with a vector field $\vec{u}$ and a bounded region $\Omega$. This more abstract definition does not make its way into (undergraduate) textbooks because it does not lend itself to calculations, but it does make it vastly easier to derive the form of the divergence in a given coordinate system, as we shall now see.
The divergence theorem deals with integrated quantities, but we can extract the point value of the divergence by taking the limit of the average divergence over the domain $\Omega$ as the domain contracts to a point:
$$\displaystyle D = \nabla\cdot\vec{u}(x) = \lim_{\Omega \to \{x\}} \frac{1}{|\Omega|} \int_{\Omega} \nabla\cdot\vec{u}\,dx = \lim_{\Omega \to \{x\}} \frac{1}{|\Omega|} \int_{\partial\Omega} \vec{u}\cdot\hat{n}\,dS $$
Since $\Omega$ is contracting to a point anyway, we may as well choose it to be something convenient. We will make it be the region bounded by surfaces where one coordinate is constant. For the case of cylindrical coordinates, this means the annular sector:
$$\displaystyle\begin{aligned} r_1 &\le r \le r_2 = r_1+\Delta r \\ \theta_1 &\le \theta \le \theta_2 = \theta_1+\Delta\theta \\ z_1 &\le z \le z_2 = z_1 + \Delta z \end{aligned}$$
We will let $\Delta r, \Delta\theta, \Delta z \to 0$. Now the task is to rewrite the surface integral on the right-hand side of the limit as iterated integrals over the faces of our sector:
$$\displaystyle D = I_1+ I_2 + I_3 $$
$$\displaystyle \begin{aligned} I_1 &= \lim_{\Delta r, \Delta\theta, \Delta z \to 0}\frac{1}{r_1\Delta r\Delta\theta\Delta z}\int_{r_1}^{r_2} \int_{\theta_1}^{\theta_2} \hat{z}\cdot\vec{u}(r,\theta,z_2) – \hat{z}\cdot\vec{u}(r,\theta,z_1) \,rd\theta dr \\ &= \frac{\partial u_z}{\partial z} \end{aligned}$$
$$\displaystyle \begin{aligned} I_2 &= \lim_{\Delta r, \Delta\theta, \Delta z \to 0}\frac{1}{r_1\Delta r\Delta\theta\Delta z} \int_{r_1}^{r_2} \int_{z_1}^{z_2} \hat{\theta}(\theta_2)\cdot\vec{u}(r,\theta_2,z) – \hat{\theta}(\theta_1)\cdot\vec{u}(r,\theta_1,z) \,dz dr \\ &= \frac{1}{r_1}\left( \frac{\partial u_\theta}{\partial\theta} \right) \end{aligned} $$
$$\displaystyle \begin{aligned} I_2 &= \lim_{\Delta r, \Delta\theta, \Delta z \to 0}\frac{1}{r_1\Delta r\Delta\theta\Delta z} \int_{\theta_1}^{\theta_2} \int_{z_1}^{z_2} r_2\hat{r}(r_2)\cdot\vec{u}(r_2,\theta,z) – r_1\hat{r}(r_1)\cdot\vec{u}(r_1,\theta,z) \,dzd\theta \\ &= \frac{1}{r_1}\frac{\partial}{\partial r}\left( ru_r \right) \end{aligned}$$
So we very quickly get that:
$$\nabla\cdot\vec{u} = \frac{1}{r}\frac{\partial}{\partial r}\left( ru_r \right) + \frac{1}{r}\frac{\partial u_\theta}{\partial\theta} + \frac{\partial u_z}{\partial z} $$
and all we had to know was how to set up some surface integrals, and to recognize when the limits become partial derivatives. This method can also be used to get the gradient and curl, and with any coordinate system for which you know how to set up the surface integrals.