Gauss-Newton approximates the Hessian with $\hat{H}:= (\nabla J)^T (\nabla J)$, i.e. first linearizing $\hat{r}(x)\approx r(x)$ then computing the Hessian of $\hat{f}(x):= \frac{1}{2}\hat{r}^{T}(x)\hat{r}(x)$. The GN step is then $\hat{H}^{-1}\nabla f(x)$. - The error in estimating the Hessian is $H-\hat{H}=\sum_{j}r_{j}(x)\nabla^{2}r_{j}(x)$, which is positive definite unless $r(x)=\mathbf{0}$. In particular, around the optimizer $x^{\ast}$, unless $r(x^{\ast})=0$, the update's error has Taylor expansion $x_{k+1} -x^{\ast}=\psi_{GN}(x_{k})-\psi_{GN}(x^{\ast})=J_{GN}(x{^{\ast}})(x_{k}-x^{\ast})+O\| x_{k}-x^{\ast} \|^{2},$where $J_{GN}$ is the Jacobian of the GN step, i.e. This is a further simplification of the Newton iterate's quadratic model $f(x)+s^{T}\nabla f(x)+\frac{1}{2}s^{T}(\nabla^{2}f(x))s$, in the sense that GN is linearize, then apply Newton. The linearization makes $\nabla^{2}f(x)$ easier to compute, but sacrifices quadratic convergence (unless $r'(x^{\ast})=0$).