Cf [[Linear Regression Methods#^cd8b6d|ridge being equivalent to OLS with added noise features]]. Cf [quantymacro's article on "ridge RF"](https://www.quantymacro.com/when-a-ridge-regression-maxi-meets-random-forest/), adding noise in RF to further de-correlate the trees, and to spread out the weights in [[Linear Smoothers#^182099|decision trees as linear smoothers]] idea: instead of hard-coding random features (space-intensive anyways), set a threshold on splitting: if a feature produces a worse fit than splitting with (the best out of a few) random noise, then we should just do a random split (instead of using any actual feature or injected noise).