You may have heard of Numerai, a decentralized hedge fund whose investment decisions and payouts are driven by the model predictions that anyone (really anyone) can share with that company. And even if not, the baseline model they provide every user with is a Gradient Boosted Decision Tree (GDBTs). Now, it happens that this simple base model outperforms many of the individual user based models consistently (according to Richard Craib, founder of Numerai). This and other opportunities to effectively use GDBTs, motivated me to have a closer look at this modeling approach.
-
-
The original post was supposed to be about gradient boosted decision trees. But before delving into this, I realized that a quick post on decision trees might be helpful. This post is a bit different from the others as it will mostly comment a simplified version of the implementation of decision trees developed by the numpy-ml package.
-
This post builds on the Survival Analysis I - Introduction as well as Survival Analysis II - Kaplan-Meier Estimator to illustrate the Cox Proportional Hazards (Cox PH) model.
-
This post builds on the survival analysis introduction and delves deeper into the “how” and “why” of the Kaplan-Meier (KM) estimator.
-
I recently gave a short intro of survival models to the team. Here, the goal is to lay out the basics and the motivate the use of survival analysis. Part II covers the Kaplan-Meier Estimator and Part III focuses on the Cox Proportional Hazards Model.