Ben Chuanlong Du's Blog

It is never too late to learn.

AI Learning

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Machine Learning Algorithms The picture comes from Machine Learning Algorithms Mindmap.

Feature Engineering

Handling Categorical Variables in Machine Learning

Regularization in Machine Learning Models

Ensemble

Frameworks

Libraries for Gradient Boosting

Big-data (Spark) Friendly Frameworks

https://mmlspark.blob.core.windows.net/website/index.html

AutoML

Questions

Random Forest

  1. Is discrete variables easier to handle than continous variables (in random forest)? Is there any advantage of discretize variables? The eseential question is how is categorical varialbes handled in RF? Does RF use category variables directly or does it have to convert it to numerical somehow?

  2. Random forest has a way to impute missing values. What if I treat missing values in categorical predictors and a new class? It sounds like a good ...

Imputation

  1. mean, median, etc.

  2. SVD imputation using low dimension to approximate high dimension data

Tips on Kaggle

Machine Learning Resources

AI Tools

https://openai.com/blog/dall-e/

References

Comments