| ... | ... | @@ -30,3 +30,29 @@ The continuous nature of the basis function will give us a gentle transition fro |
|
|
|
|
|
|
|
Learning is similar to regression: for M dimensional input array, we learn M trainable parameters (and bias). So, the model training is very fast at high dimensions. Similar to regression, learning is typically managed by the gradient descent (GD) algorithm. It should be noted that logistic regression also suffers from the over-fitting, if the training dataset is perfectly linearly separable. Once trained with such a dataset, the logistic sigmoid function will be very steep, like a [Heaviside step function](https://en.wikipedia.org/wiki/Heaviside_step_function). Therefore, we should add regularization to the error function that we apply the GD (penalize w going to very large values).
|
|
|
|
|
|
|
|
## Additional references
|
|
|
|
|
|
|
|
### Useful posts
|
|
|
|
[Logistic Regression](https://ml-cheatsheet.readthedocs.io/en/latest/logistic_regression.html)
|
|
|
|
[A Gentle Introduction to Cross-Entropy](https://machinelearningmastery.com/cross-entropy-for-machine-learning/)
|
|
|
|
[Loss functions](https://cs231n.github.io/neural-networks-2/#losses)
|
|
|
|
[The cross-entropy cost function](http://neuralnetworksanddeeplearning.com/chap3.html#the_cross-entropy_cost_function)
|
|
|
|
[Cross-validation: evaluating estimator performance](https://scikit-learn.org/stable/modules/cross_validation.html?highlight=repeatedkfold)
|
|
|
|
[L1 vs. L2 Loss function](http://rishy.github.io/ml/2015/07/28/l1-vs-l2-loss/)
|
|
|
|
[Entropy: How Decision Trees Make Decisions](https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c18c8)
|
|
|
|
[ROC Curves and Precision-Recall Curves](https://machinelearningmastery.com/roc-curves-and-precision-recall-curves-for-classification-in-python/)
|
|
|
|
[Information Gain and Mutual Information](https://machinelearningmastery.com/information-gain-and-mutual-information/#:~:text=Information%20gain%20is%20the%20reduction,before%20and%20after%20a%20transformation.)
|
|
|
|
[Gradient Boosting & XGBoost](https://www.shirin-glander.de/2018/11/ml_basics_gbm/)
|
|
|
|
[What’s considered a good Log Loss](https://medium.com/@fzammito/whats-considered-a-good-log-loss-in-machine-learning-a529d400632d)
|
|
|
|
[On log loss](https://stats.stackexchange.com/questions/276067/whats-considered-a-good-log-loss/395774)
|
|
|
|
[Decision Trees and Random Forests in Python](https://nickmccullum.com/python-machine-learning/decision-trees-random-forests-python/)
|
|
|
|
|
|
|
|
### Additional lecture notes:
|
|
|
|
[Statistical Learning Theory -notes](https://ocw.mit.edu/courses/mathematics/18-657-mathematics-of-machine-learning-fall-2015/lecture-notes/MIT18_657F15_L2.pdf)
|
|
|
|
[Logistic Regression -notes](https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec09.pdf)
|
|
|
|
[Decision Trees -notes](https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec08.pdf)
|
|
|
|
[Boosting -notes](https://ocw.mit.edu/courses/sloan-school-of-management/15-097-prediction-machine-learning-and-statistics-spring-2012/lecture-notes/MIT15_097S12_lec10.pdf)
|
|
|
|
[Convex optimization -notes](https://ocw.mit.edu/courses/mathematics/18-657-mathematics-of-machine-learning-fall-2015/lecture-notes/MIT18_657F15_L11.pdf)
|
|
|
|
|
|
|
|
### Selected articles:
|
|
|
|
[Classifying post-traumatic stress disorder](https://www.nature.com/articles/s41598-020-62713-5?sap-outbound-id=D4EE8DE8FA484F2A05F5264D4196D5BECD1CACD0&utm_source=hybris-campaign&utm_medium=email&utm_campaign=102_BHQ5340_0000009541_SREP_AWA_AW02_GL_EC_ML_HEALTHCARE&utm_content=EN_internal_20766_20210121&mkt-key=42010A0557EB1EDAA5CF8626FB94DC3E) |
|
|
\ No newline at end of file |