ITP Blogs for week-1-intro-to-deep-learning

🖍️ Notes

Reading

balanced classification: number of examples in each class is roughly equal -> receiver operating characteristic (ROC) curve
class-imbalanced problems: number of examples in a class is much larger than others (e.g, fraud transaction, only 1% is fraud), ranking problems, or multilabel classification -> precision and recall, as well as a weighted form of accuracy or ROC AUC.

Develop the Model

1. Vectorization (to tensor of float32)

2. Normalization (0-1)

3. Validation

a. K-fold cross-validation -> when you have too few samples (you reshuffle the validation set and retrain.

b.Iterated K-fold validation -> performing highly accurate model evaluation when little data is available

Note:

Take small values (0-1)
Homogenous (roughly similar range)
Consider replacing missing value in a feature with median/average instead of 0 to avoid discontinuity
Overfit first to the training set, evaluate validation data, modify model, retrain, evaluate validation data, repeat
Overfit by add layers, make layers bigger, train for more epochs
Be mindful! Every time we revalidate with validation evaluation, it LEAKS info from the validation process to the model. Don’t do it too many times! If so, it may cause overeat to the validation process. If this happens, you may want to switch to K-fold validation

Variables to take note:

Deploy the Model

False negative -> e.g valid transactions marked as fraud

False positive -> e.gfraudulent transactions that are missed

Methods:

REST API: be mindful that app doesn’t have strict latency requirement (~500ms), input data sent isn’tsensitive (it will be decrypted form on server to be seen by the model)
Install TensorFlow on a server/cloud (build with Flask / Python web dev library / Django / TF Serving)

Query model via REST API
On a device: CPU, microcontroller, model needs to run on low connectivity environment, strict latency constraints, small size / memory constraint, no need to be highly accurate, input data is sensitive
Deploy with TF Lite
Browser / JS apps (directly): offload compute to the client, reduce server cost, input data needs to stay on client (sensitive), strict latency constraints, low connectivity, model is small (similar to on device but on browser/js apps)

Inference Model Optimization

Weight pruning: reduce memory because you reduce the number of parameters in the layers of the model
Weight quantization: change from the float32 weight to int8 (smaller by quarter the size but remains accurate!)