HBOS (and probably others) model doesn't need decision_scores_ and labels_ attributes
Created by: Fed29
Hello,
I'm working on a scenario where I've to train HBOS model on past data, save model and then use that to found anomalies on new (unseen) data.
So during the HBOS training I don't need to save the decision_scores_
and labels_
attributes of the training data inside the model obj.
My suggestion is to skip the decision_scores_
and labels_
attributes initialization and if a user needs them he will be able to run predict
on training data (and save them in another structure outside the model).
This new approach enable us to save very little model (in term of memory). Here is an example (HBOS trained on 600k data) decision_scores_ are 5MB as well as labels_ -> 10MB pkl file Removing them you'll have a 2KB pkl file
Furthermore:
-
labels_
is used only in thefit_predict
method which is deprecated. - in the predict method there's
check_is_fitted(self, ['decision_scores_', 'threshold_', 'labels_'])
but neitherdecision_scores_
orlabels_
are used.