Given two models applied to a data set that has been partitioned, Model A is considerably more accurate than model B on the training data, but slightly less accurate than model B on validation data.

a. Which model are you more likely to consider for final deployment?
b. Please explain why Model A is considerably more accurate than model B on the training data, but slightly less accurate than model B on validation data as concise as possible.

Respuesta :

Answer:

Model B should be deployed.

Explanation:

(a) I would deploy the model that does well on generalizing well on validation data and does considerably well on training data, Hence, Model B should be deployed because it has less BIAS and the problem of overfitting the training set is evaded.

(b) Model A does well on the training dataset and not very well on test/validation dataset because it has OVERFITTED the training set.

OVERFITTING is a scenario where the model has large variance and low bias, that is has a perfect representation of the training set and performs woefully on generalizing validation set. For instance when a neural network has too much hidden layers and no regularization or dropout.

OVERFITTING is a common scenario in model Development.

ACCESS MORE