One way is accuracy, ie % of correctly classified observations
Downside:
Say, your test is to id whether a patient suffers a rare disease that is found in 1% of the population.
If your test is 99% accurate, is your test doing well?
Maybe, maybe not.
If my test just simply classify all patients as negative, I'll achieve 99% accuracy, but can hardly say my test is good.
To get around this, we can use precision and recall.
From wiki
Precision is the probability that a (randomly selected) retrieved document is relevant.
Recall is the probability that a (randomly selected) relevant document is retrieved in a search.
F1 score incorporates the number
= 2*prec*rec/(prec+rec)
Example
When a search engine returns 30 pages only 20 of which were relevant while failing to return 40 additional relevant pages, its precision is 20/30 = 2/3 while its recall is 20/60 = 1/3.
No comments:
Post a Comment