Recall
In the field of machine learning and data analysis, recall is a metric that is used to evaluate the performance of a model or algorithm. It is defined as the proportion of true positive predictions made by the model, relative to the total number of true positive cases in the data.
Recall is often used in combination with another metric called precision, which measures the proportion of true positive predictions made by the model relative to the total number of positive predictions made by the model. Together, precision and recall can be used to assess the overall accuracy and effectiveness of a model in different situations.
Recall is particularly useful when the cost of making a false negative prediction is high, or when the goal of the model is to identify as many positive cases as possible among a large number of negative cases. For example, in a spam filter, it might be more important to have a high recall rate, even if it comes at the cost of a lower precision rate, since false negative predictions could result in spam emails being delivered to a user's inbox.
To calculate recall, you can use the following formula:
Recall = True Positives / (True Positives + False Negatives)
Where True Positives are the number of predictions made by the model that are correct, and False Negatives are the number of true positive cases that were not correctly predicted by the model.
Recall can be used to evaluate the performance of a model on a single dataset, or it can be averaged over multiple datasets to get a more comprehensive evaluation of the model's accuracy.
Difference between recall and precision[edit]
In the context of machine learning, recall and precision are two measures of a model's performance in a binary classification problem.
Recall, also known as sensitivity or the true positive rate, is a measure of a model's ability to correctly identify all relevant instances. It is calculated as the number of true positive predictions divided by the number of true positive predictions plus the number of false negative predictions. A high recall value indicates that the model has a low false negative rate, meaning that it correctly identifies most of the positive instances.
Precision, on the other hand, is a measure of a model's ability to correctly identify only relevant instances. It is calculated as the number of true positive predictions divided by the number of true positive predictions plus the number of false positive predictions. A high precision value indicates that the model has a low false positive rate, meaning that it correctly identifies most of the instances that it predicts to be positive.
In summary, recall is a measure of completeness and precision is a measure of correctness. A high recall and high precision are desirable, but sometimes a trade-off between them is needed, depending on the context of the problem and the costs of false positives and false negatives.