Cyber Security and Confusion Matrix

Manas Srivastava
4 min readJun 7, 2021

What is Confusion Matrix?

It is a performance measurement for a machine learning classification problem where output can be two or more classes. It is a table with 4 different combinations of predicted and actual values.

The above table has the following cases:

  • True Negative: Model has given prediction No, and the real or actual value was also No.
  • True Positive: The model has predicted yes, and the actual value was also true.
  • False Negative: The model has predicted no, but the actual value was Yes, it is also called a Type-II error.
  • False Positive: The model has predicted Yes, but the actual value was No. It is also called a Type-I error.

Need for Confusion Matrix in Machine learning

  • It evaluates the performance of the classification models, when they make predictions on test data, and tells how good our classification model is.
  • It not only tells the error made by the classifiers but also the type of errors such as it is either type-I or type-II error.
  • With the help of the confusion matrix, we can calculate the different parameters for the model, such as accuracy, precision, etc.

Calculations using Confusion Matrix:

  1. Accuracy: It can be calculated as the ratio of the number of correct predictions made by the classifier to all number of predictions made by the classifiers. The formula is given below:

2. Error rate: The value of error rate can be calculated as the number of incorrect predictions to all number of the predictions made by the classifier. The formula is given below:

3. Precision: It can be defined as the number of correct outputs provided by the model or out of all positive classes that have predicted correctly by the model.

4. F-score: This score helps us to evaluate the recall and precision at the same time. The F-score is maximum if the recall is equal to the precision.

Cybersecurity

Cybersecurity is the protection of internet-connected systems such as hardware, software, and data from cyber threats. The practice is used by individuals and enterprises to protect against unauthorized access to data centers and other computerized systems.

Cybercrime can be anything like:

  • Stealing of personal data
  • Identity stolen
  • For stealing organizational data
  • Steal bank card details.
  • Hack emails for gaining information.

The trade-off between type 1 and type 2 errors is very critical in cybersecurity. Consider a face recognition system that is installed in front of the data warehouse which holds critical error. Consider that the manager comes and the recognition system is unable to recognize him. He tries to log in again and is allowed in.

This seems a pretty normal scenario. But let’s consider another condition. A new person comes and tries to log himself in. The recognition system makes an error and allows him in. Now, this is very dangerous. An unauthorized person has made an entry. This could a threat to the whole company.

In both cases, there was an error made by the security system. But the tolerance for False Negative here is 0 although we can still bear False Positive.

This shows the critical nature that might vary from use case to use case where we want a tradeoff between the two types of error.

Thank You For Reading

--

--

Manas Srivastava

Bachelor of Technology (Computer Science and Engineering)