Why Cosine Similarity is used in Natural Language Processing?

Natural Language Processing refers to the branch of artificial intelligence that gives machines the ability to read, understand and derive meaning from human languages. Robots such as Sophia or home assistants uses Natural Language Processing (NLP) to sound like human and ‘understand’ what you’re saying. NLP can be represented by using the Venn diagram as below.

Natural language processing focuses on feature engineering and for this, we should have excellent domain knowledge of data. All the data are in the form of text or string. While modeling it with the machine learning classifier algorithm, it will require a numerical feature vector and for this “Bag of Word” can be used.
Bag of Words
The bag-of-words model is used to extract features from the text by disregarding the grammar and order of words by keeping their multiple occurrences. It is represented as a bag of its words and here the occurrence of each word is used as a feature for the training classifier. It is mainly used in document classification and also used in computer vision.
Cosine Similarity Metrics
For document classification, the word counts can be represented as a vector. We can use cosine similarity metrics is used to determine the similarity between these vectors by measuring cosine angle.
The cosine of two non-zero vectors can be derived by using the Euclidean dot product formula:

(1 — similarity) gives cosine distance between two vectors.
When the angle between two points is zero, cos(0) = 1, and cosine distance will be equal to (1–1) then is zero. It indicates the two are very the same.
Thanks for reading!!!! If this article was helpful to you, feel free to clap, share and respond.