{"id":685,"date":"2023-11-09T13:37:21","date_gmt":"2023-11-09T05:37:21","guid":{"rendered":"https:\/\/www.wennroy.com\/?p=685"},"modified":"2023-11-09T13:37:21","modified_gmt":"2023-11-09T05:37:21","slug":"l2-normalization-vs-cosine-similarity","status":"publish","type":"post","link":"https:\/\/wennroy.com\/index.php\/2023\/11\/09\/l2-normalization-vs-cosine-similarity\/","title":{"rendered":"l2 normalization vs Cosine Similarity"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">\u4e00\u4e2a\u5728text mining\u6700\u7ecf\u5e38\u7528\u5230\u7684metric\u662fcosine similarity\uff0c\u6211\u4eec\u4e00\u822c\u8fd9\u4e48\u5b9a\u4e49\u5b83\uff1a<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Under Euclidean space, assuming column vectors $x$, $y$:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">$$cos(x, y) = \\frac{x^T y}{\\|{x}\\|_2\\|{y}\\|_2} = \\left(\\frac{x}{\\|x\\|_2}\\right)^T\\left(\\frac{y}{\\|y\\|_2}\\right)$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u90a3\u4e48\u4e00\u822c\u6765\u8bf4\uff0c\u6211\u4eec\u7684cosine similarity distance\u53ef\u4ee5\u88ab\u8fd9\u4e48\u5b9a\u4e49<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">$$cosine\\_similarity\\_dist(x,y) = 1 &#8211; cos(x,y)$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u56de\u8fc7\u5934\u6765\u518d\u770b\u770bl2 norm\u662f\u600e\u4e48\u505a\u7684\uff0c\u6211\u4eec\u8fd9\u91cc\u53d6 <em>squared<\/em> of l2 norm,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">$$\\begin{eqnarray}\\|x &#8211; y\\|_2^2 &amp;=&amp; (x-y)^T (x-y)\\\\&amp;=&amp;\\|x\\|_2^2+\\|y\\|_2^2-2x^Ty\\end{eqnarray}$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u5982\u679c\u6211\u4eec\u5047\u8bbe$x,y$\u5747\u662fnormalized\u7684vectors\uff0c\u5373 $\\|x\\|_2 = \\|y\\|_2 = 1$,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u90a3\u4e48\u6211\u4eec\u53c8\u4f1a\u6709\uff0c<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">$$\\begin{eqnarray}\\|x &#8211; y\\|_2^2 &amp;=&amp;2 &#8211; 2x^Ty\\\\&amp;=&amp;2 &#8211; 2cos(x,y)\\end{eqnarray}$$<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u4e5f\u5c31\u662f\u8bf4\uff0c\u5728normalized\u7684vector\u4e0a\uff0c\u5e94\u8be5\u8981\u6ee1\u8db3<em>squared <\/em>Euclidean distance is proportional to the cosine similarity distance.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u4e00\u4e2a\u5728text mining\u6700\u7ecf\u5e38\u7528\u5230\u7684metri &hellip;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_import_markdown_pro_load_document_selector":0,"_import_markdown_pro_submit_text_textarea":"","footnotes":""},"categories":[67,43,46],"tags":[69,71,68],"class_list":["post-685","post","type-post","status-publish","format-standard","hentry","category-metric","category-nlp","category-statistics","tag-cosine","tag-l2-norm","tag-metric"],"_links":{"self":[{"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/posts\/685","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/comments?post=685"}],"version-history":[{"count":19,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/posts\/685\/revisions"}],"predecessor-version":[{"id":707,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/posts\/685\/revisions\/707"}],"wp:attachment":[{"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/media?parent=685"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/categories?post=685"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wennroy.com\/index.php\/wp-json\/wp\/v2\/tags?post=685"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}