word2vec animation

Image caption: Class and gender: dimensions in a vector space?

Title of presentation by Chris Moody presenting an algorithm named word2vec, that represents words as vectors in an abstract space of meaning. In fact the basis of many text analyses is a mapping of text to an ambstract multi-dimensional “meaning space” where closeness corresponds to similarity.

In a description of a real-world example, it’s shown how customer feedback could be (and presumably is) used to determine thematic links for the purposes of (for instance) producing targetted advertising (in this case related to pregnancy).


The article, whose stated purpose is to convince the reader of the efficacy of text-analysis in general, and to the strengths of the author’s particular company in particular, contains a scatter diagram of the kind very prevalent “big data” work. Does such an image simply actually reveal something of the performance of the underlying algorithm or simply produce a self-fulfilling feedback loop of assurance… the con-man’s “see, it works” while one stares at an unreadable but colorful and seemingly meaningful void. The underlying labels remind me of the “poverty of statistics” from Nicolas showing Flickr tags.