data science – Lucas M. Chang

Reconstructing geographic distance among cities using distributional semantics

Estimated map of the 100 most populous US cities constructed by predicting the distances between them based on their similarity in Word2Vec semantic space.

The way we use words in language encodes a lot of information about the real-world meaning of those words. Continue reading Reconstructing geographic distance among cities using distributional semantics

What country is the happiest? Interpreting ordinal-scale data

In this post, I model the latent distribution of national happiness using self-report data, and discuss the difficulties involved in ranking countries by “average” happiness.

Often, when studying people’s subjective experiences, data are collected by asking survey participants to report on an ordinal scale how much they agree with a statement or question. Continue reading What country is the happiest? Interpreting ordinal-scale data