2.4 Anticipating resemblance judgments regarding embedding room

To predict similarity ranging from two objects during an embedding space, we computed the cosine distance between the term vectors corresponding to each object. We used cosine distance as a metric for two main reasons. First, cosine distance is a commonly reported metric used in the literature that allows for direct comparison to previous work (Baroni et al., 2014; Mikolov, Chen, et al., 2013; Mikolov, Sutskever, et al., 2013; Pennington et al., 2014; Pereira et al., 2016). Second, cosine distance disregards the length or magnitude of the two vectors being compared, taking into account only the angle between vectors. As this frequency relationship does not have bearing on semantic similarity of the two terms, using a distance metric such as cosine distance that ignores magnitude/length information is sensible.

dos.5 Contextual projection: Determining ability vectors in the embedding room

Generate predictions for target feature ratings using embedding spaces, we adapted and extended an earlier used vector projection approach first used by Grand et al. (2018) and Richie et al. (2019). These prior approaches manually defined three separate adjectives for each extreme end of a specific feature (e.g., for the "size" feature, adjectives representing the low end are "small," "tiny," and "smallest," and adjectives representing the high end are "large," "huge," and "giant"). Then, for each feature, nine vectors were defined in the embedding space as the vector differences between all possible pairs of adjective word vectors representing the low extreme of a feature and adjective word vectors representing the high extreme of a feature (e.g., the difference between word vectors "small" and "giant," word vectors "tiny" and "huge," etc.). The average of these nine vector differences represented a one-dimensional subspace of the original embedding space (line) and was used as an approximation of the related feature (e.g., the "size" feature vector). The authors originally dubbed this technique "semantic projection," but we will henceforth call it "adjective projection" to distinguish it from a variation of this approach that we adopted, and will also be considered a kind of semantic projection, as detailed below.

By contrast to adjective projection, the feature vectors endpoints of which were unconstrained by semantic context (e.g., "size" is defined as a vector from "small," "tiny," "minuscule" to "large," "huge," "giant," regardless of context), we hypothesized that endpoints of a feature projection may be sensitive to semantic context constraints, much like the training process of the embedding models themselves. Specifically, the range of sizes for animals is different from that for cars. Therefore, we defined a new projection technique that we refer to as "contextual semantic projection," where the extreme ends of a feature dimension were selected from relevant vectors corresponding to a specific context (e.g., for animals, word vectors "bird," "rabbit," and "rat" were used at the low end of the "size" feature and word vectors "lion," "giraffe," and "elephant" on the high end). Similar to adjective projection, for each feature, nine vectors were defined in the embedding space as the vector differences between all possible pairs of an object representing the low and high ends of a feature for a given context (e.g., the vector difference between word "bird" and word "lion," etc.). Then, the average of these nine vector differences represented a one-dimensional subspace of the original embedding space (line) for a given context and was used as approximation of its related feature for items in that context (e.g., the "size" feature vector for animals).