WordNet Based Features for Predicting Brain Activity associated with meanings of nouns
WordNet Based Features for Predicting Brain Activity associated with meanings of nouns Ahmad Babaeian Jelodar, Mehrdad Alizadeh, and Shahram Khadivi Computer Engineering Department, Amirkabir University of Technology 424 Hafez Avenue, Tehran, Iran {ahmadb_jelodar, mehr.alizadeh, khadivi}@aut.ac.ir multiple time intervals, while the subject responds Abstract to some kind of repeated stimuli (reading words), can present descriptive statistics of brain activity Different studies have been conducted for (Mitchell et al., 2004). predicting human brain activity associated Conceptual meanings of different words and pic- with the semantics of nouns. Corpus based tures trigger different brain activity. The represen- approaches have been used for deriving fea- tation of conceptual knowledge in the human brain ture vectors of concrete nouns, to model the has been studied by different science communities brain activity associated with that noun. In this paper a computational model is proposed such as psychologists, neuroscientists, linguists, in which, the feature vectors for each concrete and computational linguists. Some of these ap- noun is computed by the WordNet similarity proaches focus on visual features of picture stimuli of that noun with the 25 sensory-motor verbs to analyze fMRI activation associated with viewing suggested by psychologists. The feature vec- the picture (O’Toole et al, 2005) (Hardoon et al., tors are used for training a linear model to 2007). Recent work (Kay et al., 2008) has shown predict functional MRI images of the brain as- that it is possible to predict aspects of fMRI activa- sociated with nouns. The WordNet extracted tion based on visual features of arbitrary scenes features are also combined with corpus based and to use this predicted activation to identify semantic features of the nouns. The combined which of a set of candidate scenes an individual is features give better results in predicting hu- man brain activity related to concrete nouns. viewing. Studies of neural representations in the brain have mostly focused on just cataloging the 1 Introduction patterns of fMRI activity associated with specific categories of words. Mitchell et al present a ma- The study of human brain function has received chine learning approach that is able to predict the great attention in recent years from the advent of fMRI activity for arbitrary words (Mitchell et al., functional Magnetic Resonance Imaging (fMRI). 2008). fMRI is a 3D imaging method, that gives the abili- In this paper a computational model similar to ty to perceive brain activity in human subjects. A the computational model in (Mitchell et al., 2008) three dimensional fMRI image contains approx- is proposed for predicting the neural activation of a imately 15000 voxels (3D pixels). Since its advent, given stimulus word. Mitchell et al performs pre- fMRI has been used to conduct hundreds of studies diction of the neural fMRI activation based on a that identify specific regions of the brain that are feature vector for each noun. The feature vector is activated on average when a human performs a extracted by the co-occurrences of each individual particular cognitive function (e.g., reading, mental concrete noun with each of the 25 sensory-motor imagery). A great body of these publications show verbs, gathered from a huge google corpus (Brants, that averaging together fMRI data collected over 2006). The feature vector of each noun is used to 52 Proceedings of 1st Workshop on Computational Neurolinguistics, pages 52–60, NAACL-HLT, Los Angeles, June 2010. 2008 c Association for Computational Linguistics Figure 1 - Structure of the model for predicting fMRI activation for arbitrary stimuli word w predict the activity of each voxel in the brain, ture vectors. But, instead of using a corpus to ex- by assuming a weighted linear model (Figure 1). tract the co-occurrences of concrete nouns with The activity of a voxel is defined as a conti- these verbs we use WordNet to find the similarities nuous value that is assigned to it in the functional of each noun with the 25 sensory-motor verbs. We imaging1 procedure. Mitchell et al applied a linear also combine the WordNet extracted model with model based on its high consistency with the wide- the corpus based model, and achieve better results spread use of linear models in fMRI analysis. In in matching predicted fMRI images (from the this paper focus is on using WordNet based fea- model) to their own observed images. tures (in comparison to co-occurrence based fea- This paper is organized as follows: in section 2 a tures), therefore the linear model proposed and brief introduction to WordNet measures is de- justified by Mitchell et al is used and other models scribed. In section 3, the WordNet approaches ap- like SVM are not even considered. Mitchell et al, plied in the experiments and the Mitchell et al suggests that the trained model is able to predict linear model are explained. The results of the expe- brain activity even for unseen concepts and there- riments are discussed in section 4 and finally in fore notes that a great step forward in modeling section 5 the results and experiments are con- brain activity is taken in comparison to the pre- cluded. vious cataloging approaches for brain activity. This model does not work well in case of ambiguity in 2 WordNet-based Similarity meaning, for example a word like saw has two meanings, as a noun and as a verb, making it diffi- 2.1 WordNet cult to construct the suitable feature vector for this WorNet is a semantic lexicon database for English word. We try to alleviate this problem in this paper language and is one of the most important and and achieve better models by combining different widely used lexical resources for natural language models in case of ambiguity. processing tasks (Fellbaum, 1998), such as word In our work, we use the sensory-motor verbs sense disambiguation, information retrieval, auto- which are suggested by psychologists and are also matic text classification, and automatic text sum- used by (Mitchell et al., 2008), to extract the fea- marization. WordNet is a network of concepts in the form of 1 Functional images were acquired on a Siemens (Erlangen, word nodes organized by semantic relations be- Germany) Allegra 3.0T scanner at the Brain Imaging Re- tween words according to meaning. Semantic rela- search Center of Carnegie Mellon University and the Univer- tion is a relation between concepts, and each node sity of Pittsburgh (supporting online material of Mitchell et al. consists of a set of words (synsets) representing the 2008). 53 real world concept associated with that node. Se- cepts. Information content of a concept measures mantic relations are like pointers between synsets. the specificity or the generality of that concept. The synsets in WordNet are divided into four dis- freq(c) tinct categories, each corresponding to four of the IC(c)= - log ( ) (2) freq(root) parts of speech – nouns, verbs, adjectives and ad- verbs (Pathwarden, 2003). freq(c) is defined as the sum of frequencies of all WordNet is a lexical inheritance system. The re- concepts in subtree of concept c. The frequency of lation between two nodes show the level of gene- each concept is counted in a large corpus. There- rality in an is–a hierarchy of concepts. For fore freq(root) includes frequency count of all con- example the relation between horse and mammal cepts. shows the inheritance of horse is-a mammal. The LCS (Longest Common Subsummer) of concepts A and B is the most specific concept that 2.2 Similarity is an ancestor of both A and B. Resnik defined the Many attempts have investigated to approximate similarity of two concepts as follows (Resnik, human judgment of similarity between objects. 1995): Measures of similarity use information found in is– relatednessres (c1 ,c2 )=IC(lcs(c1 ,c2 )) (3) a hierarchy of concepts (or synsets), and quantify how much concept A is like concept B (Pedersen, IC(lcs(c1 ,c2 )) is the information content of Longest 2004). Such a measure might show that a horse is Common Subsummer of concepts c1 and c2. more like a cat than it is like a window, due to the The Lin measure, augment the information con- fact that horse and cat share mammal as an ances- tent of the LCS with the sum of the information tor in the WordNet noun hierarchy. content of concepts c1 and c2. The Lin measure Similarity is a fundamental and widely used scales the information content of the LCS by this concept and refers to relatedness between two con- sum. The similarity measure proposed by Lin, is cepts in WordNet. Many similarity measures have defined as follows (Lin, 1998): been proposed for WordNet–based measures of relatednesslin (c1 ,c2 )= 2.IC(lcs(c1 ,c2 )) (4) semantic similarity, such as information content IC(c1 )+IC(c2 ) (Resnik, 1995), JCN (Jiang and Conrath, 1997), IC(c1 ) and IC(c2 ) are information content of con- LCH (Leacock and Chodorow, 1998), and Lin cepts c1 and c2, respectively. (Lin, 1998). Jiang and Conrath proposed another formula These measures have limited the part of speech named JCN as a similarity measure which is shown (POS) of words, for example it is not defined to below (Jiang and Conrath, 1997): measure the similarity between verb see and noun 1 eye. There is another set of similarity measures relatednessjcn (c1 ,c2 )= (5) IC(c1 )+IC(c2 )-2.IC(lcs(c1 ,c2 )) which work beyond this boundary of POS limita- tion. These measures are called semantic related- The Lesk is a measure of semantic relatedness ness measures; such as Lesk (Banerjee and between concepts that is based on the number of Pedersen, 2003), and Vector (Patwardhan, 2003). shared words (overlaps) in their definitions The simple idea behind the LCH method is to (glosses). This measure extends the glosses of the compute the shortest path of two concepts in a concepts under consideration to include the WordNet unified hierarchy tree. The LCH measure glosses of other concepts to which they are re- is defined as follows (Leacock and Chodorow, lated according to a given concept hierarchy (Ba- 1998): nerjee and Pedersen, 2003). This method makes it possible to measure similarity between nouns and (1) verbs. The Vector measure creates a co–occurrence Similarity is measured between concepts c1 and matrix for each word used in theWordNet glosses c2, and D is the maximum depth of taxonomy; from a given corpus, and then represents each therefore the longest path is at most 2D. gloss/concept with a vector that is the average of Statistical information from large corpora is these co–occurrence vectors (Patwardhan, 2003). used to estimate the information content of con- 54 3 Approaches tions. The overall percentage of correct classifica- tion represents the accuracy of the model. As mentioned in the previous section, different We tried to use the same implementations in WordNet measures can be used to compute the (Mitchell et al., 2008) as our baseline. We imple- similarities between two concepts. The WordNet mented the training and test models as described in similarity measures are used to compute the verb- the supporting online material of Mitchell et al’s concept similarities. The feature matrix comprises paper, but due to some probable unseen differences of the similarities between 25 verbs (features) and for example in the voxel selection, the classifica- 60 concrete nouns (instances). In this section the tion accuracies achieved by our replicated baseline computational model proposed by (Mitchell et al., of Mitchell et al’s is in average less than the accu- 2008), WordNet-based models, and combinatory racies attained by (Mitchell et al., 2008). In the test models are briefly described. phase we used 500 selected voxels for comparison. The training is done for all 9 participants. 3.1 Mitchell et al Baseline Model This procedure is used in all the other approach- In our paper we used the Mitchell et al regression es mentioned in this section. We have contacted model for predicting human brain actively as our the authors of the paper and we are trying to re- baseline. In all of the experiments in this paper, we solve the problem of our baseline. use the fMRI data gathered by Mitchell et al. The fMRI data were collected from 3.2 WordNet based Models nine healthy, college age participants who viewed 60 different word-picture pairs presented six times As mentioned in section 2, several WordNet–based each (Mitchell et al. 2008). In Mitchell et al, for similarity measures have been proposed (Pedersen, each concept, a feature vector containing norma- 2004). We apply some of the known measures to lized co-occurrences with 25 sensory-motor verbs, construct the feature matrix, and use them to train gathered from a huge google corpus (Brants, the models of 9 participants. 2006), is constructed. The computational model WordNet::Similarity is a utility program availa- was evaluated using the collected fMRI data ga- ble on web2 to compute information content val- thered by Mitchell et al. Mean fMRI images were ues. WordNet::Similarity implements measures of constructed from the primary fMRI images, before similarity and relatedness that are all in some way training. A linear regression model was trained, based on the structure and content of WordNet using 58 (from 60) average brain images for each (Resnik, 2004). subject that maps these features to the correspond- As mentioned in section 2, every concept in ing brain image. For testing the model, the two left WordNet consists of a set of words (synsets). The out brain images were compared with their corres- similarity between two concepts is defined as a ponding predicted images, obtained from the series of similarities between synsets of the first trained model. The Pearson correlation (Equation concept and synsets of the second concept. In this 6) was used for comparing whether each predicted paper the maximum similarity between synsets of image has more similarity with its own observed two concepts is considered as the candidate simi- image (match1) or the other left out observed im- larity between two concepts. age (match2). In contrary to relatedness measures, similarity match1(p1=i1 & p2=i2) = measures have the limitation of POS of words. In pearsonCorrelation(p1,i1)+ our case the verb-noun pair similarity is not de- pearsonCorrelation(p2,i2) (6) fined when using similarity measures. To solve this p1 ,p2 are predicted images, and i1, i2 are corres- problem the sense (POS) of verb features are as- ponding observed images. sumed to be free (verb, noun, adjective and ad- For calculating the accuracy we check whether verb). For most cases the meaning of a verb sense the classification is done correctly or not. By se- of a word is close to the non-verb senses of that lecting two arbitrary concepts (of sixty concepts) as test, there would be 1770 different classifica- 2 http://wn-similarity.sourceforge.net/ 55 word. For example the verb clean can be seen as a based approach with our WordNet based approach. noun, adjective, and adverb which all have close We assume that we have two feature matrices, one meanings. Some problems arise by this assump- based on the corpus-based (baseline) method and tion. For example the verb Watch has a far mean- the other based on a WordNet-based (Lin/Lesk ing of the noun Watch or some verbs like eat do similarity measure) method. not have a non-verb sense. To handle these issues the combination of the relatedness measures and 3.4.1 Linear combination similarity measures is used. This approach is dis- cussed in section 3.3 to make a more suitable fea- The first approach for combining WordNet and ture matrix. baseline models, is based on assigning weights (λ,1- λ) to the models, for calculation of match1 The two leave out cross-validation accuracies and match2. match1 of baseline model is assigned of regression models trained by feature matrices weight λ, and match1 of WordNet model is as- (computed from WordNet similarities) are depicted signed weight (1- λ), for calculating the final in Figure 2. The results helped us to select two match1 of the system (Equation 7). measures for a final feature construction. The re- sults are discussed and analyzed in the next sec- match1= tion. λ.(match1Baseline)+(1-λ).(match1WordNet) (7) match2 is calculated in the same way. Classifica- 3.3 Lin/Lesk and JCN/Lesk based features tion is assumed to be correct when match1 gets a The experiments show that, JCN similarity meas- greater value than match2. The parameter λ needs ure gives the best results on extracting the feature to be tuned. Different values of λ were tested and vectors for predicting brain activity. Unfortunately, their output accuracies are depicted in Figure 2. some similarity measures like JCN and Lin feature matrices are to some extent sparse. In some cases, the feature (sensory-motor verb) or even a concept is represented by a null vector. The null input data do not affect the linear regression training, but lead to less data for training the model. This anomaly is originated from the fact that some verbs do not have related non-verb senses (POS). On the other hand, relatedness measures (like Lesk) do not limit the POS of words. In conse- quence, we have non-zero values for every element of the feature matrix. This motivates us to combine Lesk similarity measure with Lin to alleviate the defect mentioned above. Figure 2 – accuracies of different λ values Combination is based on finding a better feature 3.4.2 Concept based combination matrix from the two Lin (JCN) and Lesk feature The performance of computational models can be matrices. For this, a linear averaging is considered analyzed from a different view. We are looking for between Lin (JCN) and Lesk feature matrices. a combination mechanism based on model accura- 3.4 Combinatory Schemes cies for classifying a concept pair. This combina- tion mechanism estimates weights for WordNet In this paper, a new approach for extracting the and baseline models for testing a left out pair. To feature matrix using WordNet is presented and dif- have a system with the ability to work properly on ferent similarity measures for representing this fea- unseen nouns, we leave out all the concept pairs ture matrix are investigated. that have concepts c1 or c2 (117 pairs). This guar- antees that the trained model is blind to concepts c1 In this section, we propose new combinatory and c2. The remaining concept pairs are used for approaches for combining Mitchell et al’s corpus 56 Table 1- Voting mechanism training (1653 pairs). match2 for a model, based on a weighted linear combination (relation 9). The weights for a combi- The accuracies of WordNet and baseline models natory model are calculated by a voting mechan- for the training set are derived and weight of base- ism (Table 1). line model is calculated as follows: Accuracy(Base) 4 Results and Discussion λ= (8) Accuracy(Base) + Accuracy(WordNet) As mentioned in section 2, it is possible to con- weight of WordNet model is calculated in a similar struct the feature matrix based on WordNet simi- way. Relation 7 is used for calculating match1 and larity measures. Seven different measures were match2. For calculating the accuracy we check tested and models for 9 participants were trained whether the classification is done correctly or not. using a 2-leave out cross validation. Four similarity This procedure is repeated for each arbitrary pair measures (Lin, JCN, LCH, and Resnik), two simi- (1770 iterations) to calculate the overall accuracy larity relatedness measures (Lesk and Vector), a of the combinatory system. combination of Lin/ Lesk and a combination of JCN/ Lesk are compared to the baseline. The re- 3.4.3 Voting based combination schemes sults based on accuracies of these tests are shown in Table 3. The accuracies are calculated from In many intelligent combinatory systems, the ma- counts of match scores. The match score between jority voting scheme is an approach for determin- the two predicted and the two observed fMRI im- ing the final output. Mitchell et al collected data ages was determined by which match (match1 or for 9 participants. In this approach a voting is per- match2) had a higher Pearson correlation, eva- formed on the models of 8 participants (participant luated over 500 voxels with the most stable res- j=1:9, j≠i) for each concept pair (the two left out ponses across training presentations. concepts), to select the better model amongst The results of WordNet-based models are shown WordNet and baseline models. The better model is in Table 3. As described in section 2 the similarity the model that leads to higher accuracy in classify- measures have limitation of POS. The JCN meas- ing the left out concepts of 8 participants (partici- ure has the best accuracy among all single similari- pant j=1:9, j≠i). The selected model is used to test ty measures. The JCN measure has a better average the model for pi (participant i). accuracy (0.65) in comparison to the Lin measure Votes for selecting the better model for each (0.63). The relatedness similarity does not have the participant is calculated as shown in Table 1. limitation of POS. In spite of this advantage the match1Base and match1WordNet represent match1 Lesk and Vector measures do not provide a better for baseline and WordNet models. accuracy than the JCN similarity measure. The voteBase Vector average accuracy (0.529) is worse than match1= (match1Base) + Lesk (0.622) and therefore just Lesk is considered 8 as a candidate of combination with other similarity voteWordNet (match1WordNet) (9) measures like JCN and Lin. In section 3 the idea of 8 Another approach is linear voting combination. combining Lin (JCN) and Lesk measures was men- This approach is based on calculating match1 and tioned. These combinatory schemes led to better 57 accuracies among all single measures (Table 3). was designed. The union of the two feature matric- Despite the lower average accuracy of the Lin me- es (baseline feature matrix and Lin/Lesk feature thod, the combination of Lin/Lesk achieved a bet- matrix) does not lead to a better result (0.646). In ter average accuracy in comparison to JCN/Lesk contrary to the united features the combination of combination. This is probably because of the lower these systems gives a better performance. Three correlation between Lin/Lesk feature vectors in different schemes of combinatory systems are pro- comparison to JCN/Lesk feature vectors. The cor- posed in section 4. The first scheme (linear combi- relation between different pairs of feature matrices nation) uses a fixed ratio (λ) for combining the extracted by WordNet-based similarity measures output match of the two systems. As depicted in are shown in Table 2. The result shows that Lesk Figure 2 the λ value is tuned and an optimum value feature matrix has minimum correlations with all of λ=0.64 achieved an average accuracy of 0.775 other WordNet-based feature matrices. This is a (Table 4). good motivation to have the Lesk measure as a candidate to mix with other measures to extract a more informative feature matrix. The Lesk feature matrix has the least correlation with Lin feature matrix among all WordNet-based feature matrices. Therefore as noted before, results of Table 3 show better accuracy for Lin/Lesk in comparison to JCN/Lesk. But these accuracies are less than the accuracies attained by the base method proposed by Mitchell et al. Figure 3- Improvement of linear combinatory scheme Measure 1 Measure 2 Correlation Lesk Lin 0.3929 Lesk Resnik 0.4528 The accuracies of participants P1 and P5 for our Lesk JCN 0.5129 implemented baseline are almost the same as the Lesk LCH 0.5556 accuracies of P1 and P5 in Mitchell et al. A com- Lin LCH 0.6182 parison of the accuracies for P1 and P5 attained by JCN Res 0.6357 the baseline model and the linear combination JCN Lin 0.7175 scheme is illustrated in Figure 3. The results show JCN LCH 0.7234 considerable improvement in accuracies when the Lin Res 0.7400 combinatory model is used. LCH Res 0.7946 Table 2– correlation between different pairs of Word- Net-based similarity (relatedness) measures One important reason of this shortage can be the difference in sense (POS) between concepts (with noun POS) and features (with verb POS). This leads to limitation of WorldNet-based measures for constructing better feature matrices. Investigating new features of the same sense of POS between concepts and features (associated with sensory- motor verbs) might lead to even better results. The Base and WordNet use ultimately different Figure 4- Comparison of linear Combinatory scheme approaches to compute the similarity of each pair with Baseline and WordNet of concepts. Several experiments like the union of features and the combination of system outputs 58 Measure/ Partic- P1 P2 P3 P4 P5 P6 P7 P8 P9 Average ipant Baseline 0.828 0.845 0.752 0.798 0.776 0.658 0.705 0.615 0.680 0.740 Lin 0.73 0.624 0.739 0.727 0.591 0.507 0.64 0.501 0.632 0.632 Lesk 0.725 0.629 0.668 0.688 0.601 0.519 0.604 0.584 0.580 0.622 Vector 0.603 0.599 0.551 0.553 0.567 0.451 0.509 0.446 0.476 0.529 LCH 0.685 0.613 0.671 0.617 0.574 0.468 0.577 0.506 0.587 0.589 RES 0.610 0.558 0.594 0.622 0.505 0.555 0.603 0.449 0.490 0.554 JCN 0.797 0.638 0.765 0.713 0.671 0.525 0.504 0.568 0.642 0.647 Lin/Lesk 0.807 0.677 0.767 0.812 0.672 0.645 0.690 0.502 0.697 0.697 JCN/Lesk 0.790 0.604 0.718 0.789 0.641 0.593 0.593 0.514 0.667 0.656 Table 3- Results of Different similarity measures compared to baseline Approach/ Par- ticipant P1 P2 P3 P4 P5 P6 P7 P8 P9 Average Linear 0.877 0.847 0.827 0.862 0.798 0.696 0.734 0.605 0.728 0.775 Concept-based 0.887 0.832 0.836 0.87 0.793 0.687 0.736 0.588 0.734 0.774 Binary voting 0.894 0.837 0.829 0.858 0.796 0.684 0.758 0.612 0.736 0.778 Weighted voting 0.905 0.840 0.861 0.882 0.808 0.710 0.761 0.614 0.755 0.793 Table 4- Accuracies of different combinatory approaches The improvement of this combinatory scheme participant Pi. This scheme gathers votes from the can be viewed from another aspect. Concept accu- other 8 participants as described in section 3. The racy, defined as classification accuracy of the results are shown in Table 4. Improvement of bi- concept paired with each of the other 59 concepts, nary voting scheme to baseline is almost as much shows the performance of the system for each con- as the Improvement of linear and concept-based cept (Figure 4). The concept accuracies of the li- schemes to baseline. The weighted voting used a near combinatory scheme are compared with the more flexible combination scheme, and led to an Baseline and WordNet systems and results improvement of about 5.3% in comparison to base- are illustrated in Figure 4. The accuracy of some line. ambiguous concrete nouns like ‘saw’ are improved A result is called statistically significant if it is in WordNet-based model and this improvement is improbable to have occurred by chance. T-test maintained by linear combinatory model. Im- provements have been seen in combinatory model. Participant H-value P-value The second scheme uses a cross validation of P1 1 7.73e-12 the remaining 58 concepts to train the system, for P2 0 0.6610 deciding on each pair of concepts. After training, P3 1 5.55e-17 each system (WordNet and Base) is assigned a P4 1 2.61e-12 weight according to its accuracy. Decision on the test pair is based on a weighted combination of the P5 1 0.0051 systems. The results of this scheme are shown in P6 1 0.0004 Table 4. It has an improvement of 3.4% in compar- P7 1 8.28e-05 ison to the baseline model. P8 0 0.5275 P9 1 3.95e-07 The third scheme chooses another combinatory Table 5- t-test of baseline and weighted voting output strategy to decide on each test pair of concepts for values for 9 participants 59 was used to show whether the improvement ceedings of the Eighteenth International Joint Confe- achieved in this paper is statistically significant or rence on Artificial Intelligence, 805–810. not. The t-test was tested on output accuracies of Brants T., and Franz A., 2006, baseline (with average accuracy 0.74) and www.ldc.upenn.edu/Catalog/CatalogEntr weighted voting combinatory scheme (with aver- y.jsp?catalogId=LDC2006T13. Linguistic age accuracy 0.793) for 9 participants. The results Data Consortium, Philadelphia. are shown in Table 5. The weighted voting scheme Fellbaum C., 1998. WordNet: An Electronic Lexical does not have improvement on P2 and P8 and Database. The MIT Press, Cambridge, MA. results are almost similar to baseline, therefore the null hypothesis of equal mean is not rejected Hardoon D. R., Mourao-Miranda J., M. Brammer, Shawe-Taylor J. 2007. unsupervised analysis of fMRI (H-value=0) at 0.05 confidence level. For all par- data using kernel canonical correlation. Neuroimage, ticipants with improvement on results, null hypo- pp. 1250-1259. thesis of equal mean is rejected (H-value=1) at 0.05 confidence level. This rejection shows that the Kay K. N., Naselaris T., Prenger R. J., Gallant J. L. improvements are approved to be statistically sig- 2008. Identifying Natural Images from Human Brain Activity, Nature, pp. 352-355. nificant for all participants with improvement. The t-test on overall 9 participants rejected null hypo- Leacock C. and Chodorow M. 1998. Combining lo- thesis with a P-value of almost zero. This experi- cal context andWordNet similarity for word sense iden- ment shows the improvement achieved in this tification. In C. Fellbaum, editor, WordNet: An paper is statistical significant. electronic lexical database, pages 265–283. MIT Press. Lin D. 1998. An information-theoretic definition of 5 Conclusion similarity. In Proceedings of the International Confe- rence on Machine Learning, Madison. In this work, a new WordNet-based similarity ap- Mitchell T. M., et al. 2008. Predicting Human Brain proach for deriving the sensory-motor feature vec- Activity Associated with the Meanings of Nouns, Ameri- tors associated with the concrete nouns was can Association for the Advancement of Science. introduced. A correlation based combination of WordNet measures is used to attain more informa- Mitchell T. M., Hutchinson R. A., Niculescu R. S., Pereira F., and Wang X.. 2004. Learning to Decode tive feature vectors. The computational model Cognitive States from Brain Images, Machine Learning, trained by these feature vectors are combined with pp. 145-175. the computational model trained with feature vec- tors extracted by a corpus based method. O’Toole A. J., Jiang F., Abdi H., and Haxby J. V.. 2005. Partially distributed representations of objects The combinatory scheme achieves a better aver- and faces in ventral temporal cortex. Journal of Cogni- age accuracy in predicting the brain activity asso- tive Neuroscience, pp. 580-590. ciated with the meaning of concrete nouns. Patwardhan S. 2003. Incorporating dictionary and Investigating new features of the same sense (POS) corpus information into a context vector measure of between concepts and non-verb features (asso- semantic relatedness. Master’s thesis, University of ciated with sensory-motor verbs) might lead to Minnesota, Duluth. even better results for WordNet-based Models. Pedersen T., Patwardhan S., and Michelizzi J. 2004. WordNet::Similarity - Measuring the relatedness of Acknowledgements Concepts. Proceedings of Fifth Annual Meeting of the The authors would like to thank the anonymous North American Chapter of the Association for Compu- reviewers for their thorough review and their con- tational Linguistics (NAACL-04), pp. 38-41. structive comments. Resnik. P. 1995. Using information content to eva- luate semantic similarity in taxonomy. In Proceedings of the 14th International Joint Conference on Artificial References Intelligence, pages 448–453. Banerjee, S., and Pedersen, T. 2003. Extended gloss overlaps as a measure of semantic relatedness. In Pro- 60