Rule Based Fuzzy Computing Approach on Self-Supervised Sentiment Polarity Classification with Word Sense Disambiguation in Machine Translation for Hindi Language Shweta Chauhan Department of Electronics and Communication Engineering, National Institute of Technology Hamirpur, Himachal Pradesh, India Jayashree Premkumar Shet Department of English Language and Translation, College of Science and Arts, An Nabhanya, Qassim University, Saudi Arabia Shehab Mohamed Beram Department of Computing and Information Systems, School of Engineering and Technology, Sunway University, Kuala Lumpur, Malaysia Vishal Jagota Department of Mechanical Engineering, Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India Corresponding Author: Vishal Jagota; (

[email protected]

) Mohammed Dighriri Department of MIS, University of Hafr Al Batin, Hafr Al Batin, Saudi Arabia Mohd Wazih Ahmad Adama Science and Technology University, Adama, Ethiopia Md Shamim Hossain Department of Marketing, Hajee Mohammad Danesh Science and Technology University, Dinajpur, Bangladesh Ali Rizwan Department of Industrial Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah 21589, Saudi Arabia With the increasing globalization communication among the people of diverse cultural backgrounds is also taking place to a very large extent in the present era. Issues like language diversity in various parts of the world can lead to hindrance in communication. The usage of social media and user-generated material has grown at an exponential rate and existing supervised sentiment polarity classification techniques need labelling for the training dataset. In this study, two problems have been analyzed First, sentiment analysis of twitter dataset and sense disambiguation of morphologically rich Hindi language. A rule-based fuzzy logics-based system for self-supervised sentiment classification was used to compute and analyze the self-supervised or completely unsupervised sentiment categorization of a social-media dataset using three types of lexicons. The combination of fuzzy with three different types of lexicons gives sentiment analysis a new path. The unsupervised fuzzy rules integrate the fuzziness of both negative as well as positive scores, fuzzy logic-based systems can cope with ambiguity and vagueness. The fuzzy-system uses an unsupervised/self-supervised fuzzy rule-based technique to identify text using natural language processing (NLP) and sense of word. We compared the results of fuzzy rule based self-supervised sentiment classification by using three types of lexicons on five different datasets, with unsupervised as well as supervised sentiment classification techniques. Second, using cross-lingual sense embedding rather than cross-lingual word embedding resolves the ambiguity issue. The word sense embeddings are produced for the source languages to learn multiple or various senses of the words. Different evaluation metrics depict an improved performance for English- Hindi language. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from

[email protected]

. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. 2375-4699/2023/1-ART1 $15.00 https://doi.org/10.1145/3574130 ACM Trans. Asian Low-Resour. Lang. Inf. Process. Keywords: Sentiment Analysis, Fuzzy sets, Lexicon, Unsupervised sentiment classification, Self-learning. 1 INTRODUCTION In the last few years, social platform websites have had such significant influence on daily life that they are now used to gather information about both major and minor occurrences, activities, and disasters. On social media, users or people express their sentiments and content regarding events [Yoo et al. 2018]. Online review sentiment categorization or emotion artificial intelligence has attracted a growing number of scholars and yielded a large number of study findings [Liu 2012; Cambria et al. 2017]. For online reviews, there are currently two-types of sentiment classification methods that is supervised way of sentiment classification methods [Oyebode et al. 2020; Bengio and LeCun 2015] and unsupervised way of sentiment classification methods [Agarwal et al. 2015; Jurek et al. 2015; Vashishtha 2019]. Cross-lingual word embedding and back translation are used in the unsupervised Machine Translation (MT) techniques that train sequence-to-sequence machine translation, however they are ineffective for morphologically rich languages [Artetxe et al., 2018; Kim et al., 2019]. A word is ambiguous if it can be understood in more than one way or has more than one possible meaning. A high number of words in a morphologically rich language have a wide range of meanings in various contexts. The current unsupervised neural machine translation (NMT) technique does not take into account the complexity of homonyms that have numerous meanings. Sentiment Analysis Applcation Movies Social Media News E-Commerce Hotels Fig. 1 Application of sentiment analysis Fig. 1 represent the application of sentiment classification in different area. Here movies, social media, news, e-commerce and hotels are the different application sentiment analysis. the sentiment analysis of movie review can be explained in Fig. 2. Firstly, the raw data has been collected and tokenized that it covert the text into the integer. Word embedding are generated of text data, converting the text into the real value. Then unsupervised semi supervised, and supervised model is applied. Finally, detection or classification of sentiment is determined. If the value falls below the threshold vale, then it considers negative review and if its value falls above the threshold value it is positive sentiment. In this example the sentence this is not a good movie has been taken. Since this is a negative sentiment because it contains the word not . The model is trained iteratively and its value falls below the threshold value. In supervised sentence classification, one type is based on machine learning that represents text using the bag of words model and classifiers such as nave Bayesian, expectation-maximization algorithm and support vector machine [Tripathy et al. 1997; Domingos et al. 1997; Shelke et al. 2017; Moraes et al. 2013]. The other, which is based on deep learning and uses end-to-end patterns to build text representation and classifiers ACM Trans. Asian Low-Resour. Lang. Inf. Process. like random forest and support vector machine, is based on machine learning [Devlin et al. 2018; Akhtar et al. 2019; Abdi et al. 2019; Yang et al. 2020]. Raw Text (This is not a good movie.) Tokenizer (Convert text to integer tokens) (11,6,21,3,49,71) Embedding (convert integer token to real valued vector) [(0.67,0.36,…,0.39), (0.76,0.61,….,0.67),…] Bidirectional Long Short-Term Memory (process sequence of arbitrary length) Sigmoid (Dense output layer to predict class) 0.0(negative) 1.0(positive) Fig. 2 Sentiment analysis of a movie review The necessity for a good amount of labelled data to train the sentiment classification model is a typical challenge with supervised sentiment classification approaches [Adrian et al. 2013; Yadav et al. 2020; Xu et al. 2020; Neviarouskaya et al. 2011]. Various unsupervised sentence classification techniques have been presented to tackle the challenge of data labelling [Shelke and Chakraborty 2021]. In the existing sentiment classification approach only sentiment words and phrases has been used and that is based on emotion words, while sentence types and relationships between the sentences are rarely considered [Agarwal et al. 2015; Jurek et al. 2015; Vashistha and Susan 2019; Neviarouskaya et al. 2011]. As a result, fuzzy logic has been employed in an unsupervised manner to provide a technique for computing with words, and they deal with linguistic difficulties as well as cognitive and providing deeper look to the precise sentiment values [Godara et al. 2022]. For pattern identification and classification, rule-based or classification-based fuzzy systems are more powerful and well-known techniques [Zadeh 2015]. Due to the existence of fuzziness, fuzzy systems manage uncertainty, ambiguity, and vagueness relatively efficiently [Lopez et al. 2015]. In this article, we used a fuzzy rule-based unsupervised sentiment classification technique with three distinct types of lexicons on diverse public datasets to tackle the challenge of data labelling. For three-class sentiment datasets, the implemented fuzzy rule-based method can calculate sentiment categorization [Baruah et al, 2021; Sun ACM Trans. Asian Low-Resour. Lang. Inf. Process. et al., 2021; Shi et al., 2022]. The second objective is unsupervised machine translation, which has been improved or enhanced for the morphologically richer Hindi language by removing the notion of disambiguation. The following is the paper's main contribution:  The unsupervised/self-supervised sentiment categorization using rule-based fuzzy logics for three distinct types of lexicon and public available datasets has been performed.  Senti-WordNet AFINN [Nielsen 2011], [Baccianella et al. 2010], and VADER [Hutto and Gilbert 2014] are three distinct kinds that have been employed in the implementation.  The experimental findings of a comparison with supervised machine learning using a nave bias classifier and unsupervised sentiment classifications demonstrate that adopting a fuzzy rule-based technique consistently outperforms the supervised machine learning method in expressions of F1- Micro scores.  To address the problem of polysemy and homonyms in morphologically rich languages, a word meaning disambiguation has been added to the source language. Different evaluation metrics depict an improved performance for English to Hindi Languages. The current study is further organized as follows: Section 2 contains related work. The rule-based fuzzy computing method to Fully Unsupervised Sentiment Classification is defined in Section 3. Section 4 has word sense disambiguation in machine translation. The experimental conditions, acquired results, and analysis were addressed in Section 5 and Section 6. The future scope concluded in Section 7. 2 RELATED WORK In the form of text, a greater quantity of data is gathered and analyzed on social media. The final data is subjected to sentiment analysis in order to extract some meaning and logical inferences from the vast and dispersed social media dataset [Torrens et al. 2021]. The majority of Sentiment Analysis (SA) techniques focus on polarity or categorization of opinions into positive, negative, or neutral categories [Wöllmer et al 2013; Ortigosa et al. 2014; Altrabsheh et al. 2014]. Other categorization challenges include subjectivity vs. objectivity [Barbosa and Feng 2010], predicting emotion categories [Perikos and Hatzilygeroudis 2013; Strapparava et al. 2007] and sentiment intensity [Thelwall et al. 2012; Shabaz and Garg 2020; Serrano et al.2021]. SA of social media postings may also be categorized into supervised, semi-supervised, and unsupervised approaches. For Sentiment Analysis from social media postings, several writers have utilized machine learning approaches such as naive bayes [Yan et al. 2017; Saleena 2018], expectation-maximization algorithm, and support vector machines [Domingos et al. 1997; Shelke et al. 2017; Moraes 2013]. Furthermore, utilising the support vector machine technique, feature extraction approaches such as term frequency-inverse document frequency and n-gram are used to categorize texts [Windasari et al. 2017]. To discover emergent themes and study the difficulties of public opinion differences among emerging subjects, another unsupervised method has been proposed [Tan et al. 2013]. With the addition of VADER lexicon and another t-test, and Mann Whitney test [Park et al. 2018], a new study focuses on [Vashishtha and Susan 2019] the sentiments regarding various artificial intelligence assistants in order to determine which helper is statistically superior to the other [Hutto and Gilbert 2014]. A list called affective norms especially for English words, with emotion measures was utilized to create a classification model in another recent study [Montoro et al. 2018; Asghar et al. 2021]. Fuzzy converts words into numerical numbers. So, Tsukamoto fuzzy approach was utilized for Sentiment Analysis [Vashishtha and Susan 2020; Sivakumar et al., 2021]. To transform numerical values into fuzzy linguistic words, it employs the trapezoid fuzzy membership function [Qin et al. 2022]. This procedure produces two outputs: a dual output that includes numbers positive and negative classes, as well as an output that indicates varying levels of emotion intensity [Vashishtha and Susan 2020]. Researchers has used Mamdani [Mamdani and Assilian 1975], a fuzzy rule-based system, for a variety of application issues [Duţu et al. 2017]. Several researchers [Sanz et al. 2013; López et al. 2015] evaluated fuzzy rule-based structures that have been adapted for various application areas. The large amounts of unbalanced data can be handled by the linguistic cost ACM Trans. Asian Low-Resour. Lang. Inf. Process. sensitive fuzzy rule-based classification technique with high precision and quick execution. The rule-based fuzzy logic is capable of handling data with high precision and accuracy while reducing execution time. Word sense disambiguation, often known as WSD, is the process of determining the correct sense of a word in context. This task is crucial for all machine translation and natural language processing applications since multiple senses of the same word are rendered differently in other languages (Pelevina et al., 2017). The only technique that trains neural machine translation with monolingual data is back translation (Artexed et al.,2018). Additionally effective in utilizing monolingual data for domain adaptation is back translation (Artexed et al.,2017). 3 RULE BASED FUZZY COMPUTING APPROACH ON FULLY UNSUPERVISED SENTIMENT CLASSIFICATION Fig. 3, present the details of the rule based fuzzy computing approach on fully unsupervised sentiment classification. Firstly, the input dataset is given to text preprocessing to get the useful or important information by removes the noisy dataset. By removing the noisy data, the text must be pre-processed to obtain meaningful and valuable information. The sentiment lexicon has three different types of lexical features are labeled as positive, negative and neutral. Then we have used Mamdani fuzzy rule-based system which has formulation of input variables, after that interfacing with rule evaluation and aggregation of rule output. In the last step we have used centroid defuzzification to obtain the output. In the last step we have used centroid defuzzification to obtain the output. Fig. 4, represent the fuzzy deep neural network representation. It is divided into four layers in which first layer is input layer, second is fuzzification layer, third is interface layer and last layer is defuzzification layer. Preprocessing of Review data Sentiment Lexicon Types Input Review Data SentiWord net, AFINN, Emoji Hash Natural VADER Fuzzy Rule System Formulation of input Output variables Data Centroid Defuzzification Tsuka Mamdani Sugeno moto Rule Evaluation (Max Min Interface) Aggregation of rule output Fig. 3 Illustration of Rule based fuzzy computing approach on unsupervised sentiment classification. ACM Trans. Asian Low-Resour. Lang. Inf. Process. v1 x1 w1 v2 w2 Σ ... output ` x2 Defuzzification Layer ... wn ... x3 vy Input Layer Interface Layer Fuzzification Layer Fig.4 Fuzzy deep neural network illustration 1. Preprocessing of Dataset One of the most important tasks is text preprocessing, which involves extracting valuable information from noisy input. We've deleted symbols like "@""#""*" "!" since they don't convey any emotion. In other words removal of stop words, removal of punctuations, lemmatization has been also carried. Moreover, the phares like: This suitcase is very heavy!!!, I don't like this suitcase!!!'' . Here the word don’t is changed to do not change. 2. Sentiment Lexicon Three different sentiment lexicons have been employed in this study. Sentiment lexicons are lists of lexical features. SentiWordNet , AFINN and VADER [Baccianella et al. 2010; Nielsen 2011; Hutto and Gilbert 2014] are the first, second, and third to provide the positive and negative score, respectively.  SentiWordNet Lexicon Senti WordNet is basically an extension of WordNet that gives you positivity, negativity, and neutrality. In this method preprocessing of text will occur by removing punctuation, lemmatization part-of-speech tagging and word sense disambiguation (WSD). The basic goal of the word sense disambiguation method is to determine the best match for each word tag as input with word, tag, sense. Once the score is obtained from the lexicon a positive or negative score is generated. Eq. 1 and Eq. 2 represents the positive and negative score of each word and using word sense disambiguation that can be taken as fuzzy membership pertaining in Eq. 3and Eq.4 as to set fuzzy as positive or negative. The words in a text that have a greater positive score as compared to the negative score has ACM Trans. Asian Low-Resour. Lang. Inf. Process. been added together to calculate the positive score, which is called Text-positive in Eq. 6. Similarly, in Eq. 7, words with a greater negative score than a positive score in a single text are added together to find the negative score, which is called Text-negative. For all datasets, theses scores are calculated. m positive (a)  positive.score() (1) m negative (a)  negative.score() (2) Positive  {a, m positive (a)}, a  X i (3) Negative  {a, m negative (a)}, a  X i (4) if (m positive (a)  0 & m positive (a)  m negative (a)) (5) n Then TextPositive  m a 1 positive (6) if (m negative (a)  0 & m negative (a)  m positive (a)) (7) n Then TextNegative  m a 1 negative (8) Here, a = word in a sentence X = set of total words i n = Number of selected words  AFINN Lexicon To handle social media dataset this lexicon is loss used as it uses the AFINN lexicon to get the score of each word shown in Eq 9. If it is more than 0 (zero), then the word is considered as positive and if its value is less than 0 (zero) the word is taken as negative. The fuzzy membership to the fuzzy sets can be understood as the positive and negative score of each word Positive and Negative in Eq. 3 and 4. Eq.12 shows the positive score that are by summing the positive words. Eq.13 shows the negative score similarly, by summing the negative words. m(a)  afnn of score(a) (9) if (m(a)  0 then m positive (a)  m(a)) (10) if (m(a)  0 then m negative (a)  m(a)) (11) n textPositive   m positive (a) (12) a 1 n text Negative   m negative (a) (13) a 1  Valence Aware Dictionary and Sentiment Reasoner (VADER) Lexicon This rule-based sentiment investigation tool is rapid, cost-effective, and accurate, and it works well on social network posts. This technique uses the VADER lexicon's polarity score technique to compute the entire dataset's score and returns a Positive and Negative score. The VADER technique uses the VADER lexicon's polarity scores approach to compute the total text score and outputs a positive and negative text score. ACM Trans. Asian Low-Resour. Lang. Inf. Process. 3. Fuzzy Rule Based System In this work, Mamdani system is used among Sugano models and Tsukamoto model [Mamdani and Assilian 1975]. It takes single output and dual input, and the linguistic variables is represented by fuzzy sets. Here output is represented by C and inputs are represented by A and B by a number of r linguistics in the Mamdani form by if-then propositions in Eq. 14. Rule is represented by R if A is the A1j and B is B1j then C is C1j j=1….2….r…………… (14) j: . th Here, A1j and B1j = fuzzy sets represented the j antecedent and premise pairs, th C = fuzzy set that represents the j consequent. 1 j There are four phases in Mamdani style fuzzy inference and explained below. Fuzzification of input variables We have used the triangular membership function (TMF) for positive and negative score of text. When TMF is employed; three points that is d, e, f that has been connected with the change of fuzzy membership pattern when linguistics word is involved. A membership function (MF) for a fuzzy set S on the universe of discourse z is mf :z [0,1]. Here z is mapped to a value range between 0 and 1, as stated in Eq. 15. A TMF having a lower limit of d, an upper limit of f, and an intermediate value of e has d<e<f. 0, z  a    Membership function(mf) =  (z  a)   , a  z  b   (b  a)  0, z  c  Fig. 5. Triangular Fuzzy Membership In Fig. 5 the parameter is set to a=0.3, b=0.5 and c=0.7. For each lexical combination, the range is set to 0 to 10, and the range of positive and negative input is computed. We have created three fuzzy system that is Low, ACM Trans. Asian Low-Resour. Lang. Inf. Process. medium and high for positive, negative and neutral output. So, we have computed the mid value as shown in Eq. 16 by computing the minimum and maximum value and range for all positive and negative scores in datasets.  minimum  maximum  Mid value = 2 (16) For the parameters are Low, Medium and High. Low   minimum, minimum, medium  Medium   minimum, medium, maximum  High   medium, maximum, maximum  The parameters for three fuzzy rule-based set are negative, neutral and positive a minimum and maximum value is set as 0 and 10 respectively, therefore overall range is from 0 to 10. So, to depict the sentiment here: Negative: {0,0,5}; Neutral: {0,5,10}; Positive: {5,10,10}; output.negative, output.neutral and output.positive are the TMFs of consequent section of projected rules and are given in Fig. 6. Fig. 6. Output variables Triangular Fuzzy Membership sets Rule base Formulation The main goal is to map the numerical real value to a TMF degree to a fuzzy set labelled contain the linguistic term. Based on the assumption the nine rules were selected, the score which has higher value is set to be positive and the score with lower value is set to be negative score, whereas the neutral is set to the for common scores. The goal of the fuzzification stage is to translate a numerical attribute's true value to a TMF in a fuzzy set labelled with linguistic word. There are 9 rules chosen for positive and negative and neutral sentiment. The mathematical form of these rules is explained below with firing strength of each rule: 1. W1,W5,W9=Neutral (low.positive ^ low.negative), (medium.positive ^ medium.negative) ,(High.positive ^ high.negative) 2. W2,W3,W6=Positive (medium.positive ^ low.negative), (high.positive ^ low.negative) ,(High.positive ^ medium.negative) 3. W4,W7,W8=Negative (low.positive^medium.negative), (low.positive ^ high.negative) ,(medium.positive ^ high.negative) ACM Trans. Asian Low-Resour. Lang. Inf. Process. Here ^ is the AND operator. Aggregation of Rule Output The overall intensity of firing or the degree to which the fuzzy rules relating to negative, neutral, and positive emotion is shown by Eq. 17. These total firing strengths show at which extent the antecedent component of the fuzzy rule is met [Jang et al. 1997]. The TMF of subsequent portions of corresponding rules equations are output low activation, output high activation, and output medium activation in Eq. (18-20). Overall, the union operator in Eq. 21 is used to produce the final aggregated result. Wnegative  w 4  w 7  w 8 ; Wpositive  w 2  w 3  w 6 ; Wneutral  w1  w 5  w 9 (17) Output Low Activation  w negative  output _ negative (18) Output High Activation  w positive  output _positive (19) Output Medium Activation  w negative  output _ neutral (20) Final Aggreated output  Output Low Activation  Output High Activation  Output Medium Activation (21) Centroid Defuzzification Defuzzification is the final stage in a fuzzy rule system. The goal of the defuzzification step is to find the linguistic word with the highest membership degree for the value of the class attribute. We chose the centroid defuzzification technique, which gives back the Center of Area (COA), since it produces consistent results [Jang et al. 1997]. Based on the fuzzy set's center of gravity, this function returns a crisp value. A number of sub-areas make up the overall area of the TMF distribution used to depict the total control action. Each sub-areas and center of gravity or centroid are computed, and the total of the sub-areas is used to define the defuzzied value for a discrete value fuzzy set. Thus, defuzzied output is given in Eq. 22 and calculated with the final aggregated output obtained in Eq. 22 here f denotes the sample value in the output variable. Centre of area(COA)   f (z) a (22)  (z)a Now the output that is defuzzified is examined for several ranges in order to categorize the text as negative, neutral, positive in Eq (23-25). Because the output range has min=0 and max=10, we split it into three equal pieces. Positive : 2 / 3  max   max, Negative : 0  max / 3, Neutral :  max  / 3  2 / 3  max  . The output range is set is as follows: Negative output range = 0<COA<3.3 (23) Positive output range = 3.3<COA<6.7 (24) Neutral output range = 6.7<COA<10. (25) This fuzzy approach is applicable to any of three lexicon and positive negative and neutral class polarity. 4 WORD SENSE DISAMBIGUATION IN MACHINE TRANSLATION First, the generation of word embedding for the source and target languages is done. Between the source and the destination embedding spaces, there is a linear mapping. In this case, sense embedding occurs on the source side, ACM Trans. Asian Low-Resour. Lang. Inf. Process. which interprets the target language word based on its context. The suggested work's illustration is shown in Fig. 7. By first creating the word vectors and then producing the cluster semantic network, the source dataset is employed in the disambiguation procedure. The senses and sensed vectors are produced together with the clusters for each word, aiding in the development of the disambiguated corpus. By shifting words about, eliminating words, and other methods, denoising is the act of introducing noise to an original text. The original sentence is then attempted to be recovered from the denoised sentences using encoder-decoder architecture. The model is then trained to minimize the loss function to reduce the difference between the output and the original source sentence. The unsupervised MT system generates the final translated output after multiple model iterations between denoising and back translation. Source Dataset Target Dataset Crosslingual Word Embedding Mapping with word Retrieval Technique Cluster Semantic Word Vector of source dataset Graphs Developed Sensed Word Sense Disambiguation Vectors Word to word Translation Shared Encoder Denoising Backtranslation MT output Fig. 7. Toy illustration of Machine Translation ACM Trans. Asian Low-Resour. Lang. Inf. Process. 4.1 Cross Lingual Word Embedding by Retrieval Techniques Word embedding is a method of representing words numerically. The word vector-like continuous bag of words, skip-gram, fast text, and Glove can all be represented in a variety of ways (Joulin et al., 2016). Homonyms are words with numerous meanings, and cross-lingual word embedding (CLWE) makes it very difficult to capture these senses. This is because word embedding is programmed to group together words that are semantically similar, even if their meanings are incompatible. This can be understood by using the term "playing," which is one in the English language, as shown in Fig. 8. However, in the first line, "You should turn right to reach the supreme court," there is a feeling of direction, and in the second, "Hard decisions are often right," the word right refers to the correct action. Therefore, the term right has a different meaning in each phrase. Don't point your fingers at her for your error, for example, in sentence 2. To point is to draw someone's attention to something by extending a finger or whatever that is in one's hand. The word point is used to refer to a legitimate view in the other sentence, "He made a valid point in the meeting today." The word fly also refers to move fast through the air, as in question 3, "Do you have to fly to Delhi to attend the interview." The fly refers to a bird, bat, or insect in the other line, "There is a fly in my cup." Therefore, the problem of ambiguity is caused by homonyms or polysemous words that have many meanings. The word retrieval technique does not take the word's context into account. WordNet, a tool similar to a dictionary or thesaurus, was previously employed. However, these methods require a lot of effort and are not appropriate for low- resource languages with little datasets. An unsupervised method for learning the sense vector space links each word to a group of semantically similar words to create a semantic network (Peleyina et al., 2017). 4.2 Unsupervised Sense Embedding A semantic network of word similarity is initially built. The word retrievals from the retrieval technique are given a similarity score. In this case, the nearest neighbours are the words that have the maximum cosine similarity of their respective word vectors. The sense induction procedure, which is the second step, entails building an ego network for each word in the vocabulary. In this ego network, words (nodes) with the same meaning are closely connected, but words with a different connotation are less connected. The cross-lingual sense embedding has been developed using the word embedding of the source and target languages. The target word embedding has been mapped to the source language's sense embedding. To do this, two-vector space is mapped using unsupervised learning (Artetxe et al., 2018a). It is assumed that the initial dictionary is produced and that both plans are isomorphic. Self-learning leads to an improvement in the initial translation. To improve translation, the sense vector is transferred to the source language using word sense disambiguation. Undreamt model, which includes a denoising and backtranslation procedure, has been utilized for machine translation (Artetxe et al., 2018). Denoising involves adding noise to the original sentence by moving words about, deleting words, etc. Then, using encoder-decoder architecture, the original sentence is attempted to be recovered from the denoised sentences. The model is then trained to minimize the loss function to decrease the difference between the output and the original source sentence. The final output is generated by iteration between denoising and backtranslation. ACM Trans. Asian Low-Resour. Lang. Inf. Process. Right You have to take a right turn to Her decisions are often Sentence 1 reach that supermarket. right. Right (side) Right (correct) Point Don t point your fingers at her He made a valid point in the Sentence 2 on your mistake. meeting today. Point (Sign) Point (view) Fly You have to fly to Delhi to Sentence 2 There is a fly in my cup. attend that interview. Fly (flight) Fly (insect) Fig. 8. Example of sense disambiguation 5 EXPERIMENTAL SETUP, RESULTS AND DISCUSSION 6.1 section explain the experimental setup for fuzzy rule base system, whereas the section 6.2 has experimental setup for sense disambiguation in machine translation. 5.1 Fuzzy Rule Base System This fuzzy rule-based system for fully unsupervised sentiment polarity classification is implemented in python version 3.6.5, and Core i5 processor. We have used publicly available datasets of twitter, amazon and IMDB, amazon product reviews, IMDB, and stock register. We have also used two datasets of twitter that SemEval2020 [Patwa et al. 2020], SemEval2017 [Simionescu et al. 2019], IMDB [IMDB], Stock Market [Stock Market] and Amazon [Amazon dataset]. Three different types of sentiment lexicons have been used that is SentiWordNet [Baccianella et al. 2010], VADER [Hutto and Gilbert 2014] and AFINN [Nielsen 2011]. We have compared self- supervised sentiment model with a supervised using naive bayes classifier and unsupervised approach. Dataset We have taken dataset of Semval2020 [Patwa et al. 2020] which has total 13720 sentences and also Semval2017 [Simionescu et al. 2019] has 50333 sentences. Whereas the IMDB dataset [IMDB] has 50000 sentences ACM Trans. Asian Low-Resour. Lang. Inf. Process. and stocks market [Stock Market] has 5500 total and amazon [Amazon dataset] has 2950 sentences also. The statics is given in Table 1 and in (Fig. 9-13). Table 1 Statistics of Different Dataset Twitter Dataset Twitter dataset IMDB Stock Market Amazon (SemEval 2020) SemEval 2017 (Dataset-3) (Dataset-4) (Dataset-5) (Dataset 1) (Dataset-2) Positive 4364 19902 25000 3720 1950 Negative 5264 22591 25000 2000 750 Neutral 4102 7840 nil nil 250 Total 13720 50333 50000 5500 2950 Fig. 9. Dataset-1 of Semeval 2020 Fig. 10. Dataset of Semeval 2017 Fig. 11. Dataset-3: IMDB dataset detail Fig. 12. Dataset-5: Amazon dataset ACM Trans. Asian Low-Resour. Lang. Inf. Process. Fig. 13 Dataset-4: Stock Market dataset Processing of Text by Lexicons In this section we have shown the processing of single text by the model Rule based fuzzy computing approach on fully unsupervised sentiment classification. This can be understood by taking one example how the single text is processed. Text is preprocessed and a list of tokens is created. Among all VADER gives better results so we have processed data using VADER lexicon. Firstly, we have considered VADER method because it fetches score of m(a) of each token using (eq 7- 9) as positive score, negative score for each token calculated. Here the positive and negative score is calculated (eq 11-12). Universal variables are used to set the triangular fuzzy membership as low, medium and high. The fuzzy rules are applied shown in Eq. (14-22). Eq. (23-26) are used to assess the total firing intensity of text for various emotion classes. Figure 8 depicts the display of the firing strength of several emotion types. Here the emotion is shown in three colour. Positive=Red Color, Neutral=Blue colour, Red =Green colour Tweet: why are you showing onion oil benefits in your ad , when you ate not giving onion oil in your product. i bought this for redensyl content ...but not sure about the percentage of it bein added. secondly it only has onion extracts, which am sure not is effective switching back to my conventional method of makin onion juice and applyin on scalp {negative is 0.092, neutral is 0.867, pos is 0.04, compound is 0.3313} Positive. Score = 0.0 Negative. Score =0.1 Negative score firing strength =0.2 Neutral score firing strength=0.8 Positive score firing strength=0.0 Resultant MFs are Output Low Activation= [0.2 0.2 0.2 0.2 0.2 0. 0. 0. 0. 0. ] Output Medium Activation= [0. 0.2 0.4 0.6 0.8 0.8 0.8 0.6 0.4 0.2] Output high activation= [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] ACM Trans. Asian Low-Resour. Lang. Inf. Process. Final aggregated Output: [0.2 0.2 0.4 0.6 0.8 0.8 0.8 0.6 0.4 0.2] Defuzzied Score: 4.81 Defuzzification output: Neutral Document sentiment is Neutral Fig. 14. Different emotions in colors of the text Fig. 14. shows the final aggregated output using Eq. 21. For centroid defuzzification in Eq. 22 the area beneath aggregated result has been used and finally the sentiment of the text is considered as negative. 5.2 Sense Disambiguation in MT System  Dataset and preprocessing: To produce the cross-lingual sense embedding, the monolingual corpus from the IIT Bombay corpus was used. 1000 sentences have been taken for language in the test dataset. Pre- processing is crucial for any machine translation effort. Sentences are preprocessed by tokenization, and any erroneous tokens or tokens from other languages are removed. The reference and translated sentences have a direct impact on the evaluation score's quality.  First, word embedding is produced using Fast Text to generate words as a training parameter and architectural choice (Joulin et al., 2016; Artetxe et al., 2018). The largest source-to-target word similarity has been determined using a variety of retrieval methods. The encoder layers has six-layer stack on top makes up the encoder and decoder (Vaswani et al., 2017). An additional sublayer of the encoder is connected to one another (Kingma and Ba, 2014). The decoder consists of three sublayers. Undreamt model is what we utilise for machine translation (Artetxe et al. 2017) 6 RESULTS AND DISCUSSION 6.1 Fuzzy Rule-based System This system compared with one supervised involving naive bias classifier [Go et al 20019] and one unsupervised sentiment classification approach [Hutto and Gilbert 2014]. The unsupervised methods have used VADER lexicon by simple semantic analysis. The fuzzy rule-based method using VADER lexicons on different datasets that is twitter, IMDB, amazon and stock register datasets. The F1-scores for macro and micro for all the techniques for various lexicon-dataset combinations are shows in Table 2. ACM Trans. Asian Low-Resour. Lang. Inf. Process. Table 2 F1- Scores Lexicon Methods Dataset-1 Dataset-2 Dataset-3 Dataset-4 Dataset-5 F1micro F1macro F1micro F1macro F1micro F1macro F1micro F1macro F1micro F1macro Baseline 0.710 0.629 0.677 0.601 0.2823 0.2543 0.341 0.442 0.2401 0.2193 approach VADER Fuzzy rule based 0.714 0.630 0.678 0.600 0.2930 0.2491 0.392 0.440 0.2409 0.2190 approach Table 2 shows the datasets with the highest F1-score. As we can see for all datasets the fuzzy based rule approach consistently gives better results for F1-Micro. For Dataset-1 the F1 micro score is 0.714 and we can see that that the unsupervised fuzzy rule-based approach with VADER lexicon performed the best of all methods. We have computed results for all lexicons and VADER lexicon is best among three and best suitable for social media posts. It is equipped to tackle emoji’s, slang, emoticons, and acronyms, as well as evaluating emoticons found in text. VADER with complicated text data can yield enormous benefits. Among three lexicon VADER is most suitable task because for unsupervised approach training dataset is not required. This lexicon does not need any type of the training data since it has been created from a gold standard sentiment lexicon. As a result, it analyses text more accurately. Table 3 Precision and Recall Scores for Different datasets. Lexicon Methods Dataset-1 Dataset-2 Dataset-3 Dataset-4 Dataset-5 Precision Recall Precision Recall Precision Recall Precision Recall Precision Recall Baseline approach 0.528 0.589 0.376 0.444 0.6128 0.4253 0.6023 0.4047 0.4755 0.234 VADER Fuzzy rule 0.621 0.508 o.398 0.410 0.618 0.449 0.601 0.402 0.423 0.235 based approach Here in Table 3, we presented and compared the scores of all baseline approach to the fuzzy based approach. The fuzzy based approach has highest precision and recall as compared to the baseline approach. Execution time for all datasets-lexicon combinations is shown in Table 4. Larger size dataset has taken more time as compared to small size dataset. The execution time depends will upon various parameters like lexicon type, size of corpus and what type of method of calculation is used. When VADERA lexicon is used in fuzzy based approach that takes less time. Table 4 Execution Time (in sec) of Different datasets Lexicon Methods Dataset-1 Dataset-2 Dataset-3 Dataset-4 Dataset-5 Baseline approach 1.228 0.328 1.6344 0.90 0.838 VADER Fuzzy Rule Based approach 1.081 0.218 1.289 1.00 0.816 ACM Trans. Asian Low-Resour. Lang. Inf. Process. Table 5 Supervised Methods Performance on five different datasets. Datasets F1- Micro F1-Macro Precision Recall Execution Time Dataset-1 0.722 0.511 0.401 0.397 4300 Dataset-2 0.613 0.402 0.397 0.395 58.13 Dataset-3 0.603 0.509 0.498 0.504 7390 Dataset-4 0.609 0.507 0.409 0.498 293 Dataset-5 0.612 0.609 0.398 0.409 3029 We have compared results with supervised sentiment machine learning that is naïve bias. There is no lexicon has been used in supervised approach. The F1micro, F1macro, precision, recall execution time has been shown in Table 5. Supervised approach takes too much execution time as compared to the neural fuzzy based approach. Here we can conclude that the self-training supervised fuzzy rule-based approach that include VADER has performed better than the supervised approach. It has been shown that supervised learning takes a longer training period and a bigger number of samples. On the other hand, the fuzzy rule-based self-training supervised technique requires no training time and is unaffected by dataset size. This is one of the method's advantages. 6.2 Word Sense Disambiguation-based MT System Table 6 represent the evaluation score for English Hindi. We can observe that after adding sense disambiguation at the source side we got improvement in the evaluation score. Here for MT evaluation we have used BLEU [Papineni et al., 2002], METEOR[Banerjee et al., 20005], TER[Snover et al.,2009], NIST[Doddington et al., 2002], ROUGE[Lin et al., 2004]. Table 6: Evaluation Score English-Hindi BLEU METEOR TER NIST ROUGE Proposed system 22.6 7.9 7.2 8.9 7.1 Baseline work (Artex et 22.0 7.5 7.0 8.3 6.8 al., 2018b) ACM Trans. Asian Low-Resour. Lang. Inf. Process. Propsed Work Baseline Work 25 22.6 22 20 15 Scores 10 8.9 8.3 7.9 7.5 7.2 7 7.1 6.8 5 0 BLEU METEOR TER NIST ROUGE Fig. 15. Evaluation Score for English-Hindi for different evaluation metric. The proposed approach is contrasted with the baseline paper in Fig. 15. The experiment results show a little increase in BLEU after applying word sense disambiguation at the source side from the baseline paper. Our methods outperform state-of-the-art models for determining the correct word sense because they leverage noise- free source-side context. There is almost +0.6 points BLEU improvement in the proposed approach and +0.3 point for METEOR, NIST and ROUGE. The results for several evaluation metrics demonstrate that by adding word sense disambiguation to the source language context, it is possible to choose a word’s sense and get outcomes that are superior to those of the baseline technique more accurately. 7 CONCLUDING REMARKS AND FUTURE WORK In the current study, the fuzzy rule-based approach for self-supervised for sentiment polarity of social platform datasets has been compared with unsupervised approaches and one of the supervised approaches involving naïve bias classifier. The use of fuzzy provides a means of computing with words, which is a preferred way to deal with language issues and provides views that are closer to the precise sentiment values. Three different types of lexicons that is Senti-WordNet, AFINN and VADER in isolation has been applied on different types of datasets. Among three lexicon VIDER economical without comprising F1-Score and work excellently for different social-media datasets. Experimental results show by using fuzzy rule shows better results as compared to the other models that the fuzzy rule-based approach gives best results with respect to F1- Micro scores. In the majority of datasets, using a fuzzy rule-based technique results in greater Precision, Recall and F1 micro score. We can use this approach in future to incorporate fuzzy inferencing into deep neural network models. Additionally, we have demonstrated how sensing embedding improves unsupervised neural machine translation performance. In this strategy, word sense disambiguation has been effectively included to address the issue of polysemous terms in the morphologically complex language of Hindi. To eliminate the inaccurate noise brought on by the context of word translation, sense embedding is utilized at the source language. REFERENCES Abdi, A., Shamsuddin, S. M., Hasan, S., & Piran, J. (2019). Deep learning-based sentiment classification of evaluative text based on Multi-feature fusion. Information Processing & Management, 56(4), 1245-1259. Agarwal, B., Mittal, N., Bansal, P., & Garg, S. (2015). Sentiment analysis using common-sense and context information. Computational intelligence and neuroscience, 2015. ACM Trans. Asian Low-Resour. Lang. Inf. Process. Akhtar, S., Ghosal, D., Ekbal, A., Bhattacharyya, P., & Kurohashi, S. (2019). All-in-one: Emotion, sentiment and intensity prediction using a multi-task ensemble framework. IEEE transactions on affective computing. Altrabsheh, N., Cocea, M., & Fallahkhair, S. (2014, November). Sentiment analysis: towards a tool for analysing real-time students feedback. In 2014 IEEE 26th international conference on tools with artificial intelligence (pp. 419-423). IEEE. Amazon Dataset: https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data Artetxe, M., Labaka, G., Agirre, E., & Cho, K.,2017. Unsupervised neural machine translation. arXiv preprint arXiv:1710.11041 Artetxe, M., Labaka, G., & Agirre, E.,2018. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings. arXiv preprint arXiv:1805.06297. Asghar, Muhammad Z, et al. "Senti‐eSystem: A sentiment‐based eSystem‐using hybridized fuzzy and deep neural network for measuring customer satisfaction." Software: Practice and Experience 51.3 (2021): 571-594. Baccianella, S., Esuli, A., & Sebastiani, F. (2010, May). Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining. In Lrec (Vol. 10, No. 2010, pp. 2200-2204). Barbosa, L., & Feng, J. (2010, August). Robust sentiment detection on twitter from biased and noisy data. In Coling 2010: Posters (pp. 36-44). Banerjee, S., & Lavie, A.,2005, June. METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. In Proceedings of the acl workshop on intrinsic and extrinsic evaluation measures for machine translation and/or summarization (pp. 65-72). Bengio, Y., & LeCun, Y. (2007). Scaling learning algorithms towards AI. Large-scale kernel machines, 34(5), 1-41. Baruah, R., Mundotiya, R. K., & Singh, A. K. (2021). Low Resource Neural Machine Translation: Assamese to/from Other Indo-Aryan (Indic) Languages. Transactions on Asian and Low-Resource Language Information Processing, 21(1), 1-32. Cambria, E., Poria, S., Hazarika, D., & Kwok, K. (2018, April). SenticNet 5: Discovering conceptual primitives for sentiment analysis by means of context embeddings. In Proceedings of the AAAI conference on artificial intelligence (Vol. 32, No. 1). Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. Doddington, G.,2002, March. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In Proceedings of the second international conference on Human Language Technology Research (pp. 138-145). Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning, 29(2), 103-130. Duţu, L. C., Mauris, G., & Bolon, P. (2017). A fast and accurate rule-base generation method for Mamdani fuzzy systems. IEEE Transactions on Fuzzy Systems, 26(2), 715-733. Go, A., Bhayani, R., & Huang, L. (2009). Twitter sentiment classification using distant supervision. CS224N project report, Stanford, 1(12), 2009. Godara, J., Aron, R. and Shabaz, M. (2022), "Sentiment analysis and sarcasm detection from social network to train health-care professionals", World Journal of Engineering, Vol. 19 No. 1, pp. 124-133. Hutto, C., & Gilbert, E. (2014, May). Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 8, No. 1). IMDB dataset: https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews. Jang, J. S. R., Sun, C. T., & Mizutani, E. (1997). Neuro-fuzzy and soft computing-a computational approach to learning and machine intelligence [Book Review]. IEEE Transactions on automatic control, 42(10), 1482-1484. Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T.,2016. Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759. Jurek, A., Mulvenna, M. D., & Bi, Y. (2015). Improved lexicon-based sentiment analysis for social media analytics. Security Informatics, 4(1), 1-13. Kim, Y., Geng, J., & Ney, H., 2019. Improving unsupervised word-by-word translation with language model and denoising autoencoder. arXiv preprint arXiv:1901.01590. Kingma, D. P., & Ba, J.,2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out (pp. 74-81). Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis lectures on human language technologies, 5(1), 1-167. López, V., Del Río, S., Benítez, J. M., & Herrera, F. (2015). Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data. Fuzzy Sets and Systems, 258, 5-38. López, V., Del Río, S., Benítez, J. M., & Herrera, F. (2015). Cost-sensitive linguistic fuzzy rule based classification systems under the MapReduce framework for imbalanced big data. Fuzzy Sets and Systems, 258, 5-38. Mamdani, E. H., & Assilian, S. (1975). An experiment in linguistic synthesis with a fuzzy logic controller. International journal of man-machine studies, 7(1), 1-13. Montoro, A., Olivas, J. A., Peralta, A., Romero, F. P., & Serrano-Guerrero, J. (2018, July). An ANEW based fuzzy sentiment analysis model. In 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp. 1-7). IEEE. Moraes, R. (2013). JoÃčo Francisco Valiati, and Wilson P. GaviÃčo Neto. 2013. Document-level sentiment classification: An empirical comparison between SVM and ANN. Expert Syst. Appl, 40(2), 621-633. Neviarouskaya, A., Prendinger, H., & Ishizuka, M. (2011). SentiFul: A lexicon for sentiment analysis. IEEE Transactions on Affective Computing, 2(1), 22-36. Nielsen, F. (2011). Afinn. Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby. Ortega, R., Fonseca, A., & Montoyo, A. (2013, June). SSA-UO: unsupervised Twitter sentiment analysis. In Second joint conference on lexical and computational semantics (* SEM) (Vol. 2, pp. 501-507). Ortigosa, A., Martín, J. M., & Carro, R. M. (2014). Sentiment analysis in Facebook and its application to e-learning. Computers in human behavior, 31, 527-541. Oyebode, O., Alqahtani, F., & Orji, R. (2020). Using machine learning and thematic analysis methods to evaluate mental health apps based on user reviews. IEEE Access, 8, 111141-111158. Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318). Park, C. W., & Seo, D. R. (2018, April). Sentiment analysis of Twitter corpus related to artificial intelligence assistants. In 2018 5th International Conference on Industrial Engineering and Applications (ICIEA) (pp. 495-498). IEEE. ACM Trans. Asian Low-Resour. Lang. Inf. Process. Patwa, P., Aguilar, G., Kar, S., Pandey, S., PYKL, S., Gambäck, B., ... & Das, A. (2020). Semeval-2020 task 9: Overview of sentiment analysis of code-mixed tweets. arXiv e-prints, arXiv-2008. Pelevina, M., Arefyev, N., Biemann, C., & Panchenko, A. (2017). Making sense of word embeddings. arXiv preprint arXiv:1708.03390. Perikos, I., & Hatzilygeroudis, I. (2013, September). Recognizing emotion presence in natural language sentences. In International conference on engineering applications of neural networks (pp. 30-39). Springer, Berlin, Heidelberg. Qin, Y., Wang, X., & Xu, Z. (2022). Ranking tourist attractions through online reviews: A novel method with intuitionistic and hesitant fuzzy information based on sentiment analysis. International Journal of Fuzzy Systems, 24(2), 755-777. Saleena, N. (2018). An ensemble classification system for twitter sentiment analysis. Procedia computer science, 132, 937-946. Sanz, J. A., Fernández, A., Bustince, H., & Herrera, F. (2013). IVTURS: A linguistic fuzzy rule-based classification system based on a new interval-valued fuzzy reasoning method with tuning and rule selection. IEEE Transactions on Fuzzy Systems, 21(3), 399-411. Serrano G, Jesus, Francisco P. Romero, and Jose A. Olivas, "Fuzzy logic applied to opinion mining: a review." Knowledge-Based Systems 222 (2021): 107018. Shabaz, M., & Garg, U. (2020). Clustering Yelp’s sentiment data through various approaches and estimating the error rate. Materials Today: Proceedings. https://doi.org/10.1016/j.matpr.2020.09.346.. Shelke, N. M., Deshpande, S., & Thakre, V. (2017, March). Exploiting expectation maximization algorithm for sentiment analysis of product reviews. In 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT) (pp. 390-396). IEEE. Shelke, Y., & Chakraborty, C. (2021). An End-to-End Shape-Preserving Point Completion Network. IEEE COMPUTER GRAPHICS AND APPLICATIONS, 41(3), 124- 138. Shi, S., Wu, X., Su, R., & Huang, H. (2022). Low-Resource Neural Machine Translation: Methods and Trends. Transactions on Asian and Low-Resource Language Information Processing. Simionescu, C., Stoleru, I., Lucaci, D., Balan, G., Bute, I., & Iftene, A. (2019, June). UAIC at SemEval-2019 Task 3: Extracting Much from Little. In Proceedings of the 13th International Workshop on Semantic Evaluation (pp. 355-359). Sivakumar, M., and Srinivasulu R. U.. "Aspect-based sentiment analysis of mobile phone reviews using LSTM and fuzzy logic." International Journal of Data Science and Analytics 12.4 (2021): 355-367. Snover, M., Madnani, N., Dorr, B., & Schwartz, R. (2009, March). Fluency, adequacy, or HTER? Exploring different human judgments with a tunable MT metric. In Proceedings of the Fourth Workshop on Statistical Machine Translation (pp. 259-268). Stock Market: https://www.kaggle.com/yash612/stockmarket-sentiment-dataset. Strapparava, C., & Mihalcea, R. (2007, June). Semeval-2007 task 14: Affective text. In Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007) (pp. 70-74). Sun, H., Wang, R., Utiyama, M., Marie, B., Chen, K., Sumita, E., & Zhao, T. (2021). Unsupervised neural machine translation for similar and distant language pairs: An empirical study. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 20(1), 1-17. Tan, S., Li, Y., Sun, H., Guan, Z., Yan, X., Bu, J., ... & He, X. (2013). Interpreting the public sentiment variations on twitter. IEEE transactions on knowledge and data engineering, 26(5), 1158-1170. Thelwall, M., Buckley, K., & Paltoglou, G. (2012). Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology, 63(1), 163-173. Tripathy, A., Agrawal, A., & Rath, S. K. (2016). Classification of sentiment reviews using n-gram machine learning approach. Expert Systems with Applications, 57, 117-126. Torrens U., Adrià, M. Dolores J-L, and Vilém N. "Fuzzy Natural Logic for Sentiment Analysis: A Proposal." International Symposium on Distributed Computing and Artificial Intelligence. Springer, Cham, 2021. Vashishtha, S., & Susan, S. (2019). Fuzzy rule based unsupervised sentiment analysis from social media posts. Expert Systems with Applications, 138, 112834. Vashishtha, S., & Susan, S. (2020, July). Fuzzy interpretation of word polarity scores for unsupervised sentiment analysis. In 2020 11th international conference on computing, communication and networking technologies (ICCCNT) (pp. 1-6). IEEE. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30. Windasari, I. P., Uzzi, F. N., & Satoto, K. I. (2017, October). Sentiment analysis on Twitter posts: An analysis of positive or negative opinion on GoJek. In 2017 4th international conference on information technology, computer, and electrical engineering (ICITACEE) (pp. 266-269). IEEE. Wöllmer, M., Weninger, F., Knaup, T., Schuller, B., Sun, C., Sagae, K., & Morency, L. P. (2013). Youtube movie reviews: Sentiment analysis in an audio-visual context. IEEE Intelligent Systems, 28(3), 46-53. Xu, F., Pan, Z., & Xia, R. (2020). E-commerce product review sentiment classification based on a naïve Bayes continuous learning framework. Information Processing & Management, 57(5), 102221. Yadav, A., Jha, C. K., Sharan, A., & Vaish, V. (2020). Sentiment analysis of financial news using unsupervised approach. Procedia Computer Science, 167, 589-598. Yan, Y., Yang, H., & Wang, H. M. (2017, July). Two simple and effective ensemble classifiers for Twitter sentiment analysis. In 2017 Computing Conference (pp. 1386-1393). IEEE. Yang, L., Li, Y., Wang, J., & Sherratt, R. S. (2020). Sentiment analysis for E-commerce product reviews in Chinese based on sentiment lexicon and deep learning. IEEE Access, 8, 23522-23530. Yoo, S., Song, J., & Jeong, O. (2018). Social media contents based sentiment analysis and prediction system. Expert Systems with Applications, 105, 102-111. Zadeh, L. A. (2015). Fuzzy logic—a personal perspective. Fuzzy sets and systems, 281, 4-20. ACM Trans. Asian Low-Resour. Lang. Inf. Process.