Is it possible to guess the mood of the user based on the structure of the text?

I assume that for the analysis of the text itself it is necessary to use a processor with a natural language, but what suggestions do you have for the algorithm to determine the mood of the user based on the text written by him? I doubt it will be very accurate, but I'm still interested.

EDIT: I'm by no means a specialist in linguistics or natural language processing, so I apologize if this question is too general or stupid.

+54
algorithm nlp
Jun 01 '09 at 0:43
source share
11 answers

This is the basis of a natural language processing area called mood analysis . Although your question is general, this is certainly not stupid - this kind of research is conducted by Amazon in text in product reviews, for example.

If you are serious about this, then a simple version can be achieved with

  • Get a body of positive / negative emotions . If this was a professional project, you can spend some time manually annotating the case yourself, but if you were in a hurry or just wanted to experiment first, I would suggest looking at the block of polarity of feelings from the studies of Bo Pang and Lilian Lee. The problem with using this enclosure is that it is not intended for your domain (in particular, the enclosure uses movie reviews), but it should still be applicable.

  • Separate your dataset in both Positive and Negative sentences . For a body of polarity of feelings, you can divide each review into its composite sentences, and then apply the general mood polarity tag (positive or negative) to all these sentences. Divide this case into two parts - 90% should be intended for training, 10% - for testing. If you use Weka, then it can handle the splitting of the case for you.

  • Apply machine learning algorithm (e.g., SVM, Naive Bayes, Maximum Entropy) to the word-level learning corps. This model is called the word model bag , which simply represents the sentence as the words that it consists of. This is the same model that many spam filters work on. To familiarize yourself with machine learning algorithms, there is an application called Weka that implements a number of these algorithms and gives you a graphical interface for playing them. Then you can check the performance of the model studied by the machine for errors made when trying to classify your test case using this model.

  • Apply this computer training algorithm to your user messages . For each user post, separate the post in the sentences and then classify them using the model learned by the machine.

So yes, if you are serious about this, then it is achievable - even without past experience in computational linguistics. It will be quite a lot of work, but even with the use of word-based models, good results can be achieved.

If you need more help, feel free to contact me - I am always happy to help others interested in NLP =]




Small notes -

  • Simply dividing a text segment into sentences is an NLP field called the definition of a sentence boundary . There are many tools, OSS or free, available for this, but for your task a simple split on spaces and punctuation should be fine.
  • SVMlight is also another student involved in the study, and in fact their inductive SVM performs a similar task with what we are considering - trying to classify Reuter articles on "corporate acquisitions" with 1000 positive and 1000 negative examples.
  • Turning sentences into functions for classification can take some work. In this model, each word is a feature - this requires tokenization of the sentence, which means the separation of words and punctuation marks from each other. Another tip is to write down all the individual tokens of the word so that “I HATE YOU” and “I hate YOU,” both of which are considered the same. With a lot of data, you can try and also indicate whether capitalization helps in classifying someone who is angry, but I believe that words should be enough, at least for the initial effort.



Edit

I just opened LingPipe, which actually has a sentimentality analysis tutorial using the body polarity of Bo Pang and Lillian Lee that I talked about. If you use Java, which can be a great tool to use, and even if it does not go through all the steps described above.

+63
Jun 06 '09 at 7:12
source share

Undoubtedly, one can judge the user's mood based on the text they type, but this would not be a trivial task. Things I can think of:

  • Capitals tend to denote agitation, annoyance or disappointment and, of course, emotional response, but then some newcomers do this because they do not realize the significance, therefore you cannot assume that without looking at what else they wrote (to make sure that this is not all in hats);
  • Capital is just one form of attention. Others use some aggressive colors (like red) or use bold or larger fonts;
  • Some people make more spelling and grammar mistakes and typos when they are very emotional;
  • Scanning for emoticons can give you a very clear idea of ​​how the user feels, but again something like :) could be considered happy: “I said that” or even had a sarcastic meaning;
  • The use of curses tends to have a clear meaning, but again it is not clear. The conversational speech of many people will usually contain certain four letters. For some other people, they may not even say “hell”, instead say “hell”, so significant personal (even “slop”) is significant;
  • Groups of punctuation marks (for example, @ # $ @ $ @) are usually replaced for exceptions in the context where curses are not necessarily appropriate, so they are less likely to be spoken;
  • Exclamation marks may indicate surprise, shock, or irritation.

You can look at Achievements in written text analysis, or even Determine the mood for a blog by combining multiple sources of evidence .

Finally, it is worth noting that the written text is usually perceived as more negative than it actually is. This is a common email problem in companies, as one example.

+12
Jun 01 '09 at 0:56
source share

I can't believe I'm taking this seriously ... assuming a one-dimensional space:

  • If the text contains a curse, -10 mood.
  • I think that exclamations will tend to be negative, so the mood is -2.
  • When I get upset, I type Very. Short. Suggestions. -5 mood.

The more I think about it, the more it is clear that many of these signifiers indicate extreme mood in general, but it is not always clear what mood.

+3
Jun 01. '09 at 0:51
source share

If you support fonts, bold red text is probably an angry user. Regular green texts with butterfly patterns are happy.

+3
Aug 24 '09 at 5:46
source share

My memory is not very good in this matter, but I believe that I have seen some research on the structure of the grammar of the text and the general tone. It can be as simple as shorter words and words expressing emotions (well, swear words are pretty obvious).

Change I noted that the first person to respond had a similar position. Perhaps there really is a serious idea about shorter sentences.

+2
Jun 01 '09 at 1:01
source share

Analysis of mood and behavior is a very serious science. Although other answers taunt the question, law enforcement agencies have been investigating categorization of sentiment for many years. The use in computers that I heard about usually had more context (time information, voice template, channel change rate). I think that you could with some success determine if the user is configured in a certain mood by training the neural network with samples from two well-known groups: angry and not angry. Good luck with your efforts.

+1
Jun 01 '09 at 1:04
source share

I agree with ojblass that this is a serious matter.

Classification of moods is currently a hot topic in the field of speech recognition. If you think about it, an Interactive Voice Response (IVR) application should treat angry customers much differently than calm ones: angry people should quickly go to operators with the right experience and training. Vocal tone is a fairly reliable indicator of emotions, practical enough for companies to strive to make it work. Google "Speech Recognition," or read this article to find out more.

The situation should not differ in web interfaces. Returning to the cletus comments, the analogy between recognizing the emotions of text and speech is interesting. If a person dials CAPITALS, they say that they are "screaming", just as if their voice was increased in volume and pitch using the voice interface. The detection of typed profanity is similar to the “keyword discovery" of profanity in speech systems. If a person is upset, they will make more mistakes using the graphical interface or voice user interface (VUI) and can be redirected to the person.

There is a "multimodal" area of ​​study of emotions. Imagine a web interface that you can also talk to (according to the prototype IBM / Motorola / Opera XHTML + Voice Profile prototype). Emotion detection can be based on a combination of speech and visual input modality signals.

+1
Jun 01 '09 at 2:36
source share

I think my algorithm is quite simple, but why not calculate emoticons through text :) vs: (

Obviously, the text “:) :) :) :) :)” allows a happy user, and “:( :( :(" will probably be solved by a sad user. Enjoy!

+1
Jun 03 '09 at 14:24
source share

Yes.

Can this be done, this is another story. The first problem is that the AI ​​is complete.

Now, if you had keystroke timeouts, you should understand this.

0
Jun 01 '09 at 0:54
source share

I think fuzzy logic. In any case, it will be quite easy to start with a few rules for determining the user's mood, and then expand and combine the “engine” with more accurate and complex ones.

0
Nov 20 '09 at 10:15
source share

If the user types the following characters, then he is very angry at first trying to reassure him ...

`K` `k` 
0
Jul 28 '17 at 20:13
source share



All Articles