Exclude retweets from twitter streaming api using tweepy

When using the python library tweepyto pull tweets from twitter streaming API, can retweets be excluded?

For example, if I want only tweets sent by a specific user ex:, twitterStream.filter(follow = ["20264932"])but this returns retweets, and I would like to exclude them. How can i do this?

Thanks in advance.

+4
source share
2 answers

Just checking the tweet text to see if it starts with “RT” is not a reliable solution. You need to make a decision about what you consider retweets, as this is not entirely clear. The Twitter docs API explains that tweets with “RT” in the tweet text are not official retweets.

Sometimes people type RT at the beginning of Tweet to indicate that they are republishing another user's content. This is not an official Twitter team or feature, but means they are quoting another Tweet user.

If you follow the “official” definition, you want to filter out the tweets if they have a value Truefor their re-deferred attribute, for example:

if not tweet['retweeted']:
    # do something with standard tweets

, "" , "RT @", , "RT" , - , , "RT" , ( , , ). :

if not tweet['retweeted'] and 'RT @' not in tweet['text']:
    # do something with standard tweets

, , , "RT @" , , .

+11

, . - , , RT. .startswith() , on_data() , :

class TwitterStreamListener(tweepy.StreamListener):
    def on_data(self, data):
        # Twitter returns data in JSON format - we need to decode it first
        decoded = json.loads(data)
        if  not decoded[`text`].startswith('RT'):
            #Do processing here 
            print decoded['text'].encode('ascii', 'ignore')
        return True
+3

All Articles