Updating the tweets.spam column by joining one or two other tables

Consider the following three MySQL tables:

tweets urls tweets_urls --------------------------- --------------------- ---------------- tweet_id text spam url_id host spam tweet_id url_id --------------------------- --------------------- ---------------- 1 I love cnn.com 0 16 cnn.com 0 1 16 2 fox.com is fuk 0 17 fox.com 1 2 17 3 love me! 0 4 16 4 blah cnn.com 0 5 nice fox.com 0 

I want to update tweets.spam according to tweets_urls, which means the request output should be

 tweets --------------------------- tweet_id text spam --------------------------- 1 I love cnn.com 0 <-- tweets_urls tells me tweet_id 1 has url_id 16 2 fox.com is fuk 1 in it, and the urls-table tells me that url 16 3 love me! 0 is not spam (spam = 0) 4 blah cnn.com 0 5 nice fox.com 1 

I hope I let myself know. I was messing with this, and now I have something like this. I know that this is not true, but I do not know how to start all over again. You?

 UPDATE tweets SET spam = ( SELECT spam FROM urls LEFT JOIN tweets_urls WHERE urls.url_id = tweets_urls.url_id ) 

Any help would be appreciated :-)

+4
source share
2 answers

For your data, this query returns a result set ...

 SELECT t.tweet_id , t.text , IFNULL(s.spam,t.spam) AS spam FROM tweets t LEFT JOIN ( SELECT tu.tweet_id, MAX(u.spam) AS spam FROM tweets_urls tu JOIN urls u ON u.url_id = tu.url_id WHERE u.spam = 1 GROUP BY tu.tweet_id ) s ON s.tweet_id = t.tweet_id 

But we made some assumptions about what should be done when there are more than one line in tweets_url for several tweet_id, or when there is no corresponding URL, etc.

If you want the tweet to be marked as "spam = 1" when it is detected that the tweet matches ANY URL that is marked as "spam = 1", otherwise the tweet should be marked as "spam = 0". ..

This will set the spam column for each line in tweets based on this rule ...

 UPDATE tweets t LEFT JOIN ( SELECT tu.tweet_id, MAX(u.spam) AS spam FROM tweets_urls tu JOIN urls u ON u.url_id = tu.url_id WHERE u.spam = 1 GROUP BY tu.tweet_id ) s ON s.tweet_id = t.tweet_id SET t.spam = IFNULL(s.spam,0) 

If you want to leave only the spam column (leave it in accordance with the fact that it is installed), and ONLY want to update the row where the value is currently set to 0 and should be set to 1, in accordance with the "match url has spam = 1 ", you can do this:

 UPDATE tweets t JOIN ( SELECT tu.tweet_id FROM tweets_urls tu JOIN urls u ON u.url_id = tu.url_id WHERE u.spam = 1 GROUP BY tu.tweet_id ) s ON s.tweet_id = t.tweet_id SET t.spam = 1 WHERE t.spam = 0 

Please note that the predicate in the tweet table will ONLY update the lines with spam that is currently set to zero. Note that we do not need to refer to the value of the spam column from the URL table, we are already testing that it is 1, so we can use the literal 1 when assigning the value to the tweets.spam column. Also note that we are doing an INNER JOIN (and not a LEFT OUTER JOIN), so we will again update the rows that will be assigned the value 1.


+1
source

You forgot to associate the subtitle with the tweets table and the ON clause in your connection:

 UPDATE tweets SET spam = ( SELECT spam FROM urls LEFT JOIN tweets_urls ON urls.url_id = tweets_urls.url_id WHERE tweets_urls.tweet_id = tweets.tweet_id ) 

You also did not determine what to do if:

  • tweets_urls no entry for tweet_id
  • tweets_urls has several entries for tweet_id

Finally, as a side note, are you sure you want to APPLY like this? This is more like what you want to create using a view or stored procedure - if only urls and tweets_urls are just the tables that you added to help populate the tweets table and then release later.

+3
source

All Articles