MySQL SELECT returns incorrect results

I am working with MySQL 5.7. I created a table with a virtual column (not saved) of type DATETIME with an index on it. While I was working on this, I noticed that the order did not return all the data (some of the data that I expected at the top was missing). Also, the results of MAX and MIN were incorrect. After launch

ANALYZE TABLE CHECK TABLE OPTIMIZE TABLE 

then the results were correct. I think there was a problem with the index data, so I have a few questions:

  • When and why can this happen?
  • Is there any way to prevent this?
  • among the running command 3, which is the right one to use?

I am worried that this may happen in the future, but I will not notice.

EDIT

as suggested in the comments, I added a table definition:

 CREATE TABLE `items` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `user_id` bigint(20) unsigned DEFAULT NULL, `image` json DEFAULT NULL, `status` json DEFAULT NULL, `status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL COMMENT 'used for index: it checks if status contains expired=true', `lifetime` tinyint(4) NOT NULL, `expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL, `last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `user_id` (`user_id`), KEY `expiration` (`status_expired`,`expiration`) USING BTREE, CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED 

Queries that return incorrect results:

 SELECT * FROM items ORDER BY expiration DESC; SELECT max(expiration),min(expiration) FROM items; 

thanks

+7
sql mysql select virtual-column
source share
1 answer

TL; DR;

The problem is that your data comes from virtual columns implemented through indexes. Verification, optimization, analysis of the operations performed makes the indexes synchronize and correct any errors. This gives you the correct results from now on. At least until the index goes out of sync again.

Why can this happen

Most of the problems are caused by problems with the design of the table. Let's start with

 `status_expired` tinyint(1) GENERATED ALWAYS AS (ifnull(json_contains(`status`,'true','$.expired'),false)) VIRTUAL 

Sure, this is designed to overcome the fact that you cannot directly index the JSON column in mysql. Instead, you created a virtual column and indexed it. All this is very good, but this column can contain only one of two values; true or false . This means that he has a very bad censorship. As a result, mysql is unlikely to use this index for anything.

But we see that when creating the index, you combined the status_expired column with the expired column. Perhaps with the idea of ​​overcoming this bad power mentioned above. But wait ...

 `expiration` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL, 

Expiration is another virtual column. This has some implications.

When a secondary index is created in the generated virtual column, the generated column values ​​are materialized in the index entries. If the index is a coverage index (which includes all columns retrieved on demand), the generated column values ​​are retrieved from the materialized values ​​in the index structure instead of the calculated on-fly.

Link: https://dev.mysql.com/doc/refman/5.7/en/create-table-secondary-indexes.html#json-column-indirect-index

It contradicts

VIRTUAL: Column values ​​are not saved, but are evaluated when the rows are read, immediately after any BEFORE triggers. The virtual column does not accept storage.

Link: https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html

We create virtual columns based on the sound principle, so that the values ​​generated by simple column operations should not be stored to avoid redundancy, but by creating an index on it, we introduce redundancy again.

Proposed Fixes

based on the information provided, you really don't need the status_expired column or even the expired column. An item whose expiration is expiring!

 CREATE TABLE `items` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `user_id` bigint(20) unsigned DEFAULT NULL, `image` json DEFAULT NULL, `status` json DEFAULT NULL, `expire_date` datetime GENERATED ALWAYS AS ((`create_date` + interval `lifetime` day)) VIRTUAL, `last_update` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP, `create_date` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY (`id`), KEY `user_id` (`user_id`), KEY `expiration` (`expired_date`) USING BTREE, CONSTRAINT `ts_competition_item_ibfk_2` FOREIGN KEY (`user_id`) REFERENCES `ts_user_core` (`user_id`) ON DELETE CASCADE ON UPDATE CASCADE ) ENGINE=InnoDB AUTO_INCREMENT=1312459 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED 

Just compare the current date with the expired_date column in the table above when you need to find out which items have expired. The difference here instead of expired is the calculated element in each query, you calculate expiry_date once when you create the record.

This makes your table a lot neater and faster than faster.

+9
source share

All Articles