Mysql intersection of two sets having a value separated by commas

It will be very cool if someone provides me a little help in mysql.

I have a table with 1 billion records in which one column has a comma separated value.

I have comma separated values โ€‹โ€‹for search.

I want to select those rows that have any value in this column, separated by commas, from this string value.

for example, Table A contains a comma separated column as follows: -

enter image description here

and I have a line that has values โ€‹โ€‹separated by a comma "79, 62, 70, 107".

The result will be line number 1,2,3,5,7,8,9,10 (Image mentioned.)

I did this with regex, but it takes too much time, so I want to avoid this for optimization purposes.

+6
source share
2 answers

You cannot optimize what you are doing. Basically, you can run a query like this:

where find_in_set(79, comma_separated) > 0 or find_in_set(62, comma_separated) > 0 or find_in_set(70, comma_separated) > 0 or find_in_set(107, comma_separated) > 0 

This requires full-screen scanning. Although performance may be slightly better than the usual expression, it will still not be effective.

The correct way to store this data is with the connection table. This multiplies the number of rows, so the first row in your data becomes three rows in the join table (one for each value).

There are many reasons why you do not want to keep lists of things as a comma-separated list. Your values โ€‹โ€‹look like identifiers in another table, even worse:

  • Values โ€‹โ€‹must be stored in their own format. So storing integers as strings is a bad idea.
  • The native structure for lists in SQL is a table, not a list.
  • Functions on tables are more powerful and string.
  • SQL cannot use indexes (with the exception of full text indexes) for string operations.
  • If you have an identifier that refers to another table, you must have a foreign key constraint. You cannot do this with lists stored in a string.
+2
source

If you are interested in performance, you should consider changing the structure of your database. Numbers are not indexed well (if at all) in text column types.

It looks like you have a constant number of integers in the "comma_separated" column.

Consider creating a separate INT column for each of the three, that is:

 num1 | num2 | num3 79 | 62 | 101 101 | 5 | 70 

Then you can make the right choice, for example:

 WHERE num1 IN (79, 62, 70, 107) OR num2 IN (79, 62, 70, 107) OR num3 IN (79, 62, 70, 107) 
0
source

All Articles