Can you split / split field in MySQL query?

Question

Can you split / split field in MySQL query?

I need to create a completion report. Each student belongs to one client. Here are the tables (simplified for this question).

CREATE TABLE `clients` ( `clientId` int(10) unsigned NOT NULL auto_increment, `clientName` varchar(100) NOT NULL default '', `courseNames` varchar(255) NOT NULL default '' )

The courseNames field contains a string of names denoted by commas, for example, "AB01, AB02, AB03"

 CREATE TABLE `clientenrols` ( `clientEnrolId` int(10) unsigned NOT NULL auto_increment, `studentId` int(10) unsigned NOT NULL default '0', `courseId` tinyint(3) unsigned NOT NULL default '0' )

The courseId field courseId the index of the course name in the clients.courseNames field . So, if the client of courseNames is “AB01, AB02, AB03” and the courseId for registration is 2 , then the student is in AB03.

Is there a way that I can make one choice in these tables that includes the course name? Keep in mind that there will be students from different clients (and therefore have different course names, not all of which are consecutive, for example: "NW01, NW03")

Basically, if I could split this field and return one element from the resulting array, that would be what I'm looking for. Here is what I mean in magic pseudo-code:

 SELECT e.`studentId`, SPLIT(",", c.`courseNames`)[e.`courseId`] FROM ...

+42

mysql

nickf Jan 23 '09 at 4:53

source share

17 answers

Melchior Blausand · Answer 1 · 2011-12-21 04:28

So far, I wanted to keep these comma-separated lists in my SQL-db - knowledgeable about all the warnings!

I kept thinking that they have advantages over lookup tables (which provide a way to a normalized database). After several days of failure, I saw the light :

Using lookup tables DOES NOT produce more code than these ugly string operations when using comma separated values in the same field.
The lookup table allows you to create your own number formats and, therefore, NOT more than those csv fields. It is LITTLE though.
Applied string operations are subtle in a high-level language (SQL and PHP), but expensive compared to using arrays of integers.
Databases are not meant to be read by humans, and in most cases it is foolish to try to stick to structures because of their readability / direct edibility, just like me.

In short, there is a reason why MySQL has no built-in SPLIT () function.

eithed · Answer 2 · 2013-06-11 11:33

Seeing that this is a fairly popular question - the answer is YES.

For the column column in the table table containing all your data, separated by a coma:

 CREATE TEMPORARY TABLE temp (val CHAR(255)); SET @S1 = CONCAT("INSERT INTO temp (val) VALUES ('",REPLACE((SELECT GROUP_CONCAT( DISTINCT `column`) AS data FROM `table`), ",", "'),('"),"');"); PREPARE stmt1 FROM @s1; EXECUTE stmt1; SELECT DISTINCT(val) FROM temp;

Please remember, however, that do not store CSV in your database

Per @Mark Amery - since this will convert values separated by a coma into an INSERT , be careful when running on non-minebound data

To repeat, do not store the CSV in your database; This function is designed to convert CSV into a reasonable database structure and not be used anywhere in your code. If you must use it in production, please rethink your database structure.

Josias Iquabius · Answer 3 · 2012-12-31 00:04

You can create a function for this:

 /** * Split a string by string (Similar to the php function explode()) * * @param VARCHAR(12) delim The boundary string (delimiter). * @param VARCHAR(255) str The input string. * @param INT pos The index of the string to return * @return VARCHAR(255) The (pos)th substring * @return VARCHAR(255) Returns the [pos]th string created by splitting the str parameter on boundaries formed by the delimiter. * @{@example * SELECT SPLIT_STRING('|', 'one|two|three|four', 1); * This query * } */ DROP FUNCTION IF EXISTS SPLIT_STRING; CREATE FUNCTION SPLIT_STRING(delim VARCHAR(12), str VARCHAR(255), pos INT) RETURNS VARCHAR(255) DETERMINISTIC RETURN REPLACE( SUBSTRING( SUBSTRING_INDEX(str, delim, pos), LENGTH(SUBSTRING_INDEX(str, delim, pos-1)) + 1 ), delim, '' );

Convert magic pseudo code to use it:

 SELECT e.`studentId`, SPLIT_STRING(',', c.`courseNames`, e.`courseId`) FROM...

DarkSide · Answer 4 · 2014-03-26 17:57

Based on Alex's answer above ( https://stackoverflow.com/a/3/8/2/10/ ), I came up with an even better solution. A solution that does not contain an exact record identifier.

Assuming the comma separated list is in the data.list table and contains a list of codes from another classification.code table, you can do something like:

 SELECT d.id, d.list, c.code FROM classification c JOIN data d ON d.list REGEXP CONCAT('[[:<:]]', c.code, '[[:>:]]');

So, if you have tables and data like this:

 CLASSIFICATION (code varchar(4) unique): ('A'), ('B'), ('C'), ('D') MY_DATA (id int, list varchar(255)): (100, 'C,A,B'), (150, 'B,A,D'), (200,'B')

above SELECT will return

 (100, 'C,A,B', 'A'), (100, 'C,A,B', 'B'), (100, 'C,A,B', 'C'), (150, 'B,A,D', 'A'), (150, 'B,A,D', 'B'), (150, 'B,A,D', 'D'), (200, 'B', 'B'),

Mark Amery · Answer 5 · 2017-04-01 11:54

SUBSTRING_INDEX(str, delim, count) split function only in MySQL SUBSTRING_INDEX(str, delim, count) . You can use this, for example:

Return the element before the first separator in the string:

 mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1); +--------------------------------------------+ | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1) | +--------------------------------------------+ | foo | +--------------------------------------------+ 1 row in set (0.00 sec)

Return the item after the last separator in the string:

 mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1); +---------------------------------------------+ | SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1) | +---------------------------------------------+ | qux | +---------------------------------------------+ 1 row in set (0.00 sec)

Return everything to the third delimiter in the line:

 mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3); +--------------------------------------------+ | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3) | +--------------------------------------------+ | foo#bar#baz | +--------------------------------------------+ 1 row in set (0.00 sec)

Return the second element in the row by linking two calls:

 mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1); +----------------------------------------------------------------------+ | SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1) | +----------------------------------------------------------------------+ | bar | +----------------------------------------------------------------------+ 1 row in set (0.00 sec)

In general, a simple way to get the nth element of a # -separated string (assuming you know for sure that it contains at least n elements) is this:

 SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1);

The SUBSTRING_INDEX internal call discards the nth separator and everything after it, and then the SUBSTRING_INDEX external call discards everything except the last element that remains.

If you need a more robust solution that returns NULL , if you request an element that does not exist (for example, it requests the 5th element 'a#b#c#d' ), then you can count the delimiters using REPLACE and then conditionally return NULL using IF() :

 IF( LENGTH(your_string) - LENGTH(REPLACE(your_string, '#', '')) / LENGTH('#') < n - 1, NULL, SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1) )

Of course, this is pretty ugly and hard to understand! So you might want to wrap it in a function:

 CREATE FUNCTION split(string TEXT, delimiter TEXT, n INT) RETURNS TEXT DETERMINISTIC RETURN IF( (LENGTH(string) - LENGTH(REPLACE(string, delimiter, ''))) / LENGTH(delimiter) < n - 1, NULL, SUBSTRING_INDEX(SUBSTRING_INDEX(string, delimiter, n), delimiter, -1) );

Then you can use the function as follows:

 mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 3); +----------------------------------+ | SPLIT('foo,bar,baz,qux', ',', 3) | +----------------------------------+ | baz | +----------------------------------+ 1 row in set (0.00 sec) mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 5); +----------------------------------+ | SPLIT('foo,bar,baz,qux', ',', 5) | +----------------------------------+ | NULL | +----------------------------------+ 1 row in set (0.00 sec) mysql> SELECT SPLIT('foo###bar###baz###qux', '###', 2); +------------------------------------------+ | SPLIT('foo###bar###baz###qux', '###', 2) | +------------------------------------------+ | bar | +------------------------------------------+ 1 row in set (0.00 sec)

Alwin Kesler · Answer 6 · 2012-06-08 16:59

I solved this problem with a regular expression pattern. They are usually slower than regular queries, but this is an easy way to get data in a comma-separated query column

 SELECT * FROM `TABLE` WHERE `field` REGEXP ',?[SEARCHED-VALUE],?';

a greedy question mark helps you search at the beginning or end of a line.

Hope this helps anyone in the future

Alex Stevenson · Answer 7 · 2012-06-13 19:59

Based on Alvin Kesler’s decision, here is a slightly more practical example in the real world.

Assuming the comma separated list is in my_table.list, and this is a list of identifiers for my_other_table.id, you can do something like:

 SELECT * FROM my_other_table WHERE (SELECT list FROM my_table WHERE id = '1234') REGEXP CONCAT(',?', my_other_table.id, ',?');

Kickstart · Answer 8 · 2015-09-15 12:34

You can explode a string in a MySQL SELECT statement.

First, create a series of numbers with the most delimited values that you want to explode. Either from a table of integers, or by combining numbers together. The following generates 100 rows, giving values from 1 to 100. It can be easily expanded to get large ranges (add another additional query giving values from 0 to 9 for hundreds - from 0 to 999, etc.).

 SELECT 1 + units.i + tens.i * 10 AS aNum FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens

It could be a cross connected to your table to give you values. Note that you use SUBSTRING_INDEX to get the delimited value to a specific value, and then use SUBSTRING_INDEX to get this value, excluding the previous ones.

 SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name FROM clients CROSS JOIN ( SELECT 1 + units.i + tens.i * 10 AS aNum, units.i + tens.i * 10 AS aSubscript FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens ) sub0

As you can see, there is a small problem here that the last shared value is repeated many times. To get rid of this, you need to limit the range of numbers depending on the number of delimiters. This can be done by taking the length of the delimited field and comparing it to the length of the delimited field when the delimiters are changed to "" (to remove them). From this you can get the number of delimiters: -

 SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name FROM clients INNER JOIN ( SELECT 1 + units.i + tens.i * 10 AS aNum FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens ) sub0 ON (1 + LENGTH(clients.courseNames) - LENGTH(REPLACE(clients.courseNames, ',', ''))) >= sub0.aNum

In the source field of the example, you can (for example) calculate the number of students in each course based on this. Note that I changed the sub query that gets a range of numbers to return 2 numbers, 1 is used to determine the course name (since they are based on starting at 1), and the other gets an index (since they are based on starting at 0).

 SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name, COUNT(clientenrols.studentId) FROM clients INNER JOIN ( SELECT 1 + units.i + tens.i * 10 AS aNum, units.i + tens.i * 10 AS aSubscript FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens ) sub0 ON (1 + LENGTH(clients.courseNames) - LENGTH(REPLACE(clients.courseNames, ',', ''))) >= sub0.aNum LEFT OUTER JOIN clientenrols ON clientenrols.courseId = sub0.aSubscript GROUP BY a_course_name

As you can see, this is possible, but rather dirty. And with little ability to use indexes, it will not be effective. Further, the range should cope with the largest number of separable values and work, excluding many duplicates; if the maximum number of shared values is very large, this will significantly slow down the work. In general, it is generally much better to normalize the database correctly.

Kurt · Answer 9 · 2009-01-23 04:57

It’s easier there, there is a table of links, i.e.:

Table 1: customers, customer information, blah blah blah

Table 2: courses, course information, blah blah

Table 3: clientid, courseid

Then do the JOIN, and you will go to the races.

suraj deep · Answer 10 · 2012-03-07 06:42

 SELECT tab1.std_name, tab1.stdCode, tab1.payment, SUBSTRING_INDEX(tab1.payment, '|', 1) as rupees, SUBSTRING(tab1.payment, LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 1)) + 2,LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 2))) as date FROM ( SELECT DISTINCT si.std_name, hfc.stdCode, if(isnull(hfc.payDate), concat(hfc.coutionMoneyIn,'|', year(hfc.startDtae), '-', monthname(hfc.startDtae)), concat(hfc.payMoney, '|', monthname(hfc.payDate), '-', year(hfc.payDate))) AS payment FROM hostelfeescollection hfc INNER JOIN hostelfeecollectmode hfm ON hfc.tranId = hfm.tranId INNER JOIN student_info_1 si ON si.std_code = hfc.stdCode WHERE hfc.tranId = 'TRAN-AZZZY69454' ) AS tab1

user1894169 · Answer 11 · 2015-04-14 16:53

If you need a table from a delimited row:

 SET @str = 'function1;function2;function3;function4;aaa;bbbb;nnnnn'; SET @delimeter = ';'; SET @sql_statement = CONCAT('SELECT ''' ,REPLACE(@str, @delimeter, ''' UNION ALL SELECT ''') ,''''); SELECT @sql_statement; SELECT 'function1' UNION ALL SELECT 'function2' UNION ALL SELECT 'function3' UNION ALL SELECT 'function4' UNION ALL SELECT 'aaa' UNION ALL SELECT 'bbbb' UNION ALL SELECT 'nnnnn'

sudip · Answer 12 · 2012-04-29 19:33

I used the above logic, but changed it a bit. My input has the format: "apple: 100 | pinapple: 200 | orange: 300" stored in the variable @updtAdvanceKeyVal

Here is the function block:

 set @res = ""; set @i = 1; set @updtAdvanceKeyVal = updtAdvanceKeyVal; REPEAT -- set r = replace(SUBSTRING(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i), -- LENGTH(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i -1)) + 1),"|",""); -- wrapping the function in "replace" function as above causes to cut off a character from -- the 2nd splitted value if the value is more than 3 characters. Writing it in 2 lines causes no such problem and the output is as expected -- sample output by executing the above function : -- orange:100 -- pi apple:200 !!!!!!!!strange output!!!!!!!! -- tomato:500 set @r = SUBSTRING(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i), LENGTH(SUBSTRING_INDEX(@updtAdvanceKeyVal, "|", @i -1)) + 1); set @r = replace(@r,"|",""); if @r <> "" then set @key = SUBSTRING_INDEX(@r, ":",1); set @val = SUBSTRING_INDEX(@r, ":",-1); select @key, @val; end if; set @i = @i + 1; until @r = "" END REPEAT;

brightball · Answer 13 · 2015-04-22 17:30

I had a similar problem with a similar field, which I solved differently. My use case needed to take these identifiers in a comma-separated list for use in a connection.

I was able to solve this using similar, but it became easier because besides the comma separator, the identifiers were also quoted as follows:

keys "1","2","6","12"

Because of this, I was able to do LIKE

SELECT twwf.id, jtwi.id joined_id FROM table_with_weird_field twwf INNER JOIN join_table_with_ids jtwi ON twwf.delimited_field LIKE CONCAT("%\"", jtwi.id, "\"%")

Basically, this simply means that the identifier from the table you are trying to join appears in the set, and at this point you can easily join it and return your records. You can also just create a view from something like this.

This worked for my use case when I was dealing with a Wordpress plugin that managed relationships as described. Quotation marks really help, because otherwise you risk partial matches (aka - id 1 for 18, etc.).

chx · Answer 14 · 2019-08-26 05:18

You can do this with JSON in later versions of MySQL. This is an explosion. We will have quick preparations for creating a table of numbers. Then first we create an intermediate table to convert comma separated strings to json array, then we will use json_extract to split them. I encapsulate strings in quotation marks, carefully avoiding existing quotes, because I had strings separated by semicolons containing commas.

Therefore, in order to create a table of numbers, I hope you have more customers than courses, select a large enough table, if not.

 CREATE TABLE numbers (n int PRIMARY KEY); INSERT INTO numbers SELECT @row := @row + 1 FROM clients JOIN (select @row:=0) t2;

Add a LIMIT of 50 if you know that you have only 50 courses. It was easy, right? Now for the real work, to be honest, these are quotes that make it more ugly, but at least more general:

 CREATE TABLE json_coursenames SELECT clientId,clientName,CONCAT('["', REPLACE(REPLACE(courseName,'"','\\"'), ',', '","'), '"]') AS a FROM clients; CREATE TABLE extracted SELECT clientId,clientName,REPLACE(TRIM(TRIM('"' FROM JSON_EXTRACT(a, concat('$[', n, ']')))), '\\"', '"') FROM json_coursenames INNER JOIN numbers ON n < JSON_LENGTH(a);

Wheee!

The meat here is these two: CONCAT('["', REPLACE(coursename, ',', '","'), '"]') (I omitted the second REPLACE to make it more noticeable) converts foo,bar,bar in "foo","bar","baz" . Another trick - JSON_EXTRACT(a, concat('$[', n, ']') will become JSON_EXTRACT(a, $[12]) , and this is the 13th element of the array, see JSON Path Syntax .

Eugene Yokota · Answer 15 · 2009-01-23 05:03

Here's how you do it for SQL Server. Someone else could translate it to MySQL. Analysis of CSV values in several lines .

 SELECT Author, NullIf(SubString(',' + Phrase + ',' , ID , CharIndex(',' , ',' + Phrase + ',' , ID) - ID) , '') AS Word FROM Tally, Quotes WHERE ID <= Len(',' + Phrase + ',') AND SubString(',' + Phrase + ',' , ID - 1, 1) = ',' AND CharIndex(',' , ',' + Phrase + ',' , ID) - ID > 0

The idea is to cross-join the predefined Tally table, which contains an integer from 1 to 8000 (or an arbitrarily large number) and run SubString to find the correct, word, position.

nickf · Answer 16 · 2009-01-23 05:17

Here is what I got so far (found this on the Ben Alpert page):

 SELECT REPLACE( SUBSTRING( SUBSTRING_INDEX(c.`courseNames`, ',', e.`courseId` + 1) , LENGTH(SUBSTRING_INDEX(c.`courseNames`, ',', e.`courseId`) ) + 1) , ',' , '' ) FROM `clients` c INNER JOIN `clientenrols` e USING (`clientId`)

Jonathan · Answer 17 · 2013-08-13 20:02

Well, I didn’t use anything, so I decided to create a real simple split function, hope this helps:

  DECLARE inipos INTEGER; DECLARE endpos INTEGER; DECLARE maxlen INTEGER; DECLARE item VARCHAR(100); DECLARE delim VARCHAR(1); SET delim = '|'; SET inipos = 1; SET fullstr = CONCAT(fullstr, delim); SET maxlen = LENGTH(fullstr); REPEAT SET endpos = LOCATE(delim, fullstr, inipos); SET item = SUBSTR(fullstr, inipos, endpos - inipos); IF item <> '' AND item IS NOT NULL THEN USE_THE_ITEM_STRING; END IF; SET inipos = endpos + 1; UNTIL inipos >= maxlen END REPEAT;

Can you split / split field in MySQL query?

More articles: