Lua - convert a row to a table

I want to convert string text to a table, and this text needs to be divided into characters. Each character must be in a separate table value, for example:

  • a = "text"
  • - conversion of row (a) to table (b)
  • - show table (b)
  • B = {'t', 'e', ​​'x', 't'}
+7
string lua-table lua
source share
4 answers

You can use string.gsub function

t={} str="text" str:gsub(".",function(c) table.insert(t,c) end) 
+9
source share

Just index each character and place it at the same position in the table.

 local str = "text" local t = {} for i = 1, #str do t[i] = str:sub(i, i) end 
+7
source share

The built-in string library treats Lua strings as arrays of bytes. An alternative that works with multibyte (Unicode) characters is the unicode library that originated in the Selene project. Its main selling point is that it can be used as a replacement for the string library, making most string operations "magic", Unicode capable.

If you prefer not to add third-party dependencies, your task can easily be done using LPeg . Here is an example splitter:

 local lpeg = require "lpeg" local C, Ct, R = lpeg.C, lpeg.Ct, lpeg.R local lpegmatch = lpeg.match local split_utf8 do local utf8_x = R"\128\191" local utf8_1 = R"\000\127" local utf8_2 = R"\194\223" * utf8_x local utf8_3 = R"\224\239" * utf8_x * utf8_x local utf8_4 = R"\240\244" * utf8_x * utf8_x * utf8_x local utf8 = utf8_1 + utf8_2 + utf8_3 + utf8_4 local split = Ct (C (utf8)^0) * -1 split_utf8 = function (str) str = str and tostring (str) if not str then return end return lpegmatch (split, str) end end 

This snippet defines a split_utf8() function that creates a UTF8 character table (like Lua strings) but returns nil if the string is not a valid UTF sequence. You can run this test code:

 tests = { en = [[Lua (/ˈluːə/ LOO-ə, from Portuguese: lua [ˈlu.(w)ɐ] meaning moon; ]] .. [[explicitly not "LUA"[1]) is a lightweight multi-paradigm programming ]] .. [[language designed as a scripting language with "extensible ]] .. [[semantics" as a primary goal.]], ru = [[Lua ([́], . «») —   , ]] .. [[  Tecgraf   ]] .. [[--.]], gr = [[Η Lua είναι μια ελαφρή προστακτική γλώσσα προγραμματισμού, που ]] .. [[σχεδιάστηκε σαν γλώσσα σεναρίων με κύριο σκοπό τη δυνατότητα ]] .. [[επέκτασης της σημασιολογίας της.]], XX = ">\255< invalid" } ------------------------------------------------------------------------------- local limit = 14 for lang, str in next, tests do io.write "\n" io.write (string.format ("<%s %3d> ->", lang, #str)) local chars = split_utf8 (str) if not chars then io.write " INVALID!" else io.write (string.format (" <%3d>", #chars)) for i = 1, #chars > limit and limit or #chars do io.write (string.format (" %q", chars [i])) end end end io.write "\n" 

Btw., Building a table with LPeg is significantly faster than calling table.insert() several times. Here are the statistics for splitting the entire Gogol of Dead Souls (in Russian, 1023814 bytes raw, 571395 UTF characters) on my machine:

 library method time in ms string table.insert() 380 string t [#t + 1] = c 310 string gmatch & for loop 280 slnunicode table.insert() 220 slnunicode t [#t + 1] = c 200 slnunicode gmatch & for loop 170 lpeg Ct (C (...)) 70 
+3
source share

You can make the code below to achieve this easily.

  > t = {}
 > str = "text"
 > for i = 1, string.len (str) do
 t [i] = (string.sub (str, i, i))
 end
 > for k, v in pairs (t) do
 print (k, v)
 end
 1 t
 2 e
 3 x
 4 t
 > 

string.sub
string.sub(s, i [, j]) Returns the substring of the passed string. The substring begins with i. If the third argument j is not specified, the substring will end at the end of the line. If the third argument is specified, the substring ends with and includes j.

0
source share

All Articles