How to iterate over individual characters in a Lua string?

I have a string in Lua and want to iterate over individual characters in it. But no code I tried works, and the official guide only shows how to find and replace substrings :(

str = "abcd" for char in str do -- error print( char ) end for i = 1, str:len() do print( str[ i ] ) -- nil end 
+78
lua
May 6 '09 at 10:55
source share
5 answers

In lua 5.1, you can iterate through the lines of this line in several ways.

The main loop:

 for i = 1, #str do
     local c = str: sub (i, i)
     - do something with c
 end

But it may be more efficient to use a template with string.gmatch() to get an iterator over characters:

 for c in str: gmatch "."  do
     - do something with c
 end

Or even use string.gsub() to call a function for each char:

 str: gsub (".", function (c)
     - do something with c
 end)

In all of the above, I took advantage of the fact that the string module is specified as a meta-string for all string values, so its functions can be called members using the notation : I also used (new to 5.1, IIRC) # to get the length of the string.

The best answer for your application depends on many factors, and your tests are your friend, if performance matters.

You might want to evaluate why you need to iterate over characters, and look at one of the regular expression modules attached to Lua, or for a modern approach take a look at the Roberto lpeg module, which implements Parsing Expression grammar expressions for Lua.

+111
May 7, '09 at
source share

If you are using Lua 5, try:

 for i = 1, string.len(str) do print( string.sub(str, i, i) ) end 
+11
May 6 '09 at 1:05 pm
source share

Depending on the task, it would be easier to use string.byte . This is also the fastest way, since it avoids creating a new substring, which in Lua can be quite expensive by hashing every new line and checking if this is known. You can pre-compute the character code you are looking for with the same string.byte to ensure readability and portability.

 local str = "ab/cd/ef" local target = string.byte("/") for idx = 1, #str do if str:byte(idx) == target then print("Target found at:", idx) end end 
+5
Dec 24 '15 at 10:30
source share

The answers given already have many good approaches ( here , here and here ). If speed is what you are primarily looking for, you should definitely consider doing work through the Lua C API, which is many times faster than the raw Lua code. When working with preloaded blocks (for example, with the load function ), the difference is small, but still significant.

Regarding pure Lua solutions, let me share this little test that I did. It covers every response provided on this date and adds several optimizations. However, the main thing to consider:

How many times will you need to iterate over characters in a string?

  • If the answer is "once", then you should look for the first part of the mark ("raw speed").
  • Otherwise, the second part will provide a more accurate estimate, because it parses the row in the table, which is much faster to iterate. You should also consider writing a simple function for this, as @Jarriz suggested.

Here is the complete code:

 -- Setup locals local str = "Hello World!" local attempts = 5000000 local reuses = 10 -- For the second part of benchmark: Table values are reused 10 times. Change this according to your needs. local x, c, elapsed, tbl -- "Localize" funcs to minimize lookup overhead local stringbyte, stringchar, stringsub, stringgsub, stringgmatch = string.byte, string.char, string.sub, string.gsub, string.gmatch print("-----------------------") print("Raw speed:") print("-----------------------") -- Version 1 - string.sub in loop x = os.clock() for j = 1, attempts do for i = 1, #str do c = stringsub(str, i) end end elapsed = os.clock() - x print(string.format("V1: elapsed time: %.3f", elapsed)) -- Version 2 - string.gmatch loop x = os.clock() for j = 1, attempts do for c in stringgmatch(str, ".") do end end elapsed = os.clock() - x print(string.format("V2: elapsed time: %.3f", elapsed)) -- Version 3 - string.gsub callback x = os.clock() for j = 1, attempts do stringgsub(str, ".", function(c) end) end elapsed = os.clock() - x print(string.format("V3: elapsed time: %.3f", elapsed)) -- For version 4 local str2table = function(str) local ret = {} for i = 1, #str do ret[i] = stringsub(str, i) -- Note: This is a lot faster than using table.insert end return ret end -- Version 4 - function str2table x = os.clock() for j = 1, attempts do tbl = str2table(str) for i = 1, #tbl do -- Note: This type of loop is a lot faster than "pairs" loop. c = tbl[i] end end elapsed = os.clock() - x print(string.format("V4: elapsed time: %.3f", elapsed)) -- Version 5 - string.byte x = os.clock() for j = 1, attempts do tbl = {stringbyte(str, 1, #str)} -- Note: This is about 15% faster than calling string.byte for every character. for i = 1, #tbl do c = tbl[i] -- Note: produces char codes instead of chars. end end elapsed = os.clock() - x print(string.format("V5: elapsed time: %.3f", elapsed)) -- Version 5b - string.byte + conversion back to chars x = os.clock() for j = 1, attempts do tbl = {stringbyte(str, 1, #str)} -- Note: This is about 15% faster than calling string.byte for every character. for i = 1, #tbl do c = stringchar(tbl[i]) end end elapsed = os.clock() - x print(string.format("V5b: elapsed time: %.3f", elapsed)) print("-----------------------") print("Creating cache table ("..reuses.." reuses):") print("-----------------------") -- Version 1 - string.sub in loop x = os.clock() for k = 1, attempts do tbl = {} for i = 1, #str do tbl[i] = stringsub(str, i) -- Note: This is a lot faster than using table.insert end for j = 1, reuses do for i = 1, #tbl do c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V1: elapsed time: %.3f", elapsed)) -- Version 2 - string.gmatch loop x = os.clock() for k = 1, attempts do tbl = {} local tblc = 1 -- Note: This is faster than table.insert for c in stringgmatch(str, ".") do tbl[tblc] = c tblc = tblc + 1 end for j = 1, reuses do for i = 1, #tbl do c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V2: elapsed time: %.3f", elapsed)) -- Version 3 - string.gsub callback x = os.clock() for k = 1, attempts do tbl = {} local tblc = 1 -- Note: This is faster than table.insert stringgsub(str, ".", function(c) tbl[tblc] = c tblc = tblc + 1 end) for j = 1, reuses do for i = 1, #tbl do c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V3: elapsed time: %.3f", elapsed)) -- Version 4 - str2table func before loop x = os.clock() for k = 1, attempts do tbl = str2table(str) for j = 1, reuses do for i = 1, #tbl do -- Note: This type of loop is a lot faster than "pairs" loop. c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V4: elapsed time: %.3f", elapsed)) -- Version 5 - string.byte to create table x = os.clock() for k = 1, attempts do tbl = {stringbyte(str,1,#str)} for j = 1, reuses do for i = 1, #tbl do c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V5: elapsed time: %.3f", elapsed)) -- Version 5b - string.byte to create table + string.char loop to convert bytes to chars x = os.clock() for k = 1, attempts do tbl = {stringbyte(str, 1, #str)} for i = 1, #tbl do tbl[i] = stringchar(tbl[i]) end for j = 1, reuses do for i = 1, #tbl do c = tbl[i] end end end elapsed = os.clock() - x print(string.format("V5b: elapsed time: %.3f", elapsed)) 

Sample output (Lua 5.3.4, Windows) :

 ----------------------- Raw speed: ----------------------- V1: elapsed time: 3.713 V2: elapsed time: 5.089 V3: elapsed time: 5.222 V4: elapsed time: 4.066 V5: elapsed time: 2.627 V5b: elapsed time: 3.627 ----------------------- Creating cache table (10 reuses): ----------------------- V1: elapsed time: 20.381 V2: elapsed time: 23.913 V3: elapsed time: 25.221 V4: elapsed time: 20.551 V5: elapsed time: 13.473 V5b: elapsed time: 18.046 

Result:

In my case, string.byte and string.sub were the fastest in terms of raw speed. When using the cache table and reusing it 10 times per cycle, the string.byte version was the fastest even when converting codes back to characters (which is not always necessary and depends on use).

As you probably noticed, I made some assumptions based on my previous tests and applied them to the code:

  1. Library functions should always be localized if used inside loops, because it is much faster.
  2. Inserting a new element into the lua table is much faster when using tbl[idx] = value than table.insert(tbl, value) .
  3. Scrolling through the table using for = 1, #tbl little faster than for k, v in pairs(tbl) .
  4. Always prefer a version with fewer function calls, because the call itself slightly increases the execution time.

Hope it helps.

+3
Mar 11 '18 at 17:06
source share

All people offer a less optimal method.

Will be better:

  function chars(str) strc = {} for i = 1, #str do table.insert(strc, string.sub(str, i, i)) end return strc end str = "Hello world!" char = chars(str) print("Char 2: "..char[2]) -- prints the char 'e' print("-------------------\n") for i = 1, #str do -- testing printing all the chars if (char[i] == " ") then print("Char "..i..": [[space]]") else print("Char "..i..": "..char[i]) end end 
0
Feb 22 '17 at 9:53 on
source share



All Articles