9 "john.snow@domain.com".index("domai...">

How to find a substring index?

Looking for Ruby's Elixir equivalent:

"john.snow@domain.com".index("@") # => 9 "john.snow@domain.com".index("domain") # => 10 
+8
elixir
source share
5 answers

TL; DR: String.index / 2 is intentionally missing, as more reasonable alternatives exist. Very often, String.split / 2 solves the main problem and improves performance.

  • I assume that we are talking UTF-8 strings here and expect them to handle non-ASCII characters.

  • Elixir encourages fast code. It turns out that the problems that we usually try to solve with String.index / 2 can be solved much smarter, significantly improving performance without affecting the readability of the code.

  • A more reasonable solution is to use the String.split / 2 functions and / or other similar String functions. String.split / 2 works at the byte level, still handling graphemes correctly. This cannot go wrong, because both arguments are strings! String.index / 2 should work at the grapheme level, slowly distorting the entire string.

  • For this reason, String.index / 2 is unlikely to be added to the language unless very convincing use cases appear that cannot be resolved using existing functions.

  • See also the elixir-lang-core discussion on this subject: https://groups.google.com/forum/#!topic/elixir-lang-core/S0yrDxlJCss

  • On the side of the note, Elixir is quite unique in its mature support for Unicode. While most languages ​​work at the code level (conversational "characters"), Elixir works with the higher-level graphemes concept. Graphemes - this is what users perceive as one symbol (say, a more practical understanding of the "character"). Graphemes may contain more than one code (which, in turn, may contain more than one byte).

Finally, if we really need an index:

 case String.split("john.snow@domain.com", "domain", parts: 2) do [left, _] -> String.length(left) [_] -> nil end 
+10
source share

I don't think there is any Elixir wrappers for this, see # 1119 .

You can call :binary.match immediately before:

 iex(1)> :binary.match "john.snow@domain.com", "@" {9, 1} iex(2)> :binary.match "john.snow@domain.com", "domain" {10, 6} 

The return value is a tuple containing the index and length of the match. You can only extract the index by pipe at |> elem(0) or by using pattern matching.

Note that :binary.match returns :nomatch if the substring is not found in the string.

+10
source share

You can use Regex.run/3 and pass it return: :index as an option:

 iex(5)> [{start, len}] = Regex.run(~r/abc/, " abc ", return: :index) [{1, 3}] 
+5
source share

You can get the byte index using : binary.match / 3

 {index, length} = :binary.match("aéiou", "o") {4, 1} 

If you want to specify a place in a string, use:

 "aéiou" |> to_char_list() |> Enum.find_index(&(&1 == ?o)) 3 

String module documentation explains the difference between byte length and string length.

+4
source share
 # index (as INSTR from basic...) ... import IO, except: [inspect: 1] puts index "algopara ver", "ver" def index( mainstring, searchstring) do tuple = (:binary.match mainstring, searchstring) if tuple === :nomatch do 0 else elem(tuple,0) end end ... 9 
0
source share

All Articles