Is there a trick / algorithm thanks to which we can find all the substrings possible in O (n) time

Question

Is there a trick / algorithm thanks to which we can find all the substrings possible in O (n) time

I have a brute force solution to compute all the substrings in the input string in O (n ^ 2) time. It takes a lot of time when my input line is very long.

How to find all substrings in O (n) time?

I am only looking for the number of substrings where the first and last characters in the substring are the same. As you can see, I only return the account from the function to my code below. I want to do it in O (n) time

My brute force solution:

// I am calculating count of all substrings where first and last substring character are equal public class Solution { public static void main(String[] args) { String inputString = "ababaca"; System.out.println(findSubstringByBruteForcce(inputString, inputString.length())); } private static long findSubstringByBruteForcce(String inputString, int length) { long count = 0; for (int i = 0; i < length; i++) { for (int j = 1; j <= length - i; j++) { String str = inputString.substring(i, i + j); if(str.length() == 1){ count = count + 1; }else { if(str.substring(0, 1).equals(str.substring(str.length() - 1, str.length()))){ count = count + 1; } } } } return count; } }

How can I optimize the solution above and find the answer in O (N) time? The input string can be very large (approximately 10 ^ 6 lengths), and brute force works after about 20 seconds. I want the maximum execution time to be less than 2 seconds.

+5

java string substring algorithm

Sandeep kulkarni Jun 21 '15 at 2:18

source share

3 answers

This is fast O (n), but too much memory:

 public static long findSubstringByCharacterMap(String s, int length) { long count = 0; long[] map = new long[Character.MAX_VALUE + 1]; for (int i = 0; i < length; ++i) count += ++map[s.charAt(i)]; return count; }

If the string contains only single-byte characters, the size of the long[] map can be 256.

You can rewrite long[] map to Map<Character, Long> map . But he is slow.

+3

saka1029 Jun 21 '15 at 2:46

source share

I have a solution that takes a constant extra array space of size 256 (maximum Ascii value is 255) and o (n).

Algorithm steps

Create an array of 256.
add the current frequency of the current element in ans and update the frequency of the current element in the line.
move the whole line.
add string length to ans.
here is my implementation of java code, tell me if i am wrong or i have a question clear to the question.

 import java.util.*; import java.lang.*; import java.io.*; class Solution { public static void main (String[] args) throws java.lang.Exception { String str="aabbab#cd#e"; int[] array=new int[256]; int ans=0; for(int i=0;i<str.length();i++){ ans+=array[(int)str.charAt(i)]; array[(int)str.charAt(i)]++; } ans=ans+str.length(); System.out.print(ans); } }

A duplicate string will be taken into account in this algorithm.

0

nikhil jain Jun 21 '15 at 8:54

source share

David Eisenstat · Accepted Answer · 2015-06-21T02:38:34+0000

Since the subscript identity is determined by the limiting indices, and not the contents, it is enough to calculate the frequency of each letter, and then sum the term (frequency + 1) * frequency div 2 for each letter, since each pair of letter positions with duplicates, but without taking into account the order, gives a calculated substring .

Is there a trick / algorithm thanks to which we can find all the substrings possible in O (n) time

More articles: