Sort a string using a different sort order string

I saw this in an interview. For a given sort order line, you are asked to sort the input line based on a given sort order line.
for example, if the sort order line is dfbcae and the input line is abcdeeabc output should be dbbccaaee .

any ideas on how to do this in an efficient way?

+4
source share
9 answers

Here is a good algorithm that is easy to understand and has decent algorithmic complexity.

For each character in the sort order string

  • scan line to be sorted starting from the first non-stationary character (you can track this character with a pointer or pointer)
    • when you detect the appearance of the specified character, replace it with the first unordered character
    • increase index for first unordered character

This is O(n*m) , where n is the length of the string to sort, and m is the length of the string of the sort order. We can exceed the lower bound of sorting based on comparison, because this algorithm does not really use comparisons. Like Counting Sort , it relies on the fact that you have a predefined finite set of external order.

Here are some psuedocode:

 int head = 0; foreach(char c in sortOrder) { for(int i = head; i < sortTarget.length; i++) { if(sortTarget[i] == c) { // swap i with head char temp = sortTarget[head]; sortTarget[head] = sortTarget[i]; sortTarget[i] = temp; head++; } } } 
+3
source

The Counting Sort parameter is pretty cool and fast when the sortable string is long compared to the sort order string.

  • create an array in which each index corresponds to a letter in the alphabet, this is an array of count
  • for each letter in the sort task, increment the index in the count array that matches that letter
  • for each letter in the sort order line
    • add this letter to the end of the output line, the number of times equal to it in the count array

Algorithmic complexity O(n) , where n is the length of the string to be sorted. As the Wikipedia article explains, we can defeat the lower bound of standard sorting based on comparison, because it's not sorting based on comparison.

Here is some pseudo code.

 char[26] countArray; foreach(char c in sortTarget) { countArray[c - 'a']++; } int head = 0; foreach(char c in sortOrder) { while(countArray[c - 'a'] > 0) { sortTarget[head] = c; head++; countArray[c - 'a']--; } } 

Note. This implementation requires that both strings contain only lowercase characters.

+6
source

Use a binary search to find all the β€œsplit points” between different letters, and then use the length of each segment directly. This will be asymptotically faster than a naive sorting count, but it will be harder to implement:

  • Use an array of size 26 * 2 to store the beginning and end of each letter;

  • Examine the middle element, see if it is different from the element left to it. If so, then this is the beginning for the middle element and the end of the element in front of it;

  • Drop the segment with the same beginning and end (if any), recursively apply this algorithm.

Since there are no more than 25 "split", you will not need to search for more than 25 segments, and for each segment - O (logn). Since this is constant * O(logn) , the algorithm is O (nlogn).

And of course, just using counting sorting will be easier to implement:

  • Use an array of size 26 to record the number of different letters;

  • Scan input line;

  • Print a string in the specified sort order.

This is O (n), n is the length of the string.

0
source

In Python, you can simply create an index and use it in a comparison expression:

 order = 'dfbcae' input = 'abcdeeabc' index = dict([ (y,x) for (x,y) in enumerate(order) ]) output = sorted(input, cmp=lambda x,y: index[x] - index[y]) print 'input=',''.join(input) print 'output=',''.join(output) 

gives this result:

 input= abcdeeabc output= dbbccaaee 
0
source

Interview questions are usually about the thought process and usually don’t care too much about language functions, but I could not resist publishing the version of VB.Net 4.0 anyway.

Effective can mean two different things. The first is "the fastest way to get the computer to complete the task," and the second is "what is the fastest that we can complete the task." They may sound the same, but the former may mean micro-optimizations, such as int vs short , starting timers to compare runtimes, and performing weekly settings for every millisecond from the algorithm. The second definition is how long it will take human time to create code that performs the task (I hope, in a reasonable amount of time). If code A is 20 times faster than code B, but code B takes 1/20 time to write, depending on the timer details (1 ms vs 20 ms, 1 week vs 20 weeks), each version can be considered "effective" ,.

  Dim input = "abcdeeabc" Dim sort = "dfbcae" Dim SortChars = sort.ToList() Dim output = New String((From c In input.ToList() Select c Order By SortChars.IndexOf(c)).ToArray()) Trace.WriteLine(output) 
0
source

Here is my solution to the question

 import java.util.*; import java.io.*; class SortString { public static void main(String arg[])throws IOException { BufferedReader br=new BufferedReader(new InputStreamReader(System.in)); // System.out.println("Enter 1st String :"); // System.out.println("Enter 1st String :"); // String s1=br.readLine(); // System.out.println("Enter 2nd String :"); // String s2=br.readLine(); String s1="tracctor"; String s2="car"; String com=""; String uncom=""; for(int i=0;i<s2.length();i++) { if(s1.contains(""+s2.charAt(i))) { com=com+s2.charAt(i); } } System.out.println("Com :"+com); for(int i=0;i<s1.length();i++) if(!com.contains(""+s1.charAt(i))) uncom=uncom+s1.charAt(i); System.out.println("Uncom "+uncom); System.out.println("Combined "+(com+uncom)); HashMap<String,Integer> h1=new HashMap<String,Integer>(); for(int i=0;i<s1.length();i++) { String m=""+s1.charAt(i); if(h1.containsKey(m)) { int val=(int)h1.get(m); val=val+1; h1.put(m,val); } else { h1.put(m,new Integer(1)); } } StringBuilder x=new StringBuilder(); for(int i=0;i<com.length();i++) { if(h1.containsKey(""+com.charAt(i))) { int count=(int)h1.get(""+com.charAt(i)); while(count!=0) {x.append(""+com.charAt(i));count--;} } } x.append(uncom); System.out.println("Sort "+x); } 

}

0
source

Here is my version, which is O (n) in time. Instead of unordered_map, I could use a constant size char array. I., d. char char_count[256] (and done ++char_count[ch - 'a'] ), assuming that the input strings have all the small ASCII characters.

 string SortOrder(const string& input, const string& sort_order) { unordered_map<char, int> char_count; for (auto ch : input) { ++char_count[ch]; } string res = ""; for (auto ch : sort_order) { unordered_map<char, int>::iterator it = char_count.find(ch); if (it != char_count.end()) { string s(it->second, it->first); res += s; } } return res; } 
0
source
  private static String sort(String target, String reference) { final Map<Character, Integer> referencesMap = new HashMap<Character, Integer>(); for (int i = 0; i < reference.length(); i++) { char key = reference.charAt(i); if (!referencesMap.containsKey(key)) { referencesMap.put(key, i); } } List<Character> chars = new ArrayList<Character>(target.length()); for (int i = 0; i < target.length(); i++) { chars.add(target.charAt(i)); } Collections.sort(chars, new Comparator<Character>() { @Override public int compare(Character o1, Character o2) { return referencesMap.get(o1).compareTo(referencesMap.get(o2)); } }); StringBuilder sb = new StringBuilder(); for (Character c : chars) { sb.append(c); } return sb.toString(); } 
0
source

In C #, I would just use the IComparer interface and leave it in Array.Sort

 void Main() { // we defin the IComparer class to define Sort Order var sortOrder = new SortOrder("dfbcae"); var testOrder = "abcdeeabc".ToCharArray(); // sort the array using Array.Sort Array.Sort(testOrder, sortOrder); Console.WriteLine(testOrder.ToString()); } public class SortOrder : IComparer { string sortOrder; public SortOrder(string sortOrder) { this.sortOrder = sortOrder; } public int Compare(object obj1, object obj2) { var obj1Index = sortOrder.IndexOf((char)obj1); var obj2Index = sortOrder.IndexOf((char)obj2); if(obj1Index == -1 || obj2Index == -1) { throw new Exception("character not found"); } if(obj1Index > obj2Index) { return 1; } else if (obj1Index == obj2Index) { return 0; } else { return -1; } } } 
0
source

All Articles