Question about strings and characters for the guru

Here's the problem that scored me (the solution is wise):

Given str S , apply character mappings Cm = {a=(m,o,p),d=(q,u),...} and print all possible combinations using C or C ++.

The string can be of any length, and the number of character mappings is different, and there will be no mappings that are displayed on another card (thus avoiding circular dependencies).

As an example: a string abba with mappings a=(e,o), d=(g,h), b=(i) will print:

 abba,ebba,obba,abbe,abbo,ebbe,ebbo,obbe,obbo,aiba,aiia,abia,eiba,eiia,...... 
+6
c ++ c string mapping character
source share
6 answers

Definitely possible, not very difficult ... but it will surely create many lines.

The first thing to notice is that you know how many lines it is going to generate in advance, so it’s easy to perform some health checks :)

Second: it seems like a recursive solution would be simple (like so many workarounds).

 class CharacterMapper { public: CharacterMapper(): mGenerated(), mMapped() { for (int i = -128, max = 128; i != max; ++i) mMapped[i].push_back(i); // 'a' is mapped to 'a' by default } void addMapped(char origin, char target) { std::string& m = mMapped[origin]; if (m.find(target) == std::string::npos) m.push_back(target); } // addMapped void addMapped(char origin, const std::string& target) { for (size_t i = 0, max = target.size(); i != max; ++i) this->addMapped(origin, target[i]); } // addMapped void execute(const std::string& original) { mGenerated.clear(); this->next(original, 0); this->sanityCheck(original); this->print(original); } private: void next(std::string original, size_t index) { if (index == original.size()) { mGenerated.push_back(original); } else { const std::string& m = mMapped[original[index]]; for (size_t i = 0, max = m.size(); i != max; ++i) this->next( original.substr(0, index) + m[i] + original.substr(index+1), index+1 ); } } // next void sanityCheck(const std::string& original) { size_t total = 1; for (size_t i = 0, max = original.size(); i != max; ++i) total *= mMapped[original[i]].size(); if (total != mGenerated.size()) std::cout << "Failure: should have found " << total << " words, found " << mGenerated.size() << std::endl; } void print(const std::string& original) const { typedef std::map<char, std::string>::const_iterator map_iterator; typedef std::vector<std::string>::const_iterator vector_iterator; std::cout << "Original: " << original << "\n"; std::cout << "Mapped: {"; for (map_iterator it = mMapped.begin(), end = mMapped.end(); it != end; ++it) if (it->second.size() > 1) std::cout << "'" << it->first << "': '" << it->second.substr(1) << "'"; std::cout << "}\n"; std::cout << "Generated:\n"; for (vector_iterator it = mGenerated.begin(), end = mGenerated.end(); it != end; ++it) std::cout << " " << *it << "\n"; } std::vector<std::string> mGenerated; std::map<char, std::string> mMapped; }; // class CharacterMapper int main(int argc, char* argv[]) { CharacterMapper mapper; mapper.addMapped('a', "eo"); mapper.addMapped('d', "gh"); mapper.addMapped('b', "i"); mapper.execute("abba"); } 

And here is the conclusion:

 Original: abba Mapped: {'a': 'eo''b': 'i''d': 'gh'} Generated: abba abbe abbo abia abie abio aiba aibe aibo aiia aiie aiio ebba ebbe ebbo ebia ebie ebio eiba eibe eibo eiia eiie eiio obba obbe obbo obia obie obio oiba oibe oibo oiia oiie oiio 

Yes, quite long, but there are many that are not directly involved in the calculation (initialization, verification, printing). The main methods are next , which implement recursion.

+2
source share

EDIT: This should be the fastest and easiest to use algorithm. Some may argue with style or portability; I think this is ideal for the built-in type, and I have already spent a lot of time on this. I leave the original below.

An array is used for this. The bit sign is used to indicate the end of the display cycle, so the array type must be larger than the displayed type if you want to use the full unsigned range.

Generates 231M lines / second or ~ 9.5 cycles / lines at Core2 2.2GHz. Test conditions and use as shown below.

 #include <iostream> using namespace std; int const alphabet_size = CHAR_MAX+1; typedef int map_t; // may be char or short, small performance penalty int const sign_bit = 1<< CHAR_BIT*sizeof(map_t)-1; typedef map_t cmap[ alphabet_size ]; void CreateMap( char *str, cmap &m ) { fill( m, m+sizeof(m)/sizeof(*m), 0 ); char *str_end = strchr( str, 0 ) + 1; str_end[-1] = ' '; // space-terminated strings char prev = ' '; for ( char *pen = str; pen != str_end; ++ pen ) { if ( * pen == ' ' ) { m[ prev ] |= sign_bit; prev = 0; } m[ * pen ] = * pen; if ( prev != ' ' ) swap( m[prev], m[ *pen ] ); prev = *pen; } for ( int mx = 0; mx != sizeof(m)/sizeof(*m); ++ mx ) { if ( m[mx] == 0 ) m[mx] = mx | sign_bit; } } bool NextMapping( char *s, char *s_end, cmap &m ) { for ( char *pen = s; pen != s_end; ++ pen ) { map_t oldc = *pen, newc = m[ oldc ]; * pen = newc & sign_bit-1; if ( newc >= 0 ) return true; } return false; } int main( int argc, char **argv ) { uint64_t cnt = 0; cmap m; CreateMap( argv[1], m ); char *s = argv[2], *s_end = strchr( s, 0 ); do { ++ cnt; } while ( NextMapping( s, s_end, m ) ); cerr << cnt; return 0; } 

ORIGINAL:

Not as short or reliable as we would like, but here's something.

  • It is required that the input line always contain the first letter in alphabetical order in each set of substitutions
  • Run a la maptool 'aeo dgh bi' abbd
  • Exit in reverse lexicographical order
  • Productivity about 22 cycles / lines (100 M lines / second at 2.2 GHz Core2)
    • BUT my platform is trying to be smart with string s, slowing it down.
    • If I changed it to use char* strings instead, it would work at 142 M rows / sec (~ 15.5 cycles / row)
  • It should be possible to work faster with the mapping table char[256] , and another char[256] indicating which characters end the loop.

The map data structure is an array of nodes connected in circular lists.

 #include <iostream> #include <algorithm> using namespace std; enum { alphabet_size = UCHAR_MAX+1 }; struct MapNode { MapNode *next; char c; bool last; MapNode() : next( this ), c(0), last(false) {} }; void CreateMap( string s, MapNode (&m)[ alphabet_size ] ) { MapNode *mprev = 0; replace( s.begin(), s.end(), ' ', '\0' ); char *str = const_cast<char*>(s.c_str()), *str_end = str + s.size() + 1; for ( char *pen = str; pen != str_end; ++ pen ) { if ( mprev == 0 ) sort( pen, pen + strlen( pen ) ); if ( * pen == 0 ) { if ( mprev ) mprev->last = true; mprev = 0; continue; } MapNode &mnode = m[ * pen ]; if ( mprev ) swap( mprev->next, mnode.next ); // link node in mnode.c = * pen; // tell it what char it is mprev = &mnode; } // make it easier to tell that a node isn't in any map for ( MapNode *mptr = m; mptr != m + alphabet_size; ++ mptr ) { if ( mptr->next == mptr ) mptr->next = 0; } } bool NextMapping( string &s, MapNode (&m)[ alphabet_size ] ) { for ( string::iterator it = s.begin(); it != s.end(); ++ it ) { MapNode &mnode = m[ * it ]; if ( mnode.next ) { * it = mnode.next->c; if ( ! mnode.last ) return true; } } return false; } int main( int argc, char **argv ) { MapNode m[ alphabet_size ]; CreateMap( argv[1], m ); string s = argv[2]; do { cerr << s << endl; } while ( NextMapping( s, m ) ); return 0; } 
+2
source share

The way I'm going to do this is to create an array of indexes of the same length as the string, all initialized to zero. Then we treat this array of indices as a counter to list all possible comparisons of the original row. Index 0 maps this position in the string to the first match for this character, from 1 to second, etc. We can execute them through the order by simply increasing the last index in the array, moving it to the next position when we reach the maximum number of mappings for this position.

To use your example, we have mappings

 'a' => 'e', 'o' 'b' => 'i' 

With the input string "abba", we need an array of four elements for our indices:

 [0,0,0,0] => "abba" [0,0,0,1] => "abbe" [0,0,0,2] => "abbo" [0,0,1,0] => "abia" [0,0,1,1] => "abie" [0,0,1,2] => "abio" [0,1,0,0] => "aiba" [0,1,0,1] => "aibe" [0,1,0,2] => "aibo" [0,1,1,0] => "aiia" [0,1,1,1] => "aiie" [0,1,1,2] => "aiio" [1,0,0,0] => "ebba" [1,0,0,1] => "ebbe" [1,0,0,2] => "ebbo" [1,0,1,0] => "ebia" [1,0,1,1] => "ebie" [1,0,1,2] => "ebio" [1,1,0,0] => "eiba" [1,1,0,1] => "eibe" [1,1,0,2] => "eibo" [1,1,1,0] => "eiia" [1,1,1,1] => "eiie" [1,1,1,2] => "eiio" [2,0,0,0] => "obba" [2,0,0,1] => "obbe" [2,0,0,2] => "obbo" [2,0,1,0] => "obia" [2,0,1,1] => "obie" [2,0,1,2] => "obio" [2,1,0,0] => "oiba" [2,1,0,1] => "oibe" [2,1,0,2] => "oibo" [2,1,1,0] => "oiia" [2,1,1,1] => "oiie" [2,1,1,2] => "oiio" 

Before we start generating these lines, we need to store them somewhere, which in C means that we will have to allocate memory. Fortunately, we already know the length of these lines, and we can find out the number of lines that we are going to generate - this is just a product of the number of displays for each position.

While you can return them to an array , I prefer to use a callback to return them when I find them.

 #include <string.h> #include <stdlib.h> int each_combination( char const * source, char const * mappings[256], int (*callback)(char const *, void *), void * thunk ) { if (mappings == NULL || source == NULL || callback == NULL ) { return -1; } else { size_t i; int rv; size_t num_mappings[256] = {0}; size_t const source_len = strlen(source); size_t * const counter = calloc( source_len, sizeof(size_t) ); char * const scratch = strdup( source ); if ( scratch == NULL || counter == NULL ) { rv = -1; goto done; } /* cache the number of mappings for each char */ for (i = 0; i < 256; i++) num_mappings[i] = 1 + (mappings[i] ? strlen(mappings[i]) : 0); /* pass each combination to the callback */ do { rv = callback(scratch, thunk); if (rv != 0) goto done; /* increment the counter */ for (i = 0; i < source_len; i++) { counter[i]++; if (counter[i] == num_mappings[(unsigned char) source[i]]) { /* carry to the next position */ counter[i] = 0; scratch[i] = source[i]; continue; } /* use the next mapping for this character */ scratch[i] = mappings[(unsigned char) source[i]][counter[i]-1]; break; } } while(i < source_len); done: if (scratch) free(scratch); if (counter) free(counter); return rv; } } #include <stdio.h> int print_each( char const * s, void * name) { printf("%s:%s\n", (char const *) name, s); return 0; } int main(int argc, char ** argv) { char const * mappings[256] = { NULL }; mappings[(unsigned char) 'a'] = "eo"; mappings[(unsigned char) 'b'] = "i"; each_combination( "abba", mappings, print_each, (void *) "abba"); each_combination( "baobab", mappings, print_each, (void *) "baobab"); return 0; } 
+1
source share

Are you sure you want to do a depth search (DFS) or any other workaround down a directed acyclic word schedule (DAWG). I will send the code soon.

0
source share

There is a link to the fragment archive that does this, here is Permute2.c . There is another option for line swapping (I think you could filter out those that are not on the map!) See here snippets ' archive ...

Hope this helps, Regards, Tom.

0
source share

simple, recursive permutation using char map [256]

 char *map [256]; /* permute the ith char in s */ perm (char *s, int i) { if (!s) return; /* terminating condition */ if (s[i] == '\0') { /* add "s" to a string array if we want to store the permutations */ printf("%s\n", s); return; } char c = s[i]; char *m = map [c]; // printf ("permuting at [%c]: %s\n", c, m); int j=0; /* do for the first char, then use map chars */ do { perm (s, i+1); s[i] = m[j]; } while (m[j++] != '\0'); /* restore original char here, used for mapping */ s[i] = c; return; } int main () { /* map table initialization */ map['a'] = "eo\0"; map['b'] = "i\0"; map['d'] = "gh\0"; /* need modifyable sp, as we change chars in position, sp="abba" will not work! */ char *sp = malloc (10); strncpy (sp, "abba\0", 5); perm (sp, 0); return 0; } 
0
source share

All Articles