Any good surname databases?

I want to create some database test data, in particular table columns containing the names of people. To get a good idea of ​​how well indexing works with respect to name-based searches, I want to get as close as possible to real-world names and their true frequency distribution, for example. many different names with frequencies distributed over a power law distribution.

Ideally, I am looking for a freely available data file with names, followed by a single frequency value (or equivalent probability) for the name.

Names based on English-Saxon will be good, although names of other cultures will also be useful.

+7
source share
3 answers

I found some US census data that meet this requirement. The only caveat is that it lists only names that occur at least 100 times ...

Found through this blog post, which also shows the power law distribution curve

In addition to this, you can choose from a list by selecting a roulette wheel, for example. (not verified)

struct NameEntry { public string _name; public int _frequency; } int _frequencyTotal; // Precalculate this. public string SampleName(NameEntry[] nameEntryArr, Random rng) { // Throw the roulette ball. int throwValue = rng.NextDouble() * frequencyTotal; int accumulator = 0.0; for(int i=0; i<nameEntryArr.Length; i++) { accumulator += nameEntryArr[i]._frequency; if(throwValue <= accumulator) { return nameEntryArr[i]._name; } } // If we get here then we have an array of zero fequencies. throw new ApplicationException("Invalid operation. No non-zero frequencies to select."); } 
+5
source

Oxford University provides word lists on its publicly accessible FTP site as compressed .gz files in ftp://ftp.ox.ac.uk/pub/wordlists/names/ .

+4
source

You can also check out the jFairy project. It is written in Java and creates fake data (e.g. example names). http://codearte.imtqy.com/jfairy/

 Fairy fairy = Fairy.create(); Person person = fairy.person(); System.out.println(person.firstName()); // Chloe System.out.println(person.lastName()); // Barker System.out.println(person.fullName()); // Chloe Barker 
+3
source

All Articles