NSDictionary, NSArray, NSSet and Efficiency

I have a text file containing about 200,000 lines. Each row represents an object with several properties. I am only looking at one of the properties (unique identifier) ​​of the objects. If the unique identifier I'm looking for matches the identifier of the current object, I will read the rest of the values ​​of the object.

Right now, every time I search for an object, I just read the entire text file line by line, create an object for each line, and see if it is the object I'm looking for - which is basically the most inefficient way to search. I would like to read all these objects in memory, so I can work them out more quickly.

The question is, what is the most effective way to perform such a search? Is a 200,000th NSArray a good way to do this (I doubt it)? What about NSSet? With NSSet, is it possible to only search for one property of objects?

Thanks for any help!

- Ry

+6
cocoa nsarray nsdictionary nsset
source share
3 answers

@yngvedh is correct that NSDictionary has an O (1) lookup time (as expected for map structure). However, after some testing, you can see that NSSet also has O (1) search time. Here is the basic test I did to come up with this: http://pastie.org/933070

Basically, I create 1,000,000 lines, and then the time when I need to get 100,000 random from the dictionary and set. When I run this several times, the set really looks faster ...

 dict lookup: 0.174897 set lookup: 0.166058 --------------------- dict lookup: 0.171486 set lookup: 0.165325 --------------------- dict lookup: 0.170934 set lookup: 0.164638 --------------------- dict lookup: 0.172619 set lookup: 0.172966 

In your particular case, I'm not sure if any of them will be what you want. You say you want all these objects in memory, but do you really need them, or do you just need some of them? If this is the latter, then I would probably read the file and create an object identifier to match the file offset (i.e., I remember where each object identifier is in the file). Then you can see which ones you want and use the file offset to go to the right place in the file, analyze this line and move on. This is a job for NSFileHandle .

+13
source share

Use NSDictionary to map the identifier to objects. That is: use the identifier as the key, and the object as the value. NSDictionary is the only collection class that supports effective keyword searches. (Or key search in general)

Dictionaries are a different kind of collection than other collection classes. This is an associative collection (map identifiers for objects in your case), while the rest are just containers for several objects. NSSet contains unordered unique objects, and NSArray contains ordered objects (may contain duplicates).

UPDATE:

To avoid redistribution when reading records, use the dictionaryWithCapacity: method. If you know the (approximate) number of records before reading them, you can use it to pre-place a sufficiently large dictionary.

+5
source share

200,000 objects sound as if you might run into memory limitations, depending on the size of the objects and the target environment. Another thing you can consider is converting the data to a SQLite database, and then indexing the columns you want to find. This will provide a good compromise between efficiency and resource consumption, since you will not need to load the complete set into memory.

+4
source share

All Articles