Hiding vars in VS strings using objects with properties?

So, I have a program for analyzing words in Excel, with which I hope that I can import more than 30 million words.

First I created a separate object for each of these words so that each word has ...

.value '(string), the actual word itself .bool1 '(boolean) .bool2 '(boolean) .bool3 '(boolean) .isUsed '(boolean) .cancel '(boolean) 

When I found out that I can have 30 million of these objects (all of them are stored in one collection), I thought it could be a monster for compilation. And so I decided that all my words would be strings, and that I would bind them to an array.

So my idea of ​​the array is to add each of the 30 million lines by adding 5 spaces (for my 5 bools) at the beginning of each line, with each empty space representing a false bool val value. eg,

 If instr(3, arr(n), " ") = 1 then 'my 3rd bool val is false. Elseif instr(3, arr(n), "*") = 1 then '(I'll insert a '*' to denote true) 'my third bool val is true. End If 

Anyway, what do you guys think? How (collection or array) should I talk about this (especially for optimization)?

+1
source share
1 answer

(I wanted to make this comment, but it got too long)

The answer will depend on how you want to access and process the words after they are saved.

There are significant advantages and excellent benefits for 3 candidates:

  • Arrays are very effective at filling and retrieving all elements at once (from a range to an array and an array to a range), but much slower when recalibrating and inserting elements into the middle one. Each Redim copies the entire memory block to a larger location, and if Preserve is used, all values ​​are also copied. This can lead to perceived slowness for each operation (in a potential application)

    • More details (arrays and collections) here (specific VB, but it applies to VBA)
  • Collections are related lists with hash tables - they fill up rather slowly, but after that you get instant access to any element in the collection and perform reordering (sorting) and resizing just as quickly. This may result in a slow file opening, but all other operations are instantaneous. Other aspects:

    • Get keys, as well as items associated with these keys
    • Manual keys
    • Elements can be other collections, arrays, objects.
    • Although keys must be unique, they are also optional.
    • An item can be returned by reference to its key or in relation to its index value.
    • Keys are always strings and always case insensitive
    • Elements are accessible and accessible, but its keys are not
    • It is not possible to delete all items at once (one at a time or destroy, and then recreate the collection
    • Listing with ... Each ... The following are all the items

    • More here and here

  • Dictionaries : the same as collections, but with the added benefit of the .Exists () method, which in some scenarios makes them much faster than collections. Other aspects:

    • Keys are required and always unique to this Dictionary.
    • An item can only be returned with respect to its key.
    • The key can accept any type of data; for string keys, the default dictionary is case sensitive
    • Exists () method to check for a specific key (and item)

      • Collections do not have a similar test; instead, you should try to extract the value from the collection and handle the resulting error if the key is not found.
    • Elements and keys are always available and available to the developer.
    • The Item property is read / write, so it allows you to change the item associated with a particular key.
    • Allows you to delete all elements in one step without destroying the dictionary itself
    • Uses for ... Each ... The following dictionaries will list keys
    • The dictionary supports implicit item additions using the Item property.

      • In collections, items must be explicitly added.
    • More here


Other links: optimizing loops and row optimization (same site)

+1
source

All Articles