You can use sys.getsizeof(object) to get the size of the Python object. However, you should be careful when calling sys.getsizeof in containers: it only displays the size of the container, not the content - see this recipe for an explanation of how to get the total size of the container, including the contents. In this case, we do not need to go so deep: we can simply manually add the size of the container and the size of its contents.
The sizes of the considered types:
# room type size >>> sys.getsizeof(RoomType("A",["Bottom"])) + sys.getsizeof("A") + sys.getsizeof(["Bottom"]) + sys.getsizeof("Bottom") 233 # position size >>> sys.getsizeof((0,0)) + 2*sys.getsizeof(0) 120 # One character size >>> sys.getsizeof("A") 38
Look at the different options if you have N numbers:
- Dictionary from
position -> room_type . This includes storing N*(size(position) + size(room_type)) = 353 N bytes in memory. - Dictionary from
position -> 1-character string . This includes storing N*158 bytes in memory. - Dictionary from
type -> set of positions . This includes storing N*120 bytes plus tiny overhead when storing dictionary keys.
In terms of memory usage, the third option is clearly better. However, as often happens, you have a trade-off between processor memory. Itโs worthwhile to dwell briefly on the computational complexity of the queries you are likely to make. To find the type of room according to its position, with each of the three options above you should:
- Look at the position in the dictionary. This is an O (1) search, so you will always have the same time (approximately), regardless of the number of rooms (for a large number of rooms).
- Same
- Look at each type and for each type, ask if this item is in the set of items for this type. This is a search for
O(ntypes) , that is, the time it takes is proportional to the number of types you have. Please note: if you sent a list instead of a set to store numbers of a certain type, it would increase to O(nrooms * ntypes) , which would kill your performance.
As always, when optimizing, it is important to consider the effect of optimization on both memory usage and processor time. The two often diverge.
Alternatively, you might consider storing types in a two-dimensional numpy character array if your map is fairly rectangular. I believe that it will be much more effective. Each character in the numpy array is a single byte, so the memory usage will be much less, and the CPU time will still be O (1) searched from the room position to print:
# Generate random 20 x 10 rectangular map >>> map = np.repeat('a', 100).reshape(20, 10) >>> map.nbytes 200 # ie. 1 byte per character.
Some additional small-scale optimizations:
Encode the room type as an int, not a string. Ints are 24 bytes in size, and single-character strings are 38 in size.
Encode a position as a unit, not a tuple. For instance:
# Random position xpos = 5 ypos = 92 # Encode the position as a single int, using high-order bits for x and low-order bits for y pos = 5*1000 + ypos # Recover the x and y values of the position. xpos = pos / 1000 ypos = pos % 1000
Note that this kills readability, so itโs only worth doing if you want to compress the last bit of performance. In practice, you can use power 2 rather than power 10 as a separator (but power 10 helps with debugging and readability). Note that this results in a number of bytes at each position from 120 to 24. If you go along this route, consider defining the Position class with __slots__ to tell Python how to allocate memory and add the xpos and ypos properties to the class. You do not want to clog your code with pos / 1000 and pos % 1000 .