Are databases more expensive than accessing a collection in java?

It just implemented a design in which I cached some data in hashmap and extracted data from it, instead requesting the same data from the database.

Am I thinking right?

+4
source share
7 answers

You can answer this yourself if you think about what happens when you talk to the database:

  • Your program should send a request to the database. Depending on whether the database server is running in the process or somewhere else on the network, this can take from a few microseconds to several milliseconds.
  • The database server should analyze your request and formulate an execution plan. Depending on the server, it may cache the execution plan for frequently executed requests. If not, plan a few more microseconds to create the plan.
  • The database server must execute your plan by reading out any disk blocks needed to access the data. Each disk access will take tens of milliseconds. Depending on how large the table is and how indexed it is, your query may take a few seconds.
  • The database server must pack the data and send it back to the application. Again, depending on whether it will be in the process or over the network, it will take microseconds in milliseconds, and it will depend on how much data is sent back.
  • Your application should convert the extracted data into a usable form. This is probably a microsecond or less.

In comparison, searching for a hashed data structure requires several memory accesses, which can take several nanoseconds each. The difference is several orders of magnitude .

+3
source

Saving a copy of the data in memory will almost certainly be faster than fetching from the database.

However, the following considerations should be considered:

  • Can the data in the database change while you keep a copy in memory? If so, how are you going to handle this?
  • Will a memory issue be a problem?
  • Are you sure you are optimizing a real, not an imaginary bottleneck?
+7
source

Capturing a collection will be several orders of magnitude faster than getting into the database, especially on another server (due to communication delay)

That said:

  • Databases can cache data themselves, so this optimization may not be necessary.
  • If the data is very large, you will have to deal with memory consumption.
  • It is necessary to update data for data, for example, by invalidating the cache
+4
source

The primary concern to consider is the size of your cache: after a certain threshold, you do more damage than good. For example, if there are a million records in the cache and each record is 1 KB (not so difficult to achieve, given the overhead of each object), you took up a full gigabyte of heap. In this case, the performance of the main GC will also be terrible.

+2
source

Always winning DB is more expensive than anything you do at the code level.

0
source

Look at it this way: in order to query the database, bytes must still be copied to memory. Therefore, access to memory will always be faster than getting into the database.

0
source

It should be much faster if the cost of computing the hash code is low, it also depends on the number of entries (since there will be more conflicts)

0
source

All Articles