Creating a very, very large map in Java

Using Java I would like to create a map that can grow and grow and potentially be larger than the size of available memory. Now, obviously, using the standard POJO HashMap, we run out of memory and the JVM crashes. Therefore, I thought along the lines of the card that if she realizes that the memory is working at a low level, she can write the current contents to disk.

Has anyone implemented something similar or know any existing solutions there?

What I'm trying to do is read a very large ASCII file (say 50Gb) one line at a time. Each row contains a key and a value. Keys can be duplicated in a file. Then I will store each line on the map, which is the key to the list of values. This Map is an object that will simply grow and grow.

Any advice was greatly appreciated.

Phil

Update:

Thanks for all the comments and advice to everyone. With the problem that I described, the Database is the right, scalable solution. I had to state that this is a temporary Map that should be created and used for a short period of time to help analyze the file. In this case, Michael’s suggestion “keep only line number instead of actual value” is most appropriate. Marking Michael responds (s) as a recommended solution.

+5
8

, .

+12

NoSQL, , , . BerkeleyDB Java, Oracle. ​​, , ,

+3

.

, . TXT, . , , (, JVM ). , .

: .

+2

, . ; JPA -, JDBC SQL. , Derby HSQL , , , .

" " , -, , , OutOfMemoryException 50 , 75... , .

+2

( ), MapReduce , , .

: , , MapReduce , .

0

? , , , . , , 1000 . 16-24 , , .

, , . String , ASCII "String" ( ). , .

0

BerkleyDB , , ( Map, , )

http://www.oracle.com/technetwork/database/berkeleydb/overview/index.html

Maven http://www.oracle.com/technetwork/database/berkeleydb/downloads/maven-087630.html

  <dependencies>
    <dependency>
      <groupId>com.sleepycat</groupId>
      <artifactId>je</artifactId>
      <version>3.3.75</version>
    </dependency>
  </dependencies>

  <repositories>
    <repository>
      <id>oracleReleases</id>
      <name>Oracle Released Java Packages</name>
      <url>http://download.oracle.com/maven</url>
      <layout>default</layout>
    </repository>
  </repositories>

It also has another drawback of blocking the provider (i.e. you have to use this tool, although there may be other card wrappers in other databases)

Therefore, just choose according to your needs.

0
source

Most cache APIs work like cards and support disk overflow. Ehcache , for example, supports this. Or follow this tutorial for guave .

0
source

All Articles