I want to read a large text file

Question

I want to read a large text file

I want to read a large text file, that I decided to create four streams and read 25% of the files for each of them. and then join them.

but it is no more impressive. can anyone tell me i can use parallel programming for the same. as my file structure has some data like contact name compnay policyname policynumber uniqueno

and I want to put all the data in hashmap finally.

thanks

+4

java

Pedantic Jun 11 '10 at 11:22

source share

5 answers

Oregonhost · Answer 1 · 2010-06-11T11:32:11+0000

Reading a large file is usually limited by I / O performance rather than CPU time. You cannot speed up reading by dividing into several streams (this is more likely to reduce performance, since it will remain the same file on the same drive). You can use parallel programming to process the data, but this can only improve performance after reading the file.

However, you might be lucky to dedicate one thread to reading a file and delegating the actual processing from that thread to worker threads whenever a data block is read.

Peter Tillemans · Answer 2 · 2010-06-11T11:31:47+0000

If this is a large file, most likely it will be written to disk as an integral part and the "streaming" of data will be faster than parallel writing, as this will start moving heads back and forth. To find out which is the fastest, you need an intimate knowledge of your target production environment, because in a high-level storage, the data is likely to be distributed across multiple disks, and parallel reads can be faster.

The best approach - I think, is to read it with large chunks in memory. Make it available as ByteArrayInputStream for parsing.

Most likely, you will bind the processor during parsing and data processing. Perhaps parallel reduction of cards can help here to distribute the load across all cores.

Olivier croisier · Answer 3 · 2010-06-11T13:02:11+0000

You might want to use spooled files with memory (NIO) instead of plain java.io.

aioobe · Answer 4 · 2010-06-11T11:28:23+0000

Well, you can flush the disk cache and put a high conflict on the hash map synchronization if you do. I would suggest that you just make sure you buffer the stream correctly (possibly with a larger buffer size). Use the BufferedReader(Reader in, int sz) constructor BufferedReader(Reader in, int sz) to specify the size of the buffer.

If the bottle’s neck doesn’t make out the lines (that is, the bottle’s neck is not used by the CPU), you should not parallelize the task in the described way.

You can also look at memory mapped files (available through the nio package), but this is probably only useful if you want to read and write files efficiently. A source code tutorial is available here: http://www.linuxtopia.org/online_books/programming_books/thinking_in_java/TIJ314_029.htm

Sanjeev · Answer 5 · 2010-06-11T11:33:58+0000

well, you can use the help below links

http://java.sun.com/developer/technicalArticles/Programming/PerfTuning/

OR

using a large buffer

or using this

import java.io. *;

public class line1 {

 public static void main(String args[]) { if (args.length != 1) { System.err.println("missing filename"); System.exit(1); } try { FileInputStream fis = new FileInputStream(args[0]); BufferedInputStream bis = new BufferedInputStream(fis); DataInputStream dis = new DataInputStream(bis); int cnt = 0; while (dis.readLine() != null) cnt++; dis.close(); System.out.println(cnt); } catch (IOException e) { System.err.println(e); } }

}

I want to read a large text file

More articles: