Is it a good idea to store a lot of text that is periodically loaded into the cache before saving to a file?

So, I am trying to write messages from users on a messaging network to a file. I am trying to create this program with good java practices and a suitable file I / O technique.

Currently, my program recognizes that someone has sent a message, receives the message, and immediately writes it to a file. Create a file object, create a recording object, add a message, and then close the file. This seems like good practice if not many messages arrive in the message, but if there is a fast conversation flow, it seems slow and requires a lot of unnecessary actions, because the file will be opened again immediately.

Then I thought that if I just left the file open and just wrote messages when they arrived at the file, then closed it periodically. Is this a good practice? Keeping a file open for a long time? For example, an hour or after a certain amount of data has been recorded?

Now I think that I should accept messages, store them in the cache (for example, a string array), and then save the array of strings to a file when the "cache" is full. Is this the best practice?

I have two questions:

1) Is it good to leave the file open for a long period of time (from several minutes to several hours) if you are not using the file?

2) What is the good practice for the "cache" that I'm talking about? Is a string array good? Is there anything better I should use? How are you going to store this information?

+6
source share
3 answers

In my opinion, the best practice for logs (and similar) in server applications is to resolve the acceptable time delay and stick to it. For example, if you set the delay to 5 seconds, write the code so that:

  • If you write something in a journal, it will be "truly" written in 5 seconds.
  • If something else is recorded before 5 seconds, it is simply added to the buffer (recorded when the time runs out).

That way, you only do maximum write to disk in 5 seconds, but it is definitely written. This is consistent with other approaches:

  • If you clear the data on the disk every time something is written, but the load increases and, say, 10,000 events per second, you will spend I / O time with 10,000 disk writes per second.
  • If you leave it in Java / OS to decide when you need to clear the data, but the load is very low (for example, in the middle of the night), the log may even be outdated. (If there is one event not large enough to fill the buffer, then nothing for several hours.)

I have not looked at the API recently to see if there is a built-in way to make this strategy, but it is easy to code. By the way, there is no need to manually cache the output; you can just use BufferedOutputStream and call the flush () object when you want to write it to disk. (That way, it will also be automatically written when it falls into the buffer limit, but that is probably OK if you choose the limit wisely.)

Regarding opening a file, you can leave the files open as long as you like (just close it when you are not going to write anymore). Assuming you are not opening thousands of files and you do not need to write multiple applications to the same file, this does not cause any problems.

+3
source

It is absolutely good to leave the file open for a long time. This is certainly much better than reopening and closing. The amount of resources consumed by one open file is negligible; your only concern would be if you had many open files (hundreds or thousands). I suggest you open the file when the program starts and close it when it finishes.

If you use the right tools to study open files stored in your program or other programs on your system, you will find that they all contain a certain number of files (from several to tens) open for their whole life - any files that contain program code (executable files, shared libraries and JAR files for Java programs) as they become open and then mapped to memory, and log files are often logged. This is normal and safe.

Now you need to flush the stream (or writer, or RandomAccessFile , or whatever you use) during this time, you should do this every time you need to make sure that all data written up to this point has been safely written to disk; it can be after each message or after a certain number of messages, amount of data or a period of time, as you see fit.

+3
source

1) Is it good to leave the file open for a long period of time (from several minutes to several hours) if you are not using the file?

I think it depends on how many messages come into your program and the size of each message. If your memory can satisfy your calculations, you can think about it. But Iโ€™ll think about writing in databases when every message arrives (maybe blob). Also think about what happened if your program crashes while writing to a file. You may lose entire messages stored in memory.

2) What is the good practice for the "cache" that I'm talking about? Is a string array good? Is there anything better I should use? How are you going to store this information?

If you temporarily saved data in an array of memory, this is normal when you know the size. Otherwise, you can use an ArrayList.

+1
source

All Articles