Does the function "open" Python save its contents in memory or in a temporary file?

For the following Python code:

fp = open('output.txt', 'wb') # Very big file, writes a lot of lines, n is a very large number for i in range(1, n): fp.write('something' * n) fp.close() 

The recording process above can last more than 30 minutes. Sometimes I get a MemoryError error. The contents of the file before closing is stored in memory or recorded in a temporary file? If it is in a temporary file, what is its general location on Linux?

Edit:

Added fp.write to the for loop

+7
python
source share
6 answers

It is stored in the disk cache of the operating system in memory until it is flushed to the disk, either implicitly due to timing or space problems, or explicitly through fp.flush() .

+5
source share

The buffer will write to the Linux kernel, but at (ir) regular intervals will be flushed to disk. Running such a buffer space should never lead to a memory error at the application level; buffers must be flushed before this happens, while pausing the application.

+3
source share

Based on ataylor's comment on the question:

You might want to insert your loop. Something like

 for i in range(1,n): for each in range n: fp.write('something') fp.close() 

Thus, the only thing that gets into memory is the string "something" , not "something" * n .

+2
source share

If you write a large file for which recording may fail, you better clean the file to disk yourself at regular intervals using fp.flush() . Thus, the file will be located in the place of your choice, where you can easily get to, and not be at the mercy of the OS:

 fp = open('output.txt', 'wb') counter = 0 for line in many_lines: file.write(line) counter += 1 if counter > 999: fp.flush() fp.close() 

This will run the file on disk every 1000 lines.

+1
source share

If you write line by line, this should not be a problem. You must show the code of what you do before recording. To get started, you can try to remove objects where necessary, use fp.flush() , etc.

0
source share

A file record should never give a memory error; you probably have a bug elsewhere.

If you have a loop and a memory error, I would look if you are "leaking" object references.
Something like:

 def do_something(a, b = []): b.append(a) return b fp = open('output.txt', 'wb') for i in range(1, n): something = do_something(i) fp.write(something) fp.close() 

Now I select only an example, but in your actual case, the reference leak can be much harder to find; however, this case will simply be a memory leak inside do_something due to the way Python handles the default parameters for functions.

0
source share

All Articles