How to write (large) files using Ruby Eventmachine

I spent several days searching for some examples other than echo-server for eventmachine, but it seems they aren't there. Let's say I want to write a server that accepts a file and writes it to Tempfile:

require 'rubygems' require 'tempfile' require 'eventmachine' module ExampleServer def receive_data(data) f = Tempfile.new('random') f.write(data) ensure f.close end end EventMachine::run { EventMachine::start_server "127.0.0.1", 8081, ExampleServer puts 'running example server on 8081' } 

Writing to a file blocks the reactor, but I don’t understand how to do this "Eventmachine Style". Should I read the data in chunks and write each fragment to disk in the Em.next_tick block?

Thanks for any help Andreas

+6
ruby eventmachine
source share
4 answers

Two answers:

Lazy answer: just use a blocking record. EM is already sending you discrete chunks of data, not just one giant row. Thus, your implementation example may be slightly off. Are you sure you want to create a new temporary file for each individual fragment that EM you like? However, I will continue, believing that your sample code is working as intended.

Admittedly, the lazy approach depends on the device you are writing, but the simultaneous recording of several large streams to disk will be a major obstacle and you will lose your benefits from having a server anyway. You will just finish working with the juggling disk in all places, IO performance will drop dramatically, and your server will also have performance. Working with a lot of things immediately remains in order with RAM, but as soon as you start working with block devices and scheduling I / O, you will run into performance bottlenecks no matter what you do.

However, I think you might want to make some long writes to disk at the same time that you want low latency responses to other heavy requests other than IO. So maybe a good answer:

Use defer .

 require 'rubygems' require 'tempfile' require 'eventmachine' module ExampleServer def receive_data(data) operation = proc do begin f = Tempfile.new('random') f.write(data) ensure f.close end end callback = proc do puts "I wrote a file!" end EM.defer(operation, callback) end end EventMachine::run { EventMachine::start_server "127.0.0.1", 8081, ExampleServer puts 'running example server on 8081' } 

Yes, it uses threads. In this case, this is not so bad: you do not need to worry about synchronization between threads, because EM is good enough to handle this for you. If you need an answer, use the callback that will be executed on the main reactor thread when the worker thread completes. In addition, the GIL is a bit of a non-problem for this case, since you are dealing with I / O locking here and not trying to achieve CPU concurrency.

But if you intended to write everything to the same file, you will have to be careful with the delay, since a synchronization problem will arise, since your threads will probably try to write to the same file at the same time.

+3
source share

From the docs , it seems you just need to attach the file (although, as you point out, this may be invalid, it seems that you need to use File.write / ie blocking ...) and send_data .

Although I thought you cannot mix I / O lock / non-block with EM :(

Given that the source data is sockets, I think this will be handled by EventMachine.

Maybe a question for the google group ...

~ Chris

+1
source share

Unfortunately, the files do not respond very well to the choice of interfaces. If you need something more efficient than writing IO # (which is unlikely), you can use EIO .

EIO really only slightly unlocks the reactor and gives you a little buffering. If certain delays are a problem or you have really slow drives, this can be helpful. In most other cases, this is probably just a bunch of effort for a small advantage.

+1
source share

This is very similar to What is the best way to read files in an EventMachine based application? (but I wanted to know how to read files efficiently). It seems that there is no non-blocking file API, so it is best to write short packets using next_tick or defer writing (using defer ) so that it defer in a separate thread (but I don't know how this works.)

0
source share

All Articles