Reading a file in EventMachine asynchronously

I have been playing with Ruby EventMachines for some time, and I think I understand its basics.

However, I am not sure how to read in a large file (120 MB). My goal is to read the file line by line and write each line to the Cassandra database (the same should be with MySQL, PostgreSQL, MongoDB, etc., because the Cassandra client supports EM explicitly). A simple fragment blocks the reactor, right?

require 'rubygems'
require 'cassandra'
require 'thrift_client/event_machine'

EM.run do
  Fiber.new do
    rm = Cassandra.new('RankMetrics', "127.0.0.1:9160", :transport => Thrift::EventMachineTransport, :transport_wrapper => nil)
    rm.clear_keyspace!
    begin
      file = File.new("us_100000.txt", "r")
    while (line = file.gets)
      rm.insert(:Domains, "#{line.downcase}", {'domain' => "#{line}"})
    end
      file.close
    rescue => err
      puts "Exception: #{err}"
      err
    end
    EM.stop
  end.resume
end

But what is the correct way to get the file to be read asynchronously?

+5
source share
2 answers

EventMachine IO, , , - . - , .

EM.run do
  io = File.open('path/to/file')
  read_chunk = proc do
    lines_sent = 10
    10.times do
      if line = io.gets
        send_to_db(line) do
          # when the DB call is done
          lines_sent -= 1
          EM.next_tick(read_chunk) if lines_sent == 0
        end
      else
        EM.stop
      end
    end
  end
  EM.next_tick(read_chunk)
end

EventMachine?

+4

, EM:: FileStreamer. -, FileStreamer " ++". / db , ?

db ThreadedResource, , ... Cassandra, , Cassandra .

+1

All Articles