Trigger river update elasticsearch

I have an installation for regenerating elastic searches using the jdbc river plugin, which simply performs a simple select * from and indexes this table.

But I would like to be able to run the river on demand through the API, as well as at the standard time interval, so that I can index the document when it is inserted into this table.

Does anyone know if this is still being done now?

i.e. /_river/my_river/_refresh

Thanks.

+4
source share
4 answers

I donโ€™t see a good way to get JDBC River to index your specific updated document in real time, and I'm not sure if it is intended to be used anyway.

Instead of initiating a JDBC registry to index your document, why don't you just index the document from the update code?

The JDBC River is a great way to power large data streams, and there is documentation to maintain consistency with the survey . but I donโ€™t think itโ€™s easy to meet your needs in real time.

+1
source

Thank you for your offer. You are very welcome to receive feedback, please join the elasticsearch community. I will open the question for starting the selection at https://github.com/jprante/elasticsearch-river-jdbc/issues

0
source

It looks like you're struggling with the classic push vs. issue pull ". Rivers are designed to output data from the database at intervals. They are easy to configure, but like everything else in computer science, they are a compromise. In particular, you lose real-time indexing. The river you can run may be the best from both worlds, or it can fill your server with a lot of unnecessary traffic (for example, why "SELECT * ..." when you know exactly which document was updated?).

If you have a requirement for real-time indexing (like me), you "push" your updates in Elasticsearch. You just need to write an Elasticsearch client that will deliver your updated records to Elasticsearch as they are saved. FWIW, I solved this problem by issuing messages on the service bus, and the service waiting on the other end retrieved the entity from SQL and indexed it. Once you have this infrastructure, you do not need to write a small application to complete the initial import of SQL data or to create a scheduled task to index the data.

0
source

alernative will use logstash with jdbc plugin

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html

  • download logstash
  • install input-jdbc-plugin

Configuration Example:

  input { jdbc { jdbc_connection_string => "jdbc:oracle:thin:@localhost:1521:XE" jdbc_user => "user" jdbc_driver_library => "/home/logstash/lib/ojdbc6.jar" jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver" statement => "select * from events where update_date > :sql_last_value order by update_date" last_run_metadata_path => "run_metadata_event.log" schedule => "* * * * *" jdbc_password => "password" } } # The filter part of this file is commented out to indicate that it is # optional. filter { mutate { split => { "field_1" => ";"} } } output { elasticsearch { #protocol => "http" hosts => ["localhost:9200"] index => "items" document_type => "doc_type" document_id => "%{doc_id}" } } 
-1
source

All Articles