Elasticsearch clients for python, no solution

I have a very bad week that chose elasticsearch with graylog2. I am trying to run queries against data in ES using Python.

I have tried the following clients.

  • ESClient - very strange results, I think that it is not supported, query_body does not affect, it returns all results.
  • Pyes - Unread, undocumented. I looked through the sources and cannot figure out how to run a simple query, maybe I'm just not so smart. I can even run basic queries in json format and then just use Python objects / iterators to parse the results. But Pyes doesn't make it easy.
  • Elasticutils is another document, but without a complete sample. I get the following error with attached code. I don’t even know how it uses this S () to connect to the correct host?

    es = get_es (hosts = HOST, default_indexes = [INDEX])

    basic_s = S (). indexes (INDEX) .doctypes (DOCTYPE) .values_dict ()

results:

print basic_s.query(message__text="login/delete") File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 223, in __repr__ data = list(self)[:REPR_OUTPUT_SIZE + 1] File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 623, in __iter__ return iter(self._do_search()) File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 573, in _do_search hits = self.raw() File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 615, in raw hits = es.search(qs, self.get_indexes(), self.get_doctypes()) File "/usr/lib/python2.7/site-packages/pyes/es.py", line 841, in search return self._query_call("_search", body, indexes, doc_types, **query_params) File "/usr/lib/python2.7/site-packages/pyes/es.py", line 251, in _query_call response = self._send_request('GET', path, body, querystring_args) File "/usr/lib/python2.7/site-packages/pyes/es.py", line 208, in _send_request response = self.connection.execute(request) File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 167, in _client_call return getattr(conn.client, attr)(*args, **kwargs) File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 59, in execute response = self.client.urlopen(Method._VALUES_TO_NAMES[request.method], uri, body=request.body, headers=request.headers) File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 255, in urlopen raise MaxRetryError("Max retries exceeded for url: %s" % url) pyes.urllib3.connectionpool.MaxRetryError: Max retries exceeded for url: /graylog2/message/_search 

I want the developers of these good projects to provide some complete examples. Even looking at the sources, I completely lose.

Is there any solution, help me with elasticsearch and python, or should I just give it all up and pay for a good account and end this unhappiness.

I continue to use curl, load the entire json result, and json load it. Hope this works, though twisting the load of 1 million posts from elasticsearch may not just happen.

+8
python elasticsearch pyes
source share
6 answers

Honestly, I was just lucky with CURLing. ES has so many different methods, filters, and queries that various β€œwrappers” do not easily recreate all the functionality. In my opinion, this is similar to using ORM for databases ... what you get in ease of use, which you lose in flexibility / raw power.

With the exception of most ES wrappers, it's actually not that simple.

I would ask CURL a bit and see how this applies to you. You can use external JSON formatters to check your JSON, a mailing list to look for examples, and documents are fine if you use JSON.

+7
source share

I found that cheeses are quite suitable: https://github.com/humangeo/rawes

This is a rather low-level interface, but I believe that working with it is much less difficult than at high-level ones. It also supports Thrift RPC if you are at it.

+8
source share

Explicit host setup for this error allowed me:

basic_s = S() .es(hosts=HOST, default_indexes=[INDEX])

+7
source share
+4
source share

ElasticSearch recently (September 2013) released the official Python elasticsearch-py client (elasticsearch on PyPI, also on github ), which supposedly is a direct match to the official ElasticSearch API, I haven't used it yet, but it looks promising, and at least least, it will correspond to official documents!

Edit: we started using it, and I am very pleased with that. The ElasticSearch API is pretty clean, and elasticsearch-py supports this. It's easier to work with and debug in general, plus a decent log.

+3
source share

ElasticUtils has a sample code: http://elasticutils.readthedocs.org/en/latest/sampleprogram1.html

If there are other things in the documents, just ask.

+2
source share

All Articles