Zookeeper and SolrCloud on AWS EC2 Instances

I used Solr for a while, but new to SolrCloud. I am investigating whether it makes sense in my context to deploy SolrCloud or to have multiple Solr instances (with corresponding indexed content) sitting behind the ELB.

My deployment will be in AWS on EC2 instances. Our current AWS troubleshooting strategy is to stop error cases and allow them to be automatically recreated by the AutoScaling team (which sets up new instances using scripts when they are created). In fact, we do not have access to registration in copies when they are produced. Everything stored in Solr can be reindexed, so there is no problem with data loss.

However, trying to understand the SolrCloud infrastructure, I had a few questions:

  • Can Zookeeper automatically add a new instance if I destroy one of them? Everything that I saw seems to have static IPs in the configurations, which will require updating the configs (and restarting Zookeeper) if the instance was interrupted and replaced.
  • Is there a Zookeeper master instance that I have to call, or can I call any of them? If I can call any of them, we will most likely put ELB in front of Zookeeper.
  • If we press heavy use and give the AWS AutoScaling group the ability to create additional servers that will be used as SolrCloud shards, will SolrCloud gracefully add instances and stop them without problems? (This seems to be true, and the whole point of using SolrCloud.)
+5
source share
1 answer
  • Can Zookeeper automatically add a new instance if I destroy one of them? Everything that I saw seems to have static IPs in the configurations, which will require updating the configs (and restarting Zookeeper) if the instance was interrupted and replaced.

AN: At ZooKeeper, you just need to mention other ZooKeepers. This should make ZooKeepers aware of other running ZooKeepers. You do not need to change this configuration unless you plan to increase / decrease the number of ZooKeepers. Even if we must do this, we can do without breaking the cluster by doing it on time. We also save the hostname in config so that changes to ip do not affect this.

  • Is there a Zookeeper master instance that I have to call, or can I call any of them? If I can call any of them, we will most likely put ELB in front of Zookeeper.

AN: At ZooKeeper, we have a leader and followers. We do not need to worry about them, because we do not communicate with ZooKeepers

  • If we press heavy use and give the AWS AutoScaling group the ability to create additional servers that will be used as SolrCloud shards, will SolrCloud gracefully add instances and stop them without problems? (This seems to be true, and the whole point of using SolrCloud.)

AN: When you create a new SOLR node, you will need to start the node under the same cluster (skip the same ZooKeepers). Once you start with this, you will have to split the shard and transfer it to another node in order to balance the cluster. This is not automated at the moment.

SOLR nodes are the ones you need to add to your ELB.

When you start the SOLR node, you will mention the ZooKeepers list, with which the SOLR node will understand which cluster is part of and the other nodes serving the cluster

+5
source

All Articles