Magento reindexing loses Solr docs

It drives me crazy. I am running Magento EE 1.11.1 with SOLR running. We have a cron that runs every night, which reindexes the entire site. Every time he does this, I check the SOLR configuration, and the numDocs and maxDocs values ​​are only part of what should be currently indexed (27000 vs ~ 90,000). This means that when I search the site, the results are only part of what they should be.

The only way to get the search to work correctly is to stop SOLR, delete and recreate the / apache -solr / site_name / solr / data folder, restart and re-index only the directory search index through the shell. If I try to run this special reindex through the shell without deleting or re-creating the data folder, I get only about half of the documents I should receive (~ 51000).

All index files in the data folder are owned by root, and the SOLR banner runs as root. I have all the logs set to warning, but currently nothing is logged. I manage other sites using Solr and never run into this problem. This installation has many attributes (330) and many products (~ 100,000). Could this be part of the problem? Thanks!

+6
source share
5 answers

EE1.12 also cannot be a solution. We have a client on EE1.12 that has problems with SOLR integration. In their case, all indexing attempts fail when the indexer accesses the user attributes of the product.

Support for Nexcess and Magento has been working on this for over 6 weeks, the current status from Magento support is

Unfortunately, the patch is still under development, and I cannot tell when our developer will complete it.

+3
source

Since the Enterprise_Search module adds a cronjob, which runs every day by default at 3am, I found a better solution than adding a line of code to the shell/abstract.php file.

All you have to do is create a small module that adds a specific event to the global namespace instead of the administrator:

 <?xml version="1.0"?> <config> <modules> <YourNamespace_YourModuleName> <version>0.0.1</version> </YourNamespace_YourModuleName> </modules> <global> <events> <!-- The misspelling (cat-e-logsearch) is correct, you can look it up in the config.xml of the Enterprise_Search module --> <catelogsearch_searchable_attributes_load_after> <observers> <enterprise_search> <class>enterprise_search/observer</class> <method>storeSearchableAttributes</method> </enterprise_search> </observers> </catelogsearch_searchable_attributes_load_after> </events> </global> </config> 

Remember to activate your module by placing another configuration file in app/etc/modules/YourNamespace_YourModuleName.xml :

 <?xml version="1.0"?> <config> <modules> <YourNamespace_YourModuleName> <active>true</active> <codePool>local</codePool> <depends> <Enterprise_Search/> </depends> </YourNamespace_YourModuleName> </modules> </config> 

Now you can restore the Solr index from the command line by issuing the following command from the Magento root folder (provided that you have access to the shell, of course):

 php shell/indexer.php --reindex catalogsearch_fulltext 
+3
source

After checking on this solution for several days (by the way, on this question), I think I have a solution. I tested it and I did not see any errors.

 # shell/abstract.php @ line 75 public function __construct() { if ($this->_includeMage) { require_once $this->_getRootPath() . 'app' . DIRECTORY_SEPARATOR . 'Mage.php'; Mage::app($this->_appCode, $this->_appType); Mage::app()->addEventArea('adminhtml');# the magic line } $this->_applyPhpVariables(); $this->_parseArgs(); $this->_construct(); $this->_validate(); $this->_showHelp(); } 

The problem was that enterprise_search/observer not loaded so that it could run the storeSearchableAttributes method. This makes it impossible to register various additional data.

The only side effect that I can think of is that now executing the shell will load all the supervisor watchers . This can lead to a decrease in speed, taking part of the launch target from the shell. It will not be as slow as the browser, but it will probably be slower than before.

If you have any questions or I think I can help in another way, tell us!

+2
source

Have you spent some time looking at solr logs while starting the indexer? We are currently running 1.12 and are finding several problems with solr even there. We had to fix the problem when solr notifies us of the error.

My comments in my answer here are: Magento 1.12 and Solr 3.6. No correct results and spell suggestions.

I would suggest that this tip would apply to 1.11, but you might have to change it a bit. Open. /app/code/core/Enterprise/Search/Model/Adapter/Abstract.php and find prepareDocsPerStore.

You can track and register documents sent to solr as a health check. That way you could, for example, do something quick and dirty below $ docs [] = $ doc; as:

$ solr_log_file = '/mnt/tmp/'.date('Ym-d',time()).'/'.$storeId.'-'.$productId.'-solr.txt'; file_put_contents ($ solr_log_file, var_export ($ doc, true)); Warning. Perhaps I may have some syntax errors, as I just clogged this.

executing var_export from $ productIndexData before and after this line also showed: $ productIndexData = $ this β†’ _ prepareIndexProductData ($ productIndexData, $ productId, $ storeId);

+1
source

Hi, I have another solution for this problem, in my case I made a small script with the following code

 ini_set("memory_limit","1000M"); require_once "app/Mage.php"; umask(0); Mage::app(); $observer = Mage::getModel('enterprise_search/observer'); $observer->storeSearchableAttributes(); 

with the name solrindex.php and run it in a browser like mydomain / solrindex.php and then I will reindex the search directories from the administrator and this works for me.

0
source

Source: https://habr.com/ru/post/926416/


All Articles