Magento 2: Indexer stuck at processing

LATEST UPDATE (13/02/2021): Resolved? Fingers crossed!

This issue seems to have resolved itself after a few actions I did:

  • Update to Magento v2.3.6 and extensions to latest versions (Mirasvit Elasticsearch Ultimate 1.0.151)
  • Updated to Elasticsearch V7
  • Reset the Elasticsearch indexes in the backend @Mirasvit settings and changed the Elasticsearch Index Prefix, to be sure that I was working on newly generated indexes.
  • Then I discovered two cron job settings that were still left over from M1 migration in table core_config_date:
    • crontab/default/jobs/log_clean/schedule/cron_expr
    • crontab/default/jobs/log_clean/run/model
  • Since these leftover cron entries were generating errors in cron_schedule, I decided to remove those rows from core_config_data

After upgrading, compiling and generating content I once again found my logs flooding with:

Environment emulation nesting is not allowed error

Since Mirasvit’s change log mentioned something about the CMS page related to this error, I decided to:

  • turn off the search index for CMS (Mirasvit -> Search -> Search Indexes), after that reset & reindex all indexes from cli.
  • turn back on the search index for CMS. Then reindex this one index using the Mirasvit button. After that, reset and reindex all indexes from cli.

Cleared all cache and behold… no more errors, catalogsearch_fulltext index on ready. Server performance came back as it should be.

So. Still not exactly sure what fixed the issue, but this worked for me. I hope it does for you!

Grtz,

Matthijs

 

Read below the steps I tried before.


This problem starts whenever the cron catalogrule_apply_all is launched. For some reason it is locking indexer_reindex_all_invalid

Whatever you try: ALWAYS BACK UP DB FIRST.

The problem

I ran into a problem on a Magento 2 website where one of the indexers got stuck in a loop of processing. It took me quite a while to find any solution.

In my case it was catalogsearch_fulltext – which resulted in heavy system load, because every reindex process took about 8 minutes and maxed out processors and memory. After finishing the reindex, the whole process would magically start again, and again, and again.

I was using Magento 2.3.5 on PHP 7.2, MySQL, Elasticsearch with Mirasvit extension (which offered good support, by the way, but didn’t solve the problem). The product DB was migrated from 1.9.x and I suspected this to be one of the causes, because… well… migrating and Magento is kind of problem swarmed.

Searching everywhere on the net, this turned out not to be so easy to debug. From SSH I did a

tail -f var/log/debug.log

In an early stage it started with:

Environment emulation nesting is not allowed error

These turned out to be caused by the Mirasvit Elasticsearch extension. After contacting support, they fixed it and launched an update.

However, once that was fixed (or maybe, this was there all along) I noticed lots and lots of these errors:

main.WARNING: Could not acquire lock for cron job: indexer_reindex_all_invalid [] []

But no real other useful error warnings. After a while I was wandering:
Was this error the cause or the consequence of the problem?
(At first I thought this was also Mirasvit related, but they could not reproduce, so it may as well be my database).

To consider

As far as I could see: what Magento should do is scan all products/items that needed processing, updating that reindex info somewhere in the database and then launch the reindex (by cron) for only those products/items. In my case, catalogsearch_fulltext was in a loop of constantly indexing all products – so that was definately not good.

Worth trying

Settings I tried changing:

  • Setting all indexers to ‘Updated on Save’ and back to ‘Update on Schedule’
  • Changing the cron timing for Group Index for Generate/Schedule to 15 / 20 (store->configuration->system).
  • Turning on/off ‘Use seperate process’ for cron jobs groups.
  • Changing the cron timing in /vendor/magento/module-indexer/etc/crontab.xml for indexer_all_invalid to 0/5 (run every 5 minutes in stead of every minute).
  • Restarting MySQL, Elasticsearch, reset indexes
  • Manual resetting and restarting indexes via CLI
  • Truncating cron_schedule
  • Manually edit status to ‘idle’ in table mview_state
  • Upgrading server (move processors/memory).

which in most cases didn’t help at all… or maybe for one day…
Next day in debug.log: same issues. And catalogsearch_fulltext constantly processing again. Oh my.

Whatever you try: ALWAYS BACK UP DB FIRST.

Advise: clean up Migration rubbish on attributes

Optional, but in my case necessary. Many things went wrong after migration. My client’s store had 2000 products over 2 websites and 5 store views (as languages). I may write another blog about this.

I kept having strange issues. Duplicating a product took forever and didn’t show up in frontend. Other strange random things happening (images not shown or duplicating).

I noticed that for a lot of these migrated products the attribute values were individually saved per store-view, and not having the ‘Use Default’ checked. Thanks, Magento migration tool.
While my client needed at most different values for title, descriptions and meta-info, not even for all products.

So first, I cleaned up the entire database manually, removing all values from all store-views except store 0 (admin). For products these tables:

catalog_product_entity_datetime
catalog_product_entity_decimal
catalog_product_entity_gallery
catalog_product_entity_int
catalog_product_entity_media_gallery
catalog_product_entity_text
catalog_product_entity_varchar

And for categories the same catalog_category_entity_[etc].

Using EAV attributes overview on a different tab and using MySQL queries I was able to delete and filter what attributes could or could not be saved in the database (“select * / delete from catalog_product_entity_varchar where attribute_id = ‘x’ and store_id <> 0”). After that a manual reset/reindex, cache clear.

URL_PATH attribute: I read that for products, url_path is no longer used in 2.3 so I removed all these values completely (kept a backup), so also from store 0.
URL_KEY attribute: I deleted the excess URL_KEY for all stores other than store 0. I left url_path for category intact.
BEWARE Do not remove url_keys from store 0! If you delete the URL KEYS for other storeviews (or completely by accident) and run into trouble: install a module that regenerates these (Oleg Koval).

This cleanup seemed to fix the problem. Hurray! Indexing had also become faster because I deleted all the nonsense information.

Next day: same error again. Catalogsearch_fulltext stuck in processing again and the indexer_all_invalid errors again flooding the debug.log. Waaah!

Solution: reset index process completely, manually

So I tried to figure out why index_all_invalid was either not working good, throwing errors, or resetting all products to be reindexed.

So, Magento keeps track of changelos in tables like

catalogsearch_fulltext_cl

or any table ending on _cl. The ID of these tables auto increments.
Also: if there is a row here (mostly one or two) you will find the entity_id of the product that is marked for updating. I noticed that although all indexers said ‘ready’, a few of the _cl tables had the same entity_id marked as reindexeable and these rows never changed.

These _cl changelog tables should be automatically reset/cleaned with another cronjob “indexer_clean_all_changelogs“.
Lo and behold: a lot of these indexer_clean_all_changelogs crons were marked ‘missed’ in cron_schedule.

Finally, there is this table:

mview_state

which has a column ‘version_id‘, that must be in sync with the changelogs columns – or at least incrementing on run. I noticed catalogsearch_fulltext was not incrementing.

So this is what I did

  1. Turn off all cron jobs and enter maintenance mode.
  2. Make sure no job is running anymore (ps aux | grep bin/magento)
  3. Truncate changelogs tables (you may have more ending on _cl), reset mview_state table and truncate cron_schedule table to reset crons.

TRUNCATE TABLE `cataloginventory_stock_cl`;
TRUNCATE TABLE `catalogrule_product_cl`;
TRUNCATE TABLE `catalogrule_rule_cl`;
TRUNCATE TABLE `catalogsearch_fulltext_cl`;
TRUNCATE TABLE `catalog_category_product_cl`;
TRUNCATE TABLE `catalog_product_attribute_cl`;
TRUNCATE TABLE `catalog_product_category_cl`;
TRUNCATE TABLE `catalog_product_price_cl`;
TRUNCATE TABLE `customer_dummy_cl`;
TRUNCATE TABLE `design_config_dummy_cl`;
TRUNCATE TABLE `salesrule_rule_cl`;
TRUNCATE TABLE `targetrule_product_rule_cl`;
TRUNCATE TABLE `targetrule_rule_product_cl`;
TRUNCATE TABLE `inventory_cl`;

TRUNCATE TABLE `mirasvit_search_score_rule_product_cl`;
TRUNCATE TABLE `mst_misspell_cl`;

TRUNCATE TABLE `scconnector_google_feed_cl`;
TRUNCATE TABLE `scconnector_google_remove_cl`;

/* reset view_state */
UPDATE `mview_state` SET `version_id` = '0';

/* reset cron jobs schedule */
TRUNCATE TABLE `cron_schedule`;

5. Manually reset indexes and reindex all via CLI

6. Flush cache, exit maintenance and re-enable crons.

7. Monitor cron_schedule for statuses, Monitor mview_state for updated version_id’s and monitor the _cl tables for updated rows if products are changes (and check if the tables are cleaned by cron).

Monitor results

After resetting the index system, check if things run smoothly. Monitor debug.log via CLI using tail -f var/log/debug.log. With a 2nd CLI check index status via bin/magento index:status

  1. Add or update a product, check if the index (backlog) status changes and autoresolves when cron runs. Meanwhile check debug.log for errors during cron.
  2. If runs correct, check table mview_state if version_id increments.
  3. Check _cl table after a while to see if rows are auto-cleared by cron indexer_clean_all_changelogs. By default, Magento sets this to run every hour at 0.

Final thoughts

At the time of writing, all my indexing is running smoothly again. resetting the indexation process worked untill midnight.

Anyway, this is a way to reset the indexing process.


UPDATE (in progress)

As I discovered in debug.log, each night at midnight, the indexes were reset by the cron catalogrule_apply_all that is by default launched daily at 0:00. (see crontab.xml in module-catalogrule).

After resetting the indexing, I tested it out by manually adding catalogrule_apply_all in cron_schedule for the next minute. And there it was: could not acquire lock for indexer_all_invalid was back in and catalogsearch_fulltext is back in the loop of processing.

After catalog rule_apply_all is run, cron seems to hang.

[USER]+ 11321  0.0  0.0   4340   724 ?        Ss   10:09   0:00 /bin/sh -c /usr/bin/php /[PATH]/public_html/bin/magento cron:run 2>&1 | grep -v "Ran jobs by schedule" >> /[PATH]/public_html/var/log/magento.cron.log

[USER]+ 11325 60.2  4.8 590908 197924 ?       R    10:09  10:14 /usr/bin/php /[PATH]/public_html/bin/magento cron:run

[USER]+ 12542  0.0  0.0   4340   772 ?        Ss   10:26   0:00 /bin/sh -c /usr/bin/php /[PATH]/public_html/bin/magento setup:cron:run >> /[PATH]/public_html/var/log/setup.cron.log

[USER]+ 12544 57.0  2.1 484272 88712 ?        R    10:26   0:01 /usr/bin/php /[PATH]/public_html/bin/magento setup:cron:run

master_+ 12554  0.0  0.0  12752  2196 pts/0    S+   10:26   0:00 grep bin/magento

The error is generated (and should be unlocked by?) these lines in
/vendor/magento/module-cron/Observer/ProcessCronQueueObserver.php

    private function tryRunJob($scheduledTime, $currentTime, $jobConfig, $schedule, $groupId)
    {
        // use sha1 to limit length
        // phpcs:ignore Magento2.Security.InsecureFunction
        $lockName =  self::LOCK_PREFIX . md5($groupId . '_' . $schedule->getJobCode());


        try {
            for ($retries = self::MAX_RETRIES; $retries > 0; $retries--) {
                if ($this->lockManager->lock($lockName, 0) && $schedule->tryLockJob()) {
                    $this->_runJob($scheduledTime, $currentTime, $jobConfig, $schedule, $groupId);
                    break;
                }
                $this->logger->warning("Could not acquire lock for cron job: {$schedule->getJobCode()}");
            }
        } catch (\Exception $e) {
            $this->processError($schedule, $e);
        } finally {
            $this->lockManager->unlock($lockName);
        }
    }

However, indexer_all_invalid is never unlocked after the first error occurs.
Investigating how this unlocking process should work.

I hope it may help you fixing your issue, or maybe point you in the right direction. Remarks and questions welcome in the comments or e-mail!