[ENG] In December 2019, Google de-indexed a large number of pages in SERP: bug or algorithmic penalty?

[Dicembre 2019] Google ha de-indicizzato un gran numero di pagine in SERP: bug o penalizzazione algoritmica?

Recently a large number of website managers, both Italian and worldwide, have denounced a mass de-indexing of their web pages on the Google search engine, for no apparent reason.

Most of them said they were hit by the alleged algorithmic penalty (assuming this is the case) between 16 and 17 December 2019, with some further deteriorations that occurred towards the end of the year.

Those who have been hit by this hurricane, report a worsening of the situation day after day, with the progressive disappearance of pages in SERP, also for sites positioned in the first positions and for a large number of different keywords.

What happened to the SERPs?

The main events that characterize this situation are described in the following paragraphs.

1. Pages have lost indexing

Around 15/16/17 December, a large number of pages have lost their indexing and therefore are currently completely disappeared from Google Search.

At the same time, however, the pages are present on the Google Search Console, and this can be verified through the “Check URL” function.

2. The “site:” operator does not return the correct pages

The use of the “site: <domain>” operator on Google Search does not return pages or returns infinitely less than previously indexed.


From various testimonials on different sites and niches of different topics, it appears that the number of pages returned by the “site: <domain>” operator decreases day by day, presumably based on the passage of the Google crawler.
Compared to the previous point, one wonders why the “come-registrare-lo-schermo-in-ubuntu-con-simplescreenrecorder” page is indexed, but invisible to the “site:” operator. On this topic here you can find a thread (in English) on the Google support pages.

3. The “site: + keyword” operator does not return the correct pages

The use of the “site: <domain> <keyword>” operator on Google Search returns some pages related to the indicated key, but the same pages are not displayed with the only “site: <domain>” command.

4. Pages are “Indexed, but not submitted via sitemap”

A coverage check via Google Search Console (“coverage” function) typically returns a roughly correct number of valid pages, but going to analyze the details they are all of the “Indexed, but not sent via sitemap” type.

Sites that have not been affected by this hit correctly indicate a “Sent and indexed” entry.

5. Sending the sitemap shows anomalies

Sending the sitemap via Google Search Console does not return errors (“Success” status), but despite this it seems that Google does not take this into account and the option to view the coverage of the index remains disabled (in light gray color and not clickable ).

A thread on this topic has been opened on Google’s support pages (content in English), but experts once again denied the problem or directed the user to further checks.

6. The “Usability on mobile devices” report shows a collapse in the number of valid pages

What preventive checks have been made?

Personally, I have been able to check these issues on several sites I own, and the checks I have made are described in the following paragraphs.

1. Webserver log

2. Robots.txt

There are no problems with the robots.txt file.

3. Manual actions and / or security problems

There are no manual actions or security issues in Google Search Console.

4. Incoming links

There are no particular anomalies regarding the incoming links, which are roughly the same before and after the hit.

5. Site speed

There are no particular anomalies regarding the speed of the site; a test carried out with PageSpeed Insight shows a speed equal to 83/100.

6. Redirect, http vs https, www vs non-www

There are no anomalies on any redirects, nor http vs https protocol management, nor management of www vs non-www format on the domain name.

7. Duplicate or copied content

What sites have been affected and what features do they have?

As far as I have been able to observe on the sites I own, all vertical which deal with a single information niche, at the moment I have 5 sites affected by this problem.

With the aim of finding any common factors, the main characteristics are indicated below.

Niche of the site Hosting WordPress Theme Advertising Affiliation link
Site 1 Technology Hosting A (IP Italy) Theme A YES YES
Site 2 Technology Hosting B (IP USA) Theme A NO NO
Site 3 Energy Hosting A (IP Italy) Theme A NO YES
Site 4 Automotive Hosting C (IP Germany) Theme B NO NO
Site 5 Home decor Hosting B (IP USA) Theme C NO NO

 

As can be seen from the table, there does not seem to be any correlation with the use of a particular hosting, theme, ads and affiliation.

Conclusion

From what I have observed, it is not possible to explain what happened neither from a technical point of view nor from a content point of view. On the other hand, Google has not given any communication of any problems or anomalies on its systems, referring the case to specific problems on individual sites.

But how can it be possible that, in the space of two days, a large number of different websites, and all over the world, could have undergone this kind of total de-indexing of the pages?