TLDR

What exactly is the situation with the Yandex ranking factors? Learn more about what ranking factors are...

The search marketing community is attempting to make sense of the leaked Yandex repository, which contains files that appear to list search ranking factors.

Some may be looking for actionable SEO tips, but that is unlikely to be the true value.

It is generally agreed that it will be beneficial in gaining a general understanding of how search engines work.

There Is A Lot To Discover

Ryan Jones (@RyanJones) thinks this leak is significant.

He's already tested some of the Yandex machine learning models on his own machine.

Ryan is convinced that there is much to learn, but that it will require much more than simply reviewing a list of ranking factors.

Ryan elaborates:

"While Yandex is not Google, there is a lot we can learn from this in terms of similarity.

Yandex makes extensive use of Google-developed technology. They specifically mention PageRank, as well as Map Reduce and BERT, among other things.

The factors will obviously differ, as will the weights assigned to them, but the computer science methods used to analyse text relevance and link text and perform calculations will be very similar across search engines.

I believe the ranking factors can provide valuable insight, but simply looking at the leaked list is insufficient.

When looking at the default weights (before ML), there are some negative weights that SEOs would assume are positive, and vice versa.

There are also a LOT more ranking factors calculated in the code than what has been listed in the lists of ranking factors that have circulated.

That list appears to be only static factors, with no mention of how they calculate query relevance or the many dynamic factors that relate to the resultset for that query."

Over 200 Ranking Factors

According to the leaked information, Yandex employs 1,923 ranking factors (some say less).

According to Christoph Cemper (LinkedIn profile), founder of Link Research Tools, there are many more ranking factors.

Christoph mentioned:

"Friends have witnessed:

  • There are 275 personalization factors.
  • There are 220 "web freshness" factors.
  • 3186 image search variables
  • There are 2,314 video search factors.
  • There is still a lot to be discovered.

The fact that Yandex has hundreds of factors for links is probably the most surprising for many."

The point is that it is far more than the 200+ ranking factors claimed by Google.

Even Google's John Mueller stated that the company has shifted away from the 200+ ranking factors.

So perhaps this will assist the search industry in shifting away from thinking of Google's algorithm in those terms.

Nobody Is Aware of Google's Entire Algorithm?

What's striking about the data leak is how easily the ranking factors were gathered and organised.

The leak calls into question the idea that Google's algorithm is closely guarded and that no one, including Google employees, knows the entire algorithm.

Is it possible that Google has a spreadsheet with over a thousand ranking factors?

Christoph Cemper challenges the notion that no one knows Google's algorithm.

Christoph made the following comment to Search Engine Journal:

"On LinkedIn, someone commented that he couldn't imagine Google "documenting" ranking factors in such a way.

However, that is how a complex system like that must be built. This information was obtained through a reliable insider.

Google also has code that could be leaked.

For a techie like me, the frequently repeated statement that not even Google employees are aware of the ranking factors seemed absurd.

The number of people who have all of the information will be very small.

But it has to be in the code, because the code is what drives the search engine."

Which Yandex features are similar to Google?

The leaked Yandex files hint at how search engines work.

The data does not demonstrate how Google operates. However, it does allow you to see a portion of how a search engine (Yandex) ranks search results.

What is in the data should not be confused with what Google may do with it.

Nonetheless, there are some interesting parallels between the two search engines.

MatrixNet Is Not the Same as RankBrain

One of the intriguing discoveries is related to the Yandex neural network called MatrixNet.

MatrixNet is an older technology that was first introduced in 2009. (archive.org link to announcement).

Contrary to popular belief, MatrixNet is not the Yandex equivalent of Google's RankBrain.

Google RankBrain is a constrained algorithm that focuses on understanding the 15% of search queries that Google has never seen before.

According to the Bloomberg article, RankBrain's purpose is limited:

"If RankBrain encounters a word or phrase it is unfamiliar with, the machine can guess which words or phrases may have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries."

MatrixNet, on the other hand, is a multi-purpose machine learning algorithm.

It classifies search queries and then applies the appropriate ranking algorithms to those queries.

This is part of the 2009 algorithm's 2016 English language announcement:

"MatrixNet can generate a very long and complex ranking formula that takes into account a plethora of different factors and their combinations.

Another useful feature of MatrixNet is the ability to tailor a ranking formula to a specific class of search queries.

In addition, tweaking the ranking algorithm for, say, music searches will not degrade the ranking quality for other types of queries.

A ranking algorithm is analogous to complicated machinery with dozens of buttons, switches, levers, and gauges. In most cases, turning a single switch in a mechanism causes a global change in the entire machine.

MatrixNet, on the other hand, allows for the adjustment of specific parameters for specific classes of queries without requiring a major overhaul of the entire system.

MatrixNet can also automatically select sensitivity for specific ranges of ranking factors."

MatrixNet does a lot more than RankBrain; they are clearly not the same.

What's cool about MatrixNet is that its ranking factors are dynamic in the sense that it classifies search queries and applies different factors to them.

MatrixNet is mentioned in a few of the ranking factor documents, so it's important to put MatrixNet in the proper context so that the ranking factors make sense.

To make sense of the Yandex leak, it may be useful to learn more about the Yandex algorithm.

Some Yandex factors correspond to SEO practises.

Dominic Woodman (@dom woodman) has some insightful comments about the leak.

Alex Buraks (@alex buraks) has published a massive Twitter thread on the subject, which contains echoes of SEO practises.

One such factor that Alex mentions is optimising internal links in order to reduce crawl depth for important pages.

Google's John Mueller has long encouraged publishers to prominently link to important pages.

Mueller advises against burying critical pages deep within the site architecture.

In 2020, John Mueller stated:

"What will happen is that we will see that the home page is very important, and that things linked from the home page are also very important.

And then... as it moves away from the home page, we'll probably think this is less important."

It is critical to keep important pages close to the main pages that site visitors enter through.

So, if links point to the home page, the pages linked from the home page are considered more important.

Crawl depth was not mentioned by John Mueller as a ranking factor. He simply stated that it informs Google about which pages are important.

Alex cites a Yandex rule that uses crawl depth from the home page as a ranking factor.

That makes sense to consider the home page as the starting point of importance and then calculate less importance as one clicks deeper into the site away from it.

Similar ideas can be found in Google research papers (Reasonable Surfer Model, Random Surfer Model), which calculated the likelihood that a random surfer would end up at a given webpage simply by following links.

For many years, the rule of thumb for SEO has been to keep important content within a few clicks of the home page (or from inner pages that attract inbound links).

Yandex Vega Update... In Relation To Expertise And Authority?

Yandex's search engine was updated in 2019 with the Vega update. The Yandex Vega update included neural networks that had been trained with subject matter experts.

The goal of this 2019 update was to include expert and authoritative pages in search results. However, search marketers sifting through the documents have yet to discover anything that correlates with things like author bios, which some believe are related to the expertise and authority that Google seeks.

Hocalwire CMS includes fantastic automation features that help traffic from various sources focus on your website. You may drastically increase your traffic with the aid of Google Analytics and potential choices with Hocalwire. To learn more about Hocalwire CMS's limitless potential, schedule a demo.