Leaked Google Documents May Reveal How Google Indexing Works

blog cover image
18
3.9K followers
Updated

I have just seen a leaked document believed to give insights into Google’s search ranking systems.

While the document emphasises that the leak is unconfirmed, it speculates on various potential ranking factors based on these leaks.

Here’s a breakdown of the key points from the document:

Domain Authority Likely Exists

The document suggests that Google might use a “Domain Authority”-like metric, referred to as “siteAuthority” and “authorityPromotion.” This could mean that Google considers the overall authority of a domain, not just individual pages, for ranking.

This is significant because Google has long denied using a domain-wide authority score, focusing instead on individual page metrics like PageRank.

Clicks Do Matter

According to the leak, click behavior plays a larger role in rankings than Google has publicly acknowledged. Systems like **NavBoost** reportedly adjust rankings based on user interactions, including good clicks, bad clicks, and even "last longest clicks" (how long users stay on a page before returning to the search results).

This contradicts Google’s earlier statements that clicks don’t influence rankings, suggesting that user behavior might indeed shape how Google ranks pages.

The Google Sandbox is Real

The “Sandbox” theory — that new websites are held back from ranking well for a period — is supported by the leaked information. The document mentions a metric called **hostAge**, suggesting that older, trusted websites rank faster, while new websites must build credibility over time before escaping the "Sandbox."

Google Might Whitelist Websites

During critical events, such as the COVID-19 pandemic, Google might have whitelisted certain authoritative sites to promote their content.

This applies to industries like travel, health, and politics, where Google could have given preferential treatment to ensure trustworthy content appeared in the top search results.

Chrome Data Might Be Used in Rankings

Although Google has historically denied using data from the Chrome browser to influence rankings, the leak hints that Google might track user interactions via Chrome. This could include site visits, time spent on pages, and other user behaviors that might impact rankings.

The mention of **uniqueChromeViews** and **chromeInTotal** implies that Chrome user data may play a more active role than previously admitted.

No Single Ranking Algorithm

Google doesn’t use a singular ranking algorithm. Instead, it employs a network of algorithms that perform different functions. These systems include:

- **Trawler** for crawling

- **Alexandria** and **SegIndexer** for indexing

- **Mustang** and **NavBoost** for ranking and re-ranking

- **Twiddler** to tweak final search results

Google “Twiddles” with Search Results

The **Twiddler** system appears to make final adjustments to search results before they’re presented.

Twiddlers adjust rankings based on factors like user behavior and real-time events, boosting or demoting certain sites.

Authorship Still Matters

Google might still store author-related data, such as “isAuthor” and “authorName,” implying that the credibility of an article’s author could play a role in rankings.

This aligns with the SEO concept of E-E-A-T (Expertise, Experience, Authoritativeness, Trustworthiness), where author credibility boosts search rankings.

Anchor Text Importance

Anchor text in backlinks remains crucial. The document emphasizes that irrelevant or mismatched anchor text can harm rankings.

Google analyzes both the link's source and the destination to ensure relevance, and misaligned anchor text might lead to demotion.

Exact Match Domains (EMDs)

Exact match domains (EMDs) like “BestCheapLaptops.com” were once a strong ranking factor but are now subject to possible demotions.

The leak suggests Google uses **exactMatchDomainDemotion** to prevent spammy or manipulative EMDs from ranking well, though they may still have had a historical boost.

Panda Algorithm: Behavior & Links

The **Panda** update, previously thought to focus only on content quality, also considers user behavior and backlinks. The update assigns a **Site Quality Score** based on these factors, which can either boost or demote a site.

Google Demotions

Several types of demotions were mentioned, including:

- **Anchor Mismatch Demotion**: Mismatched anchor text and content can lead to penalties.

- **SERP Demotion**: Sites with poor click-through rates may be ranked lower.

- **Exact Match Domain Demotion**: Spammy EMDs are likely to face demotions.

Content Length and Originality Matter

Google’s token system tracks content originality and length. While longer content is often seen as better, the document suggests that the key information should be presented early in the content to ensure Google recognizes it.

Quality Raters’ Impact

Google employs human quality raters to assess content quality. The leak suggests that these evaluations might influence rankings, though the extent is unclear. Google’s **Search Quality Guidelines** provide insight into how raters evaluate websites.

Backlinks Still Important

Backlinks remain a critical factor in Google’s ranking algorithm. The leak reinforces the significance of **PageRank** and how the overall authority of a website, combined with the quality and relevance of its backlinks, plays a major role in rankings.

Traffic Impacts Link Value

Google evaluates links based on the traffic of the referring page. A link from a high-traffic page is deemed more valuable than one from a low-traffic page.

Backlink Velocity

Google tracks the rate at which websites gain or lose backlinks over time, referred to as **backlink velocity**. Rapid link growth can raise red flags, potentially leading to penalties.

Content Changes Are Tracked

Google maintains a history of changes to websites, keeping records of up to 20 versions. This makes it difficult to “start fresh” without Google recognizing previous iterations of a page.

User Signals

User signals, such as click-through rates and dwell time, are crucial ranking factors. The **NavBoost** system reportedly rewards sites that receive positive user engagement while demoting those with poor user signals.

Content Freshness

Google tracks the freshness of content using metrics like the **bylineDate**, **syntacticDate**, and **semanticDate**, ensuring that regularly updated content ranks better.

Domain Registration

Google collects domain registration information, including ownership and history, though it's unclear how or if this influences rankings.

Small Personal Sites Flagged

The document mentions a metric that identifies “small personal sites,” though it’s not clear what impact this has on rankings.

Conclusion

While the leaked documents provide fascinating insights into Google’s potential ranking systems, they emphasize that none of this information is confirmed.

However, the findings align with common SEO practices such as creating quality content, building backlinks, and focusing on user experience.

The leak suggests that some long-held SEO theories, such as the existence of a sandbox or the importance of user signals, might be valid.

Would any of these points change your current SEO strategy?

Dave

Login
Create Your Free Wealthy Affiliate Account Today!
icon
4-Steps to Success Class
icon
One Profit Ready Website
icon
Market Research & Analysis Tools
icon
Millionaire Mentorship
icon
Core “Business Start Up” Training

Recent Comments

16

Wow, That's enlightening! Thanks for sharing, Dave!

Other methods to drive traffic will help us overcome the changing nature of SEO:

1. Social Media Marketing
Leverage platforms like Facebook, Instagram, Twitter, and LinkedIn to share your content, interact with your audience, and promote your website. Consistent, engaging posts can help build a following and drive traffic.
2. Content Marketing
Create valuable, shareable content such as blog posts, infographics, or videos. The key is to create something your audience finds useful and wants to share.
Use guest blogging on other sites to increase your exposure and backlink opportunities.
3. Email Marketing
Build an email list and send newsletters or updates to your subscribers. Engaging emails with links back to your website can help maintain traffic from existing audiences.
4. Paid Advertising (PPC)
Platforms like Google Ads or social media ads (Facebook, Instagram, etc.) can provide a targeted way to attract traffic. Paid campaigns can be set up based on specific demographics, interests, or search queries.
5. Influencer Marketing
Partnering with influencers in your niche can help expose your website to a larger audience. This works particularly well in industries where trust and personal recommendation matter.
6. Referral Programs
Implementing a referral or affiliate program encourages existing visitors or partners to refer others to your site in exchange for rewards or commissions.
7. Community Engagement
Participate in forums, online communities (like Reddit, Quora), or social groups where your target audience spends time. Answering questions, offering insights, and dropping relevant links (without spamming) can gradually build traffic.
8. Content Syndication
Republish or share your content on platforms like Medium, LinkedIn, or other relevant content networks to reach a broader audience.
9. Webinars or Online Events
Hosting webinars or live events with topics relevant to your audience can draw attendees who are likely to visit your website afterward for more resources.
10. Podcasting
Launching a podcast, or appearing as a guest on others, can build brand awareness and direct listeners to your site.
11. Collaborations and Partnerships
Partner with other websites, brands, or content creators to co-host events, run joint campaigns, or cross-promote each other's work.
12. Viral Giveaways or Contests
Hosting a contest or giveaway that encourages social sharing, follows, or website visits can generate significant traffic spikes.
13. Video Marketing (YouTube, TikTok)
Videos on platforms like YouTube or TikTok can drive traffic if they’re informative, entertaining, or educational, with clear calls to action leading viewers to your site.
14. Offline Promotion
Don’t overlook offline methods such as speaking at events, handing out business cards, or even promoting your site on printed materials (flyers, brochures, etc.).

This is a summary list I have from ChatGPT that I thought summarized options to go along with our SEO efforts.

Have a great day!

Great list Howard. This adds substantially to the thread, for those who might have got a little downhearted. ;-)

Dave

Glad I could contribute my findings!

Hi Dave

Yes, I've seen this document surface a few times on various sites.

Here's my take on Google, for what it's worth. Lol:

Although Google still promotes EEAT as one of its highest ranking factors, I totally disagree, and User-Generated Content (UGC) proves it. Google is, above all, a for-profit business. The cornerstone of all successful businesses is giving the customer what they want.

The general public is searching more and more for answers through UGC by typing queries into the search bar like "What's the best dog for kids reddit" or "How do I know if I have a problem with my thyroid gland quora."

There is nothing about the average UGC site that meets any of the EEAT criteria. Can you typically rely on basically "anyone's opinion" on sites like Reddit or Quora to provide Experience, Expertise, Authority, or Trust? Still, Google is giving UGC some of the highest rankings!

Between UGC and AI-generated search results, Google is losing its battle for information search dominance. To counter, Google has launched its own search AI with Generative Search Experience (GSE) and Gemini.

These days, non-UGC sites are much more about domain authority and backlinks than anything else. You'll see some of the thinnest content ranked on page one for sites that have a DA of about 60 and above. Sites with high DA typically have high numbers of relevant backlinks.

All those other ranking criteria play a much lesser role.

Everything changes, and search engine behavior on the part of the user and the results is one of them.

Just my opinion based on observation. 😎

Rock On! 🤘
Frank 🎸

Thanks for your thoughts mate. Whatever is true, I think the cat is out of the bag. 😂

Dave

You're welcome, Dave.

Have a great day! 😎
Frank 🎸

Whether the information from this leaked document is genuine is a tricky question, but based on the patterns we’ve seen and what’s in the leak, I’d say there’s some reason to think it might hold truth—but with important caveats. Here’s why:

1. It Fits with What SEOs Have Experienced

A lot of what’s in the leak seems to confirm long-held suspicions in the SEO world. For example:

• Domain Authority: SEOs have long speculated that Google uses some kind of domain-wide authority metric. Google denied it, but the presence of terms like “siteAuthority” and “authorityPromotion” in the leak sounds like what SEOs have suspected for years.

• Clicks Matter: The idea that click data influences rankings has been debated for ages, and this leak seems to support it. If Google collects click data like “Good Clicks” and “Bad Clicks” to gauge user engagement, this would align with how many in the SEO field have already been optimizing sites—aiming for high engagement and low bounce rates.

• The Sandbox: Google has denied the existence of a sandbox for new sites, but many SEOs have noticed that new domains often struggle to rank. The mention of the “hostAge” metric in the leak would explain this slow ranking process for newer websites.

2. The Amount of Detail

The leak has highly specific references to Google’s internal processes, including terms like NavBoost, Twiddler, and specific micro-services like Trawler (for crawling) and Mustang (for ranking).

If someone made this up, they’d have to know an awful lot about Google’s backend systems, which makes me think there could be some truth to the leak.

Also, the sheer complexity described—multiple algorithms working together, systems like NavBoost re-ranking results based on user behavior, etc.—feels very in line with what you’d expect from a company as large and data-driven as Google.

3. It’s Consistent with Google’s Behaviour

Google’s pattern of behaviour also adds weight to the idea that some of this leak could be true. For example, they’ve been caught doing things they publicly denied before, such as tracking users in Incognito Mode.

The leak suggests that Chrome data might be used for ranking, despite Google saying otherwise. This wouldn’t be the first time Google’s public statements were contradicted by internal data.

4. Multiple Sources Point to Similar Ideas

The document references reputable SEO sources like SparkToro and iPullRank, suggesting that the leak isn’t just coming from a single, questionable source.

These outlets have been in the SEO space for a long time, and the fact they’re covering the leak gives it more credibility. It also hints that the leak might not be entirely fabricated if seasoned SEO experts are discussing it seriously.

But Now, the Caveats…

1. Google’s Denial

Historically, Google has always denied anything that implies they’re doing something shady or anti-competitive, like favoring clicks or using Chrome data.

While the leak presents compelling evidence, Google still has the official stance that some of these practices aren’t in play. And, until something official or a reliable whistleblower confirms it, we have to treat this as speculative.

2. Could Be Outdated or Testing Data

There’s a chance that even if this leak is real, it might be outdated information, part of internal tests, or systems Google experimented with but didn’t fully implement.

Just because the data exists doesn’t mean it’s being used. Google runs thousands of experiments every year, and not all of them make it into their public-facing algorithms.

3. SEO Conspiracy Culture

The SEO community is sometimes prone to conspiracies about Google’s ranking factors. A lot of the “unconfirmed” nature of the leak might be playing into these conspiracies.

If the leak was intentionally designed to mislead, it would still “feel” true to a lot of SEOs who have long suspected that Google isn’t telling the whole story about its ranking systems.

This means that while the information sounds plausible, it might also just be tailored to what SEOs want to believe.

My Overall Take

I think there’s a strong chance that some, if not most, of this information is genuine. It fits with what many in the SEO industry have already observed, and it’s detailed enough to sound like actual insider knowledge.

However, without official confirmation, it’s important to approach this leak cautiously. While it may offer insights into Google’s ranking mechanisms, it’s wise not to overhaul SEO strategies based solely on this leak.

Stick to what’s tried and true—quality content, good user experience, and solid backlinks—while keeping an eye on these developments.

Do you think this leak changes the way you’ll approach SEO, or does it just confirm what you’ve already been doing?

Thanks for information and although I understand it’s not confirmed it is probably is real. In either case it all seems to be viable information, I plan to try and use some of these points in critiquing my own site. many are above my pay grade as I don’t have a clue what they are due to my own lack of knowledge of These areas . Very interesting and thanks for sharing ! M

I think the contents explain some of the things that have happened recently in Google-land very well. However, people now have alternative ways of finding info, and Google no longer provides what people are searching for in a consistent way.

I know for a fact that websites that answered questions satisfactorily a couple of years ago are no longer in the index, and that isn't good service for the searcher.

Google search is now offering a censored version of the web. It is warped by its own sense of self.

Remember, power corrupts, and absolute power corrupts absolutely.

Dave

As a person which owns a new launched website, I should be somehow discouraged. It is good to know all the factors which give you a good rank, according to Google.

But, we have to consider our present time too. Very soon, SEARCHGTP will be launched and it will have a huge impact on how to be ranked.

So, I remain open to know these informations, and at the same time, I know many new things are going to arrive very shortly!

That's exactly right. Google will not continue to dominate search like it has for the past 20 years. Other search sites will be more open about how they find and rank content and we'll find visitors coming from a more diverse range of index providers, all interlinked and cross-indexed. That is good for the industry.

Dave

See more comments

Login
Create Your Free Wealthy Affiliate Account Today!
icon
4-Steps to Success Class
icon
One Profit Ready Website
icon
Market Research & Analysis Tools
icon
Millionaire Mentorship
icon
Core “Business Start Up” Training