SEO Articles

Why 100% indexing isn’t possible, and why that’s OK

When it comes to topics like crawl budget, the historic rhetoric has always been that it’s a problem reserved for large websites (classified by Google as 1-million-plus webpages) and medium-sized websites with high content change frequency.

In recent months, however, crawling and indexing have become more common topics on the SEO forums and in questions posed to Googlers on Twitter.

From my own anecdotal experience, websites of varying size and change frequency have since November seen greater fluctuations and report changes in Google Search Console (both crawl stats and coverage reports) than they have historically.

A number of the major coverage changes I’ve witnessed have also correlated with unconfirmed Google updates and high volatility from the SERP sensors/watchers. Given none of the websites have too much in common in terms of stack, niche or even technical issues – is this an indication that 100% indexed (for most websites) isn’t now possible, and that’s OK?

This makes sense.

Google, in their own docs, outlines that the web is expanding at a pace far outstretching its own capability and means to crawl (and index) every URL.

Get the daily newsletter search marketers rely on.

Processing…Please wait.


See terms.

function getCookie(cname) {
let name = cname + “=”;
let decodedCookie = decodeURIComponent(document.cookie);
let ca = decodedCookie.split(‘;’);
for(let i = 0; i <ca.length; i++) {
let c = ca[i];
while (c.charAt(0) == ' ') {
c = c.substring(1);
if (c.indexOf(name) == 0) {
return c.substring(name.length, c.length);
return "";
document.getElementById('munchkinCookieInline').value = getCookie('_mkto_trk');

In the same documentation, Google outlines a number of factors that impact their crawl capacity, as well as crawl demand, including:

The popularity of your URLs (and content).It’s staleness.How quickly the site responds.Google’s knowledge (perceived inventory) of URLs on our website.

From conversations with Google’s John Mueller on Twitter, the popularity of your URL isn’t necessarily impacted by the popularity of your brand and/or domain.

Having had first-hand experience of a major publisher not having content indexed based on its uniqueness to similar content already published online – as if it is falling below both the quality threshold and doesn’t have a high enough SERP inclusion value.

This is why, when working with all websites of a certain size or type (e.g., e-commerce), I lay down from day one that 100% indexed is not always a success metric.

Indexing tiers and shards

Google has been quite open in explaining how their indexing works.

They use tiered indexing (some content on better servers for faster access) and that they have a serving index stored across a number of data centers that essentially stores the data served in a SERP.

Oversimplifying this further:

The contents of the webpage (the HTML document) document are then tokenized and stored across shards, and the shards themselves are indexed (like a glossary) so that they can be queried quicker and easier for specific keywords (when a user searches).

A lot of the time, indexing issues are blamed on technical SEO, and if you have a noindex or issues and inconsistencies preventing Google from indexing content, then it is technical, but more often than not – it’s a value proposition issue.

Beneficial purpose and SERP inclusion value

When I talk about value proposition, I’m referring to two concepts from Google’s quality rater guidelines (QRGs), these being:

Beneficial purposePage quality

And combined, these create something I reference as the SERP inclusion value. 

This is commonly the reason why webpages fall into the “Discovered – currently not indexed” category within Google Search Console’s coverage report.

In the QRGs, Google makes this statement:

Remember that if a page lacks a beneficial purpose, it should always be rated Lowest Page Quality ­regardless of the page’s Needs Met rating or how well­-designed the page may be.

What does this mean? That a page can target the right keywords and tick the right boxes. But if it’s generally repetitive to other content and lacks additional value, then Google may choose not to index it.

This is where we come across Google’s quality threshold, a concept for whether a page meets the necessary “quality” to be indexed. 

A key part of how this quality threshold works is that it’s almost real-time and fluid.

Google’s Gary Illyes confirmed this on Twitter, where a URL may become indexed when first found and then dropped when new (better) URLs are found or even given a temporary “freshness” boost from manual submission in GSC.

Working out whether you have an issue

The first thing to identify is if you’re seeing the number of pages in Google Search Console’s coverage report being moved from included to excluded.

This graph on its own and out of context is enough to cause concern amongst most marketing stakeholders.

But how many of these pages do you care about? How many of these pages drive value?

You’ll be able to identify this through your collective data. You’ll see if traffic and revenue/leads are decreasing in your analytics platform, and you’ll notice in third-party tools if you’re losing overall market visibility and rank.

Once you’ve identified if you are seeing valuable pages dropping out of Google’s index, the next steps are to understand the why and Search Console breaks down excluded into further categories. The main ones you need to be aware of and understand are:

Crawled – currently not indexed

This is something I’ve encountered more with e-commerce and real estate than any other vertical.

In 2021 the number of new business applications registrations in the U.S. broke previous records, and with more businesses competing for users, there is a lot of new content being published – but likely not a lot of new and unique information or perspectives.

Discovered – currently not indexed

When debugging indexing issues, I find this a lot on e-commerce websites or websites that have deployed a considerable programmatic approach to content creation and published a large number of pages at once.

The main reasons pages fall into this category can be down to crawl budget, in that you’ve just published a large amount of content and new URLs and grown the number of crawlable and indexable pages on the site exponentially, and the crawl budget that Google has determined for your site isn’t geared to this many pages.

There’s not a lot you can do to influence this. However, you can help Google through XML sitemaps, HTML sitemaps and good internal linking to pass page rank from important (indexed) pages to these new pages.

The second reason why content may fall into this category is down to quality – and this is common in programmatic content or e-commerce sites with a large number of products and PDPs that are similar or variable products.

Google can identify patterns in URLs, and if it visits a percentage of these pages and finds no value, it can (and sometimes will) make an assumption that the HTML documents with similar URLs will be of equal (low) quality, and it will choose not to crawl them.

A lot of these pages will have been created intentionally with a customer acquisition objective, such as programmatic location pages or comparison pages targeting niche users, but these queries are searched in low frequency, will likely not get many eyes, and the content may not be unique enough versus the other programmatic pages, so Google will not index the low-value proposition content when other alternatives are available.

If this is the case, you will need to assess and determine whether the objectives can be achieved within the project resource and parameters without the excessive pages that are clogging up crawl and not being seen as valuable.

Duplicate content

Duplicate content is one of the more straightforward and is common in e-commerce, publishing and programmatic.

If the main content of the page, which holds the value proposition, is duplicated across other websites or internal pages, then Google won’t invest the resource in indexing the content.

This also ties into the value proposition and the concept of beneficial purpose. I’ve encountered numerous examples where large, authoritative websites have had content not indexed because it is the same as other content available – not offering unique perspectives or unique value propositions.

Taking action

For most large websites and decent-sized medium websites, achieving 100% indexing is only going to get harder as Google has to process all existing and new content on the web.

If you find valuable content being deemed below the quality threshold, what actions should you take?

Improve internal linking from pages that are “high value”: This doesn’t necessarily mean the pages with the most backlinks, but those pages that rank for a large number of keywords and have good visibility can pass positive signals through descriptive anchors to other pages.Prune low-quality, low-value content. If the pages being excluded from the index are low value and not driving any value (e.g., pageviews, conversions), they should be pruned. Having them live is just wasting Google’s crawl resource when it chooses to crawl them, and this can affect their assumptions of quality based on URL pattern matching and perceived inventory.

The post Why 100% indexing isn’t possible, and why that’s OK appeared first on Search Engine Land.

Read More

10 Best Live Chat Software For Websites (2022 Reviews)

A good live chat software makes it simple to enhance customer experience, and manage support requests at speed. As consumers continue to search for more convenient ways to connect with companies, a website chat function is becoming a must-have for many brands. We reviewed some of the most popular live chat software options on the […]

The post 10 Best Live Chat Software For Websites (2022 Reviews) appeared first on

Read More

Yoast SEO 19.1 & Premium 18.7: More crawl settings

Yoast SEO 19.1 & Premium 18.7: More crawl settings

As we’ve said many times, crawlability is an important aspect of SEO. Your site has to do away with technical hurdles that can hinder crawling while keeping the number of crawlable URLs in check. Crawlers are eager to crawl everything they can find on your site — even if you have a small site with content that hardly changes. The overhead is enormous because there are loads of URLs that make no sense for them to crawl. We’re doing something about that. In Yoast SEO Premium 18.7, we’re expanding our brand-new crawl settings with a long list of additional options.

More crawl settings in Yoast SEO Premium

In Yoast SEO Premium 18.6, we introduced the first addition to a new tab called Crawl settings. Within this section, you can find many toggles that let you turn off various things that WordPress automatically adds to your site and that most sites won’t miss. These WordPress additions generate URLs that search engines can and will crawl. At Yoast, we’re on a mission to drastically reduce the number of URLs that any given site outputs, as crawling all of these are a tremendous waste.

Only available in Yoast SEO Premium, these crawling features give you more control over what you want Google to crawl. The crawl settings are in beta, and we welcome your feedback. Together, we can help take your site to another level!

Go Premium and get access to all our features!

Premium comes with lots of features and free access to our SEO courses!

Get Yoast SEO Premium »Only $99 USD / per year (ex VAT) for 1 site

What can you expect from the crawl settings?

Here’s the basic crawling list with stuff added by WordPress that you can turn off to prevent search engines from crawling it. In our help documentation on the crawl settings, you’ll find all these options explained.

Short linksREST API linksRSD / WLW linksoEmbed linksGenerator tagsEmoji scriptsPingback HTTP headerPowered by HTTP header

You can also disable a huge amount of feeds generated by WordPress:

Global feedGlobal comment feedPost comment feedPost authors feedPost type feedCategory feedsTag feedsCustom taxonomy feedsSearch results feedsAtom/RDF feeds

These settings give you much more flexibility in deciding what you want Google to crawl. We don’t activate these settings for you automatically because it might be that your site benefits from or uses one of these things. We don’t want to break stuff!

Yoast SEO Premium comes with new options to manage crawling

Yoast SEO 19.1 enhancements

Yoast SEO 19.1 comes with several fixes and enhancements. For example, our content analyses improved, with support for em dashes as punctuation marks in the focus keyphrase field and correctly detecting caps in focus keyphrases with a period.

Among many other things, Yoast SEO recognizes keyphrases with hyphens in the slug. Our word forms feature and the filtering out of function words now works for hyphenated keyphrases. For example, the keyphrase ex-husband will be recognized in the slug ex-husband, but also the keyphrase ex-husbands.

With the word forms feature, we recognize different grammatical forms of your focus keyphrase in various places in your content. You don’t have to write the same form of your keyphrase each time but can diversify your language by changing forms. This makes you write much more naturally!

WooCommerce SEO 14.9: Global identifiers for product variations + Schema

Working with product variations has always been a hassle. WooCommerce, for instance, doesn’t allow you to add global identifiers for product variations — making it hard to add valuable data for Google to discover. Moreover, there’s no simple way to add structured data to this correctly! In WooCommerce SEO 14.9, we’re fixing this.

WooCommerce SEO now outputs the necessary Schema markup for product variations on individual products. Of course, the plugin already outputs all the product variations Schema markup through the offers property. Now, it will also output the global identifiers for product variations under the offers property. In addition, we’ve also added the variations’ SKU and URL to a product’s Schema.

Open the product variations tab in WooCommerce and fill in the global identifier details for each variation in the new Yoast SEO section. The details will be automatically added to the product structured data generated by WooCommerce SEO.

You can now add global identifiers to each product variant in WooCommerce SEO

Removing dates from product Schema

Google sometimes shows dates in your search results, which can help users find the most relevant content. But on your WooCommerce product pages, this doesn’t make much sense — so we remove some metadata from your page to discourage them from showing dates here. So, we now remove the datePublished and dateModified attributes on the ItemPage Schema markup for a product.

Update now!

Yoast SEO 19.1, Yoast SEO Premium 18.7, and WooCommerce SEO 14.9 are out today. In Premium 18.7, we’ve vastly expanded the possibilities of the new crawl settings we introduced in 18.6. You now have many options to manage crawling on your WordPress site.

The post Yoast SEO 19.1 & Premium 18.7: More crawl settings appeared first on Yoast.

Read More

Introducing NPD Canada

Introducing NPD Canada

As my enterprise-level agency, NP Digital, has grown, we’ve been able to expand across the globe. We’ve opened agencies in Brazil, the UK, Australia, India, and now: Canada! That’s right: we have opened a full-service digital marketing agency in Canada. 

And to kick things off with a bang, I’ll be heading to Canada to present at the Collision Conference. I’m so excited to talk about what I’ve learned you can do to ace your digital strategy in the years to come. 

It’s not just me who will be there: Ronnie Malewski, managing director of NPD Canada, as well as Ryan Douglas, VP of Strategy & Performance, will also be teaching a master class on CRO called “What Optimizing 500+ Sites Has Taught Us.”  

Why Did We Choose Canada as Our New NPD International Location?

It’s no secret that international expansion has been a big focus for us for awhile now. So why Canada? And why now? 

Well, this year, for the first time, digital ad spending will be more than double traditional ad spending, accounting for 68.3% of the total ad market in Canada–and that number is expected to reach 15.4 billion by 2024

Since Canada is expected to become one of the world’s fastest growing markets this year in terms of ad spending, it only made sense for us to bring our expertise there meet the demands of the market.

Services Offered by NPDC

NPD Canada is a full-service digital marketing agency, focusing on strategies and solutions that accelerate growth for your brand. 

Our primary areas of focus are: 

Digital Intelligence:

strategy and planningcustomer journey mappingdata analytics and insightsdashboard developmentemail marketing automationconversion rate optimization

Earned Media

Our team excels in delivering across the full spectrum of earned media specializing in:

Technical SEOOn-Site OptimizationContent ideationContent creationLink BuildingDigital PRImplementation

Paid Media:

Our team of specialists can help take your performance media strategy to the next level across a wide array of channels

Search (Google ads, Microsoft Ads)Social (Meta, LinkedIn, TikTok, Snap, Pinterest, Etc)Programmatic Display & VideoEcommerce (Google Shopping, Amazon, Marketplaces)

Tools and Tech:

We have access to the latest tools and technologies that give us the data and UX features you need to grow your brand.

Client Services:

Our client services ensure the cross-channel collaboration you need to drive customer success.

Why Choose NPD Canada? 

So, why should you choose to work with us? 

First, we are bringing serious talent to the table.

Ronnie Malewski, Managing Director, has more than 16 years of experience in digital marketing. In that time, he’s helped both SMB and enterprise brands grow their businesses. He’s worked with major brands like Adidas, Loblaws, and Microsoft. We’re lucky Ronnie is able to bring his years of proven success in the digital marketing space to provide strategic oversight to the clients of NPD Canada.

Other key players include:

Ryan Douglas, VP of Strategy and Performance: Ryan has over a decade of experience driving strategy and media activation for SMB and enterprise brands. He is a subject matter expert in SEM, SEO, display, video, social media, and email. Ryan brings years of experience driving holistic media strategies proven to deliver meaningful business results.Nikki Lamb, SEO Director: Nikki has years of experience in SEO analytics and has excelled in working across channels to ensure consistency, drive innovation, and maintain operational excellence.D Doan, Director of Data Analytics: D’s team develops advanced analytics strategies for some of the world’s most recognizable brands.

Our talent isn’t the only reason to choose NPD Canada for your digital marketing partner.

In a matter of three months we’ve already onboarded seven clients and are growing so quickly we’re hiring at a rapid pace

A few other things that make NPD Canada great:

We’re minority founded and minority owned.We’re supported by NPD U.S., meaning we have access to even more of the brightest minds in digital marketing across multiple service areas.Our agencies have won over ten awards, we have 500 clients globally, and over 600 employees worldwide. 

Are you ready to take your business to the next level by partnering with NPD Canada? Let’s talk. 

Read More

Google releases several Ads Manager updates

Google Ads Manager has added eight new features and updates that advertisers should be aware of. Plus, Google has teased four additional updates that are coming soon.

What’s new. The newly released updates are:

PPID (publisher provided identifier) Time-to-Live (TTL) Extension has increased from 90 to 180 days.Optimized pricing to reflect inventory value. Can be disabled via your network settings.Facebook’s rebrand as Meta. Publishers will now see ‘Meta’ in all references in the ads manager.Updated ad experiences video protections. Now called “Block non-instream video ads.” Troubleshoot transparency files for MCM publishers. Google is aiming to have the SupplyChain Object for MCM Manage Inventory publishers update completed by the end of June.GA4 integration for web data is now in open beta.Updates to the “Bid rejection reason” in reporting and data transfer. Additional granularity is being added to the “bid rejection reason” in both reporting and data transfer to be more specific about scenarios that result in bid losses.The WebView API for Ads is now available to unlock monetization opportunities.

Coming soon. The following updates are in progress, though Google has not given us a release date.

Updates to Active View measurement.The enforcement of app-ads.txt for CTV inventory.Query migration for Ad Exchange Historical report type queries.Header bidding in yield groups (beta).

Find out more. You can view Google’s release notes here

Why we care. Google releases a lot of new ad features, updates, and adjustments – often without notice. Being aware of these changes allows us to decide which ones are relevant to us and how/when to implement them.

The post Google releases several Ads Manager updates appeared first on Search Engine Land.

Read More

Google offers to show ad rivals on YouTube

As part of another EU antitrust investigation, Google parent Alphabet offered to let rival ad intermediaries place ads on YouTube. 

European Commission probe. This development is likely to pave the way for Google to settle an EU probe, opened last year, without having to pay a fine. The EU initiated this most recent probe to examine whether the tech giant and largest provider of search and video was giving itself an unfair advantage by restricting rivals’ and advertisers’ access to user data.

Google singled out. A EU watchdog singled out Google’s requirement that advertisers use it’s Ad Manager to display ads on YouTube and also potential restrictions on the way rivals are allowed to serve ads. They are also inquiring into Google’s requirements that advertisers use Display & Video 360 and Google Ads to buy YouTube ads – which is how publishers purchase an advertising spot on the popular video platform. 

A different regulator, Britain’s Competition and Markets Authority (CMA) launched its second probe into Google’s advertising practices last month, saying that the company could be “distorting competition and may have illegally favored its own services.”

Last year Britain imposed a competition regime to prevent Google and Facebook from “using their dominance to push out smaller firms and disadvantage customers.”

More information. You can read the full Reuters article here.

There is no response from Google on when they plan to allow rivals to place ads on YouTube, or in which countries this would take effect. 

Why we care. Rival ads on YouTube would allow other advertisers to compete for market share, which could benefit both advertisers and consumers. There’s no indication yet on what effect this could have on Google, but we’re certainly watching this closely.  

The post Google offers to show ad rivals on YouTube appeared first on Search Engine Land.

Read More