Last Friday I had the pleasure of watching John Mueller of Google being interviewed on the BrightonSEO main stage by (Distilled alumna!) Hannah Smith. I found it hugely interesting how different it was from the previous similarly formatted sessions with John I’ve seen – by Aleyda at BrightonSEO previously, and more recently by my colleague Will Critchlow at SearchLove. In this post, I want to get into some of the interesting implications in what John did and, crucially, did not say.
I’m not going to attempt here to cover everything John said exhaustively – if that’s what you’re looking for, I recommend this post by Deepcrawl’s Sam Marsden, or this transcript via Glen Allsopp (from which I’ve extracted below). This will also not be a tactical post – I was listening to this Q&A from the perspective of wanting to learn more about Google, not necessarily what to change in my SEO campaigns on Monday morning.
Looking too closely?
I’m aware of the dangers of reading too much into the minutia of what John Mueller, Garry Ilyes, and crew come out with – especially when he’s talking live and unscripted on stage. Ultimately, as John said himself, it’s his job to establish a flow of information between webmasters and search engineers at Google. There are famously few people, or arguably no people at all, who know the ins and outs of the search algorithm itself, and it is not John’s job to get into it in this depth.
That said, he has been trained, and briefed, and socialised, to say certain things, to not say certain things, to focus on certain areas, and so on. This is where our takeaways can get a little more interesting than the typical, clichéd “Google says X” or “we think Google is lying about Y”. I’d recommend this presentation anddeckfrom Will if you want to read more about that approach, and some past examples.
So, into the meat of it.
1. “We definitely use links to recognize new content”
Hannah: Like I said, this is top tier sites… Links are still a ranking factor though, right? You still use links as a ranking factor?
John: We still use links. I mean it’s not the only ranking factor, so like just focusing on links, I don’t think that makes sense at all… But we definitely use links to recognize new content.
Hannah: So if you then got effectively a hole, a very authoritative hole in your link graph… How is that going to affect how links are used as a ranking factor or will it?
John: I dunno, we’ll see. I mean it’s one of those things also where I see a lot of times the sites that big news sites write about are sites that already have links anyway. So it’s rare that we wouldn’t be able to find any of that new content. So I don’t think everything will fall apart. If that happens or when that happens, but it does make it a little bit harder for us. So it’s kind of tricky, but we also have lots of other signals that we look at. So trying to figure out how relevant a page is, is not just based on the links too.
The context here is that Hannah was interested in how much of a challenge it is for Google when large numbers of major editorial sites start adding the “nofollow” attribute to all their external links – which has been a trend of late in the UK, and I suspect elsewhere. If authoritative links are still an important trust factor, does this not weaken that data?
The interesting thing for me here was very much in what John did not say. Hannah asks him fairly directly whether links are a ranking factor, and he evades three times, by discussing the use of links for crawling & discovering content, rather than for establishing a link graph and therefore a trust signal:
“We still use links”
“We definitely use links to recognize new content”
“It’s rare we wouldn’t be able to find any of that new content”
There’s also a fourth example, earlier in the discussion – before the excerpt above – where he does the same:
“…being able to find useful content on the web, links kind of play a role in that.”
This is particularly odd as in general, Google is pretty comfortable still discussing links as a ranking factor. Evidently, though, something about this context caused this slightly evasive response. The “it’s not the only ranking factor” response feels like a bit of an evasion too, given that Google essentially refuses to discuss other ranking factors that might establish trust/authority, as opposed to just relevance and baseline quality – see my points below on user signals!
Personally, I also thought this comment was very interesting and somewhat vindicating of my critique of a lot of ranking factor studies:
“…a lot of the times the sites that big news sites write about are sites that already have links anyway”
Yeah, of course – links are correlated with just about any other metric you can imagine, whether it be branded search volume, social shares, click-through rate, whatever.
2. Limited spots on page 1 for transactional sites
Hannah: But thinking about like a more transactional query, for example. Let’s just say that you want to buy some contact lenses, how do you know if the results you’ve ranked first is the right one? If you’ve done a good job of ranking those results?
John: A lot of times we don’t know, because for a lot of these queries there is no objective, right or wrong. They’re essential multiple answers that we could say this could make sense to show as the first result. And I think in particular for cases like that, it’s useful for us to have those 10 blue links or even 10 results in the search page, where it’s really something like we don’t completely know what you’re looking for. Are you looking for information on these contact lenses? Do you want to buy them? Do you want to compare them? Do you want to buy a specific brand maybe from this-
This is one of those things where I think I could have figured this out from the information I already had, but it clicked into place for me listening to this explanation from John. If John is saying there’s a need to show multiple intents on the first page for even a fairly commercial query, there is an implication that only so many transactional pages can appear.
Given that, in many verticals, there are far more than 10 viable transactional sites, this means that if you drop from being the 3rd best to the 4th best among those, you could drop from, for example, position 5 to position 11. This is particularly important to keep in mind when we’re analysing search results statistically – whether it be in ranking factor studies or forecasting the results of our SEO campaigns, the relationship between the levers we pull and the outputs we see can be highly non-linear. A small change might move you 6 ranking positions, past sites which have a different intent and totally different metrics when it comes to links, on-page optimisation, or whatever else.
3. User signals as a ranking factor
Hannah: Surely at that point, John, you would start using signals from users, right? You would start looking at which results are clicked through most frequently, would you start looking at stuff like that at that point?
John: I don’t think we would use that for direct ranking like that. We use signals like that to analyze the algorithms in general, because across a million different search queries we can figure out like which one tends to be more correct or not, depending on where people click. But for one specific query for like a handful of pages, it can go in so many different directions. It’s really-
So, the suggestion here is that user signals – presumably CTR (click-through rates), dwell time, etc. – are used to appraise the algorithm, but not as part of the algorithm. This has been the line from Google for a while, but I found this response far more explicit and clear than John M’s skirting round the subject in the past.
It’s difficult to square this with some past experiments from the likes of Rand Fishkin manipulating rankings with hundreds of people in a conference hall clicking results for specific queries, or real world results I’ve discussed here. In the latter case, we could maybe say that this is similar to Panda – Google has machine learned what on-site attributes go with users finding a site trustworthy, rather than measuring trust & quality directly. That doesn’t explain Rand’s results, though.
Here are a few explanations I think are possible:
Google just does not want to admit to this, because it’d look spammable (whether or not it actually is)
In fact, they use something like “site recent popularity” as part of the algorithm, so, on a technicality, don’t need to call it CTR or user signals
The algorithm is constantly appraising itself, and adjusts in response to a lot of clicks on a result that isn’t p1 – but the ranking factor that gets adjusted is some arbitrary attribute of that site, not the user signal itself
Just to explain what I mean by the third one a little further – imagine if there are three sites ranking for a query, which are sites A, B, & C. At the start, they rank in that order – A, B, C. It just so happens, by coincidence, that site C has the highest word count.
Lots of people suddenly search the query and click on result C. The algorithm is appraising itself based on user signals, for example, cases where people prefer the 3rd place result, so needs to adjust to make this site rank higher. Like any unsupervised machine learning, it finds a way, any way, to fit the desired outcome to the inputs for this query, which in this case is weighting word count more highly as a ranking factor. As such, result C ranks first, and we all claim CTR is the ranking factor. Google can correctly say CTR is not a ranking factor, but in practice, it might as well be.
For me, the third option is the most contrived, but also fits in most easily with my real world experience, but I think either of the other explanations, or even all 3, could be true.
I hope you’ve enjoyed my rampant speculation. It’s only fair that you get to join in too: tweet me at @THCapper, or get involved in the comments below.