CT No.16: How grammar and SEO are intertwined
My thoughts on the BERT update. Also: a snacking solution
This past week I visited my college town and saw old friends, all of whom are doing fabulously well. We talked about work, like we do. Many of my friends didn’t know what “SEO” meant, and I explained — once right as we were walking outside Google’s Chelsea east coast HQ. These friends are worldly, intelligent, very successful in their respective fields, and it was a good reminder: my media/marketing/digital content world isn’t some land of universal knowledge. I mean, I didn’t actually think that. But. I take base knowledge of SEO for granted, like how people generally know what carbs are. But SEO is not carbs.
So, my definition: SEO is ensuring people can find content through algorithms, by enabling that content to be read by computers while using the language that people regularly use. SEO is inextricable from user experience and content creation, but it’s not altogether different from the inverted pyramid format of newspaper writing.
A fair number of industry creators — generally writers and creatives who have worked for crappy media companies alluded to in last week’s rant — think poorly of SEO as a practice. One day I will write extensively about all of these opinions, many of which are valid because hey! most SEO companies and operators are full of shit and don’t understand what good writing looks like. But I’m only permitting myself one rant monthly, and I spent that last week.
So for today: let’s look at the bright side of SEO updates. And that means looking closely at the unibrowed, finicky, turtlenecked openly gay straight man of Sesame Street: BERT.
On Google’s BERT algorithm update
The BERT update launched on October 24 and stands for the very SEO-friendly “Bidirectional Encoder Representations from Transformers.” (I’m kidding — that is not an SEO-friendly phrase at all unless we are optimizing for Optimus Prime’s diversity initiatives.)
BERT supposedly affects up to 10% of all search queries. It may have affected legacy news websites quite a bit. If you have a large content-driven website, you may have seen a change.
Changes from BERT are difficult for SEO professionals to track. That’s because SEO databases only encompass queries that are searched frequently — usually at least ten times monthly. SEO strategies generally select the largest relevant keyword that applies to a business and optimize for that query, including entity optimization and all that. Were this newsletter not on SEO-unfriendly Substack, I would want to optimize for terms like “martech” and “SEO software,” which are highly searched but specific to my topic matter. In SEO parlance we call these highly searched queries “head terms.” If there are modifiers to a head term, we call that a “long-tail” query, indicating literally that the query has more words.
Longtime Google users (aka all of us) have been trained to search with basic head terms because for ages, if you searched for “what should I do on my birthday?” you probably received results concerning having a birthday party for kids. In Google’s legacy algorithms, adult users would receive more useful responses if they Googled “events on December 11.”
As we users learned to search for two decades, that approach required quite a bit of cognitive work on our end. We can’t necessarily measure how search engines changed how humans processed language and interacted with computers, but we did a lot of work over those years.
“Aaaaaaa—aaaahhhh”: How BERT makes sense of disorganized information
“Aaaaaaa—aaaahhhh” is intended to be the sound that Sesame Street’s Bert makes. Yeah, I hate that my chosen medium is single-sense, but I try to make it work.
Using the same search query as the general population ensures that everyone will get the same result. It’s not the monoculture of 20th century mass media, which optimizes for audiences of millions, but it doesn’t account for the extreme diversity of human thought and written language.
Because 15% of search queries are brand new. That’s a mind-bogglingly high statistic for an SEO like me to comprehend. Fifteen percent of queries have never been searched before! Most of these queries are likely:
Voice searches
Variations in dialect, written or spoken
Longer, more specific queries for whatever reason. The human mind is a wonderland and life is a rich tapestry.
Most people don’t take time to consider what they are saying while they are saying it; if we did that, we’d all go crazy and overanalyze everything. We say words because they feel right and eventually someone understands us. With the internet, spoken and written language have become this mishmash of communication and context. An alien with an English textbook would have no idea what we are saying on the internet or in conversation. (Because Internet by Gretchen McCullough is a very good book on the shifting linguistics of the extremely online.)
BERT processes sentences in context. I would argue that a fair number of humans don’t know how to process sentences in context. (Insert joke about the current state of politics here.) When people want to clarify their context in a sentence, they add prepositional phrases and other modifiers. A single noun in a sentence may have many meanings but prepositional phrases clarify intent — like the prepositional phase “with oatmeal.”
Prepositional phrases were difficult for search engines to understand for a while since they just looked like other nouns that have different intent. With advanced processing like the BERT update, Google ostensibly understands that “cooking with oatmeal” means Bert only wants to see recipes that list raw oatmeal as an ingredient.
Prep phrases are often bad writing: A note to anyone I’ve ever edited
As a writer and editor, I hate prepositional phrases, especially when they are stacked on top of each other (like that ugh)! Other syntax signals clarify sentences without additional prepositional phrases.
“A good writer only needs one prepositional phrase per sentence.” — something I’ve said to writers I’ve edited.
As I understand it, the BERT update means that when users add prepositional phrases to a search query, Google is able to translate those into search results without those search results necessarily containing that prepositional phrase.
So basically: if your editor has made you find other ways to write a sentence sans prep phrases, you can no longer argue that prepositional phrases are necessary to make content sense to a search engine. You can use your fancy goodass edited writing and still show up in search results alongside the prep phrase-ridden farmed content — hopefully higher — as long as the content is helpful and meets the needs of the audience.
That’s the hope anyway. I could be wrong.
The ERNIE method to optimize for BERT: I can’t believe no one used this acronym yet
So what can you do about the BERT update?
Evaluate prepositional phrases in keyword and search query research for your content. (Long-tail queries tell you what your head terms only imply.)
Review your content in light of these prepositional phrases and their intent— are you meeting the needs of those queries?
Negate queries that don’t align with your core audiences or intent.
Identify opportunities to clarify context.
Ensure you’re meeting the needs of your users.
(Thank you for putting up with the bullshit acronym. Let’s just create content that helps users make decisions, take action, understand a concept, what have you.)
The joy of frivolous tech solutions to trivial problems: Nachos near me
One of the most common prepositional phrases in search is “near me,” which is what mobile users search when they’re not getting the localized results they want. When “near me” became a thing five years ago, I wrote a blog post for my former employer about how these searches were shaping results and behavior on mobile (they shifted the authorship to another employee after I left the company… but they’re not a publication, so that’s fine with me).
While writing, I canvassed my fellow SEOs for their favorite “near me” queries. Mine is “mozzarella sticks near me.” A coworker who had a knack for finding (and building) the weirdest internet phenomena said, “Dogs near me to buy.”
I love this query. I think about the person who searched for “dogs near me to buy” a lot. Because “dogs near me” wasn’t enough — if Google worked literally for “dogs near me,” you’d just literally get a map of all the dogs in your vicinity. That map would be great! I wish I had that “dogs near me” map. I support dog surveillance google so I can give more good dogs my attention.
“Dogs near me to buy” implies that one of those dogs in that amazing dog proximity map is for sale and the user only wants to see those dogs.
Or maybe it just implies that you would like a “pet store” or “pet adoption agency.” (Again, stacking prepositional phrases generally indicates that there’s a more specific noun available.”)
Anyway “dogs near me to buy” remains one of my favorite queries of all time.
And with that I give you the Database of Nachos for my weekly review. You can find nachos near you, wherever you are.
Database of Nachos at a glance
I realize reviewing the database of nachos is a cheat. But I just wrapped a few projects and I’m reviewing mostly enterprise-level tools right now… and those aren’t ready yet. And I was focusing on my BERT post. So I encourage you to munch on some nachos near you and enjoy some pure digital play.
This week in content tech news
Nick Quah’s Hot Pod celebrated five years this week and published some amazing insights on frustrations within the podcast industry. Hosted on NiemanLab, but you can subscribe to Hot Pod on its own.
Clarification on the differences in definitions of user experience, user-centered design and design thinking from CMS Wire.
Do you read Deez Links? Honestly, half my media news links come from Deez Links. If you like media news, subscribe to Deez Links. If you want to know about the cool new Esquire subscription model or all the things that were published about magazines last week, subscribe to Deez Links.
One quick addendum: Both articles on Conde Nast were very good, but the New York magazine feature identifies a giant operational failure (in my opinion): Hearst centralized digital operations, while Conde is still stumbling through some weird ideas about competition and operations.
More on search intent and emotion from Think with Google.
The NYT, Twitter and Adobe are working on a Content Authenticity Initiative. I will try to trust them!
How to create successful enterprise search (aka internal search), via CMSWire. I love the heading “The M in Machine Learning Doesn’t Stand for ‘Magic’”
Housekeeping | The Content Technologist is a weekly newsletter written by Deborah Carver. Follow The Content Technologist on Twitter. And if you’re not already a subscriber…