CT No.18: How algorithms work

I'm not an engineer or a mathematician person but I can try.

We’re going deep on content recommendation algorithms today. Algorithms like Google’s, Facebook’s, Amazon’s, etc. I’ve written about SEO and content recommendation before, but today I’m taking it back a step. Let’s get back to basics.

If you’re new here, you can always.


You are correct to be skeptical of algorithms. Proprietary algorithms control the content we see and have determined how we interact with the internet.

Our minds have a symbiotic relationship with algorithms, to the point that we’re near-cyborgs. You enter phrases into a search bar in a specific way because the algorithm has taught you that you will get better results by using specific nouns and adjectives. Algorithms change to meet your search behavior. You're right: it’s weird.

We use content recommendation algorithms because they can process an immense amount of information and if you’re like me, you’re overwhelmed by too much choice. Algorithms organize that vast amount of information for you so you don’t have to exist in the paradox of choice. In the faux MBA jargon that all the Serious Businesspeople speak now, they scale quickly.

The algorithms are proprietary because the theory goes: they’re less likely to be manipulated. They’re intellectual property that provides a service no one else can. We can probably argue that, but let’s just accept that intellectual property is also a weird thing. It’s all weird.

Intellectual property is super strange. Case in point: there is only one Snuggie.

What are organic vs. paid algorithms?

On Google search results pages, there are organic results from Google’s massive index of the entire web. They’re called “organic” or “natural” search, and they’re what we’re concerned with here. You cannot pay and you have never been able to pay to manipulate who appears on top of those search results. They’re based on the Google algorithm, which is the type of algorithm we’re discussing here. SEO describes the industry that optimizes for these types of organic results.

There are also ads on search results pages, which are paid placements that are clearly marked as ads. They have their own algorithm, the Google Ads algorithm. I have a preliminary understanding of how that paid algorithm works but that’s irrelevant here. We’re talking organic, non-paid content recommendation. More clarification can be found here.

How does a content recommendation algorithm, like Google’s SEO algorithm or Facebook’s or Spotify’s algorithm, work?

Why I am qualified to explain content algorithms: I’ve published content on the internet for 21 years and have had a deep obsession with figuring out how to build an audience. I’ve been working deeply in SEO for six years, half of those at a top-tier performance marketing agency that provided amazing resources, colleagues and training in digital analytics and measurement in all channels. I deeply understand and have a healthy but critical view of digital content measurement. I see software demos all the time and ask a lot of questions about proprietary algorithms to help me understand what I’m seeing. I have assisted in building recommendation algorithms for multiple clients. Knowing how content algorithms work is a core part of my strategic business, whether those algorithms are social, search or other sorts of computer-assisted decision-makers.

Why I am not qualified to explain algorithms: I haven’t taken a math class since high school. I have never taken a computer science course or statistics and couldn’t tell you how those types of classes would improve my understanding of algorithms anyway. I know the bare minimum of code and that’s only front-end appearance stuff.

So. There's your grain of salt. Now let’s do this.

How organic content recommendation algorithms are built

  1. Start with problem that needs to be solved — more often than not, it’s “I can’t find the X that I want and I know it’s out there.”

  2. Break down the problem and make a list: “These criteria will help you find the best X.” In analytics world, we call those criteria dimensions of the original problem.

    For example, if you want people to find music so people can dance to it, you’re going to create a dimension called tempo. Right now, we’re going to make a dimension called style, meant to highlight content that is particularly unique and has a strong voice.

  3. Devise some ways to assign numbers to that dimension. In this case, let’s assign numbers and structure the abstract human concept of style so it can evaluate a webpage:

    • The variety and breadth of vocabulary

    • The cadence or patterns of punctuation and sentence length

    • The amount of unique or original phrase combinations

    • The number of outside experts who would also say the webpage was “stylish”*

    The above are our ranking signals. (Are they called ranking signals in other non-Google algorithms? Not going down that rabbit hole right now, but that's what I call them because my launchpad is always search.)

    *Awwww yeah, this is the problematic one! Who's to say who the experts are and why are they important? More about this type of signal, known as Authority, within the next couple of weeks.
  4. Assign those ranking signals weight and order. Which ones are most important to the concept of style? If someone has a breadth of vocabulary or makes up their own words, do they have more style than someone who coins creative uses of the same words? Solely from the words they spoke on their early Food Network shows, did Emeril have more style than Rachael Ray? Your algorithm can now decide!

And then program it into a computer, using your if/then statements and logic and adding varying levels of complexity. I say this like it’s easy, but we all know it’s not.

You could feed your algorithm datasets like Food Network shows or the archives of Gawker, and it would come up with style, based on your very human considerations of how style operates and how you've weighed and structured them.

Congratulations! That's it! It’s oversimplified but that’s it. An algorithm is a mathematical model for solving a human problem.

If algorithms are so easy, how come SEO is a thing?

In mathematical terms, Google has a literal fuckton of ranking signals that continue to evolve. Many of those ranking signals are based on the words users input into Google — search queries — and the results they actively use in Google. How people use technology is constantly evolving, so those human-driven ranking signals evolve for Google as well. To add extra complication, the Google algorithm enables machines to learn from human communication and actions, so it’s a bit of a mess.

To add to that, there are all kinds of technical ranking signals SEO folks are trying to figure out: structure, site speed, security. Seriously, there are fuckton of very complex ranking factors.

SEO professionals spend an inordinate amount of time trying to figure out what those ranking signals are and how to show a website off to the algorithm, like a pretty peacock shimmying for a mate.

Actual footage of an SEO professional and the Google algorithm.

Google gives clues about how they adjust the search algorithm based on how people are using it, but the algorithm itself is off limits. SEO is an industry built on solving that mystery. Oh yeah, it's weird. But hey! There's always another mystery to solve.

I don't care for a lot of what Google does as a company. But I do fundamentally trust organic search to sort the information that’s available out there. I couldn’t live without this algorithm, even if I didn’t work in the industry.

Are search algorithms manipulated?

Humans make algorithms; computers didn’t just birth them on their own. That would be terrifying.

Humans have biases. So yes, algorithms are manipulated and reflect existing human biases. Google tweaks search results only when they are considered harmful (i.e., placing a suicide hotline at the top of self-harm-related queries.)

I’ve worked at and studied media companies. I’ve worked in and studied digital marketing. I have a strong understanding of how both of industries work. I’ve never worked at Google, but I have spent a lot of time evaluating ranking signals and results. And I can safely say: There is no “black magic” or shady human manipulation in Google's organic search algorithms. You can’t call up Google and ask for favors the way you can just call up, like, Ukraine. If you have a Google rep, they are working with paid ads or factual errors in local business listings and not organic search.

Big companies can devote more resources to understanding the Google algorithm because they have more resources! But that doesn’t mean big companies will always win... one of the reasons I love the organic algorithm is that ancient but informative websites rank highly all the time because they have the best information!

Last week’s WSJ article on search caused an uproar in the search industry because it didn't make an honest attempt to understand how the algorithms worked. The reporters misquoted experts and forced algorithms into a narrative of conspiracy. They understood “black box” as “black magic,” even though it’s a vastly different metaphor. They let their personal and business biases determine the facts and framing of their story. But, like the Google algorithm, the Wall Street Journal is not required to disclose or admit those biases.

Why do we assign algorithms so much power?

We’ve chosen computers to make these decisions because people have trouble processing more than a few criteria at a time. Computers process immense amounts of information quickly and because, in the words of Cady Heron/Tina Fey, math is the same in every language. But that doesn’t mean that they’re free of bias or history.

Algorithms are intensely, fundamentally human. Humans decide the criteria of what algorithms consider “good” or “bad,” and they approve the results before sending them off into the world. The problem with algorithms is the human part: the brain is a crazy thing full of thoughts and emotions and omissions and justifications.

We assign algorithms an objectivity that simply doesn’t exist. If it doesn’t exist in people, with all their emotions and perceptions and biases, objectivity certainly doesn’t exist in an algorithm that was made by a person. Using proprietary algorithms to make recommendation in complex social and emotional situations like law enforcement and healthcare is immensely inadvisable for a number of reasons, but humans are excited to get difficult decisions off their plates?

Most content algorithms are not wholly different from the concept of news judgment in journalism or principles of art critique or a company’s declared values or heck, the fundamentals of global democracy. Algorithms are just a framework that evolves over time and operates within the culture where they’re established.

I haven’t addressed more complex factors like machine learning or authority, or how people try to “hack” the Google algorithm. There’s still plenty to talk about. But understanding how algorithms are made are part of today’s media literacy.

If this post taught you something, please do me a solid and

Share


Grading the extreme complexity of search appearance: Morningscore

I have 71 SEO tools in my giant Content Technologist database, wherein I identify and review a slew of martech tools that don’t make it to the email pages of The Content Technologist. Many good SEO tools are on the market for different kinds of SEO operations, and Morningscore is one of them.

As stated above, SEO is complex and the barrier to understanding isn’t low. Advanced SEO pros want a lot of features, customization and options to build their measurement strategies and understand their performance. Enterprise-level tools like Brightedge, Conductor, Searchmetrics and SEMRush have many aspects and features to track website performance against many ranking factors.

But sometimes you just want to look at a number or a grade. Sometimes you don’t want to look through lists of links and charts. SEO tools aren’t known for aesthetics, and sometimes it’s nice to look at a pretty UI. Enter Morningscore, an SEO grading and analysis tool.

At a glance: Morningscore

Morningscore provides entry- to intermediate-level SEO analysis and tracking functionality, assigning websites a score based on organic search rankings and visibility. The constant challenge of SEO is proving economic value and ROI, so the Morningscore grade represents the approximate dollar value of SEO actions. It’s a proprietary-ish metric, calculated from an algorithm where the ranking signals are Google’s data on monthly search volume, traffic and PPC value.

There are ups and downs to demonstrating SEO value with a single metric, but it’s a good start for users who unfamiliar with organic search tactics and forced to prove their value. Morningscore’s gamification also makes it more fun than most of the other tools on the market. The tool provides the following widgets, among others:

  • Keyword research, grouping and tracking

  • Website health/audit and task-list constuction

  • Performance forecasting

  • Backlink monitoring and suggestions

  • Competitive identification, change monitoring and gap analysis

Together, those are a fairly comprehensive SEO starter kit. I’d recommend Morningscore for businesses that are looking to launch an SEO initiative and want a tool that’s clear and concise.

Advanced SEO professionals will find themselves limited quickly, but they’ll also be jealous that their very advanced tools aren’t as pretty or fun as Morningscore.


Content tech news of the week