August 30


Related Posts – A New Approach?

Depending on the target market of your site, related posts and even related products might be a hit with your audience, or they might be a fail. There's only one way to find out, and that is to have a crack at implementing them, and see if your site visitors actually use the related posts. It's that simple.

Unfortunately that is the only simple part of related posts. Something I have discovered as I've looked into how they can be implemented and what it would take to make them actually useful.

Whilst I will try to keep talking mainly about related posts or articles, the same approaches apply to products also.

I've been interested in Machine Learning and Artificial Intelligence for a long, long time now, since I first got into programming I guess. And I may have finally come upon a use for it that is not a technical exercise looking for a solution.

What are related posts?

Related posts are other articles, pages or even products linked at the bottom or in the sidebar of an article a visitor to your site is reading. The idea is that by showing other relevant content to the visitor they are more likely to stay on your site and read more of your great writing. They get to know you better. They buy products or click on ads you are showing.

Traditional related posts & their limitations

There are a few ways related posts can be added to your articles. Each have advantages & disadvantages, from high levels of manual input - bad if you have a lot of content - to fully automated - less work for you and your writers - and everything in-between.

Manual related posts

The simplest method to implement is to have your content authors specify 2, 3, 5 or more related posts. This requires the author to have a good idea of what other articles are on your site and what they are about.

If done well this approach can yield very high quality recommendations. But keep in mind these linked articles are static. If a related article changes, or gets deleted, you get to maintain this link. Automated broken link checking can help with this, but you still get to check and update articles regularly.

Match By Title

The next level of related post creation is to look for related posts by scanning the title of the current post, and looking for keyword matches with the other posts on your site. Performing this operation programmatically means the related posts can be updated as often as needed by your site itself (sometimes literally as a visitor views an article).

Title keyword matching also yields what superficially appears to be high-quality matches. The selected articles will have an obvious relationship by title to the current article. But that is where the benefits end.

Match by title and body

Selecting related posts by title matching can be extended to the author, and the body of the article. This gives a greater possibility of getting a quality match as you are no longer tied to just the title. But it is still keyword matching, and doesn't take into account the sentiment of the post itself.

There are problems

Can we prevent suggesting related posts that may be contradictory or just not complimentary? What if we want to suggest articles based on user rating (if your site allows visitors to rate articles)?

That is not easy to do, though there are ways to mitigate this problem. But before we talk about that, there is a possible drawback to having dynamic related articles that we need to discuss.

What about SEO?
Search engines don't really like dynamically generated links

But why? The answer is a little... umm... wooly and subjective.

The main reason given is that there is no way to guarantee that the related article is actually related. Remember how simpler plugins do keyword matching on article titles (& sometimes body too)? The problems we discussed are the reasons unrelated content can end up in an articles' dynamically generated related content.

Search engines are far more advanced at content categorisation than simple keyword matching. They know much more accurately if a linked article is related or not.

Search engines don't like it when you link random content

This one is a little more obvious. Internal linking within your site is a good thing for SEO. Search engines want their users to locate information that is of use to them. Internal links that are logical and lead to quality content help this. Internal links that make no sense breaks this intent. Search engines will penalise you if your site consistently offers up links to unrelated content.

Sounds risky - can we mitigate it?

Many "related posts" plugins will tag the generated links "nofollow" for this reason by default. The developers have decided it's better not to take the risk, as they have no way to know how good the recommendations they are making truly are.

A better way is to make sure we only show articles that are actually related. And if we can be sure they are well ranked as well, then that's a bonus.

What is machine learning?
From Wikipedia

Machine Learning (or "ML" for short) is a field of computer science that uses statistical techniques to give computer systems the ability to "learn" (e.g., progressively improve performance on a specific task) with data, without being explicitly programmed.

What does this mean?

We have a static resource and an expanding array of information on how this resource is used. This allows an ML system to make predictions about the future use of the static input.

The ML system will begin to see similarities within your inputs and use these to predict outputs. In our case, the inputs are articles and the way they are viewed and rated by visitors. The outputs are the suggestions for other users.

Most of the currently popular ML frameworks are based on implementations of the "Tensorflow" engine as described here

How can ML help?

At its simplest, the ML system will begin to create visitor categorisations. It will do this by finding past visitors that have viewed and rated articles in similar way to the current visitor.

In simplified terms, if the current visitor has viewed articles A, B, C & D, and the ML system finds a grouping of users that all viewed those 4 articles, plus articles E, F & G, then there is a high likelihood that the current visitor will want to view those articles too. The ML system has categorised the interests of all those visitors as similar, based on the recorded past behaviour.

A more advanced implementation would take into account not only the articles viewed by the group, but also the rating that group had applied to them. So if the group had viewed E, F & G, but rated F poorly, then it wouldn't consider it to be a good suggestion.

Some Example Implementations

Microsoft has one for their "Custom Decision Service" here that does "related article" recommendations based on visitor use and rating patterns. As this is the simplest to implement for testing, I'm going to try this within my own site, to see what the results are.

Google has their own implementation here that has a similar set of inputs and outputs. Google documents a lot more of what is going on & how to build your own custom version much more than Microsoft however.

There are also a lot of custom recommendation engines that can be reviewed and implemented, such as this one that ended up being the basis for the Twitter "who to follow" recommendation system:

All of these are concrete examples of what is pretty much still bleeding edge software technology. I do not recommend you build your own version of these though, unless you really know what you are doing, and have requirements that are outside the normal parameters of recommendations.

What about the future?

This is an area of software that is changing rapidly. Almost as fast as the hardware and systems needed to create a massively parallel computing grid are changing.

I can see these recommendation systems evolving to not only take into account past visitors actions and ratings, but also the sentiment of the articles themselves. That change is not too difficult to predict.​

Beyond that, recommendation engines will get to take into account the changing nature of people's interests - what I'm interested in this month won't necessarily be what I'm interested in 12 months from now.

The ability to accurately predict what someone may be interested in on a low traffic site is another area that will evolve rapidly. Article sentiment, topic categorisation and user ratings will become far more important on a site such as this.


You may also like

Time To Go All In

Thrive Themes Quiz Builder Part 3: Segmentation

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}

Direct Your Visitors to a Clear Action at the Bottom of the Page