Metro UK’s Powerful Content Algorithm – Now With Full Transcript

At our first London WordPress Big Media & Enterprise WordPress Meetup,  Dave Jensen (@elgrom) from Metro UK (hosted right here on WordPress.com VIP) explained how Metro continually experiments with their content algorithm to promote and feature the most interesting content for their readers and increase engagement on their site and mobile apps.

Dave recently shared some insight on how Metro UK has grown 350% through some growth hack experiments, so he provides another great inside look on what Metro has been doing internally to tweak their site content.

Below is his slide deck and the video from his presentation which we’ve shared previously, and we’re publishing it again now with full transcript below. 

Welcome, I’m Dave Jensen, I’m the head of development for Metro UK. We’ve been a WordPress VIP client since about December 2012. I think we’re one of the larger enterprise, largest publishers in the UK that use WordPress VIP.

So we’ve been playing around with all sorts of different crazy ways of interacting with our content for the last while, when we first released, we had this whole swipe based interaction that we’d been using which was fun to build, a little bit too complicated. Over the last while, we’ve been playing around with how we could automate some of the placement of stuff on the page so, our algorithms.

We’re a pretty lean operation. Maybe some of you won’t see this as lean but for big publishers there’s 6 developers, 20 people writing content all the time. We have a 24/7 mindset though, so we have champagne aspirations on a beer budget. We’re always doing constant experimentations of how a development process works.

We’ve kind of gone through a process of building something around some kind of trending content and we’ve kind of stretched that out into a kind of newsfeed and this is what we’ll go through today.

How can we do something clever and combine those into something that we might be able to move up the page and take over?

So basically we started collecting a whole bunch of data sources just to begin with, it started as one of our guys needed a dissertation project. So we grabbed a lot of data from Facebook, shares likes, comments, grabbed some information from Twitter, from Omniture, which is our analytics and from WordPress, so we can kind of build everything together.

We took all this data and stuck it in MySQL, then we started doing some pretty basic calculations on that. I wanted to keep everything really simple, so the feedback loops that we could have from everyone involved, they could all feel part of this process and having them engaged with us, is gonna give us a lot of benefit.

Because you know, with running some crazy big data thing, and nobody understands what we’re doing, they can’t tell me when I’m doing something wrong – which is really often. So basically we took the views, we took the social interactions and we times it by a multiplier, we get a score.

We ran this every half an hour and every half an hour, we took this one from the previous one and it kind of gave us a rate of change, a number that is going on. When we released the new site, we were lucky enough to convince the editorial team to stick this at the top, second thing down on the homepage, underneath their thing.

It might sound funny, but editorial people are usually pretty protective around what they put on their homepage, this was a reasonably large step for them at the time.

We also kind of snuck in our sidebar program, so we were pretty interested to see the top bit here. It’s how the trending stuff does the bottom bit is kind of how the clicks on the top stuff work.

We were pretty interested to see that even without all the images, having it text based underneath that there was a real pretty similar level of interaction between the two modules. So we thought we’d probably stumbled across something which was interesting and seemed to jive with our users.

They also changed 24/7 without anybody from our team kind of having to touch it, which was quite nice, so on a Sunday morning, before anyone was in the office, that was still reasonably fresh.

It also, from a commercialization point of view, gave us a way of promoting native content and some kind of native display units to hopefully play around making some more money. ‘Cause it’s always nice to do that.

So from that, when we removed swipe from the site, it was far too complicated, we went down kind of a hole, we started playing with a stream of news. At the bottom of the homepage we just grabbed the latest posts and we put them in a kind of you know page, infinite scroll-type approach.

This got quite a high level, we were really surprised even that the bottom of the homepage, kind of screw it away, at the number of interactions that were getting within this.

There were a lot of people kind of scrolling, playing, clicking around, from that we thought well we kind of have this trending stuff at the top, which is doing quite interesting and we have all these people clicking on these kind of lists down here at the bottom.

How can we do something clever and combine those into something that we might be able to move up the page and take over?

So the timeline is pretty straightforward, it’s just kind of sets the thing. The interesting thing was that we took the information that we calculated and put that back into WordPress to be post meta data and then we used that information to style the front end.

So the big image over there was something that was within the trending, the second one down was just normal one and the one at the bottom was something that had been promoted by an editor.

The neat thing was there was a high-level of consistency across all the platforms. We have a responsive site and it worked quite well.

We were kind of like, even just within a normal time-based feed, we were using the data that we had to change the appearance to give the stuff that was popular a larger percentage of screen time, even in something which was time-based.

Then we started playing around with some advertising and things that looked less like advertising and more fit into the style of the site. The neat thing was there was a high-level of consistency across all the platforms, we have a responsive site, it worked quite well. We were playing around with that and we spent lots of time optimizing it and all the graphs went like that which was pretty neat, so I was enjoying myself on that.

The next kind of phase of this was you know, when we were playing around with the trending stuff, it was great but recency was a real problem that we had. Cause you know, you had to get all the data to the point and then it was what was the biggest one between them.

So we had to come up with a simple calculation to you know call that up and so we just introduced a coefficient to that, to kind of give it a shape and boost things up, which were very early on, to allow to give them a bit of airtime and get them closer to the top of the streams.

So add a coefficient to the end and just by taking that, and giving it a score with the coefficient, we built something which seems to get a, seems to be performing pretty well.

We’ve been optimizing that in quite a high-level gain and you have some graphs going up, so the scrolls and clicks. First of all we track a lot of information, so you can see down at the bottom, from going to timelines when we started to newsfeed version to infinite. Each one of those had a gradual increase.

Our statistics, some of the biggest learnings that we’ve had is because we have such variable traffic numbers.

In order to figure out actually what’ going on, we had to break everything down per daily active user which was a kind of,it was a bit one of those moments, like “why didn’t I do this before”?

Because it’s just a kind of number, we can say “hey, you like news or sports”, give them another 5,000 views and the things you read are much more likely to be closer to the top.

When we moved from a time-based feed to a news-based feed, clicks increased by 9 percent across the board, which we’re quite happy with. This has allowed us to kind of take over the homepage and it’s kind of the third thing down.

We’ve been A/B testing and content density, is one of the things we’re moving towards kind of increasing the content density, even more things on the page…more opportunities to click, more people click.

Infinite scroll had a pretty big impact as well because if you stop content then people leave and they don’t have the opportunity to click. Then we would get some good clicks on our native display and the content drivers to native content.

Some of the lessons learned…Content volume is a big problem and kind of a bit of a beast that you keep on feeding and if you don’t feed it or you give it too much of the same type of content, I get complaints, because then it kind of looks clustered within that.

Because we’re running on a scale, we’ve had some pretty fun times with MySQL and Cloudfront. Making sure we cut all of the caches at the highest level, so kind of cutting it at a category level and not playing around with it too much beneath that has allowed us to keep that going, keeping everything fast. The faster things load, the less people notice that and they click.

The common understanding has definitely helped us get feedback throughout that. so some of the things we learned from a WordPress point of view.

So we’ve been using this VIP caching thing to be able to grab the first page of information and make it kind of available and part of the page rather than having to go into it grab via ajax.

That’s the third thing that, down on the homepage. If something falls over, making it always there is good otherwise I get shouted at.

We have also with the API, we built, we actually mimicked the public API’s format,so we can kind of interchangeably use our API versus the kind of latest public API stuff, which is probably one of the quite nice hacks that we did. We were playing around with storing lots of stuff in large options for a while but that didn’t scale very well and we were using post-meta to store information.

We’ve also been playing around with CHEEZETEST to kind of give us the kind of A/B testing results but it can add quite a lot of complication if you’re trying to test too much with it.

So we took a kind of microservice architecture approach to this. We kind of have a service for data mining, a service for the newsfeed, a service for the commercial feed, which keeps all the services quite nicely separated.

We’ve been using Backbone for the templates and Cloudfront for our caching. We’ve also plugged it into an Android app, that we’ve built which has been quite fun which is just a top 10 stories on the site any one time. We are been able to pass in the channels people read and give them a boost up.

Because it’s just a kind of number, we can say hey, you like news or sports, give them another 5,000 views and the things you read are much more likely to be closer to the top.

Of the thing which would be fun, a few installs to that marketing still, the biggest fun challenge we’re having with that… and right on the money, thank you for listening.

To see the presentations from previous Big Media & Enterprise WordPress Meetups, click hereFor Big Media & Enterprise WordPress Meetup groups in other cities, see the full list on VIP Events and join your local group.