Andrew Nacin on How WordPress Evolves Without Breaking Everything – Now With Full Transcript

Andrew Nacin is one of the lead developers of WordPress. At the August 2013 Big Media & Enterprise Meetup, he gave a talk on how WordPress evolves while maintaining backwards compatibility — which we shared previously and we’re publishing it again now with the full transcript below. 

My name is Andrew Nacin, I’m a lead developer at WordPress. I live in Washington DC. I’m talking about, really quickly, how WordPress evolves without breaking absolutely everything. I’m going to give two case studies.

First, some general considerations and what I’m talking about for this is three in particular:

  • One, we don’t really rush to fix what isn’t broken. There are a lot of other platforms out there that might rewrite a lot of their API’s pretty much every version, every three years. Just to name a few, like Drupal. We’re not trying to do that; we are trying to evolve at maybe a slower pace. Our slower pace might still be over 3 years; it’s just over six or seven releases.
  • We’re also trying to really make it worth it. If we are going to rewrite something, we’re going to go all out. And that’s actually one of the case studies here.
  • And then the third one is backwards compatibility. As you probably know, we’re presently backwards compatible, or 99% backwards compatible from version to version. Great for users, great for the ecosystem, it’s actually not that great for us, but that’s okay, we accept that.No “breaking changes” means that users don’t really need to worry about this when they update it. At the same time, we have to absorb extra technical debt. So if you look at the new WordPress, you’ll find some things that you’re like “wow, that’s still there”? Yes because the plugin that worked five years ago that was perfect then, should still work now.  We don’t try and just deliberately break your code.

We’re also trying to really make it worth it. If we are going to rewrite something, we’re going to go all out.

First case study: WordPress 3.4 – Themes API

So the first case that I’m going to talk about is WP theme, which came in WordPress 3.4 so this is actually a bit of a comparison from 3.3 to 3.4. There were some really big problems with the existing software wp_get_themes(). It was actually terrible. It’s a function that was not an API, it was that bad. It had a huge memory footprint; we’re talking 10s of megabytes, every time you called it. Very slow, weak error handling, pretty much nothing was good about it. It couldn’t fit into a single cache bucket, that’s how big the data was. So if you tried saving it in the cache, and WordPress.com tried doing this, they had to chunk it into individual keys, and put it together when it was done. It really didn’t make any sense. You’re probably doing it wrong if you ever have to do this with your data. Getting page templates for one theme required loading everything.

WordPress.com, which had at the time, 170-something default themes and on top of that something like 600 VIP themes, which by the way aren’t used on almost all sites, they were loading 700 pieces of data looking up every single page template for whatever Duster’s page templates are. This really didn’t make any sense at all, was really slow, and that’s why they had to cache it into multiple cache buckets. It was also really painful, because it’s one giant multi-dimensional associative array. If you try adding a feature to this, all you’re doing is making it worse.

We also needed this for an absolute ton of things, some of these on here I didn’t even know existed. You can dig into this a little bit, like multiple theme roots, cross-root parent themes, things like that. You can actually nest themes inside directories, which is what WordPress.com does for some stuff. Really weird – we had to support all these things. And you can’t break anything. You have to be 100% backwards compatible.

So how do we do it? First is that we scrap the entire array, this giant mess of crap and try to do one object with WP_Theme. So you have a method like get: $theme ->get(‘Name’) or get (‘Description’), or get (‘Author’). And also getting a header for display, so we have a display method, which automatically translates it, which is a new feature in case you’re doing that kind of stuff and also dealing with HTML markup – that could each go into some of these pieces. And then a number of helpers that can mimic a lot of the regular functions that you’re already seeing so you’re very used to all the different pieces here. Dealing with page templates: “hey look now we can only fetch one theme’s page templates”.

So we were looking through this and the pages’ page, the list table, was 5-6 times slower to load than a post table just because we were loading the list of page templates for one theme, for quick edit, that you don’t even open unless you really need to. Really stupid, but that’s kind of sad right? And things like template files, which we were only really loading for the theme editor, which on WordPress.com is disabled, but they still had to on WordPress.com load this for every single theme. That’s 40% of all of its memory.

It’s not easy to build code that just works from version to version, and many of you might not even need to deal with this. At WordPress, we do. It makes your lives easier, so you can buy me a beer later.

Let’s use WordPress.com as an example, even though I don’t work there, because they’re obviously working on a pretty incredible scale, especially the number of themes they have.

So the next step that we did, step two, is a lot of magic. We have this problem where we have $theme passed into functions, passed into hooks. How are all of the old, all the existing plugins working with this going to be able to accept this data? So in PHP there’s a class called  ArrayObject, that implements a few interfaces, one of them is called ArrayAccess. What it enabled us to do is things like this, where $theme[‘Name’] we’re able to treat that like a function call and then we can wrap it in this case, with the get map: #theme->get(‘Name’). Sometimes, this array, for whatever reason, one of the functions, WordPress decided to convert to an object, so we had to handle that as well. Well, there are some magic methods in php including __get() and __isset().

So now, we’re able to take this giant, stupid array and convert it to actually, a really smart object. We’re doing this in WordPress as well with some other things, we’re also doing dumb objects like a standard class and converting those more specifically to proper objects like wp_post if you’ve been looking around. A lot of this is just for sanity reasons, not even for future reasons. So, function __get($property), we’re able to map exactly where we need to go. Caching is non-persistent by default, but it does exist, which is pretty cool. So, the problem is that if you had caching on persistent and you would maybe doing a deploy, if you’re not actually clearing that cache, well there’s a problem. You need to be able to do that. APC is buggy enough as it is when it comes to upcode caching, you don’t need to mess with it here as well. So it does support persistent caching if you know what you’re doing. So WordPress.com for example has this enabled. They wanted to be able to store in cache bucket, so they do. So if for some reason, you ever wanted to enable it, there is a filter:

add_filter(

‘wp_cache_themes_persistently’,

‘_return_true’ );

Overall, the class itself is somewhere around 2,000 lines long. The patch that landed, that had the bulk of this was somewhere around 14,000 lines long and we wrote it in about six days and it worked.

And you can turn it on and it will work. So if you’re dealing with a lot of themes, maybe not just one on a giant multi-site installed, this might be something for you. So you have this new API that deals with array( ‘allowed’ => true ) and array( ‘errors’ => true ) and all these these different pieces. array( ‘allowed’ => true ) being for multi-site which is again, something else that we were able to speed up quite a bit.

And then we also had to make sure it worked. So, on top of a lot of functional testing, this is a few years ago this stat (29 tests, 684 assertions), there are even more tests now. Existing tests had to demonstrate of course backwards compatibility, so those existing tests did not break when we did all of this. New tests ensured the WP_Theme returned what we expected and then we practiced TDD (test-driven development) specifically when we were dealing with any bugs that came in.

Overall, the class itself is somewhere around 2,000 lines long. The patch that landed, that had the bulk of this was somewhere around 14,000 lines long and we wrote it in about six days and it worked. Also doing profiling, you’re going to find bottlenecks in some cases where you had no idea you had them. So maybe we saw some pages that were slow and we didn’t really understand why, sometimes a post request is actually slow, you might not notice this because you might think “oh yeah, Chrome is just resolving the DNS, that always takes forever”.

Profiling is really important for these things. So, for example, this is a KCachegrind right here, we were able to take 28% in theme.php to 0.76% of the page load. Total time cost was reduced by a factor of almost 6 and then we’re also able to look at weird things like this.

For sanitize_titles_with_dashes, one particular thing, we were searching for a theme on the backend and for some reason it was taking 42% of the page load. We we’re like “what the heck is going on here?”. Turns out it was being run 3529 times and here’s the best thing: the function call shouldn’t have even been there. So we removed it and the entire page sped up like you wouldn’t believe. It actually went from 44% to basically nothing. So we were able to speed things up – we would never have found this because it’s just like “oh Chrome is being stupid, it’s not loading”. No, it was actually a really slow request.

 

Second case study: Taxonomy Meta and Post Relationships

You might have heard of these, you might be using them, you might be using post-to-post taxonomy meta plugins. Working on this, this is a roadmap that was posted to make.wordpress.org/core a few weeks ago, during WordCamp San Francisco actually, explaining all the different things that we’re working on to make this happen. Now, the problem if you’ve worked in depth with terms is that shared terms was just a bad idea, we shouldn’t have done it. It came in actually, and this is a really funny history, originally in 2.2, was removed and went into 2.3 with a new schema, based in part on Drupal’s schema at the time. So the one time we did copy them, we realized we made a bad mistake.

Term_id as you might be familiar with – let’s say the term_id is ‘apple’ and then that ‘apple’ term might be in multiple taxonomies. So you might have the ‘apple’ in the tag taxonomy, and ‘apple’, in the company taxonomy. The problem is when you, let’s say, rename the ‘apple’ to ‘applecomputer’, suddenly things begin to go wrong very quickly. Unfortunately shared terms have very limited practical benefit. It would be much better if they were separate. So we have these two tables: wp_term_taxonomy and wp_terms and these fields in them, and you can kind of see how these come together with term_tax_id being the primary key for one, and term_id being the primary key in the other table, things get related and we have a third table of relationships. The joins are a mess, slow things down, and are not really fun.

So we’re going to add a new table, like this and if you see, it’s the exact same fields as the term_tax_id table, except we’re going to add all the fields that were in the term_id table. So we’re going to drop one of these tables and move all of the fields into the other table. We’re going to reduce everything to one table. Now if you’ve ever written a direct query for this stuff, if you’ve ever dealt with this, “Oh crap, things are going to break” right? “I’m sorry, it was Nacin’s fault, blame him”, or whatever you want to do. We can actually fix this. In fact, we didn’t come up with just one way to fix this we came up with two.

The first one is that we’re just going to redefine what a table means in WordPress. So if you try and reference $wpdb->terms, it will simply think, “oh, you must mean the $wpdb->term_taxonomy table”. So we’re actually self-joining. So if you’re doing interjoined terms on “terms_t” on term taxonomy and you’re do all these different fields, it’s just going to join itself. And because these fields are a superset, it will work. You also can do something like this with a view. You can create a view in MySQL as of MySQL 5, which is the current version for WordPress. You’re able to do something as simple as this: we’re able to re-create our old table in place. So after we do all these crazy upgrades and everything else, we can make this kind of work. We tested this with WordPress, we dropped the table, we merged all of them, took a 20-line patch, without changing anything, all the direct queries and everything worked. So plugins that are trying to do anything special with terms, we can do this to the point where we’re really not going to break anything. Pretty cool.

We’re also doing this over the course quite a number of releases. So, we’re able to combine these term tables, let us have on ID, finally we have one real ID that represents what a term actually is that we can pass around. Term meta is finally within reach, maybe post-relationships isn’t far behind, because that might depend on term relationships and that becomes a whole other story. So we have this long-term road map, unfortunately this actually requires we integrate a half dozen different changes, each of which is dependant on the previous one, over at least 3, 4 or maybe 5 releases.

So, we’re not rushing this, we can’t rush this. We need to do it step by step, to make sure that we don’t break absolutely everything. Maybe we slow it down, speed it up depending on how things go. Ultimately backwards compatibility prevents a lot of challenges. It’s not easy to build code that just works from version to version, and many of you might not even need to deal with this. At WordPress, we do. Iit makes your lives easier, so you can buy me a beer later. And we continue to evolve at rapid speed, WordPress 3.7, if you don’t know about the plans, is being released in October, 3.8 which will be a little bit of a different release in December. And if you’d like to join us helping out, I would go to make.wordpress.org/core and check those things out. (For the latest WordPress version, go to www.wordpress.org)

 

Q&A

Q: For the major version changes, now that we’re speeding up the timeframe for releases, typically in the past it’s been every 6 months for a major release and it’s been documented that you go back and support the last major release version. How is that going to change now that we’re speeding up the major versions?

A: Don’t know yet. In this case we’ve always aimed for 4 months, and normally end up at 5 and it slips to 6 or 7 on occasion. Sometimes we’ve actually been really on target with those. So what we’re trying to do now is 3.7 is acting as a bit of a reset, I can talk a little about 3.7.

3.7 is a platform-focused release, we’re doing it in 2 months. It’s focused on a few different things, Scott was talking earlier about some of our developer tools stuff. We’re trying to improve a lot of our own processes, so whether it’s trying to make it easier to contribute, trying to make it so tickets don’t rot for a long period of time, or people aren’t getting feedback or whatever it is – that’s important. And then a lot of our developer tools as well.

So this is really cool: you might have seen this new develop repository on WordPress.org which replaces the old core.svn repository. This is pretty interesting because it pulls in all of our tests, all of our tools, our bill process now, everything is in one place and finally we’re trying to modernize here. We’ve been around for 10 years, we can start to do it at this point. And then we’re rebuilding a lot of our developer documentation. So if you go to developer.wordpress.org right now, you’ll get a “Coming Soon” message, but we’re working right now on fully automated code reference that is very smart and deals with documenting every hook, every function, all from inline documentation to what else it can need from code. (This feature is now live, check it out)

The actual focus of the release in part is security, stability, updates and fixing a lot of bugs. We’ve already closed around 300 tickets in the last 2-3 weeks and I expect that number to continue to drop successfully with each week. This won’t affect most of you, because you will be doing manual deployments anyway, but in 3.7, security minor release updates will happen automatically. They shouldn’t be nearly as painful as they are and we want to try and ship this to you – like 5 people just went “oh God, what are you doing?” – don’t worry, relax, you can turn it off and in most cases, this won’t affect you. If you have things like automatic updates turned off on your dashboard, then this obviously will not occur. Which you should, and if you don’t, and you’re trying to do deployment anyway, how one of your editors isn’t screwing it up by pushing a button, I’m really interested.

Any further questions on 3.7? We don’t know how yet we’ll work on that, but that said, because we’re going to start doing automatic updates for security releases, we’ll probably support security backup a few more versions as well. If only because we can be much more confidant shipping those and because our security vulnerabilities we’re dealing with now are really esoteric.

We’re talking about like safe HTTP requests and XML injection and things along those lines, we’re not really dealing with the run of the mill like XSS that we might have been dealing with 5-6 years ago. So, supporting further back, yes, that said, I don’t think we’ll always be doing too much with these cycles, I would like to settle between 3 and 4, but we really don’t know yet. 2,3,4, not sure. 3 releases a year would be great, 4 maybe, I don’t know.