Documentation VIP Go Code Review: Blockers, Warnings, and Notices

Code Review: Blockers, Warnings, and Notices

Contents

Overview #

Every line of code that is committed to VIP Go is reviewed by the VIP Team. We don’t do in-depth code reviews to add more time to or delay your launch schedules. We do these lengthy code reviews to help you launch successfully.

The goal of our reviews is to make sure that on launch, your site will be:

  • Secure, because pushing a site live with insecure code presents a liability to you and your whole userbase;
  • Performant, because going live and finding out that your code can’t handle the traffic levels that your site expects puts most of your launch efforts to waste.

We also review for development best practices to make sure that your site will continue to live on without significant maintenance costs or major issues when WordPress is upgraded.

Before submitting any code for review, please be sure to look through our All About Code Review documentation. The following is a checklist of items our VIP engineers look for when reviewing. Please note that this is a living list and we are adding and modifying it as we continue to refine our processes and platform.

On VIP Go, we bucket feedback into three categories:

  • VIP Blocker – Cannot be deployed, and must be fixed.
  • VIP Warning – We strongly recommend your team take care of these issues as soon as possible.
  • VIP Notices – Needs to be considered carefully when including them in your VIP theme or plugin.

↑ Top ↑

VIP Blockers #

Blockers are items that need to be fixed before being committed to VIP Go. Here’s a partial list of what can be a blocker:

Validation, Sanitization, and Escaping #

Your code works, but is it safe? When writing code for the VIP Go environment, you’ll need to be extra cautious of how you handle data coming into WordPress and how it’s presented to the end user. Please review our documentation on validating, sanitizing, and escaping.

$_GET, $_POST, $_REQUEST, $_SERVER and other data from untrusted sources (including values from the database such as post meta and options) need to be validated and sanitized as early as possible (for example when assigning a $_POST value to a local variable) and escaped as late as possible on output.

Nonces should be used to validate all form submissions.

Capability checks need to validate that users can take the requested actions.

It’s best to do the output escaping as late as possible, ideally as it’s being outputted, as opposed to further up in your script. This way you can always be sure that your data is properly escaped and you don’t need to remember if the variable has been previously validated.

Here are two examples. In order to keep this straight forward, we’ve kept them simple. Imagine a scenario with much more code between the place where $title is defined and where it’s used. The first example is more clear that $title is escaped.

$title = $instance['title'];

// Logic that sets up the widget

echo $before_title . esc_html( $title ) . $after_title;

 

$title = esc_html( $instance['title'] );

// Logic that sets up the widget

echo $before_title . $title . $after_title;

↑ Top ↑

Inserting HTML directly into DOM with Javascript #

To avoid XSS, inserting HTML directly into the document should be avoided.  Instead, DOM nodes should be programmatically created and appended to the DOM.  This means avoiding .html(), .innerHTML(), and other related functions, and instead using .append(), .prepend(),.before(), .after(), and so on.  More information.


↑ Top ↑

VIP Warnings #

We strongly recommend your team take care of these issues as soon as possible. In most circumstances code that falls under this category should not be pushed to a production server unless a specific use makes it acceptable.

↑ Top ↑

Whitelisting values for input/output validation #

When working with user-submitted data, try where possible to accept data only from a finite list of known and trusted values. For example:

$possible_values = array( 'a', 1, 'good' );
if ( ! in_array( $untrusted, $possible_values, true ) )
die( "Don't do that!" );

↑ Top ↑

Direct Database Queries #

Thanks to WordPress’ extensive API, you should almost never need to query database tables directly. Using WordPress APIs rather than rolling your own functions saves you time and assures compatibility with past and future versions of WordPress and PHP. It also makes code reviews go more smoothly because we know we can trust the APIs. More information.

Additionally, direct database queries bypass internal caching. If absolutely necessary, you should evaluate the potential performance of these queries and add caching if needed.  Any queries that would modify database contents may also put the object cache out of sync with the data, causing problems.

↑ Top ↑

Filesystem writes #

Make sure that your code and plugins do not write to the filesystem. Since the VIP Go network is distributed across many servers in multiple data centers, file system writes won’t work how they would in a single server environment. The core WordPress upload functions can handle any uploads you need to do.

↑ Top ↑

Arbitrary JavaScript and CSS stored in options or meta #

To limit attack vectors via malicious users or compromised accounts, arbitrary JavaScript cannot be stored in options or meta and then output as-is.

CSS in options or meta should also generally be avoided, but if absolutely necessary, it’s a good idea to properly sanitize it. See art-direction-redux for an example.

↑ Top ↑

Encoding values used when creating a url or passed to add_query_arg() #

Add_query_arg() is a really useful function, but it might not work as intended.
The values passed to it are not encoded meaning that passing

$m_yurl = 'admin.php?action=delete&post_id=321';
$my_url = add_query_arg( 'my_arg', 'somevalue&post_id=123', $my_url );

You would expect the url to be:
admin.php?action=delete&post_id=321&somevalue%26post_id%3D123

But in fact it becomes:
admin.php?action=delete&post_id=321&somevalue&post_id=123

Using rawurlencode() on the values passed to it prevents this.

Using rawurlencode() on any variable used as part a the query string, either by using add_query_arg() or directly by string concatenation will also prevent parameter hijacking.

↑ Top ↑

Prefixing functions, constants, classes, and slugs #

Per the well-known WordPress adage: prefix all the things.

This applies to things obvious things such as names of function, constants, and classes, and also less obvious ones like post_type and taxonomy slugs, cron event names, etc.

↑ Top ↑

Not checking return values #

When defining a variable through a function call, you should always check the function’s return value before calling additional functions or methods using that variable.

function wpcom_vip_meta_desc() {
   $text = wpcom_vip_get_meta_desc();
      if ( !empty( $text ) ) {
         echo "\n<meta name=\"description\" content=\"$text\" />\n";
      }
}

↑ Top ↑

Order By Rand #

MySQL queries that use ORDER BY RAND() can be pretty challenging and slow on large datasets. An alternate option can be to retrieve 100 posts and pick one at random.

↑ Top ↑

Manipulating the timezone server-side #

Using date_default_timezone_set() and similar isn’t allowed because it conflicts with stats and other systems. Developers instead should use WordPress’s internal timezone support. More information.

↑ Top ↑

Skipping Full Page Caching #

On VIP Go, varnish is used to cache pages at the edges. This improves performance by serving end users a page that comes directly from the nearest datacenter. The functionality is different than on WordPress.com VIP in that GET parameters are always cached, these are cached individually based on the GET parameters and not stripped and the same page used for all requests as it is done on WordPress.com VIP. This does mean that code relying on vary_cache_on_function() will not work as intended. Varnish on VIP Go will respect the Vary header for X-Country-Code and Accept but not Cookie.

↑ Top ↑

Ajax calls on every pageload #

Making POST requests to admin-ajax.php on every pageload, or on any pageload without user input, will cause performance issues and need to be rethought. If you have questions, we would be happy to help work through an alternate implementation.  GET requests to admin-ajax.php on VIP Go are cached just like any other GET request.

↑ Top ↑

Front-end db writes #

Functions used on the front-end that write to the database are not allowed. This is due to scaling concerns and can easily bring down a site.

↑ Top ↑

*_meta as a hit counters #

Please don’t use meta (post_meta, comment_meta, etc.) to track counts of things (e.g. votes, pageviews, etc.). First of all, it won’t work properly because of caching and due to race conditions on high volume sites. It’s also just a recipe for disaster and easy way to break your site. In general you should not try to count/track user events within WordPress; consider using a Javascript-based solution paired with a dedicated analytics service (such as Google Analytics) instead.

↑ Top ↑

eval() and create_function() #

Both these functions can execute arbitrary code that’s constructed at run time, which can be created through difficult-to-follow execution flows. These methods can make your site fragile because unforeseen conditions can cause syntax errors in the executed code, which becomes dynamic. A much better alternative is an Anonymous Function, which is hardcoded into the file and can never change during execution.

If there are no other options than to use this construct, pay special attention not to pass any user provided data into it without properly validating it beforehand.

We strongly recommend using Anonymous Functions, which are much cleaner and more secure.

↑ Top ↑

No LIMIT queries #

Using posts_per_page (or numberposts) with the value set to -1 or an unreasonably high number or setting nopaging to true opens up the potential for scaling issues if the query ends up querying thousands of posts.

You should always fetch the lowest number possible that still gives you the number of results you find acceptable. Imagine that your site grows over time to include 10,000 posts. If you specify -1 for posts_per_page, you’ll query with no limit and fetch all 10,000 posts every time the query runs, which is going to destroy your site’s performance. If you know you’ll never have more than 15 posts, then set posts_per_page to 15. If you think you might have more than 15 that you’d want to display but doubt it’d hit more than 100 ever, set the limit to 100. If it gets much higher than that, you might need to rethink the page architecture a bit.

↑ Top ↑

Cron schedules less than 15 minutes or expensive events #

Overly frequent cron events (anything less than 15 minutes) can significantly impact the performance of the site, as can cron events that are expensive.

↑ Top ↑

Flash (.swf) files #

Flash (.swf) files are not advisable on VIP Go, as they often present a security threat (largely due to poor development practices or due to bugs in the Flash Player) and vulnerabilities are hard to find/detect/secure. Plus, who needs Flash?🙂

↑ Top ↑

Incorrect licenses #

Non-GPL compatible themes or plugins are not allowed on VIP Go. WordPress code is licensed under the GNU Public License v2 (GPL2) and all theme and plugin code needs to be GPL compatible or custom code you’ve written in-house—split or proprietary licenses are not allowed. The reasoning for this is that you, and we, need to have the legal rights to modify the code if something is broken, insecure, or needs optimization.

↑ Top ↑

Ignore development only files #

If it’s feasible within your development workflow, we ask that you .gitignore any files that are use exclusively in local development of your theme, these include but are not limited to .svnignore, config.rb, sass-cache, grunt files, PHPUnit tests, etc.

↑ Top ↑

VIP Requirements #

Every theme must include a VIP attribution linkwp_head(), and wp_footer() calls.

↑ Top ↑

Unprefixed Functions, Classes, Constants, Slugs #

Long-standing WordPress best practice. Always namespace things in code to avoid potential conflicts. See Prefix Everything.

↑ Top ↑

Commented out code, Debug code or output #

VIP themes should not contain debug code and should not output debugging information. That includes the use of functions that provide backtrace information, such as wp_debug_backtrace_summary() or debug_backtrace(). If you’re encountering an issue that can’t be debugged in your development environment, we’ll be glad to help troubleshoot it with you. The use of commented out code should be avoided. Having code that is not ready for production on production is bad practice and could easily lead to mistakes while reviewing (since the commented out code might not of been reviewed and the removing on a comment might slip in accidentally).

↑ Top ↑

Generating email #

To prevent issues with spam, abuse or other unwanted communications, your code should not generate, or allow users to generate, email messages to site users or user-supplied email addresses. That includes mailing list functionality, invitations to view or share content, notifications of site activity, or other messages generated in bulk. Where needed, you can integrate third-party services that allow sharing of content by email, as long as they don’t depend on the VIP Go infrastructure for message delivery.

↑ Top ↑

Custom wp_mail headers #

The PHP Mailer is properly escaping headers for you only in case you’re using appropriate filters inside WordPress. Every time you want to create custom headers using user supplied data (eg.: “FROM” header), make sure you’re using filters provided by WordPress for you. See wp_mail_from() and wp_mail_from_name()

↑ Top ↑

Serializing data #

Unserialize has known vulnerability problems with Object Injection. JSON is generally a better approach for serializing data.

↑ Top ↑

Including files with untrusted paths or filenames #

locate_template(), get_template_part(), and sometimes include() or require() are typically used to include templates. If your template name, file name or path contains any non-static data or can be filtered, you must validate it against directory traversal using validate_file() or by detecting the string “..”

↑ Top ↑

Settings alteration #

Using ini_set() for alternating PHP settings, as well as other functions with ability to change configuration at runtime of your scripts, such as error_reporting(), is prohibited on the VIP Go platform. Allowed error reporting in production can lead to Full Path Disclosure.

↑ Top ↑

Minified Javascript files #

Javascript files that are minified should also be committed with changes to their unminified counterparts.  Minified files cannot be read for review, and are much harder to work with when debugging issues.

↑ Top ↑

reCaptcha for Share by Email #

To protect against abuse of Jetpack’s share by e-mail feature (aka Sharedaddy) it must be implemented along with reCaptcha. This helps protect against the risk of the WordPress.com network being seen as a source of e-mail spam, which would adversely affect VIP sites. This blog post explains how to implement reCaptcha.

↑ Top ↑

Removing the admin bar #

The admin bar is an integral part of the WordPress experience and should not be removed.

↑ Top ↑

Remote calls #

Remote calls such as fetching information from external APIs or resources should rely on the WordPress HTTP API (no cURL) and should be cached. Example of remote calls that should be cached are wp_remote_get(), wp_safe_remote_get(), and wp_oembed_get(). More information.

↑ Top ↑

Using __FILE__ for page registration #

When adding menus or registering your plugins, make sure that you use an unique handle or slug other than __FILE__ to ensure that you are not revealing system paths.

↑ Top ↑

Functions that use JOINS, taxonomy relation queries, -cat, -tax queries, subselects or API calls #

Close evaluation of the queries is recommended as these can be expensive and lead to performance issues. Queries with known problems when working with large datasets:

  • category__and, tag__and, tax_query with AND
  • category__not_in, tag__not_in, and tax_query with NOT IN
  • tax_query with multiple taxonomies
  • meta_query with a large result set (e.g. looking for only posts with a thumbnail_id meta on a large site, looking for posts with a specific meta value on a key)

↑ Top ↑

Taxonomy queries that do not specify ‘include_children’ => false #

Almost all taxonomy queries include 'include_children' => true by default.  This can have a very significant performance impact on code, and in some cases queries will time out.  We recommend 'include_children' => false to be added to all taxonomy queries when possible.

In many instances where all posts in either a parent or child term are wanted, this can be replaced by only querying for the parent term and using a save_post() hook to determine a child term is added, and if so enforce that it’s parent term is also added. A one time WP-CLI command might be needed to ensure previous data integrity.

↑ Top ↑

Custom roles #

For best compatibility between environments and for added security, custom user roles and capabilities need to be managed via our helper functions.

↑ Top ↑

Caching constraints #

As we’re running Varnish, server side based client related logic will not work. This includes things like logic based on $_SERVER['REMOTE_ADDR'] or similar. This should be switched to a JavaScript based approach.

Because Varnish caches fully rendered pages, per-user interactions on the server-side can be problematic. This means usage of objects/functions like $_COOKIE, setcookie(), $_SERVER['HTTP_USER_AGENT'], and anything that’s unique to an individual user cannot be relied on as the values may be cached and cross-pollution can occur.

In most cases, any user-level interactions should be moved to client-side using javascript. More information.

↑ Top ↑

Using extract() #

extract() should never be used because it is too opaque and difficult to understand how it will behave under a variety of inputs. It makes it too easy to unknowingly introduce new variables into a function’s scope, potentially leading to unintended and difficult to debug conflicts.

↑ Top ↑

Using $_REQUEST #

$_REQUEST should never be used because it is hard to track where the data is coming from (was it POST, or GET, or a cookie?), which makes reviewing the code more difficult. Additionally, it makes it easy to introduce sneaky and hard to find bugs, as any of the aforementioned locations can supply the data, which is hard to predict.  Much better to be explicit and use either $_POST or $_GET instead.

↑ Top ↑

Not Using the Settings API #

Instead of handling the output of settings pages and storage yourself, use the WordPress Settings API as it handles a lot of the heavy lifting for you including added security.

Make sure to also validate and sanitize submitted values from users using the sanitize callback in the register_setting call.

↑ Top ↑

Using Page Templates instead of Rewrites #

A common “hack” in the WordPress community when requiring a custom feature to live at a vanity URL (e.g. /lifestream/) is to use a Page + Page Template. This isn’t ideal for numerous reasons:

  • Requires WordPress to do multiple queries to handle the lookup for the Page and any additional loops your manually run through.
  • Impedes development workflow as it requires the Page to be manually created in each environment and new developer machines as well.

↑ Top ↑

Use wp_parse_url() instead of parse_url() #

In PHP versions lower than 5.4.7 schemeless and relative urls would not be parsed correctly by parse_url() we therefore recommend that you use wp_parse_url() for backwards compatibility.

↑ Top ↑

Use wp_safe_redirect() instead of wp_redirect() #

Using wp_safe_redirect(), along with the allowed_redirect_hosts filter, can help avoid any chances of malicious redirects within code.  It’s also important to remember to call exit() after a redirect so that no other unwanted code is executed.

↑ Top ↑

Mobile Detection #

When targeting mobile visitors, jetpack_is_mobile() should be used instead of wp_is_mobile().  It is more robust and works better with full page caching.

↑ Top ↑

Using bloginfo() without escaping #

Keeping with the theme of Escaping All the Things, code that uses bloginfo() should use get_bloginfo() instead so that the data can be properly late escaped on output.  Since get_bloginfo() can return multiple types of data, and it can be used in multiple places, it may need escaped with many different functions depending on the context:

echo '<a href="' . esc_url( get_bloginfo( 'url' ) ) . '">' . esc_html( get_bloginfo( 'name' ) ) . '</a>';

echo '<meta property="og:description" content="' . esc_attr( get_bloginfo( 'description' ) ) . '">';

↑ Top ↑

VIP Notices #

These are items that should be addressed but that code that goes live to production with these will not cause a performance or security problem. They might be best practices that help code maintenance or help keep the error logs clean, etc.

↑ Top ↑

Check for is_array(), !empty() or is_wp_error() #

Before using a function that depends on an array, always check to make sure the arguments you are passing are arrays. If not PHP will throw a warning.

For example instead of

$tags = wp_list_pluck( get_the_terms( get_the_ID(), 'post_tag') , 'name');

do:

$tags_array = get_the_terms( get_the_ID(), 'post_tag');
//get_the_terms function returns array of term objects on success, false if there are no terms or the post does not exist, WP_Error on failure. Thus is_array is what we have to check against
if ( is_array( $tags_array ) ) {
    $tags = wp_list_pluck( $tags_array , 'name');
}

Here are some common functions / language constructs that are used without checking the parameters before hand: foreach(), array_merge(), array_filter(), array_map(), array_unique(), wp_list_pluck()
Always check the values passed as parameters or cast the value as an array before using them.

↑ Top ↑

Using in_array() without strict parameter #

PHP handles type juggling. This also applies to in_array() meaning that this:

in_array( 0, ['safe_value', 'another string']);

Will return true. Unless this is the behavior you want you should always set the strict parameter to true. See Using == instead of ===

↑ Top ↑

Inline resources #

Inlining images, scripts or styles has been a common work around for performance problems related to HTTP 1.x As more and more of the web is now served via newer protocols (SPDY, HTTP 2.0) these techniques are now detrimental as they cannot be cached and require to be sent every time with the parent resource. Read more about this here

↑ Top ↑

Using == instead of === #

PHP handles type juggling. Meaning that this:

$var = 0;
if ( $var == 'safe_string' ){
    return true;
}

Will return true. Unless this is the behavior you want you should always use === over ==.
Other interesting things that are equal are:

  • (bool) true == 'string'
  • null == 0
  • 0 == '0SQLinjection'
  • 1 == '1XSS'
  • 0123 == 83 (here 0123 is parsed as an octal representation)
  • 0xF == 15 (here 0xF is parsed as an hexadecimal representation of a number)
  • 01 == '1string'
  • 0 == 'test'
  • 0 == ''

↑ Top ↑

Using output buffering #

Output buffering should be used only when truly necessary and should never be used in a context where it is called conditionally or across multiple functions / classes. If used it should always be in the same scope and not with conditionals.

↑ Top ↑

Not defining post_status Or post_type #

By default the post_status of a query is set to publish for anonymous users on the front end. It is not set in any WP_ADMIN context including Ajax queries. Queries on the front end for logged in users will also contain an OR statement for private posts created by the logged in user, even if that user is not part of the site. This will reduce the effectiveness of MySQL indexes, specifically the type_status_date index.
The same is true for post_type, if you know that only a certain post_type will match the rest of the query (for example for a taxonomy, meta or just general query) adding the post_type as well as the post_status will help MySQL better utilize the indexes as it’s disposal.

↑ Top ↑

Using closing PHP tags #

All PHP files should omit the closing PHP tag to prevent accidental output of whitespace and other characters, which can cause issues such as ‘Headers already sent‘ errors. This is part of the WordPress Coding Standards.

↑ Top ↑

Use wp_json_encode() over json_encode() #

wp_json_encode() will take care of making sure the string is valid UTF-8 while the regular function will return false if it encounters invalid UTF-8. It also supports backwards compatibility for versions of PHP that do not accept all the parameters.

↑ Top ↑

Caching large values in options #

The options cache on VIP Go works the same as on core WordPress (different from how WordPress.com VIP works). This means that all the options are not autoloaded in a single cache key and that therefore the size of options is not as important as on WordPress.com VIP. That being said, memcache still has a 1MB cache key limit.

↑ Top ↑

switch_to_blog() #

For VIP Go Multisite instances, switch_to_blog() only switches the database context. Not the code that would run for that site (for example different filters). It should only be used with extreme caution.


↑ Top ↑

Performance Considerations #

We want to make sure that your site runs smoothly and can handle any traffic load. As such, we often make recommendations related to performance, such as: are remote requests fast and cached? Does the site request more data than needed?

↑ Top ↑

Uncached Pageload #

Uncached pageloads should be optimized as much as possible. We will load different pages and templates on your theme uncached, looking for slow queries, slow or timed out remote requests, queries that are overly repeated, or function routines that are slow.

Documentation is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.