RawR (Raw Revisited) for WordPress

This plugin allows the embedding of raw text, such as raw html, xml, css, javascript, etc. using a shortcode. For example, [rawr]some <b>raw</b> <i>html</i> here.[/rawr]. It does no content filtering of any kind. There are no configuration options; anything wrapped in [rawr][/rawr] is put through to the page unfiltered, and thus displayed to the user.

Features

  • Has no configuration options, no user interface, no access control “roles”, no files, no tables in the database.
    Just activate it and use [rawr][/rawr] in your pages and posts.
  • Just ~100 lines of code (and over half of that is comments and license).
  • This plugin may be installed in parallel with the other raw plugins, because they don’t use the [rawr] shortcode.
  • Unlike Raw HTML, this plugin does not unregister or override any built-in WordPress filters.
  • Unlike ML Raw HTML and Raw HTML Snippets, this plugin does not require you to provide unique IDs for each raw block on a page.
  • The implemenation (as per the notes below) is also used in the Pygments for WordPress plug-in

Installation

Install this plugin like any other.

Download it from the WordPress.org Plugin repository:

http://downloads.wordpress.org/plugin/rawr-raw-revisited-for-wordpress.zip

Then go to Admin > Plugins > Add New > Upload.

If that’s too easy for you, try this:

1. Put it into the `/wordpress/wp-content/plugins/` directory
2. Activate it in the admin Plugin page.

Usage

Here are some examples:

[rawr]
<script>
SomeJavascript();
window.location("http://remote-site.us");
</script>
[/rawr]

 

[rawr]<a href="http://derek.simkowiak.net/">No HTML &quot;characters&quot; are escaped.</a>[/rawr]

 

[rawr]
<style>
.some_class {
	font-style: bold;
}
</style>
[/rawr]

 

Implementation Notes

Currently, there is a shortfall in the shortcode API. It is impossible to write a shortcode plugin to get raw article content (as entered by the user), because WordPress processes shortcodes after the post has been run through wptexturize(), convert_chars(), convert_smilies(), and (starting in WP 2.5.1) wpautop(). The “$content” argument passed into your shortcode handler already has a bunch of <p/> tags and other HTML entities for things like quotes and symbols, which basically ruins everything.

This happens because WordPress registers the shortcode filter, called “do_shortcode()”, with a priority of 11. The other filters, like wpautop(), have a higher priority (priority 10) so they get executed first. If you move up from the small (easy) shortcode API, into the larger filter API, then you can register your shortcode handler at a higher priority. But unfortunately, that causes a new problem: the built-ins at priority 10 mangle the raw content on the way to the user.

To get around this, several techniques have been used:

The “Raw HTML” plugin lets you manually disable the built-in formatting filters for each post individually. It gives you a GUI checkmark to disable them on a per-post basis, and the plugin has a function called maybe_use_filter() that looks at your per-post setting before applying the filter. This can cause problems for other plugins, because it removes the built-in filters mentioned above, and replaces them with these custom “maybe_” versions that behave differently. (Also, you may want your raw code to not be processed, but the rest of your shortcode blocks on the same page to be filtered normally.)

The “ML Raw HTML” plugin has a simpler implementation. It ignores the passed-in shortcode $content altogether, because it’s already been formatted, and instead refers directly to the global $post variable. It uses $post->post_content to get the entire, raw, unformatted, unfiltered post, and then it parses the shortcode tags manually (without using the WordPress shortcode parsing functions). This is clean and efficient. However, since it must parse the entire post, and not just the wrapped shortcode block, it manually parses out the shortcodes from the entire (unformatted) article. That is hard to do correctly, because it does not make use of the shortcode parser included with WordPress. The current “ML Raw HTML” implementation has a weak custom parser: it requires that you manually give each of your shortcode blocks a unique “id” attribute, just so the custom parser can deal with multiple shortcode blocks on a single page. (And if you put more than exactly one space before your shortcode “id” tag, it won’t work.)

The WordPress core source code also deals with this problem for the [embed] shortcode. It uses a very portable technique that does not have the limitations of the methods above. The code in media.php re-registers the do_shortcode() filter, but with a higher priority (priority 8 ) than the default filters like wpautop() (priority 10). But since runnning do_shortcode() twice — once with prioriry 8, and then again with priority 11 — would cause problems, it registers a slightly tweaked version of do_shortcode() at priority 8. The tweaked version temporarily removes all the other shortcodes (except ), then manually runs do_shortcode() with as the only registered shortcode (so that every other shortcode is ignored). This causes the tag to get correctly processed, before the other filters are run. Finally, it re-registers all the original shortcodes so that when do_shortcode() is called with priority 11, so all the default filters work like one would expect.

Here is the special “do_shortcode” function used for [embed]:

<?php  // [from media.php in the core WordPress code:]

	/**
	 * Process the  shortcode.
	 *
	 * Since the  shortcode needs to be run earlier than other shortcodes,
	 * this function removes all existing shortcodes, registers the  shortcode,
	 * calls {@link do_shortcode()}, and then re-registers the old shortcodes.
	 *
	 * @uses $shortcode_tags
	 * @uses remove_all_shortcodes()
	 * @uses add_shortcode()
	 * @uses do_shortcode()
	 *
	 * @param string $content Content to parse
	 * @return string Content with shortcode parsed
	 */
	function run_shortcode( $content ) {
		global $shortcode_tags;

		// Back up current registered shortcodes and clear them all out
		$orig_shortcode_tags = $shortcode_tags;
		remove_all_shortcodes();

		add_shortcode( 'embed', array(&$this, 'shortcode') );

		// Do the shortcode (only the  one is registered)
		$content = do_shortcode( $content );

		// Put the original shortcodes back
		$shortcode_tags = $orig_shortcode_tags;

		return $content;
	}

The function above is registered as a global content filter, with add_filter(), when loaded:

<?php  /* [...] */

// Hack to get the  shortcode to run before wpautop()
add_filter( 'the_content', array(&$this, 'run_shortcode'), 8 );

Note that the priority is 8, higher than the 10 used by wpautop() and company.

This will give you access to the block of article text within the shortcode. But it has the new effect that, whatever you return as your output, will then later be filtered through wpautop and company at priority 10. That is sufficient if you are okay with having new <p></p> tags inserted into your shortcode handler’s output. For the [embed] shortcode, which is just single-line snippets, this doesn’t matter. But for raw content with multiple lines (and for also syntax highlighting in the Pygments for WordPress plugin), this was not sufficient.

So I needed a way to run my shortcode handler on the raw post (before wpautop()), at filter priority 8, and I also needed a way to prevent the built-in filters like wpautop() from mangling my output when they run at priority 10.

My solution was to run the built-in shortcode parser twice. First, I run it with priority 8 (like the [embed] hack) so I get access to raw, unmangled content. Then I store that content into an array, and I return just the array index as my shortcode content from the run at priority 8. But I also register another, different shortcode handler at priority 11, to put the raw content back into the page (after wpautop() has already been called, and thus, bypassing the mangling of raw content). This second run at priority 11 allows me to return my formatted output, without that output being mangled by wpautop().

Comments are closed.