I love information. Recently, I caught myself complaining because it took a friend more than 5 minutes to Google something we were arguing about while sitting in a hot tub. Five years ago, we would be lucky if we could even connect to the internet on our cell phones. Mmmm, information is delicious.
One of the things that makes a useful website useful is openness and usefulness of information. Whenever you provide some sort of interesting or unique information, anyone who wants it is going to get it. If you don't make it easy for them, they'll either go somewhere else or find a way to get it. Imagine if everyone who wanted a stock ticker on their site had to write their own screen scraper? Do I even have to tell you why RSS/Atom feeds are a good thing?
The Problem
My new project will most definitely benefit from open data through a web service. Since I felt this was high on the prioritized features list, I wanted to make sure I had an architecturally sound mechanism to handle all types of requests.
In my case, I wanted almost any page which interacts with data to be accessible in RDF format. Additionally, I wanted it to be easy to add new formats at the view level, without having to do anything to the code (for a true MVC architecture). This means the controller itself should only be concerned with which set of views to push data to, not how to translate the data into any particular format.
My first attempt was rather complicated. I created a router which used one set of routes to strip values from a path, and then a second to do the actual routing. On top of that, the routes had to be confusing in order to get it to work. It wasn't very innovative, but it worked. You can see that attempt here: Zend Framework: Using Transparent Routes.
The Solution: Content Type Filtering
After sleeping on it, I decided to keep it simple. Instead of writing complicated regex routes to do all of the work for me, why not just have a router that knows how to detect the content type specification, and strip it? That's true transparency: don't pass any information to anything that doesn't need it.
To revisit, here are my needs:
- User-specific content type through the URI. Receiving the same data in a different format is as simple as changing the content-type part of the uri.
- The content-type does not have to be specified. In that case, the content type will default to html
- Only content-types I wish to support will be caught through the detection mechanism. For example /foo/bar.html will work, but /foo/bar.bazml will not result in 'bazml' being detected as a content type.
- I do not want to break the default parameterization mechanism
With those requirements in mind, I went to work. The first step is to create a router which is aware of the 'content type' concept. It would allow a user to specify a list of content types which it will attempt to detect, and the regex patterns to use in order to detect this patterns. Once it's done it's job of detecting the content types, it will strip that portion of the URI, and pass it back to the request object.
The Routing Schema
After some deliberation, I decided to change my "route schema". At first, I wanted the content type to be specified using an extension:
/<controller>/<action>.<contentType>
For example, the following calls AuthorsController::browseAction() and displays the results in RDF format:
/authors/browse.rdf/page/1
The problem with this is simply the way the URIs look. I'm not sure of any implications other than it annoying me. So, I went with my original route schema:
/<contentType>/<controller>/<action>
However, pretend I never said anything. For the purposes of this post, we'll be using the first routing schema. That schema points out some extra efforts that are needed to make this work universally, and provides the best demonstration.
The configuration
<routing>
<contentTypes>html,xml</contentTypes>
<contentTypePattern>\w+\.(#)</contentTypePattern>
<contentTypeReplacePattern>\.(#)</contentTypeReplacePattern>
</routing>
So this will tell our controller that our router than we're only looking for xml and html. We'll use contentTypePattern to detect the pattern, and contentTypeReplacePattern to strip the pattern from the the path before routing.
Whoa, what's up with the #?. Simple, the # characters will be replaced with a list of content types from our configuration. This allows us to abstract the list of content types we want to detect from the pattern used to find them. Here's what the regex will look like after the str_replace is done to insert the content types:
\w+\.(html|rdf|xml)
That makes sense, but why do we need a detection pattern and a stripping pattern? This is exactly why the .extension content type schema makes a better demonstration. Lets say we're trying to determine the content type of a request to `controller/action.xml`. What we want is the 'xml' part. If we write a regex to match the 'xml' part, and use that same pattern to strip it out, we'll end up passing 'controller/action.' as the request path. Obviously, we don't want that stray dot.
Enter the replacement pattern. If we tell our router the entire string that we want to remove, we can also match the dot, and pass 'controller/action' as the request path. This adds much more flexibility to our routes.
The Router
class ContentTypeRouter extends Zend_Controller_Router_Rewrite
{
protected $_aContentTypes = array();
protected $_aContentTypeRegex = array();
protected $_sDefaultContentType = 'html';
public function addContentType( $sContentType )
{
assert( is_string( $sContentType ) );
$sContentType = strtolower( $sContentType );
if ( !in_array( $sContentType, $this->_aContentTypes ) )
{
$this->_aContentTypes[] = $sContentType;
}
}
public function setDefaultContentType( $sContentType )
{
$this->_sContentType = $sContentType;
}
public function addFilterRegex( $sRegex, $sReplacePattern, array $aMap = array() )
{
$this->_aContentTypeRegex[] = array(
'regex' => $sRegex,
'replace'=> $sReplacePattern,
'map' => $aMap );
}
/**
* @see Zend_Controller_Router_Rewrite
*
* @param Zend_Controller_Request_Abstract $oRequest
* @return Zend_Controller_Request_Abstract
*/
public function route( Zend_Controller_Request_Abstract $oRequest )
{
$sContentType = $this->_sDefaultContentType;
$sContentTypes = join( '|', $this->_aContentTypes );
$sPath = $this->_getPathInfo( $oRequest );
foreach( $this->_aContentTypeRegex as $aRegex )
{
$sRegex = str_replace(
'#',
$sContentTypes,
$aRegex['regex'] );
$aMatch = array();
$sRegex = "/$sRegex/";
//die ($sPath);
if ( 1 === preg_match( $sRegex, $sPath, $aMatch ) )
{
//grab the content type from the match
$sContentType = $aMatch[ 1 ];
//---------------------------------------
//- replace the content type in the route
//---------------------------------------
$sRegex = str_replace( '#', $sContentTypes, $aRegex['replace'] );
$sPath = preg_replace( "/$sRegex/", '', $sPath );
break;
}
}
$oRequest->setPathInfo( $sPath );
$oRequest->setParam( 'requestedContentType', $sContentType );
$oRequest = parent::route( $oRequest );
return $oRequest;
}
private function _getPathInfo( Zend_Controller_Request_Abstract $oRequest )
{
if (!method_exists($oRequest, 'getVersion') || $oRequest->getVersion() == 1) {
return $oRequest->getPathInfo();
} else {
return $oRequest;
}
}
}
One thing you will notice about the route is that you can stack filters. This means we can support more than one routing schema, and the first one to match will be applied. Now, when I decide to switch back to the `<controller>/<action>.<contentType>` schema, I can do so without breaking existing URLs. My users will love me.
Usage Example
$oRouter = new ContentTypeRouter();
// load supported content types from config
$aContentTypes = $this->getMainConfiguration()->routing->contentTypes;
$sPattern = $this->getMainConfiguration()->routing->contentTypePattern;
$aContentTypes = split( ',', $aContentTypes );
// add our regex filters
$oRouter->addFilterRegex(
$this->getMainConfiguration()->routing->contentTypePattern,
$this->getMainConfiguration()->routing->contentTypeReplacePattern );
// add each supported content type to the router
foreach ( $aContentTypes as $sContentType )
{
$oRouter->addContentType( $sContentType );
}
// make use of the router before dispatch
$oController = Zend_Controller_Front::getInstance();
$oController->setRouter( $oRouter );