mod_substitute provides a mechanism to perform both regular expression and fixed string substitutions on response bodies. - Apache DocumentationCool! I put in the URL mappings and VIOLA! All was right in the world.
Fast forward a day. Another developer is testing some new code and finds that his XML is getting munged. At first we blamed libxml because we had just been through an ordeal with a bad combination of a libxml compile option and PHP a while back. Maybe we missed that box when we fixed it. We recompiled everything on the dev box but there was no change. So I started to think what was recently different with the dev boxes. So, I turn off mod_substitute. Dang, that fixed it. I looked at my substitution strings and everything looked fine. After cursing and being depressed that such a cool tool was not working, I took a break to let it settle in my mind.
I came back to the computer and decided to try a virgin Apache 2.2 build. I downloaded the source from the web site instead of building from Gentoo's Portage. Sure enough, a simple test worked fine. No munging. So, I loaded up the dev box Apache configuration into the newly compiled Apache. Sure enough, munged XML. ARGH!!
Up until this point, I had configured the substitutions globally and not in a particular virtual host. So, I moved it all into one virtual host configuration. Still broken.
A little more background on our config. We use mod_proxy to emulate some features that we get in production with our F5 BIG-IP load balancers. So, all requests to a dev box hit a mod_proxy virtual host and are then directed to the appropriate virtual host via a proxied request.
So, I got the idea to hit the virtual host directly on its port and skip mod_proxy. Dang, what do you know. It worked fine. So, something about the output of the backend request and mod_proxy was not playing nice. So, hmm. I got the idea to move the mod_substitute directives into the mod_proxy virtual hosts configuration. Tested and working fine. So, basically, this ensures that the substitution filtering is done only after the proxy and all other requests have been processed. I am no Apache developer, so I have not dug any deeper. I have a working solution and maybe this blog post will reach someone that can explain it. As for mod_substitute, here is the way my config looks.
In the VirtualHost that is our global proxy, I have this:
FilterDeclare DN_REPLACE_URLS
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$text/
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/xml
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/json
FilterProvider DN_REPLACE_URLS SUBSTITUTE resp=Content-Type
$/javascript
FilterChain DN_REPLACE_URLS
Elsewhere, in a file that is local to each dev host, I keep the actual mappings for that particular host:
Substitute
"s|http://dealnews.com|http://somedevbox.dealnews.com|in"
Substitute
"s|http://dealmac.com|http://somedevbox.dealmac.com|in"
# etc....
I am trying to think of other really cool uses for this. Any ideas?
Olly Says:
The only clean approach to this problem is, imho, to set a constant/config variable (however your stuff works) based on the environment you're in (based on a Apache/server wide environment var, probably?) to the correct host.
This way, everything just works, without the use of any uncontrollable regex's.
But there might be a good reason why you've got the host/domain hardcoded?