Stupid PHP Tricks: Normalizing SimpleXML Data

Mon, Jun 2, 2008 09:59 PM
SimpleXML is neat.  Some people don't think it is so simple.  Boy, use the old stuff.  The DOM-XML stuff.

Anyhow, one annoying thing about SimpleXML has to do with caching.  When using web services, we often cache the contents we get back.  We were having a problem where we would get an error about a SimpleXML node not existing.  We were caching the data in memcached which serializes the variable.  So, when it unserialized the variable, there were references in there to some SimpleXML nodes that we did not take care of.  Basically, a tag like:

<foo>bar</foo>

is a string.  But a tag like:

<foo></foo>

is an empty SimpleXML Object.  That is a little annoying, but I don't feel like digging into the C code and figuring out why.  So, we just work around it.  We made a recursive function to do the dirty work for us.

function makeArray($obj) {
$arr = (array)$obj;
if(empty($arr)){
$arr = "";
} else {
foreach($arr as $key=>$value){
if(!is_scalar($value)){
$arr[$key] = makeArray($value);
}
}
}
return $arr;
}

That will turn whatever you pass it into an array or empty string if it is empty.

But, while I was hacking around tonight, I came up with another idea.  Check out this hackery:

$data = json_decode(json_encode($data));

Yeah!  One liner.  That converts all the SimpleXML elements into stdClass objects.  All other vars are left intact.

Ok, so this is where someone in the comments can tell me about the magic SimpleXML method or magic OOP function I have missed to take care of all this.  Go ahead, please make my code faster.  I dare you.
14 comments
Gravatar for Dave Marshall

Dave Marshall Says:

Can't you run asXML() on any node in a SimpleXML object?

Gravatar for Brian Moon

Brian Moon Says:

What purpose would asXML serve for caching the data? I don't want to parse it again.

Gravatar for Dave Marshall

Dave Marshall Says:

I'm not sure I follow. Are you saying you cache a SimpleXML object?

Can't you just cache $simplexml-&gt;asXML()?

Gravatar for Brian Moon

Brian Moon Says:

Well we don't want to cache the whole XML document. We are
only interested in part of it. I hate XML. The less I have to work with it the better.

Gravatar for ChieftainY2k

ChieftainY2k Says:

"...We were caching the data in memcached which serializes the variable..."

Keep in mind that it is not possible to serialize PHP built-in object(s), which SimpleXML is :-)

Gravatar for Brian Moon

Brian Moon Says:

Yeah, ChieftainY2k, that is kind of the point of my post. The serialization does not work and I get errors. I am not sure what your point is.

Gravatar for speedmax

speedmax Says:

Wow, I was using exactly the same hack 4 months ago in the system i built.. until try to deploy onto a PHP5.1 environment.



Then I find the Set::reverse(Object) method in cakephp to convert any objects into array.. the magically Set class

Gravatar for Brian Moon

Brian Moon Says:

Hmm, one function, json hack or load a huge bloated framework. I think I will stick with my functions. You realize that the CakePHP solution is only doing the same thing right? It is not magical.

Gravatar for Antti

Antti Says:

No chance using option LIBXML_NOBLANKS when loading up the XML?

Gravatar for Dar Ksyte

Dar Ksyte Says:

I started to play about with this and immediately fell into a hole.

I started with a few lines of XML which includes an empty tag,

and then made my object with simple_xml_loadstring.

Dumping the object back to string with XMLobj-&gt;asXML() gave me a mangled tag, just

Other tags (with values) survived OK. Something obvious I'm missing? PHP5.

Gravatar for Dar Ksyte

Dar Ksyte Says:

Ah! I'd forgotten that there is a shortform notation for a null item, so asXML() is just tidying up my original.

Gravatar for SneakyWho_am_i

SneakyWho_am_i Says:

Sadly I don't think there is a faster way to do what you're doing.
Not that I'm going to dig around in the source, I'm too lazy for that (I haven't struck a problem YET)...
Sure, CakePHP is just doing the exact same thing as what you are (not that I've pulled that to bits) but of course if SpeedMax is using Cake then it's right and proper for him to use its existing method (provided it suits his purpose) and not reinvent the wheel, right?

I was even tempted to print the value of the object's node out into a buffer and then capture the buffer (because buffers are strings) but that feels like a big, stupid waste of time and effort more than anything else.
Would probably work to stringify it though (then you can buy the "I've done that thing" tee shirt) because of course you can have nested buffers in PHP

Another thing (of course you've tried this) might be:
header ('Content-Type: text/plain');
$object = simplexml_load_string($data);
Reflection::export(new ReflectionClass($object)); //YAY!!

Anyway good luck finding the quicker way to get that data out (in?). Maybe my stupid rambling about nothing might give you some crazy idea about how to magically beat that object into the desired type.

Gravatar for SneakyWho_am_i

SneakyWho_am_i Says:

Just to spam (sorry):
===================================================
Reflection::export(new ReflectionClass($filename-&gt;filename-&gt;filename-&gt;filename));
===================================================
It complained that I'd tried to do a reflection on something that wasn't an object. Then it went mental and threw a stack trace.
Interesting.

Gravatar for isa

isa Says:

The simple and stupid solution I've found for this is to use the substr() function with no parameters: $load_array[$i]-&gt;date = substr($simplexml-&gt;xml_child, null);
For no reason, this converts the simplexml element into the string data type... wheeeee.

Comments are disabled for this post.