PHP XPath Tutorial – Advanced XML Part 1
Monday, December 1st, 2008Few years back when i first read book about XML i could not understand what the hyip was all about, some markup language that allows to store data in well organized way. I also remember parsing XML files in PHP few years back. Whole file was raed using fread() and then programmer had to create some kind of parser to process XML file … completely ridicolous. Sure ActionScript, probably Java and some other languages as well already had classes for parsing XML files, but that was not the case for PHP, so i thought that this is a technology that do not deserve a second look.
Learn Faster!
Get instant access to exclusive: source codes, resources and more, for THIS and other tutorials for $5 a month.
Fas Forward to today, i read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, i want to introduce you to one of them – XPath. I guess it’s name reaveals what it is all about. Basically XPath is a language that allows to localize and fetch information from XML tree.
First we need basic XML document, nothing to complicated but nothing to simple either to show the power of XPath.
<?xml version="1.0" encoding="UTF-8"?> <articles> <article id="1"> <tags> <tag>php</tag> <tag>xpath</tag> </tags> <title>PHP XPath Example</title> </article> <article id="2"> <tags> <tag>dom</tag> <tag>dodocument</tag> </tags> <title>DomDocument Tutorial</title> </article> </articles>
Using XPath with DomDocument Object
Obviously new technologies are introduced to make development process easier not harder, so we will use DomDocument object because it has implemented support for XPath.
load('articles.xml'); $xpath = new DOMXPath($doc); $arts = $xpath->query("/articles/article/title"); foreach ($arts as $art) { echo $art->nodeValue.""; }
This example outputs
PHP XPath Example DomDocument Tutorial
I assume you already know basics of object oriented programming so the most important line here is line 6. Note how path there is given. It starts with sign “/” it means that we start searching from the root of the document. We can also use more complex paths for example:
/articles/article[@id='1']/titleUsing @ sign means that we want ‘id’ attribute equal to 1. If you replace path in PHP example with this path, then as result you will get
PHP XPath Example
But that’s not all XPath allows for even more complicated searches, replace path in PHP scripts with the following path.
/articles/article/tags[tag='php' or tag='xpath']/../titleDo you know what will we get in the response? Analyze our XML document, before reading further.
As a result we get on the screen
PHP XPath Example
First query is looking in the tags section if it contains tag “tag” with value equal to “php” or “xpath” then it goes “one level up” (eg ‘..’) and fetches article title. Queries can be very complicated and contain a lot of conditions and even “if” statements however my goal was to introduce you to XPath not explain everything, mainly because i am beginner as well, but more over it requires a whole book to cover all possibilities that XPath gives.
If after this article you want more (and i hope you do) then visit XPath Documentation at W3C. I also have to mention Lucas – XPath, XPointer, XInclude article, if not him i would
never now about this stuff.



Hey there Greg,
There’s a typo in the first line. Hype is spelled as hyip.
Other than that, stellar stuff.
Would you reccomend any books for getting up tp speed on Advanced XML?
A great resource
I must say… I searched for 2 days for a way to parse a xml document, and all I was able to find was tutorials using built-in PHP parser (xml_parser_create()), wich seemed to be too complex for me (me being a newbie with php and OOP). But, now that I found XPath I wonder why is not used more frequently? Is not a default add-on on php/apache? Or, why?
PS: I forgot to thank you for the excellent tutorial
@Avenirer DomDocument and other Dom* classes are part of the PHP core so they are enabled by default. Why people are not using it? I guess they are to lazy to learn new stuff
even if it would speed up their development.
@Hoorain Jalali, sorry i did not notice your comment (although i was the one who approved it
). Obviously the book i recommend is the book that is listed at the beginning of the article “XSLT 2.0 and XPath 2.0 Programmers Reference” although it’s a book for .NET and JAVA programmers, take that into an account before you decide to buy it.
Hello again…
I tried to apply this tutorial on what I needed, yet it didn’t work. Well, it did work, but only after I changed the .xml file to look something like yours. And that’s not good… as I try to parse an external file, and I don’t want to intervene on the xml file.
Here’s how it goes:
National Bank of Romania
2009-05-29
DR
Reference rates
RON
0.8081
…and the rest for closing the .xml…
I tried to find the rate by writing the path like this: “DataSet/Body/Cube/Rate”; it didn’t work. Then I tried doing it like: “//Rate”. No result.
It only worked when I changed the .xml by deleting up until “body” tag and after “/body” tag. I don’t get it
Oups…
sorry… the xml was something like…
xml version=”1.0″ encoding=”UTF-8″
>DataSet xmlns=”" xmlns:xsi=”" xsi:schemaLocation=”"
>>Header
>>>Sender /Sender
>>>SendingDate /SendingDate
>>>MessageType /MessageType
>>/Header
>>Body
>>>Subject /Subject
>>>OrigCurrency /OrigCurrency
>>>Cube date=”2009-05-29″
>>>>Rate currency /Rate
>>>/Cube
>>/Body
>DataSet
Thanks Greg for this wonderful tutorial…..i was using that old complicated method for parsing xml in php but now i’ll use this Xpath funda…………thanks once again………
greg im getting this error “Cannot instantiate non-existent class: domdocument”
do we need some additional packages to be installed for this????
@Avenirer you said you tried “DataSet/Body/Cube/Rate”, i suppose it should be “/DataSet/Body/Cube/Rate” with a slash at the beginning?
@Anna All libraries should be installed by default. If not then required library should be already on your disc (it should be named php-dom.dll or something like that). You just need to activate it in php.ini and restart apache.
Yes… sorry about mistype… it was “/DataSet/Body/Cube/Rate” and didn’t work
To be honest i have no idea why doesn’t that work, query seems ok, maybe you need to
$xpath->registerNamespace()if there are some namspaces used in the document?Perfect tutorial
Hello
Thanks for the good description.
Your sample works.
But I try to parse the result of the webservice
http://local.yahooapis.com/MapsService/V1/geocode?appid=YD-9G7bey8_JXxQP6rxl.fBFGgCdNjoDMACQA–&street=701%20First%20Ave&city=Sunnyvale&state=CA
This webservice returns the coordinates of an address.
The problem is that the returned XML starts with:
<ResultSet xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns=”urn:yahoo:maps” xsi:schemaLocation=”urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd“>
etc…
And this stuff breaks the Xpath!!
I think this is a PHP bug.
If you take your example (as it is) and replace the first XML line
<articles>
with
<articles xmlns:xsi=”http://www…etc…>
the code will stop working.
What the hell can I do to make Xpath work with Google ?
Elmü
Yeahhh !
After hours I have got the solution:
Reading Google GeoMap:
PHP:
$doc = new DomDocument;
$doc->load(’GeoCode.xml’);
$XPath = new DOMXPath($doc);
$XPath->registerNamespace(”ns”, “urn:yahoo:maps”);
echo “Latitude= ” . $XPath->query(”//ns:Result/ns:Latitude”) ->item(0)->nodeValue;
echo “Longitude= ” . $XPath->query(”//ns:Result/ns:Longitude”)->item(0)->nodeValue;
_________________________________________________
The XML document “GeoCode.xml” which comes like this from the server:
Hey Greg
It would be great if you would add my last posting as a new chapter to your article.
It would be very helpfull for thousands who are working with soap and which must invest hours searching a WORKING sample like I did.
Elmü
Hey Elmue, you did a great work with that one. If you agree i would rather add it as a new post on the blog, as it would make easier for people to find it using search engines?
Hello Greg
I dont think that spreading information to multiple places is a good idea.
I found this article because it has high ranking in Google.
If you put it into another place, related information will be spread to different pages.
Do what you want, but my opinion is that on this page here it would be perfect.
Additionally I have very valuable link, which you can also add:
http://www.w3schools.com/XPath/xpath_syntax.asp
This information is very difficult to find.
Elmü
Thanks this is a nice way to handle XML especially with PHP4. PHP5 handles XML much better than 4 ever did. thanks for the article, much appreciated.
Thanks , it was a great tutorial
a great tutorial..superb..I was struggeling 2 days for finding a way of going through a xml document..thanx again
Great way to work on programming
Obviously this is a Helpful post…..Thanks
oh man….you did a great job….keep going…
Your English jammed my brain. I fixed it for you.
…
A few years back when I first read book about XML I could not understand what the hype was all about. It was some markup language that stores data in well organized way. I also remember parsing XML files in PHP a few years back. The whole file was read using fread() and then a programmer had to create some kind of parser to process XML file. IT was completely ridiculous. Sure ActionScript, probably Java and some other languages already had classes for parsing XML files, but that was not the case for PHP. So I thought that this was a technology that did not deserve a second look.
Fast Forward to today. I read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, I want to introduce you to one of them: XPath. I guess it’s name reveals what it is all about. Basically XPath is a language that allows to localize and fetch information from an XML tree.
First we need a basic XML document, nothing too complicated but nothing too simple either, to show the power of XPath.
Hehehe, thanks @Charles, i will update the article when i will have some time.
awesome article! however I was trying to run this piece of code against the following xml that is returned from ebay and I did not get any response. It seems like the SimpleXML xpath does not like some of the queries or at least I couldn’t make it work. It does, however work if you only run the xpath query in a xpath query test tools!
341296208834
−
http://product.half.ebay.com/GMAT-Quantitative-Review-Paperback-2005_W0QQprZ48636818QQtgZvidetailsQQitemZ341296208834
Camp Hill, Pennsylvania
1
−
24
80.0
1.99
−
−
http://shops.half.ebay.com/wutangxx3_W0QQsellerZwutangxx3
wutangxx3
US
VeryGood
−
Very good condition with no writing! Fast shipping!
// the php code
$xml = simplexml_load_file(the xml file);
$xml->registerXpathNamespace(’ebay’,'urn:ebay:apis:eBLBaseComponents’);
try{
//this should give me the sorename that has the current price of 1.98
$resp = $xml->xpath(”//ebay:CurrentPrice[.=1.98]/following-sibling::Storefront/StoreName”);
} catch (Exception $e)
{
echo $e->getMessage();
}
var_dump($resp);//returns nothing
any idea?
opps sorry the xml tags where stripped!