PHP XPath Tutorial – Advanced XML Part 1

Posted on December 1st, 2008 at 7:47 pm

Few years back when i first read book about XML i could not understand what the hyip was all about, some markup language that allows to store data in well organized way. I also remember parsing XML files in PHP few years back. Whole file was raed using fread() and then programmer had to create some kind of parser to process XML file … completely ridicolous. Sure ActionScript, probably Java and some other languages as well already had classes for parsing XML files, but that was not the case for PHP, so i thought that this is a technology that do not deserve a second look.

Fast Forward to today, i read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, i want to introduce you to one of them – XPath. I guess it’s name reaveals what it is all about. Basically XPath is a language that allows to localize and fetch information from XML tree.

First we need basic XML document, nothing to complicated but nothing to simple either to show the power of XPath.

<?xml version="1.0" encoding="UTF-8"?>
<articles>
    <article id="1">
        <tags>
            <tag>php</tag>
            <tag>xpath</tag>
        </tags>
        <title>PHP XPath Example</title>
    </article>
    <article id="2">
        <tags>
            <tag>dom</tag>
            <tag>dodocument</tag>
        </tags>
        <title>DomDocument Tutorial</title>
    </article>
</articles>

Using XPath with DomDocument Object

Obviously new technologies are introduced to make development process easier not harder, so we will use DomDocument object because it has implemented support for XPath.

load('articles.xml');
 
$xpath = new DOMXPath($doc);
$arts = $xpath->query("/articles/article/title");
 
foreach ($arts as $art)
{
    echo $art->nodeValue."";
}

This example outputs

PHP XPath Example
DomDocument Tutorial

I assume you already know basics of object oriented programming so the most important line here is line 6. Note how path there is given. It starts with sign “/” it means that we start searching from the root of the document. We can also use more complex paths for example:

/articles/article[@id='1']/title

Using @ sign means that we want ‘id’ attribute equal to 1. If you replace path in PHP example with this path, then as result you will get

PHP XPath Example

But that’s not all XPath allows for even more complicated searches, replace path in PHP scripts with the following path.

/articles/article/tags[tag='php' or tag='xpath']/../title

Do you know what will we get in the response? Analyze our XML document, before reading further.

As a result we get on the screen

PHP XPath Example

First query is looking in the tags section if it contains tag “tag” with value equal to “php” or “xpath” then it goes “one level up” (eg ‘..’) and fetches article title. Queries can be very complicated and contain a lot of conditions and even “if” statements however my goal was to introduce you to XPath not explain everything, mainly because i am beginner as well, but more over it requires a whole book to cover all possibilities that XPath gives.

If after this article you want more (and i hope you do) then visit XPath Documentation at W3C. I also have to mention Lucas – XPath, XPointer, XInclude article, if not him i would
never now about this stuff.

About this author

Greg Winiarski

Greg Winiarski is a freelance PHP and JavaScript programmer. He specializes in web applications and WordPress development.

33 Responses to “PHP XPath Tutorial – Advanced XML Part 1”

  1. Hey there Greg,

    There’s a typo in the first line. Hype is spelled as hyip.
    Other than that, stellar stuff.

    Would you reccomend any books for getting up tp speed on Advanced XML?

  2. mike says:

    A great resource

  3. Avenirer says:

    I must say… I searched for 2 days for a way to parse a xml document, and all I was able to find was tutorials using built-in PHP parser (xml_parser_create()), wich seemed to be too complex for me (me being a newbie with php and OOP). But, now that I found XPath I wonder why is not used more frequently? Is not a default add-on on php/apache? Or, why?

  4. Avenirer says:

    PS: I forgot to thank you for the excellent tutorial :)

  5. Greg says:

    @Avenirer DomDocument and other Dom* classes are part of the PHP core so they are enabled by default. Why people are not using it? I guess they are to lazy to learn new stuff :) even if it would speed up their development.

    @Hoorain Jalali, sorry i did not notice your comment (although i was the one who approved it :P ). Obviously the book i recommend is the book that is listed at the beginning of the article “XSLT 2.0 and XPath 2.0 Programmers Reference” although it’s a book for .NET and JAVA programmers, take that into an account before you decide to buy it.

  6. Avenirer says:

    Hello again…
    I tried to apply this tutorial on what I needed, yet it didn’t work. Well, it did work, but only after I changed the .xml file to look something like yours. And that’s not good… as I try to parse an external file, and I don’t want to intervene on the xml file.
    Here’s how it goes:

    National Bank of Romania
    2009-05-29
    DR

    Reference rates
    RON

    0.8081
    …and the rest for closing the .xml…
    I tried to find the rate by writing the path like this: “DataSet/Body/Cube/Rate”; it didn’t work. Then I tried doing it like: “//Rate”. No result.
    It only worked when I changed the .xml by deleting up until “body” tag and after “/body” tag. I don’t get it :-|

  7. Avenirer says:

    Oups…
    sorry… the xml was something like…
    xml version=”1.0″ encoding=”UTF-8″
    >DataSet xmlns=”" xmlns:xsi=”" xsi:schemaLocation=”"
    >>Header
    >>>Sender /Sender
    >>>SendingDate /SendingDate
    >>>MessageType /MessageType
    >>/Header
    >>Body
    >>>Subject /Subject
    >>>OrigCurrency /OrigCurrency
    >>>Cube date=”2009-05-29″
    >>>>Rate currency /Rate
    >>>/Cube
    >>/Body
    >DataSet

  8. Anna says:

    Thanks Greg for this wonderful tutorial…..i was using that old complicated method for parsing xml in php but now i’ll use this Xpath funda…………thanks once again………

  9. Anna says:

    greg im getting this error “Cannot instantiate non-existent class: domdocument”

    do we need some additional packages to be installed for this????

  10. Greg says:

    @Avenirer you said you tried “DataSet/Body/Cube/Rate”, i suppose it should be “/DataSet/Body/Cube/Rate” with a slash at the beginning?

    @Anna All libraries should be installed by default. If not then required library should be already on your disc (it should be named php-dom.dll or something like that). You just need to activate it in php.ini and restart apache.

  11. Avenirer says:

    Yes… sorry about mistype… it was “/DataSet/Body/Cube/Rate” and didn’t work

  12. Greg says:

    To be honest i have no idea why doesn’t that work, query seems ok, maybe you need to $xpath->registerNamespace() if there are some namspaces used in the document?

  13. Rijas says:

    Perfect tutorial

  14. Elmue says:

    Hello

    Thanks for the good description.
    Your sample works.

    But I try to parse the result of the webservice
    http://local.yahooapis.com/MapsService/V1/geocode?appid=YD-9G7bey8_JXxQP6rxl.fBFGgCdNjoDMACQA–&street=701%20First%20Ave&city=Sunnyvale&state=CA

    This webservice returns the coordinates of an address.

    The problem is that the returned XML starts with:

    <ResultSet xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns=”urn:yahoo:maps” xsi:schemaLocation=”urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd“>
    etc…

    And this stuff breaks the Xpath!!
    I think this is a PHP bug.

    If you take your example (as it is) and replace the first XML line
    <articles>
    with
    <articles xmlns:xsi=”http://www…etc…>

    the code will stop working.
    What the hell can I do to make Xpath work with Google ?

    Elmü

  15. Elmue says:

    Yeahhh !
    After hours I have got the solution:

    Reading Google GeoMap:

    PHP:
    $doc = new DomDocument;
    $doc->load(‘GeoCode.xml’);

    $XPath = new DOMXPath($doc);
    $XPath->registerNamespace(“ns”, “urn:yahoo:maps”);

    echo “Latitude= ” . $XPath->query(“//ns:Result/ns:Latitude”) ->item(0)->nodeValue;
    echo “Longitude= ” . $XPath->query(“//ns:Result/ns:Longitude”)->item(0)->nodeValue;
    _________________________________________________

    The XML document “GeoCode.xml” which comes like this from the server:

    <?xml version="1.0"?>
    
    <ResultSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:yahoo:maps" xsi:schemaLocation="urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd">
    	<Result precision="address">
    		<Latitude>37.416397</Latitude>
    		<Longitude>-122.025055</Longitude>
    		<Address>701 1st Ave</Address>
    		<City>Sunnyvale</City>
    		<State>CA</State>
    		<Zip>94089-1019</Zip>
    		<Country>US</Country>
    	</Result>
    </ResultSet>
    
    <!-- ws06.search.re2.yahoo.com uncompressed Thu Jun 25 16:10:14 PDT 2009 -->
    
  16. Elmue says:

    Hey Greg

    It would be great if you would add my last posting as a new chapter to your article.
    It would be very helpfull for thousands who are working with soap and which must invest hours searching a WORKING sample like I did.

    Elmü

  17. Greg says:

    Hey Elmue, you did a great work with that one. If you agree i would rather add it as a new post on the blog, as it would make easier for people to find it using search engines?

  18. Elmue says:

    Hello Greg

    I dont think that spreading information to multiple places is a good idea.
    I found this article because it has high ranking in Google.
    If you put it into another place, related information will be spread to different pages.
    Do what you want, but my opinion is that on this page here it would be perfect.

    Additionally I have very valuable link, which you can also add:
    http://www.w3schools.com/XPath/xpath_syntax.asp
    This information is very difficult to find.

    Elmü

  19. Thanks this is a nice way to handle XML especially with PHP4. PHP5 handles XML much better than 4 ever did. thanks for the article, much appreciated.

  20. pinak says:

    Thanks , it was a great tutorial

  21. dharshana says:

    a great tutorial..superb..I was struggeling 2 days for finding a way of going through a xml document..thanx again

  22. desi music says:

    Great way to work on programming

  23. Obviously this is a Helpful post…..Thanks

  24. jay says:

    oh man….you did a great job….keep going…

  25. Charles says:

    Your English jammed my brain. I fixed it for you.

    A few years back when I first read book about XML I could not understand what the hype was all about. It was some markup language that stores data in well organized way. I also remember parsing XML files in PHP a few years back. The whole file was read using fread() and then a programmer had to create some kind of parser to process XML file. IT was completely ridiculous. Sure ActionScript, probably Java and some other languages already had classes for parsing XML files, but that was not the case for PHP. So I thought that this was a technology that did not deserve a second look.

    Fast Forward to today. I read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, I want to introduce you to one of them: XPath. I guess it’s name reveals what it is all about. Basically XPath is a language that allows to localize and fetch information from an XML tree.

    First we need a basic XML document, nothing too complicated but nothing too simple either, to show the power of XPath.

  26. Greg says:

    Hehehe, thanks @Charles, i will update the article when i will have some time.

  27. Arash M. says:

    awesome article! however I was trying to run this piece of code against the following xml that is returned from ebay and I did not get any response. It seems like the SimpleXML xpath does not like some of the queries or at least I couldn’t make it work. It does, however work if you only run the xpath query in a xpath query test tools!

    341296208834

    http://product.half.ebay.com/GMAT-Quantitative-Review-Paperback-2005_W0QQprZ48636818QQtgZvidetailsQQitemZ341296208834

    Camp Hill, Pennsylvania
    1

    24
    80.0

    1.99

    http://shops.half.ebay.com/wutangxx3_W0QQsellerZwutangxx3

    wutangxx3

    US
    VeryGood

    Very good condition with no writing! Fast shipping!

    // the php code

    $xml = simplexml_load_file(the xml file);
    $xml->registerXpathNamespace(‘ebay’,'urn:ebay:apis:eBLBaseComponents’);
    try{

    //this should give me the sorename that has the current price of 1.98
    $resp = $xml->xpath(“//ebay:CurrentPrice[.=1.98]/following-sibling::Storefront/StoreName”);

    } catch (Exception $e)
    {
    echo $e->getMessage();
    }

    var_dump($resp);//returns nothing

    any idea?

  28. Arash M. says:

    opps sorry the xml tags where stripped!

  29. Riley Coury says:

    Domme ladies makes me so haèèy , why dont you post some more?

  30. paul says:

    great resource will implement on my new project.

  31. Serg says:

    Very nice! I always have some headache with xml. Hope it will be finished with this tool.

  32. nice work thanks or sharing great tutorial

  33. buzzknow says:

    How to create XML like an example with createElement function in PHP?

    thanks

Leave a Reply