PHP XPath Tutorial – Advanced XML Part 1

Written by on December 1, 2008 in XML - 43 Comments

Few years back when i first read book about XML i could not understand what the hyip was all about, some markup language that allows to store data in well organized way. I also remember parsing XML files in PHP few years back. Whole file was raed using fread() and then programmer had to create some kind of parser to process XML file … completely ridicolous. Sure ActionScript, probably Java and some other languages as well already had classes for parsing XML files, but that was not the case for PHP, so i thought that this is a technology that do not deserve a second look.

Fast Forward to today, i read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, i want to introduce you to one of them – XPath. I guess it’s name reaveals what it is all about. Basically XPath is a language that allows to localize and fetch information from XML tree.

First we need basic XML document, nothing to complicated but nothing to simple either to show the power of XPath.

<?xml version="1.0" encoding="UTF-8"?>
<articles>
    <article id="1">
        <tags>
            <tag>php</tag>
            <tag>xpath</tag>
        </tags>
        <title>PHP XPath Example</title>
    </article>
    <article id="2">
        <tags>
            <tag>dom</tag>
            <tag>dodocument</tag>
        </tags>
        <title>DomDocument Tutorial</title>
    </article>
</articles>

Using XPath with DomDocument Object

Obviously new technologies are introduced to make development process easier not harder, so we will use DomDocument object because it has implemented support for XPath.

load('articles.xml');
 
$xpath = new DOMXPath($doc);
$arts = $xpath->query("/articles/article/title");
 
foreach ($arts as $art)
{
    echo $art->nodeValue."";
}

This example outputs

PHP XPath Example
DomDocument Tutorial

I assume you already know basics of object oriented programming so the most important line here is line 6. Note how path there is given. It starts with sign “/” it means that we start searching from the root of the document. We can also use more complex paths for example:

/articles/article[@id='1']/title

Using @ sign means that we want ‘id’ attribute equal to 1. If you replace path in PHP example with this path, then as result you will get

PHP XPath Example

But that’s not all XPath allows for even more complicated searches, replace path in PHP scripts with the following path.

/articles/article/tags[tag='php' or tag='xpath']/../title

Do you know what will we get in the response? Analyze our XML document, before reading further.

As a result we get on the screen

PHP XPath Example

First query is looking in the tags section if it contains tag “tag” with value equal to “php” or “xpath” then it goes “one level up” (eg ‘..’) and fetches article title. Queries can be very complicated and contain a lot of conditions and even “if” statements however my goal was to introduce you to XPath not explain everything, mainly because i am beginner as well, but more over it requires a whole book to cover all possibilities that XPath gives.

If after this article you want more (and i hope you do) then visit XPath Documentation at W3C. I also have to mention Lucas – XPath, XPointer, XInclude article, if not him i would
never now about this stuff.

About the Author

Greg Winiarski is a freelance PHP and JavaScript programmer. He specializes in web applications and WordPress development.

43 Comments on "PHP XPath Tutorial – Advanced XML Part 1"

  1. Hoorain Jalali February 21, 2009 at 11:37 pm ·

    Hey there Greg,

    There’s a typo in the first line. Hype is spelled as hyip.
    Other than that, stellar stuff.

    Would you reccomend any books for getting up tp speed on Advanced XML?

  2. mike May 11, 2009 at 11:44 pm ·

    A great resource

  3. Avenirer May 31, 2009 at 10:40 am ·

    I must say… I searched for 2 days for a way to parse a xml document, and all I was able to find was tutorials using built-in PHP parser (xml_parser_create()), wich seemed to be too complex for me (me being a newbie with php and OOP). But, now that I found XPath I wonder why is not used more frequently? Is not a default add-on on php/apache? Or, why?

  4. Avenirer May 31, 2009 at 10:42 am ·

    PS: I forgot to thank you for the excellent tutorial :)

  5. Greg May 31, 2009 at 12:23 pm ·

    @Avenirer DomDocument and other Dom* classes are part of the PHP core so they are enabled by default. Why people are not using it? I guess they are to lazy to learn new stuff :) even if it would speed up their development.

    @Hoorain Jalali, sorry i did not notice your comment (although i was the one who approved it :P ). Obviously the book i recommend is the book that is listed at the beginning of the article “XSLT 2.0 and XPath 2.0 Programmers Reference” although it’s a book for .NET and JAVA programmers, take that into an account before you decide to buy it.

  6. Avenirer May 31, 2009 at 7:21 pm ·

    Hello again…
    I tried to apply this tutorial on what I needed, yet it didn’t work. Well, it did work, but only after I changed the .xml file to look something like yours. And that’s not good… as I try to parse an external file, and I don’t want to intervene on the xml file.
    Here’s how it goes:

    National Bank of Romania
    2009-05-29
    DR

    Reference rates
    RON

    0.8081
    …and the rest for closing the .xml…
    I tried to find the rate by writing the path like this: “DataSet/Body/Cube/Rate”; it didn’t work. Then I tried doing it like: “//Rate”. No result.
    It only worked when I changed the .xml by deleting up until “body” tag and after “/body” tag. I don’t get it :-|

  7. Avenirer May 31, 2009 at 7:25 pm ·

    Oups…
    sorry… the xml was something like…
    xml version=”1.0″ encoding=”UTF-8″
    >DataSet xmlns=”" xmlns:xsi=”" xsi:schemaLocation=”"
    >>Header
    >>>Sender /Sender
    >>>SendingDate /SendingDate
    >>>MessageType /MessageType
    >>/Header
    >>Body
    >>>Subject /Subject
    >>>OrigCurrency /OrigCurrency
    >>>Cube date=”2009-05-29″
    >>>>Rate currency /Rate
    >>>/Cube
    >>/Body
    >DataSet

  8. Anna June 1, 2009 at 7:56 am ·

    Thanks Greg for this wonderful tutorial…..i was using that old complicated method for parsing xml in php but now i’ll use this Xpath funda…………thanks once again………

  9. Anna June 1, 2009 at 8:09 am ·

    greg im getting this error “Cannot instantiate non-existent class: domdocument”

    do we need some additional packages to be installed for this????

  10. Greg June 1, 2009 at 6:20 pm ·

    @Avenirer you said you tried “DataSet/Body/Cube/Rate”, i suppose it should be “/DataSet/Body/Cube/Rate” with a slash at the beginning?

    @Anna All libraries should be installed by default. If not then required library should be already on your disc (it should be named php-dom.dll or something like that). You just need to activate it in php.ini and restart apache.

  11. Avenirer June 1, 2009 at 7:09 pm ·

    Yes… sorry about mistype… it was “/DataSet/Body/Cube/Rate” and didn’t work

  12. Greg June 6, 2009 at 4:27 pm ·

    To be honest i have no idea why doesn’t that work, query seems ok, maybe you need to $xpath->registerNamespace() if there are some namspaces used in the document?

  13. Rijas June 19, 2009 at 2:02 pm ·

    Perfect tutorial

  14. Elmue June 26, 2009 at 12:24 am ·

    Hello

    Thanks for the good description.
    Your sample works.

    But I try to parse the result of the webservice
    http://local.yahooapis.com/MapsService/V1/geocode?appid=YD-9G7bey8_JXxQP6rxl.fBFGgCdNjoDMACQA–&street=701%20First%20Ave&city=Sunnyvale&state=CA

    This webservice returns the coordinates of an address.

    The problem is that the returned XML starts with:

    <ResultSet xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xmlns=”urn:yahoo:maps” xsi:schemaLocation=”urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd“>
    etc…

    And this stuff breaks the Xpath!!
    I think this is a PHP bug.

    If you take your example (as it is) and replace the first XML line
    <articles>
    with
    <articles xmlns:xsi=”http://www…etc…>

    the code will stop working.
    What the hell can I do to make Xpath work with Google ?

    Elmü

  15. Elmue June 26, 2009 at 1:17 am ·

    Yeahhh !
    After hours I have got the solution:

    Reading Google GeoMap:

    PHP:
    $doc = new DomDocument;
    $doc->load(‘GeoCode.xml’);

    $XPath = new DOMXPath($doc);
    $XPath->registerNamespace(“ns”, “urn:yahoo:maps”);

    echo “Latitude= ” . $XPath->query(“//ns:Result/ns:Latitude”) ->item(0)->nodeValue;
    echo “Longitude= ” . $XPath->query(“//ns:Result/ns:Longitude”)->item(0)->nodeValue;
    _________________________________________________

    The XML document “GeoCode.xml” which comes like this from the server:

    <?xml version="1.0"?>
    
    <ResultSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:yahoo:maps" xsi:schemaLocation="urn:yahoo:maps http://api.local.yahoo.com/MapsService/V1/GeocodeResponse.xsd">
    	<Result precision="address">
    		<Latitude>37.416397</Latitude>
    		<Longitude>-122.025055</Longitude>
    		<Address>701 1st Ave</Address>
    		<City>Sunnyvale</City>
    		<State>CA</State>
    		<Zip>94089-1019</Zip>
    		<Country>US</Country>
    	</Result>
    </ResultSet>
    
    <!-- ws06.search.re2.yahoo.com uncompressed Thu Jun 25 16:10:14 PDT 2009 -->
    
  16. Elmue June 26, 2009 at 1:21 am ·

    Hey Greg

    It would be great if you would add my last posting as a new chapter to your article.
    It would be very helpfull for thousands who are working with soap and which must invest hours searching a WORKING sample like I did.

    Elmü

  17. Greg June 27, 2009 at 1:21 pm ·

    Hey Elmue, you did a great work with that one. If you agree i would rather add it as a new post on the blog, as it would make easier for people to find it using search engines?

  18. Elmue June 30, 2009 at 10:49 pm ·

    Hello Greg

    I dont think that spreading information to multiple places is a good idea.
    I found this article because it has high ranking in Google.
    If you put it into another place, related information will be spread to different pages.
    Do what you want, but my opinion is that on this page here it would be perfect.

    Additionally I have very valuable link, which you can also add:
    http://www.w3schools.com/XPath/xpath_syntax.asp
    This information is very difficult to find.

    Elmü

  19. Elemental Web and Mobile Solutions August 3, 2009 at 3:34 pm ·

    Thanks this is a nice way to handle XML especially with PHP4. PHP5 handles XML much better than 4 ever did. thanks for the article, much appreciated.

  20. pinak August 21, 2009 at 12:44 pm ·

    Thanks , it was a great tutorial

  21. dharshana August 26, 2009 at 1:00 pm ·

    a great tutorial..superb..I was struggeling 2 days for finding a way of going through a xml document..thanx again

  22. desi music December 9, 2009 at 4:57 pm ·

    Great way to work on programming

  23. Md. Kausar Alam January 4, 2010 at 8:19 am ·

    Obviously this is a Helpful post…..Thanks

  24. jay January 6, 2010 at 2:57 pm ·

    oh man….you did a great job….keep going…

  25. Charles January 11, 2010 at 6:57 pm ·

    Your English jammed my brain. I fixed it for you.

    A few years back when I first read book about XML I could not understand what the hype was all about. It was some markup language that stores data in well organized way. I also remember parsing XML files in PHP a few years back. The whole file was read using fread() and then a programmer had to create some kind of parser to process XML file. IT was completely ridiculous. Sure ActionScript, probably Java and some other languages already had classes for parsing XML files, but that was not the case for PHP. So I thought that this was a technology that did not deserve a second look.

    Fast Forward to today. I read an article covering interesting XML features like XPath, XPoiner, XInclude nad XSLT. Today, I want to introduce you to one of them: XPath. I guess it’s name reveals what it is all about. Basically XPath is a language that allows to localize and fetch information from an XML tree.

    First we need a basic XML document, nothing too complicated but nothing too simple either, to show the power of XPath.

  26. Greg January 11, 2010 at 7:27 pm ·

    Hehehe, thanks @Charles, i will update the article when i will have some time.

  27. Arash M. February 9, 2010 at 6:46 am ·

    awesome article! however I was trying to run this piece of code against the following xml that is returned from ebay and I did not get any response. It seems like the SimpleXML xpath does not like some of the queries or at least I couldn’t make it work. It does, however work if you only run the xpath query in a xpath query test tools!

    341296208834

    http://product.half.ebay.com/GMAT-Quantitative-Review-Paperback-2005_W0QQprZ48636818QQtgZvidetailsQQitemZ341296208834

    Camp Hill, Pennsylvania
    1

    24
    80.0

    1.99

    http://shops.half.ebay.com/wutangxx3_W0QQsellerZwutangxx3

    wutangxx3

    US
    VeryGood

    Very good condition with no writing! Fast shipping!

    // the php code

    $xml = simplexml_load_file(the xml file);
    $xml->registerXpathNamespace(‘ebay’,'urn:ebay:apis:eBLBaseComponents’);
    try{

    //this should give me the sorename that has the current price of 1.98
    $resp = $xml->xpath(“//ebay:CurrentPrice[.=1.98]/following-sibling::Storefront/StoreName”);

    } catch (Exception $e)
    {
    echo $e->getMessage();
    }

    var_dump($resp);//returns nothing

    any idea?

  28. Arash M. February 9, 2010 at 6:47 am ·

    opps sorry the xml tags where stripped!

  29. Riley Coury March 13, 2010 at 5:30 pm ·

    Domme ladies makes me so haèèy , why dont you post some more?

  30. paul March 15, 2010 at 12:29 pm ·

    great resource will implement on my new project.

  31. Serg April 13, 2010 at 1:24 am ·

    Very nice! I always have some headache with xml. Hope it will be finished with this tool.

  32. limo in baltimore July 22, 2010 at 4:11 am ·

    nice work thanks or sharing great tutorial

  33. buzzknow August 3, 2010 at 11:03 am ·

    How to create XML like an example with createElement function in PHP?

    thanks

  34. Alvina Kara November 23, 2010 at 7:34 pm ·

    Very interesting topic , thanks for posting .

  35. Adam Piper January 10, 2011 at 11:49 am ·

    @Greg, @Charles:

    A further English correction:

    “I guess it’s name reveals what it is all about.”

    “I guess its name reveals what it is all about.”

  36. Andrew March 3, 2011 at 7:31 pm ·

    Thanks, last time I searched how to parse these XML files :)
    and I used DOMDocument too

  37. Shahzad March 4, 2011 at 10:35 pm ·

    Interesting man..!

  38. Miald March 18, 2011 at 9:09 pm ·

    Very nice and simple explanation of XPATH. gREAT WORK

  39. chrism4111 May 9, 2011 at 1:34 pm ·

    Great Work… Thank you very much.
    I have a question.

    how can I put in alphabetical or numeric order and grouping of the data extracted from the ΧML.

    actually how I can do the equivalent of the order by and group by the data in SQL.

    Thank’s very much.
    and Sorry for my english…

  40. emchinee May 16, 2011 at 3:44 pm ·

    Greg,

    Why are you describing yourself in 3rd person?

  41. SaM July 29, 2011 at 7:32 am ·

    Hey can some one help me on this…
    I need to fetch the first link for a dynamic search query from google & display the link on a page…
    I need to do all this in php…
    Please Help me out Guys…!!!

  42. Peter Torres September 1, 2011 at 2:17 pm ·

    Hi Greg Winiarski, Thanks for explaining a complex example. It is very useful for me. thanks again

  43. Joshua Jackson December 7, 2011 at 7:15 pm ·

    Hello…
    Are these examples supposed to work with php4?
    I tried the published example and the GeoCode.xml example.
    Both pages ended with a blank result page.
    Maybe there is a mystery step that wasn’t in the code, but was assumed that everyone would know?
    JJ

Leave a Comment