SAFOX - A Simple API FOR XML Handling

The SAFOX package is copyright by Christian Hansel, of cpi-service Leipzig, Germany. Licensed under GPL. Copyright notices to be maintained.

This File is just for demo. To view the api-documentation click your way to the www.cpi-service.com/safox/ API Documentation or download the zip.file

This Document shall only serve as an introduction. You can do a lot with SAFOX, but this file only gives a short overview. I don't have the time to write sample applications or to provide more than this. If you wish to contribute please contact me, I'ld be happy to have some more developers and testers for SAFOX and would happily post all sample scripts etc here.

Content

How to include SAFOX

The Basics

Working with the SAFOX Objects

Parsing XML Files

Working with SAFOX (more complex)

Advanced: EventDriven Parsing

Handling, Generating and Parsing RSS 2.0

 

Including SAFOX

As of version 0.5 the SAFOX library has been re-organised. With the newest version there comes a wrapper/starting file that is the only file you have to include in your script.

include("safox/safox.cls.php");
To access the SAFOX libraries simply create the SAFOX-Wrapper object
$safox = new SAFOX();

with this you can now create all objects the SAFOX libraries provide. The neat thing about the wrapper class is that the SAFOX libraries are only included once they are needed - and then only those that are needed. This way SAFOX now is even faster and lighter:

To create the XMLDoc Object simply write another line of code which will make all the necessary calls and returns the OOP API for XML generation

$xmlDoc = & $safox->createXMLDoc();  // For XML Generation

Similarly, you may write any of the following if needed

$xmlParser = & $safox->createXMLParser();  // For XML Parsing
$rsslDoc = & $safox->createRSSDoc();  // For XML Generation
$rssParser = & $safox->createRSSParser();  // For RSS Parsing

[BACK TO TOP]

 

The Basics

There exist two Classes in the safox_g.cls.php sub-package: xmlDoc and xmlNode.

To generate a new xml-based Document instantiate a xmlDoc Object and add nodes like:

 
$safox = new SAFOX();
$xmlDoc = & $safox->createXMLDoc();  // For XML Generation

$xmlDoc->setEncoding("utf-8");

$newNode = & $xmlDoc->addNode("newNode");
$newNode->setAttribute("added",date("YmdHis"));
$newNode->setCData("Some Content");
$xmlDoc->writeXML();

The & reference is necessary to get a true reference to the Object, otherwise Childnodes added will be lost ...

To get the xml-document or to output it, call the writeXML()-method of the xmlDoc Object, or of any subNode:

print $xmlDoc->writeXML();
and you get something like ...
 <?xml version="1.0" encoding="utf-8"?>
<newNode added="20100731212758">Some Content</newNode>

Using the addNode() method of XMLDoc or XMLNode Classes will add a childnode to that specific object and puts it at the end of its hierarchy.

If you wish to add a childNode to a specific position, let's say before or after a specific node, you may also call the addNodeAfter() or addNodeBefore() methods. See API documentation for detailed information, too:

 

$xmlDoc = & $safox->createXMLDoc();  // For XML Generation
$xmlDoc->setEncoding("utf-8");


$rootNode = & $xmlDoc->addNode("RootNode");
$newNode1 = & $rootNode->addNode("Node1");
$newNode2 = & $rootNode->addNode("Node2");
$newNode3 = & $rootNode->addNode("Node3");
$newNode4 = & $rootNode->addNodeAfter($newNode2,"newNodeAfterNode2");
$newNode5 = & $rootNode->addNodeBefore($newNode1,"newNodeBeforeNode1");
print $xmlDoc->writeXML();

and you get something like ...

<?xml version="1.0" encoding="utf-8"?>
<RootNode>
   <newNodeBeforeNode1 />
   <Node1 />
   <Node2 />
   <newNodeAfterNode2 />
   <Node3 />
</RootNode>

To save the XML Document to a file make a call to the writeToFile method of the XMLDoc Object

if (! $xmlDoc->writeToFile("testwritten.xml")) {
	print "File could not be written";
 } else {
	print " File writen: <a href="testwritten.xml">testwritten.xml</a>";

}
and you get something like ...
File could not be written
If you wish to delete a node from the XML Document, make a call to the destry() or delete() method the specific node. If deleting more than one node it is advised to call the method with parameter clean set to false and make a call to the cleanUp method of the xmlDoc method afterwards manually:


$xmlDoc = & $safox->createXMLDoc();  // For XML Generation
$xmlDoc->setEncoding("utf-8");

$newNode1 = & $xmlDoc->addNode("Node1",1);
$newNode2 = & $xmlDoc->addNode("Node2",2);
$newNode3 = & $xmlDoc->addNode("Node3","test");
$newNode4 = & $xmlDoc->addNodeAfter($newNode2,"newNodeAfterNode2");
$newNode5 = & $xmlDoc->addNodeBefore($newNode1,"newNodeBeforeNode1");

// Example deleting one node 
$newNode1->delete();

//  Recommended when deleting more the one node 
$newNode2->delete(false);
$newNode3->delete(false);
$newNode4->delete(false);
$xmlDoc->cleanUp();

[BACK TO TOP]

Working with the Objects

Imagine you have parsed or created a xml-Document and have a xmlDoc Object. Adding Nodes in hierarchies is easy:

//* Now Add a new node and 3 generations of nodes

$newNode = & $xmlDoc->addNode("newNode");
for($i=0;$i<3;$i++) {
$newNode = & $newNode->addNode("newNode");
$newNode->setAttribute("added",date("YmdHis"));
$newNode->setAttribute("generation",$i+1);
$newNode->setId($i);
}
print $xmlDoc->writeXML();
will give you
<?xml version="1.0" encoding="utf-8"?>
<RootNode>
   <newNodeBeforeNode1 />
   <Node1 />
   <Node2 />
   <newNodeAfterNode2 />
   <Node3 />
</RootNode>
<newNode>
   <newNode id="0" added="20100731212758" generation="1">
      <newNode id="1" added="20100731212758" generation="2">
         <newNode id="2" added="20100731212758" generation="3" />
      </newNode>
   </newNode>
</newNode>

[BACK TO TOP]

Parsing XML Files

If you have a xml file, use the xmlParser Class to parse it and return an xmlDoc Object

Take the example xml-Doc here:
$xmlParser = & new xmlParser();
$xmlParser->loadFile("example.xml");
$xmlParser->parse();
$xmlDoc = & $xmlParser->getXmlDoc();
print $xmlDoc->writeXML();
<?xml version="1.0"?>
<catalogue date="Herbst/Winter 2001">
   <product stock="20 Stück">
      <productname>ST360020A</productname>
      <vendor>Seagate</vendor>
      <description>IDE-Festplatte, Speicherkapazität 60,0GB</description>
      <price>227,--Euro</price>
   </product>
   <product id="MyID" stock="50 Stück">
      <productname>MOS331E</productname>
      <vendor record="78">Olympus</vendor>
      <description>MO-Laufwerk, 230 MB, SCSI, intern, bulk</description>
      <price id="14">72,--Euro</price>
   </product>
</catalogue>
[BACK TO TOP]

Working with SAFOX

Using this, we can load a xmlDoc and iterate through it :
$level = 0;
$cnt = 0;
function iterate(&$object) {
	global $cnt,$level;
	for ($i=0;$i<$level;$i++) {$p.="   ";}
	while($node = & $object->getNextNode()) {
		$cnt++;
		print "<br> $p [Node $cnt] ".$node->getType();
		if ($node->hasChildren()) {
			$level++;
			iterate($node);
			$level--;
		}
		
	}
}
iterate($xmlDoc);

[Node 1] catalogue
    [Node 2] product
       [Node 3] productname
       [Node 4] vendor
       [Node 5] description
       [Node 6] price
    [Node 7] product
       [Node 8] productname
       [Node 9] vendor
       [Node 10] description
       [Node 11] price
See the id attribute in the exaple.xml file? If you wish to get only one node (including the subnodes), use the $xmlDoc->getNodeById($id) method:
$node = & $xmlDoc->getNodeById("MyID");
print "

Node with Id=".$node->getId()." has the following content :
".$node->writeXML(); $node = & $xmlDoc->getNodeById(14); print "

Node with Id=".$node->getId()." has the following content :
".$node->writeXML();
which will give you

Node with Id=MyID has the following content :
<product id="MyID" stock="50 Stück"> <productname>MOS331E</productname> <vendor record="78">Olympus</vendor> <description>MO-Laufwerk, 230 MB, SCSI, intern, bulk</description> <price id="14">72,--Euro</price> </product>

Node with Id=14 has the following content :
<price id="14">72,--Euro</price>

By default getId(),setId() work with the standard "id" attribute. In some xml-files the uniqueness is identified differently, however.

You can define which attribute is your uida -- the "Unique ID Attribute".

Example: We need to get the specific tag/node with the "Record=78" attribute to add a simple node to it :

$xmlParser = &  $safox->createXMLParser();  
$xmlParser->setUida("record"); // "Listen" to record-Attribute
$xmlParser->loadFile("example.xml");
$xmlParser->parse();
$xmlDoc = & $xmlParser->getXmlDoc();

$Node = & $xmlDoc->getNodeById(78);
$newNode = & $Node->addNode("anewNode");
$newNode->setAttribute("date",date("YmdHis"));
$newNode->setCData("Some Content");

print $xmlDoc->writeXML();
<?xml version="1.0"?>
<catalogue date="Herbst/Winter 2001">
   <product stock="20 Stück">
      <productname>ST360020A</productname>
      <vendor>Seagate</vendor>
      <description>IDE-Festplatte, Speicherkapazität 60,0GB</description>
      <price>227,--Euro</price>
   </product>
   <product stock="50 Stück" id="MyID">
      <productname>MOS331E</productname>
      <vendor record="78">Olympus
         <anewNode date="20100731212758">Some Content</anewNode>
      </vendor>
      <description>MO-Laufwerk, 230 MB, SCSI, intern, bulk</description>
      <price id="14">72,--Euro</price>
   </product>
</catalogue>

Getting Nodes by TagName

Cristiano suggested a specific function that could be used to retrieve Nodes by TagName, so here are to examples

The getChildNodeByTagName(string tagname) return either an array of references to xmlNode Object if more than one node with tagname is found within direct descendant generation (children), a reference to an xmlNode Object if only one is found, or boolean false if none are found.

consider this example with the Document from above

$xmlParser = &  $safox->createXMLParser();  
$xmlParser->loadFile("example.xml");
$xmlParser->parse();
$xmlDoc = & $xmlParser->getXmlDoc();

$catalogue = $xmlDoc->getChildNodeByTagName("catalogue"); // getting the catalogue node;
$special = $catalogue->getChildNodeByTagName("product");
if (is_array($special)) {
	foreach($special as  $node) {
		print $node->getType() . " found ";
		print "Searching for Productname ";
		if ($pn = & $node->getChildNodeByTagName("productname")) {
			print " Productname : " . $pn->getCData();
		} else {
			print " not found";
		}
		
	}
} elseif ($special) {
	print " only one node found ";
	$special->writeXML();
} else {
	print " No Node found";
}

product found
Searching for Productname
Productname : ST360020A
product found
Searching for Productname
Productname : MOS331E

[BACK TO TOP]

Advanced: Event Driven Parsing

SAFOX also offers some limited ways to add custom event handlers to the Parsing Process:

currently supported events inlcude:

XPE_ON_PARSE_DEF (0); // Event On Parsing <? xml .. String
XPE_ON_PARSE_DTD (1); // On Parsing DTD
XPE_ON_PARSE_TAG (2); // On Parsing TagName
XPE_ON_PARSE_ATT (3); // On Parsing Attribute String
XPE_ON_PARSE_CDT (4); // On Parsing CDATA
XPE_ON_PARSE_NOD (5); // On Node Parsing Completed Object
XPE_ON_PARSE_CMT (6); // On Parsing a Comment
XPE_ON_PARSE_XSP (7); // On Parsing XML Specific Tags <?;

User-Defined Eventhandlers will be called with three parameters :

EventHandlers should be considered for "PreProcessing Data" if Events are handled by XMLPARSER themselves

FullProcessing - The Following Events are currently not handled by the XMLPARSER

Post-Processing Events :

All Other Events are to be used for PreProcessing Purposes and will be provided the relavant string-data, a null var , and the parser reference

To register an event with the parser, call

$parser->registerEvent(EVENTTYPE,"NameOfYourFunction"); e.g. ;

function handler($d,$n,$p) { 
	print "Event was fired";
}

$parser = new xmlParser();
$parser->registerEvent(XPE_ON_PARSE_NOD,"handler");

The idea is to add some functionality to the Parser without having to reprogram the package itself. Consider these examples:

 


// Adding a attribute to each node
function NODChandler($d,&$n,$p) { 
	if (is_object($n))
	$n->setAttribute("updated",date("Ymd"))	;
}

// Transforms a comment into a node and adds it to the Doc
function CMThandler($c,&$n,&$p) {
	if (! is_object($n) && is_object($p) ) {
		return false;
	}
	// Add a node to the last opened Node
	$x = & $n->addNode("UnparsedComment");
	$x->setCDATA(str_replace("--","",$c));
}

$xmlParser = &  $safox->createXMLParser();  

$xmlParser->registerEvent(XPE_ON_PARSE_NOD,"NODChandler");
$xmlParser->registerEvent(XPE_ON_PARSE_CMT,"CMThandler");

$xmlParser->loadFile("example.xml");

$xmlParser->parse();
$xmlDoc = & $xmlParser->getXmlDoc();
print $xmlDoc->writeXML();
Now consider this to be used with the following xml:
<?xml version= "1.0" encoding="ISO-8859-1" standalone="yes" ?>
<!DOCTYPE catalogue [
    <!ELEMENT catalogue (product*)>
    <!ELEMENT product (productname,vendor,description,price)>
    <!ELEMENT productname       (#PCDATA)>
    <!ELEMENT vendor        (#PCDATA)>
    <!ELEMENT description      (#PCDATA)>
    <!ELEMENT price             (#PCDATA)>
    <!ATTLIST catalogue date     CDATA #REQUIRED>
    <!ATTLIST product stock   CDATA #REQUIRED>  
]>
<catalogue date="Herbst/Winter 2001">

  <product stock="20 Stück">
    <productname>ST360020A</productname>
    <vendor>Seagate</vendor>
	<!-- Just a comment -->
    <description>
      IDE-Festplatte, Speicherkapazität 60,0GB
    </description>
<!--
  <product stock="20 Stück">
    <productname>ST360020A</productname>
    <vendor>Seagate</vendor>
    <description>
      IDE-Festplatte, Speicherkapazität 60,0GB
    </description>
    <price>227,--Euro</price>
  </product>
-->  
    <price>227,--Euro</price>
	<!-- Just another comment -->
  </product>
  <product stock="50 Stück" id="MyID">
    <productname>MOS331E</productname>
    <vendor record="78">Olympus</vendor>
    <description>
      MO-Laufwerk, 230 MB, SCSI, intern, bulk
    </description>
    <price id="14">72,--Euro</price>
  </product>
</catalogue>
This would give you:
<?xml version="1.0"?>
<catalogue date="Herbst/Winter 2001">
   <product stock="20 Stück">
      <productname updated="20100731">ST360020A</productname>
      <vendor updated="20100731">Seagate</vendor>
      <UnparsedComment>&lt;! Just a comment &gt;</UnparsedComment>
      <description updated="20100731">IDE-Festplatte, Speicherkapazität 60,0GB</description>
      <UnparsedComment>&lt;!
  &lt;product stock=&quot;20 Stück&quot;&gt;
    &lt;productname&gt;ST360020A&lt;/productname&gt;
    &lt;vendor&gt;Seagate&lt;/vendor&gt;
    &lt;description&gt;
      IDE-Festplatte, Speicherkapazität 60,0GB
    &lt;/description&gt;
    &lt;price&gt;227,Euro&lt;/price&gt;
  &lt;/product&gt;
&gt;</UnparsedComment>
      <price updated="20100731">227,--Euro</price>
      <UnparsedComment>&lt;! Just another comment &gt;</UnparsedComment>
   </product>
   <product id="MyID" stock="50 Stück">
      <productname updated="20100731">MOS331E</productname>
      <vendor record="78" updated="20100731">Olympus</vendor>
      <description updated="20100731">MO-Laufwerk, 230 MB, SCSI, intern, bulk</description>
      <price id="14" updated="20100731">72,--Euro</price>
   </product>
</catalogue>
[BACK TO TOP]

Handling, Generating and Parsing RSS

As of SAFOX version 0.43 a new subpackage SAFOX_RSS has benn added.

With this you now can generate and parse RSS files as easy as generating XML. The RSSDOC and RSSParser classes were specifically developed for dealign with RSS 2.0

This simple manual here can certainly not serve to fully introduce you to the RSS capabilities so an easy example must suffice:

To create a RSS document you may simply call the createRSSDoc method of SAFOX.

$safox = new SAFOX();
$rssDoc = & $safox->createRSSDoc("My First RSS","http://somesite.com/rss","This is my very first Document");
$rssDoc->addItem("This my Item 1","http://somesite.com/rss","Item 1 has no specific description");
$rssDoc->addItem("This my Item 2","http://somesite.com/rss","Item 2 has no specific description either");
$item = & $rssDoc->addItem("This my Item 3","http://somesite.com/rss","Item 3 has no specific description but some optional information");
$item->setCategory("Test Category");
$item->setAuthor("CVH");
$item->setPubDate(mktime());

$item = & $rssDoc->addItem("This my Item 4","http://somesite.com/rss","Item 4's title has been reset after being initiated");

// Now get the RSS Doc for later parsing
$strRSS = $rssDoc->writeRSS();
// You could write it to a file too
$rssDoc->writeToFile("test.xml");

$rssDoc->printRSS();

This would give you:

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
   <channel>
      <title>My First RSS</title>
      <link>http://somesite.com/rss</link>
      <description>This is my very first Document</description>
      <generator>SAFOX_RSS 0.1</generator>
      <item>
         <title>This my Item 1</title>
         <link>http://somesite.com/rss</link>
         <description>Item 1 has no specific description</description>
      </item>
      <item>
         <title>This my Item 2</title>
         <link>http://somesite.com/rss</link>
         <description>Item 2 has no specific description either</description>
      </item>
      <item>
         <title>This my Item 3</title>
         <link>http://somesite.com/rss</link>
         <description>Item 3 has no specific description but some optional information</description>
         <category>Test Category</category>
         <author>CVH</author>
         <pubDate>Sat, 31 Jul 2010 21:27:59 +0200</pubDate>
      </item>
      <item>
         <title>This my Item 4</title>
         <link>http://somesite.com/rss</link>
         <description>Item 4&apos;s title has been reset after being initiated</description>
      </item>
   </channel>
</rss>

Now that we have valid RSS we may also try the parser: Lets assume we want to print only the titles of all items and edit the last item's title as well as change the 3rd Items Author to Galileo

	$rssParser = & $safox->createRSSParser();
	$rssParser->setString($strRSS);
	$rssParser->parse();
	$rssDoc2 = &$rssParser->getRSSDoc();
	$i=0;
	while ($item = &$rssDoc2->getNextItem()) {
		$i++;
		print "<br /> ITEM $i with heading: ".$item->getTitle();
	}
	// now changing the last item'S title a bit:
	$item = &$rssDoc2->getLastItem();
	$item->setTitle("This Title has been edited after having been parsed...");
	$item = &$rssDoc2->getPrevItem();
	$item->setAuthor("Galileo");
	
	print "the fully parsed and rebuild RSS file looks like this:<br /> ";
	print htmlspecialchars($rssDoc2->writeRSS());

ITEM 1 with heading: This my Item 1
ITEM 2 with heading: This my Item 2
ITEM 3 with heading: This my Item 3
ITEM 4 with heading: This my Item 4 the fully parsed and rebuild RSS file looks like this <?xml version="1.0"?> <rss version="2.0"> <channel> <title>My First RSS</title> <link>http://somesite.com/rss</link> <description>This is my very first Document</description> <generator>SAFOX_RSS 0.1</generator> <item> <title>This my Item 1</title> <link>http://somesite.com/rss</link> <description>Item 1 has no specific description</description> </item> <item> <title>This my Item 2</title> <link>http://somesite.com/rss</link> <description>Item 2 has no specific description either</description> </item> <item> <title>This my Item 3</title> <link>http://somesite.com/rss</link> <description>Item 3 has no specific description but some optional information</description> <category>Test Category</category> <author>Galileo</author> <pubDate>Sat, 31 Jul 2010 21:27:59 +0200</pubDate> </item> <item> <title>This Title has been edited after having been parsed...</title> <link>http://somesite.com/rss</link> <description>Item 4&apos;s title has been reset after being initiated</description> </item> </channel> </rss>

 

[BACK TO TOP]

These are just some possibilities for the many uses of SAFOX... You can do alot more with SAFOX but this document is only meant to be an explanatory introduction.

To learn more consult the API-Docu