I found it a little hard to parse nested elements, so wrote a function simplifies it (based off http://www.thescripts.com/forum/thread627281.html):
function read_mixed_xml($filename, $arrayBeginElem, $arrayEndElem)
{
$output = "";
$arrayBeginKeys = array_keys($arrayBeginElem);
$lengthBegin = count($arrayBeginElem); // Length of the begin array
$arrayEndKeys = array_keys($arrayEndElem);
$lengthEnd = count($arrayEndElem); // Length of end element array
$xmlReader = new XMLReader();
$xmlReader->open($filename);
$xmlReader->read(); // Skip root node
/* Go through the nodes */
while($xmlReader->read())
{
/* We're only parsing begin and #text nodes right now */
if($xmlReader->nodeType != XMLReader::END_ELEMENT)
{
switch($xmlReader->nodeType)
{
/* If the current node is a begin element, go through the array of begin elements, in search of the current node's name. If it is, append $arrayBeginElem's value for the current node's name to the $output. (Simulates case "paragraph":
$output .= "<p>"
break;
) */
case XMLReader::ELEMENT:
for($i = 0; $i < $lengthBegin; $i++)
{
$key = $arrayBeginKeys[i];
if($key==$xmlReader->name)
{
$output .= $arrayBeginElem[$key];
}
}
break;
/* If the current node is a #text node, append the node's value to $output */
case XMLReader::TEXT:
$output .= $xmlReader->value;
break;
}
}
/* If the current node is an end element, go through the array of end elements, and search for the current node's name. If found, append $arrayEndElem's value for the current node's name to the output */
else if($xmlReader->nodeType == XMLReader::END_ELEMENT)
{
for($i = 0; $i < $lengthEnd; $i++)
{
$key = $arrayEndKeys[i];
if($key==$xmlReader->name)
{
$output .= $arrayEndElem[$key];
}
}
}
}
$xmlReader->close();
return $output;
}
Example input:
$begin = array("title" => " <h1>", "paragraph" => " <p>", "italicized" => "<i>");
$end = array("title" => "</h1>", "paragraph" => "</p>", "italicized" => "</i>");
$content = read_mixed_xml("index.xml", $begin, $end);
echo $content;
index.xml:
<?xml version="1.0"?>
<body>
<title>Introduction</title>
<paragraph>
Lorem <italicized>ipsum dolor sit amet</italicized>, consectetuer adipiscing elit. Donec neque augue, nonummy sit amet, interdum vitae, egestas a, nulla. Aenean sed turpis eget lacus venenatis tincidunt. Integer in leo vitae est euismod congue. Curabitur quis tellus ut nulla pharetra fringilla. Phasellus id risus sagittis turpis lobortis pretium.
</paragraph>
<paragraph>
Curabitur ultrices pulvinar massa. Nullam ac massa. Morbi adipiscing pharetra est. In non neque vitae massa adipiscing vestibulum. Integer congue, lacus non sagittis consectetuer, magna nisl eleifend nisl, id fringilla justo justo et arcu.
</paragraph>
</body>
Example output:
<h1>Introduction</h1> <p>
Lorem <i>ipsum dolor sit amet</i>, consectetuer adipiscing elit. Donec neque augue, nonummy sit amet, interdum vitae, egestas a, nulla. Aenean sed turpis eget lacus venenatis tincidunt. Integer in leo vitae est euismod congue. Curabitur quis tellus ut nulla pharetra fringilla. Phasellus id risus sagittis turpis lobortis pretium.
</p> <p>
Curabitur ultrices pulvinar massa. Nullam ac massa. Morbi adipiscing pharetra est. In non neque vitae massa adipiscing vestibulum. Integer congue, lacus non sagittis consectetuer, magna nisl eleifend nisl, id fringilla justo justo et arcu.
</p>
CLXXX. XMLReader functions
Úvod
The XMLReader extension is an XML Pull parser. The reader acts as a cursor going forward on the document stream and stopping at each node on the way.
Instalace
The XMLReader extension is available in PECL as of PHP 5.0.0 and is included and enabled as of PHP 5.1.0 by default. It can be enabled by adding the argument --enable-xmlreader (or --with-xmlreader before 5.1.0) to your configure line. The libxml extension is required.
Předdefinované třídy
XMLReader::close - Close the XMLReader input
XMLReader::expand - Export current node to a DOM node
XMLReader::getAttribute - Get value of attribute by name
XMLReader::getAttributeNo - Get value of attribute by position
XMLReader::getAttributeNs - Get value of attribute by name and URI
XMLReader::getParserProperty - Indicates if parser property is set or not
XMLReader::isValid - Indicates if document is valid
XMLReader::lookupNamespace - Get URI for prefix in scope of node
XMLReader::moveToAttribute - Positions reader on named attribute
XMLReader::moveToAttributeNo - Positions reader on attribute by index
XMLReader::moveToAttributeNs - Position reader on attribute by name and URI
XMLReader::moveToElement - Move to parent element of current attribute node
XMLReader::moveToFirstAttribute - Move to first attribute of node
XMLReader::moveToNextAttribute - Move to next attribute of node
XMLReader::next - Move to next element skipping children
XMLReader::open - Set URI to be parsed
XMLReader::read - Move to next node in stream
XMLReader::setParserProperty - Set parser property
XMLReader::setRelaxNGSchema - Set URI of RelaxNG schema to validate against
XMLReader::setRelaxNGSchemaSource - Set string containing RelaxNG schema to validate against
XMLReader::XML - Set string of data to be parsed
Tabulka 307.
| Name | Type | Read-only | Description |
|---|---|---|---|
| attributeCount | int | yes | The number of attributes on the node |
| baseURI | string | yes | The base URI of the node |
| depth | int | yes | Depth of the node in the tree starting at 0 |
| hasAttributes | bool | yes | Indicates if node has attributes |
| hasValue | bool | yes | Indicates if node has a text value |
| isDefault | bool | yes | Indicates if attribute is defaulted from DTD |
| isEmptyElement | bool | yes | Indicates if node is an empty element tag |
| localName | string | yes | The local name of the node |
| name | string | yes | The qualified name of the node |
| namespaceURI | string | yes | The URI of the namespace associated with the node |
| nodeType | int | yes | The node type for the node |
| prefix | string | yes | The prefix of the namespace associated with the node |
| value | string | yes | The text value of the node |
| xmlLang | string | yes | The xml:lang scope which the node resides |
Předdefinované konstanty
Tyto konstanty jsou definovány tímto rozšířením a budou k dispozici pouze tehdy, bylo-li rozšíření zkompilováno společně s PHP nebo dynamicky zavedeno za běhu.
XMLReader uses class constants since PHP 5.1. Prior releases use global constants in the form XMLREADER_ELEMENT.
Tabulka 308. XMLReader Node Types
| Constant | Value | Description |
|---|---|---|
| XMLReader::NONE (integer) | 0 | No node type |
| XMLReader::ELEMENT (integer) | 1 | Start element |
| XMLReader::ATTRIBUTE (integer) | 2 | Attribute node |
| XMLReader::TEXT (integer) | 3 | Text node |
| XMLReader::CDATA (integer) | 4 | CDATA node |
| XMLReader::ENTITY_REF (integer) | 5 | Entity Reference node |
| XMLReader::ENTITY (integer) | 6 | Entity Declaration node |
| XMLReader::PI (integer) | 7 | Processing Instruction node |
| XMLReader::COMMENT (integer) | 8 | Comment node |
| XMLReader::DOC (integer) | 9 | Document node |
| XMLReader::DOC_TYPE (integer) | 10 | Document Type node |
| XMLReader::DOC_FRAGMENT (integer) | 11 | Document Fragment node |
| XMLReader::NOTATION (integer) | 12 | Notation node |
| XMLReader::WHITESPACE (integer) | 13 | Whitespace node |
| XMLReader::SIGNIFICANT_WHITESPACE (integer) | 14 | Significant Whitespace node |
| XMLReader::END_ELEMENT (integer) | 15 | End Element |
| XMLReader::END_ENTITY (integer) | 16 | End Entity |
| XMLReader::XML_DECLARATION (integer) | 17 | XML Declaration node |
Tabulka 309. XMLReader Parser Options
| Constant | Value | Description |
|---|---|---|
| XMLReader::LOADDTD (integer) | 1 | Load DTD but do not validate |
| XMLReader::DEFAULTATTRS (integer) | 2 | Load DTD and default attributes but do not validate |
| XMLReader::VALIDATE (integer) | 3 | Load DTD and validate while parsing |
| XMLReader::SUBST_ENTITIES (integer) | 4 | Substitute entities and expand references |
Obsah
- XMLReader::close — Close the XMLReader input
- XMLReader::expand — Returns a copy of the current node as a DOM object
- XMLReader::getAttribute — Get the value of a named attribute
- XMLReader::getAttributeNo — Get the value of an attribute by index
- XMLReader::getAttributeNs — Get the value of an attribute by localname and URI
- XMLReader::getParserProperty — Indicates if specified property has been set
- XMLReader::isValid — Indicates if the parsed document is valid
- XMLReader::lookupNamespace — Lookup namespace for a prefix
- XMLReader::moveToAttribute — Move cursor to a named attribute
- XMLReader::moveToAttributeNo — Move cursor to an attribute by index
- XMLReader::moveToAttributeNs — Move cursor to a named attribute
- XMLReader::moveToElement — Position cursor on the parent Element of current Attribute
- XMLReader::moveToFirstAttribute — Position cursor on the first Attribute
- XMLReader::moveToNextAttribute — Position cursor on the next Attribute
- XMLReader::next — Move cursor to next node skipping all subtrees
- XMLReader::open — Set the URI containing the XML to parse
- XMLReader::read — Move to next node in document
- XMLReader::setParserProperty — Set or Unset parser options
- XMLReader::setRelaxNGSchema — Set the filename or URI for a RelaxNG Schema
- XMLReader::setRelaxNGSchemaSource — Set the data containing a RelaxNG Schema
- XMLReader::XML — Set the data containing the XML to parse
XMLReader functions
14-Jul-2007 04:19
14-Nov-2006 08:09
Example, as requested, with nested nodes.
<?php
ob_start();
?>
<root>
<folder>
<name>folder A</name>
<files>
<file>
<name>Afile 1</name>
</file>
<file>
<name>Afile 2</name>
</file>
</files>
</folder>
<folder>
<name>folder B</name>
<files>
<file>
<name>Bfile 1</name>
</file>
<file>
<name>Bfile 2</name>
</file>
</files>
</folder>
</root>
<?php
$xmldata = ob_get_contents();
ob_end_clean();
$xml = new XMLReader();
$xml->XML($xmldata);
$data = array();
while ($xml->read())
{
while($xml->depth<=2 && $xml->nodeType==1)
$xml->read();
if ($xml->nodeType==3 && $xml->depth==3) // NodeType 3 : Text Element
{
$strFolderName = $xml->value;
$data[$strFolderName]=array();
while($xml->depth<=3)
$xml->read();
while($xml->depth>=3)
{
//xdump();
if ($xml->nodeType==3)
$data[$strFolderName][] = $xml->value;
$xml->read();
}
}
}
print_r($data);
echo "\n";
?>
Output :
Array
(
[folder A] => Array
(
[0] => Afile 1
[1] => Afile 2
)
[folder B] => Array
(
[0] => Bfile 1
[1] => Bfile 2
)
)
20-Mar-2006 11:52
DTD Validation
Parser properties can be set using:
$xml_reader->setParserProperty(XMLReader::CONSTANT_NAME, BoolenValue);
The constant setting in the xmlreader_validatedtd.php example that comes
with the xmlread package results in an error.
Here is how I got it to work...
<?php
$indent = 5; /* Number of spaces to indent per level */
$xml = new XMLReader();
$xml->open("dtdexample.xml");
// CHANGED NEXT TWO LINES TO REMOVE ERROR
// FROM: $xml->setParserProperty(XMLREADER_LOADDTD, TRUE);
$xml->setParserProperty(XMLReader::LOADDTD, TRUE);
$xml->setParserProperty(XMLReader::VALIDATE, TRUE);
while($xml->read()) {
/* Print node name indenting it based on depth and $indent var */
print str_repeat(" ", $xml->depth * $indent).$xml->name."\n";
if ($xml->hasAttributes) {
$attCount = $xml->attributeCount;
print str_repeat(" ", $xml->depth * $indent)." Number of Attributes: ".$xml->attributeCount."\n";
}
}
print "\n\nValid:\n";
var_dump($xml->isValid());
?>
15-Feb-2006 04:50
Some more documentation (i.e. examples) would be nice :-)
This is how I read some mysql parameters in an xml file:
<?php
$xml = new XMLReader();
$xml->open("config.xml");
$xml->setParserProperty(2,true); // This seems a little unclear to me - but it worked :)
while ($xml->read()) {
switch ($xml->name) {
case "mysql_host":
$xml->read();
$conf["mysql_host"] = $xml->value;
$xml->read();
break;
case "mysql_username":
$xml->read();
$conf["mysql_user"] = $xml->value;
$xml->read();
break;
case "mysql_password":
$xml->read();
$conf["mysql_pass"] = $xml->value;
$xml->read();
break;
case "mysql_database":
$xml->read();
$conf["mysql_db"] = $xml->value;
$xml->read();
break;
}
}
$xml->close();
?>
The XML file used:
<?xml version='1.0'?>
<MySQL_INIT>
<mysql_host>localhost</mysql_host>
<mysql_database>db_database</mysql_database>
<mysql_username>root</mysql_username>
<mysql_password>password</mysql_password>
</MySQL_INIT>
11-Feb-2006 02:09
Simple function I used while playing around with XMLReader.
<?php
function dump_xmlreader($o) {
$node_types = array (
0=>"No node type",
1=>"Start element",
2=>"Attribute node",
3=>"Text node",
4=>"CDATA node",
5=>"Entity Reference node",
6=>"Entity Declaration node",
7=>"Processing Instruction node",
8=>"Comment node",
9=>"Document node",
10=>"Document Type node",
11=>"Document Fragment node",
12=>"Notation node",
13=>"Whitespace node",
14=>"Significant Whitespace node",
15=>"End Element",
16=>"End Entity",
17=>"XML Declaration node"
);
echo "attributeCount = " . $o->attributeCount . "\n";
echo "baseURI = " . $o->baseURI . "\n";
echo "depth = " . $o->depth . "\n";
echo "hasAttributes = " . ( $o->hasAttributes ? 'TRUE' : 'FALSE' ) . "\n";
echo "hasValue = " . ( $o->hasValue ? 'TRUE' : 'FALSE' ) . "\n";
echo "isDefault = " . ( $o->isDefault ? 'TRUE' : 'FALSE' ) . "\n";
echo "isEmptyElement = " . ( @$o->isEmptyElement ? 'TRUE' : 'FALSE' ) . "\n";
echo "localName = " . $o->localName . "\n";
echo "name = " . $o->name . "\n";
echo "namespaceURI = " . $o->namespaceURI . "\n";
echo "nodeType = " . $o->nodeType . ' - ' . $node_types[$o->nodeType] . "\n";
echo "prefix = " . $o->prefix . "\n";
echo "value = " . $o->value . "\n";
echo "xmlLang = " . $o->xmlLang . "\n";
}
?>
