| |
| Some applications are simply impossible in an event driven model with no access |
| to a tree. Of course you could build some sort of tree yourself in SAX events, |
| but the DOM allows you to avoid writing that code. The DOM is a standard tree |
| representation for XML data. |
| |
| The Document Object Model is being defined by the W3C in stages, or "levels" in |
| their terminology. The Python mapping of the API is substantially based on the |
n | DOM Level 2 recommendation. |
| |
| .. XXX PyXML is dead... |
| DOM Level 2 recommendation. The mapping of the Level 3 specification, currently |
| .. The mapping of the Level 3 specification, currently |
| only available in draft form, is being developed by the `Python XML Special |
| only available in draft form, is being developed by the `Python XML Special |
| Interest Group <http://www.python.org/sigs/xml-sig/>`_ as part of the `PyXML |
| Interest Group <http://www.python.org/sigs/xml-sig/>`_ as part of the `PyXML |
| package <http://pyxml.sourceforge.net/>`_. Refer to the documentation bundled |
| package <http://pyxml.sourceforge.net/>`_. Refer to the documentation bundled |
| with that package for information on the current state of DOM Level 3 support. |
| with that package for information on the current state of DOM Level 3 support. |
| |
n | .. % What if your needs are somewhere between SAX and the DOM? Perhaps |
n | .. What if your needs are somewhere between SAX and the DOM? Perhaps |
| .. % you cannot afford to load the entire tree in memory but you find the |
| you cannot afford to load the entire tree in memory but you find the |
| .. % SAX model somewhat cumbersome and low-level. There is also a module |
| SAX model somewhat cumbersome and low-level. There is also a module |
| .. % called xml.dom.pulldom that allows you to build trees of only the |
| called xml.dom.pulldom that allows you to build trees of only the |
| .. % parts of a document that you need structured access to. It also has |
| parts of a document that you need structured access to. It also has |
| .. % features that allow you to find your way around the DOM. |
| features that allow you to find your way around the DOM. |
| .. % See http://www.prescod.net/python/pulldom |
| See http://www.prescod.net/python/pulldom |
| |
| DOM applications typically start by parsing some XML into a DOM. How this is |
| accomplished is not covered at all by DOM Level 1, and Level 2 provides only |
| limited improvements: There is a :class:`DOMImplementation` object class which |
| provides access to :class:`Document` creation methods, but no way to access an |
| XML reader/parser/Document builder in an implementation-independent way. There |
| is also no well-defined way to access these methods without an existing |
| :class:`Document` object. In Python, each DOM implementation will provide a |
| document through its properties and methods. These properties are defined in |
| the DOM specification; this portion of the reference manual describes the |
| interpretation of the specification in Python. |
| |
| The specification provided by the W3C defines the DOM API for Java, ECMAScript, |
| and OMG IDL. The Python mapping defined here is based in large part on the IDL |
| version of the specification, but strict compliance is not required (though |
| implementations are free to support the strict mapping from IDL). See section |
n | :ref:`dom-conformance`, "Conformance," for a detailed discussion of mapping |
n | :ref:`dom-conformance` for a detailed discussion of mapping requirements. |
| requirements. |
| |
| |
| .. seealso:: |
| |
| `Document Object Model (DOM) Level 2 Specification <http://www.w3.org/TR/DOM-Level-2-Core/>`_ |
| The W3C recommendation upon which the Python DOM API is based. |
| |
| `Document Object Model (DOM) Level 1 Specification <http://www.w3.org/TR/REC-DOM-Level-1/>`_ |
| The W3C recommendation for the DOM supported by :mod:`xml.dom.minidom`. |
n | |
| `PyXML <http://pyxml.sourceforge.net>`_ |
| Users that require a full-featured implementation of DOM should use the PyXML |
| package. |
| |
| `Python Language Mapping Specification <http://www.omg.org/docs/formal/02-11-05.pdf>`_ |
| This specifies the mapping from OMG IDL to Python. |
| |
| |
| Module Contents |
| --------------- |
| |
| exception classes. The :class:`Node` class provided by this module does not |
| implement any of the methods or attributes defined by the DOM specification; |
| concrete DOM implementations must provide those. The :class:`Node` class |
| provided as part of this module does provide the constants used for the |
| :attr:`nodeType` attribute on concrete :class:`Node` objects; they are located |
| within the class rather than at the module level to conform with the DOM |
| specifications. |
| |
n | .. % Should the Node documentation go here? |
n | .. Should the Node documentation go here? |
| |
| |
| .. _dom-objects: |
| |
| Objects in the DOM |
| ------------------ |
| |
| The definitive documentation for the DOM is the DOM specification from the W3C. |
| |
| Note that DOM attributes may also be manipulated as nodes instead of as simple |
| strings. It is fairly rare that you must do this, however, so this usage is not |
| yet documented. |
| |
n | +--------------------------------+---------------------------------+---------------------------------+ |
n | +--------------------------------+-----------------------------------+---------------------------------+ |
| | Interface | Section | Purpose | |
| | Interface | Section | Purpose | |
| +================================+=================================+=================================+ |
| +================================+===================================+=================================+ |
| | :class:`DOMImplementation` | :ref:`dom-implementation- | Interface to the underlying | |
| | :class:`DOMImplementation` | :ref:`dom-implementation-objects` | Interface to the underlying | |
| | | objects` | implementation. | |
| | | | implementation. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Node` | :ref:`dom-node-objects` | Base interface for most objects | |
| | :class:`Node` | :ref:`dom-node-objects` | Base interface for most objects | |
| | | | in a document. | |
| | | | in a document. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`NodeList` | :ref:`dom-nodelist-objects` | Interface for a sequence of | |
| | :class:`NodeList` | :ref:`dom-nodelist-objects` | Interface for a sequence of | |
| | | | nodes. | |
| | | | nodes. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`DocumentType` | :ref:`dom-documenttype-objects` | Information about the | |
| | :class:`DocumentType` | :ref:`dom-documenttype-objects` | Information about the | |
| | | | declarations needed to process | |
| | | | declarations needed to process | |
| | | | a document. | |
| | | | a document. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Document` | :ref:`dom-document-objects` | Object which represents an | |
| | :class:`Document` | :ref:`dom-document-objects` | Object which represents an | |
| | | | entire document. | |
| | | | entire document. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Element` | :ref:`dom-element-objects` | Element nodes in the document | |
| | :class:`Element` | :ref:`dom-element-objects` | Element nodes in the document | |
| | | | hierarchy. | |
| | | | hierarchy. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Attr` | :ref:`dom-attr-objects` | Attribute value nodes on | |
| | :class:`Attr` | :ref:`dom-attr-objects` | Attribute value nodes on | |
| | | | element nodes. | |
| | | | element nodes. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Comment` | :ref:`dom-comment-objects` | Representation of comments in | |
| | :class:`Comment` | :ref:`dom-comment-objects` | Representation of comments in | |
| | | | the source document. | |
| | | | the source document. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`Text` | :ref:`dom-text-objects` | Nodes containing textual | |
| | :class:`Text` | :ref:`dom-text-objects` | Nodes containing textual | |
| | | | content from the document. | |
| | | | content from the document. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| | :class:`ProcessingInstruction` | :ref:`dom-pi-objects` | Processing instruction | |
| | :class:`ProcessingInstruction` | :ref:`dom-pi-objects` | Processing instruction | |
| | | | representation. | |
| | | | representation. | |
| +--------------------------------+---------------------------------+---------------------------------+ |
| +--------------------------------+-----------------------------------+---------------------------------+ |
| |
| An additional section describes the exceptions defined for working with the DOM |
| in Python. |
| |
| |
| .. _dom-implementation-objects: |
| |
| DOMImplementation Objects |
| |
| A :class:`NamedNodeMap` of attribute objects. Only elements have actual values |
| for this; others provide ``None`` for this attribute. This is a read-only |
| attribute. |
| |
| |
| .. attribute:: Node.previousSibling |
| |
n | The node that immediately precedes this one with the same parent. For instance |
n | The node that immediately precedes this one with the same parent. For |
| the element with an end-tag that comes just before the *self* element's start- |
| instance the element with an end-tag that comes just before the *self* |
| tag. Of course, XML documents are made up of more than just elements so the |
| element's start-tag. Of course, XML documents are made up of more than just |
| previous sibling could be text, a comment, or something else. If this node is |
| elements so the previous sibling could be text, a comment, or something else. |
| the first child of the parent, this attribute will be ``None``. This is a read- |
| If this node is the first child of the parent, this attribute will be |
| only attribute. |
| ``None``. This is a read-only attribute. |
| |
| |
| .. attribute:: Node.nextSibling |
| |
| The node that immediately follows this one with the same parent. See also |
| :attr:`previousSibling`. If this is the last child of the parent, this |
| attribute will be ``None``. This is a read-only attribute. |
| |
| This is based on a proposed DOM Level 3 API which is still in the "working |
| draft" stage, but this particular interface appears uncontroversial. Changes |
| from the W3C will not necessarily affect this method in the Python DOM interface |
| (though any new W3C API for this would also be supported). |
| |
| |
| .. method:: Node.appendChild(newChild) |
| |
n | Add a new child node to this node at the end of the list of children, returning |
n | Add a new child node to this node at the end of the list of |
| *newChild*. |
| children, returning *newChild*. If the node was already in |
| in the tree, it is removed first. |
| |
| |
| .. method:: Node.insertBefore(newChild, refChild) |
| |
| Insert a new child node before an existing child. It must be the case that |
| *refChild* is a child of this node; if not, :exc:`ValueError` is raised. |
| *newChild* is returned. If *refChild* is ``None``, it inserts *newChild* at the |
| end of the children's list. |
| +------------------+-------------------------------------------+ |
| | ``unsigned int`` | ``IntegerType`` | |
| +------------------+-------------------------------------------+ |
| |
| Additionally, the :class:`DOMString` defined in the recommendation is mapped to |
| a Python string or Unicode string. Applications should be able to handle |
| Unicode whenever a string is returned from the DOM. |
| |
n | The IDL :keyword:`null` value is mapped to ``None``, which may be accepted or |
n | The IDL ``null`` value is mapped to ``None``, which may be accepted or |
| provided by the implementation whenever :keyword:`null` is allowed by the API. |
| provided by the implementation whenever ``null`` is allowed by the API. |
| |
| |
| .. _dom-accessor-methods: |
| |
| Accessor Methods |
| ^^^^^^^^^^^^^^^^ |
| |
| The mapping from OMG IDL to Python defines accessor functions for IDL |
n | :keyword:`attribute` declarations in much the way the Java mapping does. |
n | ``attribute`` declarations in much the way the Java mapping does. |
| Mapping the IDL declarations :: |
| |
| readonly attribute string someValue; |
| attribute string anotherValue; |
| |
| yields three accessor functions: a "get" method for :attr:`someValue` |
| (:meth:`_get_someValue`), and "get" and "set" methods for :attr:`anotherValue` |
| (:meth:`_get_anotherValue` and :meth:`_set_anotherValue`). The mapping, in |
| raise an :exc:`AttributeError`. |
| |
| The Python DOM API, however, *does* require that normal attribute access work. |
| This means that the typical surrogates generated by Python IDL compilers are not |
| likely to work, and wrapper objects may be needed on the client if the DOM |
| objects are accessed via CORBA. While this does require some additional |
| consideration for CORBA DOM clients, the implementers with experience using DOM |
| over CORBA from Python do not consider this a problem. Attributes that are |
n | declared :keyword:`readonly` may not restrict write access in all DOM |
n | declared ``readonly`` may not restrict write access in all DOM |
| implementations. |
| |
| In the Python DOM API, accessor functions are not required. If provided, they |
| should take the form defined by the Python IDL mapping, but these methods are |
| considered unnecessary since the attributes are accessible directly from Python. |
t | "Set" accessors should never be provided for :keyword:`readonly` attributes. |
t | "Set" accessors should never be provided for ``readonly`` attributes. |
| |
| The IDL definitions do not fully embody the requirements of the W3C DOM API, |
| such as the notion of certain objects, such as the return value of |
| :meth:`getElementsByTagName`, being "live". The Python DOM API does not require |
| implementations to enforce such requirements. |
| |