Sunday, January 11, 2009

DOM与SAX的区别

更多精彩请到 http://www.139ya.com

1.

SAX是parser的一种,它是event driven的,它不会一次把一个XML全parse完,只会解析一点,当遇到某个新的起点或终点时,调一个回调函数。如果程序要找到前面的元素是没有办法的,除非程序自己用某种方式先保存好。

DOM将文档保存成一个树形的数据结构,非常方便,但是明显用的内存更多。

DOM and SAX parsers work in different ways. A SAX parser processes the XML document as it parses the XML input stream, passing SAX events to a programmer-defined handler method. A DOM parser, on the other hand, parses the entire input XML stream and returns a Document object. Document is the programmatic, language-neutral interface that represents a document. The Document returned by the DOM parser has an API that lets you manipulate a (virtual) tree of Node objects; this tree represents the structure of the input XML.*

http://www.javaworld.com/jw-07-2000/jw-0707-xmldom.html?page=1

2.

SAX概念
SAX是Simple API for XML的缩写,它并不是由W3C官方所提出的标准,可以说是“民间”的事实标准。实际上,它是一种社区性质的讨论产物。虽然如此,在XML中对SAX的应用丝毫不比DOM少,几乎所有的XML解析器都会支持它。

与 DOM比较而言,SAX是一种轻量型的方法。我们知道,在处理DOM的时候,我们需要读入整个的XML文档,然后在内存中创建DOM树,生成DOM树上的每个Node对象。当文档比较小的时候,这不会造成什么问题,但是一旦文档大起来,处理DOM就会变得相当费时费力。特别是其对于内存的需求,也将是成倍的增长,以至于在某些应用中使用DOM是一件很不划算的事(比如在applet中)。这时候,一个较好的替代解决方法就是SAX。

SAX 在概念上与DOM完全不同。首先,不同于DOM的文档驱动,它是事件驱动的,也就是说,它并不需要读入整个文档,而文档的读入过程也就是SAX的解析过程。所谓事件驱动,是指一种基于回调(callback)机制的程序运行方法。(如果你对Java新的代理事件模型比较清楚的话,就会很容易理解这种机制了)


在XMLReader接受XML文档,在读入XML文档的过程中就进行解析,也就是说读入文档的过程和解析的过程是同时进行的,这和DOM区别很大。解析开始之前,需要向XMLReader注册一个ContentHandler,也就是相当于一个事件监听器,在 ContentHandler中定义了很多方法,比如startDocument(),它定制了当在解析过程中,遇到文档开始时应该处理的事情。当 XMLReader读到合适的内容,就会抛出相应的事件,并把这个事件的处理权代理给ContentHandler,调用其相应的方法进行响应。



What is the difference between a DOMParser and a SAXParser?

DOM parsers and SAX parsers work in different ways.

* A DOM parser creates a tree structure in memory from the input document and then waits for requests from client. But a SAX parser does not create any internal structure. Instead, it takes the occurrences of components of a input document as events, and tells the client what it reads as it reads through the input document.
* A DOM parser always serves the client application with the entire document no matter how much is actually needed by the client. But a SAX parser serves the client application always only with pieces of the document at any given time.
* With DOM parser, method calls in client application have to be explicit and forms a kind of chain. But with SAX, some certain methods (usually overriden by the cient) will be invoked automatically (implicitly) in a way which is called "callback" when some certain events occur. These methods do not have to be called explicitly by the client, though we could call them explicitly.

No comments: