org.apache.xerces.impl
Class XMLDocumentScannerImpl

java.lang.Object
  |
  +--org.apache.xerces.impl.XMLScanner
        |
        +--org.apache.xerces.impl.XMLDocumentScannerImpl

public class XMLDocumentScannerImpl
extends XMLScanner
implements org.apache.xerces.xni.parser.XMLDocumentScanner, org.apache.xerces.xni.parser.XMLComponent, XMLEntityHandler

This class is responsible for scanning XML document structure and content. The scanner acts as the source for the document information which is communicated to the document handler.

This component requires the following features and properties from the component manager that uses it:

Version:
$Id: XMLDocumentScannerImpl.java,v 1.1.2.1 2001/08/06 05:17:54 andyc Exp $
Author:
Glenn Marcy, IBM, Stubs generated by DesignDoc on Mon Sep 11 11:10:57 PDT 2000, Andy Clark, IBM, Arnaud Le Hors, IBM, Eric Ye, IBM

Inner Class Summary
protected  class XMLDocumentScannerImpl.ContentDispatcher
          Dispatcher to handle content scanning.
protected static interface XMLDocumentScannerImpl.Dispatcher
          This interface defines an XML "event" dispatching model.
protected static class XMLDocumentScannerImpl.ElementStack
          Element stack.
protected  class XMLDocumentScannerImpl.PrologDispatcher
          Dispatcher to handle prolog scanning.
protected  class XMLDocumentScannerImpl.TrailingMiscDispatcher
          Dispatcher to handle trailing miscellaneous section scanning.
protected  class XMLDocumentScannerImpl.XMLDeclDispatcher
          Dispatcher to handle XMLDecl scanning.
 
Inner classes inherited from class org.apache.xerces.impl.XMLScanner
XMLScanner.AttrEntityStack
 
Field Summary
protected static java.lang.String DTD_SCANNER
          Property identifier: DTD scanner.
protected  XMLDocumentScannerImpl.Dispatcher fContentDispatcher
          Content dispatcher.
protected  org.apache.xerces.xni.QName fCurrentElement
          Current element.
protected  XMLDocumentScannerImpl.Dispatcher fDispatcher
          Active dispatcher.
protected  org.apache.xerces.xni.XMLDocumentHandler fDocumentHandler
          Document handler.
protected  org.apache.xerces.xni.parser.XMLDTDScanner fDTDScanner
          DTD scanner.
protected  XMLDocumentScannerImpl.ElementStack fElementStack
          Element stack.
protected  int[] fEntityStack
          Entity stack.
protected  boolean fHasExternalDTD
          has external dtd
protected  boolean fLoadExternalDTD
           
protected  int fMarkupDepth
          Markup depth.
protected  boolean fNamespaces
          Namespaces.
protected  boolean fNotifyBuiltInRefs
          Notify built-in references.
protected  XMLDocumentScannerImpl.Dispatcher fPrologDispatcher
          Prolog dispatcher.
protected  int fScannerState
          Scanner state.
protected  boolean fScanningDTD
          Scanning DTD.
protected  boolean fSeenDoctypeDecl
          Seen doctype declaration.
protected  boolean fStandalone
          Standalone.
protected  XMLDocumentScannerImpl.Dispatcher fTrailingMiscDispatcher
          Trailing miscellaneous section dispatcher.
protected  XMLDocumentScannerImpl.Dispatcher fXMLDeclDispatcher
          XML declaration dispatcher.
protected static java.lang.String LOAD_EXTERNAL_DTD
          Feature identifier: load external DTD.
protected static java.lang.String NAMESPACES
          Feature identifier: namespaces.
protected static java.lang.String NOTIFY_BUILTIN_REFS
          Feature identifier: notify built-in refereces.
protected static int SCANNER_STATE_CDATA
          Scanner state: CDATA section.
protected static int SCANNER_STATE_COMMENT
          Scanner state: comment.
protected static int SCANNER_STATE_CONTENT
          Scanner state: content.
protected static int SCANNER_STATE_DOCTYPE
          Scanner state: DOCTYPE.
protected static int SCANNER_STATE_END_OF_INPUT
          Scanner state: end of input.
protected static int SCANNER_STATE_PI
          Scanner state: processing instruction.
protected static int SCANNER_STATE_PROLOG
          Scanner state: prolog.
protected static int SCANNER_STATE_REFERENCE
          Scanner state: reference.
protected static int SCANNER_STATE_ROOT_ELEMENT
          Scanner state: root element.
protected static int SCANNER_STATE_START_OF_MARKUP
          Scanner state: start of markup.
protected static int SCANNER_STATE_TERMINATED
          Scanner state: terminated.
protected static int SCANNER_STATE_TEXT_DECL
          Scanner state: Text declaration.
protected static int SCANNER_STATE_TRAILING_MISC
          Scanner state: trailing misc.
protected static int SCANNER_STATE_XML_DECL
          Scanner state: XML declaration.
 
Fields inherited from class org.apache.xerces.impl.XMLScanner
DEBUG_ATTR_ENTITIES, DEBUG_ATTR_NORMALIZATION, fAmpSymbol, fAposSymbol, fAttributeEntityStack, fAttributeOffset, fCharRefLiteral, fEncodingSymbol, fEntityDepth, fEntityManager, fEntityScanner, fErrorReporter, fGtSymbol, fLtSymbol, fNotifyCharRefs, fQuotSymbol, fScanningAttribute, fStandaloneSymbol, fString, fStringBuffer, fStringBuffer2, fStrings, fSymbolTable, fValidation, fVersionSymbol, NOTIFY_CHAR_REFS, VALIDATION
 
Constructor Summary
XMLDocumentScannerImpl()
          Default constructor.
 
Method Summary
 void endEntity(java.lang.String name)
          This method notifies the end of an entity.
 java.lang.String getDispatcherName(XMLDocumentScannerImpl.Dispatcher dispatcher)
          Returns the dispatcher name.
 java.lang.String[] getRecognizedFeatures()
          Returns a list of feature identifiers that are recognized by this component.
 java.lang.String[] getRecognizedProperties()
          Returns a list of property identifiers that are recognized by this component.
protected  int handleEndElement(org.apache.xerces.xni.QName element, boolean isEmpty)
          Handles the end element.
 void reset(org.apache.xerces.xni.parser.XMLComponentManager componentManager)
          Resets the component.
protected  void scanAttribute(org.apache.xerces.xni.XMLAttributes attributes)
          Scans an attribute.
protected  boolean scanCDATASection(boolean complete)
          Scans a CDATA section.
protected  void scanCharReference()
          Scans a character reference.
protected  void scanComment()
          Scans a comment.
protected  int scanContent()
          Scans element content.
protected  void scanDoctypeDecl()
          Scans a doctype declaration.
 boolean scanDocument(boolean complete)
          Scans a document.
protected  int scanEndElement()
          Scans an end element.
protected  void scanEntityReference()
          Scans an entity reference.
protected  void scanPIData(java.lang.String target, org.apache.xerces.xni.XMLString data)
          Scans a processing data.
protected  boolean scanStartElement()
          Scans a start element.
protected  void scanXMLDeclOrTextDecl(boolean scanningTextDecl)
          Scans an XML or text declaration.
protected  void setDispatcher(XMLDocumentScannerImpl.Dispatcher dispatcher)
          Sets the dispatcher.
 void setDocumentHandler(org.apache.xerces.xni.XMLDocumentHandler documentHandler)
          setDocumentHandler
 void setFeature(java.lang.String featureId, boolean state)
          Sets the state of a feature.
 void setInputSource(org.apache.xerces.xni.parser.XMLInputSource inputSource)
          Sets the input source.
 void setProperty(java.lang.String propertyId, java.lang.Object value)
          Sets the value of a property.
protected  void setScannerState(int state)
          Sets the scanner state.
 void startEntity(java.lang.String name, java.lang.String publicId, java.lang.String systemId, java.lang.String baseSystemId, java.lang.String encoding)
          This method notifies of the start of an entity.
 
Methods inherited from class org.apache.xerces.impl.XMLScanner
normalizeWhitespace, reportFatalError, scanAttributeValue, scanCharReferenceValue, scanComment, scanExternalID, scanPI, scanPseudoAttribute, scanPubidLiteral, scanSurrogates, scanXMLDeclOrTextDecl
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SCANNER_STATE_XML_DECL

protected static final int SCANNER_STATE_XML_DECL
Scanner state: XML declaration.

SCANNER_STATE_START_OF_MARKUP

protected static final int SCANNER_STATE_START_OF_MARKUP
Scanner state: start of markup.

SCANNER_STATE_COMMENT

protected static final int SCANNER_STATE_COMMENT
Scanner state: comment.

SCANNER_STATE_PI

protected static final int SCANNER_STATE_PI
Scanner state: processing instruction.

SCANNER_STATE_DOCTYPE

protected static final int SCANNER_STATE_DOCTYPE
Scanner state: DOCTYPE.

SCANNER_STATE_PROLOG

protected static final int SCANNER_STATE_PROLOG
Scanner state: prolog.

SCANNER_STATE_ROOT_ELEMENT

protected static final int SCANNER_STATE_ROOT_ELEMENT
Scanner state: root element.

SCANNER_STATE_CONTENT

protected static final int SCANNER_STATE_CONTENT
Scanner state: content.

SCANNER_STATE_REFERENCE

protected static final int SCANNER_STATE_REFERENCE
Scanner state: reference.

SCANNER_STATE_TRAILING_MISC

protected static final int SCANNER_STATE_TRAILING_MISC
Scanner state: trailing misc.

SCANNER_STATE_END_OF_INPUT

protected static final int SCANNER_STATE_END_OF_INPUT
Scanner state: end of input.

SCANNER_STATE_TERMINATED

protected static final int SCANNER_STATE_TERMINATED
Scanner state: terminated.

SCANNER_STATE_CDATA

protected static final int SCANNER_STATE_CDATA
Scanner state: CDATA section.

SCANNER_STATE_TEXT_DECL

protected static final int SCANNER_STATE_TEXT_DECL
Scanner state: Text declaration.

NAMESPACES

protected static final java.lang.String NAMESPACES
Feature identifier: namespaces.

LOAD_EXTERNAL_DTD

protected static final java.lang.String LOAD_EXTERNAL_DTD
Feature identifier: load external DTD.

NOTIFY_BUILTIN_REFS

protected static final java.lang.String NOTIFY_BUILTIN_REFS
Feature identifier: notify built-in refereces.

DTD_SCANNER

protected static final java.lang.String DTD_SCANNER
Property identifier: DTD scanner.

fDTDScanner

protected org.apache.xerces.xni.parser.XMLDTDScanner fDTDScanner
DTD scanner.

fDocumentHandler

protected org.apache.xerces.xni.XMLDocumentHandler fDocumentHandler
Document handler.

fEntityStack

protected int[] fEntityStack
Entity stack.

fMarkupDepth

protected int fMarkupDepth
Markup depth.

fScannerState

protected int fScannerState
Scanner state.

fSeenDoctypeDecl

protected boolean fSeenDoctypeDecl
Seen doctype declaration.

fHasExternalDTD

protected boolean fHasExternalDTD
has external dtd

fStandalone

protected boolean fStandalone
Standalone.

fScanningDTD

protected boolean fScanningDTD
Scanning DTD.

fCurrentElement

protected org.apache.xerces.xni.QName fCurrentElement
Current element.

fElementStack

protected XMLDocumentScannerImpl.ElementStack fElementStack
Element stack.

fNamespaces

protected boolean fNamespaces
Namespaces.

fLoadExternalDTD

protected boolean fLoadExternalDTD

fNotifyBuiltInRefs

protected boolean fNotifyBuiltInRefs
Notify built-in references.

fDispatcher

protected XMLDocumentScannerImpl.Dispatcher fDispatcher
Active dispatcher.

fXMLDeclDispatcher

protected XMLDocumentScannerImpl.Dispatcher fXMLDeclDispatcher
XML declaration dispatcher.

fPrologDispatcher

protected XMLDocumentScannerImpl.Dispatcher fPrologDispatcher
Prolog dispatcher.

fContentDispatcher

protected XMLDocumentScannerImpl.Dispatcher fContentDispatcher
Content dispatcher.

fTrailingMiscDispatcher

protected XMLDocumentScannerImpl.Dispatcher fTrailingMiscDispatcher
Trailing miscellaneous section dispatcher.
Constructor Detail

XMLDocumentScannerImpl

public XMLDocumentScannerImpl()
Default constructor.
Method Detail

setInputSource

public void setInputSource(org.apache.xerces.xni.parser.XMLInputSource inputSource)
                    throws java.io.IOException
Sets the input source.
Specified by:
setInputSource in interface org.apache.xerces.xni.parser.XMLDocumentScanner
Parameters:
inputSource - The input source.
Throws:
java.io.IOException - Thrown on i/o error.

scanDocument

public boolean scanDocument(boolean complete)
                     throws java.io.IOException,
                            org.apache.xerces.xni.XNIException
Scans a document.
Specified by:
scanDocument in interface org.apache.xerces.xni.parser.XMLDocumentScanner
Parameters:
complete - True if the scanner should scan the document completely, pushing all events to the registered document handler. A value of false indicates that that the scanner should only scan the next portion of the document and return. A scanner instance is permitted to completely scan a document if it does not support this "pull" scanning model.

reset

public void reset(org.apache.xerces.xni.parser.XMLComponentManager componentManager)
           throws org.apache.xerces.xni.parser.XMLConfigurationException
Resets the component. The component can query the component manager about any features and properties that affect the operation of the component.
Specified by:
reset in interface org.apache.xerces.xni.parser.XMLComponent
Overrides:
reset in class XMLScanner
Parameters:
componentManager - The component manager.
Throws:
SAXException - Thrown by component on initialization error. For example, if a feature or property is required for the operation of the component, the component manager may throw a SAXNotRecognizedException or a SAXNotSupportedException.

getRecognizedFeatures

public java.lang.String[] getRecognizedFeatures()
Returns a list of feature identifiers that are recognized by this component. This method may return null if no features are recognized by this component.
Specified by:
getRecognizedFeatures in interface org.apache.xerces.xni.parser.XMLComponent

setFeature

public void setFeature(java.lang.String featureId,
                       boolean state)
                throws org.apache.xerces.xni.parser.XMLConfigurationException
Sets the state of a feature. This method is called by the component manager any time after reset when a feature changes state.

Note: Components should silently ignore features that do not affect the operation of the component.

Specified by:
setFeature in interface org.apache.xerces.xni.parser.XMLComponent
Overrides:
setFeature in class XMLScanner
Parameters:
featureId - The feature identifier.
state - The state of the feature.
Throws:
SAXNotRecognizedException - The component should not throw this exception.
SAXNotSupportedException - The component should not throw this exception.

getRecognizedProperties

public java.lang.String[] getRecognizedProperties()
Returns a list of property identifiers that are recognized by this component. This method may return null if no properties are recognized by this component.
Specified by:
getRecognizedProperties in interface org.apache.xerces.xni.parser.XMLComponent

setProperty

public void setProperty(java.lang.String propertyId,
                        java.lang.Object value)
                 throws org.apache.xerces.xni.parser.XMLConfigurationException
Sets the value of a property. This method is called by the component manager any time after reset when a property changes value.

Note: Components should silently ignore properties that do not affect the operation of the component.

Specified by:
setProperty in interface org.apache.xerces.xni.parser.XMLComponent
Overrides:
setProperty in class XMLScanner
Parameters:
propertyId - The property identifier.
value - The value of the property.
Throws:
SAXNotRecognizedException - The component should not throw this exception.
SAXNotSupportedException - The component should not throw this exception.

setDocumentHandler

public void setDocumentHandler(org.apache.xerces.xni.XMLDocumentHandler documentHandler)
setDocumentHandler
Parameters:
documentHandler -  

startEntity

public void startEntity(java.lang.String name,
                        java.lang.String publicId,
                        java.lang.String systemId,
                        java.lang.String baseSystemId,
                        java.lang.String encoding)
                 throws org.apache.xerces.xni.XNIException
This method notifies of the start of an entity. The DTD has the pseudo-name of "[dtd]; parameter entity names start with '%'; and general entities are just specified by their name.
Specified by:
startEntity in interface XMLEntityHandler
Overrides:
startEntity in class XMLScanner
Parameters:
name - The name of the entity.
publicId - The public identifier of the entity if the entity is external, null otherwise.
systemId - The system identifier of the entity if the entity is external, null otherwise.
baseSystemId - The base system identifier of the entity if the entity is external, null otherwise.
encoding - The auto-detected IANA encoding name of the entity stream. This value will be null in those situations where the entity encoding is not auto-detected (e.g. internal entities or a document entity that is parsed from a java.io.Reader).
Throws:
org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.

endEntity

public void endEntity(java.lang.String name)
               throws org.apache.xerces.xni.XNIException
This method notifies the end of an entity. The DTD has the pseudo-name of "[dtd]; parameter entity names start with '%'; and general entities are just specified by their name.
Specified by:
endEntity in interface XMLEntityHandler
Overrides:
endEntity in class XMLScanner
Parameters:
name - The name of the entity.
Throws:
org.apache.xerces.xni.XNIException - Thrown by handler to signal an error.

scanXMLDeclOrTextDecl

protected void scanXMLDeclOrTextDecl(boolean scanningTextDecl)
                              throws java.io.IOException,
                                     org.apache.xerces.xni.XNIException
Scans an XML or text declaration.

 [23] XMLDecl ::= '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'
 [24] VersionInfo ::= S 'version' Eq (' VersionNum ' | " VersionNum ")
 [80] EncodingDecl ::= S 'encoding' Eq ('"' EncName '"' |  "'" EncName "'" )
 [81] EncName ::= [A-Za-z] ([A-Za-z0-9._] | '-')*
 [32] SDDecl ::= S 'standalone' Eq (("'" ('yes' | 'no') "'")
                 | ('"' ('yes' | 'no') '"'))

 [77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'
 
Parameters:
scanningTextDecl - True if a text declaration is to be scanned instead of an XML declaration.

scanPIData

protected void scanPIData(java.lang.String target,
                          org.apache.xerces.xni.XMLString data)
                   throws java.io.IOException,
                          org.apache.xerces.xni.XNIException
Scans a processing data. This is needed to handle the situation where a document starts with a processing instruction whose target name starts with "xml". (e.g. xmlfoo)
Overrides:
scanPIData in class XMLScanner
Parameters:
target - The PI target
data - The string to fill in with the data

scanComment

protected void scanComment()
                    throws java.io.IOException,
                           org.apache.xerces.xni.XNIException
Scans a comment.

 [15] Comment ::= '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'
 

Note: Called after scanning past '<!--'


scanDoctypeDecl

protected void scanDoctypeDecl()
                        throws java.io.IOException,
                               org.apache.xerces.xni.XNIException
Scans a doctype declaration.

scanStartElement

protected boolean scanStartElement()
                            throws java.io.IOException,
                                   org.apache.xerces.xni.XNIException
Scans a start element. This method will handle the binding of namespace information and notifying the handler of the start of the element.

 [44] EmptyElemTag ::= '<' Name (S Attribute)* S? '/>'
 [40] STag ::= '<' Name (S Attribute)* S? '>'
 

Note: This method assumes that the leading '<' character has been consumed.

Note: This method uses the fElementQName and fAttributes variables. The contents of these variables will be destroyed. The caller should copy important information out of these variables before calling this method.


scanAttribute

protected void scanAttribute(org.apache.xerces.xni.XMLAttributes attributes)
                      throws java.io.IOException,
                             org.apache.xerces.xni.XNIException
Scans an attribute.

 [41] Attribute ::= Name Eq AttValue
 

Note: This method assumes that the next character on the stream is the first character of the attribute name.

Note: This method uses the fAttributeQName and fQName variables. The contents of these variables will be destroyed.

Parameters:
attributes - The attributes list for the scanned attribute.

scanContent

protected int scanContent()
                   throws java.io.IOException,
                          org.apache.xerces.xni.XNIException
Scans element content.

scanCDATASection

protected boolean scanCDATASection(boolean complete)
                            throws java.io.IOException,
                                   org.apache.xerces.xni.XNIException
Scans a CDATA section.

Note: This method uses the fString and fStringBuffer variables.

Parameters:
complete - True if the CDATA section is to be scanned completely.
Returns:
True if CDATA is completely scanned.

scanEndElement

protected int scanEndElement()
                      throws java.io.IOException,
                             org.apache.xerces.xni.XNIException
Scans an end element.

 [42] ETag ::= '</' Name S? '>'
 

Note: This method uses the fElementQName variable. The contents of this variable will be destroyed. The caller should copy the needed information out of this variable before calling this method.


scanCharReference

protected void scanCharReference()
                          throws java.io.IOException,
                                 org.apache.xerces.xni.XNIException
Scans a character reference.

 [66] CharRef ::= '&#' [0-9]+ ';' | '&#x' [0-9a-fA-F]+ ';'
 

scanEntityReference

protected void scanEntityReference()
                            throws java.io.IOException,
                                   org.apache.xerces.xni.XNIException
Scans an entity reference.
Throws:
java.io.IOException - Thrown if i/o error occurs.
org.apache.xerces.xni.XNIException - Thrown if handler throws exception upon notification.

handleEndElement

protected int handleEndElement(org.apache.xerces.xni.QName element,
                               boolean isEmpty)
                        throws org.apache.xerces.xni.XNIException
Handles the end element. This method will make sure that the end element name matches the current element and notify the handler about the end of the element and the end of any relevent prefix mappings.

Note: This method uses the fQName variable. The contents of this variable will be destroyed.

Parameters:
element - The element.
Throws:
org.apache.xerces.xni.XNIException - Thrown if the handler throws a SAX exception upon notification.

setScannerState

protected final void setScannerState(int state)
Sets the scanner state.
Parameters:
state - The new scanner state.

setDispatcher

protected final void setDispatcher(XMLDocumentScannerImpl.Dispatcher dispatcher)
Sets the dispatcher.
Parameters:
dispatcher - The new dispatcher.

getDispatcherName

public java.lang.String getDispatcherName(XMLDocumentScannerImpl.Dispatcher dispatcher)
Returns the dispatcher name.


Copyright © 1999-2001 Apache XML Project. All Rights Reserved.