一般使用DOM扫描一个XML文档后会生成一个以节点树表示的文档,XML中的每一个元素、实体、PCData和Attribut都会生成一个节点,节点类型是实现了Node接口的类。参考代码如下:
public static void getScanner(String address) throws Exception{
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    //create a DocumentBuilderFactory
    DocumentBuilder db = null;
    //create a DocumentBuilder
    try{
          db=dbf.newDocumentBuilder();
      } catch (ParserConfigurationException pce){
            System.err.println(pce);
            System.exit(1);
      }
    Document doc = null;
    try{
          doc=db.parse(new File(address));
      } catch (SAXException se){
            System.err.println(se.getMessage());
            //to be replaced by the log method;
            System.exit(1);
  } catch (IOException ioe){
       System.err.println(ioe);
       //to be replaced by the log method;
        System.exit(1);
    }
    //parse the input file
}
public static void printxml(Node n)
{
   //recursive routine to print out Dom Tree nodes
    int type = n.getNodeType();
    switch (type){
        case Node.DOCUMENT_NODE:
            System.out.print("DOC:");
            break;
        case Node.DOCUMENT_TYPE_NODE:
            System.out.print("DOC_TYPE:");
            break;
       case Node.ELEMENT_NODE:
            System.out.print("ELEM:");
           break;
        case Node.TEXT_NODE:
            System.out.print("TEXT:");
           break;
        default:
            System.out.print("Other Node:" + type);
            break;
    }
    System.out.print(" nodeName=\"" + n.getNodeName() + "\"");
   String val = n.getNodeValue();
    if(val!=null){
       if (!(val.trim().equals(""))){
           System.out.print("nodeValue \"" +  n.getNodeValue() + "\"");
       }
    }
    System.out.println();
    //Print children if any
    for (Node child = n.getFirstChild() ; child!=null ; child = child.getNextSibling() )
   {
  printxml(child);
  }
}
比如我导入的xml文档是这样:
<?xml version="1.0"?>
<GenericProfileOfVideoCodecSettings xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Name>1P-Goodquality</Name>
<Test>test</Test>
</GenericProfileOfVideoCodecSettings>生成的文档树就是这样的:

不过,从实际情况来看GenericProfileOfVideoCodecSettings的所有子节点每个都会带着一个空白的兄弟节点,这个"空白"其实就是xml代码页中两个元素之间的空格和换行。于是上面的输出就会带有很多的
TEXT: nodeName="#text"


没有评论:
发表评论