XPath parser in Java

全部 Hibernate Spring Struts iBATIS 企业应用 Lucene SOA Java综合 Tomcat 设计模式 OO JBoss

浏览 5103 次

锁定老帖子主题：XPath parser in Java 精华帖 (0) :: 良好帖 (0) :: 新手帖 (0) :: 隐藏帖 (0)
作者	正文
chris_freedream 等级: 初级会员性别: 文章: 15 积分: 80 来自: 深圳	发表时间：2009-05-13 最后修改：2009-05-13 相关推荐: JsoupXpath:纯Java实现的支持W3C Xpath 1.0标准语法HTML解析器。A html parser with xpath base on Jsoup and Antlr4. Maybe it is the best in java,ha ha.Just try it xpath 解析html java_Java下使用xpath解析html文件 java爬虫工具xpath提取_爬虫 xpath (数据提取) java jsoupxpath_GitHub - zhegexiaohuozi/JsoupXpath: 纯Java实现的支持W3C Xpath 1.0标准语法的HTML解析器。A html parse... java实现html语法检查_GitHub - henryjun0/JsoupXpath: 纯Java实现的支持W3C Xpath 1.0标准语法的HTML解析器。A html parser with... 更多相关推荐 SOA 1.概览 2.解决要点 3.代码实现 4.测试代码实现 5.测试结果 6.总结 1.概览本主题旨在不依赖第三方软件情况下如何使用Java实现XPath的解析。 2.解决要点 1)如何把XML文件/XML字符流转换为DOM模型下的Node/Element. 2)XML文件/XML字符流编码格式的指定。 3)提供XML的Namespace Resolver. 4)调用XPath的相关API并依赖用户指定的experssion进行解析，并返回结果。 3.代码实现 package com.chris.xpath; import java.io.BufferedReader; import java.io.IOException; import java.io.PrintStream; import java.io.StringReader; import javax.xml.namespace.NamespaceContext; import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException; import javax.xml.xpath.XPath; import javax.xml.xpath.XPathConstants; import javax.xml.xpath.XPathExpression; import javax.xml.xpath.XPathExpressionException; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Node; import org.xml.sax.InputSource; import org.xml.sax.SAXException; import org.xml.sax.SAXParseException; /** * <code>XPathUtil</code>. * @author chris.wang / public class XPathUtil { private PrintStream error = System.err; /* * Populate XPath-expression. * @param query * @param xmlDocument * @param defaultNamespaceContext * @return value of element or attribute. / public String evaluateXPath( String query, Node xmlDocument, NamespaceContext defaultNamespaceContext ) { String result = null; XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); xpath.setNamespaceContext(defaultNamespaceContext); XPathExpression expr = null; try { expr = xpath.compile(query); } catch ( XPathExpressionException xpee ) { Throwable x = xpee; if ( null != xpee.getCause() ) { x = xpee.getCause(); if ( "javax.xml.transform.TransformerException".equals(x.getClass() .getName()) ) { error .println("Error compiling xpath expression [" + query + "]. Could all the required namespaces be resolved from the previous response?"); } else { error.println("Error compiling xpath expression [" + query + "]. Is the expression well-formed in XML Spy?"); } } x.printStackTrace(error); return null; } try { result = (String) expr.evaluate(xmlDocument, XPathConstants.STRING); } catch ( XPathExpressionException e ) { e.printStackTrace(error); } return result; } /* * Convert xmlString into Node. * @param xmlString * @param encodingHint encoding type. * @return Node / public Node loadXMLResource( String xmlString, String encodingHint ) { Node document = null; if ( 0xFEFF == xmlString.charAt(0) ) { xmlString = xmlString.substring(1); } InputSource source = new InputSource(new BufferedReader(new StringReader( xmlString))); if ( null != encodingHint ) { source.setEncoding(encodingHint); } // parse the XML purely as well-formed XML and get a DOM tree // represenation. try { document = loadDocument(source); } catch ( SAXParseException spe ) { // Error generated by the parser if ( null != spe.getSystemId() ) { error.println("\n* XML Parsing error , line " + spe.getLineNumber() + " character " + spe.getColumnNumber() + ", uri=" + spe.getSystemId()); error.println(" " + spe.getMessage() + "\n"); // NOPMD } else { error.println("\n XML Parsing error : " + spe.getMessage() + "\n"); } // Use the contained exception, if any Exception x = spe; if ( null != spe.getException() ) { x = spe.getException(); } x.printStackTrace(error); } catch ( SAXException se ) { document = null; error .println("Error parsing the XML resource - is the XML valid and well-formed?"); // Use the contained exception, if any Exception x = se; if ( null != se.getException() ) { x = se.getException(); } x.printStackTrace(error); } catch ( IOException ioe ) { document = null; error .println("Error loading the XML resource - can the xmlString be read?"); ioe.printStackTrace(error); } return document; } / * Load document from InputSource( i.e. XML file or other stream ). * @param source * @return Node of DOM. * @throws SAXException * @throws IOException / public Node loadDocument( InputSource source ) throws SAXException, IOException { Node document = null; // Get a parser capable of parsing XML into a DOM tree DocumentBuilder parser = null; // Create the dom factory DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); domFactory.setNamespaceAware(true); domFactory.setValidating(false); // only for DocType (dtd) validation try { parser = domFactory.newDocumentBuilder(); } catch ( ParserConfigurationException pce ) { pce.printStackTrace(error); } // parse the XML purely as well-formed XML and get a DOM tree // represenation. parser.reset(); document = parser.parse(source); return document; } } 4.测试代码的实现* package com.chris.xpath; import java.util.Iterator; import java.util.Map; import javax.xml.XMLConstants; import javax.xml.namespace.NamespaceContext; import org.w3c.dom.Node; public class XPathUtilDriver { private static NamespaceContext defaultNamespaceContext; static { defaultNamespaceContext = new NamespaceContext() { private Map<String, String> m_prefixMap = null; // NOPMD public String getNamespaceURI( String prefix ) { if ( null == prefix ) { throw new NullPointerException("Null prefix"); // NOPMD } else { if ( "xml".equals(prefix) ) { return XMLConstants.XML_NS_URI; } if ( null != m_prefixMap ) { for ( String key : m_prefixMap.keySet() ) { if ( key.equals(prefix) ) { return m_prefixMap.get(key); } } } } return XMLConstants.NULL_NS_URI; } // This method isn't necessary for XPath processing. public String getPrefix( String uri ) { throw new UnsupportedOperationException(); } // This method isn't necessary for XPath processing either. public Iterator getPrefixes( String uri ) { throw new UnsupportedOperationException(); } }; } public static void main( String[] args ) { String xmlString = "<?xml version=\"1.0\" encoding=\"utf-8\"?>" + "<books><book><name>JUnit in Action</name>" + "</book>" + "</books>"; XPathUtil util = new XPathUtil(); Node node = util.loadXMLResource(xmlString, "utf-8"); String value = util.evaluateXPath("//books/book[1]/name/text()", node, defaultNamespaceContext); System.out.println("execute result" + value); } } 5.测试结果 execute result:JUnit in Action 6.总结 1)如果要是namespace产生效果一定要加上domFactory.setNamespaceAware(true), 不然解析后的Node可能不能用. 2)XML可能涉及到不同的编码，所以编码问题也是一个注意点。 3)第三方开发软件如JDOM,DOM4j对XPath都有很好的支持，个人觉得使用也是很方便，如果你的项目中刚好用的这个组件，那一些解析的操作就可以省了。声明：ITeye文章版权属于作者，受法律保护。没有作者书面许可不得转载。推荐链接
返回顶楼

论坛首页 → Java企业应用版

跳转论坛: