Erlang XML处理解决方案
XML 以及相关的XSLT,XPath,XSD工具在数据层面为我们提供了极大的灵活性和便利.我们游戏协议的代码自动生成就是首先使用XSD工具设计了协议的Schema,然后使用.net的xsd工具直接生成实体类,然后就直接在工具中操作对象就可以了,协议的XML文件也可以通过事先的Schema检查来校验数据规范性;Erlang类库提供了对于XML的支持,可能你在STDLIB中并没有找到,这是因为这部分被独立在: http://www.erlang.org/doc/apps/xmerl/index.html
如果已经忘记了XML中常用的概念,最好还是在维基百科中做一下回顾:
XML XHTML DTD ( 文件类型描述 ) XML Schema XLink SVG XSLT X3D SAX W3C的XML课程 http://www.w3school.com.cn/x.asp我们可以在"\erl5.9.1\lib\xmerl-1.3.1\include\xmerl.hrl"头文件中看到XML的上述各种概念在Erlang中的表达形式;
%% XML Element %% content = [#xmlElement()|#xmlText()|#xmlPI()|#xmlComment()|#xmlDecl()] -record(xmlElement,{ name, % atom() expanded_name = [], % string() | {URI,Local} | { " xmlns " ,Local} nsinfo = [], % {Prefix, Local} | [] namespace=#xmlNamespace{}, parents = [], % [{atom(),integer()}] pos, % integer() attributes = [], % [#xmlAttribute()] content = [], language = "" , % string() xmlbase= "" , % string() XML Base path, for relative URI: s elementdef=undeclared % atom(), one of [undeclared | prolog | external | element] }).
Erlang官方解决方案从模块划分上看是五脏俱全的:xmerl_scan,xmerl,xmerl_xs,xmerl_eventp,xmerl_xpath,xmerl_xsd,xmerl_sax_parser;但是官方文档上并没有给出足够低门槛的demo代码,仅有的两段示例代码可能由于搜索引擎收录的问题,并不容易找到,其实他们是在:
http://erlang.org/doc/apps/xmerl/xmerl_xs_examples.html
http://www.erlang.org/doc/apps/xmerl/xmerl_xs_examples.html
如果你已经安装了Erlang那么你可以在下面的路径找到它们:erl5.9.1\lib\xmerl-1.3.1\doc\html;我们还是通过两段最简单的代码看看如何使用吧.
解析&创建XML
解析XML
首先我们为这次demo设计一个简单的xml文件test.xml,比如:
< shopping > < item name ="bread" quantity ="3" price ="2.50" /> < item name ="milk" quantity ="2" price ="3.50" /> </ shopping >
我们要解析上面的xml文件计算得到购物清单的总金额,使用xmerl可以这样做:
-include_lib( " xmerl/include/xmerl.hrl " ). -export([get_total/ 1 ]). get_total(ShoppingList) -> {XmlElt, _} = xmerl_scan: string(ShoppingList), Items = xmerl_xpath: string( " /shopping/item " , XmlElt), Total = lists: foldl(fun(Item, Tot) -> [#xmlAttribute{value = PriceString}] = xmerl_xpath: string( " /item/@price " , Item), {Price, _} = string: to_float(PriceString), [#xmlAttribute{value = QuantityString}] = xmerl_xpath: string( " /item/@quantity " , Item), {Quantity, _} = string: to_integer(QuantityString), Tot + Price*Quantity end, 0 , Items), io: format( " $~.2f~n " , [Total]).
运行上面的代码得到结果:$14.50
动态创建XML
下面我们从CSV文件数据源动态创建一个XML,CSV内容如下:
bread,3,2.50
milk,2,3.50
要创建的XML如下,其实就是上面的购物清单:
< shopping > < item name ="bread" quantity ="3" price ="2.50" /> < item name ="milk" quantity ="2" price ="3.50" /> </ shopping >
实现代码:
to_xml(ShoppingList) -> Items = lists: map(fun(L) -> [Name, Quantity, Price] = string: tokens(L, " , " ), {item, [{name, Name}, {quantity, Quantity}, {price, Price}], []} end, string: tokens(ShoppingList, " \n " )), xmerl: export_simple([{shopping, [], Items}], xmerl_xml).官方给出的解决方案确实差强人意,甚至有人被惹恼,比如 [erlang-questions] Rant: I hate parsing XML with Erlang 其实我们还有别的选择,比如erlsom
erlsom
erlsom 项目地址:http://sourceforge.net/projects/erlsom/ erlsom支持三种使用模型:
as a SAX parser. 备注: SAX即Simple API for XML(简称SAX)是个循序存取XML的解析器API. As a simple sort of DOM parser. 备注: DOM(Document Object Model)是W3C组织推荐的处理可扩展置标语言的标准编程接口. As a ‘data binder’ 直接解析成为Erlang的Record,类似于一个强类型DataSet的概念
下面我们实际操练一下这三种模式,我们使用下面的xml,文件名test2.xml,目标还是计算购物清单的中金额
<? xml version="1.0" ?> < shopping > < item name ="bread" quantity ="3" price ="2.50" /> < item name ="milk" quantity ="2" price ="3.50" /> </ shopping >
SAX parser
2 > {ok, Xml} = file: read_file( " test.xml " ). {ok,<< " <shopping> \r\n <item name=\ " bread\ " quantity=\ " 3 \ " price=\ " 2 . 50 \ " /> \rn <item name=\ " milk\ " quantity=\ " 2 \ " price=\ " 3 . 50 " ...>>} 3> erlsom:parse_sax(Xml, [], fun(Event, Acc) -> io:format( " ~p~n " , [Event]), Acc end). startDocument {startElement,[], " shopping " ,[],[]} {ignorableWhitespace, " \r\n " } {startElement,[], " item " ,[], [{attribute, " price " ,[],[], " 2 . 50 " }, {attribute, " quantity " ,[],[], " 3 " }, {attribute, " name " ,[],[], " bread " }]} {endElement,[], " item " ,[]} {ignorableWhitespace, " \r\n " } {startElement,[], " item " ,[], [{attribute, " price " ,[],[], " 3 . 50 " }, {attribute, " quantity " ,[],[], " 2 " }, {attribute, " name " ,[],[], " milk " }]} {endElement,[], " item " ,[]} {ignorableWhitespace, " \r\n " } {endElement,[], " shopping " ,[]} endDocument {ok,[], " " } 4> Sum = fun(Event, Acc) -> case Event of {startElement, _, " item " , _, [{_,_,_,_,P},{_,_,_,_,C},_]} -> Acc + list_to_float(P)*list_to_integer(C); _ -> Acc end end. #Fun<erl_eval.12.82930912> 5> erlsom:parse_sax(Xml, 0, Sum). {ok,14.5, " " } 6>
DOM parser
使用下面的代码解析出来的结果由于精简掉了XML的架构信息,所以清爽简单了很多,后续计算略;
9 > erlsom: simple_form(Xml). {ok,{ " shopping " ,[], [{ " item " , [{ " price " , " 2.50 " },{ " quantity " , " 3 " },{ " name " , " bread " }], []}, { " item " , [{ " price " , " 3.50 " },{ " quantity " , " 2 " },{ " name " , " milk " }], []}]}, " " } 10 >
Data Binder
首先设计XML的XSD,然后使用XSD打通数据模型使用的各个环节,比如生成C#代码,直接获得强类型的对象,这个方法在.net里面很常用;erlsom提供的Data binder的模式,其实就是实现了这种设计方法;起点还是设计XSD文件,好吧,我们为上面的test2.xml设计一个XSD,如下:
< xsd:schema xmlns:xsd ="http://www.w3.org/2001/XMLSchema" > < xsd:element name ="shopping" type ="shoppingType" /> < xsd:complexType name ="shoppingType" > < xsd:sequence > < xsd:element name ="item" minOccurs ="0" maxOccurs ="unbounded" > < xsd:complexType > < xsd:attribute name ="name" type ="xsd:string" use ="required" /> < xsd:attribute name ="quantity" type ="xsd:positiveInteger" use ="required" /> < xsd:attribute name ="price" type ="xsd:decimal" use ="required" /> </ xsd:complexType > </ xsd:element > </ xsd:sequence > </ xsd:complexType > </ xsd:schema >然后我们通过XSD生成对应的record,这个erlsom已经提供了工具:
28 > erlsom: write_xsd_hrl_file( " test.xsd " , " test.hrl " ). ok
打开test.hrl,对应的record已经生成:
%% HRL file generated by ERLSOM %% %% It is possible to change the name of the record fields. %% %% It is possible to add default values, but be aware that these will %% only be used when *writing* an xml document. -record( ' shoppingType ' , {anyAttribs, ' item ' }). -record( ' shoppingType/item ' , {anyAttribs, ' name ' , ' quantity ' , ' price ' }).
为了能在Erlang Shell中完成所有的测试,后面需要使用record的时候我们使用rd()命令,在shell中建立record的定义.
下面就是解析并映射为record了:
Eshell V5. 9 . 1 (abort with ^G) 1 > {ok, X} = erlsom: compile_xsd_file( " test.xsd " ). =ERROR REPORT==== 20 -Jul- 2012 :: 06 : 53 : 09 === Call to tuple fun {erlsom_parse,xml2StructCallback}. Tuple funs are deprecated and will be removed in R16. Use " fun M:F/A " instead, f or example " fun erlsom_parse:xml2StructCallback/2 " . (This warning will only be shown the first time a tuple fun is called.) {ok,{model,[{type, ' _document ' ,sequence, [{el,[{alt,shopping,shoppingType,[], 1 , 1 ,true,undefined}], 1 , 1 , 1 }], [],undefined,undefined, 1 , 1 , 1 ,false,undefined}, {type,shoppingType,sequence, [{el,[{alt,item, ' shoppingType/item ' ,[], 1 , 1 ,true,undefined}], 0 ,unbound, 1 }], [],undefined,undefined, 2 , 1 , 1 ,undefined,undefined}, {type, ' shoppingType/item ' ,sequence,[], [{att,name, 1 ,false,char}, {att,quantity, 2 ,false,char}, {att,price, 3 ,false,char}], undefined,undefined, 4 , 1 , 1 ,undefined,undefined}], [{ns, " http://www.w3.org/2001/XMLSchema " , " xsd " }], undefined,[]}} 2 > {ok, Xml} = file: read_file( " test2.xml " ). {ok,<< " 锘??xml version=\ " 1 . 0 \ " ?>\r\n<shopping> \r\n <item name=\ " bread\ " quanti ty=\ " 3 \ " price=\ " 2 . 50 \ " /> \r\n <item name=\ " milk " ...>>} 3> {ok, Result, _} = erlsom:scan(Xml, X). {ok,{shoppingType,[], [{'shoppingType/item',[], " bread " , " 3 " , " 2 . 50 " }, {'shoppingType/item',[], " milk " , " 2 " , " 3 . 50 " }]}, " " } 4>对于不太复杂的XML,解析到这种程度实际上已经非常方便处理了,完全可以在此停住完成最终运算;但是对于特别复杂的XML使用Record处理,更灵活直观,我们把这个流程走完:
5 > rd( ' shoppingType ' , {anyAttribs, ' item ' }). shoppingType 6 > rd( ' shoppingType/item ' , {anyAttribs, ' name ' , ' quantity ' , ' price ' }). ' shoppingType/item ' 7 > R4#shoppingType. ' item ' . [# ' shoppingType/item ' {anyAttribs = [],name = " bread " , quantity = " 3 " ,price = " 2.50 " }, # ' shoppingType/item ' {anyAttribs = [],name = " milk " , quantity = " 2 " ,price = " 3.50 " }] 8 > hd(R4#shoppingType. ' item ' ). # ' shoppingType/item ' {anyAttribs = [],name = " bread " , quantity = " 3 " ,price = " 2.50 " } 9 > # ' shoppingType/item ' .quantity. 4其它可选方案
[1] JSON 作为轻量级的数据交换格式,JSON有着巨大的优势,erlang相关解决方案也有很多比如ejson mochiweb也有相关模块
[2] Google的Protocol Buffers 以及Facebook的Thrift为代表的解决方法
[3] Piqi includes a data serialization system for Erlang. It can be used for serializing Erlang values in 4 different formats: Google Protocol Buffers, JSON , XML and Piq .
晚安!
最后送上一张96星河版<笑傲江湖>的截图,这个版本让我欣喜不已,
83版射雕,94版射雕,95版神雕,96版笑傲,97版天龙八部,百看不厌
More Sharing Services Share | Share on facebook Share on myspace Share on google Share on twitter
分类: Erlang
标签: erlang , xml , xmerl , erlsom
作者: Leo_wl
出处: http://www.cnblogs.com/Leo_wl/
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。
版权信息查看更多关于Erlang XML处理解决方案的详细内容...
声明:本文来自网络,不代表【好得很程序员自学网】立场,转载请注明出处:http://www.haodehen.cn/did48470