好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

Erlang XML处理解决方案

Erlang XML处理解决方案

XML 以及相关的XSLT,XPath,XSD工具在数据层面为我们提供了极大的灵活性和便利.我们游戏协议的代码自动生成就是首先使用XSD工具设计了协议的Schema,然后使用.net的xsd工具直接生成实体类,然后就直接在工具中操作对象就可以了,协议的XML文件也可以通过事先的Schema检查来校验数据规范性;Erlang类库提供了对于XML的支持,可能你在STDLIB中并没有找到,这是因为这部分被独立在: http://www.erlang.org/doc/apps/xmerl/index.html   

  

  如果已经忘记了XML中常用的概念,最好还是在维基百科中做一下回顾:

XML   XHTML     DTD ( 文件类型描述 )  XML Schema     XLink      SVG    XSLT     X3D   SAX W3C的XML课程   http://www.w3school.com.cn/x.asp

  我们可以在"\erl5.9.1\lib\xmerl-1.3.1\include\xmerl.hrl"头文件中看到XML的上述各种概念在Erlang中的表达形式;

 %% XML Element
%% content = [#xmlElement()|#xmlText()|#xmlPI()|#xmlComment()|#xmlDecl()]
-record(xmlElement,{
       name,               % atom()
       expanded_name = [],     % string() | {URI,Local} | {  "  xmlns  "  ,Local}
       nsinfo = [],             % {Prefix, Local} | []
       namespace=#xmlNamespace{},
       parents = [],          % [{atom(),integer()}]
       pos,               % integer()
       attributes = [],     % [#xmlAttribute()]
       content = [],
       language =   ""  ,     % string()
       xmlbase=  "" ,           % string() XML Base path, for relative  URI:  s
       elementdef=undeclared % atom(), one of [undeclared | prolog | external | element]
     }). 

  Erlang官方解决方案从模块划分上看是五脏俱全的:xmerl_scan,xmerl,xmerl_xs,xmerl_eventp,xmerl_xpath,xmerl_xsd,xmerl_sax_parser;但是官方文档上并没有给出足够低门槛的demo代码,仅有的两段示例代码可能由于搜索引擎收录的问题,并不容易找到,其实他们是在:

     http://erlang.org/doc/apps/xmerl/xmerl_xs_examples.html  

     http://www.erlang.org/doc/apps/xmerl/xmerl_xs_examples.html   

 如果你已经安装了Erlang那么你可以在下面的路径找到它们:erl5.9.1\lib\xmerl-1.3.1\doc\html;我们还是通过两段最简单的代码看看如何使用吧.

 

解析&创建XML

 

解析XML

 首先我们为这次demo设计一个简单的xml文件test.xml,比如:

 <  shopping  >  
   <  item   name  ="bread"   quantity  ="3"   price  ="2.50"  />  
   <  item   name  ="milk"   quantity  ="2"   price  ="3.50"  />  
 </  shopping  > 

我们要解析上面的xml文件计算得到购物清单的总金额,使用xmerl可以这样做:

-include_lib( "  xmerl/include/xmerl.hrl  "  ).
-export([get_total/  1  ]).

get_total(ShoppingList) ->
        {XmlElt, _} =   xmerl_scan:  string(ShoppingList),
        Items =   xmerl_xpath: string( "  /shopping/item  "  , XmlElt),
        Total =   lists:  foldl(fun(Item, Tot) ->
                                [#xmlAttribute{value = PriceString}] =   xmerl_xpath: string( "  /item/@price  "  , Item),
                                {Price, _} =   string:  to_float(PriceString),
                                [#xmlAttribute{value = QuantityString}] =   xmerl_xpath: string( "  /item/@quantity  "  , Item),
                                {Quantity, _} =   string:  to_integer(QuantityString),
                                Tot + Price*Quantity
                        end,
                  0  , Items),
          io: format( "  $~.2f~n  " , [Total]).

运行上面的代码得到结果:$14.50 

 

动态创建XML 

 下面我们从CSV文件数据源动态创建一个XML,CSV内容如下:

bread,3,2.50 
milk,2,3.50 

 

 要创建的XML如下,其实就是上面的购物清单:

< shopping >   < item  name ="bread"  quantity ="3"  price ="2.50" />   < item  name ="milk"  quantity ="2"  price ="3.50" />   </ shopping >

实现代码:

 to_xml(ShoppingList) ->
        Items =   lists:  map(fun(L) ->
                                [Name, Quantity, Price] =   string: tokens(L,  "  ,  "  ),
                                {item, [{name, Name}, {quantity, Quantity}, {price, Price}], []}
                end,   string: tokens(ShoppingList,  "  \n  "  )),
          xmerl: export_simple([{shopping, [], Items}], xmerl_xml).

  官方给出的解决方案确实差强人意,甚至有人被惹恼,比如  [erlang-questions] Rant: I hate parsing XML with Erlang  其实我们还有别的选择,比如erlsom

 

erlsom

  erlsom 项目地址:http://sourceforge.net/projects/erlsom/ erlsom支持三种使用模型:

as a SAX parser. 备注: SAX即Simple API for XML(简称SAX)是个循序存取XML的解析器API. As a simple sort of DOM parser. 备注: DOM(Document Object Model)是W3C组织推荐的处理可扩展置标语言的标准编程接口. As a ‘data binder’ 直接解析成为Erlang的Record,类似于一个强类型DataSet的概念


下面我们实际操练一下这三种模式,我们使用下面的xml,文件名test2.xml,目标还是计算购物清单的中金额

 <?  xml version="1.0"  ?> 
 <  shopping  >  
   <  item   name  ="bread"   quantity  ="3"   price  ="2.50"  />  
   <  item   name  ="milk"   quantity  ="2"   price  ="3.50"  />  
 </  shopping  > 

 

SAX parser

 2 >  {ok, Xml} =  file: read_file( "  test.xml  "  ).
{ok,<<  "  <shopping> \r\n  <item name=\  " bread\ "   quantity=\  "  3 \ "   price=\  "  2 . 50 \ "  /> \rn  <item name=\  " milk\ "   quantity=\  "  2 \ "   price=\  "  3 . 50  "  ...>>}
3> erlsom:parse_sax(Xml, [], fun(Event, Acc) -> io:format(  " ~p~n "  , [Event]), Acc end).
startDocument
{startElement,[],  " shopping "  ,[],[]}
{ignorableWhitespace,  "  \r\n   "  }
{startElement,[],  " item "  ,[],
              [{attribute,  " price "  ,[],[],  "  2 . 50  "  },
               {attribute,  " quantity "  ,[],[],  "  3  "  },
               {attribute,  " name "  ,[],[],  " bread "  }]}
{endElement,[],  " item "  ,[]}
{ignorableWhitespace,  "  \r\n   "  }
{startElement,[],  " item "  ,[],
              [{attribute,  " price "  ,[],[],  "  3 . 50  "  },
               {attribute,  " quantity "  ,[],[],  "  2  "  },
               {attribute,  " name "  ,[],[],  " milk "  }]}
{endElement,[],  " item "  ,[]}
{ignorableWhitespace,  "  \r\n "  }
{endElement,[],  " shopping "  ,[]}
endDocument
{ok,[],  "   "  }
4> Sum = fun(Event, Acc) -> case Event of {startElement, _,   " item "  , _, [{_,_,_,_,P},{_,_,_,_,C},_]} -> Acc + list_to_float(P)*list_to_integer(C); _ -> Acc end end.
#Fun<erl_eval.12.82930912>
5> erlsom:parse_sax(Xml, 0, Sum).
{ok,14.5,  "   "  }
6> 

 

DOM parser

 使用下面的代码解析出来的结果由于精简掉了XML的架构信息,所以清爽简单了很多,后续计算略;

 9 >  erlsom:  simple_form(Xml).
{ok,{  "  shopping  "  ,[],
     [{  "  item  "  ,
       [{  "  price  " , "  2.50  " },{ "  quantity  " , "  3  " },{ "  name  " , "  bread  "  }],
       []},
      {  "  item  "  ,
       [{  "  price  " , "  3.50  " },{ "  quantity  " , "  2  " },{ "  name  " , "  milk  "  }],
       []}]},
      "   "  }
  10 >

Data Binder

    首先设计XML的XSD,然后使用XSD打通数据模型使用的各个环节,比如生成C#代码,直接获得强类型的对象,这个方法在.net里面很常用;erlsom提供的Data binder的模式,其实就是实现了这种设计方法;起点还是设计XSD文件,好吧,我们为上面的test2.xml设计一个XSD,如下:

 

 <  xsd:schema   xmlns:xsd  ="http://www.w3.org/2001/XMLSchema"  > 
   <  xsd:element   name  ="shopping"   type  ="shoppingType"  /> 
 <  xsd:complexType    name  ="shoppingType"  > 
     <  xsd:sequence  > 
       <  xsd:element   name  ="item"   minOccurs  ="0"   maxOccurs  ="unbounded"  > 
         <  xsd:complexType  > 
            <  xsd:attribute   name  ="name"   type  ="xsd:string"   use  ="required"  /> 
           <  xsd:attribute   name  ="quantity"   type  ="xsd:positiveInteger"   use  ="required"  /> 
           <  xsd:attribute   name  ="price"   type  ="xsd:decimal"   use  ="required"  /> 
         </  xsd:complexType  > 
       </  xsd:element  > 
     </  xsd:sequence  > 
   </  xsd:complexType   > 
 </  xsd:schema  > 

然后我们通过XSD生成对应的record,这个erlsom已经提供了工具:

  28 >  erlsom: write_xsd_hrl_file( "  test.xsd  " , "  test.hrl  "  ).
ok 

 

打开test.hrl,对应的record已经生成:

  %% HRL file generated by ERLSOM
%%
%% It is possible to change the name of the record fields.
%%
%% It is possible to   add   default values, but be aware that these will
%% only be used when *writing* an xml document.

-record(  '  shoppingType  ' , {anyAttribs,  '  item  '  }).
-record(  '  shoppingType/item  ' , {anyAttribs,  '  name  ' ,  '  quantity  ' ,  '  price  ' }). 

为了能在Erlang Shell中完成所有的测试,后面需要使用record的时候我们使用rd()命令,在shell中建立record的定义.

 

下面就是解析并映射为record了:

 Eshell V5. 9 . 1    (abort with ^G)
  1 >  {ok, X} =  erlsom: compile_xsd_file( "  test.xsd  "  ).

=ERROR REPORT====   20 -Jul- 2012 :: 06 : 53 : 09   ===
  Call   to tuple fun {erlsom_parse,xml2StructCallback}.

Tuple funs are deprecated   and  will be removed  in  R16. Use  "  fun M:F/A  "   instead, f
  or  example  "  fun erlsom_parse:xml2StructCallback/2  "  .

(This warning will only be shown the first time a tuple fun is called.)

{ok,{model,[{type,  '  _document  '  ,sequence,
                  [{el,[{alt,shopping,shoppingType,[],  1 , 1  ,true,undefined}],
                         1 , 1 , 1  }],
                  [],undefined,undefined,  1 , 1 , 1  ,false,undefined},
            {type,shoppingType,sequence,
                  [{el,[{alt,item,  '  shoppingType/item  ' ,[], 1 , 1  ,true,undefined}],
                         0 ,unbound, 1  }],
                  [],undefined,undefined,  2 , 1 , 1  ,undefined,undefined},
            {type,  '  shoppingType/item  '  ,sequence,[],
                  [{att,name,  1  ,false,char},
                   {att,quantity,  2  ,false,char},
                   {att,price,  3  ,false,char}],
                  undefined,undefined,  4 , 1 , 1  ,undefined,undefined}],
           [{ns,  "  http://www.w3.org/2001/XMLSchema  " , "  xsd  "  }],
           undefined,[]}}
  2 > {ok, Xml} =  file: read_file( "  test2.xml  "  ).
{ok,<<  "  锘??xml version=\  "  1 . 0 \ "  ?>\r\n<shopping> \r\n  <item name=\  " bread\ "   quanti
ty=\  "  3 \ "   price=\  "  2 . 50 \ "  /> \r\n  <item name=\  " milk "  ...>>}
3> {ok, Result, _} = erlsom:scan(Xml, X).
{ok,{shoppingType,[],
                  [{'shoppingType/item',[],  " bread "  ,  "  3  "  ,  "  2 . 50  "  },
                   {'shoppingType/item',[],  " milk "  ,  "  2  "  ,  "  3 . 50  "  }]},
      "   "  }
4>  

    对于不太复杂的XML,解析到这种程度实际上已经非常方便处理了,完全可以在此停住完成最终运算;但是对于特别复杂的XML使用Record处理,更灵活直观,我们把这个流程走完:

 

 5 > rd( '  shoppingType  ' , {anyAttribs,  '  item  '  }).
shoppingType
  6 > rd( '  shoppingType/item  ' , {anyAttribs,  '  name  ' ,  '  quantity  ' ,  '  price  '  }).
  '  shoppingType/item  ' 
 7 > R4#shoppingType. '  item  '  .
[#  '  shoppingType/item  ' {anyAttribs = [],name =  "  bread  "  ,
                      quantity =   "  3  " ,price =  "  2.50  "  },
#  '  shoppingType/item  ' {anyAttribs = [],name =  "  milk  "  ,
                      quantity =   "  2  " ,price =  "  3.50  "  }]

  8 > hd(R4#shoppingType. '  item  '  ).
#  '  shoppingType/item  ' {anyAttribs = [],name =  "  bread  "  ,
                     quantity =   "  3  " ,price =  "  2.50  "  }
  9 > # '  shoppingType/item  '  .quantity.
  4 

其它可选方案

[1] JSON 作为轻量级的数据交换格式,JSON有着巨大的优势,erlang相关解决方案也有很多比如ejson mochiweb也有相关模块

[2] Google的Protocol Buffers 以及Facebook的Thrift为代表的解决方法

[3] Piqi includes a data serialization system for Erlang. It can be used for serializing Erlang values in 4 different formats: Google Protocol Buffers,  JSON ,  XML  and  Piq .

     

 

晚安!

 

最后送上一张96星河版<笑傲江湖>的截图,这个版本让我欣喜不已,

83版射雕,94版射雕,95版神雕,96版笑傲,97版天龙八部,百看不厌

More Sharing Services Share | Share on facebook Share on myspace Share on google Share on twitter

 


 

分类:  Erlang

标签:  erlang ,  xml ,  xmerl ,  erlsom

作者: Leo_wl

    

出处: http://www.cnblogs.com/Leo_wl/

    

本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

版权信息

查看更多关于Erlang XML处理解决方案的详细内容...

  阅读:32次