好得很程序员自学网

<tfoot draggable='sEl'></tfoot>

elasticsearch索引index之Mapping实现关系结构示例

Mapping的实现关系结构

Lucene索引的一个特点就filed,索引以field组合。这一特点为索引和搜索提供了很大的灵活性。elasticsearch则在Lucene的基础上更近一步,它可以是 no scheme。实现这一功能的秘密就Mapping。Mapping是对索引各个字段的一种预设,包括索引与分词方式,是否存储等,数据根据字段名在Mapping中找到对应的配置,建立索引。这里将对Mapping的实现结构简单分析,Mapping的放置、更新、应用会在后面的索引fenx中进行说明。

这只是Mapping中的一部分内容。Mapping扩展了lucene的filed,定义了更多的field类型既有Lucene所拥有的string,number等字段又有date,IP,byte及geo的相关字段,这也是es的强大之处。如上图所示,可以分为两类,mapper与documentmapper,前者是所有mapper的父接口。而DocumentMapper则是Mapper的集合,它代表了一个索引的mapper定义。

Mapper的三类

第一类就是核心field结构FileMapper—>AbstractFieldMapper—>StringField这种核心数据类型,它代表了一类数据类型,如字符串类型,int类型这种;

第二类是Mapper—>ObjectMapper—>RootObjectMapper,object类型的Mapper,这也是elasticsearch对lucene的一大改进,不想lucene之支持基本数据类型;

最后一类是Mapper—>RootMapper—>IndexFieldMapper这种类型,只存在于根Mapper中的一种Mapper,如IdFieldMapper及图上的IndexFieldMapper,它们类似于index的元数据,只可能存在于某个index内部。

parse方法

Mapper中一个比较重要的方法就是parse(ParseContext context),Mapper的子类对这个方法都有各自的实现。它的主要功能是通过解析ParseContext获取到对应的field,这个方法主要用于建立索引时。索引数据被继续成parsecontext,每个field解析parseContext构建对应的lucene Field。它在AbstractFieldMapper中的实现如下所示:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

public void parse(ParseContext context) throws IOException {

         final List&lt;Field&gt; fields = new ArrayList&lt;&gt;( 2 );

         try {

             parseCreateField(context, fields); //实际Filed解析方法

             for (Field field : fields) {

                 if (!customBoost()) { //设置boost

                     field.setBoost(boost);

                 }

                 if (context.listener().beforeFieldAdded( this , field, context)) {

                     context.doc().add(field); //将解析完成的Field加入到context中

                 }

             }

         } catch (Exception e) {

             throw new MapperParsingException( "failed to parse [" + names.fullName() + "]" , e);

         }

         multiFields.parse( this , context); //进行mutiFields解析,MultiFields作用是对同一个field做不同的定义,如可以进行不同分词方式的索引这样便于通过各种方式查询

         if (copyTo != null ) {

             copyTo.parse(context);

         }

     }

这里的parseCreateField是一个抽象方法,每种数据类型都有自己的实现,如string的实现方式如下所示:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

protected void parseCreateField(ParseContext context, List&lt;Field&gt; fields) throws IOException {

         ValueAndBoost valueAndBoost = parseCreateFieldForString(context, nullValue, boost); //解析成值和boost

         if (valueAndBoost.value() == null ) {

             return ;

         }

         if (ignoreAbove &gt; 0 &amp;&amp; valueAndBoost.value().length() &gt; ignoreAbove) {

             return ;

         }

         if (context.includeInAll(includeInAll, this )) {

             context.allEntries().addText(names.fullName(), valueAndBoost.value(), valueAndBoost.boost());

         }

         if (fieldType.indexed() || fieldType.stored()) { //构建LuceneField

             Field field = new Field(names.indexName(), valueAndBoost.value(), fieldType);

             field.setBoost(valueAndBoost.boost());

             fields.add(field);

         }

         if (hasDocValues()) {

             fields.add( new SortedSetDocValuesField(names.indexName(), new BytesRef(valueAndBoost.value())));

         }

         if (fields.isEmpty()) {

             context.ignoredValue(names.indexName(), valueAndBoost.value());

         }

     }

//解析出字段的值和boost

     public static ValueAndBoost parseCreateFieldForString(ParseContext context, String nullValue, float defaultBoost) throws IOException {

         if (context.externalValueSet()) {

             return new ValueAndBoost((String) context.externalValue(), defaultBoost);

         }

         XContentParser parser = context.parser();

         if (parser.currentToken() == XContentParser.Token.VALUE_NULL) {

             return new ValueAndBoost(nullValue, defaultBoost);

         }

         if (parser.currentToken() == XContentParser.Token.START_OBJECT) {

             XContentParser.Token token;

             String currentFieldName = null ;

             String value = nullValue;

             float boost = defaultBoost;

             while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {

                 if (token == XContentParser.Token.FIELD_NAME) {

                     currentFieldName = parser.currentName();

                 } else {

                     if ( "value" .equals(currentFieldName) || "_value" .equals(currentFieldName)) {

                         value = parser.textOrNull();

                     } else if ( "boost" .equals(currentFieldName) || "_boost" .equals(currentFieldName)) {

                         boost = parser.floatValue();

                     } else {

                         throw new ElasticsearchIllegalArgumentException( "unknown property [" + currentFieldName + "]" );

                     }

                 }

             }

             return new ValueAndBoost(value, boost);

         }

         return new ValueAndBoost(parser.textOrNull(), defaultBoost);

     }

以上就是Mapper如何将一个值解析成对应的Field的过程,这里只是简单介绍,后面会有详细分析。

部分Field

DocumentMapper是一个索引所有Mapper的集合,它表述了一个索引所有field的定义,可以说是lucene的Document的定义,同时它还包含以下index的默认值,如index和search时默认分词器。它的部分Field如下所示:

?

1

2

3

4

5

6

7

8

9

10

private final DocumentMapperParser docMapperParser;

     private volatile ImmutableMap&lt;String, Object&gt; meta;

     private volatile CompressedString mappingSource;

     private final RootObjectMapper rootObjectMapper;

     private final ImmutableMap&lt;Class&lt;? extends RootMapper&gt;, RootMapper&gt; rootMappers;

     private final RootMapper[] rootMappersOrdered;

     private final RootMapper[] rootMappersNotIncludedInObject;

     private final NamedAnalyzer indexAnalyzer;

     private final NamedAnalyzer searchAnalyzer;

     private final NamedAnalyzer searchQuoteAnalyzer;

DocumentMapper的功能也体现在parse方法上,它的作用是解析整条数据。之前在Mapper中看到了Field是如何解析出来的,那其实是在DocumentMapper解析之后。index请求发过来的整条数据在这里被解析出Field,查找Mapping中对应的Field设置,交给它去解析。如果没有且运行动态添加,es则会根据值自动创建一个Field同时更新Mapping。方法代码如下所示:

?

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

public ParsedDocument parse(SourceToParse source, @Nullable ParseListener listener) throws MapperParsingException {

         ParseContext.InternalParseContext context = cache.get();

         if (source.type() != null &amp;&amp; !source.type().equals( this .type)) {

             throw new MapperParsingException( "Type mismatch, provide type [" + source.type() + "] but mapper is of type [" + this .type + "]" );

         }

         source.type( this .type);

         XContentParser parser = source.parser();

         try {

             if (parser == null ) {

                 parser = XContentHelper.createParser(source.source());

             }

             if (sourceTransforms != null ) {

                 parser = transform(parser);

             }

             context.reset(parser, new ParseContext.Document(), source, listener);

             // will result in START_OBJECT

             int countDownTokens = 0 ;

             XContentParser.Token token = parser.nextToken();

             if (token != XContentParser.Token.START_OBJECT) {

                 throw new MapperParsingException( "Malformed content, must start with an object" );

             }

             boolean emptyDoc = false ;

             token = parser.nextToken();

             if (token == XContentParser.Token.END_OBJECT) {

                 // empty doc, we can handle it...

                 emptyDoc = true ;

             } else if (token != XContentParser.Token.FIELD_NAME) {

                 throw new MapperParsingException( "Malformed content, after first object, either the type field or the actual properties should exist" );

             }

             // first field is the same as the type, this might be because the

             // type is provided, and the object exists within it or because

             // there is a valid field that by chance is named as the type.

             // Because of this, by default wrapping a document in a type is

             // disabled, but can be enabled by setting

             // index.mapping.allow_type_wrapper to true

             if (type.equals(parser.currentName()) &amp;&amp; indexSettings.getAsBoolean(ALLOW_TYPE_WRAPPER, false )) {

                 parser.nextToken();

                 countDownTokens++;

             }

             for (RootMapper rootMapper : rootMappersOrdered) {

                 rootMapper.preParse(context);

             }

             if (!emptyDoc) {

                 rootObjectMapper.parse(context);

             }

             for ( int i = 0 ; i &lt; countDownTokens; i++) {

                 parser.nextToken();

             }

             for (RootMapper rootMapper : rootMappersOrdered) {

                 rootMapper.postParse(context);

             }

         } catch (Throwable e) {

             // if its already a mapper parsing exception, no need to wrap it...

             if (e instanceof MapperParsingException) {

                 throw (MapperParsingException) e;

             }

             // Throw a more meaningful message if the document is empty.

             if (source.source() != null &amp;&amp; source.source().length() == 0 ) {

                 throw new MapperParsingException( "failed to parse, document is empty" );

             }

             throw new MapperParsingException( "failed to parse" , e);

         } finally {

             // only close the parser when its not provided externally

             if (source.parser() == null &amp;&amp; parser != null ) {

                 parser.close();

             }

         }

         // reverse the order of docs for nested docs support, parent should be last

         if (context.docs().size() &gt; 1 ) {

             Collections.reverse(context.docs());

         }

         // apply doc boost

         if (context.docBoost() != 1 .0f) {

             Set&lt;String&gt; encounteredFields = Sets.newHashSet();

             for (ParseContext.Document doc : context.docs()) {

                 encounteredFields.clear();

                 for (IndexableField field : doc) {

                     if (field.fieldType().indexed() &amp;&amp; !field.fieldType().omitNorms()) {

                         if (!encounteredFields.contains(field.name())) {

                             ((Field) field).setBoost(context.docBoost() * field.boost());

                             encounteredFields.add(field.name());

                         }

                     }

                 }

             }

         }

         ParsedDocument doc = new ParsedDocument(context.uid(), context.version(), context.id(), context.type(), source.routing(), source.timestamp(), source.ttl(), context.docs(), context.analyzer(),

                 context.source(), context.mappingsModified()).parent(source.parent());

         // reset the context to free up memory

         context.reset( null , null , null , null );

         return doc;

     }

将整条数据解析成ParsedDocument,解析后的数据才能进行后面的Field解析建立索引。

总结

以上就是Mapping的结构和相关功能概括,Mapper赋予了elasticsearch索引的更强大功能,使得索引和搜索可以支持更多数据类型,灵活性更高,更多关于elasticsearch索引index Mapping关系结构的资料请关注其它相关文章!

原文链接:https://www.cnblogs.com/zziawanblog/p/6828490.html

查看更多关于elasticsearch索引index之Mapping实现关系结构示例的详细内容...

  阅读:16次