MENU

Elasticsearch入门

October 1, 2017 • Read: 193 • 无码

Elasticsearch提供了REST API操作接口,使用起来非常方便。由于elasticsearch安装在centos上且只配置了内网访问(elasticsearc最好不要暴露到外网),所有请求使用CURL完成;本文不涉及Elasticsearch集群内容。

基本概念

Elasticsearch中索引(index)类型(type)文档(document)等概念直接看可能比较蒙;和MySQL对照来看会容易理解:

ElasticsearchMySQL
IndexDatabase
TypeTable
DocumentRow
FieldColumn
MapppingSchema

Type会在以后版本中移除,详见:Indices, types, and parent / child: current status and upcoming changes in ElasticsearchIndex看作Table会比较恰当;每个索引最好只创建一个类型。

创建索引

创建索引名称为specs、类型为spec的索引;设置mapping(索引名字必须是小写)。

curl -X PUT 'localhost:9200/specs' -d '
{
  "mappings": {
    "spec": {
      "properties": {
        "name": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        },
        "factory": {
          "type": "text",
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word"
        }
      }
    }
  }
}'

执行结果:

{"acknowledged":true,"shards_acknowledged":true,"index":"specs"}#

查看索引:

curl 'localhost:9200/_mapping?pretty=true'

执行结果:

{
    "specs": {
        "mappings": {
            "spec": {
                "properties": {
                    "factory": {
                        "type": "text",
                        "analyzer": "ik_max_word"
                    },
                    "name": {
                        "type": "text",
                        "analyzer": "ik_max_word"
                    }
                }
            }
        }
    }
}

删除索引:

curl -X DELETE 'localhost:9200/specs'

操作数据

索引数据

curl -X PUT 'localhost:9200/specs/spec/1?pretty=true' -d '
{
    "factory": "一汽-大众奥迪",
    "name": "奥迪A6L 2018款 30周年年型 TFSI 进取型"
}'

specs是索引;spec是类型;1是指定的ID,不一定是数字,可以是任意字符串,如果不指定ID会随机生成ID。

执行结果:

{
    "_index": "specs",
    "_type": "spec",
    "_id": "1",
    "_version": 1,
    "result": "created",
    "_shards": {
        "total": 2,
        "successful": 1,
        "failed": 0
    },
    "created": true
}

再次插入两条数据:

  • ("上汽大众", "途观L 2017款 330TSI 自动两驱豪华版")
  • ("一汽-大众奥迪", "奥迪Q5 2017款 Plus 40 TFSI 豪华型")

根据ID查询

curl 'localhost:9200/specs/spec/1?pretty=true'

执行结果:

{
  "_index" : "specs",
  "_type" : "spec",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "factory" : "一汽-大众奥迪",
    "name" : "奥迪A6L 2018款 30周年年型 TFSI 进取型"
  }
}

搜索

查询所有车型:

curl 'localhost:9200/specs/spec/_search?pretty=true'

查询结果:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "factory" : "上汽大众",
          "name" : "途观L 2017款 330TSI 自动两驱豪华版"
        }
      },
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪A6L 2018款 30周年年型 TFSI 进取型"
        }
      },
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪Q5 2017款 Plus 40 TFSI 豪华型"
        }
      }
    ]
  }
}

查询表达式

curl -XGET 'localhost:9200/specs/spec/_search?pretty=true' -d '
{
    "query" : {
        "match" : {
            "factory" : "奥迪"
        }
    }
}'

查询结果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2824934,
    "hits" : [
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "1",
        "_score" : 0.2824934,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪A6L 2018款 30周年年型 TFSI 进取型"
        }
      },
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "3",
        "_score" : 0.2824934,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪Q5 2017款 Plus 40 TFSI 豪华型"
        }
      }
    ]
  }
}

全文搜索

curl -XGET 'localhost:9200/specs/spec/_search?pretty=true' -d '
{
    "query" : {
        "match" : {
            "factory" : "上汽大众"
        }
    }
}'

查询结果:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.51623213,
    "hits" : [
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "2",
        "_score" : 0.51623213,
        "_source" : {
          "factory" : "上汽大众",
          "name" : "途观L 2017款 330TSI 自动两驱豪华版"
        }
      },
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "1",
        "_score" : 0.2824934,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪A6L 2018款 30周年年型 TFSI 进取型"
        }
      },
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "3",
        "_score" : 0.2824934,
        "_source" : {
          "factory" : "一汽-大众奥迪",
          "name" : "奥迪Q5 2017款 Plus 40 TFSI 豪华型"
        }
      }
    ]
  }
}

从结果上可以看出,搜索factory为“上汽大众”,得到了3个匹配文档。

Elasticsearch默认根据文档匹配程度排序;_score反映了相关程度。第一个文档的factory为“上汽大众”,得分最高;后两个文档只匹配了“大众”这个词,得分相对较低。

短语匹配

怎样才能只匹配包含“上汽大众”的文档,而不分词匹配呢?可以使用match_phrase

curl -XGET 'localhost:9200/specs/spec/_search?pretty=true' -d '
{
    "query" : {
        "match_phrase" : {
            "factory" : "上汽大众"
        }
    }
}'

查询结果:

{
  "took" : 8,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.51623213,
    "hits" : [
      {
        "_index" : "specs",
        "_type" : "spec",
        "_id" : "2",
        "_score" : 0.51623213,
        "_source" : {
          "factory" : "上汽大众",
          "name" : "途观L 2017款 330TSI 自动两驱豪华版"
        }
      }
    ]
  }
}

返回结果仅包含factory为“上汽大众”的文档。

Archives QR Code
QR Code for this page
Tipping QR Code