站長資訊平臺

在python中使用elasticsearch做為搜索引擎

2018-07-20 來源：open-open

一直想找一個快速全文搜索的工具，目前找到的有Sphinx,xapian,Lucene,solr, elasticsearch ,whoosh,hyper estraier等，原本一直不太喜歡用java系的，內(nèi)存大戶傷不起啊。嘗試了sphinx,xapian,hyper estraier,其中xapian資料太少，hyper estraier雖然比較簡單，但資料也少。sphinx到是有一個中文化的分支coreseek，然后看到文檔里面提到sphinx支持一元切分，但根據(jù)查詢的例子去查的結(jié)果不是我想要的，不知道是不是我的查詢語句用錯了。而且因為我是在windows上測試的，而我的python又是2.7的版本，無法在 coreseek 上直接使用，應(yīng)該需要重新編譯。后來看到 elasticsearch ，真是亮瞎老夫的狗眼啊，這貨直接可以用restful json操作又有pyes,pyelasticsearch這些已經(jīng)封裝好的操作庫。 elasticsearch 還是支持分布式，擴展也方便了。由于是java開發(fā)的，跨平臺也無問題，默認單機嘗試的時候無須改配置，直接運行 bin/elasticsearch.bat 就可以了。

安裝pyes

pip install pyes

使用例子

#coding:utf-8

import pyes

conn = pyes.ES(['127.0.0.1:9200'])#連接es

conn.create_index('test-index')#新建一個索引

#定義索引存儲結(jié)構(gòu)
mapping = { u'parsedtext': {'boost': 1.0,
                      'index': 'analyzed',
                      'store': 'yes',
                      'type': u'string',
                      "term_vector" : "with_positions_offsets"},
              u'name': {'boost': 1.0,
                         'index': 'analyzed',
                         'store': 'yes',
                         'type': u'string',
                         "term_vector" : "with_positions_offsets"},
              u'title': {'boost': 1.0,
                         'index': 'analyzed',
                         'store': 'yes',
                         'type': u'string',
                         "term_vector" : "with_positions_offsets"},
              u'position': {'store': 'yes',
                         'type': u'integer'},
              u'uuid': {'boost': 1.0,
                        'index': 'not_analyzed',
                        'store': 'yes',
                        'type': u'string'}
        }

conn.put_mapping("test-type", {'properties':mapping}, ["test-index"])#定義test-type
conn.put_mapping("test-type2", {"_parent" : {"type" : "test-type"}}, ["test-index"])#從test-type繼承

#插入索引數(shù)據(jù)
#{"name":"Joe Tester", "parsedtext":"Joe Testere nice guy", "uuid":"11111", "position":1}: 文檔數(shù)據(jù)
#test-index：索引名稱
#test-type: 類型
#1: id 注：id可以不給，系統(tǒng)會自動生成
conn.index({"name":"Joe Tester", "parsedtext":"Joe Testere nice guy", "uuid":"11111", "position":1}, "test-index", "test-type", 1)

conn.index({"name":"data1", "value":"value1"}, "test-index", "test-type2", 1, parent=1)
conn.index({"name":"Bill Baloney", "parsedtext":"Bill Testere nice guy", "uuid":"22222", "position":2}, "test-index", "test-type", 2)
conn.index({"name":"data2", "value":"value2"}, "test-index", "test-type2", 2, parent=2)
conn.index({"name":u"百 度 中 國"}, "test-index", "test-type")#這個相當(dāng)于中文的一元切分吧-_-
conn.index({"name":u"百 中 度"}, "test-index", "test-type")

conn.default_indices=["test-index"]#設(shè)置默認的索引
conn.refresh()#刷新以獲得最新插入的文檔

q = pyes.TermQuery("name", "bill")#查詢name中包含bill的記錄
results = conn.search(q)

for r in results:
    print r

#查詢name中包含 百度 的數(shù)據(jù)
q = pyes.StringQuery(u"百 度",'name')
results = conn.search(q)

for r in results:
    print r

#查詢name中包含 百度 或著 中度 的數(shù)據(jù)
q = pyes.StringQuery(u"百 度 OR 中 度",'name')
results = conn.search(q)

for r in results:
    print r

標簽：搜索

版權(quán)申明：本站文章部分自網(wǎng)絡(luò)，如有侵權(quán)，請聯(lián)系：west999com@outlook.com
特別注意：本站所有轉(zhuǎn)載文章言論不代表本站觀點！
本站所提供的圖片等素材，版權(quán)歸原作者所有，如需使用，請與原作者聯(lián)系。

上一篇:Python根據(jù)url獲取網(wǎng)頁內(nèi)容

下一篇:發(fā)送email 帶附件的Python代碼

相關(guān)文章

最新資訊

熱門推薦

為學(xué)習(xí)和知識分享目的，本站文章部分自網(wǎng)絡(luò)，本站文章部分自網(wǎng)絡(luò)，如有侵權(quán)，請聯(lián)系：2653426586@qq.com QQ：2653426586

如有其他需求，請聯(lián)系：2653426586@qq.com QQ：2653426586

友情鏈接：網(wǎng)絡(luò)安全運維經(jīng)驗 IT技術(shù)分享運維隨筆錄鮮花東郊到家往約到家

中文字幕在线观看,亚洲а∨天堂久久精品9966,亚洲成a人片在线观看你懂的,亚洲av成人片无码网站,亚洲国产精品无码久久久五月天

在python中使用elasticsearch做為搜索引擎