annotate MoinMoin/search/builtin.py @ 6101:316986758258

remove MoinMoin.support.difflib
author Thomas Waldmann <tw AT waldmann-edv DOT de>
date Tue, 06 Sep 2016 00:21:08 +0200
parents fc11712e0df0
children
rev   line source
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
1 # -*- coding: iso-8859-1 -*-
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
2 """
1497
ed3845759431 update comments/docstrings
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1496
diff changeset
3 MoinMoin - search engine internals
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
4
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
5 @copyright: 2005 MoinMoin:FlorianFesti,
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
6 2005 MoinMoin:NirSoffer,
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
7 2005 MoinMoin:AlexanderSchremmer,
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
8 2006-2009 MoinMoin:ThomasWaldmann,
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
9 2006 MoinMoin:FranzPletz
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
10 @license: GNU GPL, see COPYING for details
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
11 """
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
12
3443
4ee64d58e801 move platform dependent filesystem routines to util.filesys
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3442
diff changeset
13 import sys, os, time, errno, codecs
3162
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
14
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
15 from MoinMoin import log
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
16 logging = log.getLogger(__name__)
1792
c907c2942372 Eclipse PyDev Check: fixed lots of its errors and warnings
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1791
diff changeset
17
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
18 from MoinMoin import wikiutil, config, caching
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
19 from MoinMoin.Page import Page
5021
fb0ddee3f5ff Xapian2009: queryparser was split to queryparser.__init__ and quryparser.expressions.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 5019
diff changeset
20 from MoinMoin.search.results import getSearchResults, Match, TextMatch, TitleMatch, getSearchResults
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
21
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
22 ##############################################################################
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
23 # Search Engine Abstraction
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
24 ##############################################################################
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
25
4981
f1d1d8105d52 Xapian2009: pep8 fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4972
diff changeset
26
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
27 class IndexerQueue(object):
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
28 """
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
29 Represents a locked on-disk queue with jobs for the xapian indexer
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
30
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
31 Each job is a tuple like: (PAGENAME, ATTACHMENTNAME, REVNO)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
32 PAGENAME: page name (unicode)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
33 ATTACHMENTNAME: attachment name (unicode) or None (for pages)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
34 REVNO: revision number (int) - meaning "look at that revision",
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
35 or None - meaning "look at all revisions"
1979
79189058f117 search: add comment about possible refactoring
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1920
diff changeset
36 """
1497
ed3845759431 update comments/docstrings
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1496
diff changeset
37
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
38 def __init__(self, request, xapian_dir, queuename, timeout=10.0):
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
39 """
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
40 @param request: request object
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
41 @param xapian_dir: the xapian main directory
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
42 @param queuename: name of the queue (used for caching key)
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
43 @param timeout: lock acquire timeout
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
44 """
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
45 self.request = request
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
46 self.xapian_dir = xapian_dir
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
47 self.queuename = queuename
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
48 self.timeout = timeout
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
49
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
50 def get_cache(self, locking):
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
51 return caching.CacheEntry(self.request, self.xapian_dir, self.queuename,
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
52 scope='dir', use_pickle=True, do_locking=locking)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
53
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
54 def _queue(self, cache):
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
55 try:
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
56 queue = cache.content()
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
57 except caching.CacheError:
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
58 # likely nothing there yet
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
59 queue = []
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
60 return queue
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
61
5750
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
62 def mput(self, entries):
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
63 """ Put multiple entries into the queue (append at end)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
64
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
65 @param entries: list of tuples (pagename, attachmentname, revno)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
66 pagename: page name [unicode]
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
67 attachmentname: attachment name [unicode or None]
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
68 revision number (int) or None (all revs)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
69 """
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
70 cache = self.get_cache(locking=False) # we lock manually
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
71 cache.lock('w', 60.0)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
72 try:
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
73 queue = self._queue(cache)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
74 queue.extend(entries)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
75 cache.update(queue)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
76 finally:
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
77 cache.unlock()
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
78
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
79 def put(self, pagename, attachmentname=None, revno=None):
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
80 """ Put an entry into the queue (append at end)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
81
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
82 @param pagename: page name [unicode]
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
83 @param attachmentname: attachment name [unicode]
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
84 @param revno: revision number (int) or None (all revs)
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
85 """
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
86 cache = self.get_cache(locking=False) # we lock manually
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
87 cache.lock('w', 60.0)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
88 try:
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
89 queue = self._queue(cache)
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
90 entry = (pagename, attachmentname, revno)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
91 queue.append(entry)
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
92 cache.update(queue)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
93 finally:
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
94 cache.unlock()
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
95
5750
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
96 def mget(self, count):
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
97 """ Get (and remove) first <count> entries from the queue
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
98
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
99 Raises IndexError if queue was empty when calling get().
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
100 """
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
101 cache = self.get_cache(locking=False) # we lock manually
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
102 cache.lock('w', 60.0)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
103 try:
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
104 queue = self._queue(cache)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
105 entries = queue[:count]
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
106 queue = queue[count:]
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
107 cache.update(queue)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
108 finally:
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
109 cache.unlock()
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
110 return entries
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
111
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
112 def get(self):
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
113 """ Get (and remove) first entry from the queue
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
114
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
115 Raises IndexError if queue was empty when calling get().
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
116 """
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
117 cache = self.get_cache(locking=False) # we lock manually
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
118 cache.lock('w', 60.0)
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
119 try:
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
120 queue = self._queue(cache)
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
121 entry = queue.pop(0)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
122 cache.update(queue)
5053
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
123 finally:
508135789e41 UpdateQueue: use caching for on-disk storage of the page queue
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5044
diff changeset
124 cache.unlock()
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
125 return entry
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
126
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
127
5055
ce6ae8b5d9bd MoinMoin.search: use new style classes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5054
diff changeset
128 class BaseIndex(object):
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
129 """ Represents a search engine index """
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
130
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
131 def __init__(self, request):
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
132 """
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
133 @param request: current request
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
134 """
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
135 self.request = request
5282
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
136 self.main_dir = self._main_dir()
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
137 if not os.path.exists(self.main_dir):
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
138 os.makedirs(self.main_dir)
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
139 self.update_queue = IndexerQueue(request, self.main_dir, 'indexer-queue')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
140
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
141 def _main_dir(self):
1211
d028d37e7105 raise NotImplemented instance
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1206
diff changeset
142 raise NotImplemented('...')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
143
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
144 def exists(self):
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
145 """ Check if index exists """
5282
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
146 raise NotImplemented('...')
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
147
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
148 def mtime(self):
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
149 """ Modification time of the index """
5282
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
150 raise NotImplemented('...')
1205
73f576c4bca3 fix multiconfig merge and more informative SystemInfo macro
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1197
diff changeset
151
73f576c4bca3 fix multiconfig merge and more informative SystemInfo macro
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1197
diff changeset
152 def touch(self):
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
153 """ Touch the index """
5282
bee5567d7084 xapian: remove assumption that xapian db is a directory
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5281
diff changeset
154 raise NotImplemented('...')
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
155
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
156 def _search(self, query):
5272
a728d059c78e search package: docstring cleanup, src code formatting fixes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5166
diff changeset
157 """ Actually perfom the search
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
158
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
159 @param query: the search query objects tree
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
160 """
1211
d028d37e7105 raise NotImplemented instance
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1206
diff changeset
161 raise NotImplemented('...')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
162
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
163 def search(self, query, **kw):
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
164 """ Search for items in the index
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
165
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
166 @param query: the search query objects to pass to the index
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
167 """
5166
d80478608f48 MoinMoin/search/builtin.py searching does not require a lock, xapian allows several concurrent search connections.
Dmitrii Miliaev <dimazest@gmail.com>
parents: 5055
diff changeset
168 return self._search(query, **kw)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
169
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
170 def update_item(self, pagename, attachmentname=None, revno=None, now=True):
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
171 """ Update a single item (page or attachment) in the index
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
172
1473
b5864c9492fb ensure new attachments trigger an index update, doc update for MoinMoin.search.Xapian
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1467
diff changeset
173 @param pagename: the name of the page to update
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
174 @param attachmentname: the name of the attachment to update
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
175 @param revno: a specific revision number (int) or None (all revs)
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
176 @param now: do all updates now (default: True)
1466
500e043cf7cd code documentation update
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1465
diff changeset
177 """
5276
195db0fdbb80 Fixed and cleaned up Xapian based search (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5272
diff changeset
178 self.update_queue.put(pagename, attachmentname, revno)
1480
c222d149e93f renaming and deleting pages works for all revisions
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1478
diff changeset
179 if now:
5281
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
180 self.do_queued_updates()
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
181
5750
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
182 def queuePages(self, files=None, pages=None):
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
183 """ Put pages (and files, if given) into indexer queue
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
184
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
185 @param files: iterator or list of files to index additionally
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
186 @param mode: set the mode of indexing the pages, either 'update' or 'add'
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
187 @param pages: list of pages to index, if not given, all pages are indexed
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
188 """
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
189 start = time.time()
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
190 request = self._indexingRequest(self.request)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
191 self._queue_pages(request, files, pages)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
192 logging.info("queuing completed successfully in %0.2f seconds." %
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
193 (time.time() - start))
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
194
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
195 def indexPagesQueued(self, count=-1):
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
196 """ Index <count> queued pages (and/or files)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
197 """
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
198 start = time.time()
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
199 done_count = self.do_queued_updates(count)
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
200 logging.info("indexing %d items completed successfully in %0.2f seconds." %
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
201 (done_count, time.time() - start))
021c1f6d3272 experimental queued indexing support to work around memory leak
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5734
diff changeset
202
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
203 def indexPages(self, files=None, mode='update', pages=None):
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
204 """ Index pages (and files, if given)
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
205
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
206 @param files: iterator or list of files to index additionally
5314
e005834bbf85 less disruptive xapian index rebuild (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5282
diff changeset
207 @param mode: set the mode of indexing the pages, either 'update' or 'add'
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
208 @param pages: list of pages to index, if not given, all pages are indexed
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
209 """
5281
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
210 start = time.time()
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
211 request = self._indexingRequest(self.request)
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
212 self._index_pages(request, files, mode, pages=pages)
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
213 logging.info("indexing completed successfully in %0.2f seconds." %
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
214 (time.time() - start))
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
215
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
216 def _index_pages(self, request, files=None, mode='update', pages=None):
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
217 """ Index all pages (and all given files)
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
218
5281
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
219 This should be called from indexPages only!
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
220
1467
26c8ab85dc86 completed code documentation for MoinMoin.search.builtin
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1466
diff changeset
221 @param request: current request
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
222 @param files: iterator or list of files to index additionally
5314
e005834bbf85 less disruptive xapian index rebuild (details below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5282
diff changeset
223 @param mode: set the mode of indexing the pages, either 'update' or 'add'
4991
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
224 @param pages: list of pages to index, if not given, all pages are indexed
d39bdb239da4 Xapian2009: py.test.importorskip in tests was removed, tests try import Xapian, and on ImportError skip a test. Index.indexPages now takes a pages parameter - list of pages which must be indexed.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4982
diff changeset
225
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
226 """
1211
d028d37e7105 raise NotImplemented instance
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1206
diff changeset
227 raise NotImplemented('...')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
228
5281
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
229 def do_queued_updates(self, amount=-1):
5272
a728d059c78e search package: docstring cleanup, src code formatting fixes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5166
diff changeset
230 """ Perform updates in the queues
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
231
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
232 @param request: the current request
5281
5f0ec1f315bc xapian: new locking, removed threading and signing (details see below)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5278
diff changeset
233 @keyword amount: how many updates to perform at once (default: -1 == all)
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
234 """
1211
d028d37e7105 raise NotImplemented instance
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1206
diff changeset
235 raise NotImplemented('...')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
236
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
237 def optimize(self):
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
238 """ Optimize the index if possible """
1211
d028d37e7105 raise NotImplemented instance
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1206
diff changeset
239 raise NotImplemented('...')
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
240
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
241 def contentfilter(self, filename):
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
242 """ Get a filter for content of filename and return unicode content.
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
243
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
244 @param filename: name of the file
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
245 """
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
246 request = self.request
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
247 mt = wikiutil.MimeType(filename=filename)
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
248 for modulename in mt.module_name():
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
249 try:
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
250 execute = wikiutil.importPlugin(request.cfg, 'filter', modulename)
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
251 break
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
252 except wikiutil.PluginMissingError:
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
253 pass
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
254 else:
3162
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
255 logging.info("Cannot load filter for mimetype %s" % modulename)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
256 try:
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
257 data = execute(self, filename)
3162
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
258 logging.debug("Filter %s returned %d characters for file %s" % (modulename, len(data), filename))
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
259 except (OSError, IOError), err:
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
260 data = ''
5734
a0c4450dce2c filter exception logging: fix wrong method name
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5733
diff changeset
261 logging.exception("Filter %s threw error '%s' for file %s" % (modulename, str(err), filename))
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
262 return mt.mime_type(), data
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
263
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
264 def _indexingRequest(self, request):
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
265 """ Return a new request that can be used for index building.
2286
01f05e74aa9c Big PEP8 and whitespace cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2222
diff changeset
266
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
267 This request uses a security policy that lets the current user
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
268 read any page. Without this policy some pages will not render,
1467
26c8ab85dc86 completed code documentation for MoinMoin.search.builtin
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1466
diff changeset
269 which will create broken pagelinks index.
26c8ab85dc86 completed code documentation for MoinMoin.search.builtin
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1466
diff changeset
270
26c8ab85dc86 completed code documentation for MoinMoin.search.builtin
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1466
diff changeset
271 @param request: current request
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
272 """
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
273 from MoinMoin.security import Permissions
5863
5f5faedac588 remove copy.copy() that crashed on windows/iis/isapi-wsgi after page save, replace it with minimal FakeRequest instance
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5750
diff changeset
274 from MoinMoin.user import User
4981
f1d1d8105d52 Xapian2009: pep8 fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4972
diff changeset
275
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
276 class SecurityPolicy(Permissions):
1791
6dd2e29acffe Eclipse PyDev Check: fixed lots of its errors and warnings
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1524
diff changeset
277 def read(self, *args, **kw):
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
278 return True
4981
f1d1d8105d52 Xapian2009: pep8 fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4972
diff changeset
279
5868
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
280 user = User(request)
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
281 user.may = SecurityPolicy(user)
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
282
5863
5f5faedac588 remove copy.copy() that crashed on windows/iis/isapi-wsgi after page save, replace it with minimal FakeRequest instance
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5750
diff changeset
283 class FakeRequest(object):
5f5faedac588 remove copy.copy() that crashed on windows/iis/isapi-wsgi after page save, replace it with minimal FakeRequest instance
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5750
diff changeset
284 """ minimal request object for indexing code """
5868
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
285 def __init__(self, request, user):
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
286 NAMES = """action cfg clock current_lang dicts form
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
287 getPragma getText href html_formatter
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
288 isSpiderAgent mode_getpagelinks page
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
289 parsePageLinks_running redirect redirectedOutput
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
290 rev rootpage script_root session setContentLanguage
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
291 setPragma theme uid_generator values write""".split()
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
292 for name in NAMES:
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
293 value = getattr(request, name, None)
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
294 setattr(self, name, value)
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
295 self.user = user
5863
5f5faedac588 remove copy.copy() that crashed on windows/iis/isapi-wsgi after page save, replace it with minimal FakeRequest instance
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5750
diff changeset
296
5868
fc11712e0df0 simplify FakeRequest setup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5867
diff changeset
297 return FakeRequest(request, user)
921
45e286183872 abstraction work on search engine index & cleanups
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 920
diff changeset
298
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
299
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
300 ##############################################################################
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
301 ### Searching
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
302 ##############################################################################
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
303
4981
f1d1d8105d52 Xapian2009: pep8 fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4972
diff changeset
304
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
305 class BaseSearch(object):
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
306 """ A search run """
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
307
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
308 def __init__(self, request, query, sort='weight', mtime=None, historysearch=0):
1499
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
309 """
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
310 @param request: current request
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
311 @param query: search query objects tree
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
312 @keyword sort: the sorting of the results (default: 'weight')
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
313 @keyword mtime: only show items newer than this timestamp (default: None)
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
314 @keyword historysearch: whether to show old revisions of a page (default: 0)
ffa0d1f81059 final polishing round adding docstrings, comments and fixing small issues
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1497
diff changeset
315 """
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
316 self.request = request
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
317 self.query = query
1237
0a947454dec7 use xapian for sorting search results
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1234
diff changeset
318 self.sort = sort
1433
6b0ea72d7665 mtime search works, added MoinMoin.support.parsedatetime, small fixes
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1241
diff changeset
319 self.mtime = mtime
1441
05482b439f89 optional history indexing and search is working
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1433
diff changeset
320 self.historysearch = historysearch
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
321 self.filtered = False
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
322 self.fs_rootpage = "FS" # XXX FS hardcoded
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
323
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
324 def run(self):
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
325 """ Perform search and return results object """
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
326
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
327 start = time.time()
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
328 hits, estimated_hits = self._search()
1496
70e94a679c47 cleanup whitespace, add/fix comments
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1494
diff changeset
329
919
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
330 # important - filter deleted pages or pages the user may not read!
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
331 if not self.filtered:
5469c8b911a4 Splitting out MoinMoin/search.py to MoinMoin/search/*.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents:
diff changeset
332 hits = self._filter(hits)
3162
153681321f8c logging: use module-level logger for MoinMoin.search.*
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3104
diff changeset
333 logging.debug("after filtering: %d hits" % len(hits))
920
a2498260eca5 do result processing in results.py
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 919
diff changeset
334
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
335 return self._get_search_results(hits, start, estimated_hits)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
336
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
337 def _search(self):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
338 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
339 Search pages.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
340
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
341 Return list of tuples (wikiname, page object, attachment,
5272
a728d059c78e search package: docstring cleanup, src code formatting fixes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 5166
diff changeset
342 matches, revision) and estimated number of search results (if
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
343 there is no estimate, None should be returned).
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
344
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
345 The list may contain deleted pages or pages the user may not read.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
346 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
347 raise NotImplementedError()
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
348
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
349 def _filter(self, hits):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
350 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
351 Filter out deleted or acl protected pages
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
352
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
353 @param hits: list of hits
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
354 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
355 userMayRead = self.request.user.may.read
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
356 fs_rootpage = self.fs_rootpage + "/"
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
357 thiswiki = (self.request.cfg.interwikiname, 'Self')
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
358 filtered = [(wikiname, page, attachment, match, rev)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
359 for wikiname, page, attachment, match, rev in hits
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
360 if (not wikiname in thiswiki or
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
361 page.exists() and userMayRead(page.page_name) or
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
362 page.page_name.startswith(fs_rootpage)) and
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
363 (not self.mtime or self.mtime <= page.mtime_usecs()/1000000)]
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
364 return filtered
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
365
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
366 def _get_search_results(self, hits, start, estimated_hits):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
367 return getSearchResults(self.request, self.query, hits, start, self.sort, estimated_hits)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
368
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
369 def _get_match(self, page=None, uid=None):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
370 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
371 Get all matches
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
372
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
373 @param page: the current page instance
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
374 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
375 if page:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
376 return self.query.search(page)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
377
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
378 def _getHits(self, pages):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
379 """ Get the hit tuples in pages through _get_match """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
380 logging.debug("_getHits searching in %d pages ..." % len(pages))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
381 hits = []
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
382 revisionCache = {}
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
383 fs_rootpage = self.fs_rootpage
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
384 for hit in pages:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
385
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
386 uid = hit.get('uid')
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
387 wikiname = hit['wikiname']
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
388 pagename = hit['pagename']
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
389 attachment = hit['attachment']
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
390 revision = int(hit.get('revision', 0))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
391
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
392 logging.debug("_getHits processing %r %r %d %r" % (wikiname, pagename, revision, attachment))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
393
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
394 if wikiname in (self.request.cfg.interwikiname, 'Self'): # THIS wiki
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
395 page = Page(self.request, pagename, rev=revision)
5035
93becb451375 Xapian2009: BaseTextFieldSearch.xapian_term() refactoring. Tests for a search with stemming.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 5021
diff changeset
396
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
397 if not self.historysearch and revision:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
398 revlist = page.getRevList()
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
399 # revlist can be empty if page was nuked/renamed since it was included in xapian index
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
400 if not revlist or revlist[0] != revision:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
401 # nothing there at all or not the current revision
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
402 logging.debug("no history search, skipping non-current revision...")
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
403 continue
5035
93becb451375 Xapian2009: BaseTextFieldSearch.xapian_term() refactoring. Tests for a search with stemming.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 5021
diff changeset
404
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
405 if attachment:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
406 # revision currently is 0 ever
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
407 if pagename == fs_rootpage: # not really an attachment
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
408 page = Page(self.request, "%s/%s" % (fs_rootpage, attachment))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
409 hits.append((wikiname, page, None, None, revision))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
410 else:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
411 matches = self._get_match(page=None, uid=uid)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
412 hits.append((wikiname, page, attachment, matches, revision))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
413 else:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
414 matches = self._get_match(page=page, uid=uid)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
415 logging.debug("self._get_match %r" % matches)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
416 if matches:
5036
4b2ef153ad4f Xapian2009: pep8 and typo fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 5035
diff changeset
417 if not self.historysearch and pagename in revisionCache and revisionCache[pagename][0] < revision:
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
418 hits.remove(revisionCache[pagename][1])
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
419 del revisionCache[pagename]
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
420 hits.append((wikiname, page, attachment, matches, revision))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
421 revisionCache[pagename] = (revision, hits[-1])
5035
93becb451375 Xapian2009: BaseTextFieldSearch.xapian_term() refactoring. Tests for a search with stemming.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 5021
diff changeset
422
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
423 else: # other wiki
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
424 hits.append((wikiname, pagename, attachment, None, revision))
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
425 logging.debug("_getHits returning %r." % hits)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
426 return hits
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
427
4981
f1d1d8105d52 Xapian2009: pep8 fixes.
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4972
diff changeset
428
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
429 class MoinSearch(BaseSearch):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
430
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
431 def __init__(self, request, query, sort='weight', mtime=None, historysearch=0, pages=None):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
432 super(MoinSearch, self).__init__(request, query, sort, mtime, historysearch)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
433
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
434 self.pages = pages
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
435
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
436 def _search(self):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
437 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
438 Search pages using moin's built-in full text search
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
439
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
440 The list may contain deleted pages or pages the user may not
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
441 read.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
442
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
443 if self.pages is not None, searches in that pages.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
444 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
445 self.request.clock.start('_moinSearch')
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
446
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
447 # if self.pages is none, we make a full pagelist, but don't
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
448 # search attachments (thus attachment name = '')
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
449 pages = self.pages or [{'pagename': p, 'attachment': '', 'wikiname': 'Self', } for p in self._getPageList()]
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
450
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
451 hits = self._getHits(pages)
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
452 self.request.clock.stop('_moinSearch')
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
453
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
454 return hits, None
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
455
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
456 def _getPageList(self):
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
457 """ Get list of pages to search in
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
458
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
459 If the query has a page filter, use it to filter pages before
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
460 searching. If not, get a unfiltered page list. The filtering
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
461 will happen later on the hits, which is faster with current
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
462 slow storage.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
463 """
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
464 filter_ = self.query.pageFilter()
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
465 if filter_:
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
466 # There is no need to filter the results again.
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
467 self.filtered = True
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
468 return self.request.rootpage.getPageList(filter=filter_)
1241
cba856bc0c05 estimate numer of hits correctly
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1237
diff changeset
469 else:
4970
43e7b40912ac Xapian2009: The MoinMoin.search.builtin.Search class was split to BaseSearch, MoinSearch and XapianSearch. Search using moin should work, xapian search is broken!
Dmitrijs Milajevs <dimazest@gmail.com>
parents: 4516
diff changeset
470 return self.request.rootpage.getPageList(user='', exists=0)
1237
0a947454dec7 use xapian for sorting search results
Franz Pletz <fpletz AT franz-pletz DOT org>
parents: 1234
diff changeset
471