annotate MoinMoin/script/migration/_conv160_wiki.py @ 2602:b601db2e4d34

1.6 converter: improve content conversion, add test for it
author Thomas Waldmann <tw AT waldmann-edv DOT de>
date Sat, 04 Aug 2007 21:28:22 +0200
parents 13f0331f3a42
children c61c10e3fcde
rev   line source
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
1 # -*- coding: iso-8859-1 -*-
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
2 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
3 MoinMoin - convert content in wiki markup
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
4
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
5 Assuming we have this "renames" map:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
6 -------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
7 'PAGE', 'some_page' -> 'some page'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
8 'FILE', 'with%20blank.txt' -> 'with blank.txt'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
9
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
10 Markup transformations needed:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
11 -------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
12 ["some_page"] -> ["some page"] # renamed
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
13 [:some_page:some text] -> ["some page" some text] # NEW: free link with link text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
14 [:page:text] -> ["page" text] # NEW: free link with link text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
15 (with a page not being renamed)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
16
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
17 attachment:with%20blank.txt -> attachment:"with blank.txt"
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
18 attachment:some_page/with%20blank.txt -> attachment:"some page/with blank.txt"
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
19 The attachment processing should also urllib.unquote the filename (or at
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
20 least replace %20 by space) and put it into "quotes" if it contains spaces.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
21
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
22 @copyright: 2007 MoinMoin:JohannesBerg,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
23 2007 MoinMoin:ThomasWaldmann
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
24 @license: GNU GPL, see COPYING for details.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
25 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
26
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
27 import re, codecs
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
28 from MoinMoin import i18n
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
29 i18n.wikiLanguages = lambda : []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
30 from MoinMoin import config, wikiutil
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
31 from MoinMoin.parser.text_moin_wiki import Parser
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
32 from MoinMoin.action import AttachFile
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
33
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
34 class Converter(Parser):
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
35 def __init__(self, request, pagename, raw, renames):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
36 self.request = request
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
37 self.pagename = pagename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
38 self.raw = raw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
39 self.renames = renames
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
40 self.in_pre = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
41 self._ = None
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
42
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
43 def _replace(self, key):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
44 """ replace a item_name if it is in the renames dict """
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
45 if key[0] == 'PAGE':
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
46 item_name = key[1] # pagename
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
47 elif key[0] == 'FILE':
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
48 item_name = key[2] # filename, key[1] is pagename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
49 try:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
50 return self.renames[key] # new pagename or new filename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
51 except KeyError:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
52 return item_name
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
53
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
54 def return_word(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
55 return word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
56 _remark_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
57 _table_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
58 _tableZ_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
59 _emph_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
60 _emph_ibb_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
61 _emph_ibi_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
62 _emph_ib_or_bi_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
63 _u_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
64 _strike_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
65 _sup_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
66 _sub_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
67 _small_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
68 _big_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
69 _tt_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
70 _tt_bt_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
71 _notword_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
72 _rule_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
73 _smiley_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
74 _smileyA_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
75 _ent_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
76 _ent_numeric_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
77 _ent_symbolic_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
78 _heading_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
79 _email_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
80 _macro_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
81 _interwiki_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
82 _word_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
83 _indent_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
84 _li_none_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
85 _li_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
86 _ol_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
87 _dl_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
88 _comment_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
89
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
90 # PRE SECTION HANDLING ---------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
91
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
92 def _pre_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
93 origw = word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
94 word = word.strip()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
95 if word == '{{{' and not self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
96 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
97 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
98 elif word == '}}}' and self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
99 self.in_pre = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
100 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
101 return word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
102
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
103 def _parser_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
104 origw = word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
105 if word.startswith('{{{'):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
106 word = word[3:]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
107
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
108 s_word = word.strip()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
109 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
110 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
111
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
112 # LINKS ------------------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
113
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
114 def _replace_target(self, target):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
115 target_and_anchor = target.split('#', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
116 if len(target_and_anchor) > 1:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
117 target, anchor = target_and_anchor
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
118 target = self._replace(('PAGE', target))
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
119 return '%s#%s' % (target, anchor)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
120 else:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
121 target = self._replace(('PAGE', target))
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
122 return target
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
123
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
124 def interwiki(self, target_and_text, **kw):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
125 # TODO: maybe support [wiki:Page http://wherever/image.png] ?
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
126 scheme, rest = target_and_text.split(':', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
127 wikiname, pagename, text = wikiutil.split_wiki(rest)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
128 if not text:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
129 text = pagename
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
130 #self.request.log("interwiki: split_wiki -> %s.%s.%s" % (wikiname,pagename,text))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
131
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
132 if wikiname.lower() == 'self': # [wiki:Self:LocalPage text] or [:LocalPage:text]
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
133 return '[%s %s]' % (wikiutil.quoteName(pagename), text) # ["LocalPage" text]
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
134
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
135 # check for image URL, and possibly return IMG tag
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
136 if not kw.get('pretty_url', 0) and wikiutil.isPicture(pagename):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
137 dummy, wikiurl, dummy, wikitag_bad = wikiutil.resolve_wiki(self.request, rest)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
138 href = wikiutil.join_wiki(wikiurl, pagename)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
139 #self.request.log("interwiki: join_wiki -> %s.%s.%s" % (wikiurl,pagename,href))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
140 return target_and_text # self.formatter.image(src=href)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
141
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
142 return target_and_text # wikiname, pagename, text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
143
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
144 def attachment(self, target_and_text, **kw):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
145 """ This gets called on attachment URLs """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
146 _ = self._
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
147 #self.request.log("attachment: target_and_text %s" % target_and_text)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
148 scheme, fname, text = wikiutil.split_wiki(target_and_text)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
149 pagename, fname = AttachFile.absoluteName(fname, self.pagename)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
150 from_this_page = pagename == self.pagename
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
151 fname = self._replace(('FILE', pagename, fname))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
152 if '%20' in fname:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
153 fname = fname.replace('%20', ' ')
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
154 fname = self._replace(('FILE', pagename, fname))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
155 pagename = self._replace(('PAGE', pagename))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
156 if from_this_page:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
157 name = fname
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
158 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
159 name = "%s/%s" % (pagename, fname)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
160 if ' ' in name:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
161 qname = wikiutil.quoteName(name)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
162 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
163 qname = name
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
164
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
165 if text:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
166 text = ' ' + text
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
167 return "%s:%s%s" % (scheme, qname, text)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
168
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
169 def _url_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
170 """Handle literal URLs including inline images."""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
171 scheme = word.split(":", 1)[0]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
172
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
173 if scheme == "wiki":
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
174 return word # self.interwiki(word)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
175
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
176 if scheme in self.attachment_schemas:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
177 return self.attachment(word)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
178
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
179 if wikiutil.isPicture(word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
180 # Get image name http://here.com/dir/image.gif -> image
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
181 name = word.split('/')[-1]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
182 name = ''.join(name.split('.')[:-1])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
183 return word # self.formatter.image(src=word, alt=name)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
184 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
185 return word # word, scheme
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
186
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
187 def _wikiname_bracket_repl(self, text):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
188 """Handle special-char wikinames with link text, like:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
189 ["Jim O'Brian" Jim's home page] or ['Hello "world"!' a page with doublequotes]i
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
190 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
191 word = text[1:-1] # strip brackets
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
192 first_char = word[0]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
193 if first_char in wikiutil.QUOTE_CHARS:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
194 # split on closing quote
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
195 target, linktext = word[1:].split(first_char, 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
196 target = self._replace_target(target)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
197 target = wikiutil.quoteName(target)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
198 else: # not quoted
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
199 # split on whitespace
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
200 target, linktext = word.split(None, 1)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
201 target = self._replace_target(target)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
202 if ' ' in target:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
203 target = wikiutil.quoteName(target)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
204 if linktext:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
205 linktext = ' ' + linktext
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
206 return '[%s%s]' % (target, linktext)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
207
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
208
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
209 def _url_bracket_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
210 """Handle bracketed URLs."""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
211 word = word[1:-1] # strip brackets
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
212
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
213 # Local extended link? [:page name:link text]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
214 if word[0] == ':':
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
215 words = word[1:].split(':', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
216 words[0] = self._replace_target(words[0])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
217 if len(words) == 1:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
218 link = words[0]
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
219 link = wikiutil.quoteName(link)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
220 return '[%s]' % link # use freelink
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
221 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
222 link, text = words
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
223 link = wikiutil.quoteName(link)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
224 if text:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
225 text = ' ' + text
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
226 return '[%s%s]' % (link, text) # use freelink with text
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
227
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
228 return '[%s]' % word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
229
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
230
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
231 # SCANNING ---------------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
232 def scan(self, scan_re, line):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
233 """ Scans one line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
234
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
235 Append text before match, invoke replace() with match, and add text after match.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
236 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
237 result = []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
238 lastpos = 0
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
239
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
240 for match in scan_re.finditer(line):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
241 # Add text before the match
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
242 if lastpos < match.start():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
243 result.append(line[lastpos:match.start()])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
244 # Replace match with markup
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
245 result.append(self.replace(match))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
246 lastpos = match.end()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
247
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
248 # Add remainder of the line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
249 result.append(line[lastpos:])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
250 return u''.join(result)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
251
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
252 def replace(self, match):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
253 """ Replace match using type name """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
254 result = []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
255 for _type, hit in match.groupdict().items():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
256 if hit is not None and not _type in ["hmarker", ]:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
257 # Get replace method and replace hit
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
258 replace = getattr(self, '_' + _type + '_repl')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
259 result.append(replace(hit))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
260 return ''.join(result)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
261 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
262 # We should never get here
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
263 import pprint
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
264 raise Exception("Can't handle match %r\n%s\n%s" % (
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
265 match,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
266 pprint.pformat(match.groupdict()),
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
267 pprint.pformat(match.groups()),
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
268 ))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
269
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
270 return ""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
271
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
272 def convert(self):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
273 """ For each line, scan through looking for magic
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
274 strings, outputting verbatim any intervening text.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
275 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
276 # prepare regex patterns
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
277 rules = self.formatting_rules.replace('\n', '|')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
278 if 1: # self.cfg.bang_meta:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
279 rules = ur'(?P<notword>!%(word_rule)s)|%(rules)s' % {
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
280 'word_rule': self.word_rule,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
281 'rules': rules,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
282 }
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
283 pre_rules = self.pre_formatting_rules.replace('\n', '|')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
284 scan_re = re.compile(rules, re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
285 pre_scan_re = re.compile(pre_rules, re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
286 eol_re = re.compile(r'\r?\n', re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
287
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
288 rawtext = self.raw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
289
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
290 # remove last item because it's guaranteed to be empty
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
291 self.lines = eol_re.split(rawtext)[:-1]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
292 self.in_processing_instructions = 1
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
293
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
294 # Main loop
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
295 for line in self.lines:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
296 # ignore processing instructions
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
297 if self.in_processing_instructions:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
298 found = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
299 for pi in ("##", "#format", "#refresh", "#redirect", "#deprecated",
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
300 "#pragma", "#form", "#acl", "#language"):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
301 if line.lower().startswith(pi):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
302 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
303 found = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
304 break
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
305 if not found:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
306 self.in_processing_instructions = 0
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
307 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
308 continue # do not parse this line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
309 if self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
310 # still looking for processing instructions
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
311 if self.in_pre == 'search_parser':
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
312 if line.strip().startswith("#!"):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
313 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
314 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
315 continue
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
316 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
317 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
318 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
319 # Paragraph break on empty lines
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
320 if not line.strip():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
321 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
322 continue
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
323
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
324 # Scan line, format and write
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
325 scanning_re = self.in_pre and pre_scan_re or scan_re
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
326 formatted_line = self.scan(scanning_re, line)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
327 self.request.write(formatted_line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
328
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
329 def convert_wiki(pagename, intext, renames):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
330 """ Convert content written in wiki markup """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
331 import StringIO
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
332 request = StringIO.StringIO()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
333 noeol = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
334 if not intext.endswith('\r\n'):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
335 intext += '\r\n'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
336 noeol = True
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
337 p = Converter(request, pagename, intext, renames)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
338 p.convert()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
339 res = request.getvalue()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
340 if noeol:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
341 res = res[:-2]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
342 return res
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
343