annotate MoinMoin/script/migration/_conv160_wiki.py @ 2603:c61c10e3fcde

1.6 converter: improve content conversion, more tests
author Thomas Waldmann <tw AT waldmann-edv DOT de>
date Sun, 05 Aug 2007 00:05:34 +0200
parents b601db2e4d34
children 27f06531a91b
rev   line source
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
1 # -*- coding: iso-8859-1 -*-
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
2 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
3 MoinMoin - convert content in wiki markup
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
4
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
5 Assuming we have this "renames" map:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
6 -------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
7 'PAGE', 'some_page' -> 'some page'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
8 'FILE', 'with%20blank.txt' -> 'with blank.txt'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
9
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
10 Markup transformations needed:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
11 -------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
12 ["some_page"] -> ["some page"] # renamed
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
13 [:some_page:some text] -> ["some page" some text] # NEW: free link with link text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
14 [:page:text] -> ["page" text] # NEW: free link with link text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
15 (with a page not being renamed)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
16
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
17 attachment:with%20blank.txt -> attachment:"with blank.txt"
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
18 attachment:some_page/with%20blank.txt -> attachment:"some page/with blank.txt"
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
19 The attachment processing should also urllib.unquote the filename (or at
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
20 least replace %20 by space) and put it into "quotes" if it contains spaces.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
21
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
22 @copyright: 2007 MoinMoin:JohannesBerg,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
23 2007 MoinMoin:ThomasWaldmann
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
24 @license: GNU GPL, see COPYING for details.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
25 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
26
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
27 import re, codecs
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
28 from MoinMoin import i18n
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
29 i18n.wikiLanguages = lambda : []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
30 from MoinMoin import config, wikiutil
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
31 from MoinMoin.parser.text_moin_wiki import Parser
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
32 from MoinMoin.action import AttachFile
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
33
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
34 class Converter(Parser):
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
35 def __init__(self, request, pagename, raw, renames):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
36 self.request = request
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
37 self.pagename = pagename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
38 self.raw = raw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
39 self.renames = renames
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
40 self.in_pre = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
41 self._ = None
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
42
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
43 def _replace(self, key):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
44 """ replace a item_name if it is in the renames dict """
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
45 if key[0] == 'PAGE':
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
46 item_name = key[1] # pagename
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
47 elif key[0] == 'FILE':
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
48 item_name = key[2] # filename, key[1] is pagename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
49 try:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
50 return self.renames[key] # new pagename or new filename
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
51 except KeyError:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
52 return item_name
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
53
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
54 def return_word(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
55 return word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
56 _remark_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
57 _table_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
58 _tableZ_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
59 _emph_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
60 _emph_ibb_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
61 _emph_ibi_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
62 _emph_ib_or_bi_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
63 _u_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
64 _strike_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
65 _sup_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
66 _sub_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
67 _small_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
68 _big_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
69 _tt_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
70 _tt_bt_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
71 _notword_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
72 _rule_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
73 _smiley_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
74 _smileyA_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
75 _ent_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
76 _ent_numeric_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
77 _ent_symbolic_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
78 _heading_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
79 _email_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
80 _macro_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
81 _word_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
82 _indent_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
83 _li_none_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
84 _li_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
85 _ol_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
86 _dl_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
87 _comment_repl = return_word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
88
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
89 # PRE SECTION HANDLING ---------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
90
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
91 def _pre_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
92 origw = word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
93 word = word.strip()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
94 if word == '{{{' and not self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
95 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
96 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
97 elif word == '}}}' and self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
98 self.in_pre = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
99 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
100 return word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
101
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
102 def _parser_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
103 origw = word
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
104 if word.startswith('{{{'):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
105 word = word[3:]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
106
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
107 s_word = word.strip()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
108 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
109 return origw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
110
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
111 # LINKS ------------------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
112
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
113 def _replace_target(self, target):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
114 target_and_anchor = target.split('#', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
115 if len(target_and_anchor) > 1:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
116 target, anchor = target_and_anchor
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
117 target = self._replace(('PAGE', target))
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
118 return '%s#%s' % (target, anchor)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
119 else:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
120 target = self._replace(('PAGE', target))
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
121 return target
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
122
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
123 def interwiki(self, target_and_text, **kw):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
124 # TODO: maybe support [wiki:Page http://wherever/image.png] ?
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
125 scheme, rest = target_and_text.split(':', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
126 wikiname, pagename, text = wikiutil.split_wiki(rest)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
127 #self.request.log("interwiki: split_wiki -> %s.%s.%s" % (wikiname,pagename,text))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
128
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
129 if wikiname.lower() == 'self': # [wiki:Self:LocalPage text] or [:LocalPage:text]
2603
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
130 pagename = self._replace(('PAGE', pagename))
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
131 if not text:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
132 return '[%s]' % wikiutil.quoteName(pagename) # ["LocalPage"]
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
133 else:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
134 return '[%s %s]' % (wikiutil.quoteName(pagename), text) # ["LocalPage" text]
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
135
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
136 # check for image URL, and possibly return IMG tag
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
137 if not kw.get('pretty_url', 0) and wikiutil.isPicture(pagename):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
138 dummy, wikiurl, dummy, wikitag_bad = wikiutil.resolve_wiki(self.request, rest)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
139 href = wikiutil.join_wiki(wikiurl, pagename)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
140 #self.request.log("interwiki: join_wiki -> %s.%s.%s" % (wikiurl,pagename,href))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
141 return target_and_text # self.formatter.image(src=href)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
142
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
143 return target_and_text # wikiname, pagename, text
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
144
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
145 def attachment(self, target_and_text, **kw):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
146 """ This gets called on attachment URLs """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
147 _ = self._
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
148 #self.request.log("attachment: target_and_text %s" % target_and_text)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
149 scheme, fname, text = wikiutil.split_wiki(target_and_text)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
150 pagename, fname = AttachFile.absoluteName(fname, self.pagename)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
151 from_this_page = pagename == self.pagename
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
152 fname = self._replace(('FILE', pagename, fname))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
153 if '%20' in fname:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
154 fname = fname.replace('%20', ' ')
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
155 fname = self._replace(('FILE', pagename, fname))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
156 pagename = self._replace(('PAGE', pagename))
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
157 if from_this_page:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
158 name = fname
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
159 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
160 name = "%s/%s" % (pagename, fname)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
161 if ' ' in name:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
162 qname = wikiutil.quoteName(name)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
163 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
164 qname = name
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
165
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
166 if text:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
167 text = ' ' + text
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
168 return "%s:%s%s" % (scheme, qname, text)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
169
2603
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
170 def _interwiki_repl(self, word):
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
171 """Handle InterWiki links."""
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
172 # XXX if we have access to the cfg, we can limit this to really existings interwiki identifiers
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
173 return self.interwiki("wiki:" + word)
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
174
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
175 def _url_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
176 """Handle literal URLs including inline images."""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
177 scheme = word.split(":", 1)[0]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
178
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
179 if scheme == "wiki":
2603
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
180 return self.interwiki(word)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
181
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
182 if scheme in self.attachment_schemas:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
183 return self.attachment(word)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
184
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
185 if wikiutil.isPicture(word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
186 # Get image name http://here.com/dir/image.gif -> image
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
187 name = word.split('/')[-1]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
188 name = ''.join(name.split('.')[:-1])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
189 return word # self.formatter.image(src=word, alt=name)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
190 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
191 return word # word, scheme
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
192
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
193 def _wikiname_bracket_repl(self, text):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
194 """Handle special-char wikinames with link text, like:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
195 ["Jim O'Brian" Jim's home page] or ['Hello "world"!' a page with doublequotes]i
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
196 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
197 word = text[1:-1] # strip brackets
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
198 first_char = word[0]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
199 if first_char in wikiutil.QUOTE_CHARS:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
200 # split on closing quote
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
201 target, linktext = word[1:].split(first_char, 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
202 target = self._replace_target(target)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
203 target = wikiutil.quoteName(target)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
204 else: # not quoted
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
205 # split on whitespace
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
206 target, linktext = word.split(None, 1)
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
207 target = self._replace_target(target)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
208 if ' ' in target:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
209 target = wikiutil.quoteName(target)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
210 if linktext:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
211 linktext = ' ' + linktext
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
212 return '[%s%s]' % (target, linktext)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
213
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
214
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
215 def _url_bracket_repl(self, word):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
216 """Handle bracketed URLs."""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
217 word = word[1:-1] # strip brackets
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
218
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
219 # Local extended link? [:page name:link text]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
220 if word[0] == ':':
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
221 words = word[1:].split(':', 1)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
222 words[0] = self._replace_target(words[0])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
223 if len(words) == 1:
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
224 link = words[0]
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
225 link = wikiutil.quoteName(link)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
226 return '[%s]' % link # use freelink
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
227 else:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
228 link, text = words
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
229 link = wikiutil.quoteName(link)
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
230 if text:
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
231 text = ' ' + text
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
232 return '[%s%s]' % (link, text) # use freelink with text
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
233
2603
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
234 scheme_and_rest = word.split(":", 1)
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
235 if len(scheme_and_rest) == 2: # scheme given
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
236 scheme, rest = scheme_and_rest
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
237 if scheme == "wiki":
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
238 return self.interwiki(word, pretty_url=1)
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
239 if scheme in self.attachment_schemas:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
240 return self.attachment(word, pretty_url=1)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
241
2603
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
242 words = word.split(None, 1)
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
243 if len(words) == 1:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
244 link, text = words[0], ''
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
245 else:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
246 link, text = words
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
247 if text:
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
248 text = ' ' + text
c61c10e3fcde 1.6 converter: improve content conversion, more tests
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2602
diff changeset
249 return '[%s%s]' % (link, text)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
250
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
251 # SCANNING ---------------------------------------------------------------
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
252 def scan(self, scan_re, line):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
253 """ Scans one line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
254
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
255 Append text before match, invoke replace() with match, and add text after match.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
256 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
257 result = []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
258 lastpos = 0
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
259
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
260 for match in scan_re.finditer(line):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
261 # Add text before the match
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
262 if lastpos < match.start():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
263 result.append(line[lastpos:match.start()])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
264 # Replace match with markup
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
265 result.append(self.replace(match))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
266 lastpos = match.end()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
267
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
268 # Add remainder of the line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
269 result.append(line[lastpos:])
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
270 return u''.join(result)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
271
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
272 def replace(self, match):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
273 """ Replace match using type name """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
274 result = []
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
275 for _type, hit in match.groupdict().items():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
276 if hit is not None and not _type in ["hmarker", ]:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
277 # Get replace method and replace hit
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
278 replace = getattr(self, '_' + _type + '_repl')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
279 result.append(replace(hit))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
280 return ''.join(result)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
281 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
282 # We should never get here
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
283 import pprint
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
284 raise Exception("Can't handle match %r\n%s\n%s" % (
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
285 match,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
286 pprint.pformat(match.groupdict()),
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
287 pprint.pformat(match.groups()),
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
288 ))
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
289
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
290 return ""
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
291
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
292 def convert(self):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
293 """ For each line, scan through looking for magic
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
294 strings, outputting verbatim any intervening text.
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
295 """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
296 # prepare regex patterns
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
297 rules = self.formatting_rules.replace('\n', '|')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
298 if 1: # self.cfg.bang_meta:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
299 rules = ur'(?P<notword>!%(word_rule)s)|%(rules)s' % {
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
300 'word_rule': self.word_rule,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
301 'rules': rules,
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
302 }
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
303 pre_rules = self.pre_formatting_rules.replace('\n', '|')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
304 scan_re = re.compile(rules, re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
305 pre_scan_re = re.compile(pre_rules, re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
306 eol_re = re.compile(r'\r?\n', re.UNICODE)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
307
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
308 rawtext = self.raw
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
309
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
310 # remove last item because it's guaranteed to be empty
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
311 self.lines = eol_re.split(rawtext)[:-1]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
312 self.in_processing_instructions = 1
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
313
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
314 # Main loop
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
315 for line in self.lines:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
316 # ignore processing instructions
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
317 if self.in_processing_instructions:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
318 found = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
319 for pi in ("##", "#format", "#refresh", "#redirect", "#deprecated",
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
320 "#pragma", "#form", "#acl", "#language"):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
321 if line.lower().startswith(pi):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
322 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
323 found = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
324 break
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
325 if not found:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
326 self.in_processing_instructions = 0
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
327 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
328 continue # do not parse this line
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
329 if self.in_pre:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
330 # still looking for processing instructions
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
331 if self.in_pre == 'search_parser':
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
332 if line.strip().startswith("#!"):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
333 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
334 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
335 continue
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
336 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
337 self.in_pre = True
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
338 else:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
339 # Paragraph break on empty lines
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
340 if not line.strip():
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
341 self.request.write(line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
342 continue
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
343
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
344 # Scan line, format and write
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
345 scanning_re = self.in_pre and pre_scan_re or scan_re
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
346 formatted_line = self.scan(scanning_re, line)
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
347 self.request.write(formatted_line + '\r\n')
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
348
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
349 def convert_wiki(pagename, intext, renames):
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
350 """ Convert content written in wiki markup """
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
351 import StringIO
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
352 request = StringIO.StringIO()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
353 noeol = False
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
354 if not intext.endswith('\r\n'):
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
355 intext += '\r\n'
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
356 noeol = True
2602
b601db2e4d34 1.6 converter: improve content conversion, add test for it
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 2599
diff changeset
357 p = Converter(request, pagename, intext, renames)
2599
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
358 p.convert()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
359 res = request.getvalue()
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
360 if noeol:
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
361 res = res[:-2]
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
362 return res
13f0331f3a42 1.6 converter: add content conversion (unfinished), cleanup
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
363