annotate MoinMoin/action/sitemap.py @ 5910:7e7e1cbb9d3f

security: fix remote code execution vulnerability in twikidraw/anywikidraw actions We have wikiutil.taintfilename() to make user supplied filenames safe, so that they can't contain any "special" characters like path separators, etc. It is used at many places in moin, but wasn't used here. :|
author Thomas Waldmann <tw AT waldmann-edv DOT de>
date Sat, 29 Dec 2012 15:05:29 +0100
parents 87d97510de79
children
rev   line source
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
1 # -*- coding: iso-8859-1 -*-
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
2 """
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
3 MoinMoin - "sitemap" action
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
4
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
5 Generate a URL list of all your pages (using google's sitemap XML format).
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
6
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
7 @copyright: 2006-2008 MoinMoin:ThomasWaldmann
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
8 @license: GNU GPL, see COPYING for details.
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
9 """
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
10 import time
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
11 from MoinMoin import wikiutil
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
12
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
13 datetime_fmt = "%Y-%m-%dT%H:%M:%S+00:00"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
14
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
15 def now():
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
16 return time.strftime(datetime_fmt, time.gmtime())
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
17
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
18 def make_url_xml(request, vars):
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
19 """ assemble a single <url> xml fragment """
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
20 # add protocol:server - url must be complete path starting with/from /
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
21 vars['url'] = request.getQualifiedURL(vars['url'])
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
22 return """\
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
23 <url>
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
24 <loc>%(url)s</loc>
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
25 <lastmod>%(lastmod)s</lastmod>
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
26 <changefreq>%(changefreq)s</changefreq>
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
27 <priority>%(priority)s</priority>
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
28 </url>
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
29 """ % vars
949
cbbde07e00c4 whitespace-only cleanup, small style changes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 873
diff changeset
30
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
31 def sitemap_url(request, page):
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
32 """ return a sitemap <url>..</url> fragment for page object <page> """
3234
a739558ca3dc Page.url() default changed to relative=False
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 1918
diff changeset
33 url = page.url(request)
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
34 pagename = page.page_name
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
35 lastmod = page.mtime_printable(request)
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
36 if lastmod == "0": # can happen in case of errors
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
37 lastmod = now()
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
38
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
39 # page's changefreq, priority and lastmod depends on page type / name
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
40 if pagename in [u"RecentChanges", u"TitleIndex", ]:
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
41 # important dynamic pages with macros
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
42 changefreq = "hourly"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
43 priority = "0.9"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
44 lastmod = now() # the page text mtime never changes, but the macro output DOES
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
45
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
46 elif pagename in [request.cfg.page_front_page, ]:
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
47 # important user edited pages
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
48 changefreq = "hourly"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
49 priority = "1.0"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
50
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
51 elif wikiutil.isSystemPage(request, pagename):
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
52 # other system pages are rather boring
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
53 changefreq = "yearly"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
54 priority = "0.1"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
55
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
56 else:
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
57 # these are the content pages:
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
58 changefreq = "daily"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
59 priority = "0.5"
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
60
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
61 return make_url_xml(request, locals())
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
62
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
63 def execute(pagename, request):
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
64 _ = request.getText
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
65 request.user.datetime_fmt = datetime_fmt
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
66
4579
87d97510de79 getScriptname() -> script_root, getBaseURL() -> url_root (for werkzeug API)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 4571
diff changeset
67 request.mimetype = 'text/xml'
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
68
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
69 # we emit a piece of data so other side doesn't get bored:
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
70 request.write("""<?xml version="1.0" encoding="UTF-8"?>\r\n""")
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
71
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
72 result = []
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
73 result.append("""<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">\n""")
949
cbbde07e00c4 whitespace-only cleanup, small style changes
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 873
diff changeset
74
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
75 # we include the root url as an important and often changed URL
4579
87d97510de79 getScriptname() -> script_root, getBaseURL() -> url_root (for werkzeug API)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 4571
diff changeset
76 rooturl = request.script_root + '/'
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
77 result.append(make_url_xml(request, {
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
78 'url': rooturl,
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
79 'lastmod': now(), # fake
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
80 'changefreq': 'hourly',
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
81 'priority': '1.0',
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
82 }))
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
83
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
84 # Get page dict readable by current user
1847
c935a4f09f90 allow excluding underlay from sitemap by adding underlay=0 as a parameter
Johannes Berg <johannes AT sipsolutions DOT net>
parents: 1664
diff changeset
85 try:
4424
5ad5753ae311 pre-1.9: request.form has qs args and post data, 1.9: .form only post data, .args only qs args, .values both
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 4327
diff changeset
86 underlay = int(request.values.get('underlay', 1))
1847
c935a4f09f90 allow excluding underlay from sitemap by adding underlay=0 as a parameter
Johannes Berg <johannes AT sipsolutions DOT net>
parents: 1664
diff changeset
87 except ValueError:
c935a4f09f90 allow excluding underlay from sitemap by adding underlay=0 as a parameter
Johannes Berg <johannes AT sipsolutions DOT net>
parents: 1664
diff changeset
88 underlay = 1
c935a4f09f90 allow excluding underlay from sitemap by adding underlay=0 as a parameter
Johannes Berg <johannes AT sipsolutions DOT net>
parents: 1664
diff changeset
89 pages = request.rootpage.getPageDict(include_underlay=underlay)
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
90 pagelist = pages.keys()
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
91 pagelist.sort()
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
92 for name in pagelist:
4122
25902b15fcce fixing urls given by sitemap action, if the wiki does not run in the root url of the site (forward port of forgotten 1.5 fix)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 3234
diff changeset
93 result.append(sitemap_url(request, pages[name]))
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
94
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
95 result.append("""</urlset>\n""")
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
96
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
97 result = "".join(result)
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
98 result = result.replace("\n", "\r\n") # text/* requires CR/LF
871
8ad0dad8c515 added google sitemap action, missing file (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents:
diff changeset
99
873
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
100 # emit all real data
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
101 request.write(result)
5019723cb7d4 improved google sitemap action (ported from 1.5)
Thomas Waldmann <tw AT waldmann-edv DOT de>
parents: 871
diff changeset
102