Skip to main content

sitemap_generator

Description

Creates a simple sitemap using Python, taking a source directory and translating to a site url, applying filters to include/exclude files and folders. It follows sitemaps.org schema.

Most configuration takes place inside the file but the script takes some basic arguments if you like it that way.

Usage

If you use the configuration from the file, just run sitemap_generator.py, if you would like to use the arguments, read below and add them at the end of above command.

PATH_SITE
source path, must include / at the end; also accesible from CLI with -f
SITE_URL
must include final /; also accesible from CLI with -s
EXTENSIONS_ACCEPTED
a tuple, like ('ext1', 'ext2'), default ('php')
SITEMAP_FILE
path + name with extension (default sitemap.xml and it's saved where the script runs); also accesible from CLI with -xml
IGNORE_FILES
a tuple, like ('error.php', '.htaccess', 'config.php') (default values)
IGNORE_FOLDERS
a tuple, like ('.svn', 'imgs', 'src') (default values)
PRIORITY
from 0.0 to 1.0; also accesible from CLI with -p. Default value: 0.5
LASTMOD
boolean, check in files for modify times; also accesible from CLI with -m. Default False.
FREQUENCY
a string; also accesible from CLI with -freq. Default 'monthly'. Accepted values: 'always', 'hourly', 'daily', 'weekly', 'monthly', 'yearly', 'never'.
ROBOTS
: boolean. Set to True to create a new robots.txt file with the sitemap value; also accesible from CLI with -r. Default False.

Download