reimeika

introduction

koi is a content management system (CMS) written in python (3.8.2+) using the bottle microframework.

The koi CMS uses JSON files to store information. It can be fully curated and managed from the back-end command-line, but for convenience it also features a simple web-based markdown article editor which is additionally able to manipulate image galleries (an anti-CSFR mechanism is built-in, as are anti-XSS measures). A search functionality is included, as is multi-user support with optional email- and TOTP- based multi-factor authentication. koi also features per-file access control, meaning that not only access to web pages can be restricted, but also any and all files such as JPG images and PDF documents regardless of whether the file is being downloaded via a direct link to it.

To use koi to its fullest extent from the start the following should be run (as root, tested on Ubuntu 20.04/22.04):

apt install python3-markdown python3-whoosh python3-passlib python3-bleach python3-pyqrcode \
python3-pil python3-natsort python3-ldap

local installation

koi can be downloaded from reimeika.ca. Unzip the file koi.zip and type cd koi. The file config.py contains detailed explanations of all configuration options and should be reviewed. In particular session_cookie_sig must be set. Once this is done, running ./koi.py will make this tutorial accessible at http://localhost:8080.

serving files

By default web pages are in the pages directory. To create a new web page first make a sub-directory test (a page in koi terminology, also traditionally called a slug) and copy a file into it, e.g. from inside the koi directory:

mkdir pages/test
cp logo.txt pages/test

The file is now located at http://localhost:8080/pages/test/logo.txt but clicking the link will complain about a missing .koi file and return a 403 error. In order to serve requests every page must have an associated template in the dir_templates directory. Which template is used is determined by the name of a JSON .koi file inside the web page directory so that, for example, article.koi will use the template file article.tpl. The most basic template is files.tpl and the simplest document possible is storing the string {} into a file called files.koi within test/ as so:

echo '{}' > pages/test/files.koi

Once this is done the files.tpl template will be used to serve the logo.txt file which can now be retrieved by clicking the link above. Note that, for extra security, the directive force_acl can be set to True in config.py, in which case all pages (other than login.koi) must have a properly-defined ACL if they are to be served. If that's the case the above command should be replaced by:

echo '{"acl": {"files.koi": {"users": "*", "groups": [], "ips": "*", "time": 0}}}' > pages/test/files.koi

creating a web page

Although the files inside test can be now served, the web page http://localhost:8080/pages/test itself is empty. The files.tpl template can display the contents of a key called body, so replacing the empty JSON object with {"body": "Hello world!"} in files.koi will now show a "Hello world" web page at the above link:

echo '{"body": "Hello world!"}' >! pages/test/files.koi

Editing JSON files is not very practical and so koi includes a simple web-based markdown article editor. koi articles can also be created/edited from the command-line using the koiedit.py script by simply creating a directory (i.e. page/slug) with the appropriate apache server access permissions and running (if in the koi top-level directory):

mkdir pages/<slug>
./koiedit.py pages/<slug>/article.koi
./koiacl.py pages/<slug>/article.koi

Also included is the html2koi.py script, which allows to convert existing HTML files into .koi format.

searching

The default search engine is powered by whoosh which is a non-standard but readily-available module which might be have to be installed (as root), as well as markdown in order to parse most pages:

apt install python3-markdown python3-whoosh

Main features of the search function are:

query words can be separated by AND (the default), OR, NOT, ANDNOT and ANDMAYBE
fuzzy queries e.g. grafiti~ will find graffiti (see more)
phrase search using double quotes
field searches on title, body, keywords, author, and creator

A complex search could be crafted as follows:

author:yuma OR grafiti~ ANDNOT title:lavender

The index is dynamically updated any time a new article or page is added, deleted, or modified. Note that both this and the simple search engine (see below) filter their results based on access control lists (ACLs) so that matches of restricted pages are not shown unless the user (or visitor) performing the search has access to them.

Only the title, body, keywords, author, and creator fields of .koi files are indexed. Thus, a text file upload paper.txt will not be found in a search even if a keyword matches within. Furthermore, the index will only be updated if a .koi file timestamp field has been properly set. The search index can be regenerated at any time by deleting the cache (stored in search/.index by default).

adding users

The basic functionality for creating websites through the back-end is to simply write templates and add content via their corresponding .koi JSON file. A simpler approach is to use the built-in article editor. However, in order to do so users must be added to the system. Note that this requires passlib so running apt install python3-passlib may be necessary, and if using LDAP then apt install python3-ldap.

For increased security all user management is deliberately done from the back-end, although a template could in principle be written to accomplish this through a web page. Running the script koiaccts.py offers a simple menu-driven command-line interface which allows, amongst other things, searching, adding, deleting, listing, and modifying user accounts:

# ./koiaccts.py

koi account manager (use "-h" for non-interactive usage)
========================================================

0) Search
1) Add account
2) Remove account
3) Modify account
4) List users/account
5) Batch import
6) Exit

Of note is the fact that the hashing algorithm is compatible with the Linux /etc/shadow file, and thus offers the possibility of importing existing user accounts (though only SHA512 and yescrypt are allowed). User accounts are stored in JSON files inside the directory specified by dir_accounts_fp in config.py so it's important to review this setting before proceeding. The following settings should also be revised in config.py: wwwuid and wwwgid, corresponding to the user the web server runs as, and wwwfperms and wwwdperms, corresponding to the permissions files and directories should have so that the web server user (and preferably no one else) can access them (note that changing ownership will only work if running the script as the superuser). The koiaccts.py script can also be run non-interactively as shown below.

# ./koiaccts.py -h
usage: koiaccts.py [-h] [-a] [-d] [-e] [-xe] [-fn FIRSTNAME]
                   [-g [GROUP [GROUP ...]]] [-G [AGROUP [AGROUP ...]]]
                   [-ln LASTNAME] [-L] [-m TWOF_EMAIL] [-mn MIDDLENAME]
                   [-n NAME] [-o] [-p PASSWORD] [-pa PRIMARY_ADDRESS]
                   [-pe PRIMARY_EMAIL] [-pp PRIMARY_PHONE] [-q]
                   [-s [SHOW [SHOW ...]]] [-t] [-xt] [-u USER] [-U] [-w]

optional arguments:
  -h, --help            show this help message and exit
  -a, --add             add new user
  -d, --delete          delete user (will back-up account)
  -e, --editor_on       enable editor role
  -xe, --editor_off     disable editor role
  -fn FIRSTNAME, --firstname FIRSTNAME
                        user's first name
  -g [GROUP [GROUP ...]], --group [GROUP [GROUP ...]]
                        set groups (space-separated list, none to empty)
  -G [AGROUP [AGROUP ...]], --agroup [AGROUP [AGROUP ...]]
                        add groups (space-separated list)
  -ln LASTNAME, --lastname LASTNAME
                        user's last name
  -L, --lock            lock account
  -m TWOF_EMAIL, --twof_email TWOF_EMAIL
                        set two-factor email address
  -mn MIDDLENAME, --middlename MIDDLENAME
                        user's middle name(s)
  -n NAME, --name NAME  set name (construct using "fn"/"mn"/"ln" otherwise)
  -o, --overview        overview of all accounts
  -p PASSWORD, --password PASSWORD
                        set password (using hash)
  -pa PRIMARY_ADDRESS, --primary_address PRIMARY_ADDRESS
                        set user's primary address
  -pe PRIMARY_EMAIL, --primary_email PRIMARY_EMAIL
                        set user's primary email
  -pp PRIMARY_PHONE, --primary_phone PRIMARY_PHONE
                        set user's primary phone
  -q, --quiet           quiet mode
  -s [SHOW [SHOW ...]], --show [SHOW [SHOW ...]]
                        show profile keys (list, none for all)
  -t, --twof_on         enable multi-factor
  -xt, --twof_off       disable multi-factor
  -u USER, --user USER  user name
  -U, --unlock          unlock account
  -w, --wipe            wipe out user account

Adding an account is straightforward and can be done following the steps set by the script. User names are case-insensitive and restricted by the user_re configuration setting, which by default allows alphanumeric strings and email addresses in any language (so that "AIKA", "sora@remeika.ca", and "ゆま" are all valid, albeit not necessarily a prudent mix). If desired an email for two-factor authentication may be set, but care must be taken to configure the format of the message (twoF_msg) and SMTP settings in config.py. Finally, starting with koi 0.40 LDAP authentication is supported via config.py, bypassing the local password database.

Listing a user's profile via koiaccts.py shows the basic structure of the account and some information about their last login, logout, IP used, "locked" status, roles, and groups they belong to. Note that "roles" consists of lists that can be assigned on a per-template basis, and so having the "editor" role for the "edit" template allows authoring articles via the web browser as explained later on.

multi-factor authentication

koi supports two second-factor authentication mechanisms: one via email and a second one using a time-based one-time password (TOTP) authenticator (such as Google Authenticator or Authy). They both work the same way by sending the user a numeric token (or nonce, six digits by default) which must be typed in after login credentials have been submitted. Either method can be used, either separately or in tandem (in which case both tokens must be supplied). Since koi does not have support for emailing password resets then email may be a satisfactory second-factor, but if such functionality is added (by writing a template) then using TOTP is the only sensible candidate. Creating a TOTP key can be done by running koiaccts.py interactively and selecting:

3) Modify account -> 6) Set/render two-factor TOTP (time-based one-time password)

at which point a QR image is generated (on the terminal), together with a non-graphical code in case the user does not have a camera.

Other mitigation measures against brute-force login attempts can be implemented through both rate-limiting and attempt-limiting throttlers. For details refer to the settings throttle_delay, throttle_attempts, and throttle_lockout detailed in config.py.

editing articles

By default koi has no users at all. Assuming a user has been created as described above they need to have an "editor" role to edit articles (this can be toggled via koiaccts.py). Note that the editor requires the bleach module to operate so apt install python3-bleach may be necessary.

Once a user has been created they can log in at http://localhost:8080/pages/login. After authenticating users can click on the editor button (which is only shown to editors) to create a new page or see a listing of the pages they can edit and delete (hovering over an entry provides details about the web page). Only pages using the article.tpl or gallery.tpl templates are supported:

The editor is mostly self-explanatory. Articles are written in markdown (help for which is available from the collapsed section at the bottom of the editing page) and consist of a title, a slug (see below), a space-separated list of keywords, and a body (most elements will provide a helpful pop-up if hovered on). Controls for editing an article are:

webpage
opens the current web page in a new tab (must be refreshed after saving edits)
files
file manager to review and delete files, and also manage ACLs (see below)
uploader
for uploading files unto the web page
articles
to return to the article listing
save
to save the current article

Files (including images) can be linked and embedded in web pages using markdown syntax (the file manager provides the link/embed code for each file which can be copied and pasted into the article):

By default web pages are stored in directories named after the creation time-stamp e.g. /pages/1593798867, but can be renamed using the slug field after first saved. All article revisions are stored in hidden files as .article.koi-rev# (where rev# is the revision number) within the page directory, so it's feasible to recover prior versions (though only from the back-end). Deleted articles and their files are also backed-up as hidden directories (e.g. /pages/.name-rev#-timestamp), although individually-deleted files are permanently removed.

access control lists (acls)

Articles and galleries have two types of restrictions: who can edit/curate them and who can view them. Newly-created articles and galleries can only be modified and viewed by their original author, but other editors can be added by clicking on the ▸ Access control list for <name> (or ▸ gallery ACL in grid view) expandable section and modifying the list of users who can edit the page (note that it's impossible to remove oneself). Adding a wildcard * allows any logged-in user to edit the article/gallery. ACL controls for articles and galleries are equivalent, so any reference to articles below applied also to galleries.

The access control list for viewing articles can restrict access to web pages and files on a per-user, per-IP, and date-time basis (per-group is also supported but only through the back-end). This limits who can view a web page or download a file when clicking on a link. By default new articles can only be accessed by the user who originally wrote them, from any IP, starting from the creation date-time. To make an article universally available suffice to make the user ACL equal to *, the IP ACL equal to *, and leave the timestamp as-is. Other ACL features are:

Adding a * to the user list will give access permissions to all logged-in users
Subnets (possibly in combination with an IP address list) can be specified as xxx.yyy
Setting a future release date will only allow access from that date-time onward
To block, prepend ! to a user or IP/subnet (overrides any conflicting allow directive)

Access control lists can be manipulated in exactly the same way from the file manager on a per-file basis (by default uploaded files inherit the ACL of the web page). Galleries only offer a per-image ACL via the back-end.

trusted articles, making forms

To thwart XSS attacks user input is sanitized using bleach and context-based allowlists, and further escaped upon display unless used in an HTML context. This, however, strips most HTML code which is sometimes undesirable. koi supports the concept of trusted articles which allow full HTML-editing using the web editor. This setting can be toggled via the back-end by changing trusted to True in the .koi file, which in turn will add the tag "(trusted)" next to the ACL section near the top of the article editor. It is strongly encouraged that only select editors be allowed to modify such articles (since in this mode Javascript injections are now possible).

Forms can then be included in templates as well as trusted .koi files (or .html files converted using html2koi.py). koi provides an anti-CSFR measure via a token which is tied to the user session and which must be used when making a submission (in fact, koi forbids logged-in users to submit any data — including uploads — unless a valid anti-CSFR token is present).

For example, the code to add a search field to an article is:

<form action="/pages/search" method="post">
    {{!×CSRF}}
    <input name="search_query" size="12" type="text">
    <button class="button" type="submit">search</button>
</form>

depending on how various variables have been defined in config.py i.e. search_query is actually the value of CONFIG['search_var'] (note that "x" has been replaced with "×" for the anti-CSRF variable name, otherwise it'd be replaced in the snippet above). The anti-CSRF measure must be coded in the template, which in the case of article.tpl is done as so:

% if user := PROFILE.get('user', ''):
%   xCSRF = f'<input name="xCSRF" type="hidden" value="{PROFILE["xCSRF"]}">'
% else:
%   xCSRF = ''
% end
% content = PAGE['body'].replace('{{!×CSRF}}', xCSRF)

templates

koi includes a few templates in the templates directory (configurable via dir_templates) which can be studied for reference. Other than the special login.tpl, all templates are provided the following dictionaries:

BOTTLE
the WSGI environment
CONFIG
all parameters defined in config.py
FORMS
all form values, decoded
INDEX
the website index, each key being the page name and the corresponding .koi data
ME
properties of the current page
PAGE
the .koi JSON dictionary
PROFILE
the current user's JSON profile (an empty dictionary if no session is ongoing)
QUERY
the combined GET and POST dictionary, decoded
TREE
the website tree, each key being the page name with the equivalent ME dictionary
UPLOAD
a dictionary of uploaded files (with number keys 0 up to upload_max_files)
USERS
overview of all users, each key being a user name and its PROFILE dictionary

The main difference between FORMS and QUERY is that the latter returns a dictionary with one key per form input while the former retrieves a FormsDict which can contain repeated keys with different values as a list (resulting, for example, from a selector element, or multiple checkboxes). A form with an input named animal can have its value retrieved as so:

QUERY['animal'] (returns lion)

while the input from a selector element named colours can be obtained as so:

FORMS.getall('colours') (returns ['red', 'green', 'yellow'])

Detailed dictionary keys:

ME contains the following (example shown):
- page: main
- path: /www/koi/pages/main
- koi_fp: /www/koi/pages/main/article.koi
- template: article
- uri: /pages/main
- files: ['article.koi', 'image.jpg']
PROFILE contains the following (example shown, extensible entries may have further keys added):
- id-related:
  - user: yuma
  - uid: 1c3859d413c83abcc2bc4f4c9a6d5a12
  - groups: ['s1', 'alice']
  - full_name: {'first': 'Yuma', 'last': 'Asami', 'middle': ''} (extensible)
  - name: Yuma Asami
  - email: {'primary': 'yuma@ebisumuscats.org'} (extensible)
  - phone: {'primary': '555-555-5555'} (extensible)
  - address: {'primary': '8-909 Somewhere in Tokyo'} (extensible)
  - gecos: Yuma Asami,,,
  - roles: {'edit': ['editor']} (extensible)
- authentication-related:
  - hash: $6$... (hash string)
  - multiF_check: {"enabled": true,"twoF_email": "yuma@ebisumuscats.net", "twoF_totp": {TOTP dict}}
- session-related:
  - ip: (IP address)
  - token: (token string)
  - nonce: (integer value)
  - xCSRF: (anti-CSRF string)
  - login_ts: (epoch timestamp)
  - logout_ts: (epoch timestamp)
  - knock_tss: (epoch timestamp list)
  - created: (epoch timestamp)
  - locked: (boolean)
- misc:
  - koi_version: 0.34
- unused:
  - cache: (dictionary)
  - data: (dictionary)
TREE[page] contains the following keys (which match ME):
- path
- template
- uri
- files
- serve: (boolean specifying whether page can be served)
UPLOAD[N] contains the following keys:
- OK: (boolean depending of whether the upload was successful)
- status: (upload status e.g. "upload successful")
- content_type: (e.g. "application/zip")
- raw_filename: (original filename, potentially dangerous)
- safe_name: (safe filename adequate for local storage)
- file_data: (the upload data)
USERS[user]
same as PROFILE

Note that for performance reasons INDEX and TREE are only provided if get_index is set to True in the .koi file, and similarly USERS is only provided if get_users is also True. Otherwise the dictionaries are present but empty (as is UPLOAD if no uploads are found). User names have their case preserved, but should be lower-cased for internal usage in templates if trying to recreate the account's JSON record name (uid) i.e. use user.lower().

uploads

The following code adds an upload form/button in a template (note that the name of the upload input must be koi_file_upload, while the operation/save_upload hidden input is an example of the action the template needs to perform in order to save the upload):

<form id="upload" action="{{ME['uri']}}" enctype="multipart/form-data" method="post">
    <input type="hidden" name="operation" value="save_upload">
    {{!×CSRF}}
</form>
<input form="upload"  name="koi_file_upload" type="file" multiple>
<button form="upload" type="submit">upload</button>

Uploads are stored in an UPLOAD dictionary with numerical entries UPLOAD[0], UPLOAD[1] ... UPLOAD[config.upload_max_files-1]. They can be stored via a template using the following code (i.e. what the template should do if QUERY['operation'] == 'save_upload' in the above example):

% for key, up in UPLOAD.items():
%    if up['OK']:
%       with open(os.path.join(ME['path'], up['safe_name']), 'wb') as fd:
%          fd.write(up['file_data'])
%       end
%    end
% end

Change ME['path'] to save the file in a different directory, and check for I/O errors as needed. Note that files with a .koi extension will have a safe name .koi.up extension instead.

scripts and back-end usage

As mentioned earlier in this guide, koi can be readily used and managed from the back-end (which offers much more control and security at the expense of convenience). No users or editors are really necessary, and all GUI components can be disabled via ACL restrictions. For this purpose the following scripts are provided inside the koi directory:

koiaccts.py
manage user accounts (use "-h" for non-interactive usage)
koiacl.py
manipulate the access control list of a page
html2koi.py
convert an HTML file into an article.koi file
koiedit.py
edit the contents of a .koi file using a text editor
koiport.py
create a gallery.koi or showcase.koi file

apache configuration

To run koi on an apache web server via mod_wsgi it may be necessary to run:

apt install libapache2-mod-wsgi-py3

Assuming a working SSL-enabled web server is already running, the first step is to move the koi directory into a suitable location, say /www/wsgi, and modify the file koi.wsgi to set sys.path accordingly e.g.

sys.path = ['/www/wsgi/koi/'] + sys.path

Ownership (perhaps www-data) and permissions of files (600) and directories (700) should be reviewed, as should the settings in config.py, particularly dir_accounts_fp (say, /usr/local/etc) and a new session_cookie_sig. Setting force_ssl to True is highly recommended, and care should be taken not to run in DEBUG mode.

Adding the following code to an ssl VirtualHost may then suffice:

DocumentRoot /www/wsgi/koi
<Directory /www/wsgi/koi>
    Options None
    AllowOverride None
    Require all granted
</Directory>
WSGIProcessGroup koi
WSGIDaemonProcess koi user=www-data group=www-data
WSGIScriptAlias / /www/wsgi/koi/koi.wsgi

Remember to touch koi.wsgi after adding a new template or making changes to existing ones in order to update the cache.

odds and ends

— Since koi can be fully managed from the back-end it may be desirable (for enhanced security, say) to disable login access. This can be achieved by hiding the login page (or whatever page_login is set to in config.py):

mv pages/login pages/.login

and then editing config.py appropriately e.g. page_login='.login'. Editing and searching can also be disabled in a similar manner (note that login, edit, and search can each be toggled this way independently of each other).

— Attempting to access a page which requires a session will redirect a non-authenticated user to the login page. This redirect includes the original queries (e.g. ?foo=bar&ham=spam) except for the following which are reserved for internal usage: koi_wants, koi_stay, user, password, koi_session, nonce, and xCSRF (in other words, the query ?user=jane will be removed from the redirect).

— URL slugs in the article editor are, by default, restricted to the following regular expression in config.py: r'^[a-zA-Z0-9][a-zA-Z0-9_-]{0,75}$'. This allows for a mix of up to seventy-six underscores, dashes, and alphanumeric ASCII characters. This rule is not enforced by the koi core (which doesn't rely on the editor) but by the template edit.tpl. Another example which disallows some misleading page names could be r'(?i)^(?!home|download|admin)[a-z0-9][a-z0-9_-]{0,75}$'.

— koi includes two search engines, a simple one in template ssearch.tpl and the more advanced wsearch.tpl. Which one is used is a simple matter of making a symlink of the preferred template file to search.tpl. The simple search engine has no external dependencies and requires no index, but does no more than a case-insensitive AND search of the submitted words with no concept of query analysis or scoring. The whoosh-based engine wsearch.tpl is the default.

— If error messages are deemed too informative they can be tweaked in the error.tpl template, in particular the following snippet:

% if CODE in [400, 403, 404, 413, 500]:
    <p><font class="error">[{{CODE}}] {{DETAILS}}</font></p>
% end

can be customized as desired (by, say, getting rid of {{DETAILS}} in extreme cases).

— .koi files can always be (carefully) edited directly using a text editor, or from within the python interpreter (but care must be taken not to delete standard fields which may be required by the editor):

import json
with open("pages/article/article.koi", "r") as fd:
    artdata = json.load(fd)

... do stuff to "artdata" ...

with open("pages/article/article.koi", "w") as fd:
    json.dump(artdata, fd, ensure_ascii=False)

— For advanced users who prefer using their text editor of choice, the best way to edit .koi files (article files in particular) is to use the included koiedit.py script. The editor of choice can be set via the EDITOR variable at the top of the script. The default value is emacs -nw, which combined with the "Save Place" directive ((save-place-mode 1) in the .emacs file) allows to conveniently cycle between editing, saving/quitting, and reloading the file in a browser. The koiedit.py command can take the .koi file as an argument.

— Symlinked files will only be served within the page directory (and sub-directories).

— While multi-lingual web page content should be fine, creating non-ASCII slugs and file names from the back-end will likely cause issues and should be avoided.

— An example of a complex ACL:

{"article.koi":{"users": "*", "groups": [], "ips": "*", "time": 0},
 "araara.doc": {"users": ["クレア"], "groups": [], "ips": ["8.8.8.8"], "time": 0}
 "kira.tex":   {"users": ["AIKA"], groups: ["av"], "ips": ["128.100"], "time": 0}
 "SAbday.pdf": {"users": ["yuma", "!sora"], "groups": ["sod"], "ips": "*", "time": 0}
 "ntr.ppt":    {"users": [], "groups": ["tissue"], "ips": "*", "time": 0}
 "shibu.jpg":  {"users": ["*"], "groups": ["!moodyz"], "ips": "*", "time": 1604973574}}

The contents of article.koi are available to all.
The file araara.doc is available to クレア as long as she accesses it from the IP address 8.8.8.8.
The file kira.tex is available to AIKA and everyone in the av group as long as they do so from the 128.100 subnet.
The file SAbday.pdf is available to user "yuma" and the members of group "sod", but not to "sora".
The file ntr.ppt is only available to members of group "tissue".
The file shibu.jpg can be downloaded by any logged-in user who does not belong to the moodyz group after the epoch time 1604973574 (2020-11-10T01:59:34+00:00).

These ACLs are not exclusive to articles and can be applied to any web page regardless of the template (other than the login page which has no ACL enforcement). It's a good idea, however, to make the location corresponding to page_assets (which contains css files and the such) available to all. The lack of an ACL entry in the .koi file is equivalent to making the web page and its files available to anyone unless force_acl is set to True.

colophone

koi and documentation is released under the 3-clause BSD license

bottle is distributed under the MIT license

CSS used is a customized version of skeleton, distributed under the MIT license

The koi logo is public domain