Giving a Talk at PyCon Canada 2015

I am super excited to announce that I will be doing a talk at PyCon Canada this year! I will be talking about migrating from using Fabric to deploy my WSGI app (Rookeries) to using a combination of Invoke and Ansible. PyCon Canada will be happening in Toronto at the University of Toronto campus Saturday November 7 to Sunday November 8, 2015. My talk will on the Sunday at 3:45-4:15 PM. Videos of the talks should be available about a day or two after the talk. I look forward to seeing everyone there!

More info on my talk.

Also I plan on being at the Sprints the following Monday as well.

Fixing SPDX Expression Warning in package.json

If you ever run into the following warning when installing your NPM package:

npm WARN package.json rookeries-api-client-wrapper@0.4.9 license should be a valid SPDX license expression

That means you have an improper a name for the specified license in your package.json. So what are valid values for licenses? Well… here is the SPDX list of licenses

Command-line JSON Formatting with jq

About 2 or 3 months ago, when testing a deployment of a microservice at work
with Eric, our head Ops admin, we were looking at the JSON output of one of
the REST endpoints. Rather than looking at the raw output from curl, I
piped the output through JSON tool in the Python standard library:

$ curl -X GET -s http://rookeries.org/status | python -m json.tool
{
    "app": "rookeries",
    "version": "0.4.9"
}

(I will give you a couple of examples based on calls to Rookeries, rather than
the actual service call, since that API isn’t available publicly yet.)

Eric seeing that suggested I try out a utility that he uses: jq which is
great for formatting and querying JSON
.

Installing jq

jq is a utility in C, and fortunately there are binary packages available for
it in the latest Ubuntu LTS (14.10). Installing it via aptitude (or apt-get):

$ sudo aptitude install jq

Pretty Printing JSON

The simplest use case for jq is to simply pretty print JSON output. This is
done by piping the result of a CURL command to jq with the parameter ‘.’:

$ curl -X GET -s http://status.bitbucket.org/api/v2/status.json | jq .
{
  "status": {
    "description": "All Systems Operational",
    "indicator": "none"
  },
  "page": {
    "updated_at": "2015-09-03T08:03:55.275Z",
    "url": "http://status.bitbucket.org",
    "name": "Atlassian Bitbucket",
    "id": "bqlf8qjztdtr"
  }
}

Formatting JSON in jq

It is also possible to format the output of the JSON to display only relevant
information. For instance if I want to find out the status of the components
that make up Bitbucket I can do the following:

$ curl -X GET -s http://status.bitbucket.org/api/v2/components.json | jq .

That however will give me a whole lot of extra data that I might not want. So
instead I might to narrow down and re-format the JSON data to something more
manageable with some jq magic:

$ curl -X GET -s http://status.bitbucket.org/api/v2/components.json | \ 
jq '{src: .page, components: [.components[] | \
    {id: .id, name: .name, status: .status}]}'

{
  "components": [
    {
      "status": "operational",
      "name": "Website",
      "id": "g0lfj4sv2fhf"
    },
    {
      "status": "operational",
      "name": "API",
      "id": "k0x2yw1435v7"
    },
    {
      "status": "operational",
      "name": "SSH",
      "id": "qmh4tj8h5kbn"
    },
    {
      "status": "operational",
      "name": "Git via HTTPS",
      "id": "c1qmcrcbc5zy"
    },
    {
      "status": "operational",
      "name": "Mercurial via HTTPS",
      "id": "vmbzxbbjz05j"
    },
    {
      "status": "operational",
      "name": "Webhooks",
      "id": "rfzky0v13fbp"
    },
    {
      "status": "operational",
      "name": "Source downloads",
      "id": "28h8dvv2qfzw"
    }
  ],
  "src": {
    "updated_at": "2015-09-03T08:03:55.275Z",
    "url": "http://status.bitbucket.org",
    "name": "Atlassian Bitbucket",
    "id": "bqlf8qjztdtr"
  }
}

I won’t explain that particular string in detail. But it will basically
craft a new JSON object, and generate new filtered objects when iterating
over the old array. Overall jq is pretty neat and is extremely fast to work
with.

There is also a Python bindings library for jq. I am considering using it to
help with mapping JSON into Python objects. However I have not played around
with it long enough to know if the extra dependencies are worthwhile or
whether or not it will bring a lot of benefits to Rookeries.

Reference Links

Using CouchDB in Rookeries – Part 3 – Configuring a Remote CouchDB Server

In the previous instalment of this series I wrote about installing and
managing on a remote server. Now lets talk about configuring CouchDB so that
it can run as a production server. This will not cover CouchDB’s
configuration extensively, rather I will touch on the parts relevant to
Rookeries.

Configuring CouchDB

CouchDB can be configured in two way: either by modifying the INI setting files
located in the /etc/couchdb/ or by visiting the Configuration UI in Futon:
e.g. http://localhost:5984/_utils/config.html I went with the route of editing the local.ini file via Ansible.

Configuring Users Authentication on CouchDB

The default installation of CouchDB does not force you to declare and secure
users. Users and user authentication is totally optional. However since I
did not want open up my production database to the world.

Adding an Admin User

I first added an admin user to the CouchDB configuration. This user being the
admin user for the entire CouchDB server, rather than an individual database.
The change consisted of adding a value for the user and password, under the
admins section:

[admins]
admin = password

If you’re worried about leaving your password in plain-text in the
configuration, then don’t. After restarting CouchDB (via the Upstart service)
this password gets hashed.

Admin Party No More – Enforcing User Authentication

By default CouchDB runs in what is “admin party” mode, meaning you do not need to log in to make admin or user changes on the server. Naturally for a production server that is not something you want to do. So you have to enable requiring user login:

[couch_httpd_auth]
require_valid_user = true

Getting CouchDB to Talk with the Rest of the World

This part is optional, however if you want CouchDB accessible from more than
the localhost you have configure it to allow connections from multiple sites.
I needed this since I use Codeship as a continuous integration (CI) service for Rookeries, and I wanted to run integration and end-to-end tests using my production database server. (In an ideal world I would have a separate CouchDB server just for testing or have a CI that has a local instance of CouchDB.)

Binding Addresses

The trick to allowing this is to set the right binding address for CouchDB.
This can be done by changing the bind_address value in the httpd section of the configuration as such:

[httpd]
bind_address = 0.0.0.0

By default this just localhost or 127.0.0.1. You can also setup the
configuration differently. One thing that I am not sure of is passing a
range or a list of different bind addresses. I am not sure this is possible
based on the documentation that I have seen.

What about HTTPS?

CouchDB has options to handle HTTPS and SSL natively. I personally have not configured my site to use HTTPS, since none of my sites
do so currently. Getting certificates and everything setup for all my sites
is a bit involved so I have avoided the issue for the time being. I plan on
getting around to do so in the future.

However if you have the time and option to setup HTTPS, please do so! Putting
up another layer of security around a production CouchDB will help. More
importantly HTTPS gives you and your end users a degree of privacy, that is
rare in these Post-Snowdown times.

Markdown Documentation with Sphinx

Lets take a break from setting up CouchDB in Rookeries, and discuss
documentation.

I recently made the switch to using Markdown for the majority of the prose
style documentation for Rookeries. Originally I wanted to support both
reStructuredText and Markdown. However for reasons I’ll write about, I will
concentrate on supporting Markdown in Rookeries.

Requirements

What do I expect from documentation for Rookeries? I want an automated setup
that will allow for easily writing prose documenation, API level documention
and also that the documentation can act as a test fixture as well. There is no
need to duplicate efforts maintaining two sets of documentation: one for the
code and one as a test sample.

Avoiding Duplication – Unifying Documentation and Test Fixtures

Some of the tests for Rookeries, require actual content living inside a
database. This is an excellent way to dogfood Rookeries, by forcing it to
handle some of the content it will have to support. Currently the test
fixtures live separately from the documentation. However the actual fixture
refers to the path for the sample files, as part of its setup. Whether this
path points to the test fixture folder or any other other folder in the
Rookeries source tree is arbitrary. So why not have the same documentation as
both project documentation and sample test data?

Keeping API Documentation

However I still want API documentation, not only for my sake but to allow
future contributers to extend Rookeries or build plugins for it. Going with
simply just the prose documentation is not enough. So I wanted to keep my
current Sphinx autodoc setup, or have something similar parse docstrings in
my code and generate gorgeous API documentation.

Result

After some trials, I hit upon a way to support both Markdown and
reStructuredText in my Sphinx powered documentation. The Markdown files are
being referenced in the tests, so all my requirements were met. Huge thanks
to Eric Holscher (dev on the ReadtheDocs sites) for figuring this out
originally.

Alternatives

The other alternatives were either use a Markdown-only documentation
generator (mkdocs) or support reStructuredText in Rookeries.

What not Use mkdocs?

mkdocs is an awesome project that uses Markdown to generate documentation via
Sphinx.
mkdocs works remarkably well, has a few nice themes, abstracts away lots of the Sphinx configuration and has a nice workflow for writing the documentation. However its API documentation story is lacking.

State of Autodoc API Documentation

While the initial impression I got from mkdocs was excellent. However when I
tried to use mkdocs for generating API documentation I ran into problems. Judging by the roadmap of mkdocs, there is not much desire to support API documentation. There is an experimental project to hook up mkdocs to the Sphinx autodoc.
However this did not work for me.

Why not Support reStructuredText in Rookeries

Alternatively, I considered going in the other extreme of only supporting
reStructuredText. Aside from the fact that RST syntax is not always the
easiest to remember, I ran into some more technical challenges.

State of RST Frontend Clients

The first major issue is that there are no reStructuredText frontend clients.
There are lots of Markdown clients for Javascript, but none for RST. While I
do plan on rendering content on the server side, I do would prefer to have the
option of doing some of the rendering on the client side.

Working with Docutils in Python

Less of an issue, but more of encumberance is working with reStructuredText in
Python. The docutils library is the standard way to convert RST into a number of formats. The documentation for docutils is horrible. I wish could be more charitable but it took me a good 30-45 mins poking around the docs to figure
out how to programmatically render reStructuredText into HTML:

import io
from docutils import core as docutils_core

with io.open('my_rst_sample.rst') as src:
    beta = src.read()

doc_core.publish_string(beta, writer_name='html')

Mind you this only gets you as far as having a full HTML documentation, that
you need to muck around with. I did get partial HTML rendering earlier on
Rookeries’ history, but it was not obvious or simple to get to. I don’t need
docutils’ grand, book-publishing ready setup nor their command-line tools. I
just need something to render marked-up text into HTML for a blog.

In the future I might wade into dealing with reStructuredText, but I will pass
on that for now. There are more important open issues with Rookeries than
which exact format to use.

Use pandoc

When the topic of rendering between markup format comes up, so does using
pandoc. Pandoc is a great tool for converting between various markup and document formats. And it does a decent job of translating RST to Markdown and back again:

pandoc -f rst -t markdown sample.rst -o sample.md

Now I don’t plan on relying on Pandoc for Rookeries. But I might require it
when I need to import and export data from non-Rookeries blogs and sources.

Markdown in Sphinx

The solution I finally settled on was making Sphinx render Markdown. I found a great article on configuring Sphinx to handle both reStructuredText and Markdown.

Setting up Sphinx to Render Markdown and reStructuredText

The setup is fairly straightforward. Install the remarkdown library that adds
Markdown support to docutils which Sphinx will use:

pip install remarkdown

Next add the following configuration to the Sphinx conf.py configuration:

from recommonmark.parser import CommonMarkParser

# The suffix of source filenames.

source_suffix = ['.rst', '.md']

parsers = {
    '.md': CommonMarkParser,
}

Viola! You now can mix and match Markdown and reStructuredText in your
Sphinx documentation. I would stick to RST when dealing with more
complicated macros for Sphinx (like the releases changelog add-on I use).

Markdown in WordPress

A final note: I want to work toward transitioning this site to Rookeries. So
I started playing around with how it would be to write all my blog posts in
Markdown. I am using Sublime Text for my editor. On the WordPress side, I
found that WP-Markdown is a nice WordPress plugin for writing content in
Markdown and then rendering it to HTML.