Using CouchDB in Rookeries – Part 3 – Configuring a Remote CouchDB Server

In the previous instalment of this series I wrote about installing and
managing on a remote server. Now lets talk about configuring CouchDB so that
it can run as a production server. This will not cover CouchDB’s
configuration extensively, rather I will touch on the parts relevant to
Rookeries.

Configuring CouchDB

CouchDB can be configured in two way: either by modifying the INI setting files
located in the /etc/couchdb/ or by visiting the Configuration UI in Futon:
e.g. http://localhost:5984/_utils/config.html I went with the route of editing the local.ini file via Ansible.

Configuring Users Authentication on CouchDB

The default installation of CouchDB does not force you to declare and secure
users. Users and user authentication is totally optional. However since I
did not want open up my production database to the world.

Adding an Admin User

I first added an admin user to the CouchDB configuration. This user being the
admin user for the entire CouchDB server, rather than an individual database.
The change consisted of adding a value for the user and password, under the
admins section:

[admins]
admin = password

If you’re worried about leaving your password in plain-text in the
configuration, then don’t. After restarting CouchDB (via the Upstart service)
this password gets hashed.

Admin Party No More – Enforcing User Authentication

By default CouchDB runs in what is “admin party” mode, meaning you do not need to log in to make admin or user changes on the server. Naturally for a production server that is not something you want to do. So you have to enable requiring user login:

[couch_httpd_auth]
require_valid_user = true

Getting CouchDB to Talk with the Rest of the World

This part is optional, however if you want CouchDB accessible from more than
the localhost you have configure it to allow connections from multiple sites.
I needed this since I use Codeship as a continuous integration (CI) service for Rookeries, and I wanted to run integration and end-to-end tests using my production database server. (In an ideal world I would have a separate CouchDB server just for testing or have a CI that has a local instance of CouchDB.)

Binding Addresses

The trick to allowing this is to set the right binding address for CouchDB.
This can be done by changing the bind_address value in the httpd section of the configuration as such:

[httpd]
bind_address = 0.0.0.0

By default this just localhost or 127.0.0.1. You can also setup the
configuration differently. One thing that I am not sure of is passing a
range or a list of different bind addresses. I am not sure this is possible
based on the documentation that I have seen.

What about HTTPS?

CouchDB has options to handle HTTPS and SSL natively. I personally have not configured my site to use HTTPS, since none of my sites
do so currently. Getting certificates and everything setup for all my sites
is a bit involved so I have avoided the issue for the time being. I plan on
getting around to do so in the future.

However if you have the time and option to setup HTTPS, please do so! Putting
up another layer of security around a production CouchDB will help. More
importantly HTTPS gives you and your end users a degree of privacy, that is
rare in these Post-Snowdown times.

Using CouchDB in Rookeries – Part 2 – Setting Up a Remote CouchDB Server

Overview

In the second instalment of my series on adding CouchDB support to
Rookeries, I’ll be talking about how I provisioned CouchDB on my remote
server.

Now it sounds counter-intuitive why I would talk about creating and
populating CouchDB databases first before writing about installing
CouchDB. The reason for this backwards step, is that I already have
CouchDB installed locally. At my daytime job at
Points
we use CouchDB extensively, so I already have
CouchDB installed locally on my workstation. I have also worked with the
Operations team to provision CouchDB servers. However it is a different
story when trying to provision and configure CouchDB yourself on your
own servers. This blog post details some of the things I learned along the way.

Since the setup of Couch is a bit involved, I will divide this up over two blog
posts.

Provisioning Rookeries with Ansible

One of the stated goals of Rookeries is create a developer-friendly
blogging platform that is easier to install and setup than WordPress.
That is a tall order for a Python WSGI app, since there is some more
setup involved than just installing Apache and mod_php and unzipping
Wordpress into a folder. (Even with WordPress there is more involved
when doing a proper and maintainable setup.)

So while putting up a production ready Python WSGI app is more involved
technically, this does not mean the end-user needs to experience this.
That is where the Rookeries Ansible
role
comes into
play. I created that Ansible role to encapsulate the complexity of the
installing Rookeries. (This role uses [the nginx-uwsgi-supervisord Ansible
role which I wrote to handle the actual setup of a WSGI app on an bare-bones
Ubuntu server]
(https://bitbucket.org/dorianpula/ansible-nginx-uwsgi-supervisor).) All of the
details concerning the setup and configuration of a CouchDB server for a
Rookeries installation is included in the Rookeries Ansible role.

Installing Latest CouchDB on Ubuntu Linux

I use the latest Ubuntu LTS (14.04) for both my development and
deployment environments. Having the same environment reduces the effort for meI
to take Rookeries from development to production. However the
latest version of CouchDB for Ubuntu 14.04 is 1.5.0 and I wanted to use
the latest stable version of CouchDB. While upgrading between CouchDB
versions is straightforward, I know that I am less likely to upgrade to the
latest version of CouchDB once Rookeries stabilizes. And there is no
point on starting off with an older version of your database right from
the start of a project.

Fortunately the CouchDB devs distribute the latest stable version of
CouchDB via a convenient
PPA
. The
instructions on how to install CouchDB via the PPA is right on the
Launchpad page.

Installing via Console

# add the ppa
sudo add-apt-repository ppa:couchdb/stable -y
# update cached list of packages    
sudo aptitude update -y
# remove any existing couchdb binaries
sudo aptitude remove couchdb couchdb-bin couchdb-common -yf
# install the latest
sudo aptitude install couchdb

Provisioning via Ansible

The Rookeries Ansible role translates those instructions (minus the
removal of existing packages) to:

- name: add the couchdb ppa repository
  apt_repository: repo="ppa:couchdb/stable" state=present

- name: install couchdb
  apt: pkg=couchdb state=present
  with_items:
    - couchdb
    - couchdb-bin
    - couchdb-common

Running CouchDB

Now that we have CouchDB installed, we need to control it like we would any
other service on Linux server. Surprisingly enough when I tried to find the
packaged CouchDB service scripts (using the service command), I did not find
anything!

> sudo service --status-all
# ... A lot of entries but no couchdb ...

Turns out that CouchDB package comes with an Upstart script rather than
a traditional System V initrc script. (That itself is probably not a bad
thing.)

> sudo status couchdb
couchdb start/running, process 5311
# There it is.

Starting and stopping service through Upstart is done via the ‘start’ and
‘stop’ commands. There are also ‘reload’ and ‘restart’ commands.

> sudo restart couchdb
couchdb start/running, process 15987

Side Note About Upstart vs Services vs Systemd

Update: I found an article that explains the evolution and the current situation of Linux service management. It explains things much better than I do and in much more detail. I learn quite a bit from it.

If you follow Linux developments and news, you might have heard about the development and controversy around new init systems. I will try to explain \nthese developments briefly here since we are on the topic of service scripts.

The old System V style for service scripts (in /etc/init.d/ or\n/etc/rc.d/) is not flexible when it comes to managing dependencies and running outside of the prescribed run-levels that happen during boot and shutdown.
However there is disagreement about what would would be a better alternative. Upstart was Canonical/Ubuntu’s attempt to create a more flexible system for managing services. However Debian and many other Linux distributions have recently switched over to another such system called systemd. Part of the controversy about systemd stems from the architectural design of systemd (which seems monolithic at first glance as it tries to solve service management, logging and few other seemly unrelated system level issues).

Another part of the controversy stems from how the project lead’s handled his previous project: PulseAudio. I will admit that my first experiences with PulseAudio were pretty rocky, and I missed how well using plain old ALSA worked. However these issues have since gone away, and I can not think of any PulseAudio or any audio issues I’ve encountered in Linux recently. (Ironically Windows 7 gives me more grief with sounds issues than Linux nowadays.)

I personally don’t know enough about systemd to form an opinion. Sure I am a bit anxious to see how this all plays out. However this is a case of wait and see. In the meantime be aware that the exact semantics on how you interact with services will change in the near future.

Update #2: An interview with Lennart Poettering about systemd, its design and intentions

Provisioning with Ansible

Fortunately Ansible does not make a distinction of what the underlying
service script setup is used. The Ansible service module works with initrc,
service, Upstart and systemd services without complaint.

In the Rookeries Ansible restarting the CouchDB service becomes a single
task.

- name: stop couchdb server
  service: name=couchdb state=restarted

Next Up

In the next blog post I’ll write up about configuring and securing
CouchDB.