Speaking At PyCon Canada 2019 – Rust Accelerated Pythons

I am happy to announce that I will be giving a talk on Rust and Python at PyCon Canada 2019: Rust Accelerated Pythons!


I will be talking about how to write Python bindings for Rust. With Rust you can create and call fast native code without worrying about having to use C/C++. And creating Python bindings is pretty easy in Rust. This talk will give an introduction to the Rust language, the challenges of writing cross-language bindings, and an example of working with the excellent PyO3 library. Hope to see you there if you are in the Toronto area for PyCon Canada!

If you can’t make it out to PyCon Canada this year, I also plan on giving this talk at an upcoming Rust Toronto meetup. More details will on that in a future post.

Embedded Rust Library Experiment for Python and Web Assembly

With my ever growing list of things that I need to catch up (like wiring my home network and managing Rookeries), I needed a small fun project that I can work on. Ever since I learned enough Rust to be able to convert Rookeries, I wanted to play around with being able to speed up my code with a Rust library. I am especially interested in figuring how to call Rust code from Python or from JS with Web Assembly.

As a test bed (and a reason) for me to learn this, I created a small little library for getting the uptime of a local server (Linux only): embedded-uptime converting between different measurement units like Celsius and Fahrenheit: embedded-unit-converter. If you’d like to follow along, feel free to check it out. I will be posting updates on the blog, and on the Rookeries mailing list.

Updated on 2019 February 4: When I setup the project, I forgot that server uptimes that rely on accessing a server’s /proc/uptime can not possibly work in Web Assembly in browser environment. After some consideration I decided to go with something simple that accessible from any platform, namely conversion between different units of measure.

See you at PyCon Canada 2016!

I’m looking forward to PyCon Canada 2016 that will be happening November 12-13 in Toronto. I submitted two talk proposals and I’m hoping that one of them gets accepted. But regardless I am looking forward to the conference. If you are at the conference, and you want to meet up just message me via Twitter @dorianpula. Also I plan coming out to the sprints that I’m hoping will be happening afterwards. See you there!

PyCon US 2016 Talk – Pythons in a Container

At the end of May, I presented a talk at PyCon 2016, on using Docker with Python microservices. You can imagine the rush I felt getting to present on such a popular topic at such a large and important conference as PyCon! While it took me a while to recuperate after PyCon and Portland both of which were amazing. I would definitely do another talk at PyCon given the opportunity. Anyways I hope you enjoy watching the video of the talk! Below the video I also wrote about preparing for the talk, its reception, and a bit of the controversy that it stirred up below the fold. 🙂 (And I apologize for the lateness of this post, its been sitting in my backlog waiting to get finished for a few weeks now. 🙁 )

Video

Links

Abstract

Microservices and Docker are all the rage for developing scalable systems. But what challenges will you face when developing and deploying Python apps using Docker to production? This talk goes into the real-life lessons learned from creating, deploying and scaling Dockerized Python applications.

About the talk

Preparation for the Talk

PyCon talks definitely take quite a bit of time and effort to prepare. In my case, the talk took 3 major revisions before becoming the talk that I actually presented at PyCon. What started off as a intro to some of the concepts of Docker with some minor Python points, became more of a lessons learned targetted at intermediate to advanced developers. One of the things I wished I had (and I planned to but didn’t pull of) was to mention and thank my team for helping me preparing my talk. So thank you Kevin Qiu, Biniam Bekele, Yele Bonilla, and Gavin D’Mello for all your support, sitting me through three versions of my talk, and all the amazing feedback! I’ll make sure to include a slide with thanks next time.

Also I am very thankful for Jared Kerim from Mozilla, who presented at the local Python Toronto meetup about his team’s Docker setup. He was kind enough to let me use his example docker-django-template project. His example was also inspirational for my own docker-compose example using my CMS Rookeries. An example that I crafted and tested on the plane trip over to Portland. (After talking with other speakers, finishing up your presentation, notes and examples on the plane trip over is a proud traditional of PyCon and other conferences. :D)

Reception

Overall the reception of the talk was amazing! The talk turned out quite a crowd, in fact filling up most of the room. (I’m not sure of the capacity of the room but I estimate over 300 people attended). I was pretty nervous, but with the exception of a few stumbles, I think I pulled off the talk quite well. I really enjoyed some of the questions that were fielded during the Q&A session, and also privately afterwards. I wish could of answered some of the Docker Machine and Amazon ECS questions better, but I simply have not worked with both technologies long enough to give proper advice.

Controversy

The most surprising aspect of the talk was the controversy it stirred up. At the end of the Q&A you can hear some comments from a young lady about where I supposedly went horrbily wrong, and how there were tweets flying back and forth about it. I had turned off the notifications on my phone when I got up on stage, to avoid getting distracted. She persisted with telling (or trying to explain) what was wrong in the private gathering afterwards. Unfortunately she did not do a wonderful job of communicating, and I felt it took away time from others to ask their questions. It didn’t help her case that she admitted to being a novice at Docker. Please don’t that as an attendee, there are better ways to disagree and communicate that.

I later approached by a gentleman (thank you whoever you are), who mentioned I should go talk to the OpenShift guys since they had some concerns about my talk. News of the Twitter controversy worried me, because I hated the notion that I had gotten on stage and toled people to go and do the wrong thing. Especially when apparently I’m telling the opposite of what Glyph from Twisted said to do. After a brief chat (and a nice demo about their cool Kubernetes suite) from the OpenShift guys, I found out that Graham Dumpleton, the creator of mod_wsgi and who works on OpenShift had done a live tweeting commentary during my talk, where he disagreed with a few of my points. Long story short, eventually I was able to chat with Graham. He was a great sport and explained his points. Interestingly enough I had also talked with the folks at Docker. And they agreed with the points in my talk, and the logic behind my points. Essentially most of my points were based off the best practises they proposed.

Anyways I listed a few of Graham’s points with links to his blog posts (thanks again Graham!), and some of my quick thoughts on each one. A quick disclaimer about some of my points: the advice I gave worked for us in our datacentre, and that it might not work for others in other environments. It should work well, it might not be perfect, but it worked for us, and some of the folks at Mozilla. I gave a disclaimer at my other talk on a Ansible setup for WSGI apps at PyCon Canada, and I thought it was superfluous. But it turns out it is a useful thing to mention, and be explicit.

Errata

So the slide that caused a good portion of the controversy was the base image one. There I had provided an example Dockerfile on half the slide and discussed about base images and good Dockerfile practises, with points on the lower half. Now the example was meant as a toy and not necessarily complete. It is difficult, even impossible to present a well formated, perfect Dockerfile in that context. There is only so much room on a slide to fit both an illustrative example and some explanatory points. That is why I included links to some samples, that hopefuly did a better job of it.

Virtualenvs

Ah yes, the “enfant terrible” of my talk. 🙂 If you want to be controversial in your talk, mentioning something like this will get people’s attention. (Ironically, it was not my desire to stir up a controversy). Graham post a while back why you might want to use virtualenvs in your Dockerized app. It is a longish post, so I’ll give a shortened version. Basically when you base your image off some distro (say Ubuntu, Fedora or what not), there is a good chance of bringing in more Python packages in your system site packages than you expected. e.g. You’re building a Flask app, and the package maintainer included a version of Werkzeug in the base Python install, so now when you pip install Flask as part of your requirements you get the wrong version of Werkzeug.

And that is a valid point (with my example)… except if you use something like the official Python 2.7 base image… which installs just Python. I would argue that you would catch and resolve this issue, if you are auditing your Docker images. (And you should be always doing your due diligence and checking your base and resulting images. ) So yes… you don’t really need virtualenvs, but you can also use them if you are concerned that you might be getting conflicting packages.

Volume maps

Graham was right about the adding volume mapping in the Dockerfile being problematic. You should not define volume mounts in your Dockerfile, since they create extra files with sudo-like permissions on the host (see /var). In your own datacentre that isn’t a problem. A multi-tenant cloud provider like OpenShift, would disallow you to create those files. The documentation argument I provided is not all that useful, since you can document the mountpoints in the README that you would provide with the Docker image.

Base Images

Base images are hard to get right. And there is a lot of debate whether or not to use tooling instead of base images. Graham says his warpdrive tool will do that sort of a thing. At work we build out our own tooling for building “standard” service Dockerfiles, and that just add another level of abstraction. I prefer base images since it while not ideal, provides less levels of abstractions that can get in the way when you’re debugging your Dockerfile setup. But your mileage may vary here.

So yes, good base images are hard. Try not to build your own unless you find it really useful and you have a great base to work from.

Installing GCC/Build Tools

In an ideal world one ought not have to include GCC, Python dev headers and so on. Yes, one can pip install using wheels, but that doesn’t always work out.

Dockerfiles

Formatting of the RUN command. This is not one of Graham’s points, but it did come up. Yes, you should format the RUN commands, with a line for each command and using a \ line continuation for readability. My slide didn’t have enough physical space to do so. My Rookeries example does a better job of this.

Running as Root

Graham is right, you should not run containerized apps as root. That is a bad security practise that can lead to an attacker compromising your Docker host via a privileged account on your Docker container. Again a bad example on my part, I should of added a USER command and dropped the VOLUME line, or maybe rethought the use of an example.

UWSGI and the HTTP flag

No, you don’t need it and you should use the UWSGI protocol if you put an NGINX container before your WSGI container. I left the flag in to make sure the example Dockerfile was runnable. My bad on trying to get a good illustrative example, but it wouldn’t be a good idea in production unless you feel comfortable exposing UWSGI to the direct HTTP traffic.

Personally I’m not a fan of mod_wsgi + Apache, but Graham did point out he created mod_wsgi-express to simplify your life. If we continue to use Apache + mod_wsgi at work, then I’ll try to get us to use mod_wsgi-express too.

Final Thoughts

Anyways, I hope got everything right. Thank you for reading all the way to the end! 🙂

See You at PyCon US 2016!

If you’re wondering why I’ve been so quiet these past few weeks, it is because I’ve been busy preparing to go to PyCon US in Portland this year!

I am very excited not only to be attending, but I will be giving a talk at PyCon US this year! I will be talking about Dockerizing Python microservices, and some of the lessons we’ve learned along the way at work. My talk will be on the first day (Monday May 30th) at 3:15-3:45 PM (PST). Videos of the all PyCon talks should be available a few days after the talk.

Huge thanks to everyone at my workplace, Points, for making this possible for me!

Finally I will be around in Portland for a few days after the sprints as well. I have never been to Portland, so I want to check out some of the sights around there. Let me know via Twitter or email if you want to meetup with me while I’m there. 🙂

Adding Functional (End-to-End) Test to Rookeries

Testing the client side of Rookeries, has proven to be quite a challenge. Not necessarily because testing well-written React JS components is hard. Rather I found it hard to setup a proper and consistent unit test infrastructure to do so. Rather than going through the pain of writing and maintaining functional tests in Javascript, I decided to take a different path.

BDD + Web Testing – Theory and Practise

I wanted to write my tests in a business-domain-driven (BDD) style. While most developers find awkward to use at first, it is a great way to write out the business functionality and features of an app. It also forces you to think about your app in a non-technical manner, and prove to yourself (and others) that it does what you claim it does.

At work my team has been burned by slow and awkward web tests. Namely we worked with the Robot Framework, which uses its own DSL and is very difficult to work just as we wanted. Debugging tests was also quite unpleasant.http://lettuce.it/

Fortunately one of our newest team mates Kevin Qiu introduced us to lettuce (a BDD framework) and splinter (a Pythonic Selenium interface). And we’ve had a fair bit of success describing scenario using these tests. I won’t lie, Selenium is always temperamental. However the current batch of web tests have been very stable, and has very much convinced me that this setup can work effectively.

The Rookeries Take on BDD-style Web Testing

Rookeries usea a similar setup to what we’ve done at work: namely I use splinter to interface with Selenium. However unlike work, since Rookeries uses pytest instead of nose, I ended up using pytest-bdd to provide the BDD framework for Rookeries. Furthermore, I am using the pytest integration for splinter, to provide some of the browser fixtures needed for the tests.

Feel free to check out the functional tests in Rookeries to see examples of the tests. I also highly recommend watching Dylan Lacey ‘s talk about using Splinter for web testing at PyCon Australia 2013.

A Few Wrinkles I Found So Far

  • Dylan suggests using names for inputs when working with web tests. Unfortunately the react-bootstrap components I rely don’t include support for names, something I plan on submitting a patch/pull request for.
  • One needs to run a localized server as part of the functional tests, which makes for a slightly complicated task setup. This is something I need to simplify.
  • I found using the element.text of a container React component works around the issues of text phrases being broken up over a few components.
  • Using ids is the simplest way to find elements, even thought that isn’t what a user would actually use to navigate the site.
  • ipdb does not work very well when debugging tests. Plain old pdb works wonders though. I am considering switching over to using pdb++

Using CouchDB in Rookeries – Part 1 – Creating CouchDB Test Fixtures Using Bulk Updates

Back Story

I’ve been working on adding database persistence support to Rookeries. Instead of writing down my findings and losing them somewhere, I plan on documenting my findings and thoughts in a series of blog posts.

In the case of Rookeries that means connecting to and storing all of the journal, blog and page content as CouchDB documents. Since I want to implement this properly, I intend on adding tests to make sure I can manage CouchDB documents and databases properly. Rather than writing a number of tests that mock out CouchDB, I want to use a test database along with known test data fixtures for my tests.

Python CouchDB Integration for Rookeries

When looking at different CouchDB-Python binding libraries for Rookeries, I settled on py-couchdb. Manipulating CouchDB essentially means communicating with its REST API, so it is important that a Python binding library uses the sane approach to communicate with HTTP REST API. Unfortunately the more popular CouchDB-Python library uses only Python standard library and implements its HTTP mechanism in using standard library’s unintuitive modules. In contrast py-couchdb uses requests for querying the CouchDB server, making it a much more maintainable library.

Also py-couch offers Python query views, which I very much enjoy using at work. I still need to verify how well the library’s Python query server works in practise, but I will write a future blog post about my findings. py-couchdb lacks CouchDB-Python’s mapping functionality, which behaves similar to sqlalchemy’s ORM. However I am still debating on how I want to map between CouchDB documents and Pythonic domain objects.

Creating and Deleting CouchDB

Creating and deleting a database in a CouchDB server amounts to issuing a HTTP PUT or DELETE request against the server. This REST API provides no safety net nor confirmation about deleting a database, so one needs to be careful. py-couchdb provides a nice and simple API to create or delete a database as well.

Using cURL

# Create a CouchDB database
curl -X PUT http://admin:password@localhost:5984/my_database/

# DELETE a CouchDB database
curl -X DELETE http://admin:password@localhost:5984/my_database/

Using py-couchdb

# Create a CouchDB database
server = pycouchdb.client.Server('http://admin:password@localhost:5984')
server.create('my_database')

# DELETE a CouchDB database
server.delete('my_database')

Inserting Fixture Data

Now that I can create a temporary test database, I need to populate it with some test data. Fortunately it turns out that CouchDB has a neat and fast way to insert data in bulk using its _bulk_docs API. With this API can easily come up with a number of documents that I want to input as test data.

Fixture Data Format

The format for inserting a mass of documents is:

{
  "docs": [
    {"_id": "1", "a_key": "a_value", "b_key": [1, 2, 3]},
    {"_id": "2", "a_key": "_random", "b_key": [5, 6, 7]},
    {"_id": "5", "a_key": "__etc__", "b_key": [1, 5, 5]}
 ]
}

Note that adding a _id specifies the CouchDB ID for the document.

Using cURL

# Bulk doc insert/update using the JSON data file.  One can also do this manually with a string.
curl -d @sample_data.json -X POST -H 'Content-Type: application/json' \
   http://admin:password@localhost:5984/my_database/_bulk_docs

Using py-couchdb

UPDATED: 2015-Aug-22 I was totally wrong about the format of doing bulk updates to py-couchdb. Rather than the JSON format needed for CURL, a simple list of Python dictionaries works with the save_bulk() method. I’ve updated the code example.

import io
import json

# Best practice for writing unified Python 2 and 3 compatible code is 
# to use io.open as a context manager. 
with io.open('sample_data.json') as json_file:
    my_docs = json.load(json_file)
database = server.database('my_database')
# See my update note above, about the format save_bulk expects.
database.save_bulk(my_docs['docs'])

Conclusion

And with that, I have what I need to have repeatable tests. Hopefully this will land in Rookeries in the next couple of days.

Other Resources

Ansible Role for NGINX, UWSGI and Supervisor Released!

What better way to start 2015 than to release new software?

As part of my efforts to create Rookeries, a modern Python-based CMS as a replacement for my WordPress sites: I am releasing an Ansible role to make it easier to setup WSGI apps on a private server.

The nginx-uwsgi-supervisor role is available on Ansible Galaxy.   This role setup NGINX and the UWSGI (WSGI app server) and supervisord infrastructure to make installing Rookeries or another WSGI app a breeze.   The goal is to make a Rookeries site as easy or easier to install and maintain than a WordPress site.

All the code for the role is host on Bitbucket, and mirrored on Github.

I am especially excited since this my first ever, fully functional, open source release.  I hope enjoy using and makes their life easier when build webapps in Python.

Now a Professional Pythonista at Points!

I have been working for the past month as a Software Development Engineer at Points International.  While my role is not officially as a Python developer, a large portion of my work is building Python applications, services and libraries.  Also I get to develop in Java as well and maintain some very well engineered systems as well, so I get to deal with both worlds.  Even after a month, I am super excited to work at such a cool company and with awesome people.  It really feels like a bit of a dream job, in terms of what technology I get to use (Python, Linux desktops and distributed version control systems, w00t!) and the processes (yes Agile and proper software engineering totally works when done right).

But it is the people within the company that really makes it shine.  I get to be surrounded by smart, savvy, and welcoming coworkers, including a number of important and active Pythonistas that I look up to.  My team is just amazing, supportive, and I feel that in this short time span I’ve become a much better developer thanks to them.  Even on stressful days I feel motivated and excited to come to work and give it my all.  I feel incredibly lucky and fortunate to be at Points. 🙂

Spring Cleaning for 2013

With Easter just around the corner and possibly spring coming shortly after–Canadians have to wait a bit longer for spring t0 properly arrive and winter to make her final exit–that it would make sense to update my blog.   Many things have changed in the past few weeks .  Like we have a new pope, Pope Francis, just in time for Easter.  (I’m not going to weigh in on my opinions of the decision of the Conclave, other than I have mixed feelings.  And each passing day does not ease my general feeling about unease.)  Some things have not changed.  Like most things in the world I guess.

With the slow coming of warmer weather, I have a good excuse for a bit of spring cleaning and growing myself.  In terms of spring cleaning, I have meant to really organize my activities and my surroundings.  Unfortunately since I had to make do without my laptop for a few weeks, that has not helped me get more things done.  Especially when it comes to dealing with my overflowing inbox.  Apologies for everyone expecting me to get back to them.  I’m getting there slowly.

I did get to play around with setting up Python on my hosting environment and with Clojure.  Clojure, while definitely useful still feels like an exercise in academics than industrial programming.  (Still one can write a full implementation of Snake/Nibbles in Clojure in under 100 lines of code?  Madness!)  Python on the other hand is too much fun to feel like work.  I considered using something like a static website generator like Nikola or benjen to port some of my websites.  But I think for kicks, I will go the route of using Flask and craft my own mini-site just because working with Python is a such a joy.

One unfortunately necessary bit of spring cleaning will be changing Linux distros again.  It seems that Canonical is doing a fair bit of wild experimentation nowadays.  Too wild and it smells like they are suffering from NIH (not invented here).  The idea to chuck out everyone’s hard work on replacing X with Wayland, with their own thing was just too much.  So it looks like I’m going back to openSUSE for good.  It is just a matter of when I get around to migrating all my systems over.  I have no real issue with Canonical doing what they want with their own distro Ubuntu.  I just don’t agree with the philosophy, and the needless experimentation, especially since I am quite happy with using a relatively standard KDE 4 desktop.

Hopefully once I finish all the spring cleaning I’ll get to finish up and show off some the projects I’ve been working on.