PyCon US 2016 Talk – Pythons in a Container

At the end of May, I presented a talk at PyCon 2016, on using Docker with Python microservices. You can imagine the rush I felt getting to present on such a popular topic at such a large and important conference as PyCon! While it took me a while to recuperate after PyCon and Portland both of which were amazing. I would definitely do another talk at PyCon given the opportunity. Anyways I hope you enjoy watching the video of the talk! Below the video I also wrote about preparing for the talk, its reception, and a bit of the controversy that it stirred up below the fold. 🙂 (And I apologize for the lateness of this post, its been sitting in my backlog waiting to get finished for a few weeks now. 🙁 )

Video

Links

Abstract

Microservices and Docker are all the rage for developing scalable systems. But what challenges will you face when developing and deploying Python apps using Docker to production? This talk goes into the real-life lessons learned from creating, deploying and scaling Dockerized Python applications.

About the talk

Preparation for the Talk

PyCon talks definitely take quite a bit of time and effort to prepare. In my case, the talk took 3 major revisions before becoming the talk that I actually presented at PyCon. What started off as a intro to some of the concepts of Docker with some minor Python points, became more of a lessons learned targetted at intermediate to advanced developers. One of the things I wished I had (and I planned to but didn’t pull of) was to mention and thank my team for helping me preparing my talk. So thank you Kevin Qiu, Biniam Bekele, Yele Bonilla, and Gavin D’Mello for all your support, sitting me through three versions of my talk, and all the amazing feedback! I’ll make sure to include a slide with thanks next time.

Also I am very thankful for Jared Kerim from Mozilla, who presented at the local Python Toronto meetup about his team’s Docker setup. He was kind enough to let me use his example docker-django-template project. His example was also inspirational for my own docker-compose example using my CMS Rookeries. An example that I crafted and tested on the plane trip over to Portland. (After talking with other speakers, finishing up your presentation, notes and examples on the plane trip over is a proud traditional of PyCon and other conferences. :D)

Reception

Overall the reception of the talk was amazing! The talk turned out quite a crowd, in fact filling up most of the room. (I’m not sure of the capacity of the room but I estimate over 300 people attended). I was pretty nervous, but with the exception of a few stumbles, I think I pulled off the talk quite well. I really enjoyed some of the questions that were fielded during the Q&A session, and also privately afterwards. I wish could of answered some of the Docker Machine and Amazon ECS questions better, but I simply have not worked with both technologies long enough to give proper advice.

Controversy

The most surprising aspect of the talk was the controversy it stirred up. At the end of the Q&A you can hear some comments from a young lady about where I supposedly went horrbily wrong, and how there were tweets flying back and forth about it. I had turned off the notifications on my phone when I got up on stage, to avoid getting distracted. She persisted with telling (or trying to explain) what was wrong in the private gathering afterwards. Unfortunately she did not do a wonderful job of communicating, and I felt it took away time from others to ask their questions. It didn’t help her case that she admitted to being a novice at Docker. Please don’t that as an attendee, there are better ways to disagree and communicate that.

I later approached by a gentleman (thank you whoever you are), who mentioned I should go talk to the OpenShift guys since they had some concerns about my talk. News of the Twitter controversy worried me, because I hated the notion that I had gotten on stage and toled people to go and do the wrong thing. Especially when apparently I’m telling the opposite of what Glyph from Twisted said to do. After a brief chat (and a nice demo about their cool Kubernetes suite) from the OpenShift guys, I found out that Graham Dumpleton, the creator of mod_wsgi and who works on OpenShift had done a live tweeting commentary during my talk, where he disagreed with a few of my points. Long story short, eventually I was able to chat with Graham. He was a great sport and explained his points. Interestingly enough I had also talked with the folks at Docker. And they agreed with the points in my talk, and the logic behind my points. Essentially most of my points were based off the best practises they proposed.

Anyways I listed a few of Graham’s points with links to his blog posts (thanks again Graham!), and some of my quick thoughts on each one. A quick disclaimer about some of my points: the advice I gave worked for us in our datacentre, and that it might not work for others in other environments. It should work well, it might not be perfect, but it worked for us, and some of the folks at Mozilla. I gave a disclaimer at my other talk on a Ansible setup for WSGI apps at PyCon Canada, and I thought it was superfluous. But it turns out it is a useful thing to mention, and be explicit.

Errata

So the slide that caused a good portion of the controversy was the base image one. There I had provided an example Dockerfile on half the slide and discussed about base images and good Dockerfile practises, with points on the lower half. Now the example was meant as a toy and not necessarily complete. It is difficult, even impossible to present a well formated, perfect Dockerfile in that context. There is only so much room on a slide to fit both an illustrative example and some explanatory points. That is why I included links to some samples, that hopefuly did a better job of it.

Virtualenvs

Ah yes, the “enfant terrible” of my talk. 🙂 If you want to be controversial in your talk, mentioning something like this will get people’s attention. (Ironically, it was not my desire to stir up a controversy). Graham post a while back why you might want to use virtualenvs in your Dockerized app. It is a longish post, so I’ll give a shortened version. Basically when you base your image off some distro (say Ubuntu, Fedora or what not), there is a good chance of bringing in more Python packages in your system site packages than you expected. e.g. You’re building a Flask app, and the package maintainer included a version of Werkzeug in the base Python install, so now when you pip install Flask as part of your requirements you get the wrong version of Werkzeug.

And that is a valid point (with my example)… except if you use something like the official Python 2.7 base image… which installs just Python. I would argue that you would catch and resolve this issue, if you are auditing your Docker images. (And you should be always doing your due diligence and checking your base and resulting images. ) So yes… you don’t really need virtualenvs, but you can also use them if you are concerned that you might be getting conflicting packages.

Volume maps

Graham was right about the adding volume mapping in the Dockerfile being problematic. You should not define volume mounts in your Dockerfile, since they create extra files with sudo-like permissions on the host (see /var). In your own datacentre that isn’t a problem. A multi-tenant cloud provider like OpenShift, would disallow you to create those files. The documentation argument I provided is not all that useful, since you can document the mountpoints in the README that you would provide with the Docker image.

Base Images

Base images are hard to get right. And there is a lot of debate whether or not to use tooling instead of base images. Graham says his warpdrive tool will do that sort of a thing. At work we build out our own tooling for building “standard” service Dockerfiles, and that just add another level of abstraction. I prefer base images since it while not ideal, provides less levels of abstractions that can get in the way when you’re debugging your Dockerfile setup. But your mileage may vary here.

So yes, good base images are hard. Try not to build your own unless you find it really useful and you have a great base to work from.

Installing GCC/Build Tools

In an ideal world one ought not have to include GCC, Python dev headers and so on. Yes, one can pip install using wheels, but that doesn’t always work out.

Dockerfiles

Formatting of the RUN command. This is not one of Graham’s points, but it did come up. Yes, you should format the RUN commands, with a line for each command and using a \ line continuation for readability. My slide didn’t have enough physical space to do so. My Rookeries example does a better job of this.

Running as Root

Graham is right, you should not run containerized apps as root. That is a bad security practise that can lead to an attacker compromising your Docker host via a privileged account on your Docker container. Again a bad example on my part, I should of added a USER command and dropped the VOLUME line, or maybe rethought the use of an example.

UWSGI and the HTTP flag

No, you don’t need it and you should use the UWSGI protocol if you put an NGINX container before your WSGI container. I left the flag in to make sure the example Dockerfile was runnable. My bad on trying to get a good illustrative example, but it wouldn’t be a good idea in production unless you feel comfortable exposing UWSGI to the direct HTTP traffic.

Personally I’m not a fan of mod_wsgi + Apache, but Graham did point out he created mod_wsgi-express to simplify your life. If we continue to use Apache + mod_wsgi at work, then I’ll try to get us to use mod_wsgi-express too.

Final Thoughts

Anyways, I hope got everything right. Thank you for reading all the way to the end! 🙂

See You at PyCon US 2016!

If you’re wondering why I’ve been so quiet these past few weeks, it is because I’ve been busy preparing to go to PyCon US in Portland this year!

I am very excited not only to be attending, but I will be giving a talk at PyCon US this year! I will be talking about Dockerizing Python microservices, and some of the lessons we’ve learned along the way at work. My talk will be on the first day (Monday May 30th) at 3:15-3:45 PM (PST). Videos of the all PyCon talks should be available a few days after the talk.

Huge thanks to everyone at my workplace, Points, for making this possible for me!

Finally I will be around in Portland for a few days after the sprints as well. I have never been to Portland, so I want to check out some of the sights around there. Let me know via Twitter or email if you want to meetup with me while I’m there. 🙂

PyCon and Beyond

Last weekend I went to the first ever PyCon Canada.  What an incredible event!  I met so many friendly, amazing, smart and talented people.  I learned so many new things, that essentially my knowledge of Python, and web technologies practically jumped to the next level over the course of 2 days.  The entire event left so inspired, that I’ve been hacking on a Python web application that I hope to release into the wild sooner than later.  Another fun Django Toronto night followed, and I learned so much there too.  I really can not wait to try out all these new technologies.  I have not gotten in touch with everyone from PyCon and Django Toronto, that I would like to.  Just been so swamped.  But I promise to do so shortly.

In the meantime, I hinted at a Python based web application that I am working on.  I won’t go into the details of the site in this post, since I’d rather show it off than talk about it. 🙂  I plan on building it out using Flask, SQLAlchemy and Jinja2.  Currently I am working through those technologies to build a particular website, and hopefully mastering them as I work out the details.  More details will follow soon…

For those who missed out on PyCon CA 2012: the videos of the talk are already up!  Check them out!

The Python Scene, Accepting Javascript, and Lots of REST

Life continues at a breakneck pace.  Despite my best laid plans, the surest way of getting things done at the moment is in a sort of out-order manner.  However I did want to share a brief update on my journey into the Python and web application development world.  Instead of separate long form articles, I will briefly touch upon various topics.  (Note that this post itself has been in my work queue for over 3 weeks.  Mea culpae.)

The Python Scene in Toronto: Django Toronto and PyCon

I’ve now gone to two Django Toronto meetups and really enjoyed meeting with the Python community in Toronto.  The majority of web application developers in Toronto, deal with Java, Ruby or PHP (and of course Javascript).  However the Python community is there and growing.  They are very welcoming, and I have enjoyed each of the Django Toronto meetups I have gone to.  Also I feel like everytime I go there, I learn a lot and come back a better (or at least better informed) developer.  At the last meetup, I gained a deeper insight into crafting REST web services.  Something that I immediately applied to my day job the next day.  The one funny thing about Django Toronto, is that while the Django framework is an amazing platform, many Python developers move away from it once they scale up.  That is reflected in the meetups, since the topic of different platforms such as Flask come up.

The most exciting thing that is happening in the Toronto area for Python developers is PyCon.  This will be the first year that PyCon comes to Toronto.  I will definitely be there and I have been preparing by working on some real world Python applications.  If you will be there, I’d love to meet up with you!

How I Stopped Worrying and Started to Like Javascript

Javascript has been a language and platform I have avoided as much as possible.  If I can I try to stick to fancy tricks that one can pull off HTML5 and CSS3.  However one can only go so far without adding scripting to a web application.  Plain old Javascript still is a pain in the neck in a browser environment.  In other environments, it might be different as a scripting language.  However in such cases, I usually reach for Python rather than Javascript.  In a web app environment most of my experience has been with Sencha ExtJS.  While ExtJS is a remarkably powerful and flexible platform, it is problematic.  Aside from the licensing confusion of the original versions of ExtJS, I found that ExtJS contains a lot of automagic.  Building a ExtJS application consists more of configuring a number of components, and getting them to do more or less what you want to do.  Finding the right configuration has resulted in many stressful hours for myself and many other developers that I know.  Given all that developing with Javascript has left a bad taste in my mouth, until recently.

Two main things have changed my mind about Javascript.  Aside from the fact any sort of more advanced web application requires a full-on Javascript powered client, Javascript is not an entirely evil language for the web.  Doug Crawford (of JSLint fame) makes a good claim in his book Javascript: The Good Parts, that Javascript has some nice parts.  Yes, there are some ugly parts and needs a lot of love in the coming version.  But it is quite good for what is essentially a Schema-like functional/prototype language with Java-ish syntax.  Especially considering how quickly it was created at Netscape, it is not a terrible language.  If you are not convinced, I highly recommend watching the video below:

 

 

The other thing that changed my mind about Javascript is jQuery.  After fighting with ExtJS, the minimalist approach to Javascript development espoused in jQuery is refreshing.  I essentially learned the basics of jQuery overnight, trying to firefight a work project where office politics had gone amok.  (Aside: My take on politics in general is avoid it.  Get stuff done, not argue about it.)  Yes you have to know what you are doing with jQuery, and you don’t get a fancy MVC/MVT/etc. framework with jQuery.  But if you don’t fear writing Javascript by hand, and you want a library that abstracts the cross-browser stupidity, then jQuery is the answer.  For the next big web application that I am currently writing I will definitely be using jQuery.

Lots of REST and JSON

As mentioned earlier in the post, I have been dealing with a lot of REST service and JSON.  It is an infinitely nicer and simpler manner to talk with a server application, than via SOAP and XML.  (XML has its place, but it has been overused and abused in the Java world.)  When working with Java I’ve worked with Jersey and SpringMVC for building REST services.  Spring in general just works better, aside from its crazy arcane configuration.  In Python I’ve started working with Flask to handle building REST services, which I find a lot lighter than Django that sort of thing.  Also JSON is an awesome idea.  More people should use it for more of their data interchange needs. 🙂

IntelliJ IDEA Makes It All Better

Not to sound like a promotional campaign (since I work in what essentially amounts to the advertising industry, it happens more often I’d like) but one of the best decisions I’ve made recently is to switch IDEs.  I used to swear by Eclipse as the be-all-end-all of development environments.  Then I discovered PyCharm for Python.  Soon after that I got to meet Jessamyn Smith at a Django Toronto meetup.  While were talking about the joys of switching away from Subversion to Git–Jessamyn wrote a great article about her own experiences of migrating to Git–she convinced me to look into IntelliJ IDEA as it had a better interface for managing Git operations.  She was pretty convincing, as that is my primary IDE nowadays.  No more mucking around and wasting time with Eclipse’s temperamental setup.  Things.  Just.  Work.  Meaning I can do work.

Hitting the Flask

Somethings die hard.  One of those things is my own insistence on having lots of control over my computing environment and development platforms.  This led me to using Linux late in high school.  After playing around with Django, and wanting to build my own applications I found myself hunting down various odd ways to get around Django’s defaults.  Do not get me wrong, Django has a ton of nice pre-build features and default that just work.  Unfortunately being a web application developer, I have my own experiences, expectations and assumptions.  They are not always right.  However I prefer frameworks that I can plug-and-play and give me a finer grain of control.  (Hence I prefer using Spring in my Java web apps.)  So I’ve discovered Flask, a great micro-framework for Python.  I like how it makes web programming easy, without making a whole wack-load of design assumptions.  It very much reminds of me of the best parts of Spring, and apparently it is very “Ruby on Rails” like.