<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Coffee on the Keyboard]]></title><description><![CDATA[James writes stuff about programming, mostly]]></description><link>https://coffeeonthekeyboard.com/</link><image><url>https://coffeeonthekeyboard.com/favicon.png</url><title>Coffee on the Keyboard</title><link>https://coffeeonthekeyboard.com/</link></image><generator>Ghost 2.31</generator><lastBuildDate>Wed, 25 Sep 2019 15:37:49 GMT</lastBuildDate><atom:link href="https://coffeeonthekeyboard.com/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Unclear Asks are Killing your Team]]></title><description><![CDATA[<p>We've all been in a meeting, wondering what the goal was, and why we were invited. Some of us have even gotten up, said something, and left. The truly bold may have suggested ending the entire meeting and reconvening with a plan.</p><p>We've also all gotten an email addressed to</p>]]></description><link>https://coffeeonthekeyboard.com/unclear-asks-are-killing-your-team/</link><guid isPermaLink="false">5c48fc42f95a6b00c082942b</guid><category><![CDATA[teams]]></category><category><![CDATA[distributed]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Thu, 24 Jan 2019 13:55:25 GMT</pubDate><content:encoded><![CDATA[<p>We've all been in a meeting, wondering what the goal was, and why we were invited. Some of us have even gotten up, said something, and left. The truly bold may have suggested ending the entire meeting and reconvening with a plan.</p><p>We've also all gotten an email addressed to six or seven people, assumed someone would handle it, and moved on.</p><p>I've seen a few people circulating <a href="https://medium.com/on-human-centric-systems/that-quick-email-you-just-sent-its-slowly-killing-your-company-73f9276b3c86">this medium post</a> that blames asynchronous communication like email for wasting time. But the problem is <strong>not</strong> asynchronous communication, the <strong>problem is unclear responsibilities</strong>. And it manifests in all sorts of ways beyond email.</p><p>There are two facets of this: vague requests and diffuse responsibility.</p><h3 id="vague-requests">Vague Requests</h3><p>To start with the given example, an email to five people on a design team:</p><blockquote>Hey there! Wanted to get some quick feedback on this presentation; esp. slide 7. See attached.Thanks so much!</blockquote><p>Of these five people, no one knows what the author wants them to do.</p><ul><li>What is "quick feedback"? Is this 30% feedback where the whole structure up for revision, or 90% feedback where they want a proofreader?</li><li>When do they need this?</li><li>Of the five people, how should they divide this? Should the lead designer look at the content while the junior designer punches up the visuals? Do all five need to look at everything?</li><li>Is this a collaboration, a sign-off, or an FYI?</li><li>What's the right way to provide feedback? Is it to edit the deck and send it back, provide notes? (NB: using attachments instead of collaborative tools is killing you, too.)</li></ul><p>The suggested solution is to lean toward synchronous meetings, in a case like this, preferring to ask one person for a few minutes of sync time. But that's a bandaid over the problem. <strong>Synchronous meetings won't save you from deciding and communicating what you need.</strong></p><h3 id="diffuse-responsibility">Diffuse Responsibility</h3><p>Asking something of too many people is the same thing as asking no one.</p><p>I've seen this happen over and over:</p><ul><li><em>Unmanaged mailing lists for inbound communication.</em> Who should answer? What happens if two people answer at the same time, or differently? If there's an action to be taken, who gets assigned to it? (If it's a full-time job to answer these, as in a customer service team, the list is probably managed as a queue and routed to different members somehow.)</li><li><em>Undirected requests for code review.</em> Who are the appropriate reviewers? When will they get to it? (Often this means a pull request or patch sits around until someone begs on Slack/IRC, at which point a dogpile of review happens, some of which take too long to be meaningful.)</li><li><em>Meetings with a lot of invitees.</em> Why is each person in the room? Can or should I delegate this, and get a summary? What are the agenda and goals?</li><li><em>Emails with too many people copied.</em> As in the original example. This one is pernicious because it often comes from upset stakeholders who think <em>someone</em> will answer them, or who are trying to escalate an issue but aren't using the "proper" channel for some reason—e.g. they don't know or don't trust it.</li></ul><p>Again, you need to <strong>decide and communicate what you need</strong>.</p><p>Each of these scenarios can be addressed in different ways. The mailing list (I've been on a <em>security@</em> mailing list with this issue) may need some kind of rotation or schedule. Code reviews should be assigned—not to too many people, each assigned reviewer should <em>need</em> to do the review or ask someone to replace them (GitHub's "suggested reviewer" and "requested reviewer" tools can help with this). Meetings <em>absolutely need</em> agendas. <a href="https://en.wikipedia.org/wiki/Responsibility_assignment_matrix">RACI matrices</a> are incredibly powerful tools for making reasons for including people clear.</p><p>Panic emails are tricky if they originate outside your team, though you can still communicate expectations and areas of ownership internally that can help.</p><h3 id="asynchronous-communication-is-critical">Asynchronous Communication is Critical</h3><p>It is worth your time to learn to do asynchronous communication well.</p><p>Increasingly, the nature of work is distributed. While sometimes this means across state lines and time zones, it may simply mean the ability to be productive from home with a flexible schedule sometimes. This is an<strong> important inclusion issue.</strong> If you want to hire senior team members—particularly women who are often disproportionately responsible for childcare—who are more likely to have family obligations than younger people, an hour-long doctor's appointment or parent-teacher conference shouldn't be the difference between a productive day and a wasted one.</p><p>Interrupting flow also costs disproportionate amounts of productivity. This is one of the chief complaints about Slack and why I impress on my teams that anyone should feel OK muting and quitting it for a while to get uninterrupted work done.</p><p>And if you need a few minutes of someone's time, synchronously or not, there's no reason to demure on the details. "Ping" doesn't get you anything but a "pong," while "hey, who's the right person to talk to about event details?" can get you an answer. Not only does excluding details <a href="https://theoutline.com/post/4225/if-you-just-message-hi-and-nothing-else-i-assume-im-getting-fired?zd=1&amp;zi=2bujlx3l">make people anxious</a>, it prevents them from preparing, even if it's just "ok putting on my feedback hat." Sending an email that says "hey can we chat for 15 minutes about some structural issues I'm having with this presentation?" let's the recipient know what they'll be doing, and compared to "hey can we chat for 15 minutes?" is far less terrifying.</p><p>The original post recommends using short, synchronous conversations, but these are effective only insomuch as they are forcing factors to actually do the work. It also recommends, well, doing the work: "Bring clarity to asynchronous asks", "Give every (t)ask a time limit", "Put structure around collaborative editing tools".</p><p>If you ask five people to get in a room or video call for 5 minutes (which will take 20) with no clearer question than "hey look at this plz," you'll get no better result. If you clarify your ask, you can still put it in an email.</p>]]></content:encoded></item><item><title><![CDATA[Remembering TodaysMeet]]></title><description><![CDATA[<p>Nearly 10 years ago, I launched a <a href="https://todaysmeet.com">small chat tool</a>.</p>
<p>Somehow, through a lot of luck, and a bunch of hard work, it managed to become popular among classroom teachers as a way to enable new kinds of conversations.</p>
<p>Now it's time to say goodbye and move on.</p>
<h3 id="thebeginning">The Beginning</h3>]]></description><link>https://coffeeonthekeyboard.com/remembering-todaysmeet/</link><guid isPermaLink="false">5ae4b4c49a4bc400226c267f</guid><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Sat, 28 Apr 2018 19:17:53 GMT</pubDate><content:encoded><![CDATA[<p>Nearly 10 years ago, I launched a <a href="https://todaysmeet.com">small chat tool</a>.</p>
<p>Somehow, through a lot of luck, and a bunch of hard work, it managed to become popular among classroom teachers as a way to enable new kinds of conversations.</p>
<p>Now it's time to say goodbye and move on.</p>
<h3 id="thebeginning">The Beginning</h3>
<p>In early 2008, social media was a vastly different place.</p>
<p>Twitter was barely over a year old—and lacked many of the features we take for granted today, like real search, and built-in hashtag support—but already making an impact on education. <a href="http://speedchange.blogspot.com/">My father</a> had seen the news about audiences &quot;live tweeting&quot; talks at South by Southwest, and asked me one day if I knew a good way to have a live stream of tweets with a given hashtag he could project on a wall during a class he was teaching.</p>
<p>Twitter's main search was still an external tool called <a href="https://www.crunchbase.com/organization/summize#section-overview">Summize</a> and it tended to lag behind real-time by a bit (Summize would be acquired by Twitter in mid-2008 and integrated shortly after that). &quot;But,&quot; I said, &quot;I've been playing around with real-time comments for blogs. I could get rid of the blog post and put it on a website for you.&quot; And that's the origin of TodaysMeet.</p>
<p>&quot;TodaysMeet&quot; was first name we thought of that sounded reasonably close and the <code>.com</code> domain was available. I wrapped the prototype I'd been building in my best attempt at a user interface, and put it on the web for the fall of 2008.</p>
<p>There were a few early experiments. At one point you could import tweets with a given hashtag into your room (at the time, Twitter was much more open about how tweets were displayed). But for the most part, TodaysMeet ran on a single server hosting a number of other projects, and I didn't think about it too much.</p>
<h3 id="themiddle">The Middle</h3>
<p>During the next few years, without me really noticing, TodaysMeet was gaining some traction among teachers. Those students in my dad's class all went into education. They brought TodaysMeet to conferences and into their classrooms. They shared it on their blogs and even in a couple of books. TodaysMeet owes all of its success—and I personally owe so much—to those early, eager, curious users.</p>
<p>Until, in 2012, I went to a Presidential debate party at a friend-of-a-friend's apartment. By the time I'd gotten to the party, my phone had already blown up with alerts that TodaysMeet was struggling under a <a href="http://blog.todaysmeet.com/presidential-debate-post-mortem-62/">huge increase in use</a>! I spent most of the debate in the kitchen, on a laptop borrowed from the host, editing TodaysMeet live, hoping to keep it <em>mostly</em> available.</p>
<p>Over the next few months, I <a href="http://blog.todaysmeet.com/want-to-help-test-todaysmeet-118/">rebuilt TodaysMeet from the ground up</a> and <a href="http://blog.todaysmeet.com/welcome-to-todaysmeet-2-0-140/">relaunched it in May 2013</a>.</p>
<p>The new iteration allowed me to add a few features teachers had been clamoring for, and in response, more and more teachers used and shared it. A year later, by spring 2014, TodaysMeet had grown significantly—and was costing me a decent amount to run—so I <a href="https://coffeeonthekeyboard.com/whats-next-1113/">left my job</a> to concentrate on TodaysMeet full-time.</p>
<p>2014 saw a series of huge changes to TodaysMeet: it became possible to <a href="http://blog.todaysmeet.com/sign-up-for-todaysmeet-and-take-control-228/">sign in</a> to manage your own rooms and <a href="http://blog.todaysmeet.com/moderate-your-rooms-273/">moderate the comments</a>. <a href="http://blog.todaysmeet.com/embed-todaysmeet-rooms-into-your-class-websites-and-lmses-309/">Rooms became embeddable</a>. And the biggest: a premium subscription option called <a href="http://blog.todaysmeet.com/teacher-tools-is-now-available-358/">Teacher Tools</a>. For the first time, TodaysMeet could sustain itself financially.</p>
<p>TodaysMeet has always been <a href="https://jamessocol.com/">just me</a>. I've done everything from answer the emails and tweets to design and build new site features to answer a pager when it goes down. While I cannot even begin to express how much I appreciate the opportunity the users have given me to build a product that meant so much to so many—I am humbled to be part of hundreds of thousands of classrooms and impact the lives of literally millions of students—it has been tough, at times, to keep up with it. TodaysMeet makes enough money to keep itself going and pay for the occasional vacation, but not enough to make it my full-time job. And having no one else has meant I have been on-call essentially 24/7 since 2012. (For example, I haven't traveled without a laptop since then, and even so, there has been downtime while I was on a plane.)</p>
<p>So, in 2015, I went back to work full-time and TodaysMeet became a nights-and-weekends project again. The pace of changes definitely slowed down. I have a huge backlog of projects, small and large, that I'd hoped to build.</p>
<h3 id="theend">The End</h3>
<p>Protecting the privacy of the students' who use TodaysMeet has always been one of my highest principles. At the same time, teachers need both accountability and ease of use, which has required an incredibly difficult balancing act. And it must be done while adhering both to the letter and the spirit of the law, whether it be <a href="https://www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule">COPPA</a> or state laws in the US or, now, the <a href="https://www.eugdpr.org/">GDPR</a> in Europe. I have been an advocate for privacy and user control of data in general, but when dealing with the youngest users, it takes on a whole new urgency.</p>
<p>The GDPR is an important step in the evolution of our rights of privacy and over our data on the internet—even if there are issues with some of the details. However, for TodaysMeet, it has put me in an impossible position.</p>
<p>If TodaysMeet could ever be made compliant with the GDPR, it would require completely re-imagining how it works and how participants interact with it. Compromises that were possible to comply with COPPA would no longer be suffice. The ease of use that has made TodaysMeet so popular would almost certainly have to change. And doing all of this would require more time and effort than I have to give right now.</p>
<p>TodaysMeet is also an anonymous chat tool in an era of the internet I never anticipated. Periodically I've been called upon by schools to help with incidents of amazingly hurtful, abusive, and downright violent language. I have removed some really unbelievable material and I know that there's more out there that hasn't been found or reported. Every time it's a reminder how critical that accountability is, and it has made me wonder if TodaysMeet is actually a force of good in the world. I'm still not sure. I think it certainly was, for a while.</p>
<h3 id="thankyou">Thank You</h3>
<p>TodaysMeet has taught me more than anything else I have ever done. I have done product work, design, development, operations, database administration, security, business ops, customer service, social media, marketing, and more (with varying levels of success, of course). It is an incredible privilege to have had this opportunity, even it has sometimes felt like an obligation.</p>
<p>I'm incredibly proud of a lot of the work I've done within TodaysMeet and hope to be able to leverage or open source a lot of the technology, and blog about parts that are more practice than code.</p>
]]></content:encoded></item><item><title><![CDATA[PIEfection Slides Up]]></title><description><![CDATA[<p>I put <a href="https://github.com/jsocol/talks/tree/master/2016-01-13-manhattanjs-pie">the slides for my ManhattanJS talk, &quot;PIEfection&quot;</a> up on GitHub the other day (sans images, but there are links in the source for all of those).</p>
<p>I completely neglected to talk about the <a href="https://en.wikipedia.org/wiki/Maillard_reaction">Maillard reaction</a>, which is responsible for food tasting good, and specifically for browning</p>]]></description><link>https://coffeeonthekeyboard.com/piefection-slides-up/</link><guid isPermaLink="false">5ab304f15edea3001882e651</guid><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Tue, 19 Jan 2016 20:45:34 GMT</pubDate><content:encoded><![CDATA[<p>I put <a href="https://github.com/jsocol/talks/tree/master/2016-01-13-manhattanjs-pie">the slides for my ManhattanJS talk, &quot;PIEfection&quot;</a> up on GitHub the other day (sans images, but there are links in the source for all of those).</p>
<p>I completely neglected to talk about the <a href="https://en.wikipedia.org/wiki/Maillard_reaction">Maillard reaction</a>, which is responsible for food tasting good, and specifically for browning pie crusts. tl;dr: Amino acid (protein) + sugar + ~300°F (~150°C) = delicious. There are innumerable and poorly understood combinations of amino acids and sugars, but this class of reaction is responsible for everything from searing steaks to browning crusts to toasting marshmallows.</p>
<p>Above ~330°F, you get caramelization, which is also a delicious part of the pie and crust, but you don't want to overdo it. Starting around ~400°F, you get pyrolysis (burning, charring, carbonization) and below 285°F the reaction won't occur (at least not quickly) so you won't get the delicious compounds.</p>
<p>(All of these are, of course, temperatures measured in the material, not in the air of the oven.)</p>
<p>So, instead of an egg wash on your top crust, try whole milk, which has more sugar to react with the gluten in the crust.</p>
<p>I also didn't get a chance to mention a rolling technique I use, that I learned from a <a href="https://www.facebook.com/ellenspirerstaffing">cousin of mine</a>, in whose baking shadow I happily live.</p>
<p>When rolling out a crust after it's been in the fridge, first roll it out in a long stretch, then fold it in thirds; do it again; then start rolling it out into a round. Not only do you add more layer structure (mmm, flaky, delicious layers) but it'll fill in the cracks that often form if you try to roll it out directly, resulting in a stronger crust.</p>
<p>Those <a href="http://www.amazon.com/Cheese-Shaker-Pepper-Perforated-Stainless/dp/B007T40P28/ref=sr_1_1?ie=UTF8&amp;qid=1453236391&amp;sr=8-1&amp;keywords=pizza+shaker">pepper flake shakers</a>, filled with flour, are a great way to keep adding flour to the workspace without worrying about your buttery hands.</p>
<p>For transferring the crust to the pie plate, try rolling it up onto your rolling pin and unrolling it on the plate. <a href="http://www.amazon.com/Ateco-20-Inch-Length-French-Rolling/dp/B000KESQ1G">Tapered (or &quot;French&quot;) rolling pins</a> (or wine bottles) are particularly good at this since they don't have moving parts.</p>
<p>Finally, thanks again to <a href="https://twitter.com/renrutnnej">Jenn</a> for helping me get pies from one island to another. It would not have been possible without her!</p>
]]></content:encoded></item><item><title><![CDATA[New Year, New Role]]></title><description><![CDATA[<p>Yesterday was my last day at the job I’ve been in for most of 2015. Later this month <strong>I’ll be joining the digital team at Condé Nast!</strong></p>
<p>I’ve been interested in media, publishing, and journalism since I was a kid, growing up in the office of a</p>]]></description><link>https://coffeeonthekeyboard.com/new-year-new-role-1388/</link><guid isPermaLink="false">5ab304f15edea3001882e650</guid><category><![CDATA[conde-nast]]></category><category><![CDATA[javascript]]></category><category><![CDATA[job]]></category><category><![CDATA[Life]]></category><category><![CDATA[node.js]]></category><category><![CDATA[work-life]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Thu, 07 Jan 2016 11:44:50 GMT</pubDate><content:encoded><![CDATA[<p>Yesterday was my last day at the job I’ve been in for most of 2015. Later this month <strong>I’ll be joining the digital team at Condé Nast!</strong></p>
<p>I’ve been interested in media, publishing, and journalism since I was a kid, growing up in the office of a local weekly newspaper. Condé is an institution in the media world, and I’m excited to bring my experience from the tech industry to a new set of challenges. Media is in a fascinating, turbulent place right now, and Condé is in a unique position to explore and chart a path for the industry.</p>
<p>There are a number of things that clicked with this opportunity. I’m excited about the specific role and the kind of work I and the team will be doing; about working with JavaScript, and getting more involved with <a href="http://borojs.com/">the community</a>; about having more opportunity to talk about the work we’re doing; about diving into a whole new industry. I’ll talk more about the details once I start.</p>
<p>For the next couple of weeks, I’m going to read, <a href="http://manhattanjs.com/">give a talk</a> and try to hit some other meetups, play a little catch-up on open source work and email, wrap up a TodaysMeet project I’ve been stalled on, and maybe even finish another blog post (let’s not go crazy though).</p>
<p>Thanks to <a href="https://twitter.com/scherrymomin">Scherry</a> and everyone else who helped make this connection. Onward into 2016!</p>
]]></content:encoded></item><item><title><![CDATA[Open Source Update: Waffle]]></title><description><![CDATA[<p>I just pushed <a href="https://pypi.python.org/pypi/django-waffle/0.11">version 0.11 of Waffle</a>, the feature flipper for <a href="https://www.djangoproject.com/">Django</a>. It contains <a href="https://github.com/jsocol/django-waffle/compare/v0.10.1...v0.11">a number of code and documentation fixes</a> which you can also see in the changelog and <a href="https://github.com/jsocol/django-waffle/issues?q=milestone%3A0.11+is%3Aclosed">the milestone</a>.</p>
<p>I had to shuffle the milestones around a bit, though I haven’t yet updated the <a href="http://waffle.readthedocs.org/en/v0.11/about/roadmap.html">roadmap</a></p>]]></description><link>https://coffeeonthekeyboard.com/open-source-update-waffle-1358/</link><guid isPermaLink="false">5ab304f15edea3001882e64f</guid><category><![CDATA[Code]]></category><category><![CDATA[django]]></category><category><![CDATA[open source]]></category><category><![CDATA[python]]></category><category><![CDATA[waffle]]></category><category><![CDATA[webdev]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Wed, 21 Oct 2015 08:47:54 GMT</pubDate><content:encoded><![CDATA[<p>I just pushed <a href="https://pypi.python.org/pypi/django-waffle/0.11">version 0.11 of Waffle</a>, the feature flipper for <a href="https://www.djangoproject.com/">Django</a>. It contains <a href="https://github.com/jsocol/django-waffle/compare/v0.10.1...v0.11">a number of code and documentation fixes</a> which you can also see in the changelog and <a href="https://github.com/jsocol/django-waffle/issues?q=milestone%3A0.11+is%3Aclosed">the milestone</a>.</p>
<p>I had to shuffle the milestones around a bit, though I haven’t yet updated the <a href="http://waffle.readthedocs.org/en/v0.11/about/roadmap.html">roadmap</a> yet. Basically, to get the fixes that were already in master, but because there are breaking changes, master became 0.11, and everything else moves down one. Unfinished fixes in 0.10.2 are now in 0.11.1 (and potentially a few other issues). Finished things in 0.10.2 are in the new 0.11. The old 0.11 is now 0.12, 0.12 is now 0.13, etc.</p>
<p>The biggest change is how Waffle supports <a href="http://jinja.pocoo.org/">Jinja</a> templates. Instead of relying specifically on <a href="http://jingo.readthedocs.org/">jingo</a>, Waffle now ships with a Jinja extension, that can be used by jingo, <a href="https://pypi.python.org/pypi/django-jinja/">django-jinja</a>, or—theoretically—any other library for using Jinja with Django.</p>
<p>Changing Jinja support was the only thing we had to do to get the tests running—and passing!—on Django 1.8, so that’s (finally) supported. Thanks a ton to <a href="https://github.com/rlr">Ricky</a> for writing a lot of the multiple-Django-version testing code, and <a href="http://www.bluesock.org/~willkg/blog/">WillKG</a> as always for help and feedback on all of those changes.</p>
<h3 id="waffleneedshelp">Waffle Needs Help</h3>
<p>It’s easy to pretend Waffle is in a great place when a new version has just come out, but it’s not true. Will has stepped up and been helping out—for which I’m unspeakably grateful—but Waffle in particular could use another set of hands and eyes.</p>
<p>Ideally, I’d love to find one or two people who</p>
<ol>
<li>rely on Waffle in production,</li>
<li>on up-to-date versions of Django (1.7, 1.8), and</li>
<li>have time to both write some new code but, mostly, help wrangle pull requests.</li>
</ol>
<p>There are a bunch of open PRs and issues that need triage, and possibly updates and merges. There are also a bunch of open issues that don’t have PRs yet, especially in <a href="https://github.com/jsocol/django-waffle/milestones/0.11.1">0.11.1</a>, which really need to be addressed.</p>
<p>Waffle has a fairly healthy community. Probably the least healthy bit right now is its leadership. I hope there’s someone in it eager to step up and help us fix that. If you are, email me: <a href="mailto:me@jamessocol.com">me@jamessocol.com</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Open Source Update: Jingo]]></title><description><![CDATA[<p><em>Update: Jingo is <strong>no longer supported</strong> as was deprecated in favor of <code>django-jinja</code> at the end of May 2016.</em></p>
<p>As part of <a href="http://coffeeonthekeyboard.com/open-source-update-bleach-1359/">shuffling around open source responsibilities</a>, I’m going back to being the sole maintainer of <a href="https://github.com/jbalogh/jingo">Jingo</a> for now.</p>
<p>I just pushed <a href="https://pypi.python.org/pypi/jingo/0.8">Jingo 0.8</a> to PyPI. This executes</p>]]></description><link>https://coffeeonthekeyboard.com/open-source-update-jingo-1373/</link><guid isPermaLink="false">5ab304f15edea3001882e64e</guid><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Tue, 20 Oct 2015 08:10:07 GMT</pubDate><content:encoded><![CDATA[<p><em>Update: Jingo is <strong>no longer supported</strong> as was deprecated in favor of <code>django-jinja</code> at the end of May 2016.</em></p>
<p>As part of <a href="http://coffeeonthekeyboard.com/open-source-update-bleach-1359/">shuffling around open source responsibilities</a>, I’m going back to being the sole maintainer of <a href="https://github.com/jbalogh/jingo">Jingo</a> for now.</p>
<p>I just pushed <a href="https://pypi.python.org/pypi/jingo/0.8">Jingo 0.8</a> to PyPI. This executes on a plan I developed over the summer:</p>
<ul>
<li>0.7 supports Django &lt;1.7</li>
<li>0.8 supports Django &gt;=1.7</li>
</ul>
<p>If you’re using 1.4ESR, or haven’t moved off 1.6 yet, stick with Jingo 0.7. (But this is a good time to move off 1.6.) There is a <code>v0.7.x</code> branch, in case there are fixes to backport to it, but I don’t really anticipate doing so.</p>
<p>I don’t know what the future of this project is at this point. <a href="https://pypi.python.org/pypi/django-jinja/1.4.1">django-jinja</a> actually follows the Django template loader changes supported in 1.8, and has a more active community and more mindshare. Like <code>import this</code> says, there should be one way to do it. I’m going to take a hard look at moving my dependencies over to django-jinja. <a href="http://bluesock.org/~willkg/blog/mozilla/input_django_1_8_upgrade.html#switching-from-jingo-to-django-jinja">Will wrote about the switch</a>, if you’re interested in moving.</p>
<p>But for right now, <em>Jingo is supported</em>, even if there’s not a lot of active development going on. If that changes, I’ll make a reasonably large stink about it.</p>
]]></content:encoded></item><item><title><![CDATA[Open Source Update: Bleach]]></title><description><![CDATA[<p>As of today, I transferred ownership of <a href="https://github.com/mozilla/bleach">Bleach</a> to the Mozilla organization, and the Mozilla WebDev team, in particular <a href="http://bluesock.org/~willkg/">Will Kahn-Greene</a> and <a href="https://twitter.com/jezdez/">Jannis Leidel</a>, are taking over maintenance of it.</p>
<p><strong>Huge</strong> thanks to Will, Jannis and <a href="https://twitter.com/Lonnen">Chris Lonnen</a>. Bleach needs more attentive stewardship, and I believe the project will be</p>]]></description><link>https://coffeeonthekeyboard.com/open-source-update-bleach-1359/</link><guid isPermaLink="false">5ab304f15edea3001882e64d</guid><category><![CDATA[bleach]]></category><category><![CDATA[Code]]></category><category><![CDATA[open source]]></category><category><![CDATA[python]]></category><category><![CDATA[security]]></category><category><![CDATA[webdev]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Fri, 16 Oct 2015 15:59:28 GMT</pubDate><content:encoded><![CDATA[<p>As of today, I transferred ownership of <a href="https://github.com/mozilla/bleach">Bleach</a> to the Mozilla organization, and the Mozilla WebDev team, in particular <a href="http://bluesock.org/~willkg/">Will Kahn-Greene</a> and <a href="https://twitter.com/jezdez/">Jannis Leidel</a>, are taking over maintenance of it.</p>
<p><strong>Huge</strong> thanks to Will, Jannis and <a href="https://twitter.com/Lonnen">Chris Lonnen</a>. Bleach needs more attentive stewardship, and I believe the project will be much better off with them at the helm. I wish them the best of luck in leading Bleach forward.</p>
<p>I haven’t forked the Mozilla repo, so if you go to <a href="https://github.com/jsocol/bleach">the old URL</a> you’ll be redirected to the new one. If I ever do fork it, I’ll put a note at the top of the README.</p>
<p>That’s the news, here’s the op-ed.</p>
<p>Giving over the reins of an open source project is bittersweet. I’m thrilled that Bleach has enough users that it needs attention. It <del>is</del> was my most-starred repo (though not my most-downloaded Python project, an honor which goes to <a href="https://github.com/jsocol/pystatsd">pystatsd</a>). It’s one of my first Python libraries—of which I’m both proud and horrified. It’s difficult to accept that it’s time to hand it over.</p>
<p>But it is.</p>
<p>The truth is that I don’t use Bleach anymore. None of the projects I work on need it. I’m not suffering from its problems, and it’s hard to prioritize Bleach over other work.</p>
<p>And if that didn’t make me a poor-enough choice as maintainer, I seem to disagree with the direction people who <em>actually use</em> Bleach want and need it to take. If I had my druthers, a lot of its current features would be dialed back, and it would focus on the “as a commenter, I want to use a small set of HTML to mark up this snippet” use case. But maybe that’s a different tool.</p>
<p>And even then, nothing I’m doing today would need it.</p>
<p>So it’s time to realize I was only guarding Bleach so closely because of ego, put that ego aside and let the project live on without me.</p>
]]></content:encoded></item><item><title><![CDATA[The Fallacy of "Microservice Infrastructures"]]></title><description><![CDATA[<p><a href="http://microservices-infrastructure.readthedocs.org/en/latest/"><img src="http://coffeeonthekeyboard.com/wp-content/uploads/2015/08/Pasted-image-at-2015_08_20-12_32-PM1.png" alt="is THAT all"></a></p>
<p>“Awesome, now, did someone remember to build a product?”</p>
<p>The problem with these advanced, complex <a href="http://microservices-infrastructure.readthedocs.org/en/latest/">microservice infrastructure</a> <a href="http://mantl.io/">ideas</a> is that they presume an impossible starting point: you have a large, senior operations team, big enough to operationalize and maintain all of these moving parts, and no product or existing infrastructure</p>]]></description><link>https://coffeeonthekeyboard.com/the-fallacy-of-microservice-infrastructures-1344/</link><guid isPermaLink="false">5ab304f15edea3001882e64b</guid><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Thu, 20 Aug 2015 12:31:40 GMT</pubDate><content:encoded><![CDATA[<p><a href="http://microservices-infrastructure.readthedocs.org/en/latest/"><img src="http://coffeeonthekeyboard.com/wp-content/uploads/2015/08/Pasted-image-at-2015_08_20-12_32-PM1.png" alt="is THAT all"></a></p>
<p>“Awesome, now, did someone remember to build a product?”</p>
<p>The problem with these advanced, complex <a href="http://microservices-infrastructure.readthedocs.org/en/latest/">microservice infrastructure</a> <a href="http://mantl.io/">ideas</a> is that they presume an impossible starting point: you have a large, senior operations team, big enough to operationalize and maintain all of these moving parts, and no product or existing infrastructure whatsoever.</p>
<p>If you’re starting something completely new, don’t spend your time on this. Use <a href="https://www.heroku.com/">Heroku</a> or whatever you can to minimize your ops time, and focus on your product and customers. Yes, it’s important to <a href="http://coffeeonthekeyboard.com/dont-do-write-1079/">prefer repeatable methods</a> of getting stuff done, but if you already know how to write a Python or Bash script or a Makefile? Go ahead and use it.</p>
<p><em>If</em> you are successful, <a href="https://github.com/jsocol/talks/tree/master/2015-01-07-queensjs-soa">more complexity will (probably) come</a>, and that’s a great problem to have. You’ll already have some requirements and some experience and, at this rate, there will be an entirely different set of brand new tools to help manage it. Pick and choose components that fit your needs.</p>
<p>If you already have a big ops team, you probably already have some infrastructure—I <em>guess</em> you could do it the other way but… Even spinning up a brand new project, if you’re in a company with existing infra, there’s probably a solid set of knowledge on how to do things. <a href="http://mcfunley.com/choose-boring-technology">Be boring</a> and use what you’ve got.</p>
<p>There’s always room for innovation, but while you were trying to figure out Mesos and keep your Consul cluster running, a <a href="https://twitter.com/_chriswhitten_/status/215873099499966466">bunch of people shipped a Rails config file</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Storing date, time, and timezone for future events]]></title><description><![CDATA[<p><a href="https://twitter.com/t">Tantek</a> recently published a blog post encouraging developers to <a href="http://tantek.com/2015/218/b1/use-timezone-offsets">use UTC with timezone offsets</a> for storing dates and times.</p>
<p>There is, however, an unfortunate nuance his post doesn’t include, and so I’m pointing it out here.</p>
<p>For “when <em>did</em> this happen?” and “what time is it <em>now</em>?”, UTC,</p>]]></description><link>https://coffeeonthekeyboard.com/storing-date-time-and-timezone-for-future-events-1339/</link><guid isPermaLink="false">5ab304f15edea3001882e649</guid><category><![CDATA[programming]]></category><category><![CDATA[time]]></category><category><![CDATA[timezones]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Tue, 11 Aug 2015 20:43:00 GMT</pubDate><content:encoded><![CDATA[<p><a href="https://twitter.com/t">Tantek</a> recently published a blog post encouraging developers to <a href="http://tantek.com/2015/218/b1/use-timezone-offsets">use UTC with timezone offsets</a> for storing dates and times.</p>
<p>There is, however, an unfortunate nuance his post doesn’t include, and so I’m pointing it out here.</p>
<p>For “when <em>did</em> this happen?” and “what time is it <em>now</em>?”, UTC, or UTC+offset, is perfect. Past events can be ordered easily.</p>
<p>However, for future events, “when <em>will</em> this happen?” it is, strictly speaking, <a href="http://www.creativedeletion.com/2015/03/19/persisting_future_datetimes.html">incorrect</a>. The (<a href="http://infiniteundo.com/post/25326999628/falsehoods-programmers-believe-about-time">main</a>) reason is that timezone offsets are not fixed, and do change. Not only do Daylight Saving Time (a.k.a. Summer Time, DST) policies impact the timezone offset (not to mention the <em>existence</em> of certain times, e.g. <code>2016-03-13T07:15:00-05:00</code> exists, but <code>2016-03-13T02:15:00 (America/New York)</code> does not) but those policies can and do change.</p>
<ul>
<li>Chile <a href="http://www.timeanddate.com/news/time/chile-abolishes-dst-2015.html">abolished DST</a> in 2015 (and apparently changed their plan as late as Jaunary).</li>
<li>Russia <a href="http://www.bbc.com/news/blogs-news-from-elsewhere-28423647">did, too</a> last year.</li>
<li>Haiti <a href="http://www.timeanddate.com/news/time/haiti-new-dst-schedule-2013.html">started observing DST</a> in 2013.</li>
<li><a href="http://www.washingtonpost.com/news/the-fix/wp/2015/03/12/a-bunch-of-states-want-to-get-rid-of-daylight-saving-time-is-your-state-one-of-them/">A number of US states</a> are considering abolishing it. (How serious they are is, of course, up for debate.)</li>
</ul>
<p>This is why the W3C explicitly says <a href="http://www.w3.org/TR/timezone/#pastfuture">“at a minimum you will need the time zone, not merely an offset from UTC”</a> for future events.</p>
<p>Admittedly, as Tantek points out, these changes don’t happen <em>that</em> often. We disagree, though, whether that infrequency makes it worth knowing about.</p>
<p>Time is <a href="http://infiniteundo.com/post/25509354022/more-falsehoods-programmers-believe-about-time">incredibly hard to do right</a>. You have limited engineering time. Maybe given the requirements of your application or users, you want—or need—to choose not to worry about future changes to timezone offsets. But it’s important to be aware and make an informed decision, especially because if you only store the offset, you won’t have enough information to make the necessary updates when a change like this does happen.</p>
<p>It’s complicated, but <a href="http://www.creativedeletion.com/2015/03/19/persisting_future_datetimes.html#rule-of-thumb">@laut’s rule of thumb</a> is a decent place to start.</p>
]]></content:encoded></item><item><title><![CDATA[An Object Caching Pattern for Django]]></title><description><![CDATA[<p>Increasingly I’ve been treating even RDBMSes like structured key-value stores. There are still foreign keys and relationships in there, but the access patterns are most commonly by some kind of “primary” key (not always the primary key on the table, but a natural one).</p>
<p><em>Normally when I do something</em></p>]]></description><link>https://coffeeonthekeyboard.com/an-object-caching-pattern-for-django-1294/</link><guid isPermaLink="false">5ab304f15edea3001882e646</guid><category><![CDATA[Back-end]]></category><category><![CDATA[cache]]></category><category><![CDATA[Database]]></category><category><![CDATA[django]]></category><category><![CDATA[programming]]></category><category><![CDATA[python]]></category><category><![CDATA[technical]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Thu, 07 May 2015 08:51:39 GMT</pubDate><content:encoded><![CDATA[<p>Increasingly I’ve been treating even RDBMSes like structured key-value stores. There are still foreign keys and relationships in there, but the access patterns are most commonly by some kind of “primary” key (not always the primary key on the table, but a natural one).</p>
<p><em>Normally when I do something in more than two projects I’ll put it into a library, but for once this honestly feels too small, so instead, here’s a blog post and a gist.</em></p>
<p>This makes object caching quick to implement and very effective. Here’s a pattern I’ve been using in Django models:</p>
<script src="https://gist.github.com/jsocol/dab48c9c4ab6388190ab.js?file=mymodel.py"></script>
<p>Looking up an object looks like:</p>
<p><code>obj = MyModel.get(some_key)</code></p>
<p>Advantages of this pattern:</p>
<ul>
<li>Straight-forward to implement, can be factored in a mixin without much work.</li>
<li>Caches non-existent entries (“misses”).</li>
<li>Very high hit rate in many common cases.</li>
<li>Low risk of caching stale data.</li>
<li>No signals or other spooky action at a distance.</li>
<li>Easy to <a href="http://coffeeonthekeyboard.com/testing-with-djangos-cache-1229/">mock <code>get()</code> in tests</a>.</li>
</ul>
<p>Disadvantages:</p>
<ul>
<li>Subject to <a href="http://en.wikipedia.org/wiki/Thundering_herd_problem">thundering herd</a> when read rate is too high or hot-spots—can be partially alleviated by updating <code>save()</code> and <code>delete()</code> to write to the cache, too, but increases the probability of caching stale data.</li>
<li>No support for querysets or lists (intentional, as these are notoriously difficult to cache and invalidate correctly).</li>
<li>Can’t use queryset <code>update()</code> or <code>delete()</code> methods.</li>
</ul>
<p>This works well when <em>most</em> read access is by the same natural key. You could extend it to support multiple keys—e.g. a name and an integer ID, by defining methods <code>get_by_name(cls, name)</code> and <code>get_by_id(cls, pk)</code>, or similar, and then in <code>flush</code>, generating all the keys and using <code>cache.delete_many</code>. It works badly when most access is via related managers, e.g. <code>my_obj.something_set.all()</code>.</p>
<p>The same pattern absolutely works outside of the Django ORM, but the specifics depend on how you’re accessing your DB. Personally, I like accessor functions that return dictionaries (e.g. <code>get_some_object(key)</code>).</p>
<p><strong>Update, 7 May</strong></p>
<p><a href="https://twitter.com/jezdez/status/596314761706479617">Jannis pointed out</a>, correctly, that this does introduce another call that can fail and thus it has implications for database transactions.</p>
<p>When I use this pattern, I typically enable <a href="https://docs.djangoproject.com/en/1.8/topics/db/transactions/#tying-transactions-to-http-requests">atomic requests</a>. Writes often cause side effects that need to propagate through various systems, so there’s usually more than one call that can fail. For the use cases I have today, atomic requests is enough. For others, more fine-grained transaction management is necessary.</p>
]]></content:encoded></item><item><title><![CDATA[For the love of Email]]></title><description><![CDATA[<p>Email is the whipping technology of communications. Everyone wants to <a href="https://www.google.com/search?q=replace+email">kill email</a>.</p>
<p>Email is also, for all its problems, fundamental to most modern business communication. While many teams rely more on real-time platforms like Slack or HipChat*, or even IRC, which definitely still has a posse, that only takes <em>back</em></p>]]></description><link>https://coffeeonthekeyboard.com/for-the-love-of-email-1267/</link><guid isPermaLink="false">5ab304f15edea3001882e645</guid><category><![CDATA[email]]></category><category><![CDATA[rant]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Fri, 20 Feb 2015 19:47:45 GMT</pubDate><content:encoded><![CDATA[<p>Email is the whipping technology of communications. Everyone wants to <a href="https://www.google.com/search?q=replace+email">kill email</a>.</p>
<p>Email is also, for all its problems, fundamental to most modern business communication. While many teams rely more on real-time platforms like Slack or HipChat*, or even IRC, which definitely still has a posse, that only takes <em>back</em> a function that email had begun to coöpt—mostly due to UI improvements in Gmail that made it reasonably good at near-real-time, short messages.</p>
<p>** Let’s call these “real-time plus,” because unlike IRC, there’s a searchable history by default. An IRC client could happily expand gifs or emoji, but persistence isn’t part of the protocol. By opting to use a more complete tool, and not just a protocol, you get persistence, but less customization.*</p>
<h3 id="emailsemantics">Email semantics</h3>
<p>Not all email clients are created equal, and not only because some are web-based and native applications and some are apps. But there’s a fairly common baseline set of semantics:</p>
<ul>
<li>Read/unread. Very obvious.</li>
<li>“Star” of some sort.</li>
<li>“Archive”, which sometimes is part of</li>
<li>Move to “Folder”.</li>
<li>“Tag”, often, but not always.</li>
<li>“Delete”.</li>
</ul>
<p>These are the things an email can be: Read or Unread, Starred or Unstarred, Archived or Not Archived, Filed or Unfiled, Tagged (with 𝓍) or Untagged, Deleted or Not Deleted.</p>
<p>Usually, not all of these are present. Sometimes you can Archive, other times you “archive” by Filing. Sometimes you can create Folders, other times you can create Tags.</p>
<p>Let’s break those down even further, combining some of them.</p>
<ul>
<li>Read/unread.</li>
<li>Starred/unstarred.</li>
<li>Inbox/Not-Inbox.</li>
<li>Tagged/Not-Tagged.</li>
<li>Deleted.</li>
</ul>
<p>This set of semantics has been incredibly powerful. So powerful that if you ask 3 people how they organize their email, you’ll probably get 7 answers. Everyone can take this baseline and adapt it to work, personal, side-business, as needed.</p>
<h3 id="apersonalexample">A personal example</h3>
<p>What do I do? Most of you will probably think I’m insane, but here it is. (Or just skip to the next section.)</p>
<p>My inbox is my TODO list. (I’ve tried other TODO lists and always end up just copying back and forth from email.) If there is a TODO for me in an email—even if that’s just answering it—it stays in the inbox. If there’s something I will need later, it gets starred (tickets are the big thing). If there’s nothing left for me to do, it gets archived. If it’s something automated, it gets trashed (with some exceptions).</p>
<p>I’ve gotten more aggressive about deleting things, particularly marketing emails, since I used to just “archive everything and forget about it.”</p>
<p>I used to be less aggressive about archiving, but I found that things I still needed to do got pushed off the front page by things I didn’t need to do, that were just sitting there, waiting to “age out” to page 2. Now I archive everything. “Send &amp; Archive” is my default button.</p>
<p>I’ve also been trying to use Mailbox more to push things into the future. It works up to a point: I don’t always know how far to push something. I should combine that with a calendar.</p>
<h3 id="sowhat">So what?</h3>
<p>But I’m not here to decry real-time-plus team communication—I think it’s great, especially for distributed or remote-friendly teams.</p>
<p>And I’m not here to explain how you should use email. My point is that it is so prolific because it’s not very opinionated at all. Everyone bends it to their will. <strong>I don’t want to kill that.</strong></p>
<p>On the contrary, I want everything to go through my email.</p>
<p>For various reasons, I don’t have Facebook Messenger on my phone. I don’t know if it would matter if I did, because I’m pretty aggressive about turning off badges. (You know what gets to have badges? Email.)</p>
<p>When people send me DMs on Twitter (please don’t, I never pay attention to them) I get emailed, and I can see the contents of the message, which is great. But then the chain is broken, because I can’t reply.</p>
<p>Facebook notifications are even worse—or at least the last time I managed to get one—because all they told me was that I <em>had</em> a message.</p>
<p>GitHub is my favorite. They notify you, and let you respond via email, even if some parts of GitHub-flavored Markdown don’t work that way. You could have a real discussion about an issue without ever leaving your email client.</p>
<p>I’m not trying to break out of anyone’s precious walled garden. I don’t want anyone else to have to adopt my technology. I just want two things.</p>
<ol>
<li>I want my personal, weird, custom, honed-over-years method of managing communication threads. But I don’t really want Facebook to have to—or even try to—build the flexible, just-opinionated-enough playground that email already is. And</li>
<li>I my communication todo list all in one place. If it’s spread out, I miss things. Or I check off the task of “reading” before realizing I should leave the task for “replying”—a two-sided job that email has baked in.</li>
</ol>
<h3 id="inmyday">In <em>my</em> day…</h3>
<p>One of the few things I’ll ever say that BlackBerry got unequivocally right was messaging. Way back when, on my BlackBerry Pearl, they had a combined view of all your email, text, and BBM messages. You could see, sort, act on everything with the same set of management semantics, regardless of the source. Some messages were shorter, some longer, some had primitive emojis. The medium <em>may</em> have been the message, but what I did <em>after</em> I got the message was medium-agnostic. It was great.</p>
<p>I’ve been looking for that level of integration for years. I’ll happily accept getting FB messages, DMs, WhatsApp, what-have-you, to work with me through email.</p>
<p>OK, I think that’s enough of a curmudgeonly Friday-night rant.</p>
]]></content:encoded></item><item><title><![CDATA[Digging into "that" Python error]]></title><description><![CDATA[<p>I think this is my most popular tweet ever:</p>
<blockquote class="twitter-tweet" lang="en"><p>&gt;&gt;&gt; foo = ([],)&#10;&gt;&gt;&gt; foo[0] += [1]&#10;TypeError: &#39;tuple&#39; object does not support item assignment&#10;&gt;&gt;&gt; foo&#10;&lt;&lt;&lt; ([1],)</p>&mdash; James Socol (@jamessocol) <a href="https://twitter.com/jamessocol/status/565906946780581889">February 12, 2015</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>I’ve</p>]]></description><link>https://coffeeonthekeyboard.com/digging-into-that-python-error-1268/</link><guid isPermaLink="false">5ab304f15edea3001882e644</guid><category><![CDATA[deep-dive]]></category><category><![CDATA[python]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Sun, 15 Feb 2015 17:47:37 GMT</pubDate><content:encoded><![CDATA[<p>I think this is my most popular tweet ever:</p>
<blockquote class="twitter-tweet" lang="en"><p>&gt;&gt;&gt; foo = ([],)&#10;&gt;&gt;&gt; foo[0] += [1]&#10;TypeError: &#39;tuple&#39; object does not support item assignment&#10;&gt;&gt;&gt; foo&#10;&lt;&lt;&lt; ([1],)</p>&mdash; James Socol (@jamessocol) <a href="https://twitter.com/jamessocol/status/565906946780581889">February 12, 2015</a></blockquote>
<script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
<p>I’ve known about this little quirk for a while, but I shared it because it still amuses me. It shocks and confuses people. Some people tried to make sense of it</p>
<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p><a href="https://twitter.com/jamessocol">@jamessocol</a> Like this:&#10;i=([],)&#10;f=i[0]&#10;f+=[1]&#10;i&#10;([1],)&#10;Surely that&#39;s equivalent, &amp; it works. Like you said, corner case.</p>&mdash; Ramon Ferrandis (@ramoncreager) <a href="https://twitter.com/ramoncreager/status/566692839598592001">February 14, 2015</a></blockquote>
<p>And some were just appalled</p>
<blockquote class="twitter-tweet" data-conversation="none" lang="en"><p><a href="https://twitter.com/jamessocol">@jamessocol</a> Thanks for sharing. This is the first Python quirk which makes me think of PHP/JavaScript.</p>&mdash; Eason Chen (@easoncxz) <a href="https://twitter.com/easoncxz/status/566719670414503936">February 14, 2015</a></blockquote>
<p>So what actually <em>is</em> going on here? To answer, we need to take a dive into the world of CPython <em>opcodes</em> and spend some time with the <a href="https://docs.python.org/2/library/dis.html"><code>dis</code> module</a>.</p>
<p>Let’s start by disassembling and digging into a very simple set of statements:</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=1f.py"></script>
<p>And the output:</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=1o.dis"></script>
<p>The left-most number is the line number (line 1 is <code>def f1():</code>) then an offset we’ll ignore, then the opcode, then some information about the arguments to the opcode.</p>
<p>All of these opcodes operate on “the stack,” and most often on the value on the top of the stack, <code>TOS</code>. We load values onto the stack from constants, functions, and in-scope variables, and we can store values into in-scope variables. Sometimes we’ll deal with a couple of values at once, which we’ll call <code>TOSn</code>, where <code>n</code> is the depth in the stack. Since the stack is, after all, a stack, we can’t operate on <code>TOS1</code> unless we’re also operating on <code>TOS</code>.</p>
<p>Line 2 has two operations: <code>LOAD_CONST</code>, which pushes the constant <code>0</code> onto the stack. Then <code>STORE_FAST</code> takes the value off the top of the stack and stores it in the local variable <code>a</code>, which is a human name for variable number <code>0</code>.</p>
<p>Line 3 is a little more interesting: first we load the value from <code>a</code> onto the stack (<code>LOAD_FAST</code>), then load the constant <code>1</code> on top of it (<code>LOAD_CONST</code> again). The <code>INPLACE_ADD</code> operation pops the top two stack values, executes <code>__iadd__</code>, and pushes the result back onto the stack. Then <code>STORE_FAST</code> takes the value off the top of the stack and stores it back into <code>a</code>.</p>
<p>The last two opcodes are the implicit <code>return None</code> from this function, so we’ll ignore those.</p>
<p>There are a couple of interesting things to note here: one is that <code>INPLACE_ADD</code> is actually less “in-place” than we were lead to believe. It still loads the value onto the stack and then stores a new value. It is not single atomic operation, but rather 4 operations. Another is to recall that number types, like strings and tuples, are <em>immutable</em>.</p>
<p>Immutability is going to be very important, so keep that in mind. There is no integer method that changes the value of an integer. Any time we operate on an integer value, we need to store the result again. <code>__iadd__</code> returns a new value that <em>replaces</em> the current one, it doesn’t <em>change</em> the current one. There is no unary increment (<code>++</code>) operator in Python.</p>
<p>Let’s change <code>f1</code> slightly and see what happens:</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=2f.py"></script>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=2o.dis"></script>
<p>Line 2 is the same: push <code>0</code> onto the stack, then pop it and store it in local variable 0/<code>a</code>.</p>
<p>Line 3 starts off the same: we still load <code>a</code> and push it onto the stack, then load the constant 1 and push it onto the stack. But this time we call <code>BINARY_ADD</code> which pops the top two values, adds them with <code>__add__</code>, and pushing the result. Then <code>STORE_FAST</code> pops and stores it in <code>a</code>. Finally the implicit return we’ll ignore.</p>
<p>OK, integers are fun, but what about a more interesting type, like a list?</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=3f.py"></script>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=3o.dis"></script>
<p>Start with the LHS of line 2: <code>BUILD_LIST 0</code> takes zero objects off the top of the stack, builds a list object, and pushes that list onto the stack. Then we <code>STORE_FAST</code> which puts that list into <code>a</code>.</p>
<p>On line 3, because we’re doing an “inplace” add again, we follow the same road map: <code>LOAD_FAST</code> pushes the value from the variable stored on the RHS back onto the stack (the list). Then we evaluate the LHS: <code>LOAD_CONST</code> to push <code>1</code> onto the stack, call <code>BUILD_LIST 1</code> to create a list containing the top value from the list, and push the new list onto the stack. Then call <code>INPLACE_ADD</code> which, for lists, executes the extend operation and pushes the new list onto the stack. Finally, pop the new list and store it in our local variable.</p>
<p>Now let’s compare that with calling <code>.extend()</code>:</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=4f.py"></script>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=4o.dis"></script>
<p>On line 3, we push the list from <code>a</code> onto the stack, then call <code>LOAD_ATTR</code> to pop it off the stack, and then push a reference to the <code>extend</code> attribute onto the stack. <code>LOAD_CONST</code> to push <code>1</code> onto stack, <code>BUILD_LIST</code> replaces it with a list <code>[1]</code> then <code>CALL_FUNCTION 1</code> which pops 1 value from the stack to use as a function argument, then pops the function, and finally calls the function, which, in this case, extends the list with that value, then pushes the result onto the stack.</p>
<p>What happens next is interesting. We do <em>not</em> call <code>STORE_FAST</code>. Lists are <em>mutable</em>, and the value of the list has already changed. We just call <code>POP_TOP</code> to throw away the top stack value.</p>
<p>At this point, we can basically understand how this bizarre error both works and fails at the “same time,” but there’s one more type of operation we should cover.</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=5f.py"></script>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=5o.dis"></script>
<p>Line 2 builds a list just like we’ve been doing: <code>LOAD_CONST</code> to put <code>5</code> on the stack, <code>BUILD_LIST 1</code> to build a list with 1 value from the stack, and <code>STORE_FAST</code> to pop the list from the top of the stack and store it into <code>a</code>.</p>
<p>Line 3 <code>LOAD_FAST</code>s the list onto the stack, then <code>LOAD_CONST</code> the index (0) from the LHS onto the stack. <code>DUP_TOPX 2</code> takes the top two values from stack and duplicates them, in order, which makes the stack <code>a 0 a 0</code>.</p>
<p><code>BINARY_SUBSCR</code> takes the top two values from the stack, evaluates <code>TOS1[TOS]</code>, and pushes it back onto the stack, so the stack is now <code>a 0 5</code>. <code>LOAD_CONST</code> puts the value <code>1</code> onto the stack, and <code>INPLACE_ADD</code> pops the two integers, adds, then, and pushes <code>6</code> back (<code>a 0 6</code>).</p>
<p><code>ROT_THREE</code> rotates the top three stack objects, so we reorder the stack to <code>6 a 0</code>. We need to do this to call <code>STORE_SUBSCR</code>, which implements <code>TOS1[TOS] = TOS2</code>, or <code>a[0] = 6</code>. <code>STORE_SUBSCR</code> replaces <code>STORE_FAST</code> in the other examples, and it changes the stored value of <code>TOS1</code>, so it only makes sense for <em>mutable</em> types.</p>
<p>OK! Finally, we’ve got everything we need to look at the our favorite bizarre behavior:</p>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=6f.py"></script>
<script src="https://gist.github.com/jsocol/1c1912c755512c41fc61.js?file=6o.dis"></script>
<p>Line 2: build a list and push it. (<code>[]</code>) Pop it and use it to build a tuple, then push the tuple (<code>([],)</code>) Pop the tuple and store it.</p>
<p>Line 3:</p>
<ul>
<li><code>LOAD_FAST</code> Push <code>a</code> onto the stack (<code>([],)</code>)</li>
<li><code>LOAD_CONST</code> Push <code>0</code> onto the stack (<code>([],) 0</code>)</li>
<li><code>DUP_TOPX 2</code> Duplicate the two stack values (<code>([],) 0 ([],) 0</code>)</li>
<li><code>BINARY_SUBSCR</code> Pop 2, calculate <code>TOS1[TOS]</code>, push the result (<code>([],) 0 []</code> Remember that the list inside the tuple and the list are the <em>same object</em>.)</li>
<li><code>LOAD_CONST</code> Push <code>1</code> onto the stack (<code>([],) 0 [] 1</code>)</li>
<li><code>BUILD_LIST 1</code> Pop the top, build <code>[TOS]</code>, and push the result (<code>([],) 0 [] [1]</code>)</li>
<li><code>INPLACE_ADD</code> Pop the top two, execute <code>TOS1.__iadd__(TOS)</code>, extending the list (<code>([1],) 0 [1]</code>) At this point <em>the extend operation has already happened</em>.</li>
<li><code>ROT_THREE</code> (<code>[1] ([1],) 0</code>)</li>
<li><code>STORE_SUBSTR</code> Attempt to pop the top three and store <code>TOS1[TOS] = TOS2</code>.</li>
</ul>
<p><code>STORE_SUBSTR</code> fails because tuples are immutable, which causes the <code>TypeError</code>. But, as we learned, <code>+=</code> is <em>not</em> an atomic operation. Part of it (the actual addition/extend) can succeed while another part (store) fails.</p>
<p>Without using <code>dis.dis</code>, can we tell how Ramon’s example is different?</p>
<ul>
<li><code>i = ([],)</code> builds a list, pushes it onto the stack, pops it to build a tuple, pushes the tuple, and finally pops the tuple stores it in the local variable <code>i</code>.</li>
<li><code>f = i[0]</code> is going to do something like <code>LOAD_FAST</code> to get the tuple onto the stack, <code>LOAD_CONST</code> to get <code>0</code> onto the stack, <code>BINARY_SUBSCR</code> to pop them and push <code>TOS1[TOS]</code>, a reference to the list, onto the stack, then <code>STORE_FAST</code> to pop it and store the reference in local variable <code>f</code>.</li>
<li><code>f += [1]</code> will <code>LOAD_FAST</code> the list from the RHS, <code>LOAD_CONST</code> to push <code>1</code> onto the stack, <code>BUILD_LIST 1</code> to pop <code>1</code> onto the stack, build a list, and push it onto the stack. <code>INPLACE_ADD</code> to pop the top two, add (extend) them, and push the result onto the stack, and finally <code>STORE_FAST</code> to pop the extended list into <code>f</code>.</li>
</ul>
<p>Since <code>f</code> and <code>i[0]</code> point to the same list object, <code>i</code> is now <code>([1],)</code>. <code>i</code> hasn’t mutated but its mutable member has, and nothing ever tried to do a <code>STORE_SUBSCR</code> on an immutable object.</p>
<p>This is specifically CPython (I used 2.7.5) and the opcode representation is not guaranteed or part of the spec. It’s internal and subject to change, but it’s also an interesting introduction to an important aspect of how Python works.</p>
<p>The original, specific example is contrived and bad code, specifically constructed to fail. But it’s not <em>so</em> bad or <em>so</em> contrived that no one would ever do it, which is why it’s a particularly interesting one.</p>
]]></content:encoded></item><item><title><![CDATA[Visualizing the 2015 SotU on TodaysMeet]]></title><description><![CDATA[<p><em>Now that you know <a href="http://coffeeonthekeyboard.com/how-todaysmeet-works-1237/">how TodaysMeet works</a>, here’s part 2: using the message queue architecture to build the SotU visualizations.</em></p>
<p><a href="https://todaysmeet.com">TodaysMeet</a> has a <a href="http://blog.todaysmeet.com/case-study-meeting-during-debates-88/">long history</a> with <a href="http://blog.todaysmeet.com/2014-state-of-the-union-room-205/">political events</a>. During the 2012 Presidential debates, I was at a stranger’s apartment—guest of a guest—half-listening (binders full of <em>what</em></p>]]></description><link>https://coffeeonthekeyboard.com/visualizing-the-2015-sotu-on-todaysmeet-1246/</link><guid isPermaLink="false">5ab304f15edea3001882e643</guid><category><![CDATA[Back-end]]></category><category><![CDATA[d3]]></category><category><![CDATA[distributed]]></category><category><![CDATA[django]]></category><category><![CDATA[dq]]></category><category><![CDATA[javascript]]></category><category><![CDATA[leveldb]]></category><category><![CDATA[node.js]]></category><category><![CDATA[python]]></category><category><![CDATA[queues]]></category><category><![CDATA[streams]]></category><category><![CDATA[todaysmeet]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Thu, 05 Feb 2015 15:32:47 GMT</pubDate><content:encoded><![CDATA[<p><em>Now that you know <a href="http://coffeeonthekeyboard.com/how-todaysmeet-works-1237/">how TodaysMeet works</a>, here’s part 2: using the message queue architecture to build the SotU visualizations.</em></p>
<p><a href="https://todaysmeet.com">TodaysMeet</a> has a <a href="http://blog.todaysmeet.com/case-study-meeting-during-debates-88/">long history</a> with <a href="http://blog.todaysmeet.com/2014-state-of-the-union-room-205/">political events</a>. During the 2012 Presidential debates, I was at a stranger’s apartment—guest of a guest—half-listening (binders full of <em>what did he say</em>?) and using a hastily borrowed laptop to try to <a href="http://blog.todaysmeet.com/presidential-debate-post-mortem-62/">keep it running</a>. That was the event that convinced me I needed a new platform.</p>
<p>After the 2014 State of the Union, I’d had an idea to do a running word cloud synced to the video: look at what people in “sotu” rooms were saying, and create a new cloud every minute or so. I never got around to it, and after a while it didn’t feel very topical.</p>
<p>But on Sunday, January 18th, two days before the 2015 State of the Union, when I realized it was coming up, I decided to <a href="http://blog.todaysmeet.com/sotu2015-on-todaysmeet-426/">do it live</a>.</p>
<p><a href="http://knowyourmeme.com/memes/bill-oreilly-rant"><img src="http://i1.kym-cdn.com/entries/icons/original/000/000/792/do-it-live.jpg" alt="we'll do it live!"></a></p>
<p>To do something real-time with the SotU data, I would need to dip into the comment stream and get the text of the messages in rooms related to the speech. Then I’d need to do some analysis on that text and eventually push the data to clients, which would render some sort of visualization.</p>
<p>This ended up as 3 or 5 parts, depending how you count.</p>
<p><img src="http://coffeeonthekeyboard.com/wp-content/uploads/2015/02/How-SOTU-Works.png" alt="How SOTU Works"></p>
<p>A quick review from <a href="http://coffeeonthekeyboard.com/how-todaysmeet-works-1237/">yesterday’s post</a>:</p>
<ul>
<li>We’ll call the Django app <code>tm</code> for now. It is responsible for web, API, authentication, and talking to the database. (This is the old monolith I’m <a href="https://github.com/jsocol/talks/tree/master/queensjs-soa">slowly breaking up</a>.)</li>
<li><code>ekg</code> is a websocket service. Data flows in via the API and out via websockets. It’s JavaScript on NodeJS and uses <a href="http://primus.io">Primus</a>.</li>
<li><code>reflektor</code> uses an in-process queue to proxy and replay HTTP messages from <code>tm</code> to <em>n</em><code>ekg</code> processes.</li>
</ul>
<h3 id="flatdb">flatdb</h3>
<p><a href="https://pypi.python.org/pypi/flatdb">flatdb</a> is a simple, <a href="http://flask.pocoo.org/">Flask</a>-based HTTP interface to <a href="https://github.com/google/leveldb">LevelDB</a>.</p>
<p>I built it as part of a PoC a while ago, and kept the code around. LevelDB has the ability to get a range of lexically sorted keys, so it’s a great way to keep data by timestamp (at least until September 33,658).</p>
<p>Selfishly, I put flatdb on PyPI just so it was easier for my deploy infrastructure. I don’t know if I’ll have a good reason to use it again, but maybe it’ll be useful for someone. After all, I still get pull reqs to bizarre old PHP libraries I threw on GitHub.</p>
<p>There were two instances of flatdb running, one that I called <code>clouddb</code> which stored JSON blobs of word frequency data, and one called <code>tickdb</code> which included per-second counts.</p>
<h3 id="sotucollector">sotu-collector</h3>
<p>The original plan was to <a href="http://www.jasondavies.com/wordcloud/">generate word clouds</a>—hence <code>clouddb</code>—every few seconds. I picked 10 second buckets as a balance of frequent and interesting. (Spoiler: the word cloud visualization didn’t work out.)</p>
<p><code>sotu-collector</code> exposes a subset of the same internal API as <code>ekg</code>: it pays attention to new comments and changes in room data that cause it to flush the in-process cache. As far as <code>reflektor</code> is concerned, it is just another target.</p>
<p>Lots of things go from the API through <code>reflektor</code> to <code>ekg</code>: new comments, deletes, room topics, state changes like pause or close. <code>ekg</code> cares about all of them, but <code>sotu-collector</code> does not. The queue task does not consider 404 responses a failure, so it’s possible to implement only part of the API.</p>
<p><code>sotu-collector</code> handles almost all the heavy lifting (it could have been broken up into 3 or more services with different scaling and CPU requirements). It</p>
<ul>
<li>decides whether the comment is to a SotU-related room (users could opt any room in or out, but rooms with “sotu” or “stateoftheunion” in the name defaulted to “in”, so look up the room name and the opt-in state),</li>
<li>splits comments up into words,</li>
<li>runs them through a stemmer, a stopword list, and a profanity list,</li>
<li>builds a map of word → frequency,</li>
<li>serializes the map to JSON and flushes it to <code>clouddb</code> every 10 seconds,</li>
<li>counts the number of relevant messages,</li>
<li>runs a quick AFINN-111 <a href="https://www.npmjs.com/package/sentiment">sentiment analyzer</a> on each message,</li>
<li>flushes the number of messages and average sentiment (<code>sum(sentiment)/num_messages</code>) to <code>tickdb</code> every second.</li>
</ul>
<p>That’s all together because of the time pressure. If it were a more permanent fixture, multiple copies of a collector service could filter relevant messages before sending them to one or more dedicated analysis services.</p>
<h3 id="sotupulse">sotu-pulse</h3>
<p>The other half of the server-side component was pushing the data out to clients. Since TodaysMeet already uses Primus to manage websocket connections, it was a natural choice.</p>
<p><code>sotu-pulse</code> is a simpler service than <code>sotu-collector</code>: it maintains streaming connections, and some generic stats (e.g. current connected), periodically pulls new data from <code>clouddb</code> (one blob every 10 seconds) and <code>tickdb</code> (five ticks every 5 seconds), and pushes that data to the clients.</p>
<p>When it became clear I wouldn’t be able to get the word cloud to work in time, I decided to do a simple bar graph of the relative frequencies of the top 10 words, so the only clean-up <code>sotu-pulse</code> had to do was sorting the words by frequency and picking the top 10.</p>
<h4 id="theclient">The client</h4>
<p>The other moving part was the actual visualization.</p>
<p><a href="http://coffeeonthekeyboard.com/wp-content/uploads/2015/02/sotu.png"><img src="http://coffeeonthekeyboard.com/wp-content/uploads/2015/02/small-sotu.png" alt="the sotu page"></a></p>
<p>I wanted a place people could watch the speech, participate in their own discussions (especially if a teacher had set up a room for their class) and see the overall trends across all the rooms.</p>
<p>The White House has made huge strides in making the speech accessible: they streamed it live on YouTube. That made watching it easy (and of course you could pause the stream if it was on your TV).</p>
<p>TodaysMeet already supports <a href="http://blog.todaysmeet.com/embed-todaysmeet-rooms-into-your-class-websites-and-lmses-309/">embedding a room</a>, but I needed to write a small embeddable page to help users join their <em>own</em> rooms. (I’m going to use what I built here to make it easier to join rooms from the home page or mobile devices, so if nothing else, this whole thing worth while for that!)</p>
<p>I built the frame for all of this very quickly, to get anything at all on the web to start promoting it. TodaysMeet technically supports IE8—or <a href="http://blog.todaysmeet.com/browser-support-update-no-longer-supporting-internet-explorer-8-440/">did, at the time</a>—but by skipping it for the SotU pages, I could use <code>calc()</code> and that was honestly the most helpful thing I could’ve done for myself.</p>
<p>I hadn’t directly used <a href="http://d3js.org/">d3</a> until approximately 10 hours before the SotU, so what I could do was pretty limited. I tried to make the word cloud work for a while, then gave up and decided that I could figure out a bar graph—but couldn’t manage animating the bars in time—and a couple of line graphs (that’s when I decided I had time to add frequency and sentiment graphs).</p>
<p>Fortunately there are some great examples of building <a href="http://bost.ocks.org/mike/bar/">bar</a> and line graphs, even <a href="http://bost.ocks.org/mike/path/">smoothly scrolling line graphs</a>, so I was just able to pull that off under the wire.</p>
<h3 id="howitwent">How it went</h3>
<p>Really well! My biggest concerns were that common words would be profanity—TodaysMeet has a lot of teenage users—and that it would fall over. In that order.</p>
<p>I opened up the “official” room and the viewer about an hour before the speech started. At first, there plenty of moments when the lines were flat and the most common word was “hey”.</p>
<p>But once the speech started, the common words were on topic, the post volume kept moving, and the sentiment graph was interesting! Either people were behaving alright or the profanity filter worked well enough—it’s one of many things I would’ve instrumented if I’d had more time.</p>
<p>And the SotU processes kept up without too much trouble. With Primus, I’ve noticed that most relevant load numbers seem to scale linearly with the number of connections, but with a nice low slope. <code>sotu-collector</code> was handling the same volume of messages as <code>ekg</code>, and doing a comparable amount of work per message—though more of it was loop-blocking, CPU-bound text processing.</p>
<h3 id="lessons">Lessons</h3>
<p>I’m so happy I decided to do this: it was a ton of fun, and I took away a few things.</p>
<ul>
<li>Going through the process of adding a few new services was valuable—I haven’t done it since August, so it was a nice reminder and sanity check.</li>
<li>So was provisioning a new box, that’s not something I do all that often these days.</li>
<li>I haven’t actually asked reflektor to double up production messages before, so this was the first real validation that it works as designed and in tests.</li>
<li><code>calc()</code> works, unprefixed, in all modern browsers including IE9. This is going to be incredibly helpful (I can’t rely on flexbox yet, but <code>calc()</code> is going to solve some very real layout problems especially on embedded and mobile rooms).</li>
<li>Working through the UI to join a room by name, instead of by URL, helped me understand the parts I’ll need. Not being able to do this is probably the biggest single problem people run into, and limits how useful TodaysMeet is on mobile devices.</li>
<li>If you want a line graph that shows deviation from an axis that isn’t on the edge: draw the axis.</li>
<li>None of the graphs had scales, and that was fine. This wasn’t hard science, but it was interesting to see what got people talking, about what, and roughly how they felt.</li>
<li>Sentiment analysis is hard. The readily available tool was a bag-of-words analysis, and I really didn’t trust the scores for any of the individual messages I ran through it. But in aggregate it seems to have done a reasonably good job.</li>
</ul>
<p>Designing and building a product in two days is exhilarating and challenging. When you start any part you have no idea how long it will take. I found it very helpful to have a vision of where I wanted to go that was a little bit blurry. Video, conversation, analysis. What analysis? Well, I’ll see what I can do in time.</p>
<h3 id="goservices">Go services!</h3>
<p>This deserves its own post, but… Service-oriented systems are amazing, and using message queues to push data around asynchronously means it’s very easy to dip into a stream of data. It’s much easier to shut off a service when it’s isolated and not built in to other systems, which makes prototyping—or short-lived projects very cheap.</p>
<p>Pushing data in only one direction helps reason about it. It’s the door you need to walk through to get to a queue-based architecture.</p>
<p>It’s critical to understand what guarantees your product actually <em>needs</em>. We tend to assume that what we’re doing is absolutely, life-and-death critical and any delay or disruption is completely unacceptable—because to <em>us</em>, it is. But our users may be perfectly happy with half a second delay, if they even notice it! (When you post a comment on TodaysMeet, the UI responds immediately, and backtracks if it runs into an error later. So even if a post took over a second, odds are the user wouldn’t know.)</p>
<p>The difference between 5 and 500 milliseconds is an entire universe, especially for 90-99%iles. So is the difference between “it’s bad” and “someone will die” if a message is dropped. Relax, be honest, and embrace asynchronicity.</p>
]]></content:encoded></item><item><title><![CDATA[How TodaysMeet Works]]></title><description><![CDATA[<p><em>I want to write about TodaysMeet’s 2015 <a href="http://blog.todaysmeet.com/sotu2015-on-todaysmeet-426/">State of the Union</a> site, but I realized I spent half the time on the existing architecture. So, this is part 1, and <a href="http://coffeeonthekeyboard.com/visualizing-the-2015-sotu-on-todaysmeet-1246/">here is part 2</a>!</em></p>
<p>A little over two years ago, I set out to <a href="http://blog.todaysmeet.com/welcome-to-todaysmeet-2-0-140/">completely replace TodaysMeet’s platform</a></p>]]></description><link>https://coffeeonthekeyboard.com/how-todaysmeet-works-1237/</link><guid isPermaLink="false">5ab304f15edea3001882e642</guid><category><![CDATA[django]]></category><category><![CDATA[dq]]></category><category><![CDATA[ekg]]></category><category><![CDATA[javascript]]></category><category><![CDATA[node.js]]></category><category><![CDATA[python]]></category><category><![CDATA[reflektor]]></category><category><![CDATA[todaysmeet]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Wed, 04 Feb 2015 14:39:24 GMT</pubDate><content:encoded><![CDATA[<p><em>I want to write about TodaysMeet’s 2015 <a href="http://blog.todaysmeet.com/sotu2015-on-todaysmeet-426/">State of the Union</a> site, but I realized I spent half the time on the existing architecture. So, this is part 1, and <a href="http://coffeeonthekeyboard.com/visualizing-the-2015-sotu-on-todaysmeet-1246/">here is part 2</a>!</em></p>
<p>A little over two years ago, I set out to <a href="http://blog.todaysmeet.com/welcome-to-todaysmeet-2-0-140/">completely replace TodaysMeet’s platform</a>. Then, <a href="http://coffeeonthekeyboard.com/irregular-update-04oct2014-1148/">over the past year</a>, I’ve taken the new platform and built it into a distributed, service-oriented—buzzword-compliant—application.</p>
<p>So here it is: how TodaysMeet actually works today. In <a href="http://coffeeonthekeyboard.com/visualizing-the-2015-sotu-on-todaysmeet-1246/">part 2</a>, I’ll give a very concrete example of how great this can be.</p>
<p><img src="http://coffeeonthekeyboard.com/wp-content/uploads/2015/02/How-TodaysMeet-Works.png" alt="a diagram with most of the moving parts"></p>
<p>The two big components are the TodaysMeet Django app, let’s call it <code>tm</code> for now, and the websocket connection service, called <code>ekg</code>.</p>
<p><code>tm</code> is written in Python, using Django. It currently serves as both the web app and API, including authenticating for both, though I’ve been trying to separate those as much as possible. It is solely responsible for talking to the database (and uses memcached where appropriate). Almost all non-static traffic goes through <code>tm</code>.</p>
<p><code>ekg</code> is written in JavaScript, using NodeJS (<a href="https://iojs.org/">for now</a>) and <a href="http://primus.io/">Primus</a> for websocket+fallback connections. The only bi-directional communication is connecting and joining a room, otherwise, everything comes in via the API and goes out via streams. It talks to <code>tm</code> via internal API endpoints when it needs information from the database and relies on in-process caching where appropriate.</p>
<p>Let’s look at the most common case, posting a new comment to a room. Lots of actions (closing or pausing a room, deleting a comment) follow a similar flow.</p>
<p>The <code>POST</code> request goes to <code>tm</code>. It starts a transaction, writes the comment to the database, then posts the comment to <code>ekg</code> before finally committing the transaction. <code>ekg</code> pushes the message to all the clients in that room.</p>
<p>Except, it doesn’t post <em>directly</em> to <code>ekg</code>. During deploys, <code>ekg</code> gets restarted, which can take over a second as it stops accepting new connections, tells the current clients to reconnect in a few seconds, shuts down and gets running again. Some posts would fail, causing users to see an error. It’s also important to scale to multiple <code>ekg</code> hosts without making the API slower, meaning it shouldn’t post to each <code>ekg</code> host.</p>
<p>There is an intermediate service called <a href="https://www.youtube.com/watch?v=7E0fVfectDo"><code>reflektor</code></a> to solve these problems. It runs on the same boxes as <code>tm</code>. <code>reflektor</code> accepts any HTTP request and responds immediately, then replays it against a list of downstream servers.</p>
<p>Because it responds immediately, the median time for <code>tm</code> to post the comment is just over 3ms. The 90%ile time is 4ms. It doesn’t matter how many downstream systems <code>reflektor</code> will eventually talk to, the <code>tm</code> “new message” API endpoint is extremely fast.</p>
<p><code>reflektor</code> queues these requests in-process using a library I call <code>dq</code> for “dumb queue”—that I swear I will rename and open source or replace at some point. <code>dq</code>:</p>
<ul>
<li>stores objects in memory with no persistence options, if it crashes, or is hard-stopped, they are lost;</li>
<li>is a FIFO queue, objects get processed in order;</li>
<li>has a configurable task to run on each object that can be synchronous or asynchronous and decides what counts as success or failure;</li>
<li>uses <code>setImmediate</code> or <code>setTimeout</code> to avoid blocking the event loop;</li>
<li>automatically retries with back off on errors;</li>
<li>can be paused or put in “drain” mode;</li>
<li>goes to “sleep” when empty to limit CPU time and automatically wakes up on push;</li>
<li>and emits events like <code>'drained'</code> so I can do graceful restarts.</li>
</ul>
<p>(There are better, more robust, COTS and FLOSS tools. I built <code>dq</code> because I didn’t want the operational overhead of running something like <a href="http://bitly.github.io/nsq/">nsq</a> on small VPSes—I really wanted the queue to run on <code>localhost</code> for each <code>tm</code> server to limit network time—I wanted to stick to HTTP when possible, and I did not really need its guarantees. But the idea is similar.)</p>
<p><code>tm</code> knows about its local <code>reflektor</code> and about those on its neighbors, so if the local <code>reflektor</code> is restarting—which happens in serial—<code>tm</code> will try a neighbor. Since adding <code>reflektor</code> and neighbor awareness, deploys are error-free and unnoticeable.</p>
<p>The biggest requirement for me has been speed—TodaysMeet is a real-time communication tool. So is it fast?</p>
<p><strong>Yes.</strong> Because <code>tm</code> posts the message to <code>reflektor</code><em>before</em> it commits the transaction and responds to the request, I actually had to work around a problem when the new message arrives in the browser before the <code>POST</code> completes!</p>
<p>I said I’d talk about how great this is. <a href="https://twitter.com/theseanoc">Sean</a> has talked about <a href="https://www.hakkalabs.co/articles/bitlys-practical-strategies-building-distributed-systems">streams at Bitly</a> a bunch, and nearly everything he said applies here. I’ll get into it more in <a href="http://coffeeonthekeyboard.com/visualizing-the-2015-sotu-on-todaysmeet-1246/">part two</a>, but this architecture makes it incredibly easy to build up new systems or features without interfering with what’s already running.</p>
]]></content:encoded></item><item><title><![CDATA[Testing with Django's Cache]]></title><description><![CDATA[<p>I don’t love my solution to this problem, so I’m writing about it in hopes that someone has something better.</p>
<p>When you run tests with Django, you get an <a https: coffeeonthekeyboard.com testing-with-djangos-cache-1229 href>isolated test database</a>. This can be wiped out and the consistency makes life a lot easier when you are</p>]]></description><link>https://coffeeonthekeyboard.com/testing-with-djangos-cache-1229/</link><guid isPermaLink="false">5ab304f15edea3001882e641</guid><category><![CDATA[django]]></category><category><![CDATA[python]]></category><category><![CDATA[testing]]></category><dc:creator><![CDATA[James  Socol]]></dc:creator><pubDate>Sat, 06 Dec 2014 15:11:14 GMT</pubDate><content:encoded><![CDATA[<p>I don’t love my solution to this problem, so I’m writing about it in hopes that someone has something better.</p>
<p>When you run tests with Django, you get an <a https: coffeeonthekeyboard.com testing-with-djangos-cache-1229 href>isolated test database</a>. This can be wiped out and the consistency makes life a lot easier when you are running tests locally in a dev environment.</p>
<p>However, <a href="https://docs.djangoproject.com/en/1.6/topics/testing/overview/#other-test-conditions">caches are not cleared</a> or isolated. Since the database <em>can be</em> cleared, that means that any cache keys based on model primary keys are likely to recur.</p>
<p>That sucks.</p>
<p>Here are some of the compounding challenges:</p>
<ul>
<li>The application code needs to reach into the internals of the cache backend, for various reasons, so just using the <code>LocMem</code> backend doesn’t work. I don’t want to maintain a second custom cache backend.</li>
<li>My dev environment, where I’m also running tests, is a VM that is very close to production, on purpose. I’d rather not run a second <code>memcached</code> process if possible, at least not all the time.</li>
<li>The main <code>memcached</code> process also support the local running services that I use for manual and integration testing. It would be acceptable to flush this before or after tests but not ideal.</li>
</ul>
<p>I don’t want to maintain a new test runner but I’m happy to wrap the test run commands in a shell script (I already use an alias for my common options). I’m pretty much deciding between:</p>
<ul>
<li>Just flush the whole cache before and after running the tests. It’s a blunt but effective hammer. It does not require an extra settings module. But, anything I do while the tests are running changes the environment.</li>
<li>Add a test settings module with a random (at import time) <code>KEY_PREFIX</code>. This seems effective so far. It isolates each test run, and anything I do while the tests are running. It does require a new settings module. It leads to garbage in the cache, but the VM is usually OK on memory headroom. I can always manually flush it, too.</li>
</ul>
<p>Of these, I’ll probably do the latter. Real isolation from running processes seems worth the other maintenance overhead.</p>
<p>What do you do? It seems like nearly all roads lead to separate test settings, but which road do you take?</p>
]]></content:encoded></item></channel></rss>