• 201407.16

    Nanomsg as the messaging layer for turtl-core

    I recently embarked on a project to rebuild the main functionality of Turtl in common lisp. This requires embedding lisp (using ECL) into node-webkit (or soon, Firefox, as node-webkit is probably getting dumped).

    To allow lisp and javascript to communicate, I made a simple messaging layer in C that both sides could easily hook into. While this worked, I stumbled on nanomsg and figured it couldn't hurt to give it a shot.

    So I wrote up some quick bindings for nanomsg in lisp and wired everything up. So far, it works really well. I can't tell if it's faster than my previous messaging layer, but one really nice thing about it is that it uses file descriptors which can be easily monitored by an event loop (such as cl-async), making polling and strange thread < --> thread event loop locking schemes a thing of the past (although cl-async handles all this fairly well).

    This simplified a lot of the Turtl code, and although right now it's only using the nanomsg "pair" layout type, it could be easily expanded in the future to allows different pieces of the app to communicate. In other words, it's a lot more future-proof than the old messaging system and probably a lot more resilient (dedicated messaging library authored by 0MQ mastermind beats hand-rolled, hard-coded simple messaging built by non-C expert).

    Comments
  • 201406.19

    Windows GUI apps: Bad file descriptor. (or how to convert a GUI app into a console app for easy debugging)

    Lately I've been neck-deep in embedding. Currently, I'm building a portable (hopefully) version of Turtl's core features in ECL.

    Problem is, when embedding turtl-core into Node-webkit or Firefox, any output that ECL writes to STDOUT triggers:

    C operation (write) signaled an error. C library explanation: Bad file descriptor.
    

    Well it turns out Windows doesn't let you write to STDOUT unless a console is available, and even if using msys, it doesn't create a console for GUI apps. So here's a tool (in lisp, of course) that will let you convert an executable between GUI and console.

    Seems to work great. Special thanks to death.

    Comments
  • 201402.02

    Access your Firefox extension/add-on variables from the browser console

    It can be nice to access your FF extension's variables/functions from the browser console (ctrl+shift+j) if you need some insight into its state.

    It took me a while to figure this out, so I'm sharing it. Somewhere in your extension, do:

    var chromewin = win_util.getMostRecentBrowserWindow();
    chromewin.my_extension_state = ...;
    

    Now in the browser console, you can access whatever variables you set in the global variable my_extension_state. In my case, I used it to assign a function that lets me evaluate code in the addon's background page. This lets me gain insight into the background page's variables and state straight from the browser console.

    Note! This is a security hole. Only enable this when debugging your extension/addon. Disable it when you release it.

    Comments
  • 201401.22

    Debugging javascript in the default Android browser

    Note that this may or may not work on your device. If you're running into an app that works in a real browser but on in your Android's stock browser, do this:

    1. Navigate to your app in the browser.
    2. In the same tab go to about:debug
    3. Reload (it may reload for you).
    4. Profit.

    This will show you errors that even window.onerror doesn't catch, which should help you narrow down your problem(s).

    Source: This stackoverflow answer.

    Comments
  • 201310.03

    Debugging Firefox addons

    As you all know, I'm building a Turtl, a browser extension for client-side encrypted note/file storage.

    Well once in a while the I need to debug the release version. There are docs scattered around detailing how to do this, but as usual with this type of thing you really need to do some digging.

    By default, Firefox's Browser Console only logs error events. You can change this to log any and all console.log() calls from your addon (and any other addon) by doing this:

    1. Go to about:config
    2. Search for the key extensions.sdk.console.logLevel
    3. If it exists, set it to "info", otherwise add a new string with the key extensions.sdk.console.logLevel and the value "info"

    Boom, all your addon's log calls now show up in the browser console.

    Comments
  • 201310.01

    Curl CLI not sending full file data when using --data-binary

    I use curl to test out my HTTP libraries all the time. Recently, I ran into an issue where when uploading a file (25mb) from curl in the command line to my common lisp app server, only about half the data showed up (12.5mb). I was doing this:

    curl -H 'Authorization: ...' -H 'Transfer-Encoding: chunked' --data-binary @/media/large_vid.mov
    

    Naturally, I assumed the issue was with my libraries. It could be the cl-async library dropping packets, it could be the HTTP parser having issues, and it could be the app server itself. I mean, it has to be one of those. Curl has been around for ages, and there's no way it would just drop data. So I spent days tearing my hair out.

    Finally, I ran curl with the --trace and looked at the data. It provides a hex dump of everything it sends. It's not formatted perfectly, but with vim's block select and a few handy macros, I was able to get the length of the data being sent: 12.5mb. That's right, curl was defying me. There was no error in my code at all.

    I did a search online for curl not sending the full file data when using --data-binary. Nothing. So I looked over my options and found -T which looks surprisingly similar to --data-binary with the @ modifier. I tried:

    curl -H 'Authorization: ...' -H 'Transfer-Encoding: chunked' -T /media/large_vid.mov
    

    All 25mb came through (every byte accounted for).

    Conclusion

    If you're uploading files, use -T /path/to/file instead of --data-binary @/path/to/file. Note that -d/-D were also "broken."

    Comments
  • 201309.22

    Turtl: an encrypted Evernote alternative

    Hi FORKS. Tuesday I announced my new app, Turtl for Chrome (and soon Firefox). Turtl is a private Evernote alternative. It uses AES-256bit encryption to obscure your notes/bookmarks before leaving the browser. What this means is that even if your data is intercepted on the way to the server or if the server itself is compromised, your data remains private.

    Even with all of Turtl's privacy, it's still easy to share boards with friends and colleagues: idea boards, todo lists, youtube playlists, etc. With Turtl, only you and the people you share with can see your data. Not even the guys running the servers can see it...it's just gibberish without the key that you hold.

    One more thing: Turtl's server and clients are open-source under the GPLv3 license, meaning anyone can review the code or use it for themselves. This means that Turtl can never be secretly compromised by the prying hands of hackers or government gag orders. The world sees everything we do.

    So check out Turtl and start keeping track of your life's data. If you want to keep up to date, follow Turtl on Twitter.

    Comments
  • 201301.14

    Keyboard/mouse not working in Xorg on FreeBSD

    I recently installed FreeBSD 9.1-RELEASE on a VirtualBox VM to do some cl-async testing. I wanted to get Xorg running so I could edit code at a more "comfortable" resolution. I was able to get Xorg running fairly easily just by installing Xfce from /usr/ports.

    However, upon starting Xorg, my keyboard mouse would not work. I tried many things: following the steps in the handbook, enabling/disabling hald, reconfiguring Xorg, etc. No luck. My Xorg.0.log was telling me that it couldn't load the kdb/mouse drivers. After snooping around some forums, I found the solution:

    • Install the x11-drivers/xf86-input-keyboard port
    • Install the x11-drivers/xf86-input-mouse port

    After doing this, all was right with the world. Just to clarify, I am using dbus/hald and more or less using the default configuration that Xorg -configure gave me.

    Comments
  • 201211.26

    cl-async: Non-blocking, asynchronous programming for Common Lisp

    A while ago, I released cl-async, a library for non-blocking programming in Common Lisp. I've been updating it a lot over the past month or two to add features and fix bugs, and it's really coming along.

    My goal for this project is to create a general-purpose library for asynchronous programming in lisp. I think I have achieved this. With the finishing of the futures implementation, not only is the library stable, but there is now a platform to build drivers on top of. This will be my next focal point over the next few months.

    There are a few reasons I decided to build something new. Here's an overview of the non-blocking libraries I could find:

    • IOLib - An IO library for lisp that has a built-in event loop, only works on *nix.
    • Hinge - General purpose, non-blocking library. Only works on *nix, requires libev and ZeroMQ.
    • Conserv - A nice layer on top of IOLib (so once again, only runs on *nix). Includes TCP client/server and HTTP server implementations. Very nice.
    • teepeedee2 - A non-blocking, performant HTTP server written on top of IOLib.

    I created cl-async because of all the available libraries, they are either non-portable, not general enough, have too many dependencies, or a combination of all three. I wanted a library that worked on Linux and Windows. I wanted a portable base to start from, and I also wanted tools to help make drivers.

    Keeping all this in mind, I created bindings for libevent2 and built cl-async on top of them. There were many good reasons for choosing libevent2 over other libraries, such as libev and libuv (the backend for Node.js). Libuv would have been my first choice because it supports IOCP in Windows (libevent does not), however wrapping it in CFFI was like getting a screaming toddler to see the logic behind your decision to put them to bed. It could have maybe happened if I'd written a compatibility layer in C, but I wanted to have a maximum of 1 (one) dependency. Libevent2 won. It's fast, portable, easy to wrap in CFFI, and on top of that, has a lot of really nice features like an HTTP client/server, TCP buffering, DNS, etc etc etc. The list goes on. That means less programming for me.

    Like I mentioned, my next goal is to build drivers. I've already built a few, but I don't consider them stable enough to release yet. Drivers are the elephant in the room. Anybody can implement non-blocking IO for lisp, but the real challenge is converting everything that talks over TCP/HTTP to be async. If lisp supported coroutines, this would be trivial, but alas, we're stuck with futures and the lovely syntax they afford.

    I'm going to start with drivers I use every day: beanstalk, redis, cl-mongo, drakma, zs3, and cl-smtp. These are the packages we use at work in our queue processing system (now threaded, soon to be evented + threaded). Once a few of these are done, I'll update the cl-async drivers page with best practices for building drivers (aka wrapping async into futures). Then I will take over the world.

    Another goal I have is to build a real HTTP server on top of the bare http-server implementation provided by cl-async. This will include nice syntax around routing (allowing REST interfaces), static file serving, etc.

    Cl-async is still a work in progress, but it's starting to become stabilized (both in lack of bugs and the API itself), so check out the docs or the github project and give it a shot. All you need is a lisp and libevent =].

    Comments
  • 201207.07

    cl-mongo and multithreading

    We're building a queuing system for Musio written in common lisp. To be accurate, we already built a queuing system in common lisp, and I recently needed to add a worker to it that communicates with MongoDB via cl-mongo. Each worker spawns four worker threads, each thread grabbing jobs from beanstalkd via cl-beanstalk. During my testing, each worker was updating a Mongo collection with some values scraped from our site. However, after a few seconds of processing jobs, the worker threads begin to spit out USOCKET errors and eventually Clozure CL enters it's debugger of death (ie, lisp's version of a segfault). SBCL didn't fare too much better, either.

    The way cl-mongo's connections work is that it has a global hash table that holds connections: cl-mongo::*mongo-registry*. When the threads are all running and communicating with MongoDB, they are using the same hash table without any inherent locking or synchronization. There are a few options to fix this. You can implement a connection pool that supports access from multiple threads (complicated), you can give each thread its own connection and force the each thread to use its connection when communicating, or you can take advantage of special variables in lisp (the easiest, simplest, and most elegant IMO). Let's check out the last option.

    Although it's not in the CL spec, just about all implementations allow you to have global thread-local variables by using (defparameter) or (defvar), both of which create special variables (read: dynamic variables, as opposed to lexical). Luckily, cl-mongo uses defvar to create *mongo-registry*. This means in our worker, we can re-bind this variable above the top level loop using (let) and all subsequent calls to MongoDB will use our new thread-local version of *mongo-registry* instead of the global one that all the threads we're bumping into each other using:

    ;; Main worker loop, using global *mongo-registry* (broken)
    (defun start-worker ()
      (loop
        (let ((job (get-job)))
          (let ((results (process-job job)))
            ;; this uses the global registry. not good if running multiple threads.
            (with-mongo-connection (:db "musio")
              (db.save "scraped" results))))))

    New version:

    ;; Replace *mongo-registry* above worker loop, creating a local version of the
    ;; registry for this thread.
    (defun start-worker ()
      ;; setting to any value via let will re-create the variable as a local thread
      ;; variable. nil will do just fine.
      (let ((cl-mongo::*mongo-registry* nil))
        (loop
          (let ((job (get-job)))
            (let ((results (process-job job)))
              ;; with-mongo-connection now uses the local registry, which stops the
              ;; threads from touching each other.
              (with-mongo-connection (:db "musio")
                (db.save "scraped" results)))))))
    

    BOOM everything works great after this change, and it was only a one line change. It may not be as efficient as connection pooling, but that's a lot more prone to strange errors and synchronization issues than just segregating the connections from each other and calling it a day. One issue: *mongo-registry* is not exported by cl-mongo, which is why we access it via cl-mongo::*mongo-registry* (notice the double colon instead of single). This means in future versions, the variable name may change, breaking our above code. So, don't update cl-mongo without testing. Not hard.

    Hopefully this helps a few people out, let me know if you have better solutions to this issue!

    Comments
  • 201204.15

    Email is not broken: It's a framework, not an application

    I've been seeing a lot of posts on the webz lately about how we can fix email. I have to say, I think it's a bit short-sighted.

    People are saying it has outgrown it's original usage, or it contains bad error messages, or it's not smart about the messages received.

    These are very smart people, with real observations. The problem is, their observations are misplaced.

    What email is

    Email is a distributed, asynchronous messaging protocol. It does this well. It does this very well. So well, I'm getting a boner thinking about it. You send a message and it either goes where it's supposed to, or you get an error message back. That's it, that's email. It's simple. It works.

    There's no company controlling all messages and imposing their will on the ecosystem as a whole. There's no single point of failure. It's beautifully distributed and functions near-perfectly.

    The problem

    So why does it suck so much? It doesn't. It's awesome. The problem is the way people view it. Most of the perceived suckiness comes from its simplicity. It doesn't manage your TODOs. It doesn't have built-in calendaring. It doesn't give you oral pleasure (personally I think this should be built into the spec though). So why don't we build all these great things into it if they don't exist? We could add TODOs and calendaring and dick-sucking to email!!

    Because that's a terrible idea. People are viewing email as an application; one that has limited features and needs to be extended so it supports more than just stupid messages.

    This is wrong.

    We need to view email as a framework, not an application. It is used for sending messages. That's it. It does this reliably and predictably.

    Replacing email with "smarter" features will inevitably leave people out. I understand the desire to have email just be one huge TODO list. But sometimes I just want to send a fucking message, not "make a TODO." Boom, I just "broke" the new email.

    Email works because it does nothing but messaging.

    How do we fix it then?

    We fix it by building smart clients. Let's take a look at some of our email-smart friends.

    Outlook has built-in calendaring. BUT WAIT!!!!! Calendaring isn't part of email!!1 No, it's not.

    Gmail has labels. You can categorize your messages by using tags essentially. Also, based on usage patterns, Gmail can give weight to certain messages. That's not part of email either!! No, my friend, it's not.

    Xobni also has built incredible contact-management and intelligence features on top of email. How do they know it's time to take your daily shit before you do? Defecation scheduling is NOT part of the email spec!!

    How have these companies made so much fucking money off of adding features to email that are not part of email?

    It's all in the client

    They do it by building smart clients! As I said, you can send any message your heart desires using email. You can send JSON messages with a TODO payload and attach a plaintext fallback. If both clients understand it, then BAM! Instant TODO list protocol. There, you just fixed email. Easy, no? Why, with the right client, you could fly a fucking space shuttle with email. That's right, dude, a fucking space shuttle.

    If your client can create a message and send it, and the receiving client can decode it, you can build any protocol you want on top of email.

    That's it. Use your imaginations. I'll say it one more time:

    There's nothing to fix

    Repeat after me: "There's nothing to fix!" If you have a problem with email, fork a client or build your own! Nobody's stopping you from "fixing" email. Many people have made a lot of cash by "fixing" email.

    We don't have to sit in fluorescent-lit, university buildings deliberating for hours on end about how to change the spec to fit everyone's new needs. We don't need 100 stupid startups "disrupting" the "broken" email system with their new protocols, that will inevitably end up being  a proprietary, non-distributed, "ad hoc, informally-specified, bug-ridden, slow implementation of half of" the current email system.

    Please don't try to fix email, you're just going to fuck it up!! Trust me, you can't do any better. Instead, let's build all of our awesome new features on top of an already beautifully-working system by making smarter clients.

    Comments
  • 201203.21

    Vim syntax highlighting (void.vim)

    A while ago I created a vim highlighting script called void.vim. I've been using it for over a year now and just updated some things that have been bothering me recently, so feel free to check it out. This is my main color scheme I use for everything, and I created it to be be easy on the eyes but to actually look nice too. A lot of the color schemes I've used seem to have been really loud or have a bad choice of colors. Void is my favorite.

    Here's a sample (created with vim's :TOhtml command):

     1 /**
     2  * Trigger an event for this object, which in turn runs all callbacks for that
     3  * event WITH all parameters passed in to this function.
     4  *
     5  * For instance, you could do:
     6  * mymodel.bind("destroy", this.removeFromView.bind(this));
     7  * mymodel.trigger("destroy", "omg", "lol", "wtf");
     8  *
     9  * this.removeFromView will be called with the arguments "omg", "lol", "wtf".
    10  *
    11  * Note that any trigger event will also trigger the "all" event. the idea
    12  * being that you can subscribe to anything happening on an object.
    13  */
    14 trigger: function(ev)
    15 {
    16     var args   =   shallow_array_clone(Array.from(arguments));
    17     [ev, 'all'].each(function(type) {
    18         if(!this._events[type]) return;
    19         Array.clone(this._events[type]).each(function(callback) {
    20             callback.apply(this, (type == 'all') ? args : args.slice(1));
    21         }, this);
    22     }, this);
    23 
    24     return this;
    25 }
    

    This is javascript, but I also use it for HTML, CSS, php, and lisp. Note that all code highlighting on this blog is done via vim with this color scheme.

    Comments
  • 201203.09

    TMUX/screen and root shells: a new trick I just learned (TMOUT)

    I'm currently doing some server management. My current favorite tool is TMUX, which among many other things, allows you to save your session even if you are disconnected, split your screen into panes, etc etc. If it sounds great, that's because it is. Every sysadmin would benefit from using TMUX (or it's cousin, GNU screen).

    There's a security flaw though. Let's say I log in as user "andrew" and attach to my previous TMUX session: tmux attach. Now I have to run a number of commands as root. Well, prefixing every command with sudo and manually typing in all the /sbin/ paths to each executable it a pain in the ass. I know this is a bad idea, but I'll often spawn a root shell. Let's say I spawn a root shell in a TMUX session, then go do something else, fully intending log out later, but I forget. My computer disconnects, and I forget there's a root shell sitting there.

    If someone manages to compromise the machine, and gain access to my user account, getting a root shell is as easy as doing tmux attach. Oops.

    Well, I just found out you can timeout a shell after X seconds of inactivity, which is perfect for this case. As root:

    1 echo -e "\n# logout after 5 minutes of inactivity\nexport TMOUT=300\n" >> /root/.bash_profile
    

    Now I can open root shells until my ass bleeds, and after 5 minutes of inactivity, it will log out back into my normal user account.

    A good sysadmin won't make mistakes. A great sysadmin will make mistakes self-correct ;-].

    Comments
  • 201111.21

    Composer.js - a new Javascript MVC framework for Mootools

    So my brother Jeff and I are building to Javascript-heavy applications at the moment (heavy as in all-js front-end). We needed a framework that provides loose coupling between the pieces, event/message-based invoking, and maps well to our data structures. A few choices came up, most notably Backbone.js and Spine. These are excellent frameworks. It took a while to wrap my head around the paradigms because I was so used to writing five layers deep of embedded events. Now that I have the hang of it, I can't think of how I ever lived without it. There's just one large problem...these libraries are for jQuery.

    jQuery isn't bad. We've always gravitated towards Mootools though. Mootools is a framework to make javascript more usable, jQuery is nearly a completely new language in itself written on top of javascript (and mainly for DOM manipulation). Both have their benefits, but we were always good at javascript before the frameworks came along, so something that made that knowledge more useful was an obvious choice for us.

    I'll also say that after spending some time with these frameworks and being sold (I especially liked Backbone.js) I gave jQuery another shot. I ported all of our common libraries to jQuery and I spent a few days getting used to it and learning how to do certain things. I couldn't stand it. The thing that got me most was that there is no distinction between a DOM node and a collection of DOM nodes. Maybe I'm just too used to Moo (4+ years).

    composer

    So we decided to roll our own. Composer.js was born. It merges aspects of Spine and Backbone.js into a Mootools-based MVC framework. It's still in progress, but we're solidifying a lot of the API so developers won't have to worry about switching their code when v1 comes around.

    Read the docs, give it a shot, and let us know if you have any problems or questions.

    Also, yes, we blatantly ripped off Backbone.js in a lot of places. We're pretty open about it, and also pretty open about attributing everything we took. They did some really awesome things. We didn't necessarily want to do it differently more than we wanted a supported Mootools MVC framework that works like Backbone.

    Comments
  • 201111.16

    Rekon - a simple Riak GUI

    I was looking around for Riak information when I stumbled (not via stumble, but actually doing my own blundering) across a blog post that mentioned a Riak GUI. I checked it out. Install is simple and oddly enough, the tool uses only javascript and Riak (no web server needed). I have to say I'm thoroughly impressed by it. Currently the tool doesn't do a ton besides listing buckets, keys, and stats, but you can edit your data inline and delete objects. It also supports Luwak, which I have no first-hand experience with and was unable to try out.

    One thing I thought that was missing was a way to run a map-reduce on the cluster via text input boxes for the functions. It would make writing and testing them a bit simpler I think, but then again it would be easy enough to write this myself in PHP or even JS, so maybe I'll add it in. Search integration would be nice too, although going to 127.0.0.1:8098/solr/[bucket]search?... is pretty stupid easy.

    All in all, a great tool.

    Comments
  • 201109.14

    Mono, C# for a large backend system

    I just did a writeup about MongoDB's performance in the last big app we did. Now it's time to rip Mono a new one.

    Mono has been great. It's .NET for linux. We originally implemented it because it's noted for being a fast, robust compiled language. I didn't know C# before starting the project, but afterwards I feel I have a fairly good grasp on it (10 months of using it constantly will do that). I have to say I like it. Coming from a background in C++, C# is very similar except the biggest draw is you don't separate out your definitions from your code. Your code is your definition. No header files. I understand this is a requirement if you're going to link code in C/C++ to other C/C++ code, but I hate doing it.

    Back to the point, mono is great in many ways. It is fast, compiles from source fairly easily (although libgdiplus is another story, if you want to do image processing), and easy to program in.

    We built out a large queuing system with C#. You enter jobs into a queue table in MongoDB, and they get processed based on priority/time entered (more or less) by C#. Jobs can be anything from gathering information from third-parties to generating images and layering them all together (I actually learned first-hand how some of these Photoshop filters work). The P/Invoke system allowed us to integrate with third party libraries where the language failed (such as simple web requests with timeouts or loading custom fonts,  for instance).

    As with any project, it started off great. Small is good. Once we started processing large numbers of items in parallel, we'd get horrible crashes with native stacktraces. At first glance, it looked like problems with the Boehm garbage collector. We recompiled Mono with --enable-big-arrays and --with-large-heap. No luck. We contacted the Mono guys and, probably in lieu of all the political shenanigans happening with Mono at the moment, didn't really have a good response for us. Any time the memory footprint got greater than 3.5G, it would crash. It didn't happen immediately though, it seems random. Keep in mind Mono and the machines running it were 64bit...4G is not the limit!

    Our solution was two fold:

    • Put crash-prone code into separate binaries and call them via shell. If the process crashes, oh well, try again. The entire queue doesn't crash though. This is especially handy with the image libraries, which seem to have really nasty crashes every once in a while (not related to the garbage collection).
    • Make sure Monit is watching at all times.

    We also gave the new sgen GC a try, but it was much too slow to even compare to the Boehm. It's supposed to be faster, but pitting the two against each other in a highly concurrent setting crowned Boehm the clear winner.

    All in all, I like C# the language and Mono seemed very well put together at a small to medium scale. The garbage collector shits out at a high memory/concurrency level. I wouldn't put Mono in a server again until the GC stuff gets fixed, which seems low priority from my dealings with the devs. Still better than Java though.

    Comments
  • 201109.12

    MongoDB for a large queuing system

    Let me set the background by saying that I currently (until the end of the week anyway) work for a large tech company. We recently launched a reader app for iPad. On the backend we have a thin layer of PHP, and behind that a lot of processing via C# with Mono. I, along with my brother Jeff, wrote most of the backend (PHP and C#). The C# side is mainly a queuing system driven off of MongoDB.

    Our queuing system is different from others in that it supports dependencies. For instance, before one job completes, its four children have to complete first. This allows us to create jobs that are actually trees of items all processing in parallel.

    On a small scale, things went fairly well. We built the entire system out, and tested and built onto it over the period of a few months. Then came time for production testing. The nice thing about this app was that most of it could be tested via fake users and batch processing. We loaded up a few hundred thousand fake users and went to town. What did we find?

    Without a doubt, MongoDB was the biggest bottleneck. What we really needed was a ton of write throughput. What did we do? Shard, of course. Problem was that we needed even distribution on insert...which would give us almost near-perfect balance for insert/update throughput. From what we found, there's only one way to do this: give each queue item a randomly assigned "bucket" and shard based on that bucket value. In other words, do your own sharding manually, for the most part.

    This was pretty disappointing. One of the whole reasons for going with Mongo is that it's fast and scales easily. It really wasn't as painless as everyone led us to believe. If I could do it all over again, I'd say screw dependencies, and put everything into Redis, but the dependencies required more advanced queries than any key-value system could do. I'm also convinced a single MySQL instance could have easily handled what four MongoDB shards could barely keep up with...but at this point, that's just speculation.

    So there's my advice: don't use MongoDB for evenly-distributed high-write applications. One of the hugest problems is that there is a global write lock on the database. Yes, the database...not the record, not the collection. You cannot write to MongoDB while another write is happening anywhere. Bad news bears.

    On a more positive note, for everything BUT the queuing system (which we did get working GREAT after throwing enough servers at it, by the way) MongoDB has worked flawlessly. The schemaless design has cut development time in half AT LEAST, and replica sets really do work insanely well. After all's said and done, I would use MongoDB again, but for read-mostly data. Anything that's high-write, I'd go Redis (w/client key-hash sharding, like most memcached clients) or Riak (which I have zero experience in but sounds very promising).

    TL,DR; MongoDB is awesome. I recommend it for most usages. We happened to pick one of the few things it's not good at and ended up wasting a lot of time trying to patch it together. This could have been avoided if we picked something that was built for high write throughput, or dropped our application's "queue dependency" requirements early on. I would like if MongoDB advertised the global write lock a bit more prominently, because I felt gypped when one of their devs mentioned it in passing months after we'd started. I do have a few other projects in the pipeline and plan on using MongoDB for them.

    Comments
  • 201104.15

    PHP finally has anonymous functions??

    Wow, I can't believe I missed this...nobody seems to be talking about it at all. Ever since PHP 5.3, I can finally do non-generic callbacks.

    UPDATE: Check out this description of PHP lambdas (much better than what I've done in the following).

     2 function do_something($value)
     3 {
     4     // used &gt;= 2 times, but only in this function, so no need for a global
     5     $local_function = function($value) { ... };
     6 
     7     // use our wonderful anonymous function
     8     $result = $local_function($value);
     9     ...
    10     // and again
    11     $result = $local_function($result);
    12     return $result;
    13 }
    

    There's also some other great stuff you can do:

    2 $favorite_songs = array(
    3     array('name' => 'hit me baby one more time', 'artist' => 'britney'),
    4     array('name' => 'genie in a bottle', 'artist' => 'xtina'),
    5     array('name' => 'last resort', 'artist' => 'papa roach')
    6 );
    7 $song_names = array_map(function($item) { return $item['name']; }, $favorite_songs);
    

    GnArLy bra. If PHP was 20 miles behind Lisp, it just caught up by about 30 feet. This has wonderful implications because there are a lot of functions that take a callback, and the only way to use them was to define a global function and send in an array() callback. Terrible. Inexcusable. Vomit-inducing.

    Not only can you now use anonymous functions for things like array_map() and preg_replace_callback(), you can define your own functions that take functions as arguments:

     2 function do_something_binary($fn_success, $fn_failed)
     3 {
     4     $success = ...
     5     if($success)
     6     {
     7         return $fn_success();
     8     }
     9     return $fn_failed();
    10 }
    11 
    12 do_something_binary(
    13     function() { echo "I successfully fucked a goat!"; },
    14     function() { echo "The goat got away..."; }
    15 );
    

    Sure, you could just return $success and call whichever function you need after that, but this is just a simple example. It can be very useful to encapsulate code and send it somewhere, this is just a demonstration of the beautiful new world that just opened for PHP.

    So drop your crap shared host (unless it has >= 5.3.0), get a VPS, and start using this wonderful new feature.

    Comments
  • 201011.21

    Vim: Cursor at beginning of tab in normal mode

    One thing that annoys me in Vim is that in normal mode, the cursor defaults to being at the end of a tab character. When I hit "Home" I expect the cursor to go all the way to the left, but instead it hovers 4 spaces to the right of where I expect it to. I stumbled across the answer after reading a mailing list thread for vim.

    set list lcs=tab:\ \ 
    " Note the extra space after the second \

    You can put this in your .vimrc to automatically set this behavior. Very useful.

    Comments
  • 201011.16

    Vim: I can't believe I ignored you all these years

    All these years, since the day I first turned on a linux distribution, I've ignored vi/vim. Sure, there are swarms of geeks covering you with saliva as they spew fact after fact about how superior vim is to everything else, but to me it's always been "that editor that is on every system that I eventually replace with pico anyway."

    Not anymore. Starting a few years back, I've done all of my development in Eclipse. It has wonderful plugins for PHP, C++, Javascript, etc. The past week or so I've been weening myself off of it and diving into vim. What actually got me started is I bought a Droid 2 off ebay for various hacking projects (I'm planning on reviewing it soon). Well, it was really easy to get vim working in it (sorry, lost the link already). I thought, well, shit, I've got vim, what the hell can I do with it? First things first, let's get a plugin for syntax coloring/indentation for a few of my favorite languages. What?! It has all of them already.

    Ok, now I'm interested. I installed vim for Windows (gvim), which was followed by a slow-but-steady growing period of "well, how do I do this" and "HA...I bet vim can't do THI...oh, it can." There are "marks" for saving your place in code, you can open the same file in multiple views (aka "windows"), you can bind just about any key combination to run any command or set of commands, etc. I even discovered tonight there's a "windows" mode for vim that mimics how any normal editor works. I hate to admit it, but I'll be using that a lot. One feature that blew my mind is the undo tree. Not stack, tree. Make a change, undo, make a new change, and the first change you did before your undo is still accessible (:undolist)!

    The nice thing about vim is that it saves none of its settings. Every change you make to it while inside the editor is lost after a restart. This sounds aggravating, but it actually makes playing with the editor really fun and easy. If I open 30 windows and don't know how to close them, just restart the editor. There are literally hundreds of trillions of instances when I was like "oh, shit" *restart*.

    Once you have a good idea of what you want your environment to be like, you put all your startup commands in .vimrc (_vimrc on Windows) and vim runs it before it loads. Your settings file uses the same syntax as the commands you run inline in the editor, which is awesome and makes it easy to remember how to actually use vim.

    So far I'm extremely impressed. The makers of vim have literally thought of everything you could possibly want to do when coding. And if they haven't thought of it, someone else has and has written a plugin you can drop into your plugins directory and it "just works." Speaking of plugins, vim.org's plugin list seems neverending. I was half expecting to see most plugins have a final mod date of 2002 or something, but a good portion have newer version released within the past two weeks. It seems the ones that are from 2002 never get updated because they're mostly perfect. Excellent.

    I do miss a few things though. First off, the project file list most editors have on the left side. I installed NERDTree to alleviate that pain, but honestly it's not the same as having my right click menus and pretty icons. I'm slowly getting used to it though. The nice thing about a text-only file tree is that in those instances where you only have shell access and need to do some coding, there isn't a dependency on a GUI.

    Tabs are another thing I miss. Gvim has tabs, but they aren't one tab == one file (aka "buffer") like most editors. You can hack it to do this, sort of, but it works really jenky. Instead I'm using MiniBufExplorer, which takes away some of the pain. I actually hacked it a bit because I didn't like the way it displays the tabs, which gave me a chance to look at some real vim script. It's mostly readable to someone who's never touched it before.

    That about does it for my rant. Vim is fast, free, customizable, extendable, scriptable, portable, wonderful, etc...and I've barely scratched the surface.

    Comments

 |