Create custom commands in Git

June 22nd, 2007

Let me explain a bit about Git before I get into the subject of this post. click hear if you don’t want to read my opinion on Git

Git is a wonderful thing. It’s fast, provides a great speed boost, and it’s quick. It does merging like I want an SCM to do.

Git is also a terrible thing (for me). It discards most of the command knowledge I have accumulated with SVN, or even Mercurial (my previous DSCM of choice), and sometimes replaces that knowledge with poorly exposed commands or completely different concepts. An example of the latter would be the git-revert command how it compares to the svn revert command. That isn’t so much of a big deal since Git doesn’t (and shouldn’t) let the decisions of a separate project dictate its interface; that said, it doesn’t change the fact that 95% of Git new adopters will be coming from an environment where revert undoes your currently uncommited changes.

The former is more troubling, but thankfully being worked upon. The analog to svn revert is git-checkout -f. I think that this is mostly confusing when perusing the man page, I thought to myself “Hey, there is git-reset, git-revert, there has to be some other command here like that that undoes my uncommitted changes”. The point is that common operations are not so easily learned in the git workflow. There are some good docs such as Everyday Git, but in general the learning process is somewhat painful for Git. Luckily…

Extending git

Git is made up of a hundred or so tiny applications, that are easily chained together to compose custom functionality. These applications expose functionality at high (end-user) level, and more interestingly at a very low-level. An example would be git-knacks that I hacked together yesterday evening in 20 minutes. They provide an easy way to shelve and unshelve code that isn’t ready for a real commit. More on that later…

What I didn’t know, though, was that the git command uses a simple naming convention for sub commands that you can use to inject functionality into the git command for others t use. git reset executes git-reset and git checkout-index runs git-checkout-index. All you have to do is write your own commands in whatever language you choose, name it properly, and presto… git hello-world is now in your toolchain.

I had thought the git- commands where simply for tab-completion support, and were just aliases into the “git blah” style commands. I was exactly oppositely wrong.

The unix-way got me to switch from Mercurial. Sure, I can write scripts that call to the hg command, but there isn’t a way, that I can see, to get something similar to git-merge-base. You would probably have to hack through and extract out some of the internals of Hg.


link_to as a block helper

June 19th, 2007

When it comes to creating helpers for the view portion of satisfaction, I’ve started taking the approach that once I see something “ugly” twice in the rhtml that it should be extracted out into a helper. In many cases, before that.

Today’s wart? link_to calls that have html for the text content. take for example:

<%= link_to "<strong>#{product.name}</strong><span>#{pluralize(product.topic_count(company), 'topic')}</span>", href, :class => "product_label" %>

ewww… and that doesn’t even get into having href defined above in a <% %> block. So, I extended link_to with the help of alias_method_chain such that it will take a block argument instead of the its normal first parameter.

<% link_to browse_url(product), :class => "product_label" do %>
  <strong><%= product.name %></strong>
  <span>(<%= pluralize(product.topic_count(company), 'topic') %>)</span>
<% end %> 

Much more readable. See the code block_link_to.rb


15 minute hack: Poor man’s function composition

June 14th, 2007

I loves me some functional programming and I loves me some Symbol#to_proc (which lets you do .each(&:upcase)).

One of those _fp_ features I wish I had in ruby-land was function composition. Specifically, I find myself writing things like:

initials = ["Foo". "Bar", "Baz"].map{|str| str.first.upcase}

That is fine, but just isn’t as cool as leveraging to_proc in some fashion. In the more “stupid, stupid, stupid” case, i’ve also seen things like:

initials = ["Foo". "Bar", "Baz"].map(&:first).map(&:upcase)  # Eww, unnecessary iterations...

Anways, Ruby does have function composition ([Proc#compose][2]), thanks to the Ruby facets project, but working directly with lambdas isn’t nearly as friendly as I desire. So, I spent 15 minutes to hack together a means that I like. Here’s what it looks like:

["foo", "bar", "baz"].map(&:first >> :upcase)

Which reads, map, using :first then :upcase. >> is left associative, like "foo".first.upcase, opposite of Haskell composition where foobar = foo . bar reads foo follows bar.

Also, the snippet linked below lets you do something like:

  class String
    compose(:initial, &:first >> :upcase)
  end
  "foo".initial   # => "F"

Now a re-implementation of ActiveSupport’s Module#delegate:

class Module
  def delegate(*methods)
    options = methods.pop
    unless options.is_a?(Hash) && to = options[:to]
      raise ArgumentError, "Delegation needs a target. Supply an options hash with a :to key as the last argument (e.g. delegate :hello, :to => :greeter)."
    end

    methods.each do |method|
      compose(method, &to >> method)
    end
  end
end

Perhaps not necessary, but I found it interesting and fun to write. Thoughts?

compose.rb


Caching time_ago_in_words

June 2nd, 2007

I’m a big fan of the “english” timestamps that so often get used in apps these days; “Sat Jun 02 17:55:56 -0700 2007″ means almost nothing to me without active thought. It is much nicer to read “about a minute ago”, it requires no processing on my part. As many of you know, rails bundles this functionality into a helper called time_ago_in_words, and it works wonderfully.

The only issue is that you can’t page cache these timestamps. If you do, “one minute ago” won’t be right in short order. So, I worked up a little remix for time_ago_in_words that solves this problem.

The secret sauce is Javascript, as usual. cacheable_time_ago_in_words outputs a javascript call that does the translation:

  module CachedDateHelper
    def cachable_time_ago_in_words(from)
      js_call = javascript_tag "document.write(time_ago_in_words(#{(from.to_i * 1000).to_json}) + ' ago');"
      <<-EOS
        #{js_call}
        
      EOS
    end
  end

And then, on the javascript side:

  function time_ago_in_words(from) {
   return distance_of_time_in_words(new Date().getTime(), from) 
  }

  function distance_of_time_in_words(to, from) {
    seconds_ago = ((to  - from) / 1000);
    minutes_ago = Math.floor(seconds_ago / 60)

    if(minutes_ago == 0) { return "less than a minute";}
    if(minutes_ago == 1) { return "a minute";}
    if(minutes_ago < 45) { return minutes_ago + " minutes";}
    if(minutes_ago < 90) { return " about 1 hour";}
    hours_ago  = Math.round(minutes_ago / 60);
    if(minutes_ago < 1440) { return "about " + hours_ago + " hours";}
    if(minutes_ago < 2880) { return "1 day";}
    days_ago  = Math.round(minutes_ago / 1440);
    if(minutes_ago < 43200) { return days_ago + " days";}
    if(minutes_ago < 86400) { return "about 1 month";}
    months_ago  = Math.round(minutes_ago / 43200);
    if(minutes_ago < 525960) { return months_ago + " months";}
    if(minutes_ago < 1051920) { return "about 1 year";}
    years_ago  = Math.round(minutes_ago / 525960);
    return "over " + years_ago + " years" 
  }

You’ll notice there is no support for the rails-like include_seconds option. That is left as an exercise for the reader (ie. I don’t need to use it). Using this helper will let you page cache your views, greatly improving performance. It even has a fallback to absolute timestamps for those out there with a javascript disability.


On Erlang and Strings…

May 11th, 2007

I said in a recent comment that the way Erlang handles strings is wonkey, and I would like to clarify. Take [97, 97] and "aa". My brains says these are two different values, but in erlang they are the same. Type them into an erl shell and what the result of both be “aa”.

This works because it is internally consistent, and as the developer you can easily reason whether a given list of bytes is actually a string of characters. It works because in a closed erlang system, none of the code cares whether it is a string of characters or a list of bytes, it just works.

The problem occurs when you break out erlang and communicate with a language that discerns between the two. In ruby, [?a, ?a] is different than "aa". Now here is the tricky part, Erlang can encode a List in one of 3 ways:

  1. NIL_EXT: a single byte denoting an empty list []
  2. STRING_EXT: lists are encoded to this when they are less that 65535 elements in length and only contain SMALL_INT values(0-255).
  3. LIST_EXT: any other list of elements

This info can be found in the OTP source distribution in the file “erts/internal_doc/erl_ext_dist.txt”. With these semantics, [1,2,3] would be encoded as a string, and it would be totally surprising to a Ruby programmer to have that value pop out as '\001\002\003' on the ruby side. So, it’s a bit of a juggling act, and with Erlectricity I decided to take a consistent, simple road. Any erlang list, regardless of how it is encoded by erlang, gets spit out as an Array on the ruby side. Likewise, any binary value in erlang will come out as a string on the ruby side. This makes for easy use of unpack if you need to extract value, but if you are wanting a string, nothing else is needed.

This situation is less than ideal, and is one of the confusing pieces that I would like to eliminate. To do this, a framework to ease conversion and guide understanding on the Erlang side is needed. Good thing that such a framework was already planned! By putting a thin veneer over the Erlang ports, we can build upon the erlang distribution format (or build away from if needed) to find a happy equilibrium between the two differing type systems. Stay tuned for more discussion on how I approach a solution to the problem!


Erlectricity: Hi Ruby, I’m Erlang.

May 8th, 2007

update: this pre-release is up and running on rubyforge:

sudo gem install erlectricity

As I said in my previous article, I’m very stoked about working with Erlang. Usually most forays into new languages mean new projects and new explorations: Erlang has been no different. I usually end up in some middle ground where I try to improve the current language I’m working in with the new language: With C# -> Ruby it was a code generator written in Ruby that produced the cookie cutter DALs that I was plagued with, and this time around I immediately started to gravitate towards interoperation between erlang and ruby.

I’m not the only one with this idea (I’ve never claimed to be original), with several others working on using JSON to bridge the gap. That is all well and good, but considering how good erlang is at expressing itself across networks, I’d rather see something closer to the erlang-way. And that is why I set about creating what I call Erlectricity.

Before I go further, a code sample is in order. The following code is a passthrough daemon to Tinder, (the Campfire API gem) that lets you post to campfire from erlang. Even more before I show you the code, here is the google code site: http://code.google.com/p/erlectricity/, with the source being: http://erlectricity.googlecode.com/svn/trunk/. And now…

The Ruby:

require 'rubygems'
require 'erlectricity'
require 'tinder'

domain, email, password, roomname = *ARGV campfire = Tinder::Campfire.new domain campfire.login email, password room = campfire.findroombyname room_name

receive do match(:speak, string(:comment)) do room.speak comment receive_loop end

match(:paste, string(:comment)) do room.paste comment receive_loop end end

room.leave if room

And some erlang to call it (i’m omitting some code, please see the tinderl.erl example in the source to see the code that enables the calls below):
tinder:start("thedomain", "myemail@gmail.com", "mypassword", "theroom"),
tinder:speak("hey, hey kids!"),
tinder:paste("Im Krusty the Klown!").
At first, this was going to be a JRuby layer over JInterface, the Java -> Erlang bridge. That was all well and good, and worked fine, but it had a major flaw: It was tied to JRuby, leaving out the majority of Rubyists. JRuby is super exciting, and I was super stoked the first time I got ruby running inside a prototype game using The JMonkey Engine, but I didn’t want to be the person to exclude. I decided at this initial stage to use the Erlang port system, a means of starting and communicating with a separate process. As a consequence, I implemented most of the Erlang distribution format. Why re-invent the wheel when erlang has termtobinary and binarytoterm? Plus, it’s fun investigating the Erlang distribution format, even though only a subset is used for termtobinary.

As an aside, one reason to re-implement termtobinary and its sibling is because of the limited nature of Erlang’s type system.

You’ll also notice in the example directory ‘gruff’: 2 hours of hacking invested into rendering Gruff graphs through Erlectricity. Below is a graph generated using the stat_writer example to graph memory usage while running ab against webtool:

This was as a response to Yariv, and his desire for a graphing library in erlang like gruff. The example isn’t by any means a real solution, merely a demostration that this could be a viable solution.

The goal of Erlectricity is two-fold: expose ruby code to erlang, such that erlang applications can take advantage of the breadth of ruby libraries, and secondly, expose the OTP to ruby such that fault-tolerant distributed systems will be easier to develop. Neither of those goals are realized in Erlectricity’s current form, but it is a good base to work from towards those goals. Also, the two examples included demonstrate the former, and I’m playing around with something meaningful to demonstrate the latter.

This post is already about 3-times too long, but let me real quickly describe the structure of an erlectric app on the ruby side. The programming model follows the concurrency model of erlang, so read up on how erlang processes messages if you get confused. I’ll also describe it in further detail in an additional post. So, the core of an app using erlectricity is the receive loop, which looks like:

receive do
  match(:hello, string(:yourname)) do
    send! :helloreceived, “You stink, #{yourname}”
    receiveloop
  end
  match(:goodbye){ }
end
The receive block says “get a message, blocking if there are none on the wire”. When a message is received in this state it is compared with each of the match blocks in turn. The match block in the example says “If the message looks like the symbol :hello, and a string, bind the string object to the variable yourname and run the block.” Normally, when receiving a message, the code would then jump out of the receive block, but since the return value of the match block is receiveloop, the program tries to process another message against the same supplied match blocks. You’ll also notice that if the process receives a message that is just the symbol :goodbye, the app will simply fall out of the receive loop.

Here are some tips if you would like to play around with Erlectricity:

  1. Email me if you have any trouble: nullstyle@gmail.com
  2. Pass strings as binaries to ruby. “haha” will come out as [104, 97, 104, 97] in ruby, but <<”haha”>> will come out as “haha” in ruby. See my comment above about the limited type system in erlang. I’ll make a separate post about this issue, because I want to talk more fully about why I made this choice.
  3. Make sure your port is created with the {packet, 4} and binary options. Creating a port with {packet, 2} will have your ruby process never receiving any messages. Not having binary responses enabled will cause trouble when you try binarytoterm to get the response. This is an area for further improvement.
  4. Its a good idea to include catch-all clauses while you are working out the interaction issues. It greatly sped up my debugging time when I had the program yell at me when I sent an improper message
I would like to do several things with Erlectricity. first of all I would like to remove the possibility of programmer error in as many places as possible. There isn’t much work to do in this regard on the ruby-side(as far as I can tell, feedback is always welcome), but the hoops you need to jump through on the erlang side should go away. I’m thinking about abstracting away the framework bits and creating a behavior that you can implement in a callback module, similar to how the OTP does things. Putting a layer on the erlang side between user code and the port will allow me to smooth out some of the type translation issues too.

You can install erlectricity with:

sudo gem install erlectricity --source http://gems.nullstyle.com/
I’m in process of opening up the rubyforge project, so you’ll be able to drop the –source in a couple days. Stay tuned for more info, examples, and improvements as time marches forward.


Fear and Loathing in Software Development

May 6th, 2007

A couple weeks ago I came across a video on dzone called How To Design A Good API and Why it Matters, by a very smart man named Joshua Bloch. I am always reading and watching anything that I can find programming related, especially when the material comes from the most distinguished among our industry. This video was overall very good, and I would recommend it to everyone interested in software design, but a note came out of the talk that I wanted to comment on.

Basically, Joshua asserts that a public API is permanent and as such you have one chance to get it right. He gives an example that if you write some code to solve a problem, pretty soon your buddy is going to want to solve that same problem and he will use your solution, and then he is going to pass your code onto ten friends, and so forth until your API is entrenched. Additionally, he asserts that once a person has learned an API they won’t want to relearn it as it changes. I don’t necessarily disagree with the either of those points, and as he says “This is scary”.

It reminds me of a project that I was involved in at my previous position at the Anchorage School District. I was charged with improving the ability of our systems to sync student enrollments and departures with our Active Directory. You see, we had a 20-or-so year old student information system that the school registrars used to enter student data, which should automatically be imported into our AD such that the students would have computer accounts to log in to the schools’ labs. Seems simple enough, but for various reasons the project ultimately failed. One of the major reasons was that I didn’t prototype properly, but that is beside the point. The main reason that this project failed was because of fear.

You see, we did not have a test environment for programming against our Active Directory; it was deemed too expensive to setup a parallel forest that could be over-written and rebuilt from our production environment. Try as I might at the time, the simple fact was that I would be developing a large application against a live data source that 55,000 students and teachers would be depending on day in, day out. If that wasn’t enough, the administrator in charge of maintaining the AD at the time was someone who I had very little confidence in: If I busted this thing, I gave it about a 35-50% chance that the administrator could bring everything back to working order. He was a nice enough guy, just not very good at his job. All of this added up to overwhelming amount of fear. Paralyzing would be an understatement. Every line of code had to be double and triple checked by me. I spent extra care on unit testing what pieces I could, but the fact was that the system was completely devoid of true functional testing. I was unable to test actually pulling data from the student system and writing data to the active directory.

This article isn’t about that project though, suffice to show that fear is the mind-killer. Directly following the failure of that project I wrote a similar system of reduced scale to populate Open Directory servers in much the same way. The difference was that I could hack and destroy to my hearts content, because I had a stupid-simple way to fix what I broke, and no one was depending on the systems I was developing against. The result: it rolled out to 100 elementary schools just fine. There were some hiccups, but not nearly on the scale of the previous project.

Maybe it is just me. Maybe it’s because I equate my programming style to that of a sketch or comic book artist. The point is that if you work through prototype and iteration, gradually morphing your initial failed tries into the final working API, having pressure to get it right the first time is killer.

So what can we do to alleviate that pressure when it comes to API design? Lets look at some of the points Joshua used to back up the notion that you have one try to get it right.

First of all, lets look at his assertion that programmers won’t want to relearn an API. I believe this is a side-effect of the cultural environment that a programmer works within. Perhaps this is just the Java community, with which he is so intimate; I certainly don’t see this hesitance to adapt in, for example, the Ruby community. Take for example REXML and Hpricot. Before Hpricot was released by my personal hero, _why, most everyone who was working with parsing XML in Ruby was using REXML. REXML works fine, has a reasonably simple API to interact with, but isn’t that fast. And then comes along Hpricot, which is super-fast comparatively, can work on malformed XML (is XML that is malformed still XML?), but has a different API. Some would say that the API for Hpricot is easier and more powerful (I certainly do), but the point is that I had to relearn the method in which I interact with XML. Did I want to? I sure as fuck did, and on a further note I’ve used Hpricot in just about every project I’ve worked on since it was initially released. This can further be evidenced by various people around the net, and several personal acquaintances.

If your new API is compelling, rather than change just for change’s sake, people will relearn it. Free of bureaucratic constaints, I think about which API would be best to accomplish the task at hand, rather than how I can fit an API to a problem. I don’t think “I need to use Hpricot to solve this problem”, but rather “I need to parse XML here, Hpricot is the best in this situation, I’ll use it”. Given that attitude, and exposure to new APIs, a programmer will adapt and evolve along with the code he is relying upon.

Lets discuss the entrenchment of an API that Joshua initially talked about. He is absolutely right: the more popular an API becomes the less likely it will be that you can change it. This is why we have versioning, it really is a non-issue given that. I don’t care that come Rails 2.0 my thousands of invocations of image_path that look like:

imagepath( ’somefilewithoutan_extension’ )

will no longer work, that code is bound to the version that allows invocations like that. Sure, a changing API means I pay tolls to keep my code that consumes the API up the date, but that is part of doing business. A well created test suite, and working to isolate coupling between an API and your code will reduce this toll. As a teaser into an entirely different discussion, work of this nature is so incredibly easy in a dynamic language such as Ruby. I can almost effortless produce an adapter than behaves as the old API while consuming the new API, and then gradually morph that adapter to expose the new features and changes as my application needs them.

Anyways, this rant has been going on long enough. I would like the make clear the theme I am trying to convey. Fear is the death of agility. If you have to second guess yourself, or feel restricted in how you can approach building a solution, the solution will suffer. If you are afraid at the consequences your software will have, you will fail unless you contain an inhuman capacity for mental anguish. Sketch, re-conceive, evolve all of your code. Make it better, even if others are relying upon it. Communicate with them how things are changing and work with them to reconcile those changes with their applications. Revolution is hard, evolution is easy, but use both to grow your code and make it more beautiful.


This would be great to simulate zombie infestations…

May 6th, 2007

Erlang is getting a lot of press lately: from the new Pragmatic Programmers book to SlideAware guys and their “From Python to RoR to Erlang” series of blog posts, Erlang definitely appears to be picking up steam. I myself have just recently come around to its beauty, though it is not quite as enjoyable as ruby, but that is a subject for a different post.

Joe Armstrong’s new book on erlang is masterful; It is actually the first programming related book I have read cover to cover (well, in this case the entire pre-print pdf) since “AI for Game Developers”, and that was like 3 years ago. Something wasn’t clicking for me with when Erlang first piqued my interest around the start of the year. It has documentation, and it has tutorials, and the best I could find at the time was the “Getting Started with Erlang”. It does a good job of introducing the language, but I now realize why it failed at hooking me on Erlang: The OTP was only mentioned so far as to say that no info on it would be included in the tutorial. For those of you who don’t know, the OTP is a set of libraries and patterns (called behaviors) that work to enable all of the amazing capabilities of erlang such as cool-as-shit fault tolerance and seamless code upgrade with no downtime. When reading about the OTP in the new book I was hooked: The section that explains in plain detail the non-magic behind live code upgrade sent chills down my spine. I had one of those very desireable “a-ha!” moments when I found that by following this simply explained convention I could upgrade a server with no down time. That was what sold me on Erlang, and it’s a shame that those moments aren’t earlier in the pipeline.

Yet, I am having trouble exploring the Erlang landscape. There is so much to learn, and it seems like some of the interesting bits of Erlang are inaccessible to me. I think that it is mostly a presentation issue, and I would like to help out. Besides, this gives me a good excuse to really keep a blog going. As a challenge to myself, I am going to try and make one Erlang related post to this blog each week.