June 18, 2008

Weaning myself off Tara

As far back as I can remember, I wanted a computer of my own.

The first computer I was given wide latitude upon was my mother’s Apple IIc. Arguably, my experiences on it fundamentally broke me for all future computing. After it died, the first time, I got time on an XT.

The first machine I had exclusive control over was a 386. By that point, I had stolen time on computers wherever and whenever I could. And, I had accessed the Internet thanks to the lax network security at the University of Washington. I found Linux. I named the 386: “fuzzy toilet”

I’ve since standardized my naming scheme: women I had crushes on and got nowhere with.

Years and women passed. The last desktop I purchased was in 1999. She was dubbed Tara. And, with her, I learned that data is more important than the hardware containing it.

The originally purchased hardware for Tara doesn’t exist. The motherboards, CPUs, hard drives, video cards, sound cards, network cards, keyboards, mice, and monitors have all warn out and been replaced. Many times. But, the original installation of Linux on Tara still exists.

scott@tara:~$ [0] ls -al .bash_logout
-rw-r–r– 1 scott scott 24 1999-07-20 19:09 .bash_logout

That’s a heart warming story of a boy and his computer.

But, Scott got older and finally started outgrowing Tara. My friend William pressured me into purchasing a laptop. The day I installed Ubuntu on Geneva was the last day of my preferred use of desktops. It was a matter of time before I transitioned completely:

scott@tara:~/.gaim/logs$ [0] find ./ -name ????-??-??.*.txt -printf “%f\n” | sort | tail -1 | cut -c -10
2005-05-31

Which left Tara as a server. E-mail, web, storage, shell and long running tasks. Damn, girl!

But, for the last three years I’ve been neglectful. Yes, there are backups. And monitoring. However, I don’t exactly feel comfortable with a large part of my life sitting on machine with no eyes on it and hardware older than children that can speak.

Which is a very long way of saying I’ve been transitioning my services off Tara. To other members of my increasing harem. And, this gives me an excuse to talk about virtualization.

Stay tuned.

June 17, 2008

bzrshelve, a punchline to a bad joke

The joke has been long coming.

Back when I was still on reddit, a short meme hit where someone wrote a little hack that made frontpage. The title is what must have sold it, as there wasn’t any there - there.

“Using Git as a versioned data store in Python” aka gitshelve.

A few days later, of course, hgshelve came into existence.

It’s telling that the Bazaar community never got into the action. I can imagine good arguments for both that scene being too small, or too busy getting work done.

Fortunately, I have no such issue. Behold: bzrshelve.

And the only DVCS that can get the source is svk.

June 17, 2008

Happy Key Revocation Tuesday

Almost one month ago, Florian Weimer on behalf of the Debian Security Team announced one of the worst security vulnerabilities in recent history. I won’t go into a technical description of the problem itself. But, it’s interesting to note how Debian both succeeded and failed, how this vulnerability broke the “patch to stay secure” model, and how it personally impacted me.

On Debian…

First, Debian is an all volunteer organization that created and maintains the largest integrated body of code. Ever. The Debian “operating system” is far larger than Microsoft Windows or Mac OS X - they can barely be compared. That a security vulnerability could lay in any package undiscovered for years is unsurprising.

But, once discovered, Debian’s security team promptly released an update of the affected packages fixing the flaw. In the same announcement for the update, there was an included link to a page that promised to have instructions on how to actually close the holes. That page wasn’t filled in until over a day later.

Of course, the wiki page had helpful information within 30 minutes.

Are you saying getting the security update didn’t fix my computer?

Yes. The problem wasn’t a matter of fixing the user’s software but fixing their data. The security keys they thought weren’t. The software to make new keys was provided; but, any Debian user that wasn’t subscribed to the right mailing list wouldn’t have known about the further action necessary. (Though, to be fair, the OpenSSH package at least warns about vulnerable keys on update.)

In fact, the average Debian user would be hard pressed to find any mention of the vulnerability. It wasn’t a front page news item. OpenSSL, and all dependent packages, fail to provide any alert on upgrade. Worse, the Certificate Authorities still haven’t revoked certificates for compromised keys. That means the SSL aura of trust has been devalued even more.

It would be an interesting, and expensive, experiment to see how many CAs will EV sign one of the compromised keys.

On me…

Meanwhile, tonight, I finally finished with “key rollover” on all my affected services.

  • tara: No services effected. (Too old.)
  • steak: No services effected. (Too old.)
  • megan: SSH, SMTP / IMAP, XMPP
  • resa: SSH
  • Personal keys: EECS, wsunix, Planet EECS, tara, megan, nearlyfreespeech

Gosh, I hope I got everything. Each of those only took about five hours apiece.

Of course, some people did make it easier. I already shouted out to the wiki page earlier. But, of everything and everyone who should have been doing their jobs, one group stood out and another one embarrassed itself:

From: “NearlyFreeSpeech.NET Member Support”
Subject: [NearlyFreeSpeech.NET] Potentially weak ssh key detected
Date: Wed, 14 May 2008 12:30:00 -0400

Hello

You are being contacted because an ssh key vulnerability in Debian-
derived Linux systems has been detected that may affect you.

Wow. Thanks!

From: “XMPP CertMaster”
Subject: XMPP SSL Certificate revoked, 09:12 pm 13 Jun 2008
Date: Fri, 13 Jun 2008 21:12:48 +0300

This mail is intended for the person who owns a SSL Certificate from the XMPP Intermediate Certification Authority (http://www.xmpp.net).

Your certificate with serial number 890 has been revoked for the following reason(s):

- The holder / owner of the certificate requested revocation.

You can’t blame the XMPP Federation. They don’t actually run a CA, they subcontract. I hope Peter isn’t paying much… as I’d say him having to write a notice of the vulnerability was not his money’s worth.

May 8, 2008

I will never be a software architect

Disclaimer: this may be be a Seattle area phenomenon.

I have “software architect” on my resume, and it pains me. Wikipedia has a great article on what a software architect may or may not be. But, in my world, a software architect has the knowledge, insight and responsibility to make educated decisions about the scope and direction of a team-developed software project.

That was a mouthful.

Software architects pick frameworks. They find previously existing packages for functionality just before the rest of the team realizes they need it. And, they plan and communicate how all the moving parts will come together. They’re really-really smart.

Everyone wants to be a software architect. At Seattle’s Startup Weekend, no less than a third of the developers signed up as architects. And why not?! The act of creation - from art to programming - is egotistical. If you’ve ever referred to yourself as a “software engineer” with a straight face, then you’re advertising the capability to plan non-trivial projects.

You’re a liar.

Software is big. You just won’t believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it’s a long way down the road to the chemist’s, but that’s just peanuts to software.

With all apologies to Douglas Adams. Software projects are the most complex machines created in the history of invention. You’re telling me that you can do better than Leonardo Da Vinci, Thomas Edison, or the Wright Brothers? Because each of those iconic figures were geniuses driven to create simpler machines than a web application. And each was wrong up front.

This isn’t a fair comparison. We have Photoshop, Digi-Key, and kit airplanes. Also, Rails!

Those inventors were forging into unknown territory. Customizing a CMS or integrating SAP ERP into a SOA are known quantities. It could be argued the architect exists for the partially ambiguous problems.

My response is a question oft heard in agile circles. I learned it from working in open source projects, corporate giants, startups and contracting. It’s a kōan:

“What features will you be adding in six months?”

The job of software architect is an answer. Is it the right one?

  • There is value in understanding a problem domain.
    But, the stakeholders in a project tautologically have that.

  • There is value in making the hard decisions.
    But, that is why we have team leaders.

  • There is value in planning your design.
    But, software structure inevitably resembles its team’s structure.

… and so on.

The software architect exists because of the cultural need to have someone be responsible for these aspects. But it isn’t possible to satisfy these responsibilities and simultaneously attend to the details that inform future decisions. Architecture astronauts just don’t have the time to be any more grounded!

Instead? Go slow. Let the programmers make the decisions. Feed them knowledge and constraints. Try to develop a consensus among the actual stakeholders. And accept everyone’s input. That quiet intern? They go home and spend all their spare time playing with tools that handle 80% of the job.

I’m not arguing for agile development practices.

I’m arguing for considered diligence. Plan a little. Work a little. Rinse and repeat. Never let yourself slip into the tunnel-vision that comes with long cycles.

Because if your team cannot make responsible architectural decisions, then no one can save your project.

April 13, 2008

How Scott hosts e-mail

I’ve been on the Internet a long time.

> ;$network.MOO_Name
=> "LambdaMOO"
[used 2 ticks, 0 seconds.]

> @age me
Quad first connected on Tue Oct 31 17:07:28 1995 PST
Which makes us 12 years, 5 months, and 10 days old.
However, for official purposes our age is 12 years, 3 months, and 27 days.

And, in that time, I have accumulated a few e-mail addresses. I’m proud to say that, with a few exceptions due to legal complications, every one of them still reaches me. But, this means I invest quite a bit of effort into my infrastructure.

I have a VPS running Postfix / Fetchmail + Procmail + SpamAssassin + Dovecot. I use mutt and (increasingly) Thunderbird to read and write. It’s a well oiled machine pushing a 6 gigabyte spool.

How Stuff Gets In

The Postfix configuration is bog standard. megan.quadhome.com is the authoritative name for the server. My domains are all virtually aliased to UNIX accounts.

For relaying my mail, the settings are straight-forward. No relaying without authentication. No authentication without TLS.

For the addresses whose domains I don’t directly control, that’s where Fetchmail steps in. I have a .fetchmailrc listing my accumulated servers, accounts and passwords. A crontab entry on @reboot starts the daemon.

How Stuff Gets Munged

I used to use virtual addresses. scott_BLAH@scott.tranzoa.net for anything sketchy. But, I found the effort made no difference in my inbox.

Now, when an e-mail comes in, it goes through a Procmail filter that separates mailing list traffic into their own dedicated boxes. After that, everything remaining is fed into SpamAssassin. I use spamc / spamd with bayes_learn_journal enabled to keep things fast.

As incredible as it sounds, occasionally SpamAssassin is wrong. Two folders named “Ham” and “Spam” exist for those situations. I appropriately file the miscategorized mail and the following script ran @hourly solves the problem:

#!/bin/sh
#
# learn-mbox
#
# An fancy wrapper around SpamAssassin's sa-learn.
#
# Learn an mailbox and then delete it.
#
# Lock to ensure we don't clobber anything.
#

MBOX="$1"
MODE="$2"

if [ -z "$MBOX" ]; then
  echo “Usage: $0 [MAILBOX] [ham | spam]” >&2
  exit 1
elif [ ! -f "$MBOX" ]; then
  echo “$0: ‘$MBOX’ does not exist.” >&2
  exit 1
elif [ ! -s "$MBOX" ]; then
#  echo “$0: ‘$MBOX’ is empty.” >&2
  exit 1
fi

if [[ "$MODE" != "ham" && "$MODE" != "spam" ]]; then
  echo “$0: ‘$MODE’ is not a learning mode. (’ham’ or ’spam’)” >&2
  exit 2
fi

lockfile-create $MBOX
lockfile-touch $MBOX &

sa-learn –mbox –$MODE $MBOX > /dev/null
echo -n > $MBOX

kill %1
lockfile-remove $MBOX

How Stuff Gets To Me

No Hotmail, Eudora, or Squirrelmail for me. I used Pine for the first years of my online life. After the licensing dispute, I switched to mutt and never looked back. It had all the features I needed.

Time marched on, and different features became more important.

Now, I use a combination of Thunderbird and mutt. The former provides a richer experience. The latter is a safety net for when I’m on random computers.

mutt is on the server, so it accesses my mail directly. But, Thunderbird is an IMAP client. And, Dovecot provides those necessary IMAP services.

Dovecot is also configured with out-of-box defaults with one exception. My IMAP passwords are different from my UNIX passwords. Dovecot provides TLS-only SASL authentication with hashed passwords. Postfix also works with Dovecot to share the same authentication method.

The practical upside is when Mallory finds my mail passwords, she can’t destroy my server and backups.

April 13, 2008

The challenge from Denver.

My friend Mike drunk-dials me one evening and leaves a voicemail. He’s out in Boulder for TechStars 2007. Apparently, some friendly harassment over drinks between companies was pushed to the next level. EventVue’s team bet Mike a dinner and some cash that a hack couldn’t be slipped in on their website.

~ Who ya gonna call? ~

I get started Thursday afternoon with a whois/ping of the server, and basically do my homework to make sure all the registration information is what it should be. What can I say - even though I’m being given an account on their server, I still like to feel comfortable before I (possibly) break the law.

Rules of the contest are to find a site modification hack. This has been defined as:

  • XSS
  • SQL Injection
  • Remote Root

I plan on focusing on XSS attacks as they’re easy and have the least potential to cause long-term damage. SQL injection investigation can result in inconsistent database states, and a remote root means a painful security audit for someone who isn’t me.

Their development web server is protected using HTTP authorization - plaintext. I haven’t been given a username and password yet. Therefore, I send Mike a text message and wait to get some permissions.

In the mean time, I refresh my memory on various PHP artifacts. It was mentioned that magic quotes are enabled as a security precaution. A mental echo tells me that the feature is a false sense of security option and that most deployments have it turned off. I read documentation to refresh my memory. For the uninformed, it’s a mechanism where incoming GET and POST data is unconditionally escaped. It’s generally disabled on servers because of the headaches it causes in repeated escaped data being passed from page to page. It also offers limited protection for SQL injection, as it
s often easy to bypass in cases of alternate delimeters.

30 minutes pass.

Mike sends me a username and password via text message. It isn’t the most secure password, but whatever - I don’t plan on running a dictionary attack or anything.

I logged into the development site and it’s a slightly more broken version of their normal front page. And, I apparently still need an invite. Another phone call to Mike…

30 more minutes pass.

I receive further login details and immediately am greeted with an inauspicious beginning. In their login page, the authentication fields are pre-filled with the incorrect credentials I had supplied earlier. I don’t have Javascript enabled yet (NoScript) and I planned on taking a look at the cookies later but… I decided to look then.

There were only session IDs. Their server is storing the username and password cleartexts keyed to the session ID and then pushing them back to the client in the HTML. If I find a XSS, then I can steal anyone’s username and password by requesting their login page.

Also, my username and password still don’t work.

While I wait for further details from Mike, I suss out the beginnings of a POC. The login page is XSS’able via its authentication fields. I can cull passwords via an XSS against it and then XMLHTTP’ing the password scraped from the DOM back.

Though, it is destructive on the username, but I think that can be worked around.

20 minutes pass.

I’m finally in the site. It was a matter of a “beta.” vs. “dev.” URL. I take a look at “Account Settings” and they’re kicking back the username and password there too in cleartext. So, the login page XSS doesn’t need any trickery to work around.

Their search page uses some odd search-and-replace mechanism on the query quoting. I can’t figure it out too much, but a simple XSS of:

/search?q=%22%3E%3Cscript%3Ealert(1)%3C/script%3E

Works just fine. But, I still want to find an injection hole in order to make something self-replicating.

The profile page is where they spent their lock-down time. Every field has aggressive HTML stripping and magic quotes applied. This makes for some ugly formatting bugs, but I can’t immediately push an XSS through there. The HTML filter is something along the lines of:

regexp_replace(”\<^\w*>”, “”)

I feel that there should be some trick to using magic quotes and their inconsistent use of stripslahes to bypass it all. Specifically, they strip on some output (profile page) and not on others (profile edit page). I’m surprised they just don’t use htmlspecialchars and be done with it.

An hour passes.

I called Mike to let him know I win. While I think my earlier XSS attacks were enough, I finally found a on-site modification. Changing the user’s name to a quote injected with an onload event worked. It triggers on all other users when they visit the Community Page too.

Does this mean I win a free trip to Boulder, and Munchy’z tomorrow? Sweet deal.

This was first posted 2007-07-06 but taken down because EventVue was nascent. It’s back now, for keepsies.

November 5, 2007

Wide Finder: Analysis?

Tim Bray’s response to the suggestion of analysis for the Wide Finder Results is “Are you kidding me!?!? Getouttahere. Maybe someday.”

I’m only barely braver.

People hours are more expensive than computer hours. Tim includes the lines of code metric, and the average elapsed wall-clock for each implementation. Let’s use division!

Name Language Elapsed LoC LoC per Elapsed Model
clv5 Gawk 46.73 24 0.51 Serial
wf_p Ruby 50.16 39 0.78 Map-Reduce
wf-2 Python 41.04 38 0.93 Map-Reduce
wf-Heikkinen OCaml 49.69 110 2.21 Serial
wf-Fernandez OCaml 39.17 124 3.17 Serial
tbray5 Erlang 20.74 76 3.66 Message Passing
tbray9(128) Erlang 21.58 119 5.51 Message Passing
wf-block OCaml 18.99 144 7.58 Serial
wf-6(2) Python 16.91 137 8.1 Scatter-Gather

Let’s assume less lines of code = easier to understand. Let’s also assume that parallel processing concepts are hard to learn.

Then it seems Map-Reduce models are maturing well. Thank Google for popularizing that.

Odd, though. Erlang’s model of message passing is older. But, I hear there are weaknesses in its standard library?

October 12, 2007

NSA Hookups

Last week, while bored in morning class, I had a brainstorm for a humorous personal ad. I wrote it up and posted it on the only worthwhile classified service: Craigslist. Then, I started surfing around the rest of the Pullman section.

I had never visited before.

Unsurprisingly, there were a few postings in the women seeking men. And barely a handful of postings were in the men seeking women personals. But, surprisingly, the casual encounters sections were full of lonely hearts. Or, more accurately, lonely beds.

I have read articles about websites where people hookup. But, I had assumed this was a matter of statistics - there are always a few crazies on the Internet. But, if the tiny Pullman section of Craigslist was so depraved… I checked Seattle and Los Angeles.

Thus my mind was blown.

Every metropolitan area had far more “no strings attached” sex postings than anything of a romantic quality. And, every advertisement was so straight to the point. “I’m blah blah blah. You be somewhere in the range of blah blah blah. Respond with a picture and you’ll get mine.” No beating around the bush.

At this point, I wondered why the heck are people posting these to Craigslist? I googled for the obvious terms of “hookup”, “nsa sex” and “booty call.” Several Google Adwords campaigns later, I realized there were only a few websites catering to this sort of thing. And I also realized they are totally messing up their market.

Every one of these websites wanted a person to sign up, put in a large number of details, and in general put themselves out there before ever having the opportunity to reach out and touch somebody. I didn’t even check the ones that required a credit card. This explained why all these posts were on Craigslist - it’s free and simple.

I know I can beat that.

So, imagine a website that opens up to a sign-up page. It asks for an e-mail address, zip code, and a few body characteristics. After confirming your address, it asks you for what body characteristics you’re looking - an age range, race check-boxes, and height/weight seem good enough for version one. Then, around lunch-time, you receive an e-mail saying “50 matches found, cutie.”

It needs to be “cutie” to let you know the website is hardcore.

A list of profiles, maybe with one sentence taglines, appears. You then can click “OK” or “No Way!” for each one. If two people’s “OK”s match, they’re connected and able to send text and pictures to each other. This gives six hours to arrange a hookup.

At midnight, the coach turns back into a Pumpkin - all profiles you didn’t match up with are cleared. Any matches with communications are then able to be rated: “Call for a Good Time” or “SEXUAL PREDATOR.” Clearly, the votes exist to bias these people in future rounds.

Web 2.0 wins again.

October 10, 2007

Torrents for Friends

BitTorrent is the most popular method of file-sharing in the very short history of file-sharing.

A lot of noise is made about the average user’s inability to learn new things. But, every method of file-sharing has been embarrassingly technical. It’s clear that a free copy of the latest Britney Spears single is motivation enough to:

  • Buy faster Internet connections
  • Download a client
  • Find a torrent file
  • Wait an undetermined amount of time
  • Load the MP3s into iTunes
  • Sync to an iPod
  • Burn to a CD-R

What an odd collection of acts that meant nothing 20 years ago.

Even stranger is the concept of a “private tracker.” Websites like the Pirate Bay, Demonoid and the now defunct Suprnova have been written about in the New York Times and mentioned on both CNN and MTV. But, with ascent of BitTorrent, there has been a similar descent in sharing. It’s difficult to host a torrent.

  • Create a torrent file
  • Find hosting for it
  • Share the link
  • Wait an undetermined amount of time

Where is the service providing a private tracker with my friends?

July 17, 2007

Leveraging piracy for dollars, pt. 1

The magic of the iTunes Store is its consistency. All the music you want is there. You know how much it costs. You know what the quality will be. The shopping experience is measurably more consistent and enjoyable than purchasing a CD or vinyl.

I’d argue building an experience is Steve Job’s magic.

The problem with the iTunes Store is its restrictiveness. I’m not referring to DRM. I mean that you are binding yourself to an entire business ecosystem. If I want to buy a song:

  1. I’m patronizing Microsoft or Apple.
  2. (EMI excepted) I’m supporting the use of DRM.
  3. iTunes encourges the use of an iPod.

Each of these are significant market limitations. For example, Microsoft’s URGE differentiates itself only on the third point. And, as far as I know, there has only been one service that did away with those three requirements…

AllOfMp3 was incredibly popular, and incredibly illegal. It took the US Government lobbying Russia to close it down. Clearly, the model of a website with a consistent experience and none of the above restrictions is demanded by the market.

So, piracy continues in its vaccum.