About Tom Evslin

Video Profile of Tom Evslin

Follow Tom Evslin on Twitter


subscribe:

Add to Technorati Favorites!
Powered by TypePad
Member since 01/2005

technorati


« Amazon S3 – Very Cheap Storage in the Sky | Main | Amazon S3 – Backstory for Nerds – Part 2 »

Amazon S3 – Backstory for Nerds - Part 1

It’s practical to use Amazon’s Simple Storage Service (S3) as the backend for a web application WITHOUT any intervening application server. I have the code to prove it and I’m sharing it. If you’re a nerd, read on otherwise you may want to skip this post

S3 clearly was designed to for use with an application server between browser and desktop applications and the actual data store. To Amazon’s credit, that server doesn’t have to be Amazon’s Elastic Computing Cloud (although pricing somewhat favors this) . The Getting Started Guide, for example, gives examples only in PHP, C#, Java, Perl, Ruby and Python because they assume that you’ll be using one of those languages on a server. The implication is that you can’t do it in JavaScript/AJAX – but you can.

The Developers Guide also gives short shrift to JavaScript. It does explain that you can let “your users” upload certain data directly (bypassing your server) through HTML forms although it assumes that access keys will be delivered to the browser by a server just prior to upload for all non-trivial cases. There is a also a description of a way to prepackage a very specific authorized retrieval request so that the browser can get data directly from S3 – but, again, the assumption is that the browser first asks “mother may I?” and receives a specific and time-limited key from the server before making an XMLHttpRequest GET.

My application – broadbandwiki – lets users put pins on a map in order to indicate what type of Internet access they have available at their locations. S3 is the place that the locations, access types, and access providers are remembered. Google’s examples assume that, if you want to save the locations you gather with an application like this, you’ll have a LAMP stack operating on a server somewhere. They give good examples of the PHP and mySQL you’d use to do this.

My problems were both that I don’t have a server on which to run said LAMP stack and I would have to learn a lot about Linux, Apache, mySQL, and PHP. The first problem could easily be solved by any of a number of hosting services including EC2; but all the learning seemed a lot to wade into when I just wanted to store a little data and have access to it later. Probably ended up being harder to get S3 to do what I wanted than if I’d just swallowed my medicine and learned the server-side stuff but I got stubborn.

The good news is that, once you know the tricks, this is easy.

Here’s the main thing you need to know even though the documentation doesn’t say it. XMLHttpRequest works for not only GET but also PUT and DELETE (That’s all I’ve tested). I didn’t need PUT and DELETE in the user application but I did need them in the batch-like administrator application. Note that the administrator application does all sorts of privileged stuff but it can still be run safely in a browser because the Amazon Secret ID is supplied by the administrator at run time. It is used to develop a time-limited hash key on the administrator’s computer but the secret key itself is NEVER transmitted.

Note that XMLHttpRequest is usually limited to making requests from the same domain as the web page was served from and this is no exception. The solution is to host the web page on your Amazon S3 account (it’s got to live somewhere, anyway) so that there is no cross-domain violation. However, the page hosted on Amazon can be in a frame of a page hosted elsewhere if that’s important to the look and feel of what you’re doing.

Sample code for the administrator app (which borrows liberally from every other bit of sample code I could find) is here. It won’t run for you until you have an Amazon S3 account and fix the places where I bound the name of my own bucket on S3 in the code but it is meant to provide an example of how to use XMLHttpRequests with S3. It is NOT a sample of clean, well-documented  code, however; it’s a work in progress. It also needs more functionality and more error checking so I will post later versions as I clean it up.

Next nerd post on this subject will be on how to let browser users write data, some of which is meant to be seen by other browser users and some of which isn’t – all without a server of your own.

Once I have done some cleaning up and packaging, I’ll also post the user application although all you good hackers know you can already get it just by downloading it from its web address or viewing source in a browser.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451cce569e200e55159648c8834

Listed below are links to weblogs that reference Amazon S3 – Backstory for Nerds - Part 1:

Comments

Tom Evslin

Dave:

Thanks for both your comments. Am looking forward to looking at and playing with the Google capability. In the meantime have learned php for my next project so am moving to the server side.

Don't want to be specific (or ever complacent) on security stuff. Don't think there's a significant man in the middle danger with the admin program even when the connection is not secure because requests are signed in many bits with the secret key that doesn't leave the computer and a very shorttime limit. Repeating the request would usually fail.

The html fro the app is, of course, visible to anyone who wants to look at it and hackable. But the non-secret key embedded in it only allows what users are allowed to do. Could be put in another program to do the same thing but this is write-only and not in a form that can be reread by the writer.

Dave L.

Google's new AppEngine service will give you just what you're looking for: the ability to build web applications without having to learn the whole LAMP stack. If and when AppEngine supports JavaScript, you won't even have to learn a new language.

Dave L.

This a is very clever, but it reminds me of the bad old days when people didn't appreciate that the HTML forms and cookies that they sent to a browser could be tampered with. There used to be online stores that kept product prices in the HTML form, and you could give yourself a hacker discount by sending an altered form back to the server.
In this case it's the JavaScript sent to the browser that can be hacked to wreak havoc on your datastore.
The admin app is interesting though if it can be used over an encrypted connection, otherwise it's subject to man-in-the-middle attacks.

Tom Evslin

Tom:

The PUT request is presigned allowing a very specific kind of write resulting in a record which belongs to the owner of the bucket.

One problem in generalizing this is that a different bucket would require a different signature which can't be generated by the code without having the secret key of the bucket owner which it isn't allowed to have.

Tom

I understand how the admin portion of the application functions, but how are users able to submit their pins to the map if they don't have PUT access?

Craig Plunkett

Thank you Tom, thank you thank you.

samwyse

Thanks for the explanation of how you're using S3; I've been waiting to see how you did this since I noticed your application 's URL. Two things occur to me that you may not have thought of in your haste to get things up and running: First, if you create a bucket named broadbandwiki.tomevslin.com and set up a DNS server to point it to S3's IP address, then you can use http://broadbandwiki.tomevslin.com/broadbandwiki.html as your application 's URL and only packet-sniffing geeks will ever know where it's actually hosted.

Second, as I tried to figure out how you wrote your application before today's post, I devised a clever way to gather data into an S3 bucket. You can tell S3 to store a bucket's web logs in another bucket. So, your application just needs to access a page that says "Data recorded" and provide the user's "pin data" in a custom HTTP header. At regular intervals, you run a batch process to extract that header field from the log files and update your main database. (I suppose I could get a "business process" patent on this idea, but I'm not a big believer in those things.)

Post a comment

If you have a TypeKey or TypePad account, please Sign In.

Now on Kindle!

hackoff.com: An historic murder mystery set in the Internet bubble and rubble

CEO Tom Evslin's insider account of the Internet bubble and its aftermath. "This novel is a surveillance video of the seeds of the current economic collapse."

The Interpreter's Tale

Hacker Dom Montain is in Barcelona in Evslin's Kindle-edition long short story. Why? and why are the pickpockets stealing mobile phones?

Need A Kindle?

Kindle: Amazon's Wireless Reading Device

Not quite as good as a real book IMHO but a lot lighter than a trip worth of books. Also better than a cell phone for mobile web access - and that's free!

Recent Reads - Click title to order from Amazon


Google

  • adlinks
  • adsense