<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ryanpark.org &#187; Technology</title>
	<atom:link href="http://www.ryanpark.org/category/technology/feed" rel="self" type="application/rss+xml" />
	<link>http://www.ryanpark.org</link>
	<description>The personal home page of Ryan Park of San Francisco, California, USA.</description>
	<lastBuildDate>Thu, 10 Jun 2010 08:36:13 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Well, I tried Quicken Online&#8230;</title>
		<link>http://www.ryanpark.org/2008/12/quicken-online.html</link>
		<comments>http://www.ryanpark.org/2008/12/quicken-online.html#comments</comments>
		<pubDate>Sun, 28 Dec 2008 10:48:53 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=434</guid>
		<description><![CDATA[And I was sorely disappointed. Earlier this month I reviewed Mint, an online personal finance website. At the time I promised I would review some of Mint&#8217;s competitors. Today I took a look at Quicken Online. Quicken Online is designed to compete with websites like Mint, but is not a replacement for desktop finance apps. [...]]]></description>
			<content:encoded><![CDATA[<p>And I was sorely disappointed.<br />
<span id="more-434"></span><br />
Earlier this month <a href="http://www.ryanpark.org/2008/12/mint.html">I reviewed Mint</a>, an online personal finance website.  At the time I promised I would review some of Mint&#8217;s competitors.  Today I took a look at Quicken Online.</p>
<p><a href="https://www.quickenonline.intuit.com">Quicken Online</a> is designed to compete with websites like <a href="http://www.mint.com">Mint</a>, but is not a replacement for desktop finance apps.  It&#8217;s a single portal to view all of your bank transactions&#8230; and that&#8217;s about it.  I don&#8217;t see any significant features that Mint doesn&#8217;t already have.</p>
<p>When you sign up for a Quicken Online account, it asks you to enter the usernames and passwords for your bank accounts.  Intuit claims that the sign-on data is &#8220;encrypted and stored on our firewall-protected servers,&#8221; but as a software developer, I don&#8217;t find that particularly reassuring. I&#8217;d rather avoid giving my sign-on information to third parties altogether. That said, Intuit is a large company with many years of experience in storing financial data, so I do have some faith that they know how to handle it safely.</p>
<p>Once you enter your bank accounts, you can view transactions in those accounts from the Quicken Online home page.  One nice feature is that you can enter upcoming transactions before they&#8217;re posted by your bank, so you can get an estimate of your upcoming cash flow.  But you can&#8217;t enter any accounts that don&#8217;t have sync capabilities with Intuit.</p>
<h3>Reconciliation</h3>
<p>Unlike many people, I still keep track of my transactions separately from my banks, and reconcile bills and statements when they arrive.  It&#8217;s a good way to avoid unauthorized charges and keep track of my spending.  I&#8217;d like to &#8220;bring that to the 21st century&#8221; by checking off transactions as they post from my bank, rather than having to enter every transaction.  Well, Quicken Online doesn&#8217;t have any such features. Here&#8217;s how they justify this:</p>
<blockquote><p>Reconciling is useful for matching up your paper checkbook register with the transactions on your bank statement. But who keeps a paper checkbook register these days?</p></blockquote>
<p>I find this justification highly suspect, coming from a company that sells millions of copies of an electronic check register that does just that.  Of course I don&#8217;t want to reconcile a paper checkbook register.  I do want to reconcile credit card receipts and (gasp) paper checks.  I want to differentiate transactions I&#8217;ve already seen from those that are brand new.  Reconciliation is a critical process of staying within a budget and avoiding fraudulent charges.  I can&#8217;t tell whether Intuit&#8217;s online division&#8217;s management truly believes that reconciliation has gone the way of the dodo, or whether they&#8217;re just trying to avoid cannibalizing Quicken software sales.</p>
<p>At any rate, I can&#8217;t find any significant reasons to recommend Quicken Online.  Intuit is a more established company than Mint and Wesabe, so I trust them a little more to store my bank account login safely.  But Mint has a much &#8220;fresher&#8221; design, a nice iPhone app, and a more active user community.</p>
<p>In a few days I&#8217;ll review Wesabe as well and see how it matches up to these two.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/12/quicken-online.html/feed</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Weekend website updates</title>
		<link>http://www.ryanpark.org/2008/12/weekend-website-updates.html</link>
		<comments>http://www.ryanpark.org/2008/12/weekend-website-updates.html#comments</comments>
		<pubDate>Mon, 22 Dec 2008 09:08:15 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Professional]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=432</guid>
		<description><![CDATA[I spent much of this weekend inside, out of the snow, working on a couple of geeky/cool website updates. First, I sent out about 150 online Christmas cards. For the last few years I used Yahoo Greetings, but this time I decided I wanted something a little more personal. I included a Flickr slideshow and [...]]]></description>
			<content:encoded><![CDATA[<p>I spent much of this weekend inside, out of the snow, working on a couple of geeky/cool website updates.</p>
<p>First, I sent out about 150 online Christmas cards. For the last few years I used Yahoo Greetings, but this time I decided I wanted something a little more personal. I included a <a href="http://www.flickrshow.com">Flickr slideshow</a> and a short holiday letter to the people I haven&#8217;t talked to recently. Plus I got to play around with some cool new technology like <a href="http://novemberborn.net/sifr3/alpha">sIFR</a> and the <a href="http://aws.amazon.com/cloudfront/">Amazon CloudFront</a> CDN. If you didn&#8217;t receive a card but you&#8217;d like one, let me know in the comments.</p>
<p>Second, I developed a <a href="http://www.ryanpark.org/dashboard/">dashboard</a> to monitor the health of my web server. I wrote a small script which captures statistics every minute, and then built the dashboard to go along with it. I&#8217;m using <a href="http://code.google.com/p/flot/">Flot</a>, a JavaScript graphing library, to generate the graphs. It probably would have been wise to use an existing component to capture the data, but this works fine for now.</p>
<p>Geeky/cool indeed.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/12/weekend-website-updates.html/feed</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>PureVPN</title>
		<link>http://www.ryanpark.org/2008/12/purevpn.html</link>
		<comments>http://www.ryanpark.org/2008/12/purevpn.html#comments</comments>
		<pubDate>Mon, 15 Dec 2008 13:58:53 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=425</guid>
		<description><![CDATA[Home users often use &#8220;virtual private networks&#8221; (VPNs) to establish a secure Internet channel to an office network. Recently some home users have found another reason to use a VPN. Many wireless networks are not configured to encrypt any of their traffic, especially those in public such as hotels and coffee shops. Some people have [...]]]></description>
			<content:encoded><![CDATA[<p>Home users often use &#8220;virtual private networks&#8221; (VPNs) to establish a secure Internet channel to an office network. Recently some home users have found another reason to use a VPN.  Many wireless networks are not configured to encrypt any of their traffic, especially those in public such as hotels and coffee shops.  Some people have begun using VPNs when connected, simply to encrypt the information that&#8217;s sent over the wireless network.  Google even offers a free VPN service for anyone connected to Google&#8217;s citywide wifi network in Mountain View, California.</p>
<p>I&#8217;ve been worried about Internet security myself recently, so I started trying to set up my own VPN using the free OpenVPN software.  My goal was simply to encrypt the traffic between my laptop and a computer I run on a wired network.  This wouldn&#8217;t encrypt all my communications on the Internet backbone, but at least it would prevent snooping on wireless networks.</p>
<p>OpenVPN is designed to handle an incredibly wide variety of networks, and as a result it&#8217;s very difficult to configure to do something &#8220;simple&#8221; like this. I spent an hour reading instructions and generating encryption keys, but when I first tried to run the OpenVPN software on my MacBook, it crashed the computer.  I quickly decided this wasn&#8217;t for me.</p>
<p>Instead I tried PureVPN, which was a low-cost VPN service open to the public.  PureVPN is a pay-as-you-go service and offers a variety of inexpensive service plans.  I paid $2.50 and received ten hours of VPN use.  This would be a great deal&#8212;if the service worked as promised.</p>
<p>PureVPN doesn&#8217;t require any software beyond what&#8217;s built into Mac OS X or Windows.  It was very easy to set up and when I tested it from home, it seemed to work fine.  I confirmed that all of my Internet traffic was sent over the encrypted VPN, which ensured that I&#8217;d be protected from nosy neighbors. I tried it from a coffee shop once and it worked fine from there as well.</p>
<p>However, the real test occurred when I went on vacation in Las Vegas, Nevada.  Away from home for a week, I wanted to use PureVPN over many insecure wireless networks&#8212;at hotels, at cafés, and at at my sister&#8217;s house.  But when I got to Las Vegas, I found that PureVPN was down!  It was down the entire week that I was gone, and only came back a few days after I returned home.</p>
<p>I hadn&#8217;t invested much money in my PureVPN subscription, so I haven&#8217;t contacted them about the downtime.  At $2.50 I figure &#8220;you get what you pay for.&#8221;  But unfortunately I can&#8217;t recommend PureVPN to anyone else, simply because I don&#8217;t trust I can rely on them when I need them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/12/purevpn.html/feed</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Using Mint for Personal Finance</title>
		<link>http://www.ryanpark.org/2008/12/mint.html</link>
		<comments>http://www.ryanpark.org/2008/12/mint.html#comments</comments>
		<pubDate>Mon, 15 Dec 2008 12:36:36 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=417</guid>
		<description><![CDATA[I finally entered the 21st century this week when I began using an online personal finance application called <a href="http://www.mint.com">Mint</a>. Read on for my review of the site.]]></description>
			<content:encoded><![CDATA[<p>I finally entered the 21st century this week when I began using an online personal finance application called <a href="http://www.mint.com">Mint</a>.</p>
<p>I’ve been a long-time user of <a href="http://www.microsoft.com/money">Microsoft Money</a> to keep track of all my bank accounts: checking, credit cards, loans, and investments. I enjoy the comfort of manually entering all my transactions, and reconciling the bank statement at the end of the month. This process lets me keep an eye on my accounts and watch out for suspicious activity. But in ten years of banking, I don’t think I’ve ever found a truly unauthorized transaction.</p>
<p>Microsoft Money made data entry painless and provided many simple reports about where my money is going. But I’ve had trouble finding a similar solution after I switched to a Mac in 2007. Both <a href="http://www.amazon.com/gp/product/B000GI0HR2?ie=UTF8&#038;tag=ryanparkorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000GI0HR2">Quicken Personal Finance for Mac</a><img src="http://www.assoc-amazon.com/e/ir?t=ryanparkorg-20&#038;l=as2&#038;o=1&#038;a=B000GI0HR2" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> and <a href="http://www.amazon.com/gp/product/B000Q1OTTG?ie=UTF8&#038;tag=ryanparkorg-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=B000Q1OTTG">iBank</a><img src="http://www.assoc-amazon.com/e/ir?t=ryanparkorg-20&#038;l=as2&#038;o=1&#038;a=B000Q1OTTG" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> have atrocious data entry processes that require constant movement between the keyboard and the mouse. It’s also very difficult to generate simple reports in either program. I’ve been using iBank for a year and I still can’t figure out how to see a list of all transactions in a category.</p>
<p>Frustrated with the lack of Mac options, I finally decided to try some online finance apps instead. <a href="http://www.mint.com">Mint</a> has developed a very good reputation online so it was the first site I tried. Mint is a web application that actually connects to your banks every night and downloads all of your transactions. It presents you with a summary of your recent transactions and your current balances. You can easily drill down and find more information about your accounts and transactions. The site is beautiful, and reports are easy to generate and customize.</p>
<p>Security was my overwhelming concern&#8212;they have your online banking password, so they could access all of the money in your accounts! But they say that they’ve developed reasonable safeguards, and they work with a third-party intermediary to secure your data.</p>
<p>Some more things I like about Mint:</p>
<ol>
<li>It was incredibly easy to add most of my bank and investment accounts to the site. Mint was pre-configured to work with nearly all of my financial institutions.</li>
<li>The site automatically imports new transactions every time I login. This works well and it’s easy to customize the way different transactions are handled. Mint will send me a text message when large transactions clear or when my balances are low, which is a great service.</li>
<li>The forums are fantastic and there’s already a strong user community sharing tips and tricks about saving money with Mint.</li>
<li>Mint is free, and they make money by suggesting new financial services. But they only recommend services when it’s likely in your best interest. For example, right now Mint says I could save money by moving my checking account to HSBC. This is an unusual business model because most referral programs are difficult to set up and don’t pay well. But financial referrals usually do pay well&#8212;think of all the gimmicks to entice you to sign up for a credit card. I suspect Mint will become profitable quickly.</li>
</ol>
<p>What I don’t like:</p>
<ol>
<li>I’m still not eager to give my bank account credentials to Mint. It’s impossible to know exactly how Mint stores this data or how secure their service providers are. I’m not sure that there is a good solution to this problem in any environment where the data is outside my immediate control.</li>
<li>If your bank isn’t supported by Mint, there’s no way to track an account manually. This is a significant problem for me. I have a health savings account (HSA) which Mint can’t recognize. Either I need to keep using Quicken for this one account&#8212;which is silly&#8212;or I need to stop recording transactions from my HSA. Neither option is satisfactory. I will not be able to rely on Mint alone unless I can enter data manually and have my HSA treated equal to all my other accounts.</li>
</ol>
<p>Next week I’m going to try <a href="http://www.wesabe.com">Wesabe</a>, another site similar to Mint, and see how it stacks up. I also might try <a href="http://quicken.intuit.com/online-banking-finances.jsp">Quicken Online</a>, which Intuit is now offering for free. I’ll post reviews of those applications once I’ve had a chance to try them.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/12/mint.html/feed</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Top 10 Reasons to Avoid the SimpleDB Hype</title>
		<link>http://www.ryanpark.org/2008/04/top-10-avoid-the-simpledb-hype.html</link>
		<comments>http://www.ryanpark.org/2008/04/top-10-avoid-the-simpledb-hype.html#comments</comments>
		<pubDate>Mon, 21 Apr 2008 23:42:18 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Professional]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=392</guid>
		<description><![CDATA[There is a ton of chatter on the Internet about Amazon SimpleDB, Apache CouchDB, Google App Engine&#8217;s Datastore API, and other distributed key-value data stores. Their biggest perceived advantage is scalability: they can help eliminate the bottleneck imposed by single-server databases. But the hype around these new databases is growing frantic. This morning I read [...]]]></description>
			<content:encoded><![CDATA[<p>There is a ton of chatter on the Internet about <a href="http://www.amazon.com/SimpleDB-AWS-Service-Pricing/b?ie=UTF8&#038;node=342335011">Amazon SimpleDB</a>, <a href="http://incubator.apache.org/couchdb/">Apache CouchDB</a>, Google App Engine&#8217;s <a href="http://code.google.com/appengine/docs/datastore/">Datastore API</a>, and other distributed key-value data stores. Their biggest perceived advantage is scalability: they can help eliminate the bottleneck imposed by single-server databases.</p>
<p>But the hype around these new databases is growing frantic. This morning I read <a href="http://highscalability.com/search-source-data-how-simpledb-differs-rdbms">an article by Todd Hoff</a> which fawned over SimpleDB&#8217;s unconventional rules to such an extent that I thought it might be satire. There are some significant drawbacks to developing in this new database paradigm. In fact, many of Mr. Hoff&#8217;s supposed advantages are actually serious disadvantages to the paradigm. Before designing your architecture around a database engine like SimpleDB, it&#8217;s important to consider the reasons <i>not</i> to do so.</p>
<p><span id="more-392"></span></p>
<p>Most of my points are directed at the Amazon SimpleDB service, but many also apply to other databases like CouchDB and the Google Datastore.</p>
<h3>1. Data integrity is not guaranteed.</h3>
<p>Data stores like SimpleDB don&#8217;t support the same rigorous constraints that RDBMSes do. Some of these databases support single-row constraints, like requiring data in certain fields, but it is nearly impossible for these systems to enforce <tt>UNIQUE</tt> constraints and foreign keys.</p>
<p>Programmers can work around this by issuing extra queries to confirm an update is valid, but this requires a lot of extra work. This will never be perfectly accurate&#8211;it may be impossible to avoid race conditions when two clients simultaneously attempt conflicting updates. And it&#8217;s especially difficult with SimpleDB because SimpleDB doesn&#8217;t guarantee that a client sees all the recent updates to the data.</p>
<h3>2. Inconsistency will provide a terrible user experience.</h3>
<p>Speaking of inconsistency, it&#8217;s critical to shield users from<br />
this property of SimpleDB.</p>
<p>SimpleDB is optimized for fast writes. Your API calls return as soon as the data is written to the SimpleDB service, but before it&#8217;s replicated across all of the SimpleDB servers. If you issue any queries before the data is propagated, you won&#8217;t necessarily see your most recent change.</p>
<p>When I save my changes in your web application, I expect that your system will show me a consistent view of those changes. If you show me the data that&#8217;s in SimpleDB, my changes might not appear, and I&#8217;ll probably get confused. In fact, I will<i> probably</i> freak out, thinking that you lost my data. You can try to inform me about how this works (&#8220;It will take a few minutes for your changes to be visible&#8230;&#8221;) but that&#8217;s <i>not</i> easy for users to grasp.</p>
<h3>3. Aggregate operations will require more coding.</h3>
<p>SimpleDB does not support aggregate operations like joins, <tt>GROUP BY</tt>, <tt>SUM</tt>/<tt>AVERAGE</tt> functions, and sorting. You will need to implement these yourself.</p>
<p>Todd Hoff argues that <a href="http://highscalability.com/search-source-data-how-simpledb-differs-rdbms">this &#8220;suckiness&#8221; is a fair tradeoff</a>:</p>
<blockquote><p>SimpleDB shifts work out of the database and onto programmers which is why the SimpleDB programming model sucks: it requires a lot more programming to do simple things. I&#8217;ll argue however that this is the kind of suckiness programmers like. Programmers like problems they can solve with more programming. We don&#8217;t even care how twisted and inelegant the code is because we can make it work. And as long as we can make it work we are happy.</p>
</blockquote>
<p>I disagree. More boilerplate code distracts you from actually solving real users&#8217; needs. Why reinvent the <tt>GROUP BY</tt> wheel when MySQL, PostgreSQL and Oracle have already perfected it?</p>
<h3>4. Complicated reports, and ad hoc queries, will require <i>a lot</i> more coding.</h3>
<p>In my experience, database use falls into three broad patterns: (1) standard queries and updates performed by your application&#8217;s users; (2) more complicated reports for users and internal staff; and (3) ad hoc queries for troubleshooting and system monitoring. SimpleDB may be optimized for category 1, but categories 2 and 3 will be much more difficult without SQL.</p>
<p>Complicated reports are probably the best application of the SQL language. Because SQL is a declarative language, it&#8217;s incredibly easy to generate aggregate information about your data. In my previous jobs, our reports often required hundreds of lines of SQL to get the right information out of the database. This is a lot of code, but it was required to generate the data for our customers. Without access to SQL, your programmers will need to implement reports through imperative statements, which will exponentially increase the development time.</p>
<p>Ad hoc queries are even worse: they&#8217;re usually simpler, but they&#8217;re always changing. An RDBMS expert can often write an ad hoc SQL query as fast as the marketing department can explain what they need. Using an imperative programming language to write these queries would destroy your developers&#8217; productivity.</p>
<h3>5. Aggregate operations will be much slower if you don&#8217;t use an RDBMS.</h3>
<p>RDBMSes are highly optimized for performing aggregate operations across huge volumes of data. Fast algorithms like the <a href="http://en.wikipedia.org/wiki/Hash_join">hash join</a>, <a href="http://en.wikipedia.org/wiki/Sort-merge_join">merge join</a>, and indexed <a href="http://en.wikipedia.org/wiki/Binary_search_algorithm">binary search</a> have been around for 20 years or more. SimpleDB and the Google Datastore return datasets which are more like objects than traditional database rows. It&#8217;s unlikely that you&#8217;ll be able to process this data with anything other than <a href="http://en.wikipedia.org/wiki/Nested_loop_join">nested loops</a>, especially if your programmers aren&#8217;t database algorithm experts. Nested loop algorithms are considerably slower than the others.</p>
<p>Even if you&#8217;re the 31337est database expert and enjoy writing these operations in your business objects, there&#8217;s another performance factor to consider. In order for your application server to handle aggregate operations, you will need a copy of all the relevant data on the application server. Rather than downloading a single <tt>SUM</tt> function result from the database, your application server will need to download all the data required to calculate the sum. This extra data transfer will add considerable latency when you&#8217;re dealing with thousands or millions of records.</p>
<h3>6. Data import, export, and backup will be slow and difficult.</h3>
<p>Oracle, MySQL and other RDBMSes include advanced tools to perform large-scale data import and export operations. These tools have also been refined for 20 years or so, and can process millions of rows per minute. There are no such tools for key-value data stores, because these products are so new.</p>
<p>When you&#8217;re processing millions of records, network latency makes a big impact. Most of these services perform a remote procedure call for each record inserted; some even limit you to <i>querying</i> one record per remote call. On the Internet, round-trip latency is usually 20-40ms, which may slow you down to fewer than 2,000 rows per minute. (You can process more quickly via multi-threading, but again, that requires you to write a lot more infrastructure code.)</p>
<h3>7. SimpleDB isn&#8217;t <i>that</i> fast.</h3>
<p>Todd Hoff&#8217;s article referenced a SimpleDB performance test which found that 10 record IDs could be retrieved in 141ms from a 1,000-record table; in 266ms from a 100,000-record table; and in 433ms from a 1,000,000-record table.</p>
<p>Compared to relational databases, this is pretty slow.</p>
<p>If you want your web application to be responsive, you need your database queries to operate much faster than this. 20ms responses would be more in line with conventional  databases. If you perform 3 SimpleDB queries in series, your web app will take about 1.5 seconds for that operation, and users will notice when the app is that slow. Many web applications actually make dozens of queries per request.</p>
<p>Further, tables with a million records aren&#8217;t large enough to <i>need</i> significant scalability. A million-record table is probably small enough to fit entirely in RAM; surely its indexes could fit in RAM. The real test of SimpleDB scalability is its performance on a table with 100 million or 1 billion records.</p>
<h3>8. Relational databases <i>are</i> scalable, even with massive data sets.</h3>
<p>The world&#8217;s largest companies all use giant relational databases, and they&#8217;ve been able to make this work. The world&#8217;s largest websites use relational databases, and they&#8217;ve also been able to scale successfully. <a href="http://mysql.com/customers/customer.php?id=287">Facebook</a> and <a href="http://jeremy.zawodny.com/blog/archives/001866.html">LiveJournal</a> use MySQL; <a href="http://www.sqljunkies.com/WebLog/mrys/archive/2005/11/16/17408.aspx">MySpace</a> uses Microsoft SQL Server; <a href="http://www.oracle.com/corporate/press/2005_jul/salesforceonoraclegrid2.html">Salesforce.com</a> uses Oracle. When websites like Friendster have scalability issues, it&#8217;s not usually because of the RDBMS.</p>
<p>We all expect Oracle to scale if we pay them enough money, but even free databases have made significant advances to prevent the database server from becoming a bottleneck. The first line of defense is <a href="http://danga.com/memcached/">caching</a>&#8211;eliminating repetitive queries can offload massive amount of processing. Beyond caching, there are <a href="http://dev.mysql.com/doc/refman/5.0/en/mysql-cluster.html">free clustering engines</a> which let you balance your database requests around a few servers in a cluster.</p>
<p>Without a complicated clustering setup, your data can usually be partitioned across multiple servers to eliminate the single-server bottleneck. Lest you think I&#8217;m ragging on Todd Hoff, he&#8217;s written a <a href="http://highscalability.com/unorthodox-approach-database-design-coming-shard">nice overview of sharding</a>, one way of designing a federated database to get around the bottleneck.</p>
<h3>9. Super-scalability is overrated. Slowing the pace of your product development is even worse.</h3>
<p>Time-to-market is a critical factor for most software products. If you&#8217;re writing internal software for a business, budgetary concerns are equally critical. You can workaround most of the drawbacks I&#8217;ve identified above, but it will cost you time and money.</p>
<p>More importantly, all these technical workarounds distract you from addressing the real needs of your customers. If you don&#8217;t focus on <a href="http://fareed.wordpress.com/2007/04/03/mspw-make-something-people-want/">making something people want</a>, it doesn&#8217;t matter how scalable your database is, because you won&#8217;t have any customers to fill up the database.</p>
<p>The hype around the new data stores seems to be a case of premature optimization, yet we all know Donald Knuth&#8217;s famous quote, &#8220;Premature optimization is the root of all evil.&#8221; Why not wait and address super-scalability once you&#8217;ve created a super product and have generated super cash flow?</p>
<h3>10. SimpleDB <i>is</i> useful, but only in certain contexts.</h3>
<p>Everyone&#8217;s assuming that SimpleDB was designed to be a general-purpose replacement for OLTP database servers. I don&#8217;t think it was ever intended for that purpose. SimpleDB&#8217;s architecture is similar to <a href="http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html">Dynamo</a>, Amazon&#8217;s internal &#8220;highly-available key-value store.&#8221; One of its main distinguishing features is the flexible schema: the ability to add custom fields to individual records, and to store multiple values in each field.</p>
<p>If you&#8217;re working with &#8220;semi-structured&#8221; data, then this is actually incredibly useful. For example, it&#8217;s an awesome way to persist web application sessions. You can avoid the overhead of marshaling the object-oriented session data into columns and rows, and many of the drawbacks above don&#8217;t apply because you don&#8217;t generally query sessions like you query more typical relational data.</p>
<p>Amazon SimpleDB, Apache CouchDB, and the Google Datastore API aren&#8217;t bad products. But we do them a disservice when we construe them to be replacements for general-purpose databases. Used carefully, they can help your organization. But used indiscriminately, you&#8217;ll create a lot more work for your programmers and you&#8217;ll make your application perform even worse.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/04/top-10-avoid-the-simpledb-hype.html/feed</wfw:commentRss>
		<slash:comments>65</slash:comments>
		</item>
		<item>
		<title>My calendar</title>
		<link>http://www.ryanpark.org/2008/04/my-calendar.html</link>
		<comments>http://www.ryanpark.org/2008/04/my-calendar.html#comments</comments>
		<pubDate>Wed, 16 Apr 2008 23:55:34 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/?p=387</guid>
		<description><![CDATA[I&#8217;ve started to publish my calendar on my website so that everyone can see when I&#8217;m busy or free. It may not be terribly useful, but it was pretty easy to do, and a fun little programming exercise. Read on to learn how I did it. I use Apple&#8217;s iCal to manage my schedule, and [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve started to publish <a href="http://www.ryanpark.org/calendar">my calendar</a> on my website so that everyone can see when I&#8217;m busy or free. It may not be terribly useful, but it was pretty easy to do, and a fun little programming exercise. Read on to learn how I did it.<br />
<span id="more-387"></span><br />
I use Apple&#8217;s <a href="http://www.apple.com/ical/">iCal</a> to manage my schedule, and iCal can automatically publish updates to an iCalendar data file hosted on a WebDAV server (which is basically a file server over the Internet). iCal even scrubs all of the private information &#8212; titles, locations and notes &#8212; so I don&#8217;t have to worry about any of that being stolen by hackers. I already had WebDAV running, so it was easy to get the calendar data up to my server.</p>
<p>With that out of the way, I just needed a way to display the calendar on my website. I considered writhing my own iCalendar display engine, but then remembered that <a href="http://phpicalendar.net/">PHP iCalendar</a> would do this for me. PHP iCalendar gives me a lot of flexibility with the display, so I was able to remove all of the material except for the calendar itself.</p>
<p>I&#8217;ve embedded it into my website using an iframe. This isn&#8217;t the most technologically advanced approach, but it was a lot easier than doing anything fancier. Even if I did use a more advanced approach to integrate the calendar into my website template, it wouldn&#8217;t improve the user experience at all. So, no need to bother.</p>
<p>What do you think of <a href="http://www.ryanpark.org/calendar">the calendar</a>? Make sure to send me those meeting requests and party invitations!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2008/04/my-calendar.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Google Goes Globe-Trotting</title>
		<link>http://www.ryanpark.org/2007/11/google-goes-globe-trotting.html</link>
		<comments>http://www.ryanpark.org/2007/11/google-goes-globe-trotting.html#comments</comments>
		<pubDate>Mon, 05 Nov 2007 07:24:13 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Links]]></category>
		<category><![CDATA[Professional]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/2007/11/google-goes-globe-trotting.html</guid>
		<description><![CDATA[Here&#8217;s a Newsweek profile on Google&#8217;s Associate Product Manager program.]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a Newsweek profile on Google&#8217;s Associate Product Manager program.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2007/11/google-goes-globe-trotting.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Ars Technica reviews Mac OS X Leopard</title>
		<link>http://www.ryanpark.org/2007/10/ars-technica-reviews-mac-os-x-leopard.html</link>
		<comments>http://www.ryanpark.org/2007/10/ars-technica-reviews-mac-os-x-leopard.html#comments</comments>
		<pubDate>Mon, 29 Oct 2007 10:08:50 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Links]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/2007/10/ars-technica-reviews-mac-os-x-leopard.html</guid>
		<description><![CDATA[Yet another exceptionally well-written review from John Siracusa at Ars Technica.]]></description>
			<content:encoded><![CDATA[<p>Yet another exceptionally well-written review from John Siracusa at <a href="http://www.arstechnica.com">Ars Technica</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2007/10/ars-technica-reviews-mac-os-x-leopard.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Happy Leopard Day!</title>
		<link>http://www.ryanpark.org/2007/10/happy-leopard-day.html</link>
		<comments>http://www.ryanpark.org/2007/10/happy-leopard-day.html#comments</comments>
		<pubDate>Fri, 26 Oct 2007 22:54:49 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Links]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/2007/10/happy-leopard-day.html</guid>
		<description><![CDATA[Mac OS X 10.5 &#8220;Leopard&#8221; is being released today at 6:00 P.M.  My copy arrived via FedEx this morning.  If you&#8217;re a Mac user, have you upgraded yet?]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.apple.com/macosx/"><img src="http://www.ryanpark.org/images/leopard.jpg" class="left" alt="Leopard" /></a> Mac OS X 10.5 &#8220;Leopard&#8221; is being released today at 6:00 P.M.  My copy arrived via FedEx this morning.  If you&#8217;re a Mac user, have you upgraded yet?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2007/10/happy-leopard-day.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>WordCamp</title>
		<link>http://www.ryanpark.org/2007/07/wordcamp.html</link>
		<comments>http://www.ryanpark.org/2007/07/wordcamp.html#comments</comments>
		<pubDate>Sat, 14 Jul 2007 01:07:08 +0000</pubDate>
		<dc:creator>Ryan</dc:creator>
				<category><![CDATA[Personal]]></category>
		<category><![CDATA[Professional]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://www.ryanpark.org/2007/07/wordcamp.html</guid>
		<description><![CDATA[Yesterday I registered for WordCamp 2007, a conference dedicated to the software that runs this blog. It sounds pretty interesting: the first day is going to feature presentations about blogging with WordPress, and the second day focuses on WordPress development. It&#8217;s coming up next weekend here in San Francisco. If you see me there, say [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://wordcamp.org/"><img class="left frame" src="http://2007.wordcamp.org/attendee.gif" border="0" alt="I'm going to WordCamp" /></a></p>
<p>Yesterday I registered for WordCamp 2007, a conference dedicated to the software that runs this blog.  It sounds pretty interesting: the first day is going to feature presentations about blogging with WordPress, and the second day focuses on WordPress development. It&#8217;s coming up next weekend here in San Francisco. If you see me there, say hi!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ryanpark.org/2007/07/wordcamp.html/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
