How does kdb compare to cloud computing?

Cloud-based solutions (such as Hadoop) are great if

  • you have developers who know how to program them
  • you don’t mind putting your data on a cloud
  • the data you need for even a single query can’t fit in memory
  • you can afford to wait minutes for your query results

For some applications, then, clouds are an excellent approach. kdb+, however, is typically used in scenarios where

  • you can’t (e.g., due to licensing) or don’t want your data outside your firm
  • you don’t have an in-house cloud infrastructure
  • you don’t want each query to be an IT project in and of itself
  • you want the answer to your query in milliseconds (or seconds at worst)
  • you can fit your data in memory (not necessarily all at once)

2 Comments

  1. May 9, 2011 / 10:33 pm

    anything that kdb can do, the cloud+kdb can do faster assuming the algorithm can be parallelized? using peach?

  2. kdbfaq
    May 12, 2011 / 11:24 pm

    rohan,

    peach parallelizes work within a q process, whereas cloud frameworks parallelize work across machines. I'm not aware of any cloud framework that is easy to use with q.

    Even if such a cloud were available, though, many things that a lone kdb process can do would be slower in a cloud. It takes a long time for a cloud-based map-reduce to start up – about 2 minutes in my (admittedly limited) experience, although I've heard of cases as fast as 30 seconds. A single q process can do a lot in 30 seconds.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.