Search
Follow Us

Follow nosqldatabases on Twitter Follow nosqldatabases on Facebook Follow nosqldatabases on Google Buzz Follow nosqldatabases on LinkedIn Follow nosqldatabases on FeedBurner NoSQL presentations on slideshare

Sponsors

Become a sponsor of NoSQLDatabases.com. Contact us to find out how.

Featured Jobs

 

Follow On Facebook
Recent NoSQL News

Advertisments

Entries in Gary Dusbabek (3)

Thursday
Nov042010

New Features in Cassandra 0.7

Gary Busbabek, of Rackspace, has a post discussing the upcoming features in the 0.7 release of Cassandra. There are some major changes/improvements in store for 0.7.

Here are the big ones:

  • Memory Efficient Compactions (CASSANDRA-16) - Compactions will now happen incrementally
  • Online Schema Changes (CASSANDRA-44)
    • Exposes the ability to create and drop column families and keyspaces from its client API.
    • Using the same methods there is limited support for updating existing column families and keyspaces (e.g., increasing the replication factor for a particular keyspace)
  • Secondary Indexes (CASSANDRA-749)
    • Secondary indexes are now maintained by Cassandra, no more hand built secondary indexes.
    • Using the API for online schema changes, secondary indexes can be added to existing or brand new column families

In addition to the major changes there are several improvements to the streaming code, improved read performance, support for Thrift 0.5 among many other items. As with every release things can change so be sure to read the changelog for the most up to date information about the release.

Read the full post: New Features in Cassandra 0.7

Thursday
Sep302010

Adapting your data model for Cassandra

This presentation is the first of several we will cover in upcoming posts from the ICOODB 2010 conference. In this particular presentation, Gary Dusbabek from Rackspace discusses how to adapt your data model for Cassandra.

The presentation contains probably one of the best descriptions/visualizations of Cassandra's data model that I've ever seen. In addition to the refresher course on Cassandra's data model we are reminded of Cassandra's limitations no transactions, adhoc queries, joins for example.

But how do you go about modeling your data model for Cassandra? Well Dusbabek describes the traditional workflow for relational data models with the following steps:

  1. Define entities
  2. Normalize
  3. Identify Many-to-many
  4. Query any way you want

How does this compare to working with Cassandra?

  1. Know your application
  2. Design your queries first
  3. Denormalize

Gary summarizes the presentation with the following points:

  • The goal is to scale
  • ColumnFamilies != Relational tables
  • Trade-offs: you win some, you lose some
  • Know your application
  • Queries first
  • Denormalization is OK
  • Cassandra was built for this

Tuesday
Jul132010

Introduction to Apache Cassandra

So in case you missed it, Cassandra has been in the news the last couple of days. So I thought this would be a good opportunity to provide an introduction to Cassandra via Gary Dusbabek from Rackspace. This presentation was actually given at Silicon Valley Cloud Computing Group back in June of this year.

Couple of key points about Cassandra (not from the presentation):

  • Initially created by Facebook for search functionality for users inbox mail on the site.
  • The source code was open sourced and released to the Apache Software Foundation.
  • Its design was inspired by both Google's BigTable and Amazon's Dynamo.
  • It's considered to be a column data store, similar to a Google BigTable or Apache HBase.

So why Cassandra at all? As Dusbabek mentions from his presentation "vertical scaling is hard". So as the amount of data we create and analyze increases, our strategies for dealing with that data change. Dusbabek walks us through a number of topics in his discussion including scaling, replication model, data model and practical considerations.

So without any further interruptions...