Search
Follow Us

Follow nosqldatabases on Twitter Follow nosqldatabases on Facebook Follow nosqldatabases on Google Buzz Follow nosqldatabases on LinkedIn Follow nosqldatabases on FeedBurner NoSQL presentations on slideshare

Sponsors

Become a sponsor of NoSQLDatabases.com. Contact us to find out how.

Featured Jobs

 

Follow On Facebook
Recent NoSQL News

Advertisments

Entries in Indexing (4)

Wednesday
Nov102010

Indexing Matters: A MongoDB Optimization Primer

In a recent O'Reilly webcast Kyle Banker, of 10Gen, discussed indexing in MongoDB. In the webcast, Banker discusses how indexes can be designed to optimize the performance of MongoDB. In addition, Banker discusses tips for diagnosing and fixing performance issues should they arise. We have the video and slides to the webcast.

Click to read more ...

Friday
Aug202010

Managing Indexes in Cassandra using Async Triggers

So yesterday we spoke about an extension to Cassandra that provides asynchronous triggers. Now we will see a use case in action, managing a secondary index with triggers. In this post, Maxim Grinev and Martin Hentschel are at it again, they describe the use case here:

Cassandra does not support secondary indexes at first, but storing redundant data (in a different layout) will give you the same effect. The main drawback is that your application (the code that writes to the DB) needs to take care of managing the index. Every time you write to the DB, you also need to maintain your index.

So by using the asynchronous triggers you can maintain the secondary index without the performance impact.

Check it out: Managing Indexes in Cassandra using Async Triggers

Follow Up: Cassandra is planning native support for secondary indexes in Cassandra. Here is the JIRA (CASSANDRA-749) full of the discussion among the committers.

Friday
Jul232010

Indexing documents in MongoDB

Whether you are working with a traditional relational database like MySQL or a NoSQL database like MongoDB, indexing plays a big part performing efficient lookups. Alberto Lerner, from 10gen, discusses how MongoDB indexes documents for fast retrieval.

MongoDB Indexing Basics:

  • MongoDB can use separate tree structures to index a collection
  • MongoDB is going to try an use an index to avoid going through a collection

Our job as developers is to provide MongoDB with indices on important queries. What makes an important query? Frequently used queries or queries where response time is critical. MongoDB provides us with an automatic index on _id. Others can be created with 'ensureIndex'.

When searching in MongoDB the search criteria is either the index key or the prefix of an index key. Searching can yield both exact matches or several results. There are a couple of special cases with indexing. For instance, null fields. In the case of null fields a value of null is stored in the index. In the case of fields that contain arrays, there will be one entry per element in that array.

Whenever, MongoDB evaluates search criteria only one index will be used. What if you have several indices, how does MongoDB decide? It uses previous executions, so it's a performance based selection process.

In the end, almost everything we do related to building applications revolves around trade-offs, and MongoDB is no different. Lerner reminds us that there is a trade-off that exists between search criteria performance and the insertion/updating/deletion of keys. So Lerner suggests as a concluding point to be "mindful of number of indexes and choice of keys".

Wednesday
Jun232010

Cassandra Basics: Indexing

This presentation, by Benjamin Black, discusses indexing with Cassandra. Specifically, Black starts out discussing the differences with indexes and query creation between traditional relational data stores and column stores.

Relational stores are schema driven. Start with your schema and work forward

Column stores are query driven. Start from your queries and work backwards

In addition to these differences he discusses how things change when you are "at scale".

Denormalization is the norm

Everything depends on the indices

Black, proceeds to discuss two types of indices, super column and composite key, and provides some very good examples.

From a deployment perspective, there are two common strategies, that I know of, to deal with your indices. First and most simple is simply deploying them along with your application data in the same keyspace. A second strategy is to deploy a separate physical cluster of machines that handle the indices. Complicates the deployment and the code, but I'm sure there are some good reasons to do so.