Follow Us

Follow nosqldatabases on Twitter Follow nosqldatabases on Facebook Follow nosqldatabases on Google Buzz Follow nosqldatabases on LinkedIn Follow nosqldatabases on FeedBurner NoSQL presentations on slideshare


Become a sponsor of Contact us to find out how.

Featured Jobs


Follow On Facebook
Recent NoSQL News



NoSQL Databases - An Overview

In his presentation, Marin Dimitrov, provides an introduction to both NoSQL as a concept and various examples of NoSQL data stores. I find that it's good to revisit these introduction type presentations simply because it reinforces the basics about the technology. Plus often times there are one or two things that you overlook.

Couple of points that Marin discusses about NoSQL as a concept.

NoSQL use cases:

  • Massive Data Volumes
  • Extreme Query Volume
  • Schema Evolution

Advantages/Disadvantages to NoSQL:

  • Advantages
    • Massive Scalability
    • High Availability
    • Lower Cost (than competitive solutions at that scale)
    • (Usually) Predictable elasticity
    • Schema flexibility, sparse and semi-structured data
  • Disadvantages
    • Limited query capabilities (so far)
    • Eventual consistency is not intuitive to program for (clients become more complicated)
    • No standardization  (portability is an issue)
    • Insufficient access control

Marin discusses a wide variety of topics in this presentation such as, consistency models, architecture, partitioning, operations, data model and workflow for various NoSQL data stores. What data stores does Marin discuss? Quite a few actually, each data store has about 3-5 slides about it including some nice comparison slides for data stores in the same "family".

Here is the list of data stores discussed: PNUTS, Dynamo, Voldemort, BigTable, HBase, Cassandra and CouchDB. Obvious omissions include MongoDB and graph data stores altogether. Although Marin notes in his presentation that graph databases "Not exactly NoSQL... can't satisfy the requirements for high availability and scalability/elasticity very well". Any of the graph data store folks care to dispute this point?

Definitely worth a read.


Links of the Day - 2010/07/09

Links of the day for July 09, 2010


Learning to Relax - CouchDB for beginners

In his presentation, Alan Hoffman, Co-Founder of Cloudant (hosted CouchDB) gives us an introduction to CouchDB. So what is CouchDB?

  • Apache Project
  • Written in Erlang
  • Schema-free document database management system (like MongoDB)
  • Robust, Concurrent and Fault-Tolerant
  • Custom Persistent Views using MapReduce
  • Bi-directional Increment Replication

So a couple of tidbits about CouchDB that I found interesting.

  • Each document has a revision and under certain replication environments conflicts can occur
  • Documents can have binary attachments
  • Unlike some other RDBMS systems there is nothing to repair if the server crashes, just restart the server and you are back up and online
  • Replication can be performed with one-click
  • Multiple replication setups are available: Master-Slave, Master-Master and Robust Multi-Master
  • Does not support ad-hoc queries, this is by design
  • 1.0 is just around the corner

CouchDB Lounge is an open source proxy-based partitioning and clustering solution. It uses a combinatino of a smart and dumb proxy to add partitioning and cluster to CouchDB. Another option, which is being developed by Cloudant, is Open Cloudant which essentially tries to duplicate ring clustering like what is found in Amazon's Dynamo. According the slides this solution is "Coming soon to github near you" so keep a look out.


Links of the Day - 2010/07/08

Links of the day for July 08, 2010


From MySQL to MongoDB at

Brendan W. McAdams from Sluggy Freelance and Evil Monkey Labs has an excellent presentation about their migration from MySQL to MongoDB. started using MySQL along with PHP for their publishing platform in 2002. Migrated to MongoDB (and MongoKit) in August of 2009.

Key points about the migration:

  • Completely eliminated the need for physical hardware, as a result migrated to virtual hosting.
  • Average system load is 0.05 on a 2GB slice
  • MongoDB uses 1% CPU on average
  • Switchover took 2 minutes (ran data conversion script, deployed new code tag, bounced webserver / pylons app)
  • No downtime in any way attributable to MongoDB since go live (Can’t say the same for MySQL)

Lessons learned while using MongoDB:

  • MongoDB’s MMAP system gives you a “free” MRU cache. Done right and simple; caching on MongoDB is durable, light and fast.
  • Flexible schemas are good.
  • Wasting your time mapping data back and forth between your presentation layer & RDBMS is not just tedious - it’s error prone.
  • The more you can put in memory, the less you beat on your disks. Especially important on virtual hosting: Be a Good Neighbor
  • MongoDB is very good at automatically memory caching frequently used data, reducing the amount of code you need to write 

Reasons for using MongoDB:

  • Dynamic Querying
  • Flexibility
  • CouchDB’s approach appeared obtuse and rather unPythonic
  • Tools like MongoKit allowed for easy replacement of existing MySQL ORM code with something almost identical
  • FAST
  • Great Support & Community Available
  • Easy access - talking to developers NOT support staff. One official development company behind MongoDB.


Links of the Day - 2010/07/07

Links of the day for July 07, 2010


Scalable Event Analytics with Ruby on Rails and MongoDB

In his presentation at Ruby Conf China 2010, Jared Rosoff of Yottaa, discusses using MongoDB and Ruby of Rails for Yottaa's event analytics platform. What I particularly like about this presentation is how Rosoff walks us through his decisions and reasoning to reach the conclusion to use MongoDB. I'm not here to say MongoDB was right or wrong, but I am grateful for the experimenting and work that went into the decision.

Couple of notes about working with MongoDB:

  • MongoMapper makes it look like ActiveRecord
  • Documents are more natural than rows in many cases
  • Map-Reduce rocks (but needs better support in rails)


Links of the Day - 2010/07/06

Links of the day for July 07, 2010


Need a graph database? Look no farther than Neo4j 

Robert Scoble and Emil Eifrem discuss Neo4j, graph databases and BBQ restaurants.

Couple of notes taken from the interview:

  • There are a lot of interesting integrations of social and geographic graphs that have yet to be explored.
  • Neo4j more general purpose and supports infinite depth. This is contrast to a specialized solution like Twitter's FlockDB which is geared towards Twitter's unique needs.
  • Graph databases excel whenever the value is in the connections.
  • How scalable is Neo4j?
    • It's really challenging to provide horizontally when dealing with a graph database.
    • Neo4j scales to 12 billion nodes and relationships.
    • Currently relies on replication of the entire graph across machines.
  • In Neo4j 2.0 they are implementing transparent partitioning. Transparent partitioning essentially attempts to place isolated clusters of a graph, on their own machine. It's a difficult problem to solve simply because two isolated clusters can become connected with just one link.
  • Neo4j uses AGPL license. Essentially if your service is open source, Neo4j is open source. If your service is going to be closed source you need to purchase a commercial license.


Links of the Day - 2010/07/05

Links of the Day for July 5th, 2010


News and notes - 2010/07/04

I'm very excited about this weeks updates. So many cool things in my opinion that we've added.

  • It's finally here, user registration. Now you can register with NoSQLDatabases to make it easier to receive email updates, comment on stories and to be a part of our community. To register simply click the Register link in the right hand column, or go here.
  • We finally have a job board dedicated to NoSQL positions.
    • Looking for work? Visit the Jobs link to see all positions available.
    • Looking for help? Post a job, visit the Jobs link and click on the "Post a Job" button. From now until the end of the month we are offering 50% off on all job postings. When you check out use the discount code: NEWJOB
  • NoSQLDatabases now has a Google Buzz Profile and Facebook Page check them both out.

Polyglot Persistence, is it the future of application persistence?

In yesterday's post John Nunmaker discussed the future of persistence as it related to NoSQL. His thoughts were that application persistence would be hosted and would employ polyglot persistence. In today's post we are going to explore that last piece, polyglot persistence.

In his presentation at WindyCityDB, John P. Wood discusses polyglot persistence, what it is and how does it help?

Key points from the presentation:

  • RDBMS is no longer the default choice, but it's not dead either
  • We now have several choices in the NoSQL arena. Having choices is great. However, it means we must do the work to validate our tool of choice as the right one for the job. 

So what exactly is polyglot persistence?

The continued or prolonged existence of something using several databases.

Scott Lebnerknight is quoted in the presentation to reinforce this point:

Polyglot Persistence, like polyglot programming, is all about choosing the right persistence option for the task at hand.

One could assume this would be like using Grails for the UI portion of a web application and perhaps Java for other backed processes. So what does this mean? It means you that the default mode of large organizations will be to support multiple data stores.

Some organizations like Facebook and Twitter are already doing this. Specifically in both cases you see these organizations using MySQL, Cassandra and HBase for various aspects of there applications.

As John Wood beautifully summarizes why we do this:

Right tool for the job


Links of the Day - 2010/07/02

Links of the Day for July 2nd, 2010


Why is NoSQL so popular?

John Nunmaker, from Ordered List, presented the following presentation at WindyCityDB in Chicago a couple of days ago. After providing us with a reminder of the databases history he dives into NoSQL. Couple of highlights from that section:

  • NoSQL technologies are development and operations friendly
  • Moving from "How do we store?" to "How do we use"

What about the future? Well John sees two things:

  • Hosted
  • Polyglot Persistence

Unfortunately, we don't get to see John detail his thoughts on polyglot persistence. However, the idea is simple that applications will no longer use a single persistence layer. In theory and in my opinion this is what should be done right? Use the right tool for the job. It will be interesting as more applications adopt this strategy the issues it poses. We will explore ployglot persistence more in depth tomorrow.

Update: Here is the actual presenation on Vimeo (thanks to Ryan Briones for the link).


Links of the Day - 2010/07/01

Links of the Day for July 1st, 2010

  • Notes from a production MongoDB deployment - David Mytton describes his experiences with MongoDB in a production enviroment. Discusses namespace limits, replication, durability and support from 10Gen. David also tells if he would choose MongoDB again.
  • Reflections on MongoDB - Brandon Keepers of Collective Idea presents some thoughts about their usage of MongoDB. My favorite quote "MongoDB is not all leprechauns and unicorns. It’s the bleeding edge, and you will bleed."
  • Sones releases the first open source edition of their GraphDB
  • Migrating to CouchDB with a Focus on Views - John P. Wood provides a case study of his attempt to migrate to CouchDB. John puts it best "We faced several challenges migrating to CouchDB and learned some important lessons along the way. All of those challenges will be addressed here." We will be coming back to this post, in the future, to discuss it in more detail.