The first set of videos from MongoSV are now available online. The first presentation that I'll be breaking down is the Shutterfly case study. This presentation was given by Kenny Gorman a data architect at Shutterfly. This presentation focuses on Shutterfly's usage of MongoDB for photo metadata. The presentation looks at the existing infrastructure, what MongoDB is used for and how they migrated from Oracle to MongoDB.
- Current metadata was persisted in an Oracle RDBMS (20TB of data)
- Existing infrastructure was beginning to show its age slowing down development. In addition, infrastructure teetering on the border of needing to spend additional money on licensing costs and specialized hardware to handle increased load.
- Requiremnents for new infrastructure called for commodity hardware, open source software, horizontal scaling, data locality and low cost
- Comparison was done between several NoSQL databases, MongoDB obviously won
- Key features that MongoDB provided: BSON/JSON data formats, replica sets, sharding, commercial support, good community, etc.
- Project would be need to be phased, use dual writes, more features over time
- Phased migration of data XML without MongoDB, MongoDB with XML, MongoDB/BSON without XML
- Hardware configuration is just a standard Intel box (CentOS, 48GB Ram, 3TB RAID 10, Dual Quad Core)
- Four servers per replica set, backups are done from slaves
- Data migration was set to be done in two phases. First phase was to treat MongoDB as a cache, upon a cache miss migrate the data and populate MongoDB. Second phase would be to move remaining data to MongoDB
- Results: 500% improvement in cost, 900% improvement in performance, latency from 400ms to 2ms
- Lessons learned: protect your writers, good developers make a world of difference, data modeling is still challenging
Kenny has posted the slides to his blog, I've embedded them here: