Exploring & Communicating the Future of the Future

Jeremy Geelan

Subscribe to Jeremy Geelan: eMailAlertsEmail Alerts
Get Jeremy Geelan: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Rackspace Journal, Open Source Journal, CIO/CTO Update, Telecom Innovation, Big Data on Ulitzer

Article

Cloudant Merges BigCouch into CouchDB

DBaaS pioneer contributes database scalability and fault-tolerance framework to Apache CouchDB

"There are a lot of reasons people love CouchDB," said Adam Kocoloski, co-founder and CTO at Cloudant, as he announced last week that Cloudant had delivered on its promise to integrate core capabilities of its distributed database service to the open source Apache CouchDB project, "like its elegant programming model, data durability, flexible indexing, and, most of all, its unique way of replicating and synching data across data centers or devices."

CouchDB serves as the foundation of Cloudant's technology stack in the form of BigCouch, an open source variant of CouchDB that the company developed to support large-scale, globally distributed applications. After four years of operating BigCouch in production, Cloudant has merged the BigCouch code into the CouchDB codebase, making it possible to manage and replicate data with CouchDB at much larger scale.


Cloudant's Sam Bisbee featured in this recent Big Data Power Panel at 3rd Big Data Expo

Kocoloski continued:

"We're merging the horizontal scaling and fault-tolerance framework we built for BigCouch into CouchDB so people can more easily scale all that CouchDB goodness across multiple servers and keep it running nonstop. It's our way of saying thanks and helping to grow the community of CouchDB developers and users."

The open source BigCouch database project was developed in 2008 by the Cloudant co-founders, who had previously been using CouchDB for managing and distributing the petabytes of data generated every second by CERN's Large Hadron Collider. They developed a horizontal clustering and fault-tolerance framework for BigCouch that was inspired by the Amazon Dynamo research paper.

For the code merger, Cloudant engineers imported sections of BigCouch code into the Apache CouchDB repositories, adapting the database to run in a clustered environment and to better replicate databases across clusters and between data centers. Going forward, Cloudant will cease development of BigCouch, in order to participate in the CouchDB community and keep CouchDB and Cloudant clustering functionality in sync. Cloudant engineers will continue to make cluster-scaling and fault-tolerance enhancements within the CouchDB project and will reuse that code in Cloudant's database service.

"The code merger of BigCouch and Apache CouchDB is good for the open source community and developers that require a scalable Web-aware database," said Travell Perkins, CTO at Fidelity Investments. "As a classically trained computer scientist, I'm interested in the inner workings of my database solutions as much as the practical utility they provide dynamic data and use cases. I've tried a lot of NoSQL solutions over the years with varying degrees of success. After working with the distributed clustering capabilities being built into CouchDB, I think we are approaching the ideal JSON-centric database for enterprise workloads at scale."

"We're continuing work within the Apache project to integrate the clustering technology of BigCouch, but now we've set the stage and are welcoming more project committers to get involved," said Jan Lehnardt, Project Management Committee chair of the Apache CouchDB project. "Cloudant's work fine-tuning BigCouch database replication at large scale now gives Apache CouchDB a complete strategy for replicating data across distributed systems, whether nodes are Erlang clusters in the same data center or on the other side of the world. Developers have more options for moving data closer to their users and a simpler strategy for synchronizing that data throughout a larger system."

The key accomplishment of the merged code, according to Cloudant, is the BigCouch clustering capability. Among other improvements to Apache CouchDB, Cloudant has contributed a new compactor process that creates smaller and better-organized post-compaction databases. CouchDB users can now experience significant improvements in compaction and replication speed, as well as boosts in high-concurrency access performance. Additional improvements include: better index update speeds, updated aggregate reduce functions, smooth hot-code updates, improved logging, and streamlined libraries. Cloudant engineers also refactored internal code, removing complicated sections and boosting overall performance.

A preview of the merged software is available now, and a general release of CouchDB with the merged BigCouch functionality is targeted to be available following the Apache community release process.

More Stories By Jeremy Geelan

Jeremy Geelan is Chairman & CEO of the 21st Century Internet Group, Inc. and an Executive Academy Member of the International Academy of Digital Arts & Sciences. Formerly he was President & COO at Cloud Expo, Inc. and Conference Chair of the worldwide Cloud Expo series. He appears regularly at conferences and trade shows, speaking to technology audiences across six continents. You can follow him on twitter: @jg21.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.