Site icon Women's Christian College, Chennai – Grade A+ Autonomous institution

New Apache Cassandra 5.0 gives scalability and performance boost to open source NoSQL database

After years of development efforts and community discussion, open source Apache Cassandra The 5.0 database is finally generally available. The new database update promises enterprises better performance, AI capabilities and better data efficiency.

The new release marks the first major version number change since then Apache Cassandra 4.0 Released in 2021. There was also an Apache Cassandra 4.1 update in 2022 that added scalability features, and since then, the focus has been on 5.0. Apache Cassandra is one of the most widely used database technologies and is used by big name organizations including Apple, Netflix and Meta as well as all kinds of enterprises. Cassandra has been developed as a multi-stakeholder open source technology. Multiple commercial vendors support Cassandra, including DataStax as well as managed database offerings on Amazon Web Services, Microsoft Azure, and Google Cloud.

The main advantage Cassandra has always had is that it is a large-scale distributed NoSQL database that enables organizations to have multiple nodes in different locations, all kept in synchronization. With 5.0 that distributed nature gets a big boost with a new indexing approach that also improves overall performance.

Apache Cassandra 5.0 also marks the official debut of vector search support in the generally available open-source version of Cassandra. Some commercial Cassandra vendors, esp Datastacks integrates vector supportLong before the technology was part of an official stable 5.0 release.

“We’ve changed how indexing works in Cassandra, which is a big change,” Patrick McFaddin, VP of developer relations and Apache Cassandra committer, told VentureBeat. “Not only are they vectors, but they’re also the way we do normal indexes.”

Why Cassandra’s New Data Index Matters to Enterprise Users

The new data indexing approach will provide all sorts of benefits to enterprise users.

McFaddin said that this means that developers now have a much simpler way to work with Cassandra and are not constrained by very tight data models. He noted that previously, in data modeling exercises, organizations had to be very specific about how the data model was built.

“Now we are relaxing the requirements,” he said. “You can create a data model, modify it, and then just add an index to use that data model in a different way.”

What makes the new indexing approach with Apache Cassandra particularly remarkable is that it works in a very distributed manner.

“We have users who have five data centers around the world that are synchronized, in a cluster that spans the globe,” McFadyen said.

How Cassandra 5.0 improves data density and performance

Beyond the new indexing approach, Cassandra 5.0 introduces an integrated compaction strategy that significantly increases data density per node.

“Instead of having four terabytes per node, now you can have 10 or more terabytes per node,” McFaddin said.

The ability to hold more data per node will help enterprise users by reducing hardware requirements for large-scale deployments. It will also reduce operational costs associated with operating fewer nodes

Cassandra 5.0 also introduces a pair of new data structures known as tryMemTables and tryStables. McFadyen explained that those feature changes align data structures for faster processing and improve overall performance in the database. He noted that by aligning data structures from user to disk, the database spends less time doing unnecessary work, leading to this significant performance increase.

“In short, when you’re looking for data in memory or on disk or something like that, databases have to go through this huge conversion process,” McFadyen explained. “What Try Features does is align everything, so there are no conversions that need to happen.”

The future of Apache Cassandra is ACID transactions

With Apache Cassandra 5.0 now generally available, the open-source community can turn its full attention to what comes next.

McFadin noted that work on Cassandra 5.1 has actually been ongoing since November 2023, after the feature freeze for the 5.0 release went into effect. Looking ahead, the Cassandra project is working on implementing full ACID (Atomicity, Consistency, Isolation, Durability) transactions.

“It’s probably the most exciting thing to come to the Cassandra database in 15 years,” he said.

Post New Apache Cassandra 5.0 gives scalability and performance boost to open source NoSQL database appeared first Venture beat.

ADVERTISEMENT
Exit mobile version