Posts

Showing posts with the label NoSQL

Towards a Standard for JSON Document Databases

Image
Despite the ubiquity of the MongoDB aggregation framework, it has been lacking a formal mathematical framework/specification. This paper aims to fix this gap by providing a theoretical foundation, and proposes MQuery . The formalization in MQuery is largely based on the paper published at ICDT 2018 (for which the first author is involved), extending it to include more pipeline operators, relax the assumption that the JSON documents stored in the database comply to a predefined schema, and allow objects that are either ordered or unordered sets of key-value pairs. Motivation For decades, SQL proponents have flaunted the rigorous mathematical foundation of relational algebra (courtesy of  Edgar Codd ). The world of JSON document databases, however, has remained a bit of a Wild West in comparison. The analogy is apt because, like the frontier, there is immense opportunity here. JSON is the undisputed king of data exchange, and the MongoDB aggregation framework has emerged as the wide...

Spanner: Becoming a SQL system

Image
This is a VDLB 2017 paper. Last week we reviewed the F1 paper from 2012. It seems like F1 was an experiment and sort of a preview towards adding serious SQL support in Spanner. The original Spanner paper was published in 2012 had little discussion/support for SQL. It was mostly a "transactional NoSQL core". In the intervening years, though, Spanner has evolved into a relational database system, and many of the SQL features in F1 got incorporated directly in Spanner. Spanner got a strongly-typed schema system and a SQL query processor, among other features. This paper describes Spanner's evolution to a full featured SQL system. It focuses mostly on the distributed query execution (in the presence of resharding of the underlying Spanner record space), query restarts upon transient failures, range extraction (which drives query routing and index seeks), and the improved blockwise-columnar storage format. I wish there was discussion on the evolution of data manipulation/modi...

NoSQL: The Hangover of Dropping ACID

Image
Created by Stable Diffusion The 70s were a time of excess and rebellion, a carefree era where people embraced their wildest impulses and let their hair down. But as the years went by, people realized that the true freedom lies not in the absence of rules, but in the discipline to follow them. And so, they put away their acid-washed jeans and neon-colored hair, and embraced a more structured, disciplined way of life. As we hit 2010, the 70s spirit of rebellion spawned a new breed of databases known as "NoSQL". The rise of NoSQL databases was a rebellion against the strict rules and constraints of SQL databases. NoSQL databases promised to free users from the rigid, structured world of SQL, with their strict schemas and complex query languages. Instead, NoSQL databases offered a simple, flexible approach to data storage and retrieval, allowing users to store and access data in whatever way they saw fit. In contrast to SQL databases, which provide strong consistency guarantees t...

Popular posts from this blog

Hints for Distributed Systems Design

The F word

The Agentic Self: Parallels Between AI and Self-Improvement

Learning about distributed systems: where to start?

Foundational distributed systems papers

Are We Becoming Architects or Butlers to LLMs?

Building a Database on S3

Advice to the young

Cloudspecs: Cloud Hardware Evolution Through the Looking Glass

End of Productivity Theater