Schema-less data structures are not well understood and it is important to consider the pros and cons when using these data structures in NoSQL databases. At a recent corporate event Martin fowler talked on schema-less data structures, and NoSQL and consistency.
Data structures without schema:
The lack of schema is often seen as a big advantage with NoSQL databases. Martin thinks the area is not well understood and describes various aspects of schema-lessness as well as the pros and cons of using schema-less data structures.
The main point is that even in a structure without a diagram, you still have a diagram. To query the data and find information, you need to understand the data, and that’s a Implicit scheme, a definition of the data, for example in the code. In contrast, the schema of a relational database, where only the correct data is accepted, is a Explicit scheme.
Martin ends the discussion by saying that most of the time “Default scheme == Bad thing” preferring an explicit schema to get a clear statement of what the data looks like, although there are a few cases where the absence of a schema is useful. But it also declares that a schema does not have to be a fixed storage schema; it can be more of a contract, for example a data access layer or an XML schema.
NoSQL and consistency:
In this talk, Martin examines two aspects of consistency in NoSQL databases.
Logical consistency deals with maintaining data consistency when working in a database. For most NoSQL databases (graphics being an exception), using aggregates (a concept of Domain Driven Design where you store a bunch of objects at the same time) is an obvious way to avoid inconsistencies.
While describing the consistency of replication, with copies of the same data in multiple places, Martin introduces the CAP theorem, and with data already replicated across the network, it simplifies it into a choice between consistency and availability. He stresses that this is not a technical problem, it is a business choice, whether consistency or availability is the top priority.
Martin closed with a talk on the value of software design and technical debt.