OrioleDB: The next-generation storage engine for PostgreSQL

OrioleDB is a storage extension for PostgreSQL which uses PostgreSQL's pluggable storage system.

It is designed to be a drop-in replacement for PostgreSQL's existing storage engine. OrioleDB is built to take advantage of modern hardware and cloud infrastructure, providing better performance and scalability for PostgreSQL workloads.

Example

OrioleDB uses Postgres Table Access Method (TAM) to provide a pluggable storage engine for PostgreSQL. Here is an example of how you can create a table using OrioleDB:

-- Enable the OrioleDB extension
CREATE EXTENSION orioledb;

CREATE TABLE blog_post
(
  id int8 NOT NULL,
  title text NOT NULL,
  body text NOT NULL,
  PRIMARY KEY(id)
) USING orioledb; -- Use the OrioleDB storage engine

Pluggable Storage in PostgreSQL

Pluggable Storage gives developers the ability to use different storage engines for different tables within the same database. Developers will be able to choose a storage method that is optimized for their specific needs: some tables could be configured for high transactional loads, others for analytics workloads, and still others for archiving.

See examples

create table analytics_data
(
  id int8,
  created_at timestamptz,
  event text
) using parquet; -- Store data in a analytical optimized storage engine

create table timeseries_data
(
  id int8,
  created_at timestamptz,
  event text
) using timeseries; -- Store data in a time-series optimized storage engine

Something like this is already available in MySQL, which uses the InnoDB as the default storage engine since MySQL 5.5 (replacing MyISAM). Read more about the history of pluggable storage here.

Using OrioleDB with existing PostgreSQL installations

OrioleDB currently requires a set of patches to PostgreSQL to enhance the pluggable storage API and other PostgreSQL subsystems. All of these patches have been submitted to the PostgreSQL community and are under review.

The important property of this set of patches is keeping the binary compatibility. That is, you can switch to the patched PostgreSQL binary while keeping the same data directory. The existing tables will continue working with the default heap engine until you switch them to use orioledb. Moreover, it's possible to switch back to using unpatched PostgreSQL binaries. You would just need to convert your orioledb tables back to heap before.

The goal is to upstream everything: once these patches are accepted, OrioleDB will be able to run on any PostgreSQL installation without any modifications. This will also enable the entire PostgreSQL community to create their own pluggable storage engines.

Until then, you can use our pre-built Docker image to try out OrioleDB. The Docker image includes a patched version of PostgreSQL with OrioleDB pre-installed. Follow the Getting started guide to get started.

Patch set

You can get the full set of patches here. The following patches have been submitted to the PostgreSQL community to enhance the TAM interface and other subsystems.

	Name	Link	Version
✅	Add missing inequality searches to rbtree	Link	PostgreSQL 16
✅	Document the ability to specify TableAM for pgbench	Link	PostgreSQL 16
✅	Remove Tuplesortstate.copytup function	Link	PostgreSQL 16
✅	Add new Tuplesortstate.removeabbrev function	Link	PostgreSQL 16
✅	Put abbreviation logic into puttuple_common()	Link	PostgreSQL 16
✅	Move memory management away from writetup() and tuplesort_put*()	Link	PostgreSQL 16
✅	Split TuplesortPublic from Tuplesortstate	Link	PostgreSQL 16
✅	Split tuplesortvariants.c from tuplesort.c	Link	PostgreSQL 16
✅	Fix typo in comment for writetuple() function	Link	PostgreSQL 16
✅	Support for custom slots in the custom executor nodes	Link	PostgreSQL 16
✉️	Allow table AM to store complex data structures in rd_amcache	Link	PostgreSQL 18
✉️	Allow table AM tuple_insert() method to return the different slot	Link	PostgreSQL 18
✉️	Add TupleTableSlotOps.is_current_xact_tuple() method	Link	PostgreSQL 18
✉️	Allow locking updated tuples in tuple_update() and tuple_delete()	Link	PostgreSQL 18
✉️	Add EvalPlanQual delete returning isolation test	Link	PostgreSQL 18
✉️	Generalize relation analyze in table AM interface	Link	PostgreSQL 18
✉️	Custom reloptions for table AM	Link	PostgreSQL 18
✉️	Let table AM insertion methods control index insertion	Link	PostgreSQL 18

Legend

✅ - Patch has been accepted.
✉️ - Patch is sumbitted and under review by the PostgreSQL community.
✏️ - Patch is being worked on.

Features

OrioleDB opens the door to a future of more powerful storage models that are optimized for cloud and modern hardware architectures.

Open source

OrioleDB is distributed under the standard PostgreSQL license. The goal is to upstream all the patches required to run OrioleDB on any PostgreSQL installation without any modifications.

Designed for modern hardware

OrioleDB design avoids legacy CPU bottlenecks on modern servers containing dozens and hundreds CPU cores, providing optimized usage of modern storage technologies such as SSD and NVRAM.

Reduced maintenance needs

OrioleDB implements the concepts of undo log and page-mergins, eliminating the need for dedicated garbage collection processes. Additionally, OrioleDB implements default 64-bit transaction identifiers, thus eliminating the well-known and painful wraparound problem.

Designed to be distributed

OrioleDB implements a row-level write-ahead log with support for parallel apply. This log architecture is optimized for raft consensus-based replication allowing the implementation of active-active multimaster.

Differentiators

The key technical differentiations of OrioleDB are as follows:

No buffer mapping and lock-less page reading

In-memory pages in OrioleDB are connected with direct links to the storage pages. This eliminates the need for in-buffer mapping along with its related bottlenecks. Additionally, in OrioleDB in-memory page reading doesn't involve atomic operations. Together, these design decisions bring vertical scalability for Postgres to the whole new level.

MVCC is based on the UNDO log concept

In OrioleDB, old versions of tuples do not cause bloat in the main storage system, but eviction into the undo log comprising undo chains. Page-level undo records allow the system to easily reclaim space occupied by deleted tuples as soon as possible. Together with page-mergins, these mechanisms eliminate bloat in the majority of cases. Dedicated VACUUMing of tables is not needed as well, removing a significant and common cause of system performance deterioration and database outages.

Copy-on-write checkpoints and row-level WAL

OrioleDB utilizes copy-on-write checkpoints, which provides a structurally consistent snapshot of data every moment of time. This is friendly for modern SSDs and allows row-level WAL logging. In turn, row-level WAL logging is easy to parallelize (done), compact and suitable for active-active multimaster (planned).

Resources

Hacker news

Solving PostgreSQL Wicked Problems

This talk covers how the new engine is integrated with PostgreSQL Core and solves the wicked PostgreSQL problems.

OrioleDB: The next-generation storage engine for PostgreSQL

Example​

Pluggable Storage in PostgreSQL​

Using OrioleDB with existing PostgreSQL installations​

Patch set​

Legend​

Features​

Open source​

Designed for modern hardware​

Reduced maintenance needs​

Designed to be distributed​

Differentiators​

No buffer mapping and lock-less page reading​

MVCC is based on the UNDO log concept​

Copy-on-write checkpoints and row-level WAL​

Resources​

Hacker news​

Solving PostgreSQL Wicked Problems​