How Does Database In-Memory Improve Performance
The IM column store includes several optimizations for accelerated query processing. These are described in detail in the
Database In-Memory Technical Brief so only a brief overview is provided here.
There are four basic architectural elements of the column store that enable orders of magnitude faster analytic query
processing:
1. Compressed columnar storage: Storing data contiguously in compressed column units allows an analytic query to scan
only data within the required columns, instead of having to skip past unneeded data in other columns as would be needed
for a row major format. Columnar storage therefore allows a query to perform highly efficient sequential memory
references while compression allows the query to optimize its use of the available system (processor to memory)
bandwidth.
2. Vector Processing: In addition to being able to process data sequentially, column organized storage also enables the use of
vector processing
. Modern CPUs feature highly parallel instructions known as
vector instructions
. These instructions can
process multiple values in one instruction – for instance, they allow multiple values to be compared with a given value
(e.g. find sales with State = “California”) in one instruction. Vector processing of compressed columnar data further
multiplies the native scan speed obtained via columnar storage resulting in scan speeds exceeding tens of billions of rows
per second, per CPU core.
3. In-Memory Storage Indexes: The IM column store for a given table is divided into units known as
In-Memory
Compression Units
(IMCUs) that typically represent a large number of rows (typically around a half million rows). Each
IMCU automatically records the min and max values for the data within each column in the IMCU, as well as other
summary information regarding the data. This metadata serves as an
In-Memory Storage Index
: For instance, it allows an
entire IMCU to be skipped during a scan when it is known from the scan predicates that no matching value will be found
within the IMCU.
4. In-Memory Optimized Joins and Reporting: As a result of massive increases in scan speeds, the Bloom Filter operator
(introduced earlier in Oracle Database 10g) can be commonly selected by the optimizer. With the Bloom Filter
optimization, the scan of the outer (dimension) table generates a compact Bloom filter which can then be used to greatly
reduce the amount of data processed by the join from the scan of the inner (fact) table. And with In-Memory, Bloom filter
evaluation can run an order of magnitude faster thanks to vector processing. Similarly, an optimization known as
Vector
Group By
can be used to reduce a complex aggregation query on a typical star schema into a series of filtered scans against
the dimension and fact tables.
5. Unique Features on Exadata: Database In-Memory on Exadata provides the ability to automatically enable data to be
encoded in the Smart Flash Cache of the Storage Servers using the same in-memory columnar formats as those used in the
database tier. Exadata also enables the ability to duplicate objects in the Database In-Memory Column Store, a feature
known as In-Memory Fault Tolerance. In the event of a node failure, duplicated objects will still be available in a
surviving node's column store thus preserving analytic query performance. Further, when either the primary or the
Active Data Guard standby database is on an Exadata Database machine, Database In-Memory can be used to further
accelerate reporting queries on the Active Data Guard standby. Together, these three unique to Exadata Database In-
Memory features, along with the ability to exploit Exadata's unique hardware features, represents the best platform for
running Database In-Memory.
Apart from accelerating queries, Database In-Memory has the ability to speed up DML operations or writes to the database
with the ability to replace analytic Indexes: Since the IM column store enables superfast analytics, it is possible to drop
conventional indexes used only to accelerate analytic queries. Avoiding costly index maintenance allows update/insert/delete
operations to be an order of magnitude faster. As stated earlier, the IM column store is a purely in-memory structure, and
maintaining it has very low overhead.