Database Research Group
Research topics (index structures)
Concurrency Control and Recovery
The classical theory of concurrency control and recovery defines transactions as strings of abstract read and write operations. In our model, the logical database is assumed to consist of records with unique keys taken from a totally ordered domain. This implies that insertions and deletions of records, and thus dynamic index structures, can be treated in the theory of transactions. We have shown how to implement effcient concurrency control and recovery algorithms within this model, when the underlying index structure is a Blink-tree. We present a new technique to implement a small structure modification such as a page split or a page merge as an atomic action, so that interrupted structure modifications are never rolled back (undone). This is in contrast to the previous algorithms, in which an interrupted tree-structure modification always has to be rolled back during restart recovery or when transaction aborts during normal processing. Our transaction model has also been extended to page-shipping client-server database systems, in which the physical database is organized as a sparse B-tree index; and to the multiversion B+-tree, an asymptotically optimal multiversion index structure.
Publications:
- Seppo Sippu, Eljas Soisalon-Soininen: A Theory of Transactions on Recoverable Search Trees. ICDT 2001: 83-98.
- Ibrahim Jaluta, Seppo Sippu, Eljas Soisalon-Soininen: Concurrency control and recovery for balanced B-link trees. VLDB J. 14(2): 257-277 (2005).
- Ibrahim Jaluta, Seppo Sippu, Eljas Soisalon-Soininen: B-tree concurrency control and recovery in page-server database systems. ACM Trans. Database Syst. 31(1): 82-132 (2006).
- Ibrahim Jaluta: Index Cache Consistency in Client-Server Database Systems. CIT 2006: 1-6.
Multiversion indexes
Current database systems require access to past database states, in addition to the current state. Efficient indexing of such historical data requires a multiversion index structure. We have based our work on the multiversion B+-tree (MVBT) of Becker et al (Bruno Becker , Stephan Gschwind , Thomas Ohler , Bernhard Seeger , Peter Widmayer: An asymptotically optimal multiversion B-tree. VLDB J. 5(4): 264-275 (1996)). The MVBT is asymptotically optimal, so that queries targeting any version of the database are as efficient as queries in a single-version B+-tree that only indexes the entries that are part of the targeted version; but the MVBT follows a single-update model, and the update cannot be rolled back. We have extended the MVBT by redesigning the structure-modification algorithms so that a single updating transaction can execute on the index concurrently with multiple read-only transactions, and we have further developed a combined index structure, the concurrent multiversion B+-tree (CMVBT), that is composed of a transactional MVBT and a separate main-memory-resident versioned B+-tree structure used to hold the updates of active transactions.
Our CMVBT index is an efficient multiversion index structure that allows the indexing of a single linear history of data evolution. We have analytically shown that the underlying transactional multiversion B+-tree index is asymptotically optimal, and thus remains optimal even in the presence of key deletions. Even though deletions in multiversion indexes must not physically remove the data history from the index, searches and range scans in later versions (after deletions) can be more efficient, if the index structure merges pages.
Publications:
- Tuukka Haapasalo, Ibrahim Jaluta, Seppo Sippu, Eljas Soisalon-Soininen: Concurrency control and recovery for multiversion database structures. PIKM 2008: 73-80.
- Tuukka Haapasalo, Ibrahim Jaluta, Bernhard Seeger, Seppo Sippu, Eljas Soisalon-Soininen: Transactions on the multiversion B+-tree. EDBT 2009: 1064-1075.
- Tuukka Haapasalo, Ibrahim Jaluta, Seppo Sippu, Eljas Soisalon-Soininen: Concurrent updating transactions on versioned data. IDEAS 2009: 77-87.
Bulk operations and adaptive sorting
Another interesting question would be to allow a richer set of basic operations for the indexes. For example, bulk insertion and bulk deletion are obvious candidates for additional operations. The efficient implementation of these operations for main memory search trees have been studied in the Ph.D. Thesis by Riku Saikkonen with the application of adaptive tree sorting . We have shown how to treat online bulk deletes from large warehouse tables in the context of general transactions. The new features are a new multiple-granularity locking protocol and the use of our general concurrency control and recovery technique.
Publications:
- Timo Lilja, Riku Saikkonen, Seppo Sippu, Eljas Soisalon-Soininen: Online Bulk Deletion. ICDE 2007: 956-965.
- Riku Saikkonen, Eljas Soisalon-Soininen: Cache-sensitive Memory Layout for Binary Trees. IFIP TCS 2008: 241-255.
- Riku Saikkonen, Eljas Soisalon-Soininen: Bulk-Insertion Sort: Towards Composite Measures of Presortedness. SEA 2009: 269-280.
- Riku Saikkonen: Bulk Updates and Cache Sensitivity in Search Trees. Thesis for a Doctor of Science in Technology, Helsinki University of Technology, 2009.