Riki Otaki Blog

Content Management System written in bash

このエントリーをはてなブックマークに追加

Siloを実装した

TODO

Implemented SILO

Silo is a scalable multi-core concurrency control protocol proposed in the paper: "Speedy Transactions in Multicore In-Memory Databases". I implemented it in my tpcc-runner repository (https://github.com/wattlebirdaz/tpcc-runner) to verify its scalability.

The performance was the following in 16 core Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz with 2 threads per core. The workload is Full-TPC-C with Phantom protection (Node Verify).

Single Warehouse

In single warehouse, the performance scales until 5 threads. (The maxima of throughput could be in any number between 2 to 9 threads.) The performance degrades rapidly after that because of high contention. The abort rate is calculated as num_aborts/num_trials = num_aborts/(num_aborts + num_commits). The abort rate approaches to 1 as the number threads increases.

Thread Count = Num Warehouse

In thread count = num warehouse, the performance seems to scale well. The abort rate is much smaller than that of a single warehouse. The reason why silo scales is that it is cache-friendly (= it has fewer updates to shared memory). The OCC style read does not modify the shared memory when accessing data. The group commit (epoch) allows fewer updates to shared counter.

Tips for implementing silo

I will list up some tips and questions that I needed to solve for implementing silo.

1. Understand silo's commit protocol

2. Understand masstree structure

3. Decide readset writeset semantics (state change)

  • What operations to support? In TPC-C, read, insert, update, delete, read_scan, update_scan is needed but for YCSB upsert(unconditional write) is also needed.
  • What is the actual format of TidWord?
  • What to store in readset and writeset?
  • How to efficiently find record (in index) that local sets point to in pre-commit phase?
  • Do you copy data when reading value from index? Or just use pointers.
  • Do you copy on write? Or modify record directly in index.
  • How to use absent and latest bits for inserting and deleting records?
  • How to determine TID of commit?

4. Understand epoch and garbage collection

  • When can you free deleted data?
  • Compare the global epoch and epoch of garbage and if the difference is more than 1, you can free them.
  • When can you increment the global epoch?
  • Check workers' epochs before incrementing to prevent them from having epochs more than global epoch + 1.
  • When can you free record inserted by transaction that aborts?

5. Understand phantom protection by node verify

  • Which operation and when does it need to collect node info?

6. Connect it with some workload

  • TPC-C is in tpcc-runner.
  • How to implement secondary index?

Article Info

created: Thu Aug 26 17:02:37 JST 2021
modified:
views: 157
keywords: DBMS, OCC, SILO
prev:OCC and SILO next:振り返り (2021 3-9)

Social Links