distributed transaction: sharding, A do a, B do b, C do c, (a,b,c) all or none
raft: A,B,C replication do a, then most of the node agree
what if the user just want SQL and whole transcational database experience? it is a semi-relational model on top of a kv store, which a SQL on top.
the kv store consist of spans managed by the spanservers. each span is 1k tablet, which have replication and placement. tablet is a bag(duplicatable bag) of (key, ts) -> value, each managed by independent paxos at SQL leveel, app delcare hierachy between SQL tables, to co-live two tables in one tablet. spanner provide distributed transacation with 2PC, it is slower in multi-region config. so it is better the app dev model the schema that don’t require distributed transaction.
TrueTime allows the system to produce monotonic timestamps efficiently, the need for custom hardware has gone away with the invention or Hybrid Logical Clocks
最大的创新是 true time 的引入。这个东西让分布式事务从原来最传统的读写锁方案 进化为 可扩展的分布式MVCC的方案了。MVCC方案的优势是写读、读写不相互阻塞。 并发度更好。
在分布式事务实现中,之前之所以使用读写锁方案,而不用MVCC方案,主要原因是传统mvcc扩展方案里面的自增id分配是个单点瓶颈。spanner解决问题的方法是,放弃逻辑时间,改用真实时间但是最核心的分布式事务的性能问题其实解决的也并不是很好,尤其在热点更新的情况下,锁延迟太高的问题也没能解决。
to read: https://spanner.fyi/