How to make a DB? [Draft]

It will built in C as most the db in the world, and using b-tree or b+ tree to store. the keypoint here are:

db store index and data seperately on filesystem
then ISAM GET/PUT to manipulate a record’s field
SQL statement will translate to a set of GET/PUT
a log of statment for transaction

Let first simply the question by store all data is a file. Assume a record takes 100 byte (which is defined by scheme type)

FirstName(30)	LastName(20)	Address(50)
foo1	bar1	foobar1
foo2	bar2	foobar2
foo3	bar3	foobar3

Q: How can I get the second record of firstname A: easy , lseek(file, sizeof(firstname),100)

Q: How can I get record of “bar1”? A: sequential search? ofc not. that is how INDEX comes in. If I already I have the POS of the record, then fetching it will be no brain.

so let create a index on “LastName”, so search the record ==> search the index in b-tree, and lseek the data file

                     bar2
                      130
                   /     \                   
                 bar1    bar3
                 30       230

then a table will have a data file and many index files(I am going to index every column)

Read

after have stored data and index files, using a ISAM http://en.wikipedia.org/wiki/ISAM, then I could GET/PUT
then I need a SQL parser that translate SQL statement into a sequence of GETs.
If two table is joined, then a optimizer is need to decide which table to read first.

Write

Insert parse into PUT statement
multiple INSERTs => transacation manager to make sure atomic
then a log of command if recored.
backup/restore….replicaion tools to read transcation log and replicate the transcation
it will need some connection manager the handle connections, authenticate, then run the request
commerical db direct write to raw disk because of the cacheing issue.

B-tree

one node can have multi value to reduce the layer#,
the size of one node depends on page size.
parent size is a, then it got a+1 children

Distributed

If all data is in one machine, the problem of network partition didn’t show up. But if now we are can’t handle the request, what should I do?

bought a power machine (do you smell expensive?)
distribute database.

If I distribute data in multi database.

first we have a Consistency issue, read after write may have wrong result
=> lock down and sync
then we have Availability issue, if all db are waiting for sync
=>message offline node about updates or hire a run around boy
then what if the db network is partitioned?
- if I block read before sync is secured, I sacrificed A to C
- if I let read happen before sync is secured, I sacrificed C to A
  - conflict? vector-clock or app-specifi logic

REF:

http://www.sqlite.org/arch.html
http://sqlite.org/atomiccommit.html
http://www.sqlite.org/fileformat.html

Written on February 1, 2015