Btree Test

Copying Btree

The programs in the src directory allow you to initialize a btree database and populate it with random data. Several test programs give you different views of the database.

Test Program Description
initbt Initialize the database in batch
bldbt Populate the database
delbt Delete random records from the database
tmpbt Populate a temporary database
btsize Count the number of records
deletes Count the number of deleted index nodes
dbck Check all database pointers
reorg Reorganize the database
depth Calculate how many levels deep the database is
lstbt List the database in sequence
lstbt_r List the database in reverse sequence
lstphys List the database in physical sequence
dmpbt Dump the database in hex
updtbt Update the database in batch
holdlock Lock the database while running tstlock
tstlock Test the Database Lock
optimbt Calculate optimal records-per-node

Test Procedure

Initializing a Btree Database

The initbt.c program is a batch wrapper program for the btinit.c subroutine in the library.

Read Initialize a Btree Database for initializing a new database.

Lstbt Report

The lstbt report may be used for doing mass updates to a large database in physical sequence. The lstphys report may be used more directly for doing mass updates.

In the test database, the non-key fields are in lower case, and the key is in upper case.

The first column is the sorted sequence number of each record.

The second column is the offset in the database of each block.

The third column is the record number of the record in the block.

The fourth column is the record. The key is embedded in columns 5-12 of the test database record.

Sort the report by columns two and three for doing mass updates to a large database. This allows you to update in physical sequence by block number and record number within block. This is equivalent to using the lstphys report to do mass updates.

Dmpbt Listing

The dmpbt listing may be used for analyzing the pointers in the database.

The first 256 bytes are schema information about the database. See the structure btreefmt in the bthash.h header file for a description of this schema. The fields from hndl to the end are temporary. They aren't stored on disk.

The root block starts in byte 256 of the database. See the structure blkfmt in the bthash.h header file for a description of the block. The first 4 bytes are the right pointer. The left pointers are at the beginning of each record. Bytes 5-8 of each block are the record count in the block. The first record starts in byte 9. The right and left pointers and the record count are stored in little endian format on Intel machines.

The first record starts in byte 9 of the each block. See the structure bt_nodefmt in the bthash.h header file for a description of the record node. The first 4 bytes are the left pointer. The record data starts in byte 5 of the node.

With this information, you can follow the structure of the database and repair the database if damaged.

Updating the database

The updtbt program reads each record created by bldbt in sequence using the btgetnxt routine. After each record is read, updtbt places 4 z's in the beginning of the record and replaces the record in the database using the btupdt routine.

After all records have been updated, the database is printed in sequence to show the result of the update.

Testing the database lock

Testing the database lock requires two logical terminals and four steps.

First Terminal Second Terminal
Run holdlock.sh
Run tstlock.sh
Terminate holdlock
Re-run tstlock.sh

Optimal Records Per Node

The optimbt program illustrates ways of calculating records per node based on depth of search.

The first part shows three estimations based on the n'th root of the total number of records in the database. The n'th root is calculated based on the desired number of levels in the database.

The second part shows the smallest free space solution based on the first part of the report.

The third part is a more exhaustive calculation of space optimization. In some cases there is a trade-off between the number of records per node and the number of levels in the database.