How the DBMS represents the database in files on disk.

<aside> 💡 Disk-based Architecture

The DBMS assumes that the primary storage location of the database is on non-volatile disk.

The DBMS's components manage the movement of data between non-volatile and volatile storage.

</aside>

<aside> 💡 Sequential vs. Random Access

Random access on non-volatile storage is usually much slower than sequential access.

DBMS will want to maximize sequential access.

<aside> 💡 Design Goals

<aside> 💡 Why not the OS

OS can do the low level jobs for the DBMS


DBMS (almost) always wants to control things itself and can do a better job than the OS.

The OS is not your friend.

</aside>

1. File Storage


The DBMS stores a database as one or more files on disk typically in a proprietary format.

<aside> 💡 Storage manager → responsible for maintaining a database's files.

It organizes the files as a collection of pages.

<aside> 💡 Database Pages

There are three different notions of "pages" in a DBMS:

A hardware page is the largest block of data that the storage device can guarantee failsafe writes.

</aside>

1.1 Heap File Organization

<aside> 💡 Heap File → is an unordered collection of pages with tuples that are stored in random order.

Need meta-data to keep track of what pages exist in multiple files and which ones have free space.

</aside>

Approaches:

  1. Linked List
  2. Page Directory

<aside> 💡 Page Directory

The DBMS maintains special pages that tracks the location of data pages in the database files.

The directory also records the number of free slots per page.

Must make sure that the directory pages are in sync with the data pages.

</aside>

Untitled

2. Page Layout


<aside> 💡 Page Header

Every page contains a header of metadata about the page's contents.