SDB is designed for small online databases, has a JSON/PHP interface, and is implemented in PHP. This article presents the rationale for SDB, provides a link to the SDB files (SDB.zip), and a link to a simple demo. Also presented are the SDB design and an introduction to the SDB implementation.
Small-business websites typically have very simple databases that are used to perform authentication, maintain webmaster-supplied content, and tailor user experiences. A very simple database language is desirable for such sites in order to support easy creation and maintenance. A simple design may also allow more efficient implementations. Yet another advantage is that a simple database language can provide an easy introduction to databases for the beginner. The SDB language is designed with these possibilities in mind.
Compared to traditional SQL-like languages, SDB can be much simpler as a result of increased generality and the elimination of unnecessary requirements. Specifically,
- The SDB design is general in that database objects may be arbitrary JSON values, not just SQL-like tables. Consequently, in a given database application, the chosen structure of database objects can grow organically out of the programming problem at hand. But in order to facilitate comparison with SQL-like languages, SDB does include a design for table‑like objects.
- There is no requirement to handle objects that are so large they can only be manipulated piecewise. This allows for simpler object-handling methods than is the case with SQL.
- Unlike SQL, there is no command line interface. The databases live on a web server and interact exclusively with the server's web-integration language (so far, just PHP). Consequently, there is no need for the programming interface to prepare and pass command-line instructions. Hence, there is nothing analogous to SQL injection attacks.
- As with SQLite, there is no explicit security interface. Placement of database files is at the discretion of the programmer. As a result, site-dependent details of how to access databases can be eliminated.
The above simplifications not withstanding, SDB does draw inspiration from SQL-like languages via an analogy with JSON objects: An SQL-like database is analogous a JSON object that maps identifiers to tables, where each table is characterized by its table rows. For our purposes:
A database is a JSON object implementation residing in permanent storage.
Relationship to Other Languages
There are by now more than a hundred noSQL database systems, and like SDB, several are inspired by JSON. The MongoDB database language replaces JSON objects with the more general and more efficient binary JSON (BSON) objects; this adaptation suggests an evolutionary path for SDB. MongoDB successfully gets beyond the one-size-fits-all mentality of SQL, but is far more ambitious than SDB. Tokyo Cabinet also enjoys some simplicity by not dealing with the internal structure of database objects. CouchDB handles key-value pairs where the values are required to be JSON objects; this is less general than SDB where arbitrary JSON values are allowed.
The Really Simple Database language places a high priority on simplicity. But unlike SDB, its primary interface is with humans via a command line user interface, its data objects are SQL-like tables, and as with SQL, there is not a strong separation between databases and the objects they store.
In contrast to SQL-like languages, SDB has no database schemas, at least not yet. The unmet challenge here is to provide useful schemas without sacrificing simplicity. Currently, there is a proposed notation and semantics for JSON schemas, but it is far from simple.
The SDB Design (PHP version)
$db = new SDatabase($fileName);
If the designated file contains JSON text, it is decoded to create the associative array
$fileName is assigned to
$db->$fileName. If the database file doesn't already exist,
$db->data will be an empty array. For a database that exists only in memory, specify $db = new SDatabase("").
$db->data to the file
$db->data to the named file. Savepoint files are structurally identical to SDatabase files and may be used to open a database via the
new SDatabase method.
Deletes or unlinks the named savepoint file. This operation may also be used to delete the main database file, so it is advisable to use a naming convention that avoids confusion of the main database with its savepoint files.
Overwrites the current database by reading in the named savepoint file. Does not change the file name associated with the database. The savepoint and rollback methods may be used to back up and restore memory-only databases. A rollback to an empty file name is equivalent to starting over with a new memory-only database.
Closes the database, does not automatically write out the database. This is needed only in large scripts, as all variables are automatically unset when a PHP script terminates. This statement has a different structure, as it cannot be implemented as a database method.
Analogous to the SQL DROP TABLE statement. The statement succeeds whether
$db->data[$name] existed or not.
$db->data[$name] = $value;
Generalizes the SQL CREATE TABLE coupled with INSERT OR REPLACE statements.
Extracts the database object with key $k.
$fn is a function that maps objects to objects.
Generalizes CREATE VIEW, except that the view is not explicitly part of the database. As a consequence, there is no corresponding notion of DROP VIEW. Also generalizes CREATE VIRTUAL TABLE.
There are a few SQL database constructs that we have deliberately omitted for various reasons. The ANALYZE and PRAGMA statements aren't needed because they address efficiency concerns that are only relevant to large databases. The ATTACH DATABASE statement isn't needed, as multiple databases may easily be opened and worked on in parallel without conflict. The BEGIN TRANSACTION and END TRANSACTION statements aren't supported due to the lack of a safe and effective file locking mechanism for PHP. The CREATE TRIGGER and DROP TRIGGER statements aren't supported, as these statements rely on a tight integration of table and database constructs. Finally, as already mentioned, there is no provision for database schemas due to the lack of a simple database schema language.
This concludes the presentation of the actual database language. But in order to get a valid comparison with SQL, we will explore the case where
$db->data[$k] is an SQL-like table. The challenge is to find a concise table abstraction that is easy to describe and implement, and still allows us to do the kinds of things SQL does with tables.
Virtually all SQL table manipulation is row-at-a-time. So we begin by defining a row as an associative array that maps identifiers (i.e., column names) to values. Rows include the kinds of things that might be created in MySQL as follows:
$table = mysqli_query($link, $query);
$row = mysqli_fetch_array($table, MYSQL_ASSOC);
A table is largely defined by its table rows, so our first table operations are:
rows($tbl), the rows of the table
table($rows), the table created from a nonempty
emptyTable($keys), the empty table with column names given by the
columnNames($tbl), column names of the supplied table.
table are inverses of each other, except for the fact that an empty table will still have column names, unlike an empty array of table rows. There is no enforced data typing on columns.
$rows are appended to the rows of
$tbl. This mimics the effect of repeated SQL INSERT OR REPLACE INTO statements, one for each row. The SQL UNION construct can also be mimicked as
append($tbl1, rows(tbl2)). Row ordering isn't preserved and may need to be restored using the
orderBy function. Duplicate rows may need to be removed using the
removeDuplicates function. These functions are presented below.
select($tbl, $filter), where $
filter is a Boolean function on
Rows that don't satisfy the filter are removed from
$tbl. Mimics the SQL statement, SELECT * from $tbl WHERE $filter.
sanitize($a, $filter), where
$filter is a Boolean function.
Rows that do satisfy the filter are removed from
$tbl. Mimics the SQL statement DELETE FROM $tbl WHERE $filter.
orderBy($tbl, $clns) – where
$tbl is a table and
$clns is either a column name or an array of column names.
The rows in
$tbl are sorted into lexicographic order using the column names of
$clns. Mimics SQL features such as ORDER BY and COLLATE.
Eliminates duplicate rows. Mimics the effect of the UNIQUE keyword in SELECT statements.
selectColumns($tbl, $clns), where
$clns is a column name or an array of column names from
Removes table columns whose column names are not in the array
$clns, and reorders the columns according to the order in
$clns. Mimics the effect of SELECT $clns FROM $tbl.
columnMap($tbl, $re), where
$re is an associative array that maps names to expressions.
columnMap function produces a new table,
$nt, whose column names are the keys in
$re. For each row,
$tbl, a corresponding row of
$nt is obtained by evaluating the expression values in
$re in an expanded context that treats column names in
$row as variables with values as specified in
$row. This function mimics the functionality of the SQL AS construct.
columnMap function may be used in several ways:
- Reorder selected columns by including pairs of the form
'id' => '$id'in
idnames the existing column to be included. In this use,
columnMapmimics the simpler
- Rename selected columns by including pairs of the form '
id1' => '
- Add a new blank column by including a pair of the form
$id => NULLin
- Combine columns. For example, if
$tblcontains numeric columns named '
min' and '
max', one can construct a difference column by including the pair
'diff' => '$max - $min'in
Aggregate functions, e.g.,
sum($cln) is the function that, when applied to a table
$tbl with a numeric column named
$cln, returns the sum of all numbers in that column. In other words,
sum($cln)($tbl) is the sum of all numbers in the
$cln column of
In PHP, '
count' is a reserved word, so we define
sCount() to be the function such that
sCount()($tbl) is the number of rows in
regroup($tbl, $cln, $p), where
$cln is a column name or array of column names, and
$p is an associative array that maps identifiers to aggregate functions such as sum() or sCount().
$tbl is sorted and split into an array of sub-tables:
$tn, where within each
$cln has the same value in every row. For each
$ti, a new table row,
$ri, is constructed: The elements of
$ri are pairs of the form
$id => $val, where
$id is a key in
$val = $p[$id]($ti). Mimics the effect of the SQL GROUP BY construct.
The result is a table whose rows are built from pairs of rows in
$tbl2. For each row
$tbl1 and each
$tbl2, if there are duplicate column names for which
$row2 provides a different value than
$row one, the pair is discarded; otherwise duplicate values are deleted from
$row2 and the remaining values are appended to
$row1. Mimics the SQL NATURAL JOIN construct.
The above list intentionally omits a few table operations. Specifically, no effort has been made to support the ALTER TABLE construct. In addition, we only looked at a couple of aggregate functions and totally omitted any discussion of date-time functions, the latter being readily available in the underlying PHP programming language. For sake of simplicity, there are no methods for maintaining multiple orderings of the same table, and thus nothing directly mimics the SQL REINDEX, INDEX BY, or CREATE INDEX.
Introduction to the Implementation
The SDB database language is implemented as a library consisting of two PHP classes, an SDatabase class for manipulating databases, and an STable class of static functions for manipulating tables. The PHP classes are available in files
STable.inc.php. These and corresponding test files maybe downloaded in SDB.zip.
SDB errors are communicated by generating exceptions. For example, the construct
new SDatabase can generate the following exceptions:
new SDatabase: JSON error – $json_error in file '$fileName'. // 5 cases
new SDatabase: Corrupted database in file '$fileName'.
new SDatabase: Cannot write to file '$fileName'.
The SDB design takes some advantage of PHP 5.4's support for atomic file operations. It is not possible for two scripts to write a database file at the same time, so database file integrity is preserved. However, file reads are not fully atomic, with the result that it is possible for an in-memory database to be corrupted by a write while in the process of being read in, but such an error is immediately detected due to the fact that most strings are not JSON texts.
json_encode functions allow JSON objects to be implemented as associative arrays rather than as PHP objects. The only sticky point in doing this is that an empty array is then both a JSON object and a JSON array.
A Simple Example
Consider the problem of saving a grocery list onto the web for future use at a grocery store. First, try the grocery list demo, then take a look at how it's implemented. Here is how the saved grocery list is displayed in a web page:
The above PHP code opens the database, extracts the grocery list if present, and dumps it into the web page. The doAjax function isn't shown, but its purpose is simply to send a revised grocery list to the server script,
saveGroceries.ajax.php, whose content is as follows:
The first two lines bring in support for SDB and for really simple AJAX, an easy AJAX interface based on post requests. The grocery list from the client is examined for errors. If all is well, the database containing the grocery list is opened, a backup is made in case of unexpected errors, the new list is placed into the database, the database is saved, and a success message is returned to the client.