RowID

1. Objective

IvorySQL provides Oracle-compatible RowID functionality. RowID is a pseudo-column automatically generated by the database when a table is created, returning the address of each row in the database.

RowID should have the following characteristics:

1. Logically identifies each row with a unique value

2. Allows quick querying and modification of other columns in the table via ROWID, but cannot be inserted or modified itself

3. Users can control whether this feature is enabled

2. Implementation Principles

In IvorySQL, the system column ctid represents the physical location of a data row in a table, also known as the tuple identifier, which consists of a pair of values (block number and row index). The ctid allows for quick lookup of data rows in a table, behaving similarly to Oracle’s RowID. However, the ctid value may change (e.g., during UPDATE or VACUUM FULL), making it unsuitable as a long-term row identifier.

We chose a composite type consisting of the table’s OID and a sequence value as the RowID value, where the sequence is a system column. If the RowID functionality is enabled, a sequence named table-id_rowid_seq is created simultaneously with the table. Additionally, in the heap_form_tuple constructor, the length of HeapTupleHeaderData is increased by 8 bytes, and the td→t_infomask = HEAP_HASROWID bit is set to indicate the presence of the RowID.

When the RowID functionality is enabled via the GUC parameter, or by the WITH ROWID option in table creation, or by executing ALTER TABLE … SET WITH ROWID on an existing table, a sequence is created by adding a sequence creation command.

/*
		 * Build a CREATE SEQUENCE command to create the sequence object,
		 * and add it to the list of things to be done before this CREATE/ALTER TABLE
		 */
		seqstmt = makeNode(CreateSeqStmt);
		seqstmt->with_rowid = true;
		seqstmt->sequence = makeRangeVar(snamespace, sname, -1);
		seqstmt->options = lcons(makeDefElem("as",
							 (Node *) makeTypeNameFromOid(INT8OID, -1),
							 -1),
						 seqstmt->options);
		seqstmt->options = lcons(makeDefElem("nocache",
							 NULL,
							 -1),
						 seqstmt->options);

To enable fast querying of a row using the RowID pseudo-column, a UNIQUE index is automatically created on the RowID column of the table by default to provide efficient query performance.

The implementation of the RowID column as a system attribute column is achieved by adding a new system column in heap.c.

/*
 * Compatible Oracle ROWID pseudo column.
 */
static const FormData_pg_attribute a7 = {
	.attname = {"rowid"},
	.atttypid = ROWIDOID,
	.attlen = -1,
	.attnum = RowIdAttributeNumber,
	.attcacheoff = -1,
	.atttypmod = -1,
	.attbyval = false,
	.attalign = TYPALIGN_SHORT,
	.attstorage = TYPSTORAGE_PLAIN,
	.attnotnull = true,
	.attislocal = true,
};

A boolean field relhasrowid has been added to the pg_class system table to indicate whether the WITH ROWID option was specified during table creation. If the WITH ROWID option is included when creating a table, relhasrowid is set to t; otherwise, it is set to f. This value is also updated when a user executes the ALTER TABLE … SET WITH ROWID or ALTER TABLE … SET WITHOUT ROWID command.

       /* T if we generate ROWIDs for rows of rel */
	bool		relhasrowid BKI_DEFAULT(f);

Regarding RowID storage, if the RowID pseudo-column functionality is enabled, the heap_form_tuple function will add 8 bytes to the HeapTupleHeaderData to store the sequence value, depending on whether tdhasrowid in the TupleDesc parameter is true. In the heap_prepare_insert function, the nextval of the sequence is obtained and stored in the corresponding position in the HeapTupleHeader.

       if (relation->rd_rel->relhasrowid)
	{
		//  Get the sequence next value
		seqnum = nextval_internal(relation->rd_rowdSeqid, true);
		//  Set the HeapTupleHeader
		HeapTupleSetRowId(tup, seqnum);
	}