Data type limits and constraints

This section outlines the basic data types that are supported by Engine. It also provides a list of the practical and theoretical limits of each data type.

Maximum Width

Expressions

When calculating expressions, the width of the generated expression must not exceed 1023.

NSQL - Distinct

When performing a multiple distinct using the NSQL DISTINCT command, the combined width of the fields listed in the distinct clause must not exceed 1023.

Null Values

Null values are stored in Engine as -2130706432. When storing numeric data in Integer fields, if the number stored is -2130706432, the data will be stored as a null. By default, the null value for Real fields is the same as for integers. However, this could potentially cause a problem where the Real number value is a negative. To overcome this, for Engine build 3.1.4416.5 onwards, the NewNullRepresentation setting should be added to the SYSTEM section of the Cerberus.INI file; the null value for Real numbers will then be stored as -1.797693134862315807e308. It is recommended that this setting be added and used as default on all new databases. The setting should not be altered once data has been loaded.

Cardinality

Cardinality is a measure of how many discrete values are stored for a field. A field is stored as a high cardinality (generic) field if it has many values, and a low cardinality (discrete) field if it has few values.

Engine automatically determines whether to store a field as high or low cardinality based on the characteristics of the data and various configuration settings.

Discrete Threshold: This Cerberus.ini setting controls the number of discrete values that must be exceeded in order for a field to be stored as high-cardinality. By default this value is 32768. The maximum length it can be is 65535 and the minimum is 255. The setting is effective across all text and Unicode fields in the system.

It takes more memory to store a high-cardinality field than a low cardinality field.

Care should be taken when modifying this setting as it can adversely affect system performance if not modified correctly.

Index Status

Indexed fields require more storage than non-indexed fields. However, Engine has several levels of indexing that it applies as needed.

When a field is indexed, a basic level of index is applied. If the field is then used in operations that require further indexing (for example, in a Crosstab or a query) then further indexes will automatically be added by the processing of that function. So an indexed field's storage can grow still further based on its usage.

WideTextIndexMode:This Cerberus.ini setting specifies the type of storage used for variable length text and Unicode fields. The index for the field can be stored as either 32 or 64 bit, depending on the volume of the data. The default storage mechanism is 64 bit (variable 64). However, on systems where it is guaranteed that the size of the raw data file for the field will not exceed 4Gb, the index can be stored using the 32 bit (variable 32) system. This uses half as much space as the 64 bit system.

MemoTextLimit: This Cerberus.ini setting specifies the threshold at which text fields will be stored as a memo field.

TEXT_FIELD

Text fields can be stored using either fixed or variable format. The type of format used is determined by the MemoTextLimit Cerberus Configuration setting.

Fixed Width

For fixed width data, each field value occupies the same amount of storage space.

Variable Width (Memo Fields)

Storing data as a memo field can provide significant storage savings where the length of data stored in the field varies from record to record. If all the data is the same length, there is no benefit from storing the data as variable length.

Memo fields are particularly useful when storing webdata, or sparsely populated address fields.

If the width of the data is small (less than 10), there is no benefit to storing the data as a memo field.

Memo fields can be treated exactly the same as fixed width fields. The only difference between memo and fixed text fields is the storage method. The indexes for the two types of field are identical.

Array Fields

Array fields are based on text fields and can be stored as either fixed or variable width. To create an array field, create a text field, but specify the array field details using the format parameter.

The examples below show the same array field being created using a fixed width and then Variable width format.

An array field can contain up to 127 discrete pieces of information or segments.

Array fields can be used in Crosstabs, but cannot be inverted (i.e. array fields from the MANY side cannot be used in a Crosstab on the ONE side).

It is not possible to sort by array fields. When a data set is sorted by an array field, the sort is done using the underlying value, rather than the decoded values. This is not generally of any use.

For small widths, FARRAY takes less space than VARRAY as memo fields have to store the actual field length for each record.

In tests, the optimal width for using FARRAY was <10. For widths greater than 10, VARRAY was optimal.

FARRAY

The memory required to store an FARRAY field is the same as for a fixed width text field.

VARRAY

VARRAY format has an overhead on FARRAY format as it takes a minimum of an extra 4 bytes per record to store data in Memo format.

The memory required to store a VARRAY field is the same as for a variable width field and depends on the MemoTextLimit.

INTEGER

This is because integer values lower than the NULL value (-2130706432) are not displayed correctly.

bigint

Due to the increased storage BIGINT should only be used when the column data goes beyond the scope of an INTEGER

REAL

Maximum precision is 6 digits. Default precision is determined by the DefaultPrecision setting in Cerberus.INI.

DATE

Default Format for Queries and Expressions

The default date format for use in a query or expression is DDMMCCYY (Stored format = 1). This format is controlled via the SystemDateFormat setting in the Cerberus.INI file.

Default Format for Display and Data Output

The default date format for display is SystemDateFormat (DDMMCCYY). This determines the format in which date fields will be output. Each user of the system can specify their own date format by modifying the DateFormat setting in the USER section of Cerberus.ini.

TIME

Default Format for Queries and Expressions

The default time format for use in a query or expression is HHMMSS (stored format = 0). This format is controlled via the SystemTimeFormat setting in the Cerberus.INI file.

Default Format for Display and Data Output

The default time format for display is SystemTimeFormat (HHMMSS). This determines the format in which Time fields will be output. Each user of the system can specify his/her own time format by modifying the TimeFormat setting in the USER section of Cerberus.ini.

DATETIME

The default value for a DateTime field is 20000101 0000 (Midnight, 01 January 2000)

The earliest DateTime value that can be stored is 01011101 00000 (Midnight 01 January 1101)

The latest DateTime value that can be stored is 31129999 235959 (23:59:59on 31 Dec 9999)

Default Format for Queries and Expressions

The default DateTime format for use in a query or expression is CCYYMMDD HH:MM:SS (Stored format = 2). This format is controlled via the SystemDateTimeFormat setting in the Cerberus.INI file.

Default Format for Display and Data Output

The default DateTime format for display is SystemDateTimeFormat (CCYYMMDD HH:MM:SS). This determines the format in which DateTime fields will be output. Each user of the system can specify his/her own DateTime Format by modifying the DateTimeFormat setting in the USER section of Cerberus.ini.

	Online & Instructor-Led Courses \| Training Videos \| Webinar Recordings
	© Alterian. All Rights Reserved. \| Privacy Policy \| Legal Notice

Database reference guide

Data type limits and constraints

Maximum Width

Expressions

NSQL - Distinct

Null Values

Cardinality

Index Status

TEXT_FIELD

Fixed Width

Variable Width (Memo Fields)

Array Fields

FARRAY

VARRAY

INTEGER

bigint

REAL

DATE

Default Format for Queries and Expressions

Default Format for Display and Data Output

TIME

Default Format for Queries and Expressions

Default Format for Display and Data Output

DATETIME

Default Format for Queries and Expressions

Default Format for Display and Data Output