griddb.github.io

— Introduction —

Aim & composition of this manual

This manual explains the functions of GridDB.

This manual is targeted at administrators who are in-charge of the operational management of GridDB and designers and developers who perform system design and development using GridDB.

The contents of this manual are as follows.

— What is GridDB? —

GridDB is a distributed NoSQL database to manage a group of data (known as a row) that is made up of a key and multiple values. Besides having a composition of an in-memory database that arranges all the data in the memory, it can also adopt a hybrid composition combining the use of a disk (including SSD as well) and a memory. By employing a hybrid composition, it can also be used in small scale, small memory systems.

In addition to the 3 Vs (volume, variety, velocity) required in big data solutions, data reliability/availability is also assured in GridDB. Using the autonomous node monitoring and load balancing functions, labor-saving can also be realized in cluster applications.

Features of GridDB

Big data (volume)

As the scale of a system expands, the data volume handled increases and thus the system needs to be expanded so as to quickly process the big data.

System expansion can be broadly divided into 2 approaches - scale-up (vertical scalability) and scale-out (horizontal scalability).

In GridDB, in addition to the scale-up approach to increase the number of operating nodes and reinforce the system, new nodes can be added to expand the system with a scale-out approach to incorporate nodes into an operating cluster.

As an in-memory processing database, GridDB can handle a large volume of data with its scale-out model. In GridDB, data is distributed throughout the nodes inside a cluster that is composed of multiple nodes. That is, GridDB provides a large-scale memory database by handling memories of more than one nodes as one big memory space.

Moreover, since GridDB manages data both in memories and on a disk, even when a single node is in operation, it can maintain and access the data larger than its memory size. A large capacity that is not limited by the memory size can also be realized.

Combined use of in-memory/disk
Combined use of in-memory/disk

System expansion can be carried out online with a scale-out approach. That is, without stopping the system in operation, the system can be expanded when the volume of data increases.

In the scale-out approach, data is relocated into the new nodes added to the system in accordance with the load of each existing node in the system. As GridDB will optimize the load balance, the application administrator does not need to worry about the data arrangement. Operation is also easy because a structure to automate such operations has been built into the system.

Scale-out model
Scale-out model

Various data types (variety)

GridDB data adopts a Key-Container data model that is expanded from Key-Value. Data is stored in a device equivalent to a RDB table known as a container. (A container can be considered a RDB table for easier understanding.)

When accessing data in GridDB, the model allows data to be short-listed with a key thanks to its Key-Value database structure, allowing processing to be carried out at the highest speed. A design that prepares a container serving as a key is required to support the entity under management.

Data model
Data model

Besides being suitable for handling a large volume of time series data (TimeSeries container) that is generated by a sensor or the like and other values paired with the time of occurrence, space data such as position information, etc. can also be registered and space specific operations (space intersection) can also be carried out in a container. A variety of data can be handled as the system supports non-standard data such as array data, BLOB and other data as well.

High-speed processing (velocity)

A variety of architectural features is embedded in GridDB to achieve high-speed processing.

Processing is carried out in the memory space as much as possible

In the case of an operating system with an in-memory in which all the data is arranged, there is no real need to be concerned about the access overhead in the disk. However, in order to process a volume of data so large that it cannot be saved in the memory, there is a need to localize the data accessed by the application and to reduce access to the data arranged in the disk as much as possible.

In order to localize data access from an application, GridDB provides a function to arrange related data in the same block as far as possible. Since data in the data block can be consolidated according to the hints provided in the data, the memory hit rate is raised during data access, thereby increasing the processing speed for data access. By setting hints for memory consolidation according to the access frequency and access pattern in the application, limited memory space can be used effectively for operation (Affinity function).

Reduces the overhead

In order to minimize waiting time caused by locks or latches in a simultaneous access to the database, GridDB allocates exclusive memory and DB files to each CPU core and thread, so as to eliminate waiting time for exclusive and synchronization processing.

Architecture
Architecture

In addition, direct access between the client and node is possible in GridDB by caching the data arrangement when accessing the database for the first time on the client library end. Since direct access to the target data is possible without going through the master node to manage the operating status of the cluster and data arrangement, access to the master node can be centralized to reduce communication cost substantially.

Access from a client
Access from a client

Processing in parallel

GridDB provides high-speed processing using the following functions: parallel processing e.g. by dividing a request into processing units capable of parallel processing in the drive engine and executing the process using a thread in the node and between nodes, as well as dispersing a single large data into multiple nodes (partitioning) for processing to be carried out in parallel between nodes.

Reliability/availability

Data are duplicated in a cluster and the duplicated data, replicas, are located in multiple nodes. Replicas include master data, called an owner replica, and duplicated data called a backup. By using these replicas, processing can be continued in any of the nodes constituting a cluster even when a failure occurs. Special operating procedures are not necessary as the system will also automatically perform re-arrangement of the data after a node failure occurs (autonomous data arrangement). Data arranged in a failed node is restored from a replica and then the data is re-arranged so that the set number of replicas is reached automatically.

Duplex, triplex or multiplex replica can be set according to the availability requirements.

Each node performs persistence of the data update information using a disk. Even if a failure occurs in the cluster system, all the registered and updated data up to the failure can be restored without being lost.

In addition, since the client also possesses cache information on the data arrangement and management, upon detecting a node failure, it will automatically perform a failover and data access can be continued using a replica.

High availability
High availability

GridDB Editions

GridDB has the following product.

In addition to the features described in Features of GridDB above, GridDB Enterprise Edition has the following two features:

GridDB editions
GridDB editions

The features of each interface are as follows.

When using GridDB, both NoSQL I/F and NewSQL I/F can be used depending on the use case.

Use case
Use case

The GridDB database and NoSQL/NewSQL interface of GridDB are compatible within the same major version (e.g., a minor version upgrade). The version notation is as follows:

When using both NoSQL I/F and NewSQL I/F in GridDB AE, check the following specification in advance

— Terminology —

Describes the terms used in GridDB in a list.

Term Description
Node Refers to the individual server process to perform data management in GridDB.
Cluster Single or a set of nodes that perform data management together in an integrated manner.
Master node Node to perform a cluster management process.
Follower node All other nodes in the cluster other than the master node.
number of nodes constituting a cluster Refers to the number of nodes constituting a GridDB cluster. When starting GridDB for the first time, the number is used as a threshold value for the cluster to be valid. (Cluster service is started when the number of nodes constituting a cluster joins the cluster.)
number of nodes already participating in a cluster Number of nodes currently in operation that have been incorporated into the cluster among the nodes constituting the GridDB cluster.
Block A block is a data unit for data persistence processing in a disk (hereinafter referred to a checkpoint) and is the smallest physical data management unit in GridDB. Multiple container data are arranged in a block. Block size is set up in a definition file (cluster definition file) before the initial startup of GridDB.
Partition A partition is a unit of data management for placing a container and is equivalent to a data file on the file system when persisting data to a disk. One partition corresponds to one data file. It is also the smallest unit of data placement between clusters, as well as a unit of data movement and copy for adjusting the load balance between nodes (rebalancing) and for managing data multiplexing (replicas) in the event of a failure.
Row Refers to one row of data registered in a container or table. Multiple rows are registered in a container or table. A row consists of values of columns corresponding to the schema definition of the container (table).
Container (Table) Container to manage a set of rows. It may be called a container when operated with NoSQL I/F, and may be called a table when operated with NewSQL I/F. What these names refer are the same object, only in different names. A container has two data types: collection and timeseries container.
Collection (table) One type of container (table) to manage rows having a general key.
Timeseries container (timeseries table) One type of container (table) to manage rows having a timeseries key. Possesses a special function to handle timeseries data.
Database file A database file is a group of files where the data retained by nodes configuring a cluster is written to disks or SSDs and is persisted. A database file is a collective term for data files, checkpoint log files, and transaction log files.
Data file A file to which partition data is written. Updated information located on the memory is reflected at the interval (/checkpoint/checkpointInterval) specified in the node definition file.
Checkpoint log file This is a file for storing block management information for a partition. Block management information is written in smaller batches at the interval (/checkpoint/checkpointInterval) specified in the node definition file.
Transaction log file Update information of the transaction is saved sequentially as a log.
LSN (Log Sequence Number) Shows the update log sequence number, which is assigned to each partition during the update in a transaction. The master node of a cluster configuration maintains the maximum number of LSN (MAXLSN) of all the partitions maintained by each node.
Replica Replication is the process of creating an exact copy of the original data. In this case, one or more replica are created and stored on multiple nodes, which results to the creation of partition across the nodes. There are 2 forms of replica, master and backup. The former one refers to the original or master data, whereas the latter one is used in case of failure as a reference.
Owner node A node that can update a container in a partition. A node that records the container serving as a master among the replicated containers.
Backup node A node that records the container for backup data among the replicated containers.
Definition file Definition file includes two types of parameter files: gs_cluster.json, hereinafter referred to as a cluster definition file, used when composing a cluster; gs_node.json, hereinafter referred to as a node definition file, used to set the operations and resources of the node in a cluster. It also includes a user definition file for GridDB administrator users.
Event log file Event logs of the GridDB server are saved in this file including messages such as errors, warnings and so on. including messages such as errors, warnings and so on.
OS user (gsadm) An OS user has the right to execute operating functions in GridDB. An OS user named gsadm is created during the GridDB installation.
Administrator user An administrator user is a GridDB user prepared to perform operations in GridDB.
General user A user used in the application system.
user definition file File in which an administrator user is registered. During initial installation, 2 administrators, system and admin, are registered.
Cluster database General term for all databases that can be accessed in a GridDB cluster system.
Database Theoretical data management unit created in a cluster database. A public database is created in a cluster database by default. Data separation can be realized for each user by creating a new database and giving a general user the right to use it.
Full backup A backup of the cluster database currently in use is stored online in the backup directory specified in the node definition file.
Differential/incremental backup A backup of the cluster database currently in use is stored online in the backup directory specified in the node definition file. In subsequent backups, only the difference in the update block after the backup is backed up.
Automatic log backup In addition to backing up the cluster database currently in use in the specified directory online, the transaction log is also automatically picked up at the same timing as the transaction log file writing. The write timing of the transaction log file follows the value of /dataStore/logWriteMode in the node definition file.
Failover When a failure occurs in a cluster currently in operation, the structure allows the backup node to automatically take over the function and continue with the processing.
Client failover When a failure occurs in a cluster currently in operation, the structure allows the backup node to be automatically re-connected to continue with the processing as a retry process when a failure occurs in the API on the client side.
Table partitioning Function to access a huge table quickly by allowing concurrent execution by processors of multiple nodes, and the memory of multiple nodes to be used effectively by distributing the placement of a large amount of table data with multiple data registrations in multiple nodes.
Data partition General name of data storage divided by table partitioning. Multiple data partitions are created for a table by table partitioning. Data partitions are distributed to the nodes like normal containers. The number of data partitions and the range of data stored in each data partition are depending on the type of table partitioning (hash, interval or interval-hash).
Data Affinity A function to raise the memory hit rate by placing highly correlated data in a container in the same block and localizing data access.
Placement of container/table based on node affinity A function to reduce the network load during data access by placing highly correlated containers in the same node.

— Structure of GridDB —

Describes the cluster operating structure in GridDB.

Composition of a cluster

GridDB is operated by clusters which are composed of multiple nodes. Before accessing the database from an application system, nodes must be started and the cluster must be constituted, that is, cluster service is executed.

A cluster is formed and cluster service is started when a number of nodes specified by the user joins the cluster. Cluster service will not be started and access from the application will not be possible until all nodes constituting a cluster have joined the cluster.

A cluster needs to be constituted even when operating GridDB with a single node. In this case, the number of nodes constituting a cluster is 1. A composition that operates a single node is known as a single composition.

Cluster name and number of nodes constituting a cluster
Cluster name and number of nodes constituting a cluster

A cluster name is used to distinguish a cluster from other clusters so as to compose a cluster using the right nodes selected from multiple GridDB nodes on a network. Using cluster names, multiple GridDB clusters can be composed in the same network. A cluster is composed of nodes with the following features in common: cluster name, the number of nodes constituting a cluster, and the connection method setting. A cluster name needs to be set in the cluster definition file for each node constituting a cluster, and needs to be specified as a parameter when composing a cluster as well.

The method of constituting a cluster using multicast is called multicast method. See Cluster configuration methods for details.

The operation of a cluster composition is shown below.

Operation of a cluster composition
Operation of a cluster composition

To start up a node and compose a cluster, the operation commands gs_startnode/gs_joincluster command or gs_sh are used. In addition, there is a service control function to start up the nodes at the same time as the OS and to compose the cluster.

To compose a cluster, the number of nodes joining a cluster (number of nodes constituting a cluster) and the cluster name must be the same for all the nodes joining the cluster.

Even if a node fails and is separated from the cluster after operation in the cluster started, cluster service will continue so long as the majority of the number of nodes is joining the cluster.

Since cluster operation will continue as long as the majority of the number of nodes is in operation. So, a node can be separated from the cluster for maintenance while keeping the cluster in operation. The node can be get back into the cluster via network after the maintenance. Nodes can also be added via network to reinforce the system.

The following two networks can be separated: the network that communicates within the cluster and the network dedicated to client communication. For details, refer to the GridDB Administrator Guide.

Status of node

Nodes have several types of status that represent their status. The status changes by user command execution or internal processing of the node. The status of a cluster is determined by the status of the nodes in a cluster.

This section explains types of node status, status transition, and how to check the node status.

Status of cluster

The cluster operating status is determined by the state of each node, and the status may be one of 3 states - IN OPERATION/INTERRUPTED/STOPPED.

During the initial system construction, cluster service starts after all the nodes, the number of which was specified by the user as the number of nodes constituting a cluster, have joined the cluster.

During initial cluster construction, the state in which the cluster is waiting to be composed when all the nodes that make up the cluster have not been incorporated into the cluster is known as [INIT_WAIT]. When the number of nodes constituting a cluster has joined the cluster, the state will automatically change to the operating state.

Operation status includes two states, [STABLE] and [UNSTABLE].

A cluster can be operated in an [UNSTABLE] state as long as a majority of the nodes are in operation even if some nodes are detached from a cluster due to maintenance and for other reasons.

Cluster service is interrupted automatically in order to avoid a split brain when the number of nodes constituting a cluster is less than half the number of nodes constituting a cluster. The status of the cluster will become [WAIT].

To resume the cluster service from a [WAIT] state, add the node, which recovered from the abnormal state, or add a new node, by using a node addition operation. After the cluster is joined by all the nodes, the number of which is the same as the one specified in “the number of nodes constituting a cluster”, the status will be [STABLE], and the service will be resumed.

Even when the cluster service is disrupted, since the number of nodes constituting a cluster becomes less than half due to failures in the nodes constituting the cluster, the cluster service will be automatically restarted once a majority of the nodes joine the cluster by adding new nodes and/or the nodes restored from the errors to the cluster.

Cluster status
Cluster status

A STABLE state is a state in which the value of the json parameter shown in gs_stat, /cluster/activeCount, is equal to the value of /cluster/designatedCount. (Output content varies depending on the version.)

$ gs_stat -u admin/admin
{
    "checkpoint": {
          :
          :
    },
    "cluster": {
        "activeCount":4,                       // Nodes in operation within the cluster
        "clusterName": "test-cluster",
        "clusterStatus": "MASTER",
        "designatedCount": 4,                  // Number of nodes constituting a cluster
        "loadBalancer": "ACTIVE",
        "master": {
            "address": "192.168.0.1",
            "port": 10040
        },
        "nodeList": [                          // Node list constituting a cluster
            {
                "address": "192.168.0.1",
                "port": 10040
            },
            {
                "address": "192.168.0.2",
                "port": 10040
            },
            {
                "address": "192.168.0.3",
                "port": 10040
            },
            {
                "address": "192.168.0.4",
                "port": 10040
            },

        ],
        :
        :

The status of the cluster can be checked with gs_sh or gs_admin. An example on checking the cluster status with gs_sh is shown below.

$ gs_sh
gs> setuser admin admin gsadm                  //Setting a connecting user
gs> setnode node1 192.168.0.1 10040            //Definition of a node constituting the cluster
gs> setnode node2 192.168.0.2 10040
gs> setnode node3 192.168.0.3 10040
gs> setnode node4 192.168.0.4 10040
gs> setcluster cluster1 test150 239.0.0.5 31999 $node1 $node2 $node3 $node4   //Cluster definition
gs> startnode $cluster1                        //Start-up of all nodes making up the cluster
gs> startcluster $cluster1                     //Instructing cluster composition
Waiting for cluster to start.
The GridDB cluster has been started.
gs> configcluster  $cluster1                      // Checking status of cluster
Name                  : cluster1
ClusterName           : test-cluster
Designated Node Count : 4
Active Node Count     : 4
ClusterStatus         : SERVICE_STABLE            // Stable state

Nodes:
  Name    Role Host:Port              Status
-------------------------------------------------
  node1     M  192.168.0.1:10040    SERVICING
  node2     F  192.168.0.2:10040    SERVICING
  node3     F  192.168.0.3:10040    SERVICING
  node4     F  192.168.0.4:10040    SERVICING

gs> leavecluster $node2
Waiting for a node to separate from cluster.
The GridDB node has leaved the GridDB cluster.
gs> configcluster  $cluster1
Name                  : cluster1
ClusterName           : test150
Designated Node Count : 4
Active Node Count     : 3
ClusterStatus         : SERVICE_UNSTABLE          // Unstable state

Nodes:
  Name    Role Host:Port              Status
-------------------------------------------------
  node1     M  192.168.0.1:10040    SERVICING        // Master node
  node2     -  192.168.0.2:10040    STARTED          
  node3     F  192.168.0.3:10040    SERVICING        // Follower node
  node4     F  192.168.0.4:10040    SERVICING        // Follower node

Status of partition

The partition status represents the status of the entire partition in a cluster, showing whether the partitions in an operating cluster are accessible, or the partitions are balanced.

Partition status Description
NORMAL All the partitions are in normal states where all of them are placed as planned.
NOT_BALANCE With no replica_loss, no owner_loss but partition placement is unbalanced.
REPLICA_LOSS Replica data is missing in some partitions
(Availability of the partition is reduced, that is, the node cannot be detached from the cluster.)
OWNER_LOSS Owner data is missing in some partitions.
(The data of the partition are not accessible.)
INITIAL The initial state no partition has joined the cluster

Partition status can be checked by executing gs_stat command to a master node. (The state is expressed as the value of /cluster/partitionStatus)

$ gs_stat -u admin/admin
{
    :
    :
"cluster": {
    :
    "nodeStatus": "ACTIVE",
    "notificationMode": "MULTICAST",
    "partitionStatus": "NORMAL",
    :

[Notes]

Cluster configuration methods

A cluster consists of one or more nodes connected in a network. Each node maintains a list of the other nodes’ addresses for communication purposes.

GridDB supports 3 cluster configuration methods for configuring the address list. Different cluster configuration methods can be used depending on the environment or use case. Connection method of client or operational tool may also be different depending on the configuration methods.

Three cluster configuration methods are available: Multicast method, Fixed list method and Provider method. Multicast method is recommended.

Fixed list or provider method can be used in the environment where multicast is not supported.

The table below compares the three cluster configuration methods.

Property Multicast method (recommended) Fixed list method Provider method  
Parameters - Multicast address and port - List of IP address and port of all the node - URL of the address provider  
Use case - When multicast is supported - When multicast is not supported
- System scale estimation can be performed accurately
- When multicast is not supported
- System scale estimation can not be performed
 
Cluster operation - Perform automatic discovery of nodes at a specified time interval - Set a common address list for all nodes
- Read that list only once at node startup
- Obtain the address list at a specified time interval from address provider  
Pros. - No need to restart the cluster when adding nodes - No mistake of configuration by consistency check of the list - No need to restart the cluster when adding nodes  
Cons. - Multicast is required for client connection - Need to restart cluster when adding nodes - Need to update the connection setting of the client - Need to ensure the availability of the address provider

Setting up cluster configuration files

Fixed list method or provider method can be used in the environment where multicast is not supported. Network setting of fixed list method and provider method is as follows.

Fixed list method

When a fixed address list is given to start a node, the list is used to compose the cluster.

When composing a cluster using the fixed list method, configure the parameters in the cluster definition file.

cluster definition file

Parameter JSON Data type Description
/cluster/notificationMember string Specify the address list when using the fixed list method as the cluster configuration method.

A configuration example of a cluster definition file is shown below.

{
                             :
                             :
    "cluster":{
        "clusterName":"yourClusterName",
        "replicationNum":2,
        "heartbeatInterval":"5s",
        "loadbalanceCheckInterval":"180s",
        "notificationMember": [
            {
                "cluster": {"address":"172.17.0.44", "port":10010},
                "sync": {"address":"172.17.0.44", "port":10020},
                "system": {"address":"172.17.0.44", "port":10040},
                "transaction": {"address":"172.17.0.44", "port":10001},
                "sql": {"address":"172.17.0.44", "port":20001}
            },
            {
                "cluster": {"address":"172.17.0.45", "port":10010},
                "sync": {"address":"172.17.0.45", "port":10020},
                "system": {"address":"172.17.0.45", "port":10040},
                "transaction": {"address":"172.17.0.45", "port":10001},
                "sql": {"address":"172.17.0.45", "port":20001}
            },
            {
                "cluster": {"address":"172.17.0.46", "port":10010},
                "sync": {"address":"172.17.0.46", "port":10020},
                "system": {"address":"172.17.0.46", "port":10040},
                "transaction": {"address":"172.17.0.46", "port":10001},
                "sql": {"address":"172.17.0.46", "port":20001}
            }
        ]
    },
                             :
                             :
}

Provider method

Get the address list supplied by the address provider to perform cluster configuration.

When composing a cluster using the provider method, configure the parameters in the cluster definition file.

cluster definition file

Parameter JSON Data type Description
/cluster/notificationProvider/url string Specify the URL of the address provider when using the provider method as the cluster configuration method.
/cluster/notificationProvider/updateInterval string Specify the interval to get the list from the address provider. Specify the value more than 1 second and less than 231 seconds.

A configuration example of a cluster definition file is shown below.

{
                             :
                             :
    "cluster":{
        "clusterName":"yourClusterName",
        "replicationNum":2,
        "heartbeatInterval":"5s",
        "loadbalanceCheckInterval":"180s",
        "notificationProvider":{
            "url":"http://example.com/notification/provider",
            "updateInterval":"30s"
        }
    },
                             :
                             :
}

The address provider can be configured as a Web service or as a static content. The address provider needs to provide the following specifications.

An example of a response sent from the address provider is as follows.

$ curl http://example.com/notification/provider
[
    {
        "cluster": {"address":"172.17.0.44", "port":10010},
        "sync": {"address":"172.17.0.44", "port":10020},
        "system": {"address":"172.17.0.44", "port":10040},
        "transaction": {"address":"172.17.0.44", "port":10001},
        "sql": {"address":"172.17.0.44", "port":20001}
    },
    {
        "cluster": {"address":"172.17.0.45", "port":10010},
        "sync": {"address":"172.17.0.45", "port":10020},
        "system": {"address":"172.17.0.45", "port":10040},
        "transaction": {"address":"172.17.0.45", "port":10001},
        "sql": {"address":"172.17.0.45", "port":20001}
    },
    {
        "cluster": {"address":"172.17.0.46", "port":10010},
        "sync": {"address":"172.17.0.46", "port":10020},
        "system": {"address":"172.17.0.46", "port":10040},
        "transaction": {"address":"172.17.0.46", "port":10001},
        "sql": {"address":"172.17.0.46", "port":20001}
    }
]

[Note]

— Data model —

GridDB is a unique Key-Container data model that resembles Key-Value. It has the following features.

Data model
Data model

GridDB manages data in blocks, containers, tables, rows, and partitions.

Data management unit
Data management unit

Container

To register and search for data in GridDB, a container (table) needs to be created to store the data. Data structure serving as an I/F with the user. Container to manage a set of rows. It is called a container when operating with NoSQL I/F, and a table when operating with NewSQL I/F.

The naming rules for containers (tables) are the same as those for databases.

[Notes]

Type

There are 2 container (table) data types. A timeseries container (timeseries table) is a data type which is suitable for managing hourly data together with the occurrence time while a collection (table) is suitable for managing a variety of data.

Data type

The schema can be set in a container (table). The basic data types that can be registered in a container (table) are the basic data type and array data type .

Basic data types

Describes the basic data types that can be registered in a container (table). A basic data type cannot be expressed by a combination of other data types.

JSON Data type Description
BOOL True or false
STRING Composed of an arbitrary number of characters using the unicode code point
BYTE Integer value from -27to 27-1 (8bits)
SHORT Integer value from -215to 215-1 (16bits)
INTEGER Integer value from -231to 231-1 (32bits)
LONG Integer value from -263to 263-1 (64bits)
FLOAT Single precision (32 bits) floating point number defined in IEEE754
DOUBLE Double precision (64 bits) floating point number defined in IEEE754
TIMESTAMP Data type expressing the date and time Data format maintained in the database is UTC, and accuracy is in milliseconds
GEOMETRY Data type to represent a space structure
BLOB Data type for binary data such as images, audio, etc.

The following restrictions apply to the size of the data that can be managed for STRING, GEOMETRY and BLOB data. The restriction value varies according to the block size which is the input/output unit of the database in the GridDB definition file (gs_node.json).

Data type Block size (64KB) Block size (from 1MB to 32MB)
STRING Maximum 31KB (equivalent to UTF-8 encode) Maximum 128KB (equivalent to UTF-8 encode)
GEOMETRY Maximum 31KB (equivalent to the internal storage format) Maximum 128KB (equivalent to the internal storage format)
BLOB Maximum 1GB - 1Byte Maximum 1GB - 1Byte

GEOMETRY-type (Spatial-type)

GEOMETRY-type (Spatial-type) data is often used in map information system and available only for a NoSQL interface, not supported by a NewSQL interface.

GEOMETRY type data is described using WKT (Well-known text). WKT is formulated by the Open Geospatial Consortium (OGC), a nonprofit organization promoting standardization of information on geospatial information. In GridDB, the spatial information described by WKT can be stored in a column by setting the column of a container as a GEOMETRY type.

GEOMETRY type supports the following WKT forms.

The space structure written by QUADRATICSURFACE cannot be stored in a container, only can be specified as a search condition.

Operations using GEOMETRY can be executed with API or TQL.

With TQL, management of two or three-dimensional spatial structure is possible. Generating and judgement function are also provided.

 SELECT * WHERE ST_MBRIntersects(geom, ST_GeomFromText('POLYGON((0 0,10 0,10 10,0 10,0 0))'))

See “GridDB TQL Reference” for details of the functions of TQL.

HYBRID

A data type composed of a combination of basic data types that can be registered in a container. The only hybrid data type in the current version is an array.

[Note]

The following restrictions apply to TQL operations in an array column.

Primary key

A primary key can be set in a container (table), The uniqueness of a row with a set ROWKEY is guaranteed. NULL is not allowed in the column ROWKEY is set.

In NewSQL I/F, ROWKEY is called as PRIMARY KEY.

A default index prescribed in advance according to the column data type can be set in a column set in ROWKEY (PRIMARY KEY).

In the current version GridDB, the default index of all STRING, INTEGER, LONG or TIMESTAMP data that can be specified in a ROWKEY (PRIMARY KEY) is the TREE index.

[Notes]

View

View provides reference to data in a container.

Define a reference (SELECT statement) to a container when creating a view. A view is an object similar to a container, but it does not have real data. When executing a query containing a view, the SELECT statement, which was defined when the view was created, is evaluated, and a result is returned.

Views can only be referenced (SELECT), neither adding data (INSERT), updating (UPDATE), nor deletion data (DELETE) are not accepted.

[Notes]

— Database function —

Resource management

Besides the database residing in the memory, other resources constituting a GridDB cluster are perpetuated to a disk. The perpetuated resources are listed below.

A group of data files
A group of data files

The placement of these resources is defined in GridDB home (path specified in environmental variable GS_HOME). In the initial installation state, the /var/lib/gridstore directory is GridDB home, and the initial data of each resource is placed under this directory.

The directories are placed initially as follows.

/var/lib/gridstore/        # GridDB home directory path
     admin/                # gs_admin home directory
     backup/               # Backup directory
     conf/                 # Definition files directory
          gs_cluster.json  # Cluster definition file
          gs_node.json     # Node definition file
          password         # User definition file
     data/                 # data files and checkpoint log directory
     txnlog/               # Transaction log directory
     expimp/               # Export/Import directory
     log/                  # Log directory

The location of GridDB home can be changed by setting the .bash_profile file of the OS user gsadm. If you change the location, please also move resources in the above directory accordingly.

The .bash_profile file contains two environment variables, GS_HOME and GS_LOG.

vi .bash_profile

# GridStore specific environment variables
GS_LOG=/var/lib/gridstore/log
export GS_LOG
GS_HOME=/var/lib/gridstore                    // GridDB home directory path
export GS_HOME

The database directory, backup directory and server event log directory can be changed by changing the settings of the node definition file as well.

See Parameters for the contents that can be set in the cluster definition file and node definition file.

Data access function

To access GridDB data, there is a need to develop an application using NoSQL I/F or NewSQL I/F. Data can be accessed simply by connecting to the cluster database of GridDB without having to take into account position information on where the container or table is located in the cluster database. The application system does not need to consider which node constituting the cluster the container is placed in.

In the GridDB API, when connecting to a cluster database initially, placement hint information of the container is retained (cached) on the client end together with the node information (partition).

Communication overheads are kept to a minimum as the node maintaining the container is connected and processed directly without having to access the cluster to search for nodes that have been placed every time the container used by the application is switched.

Although the container placement changes dynamically due to the rebalancing process in GridDB, the position of the container is transmitted as the client cache is updated regularly. For example, even when there is a node mishit during access from a client due to a failure or a discrepancy between the regular update timing and re-balancing timing, relocated information is automatically acquired to continue with the process.

TQL and SQL

TQL and SQL-92 compliant SQL are supported as database access languages.

Batch-processing function to multiple containers

An interface to quickly process event information that occurs occasionally is available in NoSQL I/F.

When a large volume of events is sent to the database server every time an event occurs, the load on the network increases and system throughput does not increase. Significant impact will appear especially when the communication line bandwidth is narrow. Multi-processing is available in NoSQL I/F to process multiple row registrations for multiple containers and multiple inquiries (TQL) to multiple containers with a single request. The overall throughput of the system rises as the database server is not accessed frequently.

An example is given below.

Multi-put
Multi-put
fetchAll
fetchAll
multi-get
multi-get

Index function

A condition-based search can be processed quickly by creating an index for the columns of a container (table).

Two types of indexes are available: tree indexes (TREE) and spatial indexes (SPATIAL). The index that can be set differs depending on the container (table) type and column data type.

Although there are no restrictions on the no. of indices that can be created in a container, creation of an index needs to be carefully designed. An index is updated when the rows of a configured container are inserted, updated or deleted. Therefore, when multiple indices are created in a column of a row that is updated frequently, this will affect the performance in insertion, update or deletion operations.

An index is created in a column as shown below.

[Note]

Function specific to time series data

To manage data frequently produced from sensors, data is placed in accordance with the data placement algorithm (TDPA: Time Series Data Placement Algorithm), which allows the best use of the memory. In a timeseries container (timeseries table), memory is allocated while classifying internal data by its periodicity. When hint information is given in an affinity function, the placement efficiency rises further. Moreover, a timeseries container moves data out to a disk if necessary and releases expired data at almost zero cost.

A timeseries container (timeseries table) has a TIMESTAMP ROWKEY (PRIMARY KEY).

Operation function of TQL

Aggregate operations

In a timeseries container (timeseries table), the calculation is performed with the data weighted at the time interval of the sampled data. In other words, if the time interval is long, the calculation is carried out assuming the value is continued for an extended time.

The functions of the aggregate operation are as follows:

Aggregation of weighted values (TIME_AVG)
Aggregation of weighted values (TIME_AVG)

Selection/interpolation operation

Time data may deviate slightly from the expected time due to the timing of the collection and the contents of the data to be collected. Therefore when conducting a search using time data as a key, a function that allows data around the specified time to be acquired is also required.

The functions for searching the timeseries container (timeseries table) and acquiring the specified row are as follows:

In addition, the functions for interpolating the values of the columns are as follows:

Expiry release function

An expiry release is a function to delete expired row data from GridDB physically. The data becomes unavailable by removing from a target for a search or a delete before deleting. Deleting old unused data results to keep database size results in prevention of performance degradation caused by bloat of database size.

Expiry release settings
Expiry release settings

The retention period is set in container units. The row which is outside the retention period is called “expired data.” The APIs become unable to operate expired data because it becomes unavailable. Therefore, applications can not access the row. Expired data will be the target for being deleted physically from GridDB after a certain period. Cold data is automatically removed from database intact.

Expiry release settings

Expiry release can be used in a partitioned container (table).

Partition expiry release
Partition expiry release

[Note]

Automatic deletion of cold data or save as an archive

The management information for database files are periodically scanned every one second, and a row that has become cold data at the time of scanning is physically deleted. The amount of scanning the management information for database files is 2000 blocks per execution. It can be set /dataStore/batchScanNum in the node definition file (gs_node.json). There is a possibility that the size of DB keeps increasing because the speed of automatic deletion is behind one of registration in the system which data are registered frequently. Increase the amount of scanning to avoid it.

Table partitioning function

In order to improve the operation speed of applications connected to multiple nodes of the GridDB cluster, it is important to arrange the data to be processed in memory as much as possible. For a huge container (table) with a large number of registered data, the CPU and memory resources in multiple nodes can be effectively used by splitting data from the table and distributing the data across nodes. Distributed rows are stored in the internal containers called “data partition”. The allocation of each row to the data partition is determined by a “partitioning key” column specified at the time of the table creation.

GridDB supports hash partitioning, interval partitioning and interval-hash partitioning as table partitioning methods.

Creating and Deleting tables can be performed only through the NewSQL interface. Data registration, update and search can be performed through the NewSQL/NoSQL interface. (There are some restrictions. See TQL and SQL for the details.)

Table partitioning
Table partitioning

Benefits of table partitioning

Dividing a large amount of data through a table partitioning is effective for efficient use of memory and for performance improvement in data search which can select the target data.

The followings describe the behaviors on the above items for both cases in not using the table partitioning and in using the table partition.

When a large amount of data is stored in single table which is not partitioned, all the required data might not be able to be placed on main memory and the performance might be degraded by frequent swap-in and swap-out between database files and memory. Particularly the degradation is significant when the amount of data is much larger than the memory size of a GridDB node. And data accesses to that table concentrate on single node and the parallelism of database processing decreases.

When not using table partitioning
When not using table partitioning

By using a table partitioning, the large amount of data is divided into data partitions and those partitions are distributed on multiple nodes.

In data registration and search, only necessary data partitions for the processing can be loaded into memory. Data partitions not target to the processing are not loaded. Therefore, in many cases, data size required by the processing is smaller than for a not partitioned large table and the frequency of swap-in and swap-out decreases. By dividing data into data partitions equally, CPU and memory resource on each node can be used more effectively.

In addition data partitions are distributed on nodes, so parallel data access becomes possible.

When using table partitioning
When using table partitioning

Hash partitioning

The rows are evenly distributed in the data partitions based on the hash value.

Also, when using an application system that performs data registration frequently, the access will concentrate at the end of the table and may lead to a bottleneck. A hash function that returns an integer from 1 to N is defined by specifying the partition key column and division number N, and division is performed based on the returned value.

Hash partitioning
Hash partitioning

Interval partitioning

In the interval partitioning, the rows in a table are divided by the specified interval value and is stored in data partitions. The range of each data partition (from the lower limit value to the upper limit value) is automatically determined by the interval value.

The data in the same range are stored in the same data partition, so for the continuous data or for the range search, the operations can be performed on fewer memory.

Interval partitioning
Interval partitioning
Examples of interval partitioned table creation and deletion
Examples of interval partitioned table creation and deletion

Interval-hash partitioning

The interval-hash partitioning is a combination of the interval partitioning and the hash partitioning. First the rows are divided by the interval partitioning, and further each division is divided by hash partitioning. The number of data partitions is obtained by multiplying the interval division count and the hash division count together.

Interval-hash partitioning
Interval-hash partitioning

The rows are distributed to multiple nodes appropriately through the hash partitioning on the result of the interval partitioning. On the other hand, the number of data partitions increases, so that the overhead of searching on the whole table also increases. Please judge to use the partitioning by considering its data distribution and search overhead.

The basic functions of the interval-hash partitioning are the same as the functions of interval partitioning and the hash partitioning. The items specific for the interval-hash partitioning are as follows.

Selection criteria of table partitioning type

Hash, interval and interval-hash are supported as a type of table partitioning by GridDB.

A column which is used in conditions of search or data access must be specified as a partitioning key for dividing the table. If a width of range that divides data equally can be determined for values of the partitioning key, interval or interval-hash is suitable. Otherwise hash should be selected.

Data range
Data range

Transaction function

GridDB supports transaction processing on a container basis and ACID characteristics which are generally known as transaction characteristics. The supporting functions in a transaction process are explained in detail below.

Starting and ending a transaction

When a row search or update etc. is carried out on a container, a new transaction is started and this transaction ends when the update results of the data are committed or aborted.

[Note]

The initial action of a transaction is set in autocommit.

In autocommit, a new transaction is started every time a container is updated (data addition, deletion or revision) by the application, and this is automatically committed at the end of the operation. A transaction can be committed or aborted at the requested timing by the application by turning off autocommit.

A transaction recycle may terminate in an error due to a timeout in addition to being completed through a commit or abort. If a transaction terminates in an error due to a timeout, the transaction is aborted. The transaction timeout is the elapsed time from the start of the transaction. Although the initial value of the transaction timeout time is set in the definition file (gs_node.json), it can also be specified as a parameter when connecting to GridDB on an application basis.

Transaction consistency level

There are 2 types of transaction consistency levels, immediate consistency and eventual consistency. This can also be specified as a parameter when connecting to GridDB for each application. The default setting is immediate consistency.

Immediate consistency is valid in update operations and read operations. Eventual consistency is valid in read operations only. For applications which do not require the latest results to be read all the time, the reading performance improves when eventual consistency is specified.

Transaction isolation level

Conformity of the database contents need to be maintained all the time. When executing multiple transaction simultaneously, the following events will generally surface as issues.

In GridDB, “READ_COMMITTED” is supported as a transaction isolation level. In READ_COMMITTED, the latest data confirmed data will always be read.

When executing a transaction, this needs to be taken into consideration so that the results are not affected by other transactions. The isolation level is an indicator from 1 to 4 that shows how isolated the executed transaction is from other transactions (the extent that consistency can be maintained).

The 4 isolation levels and the corresponding possibility of an event raised as an issue occurring during simultaneous execution are as follows.

Isolation level Dirty read Non-repeatable read Phantom read
READ_UNCOMMITTED Possibility of occurrence Possibility of occurrence Possibility of occurrence
READ_COMMITTED Safe Possibility of occurrence Possibility of occurrence
REPEATABLE_READ Safe Safe Possibility of occurrence
SERIALIZABLE Safe Safe Safe

In READ_COMMITED, if data read previously is read again, data that is different from the previous data may be acquired, and if an inquiry is executed again, different results may be acquired even if you execute the inquiry with the same search condition. This is because the data has already been updated and committed by another transaction after the previous read.

In GridDB, data that is being updated by MVCC is isolated.

MVCC

In order to realize READ_COMMITTED, GridDB has adopted “MVCC (Multi-Version Concurrency Control)”.

MVCC is a processing method that refers to the data prior to being updated instead of the latest data that is being updated by another transaction when a transaction sends an inquiry to the database. System throughput improves as the transaction can be executed concurrently by referring to the data prior to the update.

When the transaction process under execution is committed, other transactions can also refer to the latest data.

MVCC
MVCC

Lock

There is a data lock mechanism to maintain the consistency when there are competing container update requests from multiple transactions.

The lock granularity differs depending on the type of container. In addition, the lock range changes depending on the type of operation in the database.

Lock granularity

The lock granularity for each container type is as follows.

These lock granularity were determined based on the use-case analysis of each container type.

Lock range by database operations

Container operations are not limited to just data registration and deletion but also include schema changes accompanying a change in data structure, index creation to improve speed of access, and other operations. The lock range depends on either operations on the entire container or operations on specific rows in a container.

If there is competition in securing the lock, the subsequent transaction will be put in standby for securing the lock until the earlier transaction has been completed by a commit or rollback process and the lock is released.

A standby for securing a lock can also be cancelled by a timeout besides completing the execution of the transaction.

Data perpetuation

Data registered or updated in a container or table is perpetuated in the disk or SSD, and protected from data loss when a node failure occurs. In data persistence, there are two processes: one is a checkpoint process for periodically saving the updated data on the memory in data files and checkpoint log files for each block, and the other is a transaction log process for sequentially writing the updated data to transaction log files in sync with data updates.

To write to a transaction log, either one of the following settings can be made in the node definition file.

In the “SYNC” mode, log writing is carried out synchronously every time an update transaction is committed or aborted. In the “DELAYED_SYNC” mode, log writing during an update is carried out at a specified delay of several seconds regardless of the update timing. Default value is “1 (DELAYED_SYNC 1 sec)”.

When “SYNC” is specified, although the possibility of losing the latest update details when a node failure occurs is lower, the performance is affected in systems that are updated frequently.

On the other hand, if “DELAYED_SYNC” is specified, although the update performance improves, any update details that have not been written in the disk when a node failure occurs will be lost.

If there are 2 or more replicas in a raster configuration, the possibility of losing the latest update details when a node failure occurs is lower even if the mode is set to “DELAYED_SYNC” as the other nodes contain replicas. Consider setting the mode to “DELAYED_SYNC” as well if the update frequency is high and performance is required.

In a checkpoint, the update block is updated in the database file. A checkpoint process operates at the cycle set on a node basis. A checkpoint cycle is set by the parameters in the node definition file. Initial value is 60 sec (1 minute).

By raising the checkpoint execution cycle figure, data perpetuation can be set to be carried out in a time band when there is relatively more time to do so e.g. by perpetuating data to a disk at night and so on. On the other hand, when the cycle is lengthened, the disadvantage is that the number of transaction log files that have to be rolled forward when a node is restarted outside the system process increases, thereby increasing the recovery time.

Checkpoint
Checkpoint

Timeout process

NoSQL I/F and a NewSQL I/F have different setting items for timeout processing.

NoSQL I/F timeout process

In the NoSQL I/F, 2 types of timeout could be notified to the application developer, Transaction timeout and Failover timeout. The former is related to the processing time limit of a transaction, and the latter is related to the retry time of a recovery process when a failure occurs.

Both the transaction timeout and failover timeout can be set when connecting to a cluster using a GridDB object in the Java API or C API. See “GridDB Java API Reference” and “GridDB C API Reference” for details.

NewSQL I/F timeout process

There are 3 types of timeout as follows:

Replication function

Data replicas are created on a partition basis in accordance with the number of replications set by the user among multiple nodes constituting a cluster.

A process can be continued non-stop even when a node failure occurs by maintaining replicas of the data among scattered nodes. In the client API, when a node failure is detected, the client automatically switches access to another node where the replica is maintained.

The default number of replication is 2, allowing data to be replicated twice when operating in a cluster configuration with multiple nodes.

When there is an update in a container, the owner node (the node having the master replica) among the replicated partitions is updated.

There are 2 ways of subsequently reflecting the updated details from the owner node in the backup node.

If performance is more important than availability, set the mode to asynchronous replication and if availability is more important, set it to quasi-synchronous replication.

[Note]

Affinity function

The affinity function is a function that links related data. GridDB provides two types of affinity functions, data affinity and node affinity.

Data Affinity

Data affinity has two functionalities: one is for grouping multiple containers (tables) together and placing them in a separate block, and the other is for placing each container (table) in a separate block.

Grouping multiple containers together and placing them in a separate block

This is a function to group containers (tables) placed in the same partition based on hint information and place each of them in a separate block. By storing only data that is highly related to each block, data access is localized and thus increase the memory hit rate.

Hint information is provided as a property when creating a container (table). The characters that can be used for hint information have some restrictions just as container (table) names follow naming rules.

Data with the same hint information is placed in the same block as much as possible. Hint information is set depending on rate of data updates and data reference. For example, consider the data structure when system data is registered, referenced or updated by the following operating method in a system that samples and refers to the data on a daily, monthly or annual basis in a monitoring system.

  1. Data in minutes is sent from the monitoring device and saved in the container created on a monitoring device basis.
  2. Since data reports are created daily, one day’s worth of data is aggregated from the data in minutes and saved in the daily container
  3. Since data reports are created monthly, daily container (table) data is aggregated and saved in the monthly container
  4. Since data reports are created annually, monthly container (table) data is aggregated and saved in the annual container
  5. The current space used (in minutes and days) is constantly updated and displayed in the display panel.

In GridDB, instead of occupying a block in a container unit, data close to the time is placed in the block. Therefore, refer to the daily container (table) in 2., perform monthly aggregation and use the aggregation time as a ROWKEY (PRIMARY KEY). The data in 3. and the data in minutes in 1. may be saved in the same block.

When performing yearly aggregation (No.4 above) of a large amount of data, the data which need constant monitoring (No.1) may be swapped out. This is caused by reading the data, which is stored in different blocks (No.4 above), into the memory that is not large enough for all the monitoring data.

In this case, hints are provided according to the access frequency of containers (tables), for example, per minute, per day, and per month, to separate infrequently accessed data and frequently accessed data from each other to place each in a separate block during data placement.

In this way, data can be placed to suit the usage scene of the application by the data affinity function.

Data Affinity
Grouping multiple containers together and placing them in a separate block

[Note]

Placing each container (table) in a separate block

This is a function to occupy blocks on a per container (table) basis. Allocating a unique block to a container enables faster scanning and deletion on a per container basis.

As hint information, set a special string “#unique” to property information when creating a container. Data in a container with this property information is placed in a completely separate block from data in other containers.

data affinity independent

[Note] There is a possibility that the memory hit rate when accessing multiple containers is reduced.

Node affinity

Node affinity refers to a function to place closely related containers or tables in the same node and thereby reduce the network load when accessing the data. In the GridDB SQL, a table JOIN operation can be described, such that while joining tables, it reduces the network load for accessing tables each placed in separate nodes of the cluster. While node affinity is not effective for reducing the turnaround time as concurrent processing using multiple nodes is no longer possible, it may improve throughput because of a reduction in the network load.

Placement of container/table based on node affinity
Placement of container/table based on node affinity

To use the node affinity function, hint information is given in the container (table) name when the container (table) is created. A container (table) with the same hint information is placed in the same partition. Specify the container name as shown below.

The naming rules for node affinity hint information are the same as the naming rules for the container (table) name.

[Note]

Change the definition of a container (table)

You can change container definition including that of column addition after creating a container. Operations that can be changed and the interfaces to be used are given below.

| When the operating target is a single node | NoSQL API | SQL | |———————-|———–|——| | Add column(tail) | ✓ | ✓ | | Add column(except for tail) | ✓ (1) | ✗ | | Delete column | ✓ (1) | ✗ | | rename column |✗ |✓ |

Add column

Add a new column to a container.

If you obtain existing rows after adding columns, the “empty value” defined in the data type of each column as a additional column value returns. See Container<K,R> of a “GridDB Java API Reference” for details about the empty value. (In V4.1, there is a limitation “Getting existing rows after addition of a column results in NULL return from columns without NOT NULL constraint.”) See the limitations page of “GridDB Release Notes” for details.

Example of adding an column
Example of adding an column

Delete column

Delete a column. It is only operational with NoSQL APIs.

Rename column

Rename a container column, which is only operational with SQL.

Database compression/release function

Block data compression

When GridDB writes in-memory data to the database file residing on the disk, a database with larger capacity independent to the memory size can be obtained. However, as the size increases, so does the cost of the storage. The block data compression function is a function to assist in reducing the storage cost that increases relative to the amount of data by compressing database files (data files). In this case, flash memory with a higher price per unit of capacity can be utilized much more efficiently than HDD.

Compression method

When the data on memory is exported to database files (data files), compression is performed for each block, which represents the write unit in GridDB. The vacant area of Linux’s file space due to compression can be deallocated, thereby reducing disk usages.

Supported environment

Since block data compression uses the Linux function, it depends on the Linux kernel version and file system. Block data compression is supported in the following environment.

If block data compression is enabled in other environments, the GridDB node will fail to start.

Configuration method

The compression function needs to be configured in every nodes.

[Note]

Deallocation of unused data blocks

The deallocation of unused data block space is the function to reduce the size of database files (actual disk space allocated) by deallocating the Linux file blocks in the unused block space in database files (data files).

Use this function in the following cases.

The processing for the deallocation of unused blocks, the support environment and the execution method are explained below.

Processing for deallocation

During the startup of a GridDB node, unused space in database files (data files) is deallocated. Those remain deallocated until data is updated on them.

Supported environment

The support environment is the same as the block data compression.

Execution method

Specify the deallocation option, –releaseUnusedFileBlocks, of the gs_startnode command, in the time of starting GridDB nodes.

Check the size of unused space in database files (data files) and the size of allocated disk space using the following command.

It is desired to perform this function when the size of allocated and unused blocks is large (storeTotalUse « dataFileAllocateSize).

[Note]

— Operating function —

Service

GridDB service is automatically performed during OS start-up to start a node or cluster.

The GridDB service is enabled immediately after installing the packages. Since the service is enabled, the GridDB server is started at the same time the OS starts up, and the server is stopped when the OS is stopped.

When you use an interface that integrates middleware and application operation including OS monitoring and database software operation, consideration of dependency with other middleware such as whether to use service or operating commands for GridDB operation is necessary.

GridDB service is automatically performed during OS start-up to start a GridDB node (hereinafter, node) or GridDB cluster (hereinafter cluster). The node or cluster is stopped when the OS is shut down.

The following can be carried out by the service.

The service operation procedures to the cluster of three nodes are as follows.

[Note]

If you do not use service control, disable service at all the runlevels in the way as follows.

# /sbin/chkconfig gridstore off

User management function

There are 2 types of GridDB user, an OS user which is created during installation and a GridDB user to perform operations/development in GridDB (hereinafter referred to a GridDB user).

OS user

An OS user has the right to execute operating functions in GridDB and a gsadm user is created during GridDB installation. This OS user is hereinafter referred to gsadm.

All GridDB resources will become the property of gsadm. In addition, all operating commands in GridDB are executed by a gsadm.

Authentication is performed to check whether the user has the right to connect to the GridDB server and execute the operating commands. This authentication is performed by a GridDB user.

GridDB user

GridDB users
GridDB users

Usable function

The operations available for an administrator and a general user are as follows. Among the operations, commands which can be executed by a gsadm without using a GridDB user are marked with “??”.

When the operating target is a single node Operating details Operating tools used gsadm Administrator user General user
Node operations start node gs_startnode/gs_sh  
  stop node gs_stopnode/gs_sh  
Cluster operations Building a cluster gs_joincluster/gs_sh  
  Adding a node to a cluster gs_appendcluster/gs_sh  
  Detaching a node from a cluster gs_leavecluster/gs_sh  
  Stopping a cluster gs_stopcluster/gs_sh  
User management Registering an administrator user gs_adduser ✓✓
  Deletion of administrator user gs_deluser ✓✓
  Changing the password of an administrator user gs_passwd ✓✓
  Creating a general user gs_sh  
  Deleting a general user gs_sh  
  Changing the password of a general user gs_sh   ✓: Individual only
Database management Creating/deleting a database gs_sh  
  Assigning/cancelling a user in the database gs_sh  
Data operation Creating/deleting a container or table gs_sh   O : Only when update operation is possible in the user’s DB
  Registering data in a container or table gs_sh   O : Only when update operation is possible in the user’s DB
  Searching for a container or table gs_sh   ✓: Only in the DB of the individual
  Creating index to a container or table gs_sh   O : Only when update operation is possible in the user’s DB
Backup management Creating a backup gs_backup  
Backup management Restoring a backup gs_restore ✓✓
  Displaying a backup list gs_backuplist  
System status management Acquiring system information gs_stat  
  Changing system parameter gs_paramconf  
Data import/export Importing data gs_import   ✓: Only in accessible object
Export Exporting data gs_export   ✓: Only in accessible object

Database and users

Access to a cluster database in GridDB can be separated on a user basis. The separation unit is known as a database. The following is a cluster database in the initial state.

Multiple databases can be created in a cluster database. Creation of databases and assignment to users are carried out by an administrator user.

The rules for creating a database are as shown below.

When assigning general users to a database, specify permissions as follows :

Only assigned general users and administrator users can access the database. Administrator user can access all databases. The following rules apply when assign a general user to a database.

Database and users
Database and users

Authentication method

GridDB offers the following two authentication methods:

The following explains each method.

Internal authentication

GridDB manages the user name, password, and privilege of administrative and general GridDB users. If the authentication method is not specified, internal authentication is used by default.

The administrative user is managed using the operation commands gs_adduser, ,gs_deluser, and gs_passwd.

General users are managed using the SQL statements CREATE USER, DROP USER, and SET PASSWORD, whereas their access rights are managed using the SQL statements GRANT and REVOKE.

User cache settings

To set cache for general user information, edit the following node definition file (gs_node.json):

[Note]

Parameter Default Value
/security/userCacheSize 1000 Specify the number of entries for general and LDAP users to be cached.
/security/userCacheUpdateInterval 60 Specify the refresh interval for cache in seconds.

LDAP authentication

GridDB manages general GridDB users by LDAP. It also manages LDAP users’ access rights by allowing the user create a role having the same name as user names and group names within LDAP and manipulate LDAP users’ access rights. Moreover, it provides the caching capability of user information managed by LDAP for faster authentication operation.

[Note]

settings common to internal and LDAP authentication

To use LDAP authentication, edit the cluster definition file (gs_cluster.json) as described below.

Parameter Default Value
/security/authentication INTERNAL Specify either INTERNAL (internal authentication) or LDAP (LDAP authentication) as an authentication method to be used.
/security/ldapRoleManagement USER Specify either USER (mapping using the LDAP user name) or GROUP (mapping using the LDAP group name) as to which one the GridDB role is mapped to.
/security/ldapUrl   Specify the LDAP server using the format: ldap[s]://host[:port]

[Note]

Role management

Roles are managed by the SQL statements CREATE ROLE and DROP ROLE. If “USER” is specified for /security/ldapRoleManagement, the role is created using the LDAP user name, whereas if “GROUP” is specified, the role is created using the LDAP group name. The access authority granted to the role created is managed using the SQL statements GRANT and REVOKE.

Settings for LDAP authentication mode

Specify simple mode (directly binding with a user account) or search mode (searching and authenticate users after binding with an LDAP administrative user), Then edit the cluster definition file (gs_cluster.json) as described below:

[Note]

■Simple mode

Parameter Default Value
/security/ldapUserDNPrefix   To generate the user’s DN (identifier), specify the string to be concatenated in front of the user name.
/security/ldapUserDNSuffix   To generate the user’s DN (identifier), specify the string to be concatenated after the user name.

■Search mode

Parameter Default Value
/security/ldapBindDn   Specify the LDAP administrative user’s DN.
/security/ldapBindPassword   Specify the password for the LDAP administrative user.
/security/ldapBaseDn   Specify the root DN from which to start searching.
/security/ldapSearchAttribute uid Specify the attributes to search for.
/security/ldapMemberOfAttribute memberof Specify the attributes where the group DN to which the user belongs is set (valid if ldapRoleManagement=GROUP).

User cache settings

To set cache for LDAP user information, edit the following node definition file (gs_node.json):

[Note]

Parameter Default Value
/security/userCacheSize 1000 Specify the number of entries for general and LDAP users to be cached.
/security/userCacheUpdateInterval 60 Specify the refresh interval for cache in seconds.

Setup examples

The following example shows sample settings for the conditions below:

■Sample role settings (SQL statements)

 CREATE ROLE TEST
 GRANT SELECT ON sampleDB to TEST

■Sample server settings (excerpt from gs_cluster.json)

            :
"security":{
    "authentication":"LDAP",
    "ldapRoleManagement":"USER",
    "ldapUrl":"ldaps://192.168.1.100:636",
    "ldapUserDnPrefix":"CN=",
    "ldapUserDnSuffix":",ou=d1,ou=dev,dc=example,dc=com",
    "ldapSearchAttribute":"",
    "ldapMemberOfAttribute": ""
},
            :

Security features

Communication encryption

GridDB supports SSL connection between the GridDB cluster and the client.

[Note]

Settings

To enable SSL connection, edit the cluster definition file (gs_cluster.json) and the node definition file (gs_node.json), as illustrated below: Then place the server certificate and the private key file in the appropriate directory.

[Note]

*cluster definition file (gs_cluster.json)

Parameter Default Value
/system/serverSslMode DISABLED For SSL connection settings, specify DISABLED (SSL invalid), PREFERRED (SSL valid, but non-SSL connection is allowed as well), or REQUIRED (SSL valid; non-SSL connection is not allowed ).
/system/sslProtocolMaxVersion TLSv1.2 As a TLS protocol version, specify either TLSv1.2 or TLSv1.3.

*Node definition file (gs_node.json)

Parameter Default Value
/system/securityPath security Specify the full path or relative path to the directory where the server certificate and the private key are placed.
/system/serviceSslPort 10045 SSL listen port for operation commands

*Server certificate and private key

To enable SSL, place the server certificate and the private key in the directory where /system/securityPath is set with the following file names:

Client settings

SSL connection and server certificate verification can be specified on the client side. For details, see each tool and the API reference.

Failure process function

In GridDB, recovery for a single point failure is not necessary as replicas of the data are maintained in each node constituting the cluster. The following action is carried out when a failure occurs in GridDB.

  1. When a failure occurs, the failure node is automatically isolated from the cluster.
  2. Failover is carried out in the backup node in place of the isolated failure node.
  3. Partitions are rearranged autonomously as the number of nodes decreases as a result of the failure (replicas are also arranged).

A node that has been recovered from a failure can be incorporated online into a cluster operation. A node can be incorporated into a cluster which has become unstable due to a failure using the gs_joincluster command. As a result of the node incorporation, the partitions will be rearranged autonomously and the node data and load balance will be adjusted.

In this way, although advance recovery preparations are not necessary in a single failure, recovery operations are necessary when operating in a single configuration or when there are multiple overlapping failures in the cluster configuration.

When operating in a cloud environment, even when physical disk failure or processor failure is not intended, there may be multiple failures such as a failure in multiple nodes constituting a cluster, or a database failure in multiple nodes.

Type and treatment of failures

An overview of the failures which occur and the treatment method is shown in the table below.

A node failure refers to a situation in which a node has stopped due to a processor failure or an error in a GridDB server process, while a database failure refers to a situation in which an error has occurred in accessing a database placed in a disk.

Configuration of GridDB Type of failure Action and treatment
Single configuration Node failure Although access from the application is no longer possible, data in a transaction which has completed processing can be recovered simply by restarting the transaction, except when caused by a node failure. Recovery by another node is considered when the node failure is prolonged.
Single configuration Database failure The database file is recovered from the backup data in order to detect an error in the application. Data at the backup point is recovered.
Cluster configuration Single node failure The error is covered up in the application, and the process can continue in nodes with replicas. Recovery operation is not necessary in a node where a failure has occurred.
Cluster configuration Multiple node failure If both owner/backup partitions of a replica exist in a failure target node, the cluster will operate normally even though the subject partitions cannot be accessed. Except when caused by a node failure, data in a transaction which has completed processing can be recovered simply by restarting the transaction. Recovery by another node is considered when the node failure is prolonged.
Cluster configuration Single database failure Since data access will continue through another node constituting the cluster when there is a database failure in a single node, the data can be recovered simply by changing the database deployment location to a different disk, and then starting the node again.
Cluster configuration Multiple database failure A partition that cannot be recovered in a replica needs to be recovered at the point backup data is sampled from the latest backup data.

Client failover

If a node failure occurs when operating in a cluster configuration, the partitions (containers) placed in the failure node cannot be accessed. At this point, a client failover function to automatically connect to the backup node again and continue the process is activated in the client API. To automatically perform a failover countermeasure in the client API, the application developer does not need to be aware of the error process in the node.

However, due to a network failure or simultaneous failure of multiple nodes, an error may also occur and access to the target application operations may not be possible.

Depending on the data to be accessed, the following points need to be considered in the recovery process after an error occurs.

[Note]

Automatic restarting function

If the GridDB node abnormally terminates or the node process is forcibly terminated, it will automatically restart the node and join to the cluster. Operation manager does not need to be aware of restoring the cluster status to normal operation.

Automatic recovery function
Automatic recovery function

[Note]

Automatic restart is not performed in the following cases:

Settings

The parameters of automatic recovery function is as follows.

Parameter Default Value
SVC_ENABLE_AUTO_RESTART true true(Enabled)/false(Disabled)
GS_USER admin Set as appropriate
GS_PASSWORD admin Set as appropriate

When changing the parameters, edit the start configuration file: /etc/sysconfig/gridstore/gridstore.conf .

[Note]

Export/import function

In the GridDB export/import tools, to recover a database from local damages or the database migration process, save/recovery functions are provided in the database and container unit.

In a GridDB cluster, container data is automatically arranged in a node within a cluster. The user does not need to know how the data is arranged in the node (data position transmission). There is also no need to be aware of the arrangement position in data extraction and registration during export/import as well. The export/import configuration is as follows.

Export/import configuration
Export/import configuration

[Export]

(1) Save the container and row data of a GridDB cluster in the file below. A specific container can also be exported by specifying its name.

* See “GridDB Operation Tools Reference” for details.

[Import]

(2) Import the container and export execution data files, and recover the container and row data in GridDB. A specific container data can also be imported as well.

(3) Import container data files created by the user, and register the container and row data.

[Note]

Backup/restoration function

Regular data backup needs to be performed in case of data corruption caused by database failures and malfunctions of the application. The backup operation method should be selected according to the service level requirements and system resources. The backup operation method should be selected according to the service level requirements and system resources.

This section explains the types of backup and following features.

Backup method

Regular data backup needs to be performed in case of data corruption caused by database failures and malfunctions of the application. The type and interval of the backup operation needs to be determined based on the available disk capacity, duration of backup, and the recovery requirements in case of failure (e.g. point of recovery). The resolution method needs to be selected according to the system resources and the request from the service level of the recovery warranty. Backup methods of GridDB are shown below.

Backup method Recovery point Features  
Offline backup Stopping the cluster Clusters must keep stopping until copying the backup completes. The recovery point is not different from node to node.  
Online backup (baseline with differential/incremental) Completing the backup Use the GridDB backup command. There is a possibility that the recovery point is different from node to node depending on the timing when obtaining the backup completes.  
Online backup (automatic log) Immediately before the failure Use the GridDB backup command. There is a possibility that the start-up time gets longer because the data is recovered to the latest with transaction logs.  
File system level online backup (snapshot, etc.) Taking the snapshot The backup is obtained collaborating with the snapshot of an OS and a storage. Even if snapshots for each node are executed simultaneously, there is a possibility that there is an about 1 second difference from node to node if the log write mode is default DELAYED_SYNC 1 sec.  
File system level online backup (OS command, etc. with automatic log) Immediately before the failure The backup is obtained collaborating with backup solutions, etc. There is a possibility that the start-up time gets longer because the data is recovered to the latest with transaction logs.  

To know about the GridDB online backup functions, please refer to Online backup.

[Note]

To perform an online backup of file system level instead of using the GridDB online backup functions, please refer to File system level backup.

To perform an offline backup, stop the cluster by using gs_stopcluster command first, and stop all the nodes constituting the cluster. Next, backup the data under the database file directory of each node (directory indicated by /dataStore/dbPath, /dataStore/transactionLogPath in gs_node.json).

Backup definition files

In backup operation, in addition to a regular backup of the database files, backup of the definition files is also needed.

Use an OS command to perform a backup of the node definition file (gs_node.json), cluster definition file (gs_cluster.json), user definition file (password) in the $GS_HOME/conf directory (/var/lib/gridstore/conf by default) in addition to a regular backup of the database files.

Be sure to backup the definition file if there are configuration changes or when a user is registered or changed.

Online backup and recovery operations

Backup operations

This section explains the GridDB backup operations in the event of failure.

Types of backup

In GridDB, backup of node units can be carried out online. A backup of the entire cluster can be carried out online while maintaining the services by performing a backup of all the nodes constituting the GridDB cluster in sequence. The types of online backup provided by GridDB are as follows.

Backup type Backup actions Recovery point
Full backup A backup of the cluster database currently in use is stored online in node units in the backup directory specified in the node definition file. Full backup collection point
Differential/incremental backup A backup of the cluster database currently in use is stored online in node units in the backup directory specified in the node definition file. In subsequent backups, only the difference in the update block after the backup is backed up. Differential/incremental backup collection point
Automatic log backup In addition to backing up the cluster database currently in use which is stored online in node units in the backup directory specified in the node definition file, the transaction log is also automatically picked up at the same timing as the transaction log file writing. The write timing of the transaction log file follows the value of /dataStore/logWriteMode in the node definition file. Latest transaction update point

The recovery point differs depending on the type of backup used.

The various backup operations and systems recommendation provided by GridDB are shown below.

Full backup
Full backup
Differential/incremental backup
Differential/incremental backup
Automatic log backup
Automatic log backup

[Note]

The type of backup is specified in the command option.

Specify /dataStore/backupPath in the node definition file as the backup destination. Take into consideration physical failure of the disk, and be sure to set up the backup destination and database file (/dataStore/dbPath, /dataStore/transactionLogPath) so that the file is stored in a different physical disk.

There are 2 log persistency modes for transactions. Default is NORMAL.

KEEP_ALL_LOG is specified only for special operations. e.g. when issuing instructions to delete a log file in conjunction with the backup software of other companies, etc., but normally this is not used.

A specified example of a node definition file is shown below.

$ cat /var/lib/gridstore/conf/gs_node.json         # The example of checking a setting
{
    "dataStore":{
        "dbPath":"/var/lib/gridstore/data",
        "transactionLogPath":"/var/lib/gridstore/txnlog",
        "backupPath":"/mnt/gridstore/backup",      # Backup directory
        "storeMemoryLimit":"1024",
        "concurrency":2,
        "logWriteMode":1,
        "persistencyMode":"NORMAL"                 #Perpetuation mode
            :
            :
}
Backup execution

This section explains how to use a full backup, differential/incremental backup, and automatic log backup.

Specify the backup name (BACKUPNAME) when executing any type of backup. For the data created by backup, a directory with the same name as the backup name (BACKUPNAME) is created and placed under the directory specified in backupPath in the node definition file.

Up to 12 alphanumeric characters can be specified in the BACKUPNAME.

Full backup

When a failure occurs, the system can be recovered up to the point where the full backup was completed. Implement a full backup of all the nodes constituting the cluster. Backup data is stored in the directory indicated by the BACKUPNAME of the command. It is recommended to specify the date in the BACKUPNAME in order to make it easier to understand and manage the backup data gathered.

Execute the following command on all the nodes inside the cluster.

$ gs_backup -u admin/admin 20141025

In this example,

  1. “20141025” is specified as a backup name (BACKUPNAME), and a directory “20141025” will be created under the backup directory.
  2. In the “20141025” directory, backup information files (gs_backup_info.json and gs_backup_info_digest.json) and an LSN information file (gs_lsn_info.json) are created. In the “data” directory, data files and checkpoint log files are created, while in the “txnlog” directory, transaction log files are created.
/var/lib/gridstore/backup/
        20141025/                           # backup directory
                gs_backup_info.json         # backup information file
                gs_backup_info_digest.json  # backup information file
                gs_lsn_info.json            # LSN information file
                data/
                    0/                      # partition number 0
                        0_part_0.dat        # data file backup
                        0_117.cplog         # checkpoint log backup
                        0_118.cplog
                        ...
                    1/
                    2/
                    ...
                txnlog/
                    0/                      # partition number 0
                        0_120.xlog          # transaction log backup
                        0_121.xlog
                    1/
                    2/
                    ...

A backup command will only notify the server of the backup instructions and will not wait for the process to end.

Check the completion of the backup process by the status of the gs_stat command.


$ gs_backup -u admin/admin 20141025
$ gs_stat -u admin/admin --type backup
BackupStatus: Processing

The status of the gs_backuplist command shows whether the backup has been performed properly.

$ gs_backuplist -u admin/admin

BackupName   Status   StartTime                EndTime
------------------------------------------------------------------------
 20141025NO2     P   2014-10-25T06:37:10+0900 -
 20141025        NG  2014-10-25T02:13:34+0900 -
 20140925        OK  2014-09-25T05:30:02+0900 2014-09-25T05:59:03+0900
 20140825        OK  2014-08-25T04:35:02+0900 2014-08-25T04:55:03+0900

The status symbol of the backup list indicates the following.

Differential/incremental block backup

When a failure occurs, data can be recovered until the last differential/incremental backup was performed by using the full backup serving as the baseline (reference point) and the differential/incremental backup after the baseline. Get the full backup as a baseline for the differential/incremental backup and specify differential/incremental backup thereafter.

The backup interval needs to be studied in accordance with the service targets for the data update capacity and the time taken for recovery, but use the following as a guide.

Creation of baseline for full backup is specified below. In this example, BACKUPNAME is “201504.”

$ gs_backup  -u admin/admin --mode baseline 201504
$ gs_stat -u admin/admin --type backup
BackupStatus: Processing(Baseline)

Database file in the data directory is copied under the backup directory as a baseline for the backup.

Specify incremental or since as the mode of the backup command (gs_backup) when performing a regular backup of the differential/incremental block after creating a baseline (backup of data block updated after a full backup of the baseline). Specify the same BACKUPNAME as when the baseline was created. In this example, BACKUPNAME is “201504.”

*****  For incremental backup
$ gs_backup  -u admin/admin --mode incremental 201504
$ gs_stat  -u admin/admin --type backup
BackupStatus: Processing(Incremental)

*****  For differential backup
$ gs_backup  -u admin/admin --mode since 201504
$ gs_stat  -u admin/admin --type backup
BackupStatus: Processing(Since)

The status of the gs_backuplist command shows whether the backup has been performed properly. As a differential/incremental backup will become a single recovery unit in a multiple backup, it will be treated as a single backup in the list of BACKUPNAME. Therefore, specify the backup name and check the details to see the detailed status.

A differential/incremental backup can be confirmed by checking that an asterisk “*” is appended at the beginning of the BACKUPNAME. The status of a differential/incremental backup is always “–”.

The status of differential/incremental backup can be checked by specifying the BACKUPNAME in the argument of the gs_backuplist command.

*****  Display a list of BACKUPNAME
$ gs_backuplist -u admin/admin

BackupName   Status   StartTime                EndTime
------------------------------------------------------------------------
*201504          --  2015-04-01T05:20:00+0900 2015-04-24T06:10:55+0900
*201503          --  2015-03-01T05:20:00+0900 2015-04-24T06:05:32+0900
  :
 20141025NO2     OK   2014-10-25T06:37:10+0900 2014-10-25T06:37:10+0900

*****  Specify the individual BACKUPNAME and display the detailed information
$ gs_backuplist -u admin/admin 201504

BackupName : 201504

BackupData            Status   StartTime                EndTime
--------------------------------------------------------------------------------
201504_lv0                OK  2015-04-01T05:20:00+0900 2015-04-01T06:10:55+0900
201504_lv1_000_001        OK  2015-04-02T05:20:00+0900 2015-04-01T05:20:52+0900
201504_lv1_000_002        OK  2015-04-03T05:20:00+0900 2015-04-01T05:20:25+0900
201504_lv1_000_003        OK  2015-04-04T05:20:00+0900 2015-04-01T05:20:33+0900
201504_lv1_000_004        OK  2015-04-05T05:20:00+0900 2015-04-01T05:21:15+0900
201504_lv1_000_005        OK  2015-04-06T05:20:00+0900 2015-04-01T05:21:05+0900
201504_lv1_001_000        OK  2015-04-07T05:20:00+0900 2015-04-01T05:22:11+0900
201504_lv1_001_001        OK  2015-04-07T05:20:00+0900 2015-04-01T05:20:55+0900

A directory will be created in the backup directory according to the following rules to store the differential/incremental backup data.

The status symbol of the backup list indicates the following.

In a differential/incremental backup, the log output of updated blocks named _n_incremental.cplog is produced in the BackupData directory/data directory/ directory, where "n" in "_n_" denotes a numerical value.

Differential/incremental backup can be compared to a full backup and backup time can be reduced. However, recovery when a failure occurs may take a while as the update block is rolled forward for the data of the full backup. Get the baseline regularly or execute a differential backup from the baseline by specifying since .

[Note]

Automatic log backup

GridDB automatically outputs a transaction log to the backup directory. Therefore, the system can always be recovered to the latest condition. As backup is carried out automatically, it is not possible to perform systematic backups according to the system operating state such as a “backup process scheduled in advance during low peak periods”. In addition, due to the automatic log backup, a system load will be imposed more or less during normal operation as well. Therefore, use of this indication is recommended only when there are surplus system resources.

Specify as follows when using an automatic log backup. In this example, BACKUPNAME is “201411252100.”

$ gs_backup -u admin/admin --mode auto 201411252100
$ gs_stat -u admin/admin --type backup

Execute the command to get the backup data in the directory indicated in BACKUPNAME.

In this example,

  1. A directory with the name “201411252100” will be created under the backup directory.
  2. In the “201411252100” directory, backup information files (gs_backup_info.json and gs_backup_info_digest.json) and an LSN information file (gs_lsn_info.json) are created. In the “data” directory, data files and checkpoint log files are created, while in the “txnlog” directory, transaction log files are created.
    1. Under the “201411252100”/”txnlog” directory, transaction log files are created when execution of the transaction is completed.

When operating with an automatic log backup, the transaction log file in 3) is rolled forward for the full backup data in 2) during recovery when a failure occurs. Therefore, specify the –mode auto to perform a full backup regularly as the recovery time will increase when the number of log files used during recovery increases.

Checking backup operation

The mode of the backup currently being executed and the detailed execution status can also be checked in data that can be obtained from the gs_stat command.

$ gs_stat -u admin/admin

    "checkpoint": {
        "backupOperation": 3,
        "duplicateLog": 0,
        "endTime": 0,
        "mode": "INCREMENTAL_BACKUP_LEVEL_0",
        "normalCheckpointOperation": 139,
        "pendingPartition": 1,
        "requestedCheckpointOperation": 0,
        "startTime": 1429756253260
    },
        :
        :

The meaning of each parameter related to the backup output in gs_stat is as follows.

Collecting container data

When a database failure occurs, it is necessary to understand which container needs to be recovered and how to contact the user of the container. To detect a container subject to recovery, the following data needs to be collected regularly.

Operating efforts can be cut down by creating a gs_sh command script to output the container list in advance.

In the example below, a gs_sh sub-command is created with the file name listContainer.gsh.

setnode node1 198.2.2.1  10040
setnode node2 198.2.2.2  10040
setnode node3 198.2.2.3  10040
setcluster cl1 clusterSeller 239.0.0.20 31999 $node1 $node2 $node3
setuser admin admin gstore
connect $cl1
showcontainer
connect $cl1 db0
showcontainer
 :   Repeat as many as the number of dbs
quit

Change the node variables such as node 1, node 2, node 3 that constitute a cluster, and change the cluster variable such as cl1, user settings and database data where appropriate to suit the environment. See “GridDB Operation Tools Reference” for the details about gs_sh.

Execute the gs_sh script file as shown below to collect a list of containers and partitions.

$ gs_sh listContainer.gsh>`date +%Y%m%d`Container.txt

Information is saved in 20141001Container.txt is as follows.

Database : public
Name                  Type         PartitionId
------------------------------------------------
container_7           TIME_SERIES            0
container_9           TIME_SERIES            7
container_2           TIME_SERIES           15
container_8           TIME_SERIES           17
container_6           TIME_SERIES           22
container_3           TIME_SERIES           25
container_0           TIME_SERIES           35
container_5           TIME_SERIES           44
container_1           TIME_SERIES           53
:
 Total Count: 20

Database : db0
Name                  Type         PartitionId
---------------------------------------------
CO_ALL1              COLLECTION           32
COL1                 COLLECTION          125
 Total Count: 2

Recovery operation

An overview of the recovery operation when a failure occurs is given below.

  1. Failure recognition and checking of recovery range
  2. Recovery operation and node startup
  3. Incorporation of node in cluster
  4. Confirmation of recovery results and operation
Failure recognition and checking of recovery range

When a failure occurs in GridDB, in addition to the cause of the failure being output to the event log file of the node in which the error occurred, if it is deemed that node operation cannot continue, the node status will become ABNORMAL and the node will be detached from the cluster service.

Cluster service will not stop even if the node status becomes ABNORMAL as operations are carried out with multiple replicas in a cluster configuration. Data recovery is necessary when all partitions including the replicas were to fail.

Use gs_stat to check the status of the master node to see whether data recovery is necessary or not. Recovery is necessary if the value of /cluster/partitionStatus is “OWNER_LOSS”.

$ gs_stat -u admin/admin -p 10041
{
    "checkpoint": {
        :
    },
    "cluster": {
        "activeCount": 2,
        "clusterName": "clusterSeller",
        "clusterStatus": "MASTER",
        "designatedCount": 3,
        "loadBalancer": "ACTIVE",
        "master": {
            "address": "192.168.0.1",
            "port": 10011
        },
        "nodeList": [
            {
                "address": "192.168.0.2",
                "port": 10011
            },
            {
                "address": "192.168.0.3",
                "port": 10010
            }
        ],
        "nodeStatus": "ACTIVE",
        "partitionStatus": "OWNER_LOSS",     ★
        "startupTime": "2014-10-07T15:22:59+0900",
        "syncCount": 4
          :

Use the gs_partition command to check for data to be recovered. Partitions with problems can be checked by specifying the –loss option and executing the command.

In the example below, an error has occurred in Partition 68 due to a problem with node 192.168.0.3.

$ gs_partition -u admin/admin -p 10041 --loss

[
 {
        "all": [
            {
                "address": "192.168.0.1",
                "lsn": 0,
                "port": 10011,
                "status": "ACTIVE"
            },
            :
            :
            ,
            {
                "address": "192.168.0.3",
                "lsn": 2004,
                "port": 10012,
                "status": "INACTIVE"   <---  The status of this node is not ACTIVE.
            }
        ],
        "backup": [],
        "catchup": [],
        "maxLsn": 2004,
        "owner": null,           //Partition owner is not present in the cluster.
        "pId": "68",             //ID of partition which needs to be recovered     
        "status": "OFF"
   },
   {
     :

   }
  ]
Recovery operation and node startup
Recovery from backup data

When a problem occurs in a database due to a problem in the system e.g. a disk failure, etc., the data will be recovered from the backup. The following needs to be noted during recovery.

[Note]

Restore backup data to a GridDB node.

Follow the procedure below to restore a node from backup data.

  1. Check that no node has been started.
    • Check that the cluster definition file is the same as the other nodes in the cluster that the node is joining.
  2. Check the backup name used in the recovery. This operation is executed on a node.
    • Check the backup status and select one that has been backed up correctly.
  3. Check that past data files, checkpoint log files, and transaction log files are not left behind in the database file directories (/var/lib/gridstore/data and /var/lib/gridstore/txnlog by default) of the node.
    • Delete if unnecessary and move to another directory if required.
  4. Execute the restore command on the machine starting the node.
  5. Start node.

Use the command below to check the backup data.

A specific example to display a list of the backup names is shown below. A list of the backup names can be displayed regardless of the startup status of the nodes. The status appears as “P” (abbreviation for Processing) if the backup process is in progress with the nodes started.

A list of the backup is displayed in sequence starting from the latest one. In the example below, the one with the 201912 BACKUPNAME is the latest backup.

$ gs_backuplist -u admin/admin
 BackupName   Status  StartTime                 EndTime
-------------------------------------------------------------------------
*201912           --  2019-12-01T05:20:00+09:00 2019-12-01T06:10:55+09:00
*201911           --  2019-11-01T05:20:00+09:00 2019-11-01T06:10:55+09:00
  :
 20191025NO2      OK  2019-10-25T06:37:10+09:00 2019-10-25T06:38:20+09:00
 20191025         NG  2019-10-25T02:13:34+09:00 -
 20191018         OK  2019-10-18T02:10:00+09:00 2019-10-18T02:12:15+09:00

$ gs_backuplist -u admin/admin 201912

BackupName : 201912

BackupData            Status StartTime                 EndTime
--------------------------------------------------------------------------------
201912_lv0                OK 2019-12-01T05:20:00+09:00 2019-12-01T06:10:55+09:00
201912_lv1_000_001        OK 2019-12-02T05:20:00+09:00 2019-12-02T05:20:52+09:00
201912_lv1_000_002        OK 2019-12-03T05:20:00+09:00 2019-12-03T05:20:25+09:00
201912_lv1_000_003        OK 2019-12-04T05:20:00+09:00 2019-12-04T05:20:33+09:00
201912_lv1_000_004        OK 2019-12-05T05:20:00+09:00 2019-12-05T05:21:25+09:00
201912_lv1_000_005        OK 2019-12-06T05:20:00+09:00 2019-12-06T05:21:05+09:00
201912_lv1_001_000        OK 2019-12-07T05:20:00+09:00 2019-12-07T05:22:11+09:00
201912_lv1_001_001        OK 2019-12-08T05:20:00+09:00 2019-12-08T05:20:55+09:00

[Note]

Check the data among the 201912 backup data used in the recovery. Differential/incremental backup data used for recovery can be checked in the –test option of gs_restore. In the –test option, only data used for recovery is displayed and restoration of data will not be carried out. Use this in the preliminary checks.

The example above shows the use of the baseline data in the 201912_lv0 directory, differential data (Since) in the 201912_lv1_001_000 directory, and incremental data in the 201912_lv1_001_001 directory for recovery purposes in a recovery with the 201912 BACKUPNAME output.


-bash-4.2$ gs_restore --test 201912

BackupName : 201912
BackupFolder : /var/lib/gridstore/backup

RestoreData           Status StartTime                 EndTime
--------------------------------------------------------------------------------
201912_lv0                OK 2019-09-06T11:39:28+09:00 2019-09-06T11:39:28+09:00
201912_lv1_001_000        OK 2019-09-06T20:01:00+09:00 2019-09-06T20:01:00+09:00
201912_lv1_001_001        OK 2019-09-06T20:04:42+09:00 2019-09-06T20:04:43+09:00

When a specific partition fails, there is a need to check where the latest data of the partition is being maintained.

Use the gs_backuplist command on all the nodes constituting the cluster, and specify the ID of the partition for which you wish to check the –partitionId option for execution. Use the node backup that contains the largest LSN number for recovery. Use the node backup that contains the largest LSN number for recovery.

Perform for each node constituting the cluster.

$ gs_backuplist -u admin/admin --partitionId=68
 BackupName    ID   LSN
----------------------------------------------------------
 20191018      68   1534
*201911        68   2349
*201912        68   11512

”*” is assigned to BACKUPNAME for a differential/incremental backup.

An execution example to restore backup data is shown below. Restoration is executed with the nodes stopped.

$ mv ${GS_HOME}/data/* ${GS_HOME}/temp/data         # Move data files and checkpoint log files.
$ mv ${GS_HOME}/txnlog/* ${GS_HOME}/temp/txnlog     # Move transaction log files.
$ gs_restore 201912                                 # restoration

The process below is performed by executing a gs_restore command.

Start the node after restoration. See Operations after node startup for the processing after startup.

$ gs_startnode -u admin/admin -w
Recovery from a node failure

When the status of node becomes ABNORMAL due to a node failure, or a node is terminated due to an error, the cause of the error needs to be identified in the event log file.

If there is no failure in the database file, the data in the database file can be recovered simply by removing the cause of the node failure and starting the node.

When the node status becomes ABNORMAL, force the node to terminate once and then investigate the cause of the error first before restarting the node.

Stop a node by force.

$ gs_stopnode -f -u admin/admin -w

Identify the cause of the error and start the node if it is deemed to be not a database failure. By starting the node, a roll forward of the transaction log will be carried out and the data will be recovered to the latest status.

$ gs_startnode -u admin/admin -w

See Operations after node startup for the processing after startup.

Operations after node startup

Perform the following operation after starting a node.

  1. Join node into the cluster
  2. Data consistency check and failover operations
Join node into the cluster

After starting the node, execute a gs_joincluster command with waiting option (-w) to join the recovered node into the cluster.

$ gs_joincluster -u admin/admin -c clusterSeller -n 5 -w
Data consistency check and failover operations

After incorporating a node into a cluster, check the recovery status of the partition. When recovery of a database file is carried out from a backup for a cluster operating online, the LSN of the partition maintained online may not match. The command below can be used to investigate the detailed data of the partition and find out the container included in the lost data by comparing it to data gathered when collecting container data.

Use a gs_partition command to get missing data of a partition. If partition data is missing, only the partition with the missing data will be displayed. If not, no information displayed and there is no problem with data consistency.

$ gs_partition  -u admin/admin --loss
 [
      {
        "all": [
            {
                "address": "192.168.0.1",
                "lsn": 0,
                "port": 10040,
                "status": "ACTIVE"
            },
            {
                "address": "192.168.0.2",
                "lsn": 1207,
                "port": 10040,
                "status": "ACTIVE"
            },
            {
                "address": "192.168.0.3",
                "lsn": 0,
                "port": 10040,
                "status": "ACTIVE"
            },
        ],
        "backup": [],
        "catchup": [],
        "maxLsn": 1408,
        "owner": null,
        "pId": "1",
        "status": "OFF"
    },
:
]

Partition data is deemed to be missing if the LSN is different from the MAXLSN maintained by the master node. The status of the nodes constituting the cluster is ACTIVE but the status of the partition is OFF. Execute a gs_failovercluster command to incorporate directly into the system.

$ gs_failovercluster -u admin/admin --repair

At the end of the failover, check that the /cluster/partitionStatus is NORMAL by executing a gs_stat command to the master node, and that there is no missing data in the partition by executing a gs_partition command.

Operations after completion of recovery

After recovery ends, perform a full backup of all the nodes constituting the cluster.

File system level backup and recovery operations

There is a way to perform an online backup for a file system as an alternative plan in which you don’t use the online backup function of GridDB. It performs a backup of a data directory using the snapshot function of LVM and a storage or copying files directly.

The backup data is saved as a base line and it is possible to recover data to the latest version using the automatic log backup function of GridDB.

Online backup by snapshot

It is possible to back up online using the snapshot function of LVM or a storage. It can reduce significant time for backing up, and can align the recovery point of each node on a cluster accurately.

The procedure is as follows.

  1. Disable the periodic checkpoint function.
  2. (Optional) If you use the automatic log backup simultaneously, start the automatic log backup.
    • By using option, please omit the baseline backup process.
  3. Execute the manual checkpoint and wait to complete.
  4. Take the snapshot including a database file directory.
    • Start this operation at the same time if you want to align the recovery point of all nodes of a cluster accurately.
  5. Copy the database file directory from the snapshot.
  6. (Optional) Remove the unnecessary snapshot.
  7. Re-enable the periodic checkpoint function.

The recovery point of the backup is almost same as the point at taking the snapshot.

[Note]

[Note]

Online backup by file copying

It is possible to perform online backup by file copying using OS commands or backup solutions.

The procedure is as follows.

  1. Disable the periodic checkpoint function.
  2. (Optional) If you use the automatic log backup simultaneously, start the automatic log backup.
    • By using option, please omit the baseline backup process.
  3. Execute the manual checkpoint and wait to complete.
  4. Copy the transaction log files. Then, copy the data files and the checkpoint log files.
    • Start this operation at the same time if you want to align the recovery point of all nodes of a cluster accurately.
  5. Re-enable the periodic checkpoint function.

And the concrete procedure is as follows.

Execute the checkpoint control command to disable the periodic checkpoint function temporarily.

$ gs_checkpoint -u admin/admin --off

If you use the log backup simultaneously, execute the backup command to start the automatic log backup. Specify the option “–skipBaseline” to omit a baseline backup.

$ gs_backup -u admin/admin --mode auto --skipBaseline 201808010300

Execute the manual checkpoint with the wait option (-w).

$ gs_checkpoint -u admin/admin --manual -w

Copy the transaction log files. Then, copy the data files and the checkpoint log files.

$ mkdir -p /mnt/backup/201808010300
$ cp -p ${GS_HOME}/txnlog/* /mnt/backup/201808010300/txnlog
$ cp -p ${GS_HOME}/data/* /mnt/backup/201808010300/data

After the file copying, re-enable the periodic checkpoint function.

$ gs_checkpoint -u admin/admin --on

The recovery point of the backup is almost same as the latest transaction update point.

And the recovery point of the backup with the log backup is almost same as the last updated date of the backup directory.

[Note]

[Note]

Recovery operation and node startup

If you restore from the backup data by a snapshot and file copying, follow the below procedure.

  1. Check that no node has been started.
    • Check that the cluster definition file is the same as the other nodes in the cluster that the node is joining.
  2. Check that past data files, checkpoint log files, and transaction log files are not left behind in the database file directories (/var/lib/gridstore/data and /var/lib/gridstore/txnlog by default) of the node.
    • Delete if unnecessary and move to another directory if required.
  3. Copy the back up data for restoration to the database file directory.
    • If you recover the database to the updated point using the log backup simultaneously, restore the corresponding log backup data with the restore command specifying the updateLogs option.
  4. Start node.

And the concrete procedure from 3 is as follows.

Copy the back up data for restoration to the database file directory.

$ cp -p /mnt/backup/201808010300/data/* ${GS_HOME}/data
$ cp -p /mnt/backup/201808010300/txnlog* ${GS_HOME}/txnlog

If you recover the database to the updated point using the log backup simultaneously, restore the corresponding log backup data with the restore command specifying the updateLogs option.

The recovery point of the log backup is almost same as the last updated date of the backup directory.

$ ls -l /mnt/backup | grep 201808010300
drwx------ 2 gsadm gridstore 4096  Aug  4 14:06 2018 201808010300

After confirming whether that there are no errors, execute the gs_restore command with the –updateLogs option.

$ gs_restore --updateLogs 201808010300

Start the node after restoration. See Operations after node startup for the processing after startup.

Backup file

Installed directories and files

A directory with the name specified in BACKUPNAME of the backup command will be created under the directory indicated by /dataStore/backupPath in the node definition file to store the following files. In the case of differential/incremental backup, the BACKUPNAME_lv0 (baseline) BACKUPNAME_lv1_NNN_MMM (differential/incremental backup) directory is created under the backup directory to similarly store the following files.

  1. Backup data file (gs_backup_info.json,gs_backup_info_digest.json)
    • Data such as the backup start time, end time and backup file size, etc., is maintained in gs_backup_info.json as backup time data while digest data is maintained in gs_backup_info_digest.json. Data is output to gs_backuplist based on this file.
  2. Sequence number (gs_lsn_info.json)
    • LSN (Log Sequence Number) indicating the sequence number of the partition update is output. The LSN maintained by the partition at the point the backup is performed is output.
  3. Data file (_part_n.dat), where "n" in "_n." denotes a numerical value.
    • Data files are placed under the data directory/ directory.
    • If data files are set to be split, data files as many as the number of splits (/dataStore/dbFileSplitCount) are created.
  4. Checkpoint log file (_n.cplog), where "n" in "_n." denotes a numerical value.
    • Checkpoint log files are placed under the data directory/ directory.
  5. Transaction log file (_n.xlog), where "n" in "_n." denotes a numerical value.
    • Transaction log files are placed under the txnlog directory/ directory.
    • A new transaction log file is added according to the operation during a full backup or an automatic log backup.
  6. Differential/incremental block log file (gs_log_n_incremental.cplog), where “n” in “n” denotes a numerical value.
    • Maintain a checkpoint log file of the update block in the differential/incremental backup.
    • Checkpoint log files are placed under the data directory/ directory.

Deleting unnecessary backup files

Unnecessary backup data can be deleted from directories that are no longer required in the BACKUPNAME unit. Since all management information of the backup data is located under the BACKUPNAME directory, there is no need to delete other registry data and so on. During a differential/incremental backup, delete all the BACKUPNAME_lv0, BACKUPNAME_lv1_NNN_MMM directory groups.

Rolling upgrade

The upgrade of nodes while the cluster is running is possible by the rolling upgrade. By operating one by one to leave a node from the cluster, upgrading GridDB on the node and join the node to the cluster again, GridDB on all nodes are replaced to a newer version.

Follow the procedures below to perform upgrade using the rolling upgrade function.

  1. Make a plan for the operations of rolling upgrade in advance
    • Estimate the time of the operations. The operations for a node are as follows. Estimate the time of the following operations and calculate the time for all the nodes. The estimated time is about 5 minutes for the operations other than the start-up of a node (recovery).
      • Leave cluster
      • Stopped node
      • Installation of GridDB
      • Start-up node (recovery)
      • Join cluster
    • When there are many data updates before leaving the cluster or during the rolling upgrade, the recovery may take longer than usual.
  2. Disable the automatic data arrangement setting in a cluster.
    • In rolling upgrade, since the cluster configuration is changed repeatedly, the autonomous data redistribution is disabled, while upgrading all the nodes. By avoiding redundant data redistribution, this setting reduces the load of processing or network communication.
    • By executing the gs_goalconf command with the –cluster option, the autonomous data redistribution on all the nodes of the cluster is disabled.
    • Example:
      $ gs_goalconf -u admin/admin --off --cluster
      
  3. Confirm the cluster configuration
    • After upgrading all of the follower nodes, upgrade a master node at the end in the rolling upgrade procedure. Therefore, confirm the cluster configuration before upgrading to decide the order of upgrading nodes.
    • Confirm a master node using the gs_config command. Nodes except for the master node are follower nodes.
    • Example:
      $ gs_config -u admin/admin
      
  4. Upgrade all follower nodes one by one
    • Perform the following operations on each follower node. Login the node and do the operations. From starting these operations and until finishing step 5, the operations of SQL cause an error. See “Points to note” for details.
      • a. Acquire the present data distribution setting from the master node. (gs_goalconf)
        • Example:
          $ gs_goalconf -u admin/admin -s MASTER_IP --manual > last_goal.json
          
      • b. Set the data distribution settings to all the nodes so as to detach the target node from a cluster. (gs_goalconf)
        • To detach the node safely, set the data distribution so that the target does not have the owner of a replica. This operation takes around the following seconds: number of partitions * number of nodes / 10.
        • Since a backup and an owner are switched in some partitions, client fail over may occur. The processing that does not support client fail over causes an error.
        • Example:
          $ gs_goalconf -u admin/admin --manual --leaveNode NODE_IP --cluster
          
      • c. Wait until the partition state of a master node becomes NORMAL. (gs_stat)
        • Example:
          $ gs_stat -u admin/admin -s MASTER_IP | grep partitionStatus
          
      • d. Disable the autonomous data distribution function of all the nodes. (gs_loadbalance)
        • Example:
          $ gs_loadbalance -u admin/admin --off --cluster
          
      • e. Detach a node from a cluster. (gs_leavecluster)
        • Example:
          $ gs_leavecluster -u admin/admin --force -w
          
      • f. Stop the node normally (gs_stopnode)
        • Example:
          $ gs_stopnode -u admin/admin -w
          
      • g. Upgrade GridDB.
      • Start the node. (gs_startnode)
        • Example:
          $ gs_startnode -u admin/admin -w
          
      • i. Disable the autonomous data distribution function. (gs_loadbalance)
        • The –cluster option is not needed because of the operation on single node.
        • Example:
          $ gs_loadbalance -u admin/admin --off
          
      • j. Disable the autonomous data redistribution. (gs_goalconf)
        • The –cluster option is not needed because of the operation on single node.
        • Example:
          $ gs_goalconf -u admin/admin --off
          
      • k. Join the node to the cluster (gs_joincluster)
        • Example) Cluster name: mycluster, The number of nodes in the cluster: 5
          $ gs_joincluster -u admin/admin -c mycluster -n 5 -w
          
      • l. Wait until the partition state of a master node becomes REPLICA_LOSS. (gs_stat)
        • Example:
          $ gs_stat -u admin/admin -s MASTER_IP | grep partitionStatus
          
      • m. Set the data redistribution setting to the original. (gs_goalconf)
        • This operation takes around the following seconds: number of partitions * number of nodes / 10.
        • Example:
          $ gs_goalconf -u admin/admin --manual --set last_goal.json --cluster
          
      • n. Enable the autonomous data distribution function of all the nodes. (gs_loadbalance))
        • Since a backup and an owner are switched in some partitions, client fail over may occur. The processing that does not support client fail over causes an error.
        • Example:
          $ gs_loadbalance -u admin/admin --on --cluster
          
      • o. Wait until the partition state of the master node becomes NORMAL. (gs_stat)
        • Example:
          $ gs_stat -u admin/admin -s MASTER_IP | grep partitionStatus
          
  5. Upgrade the master node
    • Upgrade the master node checked in Procedure 3. Upgrading procedure is same as procedure 4.
  6. Check that all nodes are the new version (gs_stat)

  7. Enable autonomous data redistribution.
    • Example:
      $ gs_goalconf -u admin/admin --on --cluster
      

A sample script is available which performs the procedures from a. to o. for upgrading a node. After installing the server package, the script will be distributed to the following directory.

$ ls /usr/griddb/sample/ja/rolling_upgrade
Readme.txt  rolling_upgrade_sample.sh

$ ls /usr/griddb/sample/en/rolling_upgrade
Readme.txt  rolling_upgrade_sample.sh

[Note]

[Note]

Event log function

An event log is a log to record system operating information and messages related to event information e.g. exceptions which occurred internally in a GridDB node etc.

An event log is created with the file name gridstore-%Y%m%d-n.log in the directory shown in the environmental variable GS_LOG (Example: gridstore-20150328-5.log). 22/5000 The file switches at the following timing:

The default value of the maximum number of event log files is 30. If it exceeds 30 files, it will be deleted from the old file. The maximum number can be changed with the node definition file.

Output format of event log is as follows.


2014-11-12T10:35:29.746+0900 TSOL1234 8456 ERROR TRANSACTION_SERVICE [10008:TXN_CLUSTER_NOT_SERVICING] (nd={clientId=2, address=127.0.0.1:52719}, pId=0, eventType=CONNECT, stmtId=1) <Z3JpZF9zdG9yZS9zZXJ2ZXIvdHJhbnNhY3Rpb25fc2VydmljZS5jcHAgQ29ubmVjdEhhbmRsZXI6OmhhbmRsZUVycm9yIGxpbmU9MTg2MSA6IGJ5IERlbnlFeGNlcHRpb24gZ3JpZF9zdG9yZS9zZXJ2ZXIvdHJhbnNhY3Rpb25fc2VydmljZS5jcHAgU3RhdGVtZW50SGFuZGxlcjo6Y2hlY2tFeGVjdXRhYmxlIGxpbmU9NjExIGNvZGU9MTAwMDg=>

The event log output level can be changed online by using the gs_logconf command. When analyzing details of trouble information, change it online. However, online changes are temporary memory changes. Therefore, in order to make it permanent such as setting valid at restart of the node, it is necessary to change the trace item of the node definition file of each node constituting the cluster.

The current setting can be displayed with the gs_logconf command. Output content varies depending on the version.

$ gs_logconf -u admin/admin
{
    "levels": {
        "CHECKPOINT_FILE": "ERROR",
        "CHECKPOINT_SERVICE": "INFO",
        "CHUNK_MANAGER": "ERROR",
        "CHUNK_MANAGER_IODETAIL": "ERROR",
        "CLUSTER_OPERATION": "INFO",
        "CLUSTER_SERVICE": "ERROR",
        "COLLECTION": "ERROR",
        "DATA_STORE": "ERROR",
        "DEFAULT": "ERROR",
        "EVENT_ENGINE": "WARNING",
        "IO_MONITOR": "WARNING",
        "LOG_MANAGER": "WARNING",
        "MAIN": "WARNING",
        "MESSAGE_LOG_TEST": "ERROR",
        "OBJECT_MANAGER": "ERROR",
        "RECOVERY_MANAGER": "INFO",
        "REPLICATION_TIMEOUT": "WARNING",
        "SESSION_TIMEOUT": "WARNING",
        "SYNC_SERVICE": "ERROR",
        "SYSTEM": "UNKNOWN",
        "SYSTEM_SERVICE": "INFO",
        "TIME_SERIES": "ERROR",
        "TRANSACTION_MANAGER": "ERROR",
        "TRANSACTION_SERVICE": "ERROR",
        "TRANSACTION_TIMEOUT": "WARNING"
    }
}

Checking operation state

Performance and statistical information

GridDB performance and statistical information can be checked in GridDB using the operating command gs_stat. gs_stat represents information common in the cluster and performance and statistical information unique to the nodes.

Among the outputs of the gs_stat command, the performance structure is an output that is related to the performance and statistical information.

An example of output is shown below. The output contents vary depending on the version.

-bash-4.1$ gs_stat -u admin/admin -s 192.168.0.1:10040
{
    :
    "performance": {
        "batchFree": 0,
        "dataFileSize": 65536,
        "dataFileUsageRate": 0,
        "checkpointWriteSize": 0,
        "checkpointWriteTime": 0,
        "currentTime": 1428024628904,
        "numConnection": 0,
        "numTxn": 0,
        "peakProcessMemory": 42270720,
        "processMemory": 42270720,
        "recoveryReadSize": 65536,
        "recoveryReadTime": 0,
        "sqlStoreSwapRead": 0,
        "sqlStoreSwapReadSize": 0,
        "sqlStoreSwapReadTime": 0,
        "sqlStoreSwapWrite": 0,
        "sqlStoreSwapWriteSize": 0,
        "sqlStoreSwapWriteTime": 0,
        "storeDetail": {
            "batchFreeMapData": {
                "storeMemory": 0,
                "storeUse": 0,
                "swapRead": 0,
                "swapWrite": 0
            },
            "batchFreeRowData": {
                "storeMemory": 0,
                "storeUse": 0,
                "swapRead": 0,
                "swapWrite": 0
            },
            "mapData": {
                "storeMemory": 0,
                "storeUse": 0,
                "swapRead": 0,
                "swapWrite": 0
            },
            "metaData": {
                "storeMemory": 0,
                "storeUse": 0,
                "swapRead": 0,
                "swapWrite": 0
            },
            "rowData": {
                "storeMemory": 0,
                "storeUse": 0,
                "swapRead": 0,
                "swapWrite": 0
            }
        },
        "storeMemory": 0,
        "storeMemoryLimit": 1073741824,
        "storeTotalUse": 0,
        "swapRead": 0,
        "swapReadSize": 0,
        "swapReadTime": 0,
        "swapWrite": 0,
        "swapWriteSize": 0,
        "swapWriteTime": 0,
        "syncReadSize": 0,
        "syncReadTime": 0,
        "totalLockConflictCount": 0,
        "totalReadOperation": 0,
        "totalRowRead": 0,
        "totalRowWrite": 0,
        "totalWriteOperation": 0
    },
    :
}

Information related to performance and statistical information is explained below. The description of the storeDetail structure is omitted as this is internal debugging information.

Output parameters Type Description Event to be monitored
dataFileSize c Data file size (in bytes)  
dataFileUsageRate c Data file usage rate  
checkpointWrite s Write count to the date file by checkpoint processing  
checkpointWriteSize s Write size to the date file by checkpoint processing (byte)  
checkpointWriteTime s Write time to the date file by checkpoint processing (ms)  
checkpointWriteCompressTime s Compression time of write data to the data file by checkpointing process (ms)  
dataFileAllocateSize c The total size of blocks allocated to data files (in bytes)  
currentTime c Current time  
numConnection c Current no. of connections. Number of connections used in the transaction process, not including the number of connections used in the cluster process. Value is equal to the no. of clients + no. of replicas * no. of partitions retained. If the no. of connections is insufficient in monitoring the log, review the connectionLimit value of the node configuration.
numSession c Current no. of sessions  
numTxn c Current no. of transactions  
peakProcessMemory p Peak value of the memory used in the GridDB server, including the storememory value which is the maximum memory size (byte) used in the process If the peakProcessMemory or processMemory is larger than the installed memory of the node and an OS Swap occurs, additional memory or a temporary drop in the value of the storeMemoryLimit needs to be considered.
processMemory c Memory space used by a process (byte)  
recoveryReadSize s Size read from data files in the recovery process (in bytes)  
recoveryReadTime s Time taken to read data files in the recovery process (in milliseconds)  
sqlStoreSwapRead s Read count from the file by SQL store swap processing  
sqlStoreSwapReadSize s Read size from the file by SQL store swap processing (byte)  
sqlStoreSwapReadTime s Read time from the file by SQL store swap processing (ms)  
sqlStoreSwapWrite s Write count to the file by SQL store swap processing  
sqlStoreSwapWriteSize s Write size to the file by SQL store swap processing (byte)  
sqlStoreSwapWriteTime s Write time to the file by SQL store swap processing (ms)  
storeMemory c Memory space used in an in-memory database (byte)  
storeMemoryLimit c Memory space limit used in an in-memory database (byte)  
storeTotalUse c Full data capacity (byte) retained by the nodes, including the data capacity in the database file  
swapRead s Read count from the file by swap processing  
swapReadSize s Read size from the file by swap processing (byte)  
swapReadTime s Read time from the file by swap processing (ms)  
swapWrite s Write count to the file by swap processing  
swapWriteSize s Write size to the file by swap processing (byte)  
swapWriteTime s Write time to the file by swap processing (ms)  
swapWriteCompressTime s Compression time of write data to the file by swap process (ms)  
syncReadSize s Size of files to read from sync data files (in bytes)  
syncReadTime s Time taken to read files from sync data files (in milliseconds)  
totalLockConflictCount s Row lock competing count  
totalReadOperation s Search process count  
totalRowRead s Row reading count  
totalRowWrite s Row writing count  
totalWriteOperation s Insert and update process count  

Container placement information

Containers (tables) and partitioned tables in a GridDB cluster are automatically distributed to each node. By using operation management tools or SQL, it is possible to check which container (table) is placed on each node.

This function is used to:

[Note]

The placement information of containers (tables) is checked by the following methods.

Getting container (table) list of node

To get container (table) list of a node, use “Container list screen” of integrated operation control GUI (gs_admin).

  1. Login to gs_admin.

  2. After selecting the “ClusterTree” tab on the left tree view and selecting a node, click “Container” tab on the right frame.

  3. Container list placed on the node is displayed.

[Note]

Checking owner node of container (table)

To check node where specified container is placed on, use gs_sh and operation command (gs_partition).

  1. Perform gs_sh sub-command “showcontainer” to check ID of the partition which has specified container. The partition ID is displayed as “Partition ID”.

  2. Perform gs_sh sub-command “configcluster” to check master node. M is displayed as “Role” for the master node.

  3. Specify the partition ID, which was identified in the procedure 1., as the argument-n, and execute gs_partition in the master node. The “/owner/address” in the displayed JSON shows the owner node of the container (table).

[Example]

$ gs_partition -u admin/admin -n 5
[
    {
        "backup": [],
        "catchup": [],
        "maxLsn": 300008,
        "owner": {
            "address": "192.168.11.10",    -> The IP address of the owner node is 192.168.11.10.
            "lsn": 300008,
            "port": 10010
        },
        "pId": "5",
        "status": "ON"
    }
]

[Note]

[Note]

Checking node of data partition

A partitioned container (table) divides and stores data in two or more internal containers (data partition). The data distribution of the partitioned container (table) can be obtained by checking the nodes to which these data partitions are distributed.

Check the partition ID of the data partition in the container (table) and search the node to which the data partition is distributed. The procedure is as follows.

  1. Check the ID of the partition which has the data partition of the specified container (table).
  1. Use the partition ID to search the node to which the data partition is distributed.

[Example]

select DATABASE_NAME, TABLE_NAME, CLUSTER_PARTITION_INDEX from "#table_partitions" where TABLE_NAME='hashTable1';

DATABASE_NAME,TABLE_NAME,CLUSTER_PARTITION_INDEX
public,hashTable1,1
public,hashTable1,93
public,hashTable1,51
public,hashTable1,18
public,hashTable1,32  ->The number of data partitions of 'hashTable1'is 5 and the partition IDs stored in it are 1, 93, 51, 18, 32.
$ gs_partition -u admin/admin -n 1
[
    {
        "backup": [],
        "catchup": [],
        "maxLsn": 200328,
        "owner": {
            "address": "192.168.11.15",    -> The IP address of the owner node is 192.168.11.15.
            "lsn": 200328,
            "port": 10010
        },
        "pId": "1",
        "status": "ON"
    }
]

[Note]

[Note]

Operating tools for the system

GridDB provides the following tools for operating clusters and nodes, operating data, such as creating containers, exporting and/or importing.

List of operation tools
List of operation tools
Name Displayed information
Service Linux service management tools to start and/or stop GridDB nodes.
Integrated operation control GUI (gs_admin) Web-based integrated operation control GUI (gs_admin) for the operating functions of GridDB clusters.
Cluster operation control command interpreter (gs_sh) CUI tool for operation management and data manipulation of GridDB clusters.
Operating commands Commands to perform the operating functions of GridDB clusters. .
Exporting/importing tool Export/import data.

Operating commands

The following commands are available in GridDB. The following commands are available in GridDB. All the operating command names of GridDB start with gs_.

Type Command Functions
Node operations gs_startnode start node
  gs_stopnode stop node
Cluster operations gs_joincluster Join a node to a cluster. Join to cluster configuration
  gs_leavecluster Cause a particular node to leave a cluster. Used, when causing a particular node to leave from a cluster for maintenance. The partition distributed to the node to leave the cluster will be rearranged (rebalance).
  gs_stopcluster Cause all the nodes, which constite a cluster, to leave the cluster. Used for stopping all the nodes. The partitions are not rebalanced when the nodes leave the cluster.
  gs_config Get cluster configuration data
  gs_stat Get cluster data
  gs_appendcluster Add a node to the cluster in a STABLE state.
  gs_failovercluster Do manual failover of a cluster Used also to start a service accepting a data lost. Used also to start a service accepting a data lost.
  gs_partition Get partition data
  gs_loadbalance Set autonomous data redistribution
User management gs_adduser Registration of administrator user
  gs_deluser Deletion of administrator user
  gs_passwd Change a password of an administrator user
Log data gs_logs Display recent event logs
  gs_logconf Display and change the event log output level
Restoring a backup gs_backup Collect backup data
  gs_backuplist Display backup data list
  gs_restore Restore a backup data
Import/export gs_import Import exported containers and database on the disk
  gs_export Export containers and database as CSV or ZIP format to the disk
Maintenance gs_paramconf Display and change parameters
  gs_authcache Listing and deleting cache for user information for faster authentication of general users and of LDAP.

Integrated operation control GUI (gs_admin)

The integrated operation control GUI (hereinafter referred to gs_admin) is a Web application that integrates GridDB cluster operation functions. gs_admin is an intuitive interface that provides cluster operation information in one screen (dashboard screen). start and stop operation to individual nodes constituting the cluster, check performance information, etc.

Gs_admin dashboard screen
Gs_admin dashboard screen

gs_admin also supports the following functions to support development, so it can be used effectively in the development stage of the system.

Cluster operation control command interpreter (gs_sh)

The cluster operation control command interpreter (hereinafter referred to gs_sh) is a command line interface tool to manage GridDB cluster operations and data operations. While operating commands provide operation on a per-node basis, gs_sh provides interfaces for processing on a per-cluster basis. In addition to user management operations, it also provides data manipulation such as creating databases, containers and tables, and searching by TQL or SQL.

There are two types of start modes in gs_sh. Interactive mode: specify sub-command interactively to execute processing, Batch mode: Execute a script file containing a series of operations with sub-commands. Use of batch script enables automation of operation verification at development and labor saving of system construction.

// Interactive mode
$ gs_sh
// start gs_sh and execute sub-command "version"
gs> version

// Batch mode: execute a script file specified as an argument
$gs_sh test.gsh

gs_sh provides, cluster operations such as starting a node, starting a cluster, and data manipulation, such as creating containers.

See “GridDB Operation Tools Reference” for the details about gs_sh operations.

— Parameter —

Describes the parameters to control the operations in GridDB. In the GridDB parameters, there is a node definition file to configure settings such as the setting information and usable resources etc., and a cluster definition file to configure operational settings of a cluster. Explains the meanings of the item names in the definition file and the settings and parameters in the initial state.

The unit of the setting is set as shown below.

Cluster definition file (gs_cluster.json)

The same setting in the cluster definition file needs to be made in all the nodes constituting the cluster. As the partitionNum and storeBlockSize parameters are important parameters to determine the database structure, they cannot be changed after GridDB is started for the first time.

The meanings of the various settings in the cluster definition file are explained below.

The system can be caused to recognize an item not included in the initial state by adding its name as a property. In the change field, indicate whether the value of that parameter can be changed and the change timing.

Configuration of GridDB Default Meaning of parameters and limitation values Change
/notificationAddress 239.0.0.1 Standard setting of a multi-cast address. This setting will become valid if a parameter with the same cluster, transaction name is omitted. If a different value is set, the address of the individual setting is valid. Restart
/dataStore/partitionNum 128 Specify a common multiple that will allow the number of partitions to be divided and placed by the number of constituting clusters. Integer: Specify an integer that is 1 or higher and 10000 or lower. Disallowed
/dataStore/storeBlockSize 64KB Specify the disk I/O size from 64KB,1MB,4MB,8MB,16MB,32MB. Larger block size enables more records to be stored in one block, suitable for full scans of large tables, but also increases the possibility of conflict. Select the size suitable for the system. Cannot be changed after server is started. Disallowed
/cluster/clusterName - Specify the name for identifying a cluster. Mandatory input parameter. Restart
/cluster/replicationNum 2 Specify the number of replicas. Partition is doubled if the number of replicas is 2. Restart
/cluster/notificationAddress 239.0.0.1 Specify the multicast address for cluster configuration Restart
/cluster/notificationPort 20000 Specify the multicast port for cluster configuration. Specify a value within a specifiable range as a multi-cast port no. Restart
/cluster/notificationInterval 5s Multicast period for cluster configuration. Specify the value more than 1 second and less than 231 seconds. Restart
/cluster/heartbeatInterval 5s Specify a check period (heart beat period) to check the node survival among clusters. Specify the value more than 1 second and less than 231 seconds. Restart
/cluster/loadbalanceCheckInterval 180s To adjust the load balance among the nodes constituting the cluster, specify a data sampling period, as a criteria whether to implement the balancing process or not. Specify the value more than 1 second and less than 231 seconds. Restart
/cluster/notificationMember - Specify the address list when using the fixed list method as the cluster configuration method. Restart
/cluster/notificationProvider/url - Specify the URL of the address provider when using the provider method as the cluster configuration method. Restart
/cluster/notificationProvider/updateInterval 5s Specify the interval to get the list from the address provider. Specify the value more than 1 second and less than 231 seconds. Restart
/sync/timeoutInterval 30s Specify the timeout time during data synchronization among clusters. If a timeout occurs, the system load may be high, or a failure may have occurred. Specify the value more than 1 second and less than 231 seconds. Restart
/transaction/notificationAddress 239.0.0.1 Multi-cast address that a client connects to initially. Master node is notified in the client. Restart
/transaction/notificationPort 31999 Multi-cast port that a client connects to initially. Specify a value within a specifiable range as a multi-cast port no. Restart
/transaction/notificationInterval 5s Multi-cast period for a master to notify its clients. Specify the value more than 1 second and less than 231 seconds. Restart
/transaction/replicationMode 0 Specify the data synchronization (replication) method when updating the data in a transaction. Specify a string or integer, “ASYNC”or 0 (non-synchronous), “SEMISYNC”or 1 (quasi-synchronous). Restart
/transaction/replicationTimeoutInterval 10s Specify the timeout time for communications among nodes when synchronizing data in a quasi-synchronous replication transaction. Specify the value more than 1 second and less than 231 seconds. Restart
/transaction/authenticationTimeoutInterval 5s Specify the authentication timeout time. Restart
/sql/notificationAddress 239.0.0.1 Multi-cast address when the JDBC/ODBC client is connected initially. Master node is notified in the client. Restart
/sql/notificationPort 41999 Multi-cast port when the JDBC/ODBC client is connected initially. Specify a value within a specifiable range as a multi-cast port no. Restart
/sql/notificationInterval 5s Multi-cast period for a master to notify its JDBC/ODBC clients. Specify the value more than 1 second and less than 231 seconds. Restart
/security/authentication INTERNAL Specify either INTERNAL (internal authentication) or LDAP (LDAP authentication) as an authentication method to be used. Restart
/security/ldapRoleManagement USER Specify either USER (mapping using the LDAP user name) or GROUP (mapping using the LDAP group name) as to which one the GridDB role is mapped to. Restart
/security/ldapUrl   Specify the LDAP server with the format: ldaps://host[:port] Restart
/security/ldapUserDNPrefix   To generate the user’s DN (identifier), specify the string to be concatenated in front of the user name. Restart
/security/ldapUserDNSuffix   To generate the user’s DN (identifier), specify the string to be concatenated after the user name. Restart
/security/ldapBindDn   Specify the LDAP administrative user’s DN. Restart
/security/ldapBindPassword   Specify the password for the LDAP administrative user. Restart
/security/ldapBaseDn   Specify the root DN from which to start searching. Restart
/security/ldapSearchAttribute uid Specify the attributes to search for. Restart
/security/ldapMemberOfAttribute memberof Specify the attributes where the group DN to which the user belongs is set (valid if ldapRoleManagement=GROUP).  
/system/serverSslMode DISABLED For SSL connection settings, specify DISABLED (SSL invalid), PREFERRED (SSL valid, but non-SSL connection is allowed as well), or REQUIRED (SSL valid; non-SSL connection is not allowed ). Restart
/system/sslProtocolMaxVersion TLSv1.2 As a TLS protocol version, specify either TLSv1.2 or TLSv1.3. Restart

Node definition file (gs_node.json)

A node definition file defines the default settings of the resources in nodes constituting a cluster. In an online operation, there are also parameters whose values can be changed online from the resource, access frequency, etc., that have been laid out.

The meanings of the various settings in the node definition file are explained below.

The system can be caused to recognize an item not included in the initial state by adding its name as a property. In the change field, indicate whether the value of that parameter can be changed and the change timing.

Specify the directory by specifying the full path or a relative path from the GS_HOME environmental variable. For relative path, the initial directory of GS_HOME serves as a reference point. Initial configuration directory of GS_HOME is /var/lib/gridstore.

Configuration of GridDB Default Meaning of parameters and limitation values Change
/serviceAddress - Set the initial value of each cluster, transaction, sync service address. The initial value of each service address can be set by setting this address only without having to set the addresses of the 3 items. Restart
/dataStore/dbPath data The placement directory of data files and checkpoint log files is specified with the full or relative path. Restart
/dataStore/transactionLogPath txnlog The placement directory of transaction files is specified with the full or relative path. Restart
/dataStore/dbFileSplitCount 0 (no split) Number of data file splits Disallowed
/dataStore/backupPath backup Specify the backup file deployment directory path. Restart
/dataStore/syncTempPath sync Specify the path of the Data sync temporary file directory. Restart
/dataStore/storeMemoryLimit 1024MB Upper memory limit for data management Online
/dataStore/concurrency 4 Specify the concurrency of processing. Restart
/dataStore/logWriteMode 1 Specify the log writing mode and cycle. If the log writing mode period is -1 or 0, log writing is performed at the end of the transaction. If it is 1 or more and less than 231, log writing is performed at a period specified in seconds Restart
/dataStore/persistencyMode 1(NORMAL) In the persistence mode, specify the retention period of an update log file during a data update. Specify either 1 (NORMAL) or 2 (RETAINING_ALL_LOG). In “NORMAL”, a transaction log file which is no longer required is deleted by a checkpoint. In”RETAINING_ALL_LOG”, all transaction log files are retained. Restart
/dataStore/affinityGroupSize 4 Number of affinity groups Restart
/dataStore/storeCompressionMode NO_COMPRESSION Data block compression mode Restart
/checkpoint/checkpointInterval 60s Checkpoint process execution period to perpetuate a data update block in the memory Restart
/checkpoint/partialCheckpointInterval 10 The number of split processes that write block management information to checkpoint log files during a checkpoint. Restart
/cluster/serviceAddress Comforms to the upper serviceAddress Standby address for cluster configuration Restart
/cluster/servicePort 10010 Standby port for cluster configuration Restart
/cluster/notificationInterfaceAddress ”” Specify the address of the interface which sends multicasting packets. Restart
/sync/serviceAddress Comforms to the upper serviceAddress Reception address for data synchronization among the clusters Restart
/sync/servicePort 10020 Standby port for data synchronization Restart
/system/serviceAddress Comforms to the upper serviceAddress Standby address for operation commands Restart
/system/servicePort 10040 Standby port for operation commands Restart
/system/eventLogPath log Event log file deployment directory path Restart
/system/securityPath security Specify the full path or relative path to the directory where the server certificate and the private key are placed. Restart
/system/serviceSslPort 10045 SSL listen port for operation commands Restart
/transaction/serviceAddress Comforms to the upper serviceAddress Standby address for transaction processing for client communication, used also for cluster internal communication when /transaction/localserviceAddress is not specified. Restart
/transaction/localServiceAddress Comforms to the upper serviceAddress Standby address for transaction processing for cluster internal communication Restart
/transaction/servicePort 10001 Standby port for transaction process Restart
/transaction/connectionLimit 5000 Upper limit of the no. of transaction process connections Restart
/transaction/totalMemoryLimit 1024 MB The maximum size of the memory area for transaction processing. Restart
/transaction/transactionTimeoutLimit 300s Transaction timeout upper limit Restart
/transaction/reauthenticationInterval 0s (disabled) Re-authentication interval. (After the specified time has passed, authentication process runs again and updates permissions of the general users who have already been connected.) The default value, 0 sec, indicates that re-authentication is disabled. Online
/transaction/workMemoryLimit 128MB Maximum memory size for data reference (get, TQL) in transaction processing (for each concurrent processing) Online
/transaction/notificationInterfaceAddress ”” Specify the address of the interface which sends multicasting packets. Restart
/sql/serviceAddress Comforms to the upper serviceAddress Standby address for NewSQL I/F access processing for client communication, used also for cluster internal communication when / /sql/localServiceAddress is not specified. Restart
/sql/localServiceAddress Comforms to the upper serviceAddress Standby address for NewSQL I/F access processing for cluster internal communication Restart
/sql/servicePort 20001 Standby port for New SQL access process Restart
/sql/storeSwapFilePath swap SQL intermediate store swap file directory Restart
/sql/storeSwapSyncSize 1024MB SQL intermediate store swap file and cache size Restart
/sql/storeMemoryLimit 1024MB Upper memory limit for intermediate data held in memory by SQL processing. Restart
/sql/workMemoryLimit 32MB Upper memory limit for operators in SQL processing Restart
/sql/workCacheMemory 128MB Upper size limit for cache without being released after use of work memory. Restart
/sql/connectionLimit 5000 Upper limit of the no. of connections processed for New SQL access Restart
/sql/concurrency 4 No. of simultaneous execution threads Restart
/sql/traceLimitExecutionTime 300s The lower limit of execution time of a query to write in an event log Online
/sql/traceLimitQuerySize 1000 The upper size limit of character strings in a slow query (byte) Online
/sql/notificationInterfaceAddress ”” Specify the address of the interface which sends multicasting packets. Restart
/trace/fileCount 30 Upper file count limit for event log files. Restart
/security/userCacheSize 1000 Specify the number of entries for general and LDAP users to be cached. Restart
/security/userCacheUpdateInterval 60 Specify the refresh interval for cache in seconds. Restart

— System limiting values —

Limitations on numerical value

Block size 64KB 1MB - 32MB
STRING/GEOMETRY data size 31KB 128KB
BLOB data size 1GB - 1Byte 1GB - 1Byte
Array length 4000 65000
No. of columns 1024 Approx. 7K - 32000 (*1)
No. of indexes (Per container) 1024 16000
No. of users 128 128
No. of databases 128 128
Number of affinity groups 10000 10000
No. of divisions in a timeseries container with a cancellation deadline 160 160
Size of communication buffer managed by a GridDB node Approx. 2GB Approx. 2GB
Block size 64KB 1MB 4MB 8MB 16MB 32MB
Partition size Approx. 4TB Approx. 64TB Approx. 256TB Approx. 512TB Approx. 1PB Approx. 2PB

Limitations on naming

Field Allowed characters Maximum length  
Administrator user The head of name is “gs#” and the following characters are either alphanumeric or ‘_’ 64 characters  
General user Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 64 characters  
role alphanumeric characters, ‘_’, ‘-‘, ‘, ‘/’, ‘=’ 64 characters  
<Password> Composed of an arbitrary number of characters
using the unicode code point
64 bytes (by UTF-8 encoding)  
cluster name Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 64 characters  
<Database name> Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 64 characters  
Container name
Table name
View name
Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’
(and ‘@’ only for specifying a node affinity)
16384 characters (for 64KB block)
131072 characters (for 1MB - 32MB block)
Column name Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 256 characters  
Index name Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 16384 characters (for 64KB block)
131072 characters (for 1MB - 32MB block)
 
<Backup name> Alphanumeric and ‘_’ 12 characters  
Data Affinity Alphanumeric, ‘_’, ‘-‘, ‘.’, ‘/’, and ‘=’ 8 characters  

— Appendix —

Directory structure

The directory configuration after the GridDB server and client are installed is shown below. X.X.X indicates the GridDB version.

(Machine installed with a server/client)
/usr/griddb-ee-X.X.X/                                    GridDB installation directory
                     Readme.txt
                     bin/
                         gs_xxx                          various commands
                         gsserver                        server module
                         gssvc                           server module
                     conf/
                     etc/
                     lib/
                         gridstore-tools-X.X.X.jar
                         XXX.jar                         Freeware
                     license/
                     misc/
                     prop/
                     sample/

/usr/share/java/gridstore-tools.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-tools-X.X.X.jar

/usr/griddb-ee-webui-X.X.X/                              integrated operation control GUI directory
                           conf/
                           etc/
                           griddb-webui-ee-X.X.X.jar

/usr/griddb-ee-webui/griddb-webui.jar -> /usr/griddb-ee-webui-X.X.X/griddb-webui-ee-X.X.X.jar

/var/lib/gridstore/                                      GridDB home directory (working directory)
                   admin/                                integrated operation control GUI home directory (adminHome)
                   backup/                               backup file directory
                   conf/                                 definition file directory
                        gs_cluster.json                  Cluster definition file
                        gs_node.json                     Node definition file
                        password                         User definition file
                   data/                                 database file directory
                   txnlog/                               transaction log storage directory
                   expimp/                               Export/Import tool directory
                   log/                                  event log directory
                   webapi/                               Web API directory

/usr/bin/
         gs_xxx -> /usr/griddb-ee-X.X.X/bin/gs_xxx                       link to various commands
         gsserver -> /usr/griddb-ee-X.X.X/bin/gsserver                   link to server module
         gssvc -> /usr/griddb-ee-X.X.X/bin/gssvc                         link to server module

/usr/lib/systemd/system
       gridstore.service                            systemd unit file

/usr/griddb-ee-X.X.X/bin
       gridstore                                    rc script

(Machine installed with the library)
/usr/griddb-ee-X.X.X/                                    installation directory
                     lib/
                         gridstore-X.X.X.jar
                         gridstore-advanced-X.X.X.jar
                         gridstore-call-logging-X.X.X.jar
                         gridstore-conf-X.X.X.jar
                         gridstore-jdbc-X.X.X.jar
                         gridstore-jdbc-call-logging-X.X.X.jar
                         gridstore.h
                         libgridstore.so.0.0.0
                         libgridstore_advanced.so.0.0.0
                         python/                         Python library directory
                         nodejs/                         Node.js library directory
                             sample/
                             griddb_client.node
                             griddb_node.js
                         go/                             Go library directory
                             sample/
                             pkg/linux_amd64/griddb/go_client.a
                             src/griddb/go_client/       The source directory of Go library
                         conf/                           
                         javadoc/                           

/usr/griddb-ee-webapi-X.X.X/                             Web API directory
                     conf/
                     etc/
                     griddb-webapi-ee-X.X.X.jar

/usr/girddb-webapi/griddb-webapi.jar -> /usr/griddb-ee-webapi-X.X.X/griddb-webapi-ee-X.X.X.jar

/usr/share/java/gridstore.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-X.X.X.jar
/usr/share/java/gridstore-advanced.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-advanced-X.X.X.jar
/usr/share/java/gridstore-call-logging.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-call-logging-X.X.X.jar
/usr/share/java/gridstore-conf.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-conf-X.X.X.jar
/usr/share/java/gridstore-jdbc.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-jdbc-X.X.X.jar
/usr/share/java/gridstore-jdbc-call-logging.jar -> /usr/griddb-ee-X.X.X/lib/gridstore-jdbc-call-logging-X.X.X.jar


/usr/include/gridstore.h -> /usr/griddb-ee-X.X.X/lib/gridstore.h

/usr/lib64/                                            \* For CentOS, /usr/lib64; for Ubuntu Server, /usr/lib/x86_64-linux-gnu.
           libgridstore.so -> libgridstore.so.0
           libgridstore.so.0 -> libgridstore.so.0.0.0
           libgridstore.so.0.0.0 -> /usr/griddb-ee-X.X.X/lib/libgridstore.so.0.0.0
           libgridstore_advanced.so -> libgridstore_advanced.so.0
           libgridstore_advanced.so.0 -> libgridstore_advanced.so.0.0.0
           libgridstore_advanced.so.0.0.0 -> /usr/griddb-ee-X.X.X/lib/libgridstore_advanced.so.0.0.0