The quasardb daemon is a highly scalable data repository that handles requests from multiple clients. The data is cached in memory and persisted on disk. It can be distributed on several servers to form a cluster.
The persistence layer is based on LevelDB (c) LevelDB authors. All rights reserved. The network distribution uses the Chord protocol.
The quasardb daemon does not require privileges (unless listening on a port under 1024) and can be launched from the command line. From this command line it can safely be stopped with CTRL-C. On UNIX, CTRL-Z will also result in the daemon being suspended.
Important
A valid license is required to run the daemon (see License). The path to the license file is specified by the --license-file option.
Option Usage Default Global Req. Version -h display help No --gen-config generate default config file No >=1.1.3 -c, –config-file specify config file No >=1.1.3 --license-file specify license qdb_license.txt No -d daemonize No -a address to listen on 127.0.0.1:2836 No -s max client sessions 2000 No --partitions number of partitions Variable No -r persistence directory ./db Yes --id set the node id generated No --replication sets the replication factor 1 Yes --peer one peer to form a cluster No --transient disable persistence Yes --sync sync every disk write Yes --limiter-max-entries-count max entries in cache 1000000 Yes --limiter-max-bytes max bytes in cache Automatic Yes --max-depot-size max db size on node 0 (disabled) Yes >=1.1.3 -o log on console No -l log on given file No --log-dump dump file location qdb_error_dump.txt No --log-syslog log on syslog No --log-level change log level info No --log-flush-interval change log flush 3 No
When a node connects to a ring, it will first download the configuration of this ring and overwrite its parameters with the ring’s parameters.
This way, you can be sure that parameters are consistent over all the nodes. This is especially important for parameters such as replication where you need all nodes to agree on a single replication factor.
This is also important for persistance as having a mix of transient and non-transient nodes will result in undefined behaviour and unwanted data loss.
However, not all options are taken from the ring. It makes sense to have a heterogenous logging threshold for example, as you may want to analyze the behaviour of a specific part of your cluster.
In addition, some parameters are node specific, such as the listening address or the node ID.
An option that applies cluster-wide is said to be global whereas other options are said to be local. The value of a global option is set by the first node that creates the ring, all other nodes will copy these parameters. On the other hand, local options are read from the command line as you run the daemon.
qdbd distribution is peer-to-peer. This means:
Each server within one cluster needs:
Note
It’s counter-productive to run several instances on the same node. qdbd is hyper-scalar and will be able to use all the memory and processors of your server. The same remark applies for virtual machines: running quasardb multiple times in multiple virtual machines on a single physical server will not increase the performances.
The daemon will automatically launch an appropriate number of threads to handle connection accepts and requests, depending on the actual hardware configuration of your server.
The replication factor (--replication) is the number of copies for any given entry within the cluster. Each copy is made on a different node, this implies that a replication factor greater than the number of nodes will be lowered to the actual number of nodes.
The purpose of replication is to increase fault tolerance at the cost of decreased write performance.
For example a cluster of three nodes with a replication factor of four (4) will have an effective replication factor of three (3). If a fourth node is added, effective replication will be increased to four automatically.
By default the replication factor is one (1) which is equivalent to no replication. A replication factor of two (2) means that each entry has got a backup copy. A replication factor of three (3) means that each entry has got two (2) backup copies. The maximum replication factor is four (4).
When adding an entry to a node, the call returns only when the add and all replications have been successful. If a node part or joins the ring, replication and migration occurs automatically as soon as possible.
Replication is a cluster-wide parameter.
By default, all logging is disabled.
The daemon can log to the console (-o), to a file (-l) or to the syslog (--log-syslog) on Unix.
There are six different log levels: detailed, debug, info, warning, error and panic. You can change the log level (--log-level), it defaults to info.
You can also change the log flush interval (--log-flush-interval), which defaults to three (3) seconds.
Note
Persistence options are global for any given ring.
Data is persisted on disk, by default in a db directory under the current working directory. You can change this to any directory you want using the -r option. All nodes will use the same directory as this is a global parameter.
Caution
Never operate directly on files in the persistence directory, use the provided tools (see quasardb database tool). Never save any other file in this directory, it might be deleted or modified by the daemon.
Data persistence on disk is buffered: when an user requests ends, the data may or may not be persisted on the disk yet. Still, the persistence layer guarantees the data is consistent at all time, even in case of hardware or software failure.
Should you need every write to be synced to disk, you can do so with the --sync option. Syncing every write do disk negatively impacts performances while slightly increasing reliability.
You can also disable the persistence altogether (--transient), making quasardb a pure in-memory repository.
Caution
If you disable the persistence, evicted entries are lost.
A partition can be seen as a worker thread. The more partitions, the more work can be done in parallel. However if the number of partitions is too high relative to your server capabilities to actually do parallel work, performance will decrease.
quasardb is highly scalable and partitions do not interfere with each other. The daemon’s scheduler will assign incoming requests to the partition with the least workload.
The ideal number of partitions is close to the number of physical cores your server has. By default the daemon chooses the best compromise it can. If this value is not satisfactory, you can use the --partitions options to set the value manually.
Note
Unless a performance issue is identified, it is best to let the daemon compute the partition count.
In order to achieve high performances, the daemon keeps as much data as possible in memory. However, the physical memory available for a node may not suffice.
Therefore, entries are evicted from the cache when the entries count or the size of data in memory exceeds a configurable threshold. Use --limiter-max-entries-count (defaults to 1,000,000) and --limiter-max-bytes (defaults to a half the available physical memory) options to configure these thresholds.
Note
The memory usage (bytes) limit includes the alias and content for each entry, but doesn’t include bookkeeping, temporary copies or internal structures. Thus, the daemon memory usage may slightly exceed the specified maximum memory usage.
The quasardb daemon uses a proprietary fast monte-carlo eviction heuristic. This algorithm is currently not configurable.
Parameters can be supplied in any order and are prefixed with --. The arguments format is parameter dependent.
Instance specific parameters only apply to the instance, while global parameters are for the whole ring. Global parameters are applied when the first instance of a ring is launched.
Displays basic usage information.
To display the online help, type:
qdbd --help
Generates a JSON configuration file with default values and prints it to STDOUT.
To create a new config file with the name “qdbd_default_config.json”, type:
qdbd --gen-config > qdbd_default_config.json
Note
The –gen-config argument is only available with QuasarDB 1.1.3 or higher.
Specifies a configuration file to use. See Config File Reference.
- Any other command-line options will be ignored.
- If an option is omitted in the config file, the default will be used.
- If an option is malformed in the config file, it will be ignored.
To use a configuration file named “qdbd_default_config.json”, type:
qdbd --config-file=qdbd_default_config.json
Note
The –config-file argument is only available with QuasarDB 1.1.3 or higher.
Specifies the location of the license file. A valid license is required to run the daemon (see License).
Load the license from license.txt:
qdbd --license-file=license.txt
Runs the server as a daemon (UNIX only). In this mode, the process will fork and prevent console interactions. This is the recommended running mode for UNIX environments.
To run as a daemon:
qdbd -d
Note
Logging to the console is not allowed when running as a daemon.
Specifies the address and port on which the server will listen.
Listen on localhost and the port 5910:
qdbd --address=localhost:5910
Note
The unspecified address (0.0.0.0 for IPv4, :: for IPv6) is not allowed.
Specifies the number of simultaneous sessions per partition.
Allow 10,000 simultaneous session:
qdbd --sessions=10000
Note
The sessions count determines the number of simultaneous clients the server may handle at any given time. Increasing the value increases the memory load. This value may be limited by your license.
Specifies the number of partitions.
Have 10 partitions:
qdbd --partitions=10
Note
This value should be changed only in case of performance problems.
Sets the timeout after which inactive sessions will be considered for termination.
Set the timeout to one minute:
qdbd --idle-duration=60
Sets the timeout after which a request from the server to another server must be considered to have timed out.
Set the timeout to two minutes:
qdbd --request-timeout=120
Sets the node ID.
Set the node ID to 1-a-2-b:
qdbd --id=1-a-2-b
Warning
Having two nodes with the same ID on the ring leads to undefined behaviour. By default the daemon generates an ID that is guaranteed to be unique on any given ring. Only modify the node ID if the topology of the ring is unsatisfactory and you are certain no two node IDs are the same.
The address and port of a peer to which to connect within the cluster. It can be any server belonging to the cluster.
Join a cluster where the machine 192.168.1.1 listening on the port 2836 is already connected:
qdbd --peer=192.168.1.1:2836
Specifies the dump file location. The dump file is a text file that is written to when quasardb detects a critical error.
Dump to /var/log/qdb_error_dump.log:
qdb --log-dump=/var/log/qdb_error_dump.log
Activates logging on the console.
Activates logging to one or several files.
Log in /var/log/qdbd.log:
qdbd --log-file=/var/log/qdbd.log
UNIX only, activates logging to syslog.
Specifies the log verbosity.
A string representing the amount of logging required. Must be one of:
Request a debug level logging:
qdbd --log-level=debug
How frequently log messages are flushed to output, in seconds.
Flush the log every minute:
qdbd --log-flush-interval=60
Specifies the replication factor (global parameter).
Have one copy of every entry in the cluster:
qdbd --replication=2
Specifies the directory where data will be persisted for the node where the process has been launched.
Persist data in /var/quasardb/db
qdbd --root=/var/quasardb/db
Note
Although this parameter is global, the directory refers to the local node of each instance.
Sync every disk write. By default, disk writes are buffered. This option disables the buffering and makes sure every write is synced to disk. (global parameter)
Note
This option increases reliability at the cost of performances.
The maximum usable memory by entries, in bytes. Entries will be evicted as needed to enforce this limit. The alias length as well as the content size are both accounted to measure the actual size of entries in memory. The server may use more than the specified amount of memory because of internal data structures and temporary copies. (global parameter)
To allow only 100 KiB of entries:
qdbd --limiter-max-bytes=102400
To allow up to 8 GiB:
qdbd --limiter-max-bytes=8589934592
Note
Setting this value too high may lead to thrashing.
The maximum number of entries allowed in memory. Entries will be evicted as needed to enforce this limit.
To keep the number of entries in memory below 101:
qdbd --limiter-max-entries=100
Note
Setting this value too low may cause the server to spend more time evicting entries than processing requests.
Sets the maximum amount of disk usage for each node’s database in bytes. Any write operations that would overflow the database will return a qdb_e_system error stating “disk full”.
Due to excessive meta-data or uncompressed db entries, the actual database size may exceed this set value by up to 20%.
To limit the database size on each node to 12 Terabytes:
And thus the command:
qdbd --max-depot-size=13194139533312
This database may expand out to approximately 14.4 Terabytes due to meta-data and uncompressed db entries.
This example will limit the database size to ensure it fits within 1 Terabyte of free space. Since limiting to a specific overhead is important in this example, the filesystem cluster size is also taken into account; the default for most filesystems is 4096 bytes.
And thus the command, truncating down to an integer:
qdbd --max-depot-size=879609298124
This database should not exceed 1 Terabyte.
Note
The –max-depot-size argument is only available with QuasarDB 1.1.2 or higher.
Note
Using a max depot size may cause a slight performance penalty on writes.
As of QuasarDB version 1.1.3, the qdbd daemon can read its parameters from a JSON configuration file provided by the -c command-line argument. Using a configuration file is recommended.
Some things to note when working with a configuration file:
- If a configuration file is specified, all other command-line options will be ignored. Only values from the configuration file will be used.
- The configuration file must be valid JSON in ASCII format.
- If a key or value is missing from the configuration file or malformed, the default value will be used.
- If a key or value is unknown, it will be ignored.
The default configuration file is shown below:
{
"global":
{
"depot":
{
"history": false,
"max_bytes": 0,
"max_transaction_duration": 300,
"max_versions": 7,
"replication_factor": 1,
"root": "db",
"storage_warning_interval": 3600,
"storage_warning_level": 90,
"sync": false,
"transient": false
},
"limiter":
{
"max_bytes": 0,
"max_in_entries_count": 1000000
}
},
"local":
{
"chord":
{
"bootstrapping_peers": [ ],
"no_stabilization": false,
"node_id": "0-0-0-0"
},
"logger":
{
"dump_file": "qdb_error_dump.txt",
"flush_interval": 3,
"log_files": [ ],
"log_level": 2,
"log_to_console": false,
"log_to_syslog": false
},
"network":
{
"client_timeout": 60,
"idle_timeout": 600,
"listen_on": "127.0.0.1:2836",
"partitions_count": 13,
"server_sessions": 2000
},
"user":
{
"daemon": false,
"license_file": "qdb_license.txt"
}
}
}
An integer representing the maximum amount of disk usage for each node’s database in bytes. Any write operations that would overflow the database will return a qdb_e_system error stating “disk full”.
Due to excessive meta-data or uncompressed db entries, the actual database size may exceed this set value by up to 20%.
See --max-depot-size for more details and examples to calculate the max_bytes value.
An integer representing the maximum guaranteed duration of a transaction, in seconds.
An integer between 1 and 4 (inclusive) specifying the replication factor for the cluster. A higher value indicates more copies of data on each node.
A string representing the relative or absolute path to the directory where data will be stored.
An integer representing how often quasardb will emit a warning about depleting disk space, in seconds. See also global::depot::storage_warning_level.
An integer between 50 and 100 (inclusive) specifying the percentage of disk usage at which a warning about depleting disk space will be emitted. See also global::depot::storage_warning_interval.
A boolean representing whether or not the node should sync to the underlying filesystem for each write command.
A boolean representing whether or not to persist data on the hard drive. If true, all data will be stored in memory.
An integer representing the maximum amount of memory usage in bytes for each node’s cache. Once this value is reached, the quasardb daemon will evict entries from memory to ensure it stays below the byte limit.
An integer representing the maximum number of entries that can be stored in memory. Once this value is reached, the quasardb daemon will evict entries from memory to ensure it stays below the entry limit.
An array of strings representing other nodes in the cluster which will bootstrap this node upon startup. The string can be a host name or an IP address. Must have name or IP separated from port with a colon.
A read-only boolean value representing whether or not this node should stabilize upon startup. Even if set to true, stabilization will still occur.
A string in the form hex-hex-hex-hex, where hex is an hexadecimal number lower than 2^64, representing the 256-bit ID to use. If left at the default of 0-0-0-0, the daemon will assign a random node ID at startup. Contact a quasardb representative before changing this from the default value.
A string representing the relative or absolute path to the system error dump file.
An integer representing how frequently quasardb log messages should be flushed to the log locations, in seconds.
An array of strings representing the relative or absolute paths to the quasardb log files.
An integer representing the verbosity of the log output. Acceptable values are:
0 = detailed (most output)
1 = debug
2 = info (default)
3 = warning
4 = error
5 = panic (least output)
A boolean value representing whether or not the quasardb daemon should log to the console it was spawned from.
A boolean value representing whether or not the quasardb daemon should log to the syslog.
An integer representing the number of seconds after which a client session will be considered for termination.
An integer representing the number of seconds after which an inactive session will be considered for termination.
A string representing an address and port the web server should listen on. The string can be a host name or an IP address. Must have name or IP separated from port with a colon.
An integer representing the number of partitions, or worker threads, quasardb can spawn to perform operations. The ideal number of partitions is close to the number of physical cores your server has. If left to its default value of 0, the daemon will choose the best compromise it can.
An integer representing the number of server sessions the quasardb daemon can provide.
A boolean value representing whether or not the quasardb daemon should daemonize on launch.
A string representing the relative or absolute path to the license file.