3.12. Query Servers

3.12.1. Query Servers Definition

Changed in version 2.3: Changed configuration method for Query Servers and Native Query Servers.

CouchDB delegates computation of design documents functions to external query servers. The external query server is a special OS process which communicates with CouchDB over standard input/output using a very simple line-based protocol with JSON messages.

An external query server may be defined with environment variables following this pattern:

COUCHDB_QUERY_SERVER_LANGUAGE="PATH ARGS"

Where:

  • LANGUAGE: is a programming language which code this query server may execute. For instance, there are PYTHON, RUBY, CLOJURE and other query servers in the wild. This value in lowercase is also used for ddoc field language to determine which query server processes the functions.

    Note, that you may set up multiple query servers for the same programming language, but you have to name them differently (like PYTHONDEV etc.).

  • PATH: is a system path to the executable binary program that runs the query server.

  • ARGS: optionally, you may specify additional command line arguments for the executable PATH.

The default query server is written in JavaScript, running via Mozilla SpiderMonkey. It requires no special environment settings to enable, but is the equivalent of these two variables:

COUCHDB_QUERY_SERVER_JAVASCRIPT="/opt/couchdb/bin/couchjs /opt/couchdb/share/server/main.js"
COUCHDB_QUERY_SERVER_COFFEESCRIPT="/opt/couchdb/bin/couchjs /opt/couchdb/share/server/main-coffee.js"

By default, couchjs limits the max runtime allocation to 64MiB. If you run into out of memory issue in your ddoc functions, you can adjust the memory limitation (here, increasing to 512 MiB):

COUCHDB_QUERY_SERVER_JAVASCRIPT="/usr/bin/couchjs -S 536870912 /usr/share/server/main.js"

For more info about the available options, please consult couchjs -h.

Note

CouchDB versions 3.0.0 to 3.2.2 included a performance regression for custom reduce functions. CouchDB 3.3.0 and later come with an experimental fix to this issue that is included in a separate .js file.

To enable the fix, you need to define a custom COUCHDB_QUERY_SERVER_JAVASCRIPT environment variable as outlined above. The path to couchjs needs to remain the same as you find it on your couchdb file, and the path to main.js needs to be set to /path/to/couchdb/share/server/main-ast-bypass.js.

With a default installation on Linux systems, this is going to be COUCHDB_QUERY_SERVER_JAVASCRIPT="/opt/couchdb/bin/couchjs /opt/couchdb/share/server/main-ast-bypass.js"

See also

The Mango Query Server is a declarative language that requires no programming, allowing for easier indexing and finding of data in documents.

The Native Erlang Query Server allows running ddocs written in Erlang natively, bypassing stdio communication and JSON serialization/deserialization round trip overhead.

3.12.2. Query Servers Configuration

[query_server_config]
commit_freq

Specifies the delay in seconds before view index changes are committed to disk. The default value is 5:

[query_server_config]
commit_freq = 5
os_process_limit

Hard limit on the number of OS processes usable by Query Servers. The default value is 100:

[query_server_config]
os_process_limit = 100

Setting os_process_limit too low can result in starvation of Query Servers, and manifest in os_process_timeout errors, while setting it too high can potentially use too many system resources. Production settings are typically 10-20 times the default value.

os_process_soft_limit

Soft limit on the number of OS processes usable by Query Servers. The default value is 100:

[query_server_config]
os_process_soft_limit = 100

Idle OS processes are closed until the total reaches the soft limit.

For example, if the hard limit is 200 and the soft limit is 100, the total number of OS processes will never exceed 200, and CouchDB will close all idle OS processes until it reaches 100, at which point it will leave the rest intact, even if some are idle.

reduce_limit

Controls Reduce overflow error that raises when output of reduce functions is too big:

[query_server_config]
reduce_limit = true

Normally, you don’t have to disable (by setting false value) this option since main propose of reduce functions is to reduce the input.

3.12.3. Native Erlang Query Server

[native_query_servers]

Warning

Due to security restrictions, the Erlang query server is disabled by default.

Unlike the JavaScript query server, the Erlang one does not run in a sandbox mode. This means that Erlang code has full access to your OS, file system and network, which may lead to security issues. While Erlang functions are faster than JavaScript ones, you need to be careful about running them, especially if they were written by someone else.

CouchDB has a native Erlang query server, allowing you to write your map/reduce functions in Erlang.

First, you’ll need to edit your local.ini to include a [native_query_servers] section:

[native_query_servers]
enable_erlang_query_server = true

To see these changes you will also need to restart the server.

Let’s try an example of map/reduce functions which count the total documents at each number of revisions (there are x many documents at version “1”, and y documents at “2”… etc). Add a few documents to the database, then enter the following functions as a view:

%% Map Function
fun({Doc}) ->
    <<K,_/binary>> = proplists:get_value(<<"_rev">>, Doc, null),
    V = proplists:get_value(<<"_id">>, Doc, null),
    Emit(<<K>>, V)
end.

%% Reduce Function
fun(Keys, Values, ReReduce) -> length(Values) end.

If all has gone well, after running the view you should see a list of the total number of documents at each revision number.

Additional examples are on the users@couchdb.apache.org mailing list.

3.12.5. Mango

Mango is the Query Engine that services the _find, endpoint.

[mango]
index_all_disabled

Set to true to disable the “index all fields” text index. This can lead to out of memory issues when there are documents with nested array fields. Defaults to false.:

[mango]
index_all_disabled = false
default_limit

Sets the default number of results that will be returned in a _find response. Individual requests can override this by setting limit directly in the query parameters. Defaults to 25.:

[mango]
default_limit = 25
index_scan_warning_threshold

This sets the ratio between documents scanned and results matched that will generate a warning in the _find response. For example, if a query requires reading 100 documents to return 10 rows, a warning will be generated if this value is 10.

Defaults to 10. Setting the value to 0 disables the warning.:

[mango]
index_scan_warning_threshold = 10