.. Licensed under the Apache License, Version 2.0 (the "License"); you may not .. use this file except in compliance with the License. You may obtain a copy of .. the License at .. .. http://www.apache.org/licenses/LICENSE-2.0 .. .. Unless required by applicable law or agreed to in writing, software .. distributed under the License is distributed on an "AS IS" BASIS, WITHOUT .. WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the .. License for the specific language governing permissions and limitations under .. the License. .. _partitioned-dbs: ===================== Partitioned Databases ===================== A partitioned database forms documents into logical partitions by using a partition key. All documents are assigned to a partition, and many documents are typically given the same partition key. The benefit of partitioned databases is that secondary indices can be significantly more efficient when locating matching documents since their entries are contained within their partition. This means a given secondary index read will only scan a single partition range instead of having to read from a copy of every shard. As a means to introducing partitioned databases, we'll consider a motivating use case to describe the benefits of this feature. For this example, we'll consider a database that stores readings from a large network of soil moisture sensors. .. note:: Before reading this document you should be familiar with the :ref:`theory ` of :ref:`sharding ` in CouchDB. Traditionally, a document in this database may have something like the following structure: .. code-block:: javascript { "_id": "sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", "_rev":"1-14e8f3262b42498dbd5c672c9d461ff0", "sensor_id": "sensor-260", "location": [41.6171031, -93.7705674], "field_name": "Bob's Corn Field #5", "readings": [ ["2019-01-21T00:00:00", 0.15], ["2019-01-21T06:00:00", 0.14], ["2019-01-21T12:00:00", 0.16], ["2019-01-21T18:00:00", 0.11] ] } .. note:: While this example uses IoT sensors, the main thing to consider is that there is a logical grouping of documents. Similar use cases might be documents grouped by user or scientific data grouped by experiment. So we've got a bunch of sensors, all grouped by the field they monitor along with their readouts for a given day (or other appropriate time period). Along with our documents, we might expect to have two secondary indexes for querying our database that might look something like: .. code-block:: javascript function(doc) { if(doc._id.indexOf("sensor-reading-") != 0) { return; } for(var r in doc.readings) { emit([doc.sensor_id, r[0]], r[1]) } } and: .. code-block:: javascript function(doc) { if(doc._id.indexOf("sensor-reading-") != 0) { return; } emit(doc.field_name, doc.sensor_id) } With these two indexes defined, we can easily find all readings for a given sensor, or list all sensors in a given field. Unfortunately, in CouchDB, when we read from either of these indexes, it requires finding a copy of every shard and asking for any documents related to the particular sensor or field. This means that as our database scales up the number of shards, every index request must perform more work, which is unnecessary since we are only interested in a small number of documents. Fortunately for you, dear reader, partitioned databases were created to solve this precise problem. What is a partition? ==================== In the previous section, we introduced a hypothetical database that contains sensor readings from an IoT field monitoring service. In this particular use case, it's quite logical to group all documents by their ``sensor_id`` field. In this case, we would call the ``sensor_id`` the partition key. A good partition has two basic properties. First, it should have a high cardinality. That is, a large partitioned database should have many more partitions than documents in any single partition. A database that has a single partition would be an anti-pattern for this feature. Secondly, the amount of data per partition should be "small". The general recommendation is to limit individual partitions to less than ten gigabytes (10 GB) of data. Which, for the example of sensor documents, equates to roughly 60,000 years of data. .. note:: The ``max_partition_size`` under CouchDB dictates the partition limit. The default value for this option is 10GiB but can be changed accordingly. Setting the value for this option to 0 disables the partition limit. Why use partitions? =================== The primary benefit of using partitioned databases is for the performance of partitioned queries. Large databases with lots of documents often have a similar pattern where there are groups of related documents that are queried together. By using partitions, we can execute queries against these individual groups of documents more efficiently by placing the entire group within a specific shard on disk. Thus, the view engine only has to consult one copy of the given shard range when executing a query instead of executing the query across all ``q`` shards in the database. This mean that you do not have to wait for all ``q`` shards to respond, which is both efficient and faster. Partitions By Example ===================== To create a partitioned database, we simply need to pass a query string parameter: .. code-block:: bash shell> curl -X PUT 'http://adm:pass@127.0.0.1:5984/my_new_db?partitioned=true' {"ok":true} To see that our database is partitioned, we can look at the database information: .. code-block:: bash shell> curl http://adm:pass@127.0.0.1:5984/my_new_db { "cluster": { "n": 3, "q": 8, "r": 2, "w": 2 }, "compact_running": false, "db_name": "my_new_db", "disk_format_version": 7, "doc_count": 0, "doc_del_count": 0, "instance_start_time": "0", "props": { "partitioned": true }, "purge_seq": "0-g1AAAAFDeJzLYWBg4M...", "sizes": { "active": 0, "external": 0, "file": 66784 }, "update_seq": "0-g1AAAAFDeJzLYWBg4M..." } You'll now see that the ``"props"`` member contains ``"partitioned": true``. .. note:: Every document in a partitioned database (except _design and _local documents) must have the format “partition:docid”. More specifically, the partition for a given document is everything before the first colon. The document id is everything after the first colon, which may include more colons. .. note:: System databases (such as _users) are *not* allowed to be partitioned. This is due to system databases already having their own incompatible requirements on document ids. Now that we've created a partitioned database, it's time to add some documents. Using our earlier example, we could do this as such: .. code-block:: bash shell> cat doc.json { "_id": "sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", "sensor_id": "sensor-260", "location": [41.6171031, -93.7705674], "field_name": "Bob's Corn Field #5", "readings": [ ["2019-01-21T00:00:00", 0.15], ["2019-01-21T06:00:00", 0.14], ["2019-01-21T12:00:00", 0.16], ["2019-01-21T18:00:00", 0.11] ] } shell> $ curl -X POST -H "Content-Type: application/json" \ http://adm:pass@127.0.0.1:5984/my_new_db -d @doc.json { "ok": true, "id": "sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", "rev": "1-05ed6f7abf84250e213fcb847387f6f5" } The only change required to the first example document is that we are now including the partition name in the document id by prepending it to the old id separated by a colon. .. note:: The partition name in the document id is not magical. Internally, the database is simply using only the partition for hashing the document to a given shard, instead of the entire document id. Working with documents in a partitioned database is no different than a non-partitioned database. All APIs are available, and existing client code will all work seamlessly. Now that we have created a document, we can get some info about the partition containing the document: .. code-block:: bash shell> curl http://adm:pass@127.0.0.1:5984/my_new_db/_partition/sensor-260 { "db_name": "my_new_db", "doc_count": 1, "doc_del_count": 0, "partition": "sensor-260", "sizes": { "active": 244, "external": 347 } } And we can also list all documents in a partition: .. code-block:: bash shell> curl http://adm:pass@127.0.0.1:5984/my_new_db/_partition/sensor-260/_all_docs {"total_rows": 1, "offset": 0, "rows":[ { "id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", "key":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf", "value": {"rev": "1-05ed6f7abf84250e213fcb847387f6f5"} } ]} Note that we can use all of the normal bells and whistles available to ``_all_docs`` requests. Accessing ``_all_docs`` through the ``/dbname/_partition/name/_all_docs`` endpoint is mostly a convenience so that requests are guaranteed to be scoped to a given partition. Users are free to use the normal ``/dbname/_all_docs`` to read documents from multiple partitions. Both query styles have the same performance. Next, we'll create a design document containing our index for getting all readings from a given sensor. The map function is similar to our earlier example except we've accounted for the change in the document id. .. code-block:: javascript function(doc) { if(doc._id.indexOf(":sensor-reading-") < 0) { return; } for(var r in doc.readings) { emit([doc.sensor_id, r[0]], r[1]) } } After uploading our design document, we can try out a partitioned query: .. code-block:: bash shell> cat ddoc.json { "_id": "_design/sensor-readings", "views": { "by_sensor": { "map": "function(doc) { ... }" } } } shell> $ curl -X POST -H "Content-Type: application/json" http://adm:pass@127.0.0.1:5984/my_new_db -d @ddoc.json { "ok": true, "id": "_design/sensor-readings", "rev": "1-13859808da293bd72fde3b31be97372a" } shell> curl http://adm:pass@127.0.0.1:5984/my_new_db/_partition/sensor-260/_design/sensor-readings/_view/by_sensor {"total_rows":4,"offset":0,"rows":[ {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","0"],"value":null}, {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","1"],"value":null}, {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","2"],"value":null}, {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":["sensor-260","3"],"value":null} ]} Hooray! Our first partitioned query. For experienced users, that may not be the most exciting development, given that the only things that have changed are a slight tweak to the document id, and accessing views with a slightly different path. However, for anyone who likes performance improvements, it's actually a big deal. By knowing that the view results are all located within the provided partition name, our partitioned queries now perform nearly as fast as document lookups! The last thing we'll look at is how to query data across multiple partitions. For that, we'll implement the example sensors by field query from our initial example. The map function will use the same update to account for the new document id format, but is otherwise identical to the previous version: .. code-block:: javascript function(doc) { if(doc._id.indexOf(":sensor-reading-") < 0) { return; } emit(doc.field_name, doc.sensor_id) } Next, we'll create a new design doc with this function. Be sure to notice that the ``"options"`` member contains ``"partitioned": false``. .. code-block:: bash shell> cat ddoc2.json { "_id": "_design/all_sensors", "options": { "partitioned": false }, "views": { "by_field": { "map": "function(doc) { ... }" } } } shell> $ curl -X POST -H "Content-Type: application/json" http://adm:pass@127.0.0.1:5984/my_new_db -d @ddoc2.json { "ok": true, "id": "_design/all_sensors", "rev": "1-4a8188d80fab277fccf57bdd7154dec1" } .. note:: Design documents in a partitioned database default to being partitioned. Design documents that contain views for queries across multiple partitions must contain the ``"partitioned": false`` member in the ``"options"`` object. .. note:: Design documents are either partitioned or global. They cannot contain a mix of partitioned and global indexes. And to see a request showing us all sensors in a field, we would use a request like: .. code-block:: bash shell> curl -u adm:pass http://adm:pass@127.0.0.1:15984/my_new_db/_design/all_sensors/_view/by_field {"total_rows":1,"offset":0,"rows":[ {"id":"sensor-260:sensor-reading-ca33c748-2d2c-4ed1-8abf-1bca4d9d03cf","key":"Bob's Corn Field #5","value":"sensor-260"} ]} Notice that we're not using the ``/dbname/_partition/...`` path for global queries. This is because global queries, by definition, do not cover individual partitions. Other than having the ``"partitioned": false`` parameter in the design document, global design documents and queries are identical in behavior to design documents on non-partitioned databases. .. warning:: To be clear, this means that global queries perform identically to queries on non-partitioned databases. Only partitioned queries on a partitioned database benefit from the performance improvements.