Crate SQL HTTP Endpoint — Crate documentation
- ️Fri Oct 12 1979
Crate provides a HTTP Endpoint that can be used to submit SQL queries. The endpoint is accessible under <servername:port>/_sql
SQL statements are sent to the _sql endpoint in json
format,
whereby the statement is sent as value associated to the key stmt
.
A simple SELECT statement can be submitted like this:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d '{ ... "stmt":"select name, position from locations order by id limit 2" ... }' { "cols" : [ "name", "position" ], "duration" : ..., "rows" : [ [ "North West Ripple", 1 ], [ "Arkintoofle Minor", 3 ] ], "rowcount" : 2 }
Parameter Substitution¶
In addition to the stmt
key the request body may also contain an args
key
which can be used for SQL parameter substitution.
The SQL statement has to be changed to use placeholders where the values should
be inserted. Placeholders can either bei numbered (in the form of $1
, $2
,
etc.) or unnumbered using a question mark ?
.
The placeholders will then be substituted with values from an array that is
expected under the args
key:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d@- <<- EOF ... {"stmt": ... "select date,position from locations ... where date <= \$1 and position < \$2 order by position", ... "args": ["1979-10-12", 3] ... } ... EOF { "cols" : [ "date", "position" ], "duration" : ..., "rows" : [ [ 308534400000, 1 ], [ 308534400000, 2 ] ], "rowcount" : 2 }
Note
In this example the placeholders start with an backslash due to shell escaping.
Warning
Parameter substition must not be used within subscript notation. For example, column[?] is not allowed.
The same query using question marks as placeholders looks like this:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d@- <<- EOF ... {"stmt": ... "select date,position from locations ... where date <= ? and position < ? order by position", ... "args": ["1979-10-12", 3] ... } ... EOF { "cols" : [ "date", "position" ], "duration" : ..., "rows" : [ [ 308534400000, 1 ], [ 308534400000, 2 ] ], "rowcount" : 2 }
Note
With some queries the row count is not ascertainable. In this cases rowcount is -1.
Column Types¶
Crate can respond a list col_types
with the data type id of every
responded column. This way one can know what exact data type a column
is holding.
In order to get the list of column data types, a types
query
parameter must be passed to the request:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?types&pretty' -d@- <<- EOF ... {"stmt": ... "select date, position from locations ... where date <= \$1 and position < \$2 order by position", ... "args": ["1979-10-12", 3] ... } ... EOF { "cols" : [ "date", "position" ], "col_types" : [ 11, 9 ], "duration" : ..., "rows" : [ [ 308534400000, 1 ], [ 308534400000, 2 ] ], "rowcount" : 2 }
Collection data types like Set
or Array
are displayed as
a list where the first value is the collection type and the second is
the inner type. Of course the inner type could also be a collection.
Example of JSON representation of a column list of (String, Set<Integer[]>):
"column_types": [ 4, [ 101, [ 100, 9 ] ] ]
Id’s of all currently available data types:
Id Data Type 0 Null 1 Not Supported 2 Byte 3 Boolean 4 String 5 Ip 6 Double 7 Float 8 Short 9 Integer 10 Long 11 Timestamp 12 Object 13 GeoPoint (Double[]) 100 Array 101 Set
Bulk Operations¶
The REST endpoint allows to issue bulk operations which are executed as single calls on the back-end site. It can be compared to prepared statement.
A bulk operation can be expressed simply as an SQL statement. Supported statements are:
- Insert
- Update
- Delete
Instead of the args
(Parameter Substitution) key, use the
key bulk_args
. This allows to specify a list of lists, containing all
the records which shall be processed. The inner lists need to match the
specified columns.
The bulk response contains a results array, with a rowcount for each bulk operation. Those results are in the same order as the issued operations of the bulk operation.
The following example describes how to issue an insert bulk operation and insert three records at once:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d@- <<- EOF ... {"stmt": "INSERT INTO locations (id, name, kind, description) ... VALUES (?, ?, ?, ?)", ... "bulk_args": [ ... [1337, "Earth", "Planet", "An awesome place to spend some time on."], ... [1338, "Sun", "Star", "An extraordinarily hot place."], ... [1339, "Titan", "Moon", "Titan, where it rains fossil fuels."] ... ] ... } ... EOF { "cols" : [ ], "duration" : ..., "results" : [ { "rowcount" : 1 }, { "rowcount" : 1 }, { "rowcount" : 1 } ] <BLANKLINE> }
Error Handling¶
Queries that are invalid or cannot be satisfied will result in an error response. The response will contain an error code, an error message and in some cases additional arguments that are specific to the error code.
Client libraries should use the error code to translate the error into an appropriate exception:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d '{ ... "stmt":"select name, position" ... }' { "error" : { "message" : "SQLActionException[FROM clause is missing.]", "code" : 4000 } }
To get more insight into what exactly went wrong an additional error_trace GET parameter can be specified to return the stack trace:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty&error_trace=True' -d '{ ... "stmt":"select name, position" ... }' { "error" : { "message" : "SQLActionException[FROM clause is missing.]", "code" : 4000 }, "error_trace" : "..." }
Note
This parameter is intended for crate developers or for users requesting support for Crate. Client libraries shouldn’t make use of this option and not include the stacktrace.
Currently the defined error codes are:
Code Error 4000 The statement contains an invalid syntax or unsupported SQL statement 4001 The statement contains an invalid analyzer definition. 4002 The name of the table is invalid. 4003 Field type validation failed 4004 Possible feature not supported (yet) 4005 Alter table using a table alias is not supported. 4006 The used column alias is ambiguous. 4041 Unknown table. 4042 Unknown analyzer. 4043 Unknown column. 4044 Unknown type. 4045 Unknown schema. 4046 Unknown Partition. 4091 A document with the same primary key exists already. 4092 A VersionConflict. Might be thrown if an attempt was made to update the same document concurrently. 4093 A table with the same name exists already. 4094 The used table alias contains tables with different schema. 5000 Unhandled server error. 5001 The execution of one or more tasks failed. 5002 one or more shards are not available. 5003 the query failed on one or more shards
Bulk Errors¶
If a bulk operation fails, the resulting rowcount will be -2 and the resulting object may contain an error_message depending on the resulting error:
sh$ curl -sSXPOST '127.0.0.1:4200/_sql?pretty' -d@- <<- EOF ... {"stmt": "INSERT into locations (name, id) values (?,?)", ... "bulk_args": [ ... ["Mars", 1341], ... ["Sun", 1341] ... ] ... } ... EOF { "cols" : [ ], "duration" : ..., "results" : [ { "rowcount" : 1 }, { "rowcount" : -2 } ] }
Note
Every bulk operation will be executed, independent if one of the operation fails.