SELECT Statements
A SELECT statement can consist of the following basic clauses.
SELECT
INTO
FROM
JOIN
WHERE
GROUP BY
HAVING
UNION
ORDER BY
LIMIT
SELECT Syntax
The following syntax diagram outlines the syntax supported by the Elasticsearch adapter:
SELECT {
[ TOP <numeric_literal> | DISTINCT ]
{
*
| {
<expression> [ [ AS ] <column_reference> ]
| { <table_name> | <correlation_name> } .*
} [ , ... ]
}
[ INTO csv:// [ filename= ] <file_path> [ ;delimiter=tab ] ]
{
FROM <table_reference> [ [ AS ] <identifier> ]
}
[ WHERE <search_condition> ]
[ GROUP BY <column_reference> [ , ... ]
[
ORDER BY
{ <column_reference> [ ASC | DESC ] } [ , ... ]
]
[
LIMIT <expression>
[
{ OFFSET | , }
<expression>
]
]
} | SCOPE_IDENTITY()
<expression> ::=
| <column_reference>
| @ <parameter>
| ?
| COUNT( * | { [ DISTINCT ] <expression> } )
| { AVG | MAX | MIN | SUM | COUNT } ( <expression> )
| <literal>
| <sql_function>
<search_condition> ::=
{
<expression> { = | > | < | >= | <= | <> | != | LIKE | NOT LIKE | IS NULL | IS NOT NULL | IN | NOT IN | AND | OR | BETWEEN | CONTAINS | NOT CONTAINS } [ <expression> ]
} [ { AND | OR } ... ]
Examples
1. Return all columns:
SELECT * FROM Account
2. Rename a column:
SELECT "Name" AS MY_Name FROM Account
3. Search data:
SELECT * FROM Account WHERE Industry = 'Floppy Disks';
4. Cast a column's data as a different data type:
SELECT CAST(AnnualRevenue AS VARCHAR) AS Str_AnnualRevenue FROM Account
5. The Elasticsearch APIs support the following operators in the WHERE clause: =, >, <, >=, <=, <>, !=, LIKE, NOT LIKE, IS NULL, IS NOT NULL, IN, NOT IN, AND, OR, BETWEEN, CONTAINS, NOT CONTAINS.
SELECT * FROM Account WHERE Industry = 'Floppy Disks';
6. Return the number of items matching the query criteria:
SELECT COUNT(*) AS MyCount FROM Account
7. Return the number of unique items matching the query criteria:
SELECT COUNT(DISTINCT Name) FROM Account
8. Return the unique items matching the query criteria:
SELECT DISTINCT Name FROM Account
9. Summarize data:
SELECT Name, MAX(AnnualRevenue) FROM Account GROUP BY Name
10. Sort a result set in ascending order:
SELECT Id, Name FROM Account ORDER BY Name ASC
Aggregate Functions
Examples of Aggregate Functions
Below are several examples of SQL aggregate functions. You can use these with a GROUP BY clause to aggregate rows based on the specified GROUP BY criterion. This can be a reporting tool.
COUNT
Returns the number of rows matching the query criteria.
SELECT COUNT(*) FROM Account WHERE Industry = 'Floppy Disks'
COUNT_DISTINCT
Returns the number of distinct, non-null field values matching the query criteria.
SELECT COUNT_DISTINCT(Id) AS DistinctValues FROM Account WHERE Industry = 'Floppy Disks'
COUNT(DISTINCT)
Returns the number of distinct, non-null field values matching the query criteria.
SELECT COUNT(DISTINCT Id) AS DistinctValues FROM Account WHERE Industry = 'Floppy Disks'
AVG
Returns the average of the column values.
SELECT Name, AVG(AnnualRevenue) FROM Account WHERE Industry = 'Floppy Disks' GROUP BY Name
MIN
Returns the minimum column value.
SELECT MIN(AnnualRevenue), Name FROM Account WHERE Industry = 'Floppy Disks' GROUP BY Name
MAX
Returns the maximum column value.
SELECT Name, MAX(AnnualRevenue) FROM Account WHERE Industry = 'Floppy Disks' GROUP BY Name
SUM
Returns the total sum of the column values.
SELECT SUM(AnnualRevenue) FROM Account WHERE Industry = 'Floppy Disks'
Predicate Functions
COMMON(expression, cutoff_frequency)
Used to explicitly specify the query type to send and thus will send 'expression' in a common terms query.
Example SQL Query:
SELECT * FROM employee WHERE COMMON(about) = 'like to build'
Elasticsearch Query:
{"common":{"about":{"query":"like to build"}}}
expression: The expression to search for.
cutoff_frequency: The cutoff frequency value used to allocate terms to the high or low frequency group. Can be an absolute frequency (>=1) or a relative frequency (0.0 .. 1.0).
FILTER(expression)
Used to explicitly specify the filter context and thus will send 'expression' in a filter context, rather than a query context. A filter context does not affect the calculated scores. This is useful when performing queries where you want part of the filter to be used to calculate scores but filter the results returned (without affecting the score) using additional criteria.
Example SQL Query:
SELECT * FROM employee WHERE FILTER(TERM(first_name)) = 'john'
Elasticsearch Query:
{"filter":{"bool":{"must":{"term":{"first_name":"john"}}}}}
expression: Either a column or another function.
GEO_BOUNDING_BOX(column, top_left, bottom_right)
Used to specify a query to filter hits based on a point location using a bounding box.
Example SQL Query:
SELECT * FROM cities WHERE GEO_BOUNDING_BOX(location, '[-74.1,40.73]', '[-71.12,40.01]')
Elasticsearch Query:
{"bool":{"filter":{"geo_bounding_box":{"location":{"top_left":[-74.1,40.73],"bottom_right":[-71.12,40.01]}}},"must":[{"match_all":{}}]}}
column: A Geo-point column to perform the GEO_BOUNDING_BOX filter on.
top_left: The top-left coordinates of the bounding box. This value can be an array [shown in example], object of lat and lon values, comma-separated list, or a geohash of a latitude and longitude value.
bottom_right: The bottom-right coordinates of the bounding box. This value can be an array [shown in example], object of lat and lon values, comma-separated list, or a geohash of a latitude and longitude value.
GEO_BOUNDING_BOX(column, top, left, bottom, right)
Used to specify a query to filter hits based on a point location using a bounding box.
Example SQL Query:
SELECT * FROM cities WHERE GEO_BOUNDING_BOX(location, -74.1, 40.73, -71.12, 40.01)
Elasticsearch Query:
{"bool":{"filter":{"geo_bounding_box":{"location":{"top":-74.1,"left":40.73,"bottom":-71.12,"right":40.01}}},"must":[{"match_all":{}}]}}
column: A Geo-point column to perform the GEO_BOUNDING_BOX filter on.
top: The top coordinate of the bounding box.
left: The left coordinate of the bounding box.
bottom: The bottom coordinate of the bounding box.
right: The right coordinate of the bounding box.
GEO_DISTANCE(column, point_lat_lon, distance)
Used to specify a query to filter documents that include only the hits that exist within a specific distance from a geo point.
Example SQL Query:
SELECT * FROM cities WHERE GEO_DISTANCE(location, '40,-70', '12mi')
Elasticsearch Query:
{"bool":{"filter":{"geo_distance":{"location":"40,-70","distance":"12mi"}},"must":[{"match_all":{}}]}}
column: A Geo-point column to perform the GEO_DISTANCE filter on.
point_lat_lon: The coordinates of a geo point that will be used to measure the distance from. This value can be an array, object of lat and lon values, comma-separated list [shown in example], or a geohash of a latitude and longitude value.
distance: The distance to search within from the specified geo point. This value takes an numeric value along with a distance unit. Common distance units are: mi (miles), yd (yards), ft (feet), in (inch), km (kilometers), m (meters). Please see Elastic documentation for complete list of distance units.
GEO_DISTANCE_RANGE(column, point_lat_lon, from_distance, to_distance)
Used to specify a query to filter documents that include only the hits that exist within a range from a specific geo point.
Example SQL Query:
SELECT * FROM cities WHERE GEO_DISTANCE_RANGE(location, 'drn5x1g8cu2y', '10mi', '20mi')
Elasticsearch Query:
{"bool":{"filter":{"geo_distance_range":{"location":"drn5x1g8cu2y","from":"10mi","to":"20mi"}},"must":[{"match_all":{}}]}}
column: A Geo-point column to perform the GEO_DISTANCE_RANGE filter on.
point_lat_lon: The coordinates of a geo point that will be used to measure the range from. This value can be an array, object of lat and lon values, comma-separated list, or a geohash [shown in example] of a latitude and longitude value.
from_distance: The starting distance to calculate the range from the specified geo point. This value takes an numeric value along with a distance unit. Common distance units are: mi (miles), yd (yards), ft (feet), in (inch), km (kilometers), m (meters). Please see Elastic documentation for complete list of distance units.
to_distance: The end distance to calculate the range from the specified geo point. This value takes an numeric value along with a distance unit. Common distance units are: mi (miles), yd (yards), ft (feet), in (inch), km (kilometers), m (meters). Please see Elastic documentation for complete list of distance units.
GEO_POLYGON(column, points)
Used to specify a query to filter hits that only fall within a polygon of points.
Example SQL Query:
SELECT * FROM cities WHERE GEO_POLYGON(location, '[{"lat":40,"lon":-70},{"lat":30,"lon":-80},{"lat":20,"lon":-90}]')
Elasticsearch Query:
{"bool":{"filter":{"geo_polygon":{"location":{"points":[{"lat":40,"lon":-70},{"lat":30,"lon":-80},{"lat":20,"lon":-90}]}}},"must":[{"match_all":{}}]}}
column: A Geo-point column to perform the GEO_POLYGON filter on.
points: A JSON array of points that make up a polygon. This value can be an array of arrays, object of lat and lon values [shown in example], comma-separated lists, or geohashes of a latitude and longitude value.
GEO_SHAPE(column, type, points [, relation])
Used to specify an inline shape query to filter documents using the geo_shape type to find documents that have a shape that intersects with the query shape.
Example SQL Query:
SELECT * FROM shapes WHERE GEO_SHAPE(my_shape, 'envelope', '[[13.0, 53.0], [14.0, 52.0]]
Elasticsearch Query:
{"bool":{"filter":{"geo_shape":{"my_shape":{"shape":{"type":"envelope","coordinates":[[13.0, 53.0], [14.0, 52.0]]}}}},"must":[{"match_all":{}}]}}
column: A Geo-shape column to perform the GEO_SHAPE filter on.
type: The type of shape to search for. Valid values: point, linestring, polygon, multipoint, multilinestring, multipolygon, geometrycollection, envelope, and circle. Please see Elastic documentation for further information regarding these shapes.
points: The coordinates for the shape type specified. These coordinates and their structure will vary depending upon the shape type desired. Please see Elastic search documentation for further details.
relation: The name of the spatial relation operator to use at search time. Valid values: intersects (default), disjoint, within, and contains. Please see Elastic documentation for further information regarding spatial relations.
INARRAY(column)
Used to search for values contained within a primitive array. Supports comparison operators based on the data type contained within the array, including the LIKE operator.
Example SQL Query:
SELECT * FROM employee WHERE INARRAY(skills) = 'coding'
column: A primitive array column to filter on.
MATCH(column)
Used to explicitly specify the query type to send and thus will send 'column' in a match query.
Example SQL Query:
SELECT * FROM employee WHERE MATCH(last_name) = 'SMITH'
Elasticsearch Query:
{"match":{"last_name":"SMITH"}}
column: A column to perform the match query on.
MATCH_PHRASE(column)
Used to explicitly specify the query type to send and thus will send 'column' in a match phrase query.
Example SQL Query:
SELECT * FROM employee WHERE MATCH_PHRASE(about) = 'rides motorbikes'
Elasticsearch Query:
{"match_phrase":{"about":"rides motorbikes"}}
column: A column to perform the match phrase query on.
MATCH_PHRASE_PREFIX(column)
Used to explicitly specify the query type to send and thus will send 'column' in a match phrase prefix query. The match phrase prefix query is the same as a match query except that it allows for prefix matches on the last term in the text.
Example SQL Query:
SELECT * FROM employee WHERE MATCH_PHRASE_PREFIX(about) = 'quick brown f'
Elasticsearch Query:
{"match_phrase_prefix":{"about":"quick brown f"}}
expression: A column to perform the match phrase prefix query on.
TERM(column)
Used to explicitly specify the query type to send and thus will send 'column' in a term query.
Example SQL Query:
SELECT * FROM employee WHERE TERM(last_name) = 'jacobs'
Elasticsearch Query:
{"term":{"last_name":"jacobs"}}
column: A column to perform the term query on.
DSLQuery(dsl_json)
Used to explicitly specify the Elasticsearch DSL query to send in the request. Can be used along with other filters and the AND and OR operators.
DSL query JSON can contain a full 'bool' query object, a 'must', 'should', 'must_not', or 'filter' occurrence type, or just a clause object (which will append to a 'must' (default) or 'should' occurrence type depending on whether an AND or OR operator is used).
Example SQL Query (These examples generate the same query using a 'bool' object, 'must' occurrence type, and query object):
SELECT * FROM employee WHERE DSLQuery('{\"bool\":{\"must\":[{\"query_string\":{\"default_field\":\"last_name\",\"query\":\"\\\\\"Smith\\\\\"\"}}]}}')
SELECT * FROM employee WHERE DSLQuery('{\"must\":[{\"query_string\":{\"default_field\":\"last_name\",\"query\":\"\\\\\"Smith\\\\\"\"}}]}')
SELECT * FROM employee WHERE DSLQuery('{\"query_string\":{\"default_field\":\"last_name\",\"query\":\"\\\\\"Smith\\\\\"\"}}')
Elasticsearch Query:
{"bool":{"must":[{"query_string":{"default_field":"last_name","query":"\"Smith\""}}]}}
Example SQL Query (with OR operator):
SELECT * FROM employee WHERE Age < 10 OR DSLQuery('{\"should\":[{\"query_string\":{\"default_field\":\"last_name\",\"query\":\"\\\\\"Smith\\\\\"\"}}]}')
Elasticsearch Query:
{"bool":{"should":[{"range":{"age":{"lt":10}}},{"query_string":{"default_field":"last_name","query":"\"Smith\""}}]}}
column: A column to perform the term query on.
ORDER BY Functions
MAPFIELD(column, data_type)
Used to explicitly specify a mapping (by sending the 'unmapped_type' sort option) for a column that does not have a mapping associated with it, which will enable sorting on the column. By default, if a column does not have a mapping, an exception will be thrown containing an error message similar to: "No mapping found for [column] in order to sort on".
Example SQL Query:
SELECT * FROM employee ORDER BY MAPFIELD(start_date, 'long') DESC
Elasticsearch Sort:
{"start_date":{"order":"desc", "unmapped_type": "long"}}
column: The column to perform the order by on.
data_type: The Elasticsearch data type to map the column to.