Groovy’s groovy-ginq
module provides a higher-level abstraction over collections.
It could perform queries against in-memory collections of objects in a SQL-like style.
Also, querying XML, JSON, YAML, etc. could also be supported because they can be parsed into collections.
As GORM and jOOQ are powerful enough to support querying DB, we will cover collections first.
1. GINQ a.k.a. Groovy-Integrated Query
GINQ is a DSL for querying with SQL-like syntax, which consists of the following structure:
GQ, i.e. abbreviation for GINQ
|__ from
| |__ <data_source_alias> in <data_source>
|__ [innerjoin/leftjoin/rightjoin/fulljoin/crossjoin]*
| |__ <data_source_alias> in <data_source>
| |__ on <condition> ((&& | ||) <condition>)* (NOTE: `crossjoin` does not need `on` clause)
|__ [where]
| |__ <condition> ((&& | ||) <condition>)*
|__ [groupby]
| |__ <expression> [as <alias>] (, <expression> [as <alias>])*
| |__ [having]
| |__ <condition> ((&& | ||) <condition>)*
|__ [orderby]
| |__ <expression> [in (asc|desc)] (, <expression> [in (asc|desc)])*
|__ [limit]
| |__ [<offset>,] <size>
|__ select
|__ <expression> [as <alias>] (, <expression> [as <alias>])*
[] means the related clause is optional, * means zero or more times, and + means one or more times. Also, the clauses of GINQ are order sensitive,
so the order of clauses should be kept as the above structure
|
As we could see, the simplest GINQ consists of a from
clause and a select
clause, which looks like:
from n in [0, 1, 2]
select n
ONLY ONE from clause is required in GINQ. Also, GINQ supports multiple data sources through from and the related joins.
|
As a DSL, GINQ should be wrapped with the following block to be executed:
GQ { /* GINQ CODE */ }
For example,
def numbers = [0, 1, 2]
assert [0, 1, 2] == GQ {
from n in numbers
select n
}.toList()
import java.util.stream.Collectors
def numbers = [0, 1, 2]
assert '0#1#2' == GQ {
from n in numbers
select n
}.stream()
.map(e -> String.valueOf(e))
.collect(Collectors.joining('#'))
And it is strongly recommended to use def
to define the variable for the result of GINQ execution,
which is a Queryable
instance that is lazy.
def result = GQ {
/* GINQ CODE */
}
def stream = result.stream() // get the stream from GINQ result
def list = result.toList() // get the list from GINQ result
Currently GINQ can not work well when STC is enabled. |
1.1. GINQ Syntax
1.1.1. Data Source
The data source for GINQ could be specified by from
clause.
Currently GINQ supports Iterable
, Stream
, array and GINQ result set as its data source:
Iterable
Data Source
from n in [1, 2, 3] select n
Stream
Data Source
from n in [1, 2, 3].stream() select n
Array Data Source
from n in new int[] {1, 2, 3} select n
GINQ Result Set Data Source
def vt = GQ {from m in [1, 2, 3] select m}
assert [1, 2, 3] == GQ {
from n in vt select n
}.toList()
1.1.2. Projection
The column names could be renamed with as
clause:
def result = GQ {
from n in [1, 2, 3]
select Math.pow(n, 2) as powerOfN
}
assert [[1, 1], [4, 4], [9, 9]] == result.stream().map(r -> [r[0], r.powerOfN]).toList()
The renamed column could be referenced by its new name, e.g. r.powerOfN .
Also, it could be referenced by its index, e.g. r[0]
|
assert [[1, 1], [2, 4], [3, 9]] == GQ {
from v in (
from n in [1, 2, 3]
select n, Math.pow(n, 2) as powerOfN
)
select v.n, v.powerOfN
}.toList()
select P1, P2, …, Pn is a simplifed syntax of select new NamedRecord(P1, P2, …, Pn) when and only when n >= 2.
Also, NamedRecord instance will be created if as clause is used.
The values stored in the NamedRecord could be referenced by their names.
|
Construct new objects as column values:
@groovy.transform.EqualsAndHashCode
class Person {
String name
Person(String name) {
this.name = name
}
}
def persons = [new Person('Daniel'), new Person('Paul'), new Person('Eric')]
assert persons == GQ {
from n in ['Daniel', 'Paul', 'Eric']
select new Person(n)
}.toList()
1.1.3. Filtering
from n in [0, 1, 2, 3, 4, 5]
where n > 0 && n <= 3
select n * 2
Exists
from n in [1, 2, 3]
where (
from m in [2, 3]
where m == n
select m
).exists()
select n
Not Exists
from n in [1, 2, 3]
where !(
from m in [2, 3]
where m == n
select m
).exists()
select n
1.1.4. Joining
More data sources for GINQ could be specified by join clauses.
from n1 in [1, 2, 3]
innerjoin n2 in [1, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
leftjoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [2, 3, 4]
rightjoin n2 in [1, 2, 3] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
fulljoin n2 in [2, 3, 4] on n1 == n2
select n1, n2
from n1 in [1, 2, 3]
crossjoin n2 in [3, 4, 5]
select n1, n2
1.1.5. Grouping
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having n >= 3
select n, count(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
having count() < 3
select n, count()
The group columns could be renamed with as
clause:
from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
select length, firstChar, max(s)
from s in ['ab', 'ac', 'bd', 'acd', 'bcd', 'bef']
groupby s.size() as length, s[0] as firstChar
having length == 3 && firstChar == 'b'
select length, firstChar, max(s)
Aggregate Functions
GINQ supports some built-in aggregate functions, e.g.
count
, min
, max
, sum
, avg
and the most powerful function agg
.
count(…) , min(…) and max(…) just operate on non-null values,
and count() is similar to count(*) in SQL.
|
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, count()
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, min(s)
from s in ['a', 'b', 'cd', 'ef']
groupby s.size() as length
select length, max(s)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, sum(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, avg(n)
from n in [1, 1, 3, 3, 6, 6, 6]
groupby n
select n, agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
_g is an implicit variable for agg aggregate function,
it represents the grouped Queryable object and its record(e.g. r ) could reference the data source by alias(e.g. n )
|
Also, we could apply the aggregate functions for the whole GINQ result, i.e. no groupby
clause is needed:
assert [3] == GQ {
from n in [1, 2, 3]
select max(n)
}.toList()
assert [[1, 3, 2, 6, 3, 3, 6]] == GQ {
from n in [1, 2, 3]
select min(n), max(n), avg(n), sum(n), count(n), count(),
agg(_g.stream().map(r -> r.n).reduce(BigDecimal.ZERO, BigDecimal::add))
}.toList()
1.1.6. Sorting
from n in [1, 5, 2, 6]
orderby n
select n
in asc is optional when sorting in ascending order
|
from n in [1, 5, 2, 6]
orderby n in asc
select n
from n in [1, 5, 2, 6]
orderby n in desc
select n
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s in asc
select s
from s in ['a', 'b', 'ef', 'cd']
orderby s.length() in desc, s
select s
1.1.7. Pagination
limit
is similar to the limit
clause of MySQL, which could specify the offset
(first argument) and size
(second argument) for paginating,
or just specify the only one argument as size
from n in [1, 2, 3, 4, 5]
limit 3
select n
from n in [1, 2, 3, 4, 5]
limit 1, 3
select n
1.1.8. Nested GINQ
Nested GINQ in from
clause
from v in (
from n in [1, 2, 3]
select n
)
select v
Nested GINQ in where
clause
from n in [0, 1, 2]
where n in (
from m in [1, 2]
select m
)
select n
from n in [0, 1, 2]
where (
from m in [1, 2]
where m == n
select m
).exists()
select n
Nested GINQ in select
clause
assert [null, 2, 3] == GQ {
from n in [1, 2, 3]
select (
from m in [2, 3, 4]
where m == n
limit 1
select m
)
}.toList()
It’s recommended to use limit 1 to restrict the count of sub-query result
because TooManyValuesException will be thrown if more than one values returned
|
We could use as
clause to name the sub-query result
assert [[1, null], [2, 2], [3, 3]] == GQ {
from n in [1, 2, 3]
select n, (
from m in [2, 3, 4]
where m == n
select m
) as sqr
}.toList()
1.2. GINQ Tips
1.2.1. Row Number
_rn
is the implicit variable representing row number for each record in the result set. It starts with 0
from n in [1, 2, 3]
select _rn, n
1.2.2. List Comprehension
List comprehension is an elegant way to define and create lists based on existing lists:
assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}.toList()
assert [4, 16, 36, 64, 100] == GQ {from n in 1..<11 where n % 2 == 0 select n ** 2} as List
assert [4, 16, 36, 64, 100] == GQL {from n in 1..<11 where n % 2 == 0 select n ** 2}
GQL {…} is the abbreviation of GQ {…} as List
|
GINQ could be used as list comprehension in the loops directly:
def result = []
for (def x : GQ {from n in 1..<11 where n % 2 == 0 select n ** 2}) {
result << x
}
assert [4, 16, 36, 64, 100] == result
1.2.3. Query JSON
import groovy.json.JsonSlurper
def json = new JsonSlurper().parseText('''
{
"fruits": [
{"name": "Orange", "price": 11},
{"name": "Apple", "price": 6},
{"name": "Banana", "price": 4},
{"name": "Mongo", "price": 29},
{"name": "Durian", "price": 32}
]
}
''')
def expected = [['Mongo', 29], ['Orange', 11], ['Apple', 6], ['Banana', 4]]
assert expected == GQ {
from f in json.fruits
where f.price < 32
orderby f.price in desc
select f.name, f.price
}.toList()
1.2.4. Customize GINQ
For advanced users, you could customize GINQ behaviour by specifying your own target code generator.
For example, we could specify the qualified class name org.apache.groovy.ginq.provider.collection.GinqAstWalker
as the target code generator to generate GINQ method calls for querying collections,
which is the default behaviour of GINQ:
assert [0, 1, 2] == GQ(astWalker: 'org.apache.groovy.ginq.provider.collection.GinqAstWalker') {
from n in [0, 1, 2]
select n
}.toList()
1.3. GINQ Examples
1.3.1. Generate Multiplication Table
from v in (
from a in 1..9
innerjoin b in 1..9 on a <= b
select a as f, b as s, "$a * $b = ${a * b}".toString() as r
)
groupby v.s
select max(v.f == 1 ? v.r : '') as v1,
max(v.f == 2 ? v.r : '') as v2,
max(v.f == 3 ? v.r : '') as v3,
max(v.f == 4 ? v.r : '') as v4,
max(v.f == 5 ? v.r : '') as v5,
max(v.f == 6 ? v.r : '') as v6,
max(v.f == 7 ? v.r : '') as v7,
max(v.f == 8 ? v.r : '') as v8,
max(v.f == 9 ? v.r : '') as v9