Next: , Previous: , Up: Top  


14 kSQL

kSQL is a powerful query language for tables. The benchmark chapter has shown how quickly k9 can process big tables.

select [count first last min max sum avg ..]A by B from T where C
delete from T where C               *update A by B from T where C
x,y      / union, insert, upsert, outerjoin, leftjoin, asofjoin ..
x+y      / equi and asof outer joins: combine markets through time
x#y x_y  / take/drop, intersect/except, inner join

14.1 Queries

14.1.1 Trivial Query

There a number of ways to return a complete table with kSQL. You can use the table name, a kSQL query without columns, or a fully specified query with columns.

 n:5;t:[[]x:!n;y:sin !n]
 t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 select from t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 select x,y from t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

14.1.2 Query with Where

kSQL makes it easy to build up a where clause to filter down the table.

 n:5;t:[[]x:!n;y:sin !n]
 select from t where x>0
x y         
- ----------
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

select from t where (x>0)&y within 0 .9
x y       
- --------
1 0.841471
3 0.14112 

14.1.3 Query with By

kSQL also has a way to group rows using by. The result is a ktable where the key is the now unique field in the by.

n:5;t:[[]x:!n;y:sin !n]
 
 select sum y by x>2 from t
x|y         
-|----------
0|1.750768  
1|-0.6156825

14.1.4 Query with By and Where

 n:5;t:[[]x:!n;y:sin !n]
 select sum y by x>2 from t where y>0
x|y       
-|--------
0|1.750768
1|0.14112 

14.2 Joins

k9 has a number of methods to join tables together which are described below. In this section t, t1 and t2 represent tables and k, k1 and k2 represent ktables.

joinsyntax
uniont1,t2
leftt,k
outerk1,k2
asoft,k
asof+k1+k2

14.2.1 union join ⇒ t1,t2

Union join table t1 with table t2. The tables should have the same columns and the join results in a table with t2 appended to t1. If the tables do not have the same columns then return t1.

 t1:[[]s:`a`b;p:1 2;q:3 4]
 t2:[[]s:`b`c;p:11 12;q:21 22]
 t1
s p q
- - -
a 1 3
b 2 4

 t2
s p  q 
- -- --
b 11 21
c 12 22

 t1,t2
s p  q 
- -- --
a  1  3
b  2  4
b 11 21
c 12 22

14.2.2 left join ⇒ t,k

Left join table t with ktable k. Result includes all rows from t and values from t where there is no k value.

 t:[[]s:`a`b`c;p:1 2 3;q:7 8 9]
 k:[[s:`a`b`x`y`z]q:101 102 103 104 105;r:51 52 53 54 55]
 t
s p q
- - -
a 1 7
b 2 8
c 3 9

 k
s|q   r 
-|--- --
a|101 51
b|102 52
x|103 53
y|104 54
z|105 55

 t,k
s p q   r 
- - --- --
a 1 101 51
b 2 102 52
c 3   9  0

14.2.3 outer join ⇒ k1,k2

Outer join ktable k1 with key ktable k2.

 k1:[[s:`a`b]p:1 2;q:3 4]
 k2:[[s:`b`c]p:9 8;q:7 6]
 k1
s|p q
-|- -
a|1 3
b|2 4

 k2
s|p q
-|- -
b|9 7
c|8 6

 k1,k2
s|p q
-|- -
a|1 3
b|9 7
c|8 6

14.2.4 asof join ⇒ t,k

Asof joins a table t to a ktable k (key by time) such that the t values show the preceeding or equal time value of k.

 t:[[]t:09:30+5*!5;p:100+!5];t
t     p  
----- ---
09:30 100
09:35 101
09:40 102
09:45 103
09:50 104

 k:[[t:09:32 09:41 09:45]q:50 51 52];k
t    |q 
-----|--
09:32|50
09:41|51
09:45|52

 t,k
t     p   q 
----- --- --
09:30 100  0
09:35 101 50
09:40 102 50
09:45 103 52
09:50 104 52

Scaling this up to a bigger set of tables one can see the performance of k9 on joins.

 N:_1e8;T:[[]t:N?`t 0;q:N?100];5#T
t            q 
------------ --
00:00:00.001 44
00:00:00.002 46
00:00:00.002 48
00:00:00.003 35
00:00:00.003 43

 n:_1e5;K:[[t:n?`t 0]p:n?100];5#K
t           |p 
------------|--
00:00:00.481|54
00:00:00.961|63
00:00:01.094|67
00:00:01.479|16
00:00:01.917|58

 \t T,K
222

14.2.5 asof+ join ⇒ k1+k2

Asof+ joins ktables adding the non-key fields to represent the sum asof the key field usually time. This join allows one to aggregate over markets to find the total available at a given time. The ktables need to be specified with `a.

 k1:`a [[t:09:30+5*!5]bs:100*1 2 3 2 1];k1
t    |bs 
-----|---
09:30|100
09:35|200
09:40|300
09:45|200
09:50|100

 k2:`a [[t:09:32 09:41 09:45]bs:1 2 3];k2
t    |bs
-----|--
09:32| 1
09:41| 2
09:45| 3

 k1+k2
t    |bs 
-----|---
09:30|100
09:32|101
09:35|201
09:40|301
09:41|302
09:45|203
09:50|103

14.3 Insert and Upsert

One can add data to tables via insert or upsert. The difference between the two is that insert adds data to a table while upsert will add or replace data to a keyed table. Upsert adds when the key isn’t present and replaces when the key is.

14.3.1 insert ⇒ t,d

Insert dictionary d into table t.

 t:[[]c1:`a`b`a;c2:1 2 7];t
c1 c2
-- --
a   1
b   2
a   7

 t,`c1`c2!(`a;12)
c1 c2
-- --
a   1
b   2
a   7
a  12

 t,`c1`c2!(`c;12)
c1 c2
-- --
a   1
b   2
a   7
c  12

14.3.2 upsert ⇒ k,d

Insert dictionary d into ktable k.

 k:[[c1:`a`b`c]c2:1 2 7];k
c1|c2
--|--
a | 1
b | 2
c | 7

 k,`c1`c2!(`a;12)
c1|c2
--|--
a |12
b | 2
c | 7

 k,`c1`c2!(`b;12)
c1|c2
--|--
a | 1
b |12
c | 7

Next: , Previous: , Up: Top