Next: , Previous: , Up: k9: Manual  


14 kSQL

kSQL is a powerful query language for tables. It has similarities to ANSI SQL but additional features to make it easier to work with ordered data, such as time series data.

Database
{select|update} A by B from T where C; delete from T where C

Joins
x,y      / insert, upsert, union, equi-and-asof leftjoin
x+y      / equi-and-asof outerjoin (e.g. combine markets through time)
x#y      / take/intersect innerjoin
x_y      / drop/difference

14.1 Queries

Queries can be done either via the kSQL or functional forms. The kSQL form is general simplier to write while the functional form allows one to more easily run multithreaded or build queries programmatically.

14.1.1 select

There a number of ways to return a complete table with kSQL. You can use the table name, a kSQL query without columns, or a fully specified query with columns.

 n:5;t:[[]x:!n;y:sin !n]
 t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 select x,y from t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 `x`y#t
x y         
- ----------
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

14.1.2 A

A is the list of fields to return from the select or update query. A can create new column names.

 n:5;t:[[]x:!n;y:sin !n];t
x y         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 select x from t
x
-
0
1
2
3
4

`x#t
x
-
0
1
2
3
4


 select y,x from t    / reorder columns
y          x
---------- -
0.         0
0.841471   1
0.9092974  2
0.14112    3
-0.7568025 4

 `y`x#t
y          x
---------- -
0          0
0.841471   1
0.9092974  2
0.14112    3
-0.7568025 4

 select x,z:y from t  / create new column z
x z         
- ----------
0 0.        
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

14.1.3 by B

kSQL also has a way to group rows using by B. The result is a utable where the key is determined by the grouping clause.

 n:5;t:[[]x:!n;y:sin !n]
 select sum y by x>2 from t
x|y         
-|----------
0|1.750768  
1|-0.6156825

 t :+/'y@=x>2
0|1.750768  
1|-0.6156825

14.1.4 where C

kSQL makes it easy to build up a where clause to filter down the table. C is the list of constraints.

 n:5;t:[[]x:!n;y:sin !n]
 select from t where x>0
x y         
- ----------
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

 (:x>0)#t
x y         
- ----------
1 0.841471  
2 0.9092974 
3 0.14112   
4 -0.7568025

select from t where (x>0), y within 0 .9
x y       
- --------
1 0.841471
3 0.14112

 (:x>0;:y within .0 .9)#t
x y       
- --------
1 0.841471
3 0.14112 

14.1.5 Query with By and Where

 n:5;t:[[]x:!n;y:sin !n]
 select sum y by x>2 from t where y>0
x|y       
-|--------
0|1.750768
1|0.14112

 ((:y>0)#t) :+/'y@=x>2
0|1.750768
1|0.14112 

14.1.6 update

update allows one to modify values without specifying all the fields that pass through. update is also used to add new columns in a table. update does not, by itself, save the modifications to the table. If you want to preserve those modifications, use assignment.

 t:[[]x:`a`b`c;y:1 2 3];t
x y
- -
a 1
b 2
c 3

 update y+18 from t
x y 
- --
a 19
b 20
c 21

 update z:y+18 from t
x y z 
- - --
a 1 19
b 2 20
c 3 21

 t:update z:y+18 from t; / save the updates into table t
 t                       / Now t has the updated values.
x y z 
- - --
a 1 19
b 2 20
c 3 21

14.1.7 delete

Delete rows from a table that satisfy one or more conditions. Currently not working.

 t:[[]x:`a`b`c;y:1 2 1];t
x y
- -
a 1
b 2
c 1

 delete from t where y>1
!value

 delete from t where x=`c,y=1
!nyi

14.2 Joins

k9 has a number of methods to join tables together which are described below. In this section t, t1 and t2 represent tables and k, k1 and k2 represent utables.

joinsyntax
uniont1,t2
leftjoint,k
outerk1,k2
asoft,k
asof+k1+k2

14.2.1 union join ⇒ t1,t2

Union join table t1 with table t2. The tables should have the same columns and the join results in a table with t2 appended to t1. If the tables do not have the same columns then return t1.

 t1:[[]s:`a`b;p:1 2;q:3 4]
 t2:[[]s:`b`c;p:11 12;q:21 22]
 t1
s p q
- - -
a 1 3
b 2 4

 t2
s p  q 
- -- --
b 11 21
c 12 22

 t1,t2
s p  q 
- -- --
a  1  3
b  2  4
b 11 21
c 12 22

14.2.2 leftjoin ⇒ t,k

leftjoin table t with utable k. Result includes all rows from t and the values from table k having the same key values. If a row of t has key values not found in k, then the t values are shown and 0 for the columns coming from k.

 t:[[]s:`a`b`c;p:1 2 3;q:7 8 9]
 k:[[s:`a`b`x`y`z]q:101 102 103 104 105;r:51 52 53 54 55]
 t
s p q
- - -
a 1 7
b 2 8
c 3 9

 k
s|q   r 
-|--- --
a|101 51
b|102 52
x|103 53
y|104 54
z|105 55

 / t.s includes the value `c. Because k.s does not include c,
 / the last row shows a 0 under the r column (which comes from k).
 t,k       
s p q   r 
- - --- --
a 1 101 51
b 2 102 52
c 3   9  0

14.2.3 outer join ⇒ k1,k2

Outer join utable k1 and k2.

 k1:[[s:`a`b]p:1 2;q:3 4]
 k2:[[s:`b`c]p:9 8;q:7 6]
 k1
s|p q
-|- -
a|1 3
b|2 4

 k2
s|p q
-|- -
b|9 7
c|8 6

 k1,k2
s|p q
-|- -
a|1 3
b|9 7
c|8 6

 k1:[[s:`a`b]p:1 1;q:10 10]
 k2:[[s:`b`c]p:2 2;q:20 20]
 k3:[[s:`c`d]p:3 3;q:30 30]

k1,k2,k3       / joining a 3rd
s|p q 
-|- --
a|1 10
b|2 20
c|3 30
d|3 30

 ,/(k1;k2;k3)  / join over
s|p q 
-|- --
a|1 10
b|2 20
c|3 30
d|3 30

14.2.4 asof join ⇒ t,k

Asof joins each row rt of table t to a row rk in utable k (keyed by time) provided rk has the maximum time value of any row in k while obeying the constraint that the time value in rt >= the time value of rk. Intuitively, rk should be the row in k that is most up-to-date with respect to rt.

 t:[[]t:09:30+5*!5;p:100+!5];t
t     p  
----- ---
09:30 100
09:35 101
09:40 102
09:45 103
09:50 104

 k:[[t:09:32 09:41 09:45]q:50 51 52];k
t    |q 
-----|--
09:32|50
09:41|51
09:45|52

 / Notice below the t row at 09:45 is linked with the k row at 09:45.
 / The k row at 09:41 is not linked with any t row.
 / By contrast, both the 09:35 and the 09:40 rows of t
 / are linked to the 09:32 row of k.
 t,k      
          
t     p   q 
----- --- --
09:30 100  0
09:35 101 50
09:40 102 50
09:45 103 52
09:50 104 52

Scaling this up to a bigger set of tables one can see the performance of k9 on joins.

 N:_1e8;T:[[]t:N?`t 0;q:N?100];5#T
t            q 
------------ --
00:00:00.001 44
00:00:00.002 46
00:00:00.002 48
00:00:00.003 35
00:00:00.003 43

 n:_1e5;K:[[t:n?`t 0]p:n?100];5#K
t           |p 
------------|--
00:00:00.481|54
00:00:00.961|63
00:00:01.094|67
00:00:01.479|16
00:00:01.917|58

 \t T,K
222

14.2.5 asof+ join ⇒ k1+k2

Asof+ joins allows one to aggregate over markets to find the total available at a given time. The utables need to be specified with `a. The effect is to merge the two key fields (the field t in this case) and for each row rk1 from table k1, add the non-key field (bs in this case) from rk1 to the bs field of the most recent row in k2 whose t value is less than or equal to the t value in rk1. And symmetrically for each row of table k2.

 k1:`a [[t:09:30+5*!5]bs:100*1 2 3 2 1];k1
t    |bs 
-----|---
09:30|100
09:35|200
09:40|300
09:45|200
09:50|100

 k2:`a [[t:09:32 09:41 09:45]bs:1 2 3];k2
t    |bs
-----|--
09:32| 1
09:41| 2
09:45| 3

 k1+k2
t    |bs 
-----|---
09:30|100
09:32|101
09:35|201
09:40|301
09:41|302
09:45|203
09:50|103

14.3 Insert and Upsert

One can add data to tables via insert or upsert. The difference between the two is that insert adds data to a table while upsert on some key x will replace the values if x is present in the target table or insert x with its associated value otherwise.

14.3.1 insert ⇒ t,d

Insert dictionary d into table t.

 t:[[]c1:`a`b`a;c2:1 2 7];t
c1 c2
-- --
a   1
b   2
a   7

 t,`c1`c2!(`a;12)
c1 c2
-- --
a   1
b   2
a   7
a  12

 t,`c1`c2!(`c;12)
c1 c2
-- --
a   1
b   2
a   7
c  12

14.3.2 upsert ⇒ k,d

Insert dictionary d into utable k.

 k:[[c1:`a`b`c]c2:1 2 7];k
c1|c2
--|--
a | 1
b | 2
c | 7

 k,`c1`c2!(`a;12)
c1|c2
--|--
a |12
b | 2
c | 7

 k,`c1`c2!(`b;12)
c1|c2
--|--
a | 1
b |12
c | 7

 k,`c1`c2!(`d;12)
c1|c2
--|--
a | 1
b | 2
c | 7
d |12

Next: System, Previous: Tables, Up: k9: Manual