Next: System, Previous: Tables, Up: k9: Manual
kSQL is a powerful query language for tables. It has similarities to ANSI SQL but additional features to make it easier to work with ordered data, such as time series data.
Database {select|update} A by B from T where C; delete from T where C Joins x,y / insert, upsert, union, equi-and-asof leftjoin x+y / equi-and-asof outerjoin (e.g. combine markets through time) x#y / take/intersect innerjoin x_y / drop/difference
Queries can be done either via the kSQL or functional forms. The kSQL form is general simplier to write while the functional form allows one to more easily run multithreaded or build queries programmatically.
There a number of ways to return a complete table with kSQL. You can use the table name, a kSQL query without columns, or a fully specified query with columns.
n:5;t:[[]x:!n;y:sin !n] t x y - ---------- 0 0. 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025 select x,y from t x y - ---------- 0 0. 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025 `x`y#t x y - ---------- 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025
A is the list of fields to return from the select or update query. A can create new column names.
n:5;t:[[]x:!n;y:sin !n];t x y - ---------- 0 0. 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025 select x from t x - 0 1 2 3 4 `x#t x - 0 1 2 3 4 select y,x from t / reorder columns y x ---------- - 0. 0 0.841471 1 0.9092974 2 0.14112 3 -0.7568025 4 `y`x#t y x ---------- - 0 0 0.841471 1 0.9092974 2 0.14112 3 -0.7568025 4 select x,z:y from t / create new column z x z - ---------- 0 0. 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025
kSQL also has a way to group rows using by B
. The result is a utable where the key is determined by the grouping clause.
n:5;t:[[]x:!n;y:sin !n] select sum y by x>2 from t x|y -|---------- 0|1.750768 1|-0.6156825 t :+/'y@=x>2 0|1.750768 1|-0.6156825
kSQL makes it easy to build up a where clause to filter down the table. C is the list of constraints.
n:5;t:[[]x:!n;y:sin !n] select from t where x>0 x y - ---------- 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025 (:x>0)#t x y - ---------- 1 0.841471 2 0.9092974 3 0.14112 4 -0.7568025 select from t where (x>0), y within 0 .9 x y - -------- 1 0.841471 3 0.14112 (:x>0;:y within .0 .9)#t x y - -------- 1 0.841471 3 0.14112
n:5;t:[[]x:!n;y:sin !n] select sum y by x>2 from t where y>0 x|y -|-------- 0|1.750768 1|0.14112 ((:y>0)#t) :+/'y@=x>2 0|1.750768 1|0.14112
update allows one to modify values without specifying all the fields that pass through. update is also used to add new columns in a table. update does not, by itself, save the modifications to the table. If you want to preserve those modifications, use assignment.
t:[[]x:`a`b`c;y:1 2 3];t x y - - a 1 b 2 c 3 update y+18 from t x y - -- a 19 b 20 c 21 update z:y+18 from t x y z - - -- a 1 19 b 2 20 c 3 21 t:update z:y+18 from t; / save the updates into table t t / Now t has the updated values. x y z - - -- a 1 19 b 2 20 c 3 21
Delete rows from a table that satisfy one or more conditions. Currently not working.
t:[[]x:`a`b`c;y:1 2 1];t x y - - a 1 b 2 c 1 delete from t where y>1 !value delete from t where x=`c,y=1 !nyi
k9 has a number of methods to join tables together which are described below. In this section t, t1 and t2 represent tables and k, k1 and k2 represent utables.
join | syntax |
union | t1,t2 |
leftjoin | t,k |
outer | k1,k2 |
asof | t,k |
asof+ | k1+k2 |
Union join table t1 with table t2. The tables should have the same columns and the join results in a table with t2 appended to t1. If the tables do not have the same columns then return t1.
t1:[[]s:`a`b;p:1 2;q:3 4] t2:[[]s:`b`c;p:11 12;q:21 22] t1 s p q - - - a 1 3 b 2 4 t2 s p q - -- -- b 11 21 c 12 22 t1,t2 s p q - -- -- a 1 3 b 2 4 b 11 21 c 12 22
leftjoin table t with utable k. Result includes all rows from t and the values from table k having the same key values. If a row of t has key values not found in k, then the t values are shown and 0 for the columns coming from k.
t:[[]s:`a`b`c;p:1 2 3;q:7 8 9] k:[[s:`a`b`x`y`z]q:101 102 103 104 105;r:51 52 53 54 55] t s p q - - - a 1 7 b 2 8 c 3 9 k s|q r -|--- -- a|101 51 b|102 52 x|103 53 y|104 54 z|105 55 / t.s includes the value `c. Because k.s does not include c, / the last row shows a 0 under the r column (which comes from k). t,k s p q r - - --- -- a 1 101 51 b 2 102 52 c 3 9 0
Outer join utable k1 and k2.
k1:[[s:`a`b]p:1 2;q:3 4] k2:[[s:`b`c]p:9 8;q:7 6] k1 s|p q -|- - a|1 3 b|2 4 k2 s|p q -|- - b|9 7 c|8 6 k1,k2 s|p q -|- - a|1 3 b|9 7 c|8 6 k1:[[s:`a`b]p:1 1;q:10 10] k2:[[s:`b`c]p:2 2;q:20 20] k3:[[s:`c`d]p:3 3;q:30 30] k1,k2,k3 / joining a 3rd s|p q -|- -- a|1 10 b|2 20 c|3 30 d|3 30 ,/(k1;k2;k3) / join over s|p q -|- -- a|1 10 b|2 20 c|3 30 d|3 30
Asof joins each row rt of table t to a row rk in utable k (keyed by time) provided rk has the maximum time value of any row in k while obeying the constraint that the time value in rt >= the time value of rk. Intuitively, rk should be the row in k that is most up-to-date with respect to rt.
t:[[]t:09:30+5*!5;p:100+!5];t t p ----- --- 09:30 100 09:35 101 09:40 102 09:45 103 09:50 104 k:[[t:09:32 09:41 09:45]q:50 51 52];k t |q -----|-- 09:32|50 09:41|51 09:45|52 / Notice below the t row at 09:45 is linked with the k row at 09:45. / The k row at 09:41 is not linked with any t row. / By contrast, both the 09:35 and the 09:40 rows of t / are linked to the 09:32 row of k. t,k t p q ----- --- -- 09:30 100 0 09:35 101 50 09:40 102 50 09:45 103 52 09:50 104 52
Scaling this up to a bigger set of tables one can see the performance of k9 on joins.
N:_1e8;T:[[]t:N?`t 0;q:N?100];5#T t q ------------ -- 00:00:00.001 44 00:00:00.002 46 00:00:00.002 48 00:00:00.003 35 00:00:00.003 43 n:_1e5;K:[[t:n?`t 0]p:n?100];5#K t |p ------------|-- 00:00:00.481|54 00:00:00.961|63 00:00:01.094|67 00:00:01.479|16 00:00:01.917|58 \t T,K 222
Asof+ joins allows one to aggregate over markets to find the total available at a given time. The utables need to be specified with `a
. The effect is to merge the two key fields (the field t in this case) and for each row rk1 from table k1, add the non-key field (bs in this case) from rk1 to the bs field of the most recent row in k2 whose t value is less than or equal to the t value in rk1. And symmetrically for each row of table k2.
k1:`a [[t:09:30+5*!5]bs:100*1 2 3 2 1];k1 t |bs -----|--- 09:30|100 09:35|200 09:40|300 09:45|200 09:50|100 k2:`a [[t:09:32 09:41 09:45]bs:1 2 3];k2 t |bs -----|-- 09:32| 1 09:41| 2 09:45| 3 k1+k2 t |bs -----|--- 09:30|100 09:32|101 09:35|201 09:40|301 09:41|302 09:45|203 09:50|103
One can add data to tables via insert or upsert. The difference between the two is that insert adds data to a table while upsert on some key x will replace the values if x is present in the target table or insert x with its associated value otherwise.
Insert dictionary d into table t.
t:[[]c1:`a`b`a;c2:1 2 7];t c1 c2 -- -- a 1 b 2 a 7 t,`c1`c2!(`a;12) c1 c2 -- -- a 1 b 2 a 7 a 12 t,`c1`c2!(`c;12) c1 c2 -- -- a 1 b 2 a 7 c 12
Insert dictionary d into utable k.
k:[[c1:`a`b`c]c2:1 2 7];k c1|c2 --|-- a | 1 b | 2 c | 7 k,`c1`c2!(`a;12) c1|c2 --|-- a |12 b | 2 c | 7 k,`c1`c2!(`b;12) c1|c2 --|-- a | 1 b |12 c | 7 k,`c1`c2!(`d;12) c1|c2 --|-- a | 1 b | 2 c | 7 d |12
Next: System, Previous: Tables, Up: k9: Manual