Next: , Previous: , Up: Top  


2 Examples

Before jumping into synax let’s look at some example problems to get a sense of the speed of k9 at processing data. Given both the historic use of languages similar to k9 in finance and the author’s background much of the examples will be based on financial markets. For those not familiar with this field a short introduction will likley be needed.

2.1 A Tiny Introduction to Financial Market Data

Financial market data generally are stored as prices (often call quotes) and trades. At a minimum, prices will include time (t), price to buy (b for bid) and price to sell (a for ask). Trades will include at a minimum time (t) and trade price (p). In normal markets there are many more prices than trades. (Additionally, the data normally includes the name of the security (s) and the exchange (x) from which the data comes.)

Let’s use k9 to generate a set of random prices.

 n:10
 T:10:00+`t n?36e5
 B:100++\-1+n?3
 A:B+1+n?2
 q:+`t`b`a!(T;B;A);q
t            b   a  
------------ --- ---
10:01:48.464 100 102
10:23:12.033 100 102
10:30:00.432 101 102
10:34:00.383 101 103
10:34:36.839 101 102
10:42:59.230 100 102
10:46:50.478 100 102
10:52:42.189  99 100
10:55:52.208  99 101
10:59:06.262  98  99

Here you see that at 10:42:59.230 the prices update to 100 and 102. The price one could sell is 100 and the price to buy is 102. You might think that 100 seems a bit high so sell there. Later at 10:59:06.262 you might have thought the prices look low and then buy at 99. Here’s the trade table for those two transactions.

 t:+`t`p!(10:43:00.230 10:59:07.262:;100 99);t
t            p  
------------ ---
10:43:00.230 100
10:59:07.262  99

You’ll note that the times didn’t line up and that’s because it apparently took you a second to decide to trade. Because of this delay you’ll often have to look back at the previous prices to join trade (t) and quote (q) data.

Now that you’ve learned enough finance to understand the data, let’s scale up to larger problems to see the power of k9.

2.2 Data Manipulation

Generate a table of random data and compute basic statistics quickly. The data here includes time (t), security (s), and price delta (d). This table takes about 4 GB and 3.3 seconds on a relatively new consumer laptop.

 n:_100e6                         / 100 million rows
 t:{09:00:00.000+x?10:00:00.000}  /  random times
 s:{x?`a`b`c`d`e}                 / random symbols
 m:0,(|m),365378984,m:271810244 42800467 2636454 62769 572 2;
 d:{(-6+!13)@(+\m)bin x?_1e9}
 \t q:+`t`s`d!(t[n];s[n];d[n])    / time data generation in ms
3391

As this point one might want to check start and stop times, see if the symbol distribution is actually random and look at the distribution of the price deltas.

 select ti:min t, tf:max t from q / min and max time values
ti|09:00:00.000
tf|18:59:59.999

 select c:#s by s from q          / count each symbol
s|c       
-|--------
a|20003490
b|19997344
c|19998874
d|20000640
e|19999652

 select c:#d by d from q          / check the normal distribution (2s to run)
d |c       
--|--------
-6|1       
-5|55      
-4|6226    
-3|263801  
-2|4280721 
-1|27179734
 0|36531595
 1|27188092
 2|4279872 
 3|263610  
 4|6245    
 5|48

 select gain:sum d by s from q    / profit (or loss) over each symbol
s|gain 
-|-----
a|  872
b| 2765
c| 2668
d| 2171
e|-2354

 select loss:min +\d by s from q  / worst loss over the period
s|loss 
-|-----
a|-1803
b| -846
c|-2732
d|-2101
e|-2903

2.3 Understanding Code Examples

In the shakti mailing list there is a number of code examples that can be used to learn best practice. In order to make sense of other’s codes one needs to be able to effeciently parse the typically dense k9 language. Here, an example of how one goes about this process is presented.

ss:{*{
      o:o@&(-1+(#y)+*x@1)<o:1_x@1;
      $[0<#x@1;((x@0),*x@1;o);x]}[;y]/:(();&(x@(!#x)+\!#y)~\y)
      }

This function finds a substring in a string.

 000000000011111111112222222222333333
 012345678901234567890123456789012345
"Find the +++ needle in + the ++ text"

Here one would expect to find “++” at 9 and 29.

 ss["Find the +++ needle in + the ++ text";"++"]
9 29

In order to determine how this function works let’s strip out the details...

ss:{
    *{
      o:o@&(-1+(#y)+*x@1)<o:1_x@1; / set o 
      $[0<#x@1;((x@0),*x@1;o);x]   / if x then y else z
      }
  [;y]/:(();&(x@(!#x)+\!#y)~\y)    / use value for inner function
  }
 

Given k9 evaluates right to left let’s start with the right most code fragment.

 (();&(x@(!#x)+\!#y)~\y)          / a list (null;value)

And now let’s focus on the value in the list.

 &(x@(!#x)+\!#y)~\y

In order to easily check our understand we can wrap this in a function and call the function with the parameters shown above. In order to step through we can start with the inner parenthesis and build up the code until it is complete.

 {!#x}["Find the +++ needle in + the ++ text";"++"]
{!#x}["Find the +++ needle in + the ++ text";"++"]
^
:rank

This won’t work as one cannot call a function with two arguments and then only use one. In order to get around this we will insert code for the second argument but not use it.

 {y;#x}["Find the +++ needle in + the ++ text";"++"]
36
 {y;!#x}["Find the +++ needle in + the ++ text";"++"]
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 ..

As might have been guessed #x counts the number of charcters in the first argument and then !#x generates a list of integers from 0 to n-1.

 {(!#x)+\!#y}["Find the +++ needle in + the ++ text";"++"]
 0  1
 1  2
 2  3
 3  4
 4  5
 5  6
 6  7
 7  8
 8  9
 9 10
10 11
11 12
12 13
13 14
14 15
15 16
16 17
17 18
18 19
19 20
20 21
..

Here the code takes each integer from the previous calculation and then add an integer list as long as the send argument to each value. In order to ensure this is clear one could write something similar and ensure the output is able to be predicted.

 {(!x)+\!y}[6;4]
0 1 2 3
1 2 3 4
2 3 4 5
3 4 5 6
4 5 6 7
5 6 7 8

Now using the matrix above the code indices the first argument and pull substrings that match in length of the search string.

 {x@(!#x)+\!#y}["Find the +++ needle in + the ++ text";"++"]
Fi
in
nd
d 
 t
th
he
e 
 +
++
++
+ 
 n
ne
ee
ed
dl
le
e 
 i
in
..

At this point one can compare the search substring in this list of substrings to find a match.

 {(x@(!#x)+\!#y)~\y}["Find the +++ needle in + the ++ text";"++"]
000000000110000000000000000001000000b

And then one can use the where function, &, to determine the index of the matches.

 {&(x@(!#x)+\!#y)~\y}["Find the +++ needle in + the ++ text";"++"]
9 10 29

Next: , Previous: , Up: Top