Next: , Previous: , Up: Top  


1 Introduction

The k9 programming language is designed primarily for the analysis of data. It may surprise new users with two rather different paradigms, (1) fast data analysis and (2) concise syntax. After you become familiar with the language, these features will both seem normal and intuitive. Also, going back to slow and verbose programming will be surprisingly difficult.

1.1 Going fast

Imagine you have a small, on-disk, 100 million row database containing a time-series with two float values at each time. Additionally this data could be split in three different tables covering different measurements. Here’s how fast k9 can read in the data from disk and compute a statistic, average difference over each table, which uses each and every row.

This section requires 2: a feature only in the enterprise version of Shakti.

 bash-3.2$ ./k
 2020.xx.xx 17GB 4core (c) shakti
 \t q:2:`q;{select s:avg a-b from x}'q[!q]
495

That’s 495 ms to read the data in from disk and compute over all the 100 million values. The data read is the biggest bit. If the data were already in memory then the computation would be faster:

 \t {select s:avg a-b from x}'q[!q]
230

230 ms, not bad for 100 million calculations.

The code to generate the on-disk database is presented below. Speed of course will depend on your hardware so times will vary.

 nf:d+*|d:(|-d),d:683 954 997 1000;
 T:^`t rand[;_8.64e7]@
 B:100++\1e-2*-3+nf bin/:rand[;*|nf]@
 S:rand[;1e-2*2,2,8#1]@
 L:{select t,b,a:b+s from +`t`b`s!(T;B;S)@'x}
 q:`eurusd`usdjpy`usdchf!L'_60e6 20e6 20e6
 `q 2:q

1.2 Going concise

The k9 language is more closely related to mathematics syntax than most programming lanauges. It requires the developer to learn to “speak” k9 but once that happens most find an ability to speak quicker and more accurately in k9 than in other languages. At this point an example might help.

In mathematics, “3+2” is read as “3 plus 2” as you learn at an early age that “+” is the “plus” sign. For trival operations like arithmetic most programming languages use symbols also. When moving beyond arithmetic, most programming lanauges switch to words while k9 remains with symbols. As an example, to determine the distinct values of a list most programming languages might use a syntax like distinct() while k9 uses ?. This requires the developer to learn how to say a number of symbols but once that happens it results in much shorter code that is quicker to write, easier to inspect, and easier to maintain.

This should not be surprising. In arithmetic, which do you find easier to understand?

/ math with text
Three plus two times open parenthesis six plus fourteen close parenthesis

/ math with symbols
3+2*(6+14)

In code, if you’re new to k9 then it’s unlikley you can understand the second example.

/ code with text
x = (0,12,3,11,3);y=5;
distinct_x = list(set(x));
sorted(i for i in distinct_x if i >= y)

/ code with symbols
x:0 12 3 11 3;y:5;
z@&y<z:?x

When you first learned arithmetic you likely didn’t have a choice. Now you have a choice whether or not you want to learn k9. If you give it a try, then you’ll likely get it quickly and move onto the power phase fast enough that you’ll be happy you gave it a chance.

1.3 Get k9.

https://shakti.com

k9 is available in two versions, standard (under download) and enterprise. The enterprise version has additional features indicated on the k9 help page and also indicated in this tutorial.

Once downloaded you will need to change the file mode with the following commmand

 chmod +x k

On the mac if you then attempt to run this file you likely won’t succeed due to MacOS security. You’ll need to go to “System Preferences...” and then “Security and Privacy” and select to allow this binary to run. (You’ll have to have tried and failed to have it appear here automatically.)

On linux (and macos if you have installed npm) one can download k from the command line via

 npm i @kparc/k -g

1.4 Help / Info Card

Typing \ in a terminal window gives you a concise overview of the language. This document aims to provide details to beginning user where the help screen is a bit too terse. Some commands are not available in the basic version and thus marked with an asterisk, eg. *4: https get.

python:from k import k;k('+',2,3);nodejs:k=require('k').k;k('+',2,3)
*ffi:"./a.so"5:`a!"fi" //double a(int x){return 2.3+x;}

$k [-p 1024] a.k
verb                   adverb                  noun
: x         y          f' each                 char " ab"              \l a.k
+ flip      plus    [x]f/ over      c/ join    name ``ab               \t:n x
- minus     minus   [x]f\ scan      c\ splt    int  2 3                \u:n x
* first     times   [y]f':eachprior            flt  2 3.4 0w 0n        \v
%           divide     f/:eachright g/:over    date 2021.06.28   .z.d
& where     min/and    f\:eachleft  g\:scan    time 12:34:56.789 .z.t
| reverse   max/or     
< asc       less       i/o (*enterprise)       class                   \f
> desc      more       0: r/w line (N;C)0:     list (2;3.4;`c)         \fl x
= group     equal      1: r/w char             dict [n:`b;i:2]         \fc x   
~ not       match     *2: r/w data             func {[a;b]a+b}         \fs x
! key       key       *3: k-ipc set            expr :a+b               \cd [d]
, enlist    cat       *4: https get            
^ sort   [f]cut       *5: ffi:`f!"ifsIF"
# count  [f]take                              
_ floor  [f]drop                      
$ string    parse      $[b;t;f] cond
? unique    find                               limit {[p8]l8;g32;c128}
@ type   [f]at         @[x;i;f[;y]] amend      table [[]n:`b`c;i:2 3]
. value  [f]dot        .[x;i;f[;y]] dmend     utable  [[n:`b`c]i:2 3]

math: sqrt sqr exp log sin cos div mod bar in bin
aggr: count first last min max sum dot avg var [dev med mode ..]
sql: select A by B from T where C; update A from T; delete from T where C

/comment \trace [:return 'signal if do while] \\exit

1.5 rlwrap

Although you only need the k binary to run k9 most will also install rlwrap, if not already installed, in order to get command history in a terminal window. rlwrap is “Readline wrapper: adds readline support to tools that lack it” and allows one to arrow up to go through the command buffer history.

In order to start k9 you should either run k or rlwrap k to get started. Here I will show both options but one should run as desired. In this document lines with input are shown with a leading space and output will be without. In the examples below the user starts a terminal window in the directory with the k binary file. Then the users enters rlwrap ./k RET. k9 starts and displays the date of the build, (c), and shakti and then listens to user input. In this example I have entered the command to exit k9, \\. Then I start k9 again without rlwrap and again exit the session.

 rlwrap ./k
Sep 13 2020 16GB (c) shakti
 \\

 ./k
Sep 13 2020 16GB (c) shakti
 \\

1.6 Simple example

Here I will start up k9, perform some trivial calculations, and then close the session. After this example it will be assumed the user will have a k9 session running and working in repl mode. Comments (/) will be added to the end of lines as needed. One can review plus, enum, floor and timing as needed.

 1+2  / add 1 and 2
3

 !5   / generate a list of 5 integers from 0 to 4
0 1 2 3 4

 1+!5 / add one to each element of the list
1 2 3 4 5

 !_100e6;     / generate a list of 100 million integers (suppress output with ;)
 1+!_100e6;   / do 100 million sums
 \t 1+!_100e6 / time the operations in milliseconds
82

Now let’s exit the session.

 \\
bash-3.2$ 

1.7 Document formatting for code examples

This document uses a number of examples to familiarize the reader with k9. The sytax is to have input with a leading space and output without a leading space. This follows the terminal syntax where the REPL input has space but prints output without.

 3+2 / this is input
5    / this is output

1.8 k9 idiosyncracies

One will need to understand some basic rules of k9 in order to progress. These may seem strange at first but the faster you learn them, the faster you’ll move forward. Also, some of them, like overloading based on number of arguments, add a lot of expressability to the language.

1.8.1 Colon (:) is used to set a variable to a value

a:3 is used to set the variable, a, to the value, 3. a=3 is an equality test to determine if a is equal to 3.

1.8.2 Percent (%) is used to divide numbers

Yeah, 2 divided by 5 is written as 2%5, not 2/5. This choice is because % is similar to ÷, and the \ and / symbols are used elsewhere.

1.8.3 Evaluation is done right to left

2+5*3 is 17 and 2*5+3 is 16. 2+5*3 is first evaluated on the right most portion, 5*3, and once that is computed then it proceeds with 2+15. 2*5+3 goes to 2*8 which becomes 16.

1.8.4 There is no arithmetic order

+ has equal precedence as *. The order of evaluation is done right to left unless parenthesis are used. (2+5)*3 = 21 as the 2+5 in parenthesis is done before being multiplied by 3.

1.8.5 Operators are overloaded depending on the number of arguments.

 *(13;6;9)    / single argument: * returns the first element
13
 2*(13;6;9)   / two arguments: * is multiplication
26 12 18

1.8.6 Lists and functions are very similar.

k9 syntax encourages you to treat lists and functions in a similar function. They should both be thought of a mapping from a value to another value or from a domain to a range. Lists and functions do not have the same type.

 l:3 4 7 12
 f:{3+x*x}
 l@2
7
 f@2
7
 @l
`I
 @f
`.

1.8.7 k9 notions of Noun, Verb, and Adverb

k9 uses an analogy with grammar to describe language syntax. The k9 grammar consists of nouns (data), verbs (functions) and adverbs (function modifiers).

In k9 as the Help/Info card shows data are nouns, functions/lists are verbs and modifiers are adverbs.


Next: , Previous: , Up: Top