Next: Verbs, Previous: k9: Manual, Up: k9: Manual
The k9 programming language is designed primarily for the analysis of data. It may surprise new users with two rather different paradigms, (1) fast data analysis and (2) concise syntax. After you become familiar with the language, these features will both seem normal and intuitive. Also, going back to slow and verbose programming will be surprisingly difficult.
Imagine you have a small, on-disk, 100 million row database containing a time-series with two float values at each time. Additionally this data could be split in three different tables covering different measurements. Here’s how fast k9 can read in the data from disk and compute a statistic, average difference over each table, which uses each and every row.
This section requires 2: a feature only in the enterprise version of Shakti. If that is not available then use the section below with 1:
bash-3.2$ ./k 2021.xx.xx 17GB 4core (c) shakti \t q:2:`q;{select s:avg a-b from x}'q[!q] 884
That’s 884 ms to read the data in from disk and compute over all the 100 million values. The data read is the biggest bit. If the data were already in memory then the computation would be faster:
\t {select s:avg a-b from x}'q[!q] 217
217 ms, not bad for 100 million calculations.
The code to generate the on-disk database is presented below. Speed of course will depend on your hardware so times will vary.
nf:d+*|d:(|-d),d:683 954 997 1000; T:^`t ?[;_8.64e7]@ B:100++\1e-2*-3+nf bin/:?[;*|nf]@ S:?[;1e-2*2,2,8#1]@ L:{select t,b,a:b+s from +`t`b`s!(T;B;S)@'x} q:`eurusd`usdjpy`usdchf!L'_60e6 20e6 20e6 `q 2:q
This section requires 1: a feature in all versions of Shakti.
bash-3.2$ ./k 2021.xx.xx 17GB 4core (c) shakti \t select s:avg a-b from q:`csv?1:"q.csv" 832
That’s 832 ms to read the data in from disk and compute over all the 10 million values. The data read and csv conversion process are the biggest bits.
Here is the code to generate the q.csv on-disk file. Note in this example only 10 million lines are generated versus the 100 million lines in the previous example using 2:
nf:d+*|d:(|-d),d:683 954 997 1000; T:^`t ?[;_8.64e7]@ B:100++\1e-2*-3+nf bin/:?[;*|nf]@ S:?[;1e-2*2,2,8#1]@ L:{select t,b,a:b+s from +`t`b`s!(T;B;S)x'} q:L[_10e6] "q.csv"1:`csv@q "q.csv"
The k9 language is more closely related to mathematics syntax than most programming lanauges. It requires the developer to learn to “speak” k9 but once that happens most find an ability to speak quicker and more accurately in k9 than in other languages. At this point an example might help.
In mathematics, “3+2” is read as “3 plus 2” as you learn at an early age that “+” is the “plus” sign. For trival operations like arithmetic most programming languages use symbols also. When moving beyond arithmetic, most programming lanauges switch to words while k9 remains with symbols. As an example, to determine the distinct values of a list most programming languages might use a syntax like distinct()
while k9 uses ?
. This requires the developer to learn how to say a number of symbols but once that happens it results in much shorter code that is quicker to write, easier to inspect, and easier to maintain.
This should not be surprising. In arithmetic, which do you find easier to understand?
/ math with text Three plus two times open parenthesis six plus fourteen close parenthesis / math with symbols 3+2*(6+14)
In code, if you’re new to k9 then it’s unlikley you can understand the second example.
/ code with text x = (0,12,3,11,3);y=5; distinct_x = list(set(x)); sorted(i for i in distinct_x if i >= y) / code with symbols x:0 12 3 11 3;y:5; z@&y<z:?x
When you first learned arithmetic you likely didn’t have a choice. Now you have a choice whether or not you want to learn k9. If you give it a try, then you’ll likely get it quickly and move onto the power phase fast enough that you’ll be happy you gave it a chance.
k9 is available in two versions, standard (under download) and enterprise. The enterprise version has additional features indicated on the k9 help page and also indicated in this tutorial.
Once downloaded you will need to change the file mode with the following commmand
chmod +x k
On the mac if you then attempt to run this file you likely won’t succeed due to MacOS security. You’ll need to go to “System Preferences...” and then “Security and Privacy” and select to allow this binary to run. (You’ll have to have tried and failed to have it appear here automatically.)
On linux (and macos if you have installed npm) one can download k from the command line via
npm i @kparc/k -g
Typing \
in a terminal window gives you a concise overview of the language. This document aims to provide details to beginning user where the help screen is a bit too terse. Some commands are not available in the basic version and thus marked with an asterisk, eg. *4: https get.
select count first last min max sum avg var dev .. by .. in n_(rand) n@(multiply) n?(divide) n@n?(bar) Verb monad Adverb Type + + ' each char " ab" - - / over sym ``ab * * \ scan bool 011b % div int 2 3 4 ! mod where System float 2 3e4 & & flip \l load -fixed 2.0 3.4 | | reverse \t time -locus -74::40.7 < < asc \v vars z.d date 2001.02.03 > > desc \w work z.t time 12:34:56.789 = = freq z.T datetime ~ ~ ~ , , , # take count I/O Class _ drop first 0' line expr :2+a ^ cut sort 1' char/stdout func f[a] 2+a @ @ type 2' data/stderr ? find unique *3' set list (2;3.4) $ parse str *4' get dict {a:2 3} . dict value *5' ffi table [a:2 3]
Although you only need the k binary to run k9 most will also install rlwrap, if not already installed, in order to get command history in a terminal window. rlwrap is “Readline wrapper: adds readline support to tools that lack it” and allows one to arrow up to go through the command buffer history.
In order to start k9 you should either run k or rlwrap k to get started. Here I will show both options but one should run as desired. In this document lines with input are shown with a leading space and output will be without. In the examples below the user starts a terminal window in the directory with the k binary file. Then the users enters rlwrap ./k RET. k9 starts and displays the date of the build, (c), and shakti and then listens to user input. In this example I have entered the command to exit k9, \\. Then I start k9 again without rlwrap and again exit the session.
rlwrap ./k Sep 13 2020 16GB (c) shakti \\ ./k Sep 13 2020 16GB (c) shakti \\
k9 runs as a read, evaluation, print loop (REPL). This means that one either programs in an interactive programming environment (eg. a shell/terminal window) or by running a script. There is no reason to compile code into an executable.
Here I will start up k9, perform some trivial calculations, and then close the session. After this example it will be assumed the user will have a k9 session running and working in repl mode. Comments (/
) will be added to the end of lines as needed. One can review plus, where, floor and timing as needed.
1+2 / add 1 and 2 3 !5 / generate a list of 5 integers from 0 to 4 0 1 2 3 4 1+!5 / add one to each element of the list 1 2 3 4 5 !_100e6; / generate a list of 100 million integers (suppress output with ;) 1+!_100e6; / do 100 million sums \t 1+!_100e6 / time the operations in milliseconds 82
Now let’s exit the session.
\\ bash-3.2$
This document uses a number of examples to familiarize the reader with k9. The syntax is to have input with a leading space and output without a leading space. This follows the terminal syntax where the REPL input has space but prints output without.
3+2 / this is input 5 / this is output
One will need to understand some basic rules of k9 in order to progress. These may seem strange at first but the faster you learn them, the faster you’ll move forward. Also, some of them, like overloading based on number of arguments, add a lot of expressability to the language.
:
) is used to set a variable to a value%
) is used to divide numbers:
) is used to set a variable to a valuea:3
is used to set the variable, a, to the value, 3. a=3
is an equality test to determine if a is equal to 3.
%
) is used to divide numbersYeah, 2 divided by 5 is written as 2%5
, not 2/5
. This choice is because %
is similar to ÷, and the \ and / symbols are used elsewhere.
2+5*3 is 17 and 2*5+3 is 16. 2+5*3 is first evaluated on the right most portion, 5*3, and once that is computed then it proceeds with 2+15. 2*5+3 goes to 2*8 which becomes 16.
+ has equal precedence as *. The order of evaluation is done right to left unless parenthesis are used. (2+5)*3 = 21 as the 2+5 in parenthesis is done before being multiplied by 3.
*(13;6;9) / single argument: * returns the first element 13 2*(13;6;9) / two arguments: * is multiplication 26 12 18
k9 syntax encourages you to treat lists and functions in a similar function. They should both be thought of a mapping from a value to another value or from a domain to a range. Lists and functions do not have the same type.
l:3 4 7 12 f:{3+x*x} l@2 7 f@2 7 @l `I @f `.
k9 uses an analogy with grammar to describe language syntax. The k9 grammar consists of nouns (data), verbs (functions) and adverbs (function modifiers).
In k9 as the Help/Info card shows data are nouns, functions/lists are verbs and modifiers are adverbs.
Next: Verbs, Previous: k9: Manual, Up: k9: Manual