Next: , Previous: , Up: Top  

1 Introduction

The k9 programming language is designed primarily for the analysis of data. It may surprise new users with two rather different paradigms, (1) fast data analysis and (2) concise syntax. After you become familiar with the languages, these features will both seem normal and going back to slow and verbose programming will be surprisingly difficult.

1.1 Going fast

Imagine you have a small, on-disk, 100 million row database containing a time-series with two float values at each time. Additionally this data could be split in three different tables covering different measurements. Here’s how fast k9 can read in the data from disk and compute a statistic, average difference over each table, which uses each and every row.

 bash-3.2$ ./k
 2020.xx.xx 17GB 4core (c) shakti
 \t q:2:`q;{select s:avg a-b from x}'q[!q]

That’s 495 ms to read the data in from disk and compute over all the 100 million values. The data read is the biggest bit. If the data were already in memory then the computation would be faster:

 \t {select s:avg a-b from x}'q[!q]

230 ms, not bad for 100 million calculations.

The code to generate the on-disk database is presented below. Speed of course will depend on your hardware so times will vary.

 nf:d+*|d:(|-d),d:683 954 997 1000;
 T:^`t rand[;_8.64e7]@
 B:100++\1e-2*-3+nf bin/:rand[;*|nf]@
 L:{select t,b,a:b+s from +`t`b`s!(T;B;S)@'x}
 q:`eurusd`usdjpy`usdchf!L'_60e6 20e6 20e6
 `q 2:q

1.2 Going concise

The k9 language is more closely related to mathematics syntax than most programming lanauges. It requires the developer to learn to “speak” k9 but once that happens most find an ability to speak quicker and more accurately in k9 than in other languages. At this point an example might help.

In mathematics, “3+2” is read as “3 plus 2” as you learn at an early age that “+” is the “plus” sign. For trival operations like arithmetic most programming languages use symbols also. When moving beyond arithmetic, most programming lanauges switch to words while k9 remains with symbols. As an example, to determine the distinct values of a list most programming languages might use a syntax like distinct() while k9 uses ?. This requires the developer to learn how to say a number of symbols but once that happens it results in much shorter code that is quicker to write, easier to inspect, and easier to maintain.

This should not be surprising. In arithmetic, which do you find easier to answer?

Math with text
Three plus two times open parenthesis six plus fourteen close parenthesis

Math with symbols

In code, which will you eventually find easier to understand?

Code with text
x = (0,12,3,4,1,17,-5,0,3,11);y=5;
distinct_x = distinct(x);
gt_distinct_x = [i for i in j if i >= y];

Code with symbols

If you’re new to k9 then you’ll appreciate that symbols are shorter but look like line noise. That’s true but so did arithetic until you learned the basics.

When you first learned arithmetic you likely didn’t have a choice. Now you have a choice whether or not you want to learn k9. If you give it a try, then you’ll get it quickly and move onto the power phase fast enough that you’ll be happy you gave it a chance.

1.3 Get k9.

You will find the Linux version in the linux directory and the MacOS version under macos. Once you download the MacOS version you’ll have to change its file permissions to allow it to execute.

 chomd u+x k

Again on the mac if you then attempt to run this file you likely won’t succeed due to MacOS security. You’ll need to go to “System Preferences...” and then “Security and Privacy” and select to allow this binary to run. (You’ll have to have tried and failed to have it appear here automatically.)

1.4 Help / Info Card

Typing \ in a terminal window gives you a concise overview of the language. This document aims to provide details to beginning user where the help screen is a bit too terse. Some commands are not yet complete and thus marked with an asterisk, eg. *update A by B from T where C.

*ffi: a:"./"5:`f!"ii";a.f[2;3] / int f(int i,int j){return i+j;}
*k/c: b:"./"5:`f!2   ;b.f[2;3] / K f(K x,K y){return ki(xi+yi);}
`csv?`csv t:,`js?`js d:[d:.z.d;t:.z.t;n:`ab;i:23;f:4.5]
python: import k;k.k('+',2,3); nodejs: require('k').k('+',2,3)

verb                   adverb                  noun
: x         y          f' each      g' each    char " ab"              \l a.k
+ flip      plus    [x]f/ over      c/ join    name ``ab               \t:n x
- minus     minus   [x]f\ scan      c\ splt    int  2 3                \u:n x
* first     times   [y]f':eachprior            flt  2 3.4 0w 0n        \v
%           divide     f/:eachright g/:over    date 2021.06.28   .z.d
& where     min/and    f\:eachleft  g\:scan    time 12:34:56.789 .z.t
| reverse   max/or     
< asc       less       i/o (*enterprise)       class                   \f
> desc      more       0: r/w line (N;C)0:     list (2;3.4;`c)         \fl x
= group     equal      1: r/w char             dict [n:`b;i:2]         \fc x   
~ not       match     *2: r/w data             func {[a;b]a+b}         \fs x
! key       key       *3: k-ipc set            expr :a+b               \cd [d]
, enlist    cat       *4: https get
^ sort   [f]cut       *5: ffi/iff[py/js/..]    table [[]n:`b`c;i:2 3]
# count  [f]take                              utable  [[n:`b`c]i:2 3]
_ floor  [f]drop                      
$ string    parse      $[b;t;f] cond
? unique [n]find                               limit
@ type   [n]at         @[x;i;f[;y]] amend      name8(*256)
. value     dot        .[x;i;f[;y]] dmend      code p8 l8 g32 c128

select A by B from T where C; update A from T; delete from T where C
count first last min max sum dot avg var [dev med mode ..]
sqrt sqr exp log sin cos div mod bar in bin
/comment \trace [:return 'signal if do while] \\exit

1.5 rlwrap

Although you only need the k binary to run k9 most will also install rlwrap, if not already installed, in order to get command history in a terminal window. rlwrap is “Readline wrapper: adds readline support to tools that lack it” and allows one to arrow up to go through the command buffer history.

In order to start k9 you should either run k or rlwrap k to get started. Here I will show both options but one should run as desired. In this document lines with input are shown with a leading space and output will be without. In the examples below the user starts a terminal window in the directory with the k binary file. Then the users enters rlwrap ./k RET. k9 starts and displays the date of the build, (c), and shakti and then listens to user input. In this example I have entered the command to exit k9, \\. Then I start k9 again without rlwrap and again exit the session.

 rlwrap ./k
Sep 13 2020 16GB (c) shakti

Sep 13 2020 16GB (c) shakti

1.6 Simple example

Here I will start up k9, perform some trivial calculations, and then close the session. After this example it will be assumed the user will have a k9 session running and working in repl mode. Comments (/) will be added to the end of lines as needed. One can review plus, enum, floor and timing as needed.

 1+2  / add 1 and 2

 !5   / generate a list of 5 integers from 0 to 4
0 1 2 3 4

 1+!5 / add one to each element of the list
1 2 3 4 5

 !_100e6;     / generate a list of 100 million integers (suppress output with ;)
 1+!_100e6;   / do 100 million sums
 \t 1+!_100e6 / time the operations in milliseconds

Now let’s exit the session.


1.7 Document formatting for code examples

This document uses a number of examples to familiarize the reader with k9. The sytax is to have input with a leading space and output without a leading space. This follows the terminal syntax where the REPL input has space but prints output without.

 3+2 / this is input
5    / this is output

1.8 k9 Idiosyncracies

One will need to understand some basic rules of k9 in order to progress. These may seem strange at first but the faster you learn them, the faster you’ll move forward. Also, some of them, like overloading based on number of arguments, add a lot of expressability to the language.

1.8.1 The language changes often (for now).

There may be examples in this document which work on the version indicated but do not with the version currently available to download. If so, then feel free to drop the author a note. Items which currently cause an error but are likely to come back ’soon’ will be left in the document.

1.8.2 Colon (:) is used to set a variable to a value

a:3 is used to set the variable, a, to the value, 3. a=3 is an equality test to determine if a is equal to 3.

1.8.3 Percent (%) is used to divide numbers

Yeah, 2 divided by 5 is written as 2%5, not 2/5.

1.8.4 Evaluation is done right to left

2+5*3 is 17 and 2*5+3 is 16. 2+5*3 is first evaluated on the right most portion, 5*3, and once that is computed then it proceeds with 2+15. 2*5+3 goes to 2*8 which becomes 16.

1.8.5 There is no arithmetic order

+ has equal precedence as *. The order of evaluation is done right to left unless parenthesis are used. (2+5)*3 = 21 as the 2+5 in parenthesis is done before being multiplied by 3.

1.8.6 Operators are overloaded depending on the number of arguments.

 *(13;6;9)    / single argument: * returns the first element
 2*(13;6;9)   / two arguments: * is multiplication
26 12 18

1.8.7 Lists and functions are very similar.

k9 syntax encourages you to treat lists and functions in a similar function. They should both be thought of a mapping from a value to another value or from a domain to a range. Lists and functions do not have the same type.

 l:3 4 7 12

1.8.8 k9 notions of Noun, Verb, and Adverb

k9 uses an analogy with grammar to describe language syntax. The k9 grammar consists of nouns (data), verbs (functions) and adverbs (function modifiers).

In k9 as the Help/Info card shows data are nouns, functions/lists are verbs and modifiers are adverbs.

Next: , Previous: , Up: Top