I wish to keep sym file small. How can I find out if I will be introducing a new symbol before splaying the data out to disk?

You can detect if the use of a symbol will increased the total number of interned symbols in the pool using .Q.w:

$ rlwrap q
KDB+ 2.7 2011.02.16 Copyright (C) 1993-2011 Kx Systems
q).Q.w[]
used| 108432
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 537
symw| 15616
q)`1              // force kdb to internalize a new symbol
`1
q).Q.w[]
used| 108432
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 538         // increased by 1
symw| 15634       // increase of +18 bytes
q)`12             // force kdb to internalize another
`12
q).Q.w[]
used| 108432
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 539         // increase by 1 
symw| 15653       // increase of 19 bytes 
q)`123
`123
q).Q.w[]
used| 108432
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 540         // increase by 1
symw| 15673       // increase of 20 bytes
q)

See this solution to compacting a bloated sym file

How can I define my own infix functions?

Put them in the .q workspace, e.g.,

q).q.f: {x + y}
q)5 f 6
11
q)

This is generally considered bad style.

Why doesn’t my query 'select from reallybigtable' ever return?

It's a valid query, but q will run out of memory before returning. You'll have to add some constraints to the query to reduce the size of the result set. Something like

select from reallybigtable where date = YYYY.MM.DD

would be a good start.

How do I store mixed data types in a single column of a table?

The first time we needed a column that can hold values of multiple types, we tried something similar to the following:

q)t: ([] thing: ())
q)meta t
c    | t f a
-----| -----
thing|
q)

True, when you define a table column as an untyped list, it can contain anything. However, the first time you put something in it, the column's type is set to the type of that first inserted item:

q)`t insert `aunt
,0
q)`t insert 33.33
'type
q)meta t
c    | t f a
-----| -----
thing| s

In our example, the type of the thing column has been promoted to symbol; we can't insert numbers into it.

When you want to store elements of different types in the same column, you have to prevent q from performing this type promotion by stuffing a sentinel row (i.e., a row of dummy values), whose value for the column in question is itself a list, into the table:

q)t: ([] thing: enlist "sentinel")
q)meta t
c    | t f a
-----| -----
thing| C

Don't believe the hype. The meta function is trying to help out by actually inspecting the first value in the column when reporting the thing column's type. However, the true type of the column is untyped, i.e., zero:

q)type t `thing
0h
q)

You can, therefore, put whatever you like into t's thing column:

q)`t insert/: (`gave; 42)
1
2
q)t
thing
----------
"sentinel"
`gave
42
q)meta t
c    | t f a
-----| -----
thing| C
q)

Beware cleaning out such a table too thoroughly (in, say, an end-of-day function):

q)delete from `t
`t
q)meta t
c    | t f a
-----| -----
thing|
q)

See this related faq on untyped lists.

How does kdb compare to cloud computing?

Cloud-based solutions (such as Hadoop) are great if

  • you have developers who know how to program them
  • you don’t mind putting your data on a cloud
  • the data you need for even a single query can't fit in memory
  • you can afford to wait minutes for your query results

For some applications, then, clouds are an excellent approach. kdb+, however, is typically used in scenarios where

  • you can't (e.g., due to licensing) or don't want your data outside your firm
  • you don't have an in-house cloud infrastructure
  • you don't want each query to be an IT project in and of itself
  • you want the answer to your query in milliseconds (or seconds at worst)
  • you can fit your data in memory (not necessarily all at once)

How do I extract the milliseconds from a time?

mod 1000:

q)now: .z.T
q)now
00:15:00.812
q)now mod 1000
00:00:00.812

If you want the milliseconds as an integer, you can simply follow the mod with a conversion to int:

q)`int$ now mod 1000
812

You don't want to pass a datetime to mod; the result you'll get is not what you had in mind. You must convert it to a time first:

q)now: .z.Z
q)now
2011.03.26T09:51:26.624
q)now mod 1000
102.4107
q)`int$ (`time$ now) mod 1000
624
q)

Can I simply invoke garbage collect and clear the root namespace instead of restarting the process to reinitialize my database?

Almost. The symbol pool, however, will remain uncleared:

$ rlwrap q
KDB+ 2.7 2011.02.16 Copyright © 1993-2011 Kx Systems
q).Q.gc[]
0j     / as expected, gc is a no-op
q).Q.w[]
used| 108432
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 538
symw| 15638
q)-1000000?`6    / 1 million symbols of length 6
`milgli`igfbag`kaodhb`bafclb`kfhogj`jecpae`kfmohp`lk..
q).Q.w[]
used| 108592
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 1000539 / bumped by a million
symw| 31400168
q).Q.gc[]
0j      / no-op again!
q).Q.w[]
used| 108592
heap| 67108864
peak| 67108864
wmax| 0
mmap| 0
syms| 1000539 / remains unchanged
symw| 31400168
q)

How portable is q code across operating systems?

kdb is available on Windows, Linux, Solaris and Mac OS X. Like most functional languages, q is a very high level language. As long as you stick with forward slashes in your file paths, you will be rarely notice the difference between operating systems. The only significant issue we've come across is that multi-threading is not supported on Macs.

If you do need to distinguish different operating systems, use the variable .z.o to detect the operating system in which your code is running.

Given a value, how do I get the null of that value's type?

We take advantage of a property of q's indexing: If you request an item from a list (or a dictionary, for that matter) using an index that is out of range, q returns the null item for the list's type.

q)x: 1 2 3
q)x 3
0N
q)

Thus, we can take the item whose corresponding null we want, make a list from it, and then access that list with an out-of-range index:

q)NullOf: {[item] enlist[item] 1}
q)NullOf 1
0N
q)(::) ~ NullOf {x*x}
1b
q)null NullOf (.z.D; 1e; " "; type "foo")
1111b
q)

Remember, there is no null list. If you pass a list to NullOf, you get the empty list of the corresponding type:

q)NullOf 1 2 3
`int$()
q)

See also: .Q.ty

What are 0!, 1! and 2!?

When !'s left argument is a positive integer, it sets the key fields of a table:

q)flip `a`b`c ! (1 2 3; 1 2 3; 1 2 3)
a b c
-----
1 1 1
2 2 2
3 3 3
q)1! flip `a`b`c ! (1 2 3; 1 2 3; 1 2 3)
a| b c
-| ---
1| 1 1
2| 2 2
3| 3 3
q)2! flip`a`b`c ! (1 2 3; 1 2 3; 1 2 3)
a b| c
---| -
1 1| 1
2 2| 2
3 3| 3

When !'s left argument is zero, it returns an unkeyed table from a keyed table:

q)0! ([a: 1 2 3; b: 1 2 3]; c: 1 2 3)
a b c
-----
1 1 1
2 2 2
3 3 3
q)

An arguably more readable equivalent is the xkey function:

q)`a xkey flip `a`b`c ! (1 2 3; 1 2 3; 1 2 3)
a| b c
-| ---
1| 1 1
2| 2 2
3| 3 3
q)`a`b xkey flip `a`b`c ! (1 2 3; 1 2 3; 1 2 3)
a b| c
---| -
1 1| 1
2 2| 2
3 3| 3
q)() xkey ([a: 1 2 3; b: 1 2 3]; c: 1 2 3)
a b c
-----
1 1 1
2 2 2
3 3 3
q)

It's easy to confuse function xkey with key, keys, xcol and xcols.

Page 1 ... 4 5 6 7 8 ... 12 Next 10 Entries »