Short answer: Yes, put the most restrictive constraint first.

Although expressions in q are usually evaluated from right to left, there is one exception: the constraints in a where clause are evaluated from left to right. Consider the following example:

q)t: ([] x: `a`b`c; y: 1 2 3)
q)select from t where x < `c, y > 1
x y

b 2

The comma separating x < `c from y > 1 is not the join operator; instead, it delimits the constraints of the where clause. If you need to use the join operator in a where clause, use parentheses:

q)select from t where x in (`a, `b), y > 1
x y

b 2

Each constraint is evaluated in the usual order, i.e., right-to-left:

q)select from t where x < first -1 # `p`c, y > 1
x y

b 2

Because constraints are evaluated from left to right, putting the most restrictive constraint first will speed up queries on large tables. On a date partitioned database, for example, the placement of a constraint on the virtual date column will make a marked performance difference:

select from large_table where date = 2011.02.25, name = `JOE

is significantly faster than

select from large_table where name = `JOE, date = 2011.02.25

The latter query will attempt to load the entire contents of table for rows matching name `JOE prior to narrowing the result set down to a single date. kdb does not provide behind-the-scenes optimization in cases like these.

It means that all the values in a column are stored together. This is in contrast to row-oriented, in which all the values for a row are stored together. Which organization is better depends on the access pattern of your application.

Many operations are more efficient with a column-oriented approach. In particular, operations that need to access a sequence of values from a particular column are much faster. If all the values in a column have the same size (which is true, by design, in kdb), things get even better. This type of access pattern is typical of the applications for which q and kdb are used.

To make this concrete, let’s examine a column of 64-bit, floating point numbers:

q).Q.w[] `used
q)t: ([] f: 1000000 ? 1.0)
q).Q.w[] `used

As you can see, the memory needed to hold one million 8-byte values is only a little over 8MB. That’s because the data are being stored sequentially in an array. To clarify, let’s create another table:

q)u: update g: 1000000 ? 5.0 from t
q).Q.w[] `used

Both t and u are sharing the column f. If q organized its data in rows, the memory usage would have gone up another 8MB. Another way to confirm this is to take a look at k.h.

Now let’s see what happens when we write the table to disk:

q)`:t/ set t
q)\ls -l t
"total 15632"
"-rw-r--r--  1 kdbfaq  staff  8000016 May 29 19:57 f"

16 bytes of overhead. Clearly, all of the numbers are being stored sequentially on disk. Efficiency is about avoiding unnecessary work, and here we see that q does exactly what needs to be done when reading and writing a column – no more, no less.

OK, so this approach is space efficient. How does this data layout translate into speed?

If we ask q to sum all 1 million numbers, having the entire list packed tightly together in memory is a tremendous advantage over a row-oriented organization, because we’ll encounter fewer misses at every stage of the memory hierarchy. Avoiding cache misses and page faults is essential to getting performance out of your machine.

Moreover, doing math on a long list of numbers that are all together in memory is a problem that modern CPU instruction sets have special features to handle, including instructions to prefetch array elements that will be needed in the near future. Although those features were originally created to improve PC multimedia performance, they turned out to be great for statistics as well. In addition, the same synergy of locality and CPU features enables column-oriented systems to perform linear searches (e.g., in where clauses on unindexed columns) faster than indexed searches (with their attendant branch prediction failures) up to astonishing row counts.

A thorough introduction to these topics is given in Ulrich Drepper’s ‘What every programmer should know about memory’ and Scott Meyer’s 2011 talk on ‘CPU caches and why you care’.

Short answer: Yes, multiple processes can write to a database as long as the processes avoid concurrent write operations to the same table on the same partition. To see a significant speedup, you’ll need to use segments.

With the exception of the ? operator, which can be used to provide concurrency protection for sym files, q processes do not coordinate file I/O operations in any way. Thus, kdb splayed table write operations require the same degree of care as ordinary file writes.

The mechanism to protect sym files, however, is used throughout the functions in the .Q namespace that pertain to writing splayed tables. So, you can use those functions to build a database faster by running multiple loader processes in parallel.

Example 1: Concurrent writes to the same table on the same partition. This means two processes are writing to the same (table column) files simultaneously, and that is not safe! The potential for corruption is high.

kdb_proc1: .Q.dpft[`:/vol/db; 2011.04.07; `sym; `t] kdb_proc2: .Q.dpft[`:/vol/db; 2011.04.07; `sym; `t]

Example 2: Concurrent writes to two different tables on the same partition. This is will not cause corruption.

kdb_proc1: .Q.dpft[`:/vol/db; 2011.04.07; `sym; `t] kdb_proc2: .Q.dpft[`:/vol/db; 2011.04.07; `sym; `q]

Example 3: Concurrent writes to the same table on different partitions. This is also safe, with no potential corruption.

kdb_proc1: .Q.dpft[`:/vol/db; 2011.04.07; `sym; `t] kdb_proc2: .Q.dpft[`:/vol/db; 2011.04.08; `sym; `t]

If the two partitions are on different I/O channels using segments, this last approach can be much faster than performing the writes serially.