Nov 21, 2012

How does fby work?

According to code.kx.com, fby is short for filter-by, and it is commonly used as the q equivalent to SQL’s HAVING clause (though, like where, fby is a q function, and its use is not limited to where clauses).

What fby does is aggregate values from one list based on groups defined in another, parallel, list. For example, suppose we have one list of cities and another list with a few temperature samples for each. We can use fby to calculate the minimum temperature sample for each city, and then replicate those values at each position for each corresponding city:

q)city:`NY`NY`LA`SF`LA`SF`NY
q)temp:32 31 75 69 70 68 12
q)(min;temp) fby city
12 12 70 68 70 68 12
q)

Thus, NY’s minimum temperature, 12, appears at every index in fby‘s output where `NY appears in city.

At the core of fby (like by) is group, which organizes the distinct values in a list into a dictionary mapping those values to their indices:

q)city:`NY`NY`LA`SF`LA`SF`NY
q)group city
NY| 0 1 6
LA| 2 4
SF| 3 5
q)

We can group the temperatures for each city together by indexing temp with the value of the grouped city dictionary:

q)grouped: value group city
q)temp[grouped]
32 31 12
75 70
69 68
q)

Note that the result of indexing temp with grouped is a nested list with the same shape as grouped. This is a general principle: the result of an indexing operation has the shape of the index.

Now we can apply an aggregation function to each of the temperature groups:

q)min each temp[grouped]
12 70 68
q)

We’re almost there. The real trick of fby is placing each aggregation result into a new list so that each element has the correct value per the grouping list. We can use @ (functional amend) to get the job done (see also the functional apply/amend faq):

q)@[temp; grouped; :; min each temp[grouped] ]
12 12 70 68 70 68 12
q)

The real fby is just slightly more complicated to ensure that the first argument to @ has the correct type.

See Also