This document provides a high level overview of how the DensityServer works.
1/2
, 1/4
, 1/8
, ... depending on the size of the input.To enable efficient access to the 3D data, the density values are stored in a "block level" format.
This means that the data is split into NxNxN
blocks (by default N=96
, which corresponds to 96^3 * 4 bytes = 3.375MB
disk read
per block access and provides good size/performance ratio). This data layout
enables to access the data from a hard drive using a bounded number of disk seeks/reads which
greatly reduces the server latency.
[H,K,L]
number of samples along each axis (i.e. the extent
field in the CCP4 header).To downsample, use the kernel C = [1,4,6,4,1]
(customizable on the source code level) along each axis, because it is "separable":
downsampled[i] = C[0] * source[2 * i - 2] + ... + C[4] * source[2 * i + 2]
The downsampling step is applied in 3 steps:
[H,K,L] => [H/2, K, L] => [H/2, K/2, L] => [H/2, K/2, L/2]
(if the dimension is odd, the value (D+1)/2
is used instead).
Apply the downsampling step iteratively until the number of samples along the largest dimension is smaller than "block size" (or the smallest dimension has >2 samples).
When the server receives a query for a 3D region, it chooses the the appropriate downsampling level based on the required details so that the number of voxels in the response is small enough. This enables sub-second response time even for the largest of entries.
The BinaryCIF format is used to encode the response. Floating point data are quantized into 1 byte values (256 levels) before being sent back to the client. This quantization is performed by splitting the data interval into uniform pieces.
Downsampling the data results in changing of absolute contour levels. To mitigate this effect, relative values are always used when displaying the data.
A = [-0.3, 2, 0.1, 6, 3, -0.4]
:B = [-0.3, 0.1, 3]
.[1 4 6 4 1]
kernel) only not as severe.X
from mean in Y = mean + sigma * X
) are preserved.i-th
level (starting from zero) reduces the size by approximate factor 1/[(2^i)^3]
(i.e. "cubic" of the frequency).Start with 3.5GB compressed density data in the CCP4 mode 2 format (32-bit float for each value)
=> ~4GB uncompressed CCP4
=> Downsample by 1/4 => 4GB * (1/4)^3 = 62MB
=> Convert to BinaryCIF => 62MB / 4 = ~16MB
=> Gzip: 2 - 8 MB depending on the "density" of the data
(e.g. a viral shell data will be smaller because it is "empty" inside)