querybuilder¶
Prequisites for derived products are typically described
using a query
-function.
This function describes which relations should exist between components of a
new (i.e. to-be-built) derived product in an abstract way.
These relations do not describe a fact which must hold for a single database object
as it is usual for normal database queries, but they describe facts which must
hold for a combination of multiple such objects together as one group.
The query builder is responsible for the transformation of a query function into
multiple single-object queries for the metaStorage.
how does a query function look like?¶
The query function for a calibrated image looks as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | minimalDarkFrameCount = 15
def query(rawImage, darkBefore, darkAfter, calibrationData):
rawImage.productType == 'raw_image_data'
rawImage.data_type == 'scan'
darkBefore.productType == 'averaged_dark_data'
darkBefore.frameCount >= minimalDarkFrameCount
darkAfter.productType == 'averaged_dark_data'
darkAfter.frameCount >= minimalDarkFrameCount
calibrationData.productType == 'image_calibration_data'
rawImage.integrationTime == darkBefore.integrationTime
rawImage.sensorId == darkBefore.sensorId
rawImage.integrationTime == darkAfter.integrationTime
rawImage.sensorId == darkAfter.sensorId
rawImage.sensorId == calibrationData.sensorId
rawImage.startTime >= calibrationData.validFrom
rawImage.stopTime <= calibrationData.validUntil
rawImage.userMeta.optics == calibrationData.optics
rawImage.dataSize.spatial == calibrationData.dataSize.spatial
rawImage.dataSize.spectral == calibrationData.dataSize.spectral
rawImage.dataSize.spatial == darkBefore.dataSize.spatial
rawImage.dataSize.spectral == darkBefore.dataSize.spectral
rawImage.dataSize.spatial == darkAfter.dataSize.spatial
rawImage.dataSize.spectral == darkAfter.dataSize.spectral
rawImage.date.enclosedBy(darkBefore.date, darkAfter.date)
|
This clearly looks odd. There is no return-statement, so this function does nothing obvious. Also almost all statements are comparisions and nothing is done with their results.
And how can it be understood?¶
The arguments of the query function determine how many components are needed. They can be freely chosen but are all interpreted as something representing a product. In the above case, the agruments state:
- There are 4 components needed.
- Their names are:
rawImage
,darkBefore
,darkAfter
,calibrationData
.
The body of the query function states relations which must host between the components or between a component and some external value (which is assumed to be constant). In this example the highlighted lines state the following:
- Line 4: The
productType
ofrawImage
must be “raw_image_data”.- Line 7: The
frameCount
ofdarkBefore
must be at least 15.- Line 17: The
startTime
ofrawImage
must be equal or later that the start of the validity period of thecalibrationData
.
All these constraints are interpreted in a way that every comparision or statement must
be True
in order to find a valid combination of components.
The most interesting statements are of course the relations between two or more products.
How does it work?¶
The querybuilder can be called like:
import runmacs.processor.querybuilder as qb
components, pg = qb.processQuery(query)
processQuery
uses introspection to find the name of all components (returned as
components
List) and runs the query
function with some dummy arguments, collecting
all the information contained in the specified relations.
Afterwards, pg
can be used to generate the most specific metaStorage-query for any
given component name, using all currently available information:
>>> generated_query = pg.getQuery('rawImage')
>>> generated_query
{'data_type': 'scan', 'productType': 'raw_image_data'}
>>> generated_query = pg.getQuery('rawImage', {"calibrationData": {
"validFrom": datetime.datetime(2014,1,1),
"validUntil": datetime.datetime(2015,1,1),
"sensorId": 10,
"optics": "baffle",
"dataSize": {"spatial": 50, "spectral": 30}}})
>>> generated_query
{'dataSize.spatial': 50,
'dataSize.spectral': 30,
'data_type': 'scan',
'productType': 'raw_image_data',
'sensorId': 10,
'startTime': {'$gte': datetime.datetime(2014, 1, 1, 0, 0)},
'stopTime': {'$lte': datetime.datetime(2015, 1, 1, 0, 0)},
'userMeta.optics': 'baffle'}
So if nothing is known about other products, only the constant parameters can be used for a query. However, if some raw data matching some given calibration data is needed, more restrictions can be stated. Running this mechanism iteratively allows to quickly find matching combinations for a new derived product.