querybuilder ============ Prequisites for :ref:`derived products ` are typically described using a ``query``-function. This function describes which relations should exist between components of a new (i.e. to-be-built) derived product in an abstract way. These relations do not describe a fact which must hold for a single database object as it is usual for normal database queries, but they describe facts which must hold for a combination of multiple such objects together as one group. The query builder is responsible for the transformation of a query function into multiple single-object queries for the metaStorage. how does a query function look like? ------------------------------------ The query function for a calibrated image looks as follows: .. code-block:: python :emphasize-lines: 3,4,7,17 :linenos: minimalDarkFrameCount = 15 def query(rawImage, darkBefore, darkAfter, calibrationData): rawImage.productType == 'raw_image_data' rawImage.data_type == 'scan' darkBefore.productType == 'averaged_dark_data' darkBefore.frameCount >= minimalDarkFrameCount darkAfter.productType == 'averaged_dark_data' darkAfter.frameCount >= minimalDarkFrameCount calibrationData.productType == 'image_calibration_data' rawImage.integrationTime == darkBefore.integrationTime rawImage.sensorId == darkBefore.sensorId rawImage.integrationTime == darkAfter.integrationTime rawImage.sensorId == darkAfter.sensorId rawImage.sensorId == calibrationData.sensorId rawImage.startTime >= calibrationData.validFrom rawImage.stopTime <= calibrationData.validUntil rawImage.userMeta.optics == calibrationData.optics rawImage.dataSize.spatial == calibrationData.dataSize.spatial rawImage.dataSize.spectral == calibrationData.dataSize.spectral rawImage.dataSize.spatial == darkBefore.dataSize.spatial rawImage.dataSize.spectral == darkBefore.dataSize.spectral rawImage.dataSize.spatial == darkAfter.dataSize.spatial rawImage.dataSize.spectral == darkAfter.dataSize.spectral rawImage.date.enclosedBy(darkBefore.date, darkAfter.date) This clearly looks odd. There is no return-statement, so this function does nothing obvious. Also almost all statements are comparisions and nothing is done with their results. And how can it be understood? ----------------------------- The arguments of the query function determine how many components are needed. They can be freely chosen but are all interpreted as something representing a product. In the above case, the agruments state: * There are 4 components needed. * Their names are: ``rawImage``, ``darkBefore``, ``darkAfter``, ``calibrationData``. The body of the query function states relations which must host between the components or between a component and some external value (which is assumed to be constant). In this example the highlighted lines state the following: * Line 4: The ``productType`` of ``rawImage`` must be "raw_image_data". * Line 7: The ``frameCount`` of ``darkBefore`` must be at least 15. * Line 17: The ``startTime`` of ``rawImage`` must be equal or later that the start of the validity period of the ``calibrationData``. All these constraints are interpreted in a way that every comparision or statement must be ``True`` in order to find a valid combination of components. The most interesting statements are of course the relations between two or more products. How does it work? ----------------- The querybuilder can be called like: .. code-block:: python import runmacs.processor.querybuilder as qb components, pg = qb.processQuery(query) ``processQuery`` uses introspection to find the name of all components (returned as ``components`` List) and runs the ``query`` function with some dummy arguments, collecting all the information contained in the specified relations. Afterwards, ``pg`` can be used to generate the most specific metaStorage-query for any given component name, using all currently available information:: >>> generated_query = pg.getQuery('rawImage') >>> generated_query {'data_type': 'scan', 'productType': 'raw_image_data'} >>> generated_query = pg.getQuery('rawImage', {"calibrationData": { "validFrom": datetime.datetime(2014,1,1), "validUntil": datetime.datetime(2015,1,1), "sensorId": 10, "optics": "baffle", "dataSize": {"spatial": 50, "spectral": 30}}}) >>> generated_query {'dataSize.spatial': 50, 'dataSize.spectral': 30, 'data_type': 'scan', 'productType': 'raw_image_data', 'sensorId': 10, 'startTime': {'$gte': datetime.datetime(2014, 1, 1, 0, 0)}, 'stopTime': {'$lte': datetime.datetime(2015, 1, 1, 0, 0)}, 'userMeta.optics': 'baffle'} So if nothing is known about other products, only the constant parameters can be used for a query. However, if some raw data matching some given calibration data is needed, more restrictions can be stated. Running this mechanism iteratively allows to quickly find matching combinations for a new derived product.