index.xml

<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Home on Kolibri Documentation</title><link>http://awagen.github.io/</link><description>Recent content in Home on Kolibri Documentation</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Thu, 07 Sep 2023 10:00:00 +0200</lastBuildDate><atom:link href="http://awagen.github.io/index.xml" rel="self" type="application/rss+xml"/><item><title>Overview</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/0-overview/</link><pubDate>Mon, 03 Jan 2022 18:10:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/0-overview/</guid><description>Overview The sketch below shows the high-level flow of processing within Kolibri.
Define the samples to operate on either via OrderedMultiValues (composed of OrderedValues) and use provided implicit conversions to create IndexedGenerator or create IndexedGenerator directly. Use batching strategy to split in single batches that are processed through computations defined in RunnableGraph. Each data sample results in a tagged ProcessingMessage[T], which is handled by AggregatingActor (created by RunnableExecutionActor that runs the RunnableGraph).</description></item><item><title>Values</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/1-values/</link><pubDate>Wed, 08 Dec 2021 08:30:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/1-values/</guid><description>Values In the following let&amp;rsquo;s look into which structures Kolibri provides to simplify definitions of values.
OrderedValues trait OrderedValues[+T] extends KolibriSerializable { val name: String val totalValueCount: Int def getNthZeroBased(n: Int): Option[T] def getNFromPositionZeroBased(position: Int, n: Int): Seq[T] def getAll: Seq[T] } Two distinct implementations:
Using explicitly passed values: case class DistinctValues[+T](name: String, values: Seq[T]) extends OrderedValues[T] Range with defined start, end and stepSize case class RangeValues[T:Fractional](name: String, start:T, end:T, stepSize:T)(implicit v: Numeric[T]) extends OrderedValues[T] OrderedMultiValues Container for multiple OrderedValues with methods to edit (remove, add) values and methods to retrieve the n-th element out of the permutation over all contained OrderedValues.</description></item><item><title>Judgement Files</title><link>http://awagen.github.io/kolibri/2-config-details/4-file-formats/1-judgements/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/4-file-formats/1-judgements/</guid><description>COMING SOON</description></item><item><title>Resource Directives</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/1-resourcedirectives/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/1-resourcedirectives/</guid><description>Resource directives describe a resource, provide instructions how to load them and assign the resource an identifier such that it can be referenced and loaded. In the job definitions they are used to load resources upfront that are big or repeatedly accessed, such as judgement lists.
Note: the supplier option VALUES_FROM_NODE_STORAGE is usually not valid since it assumes that a resource with the defined identifier has already been loaded when this resource is accessed, which might not be the case.</description></item><item><title>Generators</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/2-generators/</link><pubDate>Sun, 05 Dec 2021 20:22:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/2-generators/</guid><description>Generators Definition of the data is done via instances of type IndexedGenerator. Possible values for those can be found in the package de.awagen.kolibri.datatypes.collections.generators (GitHub Link). These can hold single collections of values or combinations of multiple collections. By using the right combination of those, all kinds of permutations of values can be composed.
So lets say you had multiple stores (lets say identified by storeIds), and they are classified into certain types, and you wanted to request certain queries pro store type.</description></item><item><title>Overview</title><link>http://awagen.github.io/kolibri_archive/kolibri-watch/1-ui/1-overview/</link><pubDate>Thu, 02 Dec 2021 00:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-watch/1-ui/1-overview/</guid><description>Overview The below gives an overview of the current screens provided by Kolibri Watch. Additional screens for analysis of the results of the executions are planned to follow up soon.
Current Main Screens
Status overview of cluster Creating job definitions from templates and starting jobs Finished job history</description></item><item><title>Basics</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/2-mechanisms/1-overview/</link><pubDate>Thu, 09 Sep 2021 20:08:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/2-mechanisms/1-overview/</guid><description>Basics First, all batches are represented by an ActorRunnable object, that on execution provides a tuple of a KillSwitch and Future[Done] which completes when the execution completes. The KillSwitch allows killing the execution if indicated. The criteria whether an execution is to be stopped is represented by an ExecutionExpectation instance, which can contain multiple failedWhenMetExpectations, that is e.g if a certain rate or number of failed data sample processings or a timeout is exceeded.</description></item><item><title>DataType Categories</title><link>http://awagen.github.io/kolibri_archive/kolibri-datatypes/1-types/1-categories/</link><pubDate>Sun, 22 Aug 2021 22:20:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-datatypes/1-types/1-categories/</guid><description>Lets have a look at the distinct categories of data structures provided by the kolibri-datatypes project (might not be fully exhaustive).
Indexed Generators Indexed Generators allow to generate elements on demand by index. It provides size without having to iterate over the elements, provides Iterator of its contains elements, methods to retrieve generators of subparts of the original generator and mapping that transforms each generated element by the specified mapping function.</description></item><item><title>Getting it started</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/1-runningit/</link><pubDate>Sat, 21 Aug 2021 19:36:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/1-runningit/</guid><description>Lets first dive into how to get the project started on your machine. There are multiple configuration options available, which will be detailled in the following.
Starting locally with docker-compose Find the docker compose in project root. If youre referencing an existing image, you dont need to build anything beforehand. In case you want to start a local version, make sure to package the jar, create the properly tagged docker image and reference this in the docker-compose.</description></item><item><title>Request Parameters</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/2-requestparameters/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/2-requestparameters/</guid><description>Request parameters can be configured in different ways and types. This page describes how single parameters are defined and composed to create permutations (e.g needed in extensive offline evaluations where wide range of parameters are permutated to use in requests to the target system).
There are two general types of parameters, which are STANDALONE and MAPPING. STANDALONE just stands for a single parameter (with one or more values), while MAPPING allows to specify relationships between different parameters.</description></item><item><title>Result Files</title><link>http://awagen.github.io/kolibri/2-config-details/4-file-formats/2-resultformat/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/4-file-formats/2-resultformat/</guid><description>COMING SOON</description></item><item><title>Tagging</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/3-tags/</link><pubDate>Sun, 05 Dec 2021 22:47:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/3-tags/</guid><description>Tagging Coming shortly :)</description></item><item><title>Monitoring</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/2-monitoring/</link><pubDate>Sat, 21 Aug 2021 22:20:08 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/2-monitoring/</guid><description>Metrics with Kamon You&amp;rsquo;ll find the kamon configuration file within the resources/metrics folder (kamon.conf). It contains instrumentation configuration including filters for which elements metrics shall be collected as well as the configuration for the exposed server providing the status page mentioned above.
An example dashboard can be found in the grafana/dashboards folder. It provides general metrics regarding system performance. The below provides a description of the distinct displays in the example dashboard, for screenshot of the dashboard see below.</description></item><item><title>Parsing Configuration</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/3-parsingconfig/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/3-parsingconfig/</guid><description>A parsing configuration consists of the following parts:
selector that defines which fields to pick from a json name under which the extracted data is stored castType that defines what the value is cast to (Note: for a recursive selector that extracts a sequence of single-value fields, use the single-value cast type, that is if you use a recursive selector and each single extracted element is a string, you will use castType &amp;lsquo;STRING&amp;rsquo;, not &amp;lsquo;SEQ[STRING]&amp;rsquo;.</description></item><item><title>Summary Format</title><link>http://awagen.github.io/kolibri/2-config-details/4-file-formats/3-summaryformat/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/4-file-formats/3-summaryformat/</guid><description>COMING SOON</description></item><item><title>Processing Messages / Aggregation States</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/4-processing-messages/</link><pubDate>Sun, 05 Dec 2021 22:47:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/4-processing-messages/</guid><description>Processing Messages And AggregationStates Single processing units are represented by instances of type ProcessingMessage. This allows enriching of values with tags, which can be used for selective result handling, such as result writing, aggregations and selective handling of values.
Completion of a single batch is signalled by message of type AggregationState. This can be of two types:
AggregationStateWithoutData: provide info about completed batch without the generated data AggregationStateWithData: provide info about completed batch with the generated data trait AggregationState[+T] extends KolibriSerializable with TaggedWithType { val jobID: String val batchNr: Int val executionExpectation: ExecutionExpectation } case class AggregationStateWithoutData[+V](containedElementCount: Int, jobID: String, batchNr: Int, executionExpectation: ExecutionExpectation) extends AggregationState[V] case class AggregationStateWithData[+V](data: V, jobID: String, batchNr: Int, executionExpectation: ExecutionExpectation) extends AggregationState[V]</description></item><item><title>Executing examples</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/3-executeexamplejob/</link><pubDate>Sat, 21 Aug 2021 20:20:35 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/1-basics/3-executeexamplejob/</guid><description>An example job definition for a parameter grid search evaluating search metrics against a given endpoint is provided within the scripts-folder. The definition is contained in the file testSearchEval.json, that can be send to the respective Kolibri endpoints (see start_searcheval.sh). Where the response is written is configured via properties/env variables (see respective part of the documentation). A simpler way is to start up the app along with the UI (Kolibri Watch, see respective section of this doc), and navigate to the CREATE menu, select the search evaluation type and choose a job execution definition template.</description></item><item><title>Calculations</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/4-calculations/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/4-calculations/</guid><description>Lets look at available calculation types:
Calculation Types IR_METRICS IDENTITY FIRST_TRUE FIRST_FALSE TRUE_COUNT FALSE_COUNT BINARY_PRECISION_TRUE_AS_YES BINARY_PRECISION_FALSE_AS_YES STRING_SEQUENCE_VALUE_OCCURRENCE_HISTOGRAM COMPLETION COMING SOON</description></item><item><title>Parameters</title><link>http://awagen.github.io/kolibri/2-config-details/4-file-formats/4-parameters/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/4-file-formats/4-parameters/</guid><description>COMING SOON</description></item><item><title>Formats &amp; Writers</title><link>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/5-writers/</link><pubDate>Mon, 03 Jan 2022 09:20:00 +0200</pubDate><guid>http://awagen.github.io/kolibri_archive/kolibri-base/3-step-by-step/5-writers/</guid><description>Formats &amp;amp; Writers Coming shortly :)</description></item><item><title>Aggregation Type Mappings</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/5-aggregationtypemappings/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/5-aggregationtypemappings/</guid><description>Aggregation type mappings just specify for each defined metric name the appropriate type of aggregation. If no aggregation type is specified in the job definitions, the fallback will be DOUBLE_AVG (which will lead to problems if the calculated value is not a number and you did not specify the right aggregation type). The aggregation types are:
Aggregation Types DOUBLE_AVG SEQUENCE_KEEP_FIRST _MAP_UNWEIGHTED_SUM_VALUE _ _MAP_WEIGHTED_SUM_VALUE _NESTED_MAP_UNWEIGHTED_SUM_VALUE _NESTED_MAP_WEIGHTED_SUM_VALUE COMPLETION COMING SOON</description></item><item><title>Tagging Configuration</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/6-taggingconfiguration/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/6-taggingconfiguration/</guid><description>In Kolibri, tagging is used on results to establish groupings in the result sets. This allows mechanisms such as aggregation of different results based on equal tags. Thus tags effectively define the granularity of your results. Let&amp;rsquo;s say you tag by the query-parameter and you run the evaluation on a range of parameters, all over a set of 1000 queries. Now you will have 1000 single results. To the contrary, in case you have a parameter that can only assume two values, and you tag based on this parameter, you will only have two results.</description></item><item><title>Tasks</title><link>http://awagen.github.io/kolibri/2-config-details/3-config-options/7-tasks/</link><pubDate>Thu, 07 Sep 2023 10:00:00 +0200</pubDate><guid>http://awagen.github.io/kolibri/2-config-details/3-config-options/7-tasks/</guid><description>Tasks are modular descriptions of computations. A job consists of batches of elements that undergo a sequence of tasks. These are the currently available task types:
Task Type REQUEST_PARSE Define target systems along with http method, fixed parameters, contextPath, which fields to parse, on which criteria to tag and which keys are used for storage of successful results and failures. _METRIC_CALCULATION _ Based on parsed results (such as provided by the REQUEST_PARSE task), define which metrics to calculate.</description></item></channel></rss>