-
Notifications
You must be signed in to change notification settings - Fork 34
Test configuration files
Test configuration files include information needed to test an API that is not present in the OAS document, namely:
Next you can find an example of a test configuration file including the four settings commented:
auth: # Authentication data
required: true
queryParams:
apikey: adsf1234
testConfiguration: # Operations under test. For each, you can set...
testPaths:
- testPath: /endpoint/path
operations:
- operationId: getSomething
method: get
testParameters:
- name: location # Parameter options:
weight: 0.5 # Weight
generator: # Test data generator
type: RandomInputValue
genParameters:
- name: csv
values:
- src/main/resources/TestData/IATACodes.csv
Run the CreateTestConf class to create a default test configuration file. You need to replace the path to the OAS specification.
If you want to test only some operations of the API, you can modify the CreateTestConf class to select the desired subset of operations. In the following code snippet, only the GET and POST operations of the /example/url
path and the DELETE operation of the /delete/example
path will be considered when generating the test configuration file:
TestConfigurationFilter filter1 = new TestConfigurationFilter();
filter1.setPath("/example/url");
filter1.addGetMethod();
filter1.addPostMethod();
TestConfigurationFilter filter2 = new TestConfigurationFilter();
filter2.setPath("/delete/example");
filter2.addDeleteMethod();
RESTest supports several authentication schemes, namely: API Keys, Bearer tokens and HTTP Basic.
auth:
required: true
queryParams:
<parameter_name>: <YOUR_API_KEY>
Alternatively, you can also create an auth file, locate it under src/test/resources/auth/<API>/<apikeys>.json
and specify it in the testConf as follows:
auth:
required: true
apiKeysPath: <API>/<apikeys>.json
The format of this file should be as follows:
{
"apikey_param_name_1": [
"apikey_param_value_11",
"apikey_param_value_12"
],
"apikey_param_name_2": [
"apikey_param_value_21",
"apikey_param_value_22"
]
}
auth:
required: true
headerParams:
Authorization: Bearer <YOUR_TOKEN>
Alternatively, you can also create an auth file, locate it under src/test/resources/auth/<API>/<headers>.json
and specify it in the testConf as follows:
auth:
required: true
headersPath: <API>/<headers>.json
The format of this file should be as follows:
{
"header_param_name_1": [
"header_param_value_11",
"header_param_value_12"
],
"header_param_name_2": [
"header_param_value_21",
"header_param_value_22"
]
}
auth:
required: true
headerParams:
Authorization: Basic <Base64 encoding of <username>:<password>>
Testers might be interested in testing some parameters more thoroughly than others, for example, those more used in practice. Weights allow to do so. A weight is a real number in the range [0,1]. The higher the weight of a parameter, the more frequently it will be used in test cases. By default, all parameters have a weight of 0.5.
RESTest offers a comprehensive set of generators that create parameter values for each API request. While the test configuration file generator assigns a default generator for each parameter, we highly recommend customizing the setup of these generators to align with your specific requirements. The following generators are provided:
The class InputValue is a data generator that iterates through a list of strings and provides them sequentially. When you instantiate this class with a list of strings as a data dictionary, the generator iterates through the list and returns each string in order, one after the other, each time a value is requested. When it reaches the end of the list, the generator automatically resets itself to provide values from the beginning again.
This generator is particularly useful when you need predefined data in the form of strings and require a specific sequence of these values for systematic and repetitive testing. By using InputValueIterator, you can ensure that the test data is presented in an orderly and controlled manner, which facilitates the testing process and improves consistency in the obtained results.
- values: the list of strings that will be iterated.
...
testParameters:
- name: cityCode
weight: 0.5
generator:
type: InputValue
genParameters:
- name: values
values:
- PAR
- OPO
- MAD
- BER
- NYC
- MEL
...
The Random Input Value Iterator is a data generator that iterates through a list of strings and provides them in a random order. You have the flexibility to specify the strings either as a list of values or by providing the path to a CSV file where the possible values are stored. This generator supports both strings and files as values, but you must provide the paths to the files you want to include in the generator.
When using the Random Input Value Iterator, it's essential to provide either values or csv. In the case of providing both parameters, the generator will only consider the last one you supply. The parameters are defined as follows:
- values: The list of strings or the paths to the files that will be iterated.
- csv: The path to the CSV file where you store the values to be iterated.
The Random Input Value Iterator offers the following optional parameters to further customize its behavior:
- minValues: If the parameter accepts a list of values, you can specify the minimum number of values to be included in the response. By default, it is set to 1.
- maxValues: If the parameter accepts a list of values, you can specify the maximum number of values to be included in the response. The default value is also 1.
- separator: In case the parameter accepts a list of values, you can specify the separator that delimits the values. The default separator is ','.
By leveraging the Random Input Value Iterator with the appropriate parameters, you can easily generate randomized test data from a list of strings or CSV files, enhancing the variety and coverage of your API testing scenarios.
...
testParameters:
- name: cityCode
weight: 0.5
generator:
type: RandomInputValue
genParameters:
- name: values
values:
- PAR
- OPO
- MAD
- BER
- NYC
- MEL
...
...
testParameters:
- name: cityCodes
weight: 0.5
generator:
type: RandomInputValue
genParameters:
- name: csv
values:
- 'src/resources/city-codes.csv'
- name: minValues
values:
- 2
- name: maxValues
values:
- 4
- name: separator
values:
- '-'
...
...
testParameters:
- name: image
weight: 0.5
generator:
type: RandomInputValue
genParameters:
- name: values
values:
- 'src/resources/image1.png'
- 'src/resources/image2.jpg'
...
The RandomEnglishWordGenerator is a powerful tool for generating random words or sentences in English. Whether you need a single word or a complete sentence, this generator provides the flexibility to meet your requirements. By utilizing the extJWNL library and WordNet database, the generator ensures that the generated words are valid English words, adhering to proper linguistic rules.
- minWords: The minimum number of words to be included in the generated sentence. Defaults to 1.
- maxWords: The maximum number of words to be included in the generated sentence. Defaults to 3.
- generateCompounds: Set to true to generate words with compound forms, or false to generate single words only. Defaults to true.
- ignoreLinkingWords: Set to true to exclude linking words from the generated output, or false to include them. Defaults to true.
- category: The part of speech (NOUN, VERB, ADJECTIVE, ADVERB) to restrict the word generation. If not specified, it will be randomly selected.
...
testParameters:
- name: sentence
weight: 1
generator:
type: RandomEnglishWord
genParameters:
- name: minWords
values:
- 4
- name: maxWords
values:
- 15
...
The RandomNumberGenerator is a versatile tool that generates random numbers, which can be either integers or floating-point numbers. This generator supports various number types, such as int32, int64, double, long, float, and number.
-
type: Specifies the type of the number to be generated. The possible values are:
integer
,int32
,int64
,double
,number
,long
orfloat
.
- min: the minimum value that can be generated. Defaults to the minimum possible value of the number type.
- max. the maximum value than can be generated. Defaults to the maximum possible value of the number type.
...
testParameters:
- name: amount
weight: 1
generator:
type: RandomNumber
genParameters:
- name: type
values:
- integer
- name: min
values:
- 1
- name: max
values:
- 20
...
The RandomDateGenerator is a powerful tool that generates random dates within specified date ranges. It offers flexibility to set the start and end dates, either as specific dates or as relative days from today using the startDays
and endDays
parameters. The generator provides dates in the format specified, which defaults to "yyyy-MM-dd HH:mm:ss."
You must not provide both startDate and fromToday parameters. If you provide both, the generator will only take into account the last one you supply.
-
startDate: specifies the minimum date that can be generated. The date format must be
yyyy-MM-dd
. Defaults to ten years ago from today. -
endDate: specifies the maximum date that can be generated. The date format must be
yyyy-MM-dd
. Defaults to two years from today. -
fromToday: a boolean to specify the start date - that is, the minimum date to be generated - as today's date. Defaults to
false
. -
format: the format of the dates to be generated. Defaults to
yyyy-MM-dd HH:mm:ss
. - startDays: An integer parameter specifying the number of days from today to set the minimum date for generating random dates.
- endDays: An integer parameter specifying the number of days from today to set the maximum date for generating random dates.
...
testParameters:
- name: birthDate
weight: 1
generator:
type: RandomDate
genParameters:
- name: fromToday
values:
- true
- name: endDate
values:
- 2025-03-01
- name: format
values:
- yyyy-MM-dd'T'HH:mm:ss'Z'
...
The RandomRegExp is a tool that generates random strings that comply with a specific regular expression. It allows you to create test data that matches string patterns defined by regular expressions.
- regExp: the regular expression that the generated values will match.
- minLength: Specifies the minimum length of the generated string. By default, there is no minimum length set.
- maxLength: Specifies the maximum length of the generated string. By default, there is no maximum length set.
...
testParameters:
- name: latlng
weight: 0.5
generator:
type: RandomRegExp
genParameters:
- name: regExp
values:
- ([-+])?([1-8]?\d(\.\d+)?|90(\.0+)?),\s*[-+]?(180(\.0+)?|((1[0-7]\d)|([1-9]?\d))(\.\d+)?)
...
The RandomBooleanGenerator is a tool that generates random boolean values, representing true or false, based on a specified probability. It allows you to control the likelihood of getting true or false outcomes, making it useful for creating test scenarios with different probabilities.
- trueProbability: Specifies the probability of generating a ```true`` value. The value should be between 0 and 1, where 0 means always false, and 1 means always true. By default, the probability is set to 0.5, resulting in an equal chance of getting true or false.
...
testParameters:
- name: onlyMyPlaylists
weight: 0.5
generator:
type: RandomBoolean
genParameters: []
...
The RandomObject is a powerful tool that allows you to obtain a random JSON object from a given list of JSON objects. It provides the flexibility to select a JSON object randomly, making it ideal for scenarios where you need to choose a random JSON representation from a predefined set.
You must provide either values or files, but not both. If you provide both parameters, the generator will only consider the last one you supply.
-
values: the list of JSON objects that will be iterated. You must specify the objects in the
objectValues
parameter in YAML format - see the example -. - files: a list of paths to the JSON files that will be iterated.
...
testParameters:
- name: body
weight: 0.5
generator:
type: RandomObject
genParameters:
- name: values
values: null
objectValues:
- id: 1
userName: user1
text: I want more features!
date: 2020-02-14T14:13:21.827+0000
type: Request
...
...
testParameters:
- name: body
weight: 0.5
generator:
type: RandomObject
genParameters:
- name: files
values:
- 'src/resources/file1.json'
- 'src/resources/file2.json'
...
The ObjectPerturbator is a versatile tool designed to transform JSON objects, commonly used as inputs for API operations, into new JSON objects. These transformed objects may be invalid (but not guaranteed), enabling developers to test diverse data scenarios.
You must only provide either object, stringObject or file. If you provide more than one of these parameters together the generator will only take into account the last one you supply.
-
object: the JSON object that will be perturbed. You must specify the object in the
objectValues
parameter in YAML format - see the example -. - stringObject: the JSON object that will be perturbed as a string.
- file: the path to the JSON file that will be perturbed.
-
singleOrder: a boolean that specifies if the generator must only apply single-order perturbations. A single-order perturbation implies, for example, that only one mutation is applied at a time. Defaults to
true
.
...
testParameters:
- name: body
weight: 0.5
generator:
type: ObjectPerturbator
genParameters:
- name: object
values: null
objectValues:
- id: 1
userName: user1
text: I want more features!
date: 2020-02-14T14:13:21.827+0000
type: Request
...
...
testParameters:
- name: body
weight: 0.5
generator:
type: ObjectPerturbator
genParameters:
- name: stringObject
values:
- '{"id":1,"userName":"user1","text":"I want more features!","date":"2020-02-14T14:13:21.827+0000","type":"Review"}'
...
...
testParameters:
- name: body
weight: 0.5
generator:
type: ObjectPerturbator
genParameters:
- name: file
values:
- 'src/resources/file1.json'
...
The RandomStringGenerator is a versatile tool that allows you to generate random strings based on specified parameters. It provides the flexibility to customize the length and character set of the generated strings.
minLength: maxLength: includeAlphabetic: includeNumbers: includeSpecialCharacters:
- minLength: Specifies the minimum length of the generated string. The default value is 0.
- maxLength: Specifies the maximum length of the generated string. The default value is 10.
-
includeAlphabetic: A boolean parameter that determines whether to include alphabetic characters in the generated string. The default value is
true
. Set tofalse
if you want to exclude alphabetic characters from the generated string. -
includeNumbers: A boolean parameter that determines whether to include numeric characters in the generated string. The default value is
false
. Set totrue
if you want to include numeric characters in the generated string. -
includeSpecialCharacters: A boolean parameter that determines whether to include non-alphanumeric ASCII characters in the generated string. The default value is
false
. Set totrue
if you want to include special characters in the generated string.
...
testParameters:
- name: user_id
weight: 0.5
generator:
type: RandomString
genParameters:
- name: minLength
values:
- 15
- name: maxLength
values:
- 20
- name: includeNumbers
values:
- true
...
The BoundaryString generator is a versatile utility designed to thoroughly test the range boundaries of a string parameter. By leveraging the specified configuration parameters, it creates a diverse list of strings with various lengths, covering key scenarios. This includes strings with lengths equal to minLength, maxLength, minLength + delta, maxLength + delta, minLength - delta (if positive), and maxLength - delta (if positive). Additionally, it generates a string with a mean length of (minLength + maxLength) / 2. These strings are intelligently stored in a list, which is then utilized by an InputValue generator. This powerful combination allows developers to effectively examine edge cases and boundary conditions of APIs that accept string inputs, ensuring comprehensive testing and robustness validation.
- minLength: The lower boundary of the string length. Strings generated will have lengths equal to minLength, minLength + delta, and minLength - delta (if positive). Defaults to 0.
- maxLength: The upper boundary of the string length. Strings generated will have lengths equal to maxLength, maxLength + delta, and maxLength - delta (if positive). Defaults to 1024.
-
delta:The value added and subtracted to
minLength
andmaxLength
. This allows generating strings with lengths close to the specified boundaries. Defaults to 2. -
includeEmptyString. Specifies whether to include an empty string (
""
) in the generated list of strings. Defaults totrue
. -
includeNullCharacter. Specifies whether to include the null character (
"\0"
) in the generated list of strings. Defaults totrue
.`
...
testParameters:
- name: tweet
weight: 0.5
generator:
type: BoundaryString
genParameters:
- name: minLength
values:
- 1
- name: maxLength
values:
- 240
- name: delta
values:
- 1
- name: includeNullCharacter
values:
- false
...
Same as BoundaryString, but it uses a RandomInputValue generator to iterate the generated strings.
...
testParameters:
- name: tweet
weight: 0.5
generator:
type: RandomBoundaryString
genParameters:
- name: minLength
values:
- 1
- name: maxLength
values:
- 240
- name: delta
values:
- 1
- name: includeEmptyString
values:
- false
...
This generator is used to test the range boundaries of a number parameter. It generates the following numbers: min
, max
, min + delta
, max + delta
, min - delta
, max - delta
and (min + max) / 2
. These numbers are stored in a list which is provided to an InputValue generator. It supports all number types.
-
type: the type of the number. Possible values are
integer
,int32
,int64
,double
,number
,long
orfloat
.
- min: it is the lower boundary of the parameter. Defaults to -231.
- max: it is the upper boundary of the parameter. Defaults to 231-1.
-
delta: value added and substracted to
min
andmax
. Defaults to 1.
...
testParameters:
- name: phoneNumber
weight: 0.5
generator:
type: BoundaryNumber
genParameters:
- name: type
values:
- int32
- name: min
values:
- 600000000
- name: max
values:
- 749999999
...
Same as BoundaryNumber, but it uses a RandomInputValue generator to iterate the generated numbers.
...
testParameters:
- name: probability
weight: 0.5
generator:
type: RandomBoundaryNumber
genParameters:
- name: type
values:
- double
- name: min
values:
- 0.0
- name: max
values:
- 1.0
- name: delta
values:
- 0.1
...
El FuzzingDictionary es un generador de utilidades diseñado para proporcionar valores de fuzzing utilizados en pruebas de API. Proporciona una variedad de valores para diferentes tipos de datos que son especialmente útiles para realizar pruebas de borde y escenarios de casos límite en API que aceptan diversos tipos de datos.
No hay parámetros de configuración directos para esta clase, ya que utiliza un diccionario predefinido para obtener los valores de fuzzing para cada tipo de dato.
El BodyGenerator es un generador de utilidades diseñado para proporcionar datos de cuerpos de solicitud utilizados en pruebas de API con escenarios de estado. Genera nodos JsonNode que representan cuerpos de solicitud válidos para operaciones API específicas.
- operationMethod: The method of the API operation for which the request body will be generated.
- operationPath: The path of the API operation for which the request body will be generated.
- defaultValue: The default value in JSON format for the request body. If a valid request body cannot be generated, this default value will be used.
- mutate: A flag that allows enabling or disabling schema mutation of the request body during data generation. If mutation is enabled, schema changes will be applied to generate additional and diverse test data.
- dataDirPath: The directory path where generated state data will be stored for later use in tests.
...
testParameters:
- name: body
in: body
weight: null
generators:
- valid: true
type: BodyGenerator
genParameters:
- name: defaultValue
values:
- "{\"id\":\"c1\",\"userName\":\"josedelpino\",\"text\":\"I love Spotify\"\
,\"date\":\"2013-04-16T20:44:53.950\",\"type\":\"Review\"}"
objectValues: null
...
The ParameterGenerator is a utility generator designed to provide parameter values used in stateful API testing scenarios. It generates JsonNode nodes that represent valid parameter values for specific API operations.
- parameterName: The name of the parameter for which the value will be generated.
- parameterType: The data type of the parameter for which the value will be generated.
- dataDirPath: The directory path where generated state data will be stored for later use in tests.
- operationMethod: The method of the API operation for which the parameter value will be generated.
- operationPath: The path of the API operation for which the parameter value will be generated.
- altParameterName: An alternate name for the parameter, used if the primary parameter name is not found in the stateful data.
- altOperationPath: An alternate path for the operation, used if the primary operation path is not found in the stateful data.
- spec: The OpenAPI specification that contains the definition of the API operation for which the parameter value will be generated.
- defaultValue: The default value for the parameter. If a valid value cannot be generated, this default value will be used.
...
testParameters:
- name: id
in: path
weight: null
generators:
- valid: true
type: ParameterGenerator
genParameters: []
# - name: defaultValue
# values:
# - c1
...
RESTest supports the automatic generation of realistic input values for the parameters of an API by using ARTE (Automated generation of Realistic Test inputs), an approach that generates these realistic values by applying Natural Language Processing (NLP), search-based and knowledge extraction techniques.
More specifically, ARTE analyzes both the name and description of the parameters of an OpenAPI specification (along with other constraints, such as regular expressions or minimum and maximum values) to query a knowledge base. By default, ARTE relies on DBpedia version 2016-10.
The main method for executing ARTE can be found at src/main/java/es.us.isa.restest/inputs/semantic/ARTEInputGenerator.java
. This method must be provided with a .properties
file containing both the path to the OpenAPI specification file and a configuration file in which the user must specify the parameters for which he wants ARTE to generate values for. For example, if we want to generate inputs for the isbn
parameter, we must change the generator type of this parameter to SemanticParameter
as shown below:
...
testParameters:
- name: isbn
in: query
weight: 0.5
generators:
- type: SemanticParameter
genParameters: []
valid: true
...
After executing ARTE, a new file named testConfSemantic.yaml
will be generated at the route containing the original test configuration file. In this configuration file, the parameters previously specified as SemanticParameter
will now contain a route to a csv file with the values generated by ARTE and the predicates from the knowledge base that were used to extract them. These csv files can be modified by the user and are available at src/main/resources/TestData/Generated/{apiName}
.
...
testParameters:
- name: isbn
in: query
weight: 0.5
generators:
- type: RandomInputValue
genParameters:
- name: csv
values:
- src/main/resources/TestData/Generated/sampleSemanticAPI/sampleEndpoint_isbn.csv
objectValues: null
- name: predicates
values:
- http://dbpedia.org/property/isbn
objectValues: null
- name: numberOfTriesToGenerateRegex
values:
- 0
objectValues: null
valid: true
...
Before executing RESTest with the values generated by ARTE, the property “conf.path” of the .properties file must be replaced by the path of the testConfSemantic.yaml
file.
ARTE extracts these values by automatically generating SPARQL queries. For instance, for the parameters, title
, pages
and isbn
, ARTE would generate the following query:
SELECT DISTINCT ?title ?pages ?isbn WHERE {
?subject <http://dbpedia.org/property/title> ?title ;
<http://dbpedia.org/ontology/numberOfPages> ?pages ;
<http://dbpedia.org/ontology/isbn> ?isbn .
FILTER (?pages >= 100)
FILTER regex(str(?isbn), '^([0-9]*[-| ]){4}[0-9]*$')
}
Where http://dbpedia.org/property/title, http://dbpedia.org/ontology/numberOfPages and http://dbpedia.org/ontology/isbn are predicates representative of the parameters. By leveraging the information provided by the OAS specification, ARTE applies additional filters to restrict the minimum number of pages and obtain only those titles whose isbn match the regular expression. The table below shows an example of the values returned by this query:
Title | Pages | Isbn |
---|---|---|
The Last Wish | 288 | 978-0-575-08244-1 |
Darwin et la science de l'évolution | 160 | 978-2-0705-3520-0 |
Alan Moore: Portrait of an Extraordinary Gentleman | 352 | 978-0-946790-06-7 |
Dark Matter | 342 | 978-1-101-90422-0 |
The Martian | 369 | 978-0-8041-3902-1 |
The generation of initial values with ARTE receives the following configuration parameters:
-
propertiesFilePath: Of type
string
. Route of the .properties file containing the paths to the OpenAPI specification and the configuration file in which the user specifies the parameters for which ARTE must generate values. -
maxNumberOfCandidatePredicates: Of type
integer
. When exploring the knowledge base, ARTE uses multiple keywords extracted from the OpenAPI specification to search for candidate predicates. This parameter establishes the maximum number of candidate predicates to extract per keyword. Default: 5. -
minSupport: Of type
integer
. When exploring the target knowledge base, ARTE analyzes several candidate predicates, keeping the first one whose support is equal or greater than the value of this parameter. The support of a predicate is the unique number of RDF triples containing it. Default: 20. -
threshold: Of type
integer
. Minimum number of unique input values per parameter necessary to consider the parameter as satisfied. If during the execution of an automatically generated query, this threshold is not achieved for a subset of parameters, a new query is generated containing exclusively the parameters that have not been satisfied. For instance, if for the previous example of SPARQL query, the threshold is not achieved for the parameters title and pages, a new query is generated containing exclusively these 2 parameters. On the other hand, if the threshold is not achieved for none of the parameters, ARTE will execute a number of queries equal to the number of parameters, each containing all predicates but one, isolating the parameter whose discarding would result in a greater number of results for the remaining ones. This query decomposition process continues until all parameters have been satisfied or until all predicates are executed in isolation. Default: 100. -
limit: Of type
integer
. Maximum number of values to extract per parameter. If its value is null, ARTE will ignore this parameter. Default: null. -
szEndpoint: Of type
string
. Endpoint of the knowledge base to leverage. By default, ARTE relies on DBpedia version 2016-10.
The default values for these parameters are shown in the Table below
Parameter name | Default value |
---|---|
maxNumberOfCandidatePredicates | 5 |
minSupport | 20 |
threshold | 100 |
limit | null |
While executing RESTest, ARTE automatically obtains feedback from API responses and classify the obtained input values as valid or invalid. Optionally, ARTE can automatically generate a regular expression that matches only the valid values and use it to filter out the invalid ones. Furthermore, this regular expression can be used by ARTE to extract additional input values for each parameter.
This step is useful when ARTE generates input values in different formats, such as country codes of length 2 and 3, or urls starting with https or www, with the API under test accepting only one of them.
If this option is enabled, ARTE will try to automatically generate a regular expression at the end of each iteration of RESTest (whose length in terms of API request is indicated by the parameter testperoperation
of the .properties
file).
During this step, ARTE receives the following parameters, that must be specified in the .properties
file:
-
learnRegex: Of type
boolean
. This parameter indicates whether ARTE should try to automatically try to infer a regular expression for the semantic parameters at the end of every RESTest iteration. Default: false. -
secondPredicateSearch: Of type
boolean
. It indicates whether we want ARTE to use the automatically generated regular expression to search for additional input values in the knowledge base. This parameter has no effect if learnRegex has been set to false. Default: false -
maxNumberOfPredicates: Of type
integer
. Maximum number of predicates of the knowledge base to leverage for parameter when extracting new input values. Default: 3. -
minimumValidAndInvalidValues: Of type
integer
. For automatically generating a regular expression, ARTE requires a set of both valid and invalid values. This parameter stablishes the minimum value of this parameter necessary for ARTE to try to generate this regular expression. Default: 5. -
minimumValueOfMetric: Of type
double
(Between 0 and 1). This parameter establishes the minimum performance of the generated regular expression on the set of valid and valid values in terms of recall. If this value is achieved, ARTE uses the regex to filter the list of generated values and to optionally (ifsecondPredicateSearch=true
) perform and additional search for input values. Default: 0.9. -
maxNumberOfTriesToGenerateRegularExpression: Of type
integer
. Automatically trying to generate a regular expression for each parameter at the end of every execution of RESTest can result in an important (and in some cases unnecessary) performance overhead. To avoid this issue, this parameter establishes the maximum number of tries to automatically generate a regular expression per parameter. If this value is reached for a parameter, ARTE will not try to generate a regular expression for it in the subsequent operations. However, if ARTE successfully generates a regular expression that achieves the minimum performance specified by the user (set byminimumValueOfMetric
) The current number of tries to generate a regular expression for the parameter will be set to 0 again. Default: 2.
The default values for these parameters are shown in the Table below
Parameter name | Default value |
---|---|
learnRegex | false |
secondPredicateSearch | false |
maxNumberOfPredicates | 3 |
minimumValidAndInvalidValues | 5 |
minimumValueOfMetric | 0.9 |
maxNumberOfTriesToGenerateRegularExpression | 2 |
At the end of the execution of RESTest, the list of valid and invalid values generated by ARTE that match the automatically generated regular expression can be found inside the folder target/test-data/{experimentName}/validAndInvalidValues
An extensive evaluation on a dataset of 47 real world APIs showed the effectiveness of the default values of ARTE for successfully generating realistic input values. However, this section contains an analysis of the influence of each parameter in the performance of ARTE:
-
maxNumberOfPredicatesSelected: This parameter is designed to avoid ARTE to select predicates that are not related to the keyword used for conducting the search for predicates. A low value for this parameter would result in a faster execution (since less predicates are analysed) with the risk of not finding a suitable predicate, whereas a high value could make ARTE consider predicates that are not related to the provided keyword. For instance, if a high number is set for this parameter when using the keyword
url
, ARTE could consider predicates such as http://dbpedia.org/property/curlie (This is the 8th predicate obtained when using this keyword). -
Minimum support of predicate (minSupport): This parameter is designed to avoid ARTE to select predicates that do not appear in a sufficient number of unique triples, but that are found first according to the priority defined by the matching rules used for extracting predicates. As an example, when using
federalState
as keyword, ARTE obtains two predicates (http://dbpedia.org/property/federalState and http://dbpedia.org/ontology/federalState). The first predicate is only present in one triple, so only one value for the parameterfederalState
would be obtained. On the other hand, using the second predicate would return a greater number of results. Setting a low value for this parameter would allow ARTE to find a predicate slightly faster (with the risk of not finding enough values for a parameter or irrelevant data), whereas setting a greater value would result in a slower execution, with the risk of not finding any predicates. - Minimum number of unique parameter values (threshold): Using a low value for this parameter would result in a faster execution, since the threshold would be achieved with a fewer number of queries, but the number of generated input values for each parameter may be insufficient. On the other hand, a greater value would result in a slower execution, with the worst case being that all the predicates would be isolated.
- Minimum recall for accepting regex (minimumValueOfMetric): If a low value is set for this parameter, ARTE would accept almost any automatically generated regular expression, whereas a recall too close to 100% could be too restrictive.
- Minimum number of valid and invalid values for generating regex (minimumValidAndInvalidValues): In order to generate a regular expression, it is necessary to provide ARTE with a set of valid and invalid values. Using a small set as input for ARTE could result in the generation of regular expressions that overfit the training set and therefore not generalize enough. On the other hand, collecting a big set of valid and invalid values would require a vast number of API requests.
- Maximum number of predicates to leverage per parameter (maxNumberOfPredicates): When a regular expression with a good recall is successfully generated, the user can configure ARTE to search for more input values using this regular expression as a filter, leveraging new predicates. However, not restricting the number of maximum predicates to leverage per parameter could cause ARTE to extract values that are not related to the parameter.
- maxNumberOfTriesToGenerateRegularExpression: Setting a high value for this parameter could result in an important performance overhead. Additionally, it could result in ARTE eventually generating a regular expression that does not represent the target parameter, resulting in the addition of noisy data to the list of input values.