Skip to content

Test configuration files

José Luis García Marín edited this page Dec 12, 2023 · 77 revisions

Table of contents

  1. Test configuration files
    1. Generating a default test configuration file
      1. Filters
  2. Authentication data
    1. API key
    2. Bearer token
    3. HTTP Basic
  3. Parameter weights
  4. Test data generators
    1. InputValue
    2. RandomInputValue
    3. RandomEnglishWord
    4. RandomNumber
    5. RandomDate
    6. RandomRegExp
    7. RandomBoolean
    8. RandomObject
    9. ObjectPerturbator
    10. RandomString
    11. BoundaryString
    12. RandomBoundaryString
    13. BoundaryNumber
    14. RandomBoundaryNumber
    15. FuzzingDictionary
    16. Stateful Generator
    17. SemanticGenerator (ARTE)

Test configuration files

Test configuration files include information needed to test an API that is not present in the OAS document, namely:

Next you can find an example of a test configuration file including the four settings commented:

auth: # Authentication data
  required: true
  queryParams:
    apikey: adsf1234
testConfiguration: # Operations under test. For each, you can set...
  testPaths:
  - testPath: /endpoint/path
    operations:
    - operationId: getSomething
      method: get
      testParameters:
      - name: location # Parameter options:
        weight: 0.5 # Weight
        generator: # Test data generator
          type: RandomInputValue
          genParameters:
          - name: csv
            values:
            - src/main/resources/TestData/IATACodes.csv

Generating a default test configuration file

Run the CreateTestConf class to create a default test configuration file. You need to replace the path to the OAS specification.

Filters

If you want to test only some operations of the API, you can modify the CreateTestConf class to select the desired subset of operations. In the following code snippet, only the GET and POST operations of the /example/url path and the DELETE operation of the /delete/example path will be considered when generating the test configuration file:

TestConfigurationFilter filter1 = new TestConfigurationFilter();
filter1.setPath("/example/url");
filter1.addGetMethod();
filter1.addPostMethod();
TestConfigurationFilter filter2 = new TestConfigurationFilter();
filter2.setPath("/delete/example");
filter2.addDeleteMethod();

Authentication data

RESTest supports several authentication schemes, namely: API Keys, Bearer tokens and HTTP Basic.

API key

auth:
  required: true
  queryParams: 
    <parameter_name>: <YOUR_API_KEY>

Alternatively, you can also create an auth file, locate it under src/test/resources/auth/<API>/<apikeys>.json and specify it in the testConf as follows:

auth:
  required: true
  apiKeysPath: <API>/<apikeys>.json

The format of this file should be as follows:

{
	"apikey_param_name_1": [
		"apikey_param_value_11",
		"apikey_param_value_12"
	],
	"apikey_param_name_2": [
		"apikey_param_value_21",
		"apikey_param_value_22"
	]
}

Bearer token

auth:
  required: true
  headerParams: 
    Authorization: Bearer <YOUR_TOKEN>

Alternatively, you can also create an auth file, locate it under src/test/resources/auth/<API>/<headers>.json and specify it in the testConf as follows:

auth:
  required: true
  headersPath: <API>/<headers>.json

The format of this file should be as follows:

{
	"header_param_name_1": [
		"header_param_value_11",
		"header_param_value_12"
	],
	"header_param_name_2": [
		"header_param_value_21",
		"header_param_value_22"
	]
}

HTTP Basic

auth:
  required: true
  headerParams: 
    Authorization: Basic <Base64 encoding of <username>:<password>>

Parameter weights

Testers might be interested in testing some parameters more thoroughly than others, for example, those more used in practice. Weights allow to do so. A weight is a real number in the range [0,1]. The higher the weight of a parameter, the more frequently it will be used in test cases. By default, all parameters have a weight of 0.5.

Test Data Generators

RESTest offers a comprehensive set of generators that create parameter values for each API request. While the test configuration file generator assigns a default generator for each parameter, we highly recommend customizing the setup of these generators to align with your specific requirements. The following generators are provided:

InputValue

The class InputValue is a data generator that iterates through a list of strings and provides them sequentially. When you instantiate this class with a list of strings as a data dictionary, the generator iterates through the list and returns each string in order, one after the other, each time a value is requested. When it reaches the end of the list, the generator automatically resets itself to provide values from the beginning again.

This generator is particularly useful when you need predefined data in the form of strings and require a specific sequence of these values for systematic and repetitive testing. By using InputValueIterator, you can ensure that the test data is presented in an orderly and controlled manner, which facilitates the testing process and improves consistency in the obtained results.

Required parameters

  • values: the list of strings that will be iterated.

Example

...

      testParameters:
      - name: cityCode
        weight: 0.5
        generator:
          type: InputValue
          genParameters:
            - name: values
              values:
                - PAR
                - OPO
                - MAD
                - BER
                - NYC
                - MEL

...

RandomInputValue

The Random Input Value Iterator is a data generator that iterates through a list of strings and provides them in a random order. You have the flexibility to specify the strings either as a list of values or by providing the path to a CSV file where the possible values are stored. This generator supports both strings and files as values, but you must provide the paths to the files you want to include in the generator.

Required Parameters

When using the Random Input Value Iterator, it's essential to provide either values or csv. In the case of providing both parameters, the generator will only consider the last one you supply. The parameters are defined as follows:

  • values: The list of strings or the paths to the files that will be iterated.
  • csv: The path to the CSV file where you store the values to be iterated.

Optional Parameters

The Random Input Value Iterator offers the following optional parameters to further customize its behavior:

  • minValues: If the parameter accepts a list of values, you can specify the minimum number of values to be included in the response. By default, it is set to 1.
  • maxValues: If the parameter accepts a list of values, you can specify the maximum number of values to be included in the response. The default value is also 1.
  • separator: In case the parameter accepts a list of values, you can specify the separator that delimits the values. The default separator is ','.

By leveraging the Random Input Value Iterator with the appropriate parameters, you can easily generate randomized test data from a list of strings or CSV files, enhancing the variety and coverage of your API testing scenarios.

Example 1 - List of strings

...

      testParameters:
      - name: cityCode
        weight: 0.5
        generator:
          type: RandomInputValue
          genParameters:
            - name: values
              values:
                - PAR
                - OPO
                - MAD
                - BER
                - NYC
                - MEL

...

Example 2 - Parameter that accepts several values - CSV file

...

      testParameters:
      - name: cityCodes
        weight: 0.5
        generator:
          type: RandomInputValue
          genParameters:
            - name: csv
              values:
                - 'src/resources/city-codes.csv'
            - name: minValues
              values:
                - 2
            - name: maxValues
              values:
                - 4
            - name: separator
              values:
                - '-'

...

Example 3 - List of files

...

      testParameters:
      - name: image
        weight: 0.5
        generator:
          type: RandomInputValue
          genParameters:
            - name: values
              values:
                - 'src/resources/image1.png'
                - 'src/resources/image2.jpg'

...

RandomEnglishWord

The RandomEnglishWordGenerator is a powerful tool for generating random words or sentences in English. Whether you need a single word or a complete sentence, this generator provides the flexibility to meet your requirements. By utilizing the extJWNL library and WordNet database, the generator ensures that the generated words are valid English words, adhering to proper linguistic rules.

Optional parameters

  • minWords: The minimum number of words to be included in the generated sentence. Defaults to 1.
  • maxWords: The maximum number of words to be included in the generated sentence. Defaults to 3.
  • generateCompounds: Set to true to generate words with compound forms, or false to generate single words only. Defaults to true.
  • ignoreLinkingWords: Set to true to exclude linking words from the generated output, or false to include them. Defaults to true.
  • category: The part of speech (NOUN, VERB, ADJECTIVE, ADVERB) to restrict the word generation. If not specified, it will be randomly selected.

Example

...

      testParameters:
      - name: sentence
        weight: 1
        generator:
          type: RandomEnglishWord
          genParameters:
            - name: minWords
              values:
                - 4
            - name: maxWords
              values:
                - 15

...

RandomNumber

The RandomNumberGenerator is a versatile tool that generates random numbers, which can be either integers or floating-point numbers. This generator supports various number types, such as int32, int64, double, long, float, and number.

Required parameters

  • type: Specifies the type of the number to be generated. The possible values are: integer, int32, int64, double, number, long or float.

Optional parameters

  • min: the minimum value that can be generated. Defaults to the minimum possible value of the number type.
  • max. the maximum value than can be generated. Defaults to the maximum possible value of the number type.

Example

...

      testParameters:
      - name: amount
        weight: 1
        generator:
          type: RandomNumber
          genParameters:
            - name: type
              values:
                - integer
            - name: min
              values:
                - 1
            - name: max
              values:
                - 20

...

RandomDate

The RandomDateGenerator is a powerful tool that generates random dates within specified date ranges. It offers flexibility to set the start and end dates, either as specific dates or as relative days from today using the startDays and endDays parameters. The generator provides dates in the format specified, which defaults to "yyyy-MM-dd HH:mm:ss."

Optional parameters

You must not provide both startDate and fromToday parameters. If you provide both, the generator will only take into account the last one you supply.

  • startDate: specifies the minimum date that can be generated. The date format must be yyyy-MM-dd. Defaults to ten years ago from today.
  • endDate: specifies the maximum date that can be generated. The date format must be yyyy-MM-dd. Defaults to two years from today.
  • fromToday: a boolean to specify the start date - that is, the minimum date to be generated - as today's date. Defaults to false.
  • format: the format of the dates to be generated. Defaults to yyyy-MM-dd HH:mm:ss.
  • startDays: An integer parameter specifying the number of days from today to set the minimum date for generating random dates.
  • endDays: An integer parameter specifying the number of days from today to set the maximum date for generating random dates.

Example

...

      testParameters:
      - name: birthDate
        weight: 1
        generator:
          type: RandomDate
          genParameters:
            - name: fromToday
              values:
                - true
            - name: endDate
              values:
                - 2025-03-01
            - name: format
              values:
                - yyyy-MM-dd'T'HH:mm:ss'Z'

...

RandomRegExp

The RandomRegExp is a tool that generates random strings that comply with a specific regular expression. It allows you to create test data that matches string patterns defined by regular expressions.

Required parameters

  • regExp: the regular expression that the generated values will match.

Optional parameters

  • minLength: Specifies the minimum length of the generated string. By default, there is no minimum length set.
  • maxLength: Specifies the maximum length of the generated string. By default, there is no maximum length set.

Example

...

      testParameters:
      - name: latlng
        weight: 0.5
        generator:
          type: RandomRegExp
          genParameters:
            - name: regExp
              values:
                - ([-+])?([1-8]?\d(\.\d+)?|90(\.0+)?),\s*[-+]?(180(\.0+)?|((1[0-7]\d)|([1-9]?\d))(\.\d+)?)

...

RandomBoolean

The RandomBooleanGenerator is a tool that generates random boolean values, representing true or false, based on a specified probability. It allows you to control the likelihood of getting true or false outcomes, making it useful for creating test scenarios with different probabilities.

Optional parameters

  • trueProbability: Specifies the probability of generating a ```true`` value. The value should be between 0 and 1, where 0 means always false, and 1 means always true. By default, the probability is set to 0.5, resulting in an equal chance of getting true or false.

Example

...

      testParameters:
      - name: onlyMyPlaylists
        weight: 0.5
        generator:
          type: RandomBoolean
          genParameters: []

...

RandomObject

The RandomObject is a powerful tool that allows you to obtain a random JSON object from a given list of JSON objects. It provides the flexibility to select a JSON object randomly, making it ideal for scenarios where you need to choose a random JSON representation from a predefined set.

Required parameters

You must provide either values or files, but not both. If you provide both parameters, the generator will only consider the last one you supply.

  • values: the list of JSON objects that will be iterated. You must specify the objects in the objectValues parameter in YAML format - see the example -.
  • files: a list of paths to the JSON files that will be iterated.

Example 1 - List of values

...

      testParameters:
      - name: body
        weight: 0.5
        generator:
          type: RandomObject
          genParameters:
            - name: values
              values: null
              objectValues:
              - id: 1
                userName: user1
                text: I want more features!
                date: 2020-02-14T14:13:21.827+0000
                type: Request

...

Example 2 - List of file paths

...

      testParameters:
      - name: body
        weight: 0.5
        generator:
          type: RandomObject
          genParameters:
            - name: files
              values:
              - 'src/resources/file1.json'
              - 'src/resources/file2.json'

...

ObjectPerturbator

The ObjectPerturbator is a versatile tool designed to transform JSON objects, commonly used as inputs for API operations, into new JSON objects. These transformed objects may be invalid (but not guaranteed), enabling developers to test diverse data scenarios.

Required parameters

You must only provide either object, stringObject or file. If you provide more than one of these parameters together the generator will only take into account the last one you supply.

  • object: the JSON object that will be perturbed. You must specify the object in the objectValues parameter in YAML format - see the example -.
  • stringObject: the JSON object that will be perturbed as a string.
  • file: the path to the JSON file that will be perturbed.

Optional parameters

  • singleOrder: a boolean that specifies if the generator must only apply single-order perturbations. A single-order perturbation implies, for example, that only one mutation is applied at a time. Defaults to true.

Example 1 - Object

...

      testParameters:
      - name: body
        weight: 0.5
        generator:
          type: ObjectPerturbator
          genParameters:
            - name: object
              values: null
              objectValues:
              - id: 1
                userName: user1
                text: I want more features!
                date: 2020-02-14T14:13:21.827+0000
                type: Request

...

Example 2 - String object

...

      testParameters:
      - name: body
        weight: 0.5
        generator:
          type: ObjectPerturbator
          genParameters:
            - name: stringObject
              values:
              - '{"id":1,"userName":"user1","text":"I want more features!","date":"2020-02-14T14:13:21.827+0000","type":"Review"}'
...

Example 3 - JSON file path

...

      testParameters:
      - name: body
        weight: 0.5
        generator:
          type: ObjectPerturbator
          genParameters:
            - name: file
              values:
              - 'src/resources/file1.json'

...

RandomString

The RandomStringGenerator is a versatile tool that allows you to generate random strings based on specified parameters. It provides the flexibility to customize the length and character set of the generated strings.

minLength: maxLength: includeAlphabetic: includeNumbers: includeSpecialCharacters:

Optional parameters

  • minLength: Specifies the minimum length of the generated string. The default value is 0.
  • maxLength: Specifies the maximum length of the generated string. The default value is 10.
  • includeAlphabetic: A boolean parameter that determines whether to include alphabetic characters in the generated string. The default value is true. Set to false if you want to exclude alphabetic characters from the generated string.
  • includeNumbers: A boolean parameter that determines whether to include numeric characters in the generated string. The default value is false. Set to true if you want to include numeric characters in the generated string.
  • includeSpecialCharacters: A boolean parameter that determines whether to include non-alphanumeric ASCII characters in the generated string. The default value is false. Set to true if you want to include special characters in the generated string.

Example

...

      testParameters:
      - name: user_id
        weight: 0.5
        generator:
          type: RandomString
          genParameters:
          - name: minLength
            values:
            - 15
          - name: maxLength
            values:
            - 20
          - name: includeNumbers
            values:
            - true

...

BoundaryString

The BoundaryString generator is a versatile utility designed to thoroughly test the range boundaries of a string parameter. By leveraging the specified configuration parameters, it creates a diverse list of strings with various lengths, covering key scenarios. This includes strings with lengths equal to minLength, maxLength, minLength + delta, maxLength + delta, minLength - delta (if positive), and maxLength - delta (if positive). Additionally, it generates a string with a mean length of (minLength + maxLength) / 2. These strings are intelligently stored in a list, which is then utilized by an InputValue generator. This powerful combination allows developers to effectively examine edge cases and boundary conditions of APIs that accept string inputs, ensuring comprehensive testing and robustness validation.

Optional parameters

  • minLength: The lower boundary of the string length. Strings generated will have lengths equal to minLength, minLength + delta, and minLength - delta (if positive). Defaults to 0.
  • maxLength: The upper boundary of the string length. Strings generated will have lengths equal to maxLength, maxLength + delta, and maxLength - delta (if positive). Defaults to 1024.
  • delta:The value added and subtracted to minLength and maxLength. This allows generating strings with lengths close to the specified boundaries. Defaults to 2.
  • includeEmptyString. Specifies whether to include an empty string ("") in the generated list of strings. Defaults to true.
  • includeNullCharacter. Specifies whether to include the null character ("\0") in the generated list of strings. Defaults to true.`

Example

...

      testParameters:
      - name: tweet
        weight: 0.5
        generator:
          type: BoundaryString
          genParameters:
          - name: minLength
            values:
            - 1
          - name: maxLength
            values:
            - 240
          - name: delta
            values:
            - 1
          - name: includeNullCharacter
            values:
            - false

...

RandomBoundaryString

Same as BoundaryString, but it uses a RandomInputValue generator to iterate the generated strings.

Example

...

      testParameters:
      - name: tweet
        weight: 0.5
        generator:
          type: RandomBoundaryString
          genParameters:
          - name: minLength
            values:
            - 1
          - name: maxLength
            values:
            - 240
          - name: delta
            values:
            - 1
          - name: includeEmptyString
            values:
            - false

...

BoundaryNumber

This generator is used to test the range boundaries of a number parameter. It generates the following numbers: min, max, min + delta, max + delta, min - delta, max - delta and (min + max) / 2. These numbers are stored in a list which is provided to an InputValue generator. It supports all number types.

Required parameters

  • type: the type of the number. Possible values are integer, int32, int64, double, number, long or float.

Optional parameters

  • min: it is the lower boundary of the parameter. Defaults to -231.
  • max: it is the upper boundary of the parameter. Defaults to 231-1.
  • delta: value added and substracted to min and max. Defaults to 1.

Example

...

      testParameters:
      - name: phoneNumber
        weight: 0.5
        generator:
          type: BoundaryNumber
          genParameters:
          - name: type
            values:
            - int32
          - name: min
            values:
            - 600000000
          - name: max
            values:
            - 749999999

...

RandomBoundaryNumber

Same as BoundaryNumber, but it uses a RandomInputValue generator to iterate the generated numbers.

Example

...

      testParameters:
      - name: probability
        weight: 0.5
        generator:
          type: RandomBoundaryNumber
          genParameters:
          - name: type
            values:
            - double
          - name: min
            values:
            - 0.0
          - name: max
            values:
            - 1.0
          - name: delta
            values:
            - 0.1

...

FuzzingDictionary

The FuzzingDictionary is a utility generator designed to provide fuzzing values used in API testing. It offers a range of values for different data types, particularly useful for conducting boundary tests and scenarios involving edge cases in APIs that accept various data types.

There are no direct configuration parameters for this class, as it utilizes a predefined dictionary to retrieve fuzzing values for each data type.

Stateful Generator

These data generators are primarily designed to make calls to an API to retrieve dynamic information and selectively collect relevant fields from the obtained responses. In this process, four main classes are involved: DataMatching, BodyGenerator, ParameterGenerator, and TestDataGeneratorFactory, with the latter encompassing all data generators.

The process can be described in several steps:

First, make API calls. Upon receiving responses, the generators identify and select specific fields they consider relevant. For example, if a response contains the "name" field, that field is chosen for collection along with other data like surname and address, which could be useful as parameters in future calls.

The collected information is stored in a structured manner in a JSON file, acting as a data repository that can be queried later. Each key in the file is associated with a response field, and the value is the corresponding information for that field. Practically, the collection is done diligently, extracting information from the API-provided responses. If a response includes a "customer" field along with details like address and street, this data is stored comprehensively.

When a subsequent API call requires specific parameters, the generators use the previously stored information. For instance, if a "name" parameter is needed in a call, the previously collected "name" field is used to provide dynamic data for that call.

Once the data is collected, the DataMatching class comes into play to effectively leverage this information. Its main function is to match a parameter name and extract relevant information from the generated JSON file.

DataMatching class implements various heuristics to extract information accurately.

One heuristic involves exact matches. If a parameter is called "Name and Surname" and a response contains a field called "name and surname," the information is securely extracted from that file.

Besides exact matches, other heuristics are used to maximize effectiveness. One option is to look for a parameter name and an operation that is the same. For example, if there is an operation to get books and it returns a field called "title," that title is specifically searched in the responses of that operation.

Another option is to look for the same parameter name in different operations. For instance, if there is an operation called "search movies" with a parameter named "title," relevant information could be extracted from that operation.

The third option is to search for a similar parameter name but within the same operation. This is useful when parameter names are not identical but share similarities, like "titles" and "title."

The fourth option is to search for a similar parameter name but in a different operation.

The fifth option involves searching for information within nested objects in JSON responses. For example, if a response returns an object that includes "data.commets.id," "comments.id" is specifically searched as a parameter name to extract information. The next step would be to search only with "id."

Therefore, the DataMatching class is used both to obtain parameters and to build complete JSONs. If you want to find a single parameter, you only need to perform this process once. However, if you want to build a complete JSON, such as for creating an HTTP request, you would have to apply this process to each property of the JSON.

The ParameterGenerator and BodyGenerator classes play specific roles in this context. ParameterGenerator searches the JSON for a valid parameter name to use in constructing requests. You have the option to specify some parameters, like an alternative parameter name or an alternative operation name. This option is especially useful when the web API uses nondescriptive parameter names, such as just the letter 't.' In this case, you can manually specify an alternative name, like 'title,' for the generator to search using that name instead of the letter 't.' This allows finding data that would not otherwise be obtained using only the letter 't,' and the same applies to operations.

Applying heuristics to a single parameter is quite straightforward. However, when building a JSON, this process must be carried out for each property of the JSON, making it more complex. Some properties of the JSON may not have data in the collected information, and this is where the BodyGenerator class comes into play.

In cases where no data is found for certain properties of the JSON, this class manages that situation. It first attempts to extract data from existing sources. If it doesn't find data, depending on the type of property, it employs different strategies. For example, it may extract information from examples defined in the OpenAPI, which allows specifying examples for properties. It also tries to obtain default values.

If none of these methods work, the BodyGenerator class generates random data to fill in those properties. This is where it becomes clever, generating random strings, values like random numbers, random dates, random names, etc. Essentially, it dynamically and flexibly fills the properties of the JSON without specific data.

Therefore, if we include one of these generators, ParameterGenerator or BodyGenerator, in the configuration file. For instance, for an operation in the Spotify API:

  - testPath: "/playlists/{playlist_id}/images"
    operationId: endpoint-get-playlist-cover
    method: get
    testParameters:
    - name: playlist_id
      in: path
      weight: null
      generators:
        - type: ParameterGenerator
          genParameters:
            - name: altParamName
              values:
                - playlists.items.id
            - name: altOperationPath
              values:
                - /search
          valid: true
    expectedResponse: 200

The generator will have gathered a set of identifiers for playlists that it can use to conduct tests on this operation.

"playlists.items.id":["3jWDSsEMaUgT7K01Dy1QyM","2kHYfIaiQGooI56As7l944","3PFFCA28AuP4JAQlqVWvz3","6TzGtw9xGfTJUtcfPkx7Tk","1NqgQkGjdf7ZjrHkFeGFVX","7k96QQHhW9zq6ODuO4Ct8J","4Eionm84gYgPceOTPhoVr8","6P4T6rWUTYdpdqMqOx17qG","7vWyzynFn9p0SE2ecLRADP","3SycXhaNehrcEQfXpia6GU","42jWb0YUZIL8pQcWmss4rQ","6wQLrKs05DmbAnj7DXVvR4","3hQyzpcmxjvbyLfcH8RoMq","22SVwIfcZcwxmvZlbwlRSq","4RQjglgHCCtP29b13W5btq","7pX1EBSYsHeKDjg6X59LGa","3Z3Ra7k4ZEDFLENNHGJOAX","37i9dQZF1DWXtn4mSM1Su9","2j1scUancgopc8qJnlazzU","7nGvOp1wSzt3kaHpTNok29","6ul83gIgwGdPERbyQesmLh","3CvJ0YpOASDhNMfuO1WueL","6OyWlWooBxAXeTSrDTlFEj","6rjgZRvblUPscX7S7utzaE","1esymgxzpJkVHUaoHHnDUo","0xCZoO23OPe5J86f2190KF","3zeEzhqmdlgNbDLEinRtNe","7sya9DpzsSBpvs1r459jXo","6v95e9rx9XsNmkJWHUUMKr","1BTQIQzrBRpodBBLVJl1mm","1IgmB2vvKUqpLxDV01rQwn","7iaGdSUOyqVwLQ6xg23APf","5O9xqQ9RxkYJjiHQZRegbg","5AUcpSBu07Zo1HvyH43kMZ","6VBMZYNcBpXC8yZ1M1xzyx","3W4sI001lnGBqQOqxR8EKN","6t56VTgWHP7spkZ5tdNWSv","6pb4KOOmQ3lIC559nJzbR1","69XcyrSB1lMvTbmeMxSBhT","6fnpbiKatyuUX3pMIt8cnc","4TDLin4UdDnZRgXHpYd7Hn","12kOjDWmeZj6qzBC6XvJKL","02qFUHWlx7qreMsrrSqNot","2b9fLC8gUIjX6en3MEzpUM","3rusLGSx45XygFv9mgt6SU","2TrgpCqD1ICvgRheqRohmx","0E890F3zCCGFba7jMPemfu"]

SemanticGenerator (ART)

RESTest supports the automatic generation of realistic input values for the parameters of an API by using ART (Automated generation of Realistic Test inputs), an approach that generates these realistic values by applying Natural Language Processing (NLP), search-based and knowledge extraction techniques.

More specifically, ART analyzes both the name and description of the parameters of an OpenAPI specification (along with other constraints, such as regular expressions or minimum and maximum values) to query a knowledge base. By default, ARTE relies on DBpedia version 2016-10.

The main method for executing ART can be found at src/main/java/es.us.isa.restest/inputs/semantic/ARTEInputGenerator.java. This method must be provided with a .properties file containing both the path to the OpenAPI specification file and a configuration file in which the user must specify the parameters for which he wants ARTE to generate values for. For example, if we want to generate inputs for the isbn parameter, we must change the generator type of this parameter to SemanticParameter as shown below:

...
      testParameters:
       - name: isbn
         in: query
         weight: 0.5
         generators:
          - type: SemanticParameter
            genParameters: []
            valid: true
...

After executing ART, a new file named testConfSemantic.yaml will be generated at the route containing the original test configuration file. In this configuration file, the parameters previously specified as SemanticParameter will now contain a route to a csv file with the values generated by ARTE and the predicates from the knowledge base that were used to extract them. These csv files can be modified by the user and are available at src/main/resources/TestData/Generated/{apiName}.

...
      testParameters:
       - name: isbn
         in: query
         weight: 0.5
         generators:
          - type: RandomInputValue
            genParameters:
             - name: csv
               values:
                - src/main/resources/TestData/Generated/sampleSemanticAPI/sampleEndpoint_isbn.csv
               objectValues: null
             - name: predicates
               values:
                - http://dbpedia.org/property/isbn
               objectValues: null
             - name: numberOfTriesToGenerateRegex
               values:
                - 0
               objectValues: null
             valid: true
...

Before executing RESTest with the values generated by ARTE, the property “conf.path” of the .properties file must be replaced by the path of the testConfSemantic.yaml file.

ART extracts these values by automatically generating SPARQL queries. For instance, for the parameters, title, pages and isbn, ART would generate the following query:

SELECT DISTINCT ?title  ?pages ?isbn WHERE {
 ?subject <http://dbpedia.org/property/title>         ?title ;
          <http://dbpedia.org/ontology/numberOfPages> ?pages ;
          <http://dbpedia.org/ontology/isbn>          ?isbn .
 FILTER (?pages >= 100)
 FILTER regex(str(?isbn), '^([0-9]*[-| ]){4}[0-9]*$')
}

Where http://dbpedia.org/property/title, http://dbpedia.org/ontology/numberOfPages and http://dbpedia.org/ontology/isbn are predicates representative of the parameters. By leveraging the information provided by the OAS specification, ARTE applies additional filters to restrict the minimum number of pages and obtain only those titles whose isbn match the regular expression. The table below shows an example of the values returned by this query:

Title Pages Isbn
The Last Wish 288 978-0-575-08244-1
Darwin et la science de l'évolution 160 978-2-0705-3520-0
Alan Moore: Portrait of an Extraordinary Gentleman 352 978-0-946790-06-7
Dark Matter 342 978-1-101-90422-0
The Martian 369 978-0-8041-3902-1

Configuration parameters

The generation of initial values with ARTE receives the following configuration parameters:

  • propertiesFilePath: (String) The file path of the .properties file that contains the paths to the OpenAPI specification and the configuration file where the user specifies the parameters for which ARTE should generate values.

  • maxNumberOfCandidatePredicates: Of type integer. When exploring the knowledge base, ARTE uses multiple keywords extracted from the OpenAPI specification to search for candidate predicates. This parameter establishes the maximum number of candidate predicates to extract per keyword. Default: 5.

  • minSupport: (Integer) During exploration of the target knowledge base, ARTE analyzes several candidate predicates and keeps the first one whose support is equal to or greater than the value of this parameter. The support of a predicate is the number of unique RDF triples containing it. Default: 20.

  • threshold: (Integer) The minimum number of unique input values per parameter required to consider the parameter as satisfied. If, during the execution of an automatically generated query, this threshold is not reached for a subset of parameters, ARTE generates a new query containing only the unsatisfied parameters. For example, if the threshold is not reached for the parameters "title" and "pages" in a SPARQL query, ARTE generates a new query with only these two parameters. Conversely, if the threshold is not reached for any parameter, ARTE will execute a number of queries equal to the number of parameters, each containing all predicates except one, isolating the parameter whose exclusion would result in the highest number of results for the remaining ones. This query decomposition process continues until all parameters are satisfied or until all predicates are executed in isolation. Default: 100.

  • limit: (Integer) The maximum number of values to extract per parameter. If the value is null, ARTE will ignore this parameter. Default: null.

  • szEndpoint: (String) The endpoint of the knowledge base to leverage. By default, ARTE uses DBpedia version 2016-10.

The default values for these parameters are shown in the Table below

Parameter name Default value
maxNumberOfCandidatePredicates 5
minSupport 20
threshold 100
limit null

Automated Generation of regular expressions

While executing RESTest, ARTE automatically obtains feedback from API responses and classify the obtained input values as valid or invalid. Optionally, ARTE can automatically generate a regular expression that matches only the valid values and use it to filter out the invalid ones. Furthermore, this regular expression can be used by ARTE to extract additional input values for each parameter.

This step is useful when ARTE generates input values in different formats, such as country codes of length 2 and 3, or urls starting with https or www, with the API under test accepting only one of them.

If this option is enabled, ARTE will try to automatically generate a regular expression at the end of each iteration of RESTest (whose length in terms of API request is indicated by the parameter testperoperation of the .properties file).

During this step, ARTE receives the following parameters, that must be specified in the .properties file:

  • learnRegex: Of type boolean. This parameter indicates whether ARTE should try to automatically try to infer a regular expression for the semantic parameters at the end of every RESTest iteration. Default: false.
  • secondPredicateSearch: Of type boolean. It indicates whether we want ARTE to use the automatically generated regular expression to search for additional input values in the knowledge base. This parameter has no effect if learnRegex has been set to false. Default: false
  • maxNumberOfPredicates: Of type integer. Maximum number of predicates of the knowledge base to leverage for parameter when extracting new input values. Default: 3.
  • minimumValidAndInvalidValues: Of type integer. For automatically generating a regular expression, ARTE requires a set of both valid and invalid values. This parameter stablishes the minimum value of this parameter necessary for ARTE to try to generate this regular expression. Default: 5.
  • minimumValueOfMetric: Of type double (Between 0 and 1). This parameter establishes the minimum performance of the generated regular expression on the set of valid and valid values in terms of recall. If this value is achieved, ARTE uses the regex to filter the list of generated values and to optionally (if secondPredicateSearch=true) perform and additional search for input values. Default: 0.9.
  • maxNumberOfTriesToGenerateRegularExpression: Of type integer. Automatically trying to generate a regular expression for each parameter at the end of every execution of RESTest can result in an important (and in some cases unnecessary) performance overhead. To avoid this issue, this parameter establishes the maximum number of tries to automatically generate a regular expression per parameter. If this value is reached for a parameter, ARTE will not try to generate a regular expression for it in the subsequent operations. However, if ARTE successfully generates a regular expression that achieves the minimum performance specified by the user (set by minimumValueOfMetric) The current number of tries to generate a regular expression for the parameter will be set to 0 again. Default: 2.

The default values for these parameters are shown in the Table below

Parameter name Default value
learnRegex false
secondPredicateSearch false
maxNumberOfPredicates 3
minimumValidAndInvalidValues 5
minimumValueOfMetric 0.9
maxNumberOfTriesToGenerateRegularExpression 2

At the end of the execution of RESTest, the list of valid and invalid values generated by ARTE that match the automatically generated regular expression can be found inside the folder target/test-data/{experimentName}/validAndInvalidValues

Parameter values discussion

An extensive evaluation on a dataset of 47 real world APIs showed the effectiveness of the default values of ARTE for successfully generating realistic input values. However, this section contains an analysis of the influence of each parameter in the performance of ARTE:

  • maxNumberOfPredicatesSelected: This parameter is designed to avoid ARTE to select predicates that are not related to the keyword used for conducting the search for predicates. A low value for this parameter would result in a faster execution (since less predicates are analysed) with the risk of not finding a suitable predicate, whereas a high value could make ARTE consider predicates that are not related to the provided keyword. For instance, if a high number is set for this parameter when using the keyword url, ARTE could consider predicates such as http://dbpedia.org/property/curlie (This is the 8th predicate obtained when using this keyword).
  • Minimum support of predicate (minSupport): This parameter is designed to avoid ARTE to select predicates that do not appear in a sufficient number of unique triples, but that are found first according to the priority defined by the matching rules used for extracting predicates. As an example, when using federalState as keyword, ARTE obtains two predicates (http://dbpedia.org/property/federalState and http://dbpedia.org/ontology/federalState). The first predicate is only present in one triple, so only one value for the parameter federalState would be obtained. On the other hand, using the second predicate would return a greater number of results. Setting a low value for this parameter would allow ARTE to find a predicate slightly faster (with the risk of not finding enough values for a parameter or irrelevant data), whereas setting a greater value would result in a slower execution, with the risk of not finding any predicates.
  • Minimum number of unique parameter values (threshold): Using a low value for this parameter would result in a faster execution, since the threshold would be achieved with a fewer number of queries, but the number of generated input values for each parameter may be insufficient. On the other hand, a greater value would result in a slower execution, with the worst case being that all the predicates would be isolated.
  • Minimum recall for accepting regex (minimumValueOfMetric): If a low value is set for this parameter, ARTE would accept almost any automatically generated regular expression, whereas a recall too close to 100% could be too restrictive.
  • Minimum number of valid and invalid values for generating regex (minimumValidAndInvalidValues): In order to generate a regular expression, it is necessary to provide ARTE with a set of valid and invalid values. Using a small set as input for ARTE could result in the generation of regular expressions that overfit the training set and therefore not generalize enough. On the other hand, collecting a big set of valid and invalid values would require a vast number of API requests.
  • Maximum number of predicates to leverage per parameter (maxNumberOfPredicates): When a regular expression with a good recall is successfully generated, the user can configure ARTE to search for more input values using this regular expression as a filter, leveraging new predicates. However, not restricting the number of maximum predicates to leverage per parameter could cause ARTE to extract values that are not related to the parameter.
  • maxNumberOfTriesToGenerateRegularExpression: Setting a high value for this parameter could result in an important performance overhead. Additionally, it could result in ARTE eventually generating a regular expression that does not represent the target parameter, resulting in the addition of noisy data to the list of input values.
Clone this wiki locally