Skip to content

Documentation

Ealyn edited this page Jun 27, 2024 · 9 revisions

What's PistolMagazine?


PistolMagazine is a flexible and extensible tool for generating realistic data, suitable for testing, creating sample data for demonstrations, and populating databases. It offers various ways to customize and extend the data generation process to meet different needs.

Key Features✨:

  • Highly Extensible: Easily extend Pistol Magazine by defining custom data providers and hooks, allowing you to generate data sets tailored to your specific requirements.
  • Custom Providers: Create your own provider classes to generate specific types of data, making mock data more realistic and relevant.
  • Hook System: Use hooks to execute operations at various stages of data generation, such as preprocessing, data validation, and modification.
  • Diverse Data Models: Construct complex data models using classes like Dict, List, and Timestamp, enabling you to represent and generate a wide range of data structures.
  • Multiple Export Options: Support for exporting generated data in CSV, JSON, and XML formats, or directly importing it into a database to meet various usage needs.

PistolMagazine, with its flexible architecture and diverse functionality, helps developers, testers, and data scientists generate the mock data they need.


Supported Data Types

Str Class

The Str class is used to generate strings of a specific data type. By default, the data type is "word". The data_type parameter usage is similar to the usage in fake, supporting various common types.

Initialization Method

Str(data_type="word")
  • data_type (optional): Specifies the type of data to generate, default is "word". The supported types are similar to those used in fake.

mock Method

mock()

Returns a random string of the specified data type.

match Class Method

match(value: str)

Returns the appropriate Str class instance based on the input string value. If value is a digit string, it returns a StrInt instance; if value is a float string, it returns a StrFloat instance; otherwise, it returns a Str instance.

Examples of Supported data_type

  • word
  • name
  • address
  • email
  • text

The usage of these types is similar to their usage in fake.

Example

from pistol_magazine import Str

# Generate a "word" type string by default
s = Str()
print(s.mock())

# Generate a "name" type string, e.g. Michelle Mendez
s = Str(data_type="name")
print(s.mock())

# Match and generate the corresponding type instance based on the input value
s = Str.match("123")
print(s.mock())  # This will return a StrInt instance, e.g. 8907407153424553311

Int Class

The Int class is used to generate random integers. You can specify the number of bytes and whether the integer is unsigned.

Initialization Method

Int(byte_nums=64, unsigned=False)
  • byte_nums (optional): Specifies the number of bytes for the integer, default is 64.
  • unsigned (optional): Specifies whether the integer is unsigned, default is False.

mock Method

mock()

Returns a random integer within the specified range.

UInt and Int Subclasses

These subclasses are used to generate integers of specific byte sizes, either unsigned or signed.

  • UInt8: Generates an 8-bit unsigned integer
  • Int8: Generates an 8-bit signed integer
  • UInt16: Generates a 16-bit unsigned integer
  • Int16: Generates a 16-bit signed integer
  • UInt32: Generates a 32-bit unsigned integer
  • Int32: Generates a 32-bit signed integer
  • UInt: Generates a 64-bit unsigned integer

Initialization Method

UInt8()
Int8()
UInt16()
Int16()
UInt32()
Int32()
UInt()

mock Method

mock()

Returns a random integer of the specified type and byte size.

Example

from pistol_magazine import Int, UInt, Int8, UInt8, Int16, UInt16, Int32, UInt32

# Generate a 64-bit signed integer
i = Int()
print(i.mock())

# Generate a 64-bit unsigned integer
ui = UInt()
print(ui.mock())

# Generate an 8-bit unsigned integer
ui8 = UInt8()
print(ui8.mock())

# Generate an 8-bit signed integer
i8 = Int8()
print(i8.mock())

# Generate a 16-bit unsigned integer
ui16 = UInt16()
print(ui16.mock())

# Generate a 16-bit signed integer
i16 = Int16()
print(i16.mock())

# Generate a 32-bit unsigned integer
ui32 = UInt32()
print(ui32.mock())

# Generate a 32-bit signed integer
i32 = Int32()
print(i32.mock())

Float Class

The Float class is used to generate random floating-point numbers. You can specify the maximum number of digits to the left and right of the decimal point, and whether the number is unsigned.

Initialization Method

Float(left=2, right=2, unsigned=False)
  • left (optional): Specifies the maximum number of digits to the left of the decimal point, default is 2.
  • right (optional): Specifies the maximum number of digits to the right of the decimal point, default is 2.
  • unsigned (optional): Specifies whether the floating-point number is unsigned, default is False.

mock Method

mock()

Returns a random floating-point number within the specified range. If unsigned is False, the returned value will be positive.

get_datatype Method

get_datatype()

Returns the data type name.

Example

from pistol_magazine import Float

# Generate a floating-point number with default range
f = Float()
print(f.mock())  # e.g., 12.34

# Generate an unsigned floating-point number with specified range
f_unsigned = Float(left=3, right=4, unsigned=True)
print(f_unsigned.mock())  # e.g., 123.4567

# Get the data type name
print(f.get_datatype())  # Output: "Float"

Bool Class

The Bool class is used to generate random boolean values. It can also check if a given value is of boolean type.

mock Method

mock()

Returns a random boolean value (True or False).

match Method

match(value)

Checks if the given value is of boolean type.

Example

from pistol_magazine import Bool

# Generate a random boolean value
b = Bool()
print(b.mock())  # e.g., True or False

# Check if a value is of boolean type
print(Bool.match(True))  # Output: True
print(Bool.match(1))     # Output: False

Datetime Class

The Datetime class is used to generate random dates and times. You can specify the date format and time delta.

Initialization Method

Datetime(date_format="%Y-%m-%d %H:%M:%S", **kwargs)
  • date_format (optional): Specifies the format of the date and time, default is "%Y-%m-%d %H:%M:%S".
  • kwargs (optional): Specifies time deltas like days, seconds, microseconds, milliseconds, minutes, hours, weeks.

mock Method

mock()

Returns a random date and time within the current time range. If a time delta is specified, the returned date and time will be a random time within the range of current time minus and plus the delta.

match Method

match(value)

Checks if the given string value matches any of the defined date formats.

get_datatype Method

get_datatype()

Returns the data type name and date format.

Example

from pistol_magazine import Datetime

# Generate current date and time with default format
dt = Datetime()
print(dt.mock())  # e.g., "2024-06-26 15:32:45"

# Generate a random date and time with specified format and time delta
dt_with_delta = Datetime(date_format="%Y-%m-%d %H:%M", days=2, hours=5)
print(dt_with_delta.mock())  # e.g., "2024-06-24 10:27"

# Check if a date string matches any of the defined date formats
print(Datetime.match("2024-06-26 15:32:45"))  # Output: "%Y-%m-%d %H:%M:%S"
print(Datetime.match("2024-06-26T15:32:45"))  # Output: "%Y-%m-%dT%H:%M:%S"

# Get the data type name and date format
print(dt.get_datatype())  # Output: "Datetime_%Y-%m-%d %H:%M:%S"

Timestamp Class

The Timestamp class is used to generate random timestamps with specified precision and time delta.

Initialization Method

Timestamp(times=3, **kwargs)
  • times (optional): Specifies the precision of the timestamp, can be 10 or 13, default is 13. 13 indicates millisecond precision, 10 indicates second precision.
  • kwargs (optional): Specifies time deltas like days, seconds, microseconds, milliseconds, minutes, hours, weeks.

mock Method

mock()

Returns a random timestamp within the current time range. If a time delta is specified, the returned timestamp will be a random time within the range of current time minus and plus the delta.

match Method

match(value)

Checks if the given value is a valid timestamp and returns its precision.

get_datatype Method

get_datatype()

Returns the data type name and timestamp precision.

Example

from pistol_magazine import Datetime, Timestamp

# Generate a random timestamp with default precision
ts = Timestamp()
print(ts.mock())  # e.g., 1656089173123

# Generate a random timestamp with specified time delta
ts_with_delta = Timestamp(days=2, hours=5)
print(ts_with_delta.mock())  # e.g., 1656005173123

# Check if a value is a valid timestamp and return its precision
print(Timestamp.match(1656089173123))  # Output: 13

# Get the data type name and timestamp precision
print(ts.get_datatype())  # Output: "Timestamp_13"

List Class

The List class is used to generate random lists containing different types of field objects.

Initialization Method

List(list_fields=None)
  • list_fields (optional): Specifies a list of field objects for the list. If not specified, it defaults to including a string, an integer, and a float field object.

mock Method

mock(to_json=False)

Returns a list of randomly generated data from the list of field objects. If the to_json parameter is True, the returned data will be serialized into JSON format.

get_datatype Method

get_datatype()

Returns a list of data type names for each field object in the list.

Example

from pistol_magazine import List, Datetime, Timestamp, Str, Int, Float

# Generate a random list with default fields
lst = List()
print(lst.mock())  # e.g. ["involve", 42, 3.14]

# Generate a random list with custom fields
custom_list_format = [
        Datetime(Datetime.D_FORMAT_YMD, days=2),
        Timestamp(Timestamp.D_TIMEE10, days=2),
        Float(left=2, right=4, unsigned=True),
        Str(data_type="file_name"),
        Int(byte_nums=6, unsigned=True)
    ]
lst_custom = List(list_fields=custom_list_format)
print(lst_custom.mock())  # e.g., ["2024-06-25 21:45:16", 1719483880, 76.4993, "coach.csv", 62]

# Convert the generated list into JSON format
print(lst.mock(to_json=True))  # Output: '["coach", 42, 3.14]'

# Get the data type names for each field object in the list
print(lst.get_datatype())  # Output: ["Str", "Int", "Float"]

Dict Class

The Dict class is used to generate random dictionaries containing different types of field objects.

Initialization Method

Dict(dict_fields=None)
  • dict_fields (optional): Specifies the field objects in the dictionary. If not specified, it defaults to including an integer, a string, and a timestamp field object.

mock Method

mock(to_json=False)

Returns a dictionary of randomly generated data from the field objects. If the to_json parameter is True, the returned data will be serialized into JSON format.

get_datatype Method

get_datatype()

Returns a dictionary of data type names for each field object in the dictionary.

Example

from pistol_magazine import Dict, Int, Str, Timestamp, Float, List, Datetime, StrInt

# Generate a random dictionary with default fields
d = Dict()
print(d.mock())  # e.g., {"a": 42, "b": "random_string", "c": 1656089173123}

# Generate a random dictionary with custom fields
custom_dict_format = {
        "a": Float(left=2, right=4, unsigned=True),
        "b": Timestamp(Timestamp.D_TIMEE10, days=2),
        "C": List(
            [
                Datetime(Datetime.D_FORMAT_YMD_T, weeks=2),
                StrInt(byte_nums=6, unsigned=True)
            ]
        )
    }
d_custom = Dict(dict_fields=custom_dict_format)
print(d_custom.mock())  # e.g., {"a": -25.5595, "b": 1719450850, "C": ["2024-07-05T03:00:27", "11"]}

# Convert the generated dictionary into JSON format
print(d.mock(to_json=True))  # Output: '{"a": 42, "b": "land", "c": 1656089173123}'

# Get the data type names for each field object in the dictionary
print(d.get_datatype())  # Output: {"a": "Int", "b": "Str", "c": "Timestamp"}

Custom Providers, Hooks To Mock Data

Provider

To define custom data providers, use the @provider decorator to designate a class as a data provider. Below is an example of defining a MyProvider class with a method user_status that returns either "ACTIVE" or "INACTIVE":

from pistol_magazine import provider
from random import choice

@provider
class MyProvider:
    def user_status(self):
        return choice(["ACTIVE", "INACTIVE"])

Hook

Hooks are functions executed at different stages of data generation. Use the @hook decorator to define hooks. Specify the hook_type, order, and hook_set parameters to control when and how hooks are triggered. For example:

from pistol_magazine.hooks.hooks import hook

@hook('pre_generate', order=1, hook_set='SET1')
def pre_generate_first_hook():
    print("Start Mocking User Data")

@hook('after_generate', order=1, hook_set="SET1")
def after_generate_first_hook(data):
    data['user_status'] = 'ACTIVE' if data['user_age'] >= 18 else 'INACTIVE'
    return data

@hook('final_generate', order=1, hook_set="SET1")
def final_generate_second_hook(data):
    # Suppose there is a function send_to_message_queue(data) to send data to the message queue
    pass

Hook Function Parameters

  • hook_type: Type of hook ('pre_generate', 'after_generate', 'final_generate').
    • pre_generate: Executes operations before generating all data. Suitable for tasks like logging or starting external services.
    • after_generate: Executes operations after generating each data entry but before final processing. Suitable for tasks like data validation or conditional modifications.
    • final_generate: Executes operations after generating and processing all data entries. Suitable for final data processing, sending data to message queues, or performing statistical analysis.
  • order: Execution order of the hook (lower values execute earlier).
  • hook_set: Name of the hook set to group related hooks.

Mock

Utilize the mock method provided by your custom data model class (e.g., UserInfo) to generate mock data. Customize generation options such as JSON serialization, number of entries, key generation, output format, and hook set.

Using Custom Data Models

Define a class that inherits from DataMocker, such as UserInfo, to generate structured data. Customize fields using various field types (e.g., Int, Str, Timestamp, ProviderField, Dict, List) provided by the mock data framework.

Example: UserInfo Class

from pistol_magazine import DataMocker, Str, Int, Timestamp, Bool, ProviderField, Dict, List, StrInt, Datetime, Float, MyProvider
class UserInfo(DataMocker):
    create_time: Timestamp = Timestamp(Timestamp.D_TIMEE10, days=2)
    user_name: Str = Str(data_type="name")
    user_email: Str = Str(data_type="email")
    user_age: Int = Int(byte_nums=6, unsigned=True)
    user_status: ProviderField = ProviderField(MyProvider().user_status)
    user_marriage: Bool = Bool()
    user_dict: Dict = Dict({
        "a": Float(left=2, right=4, unsigned=True),
        "b": Timestamp(Timestamp.D_TIMEE10, days=2)
    })
    user_list: List = List([
        Datetime(Datetime.D_FORMAT_YMD_T, weeks=2),
        StrInt(byte_nums=6, unsigned=True)
    ])

mock Method

mock(
        to_json: bool = False,
        num_entries: Optional[int] = None,
        key_generator: Optional[Callable[[], str]] = None,
        as_list: bool = False,
        hook_set: Optional[str] = 'default'
    )
  • to_json (bool): Serialize generated data as JSON (default False).
  • num_entries (int): Number of data entries to generate (default None for single entry).
  • key_generator (Callable[[], str]): Function to generate dictionary keys (default lambda: str(uuid.uuid4())).
  • as_list (bool): Return generated data as a list (default False).
  • hook_set (str): Name of the hook set to use (default 'default').

Example

from pprint import pprint
import uuid

pprint(
    UserInfo().mock(
        num_entries=2, 
        as_list=False, 
        to_json=False, 
        hook_set='SET1', 
        key_generator=lambda: str(uuid.uuid4())
    )
)

"""
e.g.
Start Mocking User Data
{'7f79875c-fe79-401b-9386-e68417dda747': {'create_time': 1719541419,
                                          'user_age': 33,
                                          'user_dict': {'a': -0.616,
                                                        'b': 1719549151},
                                          'user_email': 'fullercheryl@example.com',
                                          'user_list': ['2024-06-22T23:38:48',
                                                        '48'],
                                          'user_marriage': True,
                                          'user_name': 'Tiffany Blankenship',
                                          'user_status': 'ACTIVE'},
 '97a13622-0db4-4a1e-b176-bad1cd23777d': {'create_time': 1719458525,
                                          'user_age': 14,
                                          'user_dict': {'a': 79.333,
                                                        'b': 1719410667},
                                          'user_email': 'qnicholson@example.net',
                                          'user_list': ['2024-06-28T20:37:57',
                                                        '17'],
                                          'user_marriage': True,
                                          'user_name': 'James Nichols',
                                          'user_status': 'INACTIVE'}}
"""

Support Data Export

Supports exporting to CSV, JSON, XML, and MySQL.


Can be used in conjunction with hook functions.


Exporter Classes

The following examples demonstrate how to export data to CSV, JSON, and XML files, as well as how to export data to a MySQL database.

Export to CSV

To export data to a CSV file, use the CSVExporter class:

from pistol_magazine import CSVExporter

data = [
    {"name": "Alice", "age": 25, "city": "New York"},
    {"name": "Bob", "age": 30, "city": "Los Angeles"},
    {"name": "Charlie", "age": 35, "city": "Chicago"}
]

csv_exporter = CSVExporter()
csv_exporter.export(data, 'output.csv')

Export to JSON

To export data to a JSON file, use the JSONExporter class:

from pistol_magazine import JSONExporter

data = [
    {"name": "Alice", "age": 25, "city": "New York"},
    {"name": "Bob", "age": 30, "city": "Los Angeles"},
    {"name": "Charlie", "age": 35, "city": "Chicago"}
]

json_exporter = JSONExporter()
json_exporter.export(data, 'output.json')

Export to XML

To export data to an XML file, use the XMLExporter class:

from pistol_magazine import XMLExporter
data_xml = {
    "users": [
        {
            "id": 1,
            "name": "Alice",
            "email": "alice@example.com",
            "profile": {
                "age": 30,
                "city": "New York"
            }
        },
        {
            "id": 2,
            "name": "Bob",
            "email": "bob@example.com",
            "profile": {
                "age": 25,
                "city": "Los Angeles"
            }
        }
    ]
}

xml_exporter = XMLExporter()
xml_exporter.export(data_xml, 'output.xml')

Export to MySQL Database

To export data to a MySQL database, use the DBExporter class:

from pistol_magazine import DBExporter
data = [
    {"name": "Alice", "age": 25, "city": "New York"},
    {"name": "Bob", "age": 30, "city": "Los Angeles"},
    {"name": "Charlie", "age": 35, "city": "Chicago"}
]

db_config = {
    "user": "User",
    "password": "Password",
    "host": "Localhost",
    "port": 3306,
    "database": "DB"
}

db_exporter = DBExporter(table_name='TableName', db_config=db_config)
db_exporter.export(data)

Built-in Provider

Provides several built-in providers for common use cases:

CyclicParameterProvider

This class provides parameters in a cyclic manner from the given list.If no list is provided, it uses a default list of parameters.

Usage

from pistol_magazine import DataMocker, ProviderField, CyclicParameterProvider

class Param(DataMocker):
    param: ProviderField = ProviderField(
        CyclicParameterProvider(parameter_list=[10, 11, 12]).get_next_param
    )
    def param_info(self):
        return self.mock(num_entries=6, as_list=True)

param = Param()
print(param.param_info())

Example Output

[{'param': 10}, {'param': 11}, {'param': 12}, {'param': 10}, {'param': 11}, {'param': 12}]

FixedValueProvider

This class always returns a fixed value.If no value is provided, it uses a default fixed value.

Usage

from pistol_magazine import DataMocker, ProviderField, FixedValueProvider

class Param(DataMocker):
    param: ProviderField = ProviderField(
        FixedValueProvider(fixed_value="STATIC").get_fixed_value
    )
    def param_info(self):
        return self.mock(num_entries=2, as_list=True)

param = Param()
print(param.param_info())

Example Output

[{'param': 'STATIC'}, {'param': 'STATIC'}]

IncrementalValueProvider

This class provides incrementing values starting from a given value.

Usage

from pistol_magazine import DataMocker, ProviderField, IncrementalValueProvider

class Param(DataMocker):
    param: ProviderField = ProviderField(
        IncrementalValueProvider(start=0, step=2).get_next_value
    )
    def param_info(self):
        return self.mock(num_entries=3, as_list=True)

param = Param()
print(param.param_info())

Example Output

[{'param': 0}, {'param': 2}, {'param': 4}]

RandomChoiceFromListProvider

This class provides random values from the given list.If no list is provided, it uses a default list of values.

Usage

from pistol_magazine import DataMocker, ProviderField, RandomChoiceFromListProvider

class Param(DataMocker):
    param: ProviderField = ProviderField(
        RandomChoiceFromListProvider(value_list=["value1", "value2", "value3"]).get_random_value
    )
    def param_info(self):
        return self.mock(num_entries=4, as_list=True)

param = Param()
print(param.param_info())

Example Output

[{'param': 'value3'}, {'param': 'value1'}, {'param': 'value2'}, {'param': 'value1'}]

RandomFloatInRangeProvider

This class provides random float values within a specified range and precision.

Usage

from pistol_magazine import DataMocker, ProviderField, RandomFloatInRangeProvider

class Param(DataMocker):
    param: ProviderField = ProviderField(
        RandomFloatInRangeProvider(start=0.00, end=4.00, precision=4).get_random_float
    )
    def param_info(self):
        return self.mock(num_entries=6, as_list=True)

param = Param()
print(param.param_info())

Example Output

[{'param': 3.8797}, {'param': 3.4613}, {'param': 2.193}]