Skip to content

PharmaForest/sas_faker

Repository files navigation

sas_faker

SAS package to create dummy data in CDISC format for clinical trials Purpose: A macro to generate dummy clinical trial data. Creates datasets in SDTM (DM, AE, SV, VS) and ADaM (ADSL, ADAE,ADVS, ADTTE) formats. Generates pseudo subject data, vital signs, study visits, and adverse events based on user-specified group numbers and sample sizes.

Image
/*example*/
%sas_faker(n_groups=2,
                 n_per_group=50, 
                 output_lib=WORK)

dm domain

Dummy is designed to be a randomized parallel-group study, with a low probability of discontinuation or death data. Image

ae domain

For rights reasons, meddra variables have non-standard CDISC variable names, event names are dummy generated, and the dictionary form has the same structure as MedDRA, but is specific and different from MedDRA. For example, variables related to toxicity, such as severity, are set to be less likely to occur at higher values. Image

vs domain

Synchronized with the VISIT information of SV domain. Values are stable from participant to participant and rise and fall with random errors. No systematic differences are built into the values between groups or time series. Image

sv domain

Synchronized with the VISIT information of the domain of the Finding Class. Image

adsl dataset

It is created based on the information in the SDTM, mainly in the DM domain. For example, WEIGHTBL is consistent with VS domain information, which should basically be consistent with SDTM information. Image

adae dataset

Created from AE domain information and ADSL Image

advs dataset

Created from VS domain information and ADSL Image

adtte dataset

The event times are adjusted for differences in appearance in the Kaplan-Meier curves for each Treatment Group(TRTP). If there are many groups, the same distribution will appear. Image

proc lifetest data=adtte
  plots=survival(atrisk=1 7 14 21 28 35 42 49 56 63 70 77);
  time AVAL * CNSR(1);
  strata TRTPN ;
run;
Image

%sas_faker

Purpose: A macro to generate dummy clinical trial data. Creates datasets in SDTM (DM, AE, SV, VS) and ADaM (ADSL, ADAE) formats. Generates pseudo subject data, vital signs, study visits, and adverse events based on user-specified group numbers and sample sizes.

Author: [Yutaka Morioka]
Date: July 2, 2025
Version: 0.1

Input Parameters:

  • n_groups: Number of groups (default=2)
  • n_per_group: Number of subjects per group (default=50)
  • output_lib: Output library (default=WORK)
  • seed: Random seed (default=123456)
  • create_dm: Flag to generate DM dataset (Y/N, default=Y)
  • create_ae: Flag to generate AE dataset (Y/N, default=Y)
  • create_sv: Flag to generate SV dataset (Y/N, default=Y)
  • create_vs: Flag to generate VS dataset (Y/N, default=Y)
  • create_adsl: Flag to generate ADSL dataset (Y/N, default=Y)
  • create_adae: Flag to generate ADAE dataset (Y/N, default=Y)
  • create_adae: Flag to generate ADTTE dataset (Y/N, default=Y) If, for example, create_dm is set to N and no DM domain is created, ADSL is not affected. All datasets are created harmoniously once in the background.

Output:

  • SDTM datasets: DM, AE, SV, VS (if specified)
  • ADaM datasets: ADSL, ADAE, ADVS, ADTTE (if specified)

Notes:

  • Uses a random seed for reproducible data generation.
  • Utilizes the minimize_charlen macro to optimize character variable lengths.
  • Generated data mimics the structure of clinical trial data but is not real.
  • Variable names related to MeDRA dictionary (e.g., F_AELLTCD, F_AEPTCD) are prefixed with "F_" to avoid infringement of intellectual property rights.
  • Adverse event terms and codes (AETERM, AEDECOD, AEBODSYS, etc.) are structured systematically but are fictitious dictionary coding data and unrelated to the actual MeDRA dictionary.
/*example*/
** Generate data with 3 treatment groups, 100 subjects per group.
%generate_clinical_dummy_data(
n_groups=3,
n_per_group=100,
seed=789012)

version history

0.1.1(01December2025): bug-fix(ADAE)
0.1.0(03July2025): Initial version

What is SAS Packages?

The package is built on top of SAS Packages Framework(SPF) developed by Bartosz Jablonski.

For more information about the framework, see SAS Packages Framework.

You can also find more SAS Packages (SASPacs) in the SAS Packages Archive(SASPAC).

How to use SAS Packages? (quick start)

1. Set-up SAS Packages Framework

First, create a directory for your packages and assign a packages fileref to it.

filename packages "\path\to\your\packages";

Secondly, enable the SAS Packages Framework. (If you don't have SAS Packages Framework installed, follow the instruction in SPF documentation to install SAS Packages Framework.)

%include packages(SPFinit.sas)

2. Install SAS package

Install SAS package you want to use with the SPF's %installPackage() macro.

  • For packages located in SAS Packages Archive(SASPAC) run:

    %installPackage(packageName)
  • For packages located in PharmaForest run:

    %installPackage(packageName, mirror=PharmaForest)
  • For packages located at some network location run:

    %installPackage(packageName, sourcePath=https://some/internet/location/for/packages)

    (e.g. %installPackage(ABC, sourcePath=https://github.com/SomeRepo/ABC/raw/main/))

3. Load SAS package

Load SAS package you want to use with the SPF's %loadPackage() macro.

%loadPackage(packageName)

Enjoy!

About

mirror of sas_faker

Topics

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
license.sas

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages