Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#7216: noise generator basic structure #7283

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
d426b35
Save work.
edeiana Feb 6, 2025
f445f78
Save work.
edeiana Feb 9, 2025
f70f5f5
Save work.
edeiana Feb 12, 2025
3814809
Minor improvements.
edeiana Feb 12, 2025
cbe2176
Save work.
edeiana Feb 18, 2025
9f1a73c
Cleanup.
edeiana Feb 18, 2025
d99c991
Fixed result aggregation in scheduler issue.
edeiana Feb 18, 2025
e143e2f
Do not count tid, pid, and tid exit when generating synthetic records.
edeiana Feb 18, 2025
db8aa84
Moved noise_generator into scheduler dir.
edeiana Feb 18, 2025
03d79a2
Added noise_generator_num_records flag and noise_generator unit test.
edeiana Feb 21, 2025
2e70cd4
Minor improvements.
edeiana Feb 21, 2025
ae2d44f
Added end-to-end schedule stats test with noise_generator.
edeiana Feb 21, 2025
c906bf7
Added changes to doc.
edeiana Feb 21, 2025
d7a4365
Removed references from doc, as doxygen refuses to generate a link.
edeiana Feb 21, 2025
6e21191
Cannot rely on thread numbers or order for schedule_stats output.
edeiana Feb 21, 2025
d1a9047
Added get_noise_generator_end().
edeiana Feb 22, 2025
ddc8e49
Seprate logic to generate noise records from the logic
edeiana Feb 22, 2025
d823067
Add multiple noise generators with -noise_generator_add.
edeiana Feb 24, 2025
09e2a0d
Fixed dr_option type of -noise_generator_add from bool to uint64_t.
edeiana Feb 24, 2025
04fabbc
End-to-end noise generator test now using 3 noise generators instead of
edeiana Feb 24, 2025
fddaecd
Updated doc.
edeiana Feb 24, 2025
9fcf581
Add default value of 0 to pid and tid of noise_generator_t.
edeiana Feb 24, 2025
ca8a792
Removed noise_generator global flags, except -noise_generator_enable.
edeiana Feb 27, 2025
382230e
Added noise_Generator_info_t and fixed doc.
edeiana Feb 27, 2025
21c8216
Removed relative path to trace_entry.h include.
edeiana Feb 27, 2025
863a3be
Merge branch 'master' into i7216-noise-generator-initial
edeiana Feb 27, 2025
212cefb
noise_generator_info is now a unique ptr in scheduler options.
edeiana Feb 27, 2025
e62c3c2
clang format.
edeiana Feb 27, 2025
5a0a4a6
Missing include added.
edeiana Feb 27, 2025
9a22912
Fixed typo in comment.
edeiana Feb 27, 2025
91c0b2e
Added noise_generator_factory_t.
edeiana Mar 6, 2025
ec872b0
Ensure schedule_stats_noise_generator always simulates on 4 cores.
edeiana Mar 6, 2025
410c108
Report error if adding noise generators fails.
edeiana Mar 6, 2025
45a6841
Improved comment.
edeiana Mar 6, 2025
b3dc6de
Improved comment.
edeiana Mar 6, 2025
1bf889c
noise_generator_factory_t only generates a single noise generator
edeiana Mar 8, 2025
6a9a4d3
Trying to schedule always on 4 cores in schedule_stats_noise_generator
edeiana Mar 8, 2025
2223afc
Ignore core count in schedule_stats_noise_generator test,
edeiana Mar 8, 2025
4509a31
Minor improvements.
edeiana Mar 9, 2025
a344710
Improved noise generator scheduler unit test.
edeiana Mar 9, 2025
1a35d61
Minor improvements.
edeiana Mar 9, 2025
4c746e0
Typo in comment fixed.
edeiana Mar 9, 2025
3c4968f
Clarification comment on why we need a timestamp
edeiana Mar 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,12 @@ Further non-compatibility-affecting changes include:
instructions in the trace. This works for traces that have embedded instruction
encodings in them, and also for legacy traces without embedded encodings where the
encodings are obtained from the application binaries instead.
- Added noise_generator_t scaffolding as a reader_t to produce synthetic trace records
in the drmemtrace framework.
- Added -noise_generator_add and -noise_generator_num_records as flags and scheduler
options in the drmemtrace framework. -noise_generator_add determines how many noise
generators to add to a target trace(s), while -noise_generator_num_records determines
how many synthetic records each noise generator will produce.

**************************************************
<hr>
Expand Down
2 changes: 2 additions & 0 deletions clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,7 @@ set(drcachesim_srcs
scheduler/scheduler_replay.cpp
scheduler/scheduler_fixed.cpp
scheduler/speculator.cpp
scheduler/noise_generator.cpp
analyzer.cpp
analyzer_multi.cpp
${client_and_sim_srcs}
Expand Down Expand Up @@ -346,6 +347,7 @@ add_exported_library(drmemtrace_analyzer STATIC
scheduler/scheduler_replay.cpp
scheduler/scheduler_fixed.cpp
scheduler/speculator.cpp
scheduler/noise_generator.cpp
common/trace_entry.cpp
reader/reader.cpp
reader/config_reader.cpp
Expand Down
5 changes: 5 additions & 0 deletions clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -571,6 +571,11 @@ analyzer_multi_tmpl_t<RecordType, ReaderType>::analyzer_multi_tmpl_t()
#endif
}

// Add the noise generator before init_scheduler(), where we eventually add
// the noise generator as another input workload.
sched_ops.noise_generator_add = op_noise_generator_add.get_value();
sched_ops.noise_generator_num_records = op_noise_generator_num_records.get_value();

if (!indirs.empty()) {
std::vector<std::string> tracedirs;
for (const std::string &indir : indirs)
Expand Down
12 changes: 12 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -714,6 +714,18 @@ droption_t<std::string>
"Cache hierarchy configuration file",
"The full path to the cache hierarchy configuration file.");

droption_t<uint64_t> op_noise_generator_add(
DROPTION_SCOPE_FRONTEND, "noise_generator_add", false, "Enables noise generators.",
"Adds the specified number of noise generators, which enables the scheduler to "
"interleave trace records with synthetic records. Each noise generator behaves like "
"a thread in its own process.");

droption_t<uint64_t> op_noise_generator_num_records(
DROPTION_SCOPE_FRONTEND, "noise_generator_num_records", 0,
"Number of synthetic reords the noise generator produces.",
"Determines the number of synthetic records produced by the noise generator "
"excluding TRACE_TYPE_THREAD and TRACE_TYPE_PID.");

// XXX: if we separate histogram + reuse_distance we should move this with them.
droption_t<unsigned int>
op_report_top(DROPTION_SCOPE_FRONTEND, "report_top", 10,
Expand Down
2 changes: 2 additions & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,8 @@ extern dynamorio::droption::droption_t<dynamorio::droption::bytesize_t> op_warmu
extern dynamorio::droption::droption_t<double> op_warmup_fraction;
extern dynamorio::droption::droption_t<dynamorio::droption::bytesize_t> op_sim_refs;
extern dynamorio::droption::droption_t<std::string> op_config_file;
extern dynamorio::droption::droption_t<uint64_t> op_noise_generator_add;
extern dynamorio::droption::droption_t<uint64_t> op_noise_generator_num_records;
extern dynamorio::droption::droption_t<unsigned int> op_report_top;
extern dynamorio::droption::droption_t<unsigned int> op_reuse_distance_threshold;
extern dynamorio::droption::droption_t<bool> op_reuse_distance_histogram;
Expand Down
112 changes: 112 additions & 0 deletions clients/drcachesim/scheduler/noise_generator.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
/* **********************************************************
* Copyright (c) 2025 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#include <assert.h>
#include "noise_generator.h"
#include "trace_entry.h"
#include "utils.h"

namespace dynamorio {
namespace drmemtrace {

noise_generator_t::noise_generator_t()
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List as = default; in .h instead?


noise_generator_t::noise_generator_t(addr_t pid, addr_t tid,
uint64_t num_records_to_generate)
: num_records_to_generate_(num_records_to_generate)
, pid_(pid)
, tid_(tid)
{
}

noise_generator_t::~noise_generator_t()
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also = default;?


bool
noise_generator_t::init()
{
at_eof_ = false;
++*this;
return true;
}

std::string
noise_generator_t::get_stream_name() const
{
return "noise_generator";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append the pid and tid? Or those are available through other means: probably what we'd want is to append the noise parameters like "30_percent_loads". Add XXX comment?

}

trace_entry_t
noise_generator_t::generate_trace_entry()
{
// TODO i#7216: this is a temporary trace record that we use as a placeholder until
// the logic to generate noise records is in place.
trace_entry_t generated_entry = { TRACE_TYPE_READ, 4, { 0xdeadbeef } };
return generated_entry;
}

trace_entry_t *
noise_generator_t::read_next_entry()
{
if (num_records_to_generate_ == 0) {
at_eof_ = true;
return nullptr;
}

// Do not change the order for generating TRACE_TYPE_THREAD and TRACE_TYPE_PID.
// The scheduler expects a tid first and then a pid.
if (!marker_tid_generated_) {
entry_ = { TRACE_TYPE_THREAD, sizeof(int), { tid_ } };
marker_tid_generated_ = true;
return &entry_;
}
if (!marker_pid_generated_) {
entry_ = { TRACE_TYPE_PID, sizeof(int), { pid_ } };
marker_pid_generated_ = true;
return &entry_;
}

entry_ = generate_trace_entry();

if (num_records_to_generate_ == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this before line 112 to avoid wasteful work?

entry_ = { TRACE_TYPE_THREAD_EXIT, sizeof(int), { tid_ } };
}
--num_records_to_generate_;

return &entry_;
}

} // namespace drmemtrace
} // namespace dynamorio
86 changes: 86 additions & 0 deletions clients/drcachesim/scheduler/noise_generator.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
/* **********************************************************
* Copyright (c) 2025 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#ifndef _NOISE_GENERATOR_H_
#define _NOISE_GENERATOR_H_ 1

#include "reader.h"
#include "../common/trace_entry.h"

namespace dynamorio {
namespace drmemtrace {

/**
* Generates synthetic #dynamorio::drmemtrace::memref_t trace entries in a single thread
* and presents them via an iterator interface to the scheduler.
*/
class noise_generator_t : public reader_t {
public:
noise_generator_t();

noise_generator_t(addr_t pid, addr_t tid, uint64_t num_records_to_generate);

virtual ~noise_generator_t();

bool
init() override;

std::string
get_stream_name() const override;

protected:
// Wraps the noise records generated by generate_trace_entry() between
// TRACE_TYPE_THREAD, TRACE_TYPE_PID and TRACE_TYPE_THREAD_EXIT.
virtual trace_entry_t *
read_next_entry() override;

// Has the main logic to generate noise records.
virtual trace_entry_t
generate_trace_entry();

// This counter does not count TRACE_TYPE_THREAD or TRACE_TYPE_PID.
// The idea is that when we want to generate at least 1 record, tid and pid have to be
// there as well, otherwise the scheduler will report an error.
uint64_t num_records_to_generate_ = 0;
addr_t pid_ = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type should be memref_pid_t.

addr_t tid_ = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type should be memref_tid_t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems better to hold an instance of noise_generator_info_t instead of duplicating all the fields. We'll be adding more fields and instead of duplicating them yet again let's just encapsulate the info struct here. (But be sure a default instance used for end() has a 0 num_records.)


private:
trace_entry_t entry_ = {};
bool marker_pid_generated_ = false;
bool marker_tid_generated_ = false;
};

} // namespace drmemtrace
} // namespace dynamorio

#endif /* _NOISE_GENERATOR_H_ */
10 changes: 10 additions & 0 deletions clients/drcachesim/scheduler/scheduler.h
Original file line number Diff line number Diff line change
Expand Up @@ -829,6 +829,16 @@ template <typename RecordType, typename ReaderType> class scheduler_tmpl_t {
* when raising this value on uneven inputs.
*/
double exit_if_fraction_inputs_left = 0.1;
/**
* Adds noise generators to the scheduler. A noise generator creates synthetic
* trace records that will be scheduled alongside records of one or more real
* traces. Each noise generator behaves like a thread in its own process.
*/
uint64_t noise_generator_add = 0;
/**
* Number of synthetic trace records produced by the noise generator.
*/
uint64_t noise_generator_num_records = 0;
// When adding new options, also add to print_configuration().
};

Expand Down
64 changes: 63 additions & 1 deletion clients/drcachesim/scheduler/scheduler_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
#include "mutex_dbg_owned.h"
#include "reader.h"
#include "record_file_reader.h"
#include "noise_generator.h"
#include "trace_entry.h"
#ifdef HAS_LZ4
# include "lz4_file_reader.h"
Expand Down Expand Up @@ -117,6 +118,22 @@ replay_file_checker_t::check(archive_istream_t *infile)
* Specializations for scheduler_tmpl_impl_t<reader_t>, aka scheduler_impl_t.
*/

template <>
std::unique_ptr<reader_t>
scheduler_impl_tmpl_t<memref_t, reader_t>::get_noise_generator(addr_t pid, addr_t tid,
uint64_t num_records)
{
return std::unique_ptr<noise_generator_t>(
new noise_generator_t(pid, tid, num_records));
}

template <>
std::unique_ptr<reader_t>
scheduler_impl_tmpl_t<memref_t, reader_t>::get_noise_generator_end()
{
return std::unique_ptr<noise_generator_t>(new noise_generator_t());
}

template <>
std::unique_ptr<reader_t>
scheduler_impl_tmpl_t<memref_t, reader_t>::get_default_reader()
Expand Down Expand Up @@ -353,6 +370,23 @@ scheduler_impl_tmpl_t<memref_t, reader_t>::insert_switch_tid_pid(input_info_t &i
* record_scheduler_impl_t.
*/

template <>
std::unique_ptr<dynamorio::drmemtrace::record_reader_t>
scheduler_impl_tmpl_t<trace_entry_t, record_reader_t>::get_noise_generator(
addr_t pid, addr_t tid, uint64_t num_records)
{
error_string_ = "Noise generator is not suppported for record_reader_t";
return std::unique_ptr<dynamorio::drmemtrace::record_reader_t>();
}

template <>
std::unique_ptr<dynamorio::drmemtrace::record_reader_t>
scheduler_impl_tmpl_t<trace_entry_t, record_reader_t>::get_noise_generator_end()
{
error_string_ = "Noise generator is not suppported for record_reader_t";
return std::unique_ptr<dynamorio::drmemtrace::record_reader_t>();
}

template <>
std::unique_ptr<dynamorio::drmemtrace::record_reader_t>
scheduler_impl_tmpl_t<trace_entry_t, record_reader_t>::get_default_reader()
Expand Down Expand Up @@ -629,6 +663,10 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::print_configuration()
options_.honor_infinite_timeouts);
VPRINT(this, 1, " %-25s : %f\n", "exit_if_fraction_inputs_left",
options_.exit_if_fraction_inputs_left);
VPRINT(this, 1, " %-25s : %" PRIu64 "\n", "noise_generator_add",
options_.noise_generator_add);
VPRINT(this, 1, " %-25s : %" PRIu64 "\n", "noise_generator_num_records",
options_.noise_generator_num_records);
}

template <typename RecordType, typename ReaderType>
Expand Down Expand Up @@ -705,6 +743,25 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::init(
{
options_ = std::move(options);
verbosity_ = options_.verbosity;

// Add noise generator reader to workload_inputs.
if (options_.noise_generator_add > 0) {
for (uint64_t noise_generator_idx = 0;
noise_generator_idx < options_.noise_generator_add; ++noise_generator_idx) {
auto noise_generator =
get_noise_generator(static_cast<addr_t>(noise_generator_idx + 1),
static_cast<addr_t>(noise_generator_idx + 1),
options_.noise_generator_num_records);
auto noise_generator_end = get_noise_generator_end();
std::vector<typename sched_type_t::input_reader_t> readers;
// Use a sentinel for the tid so the scheduler will use the memref record tid.
readers.emplace_back(std::move(noise_generator),
std::move(noise_generator_end),
/* tid = */ INVALID_THREAD_ID);
workload_inputs.emplace_back(std::move(readers));
}
}

// workload_inputs is not const so we can std::move readers out of it.
for (int workload_idx = 0; workload_idx < static_cast<int>(workload_inputs.size());
++workload_idx) {
Expand Down Expand Up @@ -820,7 +877,7 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::init(
}
}

// Legacy field support.
// Legacy field support.
scheduler_status_t res = legacy_field_support();
if (res != sched_type_t::STATUS_SUCCESS)
return res;
Expand Down Expand Up @@ -1514,6 +1571,11 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::get_initial_input_content(
// output stream(s).
for (size_t i = 0; i < inputs_.size(); ++i) {
input_info_t &input = inputs_[i];
// If the input is a noise generator, we don't want to read ahead to find
// timestamp records, since we don't have any.
if (dynamic_cast<noise_generator_t *>(input.reader.get()) != nullptr)
continue;

std::lock_guard<mutex_dbg_owned> lock(*input.lock);

// If the input jumps to the middle immediately, do that now so we'll have
Expand Down
Loading
Loading