Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i#7216: noise generator basic structure #7283

Open
wants to merge 43 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
d426b35
Save work.
edeiana Feb 6, 2025
f445f78
Save work.
edeiana Feb 9, 2025
f70f5f5
Save work.
edeiana Feb 12, 2025
3814809
Minor improvements.
edeiana Feb 12, 2025
cbe2176
Save work.
edeiana Feb 18, 2025
9f1a73c
Cleanup.
edeiana Feb 18, 2025
d99c991
Fixed result aggregation in scheduler issue.
edeiana Feb 18, 2025
e143e2f
Do not count tid, pid, and tid exit when generating synthetic records.
edeiana Feb 18, 2025
db8aa84
Moved noise_generator into scheduler dir.
edeiana Feb 18, 2025
03d79a2
Added noise_generator_num_records flag and noise_generator unit test.
edeiana Feb 21, 2025
2e70cd4
Minor improvements.
edeiana Feb 21, 2025
ae2d44f
Added end-to-end schedule stats test with noise_generator.
edeiana Feb 21, 2025
c906bf7
Added changes to doc.
edeiana Feb 21, 2025
d7a4365
Removed references from doc, as doxygen refuses to generate a link.
edeiana Feb 21, 2025
6e21191
Cannot rely on thread numbers or order for schedule_stats output.
edeiana Feb 21, 2025
d1a9047
Added get_noise_generator_end().
edeiana Feb 22, 2025
ddc8e49
Seprate logic to generate noise records from the logic
edeiana Feb 22, 2025
d823067
Add multiple noise generators with -noise_generator_add.
edeiana Feb 24, 2025
09e2a0d
Fixed dr_option type of -noise_generator_add from bool to uint64_t.
edeiana Feb 24, 2025
04fabbc
End-to-end noise generator test now using 3 noise generators instead of
edeiana Feb 24, 2025
fddaecd
Updated doc.
edeiana Feb 24, 2025
9fcf581
Add default value of 0 to pid and tid of noise_generator_t.
edeiana Feb 24, 2025
ca8a792
Removed noise_generator global flags, except -noise_generator_enable.
edeiana Feb 27, 2025
382230e
Added noise_Generator_info_t and fixed doc.
edeiana Feb 27, 2025
21c8216
Removed relative path to trace_entry.h include.
edeiana Feb 27, 2025
863a3be
Merge branch 'master' into i7216-noise-generator-initial
edeiana Feb 27, 2025
212cefb
noise_generator_info is now a unique ptr in scheduler options.
edeiana Feb 27, 2025
e62c3c2
clang format.
edeiana Feb 27, 2025
5a0a4a6
Missing include added.
edeiana Feb 27, 2025
9a22912
Fixed typo in comment.
edeiana Feb 27, 2025
91c0b2e
Added noise_generator_factory_t.
edeiana Mar 6, 2025
ec872b0
Ensure schedule_stats_noise_generator always simulates on 4 cores.
edeiana Mar 6, 2025
410c108
Report error if adding noise generators fails.
edeiana Mar 6, 2025
45a6841
Improved comment.
edeiana Mar 6, 2025
b3dc6de
Improved comment.
edeiana Mar 6, 2025
1bf889c
noise_generator_factory_t only generates a single noise generator
edeiana Mar 8, 2025
6a9a4d3
Trying to schedule always on 4 cores in schedule_stats_noise_generator
edeiana Mar 8, 2025
2223afc
Ignore core count in schedule_stats_noise_generator test,
edeiana Mar 8, 2025
4509a31
Minor improvements.
edeiana Mar 9, 2025
a344710
Improved noise generator scheduler unit test.
edeiana Mar 9, 2025
1a35d61
Minor improvements.
edeiana Mar 9, 2025
4c746e0
Typo in comment fixed.
edeiana Mar 9, 2025
3c4968f
Clarification comment on why we need a timestamp
edeiana Mar 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions api/docs/release.dox
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,9 @@ Further non-compatibility-affecting changes include:
- Allow v2p.textproto file in a trace directory. This file is present in public traces.
- Allow v2p.textproto file to have one missing virtual_address field, which indicates
virtual_address == 0x0. Necessary in case a trace accesses virtual address 0x0.
- Added noise_generator_t scaffolding as a reader_t to produce synthetic trace records.
- Added -enable_noise_generator and -noise_generator_num_records as flags and scheduler
options.

**************************************************
<hr>
Expand Down
2 changes: 2 additions & 0 deletions clients/drcachesim/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,7 @@ set(drcachesim_srcs
scheduler/scheduler_replay.cpp
scheduler/scheduler_fixed.cpp
scheduler/speculator.cpp
scheduler/noise_generator.cpp
analyzer.cpp
analyzer_multi.cpp
${client_and_sim_srcs}
Expand Down Expand Up @@ -346,6 +347,7 @@ add_exported_library(drmemtrace_analyzer STATIC
scheduler/scheduler_replay.cpp
scheduler/scheduler_fixed.cpp
scheduler/speculator.cpp
scheduler/noise_generator.cpp
common/trace_entry.cpp
reader/reader.cpp
reader/config_reader.cpp
Expand Down
5 changes: 5 additions & 0 deletions clients/drcachesim/analyzer_multi.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -571,6 +571,11 @@ analyzer_multi_tmpl_t<RecordType, ReaderType>::analyzer_multi_tmpl_t()
#endif
}

// Add the noise generator before init_scheduler(), where we eventually add
// the noise generator as another input workload.
sched_ops.enable_noise_generator = op_enable_noise_generator.get_value();
sched_ops.noise_generator_num_records = op_noise_generator_num_records.get_value();

if (!indirs.empty()) {
std::vector<std::string> tracedirs;
for (const std::string &indir : indirs)
Expand Down
12 changes: 12 additions & 0 deletions clients/drcachesim/common/options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -714,6 +714,18 @@ droption_t<std::string>
"Cache hierarchy configuration file",
"The full path to the cache hierarchy configuration file.");

droption_t<bool>
op_enable_noise_generator(DROPTION_SCOPE_FRONTEND, "enable_noise_generator", false,
"Enables the noise generator.",
"Enables the scheduler to interleave trace records "
"with synthetic records.");

droption_t<uint64_t> op_noise_generator_num_records(
DROPTION_SCOPE_FRONTEND, "noise_generator_num_records", 0,
"Number of synthetic reords the noise generator produces.",
"Determines the number of synthetic records produced by the noise generator "
"excluding TRACE_TYPE_THREAD and TRACE_TYPE_PID.");

// XXX: if we separate histogram + reuse_distance we should move this with them.
droption_t<unsigned int>
op_report_top(DROPTION_SCOPE_FRONTEND, "report_top", 10,
Expand Down
2 changes: 2 additions & 0 deletions clients/drcachesim/common/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,8 @@ extern dynamorio::droption::droption_t<dynamorio::droption::bytesize_t> op_warmu
extern dynamorio::droption::droption_t<double> op_warmup_fraction;
extern dynamorio::droption::droption_t<dynamorio::droption::bytesize_t> op_sim_refs;
extern dynamorio::droption::droption_t<std::string> op_config_file;
extern dynamorio::droption::droption_t<bool> op_enable_noise_generator;
extern dynamorio::droption::droption_t<uint64_t> op_noise_generator_num_records;
extern dynamorio::droption::droption_t<unsigned int> op_report_top;
extern dynamorio::droption::droption_t<unsigned int> op_reuse_distance_threshold;
extern dynamorio::droption::droption_t<bool> op_reuse_distance_histogram;
Expand Down
107 changes: 107 additions & 0 deletions clients/drcachesim/scheduler/noise_generator.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
/* **********************************************************
* Copyright (c) 2025 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#include <assert.h>
#include "noise_generator.h"
#include "trace_entry.h"
#include "utils.h"

namespace dynamorio {
namespace drmemtrace {

noise_generator_t::noise_generator_t()
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

List as = default; in .h instead?


noise_generator_t::noise_generator_t(uint64_t num_records_to_generate)
: num_records_to_generate_(num_records_to_generate)
{
}

noise_generator_t::~noise_generator_t()
{
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also = default;?


bool
noise_generator_t::init()
{
at_eof_ = false;
++*this;
return true;
}

std::string
noise_generator_t::get_stream_name() const
{
return "noise_generator";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append the pid and tid? Or those are available through other means: probably what we'd want is to append the noise parameters like "30_percent_loads". Add XXX comment?

}

trace_entry_t *
noise_generator_t::read_next_entry()
{
if (num_records_to_generate_ == 0) {
at_eof_ = true;
return nullptr;
}

// Do not change the order for generating TRACE_TYPE_THREAD and TRACE_TYPE_PID.
// The scheduler expects a tid first and then a pid.
if (!marker_tid_generated_) {
entry_ = { TRACE_TYPE_THREAD,
sizeof(int),
{ static_cast<addr_t>(IDLE_THREAD_ID) } };
marker_tid_generated_ = true;
return &entry_;
}
if (!marker_pid_generated_) {
entry_ = { TRACE_TYPE_PID,
sizeof(int),
{ static_cast<addr_t>(INVALID_CPU_MARKER_VALUE) } };
marker_pid_generated_ = true;
return &entry_;
}

// XXX i#7216: this is a temporary trace record that we use as a placeholder until the
// logic to generate noise records is in place.
entry_ = { TRACE_TYPE_READ, 4, { 0xdeadbeef } };
if (num_records_to_generate_ == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do this before line 112 to avoid wasteful work?

entry_ = { TRACE_TYPE_THREAD_EXIT,
sizeof(int),
{ static_cast<addr_t>(IDLE_THREAD_ID) } };
}
--num_records_to_generate_;

return &entry_;
}

} // namespace drmemtrace
} // namespace dynamorio
77 changes: 77 additions & 0 deletions clients/drcachesim/scheduler/noise_generator.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/* **********************************************************
* Copyright (c) 2025 Google, Inc. All rights reserved.
* **********************************************************/

/*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
*
* * Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* * Neither the name of Google, Inc. nor the names of its contributors may be
* used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL VMWARE, INC. OR CONTRIBUTORS BE LIABLE
* FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
* DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
* SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
* CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
* LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
* OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
* DAMAGE.
*/

#ifndef _NOISE_GENERATOR_H_
#define _NOISE_GENERATOR_H_ 1

#include "reader.h"
#include "../common/trace_entry.h"

namespace dynamorio {
namespace drmemtrace {

/**
* Generates synthetic #dynamorio::drmemtrace::memref_t trace entries in a single thread
* and presents them via an iterator interface to the scheduler.
*/
class noise_generator_t : public reader_t {
public:
noise_generator_t();

noise_generator_t(uint64_t num_records_to_generate);

virtual ~noise_generator_t();

bool
init() override;

std::string
get_stream_name() const override;

protected:
virtual trace_entry_t *
read_next_entry() override;

private:
trace_entry_t entry_ = {};
bool marker_pid_generated_ = false;
bool marker_tid_generated_ = false;
// This counter does not count TRACE_TYPE_THREAD or TRACE_TYPE_PID.
// The idea is that when we want to generate at least 1 record, tid and pid have to be
// there as well, otherwise the scheduler will report an error.
uint64_t num_records_to_generate_ = 0;
};

} // namespace drmemtrace
} // namespace dynamorio

#endif /* _NOISE_GENERATOR_H_ */
9 changes: 9 additions & 0 deletions clients/drcachesim/scheduler/scheduler.h
Original file line number Diff line number Diff line change
Expand Up @@ -829,6 +829,15 @@ template <typename RecordType, typename ReaderType> class scheduler_tmpl_t {
* when raising this value on uneven inputs.
*/
double exit_if_fraction_inputs_left = 0.1;
/**
* Enables the noise generator to create synthetic trace records that will be
* scheduled alongside records of one or more real traces.
*/
bool enable_noise_generator = false;
/**
* Number of synthetic trace records produced by the noise generator.
*/
uint64_t noise_generator_num_records = 0;
// When adding new options, also add to print_configuration().
};

Expand Down
38 changes: 38 additions & 0 deletions clients/drcachesim/scheduler/scheduler_impl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@
#include "mutex_dbg_owned.h"
#include "reader.h"
#include "record_file_reader.h"
#include "noise_generator.h"
#include "trace_entry.h"
#ifdef HAS_LZ4
# include "lz4_file_reader.h"
Expand Down Expand Up @@ -117,6 +118,13 @@ replay_file_checker_t::check(archive_istream_t *infile)
* Specializations for scheduler_tmpl_impl_t<reader_t>, aka scheduler_impl_t.
*/

template <>
std::unique_ptr<reader_t>
scheduler_impl_tmpl_t<memref_t, reader_t>::get_noise_generator(uint64_t num_records)
{
return std::unique_ptr<noise_generator_t>(new noise_generator_t(num_records));
}

template <>
std::unique_ptr<reader_t>
scheduler_impl_tmpl_t<memref_t, reader_t>::get_default_reader()
Expand Down Expand Up @@ -353,6 +361,15 @@ scheduler_impl_tmpl_t<memref_t, reader_t>::insert_switch_tid_pid(input_info_t &i
* record_scheduler_impl_t.
*/

template <>
std::unique_ptr<dynamorio::drmemtrace::record_reader_t>
scheduler_impl_tmpl_t<trace_entry_t, record_reader_t>::get_noise_generator(
uint64_t num_records)
{
error_string_ = "Noise generator is not suppported for record_reader_t";
return std::unique_ptr<dynamorio::drmemtrace::record_reader_t>();
}

template <>
std::unique_ptr<dynamorio::drmemtrace::record_reader_t>
scheduler_impl_tmpl_t<trace_entry_t, record_reader_t>::get_default_reader()
Expand Down Expand Up @@ -629,6 +646,10 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::print_configuration()
options_.honor_infinite_timeouts);
VPRINT(this, 1, " %-25s : %f\n", "exit_if_fraction_inputs_left",
options_.exit_if_fraction_inputs_left);
VPRINT(this, 1, " %-25s : %d\n", "enable_noise_generator",
options_.enable_noise_generator);
VPRINT(this, 1, " %-25s : %" PRIu64 "\n", "noise_generator_num_records",
options_.noise_generator_num_records);
}

template <typename RecordType, typename ReaderType>
Expand Down Expand Up @@ -705,6 +726,18 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::init(
{
options_ = std::move(options);
verbosity_ = options_.verbosity;

// Add noise generator reader to workload_inputs.
if (options_.enable_noise_generator) {
auto noise_generator = get_noise_generator(options_.noise_generator_num_records);
auto noise_generator_end = get_noise_generator(0);
std::vector<typename sched_type_t::input_reader_t> readers;
// Use a sentinel for the tid so the scheduler will use the memref record tid.
readers.emplace_back(std::move(noise_generator), std::move(noise_generator_end),
/* tid = */ INVALID_THREAD_ID);
workload_inputs.emplace_back(std::move(readers));
}

// workload_inputs is not const so we can std::move readers out of it.
for (int workload_idx = 0; workload_idx < static_cast<int>(workload_inputs.size());
++workload_idx) {
Expand Down Expand Up @@ -1514,6 +1547,11 @@ scheduler_impl_tmpl_t<RecordType, ReaderType>::get_initial_input_content(
// output stream(s).
for (size_t i = 0; i < inputs_.size(); ++i) {
input_info_t &input = inputs_[i];
// If the input is a noise generator, we don't want to read ahead to find
// timestamp records, since we don't have any.
if (dynamic_cast<noise_generator_t *>(input.reader.get()) != nullptr)
continue;

std::lock_guard<mutex_dbg_owned> lock(*input.lock);

// If the input jumps to the middle immediately, do that now so we'll have
Expand Down
4 changes: 4 additions & 0 deletions clients/drcachesim/scheduler/scheduler_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -667,6 +667,10 @@ template <typename RecordType, typename ReaderType> class scheduler_impl_tmpl_t
std::unique_ptr<ReaderType>
get_default_reader();

// Creates a noise generator as a reader.
std::unique_ptr<ReaderType>
get_noise_generator(uint64_t num_records);

// Creates a reader for the specific file type at (non-directory) 'path'.
std::unique_ptr<ReaderType>
get_reader(const std::string &path, int verbosity);
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Schedule stats tool results:
Total counts:
4 cores
2 threads: W.*, W.*
.*
Loading
Loading