Skip to content

Commit

Permalink
Enable processing defines from included headers
Browse files Browse the repository at this point in the history
This allows properly analyzing code which uses config headers for setting
various defines which influence the code. For example for platform specific
behavior.
Furthermore, we fix a bug in passing defines to our pre processor.
  • Loading branch information
martis42 committed Sep 17, 2023
1 parent 11be4fa commit 556b278
Show file tree
Hide file tree
Showing 26 changed files with 188 additions and 94 deletions.
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@
- [Implementation_deps](#Implementation_deps)
- [Known limitations](#known-limitations)
- [Applying automatic fixes](#applying-automatic-fixes)
- [Assumptions of use](#assumptions-of-use)
- [Supported Platforms](#supported-platforms)
- [Preconditions](#preconditions)
- [Alternatives to DWYU](#alternatives-to-dwyu)
- [Versioning](#versioning)
- [Contributing](#contributing)
Expand Down Expand Up @@ -169,11 +169,13 @@ Examples for this can be seen at the [implementation_deps test cases](test/aspec

## Known limitations

- Includes which are added through a preprocessor token are not recognized.
- Fundamental support for processing preprocessor defines is present.
However, if header _A_ specifies a define _X_ and is included in header _B_, header _B_ is not aware of _X_ from header _A_.
Right now only defines specified through Bazel (e.g. toolchain or `cc_*` target attributes) or defines specified inside a file itself are used to process a file and discover include statements.
We aim to resolve this limitation in a future release.
Includes which are added through a preprocessor token are not recognized.
For example the following won't be analyzed properly:

```cpp
#define INCLUDE_PATH "some/header.h"
#include INCLUDE_PATH
```
## Applying automatic fixes
Expand Down Expand Up @@ -212,7 +214,7 @@ Unfortunately, the tool cannot promise perfect results due to various constraint
But a downstream user of library _A_ might transitively depend on _X_.
Removing the unused dependency will break the build as the downstream dependency no longer finds dependency _X_.
# Preconditions
# Assumptions of use
##### The code has to be compilable
Expand Down
10 changes: 8 additions & 2 deletions src/analyze_includes/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,10 +76,16 @@ def main(args: Namespace) -> int:
implementation_deps=args.implementation_deps,
)
all_includes_from_public = get_relevant_includes_from_files(
files=args.public_files, ignored_includes=ignored_includes, defines=system_under_inspection.defines
files=args.public_files,
ignored_includes=ignored_includes,
defines=system_under_inspection.defines,
include_paths=system_under_inspection.include_paths,
)
all_includes_from_private = get_relevant_includes_from_files(
files=args.private_files, ignored_includes=ignored_includes, defines=system_under_inspection.defines
files=args.private_files,
ignored_includes=ignored_includes,
defines=system_under_inspection.defines,
include_paths=system_under_inspection.include_paths,
)

result = evaluate_includes(
Expand Down
47 changes: 30 additions & 17 deletions src/analyze_includes/parse_source.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,38 @@
from pathlib import Path
from typing import List, Optional

from pcpp import Preprocessor
from pcpp.preprocessor import Action, OutputDirective, Preprocessor


class SimpleParsingPreprocessor(Preprocessor):
"""
This preprocessor configuration is used to prune commented code and to resolve preprocessor statements for defines
which are injected through Bazel. We do not resolve include statements. Meaning each file is analyzed only for
itself.
This preprocessor configuration is used to prune commented code and to resolve preprocessor statements. The main
points for us is resolving branching statements (e.g. '#ifdef') to analyze the correct code parts.
"""

def on_file_open(self, _, __):
def on_include_not_found(self, is_malformed, is_system_include, curdir, includepath):
"""
Raising here prevents include statements being resolved
We ignore missing include statements.
We have many tests regarding aggregating and processing include paths which are defined through Bazel in the
core DWYU logic for analyzing the include statements. Thus, we are confident all relevant include paths are
provided to the preprocessor.
We expect to fail finding system headers, as the standard library headers (aka the C++ Bazel toolchain) are
ignored by DWYU.
If a non toolchain header is missing we assume this is due to the header missing completely in the dependencies.
In other words the code does not even compile and thus is violating our assumptions of use, see the README.md.
"""
raise OSError("Do not open file")
raise OutputDirective(Action.IgnoreAndPassThrough)

def on_error(self, _, __, ___):
"""
Since unresolved include statements cause errors we silence error reporting
"""
pass

def make_pre_processor() -> SimpleParsingPreprocessor:
"""
We can't overwrite member values from the base class in the ctor of our derived class and thus set them after
construction.
"""
processor = SimpleParsingPreprocessor()
processor.passthru_includes = re.compile(".*")
return processor


@dataclass
Expand Down Expand Up @@ -65,16 +76,18 @@ def is_ignored(self, include: str) -> bool:
return is_ignored_path or is_ignored_pattern


def get_includes_from_file(file: Path, defines: List[str]) -> List[Include]:
def get_includes_from_file(file: Path, defines: List[str], include_paths: List[str]) -> List[Include]:
"""
Parse a C/C++ file and extract include statements which are neither commented nor disabled through a define.
"""
with open(file, encoding="utf-8") as fin:
pre_processor = SimpleParsingPreprocessor()
pre_processor = make_pre_processor()
for define in defines:
pre_processor.define(define)
pre_processor.parse(fin.read())
for path in include_paths:
pre_processor.add_path(path)

pre_processor.parse(fin.read())
output_sink = StringIO()
pre_processor.write(output_sink)

Expand All @@ -94,11 +107,11 @@ def filter_includes(includes: List[Include], ignored_includes: IgnoredIncludes)


def get_relevant_includes_from_files(
files: Optional[List[str]], ignored_includes: IgnoredIncludes, defines: List[str]
files: Optional[List[str]], ignored_includes: IgnoredIncludes, defines: List[str], include_paths: List[str]
) -> List[Include]:
all_includes = []
if files:
for file in files:
includes = get_includes_from_file(file=Path(file), defines=defines)
includes = get_includes_from_file(file=Path(file), defines=defines, include_paths=include_paths)
all_includes.extend(includes)
return filter_includes(includes=all_includes, ignored_includes=ignored_includes)
10 changes: 9 additions & 1 deletion src/analyze_includes/system_under_inspection.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,14 @@ def replace_dot(paths: List[str]) -> List[str]:
)


def _get_defines(target_info: Dict[str, Any]) -> List[str]:
"""
Defines with values in BUILD files or the compiler CLI can be specified via '<DEFINE_TOKEN>=<VALUE>'. However, this
syntax is not valid for the preprocessor, which expects '<DEFINE_TOKEN> <VALUE>'.
"""
return [define.replace("=", " ") for define in target_info["defines"]]


def get_system_under_inspection(
target_under_inspection: Path, deps: List[Path], implementation_deps: List[Path]
) -> SystemUnderInspection:
Expand All @@ -117,5 +125,5 @@ def get_system_under_inspection(
deps=_cc_targets_from_deps(deps),
implementation_deps=_cc_targets_from_deps(implementation_deps),
include_paths=_get_include_paths(target_info),
defines=target_info["defines"],
defines=_get_defines(target_info),
)
2 changes: 2 additions & 0 deletions src/analyze_includes/test/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ py_test(
"data/commented_includes/single_line_comments.h",
"data/empty_header.h",
"data/header_with_defines.h",
"data/some_defines.h",
"data/some_header.h",
"data/use_defines.h",
],
deps = ["//src/analyze_includes:lib"],
)
Expand Down
2 changes: 2 additions & 0 deletions src/analyze_includes/test/data/some_defines.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
#define MY_DEFINE
#define THE_ANSWER 42
13 changes: 13 additions & 0 deletions src/analyze_includes/test/data/use_defines.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#include "src/analyze_includes/test/data/some_defines.h"

#ifdef MY_DEFINE
#include "expected/include_a.h"
#else
#include "bad/include_a.h"
#endif

#if THE_ANSWER > 40
#include "expected/include_b.h"
#else
#include "bad/include_b.h"
#endif
24 changes: 17 additions & 7 deletions src/analyze_includes/test/parse_source_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,19 +100,19 @@ def test_filter_includes_for_patterns(self):
class TestGetIncludesFromFile(unittest.TestCase):
def test_empty_header(self):
test_file = Path("src/analyze_includes/test/data/empty_header.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(result, [])

def test_single_include(self):
test_file = Path("src/analyze_includes/test/data/another_header.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(result, [Include(file=test_file, include="foo/bar.h")])

def test_multiple_includes(self):
test_file = Path("src/analyze_includes/test/data/some_header.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(len(result), 4)
self.assertTrue(Include(file=test_file, include="bar.h") in result)
Expand All @@ -121,15 +121,15 @@ def test_multiple_includes(self):

def test_commented_includes_single_line_comments(self):
test_file = Path("src/analyze_includes/test/data/commented_includes/single_line_comments.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(len(result), 2)
self.assertTrue(Include(file=test_file, include="active_a.h") in result)
self.assertTrue(Include(file=test_file, include="active_b.h") in result)

def test_commented_includes_block_comments(self):
test_file = Path("src/analyze_includes/test/data/commented_includes/block_comments.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(len(result), 8)
self.assertTrue(Include(file=test_file, include="active_a.h") in result)
Expand All @@ -143,27 +143,37 @@ def test_commented_includes_block_comments(self):

def test_commented_includes_mixed_style(self):
test_file = Path("src/analyze_includes/test/data/commented_includes/mixed_style.h")
result = get_includes_from_file(test_file, defines=[])
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(result, [Include(file=test_file, include="active.h")])

def test_includes_selected_through_defines(self):
test_file = Path("src/analyze_includes/test/data/header_with_defines.h")
result = get_includes_from_file(test_file, defines=["FOO", "BAZ 42"])
result = get_includes_from_file(test_file, defines=["FOO", "BAZ 42"], include_paths=[])

self.assertEqual(len(result), 4)
self.assertTrue(Include(file=test_file, include="has_internal.h") in result)
self.assertTrue(Include(file=test_file, include="has_foo.h") in result)
self.assertTrue(Include(file=test_file, include="no_bar.h") in result)
self.assertTrue(Include(file=test_file, include="baz_greater_40.h") in result)

def test_includes_selected_through_defines_from_header(self):
test_file = Path("src/analyze_includes/test/data/use_defines.h")
result = get_includes_from_file(test_file, defines=[], include_paths=[])

self.assertEqual(len(result), 3)
self.assertTrue(Include(file=test_file, include="src/analyze_includes/test/data/some_defines.h") in result)
self.assertTrue(Include(file=test_file, include="expected/include_a.h") in result)
self.assertTrue(Include(file=test_file, include="expected/include_b.h") in result)


class TestGetRelevantIncludesFromFiles(unittest.TestCase):
def test_get_relevant_includes_from_files(self):
result = get_relevant_includes_from_files(
files=["src/analyze_includes/test/data/some_header.h", "src/analyze_includes/test/data/another_header.h"],
ignored_includes=IgnoredIncludes(paths=["vector"], patterns=[]),
defines=[],
include_paths=[],
)

self.assertEqual(len(result), 4)
Expand Down
2 changes: 1 addition & 1 deletion src/analyze_includes/test/system_under_inspection_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ def test_load_full_file(self):
self.assertEqual(sui.implementation_deps[1].usage.usage, UsageStatus.NONE)

self.assertEqual(sui.include_paths, ["", "some/dir", "another/dir"])
self.assertEqual(sui.defines, ["SOME_DEFINE", "ANOTHER_DEFINE=42"])
self.assertEqual(sui.defines, ["SOME_DEFINE", "ANOTHER_DEFINE 42"])

def test_load_empty_file(self):
sui = get_system_under_inspection(
Expand Down
4 changes: 3 additions & 1 deletion src/aspect/dwyu.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -187,9 +187,11 @@ def dwyu_aspect_impl(target, ctx):
if _do_ensure_private_deps(ctx):
args.add("--implementation_deps_available")

all_hdrs = target[CcInfo].compilation_context.headers.to_list()
analysis_inputs = [ctx.file._config, processed_target] + processed_deps + processed_implementation_deps + private_files + all_hdrs
ctx.actions.run(
executable = ctx.executable._dwyu_binary,
inputs = [ctx.file._config, processed_target] + public_files + private_files + processed_deps + processed_implementation_deps,
inputs = analysis_inputs,
outputs = [report_file],
mnemonic = "CompareIncludesToDependencies",
progress_message = "Analyze dependencies of {}".format(target.label),
Expand Down
15 changes: 12 additions & 3 deletions test/aspect/defines/BUILD
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
cc_library(
name = "in_file_defines",
hdrs = ["in_file_defines.h"],
deps = ["//test/aspect/defines/lib:a"],
deps = ["//test/aspect/defines/support:lib_a"],
)

cc_library(
Expand All @@ -10,14 +10,23 @@ cc_library(
copts = ["-DSOME_COPT 42"],
defines = ["SOME_DEFINE"],
local_defines = ["LOCAL_DEFINE"],
deps = ["//test/aspect/defines/lib:a"],
deps = ["//test/aspect/defines/support:lib_a"],
)

cc_library(
name = "transitive_defines_from_bazel_target",
hdrs = ["transitive_defines_from_bazel_target.h"],
deps = [
"//test/aspect/defines/lib:a",
"//test/aspect/defines/support:lib_a",
"//test/aspect/defines/support:transitive_define",
],
)

cc_library(
name = "use_defines_from_dependency_header",
hdrs = ["use_defines_from_dependency_header.h"],
deps = [
"//test/aspect/defines/support:conditional_defines",
"//test/aspect/defines/support:lib_b",
],
)
11 changes: 1 addition & 10 deletions test/aspect/defines/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,2 @@
Defines can influence which include statements are relevant.

These tests concentrate on parsing single files based on defines:

- specified in the parsed file itself
- coming from the C/C++ toolchain
- defined by the Bazel target attributes `defines`, `local_defines` or `cops`

Defines can also be imported into a file via an included header which specifies a define.
This use case is not yet supported.
We might add it at a later stage.
In these test we ensure our preprocessor behaves as expected and chooses the desired code parts for analysis.
12 changes: 6 additions & 6 deletions test/aspect/defines/defines_from_bazel_target.h
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
#ifdef SOME_DEFINE
#include "test/aspect/defines/lib/a.h"
#include "test/aspect/defines/support/a.h"
#else
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif

#ifdef LOCAL_DEFINE
#include "test/aspect/defines/lib/a.h"
#include "test/aspect/defines/support/a.h"
#else
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif

#if SOME_COPT > 40
#include "test/aspect/defines/lib/a.h"
#include "test/aspect/defines/support/a.h"
#else
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif
10 changes: 5 additions & 5 deletions test/aspect/defines/in_file_defines.h
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
#define USE_A

#ifdef USE_A
#include "test/aspect/defines/lib/a.h"
#include "test/aspect/defines/support/a.h"
#else
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif

#ifdef NON_EXISTING_DEFINE
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif

#define SOME_VALUE 42

#if SOME_VALUE > 40
#include "test/aspect/defines/lib/a.h"
#include "test/aspect/defines/support/a.h"
#else
#include "test/aspect/defines/lib/b.h"
#include "test/aspect/defines/support/b.h"
#endif
Loading

0 comments on commit 556b278

Please sign in to comment.