I created this personal Semgrep server to learn Rust. It is suitable for local deployment for folks who cannot use the Semgrep SaaS App because of custom Semgrep rules and proprietary code.
- Unlimited local policies: A policy is a collection of rules.
- Serve rules and policies to the Semgrep CLI app over HTTP.
It was inspired by wahyuhadi/semgrep-server-rules.
I will try to keep the main
branch usable. The dev
branch is used for
development.
$ git clone https://github.com/parsiya/personal-semgrep-server
$ git submodule update --init --recursive
$ cargo build
$ ./target/debug/personal-semgrep-server -r tests/rules/ -p tests/policies/
# run all rules against your code
$ semgrep --config http://localhost:9090/c/p/all path/to/code
Note: Passing a policy path with "-p" is optional. The only mandatory option
is -r
that points to the location of the rules. In this case, it will only
serve individual rules or the all
policy/rule.
Run the server like this:
./personal-semgrep-server -r path/to/rules/ -p path/to/policies/
Then navigate to http://localhost:9090. The landing page has a link to every rule and policy indexed by the server. Clicking on each link will show you the complete YAML file. This server uses the same path structure as the Semgrep App.
- Policy URL:
/c/p/{policyid}
- Rule URL:
/c/r/{ruleid}
Pass these URLs directly to the Semgrep CLI app.
Policies are collections of rules. A local policy is a YAML file like this:
name: policy-name # this should be unique
rules:
- ruleID-1
- ruleID-2
- arrays-out-of-bounds-access
- potentially-uninitialized-pointer
- snprintf-insecure-use
Create as many as you want. After passing the path to the server, it will search
for all .yaml
and .yml
files in that path recursively. This allows you to
store your policies in subdirectories for better organization:
tests
└── policies
├── cpp
| ├── cpp-policy1.yaml
| └── cpp-policy2.yaml
└── rust
├── rust-policy1.yaml
└── rust-policy2.yaml
Note: Policy names must be unique. If you have duplicate policy names, one will be overwritten by another.
The semgrep-rs library creates a built-in policy and rule named
all
, even if you do not pass a policy path. The all
rule/policy contains
every rule indexed by the server. It's useful when you want to run all rules
against a code base.
If you have a custom rule or policy named all
, it will be overwritten.
The Semgrep CLI app only runs specific rules against a file based on its
extension so don't shy away from throwing the kitchen sink at your code with
all
. See Language extensions and tags in the Semgrep documentation.
Similar to policy names, rule IDs must also be unique. The
Semgrep SaaS App uses complete rule IDs that are based on the path
to avoid collisions. To create a complete rule ID, replace the path separator
(/
or \
) with .
, then append the rule's internal ID (the value of the id
key in the rule file).
For example, the complete rule ID for a rule with id: double-free
in the
rules/c/lang/security/double-free.c
file is:
rules.c.lang.security.double-free.double-free
.
My underlying library semgrep-rs supports creating complete rule IDs, but I have not added it to the current iteration of server because:
- You have to include the complete rule ID in the policy file.
- The rule ID will be dependent on the path passed to the server.
I can change this if we can come up with a solution to get consistent rule IDs and a way to write policies automatically.
lol wut?! Only run it on localhost
and don't expose this to the internet.
The Semgrep SaaS App is awesome and you should use (and buy it) if you can. But my custom rules and code had to stay local so I had to create a directory structure for rules to simulate policies.
Another issue was lack of local policies. For example, to run all C++ rules
against a target, your only realistic option is to store all rules in a
directory named cpp
and pass it to Semgrep CLI with --config path/to/cpp/
.
If you want to run a specific set of C++ rules, you can either copy/paste the
rule files somewhere else or use the --exclude-rule
command line switch a
bajillion times. Now if you modify a rule (good idea to keep them in a git
repository), you have to manually update all copies
Another issue is the directory structure of the Semgrep Rules on GitHub. It doesn't work for me. Audit Shouldn't be Under Security in the Semgrep Rules Repository
This server and local policies solve all of these problems for me. I can keep a single copy of my custom rules in a git repository with the Semgrep Rules repository as a git submodule and have custom policies to mix and match rules. The policies are also in the same repository.
I like to keep this server as simple as possible. I don't want to create a Semgrep App competitor. The only thing I would add is a simple UI similar to the Semgrep Playground to allow people to run it locally for proprietary rules/code and maybe some simple commands for filtering rules and creating policies (e.g., create a policy from specific keys in the rules' metadata).
Rust likes dual-licensing so here we go.
Licensed under either of Apache License, Version 2.0 or MIT license.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this repository by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.