Skip to content

Latest commit

 

History

History
48 lines (29 loc) · 2.87 KB

what-we-look-at.md

File metadata and controls

48 lines (29 loc) · 2.87 KB

What we look at

Licensee works by taking a detected license file, and comparing the contents to a short list of known licenses.

Detecting the license file

Licensee uses a series of regular expressions to score files in the project's root as potential license files. Here's a few examples of files that would be detected:

  • LICENSE
  • LICENCE
  • license.md
  • COPYING.txt
  • LICENSE-MIT
  • COPYRIGHT
  • UNLICENSE

Known licenses

Licensee relies on the crowdsourced license content and metadata from choosealicense.com.

What it doesn't look at

  • The licensing of a project's dependencies
  • References to licenses in README, README.md, etc.
  • Every single possible license (just the most popular ones)
  • Compliance - If you're looking for this, take a look at LicenseFinder.

Huh? Why don't you look at X?

Because reasons.

Why not just look at the "license" field of [insert package manager here]?

Because it's not legally binding. A license is a legal contract. You give up certain rights (e.g., the right to sue the author) in exchange for the right to use the software.

Most popular licenses today require that the license itself be distributed along side the software. Simply putting the letters "MIT" or "GPL" in a configuration file doesn't really meet that requirement. Those files are designed to be read by computers (who can't enter into contracts), not humans (who can). It's great metadata, but that's about it.

From a practical standpoint, every language has its own package manager (some even have multiple). That means that if you want to detect the license of an arbitrary project, you'll have to implement 100s of package-manager-specific detection strategies. The LICENSE file is a platform-agnostic and unambiguous way to communicate license intention.

What about looking to see if the author said something in the readme?

You could make an argument that, when linked or sufficiently identified, the terms of the license are incorporated by reference, or at least that the author's intent is there. There's a handful of reasons why this isn't ideal. For one, if you're using the MIT or BSD (ISC) license, along with a few others, there's templematic language, like the copyright notice, which would go unfilled.

What about checking every single file for a copyright header?

Because that's silly in the context of how software is developed today. You wouldn't put a copyright notice on each page of a book. Besides, it's a lot of work, as there's no standardized, cross-platform way to describe a project's license within a comment.

Checking the actual text into version control is definitive, so that's what this project looks at.