Re: Standalone license tools for scanning debian/ubuntu apps?
On Tue, Feb 5, 2019 at 1:30 PM Jeremiah C. Foster <firstname.lastname@example.org> wrote:
If I'm not mistaken, copyright has to be a string because it has to be legible by humans. This means you can likely grep through source code as scancode does with a fair degree of confidence and use 'strings' on binaries.Yes, absolutely.
SPDX's set of standard licenses and ids (and scancode's somewhat
expanded similar set) are great for stating license info succinctly.
scancode is great at collecting the info that should go into the
debian copyright file.
My goal for this iteration at our licensing process was to automate
collection of license info for the shared libraries our binary uses.
Here's the pipeline I set up to do that:
1) https://github.com/Oblong/obs/blob/master/ob-filter-licenses reads
a DEP-5 (aka Debian copyright) file and filters out any clauses that
(most likely) do not propagate to shared library artifacts
2) https://github.com/Oblong/obs/blob/master/ob-parse-licenses reads a
Debian copyright file, filters it through ob-filter-licenses, and
outputs spdx ids. (For non-DEP-5 copyright files, it uses scancode to
3) https://github.com/Oblong/obs/blob/master/ob-list-licenses uses ldd
to look up shared libraries used by a binary, uses dpkg-query to look
up the containing packages, and runs ob-parse-licenses on them.
For instance, running "ob-list-licences /bin/login" outputs:
This of course only solves a small part of the license / copyright
problem, and only approximately, but it found interesting things for