Frequently Asked Questions¶
General Project FAQs¶
What is FOSSology?¶
Q: What is the FOSSology Project all about?
A: The FOSSology Project is a Free Open Source Software (FOSS) project built around an open and modular architecture for analyzing software. Existing modules include license analysis, meta data extraction, and MIME type identification. This open source software tool analyzes a given set of software packages, and reports items such as the software licenses used by these packages.
More than simply reporting, "Package //X// uses license //Y//," the FOSSology tool attempts to analyze every file within the package to determine its license. The license report is thus an aggregate of all of the different licenses found to be in use by a package. A single package may be labeled as "GPL" but contain files that use other licenses (BSD, OSL, or any of the hundreds of other licenses). Even if an exact license is unknown, the license may be identifiable by common license phrases.
Digging deeper, the FOSSology project is intended as a general-purpose data mining tool. It can be extended by adding new Agents to analyze all sorts of meta information about Free and Open Source Software -- not just licenses, but code re-use, security alerts, bug fixes and patches, project information, usage statistics -- just about anything you could imagine!
All of the software packages analyzed by FOSSology are maintained in its internal Software Repository, and the information collected by FOSSology (such as the license analyses) are maintained in its internal Database.
Where did FOSSology come from?¶
Q: Where did the FOSSology tool come from? Why would somebody create this tool? Who are you and what do you get out of this?
A: The FOSSology Project started as an internal software development effort within Hewlett Packard's Open Source and Linux Organization. The tool evolved over several years at HP from a few simple shell scripts to the much more comprehensive tool you see today.
HP needed a way to quickly and accurately evaluate open source software that was being proposed for use within the company as well as software that was being considered for distribution on its own or as part of an HP product or service. These tools were developed to meet this need, alerting developers and project managers to conflicts in licensing terms, potential pitfalls in the combination of various software packages, or problems with integrating "home grown" code with existing open source software.
In time, HP came to realize that this tool was of far greater value to itself as well as the community if it could be made available on a broader basis. Thus the decision was made to open source the tool and help promote its use within open source communities to help ease the confusion and uncertainty around licensing questions.
Is FOSSology free?¶
Q: Is FOSSology free? Is it open source? How is it licensed?
A: The FOSSology Project is free and open source software. It is available under the terms of the GNU General Public License (GPL) version 2. The documentation for the project is available under the terms of the GNU Free Documentation License (FDL). For more information, see our License page.
Where can I get the FOSSology Project source code?¶
Q: Where can I get the FOSSology Project source code?
A: The FOSSology project source code is available from our project's Subversion repository at SourceForge.net, and tarballs of all released versions of the project are available from our Releases area. You can find links to the source code and packages at our Download page.
How can I troubleshoot my FOSSology system?¶
Q: I'm having trouble with my FOSSology system, where do I turn for help?
A: There are many resources to help you out.
- First, take a look through the FOSSology Troubleshooting guide for some common problems and how to fix them.
- Next, you can send an email to the FOSSology mailing list, and/or join our live public IRC channel. Information on accessing either is availble at Contact Us.
- If you want to dig much deeper, the best place to start is the FOSSology Developer Documentation which will walk you through the system architecture, components, low-level operations, and gory details.
What platforms are supported?¶
Q: What platforms are supported for running FOSSology?
A: Currently we support FOSSology on most GNU/Linux platforms. Most of our development and testing right now occurs on Debian 5.0 (Lenny) but the tools should build and run just fine on any Linux system as long as the dependencies are met. Refer to the System Administration Documentation for more details.
What software is required?¶
Q: What software is required?
A: FOSSology consists of three components: user interface, database, and agents (used to analyze the code).
- The user interface is managed by a web server, so Apache 2.x with PHP5 support is required.
- The database stores information about packages, files, jobs, and everything else. Postgres 8.3 or higher is required.
- The agents consist of a scheduler and all of the analysis agents. The scheduler and agents are provided by FOSSology. However, many of the agents have external dependencies on other software packages:
- Libraries: The libraries used by FOSSology and its agents include: * libmagic - for determining file types, from the "file" software: ftp://ftp.astron.com/pub/file/file-4.02.tar.gz * libxml2 - GNOME XML library: http://gnome.org * libextractor - GNU file meta-data extractor: http://www.gnunet.org/libextractor/
- External Commands: FOSSology also requires several external tools, primarily for unpacking a variety of compression and archive file formats: These include: * ar - for extracting archives, from the binutils software: http://www.gnu.org/software/binutils/ * bzcat - bz2 decompressor, from the bzip2 software: http://www.bzip.org/ * cabextract - extractor for Microsoft Cabinet files: http://www.kyz.uklinux.net/cabextract.php * cpio - for extracting cpio archives: http://www.gnu.org/software/cpio/ * icat and fls - forensics tools from the sleuthkit software: http://sourceforge.net/projects/sleuthkit/ * isoinfo - read metadata info from ISO9660 images, from the mkisofs/cdrtools/cdrkit implementations: cdrkit implemtation http://debburn.alioth.debian.org/ * pdftotext - from the xpdf software: http://www.foolabs.com/xpdf * rpm and rpm2cpio - for extracting software and metadata from rpm packages: http://www.rpm.org/ * tar - tape archive decompressor: http://www.gnu.org/software/tar/ * upx-ucl - an executable compressor/decompressor: http://upx.sourceforge.net * unrar-free - Unarchiver for .rar files: https://gna.org/projects/unrar/ * unzip - De-archiver for .zip files: ftp://ftp.info-zip.org/pub/infozip/src/ * wget - version 1.10 or later (should be installed by default on newer Linux systems) * zcat - for uncompressing .gz and .Z files, from the gzip software: http://www.gzip.org/
For more details on FOSSology dependencies and installation, please refer to System Administration Documentation.
How accurate are the results?¶
Q: How accurate is the license analysis, meta data extraction, and results from other agents?
A: Each agent uses heuristics to identify file properties and determine analysis results. Because tasks such as "license identification" are complex problems, the analysis is usually correct but //not// guaranteed to be correct.
For example, if a file contains the GPLv2 license, then it will likely be idenfied as containing the correct license. However, if someone creates a derivative license, such as modifying the GPLv2 text, then the license should be identified as a GPLv2 license with a lower percentage of match. In the case of open source projects, it is common to see custom licenses which are derived from other licenses. (There are also Frankenstein licenses, where a project's license is made by combining parts from different licenses.)
Similarly, if the unpack agent does not know how to unpack a file, then licenses inside the file may not be analyzed.
In general, the analysis results are very good guesses, but should not be considered //authoritative//. (Or to say it simply: we're not lawyers. The code tries its best, but leave the legal decisions up to your own attorneys.)
Using FOSSology FAQs¶
Why is the output unreadable?¶
Q: What am I doing wrong that my output is unreadable? (Thanks to SpottedOtter for this question posed on the fossology-devel mailing list.)
A: You are seeing garbage because the upload you analyzed is a binary file (an exe). Fossology only scans printable text when looking for licenses, and copyrights. Switching the view to "text" will show you those characters.
Although we scan the text in binary files, fossology is best use when looking at source code. If you upload a source rpm or source gzipped file and you see it unpacked into its component files, you should be able to read it.
How can I safely delete uploads in the repository?¶
Q: I would like to cleanup the repository by deleting uploads older than 1 year. How can I do this? ( Thanks to Ray Westphal for posting this question on the fossology mailing list.)
A:To find the uploads over a year old:
select upload_pk from upload where upload_ts< (now() - interval '1 year') order by upload_ts
You can then take that list and schedule the delete agent on each one of the upload_pk's with the fossjobs command (man fossjobs).
Unfortunately, you can't simply delete the upload from the database because that won't delete the files from the repository, and because "cascade" isn't turned on for most of the foreign keys. So, at least for now, you have to use the delete agent.