Uploading Files for Analysis¶
To upload to the FOSSology database you must be logged in to the FOSSology UI.
Using the default User Interface, clicking on the main Upload menu item will bring up this screen:
FOSSology has four options for uploading files for analysis. The options vary based on where the data originates from and whether you wish to save the upload and analysis results to your FOSSology database. The data may be located:
- On your local system. Select the option Upload -> From File to select and upload the file. While this can be very convenient (particularly if the file is not readily accessible online), uploading via your web browser can be slow for large files, and files larger than 650 Megabytes may not be uploadable.
- On a remote server. Use the Upload -> From URL option to specify a remote server. This is the most flexible option, but the URL must denote a publicly accessible HTTP, HTTPS, or FTP location. URLs that require authentication or human interactions cannot be downloaded through this automated system.
Either of the above two options will save the upload, perform the default analysis (determined on a per user basis) and store the results to the database.
For advanced usage, the Upload -> From URL option allows multiple files or directories to be uploaded by using comma-separated lists of name suffixes or patterns to select or exclude as entries to be uploaded. These options are for users who are comfortable using wildcards and patterns. For example, to select all the iso files in the direcotory NewISOs, the select field would contain: 'iso'. If there is no exclusion list, then all files matching the selection criteria will be uploaded. The recursion depth default setting is 1. This allows uploading all files refrenced on the page link or all the files in a directory. To include sub-directories, increase the recursion depth. Setting the recursion depth to more than five could result in very large data uploads which might use all the available disk space and slow the network down considerably. Use this option with care. A large amount of data could be uploaded. Make sure the FOSSology server has enough room to hold the uploaded data and analysis.
There is also a command line utility cp2foss that can be used to upload 1 or more files. The cp2foss utility is best used when large amounts of material need to be loaded or when a large data set needs to be broken into smaller chunks for ease of loading.
The other 2 options allow real time analysis of a single file.
- Select the Upload -> One-Shot Analysis option to perform a license scan.
- Select the Upload -> One-Shot Copyright/Email/URL option to scan for Copyrights, Email and URLs.
There are some important limitations to using either of these options:
- The analysis is done in real-time. Large files may take a while. This method is not recommended for files larger than a few hundred kilobytes.
- Files that contain files are not unpacked. If you upload a 'zip', 'deb' or any other compressed file, then the binary file will be scanned for licenses and nothing will likely be found.
- Results are not stored. As soon as you get your results, your uploaded file is removed from the system.
Upload Time is variable¶
Uploads times vary depending on the size of the upload, the speed of the server, the network speed and the speed of your FOSSOlogy system. After he file is transfered to the web server, the file is then unpacked. Unpacking can take many hours depending on a number of factors. Next the default agents are scheduled to perform analysis on each of the unpacked files. Each agent must run in turn to perform it's analysis.
Reuse Speeds Up Analysis¶
Another factor that affects analysis time is if the same upload has already been loaded into the Database. FOSSology keeps track of identical data through the use of checksums. Many open source projects use common parts from other open source projects. When fossology starts to upload a file, a check is first made to see if that file has already been uploaded and is identical to the file to be uploaded. If it is, the file will still be uploaded, but the actual analysis data will come from the previous analysis. This can greatly speed up the upload time. There is no way to predict how many files will be reused in a given upload.
Delete an Upload¶
Deleting Uploaded Files is performed by an Agent, the Delete Upload Agent, or simply, delagent. delagent is Scheduled like any other Agent. The Scheduler appends it as a Task at the end of the Job queue.
First note that delagent does not delete Tasks, so if you wish to abort a Job in progress, you should first delete all of the Tasks for that Job manually. To do so, click on Jobs, Queue, By Upload, and select your Uploaded Project there.
When you are ready to Schedule delagent, click on Organize, Uploads, Delete Uploaded File, and select your Uploaded Project there. The Scheduler will append a delagent task to the end of the Job queue for the selected Upload.
Depending upon how your Scheduler is configured, the actual deletion of the Uploaded Files by delagent will be queued up and scheduled sometime in the future (perhaps after other jobs on other Uploaded Projects are completed). In addition, the duration of the actual file deletion will depend upon how large the Uploaded Project is. That is, it could take quite a while for delagent to begin deleting your files, and once started, it could take quite another while to finish deleting your files.