![]() |
FOSSology Advancing open source analysis and development |
This shows you the differences between the selected revision and the current version of the page.
| task_list 2010/03/01 13:52 | task_list 2010/07/29 17:36 current | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ~~NOTOC~~ | ||
| - | {{page>pm-template}} | ||
| + | ===== V 1.3 ===== | ||
| + | //This list represents what we would like to do for v 1.3. We will be working on v 2.0 at the same time as 1.3. Our goal is to get all the high priority items done and probably all the "easy" items. The low priorities will get done if we have time (we are switching to time based releases). It's possible that we will get to none of the low priority items in 1.3. Whatever we don't do will become high priority in 1.4// | ||
| - | ===== For v 1.2 ===== | + | ==== 1.3 high ==== |
| - | This list is in priority order. (Owner of the task is listed in parenthesis, responsible for task breakdown and estimate). We don't have a schedule but are hoping to have this out by April. Without a schedule, this is just a guess and a hope. If you want to know why this takes as long as it does, just ask and have some time or money available to contribute. | + | |
| - | - (Mark) New heuristic based license analyzer (based on small phrases and phrases relative to other phrases). We are calling this analyzer [[task:nomos]]. | + | |
| - | - Preliminary results show nomos to be 20x faster than bsam. | + | |
| - | - Need a way to add licenses to nomos without recompiling | + | |
| - | - (Bob) Implement [[task:Buckets]]. This allows one to create categories used in license reports. For example, you could define categories like "good licenses", "bad licenses", "commercial licenses", "files with no license", ... [[http://fossology.org/~bobg/1.2reqs/buckets.html | Buckets mockup]] | + | |
| - | - Notes from 2/25 team meeting: 3-4 weeks (start date 2/26, est completion date 3/24) | + | |
| - | - Finish cascading buckets - 1 day; code checked in 2/26 | + | |
| - | - Talk with Glen to understand env in the exec part of buckets - 1 day | + | |
| - | - Do we use a simple rule engine or rewrite for FO? - 5-10 days? | + | |
| - | - New UI for bucket browsing (http://fossology.org/~bobg/1.2reqs/buckets.html) with add'l file browsing - 2 days | + | |
| - | - Modifications to include buckets in existing UI screen? (could Vincent do this task?) - 3 days (depends on if OSRB want modifications) | + | |
| - | - (Mary) Cleanup license naming in the new [[licenseref|license_ref table]]. | + | |
| - | - First pass makes a note of all the names that have to be changed in f1 and nomos. | + | |
| - | - Second pass verifies that the nomos changes are acceptable to OSRB. | + | |
| - | - Third pass is changing names in nomos and f1. | + | |
| - | - (Vincent) Package agents | + | |
| - | - [[1.2pkgagent | pkg meta data]] (add debian to complement or replace the existing spec file agent), identify packages, bin/src pkgs, and stated licenses | + | |
| - | - [[http://fossology.org/~bobg/1.2reqs/specmeta.html| package meta report]] | + | |
| - | - [[http://fossology.org/~bobg/1.2reqs/pkgsonly.html| browse by package]] | + | |
| - | - [[http://fossology.org/~bobg/1.2reqs/pkghistory.html| package history]] | + | |
| - | - Notes from 2/25 team meeting: 4 days (start date 2/26, est completion date 3/3) | + | |
| - | - need to detect if dpkg is not installed. If so, disable the pkgagent. | + | |
| - | - make sure it can't be run via command line or the UI. Print an informative message! | + | |
| - | - test: run fossjobs | + | |
| - | - test: try it on fedora installs, both with & without dpkg installed | + | |
| - | - (DONE)<del>default agents per user (Mark) Done!</del> | + | |
| - | - Admin UI changes done. Testing & debugging - should be done by 2/19 | + | |
| - | - (Mark) Create tests to validate buckets against nomos - 5 days (start date 2/26, est completion date 3/5) | + | |
| - | - Test will compare file, license, bucket from g-nomos to the results of f-nomos (Mark will try the verbose switch and if that doesn't work he will talk with Paul W and/or Glen) | + | |
| - | - need to verify the result at the package level | + | |
| - | - Reprioritized to post-1.3 <del>(Bob) [[tagging | File tags]]</del> | + | |
| - | - (DONE) <del>Report files with no license. This should be part of standard license reports, not just in [[task:Buckets]]. A mockup of the new license browser UI is [[http://fossology.org/~bobg/1.2reqs/licbrows.html| here]]</del> | + | |
| - | - (Bob) [[Distro reports]] | + | |
| - | - (Bob) License Browser color coding | + | |
| - | - (Bob) Search within any part of a file tree | + | |
| - | - (Bob) Search for packages | + | |
| - | - (Bob, mark?) Browse packages | + | |
| - | - (Bob) [[http://fossology.org/~bobg/1.2reqs/browsefile.html|Display license changes by package version]] | + | |
| - | - (Bob) Display license differences on a per file basis between versions of any archive (rpm, tar, etc) | + | |
| - | - (Bob/Adam) [[task:copyright/author detection agent| Report Copyrights, URL's, and email addresses]] [[http://fossology.org/~bobg/1.2reqs/copyrights.html|mockup]] | + | |
| - | - Notes from 2/25 team meeting: 3 days (start date 2/26, est completion date 3/3) | + | |
| - | - NOTE: weather related power outage on 2/26 affected productivity. | + | |
| - | - where did he put the training files? put them in $DATADIR | + | |
| - | - locates copyright statements, URL's & email addresses but only copyright is currently reported in the UI | + | |
| - | - data is currently being placed in copyright_test. Need to change the name. | + | |
| - | - UI is complete; need to move from Adam's project directory to the main trunk | + | |
| - | - (DONE) <del>(Vincent) [[task:define_test_implement_full_backup_recovery_process|Define, test & implement full backup & recovery process]]</del> | + | |
| - | - (Mary) [[task:Testing_1.2|Test changes checked into subversion]] | + | |
| - | - Notes from 2/25 team meeting: | + | |
| - | - test all new FO1.2 features listed above | + | |
| - | - need to run Adam's copyright code | + | |
| - | - Creative Commons test files | + | |
| - | - (Mary) Address [[http://bugs.linux-foundation.org/buglist.cgi?cmdtype=runnamed&namedcmd=All%20FO%20Bugs | bugs ]] targeted for this release. | + | |
| - | ==== 1.2 UI Impacts ==== | + | - Diff packages (and distros, archives, ...) [[task:Integrate DistroDB (distro package reports)]] shows some of this. |
| - | This list is to keep track and summarize changes needed in the UI that are the results of changing to the nomos agent as our primary license agent for 1.2. | + | - Tagging. For example, tag individual files or containers. If container then the tag might or might not cascade to all it's files. A bucket could tag files. For example, it could tag all the gplv3 packages as "Need Review". Then as they are reviewed, the reviewer would change that tag. This means tags are editable. |
| - | - (Mark) agent_license_once_compare should be deprecated and replaced by a nomos version. | + | - To implement tags like this (editable) we need user groups. So write access could be restricted to the group that created the tag, for example. Other ideas can be found here [[tagging | File tags]] (moved from 1.2) |
| - | - (Mark) in the browse menu/page the Schedule license analysis link should be changed to reschedule nomos not bsam. | + | - [[task:copyright_author_detection_agent|Improve(replace)the copyright agent]]. A quick experiment showed that we can get better results with simple heuristics rather then the current naive Bayes. Development on branches/new_copyright. |
| - | - (Bob) <del>include a warning message in license groups page indicating it will be deprecated and replaced with buckets.</del> | + | - [[unit_testing_with_cunit|C code Unit Test]] and Coverage suite. Initial proposal, ideas or framework for how to use C code unit tests and C code coverage to improve our code quality. |
| - | ==== UI Mockups ==== | + | ==== 1.3 easy ==== |
| + | - Scheduler.conf does not get created correctly for a cluster. fosscp_agent and fo_notify should only be run on the scheduler host (localhost). Don't forget to include an entry for selftest, too. | ||
| + | - [[nfs_performance_investigation|NFS I/O performance]] investigate and improve. NFS file I/O is the largest bottleneck for agents, so what can we do to make it faster? Besides faster, making repo access more robust, easier to admin, and take less disk space would be big improvements. | ||
| + | |||
| + | ==== 1.3 low ==== | ||
| + | - [[task:Add/Modify licenses on-the-fly]] this would include a new permission level so that user can correct license and bucket data (other data as well). Perhaps this would update the real data record, but an audit trail could be kept of any changes. (reqt from sutula) | ||
| + | - Improve the unpack agent. The unpack agent used by fossology extracts files from containers. A container is any kind of file that stores other files. For example, a ZIP file contains an archive of different files. Other types of containers include tar, ar, ISO, and rpm files. Look [[1.0.0:agents#unpack | here]] for a full description of unpack agent. What’s wrong with the current unpack agent?: | ||
| + | - The agent is SLOW. It can take days to unpack a Linux distro. Since Linux distros are of primary interest to the OSRB, fossology needs to be able to unpack distros in hours or minutes, not days. How can we take advantage of multiple CPUs (with the –m switch?) and agent systems to improve performance. __Larry is investigating this issue, and have done some jobs, the performance is improved.__ [[Unpack performance]]. | ||
| + | - The current unpack agent cannot process some Microsoft proprietary formatted files (for example, .msi files). There are windows based command/utilities capable of doing this. Do any exist for use on Linux? __Larry will investigate this issue after unpack performance job.__ [[Unpack Microsoft proprietary formatted files]]. | ||
| + | - Information/error messages are unhelpful, non-existent and difficult to find in the log file. Log meaningful messages with names of file being processed (if applicable) to a log file for a specific upload – NOT the general fossology.log file. | ||
| + | - Deprecate (bSAM) license analyzer and licterms. Either remove entirely or move to its own, unsupported, package. | ||
| + | - UI for bucket definition and management (new, change, delete) Not sure where this goes in the priorities. | ||
| + | - UI cleanup. Work on inconsistencies and ease of use. Some problems are: | ||
| + | - The way you queue a job that has already been unpacked is different depending on if it is a new scan or a rescan. Of course, most rescan's don't work, but that's an issue that needs to be handled by the new modular agent/plugin design. | ||
| + | - Micromenu can get very cluttered. | ||
| + | - Search should be an option at any browse level. | ||
| + | - [[http://fossology.org/~bobg/1.2reqs/browsefile.html|Display license changes by package version]] (moved from 1.2) | ||
| + | - Display license differences on a per file basis between versions of any archive (rpm, tar, etc) (moved from 1.2). This includes [[Distro reports]] | ||
| + | - Browse by folders. Do union of sql query of all uploads in a folder. | ||
| + | - Identify binary packages and the source package they came from (Scott Lamons). The issue here is that the source may not be in the same upload as the binary. So when looking at a binary we need to have an option to choose a source and look at its scans. | ||
| + | |||
| + | ===== 1.3 cutline ===== | ||
| + | - From slamons: "We need a way to allow users to easily set up new accounts. It would be especially nice if they could log in using their HP email and NT password (or better yet, SiteMinder [[task:implement_single_sign-on|single sign-on session]]). As it is, it is not at all obvious that you need to set yourself up a new account before you start running analysis." | ||
| + | - Integrated error information. Our current method of logging EVERYTHING to fossology.log makes it difficult to debug issues and view log messages/errors for a particular upload or file. | ||
| + | - Add capability for reanalysis without breaking data persistence ie. do new analysis without removing previous analysis results. This can be used, to compare new and old analysis results, and to insure that report url's are persistent. 1.2 implemented data collection for this for nomos and buckets. The UI needs to catch up and allow one to select the data set they want to see. The code is already in ui-buckets.php and ui-nomos-license.php (search for FUTURE). But we need to decide if this is the interface we want. | ||
| + | - How can one tell who, when and from where an upload came? Add to ui-browse | ||
| + | - Modify code to support the db server on a separate system. This has always been a design goal but has not been tested. | ||
| + | - Remove pfile.pfile_liccount from schema and code (common/common-license.php, plugins/agent-license.php, plugins/agent-license-reanalyze.php, plugins/ui-license.php. This was an experimental feature that mistakenly had code checked in around it. | ||
| + | - delagent needs to be more robust. Much of the delagent db updates should probably be done with cascading deletes on the upload. Perhaps cleaning the filesystem should be a separate agent that could be done on a periodic basis? Because of the concurrency problem of deleting unused files from the repo while another agent it adding them, delagent should never run concurrently with unpack or probably any other agent. | ||
| + | - Add license from kernel object modules (license from modsym) to license_file | ||
| + | - New "Compare" checkboxes to compare different files/directories/packages/... | ||
| + | - Create a user interface to create bucket pools, bucket definitions, scripts and anything else needed, along with a prompt & screen to rerun analysis with your newly defined bucket pool. [[ buckets#rerunning_buckets |Current method is too ugly.]] | ||
| + | |||
| + | |||
| + | == Notes for 1.3 Planning == | ||
| + | * Spend more upfront time planning new features, estimating time to implement/test and identifying impacts. | ||
| + | * Develop new "disruptive" code on a branch so as not to cripple top-of-tree builds, install and testing. | ||
| + | |||
| + | ===== v 2.0 ===== | ||
| + | - New scheduler | ||
| + | - Modular plugins and agents. There are many advantages to [[separate_package | separate optional package]]. Support for optional plugin installs. | ||
| + | - Improved multihost configuration and installation. | ||
| + | ===== UI Mockups ===== | ||
| All the mockups can be found [[http://fossology.org/~bobg/1.2reqs/|here]]. Or just click on the individual mockups below: | All the mockups can be found [[http://fossology.org/~bobg/1.2reqs/|here]]. Or just click on the individual mockups below: | ||
| - [[http://fossology.org/~bobg/1.2reqs/attachments.html|Attachments]] | - [[http://fossology.org/~bobg/1.2reqs/attachments.html|Attachments]] | ||
| Line 82: | Line 71: | ||
| - [[http://fossology.org/~bobg/1.2reqs/licbrowsediff.html | License Browsing with license differences]] | - [[http://fossology.org/~bobg/1.2reqs/licbrowsediff.html | License Browsing with license differences]] | ||
| - | **1.2 Notes** | ||
| - | - Additional [[1.2 notes | 1.2 Useful SQL]] | ||
| - | - [[ 1.2tasknotes | Scratchpaper for 1.2 tasks]] | ||
| - | |||
| - | ===== V 1.3 requests ===== | ||
| - | - (Bob) Add capability for reanalysis without breaking data persistence ie. do new analysis without removing previous analysis results. This can be used, to compare new and old analysis results, and to insure that report url's are persistent. | ||
| - | - (Bob) New scheduler | ||
| - | - Modular plugins and agents. Supports optional plugin installs. | ||
| - | - Improved multihost configuration and installation | ||
| - | - (Scott Lamons) Identify binaries and where they came from. | ||
| - | - Modify code to support the db server on a separate system. (This has always been a design goal but, has not been implemented correctly OR tested.) | ||
| - | - (Bob) [[tagging | File tags]] | ||
| ===== Everything Else ===== | ===== Everything Else ===== | ||
| Line 101: | Line 78: | ||
| **High priority** - //within the next two releases// | **High priority** - //within the next two releases// | ||
| - | |||
| * (Mary/Bob) Improve postgres error checking and error reporting | * (Mary/Bob) Improve postgres error checking and error reporting | ||
| * (Mark) improve [[reporting of UI test results]]. | * (Mark) improve [[reporting of UI test results]]. | ||
| * (Bob) [[task:write comment plugin | Attachments]] [[http://fossology.org/~bobg/1.2reqs/attachments.html|mockup]] | * (Bob) [[task:write comment plugin | Attachments]] [[http://fossology.org/~bobg/1.2reqs/attachments.html|mockup]] | ||
| - | * Deprecate (bSAM) license analyzer and licterms. They will still be available but not supported. To facilitate this, they could be separated into their own package. | + | * (Alex) New machine learning license analyzer (based on sentence clustering). Currently we are calling this [[F1]]. If there are results from both this analysis and fo_nomos, the results will be combined for reporting. Part of this is to compare with Ninka. |
| - | * Split Pkgmetagetta into a [[separate_package | separate optional package]] | + | * [[task:Rest| REST API]] - also look at [[http://fossology.org/pipermail/fossology/2009-December/001489.html | thread started by Phil Martin]] the info in this thread needs to be added to [[task:Rest| REST API]] |
| - | * (Bob) New machine learning license analyzer (based on sentence clustering). Currently we are calling this [[F1]]. If there are results from both this analysis and fo_nomos, the results will be combined for reporting. | + | |
| - | * [[task:Rest| REST API]] - follow [[http://fossology.org/pipermail/fossology/2009-December/001489.html | thread started by Phil Martin]] | + | |
| - | * [[task:Integrate DistroDB (distro package reports)]] | + | |
| - | * [[task:Add/Modify licenses on-the-fly]] | + | |
| * [[task:binary_analysis | Binary analysis for open source discovery]] | * [[task:binary_analysis | Binary analysis for open source discovery]] | ||
| * [[task:Document FOSSology recommended hardware configs]] | * [[task:Document FOSSology recommended hardware configs]] | ||
| Line 126: | Line 98: | ||
| * [[task:Get distros interested in using fossology to scan their codebase]] This is already happening with FreeBSD | * [[task:Get distros interested in using fossology to scan their codebase]] This is already happening with FreeBSD | ||
| * [[task:Generate meaningful URLs]] | * [[task:Generate meaningful URLs]] | ||
| + | * [[task:FO permission scheme needs improvement]] | ||
| * [[task:Integrate fossology data with other open source data providers]] | * [[task:Integrate fossology data with other open source data providers]] | ||
| * [[task:Provide analysis to OSI License Proliferation sub-group]] | * [[task:Provide analysis to OSI License Proliferation sub-group]] | ||
| Line 134: | Line 107: | ||
| * [[task:Improve text search]] Requires postgres >= 8.3 | * [[task:Improve text search]] Requires postgres >= 8.3 | ||
| * [[task:Archived Reports]] - simple text file dump, PDF reports, eventually full web archive of all analysis reports | * [[task:Archived Reports]] - simple text file dump, PDF reports, eventually full web archive of all analysis reports | ||
| - | * [[task:group authentication]] - don't have an owner/customer to support this | ||
| * [[task:Write tar agent - re-tar arbitrary parts of repo]] - don't have an owner/customer to support | * [[task:Write tar agent - re-tar arbitrary parts of repo]] - don't have an owner/customer to support | ||
| * [[task:Add pie/bar charts to license analysis]] - fun to have, pretty pictures, not that useful | * [[task:Add pie/bar charts to license analysis]] - fun to have, pretty pictures, not that useful | ||
| - | * [[task:Change scheduler to assign resources to hosts and agents (rather than specifying number of agents per host)]] | + | * [[task:Willebrand|Automated Collection and Presentation of FOSS Notices]]. Many of these suggestions are incorporated into v 1.3 tasks. |
| - | * [[task:Change scheduler to manage jobs by resource rather than jobs by jobqueue item]] | + | * rpm spec file analysis - we have pkgagent and pkgmetagetta that determine info about rpm's, but maybe there would be some value in analyzing .spec files we find. ununpack will unpack source rpm's that it finds and we'll already have some info from them. but maybe we'd find loose .spec file in source trees too. We need more specifics. |
| - | * [[task:write agent to reconfigure & rebalance the repo]] | + | * Easy way to install buttons/links on micro menu to [[task:run_util_on_file|run system utilities/scripts on a file]]. For example, .ko files could have an nm link, and a modinfo link. The script may need a way to determine if it should be added to menu or not. For example, nm applies to object files, but modinfo only applies to .ko. |
| - | * [[task:Stop running filter_clean]] | + | |
| - | * [[task:Willebrand|Automated Collection and Presentation of FOSS Notices]] | + | |
| ===== Completed ===== | ===== Completed ===== | ||
| * See [[task:Archive]] for a list of completed tasks | * See [[task:Archive]] for a list of completed tasks | ||
| - | |||
| - | |||