Skip to content

Practice of AI dataset metadata and license compliance

License

Notifications You must be signed in to change notification settings

lauson-oo/OpenDataology

 
 

OpenDataology


CII Best Practices License: MIT

Overview


OpenDataology is a project for AI model trainning with trusted dataset compliance. Our project enables users of publicly available datasets and users who curate a dataset from multiple data sources (particularly for use as a part of machine learning models) to identify the potential license compliance risks. Our project is primarily comprised of three key components.

  • A dataset license compliance analysis workflow that ascertains the final allowed rights and the required obligations associated with using a publicly available dataset or a dataset that is curated from multiple data sources for any purpose. Please refer to the paper Can I use this publicly available dataset to build commercial AI software?-A Case Study on Publicly Available Image Datasets for more details.
  • A growing database and a web portal that documents the final rights and obligations (after the license compliance analysis is conducted) associated with the datasets and the data sources analyzed in our project. The database also documents the metadata collected and used to conduct the compliance workflow.
  • An online license generation toolkit that creators of datasets to generate custom licenses depending on the exact rights and obligations that they want to allow (instead of having to rely on existing available and limited dataset specific licenses)

OpenDatalogy's recommendations cannot be constituted as legal advice.

Getting Involved


Contributing


We love contributions in various forms. To contribute to OpenDataology please see CONTRIBUTING.md

Governance


OpenDataology is a project hosted by the LF AI and Data Foundation. The project governance details can be found at GOVERNANCE.md.

Reporting a Problem


To report a problem, you can open an issue in the repository against a specific workflow. If the issue is sensitive in nature or a security related issue, please do not report it in the issue tracker but instead email main@opendataology.com.

Meeting Schedule and Minutes


Here are the records of the meeting notes and progress.

License


OpenDatalogy is licensed under MIT

Copyright 2022 OpenDatalogy

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

Practice of AI dataset metadata and license compliance

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published