Skip to content

ThomasHeliase/PatentData

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Patent Public Bulk Files

Tool kit to download, read and utilize open patent data provided to the public.

Notice

This source code is a work in progress and has not been fully vetted for a production environment.

Two main modules

  • Bulk Downloader automates downloading of public bulk patent data
  • Patent Document provides the ability to iterate and read patents directly from the large bulk download files, supports reading patent documents from 1976 to current (formats: Greenbook, SGML, PAP, Redbook XML) into a normalized Patent Object Model.

Features

  • Download Bulk Patent Grants and Applications, as well as additional resources
  • View individual Patent Documents directly from the large bulk files
  • Read Patent Documents directly from the large bulk files, supports reading patent documents from 1976 to current (formats: Greenbook, SGML, PAP, Redbook XML) into a normalized Patent Object Model
  • Extract Patent Documents from bulk files
  • Normalize and transform Patent data before loading into a data resource
  • Patent Claim Tree to facilitate analysis
  • Update Classifications from Master CPC File (contains current CPC classification for patents starting a patent number 1)
  • Include classification definitions from CPC Scheme
  • Build a corpus using Corpus Builder which automates building a corpus by downloading and extracting patent/applications matching specified classifications, one bulk file at a time for a date range.

Changes after Patents are published

Bulk files are not updated once published, updates can be received by indexing additional supplemental files also made available to the public. The following are fields which may update after being published.

field description
assignee Update available daily within Patent Assignment XML Dump files
Classifications Updates available monthly within Master Classification File Dump files

Other Information

The United States Department of Commerce (DOC)and the United States Patent and Trademark Office (USPTO) GitHub project code is provided on an ‘as is’ basis without any warranty of any kind, either expressed, implied or statutory, including but not limited to any warranty that the subject software will conform to specifications, any implied warranties of merchantability, fitness for a particular purpose, or freedom from infringement, or any warranty that the documentation, if provided, will conform to the subject software. DOC and USPTO disclaim all warranties and liabilities regarding third party software, if present in the original software, and distribute it as is. The user or recipient assumes responsibility for its use. DOC and USPTO have relinquished control of the information and no longer have responsibility to protect the integrity, confidentiality, or availability of the information.

User and recipient agree to waive any and all claims against the United States Government, its contractors and subcontractors as well as any prior recipient, if any. If user or recipient’s use of the subject software results in any liabilities, demands, damages, expenses or losses arising from such use, including any damages from products based on, or resulting from recipient’s use of the subject software, user or recipient shall indemnify and hold harmless the United States government, its contractors and subcontractors as well as any prior recipient, if any, to the extent permitted by law. User or recipient’s sole remedy for any such matter shall be immediate termination of the agreement. This agreement shall be subject to United States federal law for all purposes including but not limited to the validity of the readme or license files, the meaning of the provisions and rights and the obligations and remedies of the parties. Any claims against DOC or USPTO stemming from the use of its GitHub project will be governed by all applicable Federal law. “User” or “Recipient” means anyone who acquires or utilizes the subject code, including all contributors. “Contributors” means any entity that makes a modification.

This agreement or any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not in any manner constitute or imply their endorsement, recommendation or favoring by DOC or the USPTO, nor does it constitute an endorsement by DOC or USPTO or any prior recipient of any results, resulting designs, hardware, software products or any other applications resulting from the use of the subject software. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, including USPTO, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC, USPTO or the United States Government.



CC0
To the extent possible under law, https://github.com/USPTO/PatentPublicData has waived all copyright and related or neighboring rights to Patent Public Data. This work is published from: United States.

Releases

No releases published

Packages

No packages published

Languages