This repository has been archived by the owner on Feb 3, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 135
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #336 from maarten-boot/master
update to whoisdomain 1.20231115.1
- Loading branch information
Showing
17 changed files
with
541 additions
and
150 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,141 +1,106 @@ | ||
# whois | ||
* A Python package for retrieving WHOIS information of domains. | ||
* This package will not support querying ip CIDR ranges or AS information | ||
* It requires the whois cli component of your os to be installed: e.g. `/usr/bin/whois` on Linux | ||
|
||
## NOTE | ||
* 2023-04-25: mboot | ||
* when DannyCork returns he can decide on the future of this repo. | ||
* in his absence future development will take place in: https://github.com/mboot-github/WhoisDomain | ||
* and new pypi releases will come from: https://pypi.org/project/whoisdomain/ | ||
* efforts will be made to keep the v1.x.y version of whoisdomain compatible with this repo | ||
* changes will be verified and back copied also here for the time being | ||
* starting 2024-02, this repo will be abandon-ware | ||
|
||
## Support | ||
* Python 3.x is supported for x >= 9 | ||
* Python 2.x IS NOT supported. | ||
# whoisdomain | ||
|
||
* A Python package for retrieving WHOIS information of DOMAIN'S ONLY. | ||
* Python 2.x IS NOT supported. | ||
* Currently no additional python packages need to be installed. | ||
|
||
## Features | ||
* Python wrapper for the "whois" cli command of your operating system. | ||
* Simple interface to access parsed WHOIS data for a given domain. | ||
* Able to extract data for all the popular TLDs (com, org, net, biz, info, pl, jp, uk, nz, ...). | ||
* Query a WHOIS server directly instead of going through an intermediate web service like many others do. | ||
* Works with Python >= 3.9 | ||
* All dates as datetime objects. | ||
* Possibility to cache results. | ||
* Verbose output on stderr during debugging to see how the internal functions are doing their work | ||
* raise a exception on Quota ecceeded type responses | ||
* raise a exception on PrivateRegistry tld's where we know the tld and know we don't know anything | ||
* allow for optional cleaning the whois response before extracting information | ||
* optionally allow IDN's to be translated to Punycode | ||
* optional specify the whois command on query(...,cmd="whois") as in: https://github.com/gen1us2k/python-whois/ | ||
|
||
## Dependencies | ||
* please install also the command line "whois" of your distribution | ||
* this library parses the output of the "whois" cli command of your operating system | ||
--- | ||
|
||
## Docker | ||
* docker pull mbootgithub/whoisdomain:latest | ||
## Notes | ||
|
||
## Help Wanted | ||
Your contributions are welcome, look for the Help wanted tag https://github.com/DannyCork/python-whois/labels/help%20wanted | ||
* This package will not support querying ip CIDR ranges or AS information | ||
* This was a copy of the original DanyCork 'whois'. | ||
* Significantly refactored in 2023. | ||
* The output is still compatible with DanyCork 'whois' | ||
|
||
## Usage example | ||
## Versioning | ||
|
||
Install the cli `whois` of your operating system if it is not present already | ||
* I will start versioning at 1.x.x | ||
* the second item will be YYYYMMDD, | ||
* the third item will start from 1 and be only used if more than one update will have to be done in one day. | ||
|
||
Install `whois` package from your distribution (e.g apt install whois) | ||
Versions `1.x.x` will keep the output compatible with Danny Cork until 2024-02-03 (February 2024) | ||
|
||
## Releases | ||
|
||
$pip install whois | ||
* Releases are avalable at: [Pypi](https://pypi.org/project/whoisdomain/) | ||
|
||
>>> import whois | ||
>>> domain = whois.query('google.com') | ||
Pypi releases can be installed with: | ||
|
||
>>> print(domain.__dict__) | ||
{ | ||
'expiration_date': datetime.datetime(2020, 9, 14, 0, 0), | ||
'last_updated': datetime.datetime(2011, 7, 20, 0, 0), | ||
'registrar': 'MARKMONITOR INC.', | ||
'name': 'google.com', | ||
'creation_date': datetime.datetime(1997, 9, 15, 0, 0) | ||
} | ||
* `pip install whoisdomain` | ||
|
||
>>> print(domain.name) | ||
google.com | ||
## Features | ||
* See: [Features](docs/Features.md) | ||
|
||
>>> print(domain.expiration_date) | ||
2020-09-14 00:00:00 | ||
## Dependencies | ||
* please install also the command line "whois" of your distribution as this library parses the output of the "whois" cli command of your operating system | ||
|
||
## ccTLD & TLD support | ||
see the file: ./whois/tld_regexpr.py | ||
or call whois.validTlds() | ||
### Notes for Mac users | ||
* it has been observed that the default cli whois on Mac is showing each forward step in its output, this makes parsing the result very unreliable. | ||
* using a brew install whois will give in general better results. | ||
|
||
## Issues | ||
* Raise an issue https://github.com/DannyCork/python-whois/issues/new | ||
## Docker release | ||
* See [Docker](docs/Docker.md) | ||
|
||
## Changes: 2022-06-09: maarten_boot: | ||
* the returned list of name_servers is now a sorted unique list and not a set | ||
* the help function whois.validTlds() now outputs the true tld with dots | ||
## Usage example | ||
* See [Usage](docs/Usage.md) | ||
|
||
## 2022-09-27: maarten_boot | ||
* add test2.py to replace test.py | ||
* ./test2.py -h will show the possible usage | ||
* all tests from the original program are now files in the ./tests directory | ||
* test can be done on all supported tld's with -a or --all and limitest by regex with -r <pattern> or --reg=<pattern> | ||
## whoisdomain | ||
* the cli `whoisdomain` is documented in [whoisdomain-cli](docs/whoisdomain-cli.md) | ||
|
||
## 2022-11-04: maarten_boot | ||
* add support for Iana example.com, example.net | ||
## ccTLD & TLD support | ||
|
||
## 2022-11-07: maarten_boot | ||
* add testing against static known data in dir: ./testdata/<domain>/output | ||
* test.sh will test all domains in testdata without actually calling whois, the input data is instead read from testdata/<domain>/input | ||
Most `tld's` are now autodetected via IANA root db, see the Analizer directory | ||
and `make suggest`. | ||
|
||
## 2022-11-11: maarten_boot | ||
* see the file: [tld_regexpr](./whoisdomain/tldDb/tld_regexpr.py) | ||
* for python use: `whoisdomain.validTlds()` | ||
* for cli use `whoisdomain -S` | ||
|
||
* add support for returning the raw data from the whois command: flag include_raw_whois_text | ||
* add support for handling unsupported domains via whois raw text only: flag return_raw_text_for_unsupported_tld | ||
--- | ||
|
||
## 2023-01-18: sorrowless | ||
## Support | ||
* Python 3.x is supported for x >= 9 | ||
* Python 2.x IS NOT supported. | ||
|
||
* add an opportunity to specify maximum cache age | ||
## Author's | ||
* See: [Authors](docs/Authors.md) | ||
|
||
## 2023-01-25: maarten_boot | ||
--- | ||
|
||
* convert the tld file to a Dict, we now no longer need a mappper for python keywords or second level domains. | ||
* utf8 level domains also need no mapper anymore an can be added as is with a translation to xn--<something> | ||
* added xn-- tlds for all known utf-8 domains we currently have | ||
* we can now add new domains on the fly or change them: whois.mergeExternalDictWithRegex(aDictToOverride) see example exampleExtend.py | ||
## Updates | ||
* see [Updates](docs/Updates.md) for a full history of changes. | ||
* Only the latest update is mentioned here | ||
|
||
## 2023-01-27: maarten_boot | ||
### 1.20230906.1 | ||
* introduce parsing based on functions | ||
* allow contextual search in splitted data and plain data | ||
* allow contextual search based on earlier result | ||
* fix a few tld to return the proper registrant string (not nic handle) | ||
|
||
* add autodetect via iana tld file (this has only tld's) | ||
* add a central collection of all compiled regexes and reuse them: REG_COLLECTION_BY_KEY in _0_init_tld.py | ||
* refresh testdata now that tld has dot instead of _ if more then one level | ||
* add additional strings meaning domain does not exist | ||
### 1.20230913.1 | ||
* if you have installed `tld` (pip install tld) you can enable withPublicSuffix=True to process untill you reach the pseudo tld. | ||
* the public_suffix info is added if available (and if requested) | ||
* example case is: ./test2.py -d www.dublin.airport.aero --withPublicSuffix | ||
|
||
## 2023-02-02: maarten_boot | ||
### 1.20230913.3 | ||
* fix re.NOFLAGS, it is not compatible with 3.9, it appears in 3.11 | ||
|
||
* whois.QuotaStringsAdd(str) to add additional strings for over quota detection. whois.QuotaStrings() lists the current configured strings | ||
* whois.NoneStringsAdd(str) to add additional string for NoSuchDomainExists detection (whois.query() retuning None). whois.NoneStrings() lsts the current configured strings | ||
* suppress messages to stderr if not verbose=True | ||
## 1.20230917.1 | ||
* prepare work on pylint | ||
* switch to logging: all verbose is currently log.debug(); to show set LOGLEVEL=DEBUG before calling, see Makefile: make test | ||
* experimental: add extractServers: bool default False; when true we will try to extract the "redirect info chain" on rcf1036/whois and jwhois for linux/darwin | ||
* add missing option to query(), test in production environment done | ||
|
||
## 2023-07-20: maarten_boot | ||
## 1.20231102.1 | ||
* fix from kazet for .pl tld. | ||
|
||
* sync with https://github.com/mboot-github/WhoisDomain; 1.20230720.1; (gov.tr), (com.ru, msk.ru, spb.ru), (option to preserve partial output after timeout) | ||
* sync with https://github.com/mboot-github/WhoisDomain; 1.20230720.2; add t_test hint support; fix some server hints | ||
## 1.20231115.1 | ||
New tld's and removal of a few tlds no longer supported at iana | ||
|
||
## 2023-08-21: mboot-github (maarten_boot) | ||
* abb, bw, bn, crown, crs, fj (does not work), gp (does not work), weir, realtor, post, mw, pf (a strange one), iq (gives timout), mm, int, hm (does not work) | ||
|
||
* abandon any python below 3.9 (mypy compatibilities) | ||
* major refactor into more object based approch and parameterContext | ||
* allow custom caching backends (e.g. redis, dbm, ...) | ||
--- | ||
|
||
## 2023-09-22 see new paramaters in whois/context/parameterContext.oy | ||
## in progress | ||
|
||
* Sync with latest whoisdomain | ||
* Allow cleaning up the http(s) info in the status response. | ||
* Allow correlation with tld (pip install tld) public_suffix. | ||
* Allow display of what whois-servers were used until we reach the final item. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# Author's | ||
|
||
* this is a rename copy of original work done in: https://github.com/DannyCork/python-whois | ||
* the project is also related to the project: https://github.com/gen1us2k/python-whois | ||
* both seem derived from a older google.code site: https://code.google.com/archive/p/python-whois | ||
* aside from the original authors, many others already contributed to these repositories | ||
* if authors/contributors prefer to be named explicitly, they can add a line in [Historical.txt](Historical.txt) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# Docker | ||
|
||
[Docker](https://hub.docker.com/r/mbootgithub/whoisdomain) | ||
|
||
* docker pull mbootgithub/whoisdomain:latest | ||
* docker run mbootgithub/whoisdomain -V # show version | ||
* docker run mbootgithub/whoisdomain -d google.com # run one domain | ||
* docker run mbootgithub/whoisdomain -a # run all tld | ||
* docker run mbootgithub/whoisdomain -d google.com -j | jq -r . # run one domains , output in json and reformat with jq | ||
* docker run mbootgithub/whoisdomain -d google.com -j | jq -r '.expiration_date' # output only expire date | ||
* docker run mbootgithub/whoisdomain -d google.com -j | jq -r '[ .expiration_date, .creation_date ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
# Features | ||
* Python wrapper for the "whois" cli command of your operating system. | ||
* Simple interface to access parsed WHOIS data for a given domain. | ||
* Able to extract data for all the popular TLDs (com, org, net, biz, info, pl, jp, uk, nz, ...). | ||
* Query a WHOIS server directly instead of going through an intermediate web service like many others do. | ||
* Works with Python >= 3.9 | ||
* All dates as datetime objects. | ||
* Possibility to cache results. | ||
* Verbose output on stderr during debugging to see how the internal functions are doing their work | ||
* raise a exception on Quota ecceeded type responses | ||
* raise a exception on PrivateRegistry tld's where we know the tld and know we don't know anything | ||
* allow for optional cleaning the response before extracting information | ||
* optionally allow IDN's to be translated to Punycode | ||
* optional specify the whois command on query(...,cmd="whois") as in: https://github.com/gen1us2k/python-whois/ | ||
* the module is now 'mypy --strict' clean | ||
* the module now also exports a cli command domainwhois | ||
* both the module and the cli now support showing the version with lib:whois.getVersion() or cli:whoisdomain -V | ||
* the whoisdomain can now output json data (one per domain: e.g 'whoisdomain -d google.com -j' ) | ||
* withRedacted: bool = False has been added to query(), if set to True any redacted fields will now be shown also (also supported in the cli whoisdomain as --withRedacted) | ||
* a analizer directory is presend in the github repo that will be used to look for new IANA tls's currently unsupported but maching known whois servers |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# Updates | ||
## 1.20230627.2 | ||
* add Kenia proper whois server and known second level domains | ||
## 1.20230627.3 | ||
* add rw tld proper whois server and second level ; restore mistakenly deleted .toml file | ||
## 1.20230627.3 | ||
* additional kenia second level domains | ||
## 1.20230712.2 | ||
* tld .edu now can have up to 10 nameservers; remove action on pull request | ||
## 1.20230717.1 | ||
* add tld: com.ru, msk.ru, spb.ru (all have a test documented), also update the tld: ru, the newlines are not needed. | ||
## 1.20230717.2 | ||
* add option to parse partial result after timout has occurred (parse_partial_response:bool default False); this will need `stdbuf` installed otherwise it will fail | ||
## 1.20230718.3 | ||
* fix typo in whois server hint for tld: ru | ||
## 1.20230720.1 | ||
* add gov.tr; switch off status:available and status:free as None response, we should not interprete the result by default (we can add a option later) | ||
## 1.20230720.2 | ||
* fix server hints for derived second level "xxx.tr", add processing "_test" hints during 'test2.py -a' | ||
* add external caching framework that can be overridden for use of your own caching implementation | ||
* renaming various vars to mak them more verbose | ||
* preparing for capturing all parameters in one object and parring that object around instead of many arguments in methods/functions | ||
* switch to json so we dont need a additional dependency in ParamContext | ||
* finish rework args to ParameterContext, split of domain as file | ||
## 1.20230803.1 | ||
* frenzy refactor-release | ||
## 1.20230804.1 | ||
* testing | ||
## 1.20230804.2 | ||
* testing after remove of leading dot in rw second level domains | ||
## 1.20230804.3 | ||
* simplefy cache implementation after feedback from `baderdean` | ||
* "more lembas bread", refactor parse and query | ||
* remove option to typecheck CACHE_STUB, use try/catch/exit instead, does not work when timout happens, removed ;-( | ||
* refactor doQuery create processWhoisDomainRequest, split of lastWhois | ||
## 1.20230806.1 | ||
* testing done, prep new release: "more lembas bread" | ||
* bug found with the default timeout: if no timeout is specified the program fails: all pypi releases before 2023-07-17 yanked | ||
## 1.20230807.1 | ||
* fix default timeout | ||
* add DummyCache, DBMCache, RedisCache with simple test in testCache.py, testing custom cache options | ||
## 1.20230811.1 | ||
* replace type hint | with Union for py3.9 compat; switch off experimental redis tools | ||
* switch off 3.[6-8] minimal is 3.9 we test against | ||
* start working on dataContext; | ||
* add more \_test items; reorder parts of tld_regexpr; | ||
* propagate all meta domains servers as they are not inherited, testing , some domains have been retracted mboot; 2023-08-23; | ||
* add suggestion from baderdean to parse fr domains with more focus on ORGANISATION | ||
* 2023-08-24: mboot: more \_test added to tld | ||
* verify all \_test on whois.nic.<tld> \_test: nic.<tld> fix where needed; remove some abandoned tld's | ||
## 1.20230824.1 | ||
* mboot; to combine all new tests and changes, "the galloping Chutzpah release" | ||
## 1.20230824.5 | ||
* mboot; fix missing module in whl | ||
* restore python 3.6 test as i still use it on one remaining app with python 3.6 (make testP36) | ||
* finalize verification of all tld's in iana, add test where this can be auto generated from whois.nic.<tld> 2023-08-28; mboot | ||
## 1.20230829.1 | ||
* mboot; all \_test now work, using analizer tool to verify that iana tld db web site and tl-regexpr match | ||
* add DEBUG to all verbose strings | ||
* remove tldString and dList and domain , all go via dc (dataContect) now | ||
* run tests and add new TODO | ||
* moving all TLD_RE activities to tldInfo.py, and all exported helper funcs to helpers.py | ||
* thinking about adding more complicated nested regex extractors to target contact info | ||
* start with dependency inject: parser is passed as arg | ||
* add cli interface to dependency inject, rightsize after test | ||
* finish dependency inject move Domain create outside | ||
* prep for other types or regex; all simple regex strings in tld_regexpr.py now need R() around them | ||
* use currying to make all regex strings into function cal in whoisParser.py; all regexes in tld_regexpr.py are now converted on import to function calls via R() | ||
* update tld: sk to use contextual extract, test with google.sk | ||
* add findFromToAndLookForWithFindFirst contextual search based on a previous findFirst, used in "fr" tld, example google.fr, {} is used to add to fromStr | ||
|
||
## 1.20230904.1 | ||
* only on pypi-test | ||
|
||
--- | ||
|
||
## 1.20230906.1 | ||
* introduce parsing based on functions | ||
* allow contextual search in splitted data and plain data | ||
* allow contextual search based on earlier result | ||
* fix a few tld to return the proper registrant string (not nic handle) | ||
* introduce parsing based on functions, allow contextual search in splitted data and plain data, allow contextual search based on earlier result; fix a few tld to return the proper registrant string (not nic handle) | ||
|
||
### 1.20230906.1 | ||
* introduce parsing based on functions | ||
* allow contextual search in splitted data and plain data | ||
* allow contextual search based on earlier result | ||
* fix a few tld to return the proper registrant string (not nic handle) | ||
|
||
### 1.20230913.1 | ||
* if you have installed `tld` (pip install tld) you can enable withPublicSuffix=True to process untill you reach the pseudo tld. | ||
* the public_suffix info is added if available (and if requested) | ||
* example case is: ./test2.py -d www.dublin.airport.aero --withPublicSuffix | ||
|
||
### 1.20230913.3 | ||
* fix re.NOFLAGS, it is not compatible with 3.9, it appears in 3.11 | ||
|
||
--- | ||
|
||
## in progress | ||
|
||
* prepare work on pylint | ||
* switch to logging: all verbose is currently log.debug(); to show set LOGLEVEL=DEBUG before calling, see Makefile: make test | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Usage | ||
|
||
## Requirements | ||
|
||
* Install the cli `whois` of your operating system if it is not present already, | ||
* Debian / Ubuntu: | ||
* `sudo apt install whois` | ||
* Fedora/Centos/Rocky: | ||
* `sudo yum install whois` | ||
|
||
## whois used in python (compatible with the Danny Cork version) | ||
|
||
### example for fedora 37 | ||
|
||
|
||
sudo yum install whois | ||
pip install whoisdomain | ||
|
||
python | ||
# to make it compatible with Danny_Cork whois | ||
>>> import whoisdomain as whois | ||
>>> d = whois.query('google.com') | ||
>>> print(d.__dict__) | ||
|
||
{'name': 'google.com', 'tld': 'com', 'registrar': 'MarkMonitor, Inc.', 'registrant_country': 'US', 'creation_date': datetime.datetime(1997, 9, 15, 9, 0), 'expiration_date': datetime.datetime(2028, 9, 13, 9, 0), 'last_updated': datetime.datetime(2019, 9, 9, 17, 39, 4), 'status': 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'statuses': ['clientDeleteProhibited (https://www.icann.org/epp#clientDeleteProhibited)', 'clientTransferProhibited (https://www.icann.org/epp#clientTransferProhibited)', 'clientUpdateProhibited (https://www.icann.org/epp#clientUpdateProhibited)', 'serverDeleteProhibited (https://www.icann.org/epp#serverDeleteProhibited)', 'serverTransferProhibited (https://www.icann.org/epp#serverTransferProhibited)', 'serverUpdateProhibited (https://www.icann.org/epp#serverUpdateProhibited)'], 'dnssec': False, 'name_servers': ['ns1.google.com', 'ns2.google.com', 'ns3.google.com', 'ns4.google.com'], 'registrant': 'Google LLC', 'emails': ['abusecomplaints@markmonitor.com', 'whoisrequest@markmonitor.com']} | ||
|
||
>>> print (d.expiration_date) | ||
2028-09-13 09:00:00 | ||
|
||
>>> print(d.name) | ||
google.com | ||
|
||
>>> print (d.creation_date) | ||
1997-09-15 09:00:00 |
Oops, something went wrong.