Skip to content

Latest commit

 

History

History
118 lines (104 loc) · 8.54 KB

predefined_extraction_sources.md

File metadata and controls

118 lines (104 loc) · 8.54 KB

The maintainers of NewsHound have developed and tested predefined extraction modules for various news sources around the world. The table below lists these news sources. This table will be periodically updated to include additional sources.

Last Updated: 11.20.2021

Last Tested: 11.20.2021


News Source URL Language
ABC (Australian Broadcasting Corporation) www.abc.net.au English
ABC News www.abcnews.go.com English
Alarabiya www.alarabiya.net Arabic.
Amar Ujala www.amarujala.com Hindi
Asahi Shimbun www.asahi.com Japanese
Associated Press www.apnews.com English
Atlantic Council www.atlanticcouncil.org English
BBC www.bbc.com English
Business Standard www.business-standard.com English
Canadian Broadcasting Corporation www.cbc.ca English
CBS News www.cbsnews.com English
Chicago Tribune www.chicagotribune.com English
China Daily www.chinadaily.com.cn English
Christian Science Monitor www.csmonitor.com English
CNBC www.cnbc.com English
CNN www.cnn.com English
Corriere della Sera www.corriere.it Italian
Dainik Bhaskar www.bhaskar.com Hindi
Dainik Jagran www.jagran.com Hindi
Daily Mail www.dailymail.co.uk English
Der Spiegel www.spiegel.de German
Des Moines Register www.desmoinesregister.com English
El Nuevo Dia www.elnuevodia.com Spanish
El Mundo www.elmundo.es Spanish
El Pais elpais.com Spanish
Forbes www.forbes.com English
Fox Business www.foxbusiness.com English
Fox News www.foxnews.com English
Guangzhou Daily gzdaily.dayoo.com Chinese
Jahan News www.jahannews.com Persian
Japan Times www.japantimes.co.jp English
Jerusalem Post www.jpost.com English
Khaosod www.khaosod.co.th Thai
Korea Times www.koreatimes.com Korean
Korea Times www.koreatimes.co.kr English
La Opinión laopinion.com Spanish
La Repubblica www.repubblica.it Italian
Le Monde www.lemonde.fr French
Le Parisien www.leparisien.fr French
Los Angeles Times www.latimes.com English
Mainichi Shimbun mainichi.jp English
Mainichi Shimbun mainichi.jp Japanese
Malayala Manorama www.manoramaonline.com Malayalam
MarketWatch www.marketwatch.com English
Mercury News www.mercurynews.com English
Miami Herald www.miamiherald.com English
National Review www.nationalreview.com English
NBC News www.nbcnews.com English
New Delhi Television Ltd (NDTV) www.ndtv.com English
Newsweek www.newsweek.com English
Nikkei asia.nikkei.com English
People's Daily www.people.com.cn Chinese
People's Daily en.people.cn English
Politico www.politico.com English
Pravda www.pravda.ru Russian
Rajasthan Patrika www.patrika.com Hindi
Reference News www.cankaoxiaoxi.com Chinese
Reuters www.reuters.com English
Star Advertiser www.staradvertiser.com English
Tampa Bay Times www.tampabay.com English
Times of Israel www.timesofisrael.com English
The Asahi Shimbun www.asahi.com/ajw English
The Asahi Shimbun www.asahi.com Japanese
The Atlantic www.theatlantic.com English
The Daily Beast www.thedailybeast.com English
The Guardian www.theguardian.com English
The Hill www.thehill.com English
The San Diego Union Tribune www.sandiegouniontribune.com English
The Times of India www.timesofindia.indiatimes.com English
USA Today www.usatoday.com English
Washington Times www.washingtontimes.com English
World Economic Forum www.weforum.org English
World Economic Forum - Chinese cn.weforum.org Chinese
World Economic Forum - French fr.weforum.org French
World Economic Forum - Spanish es.weforum.org Spanish
World Economic Forum - Japanese jp.weforum.org Japanese
Yahoo News news.yahoo.com English
Yomiuri Shimbun www.yomiuri.co.jp Japanese

As of 11.20.2021 each one of the news sources listed above can be queried for the following data elements related to an individual article:

  • Title
  • Description/Summary
  • Keywords
  • Author(s)
  • Text/Content
  • Language Type
  • Published Date
  • Modified Date
  • Top Image

SPECIAL NOTE: Some news sources will be missing certain data elements, because these elements are not available in the source's navigational structure. The most common element missing is Keywords and the second most common is Modified Date. The maintainers of NewsHound are exploring methods to extract keywords from an article's title and description when these words are not provided by a news source.