This Python code retrieves an article from a provided URL, extracts its title and text, and then utilizes the ChatOpenAI library (assuming access) to generate a bulleted summary using the GPT-4 model.
pip install requests newspaper dotenv langchain chat-openai
Update article_url
with the actual URL of the article you want to summarize.
Run the Script:
Execute the Python script using a command like:
python your_script_name.py
- json: For potential JSON data handling (not explicitly used in this code).
- requests: To fetch web content from the provided URL.
- newspaper: To extract title and text from the downloaded article.
- dotenv: To manage environment variables (not used in the current version).
- langchain.schema.HumanMessage: Defines the message format for ChatOpenAI.
- langchain_community.chat_models.ChatOpenAI: Enables interaction with ChatOpenAI.
- Defines a user agent string for the HTTP request headers.
- Sets the article_url variable to hold the article's URL.
- Creates a requests.Session object for managing HTTP requests.
- Attempts to retrieve the article using requests.get.
- If successful (status code 200), creates a newspaper.Article object, downloads and parses the article.
- Extracts title and text using article.title and article.text.
- Prints an error message if fetching fails.
- Constructs a template string using f-strings for formatted text insertion.
- Inserts the retrieved article_title and article_text into the template.
- (Assuming Access) Creates a ChatOpenAI object with the gpt-4 model (if available) and sets the temperature parameter (controls randomness).
- (Assuming Access) Creates a list of HumanMessage objects containing the prompt.
- (Assuming Access) Generates a summary using chat(messages).
- (Assuming Access) Prints the generated summary content.
- Consider exploring alternative summarization libraries or techniques (e.g., Gensim, spaCy) for broader applicability if access to ChatOpenAI's GPT-4 model is limited.
- The code can be further enhanced with additional features like:
- User input for the article URL.
- Support for different summarization models.
- Output formatting options (bullet points, text, etc.).