- Definition: Systems designed to mimic human intelligence through data processing and pattern recognition
- Real-world Example: Tesla's self-driving cars that process visual data, make decisions, and navigate in real-time
- Components:
- Perception: Understanding input data
- Learning: Improving from experience
- Problem-solving: Making decisions based on learned patterns
- Reasoning: Drawing conclusions from data
- Historical Context: Evolution from rule-based systems to modern machine learning
- Definition: Systems that improve performance through exposure to data without explicit programming
- Types:
- Supervised Learning: Learning from labeled data (e.g., spam detection)
- Unsupervised Learning: Finding patterns in unlabeled data (e.g., customer segmentation)
- Reinforcement Learning: Learning through trial and error (e.g., game AI)
- Real-world Example: Netflix's recommendation system learning from viewing habits
- Relation to GenAI: Forms the foundation for modern generative models
- Definition: AI systems that create new content based on training data
- Capabilities:
- Text Generation: Creating articles, stories, code
- Image Creation: Generating, editing, and modifying images
- Audio Synthesis: Creating music, speech, sound effects
- Code Generation: Creating and completing code
- Current Limitations:
- Hallucinations: Generating false information
- Context window: Limited input/output length
- Biases: Reflecting training data biases
- Real-world Example: GitHub Copilot generating code suggestions based on context
- Definition: Neural network architecture that processes sequential data using attention mechanisms
- Key Components:
- Attention Mechanism: Weighs importance of different parts of input
- Multi-head Attention: Processes multiple attention patterns simultaneously
- Feed-forward Networks: Processes transformed representations
- Real-world Example: Google Translate using transformers for more accurate translations
- Definition: Family of language models trained on vast text data using transformer architecture
- Versions:
- GPT-3: 175 billion parameters
- GPT-4: Enhanced capabilities and multimodal features
- Capabilities:
- Text Completion
- Translation
- Code Generation
- Reasoning
- Real-world Example: ChatGPT answering questions and generating content
- Definition: Dense vector representations of text/images in high-dimensional space
- Types:
- Word Embeddings: Representing individual words (Word2Vec, GloVe)
- Sentence Embeddings: Representing entire sentences (BERT, USE)
- Image Embeddings: Representing images (ResNet, VIT)
- Use Cases:
- Semantic Search: Finding similar content
- Classification: Categorizing content
- Recommendation Systems: Finding related items
- Real-world Example: Spotify using audio embeddings to recommend similar songs
- Definition: Maximum amount of text/tokens an LLM can process at once
- Importance:
- Determines model's ability to understand long documents
- Affects cost of API calls
- Influences response quality
- Real-world Example: Claude-3 processing a 150,000-token technical document
- Definition: Smallest unit of text processing in LLMs
- Types:
- Word-based: Split on word boundaries
- Subword-based: Common patterns in text
- Character-based: Individual characters
- Examples:
Input: "Let's learn AI!" Tokens: ["Let", "'s", "learn", "A", "I", "!"]
- Importance:
- Context window limits
- Cost calculations
- Processing efficiency
- Definition: Art of crafting inputs to get desired outputs from LLMs
- Components:
- System Instructions: Setting behavior and constraints
- Few-shot Examples: Providing examples for learning
- Task Description: Clear explanation of requirements
- Techniques:
- Chain-of-Thought: Breaking down complex reasoning
- Zero-shot: No examples needed
- Few-shot: Learning from examples
- Real-world Example:
System: You are a professional email writer. Few-shot Example: Input: "Schedule meeting with John" Output: "Subject: Meeting Request Dear John, I hope this email finds you well. I would like to schedule a meeting with you. Please let me know your availability. Best regards"
- Definition: Combining LLMs with external knowledge retrieval
- Components:
- Document Processing: Converting documents to vectors
- Knowledge Base: Stored document embeddings
- Retrieval System: Finding relevant information
- Generation: Creating responses using retrieved context
- Benefits:
- Improved accuracy
- Reduced hallucinations
- Current information access
- Real-world Example: A legal AI system that:
- Stores law documents as embeddings
- Retrieves relevant laws for a query
- Generates accurate legal advice
- Definition: System for weighing importance of different parts of input
- Types:
- Self-attention: Relating different positions of single sequence
- Cross-attention: Relating positions from two sequences
- Real-world Example: Translation system focusing on relevant words:
Input: "The bank is by the river" Attention: Focuses on context to understand "bank" means riverbank
- Definition: Model generating false or inconsistent information
- Types:
- Factual: Incorrect facts
- Logical: Inconsistent reasoning
- Contextual: Misunderstanding context
- Prevention:
- Using RAG for factual grounding
- Proper prompt engineering
- Output validation
- Real-world Example: A medical AI system generating non-existent medical conditions
- Definition: Interface that allows different software to communicate
- Types:
- REST: Standard web APIs
- GraphQL: Flexible data querying
- WebSockets: Real-time communication
- Usage in GenAI:
- Interacting with LLM providers
- Rate limiting and quotas
- Authentication and security
- Example Implementation:
import requests def call_ai_api(prompt): response = requests.post( 'https://api.example.com/v1/completions', headers={'Authorization': f'Bearer {API_KEY}'}, json={'prompt': prompt} ) return response.json()
- Definition: Lightweight web framework for Python
- Key Components:
- Routes: URL endpoint handling
- Templates: Dynamic HTML generation
- Request Processing: Handling user inputs
- Example Structure:
from flask import Flask, request, jsonify app = Flask(__name__) @app.route('/generate', methods=['POST']) def generate_text(): prompt = request.json.get('prompt') # AI processing logic return jsonify({"response": result})
- Definition: Framework for building LLM-powered applications
- Features:
- Document Loading: PDF, HTML, Text files
- Index Creation: Vector, List, Tree indexes
- Query Processing: Semantic, Hybrid search
- Real-world Example: Building a customer service bot that:
- Indexes company documentation
- Retrieves relevant information
- Generates accurate responses
- Definition: High-performance AI inference platform
- Features:
- Fast processing speeds
- Cost-effective inference
- Simple API integration
- Example Integration:
from groq import Groq client = Groq(api_key="your-key") response = client.chat.completions.create( messages=[{"role": "user", "content": "Hello!"}], model="mixtral-8x7b-32768" )
- Virtual Environment:
python -m venv myenv source myenv/bin/activate # Unix myenv\Scripts\activate # Windows
- Dependencies:
# requirements.txt flask==3.0.0 llama-index==0.9.7 python-dotenv==1.0.0 groq==0.4.0
- Environment Variables:
# .env file GROQ_API_KEY=your_key_here FLASK_ENV=development
[Previous theoretical sections remain exactly as before through "Best Practices and Security"]
Task: Understand how different texts are tokenized
from transformers import AutoTokenizer
def analyze_tokens(text):
tokenizer = AutoTokenizer.from_pretrained("gpt2")
tokens = tokenizer.encode(text)
tokens_decoded = [tokenizer.decode([token]) for token in tokens]
print(f"Original text: {text}")
print(f"Number of tokens: {len(tokens)}")
print("Tokens breakdown:", tokens_decoded)
print("-" * 50)
# Test cases
texts = [
"Hello world!",
"Artificial Intelligence",
"https://www.example.com",
"The quick brown fox jumps over the lazy dog."
]
for text in texts:
analyze_tokens(text)
Task: Write prompts for different scenarios
import os
from groq import Groq
client = Groq(api_key=os.getenv("GROQ_API_KEY"))
# Example prompts and their purposes
prompts = {
"Classification": """
Task: Classify the sentiment of the following text as positive, negative, or neutral.
Text: "The movie was amazing! I loved every minute of it."
Format your response as a single word: positive, negative, or neutral.
""",
"Structured Output": """
Extract the following information from the text and format it as JSON:
Text: "John Smith, age 35, works as a software engineer at Tech Corp since 2020."
Required fields: name, age, occupation, company, start_year
""",
"Step by Step": """
Explain how to make a peanut butter sandwich.
Format your response as numbered steps.
Keep each step brief and clear.
"""
}
def test_prompt(prompt_name, prompt_text):
print(f"Testing: {prompt_name}")
print("Prompt:", prompt_text)
response = client.chat.completions.create(
messages=[{"role": "user", "content": prompt_text}],
model="mixtral-8x7b-32768"
)
print("Response:", response.choices[0].message.content)
print("-" * 50)
for name, prompt in prompts.items():
test_prompt(name, prompt)
Task: Create a simple Flask app with AI integration
from flask import Flask, request, jsonify
from groq import Groq
import os
app = Flask(__name__)
client = Groq(api_key=os.getenv("GROQ_API_KEY"))
@app.route('/health')
def health_check():
return jsonify({"status": "healthy"})
@app.route('/analyze', methods=['POST'])
def analyze_text():
data = request.json
text = data.get('text', '')
prompt = f"""
Analyze the following text and provide:
1. Main topic
2. Sentiment
3. Key words (maximum 3)
Text: {text}
Format response as JSON.
"""
response = client.chat.completions.create(
messages=[{"role": "user", "content": prompt}],
model="mixtral-8x7b-32768"
)
return jsonify({
"analysis": response.choices[0].message.content
})
if __name__ == '__main__':
app.run(debug=True)
Task: Implement basic document retrieval and querying
from llama_index import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms import Groq
import os
def setup_document_qa():
# Initialize LLM
llm = Groq(
api_key=os.getenv("GROQ_API_KEY"),
model_name="mixtral-8x7b-32768"
)
# Load documents
documents = SimpleDirectoryReader('data').load_data()
# Create index
index = VectorStoreIndex.from_documents(documents)
# Create query engine
query_engine = index.as_query_engine()
return query_engine
def test_questions(query_engine):
questions = [
"What are the main topics covered in the documents?",
"Can you summarize the key points?",
"What are the conclusions drawn?"
]
for question in questions:
print(f"Question: {question}")
response = query_engine.query(question)
print(f"Response: {response}\n")
Task: Create a complete project setup script
# setup_project.py
import subprocess
import os
import sys
from pathlib import Path
def setup_project():
# Create project structure
directories = ['data', 'templates', 'static', 'logs']
for directory in directories:
Path(directory).mkdir(exist_ok=True)
# Create virtual environment
subprocess.run([sys.executable, '-m', 'venv', 'venv'])
# Create requirements.txt
requirements = """
flask==3.0.0
llama-index==0.9.7
python-dotenv==1.0.0
groq==0.4.0
transformers==4.37.2
""".strip()
with open('requirements.txt', 'w') as f:
f.write(requirements)
# Create .env template
env_template = """
GROQ_API_KEY=your_api_key_here
FLASK_ENV=development
""".strip()
with open('.env.template', 'w') as f:
f.write(env_template)
print("Project structure created successfully!")
print("\nNext steps:")
print("1. Activate virtual environment:")
print(" Windows: .\\venv\\Scripts\\activate")
print(" Unix/MacOS: source venv/bin/activate")
print("2. Install requirements: pip install -r requirements.txt")
print("3. Copy .env.template to .env and add your API keys")
if __name__ == '__main__':
setup_project()
- What happens when the text contains special characters or emojis? (Exercise 1)
- How does changing the prompt format affect the LLM's response? (Exercise 2)
- What happens if the API key is invalid? How should we handle this? (Exercise 3)
- How does the RAG system perform with different types of documents? (Exercise 4)
- What additional environment variables might be needed for production? (Exercise 5)
- Modify the token analyzer to compare different tokenizer models
- Create a prompt template system with variable substitution
- Add rate limiting and caching to the Flask application
- Implement document chunking and metadata in the RAG system
- Add Docker support to the project setup