Update README.md

tensorlakeai · Jun 18, 2024 · 882ece5 · 882ece5
1 parent fb29bd2
commit 882ece5
Showing 1 changed file with 34 additions and 34 deletions.
diff --git a/README.md b/README.md
@@ -85,6 +85,40 @@ context = client.search_index(name="sportsknowledgebase.minilml6.embedding", que
 
 > The method wait_for_extraction blocks the client until Indexify runs the extraction on the ingested content. In production applications you will most likely won't block your application, and let extraction be asynchronous.
 
+###  PDF Extraction and Retrieval
+This example shows how to create a pipeline that extracts from PDF documents.
+More information here - https://docs.getindexify.ai/usecases/pdf_extraction/
+
+#### Create an Extraction Graph
+```python
+from indexify import IndexifyClient, ExtractionGraph
+import requests
+client = IndexifyClient()
+
+extraction_graph_spec = """
+name: 'pdfqa'
+extraction_policies:
+   - extractor: 'tensorlake/pdfextractor'
+     name: 'docextractor'
+"""
+
+extraction_graph = ExtractionGraph.from_yaml(extraction_graph_spec)
+client.create_extraction_graph(extraction_graph)
+```
+
+#### Upload a Document
+```python
+with open("sample.pdf", 'wb') as file:
+  file.write((requests.get("https://extractor-files.diptanu-6d5.workers.dev/scientific-paper-example.pdf")).content)
+content_id = client.upload_file("pdfqa", "sample.pdf")
+```
+
+#### Get Text, Image and Tables
+```python
+client.wait_for_extraction(content_id)
+print(client.get_extracted_content(content_id, "pdfqa", "docextractor"))
+```
+
 
 ### Podcast Summarization and Embedding
 This example shows how to transcribe audio, and create a pipeline that embeds the transcription 
@@ -182,40 +216,6 @@ client.get_extracted_content(content_id, "imageknowledgebase", "object_detection
 print(client.sql_query("select * from imageknowledgebase where object_name='person';"))
 ```
 
-###  PDF Extraction and Retrieval
-This example shows how to create a pipeline that extracts from PDF documents.
-More information here - https://docs.getindexify.ai/usecases/pdf_extraction/
-
-#### Create an Extraction Graph
-```python
-from indexify import IndexifyClient, ExtractionGraph
-import requests
-client = IndexifyClient()
-
-extraction_graph_spec = """
-name: 'pdfqa'
-extraction_policies:
-   - extractor: 'tensorlake/pdfextractor'
-     name: 'docextractor'
-"""
-
-extraction_graph = ExtractionGraph.from_yaml(extraction_graph_spec)
-client.create_extraction_graph(extraction_graph)
-```
-
-#### Upload a Document
-```python
-with open("sample.pdf", 'wb') as file:
-  file.write((requests.get("https://extractor-files.diptanu-6d5.workers.dev/scientific-paper-example.pdf")).content)
-content_id = client.upload_file("pdfqa", "sample.pdf")
-```
-
-#### Get Text, Image and Tables
-```python
-client.wait_for_extraction(content_id)
-print(client.get_extracted_content(content_id, "pdfqa", "docextractor"))
-```
-
 ### LLM Framework Integration 
 Indexify can work with any LLM framework, or with your applications directly. We have an example of a Langchain application [here](https://docs.getindexify.ai/integrations/langchain/python_langchain/) and DSPy [here](https://docs.getindexify.ai/integrations/dspy/python_dspy/).