page_type

languages

products

name

urlFragment

description

Deploy

sample

python

azure

azure-cognitive-search

azure-container-registry

azure-functions

PID

azure-pid-drawing-sample

This custom skill extracts specific product/equipment names from PI&D drawings.

Manually [deploy the container image as a Azure function](#deployment)

PID Skill

A piping and instrumentation diagram (P&ID) is a detailed diagram in the process industry which shows the piping and process equipment together with the instrumentation and control devices. Superordinate to the P&ID is the process flow diagram (PFD) which indicates the more general flow of plant processes and the relationship between major equipment of a plant facility.

This skill is designed to extract equipment information from specific instrument symbols in engineering diagrams. The skill uses the X, Y coordinates of text extracted by OCR to generate groupings of text based on proximity, vertical and horizontal separation and alignment.

For best results, set the normalized images to the higest resolution. You can also edit the parameters within the skill to change the sensitivity of how the tags are grouped. Additional logic is applied product tags to determine tag boundaries and hypheated text. The skill returns two json elements, a tag array and text array.

.

Requirements

This skill requires Docker to build a container that will be deployed as an Azure function.

Settings

The default configuration of the skill identifies tags or equipment and associated text blocks. Tuning the following parameters allows you to se the sensitivity of grouping of individual text spans into a block.

maxSegment defines the max length of a valid text segment
leftAlignSensitivity defines the sensitivity of the algorithm in matching text blocks that are left aligned
rightAlignSensitivity defines the sensitivity of the algorithm in matching text blocks that are right aligned
centerAlignSensitivty defines the sensitivity of the algorithm in matching text blocks that are center aligned

Deployment

Follow these steps to build the container and deploy the skill as an Azure Function.

Navigate to the diagramskill folder and build the docker container docker build -t pidskill .
Run the container docker run -p 8080:80 -it pidskill:latest
Save the image docker commit {container id from previous step} pidskill
Push the image to the container registry docker push {containerregistry}.azurecr.io/pidskill

Once the image is in the container registry, you can now create an Azure function to deploy that image to.

In the portal, create a new Azure Function App
Select the Docker Container option, provide a valid function name
Once the deployment is complete, navigate to the resource, select Container settings
Select Azure Container Registry for the Image Source
Select the registry, image and tag
Set continuous deployment to On to ensure that the skill is updated when a new image is uploaded
Save your changes

Your skill should now be configured and you can now navigate to the Functions menu, select the app and get the function URL.

Sample Skillset Integration

In order to use this skill in a cognitive search pipeline, you'll need to add a skill definition to your skillset. Here's a sample skill definition for this example (inputs and outputs should be updated to reflect your particular scenario and skillset environment):

{
    "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
    "name": "PIDSkill", 
    "description": "Extracts tags and text blocks from PID drawings",
    "uri": "[Azure Functions URL]",
    "httpMethod": "POST",
    "timeout": "PT30S",
    "context": "/document/normalized_images/*",
    "batchSize": 1,
    "inputs": [
        {
            "name": "file_data",
            "source": "/document/normalized_images/*"
        },
        {
            "name": "layoutText",
            "source": "/document/normalized_images/*/layoutText"
        }
    ],
    "outputs": [
        {
            "name": "tags",
            "targetName": "tags"
        },
        {
            "name": "textBlocks",
            "targetName": "textBlocks"
        }
    ]
}

Indexer Configuration

To ensure that the skill gets the higest quality image as an input, set the following parameters on the configuration object in the indexer parameters.

"configuration": {
    "dataToExtract": "contentAndMetadata",
    "imageAction": "generateNormalizedImages",
    "allowSkillsetToReadFileData": true,
    "normalizedImageMaxWidth": 4200,
    "normalizedImageMaxHeight": 4200
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

PID Skill

Requirements

Settings

Deployment

Sample Skillset Integration

Indexer Configuration

Files

README.md

Latest commit

History

README.md

File metadata and controls

PID Skill

Requirements

Settings

Deployment

Sample Skillset Integration

Indexer Configuration