Vision App is a sample iOS application to automatically tag images and detect faces by using IBM visual recognition technologies.
Take a photo or select an existing picture, let the application generate a list of tags and detect people, buildings, objects in the picture. Share the results with your network.
Built using IBM Cloud, the application uses:
- Watson Visual Recognition
- Cloud Functions
- Cloudant
vision_analysis digraph G { node [fontname = "helvetica"] /* stores image */ app -> cloudant /* analyzes the image */ app -> openwhisk {rank=same; app -> openwhisk -> watson [style=invis] } /* openwhisk reads from cloudant */ cloudant -> openwhisk /* whisk passes image to visual recognition */ openwhisk -> watson /* whisk provides result */ openwhisk -> app /* services on top */ {rank=source; cloudant } /* styling ****/ cloudant [shape=circle style=filled color="%234E96DB" fontcolor=white label="Cloudant"] watson [shape=circle style=filled color="%234E96DB" fontcolor=white label="Watson\nVisual\nRecognition"] openwhisk [shape=circle style=filled color="%2324B643" fontcolor=white label="OpenWhisk"] } vision_analysis
The application sends the picture to a Cloudant database. Then it calls an OpenWhisk action that will analyze the picture and send back the results of the analysis.
This application is one example use case. Equipped with the OpenWhisk action implemented in this example, an other use case could be to automatically classify images in a library to improve search capabilities: the same OpenWhisk action but used in a different context. Indeed with this action, we created a microservice for image analysis in the cloud, without deploying or managing a single server.
- IBM Cloud account. Sign up for IBM Cloud, or use an existing account.
- IBM Cloud Functions
- XCode 8.1, iOS 10, Swift 3.0
-
Clone the app to your local environment from your terminal using the following command:
git clone https://github.com/IBM-Bluemix/openwhisk-visionapp.git
-
or Download and extract the source code from this archive
-
Open the IBM Cloud console
-
Create a Cloudant NoSQL DB service instance named cloudant-for-vision
-
Create a new set of credentials for the Cloudant NoSQL DB service
-
Open the Cloudant service dashboard and create a new database named openwhisk-vision
-
Create a Watson Visual Recognition service instance named visualrecognition-for-vision
-
Create a new set of credentials for the Watson Visual Recognition service
Note: if you have existing instances of these services, you don't need to create new instances. You can simply reuse the existing ones.
- Ensure your Cloud Functions command line interface is property configured with:
ibmcloud cloud-functions list
- Create the action using the following command line replacing the placeholders with the credentials obtained from the respective service dashboards in IBM Cloud:
ibmcloud cloud-functions action create -p cloudantUrl [URL] -p cloudantDbName openwhisk-vision -p watsonApiKey [123] vision-analysis analysis.js
To configure the iOS application, you need the credentials of the Cloudant service created above, your Cloud Functions authorization key.
-
Open vision.xcworkspace with XCode
-
Open the file vision/vision/model/ServerlessAPI.swift
-
Set the value of the constant CloudantUrl to the Cloudant service credentials url.
-
Set the value of the constants WhiskAppKey and WhiskAppSecret to your OpenWhisk credentials. You can retrieve them from the iOS SDK configuration page or you can retrieve the key and secret with the following CLI command:
ibmcloud cloud-functions property get --auth
whisk auth kkkkkkkk-kkkk-kkkk-kkkk-kkkkkkkkkkkk:tttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt
The strings before and after the colon are your key and secret, respectively.
- Save the file
- Start the application from XCode with iPhone 6s as the target
- Select an existing picture
Note: To add pictures to the simulator, go to the home screen (Cmd+Shift+H). Drag and drop images from the Finder to the simular window. This will open the Photos app and you should see your images.
- The picture is sent for analysis and results are returned:
Results are made of the faces detected in the picture and of tags returned by Watson. The tags with the highest confidence score are pre-highlighted. The highlighted tags will be used when sharing the picture. You can tap tags to toggle their state.
- Press the Share button. This opens the standard iOS sharing screen.
Note: to configure a Twitter account, go to the Settings app on the simulator. Under Twitter, add your account (no need for the Twitter app to be installed). You can go back to the home screen with Cmd+Shift+H
- Pick Twitter as example.
The picture and the highlighted tags are included in the message. The message can be edited before posting.
analysis.js holds the JavaScript code to perform the image analysis:
- It retrieves the image data from the Cloudant document. The data has been attached by the iOS app as an attachment named "image.jpg".
- It saves the image file locally.
- If needed, it resizes the image so that it matches the requirements of the Watson service
- It calls Watson
- It returns the results of the analysis
The action runs asynchronously.
File | Description |
---|---|
ServerlessAPI.swift | Stores the image in Cloudant and executes the analysis OpenWhisk action, waiting for the result. |
Result.swift | Encapsulates the JSON result |
HomeController.swift | Manages the selection of an existing picture and taking a picture from the camera |
ResultController.swift | Uses ServerlessAPI to send the image for processing and then display the results of the analysis |
FacesController.swift | Embedded in ResultController if handles the face collection view |
FaceCellRenderer.swift | Renders a face in the FacesController |
Please create a pull request with your desired changes.
Polling activations is good start to debug the OpenWhisk action execution. Run
ibmcloud cloud-functions activation poll
and submit a picture for analysis.
A typical activation log when everything goes fine will look like:
Activation: vision-analysis (123fb4230902822202029fff436a94be745)
2016-02-23T16:17:53.955350233Z stdout: [ 49382920fdb022039403934b3bd33d00 ] Processing image.jpg from document
2016-02-23T16:17:59.847872226Z stdout: [ 49382920fdb022039403934b3bd33d00 ] OK
The application prints several statements to the console as it uploads, analyzes and updates the user interface. Make sure you correctly updated the constants in ServerlessAPI.swift.
The application uses:
- Alamofire (License)
- AlamofireImage (License)
- SwiftyJSON (License)
- TagListView (License)
- JGProgressHUD (License)
- RDHCollectionViewGridLayout (License)
See License.txt for license information.