A couple years ago, I tinkered with a solution to use a webcam to capture images of receipts, covert the images to raw text, and store in a database. My scrappy solution worked okay but it lacked the accuracy to make it viable for anything real-world.
With AWS Rekognition launching since then, I figured I’d try it out and see how it compares. I used a fake receipt to see how it’d do.
Like every other AWS product I’ve used, it was incredibly easy to work it. I’ll share the simple script I used at the bottom of this post but, needless to say, there’s not much to it.
While use was a breeze, the results were disappointing. Primarily, the fact that Rekognition is limited to ONLY 50 words in an image. So clearly it’s not a full-on OCR tool.
Somewhat more disappointing was the limited range of confidence scores Rekognition returned (for each text detection, it provides a confidence score). The overall output was pretty accurate but not accurate enough for me to consider it “wow” worthy. Despite this, all of the confidence scores were above 93%.
To be considered an OCR service, AWS Rekognition has a long way to go before it’s competitive as an OCR service. It’s performance in object detection/facial recognition (which is the heart and primary usecase of Rekognition) may be better but I haven’t tested that at this point.
You can view the full analysis and output of the receipt image here.
Below is the code used to generate the output linked above:
<?php require '/home/vendor/autoload.php'; use Aws\Rekognition\RekognitionClient; $client = new Aws\Rekognition\RekognitionClient([ 'version' => 'latest', 'region' => 'us-west-2', 'credentials' => [ 'key' => 'IAM KEY', 'secret' => 'IAM SECRET' ] ]); $result = $client->detectText([ 'Image' => [ 'S3Object' => [ 'Bucket' => 'S3 BUCKET CONTAINING IMAGE', 'Name' => 'receipt_preview.jpg', ], ], ]); echo "<h1>Rekognition</h1>"; $i=0; echo "<table border=1 cellspacing=0><tr><td>#</td><td>DetectedText</td><td>Type</td><td>ID</td><td>ParentId</td><td>Confidence</td></tr>"; foreach ($result['TextDetections'] as $phrase) { $i++; echo "<tr><td>$i</td><td>".$phrase['DetectedText']."</td><td>".$phrase['Type']."</td><td>".$phrase['Id']."</td><td>".$phrase['ParentId']."</td><td>".round($phrase['Confidence'])."%</td></tr>"; } echo "</table>"; echo "<h1>Raw Output</h1><pre>"; print_r($result); echo "</pre>"; ?>
I have tried and it gives me error 500. Which region should I use?
The region doesn’t matter – up to you.
What’s the detailed 500 error? I’m guessing your paths are different than what I used in the example above.
Sorry, I’ve tried the code but it only works for me using IdentityPoolId.
I do not understand how it works with the credentials
‘credentials’ => [
‘key’ => ‘IAM KEY’,
‘secret’ => ‘IAM SECRET’
That way I can not get it to work with my credentials
Only validate for IdentityPoolId