Month: May 2019

Comparing AWS Textract and AWS Rekognition to extract text from images using PHP

A few months ago I tried using AWS Rekognition to detect text in images. The results were okay for casual use cases but overall the quality was pretty poor (primarily because Rekognition isn’t intended to be used as an OCR product). A few days ago (May 29), AWS announced the general availability of Textract, an actual OCR product. Out of curiosity, I wanted to run the same image I ran through Rekognition through Textract to compare the difference. While Textract isn’t 100%, it’s a huge improvement over Rekognition (as should be expected since it’s intended for this). View full results of Rekognition View full results of Textract Below is a side-by-side comparison of the results from the two services: Textract Results Rekognition Results DetectedText Confidence DetectedText Confidence DANS PUMP AND GO 98% DANS PLMP AND GO 98% 15238 MAIN ST 99% 15238 MAIN ST 100% NEWTOWN 100% NEWTOWN 100% CAROLINA 93812 96% CAROLINA 93812 97% ST-TX: 11089984 99% ST-TX: 11089987 (555) 708-2224 98% (555) 708-2224 100% 2014-02-25 IW424534:9338300 07:09 99% 2014-02-25 TW420534: 34:9338300 07:09 94% TERMINAL: 509338300 OPER: A 89% TERMINAL: 509338300 OPER: A 99% Fuel 99% Fuel (G) ($/G) 99% (G) ($/G) 98% ($) 95% ($) 99% Pump 9…