Skip to content

16.2.6+103

📅 2025-05-16

N/A

The possibility of specifying a text direction for OCR of Asian languages has been reintroduced to improve recognition accuracy in specific scenarios.

As a consequence, the value TextDirection.Any has also been added to the relevant enum to represent automatic detection, which remains the default behavior.

Default vs. Explicit behavior

  • Default behavior: The OCR engine automatically detects the text direction (TextDirection.Any). This will give the appropriate output in the vast majority of the use cases.
  • Explicit behavior: In certain scenarios — especially in the context of zonal OCR — automatic detection may misinterpret the direction due to limited context. In these cases, explicitly setting the text direction when already known will improve accuracy.

The following snippet demonstrates how to configure OCR with an explicit text direction:

COcrContext objContext = COcrContext::Create(evLanguage);
objContext.SetTextDirection(TextDirection::TopToBottom);
CTextRecognition objTextRecognition = CTextRecognition::Create(objIdrs);
objTextRecognition.SetOcrParams(COcrPageParams::Create(objContext));
objTextRecognition.RecognizeText(objImage);

A new property, CImageLoadOptionsPdf.GreyscaleDetection, has been introduced in order to enable/disable the rasterization of PDF pages as greyscale CImage objects.

This property is enabled by default, maintaining behavior consistent with previous iDRS releases.

Disabling greyscale detection can offer performance improvements in terms of speed, but at the cost of increased memory usage, as pages will be rasterized in color rather than greyscale.

N/A

Internal IDDescriptionService desk IDs
IDRSRD-9934PDF/UA generated by the iDRS doesn’t succeed compliance validation when document contains complex tables
IDRSRD-9927the iDRS encounters a crash when rotating a specific image
IDRSRD-9920The iDRS requires extra unexpected OCR resources to run auto-orientation only
IDRSRD-9908The iDRS should expose a flag to enable or disable greyscale image detection during loading of a PDF page
IDRSRD-9903Confidence values of language and orientation detection feature are unusable with 16.2.6+82ISD-36788
IDRSRD-9901The default character set is missing some supported characters for Japanese language
IDRSRD-9900.NET samples cannot be compiled on Linux
IDRSRD-9884Graphic shapes detected by iDRS are incorrectly scaled on output document if input image resolution is different than 300 dpi
IDRSRD-9883The OCR engine library is linked with WS2_32.dll for no reasonISD-36751
IDRSRD-9882API reference main pages are different between C++, .NET and C APIs
IDRSRD-9856The iDRS can return OCR results outside of input zones when running zonal OCR
IDRSRD-9839The iDRS merges lines from different text columns on a specific image
IDRSRD-9796Orientation detection gives unexpected answer on a border-case scenario
IDRSRD-9795An integrator should be able to hint the OCR engine for the text direction to detect, when processing Asian documents
IDRSRD-9754The iDRS is not compatible with VirtualBox VMs running on Windows HostsISD-36479
IDRSRD-9703The new segmentation filters isolated punctuations or characters
IDRSRD-5619DOCX output created by the iDRS is poor for specific images