I want to achieve

I am working on detecting items on paper, but there are about 10 or more false positives for every 100 sheets.
A standard method of scrutinizing the process leading up to detection? And if there is a better way
I would like to have a professor.

I am not very familiar with image processing, so I think I may ask additional questions.

I look forward to working with you.


Windows 10
C # (VisualStudio2015, .NetFramework4.5.2), OpenCVSharp3, GhostScript.Net, QR Code Decode Library

Current processing content

* It is long.

1. About paper
The paper is A3 landscape, with a QR code printed on the bottom left and top right, and is provided as a JPEG or PDF file.

2. Image extraction-correction processing

2-1 JPEG save
JPG and PDF are created by the scanner. * The tilt has been corrected to some extent by the scanner.
JPG (200dpi) is processed as it is, and PDF (200dpi) is converted to JPG (96dpi) using GhostScript. Outputting at 96dpi is a balance with the response of the entire process.

2-2 QR code detection
Use OpenCVSharp and "QR Code Decode Library" to detect the QR code.
At this time, the coordinates of the four corners of the QR can be taken, but when connected, it may become a quadrangle instead of a rectangle.

A rectangle (QR frame) surrounded by QR is formed from the coordinates of the lower left of the lower left QR and the upper right of the upper right QR.

With the coordinates on the upper left of the QR frame as the reference point, 2-3. After rotation correction, from the separately defined information (template)
Read the item in the QR frame.

2-3 Rotation correction
Even after the scanner tilt correction, a slight tilt may remain. * The figure is tilted exaggeratedly.

Calculate the aspect ratio of the QR frame of the template (red) and the read image (blue), and obtain the difference.

If the difference is 0.007 or more, there is a high probability that the result will be misaligned, so rotation correction will be performed.
* If it is less than 0.007, it will not be corrected.
Straight lines are detected by HoughLinesP of OpenCV.

Mat edge = new Mat (original image.Size (), MatType.CV_8U)Cv2.CvtColor (original image, edge, ColorConversionCodes.BGR2GRAY)
Cv2.Canny (edge, edge2, 25, 50);
lines = Cv2.HoughLinesP (edge2, 1, Math.PI/180, 100, 150, 15);
double radian = Math.Atan2 (ln.P2.Y --ln.P1.Y, ln.P2.X --ln.P1.X) * (180/Math.PI)

You can get a group of lines as shown in the figure below, but since you can also pick up characters, etc., discard lines with an angle of 1 degree or more and use the average value as the angle. * Because there are many tilts of less than 1 degree due to the automatic correction of the scanner.

Graphics object is used for rotation.

g.RotateTransform (angle * -1F);

For the rotated image, go back to the beginning of 2-3 and perform the processing to obtain the resulting image.

3. Item detection
From the template (red frame), detect the target item (dotted line frame) using the aspect ratio and aspect ratio of the target image (green frame).

As a result, if the aspect ratios are close, there is no problem, but if there is a deviation of a certain amount or more, the items will shift.

  • Answer # 1

    First of all, what is the accuracy of the detection of the two points "upper right" and "lower left" by QR detection performed at the beginning?
    You should see that.
    If the error of the data that is the beginning is large,

    I don't really need the correction in the latter stage

    There is no point in repeating the process (if there is a non-negligible amount of independent deviation each time, there is no point in repeating it)

    Will arise.

    If the detection accuracy of these two points is sufficient ...
    The index to see "whether it is tilted" is expressed as "aspect ratio", but the point is that it looks at the tan of the angle of the line connecting the two points.
    If so, it may be possible to estimate the "slope" directly from it.
    (However, it is too ideal theory, and I feel that it is delicate from the viewpoint of robustness to actually use this as the basis of the correction amount.)

    # Also, is it possible to obtain the amount of tilt from the QR detection process itself?

    Maybe it's a useless story:

    Looking at the figure presented in the question, the reference position (upper left of the frame) when determining the final area is not the position where some image features exist, but from the above two points by QR. It looks like the coordinates are calculated (so to speak, extrapolated).
    It seems like a pretty "weak" way (although it's a sensory story and I can't say it reasonably well).
    For me, the reference coordinates are the places where there are image features, and they are detected directly.

    * Probably, this upper left reference point is estimated to be (x, y) using the upper right point y and the lower left point x detected on the "image that the rotation is supposed to have been completely removed". think.
    If so, if the rotation removal of the image is not perfect, a strange offset will occur in the upper left coordinates themselves.
    Based on the story that "the rotation may not have been completely removed even in the final correction result image", if the directions of the two axes are properly estimated and used, OK.