Next is a 300 dpi Line art sample with text that's a little light at the default 128 threshold. The original was printed from a dot matrix printer that was borderline of needing a new ribbon, and the document was old and faded too. Notice the large spike of black text is conspicuous by its absence. Instead, we have text of many shades of gray ranging all over the spectrum.
But the background is good clear white, see, all the way to the right end. So raising the threshold to about 180 (still well to the left of the white background peak) could darken ALL OF THE TEXT without hurting the background, and a good dark scan is obtained. All tonal values from 180 to 128 were changed from white to black. You can overdo it and get smudged characters, the same as those we try to repair next below. Again, see A Simple Way pages later for a discussion of the histogram tool.
Threshold 128 | |
Threshold 180 |
But generally, it's more common that documents need to LOWER the threshold to make the text more clear, or to make the background lighter. Magazines and newspapers especially, the type is rather smudged if you look closely. Lowering the threshold does NOT make the text lighter, black is still black in Line art, but it may seem to because it reduces the smudges and makes the text be finer and more distinct. This can definitely help OCR, did I mention that? In Line art mode, there are only two choices, and if any dot prints at all, it will be jet black. But if the original has a background or spots or objects of Gray color of say tone 105, you can easily change value 105 to either White or Black by positioning the Threshold relative to it. The Preview screen shows the real time result of moving the Threshold (if you have selected Real Time).
One suggestion for minimizing print-through from text on the back side of the page is to put a sheet of black or dark craft paper behind the page. This will hide any dark areas on the back side. However, in Line art mode, it's easier to just adjust the Threshold so those light tones go to white.
This next image is from a 300 dpi Line art scan (from a printed travel catalog of only moderate printed quality), using Threshold = 155 (to aggravate things, default is 128).
Below is all the same, but with Threshold reduced from 155 to 95. Notice the background, but also the text, the more distinct white space in the e's in Versailles or Geneva, the A in Glacier, the G in TGV, the M in St. Moritz. This is not an OCR document, but the point is the same as if it were. OCR likes that white space to be correct, because the shape is how it recognizes the character within the pixel bits. And notice that the surrounding area is blank and clear of extraneous background dots to confuse the OCR either. So hopefully this minimal little demo will give a hint about the control possible in Line art mode, how you can set a limit to accept or reject gray tones, and why you'd want to do it.
Threshold is the only control possible in Line art mode.