Table of Contents

  • 1 Introduction
  • 2 Monochrome fans
  • 3 Give me some more color
  • 4 Other options

1 Introduction

My recommended practice is using NetPBM to do the dithering, to get a huge
amount reduction in file size at a minor cost of image quality loss.

✎︎
Note

There are some custom programs I wrote to complement the dithering algorithms
fron NetPBM. Some day I would add explaination to them.

If you have doubts on things, reference the NetPBM man pages for more options.

2 Monochrome fans

For high resolution scanned monochrome images (or less than original size),
pipe it to PBM format as following:

$ jpegtopnm foo.jpg | ppmtopgm | pgmtopbm -threshold > out.pbm

You need ppmtopgm to normalize the input as greyscale, and pgmtopbm -threshold
should be good enough. If the resolution is not high enough you might want pipe
the greyscale through pgmenhance with appropriate level, the default 9 might be
a little high. Just experiment a few of time and you'll get the optimal setting
for you.

☞︎
When the resolution is too low

If the file is greyscale, or has been compressed too much that PBM format
cannot give a good enough representation of the image, use pnmquant instead of
pgmtopbm, specify -fs 8, which is usually good enough, adjust to 5 if you
prefer reduced size. No less than 5 unless you'd like to losing a lot of
details.

Most of the scenarios, the Atkinson algorithm can still produce satisfactory
results.

That's not done yet. To reduce the file size you need to convert the raw PPM/
PGM/PBM to lossless compressed image format, I choose TIFF with LZW here,
because it is universal for both greyscale & colored images. For high entropy
images such as comics, choosing CCITT for monochrome may not get smaller size
than just LZW.

On macOS you can use:

$ sips -s format tiff -s formatOptions lzw file

Then you may convert to other more sophisticated TIFF variants with libtiff's
tiffcp or tiff2bw.

3 Give me some more color

For scanned high resolution colored images, use ppmdither can get rid of the
unnecessary details, such as paper grains. Then the loss less compression
format to be used can be PNG.

Riemersma dither might also works, but takes long time to compute the result.

4 Other options

JXL is a less popular high compression rate format for colored images, but
currently there are little support from macOS.

JBIG2 can reuse the common patterns among multiple binary images, which can
give surprisingly small output. Sometimes I use that for scanned PDF files.