Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
GSCAN2PDF(1)	      User Contributed Perl Documentation	  GSCAN2PDF(1)

NAME
       gscan2pdf - A GUI to produce PDFs or DjVus from scanned documents

USAGE
       1. Scan one or several pages in with File/Scan
       2. Create PDF of	selected pages with File/Save

REQUIRED ARGUMENTS
       None

OPTIONS
       gscan2pdf has the following command-line	options:

       --device=device
	   Specifies the device	to use,	instead	of getting the list of devices
	   from	via the	SANE API.  This	can be useful if the scanner is	on a
	   remote computer which is not	broadcasting its existence.

       --help
	   Displays this help page and exits.

       --log=log-file
	   Specifies a file to store logging messages.

       --debug,	--info,	--warn,	--error, --fatal
	   Defines the log level.  If a	log file is specified, this defaults
	   to --debug, otherwise --error.

       --import=PDF|DjVu|images
	   Imports the specified file(s). If the document has more than	one
	   page, a window is displayed to select the required pages.

       --import-all=PDF|DjVu|images Imports all	pages of the specified
       file(s).
       --version
	   Displays the	program	version	and exits.

       Scanning	is handled with	SANE via scanimage.  PDF conversion is done by
       PDF::Builder.  TIFF export is handled by	libtiff	(faster	and smaller
       memory footprint	for multipage files).

DIAGNOSTICS
       To diagnose a possible error, start gscan2pdf from the command line
       with logging enabled:

       "gscan2pdf --log=file.log"

       and check file.log.

EXIT STATUS
       None

CONFIGURATION
       gscan2pdf creates a text	resource file in ~/.config/gscan2pdfrc.	The
       directory can be	changed	by setting the $XDG_CONFIG_HOME	variable.
       Generally, however, preferences should be changed via the
       Edit/Preferences	menu, or are captured automatically during normal
       usage of	the program.

INCOMPATIBILITIES
       None known.

BUGS AND LIMITATIONS
       Whilst it is possible to	import PDFs, this is intended to be able to
       round-trip files	created	by gscan2pdf.

Download
       gscan2pdf is available on Sourceforge
       (<https://sourceforge.net/projects/gscan2pdf/files/gscan2pdf/>).

   Debian-based
       If you are using	Debian,	you should find	that sid
       <https://www.debian.org/releases/sid/> has the latest version already
       packaged.

       If you are using	a Ubuntu-based system, you can automatically keep up
       to date with the	latest version via the ppa:

       "sudo apt-add-repository	ppa:jeffreyratcliffe/ppa"

       If you are you are using	Synaptic, then use menu	Edit/Reload Package
       Information, search for gscan2pdf in the	package	list, and lo and
       behold, you can install the nice	shiny new version.

       From the	command	line:

       "sudo apt update"

       "sudo apt install gscan2pdf"

   From	source
       The source is hosted in the files section of the	gscan2pdf project on
       Sourceforge (<https://sourceforge.net/projects/gscan2pdf/files/>).

   From	the repository
       gscan2pdf uses Git for its Revision Control System. You can browse the
       tree at <https://sourceforge.net/p/gscan2pdf/code/>.

       Git users can clone the complete	tree with "git clone
       git://git.code.sf.net/p/gscan2pdf/code"

Building gscan2pdf from	source
       Having downloaded the source either from	a Sourceforge file release, or
       from the	Git repository,	unpack it if necessary with "tar xvfz
       gscan2pdf-x.x.x.tar.gz cd gscan2pdf-x.x.x"

       "perl Makefile.PL", will	create the Makefile.

       "make test" should run several hundred tests to confirm that things
       will work properly on your system.

       You can install directly	from the source	with "make install", but
       building	the appropriate	package	for your distribution should be	as
       straightforward as "make	debdist" or "make rpmdist". However, you will
       additionally need the rpm, devscripts, fakeroot,	debhelper and gettext
       packages.

Dependencies
       The list	below looks daunting, but all packages are available from any
       reasonable up-to-date distribution. If you are using Synaptic, having
       installed gscan2pdf, locate the gscan2pdf entry in Synaptic, right-
       click it	and you	can install them under Recommends. Note	also that the
       library names given below are the Debian/Ubuntu ones. Those
       distributions using RPM typically use perl(module) where	Debian has
       libmodule-perl.

       Required
	   libgtk3-perl	>= 0.028
	       There is	a bug in version of libgtk3-perl before	0.028 that
	       causes gscan2pdf	to crash when saving. Whilst I could prevent
	       gscan2pdf from crashing,	it would still be impossible to	save
	       anything, rendering gscan2pdf rather useless.

	   libgtk3-simplelist-perl
	       A simple	interface to Gtk3's complex MVC	list widget

	   liblocale-gettext-perl (>= 1.05)
	       Using libc functions for	internationalisation in	Perl

	   libpdf-builder-perl
	       provides	the functions for creating PDF documents in Perl

	   libsane
	       API library for scanners

	   libimage-sane-perl
	       Perl bindings for libsane.

	   libset-intspan-perl
	       manages sets of integers

	   libtiff-tools
	       TIFF manipulation and conversion	tools

	   Imagemagick
	       Image manipulation programs

	   perlmagick
	       A perl interface	to the libMagick graphics routines

	   sane-utils
	       API library for scanners	-- utilities.

       Optional
	   sane
	       scanner graphical frontends. Only required for the scanadf
	       frontend.

	   unpaper
	       post-processing tool for	scanned	pages. See
	       <https://www.flameeyes.eu/projects/unpaper>.

	   xdg-utils
	       Desktop integration utilities from freedesktop.org. Required
	       for Email as PDF.  See
	       <https://www.freedesktop.org/wiki/Software/xdg-utils/>

	   djvulibre-bin
	       Utilities for the DjVu image format. See
	       <http://djvu.sourceforge.net/>

	   gocr
	       A command line OCR. See <http://jocr.sourceforge.net/>.

	   tesseract
	       A command line OCR. See
	       <https://github.com/tesseract-ocr/tesseract>

	   cuneiform
	       A command line OCR. See <http://launchpad.net/cuneiform-linux>

Support
       There are two mailing lists for gscan2pdf:

       gscan2pdf-announce
	   A low-traffic list for announcements, mostly	of new releases. You
	   can subscribe at
	   <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-announce>

       gscan2pdf-help
	   General support, questions, etc.. You can subscribe at
	   <https://lists.sourceforge.net/lists/listinfo/gscan2pdf-help>

Reporting bugs
       Before reporting	bugs, please read the "FAQs" section.

       Please report any bugs found, preferably	against	the Debian
       package[1][2].  You do not need to be a Debian user, or set up an
       account to do this.  The	Debian tool "reportbug"	provides a convenient
       GUI for doing so.

       1. https://packages.debian.org/sid/gscan2pdf
       2. https://www.debian.org/Bugs/

       Alternatively, there is a bug tracker for the gscan2pdf project on
       Sourceforge
       (<https://sourceforge.net/p/gscan2pdf/_list/tickets?source=navbar>).

       Please include the log file created by "gscan2pdf --log=log" with any
       new bug report.

Translations
       gscan2pdf has already been partly translated into several languages.
       If you would like to contribute to an existing or new translation,
       please check out	Rosetta:
       <https://translations.launchpad.net/gscan2pdf>

       Note that the translations for the scanner options are taken directly
       from sane-backends. If you would	like to	contribute to these, you can
       do so either at contact the sane-devel mailing list
       (sane-devel@lists.alioth.debian.org) and	have a look at the po/
       directory in the	source code <http://www.sane-project.org/cvs.html>.

       Alternatively, Ubuntu has its own translation project. For the 9.04
       release,	the translations are available at
       <https://translations.launchpad.net/ubuntu/jaunty/+source/sane-backends/+pots/sane-backends>

       If you have updated an ".po" file in the	"po" directory of the
       gscan2pdf source	tree and would like to test it,	pick a test directory
       for the compiled	locales, e.g.  "./locale", and create the ".mo"	files
       with:

       "perl Makefile.PL LOCALEDIR=./locale"

       If the updated locale is	your standard one, then	the following will
       find the	updated	file:

       "perl -I	lib bin/gscan2pdf --log=log --locale=locale"

       If it is	not your standard locale, you will need	something like (for
       Russian):

       "LC_ALL=ru_RU.utf8 LC_MESSAGES=ru_RU.utf8 LC_CTYPE=ru_RU.utf8
       LANG=ru_RU.utf8 LANGUAGE=ru_RU.utf8 perl	-I lib bin/gscan2pdf --log=log
       --locale=locale"

       or German:

       "LC_ALL=de_DE LC_MESSAGES=de_DE LC_CTYPE=de_DE LANG=de_DE
       LANGUAGE=de_DE perl -I lib bin/gscan2pdf	--log=log --locale=locale"

       If the above doesn't work, make sure it is in the list produced by
       "locale -a", including any ".utf8" suffix. If necessary,	generate new
       locales with "sudo dpkg-reconfigure locales"

DESCRIPTION
   File
       New

       Clears the page list.

       Open

       Opens any format	that imagemagick supports. PDFs	will have their
       embedded	images extracted and imported one per page.

       Note that files can also	be imported by dragging	them into the
       thumbnail list from a program like nautilus or konqueror.

       Scan

       Sets options before scanning via	SANE.

       Device

       Chooses between available scanners.

       # Pages

       Selects the number of pages, or all pages to scan.

       Source document

       Selects between single sided or double sides pages.

       This affects the	page numbering.	 Single	sided scans are	numbered
       consecutively.  Double sided scans are incremented (or decremented, see
       below) by 2, i.e. 1, 3, 5, etc..

       Side to scan

       If double sided is selected above, assuming a non-duplex	scanner, i.e.
       a scanner that cannot automatically scan	both sides of a	page, this
       determines whether the page number is incremented or decremented	by 2.

       To scan both sides of three pages, i.e. 6 sides:

       1. Select:
	   # Pages = 3 (or "all" if your scanner can detect when it is out of
	   paper)

	   Double sided

	   Facing side

       2. Scans	sides 1, 3 & 5.
       3. Put pile back	with scanner ready to scan back	of last	page.
       4. Select:
	   # Pages = 3 (or "all" if your scanner can detect when it is out of
	   paper)

	   Double sided

	   Reverse side

       5. Scans	sides 6, 4 & 2.
       6. gscan2pdf automatically sorts	the pages so that they appear in the
       correct order.

       Device-dependent	options

       These, naturally, depend	on your	scanner.  They can include

       Page size.
       Mode (colour/black & white/greyscale)
       Resolution (in PPI)
       Batch-scan
	   Guarantees that a "no documents" condition will be returned after
	   the last scanned page, to prevent endless flatbed scans after a
	   batch scan.

       Wait-for-button/Button-wait
	   After sending the scan command, wait	until the button on the
	   scanner is pressed before actually starting the scan	process.

       Source
	   Selects the document	source.	 Possible options can include Flatbed
	   or ADF.  On some scanners, this is the only way of generating an
	   out-of-documents signal.

       Save

       Saves the selected or all pages as a PDF, DjVu, TIFF, PNG, JPEG,	PNM or
       GIF.

       Metadata

       Metadata	are information	that are not visible when viewing the
       PDF/DjVu, but are embedded in the file and so searchable	and can	be
       examined, typically with	the "Properties" option	of the document
       viewer.

       The metadata are	completely optional, but can also be used to generate
       the filename see	preferences for	details.

       The date	can be selected	with use of the	calendar widget. The displayed
       date can	be incremented or decremented with use of the '+' and '-'
       keys.

       DjVu

       Both black and white, and colour	images produce better compression than
       PDF. See	<http://www.djvuzone.org/> for more details.

       Email as	PDF

       Attaches	the selected or	all pages as a PDF to a	blank email.  This
       requires	xdg-email, which is in the xdg-utils package.  If this is not
       present,	the option is ghosted out.

       Print

       Prints the selected or all pages.

       Compress	temporary files

       If your temporary ($TMPDIR) directory is	getting	full, this function
       can be useful - compressing all images at LZW-compressed	TIFFs. These
       require much less space than the	PNM files that are typically produced
       by SANE or by importing a PDF.

   Edit
       Delete

       Deletes the selected page.

       Renumber

       Renumbers the pages from	1..n.

       Note that the page order	can also be changed by drag and	drop in	the
       thumbnail view.

       Select

       The select menus	can be used to select, all, even, odd, blank, dark or
       modified	pages. Selecting blank or dark pages runs imagemagick to make
       the decision.  Selecting	modified pages selects those which have
       modified	by threshold, unsharp, etc., since the last OCR	run was	made.

       Properties

       When an image is	scanned, gscan2pdf attempts to extract the resolution
       from the	scan options. This nearly always works without problem.

       Importing an image can be trickier, however. Some image formats such as
       PNM do not encode metadata for resolution. In other cases, the data is
       incorrect.  Edit/Properties allows the user to manually correct the
       metadata	for a particular page, thus correcting the size	of final PDF
       or DjVu.	The image itself is otherwise not changed - it is not down- or
       upscaled.

       Preferences

       The preferences menu item allows	the control of the default behaviour
       of various functions. Most of these are self-explanatory.

       Frontends

       gscan2pdf initially supported two frontends, scanimage and scanadf.
       scanadf support was added when it was realised that scanadf works
       better than scanimage with some scanners. On Debian-based systems,
       scanadf is in the sane package, not, like scanimage, in sane-utils. If
       scanadf is not present, the option is obviously ghosted out.

       In 0.9.27, Perl bindings	for SANE were introduced. These	are called
       libsane-perl.

       Before 1.2.0, options available through CLI frontends like scanimage
       were made visible as users asked	for them. In 1.2.0, all	options	can be
       shown or	hidden via Edit/Preferences, along with	the ability to specify
       which options trigger a reload.

       In 1.8.3, New Perl bindings for SANE were introduced. These are called
       libimage-sane-perl and are the preferred	frontend.

       In 1.8.5, support for libsane-perl was removed.

       Device blacklist

       Ignore listed devices.

       Note that this is a device name regular expression, e.g.	/dev/video,
       and not the name	as listed in the scan window, e.g. Noname
       Integrated_Webcam_HD.

       Default filename	for PDF	or DjVu	files

       All strftime codes (e.g.	%Y for the current year) are available as
       variables, with the following additions:

       %Da author

       %De filename extension

       %Dt title

       All document date codes use strftime codes with a leading D, e.g.:

       %DY document year

       %Dm document month

       %Dd document day

   View
       Zoom 100%

       Zooms to	1:1. How this appears depends on the desktop resolution.

       Zoom to fit

       Scales the view such that all the page is visible.

       Zoom in

       Zoom out

       Rotate 90_A_degree_ clockwise

       The rotate options require the package imagemagick and, if this is not
       present,	are ghosted out.

       Rotate 180_A_degree_

       Rotate 90_A_degree_ anticlockwise

   Tools
       Threshold

       Changes all pixels darker than the given	value to black;	all others
       become white.

       Unsharp mask

       The unsharp option sharpens an image. The image is convolved with a
       Gaussian	operator of the	given radius and standard deviation (sigma).
       For reasonable results, radius should be	larger than sigma. Use a
       radius of 0 to have the method select a suitable	radius.

       Crop

       unpaper

       unpaper (see <https://www.flameeyes.eu/projects/unpaper>) is a utility
       for cleaning up a scan.

       OCR (Optical Character Recognition)

       The gocr, tesseract or cuneiform	utilities are used to produce text
       from an image.

       There is	an OCR output buffer for each page and is embedded as plain
       text behind the scanned image in	the PDF	produced. This way, Beagle can
       index (i.e. search) the plain text.

       In DjVu files, the OCR output buffer is embedded	in the hidden text
       layer.  Thus these can also be indexed by Beagle.

       There is	an interesting review of OCR software at
       <https://web.archive.org/web/20080529012847/http://groundstate.ca/ocr>.
       An important conclusion was that	400ppi is necessary for	decent
       results.

       Up to v2.04, the	only way to tell which languages were available	to
       tesseract was to	look for the language files. Therefore,	gscan2pdf
       checks the path returned	by:

       "tesseract '' ''	-l ''"

       If there	are no language	files in the above location, then gscan2pdf
       assumes that tesseract v1.0 is installed, which had no language files.

       Variables for user-defined tools

       The following variables are available:

       %i  input filename

       %o  output filename

       %r  resolution

       An image	can be modified	in-place by just specifying %i.

FAQs
   Why isn't option xyz	available in the scan window?
       Possibly	because	SANE or	your scanner doesn't support it.

       If an option listed in the output of "scanimage --help" that you	would
       like to use isn't available, send me the	output and I will look at
       implementing it.

   I've	only got an old	flatbed	scanner	with no	automatic sheetfeeder. How do
       I scan a	multipage document?
       In Edit/Preferences, tick the box "Allow	batch scanning from flatbed".

       Some Brother scanners report "out of documents",	despite	scanning from
       flatbed.	 This can be worked around by ticking the box "Force new scan
       job between pages".

       If you are lucky, you have an option like Wait-for-button or Button-
       wait, where the scanner will wait for you to press the scan button on
       the device before it starts the scan, allowing you to scan multiple
       pages without touching the computer.

       If you are quick, you might be able to change the document on the
       flatbed whilst the scan head is returning.

       Otherwise, you have to set the number of	pages to scan to 1 and hit the
       scan button on the scan window for each page.

   Why is option xyz ghosted out?
       Probably	because	the package required for that option is	not installed.
       Email as	PDF requires xdg-email (xdg-utils), unpaper and	the rotate
       options require imagemagick.

   Why can I not scan from the flatbed of my HP	scanner?
       Generally for HP	scanners with an ADF, to scan from the flatbed,	you
       should set "# Pages" to "1", and	possibly "Batch	scan" to "No".

   When	I update gscan2pdf using the Update Manager in Ubuntu, why is the list
       of changes never	displayed?
       As far as I can tell, this is pulled from changelogs.ubuntu.com,	and
       therefore only the changelogs from official Ubuntu builds are
       displayed.

   Why can gscan2pdf not find my scanner?
       If your scanner is not connected	directly to the	machine	on which you
       are running gscan2pdf and you have not installed	the SANE daemon,
       saned, gscan2pdf	cannot automatically find it. In this case, you	can
       specify the scanner device on the command line:

       "gscan2pdf --device <device">

   How can I search for	text in	the OCR	layer of the finished PDF or DJVU
       file?
       pdftotext or djvutxt can	extract	the text layer from PDF	or DJVU	files.
       See the respective man pages for	details.

       Having opened a PDF or DJVU file	in evince or Acrobat Reader, the
       search function will typically find the page with the requested text
       and highlight it.

       There are various tools for searching or	indexing files,	including PDF
       and DJVU:

       o   (meta) Tracker (<https://projects.gnome.org/tracker/>)

       o   plone (<http://plone.org/>)

       o   pdfgrep (<http://pdfgrep.sourceforge.net/>

       o   swish-e (<http://www.swish-e.org/>)

       o   recoll (<http://www.lesbonscomptes.com/recoll/>)

       o   terrier (<http://www.lesbonscomptes.com/recoll/>)

   How can I change the	colour of the selection	box in the image viewer?
       Create a	file called "~/.config/gtk-3.0/gtk.css"	with the following
       content:

	.rubberband,
	rubberband,
	flowbox	rubberband,
	treeview.view rubberband,
	.content-view rubberband,
	.content-view .rubberband {
	  border: 1px solid #2a76c6;
	  background-color: rgba(42, 118, 198, 0.2); }

   How can I change the	colour of the OCR output
       Create a	file called "~/.config/gtk-3.0/gtk.css"	with the following
       content:

	#gscan2pdf-ocr-output {
	  color: black;
	}

See Also
       XSane (<http://xsane.org/>)

       Scan Tailor (<http://scantailor.org/>)

Author
       Jeffrey Ratcliffe (jffry	at posteo dot net)

Thanks to
       o   all the people who have sent	patches, translations, bugs and
	   feedback.

       o   the gtk+ project for	a most excellent graphics toolkit.

       o   the Gtk3-Perl project for their superb Perl bindings	for GTK3.

       o   The SANE project for	scanner	access

       o   BjA<paragraph>rn Lindqvist for the gtkimageview widget

       o   Sourceforge for hosting the project.

LICENSE	AND COPYRIGHT
       Copyright (C) 2006--2021	Jeffrey	Ratcliffe <jffry@posteo.net>

       This program is free software: you can redistribute it and/or modify it
       under the terms of the version 3	GNU General Public License as
       published by the	Free Software Foundation.

       This program is distributed in the hope that it will be useful, but
       WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A	PARTICULAR PURPOSE.  See the GNU
       General Public License for more details.

       You should have received	a copy of the GNU General Public License along
       with this program.  If not, see <https://www.gnu.org/licenses/>.

perl v5.32.1			  2022-04-04			  GSCAN2PDF(1)

NAME | USAGE | REQUIRED ARGUMENTS | OPTIONS | DIAGNOSTICS | EXIT STATUS | CONFIGURATION | INCOMPATIBILITIES | BUGS AND LIMITATIONS | Download | Building gscan2pdf from source | Dependencies | Support | Reporting bugs | Translations | DESCRIPTION | FAQs | See Also | Author | Thanks to | LICENSE AND COPYRIGHT

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=gscan2pdf&sektion=1&manpath=FreeBSD+13.1-RELEASE+and+Ports>

home | help