filter

The filter sample program demonstrates the advanced functionality of the Filter API. It is composed of the following files:

  • filter.c—command line interface
  • filtersupport.c—contains core functionality, such as file filtering, stream filtering, metadata extraction, and format detection.
  • filtersupport.h—structure and variable definitions

To run filter, type the following at the command line:

filter [options] input_file output_file

where:

options is one or more of the options listed in Options for the Filter Sample Program .

input_file is the full path and file name of the source document.

output_file is the full path and file name of the output file.

Options for the Filter Sample Program

Option Description
-m Extract metadata instead of filtering text. See Use the Metadata API.
-c Run Filter in the same process as the calling application (in process). See Run Filter In Process.
-e Run Filter in stream mode.
-h Extract headers and footers, as well as the body text.
-d Detect the file format instead of filtering text. This option displays information obtained through fpGetDocInfo(), combined with additional information from formats_description.tsv.
-L Enable error logging for the legacy out-of-process server. See Enable or Disable Legacy Out-of-Process Error Logging. Error logs are not generated when using the default out-of-process method, or when in-process filtering is enabled.
-LN Disable error logging for the legacy out-of-process server. See Enable or Disable Legacy Out-of-Process Error Logging. Error logs are not generated when using the default out-of-process method, or when in-process filtering is enabled.
-AF Include the input file name in an error log. See Report the File Name in Stream Mode.
-rm If you set this option, text that was deleted from a document with revision tracking enabled is extracted from the document and included in the filtered output. See Deleted Text.
-x xmlconfigfile

Filter an XML file by using customized extraction settings defined in the kvxconfig.ini file. If you do not enter the full path to the INI file, the program looks for the file in the current working directory.

See Filter XML Files for more information.

-z tempdirectory Specify a temporary directory where temporary files generated by the filtering process are stored. The default is the current working directory.
-ps password Specify a password to open a password-protected file. See Filter Password Protected Files.
-pdfauto Specify that PDF files are output in a logical reading order. The PDF filter determines the paragraph direction (left-to-right or right-to-left) for each PDF page, and then sets the direction accordingly. See Filter PDF Files.
-pdfltr Specify that PDF files are output in a logical reading order, and that the paragraph direction is left to right. See Filter PDF Files.
-pdfrtl Specify that PDF files are output in a logical reading order, and that the paragraph direction is right to left. See Filter PDF Files.
-pdfraw Specify that PDF files are output in an unstructured paragraph flow. This is the default option . If logical reading order is enabled, and you want to return to an unstructured paragraph flow, set this flag. See Filter PDF Files.