ppdup.exe reference

 

Applies to: FAST Search Server 2010

Use the ppdup tool to start a duplicate server that provides a centralized duplicate detection function for each node scheduler/postprocess host.

Note

To use a command-line tool, verify that you meet the following minimum requirements: You are a member of the FASTSearchAdministrators local group on the computer where FAST Search Server 2010 for SharePoint is installed.

Syntax

<FASTSearchFolder>\bin\ppdup [options]

Parameters

Parameter Description

<FASTSearchFolder>

The path of the folder where you have installed FAST Search Server 2010 for SharePoint, for example C:\FASTSearch.

No option is required; all are optional.

ppdup options

Option Value Description

-h

Displays help information.

-v

Displays version information.

-I

<identifier>

Specifies a symbolic duplicate server identifier.

Use this option to assign a symbolic name to the duplicate server. This name is used when the state of the duplicate server is copied to a replica.

By default, this is read from the crawler generated node identifier in <FASTSearchFolder>\data\crawler\node_id.dat.

When running multiple duplicate servers on the same node, specify a different identifier for at least one of the duplicate servers.

-P

[<address>:]<port number>

Specifies the port number and an optional interface that a postprocess communicates to the duplicate server in a multiple node setup.

-r

<port number>

Specifies a replication service port number to enable replica mode for the duplicate server.

The duplicate server listens for incoming replication requests on the specified port number.

-R

<host name>:<port number>

Specifies the address of a duplicate server that replicates the server state.

The host name must correspond to a server that is running the duplicate server that uses the -r option and the specified port number.

-d

<path>

Specifies the current working data directory for the duplicate server.

Default: If the FASTSEARCH environment variable is set, the default path is <FASTSearchFolder>\data\crawler\ppdup; otherwise the default path is data.

-c

<cache size>

Specifies the database cache size or hash size, in megabytes.

With a storage format of hashlog (see -S option), this value determines the size of the allocated memory hash. If the number of items that are stored in the hash exceeds the available capacity, the hash will be automatically converted into a disk hash and resized (2x increments).

With a storage format of diskhashlog, <cache size> determines the initial size of the hash on disk. For each overflow (when capacity is exceeded) the hash is resized, as described above.

When the storage format is gigabase, <cache size> specifies how much memory to reserve for database caches.

Note

This value is per crawl collection. If you use multiple crawl collections, each crawl collection will allocate the specified amount of cache/memory/disk. If the duplicate server is run as both a primary and a replica, double the resources will be consumed.

Default: 64

-s

<stripes>

Specifies the number of stripes (separate files) used by the duplicate server databases.

Default: 1

-D

Enables direct I/O for the duplicate server's databases.

Use only if supported by your operating system.

-S

<database format>

Specifies a database storage format:

  • hashlog - This format first allocates a memory based hash structure with a data log on disk. The size of the memory hash is specified by the -c option. If the hash overflows, it automatically converts into a diskhashlog.

  • diskhashlog - This format resembles hashlog, but with a disk based hash structure.

  • gigabase - The gigabase format is a database structure on disk.

Default: hashlog

-N

Disables nightly compression of duplicate server databases.

-F

<file>

Specifies an XML file that contains the crawler global configuration. The file may contain default values for all command-line options.

Default: <FASTSearchFolder>\etc\CrawlerGlobalDefaults.xml

-T

Enables profiling.

Use for debugging only.

-t

Enables profiling using the hotshot module.

Use for debugging only.

-l

<log level>

Specifies the kind of information to log:

  • debug

  • verbose

  • info

  • warning

  • error

Examples

The following example starts a duplicate server at port 14000:

<FASTSearchFolder>\bin\ppdup -P 14000

The following example starts a duplicate server acting as a replica at port 14100:

<FASTSearchFolder>\bin\ppdup -r 14100

To start a duplicate server at port 14000 which copies its databases to a replica at port 14100, follow this example:

<FASTSearchFolder>\bin\ppdup -P 14000 -R 14100