blob: e0c9b7462fd0b0065e2586b2a721805bc5f31cbb [file] [log] [blame]
<body>
<h3 id="fsadaptor">Deployment of File System Adaptor</h3>
<p>A single instance of File System adaptor can have
GSA index a single UNC share. DFS is supported.
<h4>Requirements</h4>
<ul>
<li>GSA 7.2 or higher
<li>Java JRE 1.7 or higher installed on computer that runs adaptor
<li>File System Adaptor JAR executable
<li>Credentials for File System share to be indexed by GSA
<li>Requires running on Microsoft Windows
</ul>
<h4>Configure GSA for Adaptor</h4>
<ol>
<li>Add the IP address of the computer that hosts the adaptor to the <b>List
of Trusted IP Addresses</b> on the GSA.
<p>In the GSA's Admin Console, go to <b>Content Sources &gt; Feeds</b>,
and scroll down to <b>List of Trusted IP Addresses</b>. Add the IP address
for the adaptor to the list.
<li>Add the URLs provided by the adaptor to the <b>Follow Patterns</b>
on the GSA.
<p>In the Admin console, go to <b>Content Sources &gt; Web Crawl
&gt; Start and Block URLs</b>, and
scroll down to <b>Follow Patterns</b>.
Add an entry like <code>http://adaptor.example.com:5678/doc/
</code> where <code>adaptor.example.com</code> is the hostname of the
machine that hosts the adaptor. By default the adaptor runs on port 5678.
</ol>
<h4>Configure Adaptor</h4>
<ol>
<li>Create a file named <code>adaptor-config.properties</code> in the
directory that contains the adaptor binary.
<p>
Here is an example configuration (bold items are example values to be
replaced):
<pre>
gsa.hostname=<b>yourgsa.hostname.com</b>
filesystemadaptor.src=<b>\\\\host\\share</b>
</pre>
<p> Note: Backslashes are entered as double backslashes. In order
to represent a single '\' you need to enter '\\'.
<p> Note: DFS links can be given as
filesystemadaptor.src: <b>\\\\host\\dfsnamespace\\link</b>
<br>
<li> Create file named <code>logging.properties</code> in the same directory
that contains adaptor binary:
<pre>
.level=INFO
handlers=java.util.logging.FileHandler,java.util.logging.ConsoleHandler
java.util.logging.FileHandler.formatter=com.google.enterprise.adaptor.CustomFormatter
java.util.logging.FileHandler.pattern=logs/adaptor.%g.log
java.util.logging.FileHandler.limit=10485760
java.util.logging.FileHandler.count=20
java.util.logging.ConsoleHandler.formatter=com.google.enterprise.adaptor.CustomFormatter
</pre>
<li><p>Create a directory named logs in the same directory that contains logging.properties.
<li><p>Run the adaptor using:
<pre>java -Djava.util.logging.config.file=logging.properties -jar adaptor-fs-YYYYMMDD-withlib.jar</pre>
</ol>
<!--
<h4>Running as service on Windows</h4>
<p>Example execution with jsvc:
<pre>jsvc -pidfile adaptor.pid -cp adaptor-fs-YYYYMMDD-withlib.jar com.google.enterprise.adaptor.Daemon com.google.enterprise.adaptor.fs.FsAdaptor</pre>
-->
<h4>Optional <code>adaptor-config.properties</code> fields</h4>
<dl>
<dt>
<code>filesystemadaptor.supportedAccounts</code>
</dt>
<dd>
Accounts that are in the supportedAccounts will be
included in Acls regardless if they are builtin or
not.
By default the value is:
<pre>
BUILTIN\\Administrators,\\Everyone,BUILTIN\\Users,
BUILTIN\\Guest,NT AUTHORITY\\INTERACTIVE,
NT AUTHORITY\\Authenticated Users
</pre>
</dd>
<dt>
<code>filesystemadaptor.builtinGroupPrefix</code>
</dt>
<dd>
Builtin accounts are excluded from the Acls
that are pushed to the GSA. An account that starts with
this prefix is considered a builtin account and will be
excluded from the Acls.
By default the value is:
<pre>
BUILTIN\\
</pre>
</dd>
<dt>
<code>adaptor.incrementalPollPeriodSecs</code>
</dt>
<dd>
Time between incremental crawls. Default value is 300 seconds.
</dd>
<br>
<dt>
<code>adaptor.namespace</code>
</dt>
<dd>
Namespace used for ACLs sent to GSA. Defaults to "Default".
</dd>
</dl>
<h3>Document Last Access Dates</h3>
<p>The adaptor attempts to restore the last access date for documents after
it reads the document content during a crawl. In order for the last access
date to be restored back to the original value before the content was read,
the user account that the adaptor is running under needs to have write permission.
If the account has read-only permission and not write permission for documents,
then the last access date for documents will change as the adaptor reads
document content during a crawl.
<h3>File System Adaptor Acl Overview</h3>
<p>ACLs for documents and folders are read, preserved and pushed to the Google
Search Appliance by the File System Adaptor for UNC and DFS UNC paths.
</p>
<p>The following images show the ACL inheritance used by the File System Adaptor.
The green and pink arrows signify inheritance. While the dotted arrows show an
optional inheritance depending on whether the item inherits permission from
its parent or if it breaks inheritance and defines its own set of permissions.
</p>
<h4>non-DFS Acl inheritance</h4>
<img src="non_dfs_acls.jpg" />
<h4>DFS Acl inheritance</h4>
<img src="dfs_acls.jpg" />
</body>