| <body> |
| <h3 id="fsadaptor">Deployment of File System Adaptor</h3> |
| |
| <p>A single instance of File System adaptor can have |
| GSA index a single UNC share. DFS is supported. |
| |
| <h4>Requirements</h4> |
| <ul> |
| <li>GSA 7.2 or higher |
| <li>Java JRE 1.7 update 6 or higher installed on computer that runs adaptor |
| <li>File System Adaptor JAR executable |
| <li>Requires running on Microsoft Windows |
| </ul> |
| |
| <h4>Configure GSA for Adaptor</h4> |
| <ol> |
| <li>Add the IP address of the computer that hosts the adaptor to the <b>List |
| of Trusted IP Addresses</b> on the GSA. |
| <p>In the GSA's Admin Console, go to <b>Content Sources > Feeds</b>, |
| and scroll down to <b>List of Trusted IP Addresses</b>. Add the IP address |
| for the adaptor to the list. |
| |
| <li>Add the URLs provided by the adaptor to the <b>Follow Patterns</b> |
| on the GSA. |
| <p>In the Admin console, go to <b>Content Sources > Web Crawl |
| > Start and Block URLs</b>, and |
| scroll down to <b>Follow Patterns</b>. |
| Add an entry like <code>http://adaptor.example.com:5678/doc/ |
| </code> where <code>adaptor.example.com</code> is the hostname of the |
| machine that hosts the adaptor. By default the adaptor runs on port 5678. |
| </ol> |
| |
| <h4>Configure Adaptor</h4> |
| <ol> |
| <li>Create a file named <code>adaptor-config.properties</code> in the |
| directory that contains the adaptor binary. |
| <p> |
| Here is an example configuration (bold items are example values to be |
| replaced): |
| <pre> |
| gsa.hostname=<b>yourgsa.hostname.com</b> |
| filesystemadaptor.src=<b>\\\\host\\share</b> |
| </pre> |
| <p> Note: Backslashes are entered as double backslashes. In order |
| to represent a single '\' you need to enter '\\'. |
| <p> Note: DFS links can be given as |
| filesystemadaptor.src: <b>\\\\host\\dfsnamespace\\link</b> |
| <br> |
| |
| <li> Create file named <code>logging.properties</code> in the same directory |
| that contains adaptor binary: |
| <pre> |
| .level=INFO |
| handlers=java.util.logging.FileHandler,java.util.logging.ConsoleHandler |
| java.util.logging.FileHandler.formatter=com.google.enterprise.adaptor.CustomFormatter |
| java.util.logging.FileHandler.pattern=logs/adaptor.%g.log |
| java.util.logging.FileHandler.limit=10485760 |
| java.util.logging.FileHandler.count=20 |
| java.util.logging.ConsoleHandler.formatter=com.google.enterprise.adaptor.CustomFormatter |
| </pre> |
| |
| <li><p>Create a directory named <code>logs</code> inside same directory that contains |
| the adaptor binary. |
| |
| <li><p>Run the adaptor using a command line like: |
| <pre>java -Djava.util.logging.config.file=logging.properties -jar adaptor-fs-YYYYMMDD-withlib.jar</pre> |
| </ol> |
| |
| <h4>Running as service on Windows</h4> |
| <p>Example service creation on Windows with prunsrv: |
| <pre>prunsrv install adaptor-fs --StartPath="%CD%" ^ |
| --Classpath=adaptor-fs-YYYYMMDD-withlib.jar ^ |
| --StartMode=jvm --StartClass=com.google.enterprise.adaptor.Daemon ^ |
| --StartMethod=serviceStart --StartParams=com.google.enterprise.adaptor.fs.FsAdaptor ^ |
| --StopMode=jvm --StopClass=com.google.enterprise.adaptor.Daemon ^ |
| --StopMethod=serviceStop --StdOutput=stdout.log --StdError=stderr.log ^ |
| ++JvmOptions=-Djava.util.logging.config.file=logging.properties</pre> |
| |
| <p> Note: By default the File System adaptor service runs using the Windows Local System account. |
| This should be fine in most cases but this can cause issues if access to documents is |
| restricted through Acls. |
| In cases where the File System adaptor service is not able to crawl documents due |
| to Acl restrictions, you would need to specify a user for the File System adaptor |
| service through the Service Control Manager that has sufficient access to crawl the documents. |
| |
| <h4>Optional <code>adaptor-config.properties</code> fields</h4> |
| <dl> |
| <dt> |
| <code>server.dashboardPort</code> |
| </dt> |
| <dd> |
| Port on which to view web page showing information |
| and diagnostics. Defaults to "5679". |
| </dd> |
| <br> |
| <dt> |
| <code>filesystemadaptor.supportedAccounts</code> |
| </dt> |
| <dd> |
| Accounts that are in the supportedAccounts will be |
| included in Acls regardless if they are builtin or |
| not. |
| By default the value is: |
| <pre> |
| BUILTIN\\Administrators,\\Everyone,BUILTIN\\Users, |
| BUILTIN\\Guest,NT AUTHORITY\\INTERACTIVE, |
| NT AUTHORITY\\Authenticated Users |
| </pre> |
| </dd> |
| <dt> |
| <code>filesystemadaptor.builtinGroupPrefix</code> |
| </dt> |
| <dd> |
| Builtin accounts are excluded from the Acls |
| that are pushed to the GSA. An account that starts with |
| this prefix is considered a builtin account and will be |
| excluded from the Acls. |
| By default the value is: |
| <pre> |
| BUILTIN\\ |
| </pre> |
| </dd> |
| <dt> |
| <code>adaptor.incrementalPollPeriodSecs</code> |
| </dt> |
| <dd> |
| Time between incremental crawls. Default value is 300 seconds. |
| </dd> |
| <br> |
| <dt> |
| <code>adaptor.namespace</code> |
| </dt> |
| <dd> |
| Namespace used for ACLs sent to GSA. Defaults to "Default". |
| </dd> |
| </dl> |
| |
| <br> |
| <br> |
| |
| <h3> Advanced Topics </h3> |
| |
| <h4>Not changing 'last access' of the documents on the share</h4> |
| <p>The adaptor attempts to restore the last access date for documents after |
| it reads the document content during a crawl. In order for the last access |
| date to be restored back to the original value before the content was read, |
| the user account that the adaptor is running under needs to have write permission. |
| If the account has read-only permission and not write permission for documents, |
| then the last access date for documents will change as the adaptor reads |
| document content during a crawl. |
| |
| <br> |
| <br> |
| |
| |
| <h3> Developer Topics </h3> |
| |
| <h4>File System Adaptor Acl Overview</h4> |
| |
| <p>ACLs for documents and folders are read, preserved and pushed to the Google |
| Search Appliance by the File System Adaptor for UNC and DFS UNC paths. |
| </p> |
| |
| <p>The following images show the ACL inheritance used by the File System Adaptor. |
| The green and pink arrows signify inheritance. While the dotted arrows show an |
| optional inheritance depending on whether the item inherits permission from |
| its parent or if it breaks inheritance and defines its own set of permissions. |
| </p> |
| |
| <h4>non-DFS ACL inheritance</h4> |
| <img src="non_dfs_acls.jpg" /> |
| |
| <h4>DFS ACL inheritance</h4> |
| <img src="dfs_acls.jpg" /> |
| |
| </body> |