| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <!-- NewPage --> |
| <html lang="en"> |
| <head> |
| <title>Overview</title> |
| <link rel="stylesheet" type="text/css" href="stylesheet.css" title="Style"> |
| </head> |
| <body> |
| <script type="text/javascript"><!-- |
| if (location.href.indexOf('is-external=true') == -1) { |
| parent.document.title="Overview"; |
| } |
| //--> |
| </script> |
| <noscript> |
| <div>JavaScript is disabled on your browser.</div> |
| </noscript> |
| <!-- ========= START OF TOP NAVBAR ======= --> |
| <div class="topNav"><a name="navbar_top"> |
| <!-- --> |
| </a><a href="#skip-navbar_top" title="Skip navigation links"></a><a name="navbar_top_firstrow"> |
| <!-- --> |
| </a> |
| <ul class="navList" title="Navigation"> |
| <li class="navBarCell1Rev">Overview</li> |
| <li>Package</li> |
| <li>Class</li> |
| <li><a href="overview-tree.html">Tree</a></li> |
| <li><a href="deprecated-list.html">Deprecated</a></li> |
| <li><a href="index-all.html">Index</a></li> |
| <li><a href="help-doc.html">Help</a></li> |
| </ul> |
| </div> |
| <div class="subNav"> |
| <ul class="navList"> |
| <li>PREV</li> |
| <li>NEXT</li> |
| </ul> |
| <ul class="navList"> |
| <li><a href="index.html?overview-summary.html" target="_top">FRAMES</a></li> |
| <li><a href="overview-summary.html" target="_top">NO FRAMES</a></li> |
| </ul> |
| <ul class="navList" id="allclasses_navbar_top"> |
| <li><a href="allclasses-noframe.html">All Classes</a></li> |
| </ul> |
| <div> |
| <script type="text/javascript"><!-- |
| allClassesLink = document.getElementById("allclasses_navbar_top"); |
| if(window==top) { |
| allClassesLink.style.display = "block"; |
| } |
| else { |
| allClassesLink.style.display = "none"; |
| } |
| //--> |
| </script> |
| </div> |
| <a name="skip-navbar_top"> |
| <!-- --> |
| </a></div> |
| <!-- ========= END OF TOP NAVBAR ========= --> |
| <div class="header"> |
| <p class="subTitle"> |
| <div class="block">Easily provide repository data to a Google Search Appliance (GSA).</div> |
| </p> |
| <p>See: <a href="#overview_description">Description</a></p> |
| </div> |
| <div class="contentContainer"> |
| <table class="overviewSummary" border="0" cellpadding="3" cellspacing="0" summary="Packages table, listing packages, and an explanation"> |
| <caption><span>Packages</span><span class="tabEnd"> </span></caption> |
| <tr> |
| <th class="colFirst" scope="col">Package</th> |
| <th class="colLast" scope="col">Description</th> |
| </tr> |
| <tbody> |
| <tr class="altColor"> |
| <td class="colFirst"><a href="com/google/enterprise/adaptor/package-summary.html">com.google.enterprise.adaptor</a></td> |
| <td class="colLast"> |
| <div class="block">Adaptor interfaces and implementation.</div> |
| </td> |
| </tr> |
| <tr class="rowColor"> |
| <td class="colFirst"><a href="com/google/enterprise/adaptor/examples/package-summary.html">com.google.enterprise.adaptor.examples</a></td> |
| <td class="colLast"> </td> |
| </tr> |
| <tr class="altColor"> |
| <td class="colFirst"><a href="com/google/enterprise/adaptor/experimental/package-summary.html">com.google.enterprise.adaptor.experimental</a></td> |
| <td class="colLast"> </td> |
| </tr> |
| <tr class="rowColor"> |
| <td class="colFirst"><a href="com/google/enterprise/adaptor/prebuilt/package-summary.html">com.google.enterprise.adaptor.prebuilt</a></td> |
| <td class="colLast"> </td> |
| </tr> |
| </tbody> |
| </table> |
| </div> |
| <div class="footer"><a name="overview_description"> |
| <!-- --> |
| </a> |
| <p class="subTitle"> |
| <div class="block"><p>Easily provide repository data to a Google Search Appliance (GSA). |
| |
| <p> Note: If instead of Java you'd like to use another language take a look |
| at <a href="com/google/enterprise/adaptor/prebuilt/CommandLineAdaptor.html" title="class in com.google.enterprise.adaptor.prebuilt"><code>CommandLineAdaptor</code></a>. |
| </p> |
| |
| <h1>Table Of Contents</h1> |
| <ul> |
| <li> <a href=#gsasetup>Basic GSA Setup </a></li> |
| <li> <a href=#runtempl>Running the Adaptor Template, as an initial test </a></li> |
| <li> <a href=#createown>Creating your own Adaptor </a></li> |
| <li> <a href=#testtip>Testing Tip </a></li> |
| <li> <a href=#admintip>Admin Tip </a></li> |
| <li> <a href=#service>Running as a Windows Service </a></li> |
| <li> <a href=#secure>Enabling Security </a></li> |
| </ul> |
| |
| <h1><a name=gsasetup>Basic GSA Setup </a></h1> |
| <ol> |
| <li>Add the IP address of the computer that hosts the adaptor to the <b>List |
| of Trusted IP Addresses</b> on the GSA. |
| <p>In the GSA's Admin Console, go to <b>Content Sources > Feeds</b>, |
| and scroll down to <b>List of Trusted IP Addresses</b>. Add the IP address |
| for the adaptor to the list.</p> |
| <li>Add the URLs provided by the adaptor to the <b>Follow Patterns</b> on the GSA. |
| <p>In the Admin console, go to <b>Content Sources > Web Crawl > Start and |
| Block URLs </b>, and scroll down to <b>Follow Patterns</b>. Add an entry like |
| <code>hostname:port/</code> where <code>hostname</code> is the hostname of the machine |
| that hosts the adaptor and <code>port</code> defaults to 5678 (read on to |
| change port number).</p> |
| </ol> |
| |
| <h1><a name=runtempl>Running the Adaptor Template, as an initial test </a></h1> |
| <ol> |
| <li>You should have already installed JDK 6 or higher and gotten a plexi |
| release (download from https://code.google.com/p/plexi/). From the |
| downloaded release zip file, use the extracted adaptor jar |
| (eg: <code>adaptor-20130612-withlib.jar</code>) and extracted adaptor |
| examples jar (eg: <code>examples/adaptor-20130612-examples.jar</code>). |
| If instead of working from a release you are |
| working from source code you can build the required jars by running: |
| <pre>ant dist |
| cd dist</pre> |
| <p>The needed jars will be in a zip file |
| within the current directory (eg: adaptor-20130612-bin.zip will have |
| adaptor-20130612-withlib.jar and examples/adaptor-20130612-examples.jar). |
| </p> |
| <li>Create an <code>adaptor-config.properties</code> text file in the |
| current directory that looks like: |
| <pre>gsa.hostname=mygsahostname</pre> |
| <p>You should replace <code>mygsahostname</code> with the hostname or IP |
| of your GSA. This file allows you to do other configuration of the adaptor |
| library like changing the server port and feed name: |
| <pre>gsa.hostname=mygsahostname |
| server.port=6677 |
| feed.name=mydocfeedtogsa</pre> |
| <p>Later, if you have trouble with the adaptor library incorrectly |
| auto-detecting your computer's hostname, then you may need to add a line |
| like: |
| <pre>server.hostname=yourcomputershostname</pre> |
| <p>For a list and explanation of available configruation options view |
| <a href="com/google/enterprise/adaptor/Config.html" title="class in com.google.enterprise.adaptor"><code>Config</code></a>. |
| <li>Start the Adaptor Template. Note that the jar files you have may have a |
| different date in their names. For Windows: |
| <pre>java -cp adaptor-20130612-withlib.jar;examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre> |
| For all other OSes: |
| <pre>java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre> |
| <li> Ensure crawling is enabled on your GSA. |
| <p> |
| Go to <b>Content Sources > Diagnostics > Crawl Status </b> |
| and click <b>Resume Crawl</b> if crawling system is currently paused. |
| |
| <li> Confirm things ran successfully. |
| <p> |
| In the GSA, go to <b>Contents Sources > Feeds</b>. |
| In the <b>Current Feeds</b> section, you should see an entry for a |
| "adaptor_HOSTNAME_PORT" (which can be changed by setting the |
| <code>feed.name</code> configuration variable). |
| <p> |
| In the adaptor log look to see document ids being pushed and |
| requests for document contents being served. |
| </ol> |
| |
| <h1><a name=createown>Creating your own Adaptor </a></h1> |
| <ol> |
| <li>Review JavaDoc for <a href="com/google/enterprise/adaptor/Adaptor.html" title="interface in com.google.enterprise.adaptor"><code>Adaptor</code></a> |
| and <a href="com/google/enterprise/adaptor/AbstractAdaptor.html" title="class in com.google.enterprise.adaptor"><code>AbstractAdaptor</code></a>. |
| <li>From the zip file (eg:<code>adaptor-20130612-src.zip</code>), |
| make a copy of <code>src/com/google/enterprise/adaptor/examples/AdaptorTemplate.java</code> |
| to your own package and name. You will need to modify the contents |
| appropriately for the new package and name. |
| <li>Compile, run, and verify the copied adaptor using your favorite IDE. You |
| will only need <code>adaptor-20130612-withlib.jar</code> in your classpath. |
| Note that the date may be different. |
| <li>Modify it further for your own repository. |
| <li>Declare success for getting content from your custom repository to the |
| GSA. |
| </ol> |
| |
| <h1><a name=testtip>Testing Tip </a></h1> |
| <p>An adaptor, by default, will deny all document accesses, except from the |
| GSA. To allow debugging and testing an adaptor without a GSA, you can add a |
| hostname to the <code>server.fullAccessHosts</code> config key to allow that |
| computer full access to all adaptor content. In addition, this setting |
| allows that computer to see metadata and other GSA-specific information as |
| HTTP headers. This can be very useful when combined with Firebug or the Web |
| Inspector in your browser to observe an Adaptor's behavior. |
| |
| <h1><a name=admintip>Admin Tip </a></h1> |
| <p>You can set configuration variables on the command line instead of in |
| <code>adaptor-config.properties</code>. You are allowed multiple arguments |
| of the form "-Dconfigkey=configvalue". When providing a value on the command |
| line, it overrides the default value and the value (if any) in the |
| configuration file. For example: |
| <pre>java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar |
| com.google.enterprise.adaptor.examples.AdaptorTemplate -Dgsa.hostname=mygsahostname |
| -Dserver.port=6677</pre> |
| |
| <h1><a name=service>Running as a Windows Service </a></h1> |
| <p>Download and extract prunsrv.exe from the |
| <a href="http://www.us.apache.org/dist/commons/daemon/binaries/windows/"> |
| latest Windows binary download</a> of Apache Commons Daemon. If you are |
| running on 64-bit Windows and will use a 64-bit JVM, then you should use the |
| prunsrv.exe in the amd64/ directory. Place prunsrv.exe in the same directory |
| of the Adaptor you would like to run as a service. |
| |
| <p>You can then register the service: |
| <pre>prunsrv install <b>someadaptor</b> --StartPath="%CD%" ^ |
| --Classpath=<b>someadaptor-withlib.jar</b> ^ |
| --StartMode=jvm --StartClass=com.google.enterprise.adaptor.Daemon ^ |
| --StartMethod=serviceStart --StartParams=<b>package.SomeAdaptor</b> |
| --StopMode=jvm --StopClass=com.google.enterprise.adaptor.Daemon ^ |
| --StopMethod=serviceStop --StdOutput=stdout.log --StdError=stderr.log ^ |
| ++JvmOptions=-Djava.util.logging.config.file=logging.properties ^ |
| --Startup=auto</pre> |
| |
| <p>Where <code>someadaptor</code> is a unique, arbitrary service name. |
| |
| <p>To start the service, use the Windows service management tool or run: |
| <pre>prunsrv start <b>someadaptor</b></pre> |
| |
| <p>Where <code>someadaptor</code> is the same service name used during |
| registration.</p> |
| |
| <h1><a name=secure>Enabling Security </a></h1> |
| <p>Security is not enabled by default because it requires a reasonable amount |
| of setup, on both the GSA and adaptor. The GSA needs a valid certificate for |
| the hostname you are accessing it with (<code>gsa.hostname</code>). Thus, |
| the default one it ships with cannot be valid and you need to generate a new |
| one. Setting up security is required before users can access non-public |
| documents directly from the adaptor. |
| |
| <h3>Creating Self-Signed Certificates</h3> |
| <p>In the GSA's Admin Console, go to <b>Administration > SSL Settings</b>. |
| Under the <b>Create a New SSL Certificate</b> heading change <b>Host |
| Name</b> to GSA's hostname written exactly as the adaptor will use. |
| Then click <b>Create |
| Self-Signed Certificate</b> and wait for the operation to complete. |
| Then click <b>Install SSL Certificate</b> and wait for that operation |
| to complete (about 1 minute). |
| You now have a valid self-signed certificate, but it is not available to be |
| trusted by the adaptor. |
| |
| <p>You need to get the GSA's freshly-created certificate to add it as a |
| trusted host for the adaptor: |
| <ul> |
| <li><b>Using Firefox:</b> Navigate to the GSA's secure search: |
| https://gsahostname/. You should see a warning page that says, "This |
| Connection is Untrusted." This message is because the certificate is |
| self-signed and not signed by a trusted Certificate Authority. Click, "I |
| Understand the Risks" and "Add Exception." Wait until the "View..." |
| button is clickable, then click it. Change to the "Details" tab and |
| click "Export...". Save the certificate in your adaptor's directory with |
| the name "gsa.crt". You can then hit "Close" and "Cancel" to close the |
| dialog windows. |
| <li><b>Using Chrome:</b> Navigate to the GSA's secure search: |
| https://gsahostname/. You should see a warning page that says, "The |
| site's security certificate is not trusted!" In the location bar, there |
| should be a pad lock with a red 'x' on it. Click the pad lock and then |
| click "Certificate Information." Change to the "Details" tab and click |
| "Export...". Save the certificate in your adaptor's directory with the |
| name "gsa.crt". You can then hit "Close" and "Cancel" to close the |
| dialog windows. |
| <li><b>Using OpenSSL:</b> Execute: |
| <pre>openssl s_client -connect gsahostname:443 < /dev/null</pre> |
| Copy the section that begins with <code>-----BEGIN CERTIFICATE-----</code> |
| and ends with <code>-----END CERTIFICATE-----</code> (including the BEGIN |
| and END CERTIFICATE portions) into a new file. Save the file in your |
| adaptor's directory with the name "gsa.crt". |
| </ul> |
| |
| <p>Now you should generate a self-signed certificate for the adaptor and |
| export the newly created certificate. Within the adaptor's directory, you |
| should run: |
| <pre>keytool -genkeypair -keystore keys.jks -storepass changeit -keypass changeit -alias adaptor -keyalg RSA -validity 365</pre> |
| <p>For "What is your first and last name?", you should enter the hostname of |
| the adaptor's computer. You are free to answer the other questions however |
| you wish (including not answering them). When you are happy with your |
| answers, answer "yes" to "Is CN=yourcomputershostname, OU=... correct?" |
| <p>Then, still in adaptor's directory, you should run: |
| <pre>keytool -exportcert -alias adaptor -keystore keys.jks -storepass changeit -keypass changeit -rfc -file adaptor.crt</pre> |
| |
| <p>Copy cacerts from Java to the adaptor's directory. For Windows: |
| <pre>copy PATH\TO\JRE\lib\security\cacerts cacerts.jks</pre> |
| <p>For all other OSes: |
| <pre>cp PATH/TO/JRE/lib/security/cacerts cacerts.jks</pre> |
| |
| <p>To allow the adaptor to trust itself, execute: |
| <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file adaptor.crt -alias adaptor</pre> |
| <p>Answer "yes" to "Trust this certificate?" |
| |
| <h3>Exchanging Certificates</h3> |
| <p>To allow the adaptor to trust the GSA, execute: |
| <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file gsa.crt -alias gsa</pre> |
| <p>Answer "yes" to "Trust this certificate?" |
| |
| <p>To allow the GSA to trust the adaptor, within the GSA's Admin Console, go |
| to <b>Administration > Certificate Authorities</b>. Click the <b>Choose |
| File</b> button (this button could be called "Browse...") under the |
| <b>Add more Cerificate Authorities</b> heading. |
| Choose "adaptor.crt" in the adaptor's directory and click <b>Save |
| Settings</b>. |
| |
| <h3>Flipping the Switch</h3> |
| <p>Now that everything is prepared, you can flip the security switch with the |
| adaptor by adding a line to your <code>adaptor-config.properties</code>: |
| <pre>server.secure=true</pre> |
| <p>The adaptor can now use the GSA's authentication configuration and will use |
| HTTPS for all communication.</p> |
| <p> Example command line to run secure: |
| <pre> |
| java \ |
| -Djava.util.logging.config.file=src/logging.properties \ |
| -Djavax.net.ssl.keyStore=keys.jks \ |
| -Djavax.net.ssl.keyStoreType=jks \ |
| -Djavax.net.ssl.keyStorePassword=changeit \ |
| -Djavax.net.ssl.trustStore=cacerts.jks \ |
| -Djavax.net.ssl.trustStoreType=jks \ |
| -Djavax.net.ssl.trustStorePassword=changeit \ |
| -classpath 'adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar' \ |
| com.google.enterprise.adaptor.examples.AdaptorWithCrawlTimeMetadataTemplate |
| </pre> |
| |
| <h3>Enable Stricter Security (optional)</h3> |
| <p>There are additional security options you can control on the GSA. |
| You may want to try running an adaptor with server.secure set before |
| enabling these stricter features. |
| Within the GSA's Admin Console, go to <b>Administration > SSL |
| Settings</b>. There you can:<ul> |
| <li> uncheck <b>Enable HTTP (non-SSL) access for Feedergate</b>. With this |
| field unchecked only HTTPS communications will be accepted by feedergate. |
| Adaptors send document ids to feedergate. |
| <li> check <b>Enable Client Certificate Authentication for Feedergate</b>. |
| <li> check <b>Enable Server Certificate Authentication</b>. Note: Does not |
| work at this time (Oct 4 2011). |
| </ul> |
| <p> |
| Click <b>Save Setup</b> to save your changes. |
| <p> |
| Note: By using these settings you improve security, but also require |
| all adaptors to be configured for security and have |
| <code>server.secure=true</code> in their configuration.</div> |
| </p> |
| </div> |
| <!-- ======= START OF BOTTOM NAVBAR ====== --> |
| <div class="bottomNav"><a name="navbar_bottom"> |
| <!-- --> |
| </a><a href="#skip-navbar_bottom" title="Skip navigation links"></a><a name="navbar_bottom_firstrow"> |
| <!-- --> |
| </a> |
| <ul class="navList" title="Navigation"> |
| <li class="navBarCell1Rev">Overview</li> |
| <li>Package</li> |
| <li>Class</li> |
| <li><a href="overview-tree.html">Tree</a></li> |
| <li><a href="deprecated-list.html">Deprecated</a></li> |
| <li><a href="index-all.html">Index</a></li> |
| <li><a href="help-doc.html">Help</a></li> |
| </ul> |
| </div> |
| <div class="subNav"> |
| <ul class="navList"> |
| <li>PREV</li> |
| <li>NEXT</li> |
| </ul> |
| <ul class="navList"> |
| <li><a href="index.html?overview-summary.html" target="_top">FRAMES</a></li> |
| <li><a href="overview-summary.html" target="_top">NO FRAMES</a></li> |
| </ul> |
| <ul class="navList" id="allclasses_navbar_bottom"> |
| <li><a href="allclasses-noframe.html">All Classes</a></li> |
| </ul> |
| <div> |
| <script type="text/javascript"><!-- |
| allClassesLink = document.getElementById("allclasses_navbar_bottom"); |
| if(window==top) { |
| allClassesLink.style.display = "block"; |
| } |
| else { |
| allClassesLink.style.display = "none"; |
| } |
| //--> |
| </script> |
| </div> |
| <a name="skip-navbar_bottom"> |
| <!-- --> |
| </a></div> |
| <!-- ======== END OF BOTTOM NAVBAR ======= --> |
| </body> |
| </html> |