<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- NewPage -->
<html lang="en">
<head>
<title>Overview</title>
<link rel="stylesheet" type="text/css" href="stylesheet.css" title="Style">
</head>
<body>
<script type="text/javascript"><!--
    if (location.href.indexOf('is-external=true') == -1) {
        parent.document.title="Overview";
    }
//-->
</script>
<noscript>
<div>JavaScript is disabled on your browser.</div>
</noscript>
<!-- ========= START OF TOP NAVBAR ======= -->
<div class="topNav"><a name="navbar_top">
<!--   -->
</a><a href="#skip-navbar_top" title="Skip navigation links"></a><a name="navbar_top_firstrow">
<!--   -->
</a>
<ul class="navList" title="Navigation">
<li class="navBarCell1Rev">Overview</li>
<li>Package</li>
<li>Class</li>
<li><a href="overview-tree.html">Tree</a></li>
<li><a href="deprecated-list.html">Deprecated</a></li>
<li><a href="index-all.html">Index</a></li>
<li><a href="help-doc.html">Help</a></li>
</ul>
</div>
<div class="subNav">
<ul class="navList">
<li>PREV</li>
<li>NEXT</li>
</ul>
<ul class="navList">
<li><a href="index.html?overview-summary.html" target="_top">FRAMES</a></li>
<li><a href="overview-summary.html" target="_top">NO FRAMES</a></li>
</ul>
<ul class="navList" id="allclasses_navbar_top">
<li><a href="allclasses-noframe.html">All Classes</a></li>
</ul>
<div>
<script type="text/javascript"><!--
  allClassesLink = document.getElementById("allclasses_navbar_top");
  if(window==top) {
    allClassesLink.style.display = "block";
  }
  else {
    allClassesLink.style.display = "none";
  }
  //-->
</script>
</div>
<a name="skip-navbar_top">
<!--   -->
</a></div>
<!-- ========= END OF TOP NAVBAR ========= -->
<div class="header">
<p class="subTitle">
<div class="block">Easily provide repository data to a Google Search Appliance (GSA).</div>
</p>
<p>See: <a href="#overview_description">Description</a></p>
</div>
<div class="contentContainer">
<table class="overviewSummary" border="0" cellpadding="3" cellspacing="0" summary="Packages table, listing packages, and an explanation">
<caption><span>Packages</span><span class="tabEnd">&nbsp;</span></caption>
<tr>
<th class="colFirst" scope="col">Package</th>
<th class="colLast" scope="col">Description</th>
</tr>
<tbody>
<tr class="altColor">
<td class="colFirst"><a href="com/google/enterprise/adaptor/package-summary.html">com.google.enterprise.adaptor</a></td>
<td class="colLast">
<div class="block">Adaptor interfaces and implementation.</div>
</td>
</tr>
<tr class="rowColor">
<td class="colFirst"><a href="com/google/enterprise/adaptor/examples/package-summary.html">com.google.enterprise.adaptor.examples</a></td>
<td class="colLast">&nbsp;</td>
</tr>
<tr class="altColor">
<td class="colFirst"><a href="com/google/enterprise/adaptor/experimental/package-summary.html">com.google.enterprise.adaptor.experimental</a></td>
<td class="colLast">&nbsp;</td>
</tr>
<tr class="rowColor">
<td class="colFirst"><a href="com/google/enterprise/adaptor/prebuilt/package-summary.html">com.google.enterprise.adaptor.prebuilt</a></td>
<td class="colLast">&nbsp;</td>
</tr>
</tbody>
</table>
</div>
<div class="footer"><a name="overview_description">
<!--   -->
</a>
<p class="subTitle">
<div class="block"><p>Easily provide repository data to a Google Search Appliance (GSA).

  <p> Note: If instead of Java you'd like to use another language take a look
  at <a href="com/google/enterprise/adaptor/prebuilt/CommandLineAdaptor.html" title="class in com.google.enterprise.adaptor.prebuilt"><code>CommandLineAdaptor</code></a>.
  </p>

  <h1>Table Of Contents</h1>
  <ul>
  <li> <a href=#gsasetup>Basic GSA Setup </a></li>
  <li> <a href=#runtempl>Running the Adaptor Template, as an initial test </a></li>
  <li> <a href=#createown>Creating your own Adaptor </a></li>
  <li> <a href=#testtip>Testing Tip </a></li>
  <li> <a href=#admintip>Admin Tip </a></li>
  <li> <a href=#service>Running as a Windows Service </a></li>
  <li> <a href=#secure>Enabling Security </a></li>
  </ul>

  <h1><a name=gsasetup>Basic GSA Setup </a></h1>
  <ol>
    <li>Add the IP address of the computer that hosts the adaptor to the <b>List
      of Trusted IP Addresses</b> on the GSA.
      <p>In the GSA's Admin Console, go to <b>Content Sources &gt; Feeds</b>,
      and scroll down to <b>List of Trusted IP Addresses</b>. Add the IP address
      for the adaptor to the list.</p>
    <li>Add the URLs provided by the adaptor to the <b>Follow Patterns</b> on the GSA.
      <p>In the Admin console, go to <b>Content Sources &gt; Web Crawl &gt Start and
      Block URLs </b>, and scroll down to <b>Follow Patterns</b>. Add an entry like
      <code>hostname:port/</code> where <code>hostname</code> is the hostname of the machine
      that hosts the adaptor and <code>port</code> defaults to 5678 (read on to 
      change port number).</p>
  </ol>

  <h1><a name=runtempl>Running the Adaptor Template, as an initial test </a></h1>
  <ol>
    <li>You should have already installed JDK 6 or higher and gotten a plexi
      release (download from https://code.google.com/p/plexi/). From the 
      downloaded release zip file, use the extracted adaptor jar 
      (eg: <code>adaptor-20130612-withlib.jar</code>) and extracted adaptor 
      examples jar (eg: <code>examples/adaptor-20130612-examples.jar</code>). 
      If instead of working from a release you are
      working from source code you can build the required jars by running:
      <pre>ant dist
cd dist</pre>
      <p>The needed jars will be in a zip file 
      within the current directory (eg: adaptor-20130612-bin.zip will have  
      adaptor-20130612-withlib.jar and examples/adaptor-20130612-examples.jar). 
      </p>
    <li>Create an <code>adaptor-config.properties</code> text file in the
      current directory that looks like:
      <pre>gsa.hostname=mygsahostname</pre>
      <p>You should replace <code>mygsahostname</code> with the hostname or IP
      of your GSA. This file allows you to do other configuration of the adaptor
      library like changing the server port and feed name:
      <pre>gsa.hostname=mygsahostname
server.port=6677
feed.name=mydocfeedtogsa</pre>
      <p>Later, if you have trouble with the adaptor library incorrectly
      auto-detecting your computer's hostname, then you may need to add a line
      like:
      <pre>server.hostname=yourcomputershostname</pre>
      <p>For a list and explanation of available configruation options view 
        <a href="com/google/enterprise/adaptor/Config.html" title="class in com.google.enterprise.adaptor"><code>Config</code></a>.
    <li>Start the Adaptor Template. Note that the jar files you have may have a
      different date in their names. For Windows:
      <pre>java -cp adaptor-20130612-withlib.jar;examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre>
      For all other OSes:
      <pre>java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre>
    <li> Ensure crawling is enabled on your GSA.
      <p>
      Go to <b>Content Sources &gt; Diagnostics &gt; Crawl Status </b>
      and click <b>Resume Crawl</b> if crawling system is currently paused.
      
    <li> Confirm things ran successfully.
      <p>
      In the GSA, go to <b>Contents Sources &gt; Feeds</b>.
      In the <b>Current Feeds</b> section, you should see an entry for a
      "adaptor_HOSTNAME_PORT" (which can be changed by setting the
      <code>feed.name</code> configuration variable).
      <p>
      In the adaptor log look to see document ids being pushed and
      requests for document contents being served.
  </ol>

  <h1><a name=createown>Creating your own Adaptor </a></h1>
  <ol>
    <li>Review JavaDoc for <a href="com/google/enterprise/adaptor/Adaptor.html" title="interface in com.google.enterprise.adaptor"><code>Adaptor</code></a>
      and <a href="com/google/enterprise/adaptor/AbstractAdaptor.html" title="class in com.google.enterprise.adaptor"><code>AbstractAdaptor</code></a>.
    <li>From the zip file (eg:<code>adaptor-20130612-src.zip</code>), 
      make a copy of <code>src/com/google/enterprise/adaptor/examples/AdaptorTemplate.java</code>
      to your own package and name. You will need to modify the contents
      appropriately for the new package and name.
    <li>Compile, run, and verify the copied adaptor using your favorite IDE. You
      will only need <code>adaptor-20130612-withlib.jar</code> in your classpath.
      Note that the date may be different.
    <li>Modify it further for your own repository.
    <li>Declare success for getting content from your custom repository to the
      GSA.
  </ol>

  <h1><a name=testtip>Testing Tip </a></h1>
  <p>An adaptor, by default, will deny all document accesses, except from the
    GSA. To allow debugging and testing an adaptor without a GSA, you can add a
    hostname to the <code>server.fullAccessHosts</code> config key to allow that
    computer full access to all adaptor content. In addition, this setting
    allows that computer to see metadata and other GSA-specific information as
    HTTP headers. This can be very useful when combined with Firebug or the Web
    Inspector in your browser to observe an Adaptor's behavior.

  <h1><a name=admintip>Admin Tip </a></h1>
  <p>You can set configuration variables on the command line instead of in
    <code>adaptor-config.properties</code>. You are allowed multiple arguments
    of the form "-Dconfigkey=configvalue". When providing a value on the command
    line, it overrides the default value and the value (if any) in the
    configuration file. For example:
    <pre>java -cp adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar 
    com.google.enterprise.adaptor.examples.AdaptorTemplate -Dgsa.hostname=mygsahostname 
    -Dserver.port=6677</pre>

  <h1><a name=service>Running as a Windows Service </a></h1>
  <p>Download and extract prunsrv.exe from the
    <a href="http://www.us.apache.org/dist/commons/daemon/binaries/windows/">
    latest Windows binary download</a> of Apache Commons Daemon. If you are
    running on 64-bit Windows and will use a 64-bit JVM, then you should use the
    prunsrv.exe in the amd64/ directory. Place prunsrv.exe in the same directory
    of the Adaptor you would like to run as a service.

  <p>You can then register the service:
  <pre>prunsrv install <b>someadaptor</b> --StartPath="%CD%" ^
  --Classpath=<b>someadaptor-withlib.jar</b> ^
  --StartMode=jvm --StartClass=com.google.enterprise.adaptor.Daemon ^
  --StartMethod=serviceStart --StartParams=<b>package.SomeAdaptor</b>
  --StopMode=jvm --StopClass=com.google.enterprise.adaptor.Daemon ^
  --StopMethod=serviceStop --StdOutput=stdout.log --StdError=stderr.log ^
  ++JvmOptions=-Djava.util.logging.config.file=logging.properties ^
  --Startup=auto</pre>

  <p>Where <code>someadaptor</code> is a unique, arbitrary service name.

  <p>To start the service, use the Windows service management tool or run:
  <pre>prunsrv start <b>someadaptor</b></pre>

  <p>Where <code>someadaptor</code> is the same service name used during
    registration.</p>

  <h1><a name=secure>Enabling Security </a></h1>
  <p>Security is not enabled by default because it requires a reasonable amount
    of setup, on both the GSA and adaptor. The GSA needs a valid certificate for
    the hostname you are accessing it with (<code>gsa.hostname</code>). Thus,
    the default one it ships with cannot be valid and you need to generate a new
    one. Setting up security is required before users can access non-public
    documents directly from the adaptor.

  <h3>Creating Self-Signed Certificates</h3>
  <p>In the GSA's Admin Console, go to <b>Administration &gt; SSL Settings</b>.
    Under the <b>Create a New SSL Certificate</b> heading change <b>Host
    Name</b> to GSA's hostname written exactly as the adaptor will use.
    Then click <b>Create
    Self-Signed Certificate</b> and wait for the operation to complete.
    Then click <b>Install SSL Certificate</b> and wait for that operation
    to complete (about 1 minute).
    You now have a valid self-signed certificate, but it is not available to be
    trusted by the adaptor.

  <p>You need to get the GSA's freshly-created certificate to add it as a
    trusted host for the adaptor:
  <ul>
    <li><b>Using Firefox:</b> Navigate to the GSA's secure search:
      https://gsahostname/. You should see a warning page that says, "This
      Connection is Untrusted." This message is because the certificate is
      self-signed and not signed by a trusted Certificate Authority. Click, "I
      Understand the Risks" and "Add Exception." Wait until the "View..."
      button is clickable, then click it. Change to the "Details" tab and
      click "Export...". Save the certificate in your adaptor's directory with
      the name "gsa.crt". You can then hit "Close" and "Cancel" to close the
      dialog windows.
    <li><b>Using Chrome:</b> Navigate to the GSA's secure search:
      https://gsahostname/. You should see a warning page that says, "The
      site's security certificate is not trusted!" In the location bar, there
      should be a pad lock with a red 'x' on it. Click the pad lock and then
      click "Certificate Information." Change to the "Details" tab and click
      "Export...". Save the certificate in your adaptor's directory with the
      name "gsa.crt". You can then hit "Close" and "Cancel" to close the
      dialog windows.
    <li><b>Using OpenSSL:</b> Execute:
      <pre>openssl s_client -connect gsahostname:443 &lt; /dev/null</pre>
      Copy the section that begins with <code>-----BEGIN CERTIFICATE-----</code>
      and ends with <code>-----END CERTIFICATE-----</code> (including the BEGIN
      and END CERTIFICATE portions) into a new file. Save the file in your
      adaptor's directory with the name "gsa.crt".
  </ul>

  <p>Now you should generate a self-signed certificate for the adaptor and
    export the newly created certificate. Within the adaptor's directory, you
    should run:
  <pre>keytool -genkeypair -keystore keys.jks -storepass changeit -keypass changeit -alias adaptor -keyalg RSA -validity 365</pre>
  <p>For "What is your first and last name?", you should enter the hostname of
    the adaptor's computer. You are free to answer the other questions however
    you wish (including not answering them). When you are happy with your
    answers, answer "yes" to "Is CN=yourcomputershostname, OU=... correct?"
  <p>Then, still in adaptor's directory, you should run:
  <pre>keytool -exportcert -alias adaptor -keystore keys.jks -storepass changeit -keypass changeit -rfc -file adaptor.crt</pre>

  <p>Copy cacerts from Java to the adaptor's directory. For Windows:
  <pre>copy PATH\TO\JRE\lib\security\cacerts cacerts.jks</pre>
  <p>For all other OSes:
  <pre>cp PATH/TO/JRE/lib/security/cacerts cacerts.jks</pre>

  <p>To allow the adaptor to trust itself, execute:
  <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file adaptor.crt -alias adaptor</pre>
  <p>Answer "yes" to "Trust this certificate?"

  <h3>Exchanging Certificates</h3>
  <p>To allow the adaptor to trust the GSA, execute:
  <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file gsa.crt -alias gsa</pre>
  <p>Answer "yes" to "Trust this certificate?"

  <p>To allow the GSA to trust the adaptor, within the GSA's Admin Console, go
    to <b>Administration &gt; Certificate Authorities</b>. Click the <b>Choose
    File</b> button (this button could be called "Browse...") under the 
    <b>Add more Cerificate Authorities</b> heading.
    Choose "adaptor.crt" in the adaptor's directory and click <b>Save
    Settings</b>.

  <h3>Flipping the Switch</h3>
  <p>Now that everything is prepared, you can flip the security switch with the
    adaptor by adding a line to your <code>adaptor-config.properties</code>:
  <pre>server.secure=true</pre>
  <p>The adaptor can now use the GSA's authentication configuration and will use
    HTTPS for all communication.</p>
  <p> Example command line to run secure:
  <pre>
    java \
    -Djava.util.logging.config.file=src/logging.properties \
    -Djavax.net.ssl.keyStore=keys.jks \
    -Djavax.net.ssl.keyStoreType=jks \
    -Djavax.net.ssl.keyStorePassword=changeit \
    -Djavax.net.ssl.trustStore=cacerts.jks \
    -Djavax.net.ssl.trustStoreType=jks \
    -Djavax.net.ssl.trustStorePassword=changeit \
    -classpath 'adaptor-20130612-withlib.jar:examples/adaptor-20130612-examples.jar' \
    com.google.enterprise.adaptor.examples.AdaptorWithCrawlTimeMetadataTemplate
  </pre>

  <h3>Enable Stricter Security (optional)</h3>
  <p>There are additional security options you can control on the GSA.
    You may want to try running an adaptor with server.secure set before
    enabling these stricter features.
    Within the GSA's Admin Console, go to <b>Administration &gt; SSL
    Settings</b>.  There you can:<ul>
    <li> uncheck <b>Enable HTTP (non-SSL) access for Feedergate</b>.  With this
      field unchecked only HTTPS communications will be accepted by feedergate.
      Adaptors send document ids to feedergate.
    <li> check <b>Enable Client Certificate Authentication for Feedergate</b>.
    <li> check <b>Enable Server Certificate Authentication</b>. Note: Does not
      work at this time (Oct 4 2011).
    </ul>
    <p>
    Click <b>Save Setup</b> to save your changes.
    <p>
    Note: By using these settings you improve security, but also require
    all adaptors to be configured for security and have 
    <code>server.secure=true</code> in their configuration.</div>
</p>
</div>
<!-- ======= START OF BOTTOM NAVBAR ====== -->
<div class="bottomNav"><a name="navbar_bottom">
<!--   -->
</a><a href="#skip-navbar_bottom" title="Skip navigation links"></a><a name="navbar_bottom_firstrow">
<!--   -->
</a>
<ul class="navList" title="Navigation">
<li class="navBarCell1Rev">Overview</li>
<li>Package</li>
<li>Class</li>
<li><a href="overview-tree.html">Tree</a></li>
<li><a href="deprecated-list.html">Deprecated</a></li>
<li><a href="index-all.html">Index</a></li>
<li><a href="help-doc.html">Help</a></li>
</ul>
</div>
<div class="subNav">
<ul class="navList">
<li>PREV</li>
<li>NEXT</li>
</ul>
<ul class="navList">
<li><a href="index.html?overview-summary.html" target="_top">FRAMES</a></li>
<li><a href="overview-summary.html" target="_top">NO FRAMES</a></li>
</ul>
<ul class="navList" id="allclasses_navbar_bottom">
<li><a href="allclasses-noframe.html">All Classes</a></li>
</ul>
<div>
<script type="text/javascript"><!--
  allClassesLink = document.getElementById("allclasses_navbar_bottom");
  if(window==top) {
    allClassesLink.style.display = "block";
  }
  else {
    allClassesLink.style.display = "none";
  }
  //-->
</script>
</div>
<a name="skip-navbar_bottom">
<!--   -->
</a></div>
<!-- ======== END OF BOTTOM NAVBAR ======= -->
</body>
</html>
