<body>
  <p>Easily provide repository data to a Google Search Appliance (GSA).

  <p> If you'd like to use a language other than Java or if you have command
  line programs that can provide repository access, see if {@link
  com.google.enterprise.adaptor.prebuilt.CommandLineAdaptor} fits your needs.
  </p>

  <h3>Basic GSA Setup</h3>
  <ol>
    <li>Add the IP address of the computer that hosts the adaptor to the <b>List
      of Trusted IP Addresses</b> on the GSA.
      <p>In the GSA's Admin Console, go to <b>Crawl and Index &gt; Feeds</b>,
      and scroll down to <b>List of Trusted IP Addresses</b>. Add the IP address
      for the adaptor to the list.</p>
    <li>Add the URLs provided by the adaptor to the <b>Follow and Crawl Only
      URLs with the Following Patterns</b> on the GSA.
      <p>In the Admin console, go to <b>Crawl And Index &gt; Crawl URLs</b>, and
      scroll down to <b>Follow and Crawl Only URLs with the Following
      Patterns</b>. Add an entry like {@code hostname:port/} where {@code
      hostname} is the hostname of the machine that hosts the adaptor and {@code
      port} defaults to 5678 (read on to change port number).</p>
  </ol>

  <h3>Running the Adaptor Template, as an initial test</h3>
  <ol>
    <li>You should have already installed JDK 6 or higher and gotten a plexi
      release. Specifically, you will need {@code adaptor.jar} and {@code
      adaptor-examples.jar}.   If instead of working from a release you are
      working from source code you can build the required jars by running:
      <pre>ant dist
cd dist</pre>
      The needed {@code adaptor.jar} and {@code adaptor-examples.jar} will be
      within the current directory.
    <li>Create an <code>adaptor-config.properties</code> text file in the
      current directory that looks like:
      <pre>gsa.hostname=mygsahostname</pre>
      <p>You should replace <code>mygsahostname</code> with the hostname or IP
      of your GSA. This file allows you to do other configuration of the adaptor
      library like changing the server port and feed name:
      <pre>gsa.hostname=mygsahostname
server.port=6677
feed.name=mydocfeedtogsa</pre>
      <p>Later, if you have trouble with the adaptor library incorrectly
      auto-detecting your computer's hostname, then you may need to add a line
      like:
      <pre>server.hostname=yourcomputershostname</pre>
      <p>For a list and explanation of available configruation options view 
        {@link com.google.enterprise.adaptor.Config}.
    <li>Start the Adaptor Template. For Windows:
      <pre>java -cp adaptor.jar;adaptor-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre>
      For all other OSes:
      <pre>java -cp adaptor.jar:adaptor-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate</pre>
    <li> Ensure crawling is enabled on your GSA.
      <p>
      Go to <b>Status and Reports</b> and click <b>Resume Crawl</b>  if
      crawling system is currently paused.
      
    <li> Confirm things ran successfully.
      <p>
      In the GSA, go to <b>Crawl and Index &gt; Feeds</b>.
      In the <b>Current Feeds</b> section, you should see an entry for a
      "adaptor_HOSTNAME_PORT" (which can be changed by setting the
      <code>feed.name</code> configuration variable).
      <p>
      In the adaptor log look to see document ids being pushed and
      requests for document contents being served.
  </ol>

  <h3>Creating your own Adaptor</h3>
  <ol>
    <li>Review JavaDoc for {@link com.google.enterprise.adaptor.Adaptor}
      and {@link com.google.enterprise.adaptor.AbstractAdaptor}.
    <li>From {@code adaptor-src.zip}, make a copy of {@code
      src/com/google/enterprise/adaptor/examples/AdaptorTemplate.java}
      to your own package and name. You will need to modify the contents
      appropriately for the new package and name.
    <li>Compile, run, and verify the copied adaptor using your favorite IDE. You
      will only need {@code adaptor.jar} in your classpath.
    <li>Modify it further for your own repository.
    <li>Declare success for getting content from your custom repository to the
      GSA.
  </ol>

  <h3>Advanced</h3>
  <p>You can set configuration variables on the command line instead of in
    <code>adaptor-config.properties</code>. You are allowed multiple arguments
    of the form "-Dconfigkey=configvalue". When providing a value on the command
    line, it overrides the default value and the value (if any) in the
    configuration file. For example:
    <pre>java -cp adaptor.jar:adaptor-examples.jar com.google.enterprise.adaptor.examples.AdaptorTemplate -Dgsa.hostname=mygsahostname -Dserver.port=6677</pre>

  <h3>Enabling Security</h3>
  <p>Security is not enabled by default because it requires a reasonable amount
    of setup, on both the GSA and adaptor. The GSA needs a valid certificate for
    the hostname you are accessing it with (<code>gsa.hostname</code>). Thus,
    the default one it ships with cannot be valid and you need to generate a new
    one. Setting up security is required before users can access non-public
    documents directly from the adaptor.

  <h4>Creating Self-Signed Certificates</h4>
  <p>In the GSA's Admin Console, go to <b>Administration &gt; SSL Settings</b>.
    Under the <b>Create a New SSL Certificate</b> heading change <b>Host
    Name</b> to GSA's hostname written exactly as the adaptor will use.
    Then click <b>Create
    Self-Signed Certificate</b> and wait for the operation to complete.
    Then click <b>Install SSL Certificate</b> and wait for that operation
    to complete (about 1 minute).
    You now have a valid self-signed certificate, but it is not available to be
    trusted by the adaptor.

  <p>You need to get the GSA's freshly-created certificate to add it as a
    trusted host for the adaptor:
  <ul>
    <li><b>Using Firefox:</b> Navigate to the GSA's secure search:
      https://gsahostname/. You should see a warning page that says, "This
      Connection is Untrusted." This message is because the certificate is
      self-signed and not signed by a trusted Certificate Authority. Click, "I
      Understand the Risks" and "Add Exception." Wait until the "View..."
      button is clickable, then click it. Change to the "Details" tab and
      click "Export...". Save the certificate in your adaptor's directory with
      the name "gsa.crt". You can then hit "Close" and "Cancel" to close the
      dialog windows.
    <li><b>Using Chrome:</b> Navigate to the GSA's secure search:
      https://gsahostname/. You should see a warning page that says, "The
      site's security certificate is not trusted!" In the location bar, there
      should be a pad lock with a red 'x' on it. Click the pad lock and then
      click "Certificate Information." Change to the "Details" tab and click
      "Export...". Save the certificate in your adaptor's directory with the
      name "gsa.crt". You can then hit "Close" and "Cancel" to close the
      dialog windows.
    <li><b>Using OpenSSL:</b> Execute:
      <pre>openssl s_client -connect gsahostname:443 &lt; /dev/null</pre>
      Copy the section that begins with <code>-----BEGIN CERTIFICATE-----</code>
      and ends with <code>-----END CERTIFICATE-----</code> (including the BEGIN
      and END CERTIFICATE portions) into a new file. Save the file in your
      adaptor's directory with the name "gsa.crt".
  </ul>

  <p>Now you should generate a self-signed certificate for the adaptor and
    export the newly created certificate. Within the adaptor's directory, you
    should run:
  <pre>keytool -genkeypair -keystore keys.jks -storepass changeit -keypass changeit -alias adaptor -keyalg RSA -validity 365</pre>
  <p>For "What is your first and last name?", you should enter the hostname of
    the adaptor's computer. You are free to answer the other questions however
    you wish (including not answering them). When you are happy with your
    answers, answer "yes" to "Is CN=yourcomputershostname, OU=... correct?"
  <p>Then, still in adaptor's directory, you should run:
  <pre>keytool -exportcert -alias adaptor -keystore keys.jks -storepass changeit -keypass changeit -rfc -file adaptor.crt</pre>

  <p>Copy cacerts from Java to the adaptor's directory. For Windows:
  <pre>copy PATH\TO\JRE\lib\security\cacerts cacerts.jks</pre>
  <p>For all other OSes:
  <pre>cp PATH/TO/JRE/lib/security/cacerts cacerts.jks</pre>

  <p>To allow the adaptor to trust itself, execute:
  <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file adaptor.crt -alias adaptor</pre>
  <p>Answer "yes" to "Trust this certificate?"

  <h4>Exchanging Certificates</h4>
  <p>To allow the adaptor to trust the GSA, execute:
  <pre>keytool -importcert -keystore cacerts.jks -storepass changeit -file gsa.crt -alias gsa</pre>
  <p>Answer "yes" to "Trust this certificate?"

  <p>To allow the GSA to trust the adaptor, within the GSA's Admin Console, go
    to <b>Administration &gt; Certificate Authorities</b>. Click the <b>Choose
    File</b> button (this button could be called "Browse...") under the 
    <b>Add more Cerificate Authorities</b> heading.
    Choose "adaptor.crt" in the adaptor's directory and click <b>Save
    Settings</b>.

  <h4>Flipping the Switch</h4>
  <p>Now that everything is prepared, you can flip the security switch with the
    adaptor by adding a line to your <code>adaptor-config.properties</code>:
  <pre>server.secure=true</pre>
  <p>The adaptor can now use the GSA's authentication configuration and will use
    HTTPS for all communication.</p>
  <p> Example command line to run secure:
  <pre>
    java \
    -Djava.util.logging.config.file=src/logging.properties \
    -Djavax.net.ssl.keyStore=keys.jks \
    -Djavax.net.ssl.keyStoreType=jks \
    -Djavax.net.ssl.keyStorePassword=changeit \
    -Djavax.net.ssl.trustStore=cacerts.jks \
    -Djavax.net.ssl.trustStoreType=jks \
    -Djavax.net.ssl.trustStorePassword=changeit \
    -classpath 'adaptor.jar:adaptor-examples.jar' \
    com.google.enterprise.adaptor.examples.AdaptorWithCrawlTimeMetadataTemplate
  </pre>

  <h4>Enable Stricter Security (optional)</h4>
  <p>There are additional security options you can control on the GSA.
    You may want to try running an adaptor with server.secure set before
    enabling these stricter features.
    Within the GSA's Admin Console, go to <b>Administration &gt; SSL
    Settings</b>.  There you can:<ul>
    <li> uncheck <b>Enable HTTP (non-SSL) access for Feedergate</b>.  With this
      field unchecked only HTTPS communications will be accepted by feedergate.
      Adaptors send document ids to feedergate.
    <li> check <b>Enable Client Certificate Authentication for Feedergate</b>.
    <li> check <b>Enable Server Certificate Authentication</b>. Note: Does not
      work at this time (Oct 4 2011).
    </ul>
    <p>
    Click <b>Save Setup</b> to save your changes.
    <p>
    Note: By using these settings you improve security, but also require
    all adaptors to be configured for security and have 
    <code>server.secure=true</code> in their configuration.


</body>
