According to our requirements, we need to cover a large range of deployment scenarios, from single journal deployments of OJS (S1) all the way up to large system landscapes including integration of OJS search with arbitrary search applications (S4). Fortunately the large majority of the configuration described in this document is independent of the deployment scenario. This means that only very few parameters will differ for the recommended configuration of different deployment scenarios. More specifically we recommend two deployment options:
All OJS installation requirements apply unchanged. The following additional installation requirements must be met by the OJS server (embedded deployment) or the solr server (network deployment):
Both deployment options have in common that the solr client and configuration will be integrated into OJS as a generic plug-in. While the plug-in is disabled, the current OJS search function will work unchanged. Enabling the plug-in will switch to solr as a search back-end.
OJS plug-in code will be maintained within PKP’s official github repository. The already existing SWORD plug-in creates a precedent for the integration of 3rd-party software libraries through PKP’s plug-in mechanism. No Java software has been integrated into OJS by way of plug-ins so far. We therefore expect that a few additional integration techniques need to be developed.
Our proposed integration approach is described in the README.txt provided with the plug-in and will be summarized here.
For several reasons Java binaries for jetty or solr/Lucene should not be part of PKP’s default distribution:
An integration as a git subproject as in the case of SWORD also does not seem appropriate as jetty/solr do not use git for their projects and maintaining our own jetty/solr binary git release server would be relatively costly.
We rather recommend that users download jetty/solr binaries from their original sources unchanged and extract them into well documented destinations within the solr OJS plug-in. A preconfigured installation script can then take care of copying or linking binaries to their required locations.
We cannot define a precise prescription for installing solr in a network deployment as this will largely depend on the provider’s installation policy. Most providers will probably already have a preferred servlet container and may want to install and configure container and solr through OS-specific installation mechanisms.
Solr’s example server does not come preconfigured with security in mind. Solr itself does not provide any authentication or authorization mechanisms. Securing solr must mostly be done through the servlet container and by properly protecting the server solr runs on. The following recommendations should be followed:
As most providers operate in an Open Access scenario, we do not recommend access limitations to the search handler by default (except for the firewalling as described above). The default recommended configuration will expose the query interface to all users on the provider’s network who have HTTP access to the solr endpoint.
In order to limit access in a subscription based environment and reduce the amount of data to be transferred over the network, our custom search handler was configured with mandatory (“invariant”) query parameters limiting – among other things – the returned fields to the article ID field and search score. Further recommendations for subscription-based journals have been given in this document where appropriate.
The default solr deployment descriptor has been provided in plugins/generic/solr/embedded/solr/conf/solrconfig.xml. This descriptor is recommended for both, embedded and network deployments.
A default jetty configuration has been developed for the embedded scenario, see plugins/generic/solr/embedded/etc/*.*
Details of recommended solr and servlet container configurations for both scenarios will be given in the following sections.
The embedded deployment option will work for the large majority of OJS users. With a few easy and well-documented additional installation steps it is possible to transform every OJS server into a solr server that should be reasonably secure for the majority OJS users. We have laid out these steps in the README.txt that comes with the plug-in and will be displayed on the plug-in home page as long as no working solr server has been configured for the plug-in.
The embedded deployment works with a preconfigured Jetty server and solr binaries directly deployed to the special plug-in directory “plugins/generic/solr/lib”. It is sufficient to download and extract the binaries and execute an installation script to get up and running, both on Linux and Windows operating systems. We pre-package all solr configuration required for embedded deployment inside the plug-in. No additional manual configuration is usually required. Transitive data, i.e. the index and the spellchecking dictionary, will by default be saved to the files_dir configured in config.inc.php. We’ll create a “solr” sub-directory there for our purposes.
See plugins/generic/solr/embedded/bin/start.sh for further details of the configuration of the embedded scenario.
The configuration of the embedded scenario follows a “secure by default” approach. While we do recommend proper firewalling of the OJS server even in the embedded scenario, the default configuration will provide basic protection even with no firewall in place. We do this by binding the embedded Jetty server to the loopback device (127.0.0.1) which should prohibit external access to the server on most operating systems. The above comments about CSRF vulnerability of solr apply to the embedded deployment if users log into the OJS server and open a browser (or other client software with network access) there.
Even in the embedded scenario, jetty and solr will need to be upgraded from time to time, e.g. in case of security or performance updates. In this case the new versions can simply be extracted into “plugins/generic/solr/lib” following the instructions in README.txt.
In the embedded scenario, solr can be started from within OJS with a background exec() call of a start script running a daemonized version of jetty with proper start parameters. On Windows this will probably not work without additional installation steps to create a system service. We may alternatively try to work around this restriction by running Jetty within a “permanent” PHP background process (e.g.http://stackoverflow.com/questions/45953/php-execute-a-background-process). Whether this works has to be tested in practice. It doesn’t seem to be a very scalable and reliable option, though.
Alternatively the Linux or Windows shell solr start script provided in plugins/generic/solr/embedded/bin can always be executed directly on the OJS server.
In the embedded scenario, the privileges of the web server / PHP user are probably appropriate for the solr server too. This will be the default case when starting solr from within OJS. Users are free, though, to execute the start script manually with any other user as long as they make sure that that user has write permissions to the solr index files.
Analyzing search query logs is a great tool to optimize search. We do not recommend enabling query logging by default in the embedded scenario, though. Most users opting for the embedded scenario will not be able to interpret query logs, so these logs will just unnecessarily occupy disk space. The default configuration sets logging levels to obtain enough information on the console when users need support through the forum or other remote communication means.
The network deployment option enables large service providers to connect any OJS installation to a solr server running in an arbitrary servlet container (e.g. Jetty or Tomcat) deployed somewhere on the local network. We do not give specific installation instructions for solr servers deployed like this as these instructions depend on the provider’s individual OS and (in the case of Linux) OS distribution. We make sure, though, that providers can copy the solr configuration directory provided with the solr plug-in unchanged and plug it into their servlet container. This enables providers to get up-and-running with an OJS-compatible solr installation in very little time.
We’ll also recommend providing full step-by-step installation instructions for a well-known Linux distribution e.g. Debian/Ubuntu. This can then usually easily be adapted to other distributions as well.