Drupal Hosting with Apache Solr Search

The search module in core Drupal provides a general framework for indexing and searching. It does not provide any search functionality on its own. Unfortunately, core search module has several drawbacks that limit its utility:
  • For the content search, only exact keywords match. This means that if someone searches for "quake", and a node contains "quakes", "quaking", or "earthquake", it will not be matched. In contrast, user name search always looks for substrings, so for instance if you search for "jo", you would find users called "mojo" and "josephine" as well.
  • All of the Node content on your site will be indexed, whether you want it to be or not. For some sites, this may not be appropriate, as some content types are not ever meant to be displayed (or searched) on their own pages, but only on composite pages such as Views.
  • Only Node content on the site will be indexed. So if you have a module that produces content that is not Nodes, or pages that are composites of multiple nodes (such as Views), they will not be included in search results. Again, this may be a problem for some sites.
  • Profile fields are not searched in the User search, just the user names.
  • There is no faceted search capability.
  • If you have modified how the Node content on your site is displayed in the Theme, these modifications will not be indexed correctly. The content that is indexed is the default view of the node, not the view as rendered by the Theme.

For big sites, say tens of thousands of nodes, the Drupal core search module is not suitable any longer. To handle high search volumes, one should look to external third party search solutions. The most popular open source tool is Apache Solr.

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.

The Solr search functionality for Drupal installations is maintained on all our Drupal optimized servers. So you can disable core search indexing on Apache Solr Search module settings, which also reduces load on MySQL.

Project's webpage: http://lucene.apache.org/solr