2012-09-12

SearchBox is an add-on for providing full-text hosted search functionality powered by ElasticSearch.

SearchBox offers real time searching, bulk indexing, faceting, geo tagging and many more without headache.

Simply SearchBox is “The easiest way to have a searchbox in your application.”

Installing the add-on

SearchBox can be installed to a Heroku application via the CLI:

Once Searchbox has been added a SEARCHBOX_URL setting will be available in the app configuration and will contain the account name and api-key to access SearchBox indices service. This can be confirmed using the heroku config command.

After installing SearchBox the application should be configured to fully integrate with the add-on.

Using Tire with Rails 3.x

Tire is a Ruby client for the ElasticSearch search engine. It provides Ruby-like API for fluent communication with the ElasticSearch server and blends with ActiveModel class for convenient usage in Rails applications. It allows to delete and create indices, define mapping for them, supports the bulk API, and presents an easy-to-use DSL for constructing your queries. It has full ActiveRecord/ActiveModel compatibility, allowing you to index your models (incrementally upon saving, or in bulk), searching and paginating the results.

A sample Rails application using the Tire library can be found on GitHub https://github.com/searchbox-io/rails-sample.

Configuration

Ruby on Rails applications will need to add the following entry into their Gemfile.

Update application dependencies with bundler.

Configure Tire in configure/application.rb or configure/environment/production.rb

Search

Make your model searchable:

When you now save a record:

The included callbacks automatically add the document to a documents index, making the record searchable:

Tire has very detailed documentation at it’s github page.

Using Haystack with Django

Haystack provides modular search for Django. It features a unified, familiar API that allows you to plug in different search backends without having to modify your code. Currently Haystack 2.0.0-beta can be integrated to SearchBox.io ElasticSearch.

A sample Django application using Haystack can be found on GitHub https://github.com/searchbox-io/django-haystack-sample.

Configuration

Under the hood Haystack uses a fork of pyelasticsearch (A Lightweight ElasticSearch client) to integrate with ElasticSearch.

Django applications will need to add following entries into their requirements.txt;

or install via pip;

As with most Django applications, you should add Haystack to the INSTALLED_APPS within your settings.py.

Add Haystack connection string to integrate with SearchBox into settings.py and set a default index name.

Creating SearchIndexes

SearchIndex objects are the way Haystack determines what data should be placed in the search index and handles the flow of data in. You can think of them as being similar to Django Models or Forms in that they are field-based and manipulate/store data.

To build a SearchIndex, all that’s necessary is to subclass both indexes.RealTimeSearchIndex & indexes.Indexable, define the fields you want to store data with and define a get_model method. We’ll create the following DocumentIndex to correspond to our Document model. This code generally goes in a search_indexes.py file within the app it applies to, though that is not required. This allows Haystack to automatically pick it up. The DocumentIndex should look like:

Additionally, we’re providing use_template=True on the text field. This allows us to use a data template (rather than error prone concatenation) to build the document the search engine will use in searching. You’ll need to create a new template inside your template directory called search/indexes/myapp/document_text.txt and place the following inside:

Also to integrate Haystack with Django admin, create search_sites.py inside your application;

Setup views

Add the SearchView To Your URLconf

Search template sample

Your search template with default url configuration is should be placed under your template directory and called search/search.html.

Searching

With default url configuration you need to make a get request with parameter named q to action /search.

The Haystack home page is great resource for additional documentation.

Using ElasticSearchClient with Node.js

elasticsearchclient is a lightweight ElasticSearch client for Node.js. It is actively developed and covers core modules of ElasticSearch.

A sample Node.js application can be found on GitHub https://github.com/searchbox-io/node.js-sample.

Configuration

Add elasticsearchclient dependency to your package.json file and use npm to install your dependencies

Search

Create a search client:

Index a document

Create a query and search it

SearchBox ElasticSearch dashboard

The SearchBox dashboard allows you to create, delete and edit access configurations of your indices and also gives basic statistical information.

The dashboard can be accessed via the CLI:

or by visiting the Heroku apps web interface and selecting the application in question. Select SearchBox from the Add-ons menu.

Migrating between plans

Application owners should carefully manage the migration timing to ensure proper application function during the migration process.

Use the heroku addons:upgrade command to migrate to a new plan.

Removing the add-on

SearchBox can be removed via the CLI.

This will destroy all associated data and cannot be undone!

Troubleshooting

SearchBox.io returns errors as JSON objects with message property.

400 - {“message”:”You have reached your maximum index count, upgrade your plan to add more documents!”}

400 - {“message”:”You have reached your maximum storage size, upgrade your plan for more storage!”}

403 - {“message”:”At least one of given indices does not exist!”}

403 - {“message”:”Given api key is invalid!”}

409 - {“message”:”Index can not be deleted via api.”}

409 - {“message”:”Index name is invalid. Index name should be between 3-16 characters and only letters, numbers and hyphens are allowed”}

409 - {“message”:”An index with given name already exists”}

409 - {“message”:”You have reached maximum index count for your current plan.”}

API limitations

Index refresh times are set to 1 second and can not be invoked via API.

While creating an index below parameters are ignored;

store

translog

cache

refresh_interval

compound_format

term_index_interval

term_index_divisor

Additionally, all administrative features of ElasticSearch are restricted from the API. Here list of banned ElasticSearch resources to call:

_cluster

_shutdown

_local

_primary

_aliases

_refresh

_gateway

_settings

_template

_nodes

_segments

_cache

Please take into account that Searchbox ElasticsSearch is under heavy development and this list may evolve.

Support

All SearchBox support and runtime issues should be submitted via on of the Heroku Support channels. Any non-support related issues or product feedback is welcome at SeachBox Support.

Additional resources

Official ElasticSearch page

ElasticSearch Q&A on Stack Overflow

Handy Slides at SlideShare

SearchBox Dev Center

SearchBox Support

Show more