Tony Dillon's Blog: Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx

Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx – Fight!): "

In the past few weeks I've been implementing advanced search at Plaxo, working quite closely with Solr enterprise search server. Today, I saw this relatively detailed comparison between Solr and its main competitor Sphinx (full credit goes to StackOverflow user mausch who had been using Solr for the past 2 years). For those still confused, Solr and Sphinx are similar to MySQL FULLTEXT search, or for those even more confused, think Google (yeah, this is a bit of a stretch, I know).

Similarities

Both Solr and Sphinx satisfy all of your requirements. They're fast and designed to index and search large bodies of data efficiently.

Both have a long list of high-traffic sites using them (Solr, Sphinx)

Both offer commercial support. (Solr, Sphinx)

Both offer client API bindings for several platforms/languages (Sphinx, Solr)

Both can be distributed to increase speed and capacity (Sphinx, Solr)

Here are some differences

Solr, being an Apache project, is obviously is Apache2-licensed. Sphinx is GPLv2. This means that if you ever need to embed or extend (not just "use") Sphinx in a commercial application, you'll have to buy a commercial license.

Solr is easily embeddable in Java applications.

Solr is built on top of Lucene, which is a proven technology over 7 years old with a huge user base (this is only a small part). Whenever Lucene gets a new feature or speedup, Solr gets it too. Many of the devs committing to Solr are also Lucene committers.

Sphinx integrates more tightly with RDBMSs, especially MySQL.

Solr can be integrated with Hadoop to build distributed applications

Solr can be integrated with Nutch to quickly build a fully-fledged web search engine with crawler.

Solr can index proprietary formats like Microsoft Word, PDF, etc. Sphinx can't.

Solr comes with a spell-checker out of the box.

Solr comes with facet support out of the box. Faceting in Sphinx takes more work.

Sphinx doesn't allow partial index updates for field data.

In Sphinx, all document ids must be unique unsigned non-zero integer numbers. Solr doesn't even require a unique key for many operations, and unique keys can be either integers or strings.

Solr supports field collapsing to avoid duplicating similar results. Sphinx doesn't seem to provide any feature like this.

Conclusion

In my experience, Solr is very-very fast on the query side. It is also very powerful. The indexing side is very CPU and memory intensive and is an unfortunate side effect of having such a feature-rich, fast application. Nevertheless, I highly recommend Solr.

For disclaimer purposes, I have not had much experience with Sphinx and, again, all credit for this comparison goes to mausch.

Tony Dillon's Blog

Sunday, 13 September 2009

Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx – Fight!)

Similarities

Here are some differences

Related questions

Conclusion

No comments:

Sike's shared items

The Book Depository

Blog Archive

About Me

Search This Blog

Tony Dillon's Blog

Sunday, 13 September 2009

Comparison Between Solr And Sphinx Search Servers (Solr Vs Sphinx – Fight!)

Similarities

Here are some differences

Related questions

Conclusion

No comments:

Sike's shared items

The Book Depository

Blog Archive

About Me

Search This Blog

Subscribe