Need advice on query optimization Lucene

I am working on a web search application using Lucene.User on my site that can search for jobs that are within a 100 mile radius of Boston, MA, or anywhere else. In addition, I need to show search results sorted by "relevance" (ie, the invoice returned by lucene) in descending order.

I use a third-party API to retrieve all cities within a given city radius. This API returns me about 864 cities within a 100 mile radius of Boston, Massachusetts.

I am creating a city / state Lucene query using the following logic, which is part of my BuildNearestCitiesQuery method. Here nextCities is the hash table returned by the above API. It contains 864 cities with the key for the CityName key and StateCode as the value. And finalQuery is a Lucene BooleanQuery object that contains other search criteria entered by the user, such as: skills, keywords, etc.

foreach (string city in nearestCities.Keys)

{

    BooleanQuery tempFinalQuery = finalQuery;

    cityStateQuery = new BooleanQuery();    

    queryCity = queryParserCity.Parse(city);

    queryState = queryParserState.Parse(((string[])nearestCities[city])[1]);

    cityStateQuery.Add(queryCity, BooleanClause.Occur.MUST); //must is like an AND

    cityStateQuery.Add(queryState, BooleanClause.Occur.MUST);

} 


nearestCityQuery.Add(cityStateQuery, BooleanClause.Occur.SHOULD); //should is like an OR



finalQuery.Add(nearestCityQuery, BooleanClause.Occur.MUST);

Then I insert the finalQuery object into the Lucene search method to get all jobs within a 100 mile radius .:

searcher.Search(finalQuery, collector);

, BuildNearestCitiesQuery 29 , , , -. , "" .

, 2 ( ), 3 . "" .

- , , / 100 Lucene?

FYI, Lucene:

doc.Add(new Field("jobId", job.JobID.ToString().Trim(), Field.Store.YES, Field.Index.UN_TOKENIZED));

doc.Add(new Field("title", job.JobTitle.Trim(), Field.Store.YES, Field.Index.TOKENIZED));

doc.Add(new Field("description", job.JobDescription.Trim(), Field.Store.NO, Field.Index.TOKENIZED));

doc.Add(new Field("city", job.City.Trim(), Field.Store.YES, Field.Index.TOKENIZED , Field.TermVector.YES));

doc.Add(new Field("state", job.StateCode.Trim(), Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.YES));

doc.Add(new Field("citystate", job.City.Trim() + ", " + job.StateCode.Trim(), Field.Store.YES, Field.Index.UN_TOKENIZED , Field.TermVector.YES));

doc.Add(new Field("datePosted", jobPostedDateTime, Field.Store.YES, Field.Index.UN_TOKENIZED));

doc.Add(new Field("company", job.HiringCoName.Trim(), Field.Store.YES, Field.Index.TOKENIZED));

doc.Add(new Field("jobType", job.JobTypeID.ToString(), Field.Store.NO, Field.Index.UN_TOKENIZED,Field.TermVector.YES));

doc.Add(new Field("sector", job.SectorID.ToString(), Field.Store.NO, Field.Index.UN_TOKENIZED, Field.TermVector.YES));

doc.Add(new Field("showAllJobs", "yy", Field.Store.NO, Field.Index.UN_TOKENIZED));

! .

Janis

+3
6

, tempFinalQuery , , , , , . ...

Parse, .

0

, , ? , , .

0

, - . , ; , + , .

0

:

  • , lat/lon
  • , , /lon

, Perl Geo::Distance. closest source, SQL.

0

, . , . ( , ). ​​

- . , . , Fluent NHibernate SQL Server 2008, . . - Lucene.

, " " SQL Server, Lucene?

, , .

0

All Articles