What is the most efficient way to make many-many comparisons with LINQ in EF 4.1?

My database has the following tables:

  • Person
  • Message
  • Interest tag

There are many, many relationships between Person-InterestTag and Post-InterestTag

I need to execute a linq query in EF 4.1 to cancel any entry containing at least one interest tag that matches at least one interest tag associated with this user.

Example

A person has the following interests:

  • Cars
  • Sport
  • Fitness

I need to return any Mail that is related to cars, sports or fitness.

What is the most efficient way to write this query in terms of performance?

Edit

Making an error based on the answer below ...

This compiles fine, but throws an error at runtime:

var matchingPosts = posts.Where(post => post.Topics.Any(postTopic => person.Interests.Contains(postTopic))); 

Mistake:

 Unable to create a constant value of type 'System.Collections.Generic.ICollection`1'. Only primitive types ('such as Int32, String, and Guid') are supported in this context. 

Any ideas how to fix this?

EDIT 2

So my classes are structured as such:

 public class Person { public int PersonID {get; set;} public string FirstName {get; set;} public string LastName {get; set;} //other properties of types string, int, DateTime, etc. public ICollection<InterestTag> InterestTags {get; set;} } public class Post { public int PostID {get; set;} public string Title{get; set;} public string Content {get; set;} //other properties of types string, int, DateTime, etc. public ICollection<InterestTag> InterestTags {get; set;} } public class InterestTag { public int InterestTagID { get; set; } public string InterestDescription { get; set; } public bool Active { get; set; } public ICollection<Person> Persons { get; set; } public ICollection<Post> Posts { get; set; } } 

In my Context class, I override OnModelCreating to determine the names of my table table

 modelBuilder.Entity<Person>().HasMany(u => u.InterestTags).WithMany(t => t.Persons) .Map(m => { m.MapLeftKey("PersonID"); m.MapRightKey("InterestTagID"); m.ToTable("PersonInterestTags"); }); modelBuilder.Entity<Post>().HasMany(u => u.InterestTags).WithMany(t => t.Posts) .Map(m => { m.MapLeftKey("PostID"); m.MapRightKey("InterestTagID"); m.ToTable("PostInterestTags"); }); 

In my query method, I return IQueryable of Post and apply some filters, including a sentence in which I am trying to complete this question.

  var person = personRepository.Get(x => x.PersonID = 5); var posts = postRepository.GetQueryable(); //I have tried this and get the error above posts= posts.Where(x => x.InterestTags.Any(tag => person.InterestTags.Contains(tag))); 
+7
source share
3 answers

If you start only with the given personId (or userId ), you can make this request in one reverse direction, for example:

 var posts = context.Posts .Intersect(context.People .Where(p => p.Id == givenPersonId) .SelectMany(p => p.InterestTags.SelectMany(t => t.Posts))) .ToList(); 

This means an INTERSECT statement in SQL.

You can also do this in two rounds:

 var interestTagsOfPerson = context.People.Where(p => p.Id == givenPersonId) .Select(p => p.InterestTags.Select(t => t.Id)) .SingleOrDefault(); // Result is an IEnumerable<int> which contains the Id of the tags of this person var posts = context.Posts .Where(p => p.InterestTags.Any(t => interestTagsOfPerson.Contains(t.Id))) .ToList(); // Contains translates into an IN clause in SQL 

Using the list of primitive types in the second request ( interestTagsOfPerson is a collection from int ) also fixes the error indicated in your Editing in the question. For Contains you cannot use object references in LINQ to Entities because EF does not know how to translate this into SQL.

I don’t know which of the two approaches is faster (SQL experts may have a better idea), but they will probably start testing the first option. (I tested a bit and it seems to have returned the correct results, but this is the first time I've used INTERSECT .)

Edit

To give an idea of ​​the generated SQL (taken from SQL Profiler):

The first query (with INTERSECT ) creates this SQL query:

 SELECT [Intersect1].[Id] AS [C1], [Intersect1].[Name] AS [C2], FROM (SELECT [Extent1].[Id] AS [Id], [Extent1].[Name] AS [Name], FROM [dbo].[Posts] AS [Extent1] INTERSECT SELECT [Join1].[Id] AS [Id], [Join1].[Name] AS [Name], FROM [dbo].[PersonInterestTags] AS [Extent2] INNER JOIN (SELECT [Extent3].[TagId] AS [TagId], [Extent4].[Id] AS [Id], [Extent4].[Name] AS [Name] FROM [dbo].[PostInterestTags] AS [Extent3] INNER JOIN [dbo].[Posts] AS [Extent4] ON [Extent3].[PostId] = [Extent4].[Id] ) AS [Join1] ON [Extent2].[TagId] = [Join1].[TagId] WHERE 1 = [Extent2].[PersonId]) AS [Intersect1] 

The second option:

Query1 (list of user tag identifiers):

 SELECT [Project1].[Id] AS [Id], [Project1].[C1] AS [C1], [Project1].[TagId] AS [TagId] FROM ( SELECT [Limit1].[Id] AS [Id], [Extent2].[TagId] AS [TagId], CASE WHEN ([Extent2].[PersonId] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1] FROM (SELECT TOP (2) [Extent1].[Id] AS [Id] FROM [dbo].[People] AS [Extent1] WHERE 1 = [Extent1].[Id] ) AS [Limit1] LEFT OUTER JOIN [dbo].[PersonInterestTags] AS [Extent2] ON [Limit1].[Id] = [Extent2].[PersonId] ) AS [Project1] ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC 

Request 2 for final messages:

 SELECT [Extent1].[Id] AS [Id], [Extent1].[Name] AS [Name], FROM [dbo].[Posts] AS [Extent1] WHERE EXISTS (SELECT 1 AS [C1] FROM [dbo].[PostInterestTags] AS [Extent2] WHERE ([Extent1].[Id] = [Extent2].[PostId]) AND ([Extent2].[TagId] IN (1,2,3)) ) 

In this example, the returned query is 1 (1,2,3), therefore, (1,2,3) in the IN clause in query 2.

+3
source

Edit: I had to edit my post because I forgot about the many relationships between post and topic. Now it should work.

I can't tell you if this is the most efficient way, but it will be a way to use LINQ queries, so it should be very efficient:

 var matchingPosts = posts.Where(post => post.Topics.Any(postTopic => person.Interests.Contains(postTopic))); 

If you want to use parallel execution, you can change it as follows:

 var matchingPosts = posts.AsParallel().Where(post => post.Topics.Any(postTopic => person.Interests.Contains(postTopic))); 

Since you are using EF, you will need this query:

 var matchingPosts = from post in posts where post.Topics.Any(topic => person.Interests.Contains(topic)) select post; 
+3
source

how about this:

 context.Persons .Where(p => p.Name == "x") .SelectMany(p => p.Interests.SelectMany(i => i.Posts)) .Distinct() .Take(10) .ToList(); 

Take () to improve performance and pagination. You should never select all records, because first of all no one will read the list of thousands of records, and secondly, the result set may grow in the future, and the query will not be scaled.

+3
source

All Articles