I'm running code like this:
var somethings = db.Somethings.Select(s => new SomethingViewModel
{
Id = s.Id,
Name = s.Name,
IsActive = s.IsActive
SubSomethings = s.SubSomethings.Select(ss => new SubSomethingViewModel
{
Id = ss.Id,
Name = ss.Name,
IsActive = ss.IsActive
}).Where(wss => wss.IsActive)
}).Where(ws => ws.IsActive && (ws.SubSomethings.Any())) //remove elements if no SubSomethings
.ToList();
As you can see, this is a one to many relationship. There are a list of SubSomethings in a Something. If I take out the && (ws.SubSomethings.Any()), I get a very fast list returned.
But, I want to only include in the list the Somethings that have at least one SubSomething. I also tried the following and got the same horrible efficiency:
var somethings = db.Somethings.Select(s => new SomethingViewModel
{
Id = s.Id,
Name = s.Name,
IsActive = s.IsActive
SubSomethings = s.SubSomethings.Select(ss => new SubSomethingViewModel
{
Id = ss.Id,
Name = ss.Name,
IsActive = ss.IsActive
}).Where(wss => wss.IsActive)
}).Where(ws => ws.IsActive)
.ToList(); //this finishes very quickly
var somethings2 = somethings.Where(s => s.SubSomethings.Any()).ToList(); //This is where the code bogged down
How can I re-write my query to get the bogging down code to be much faster? One thing to note: This works just fine with one or two records. When I hit >8000 records, it takes at least four minutes.
Here is the Index I created on the SubSomething table for the foreign Key of SomethingId, which corresponds to Something.Id
CREATE NONCLUSTERED INDEX [IX_SubSomething_SomethingId] ON [dbo].[SubSomething]
(
[SomethingId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Here is the foreign key creation of SubSomething.SomethingId:
ALTER TABLE [dbo].[SubSomething] WITH CHECK ADD CONSTRAINT [FK_SubSomething_Something_SomethingId] FOREIGN KEY([SomethingId])
REFERENCES [dbo].[Something] ([Id])
GO
ALTER TABLE [dbo].[SubSomething] CHECK CONSTRAINT [FK_SubSomething_Something_SomethingId]
GO
EF Core is your problem. Currently it's known to execute N + 1 subqueries when the query contains sub collection projection.
The only way to workaround it and limit the whole thing to 2 SQL queries is to filter as much as you can, then load the whole entity set with sub entities in memory using eager loading, then switch to LINQ to Objects and do the final projection/filtering:
var somethings = db.Somethings
.Include(s => s.SubSomethings)
.Where(s => s.IsActive)
.AsEnumerable()
.Select(s => new SomethingViewModel
{
Id = s.Id,
Name = s.Name,
IsActive = s.IsActive,
SubSomethings = s.SubSomethings.Select(ss => new SubSomethingViewModel
{
Id = ss.Id,
Name = ss.Name,
IsActive = ss.IsActive
}).Where(wss => wss.IsActive).ToList()
})
.Where(s => s.SubSomethings.Any())
.ToList();
As Ivan Stoev said, EF Core is your problem, i share a alternative not using LINQ to SQL but that ill mutch better performance:
Use db.Database.SqlQuery("Query") to query the view and return the data:
var results = db.Database.SqlQuery< SomethingViewModel >("Select ColA,ColB,ColC from ViewSomething")
The object SomethingViewModel must be a strongly type with propreties matching to cols of the view.
I know that is not pretty and have the hardcoded query for the view, but Generally is the fastest performance you get using EF since you pass compiler.