.netcore EF linq - this is a BUG? Very strange behavior

c# entity-framework entity-framework-core

Question

enter image description hereI have two table in sql. Document and User. Document have relation to User and I want to get users that I sent document recently.

I need to sort by the date document was sent and get unique (distinct) user with relation to this document

This is my linq queries

var recentClients = documentCaseRepository.Entities
            .Where(docCase => docCase.AssignedByAgentId == WC.UserContext.UserId)
            .OrderByDescending(userWithDate => userWithDate.LastUpdateDate)
            .Take(1000) // I need this because if I comment this line then EF generate completely different sql query. 
            .Select(doc => new { doc.AssignedToClient.Id, doc.AssignedToClient.FirstName, doc.AssignedToClient.LastName })                
            .Distinct()
            .Take(configuration.MaxRecentClientsResults)                
            .ToList();

and generated sql query is:

SELECT DISTINCT TOP(5) [t].*
FROM (
     SELECT TOP(1000) [docCase.AssignedToClient].[Id]
     FROM [DocumentCase] AS [docCase]
     INNER JOIN [User] AS [docCase.AssignedToClient] 
       ON ([docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id])
     WHERE [docCase].[AssignedByAgentId] = 3
     ORDER BY [docCase].[LastUpdateDate] DESC
) 
AS [t]

Every thing is correct for now. But if I delete this line

.Take(1000) // I need this because...

EF generated completely different query such as:

SELECT DISTINCT TOP(5) 
    [docCase.AssignedToClient].[Id]
FROM [DocumentCase] AS [docCase]
INNER JOIN [User] AS [docCase.AssignedToClient] 
    ON ([docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id])
WHERE [docCase].[AssignedByAgentId] = 3

My question is: why EF not generated orderby clause and subquery with distinct? This is a BUG EF or I'm doing something wrong? And what I must do to generate in linq this sql query ()

SELECT DISTINCT TOP 5 [t].*
FROM ( SELECT [docCase.AssignedToClient].[Id]
 FROM [DocumentCase] AS [docCase]
 INNER JOIN [User] AS [docCase.AssignedToClient] 
   ON [docCase].[AssignedToClientId] = [docCase.AssignedToClient].[Id]
 WHERE [docCase].[AssignedByAgentId] = 1
 ORDER BY [docCase].[LastUpdateDate] DESC
) AS [t]
1
2
9/27/2018 2:45:10 PM

Popular Answer

The reason for the difference in queries is that Distinct messes up result order. So when you first execute OrderBy and then Distinct, you can just es well not execute OrderBy, because this order is lost anyway. So EF can just optimize it away.

Calling Take in between causes the result set to be semantically different: You first order the items, take the first 1000 items of that order and then call Distinct on them.

What you can change in your query depends mainly on the result you want to achieve. Maybe you want to first make the result set distinct then order by date and finally take the amount of items. Other options are also thinkable based on your requirements.

2
9/27/2018 12:21:31 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow