Efficient joining the most recent record from another table in Entity Framework Core

asp.net asp.net-core entity-framework entity-framework-core linq

Question

I am comming to ASP .NET Core from PHP w/ MySQL.


The problem:

For the illustration, suppose the following two tables: T: {ID, Description, FK} and States: {ID, ID_T, Time, State}. There is 1:n relationship between them (ID_T references T.ID).

I need all the records from T with some specific value of FK (lets say 1) along with the related newest record in States (if any).


In terms of SQL it can be written as:

SELECT T.ID, T.Description, COALESCE(s.State, 0) AS 'State' FROM T
LEFT JOIN (
    SELECT ID_T, MAX(Time) AS 'Time'
    FROM States
    GROUP BY ID_T
) AS sub ON T.ID = sub.ID_T
LEFT JOIN States AS s ON T.ID = s.ID_T AND sub.Time = s.Time
WHERE FK = 1

I am struggling to write an efficient equivalent query in LINQ (or the fluent API). The best working solution I've got so far is:

from t in _context.T
where t.FK == 1
join s in _context.States on t.ID equals o.ID_T into _s
from s in _s.DefaultIfEmpty()
let x = new
{
    id = t.ID,
    time = s == null ? null : (DateTime?)s.Time,
    state = s == null ? false : s.State
}
group x by x.id into x
select x.OrderByDescending(g => g.time).First();

When I check the resulting SQL query in the output window when executed it is just like:

SELECT [t].[ID], [t].[Description], [t].[FK], [s].[ID], [s].[ID_T], [s].[Time], [s].[State]
FROM [T] AS [t]
LEFT JOIN [States] AS [s] ON [T].[ID] = [s].[ID_T]
WHERE [t].[FK] = 1
ORDER BY [t].[ID]

Not only it selects more columns than I need (in the real scheme there are more of them). There is no grouping in the query so I suppose it selects everything from the DB (and States is going to be huge) and the grouping/filtering is happening outside the DB.


The questions:

What would you do?

  • Is there an efficient query in LINQ / Fluent API?
  • If not, what workarounds can be used?
    • Raw SQL ruins the concept of abstracting from a specific DB technology and its use is very clunky in current Entity Framework Core (but maybe its the best solution).
    • To me, this looks like a good example for using a database view - again, not really supported by Entity Framework Core (but maybe its the best solution).
1
4
8/3/2017 2:23:17 PM

Accepted Answer

OK. Reading this article (almost a year old now), Smit's comment to the original question and other sources, it seems that EF Core is not really production ready yet. It is not able to translate grouping to SQL and therefore it is performed on the client side, which may be (and in my case would be) a serious problem. It corresponds to the observed behavior (the generated SQL query does no grouping and selects everything in all groups). Trying the LINQ queries out in Linqpad it always translates to a single SQL query.

I have downgraded to EF6 following this article. It required some changes in my model's code and some queries. After changing .First() to .FirstOrDefault() in my original LINQ query it works fine and translates to a single SQL query selecting only the needed columns. The generated query is much more complex than it is needed, though.

Using a query from NetMage's answer (after small fixes), it results in a SQL query almost identical to my own original SQL query (there's only a more complex construct than COALESCE).

var latestState = from s in _context.States
                  group s by s.ID_T into sg
                  select new { ID = sg.Key, Time = sg.Time.Max() };

var ans = from t in _context.T
          where t.FK == 1
          join sub in latestState on t.ID equals sub.ID into subj
          from sub in subj.DefaultIfEmpty()
          join s in _context.States
              on new { ID_T = t.ID, sub.Time } equals new { s.ID_T, s.Time }
              into sj
          from s in sj.DefaultIfEmpty()
          select new { t.ID, t.Description, State = (s == null ? false : s.State) };

In LINQ it's not as elegant as my original SQL query but semantically it's the same and it does more or less the same thing on the DB side.

In EF6 it is also much more convenient to use arbitrary raw SQL queries and AFAIK also the database views.

The biggest downside of this approach is that full .NET framework has to be targeted, EF6 is not compatible with .NET Core.

2
8/4/2017 1:00:58 PM

Popular Answer

What happens if you try to do a more straight forward translation to LINQ?

var latestState = from s in _context.States
                  group s by s.ID_T into sg
                  select new { ID_T = sg.Key, Time = sg.Time.Max() };

var ans = from t in _context.T
          where t.FK == 1
          join sub in latestState on t.ID equals sub.ID_T into subj
          from sub in subj.DefaultIfEmpty()
          join s in _context.States on new { t.ID, sub.Time } equals new { s.ID, s.Time } into sj
          from s in sj.DefaultIfEmpty()
          select new { t.ID, t.Description, State = (s == null ? 0 : s.State) };

Apparently the ?? operator will translate to COALESCE and may handle an empty table properly, so you could replace the select with:

          select new { t.ID, t.Description, State = s.State ?? 0 };


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow