In my project I receive some data from an SPS all x seconds. Every y minutes I archive the current Data in a database so I'm able to show statistics.

The data I receive gets put in a model. Something like this but much more complex:

public class Data
    public DateTime ArchiveTime { get; set; }
    public float TempC { get; set; }
    public float CO2Percent { get; set; }

I have a repository for the database that returns all entries in a certain time span. See this code:

// Context is my DbContext for a SQLite db and Data is the DbSet<Data> on that
IQueryable<Data> GetDataBetween(DateTime from, DateTime to) => Context.Data.Where(d => (d.ArchiveTime >= from && d.ArchiveTime <= to));

As you can see this returns an IQueryable so I want to make use of the linq to entities functionality.
I believe it's called linq to entities but in case it isn't, I mean the functionality that converts expression trees to sql or whatever instead of just executing it in C#.

Since there is an indeterminable amount of entries per hour in the database I want to only get one entry per hour (the first one) so I can display it in a graph.

Here's an example of some datetimes that maybe show my intent a bit better:
NOTE: these are only the datetimes contained in the object, I want the whole object - not just the times.

// say this is all the data I get between two times
2019-07-06 10:30:01 // I want
2019-07-06 10:40:09
2019-07-06 10:50:10
2019-07-06 11:00:13 // I want
2019-07-06 11:10:20
2019-07-06 11:20:22
2019-07-06 11:30:24
2019-07-06 11:40:32
2019-07-06 11:50:33
2019-07-06 12:00:35 // I want
2019-07-06 12:10:43
2019-07-06 12:20:45
2019-07-06 12:40:54
2019-07-06 12:50:56
2019-07-06 13:00:58 // I want
2019-07-06 13:11:06
2019-07-06 13:21:08
2019-07-06 13:31:09

The current way I do this is via a IEnumerable and GroupBy. See this code:

var now = DateTime.Now;
IQueryable<Data> dataLastWeek = repos.GetDataBetween(now.AddDays(-7), now);

IEnumerable<Data> onePerHour = dataLastWeek.AsEnumerable()
    .GroupBy(d => new DateTime(d.ArchiveTime.Year, d.ArchiveTime.Month, d.ArchiveTime.Day, d.ArchiveTime.Hour, 0, 0))
    .Select(g => g.First());

This works fine but since it uses IEnumerable and creates objects, I don't get the advantages of linq to entities and I think it must a lot slower this way.

Is there any way to rewrite this query to work with IQueryable on a SQLite database?

EDIT: I'm working with the .net core 3 preview6 (newest preview) version of EF Core. Maybe there is a new feature that allowes for what I want :)

7/15/2019 11:35:08 AM

Accepted Answer

The key part of the GroupBy can easily be made translatable by avoiding new DateTime(...) and using either anonymous type

.GroupBy(d => new { d.ArchiveTime.Date, d.ArchiveTime.Hour })

or Date property + AddHours:

.GroupBy(d => d.ArchiveTime.Date.AddHours(d.ArchiveTime.Hour))

Unfortunately currently (EF Core 2.2) does not translate nested First / FirstOrDefault / Take(1) to SQL and uses client evaluation. For First() it is forced in order to emulate the LINQ to Objects throwing behavior, but for the other two patterns it's caused by the lack of proper translation.

The only server side solution I see for your concrete query is to not use GroupBy at all, but correlated self antijoin, something like this:

var onePerHour = dataLastWeek.Where(d => !dataLastWeek.Any(d2 =>
    d2.ArchiveTime.Date == d.ArchiveTime.Date &&
    d2.ArchiveTime.Hour == d.ArchiveTime.Hour &&
    d2.ArchiveTime < d.ArchiveTime));
7/15/2019 12:25:25 PM

Related Questions


Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow