Create custom objects from IQueryable without loading everything into memory

c# entity-framework-core iqueryable linq performance

Question

This is a follow up question to this question. You should read that first.

I have now, thanks to this answer, created a query that will return the correct entries. See:

IQueryable<Data> onePerHour = dataLastWeek
    .Where(d => 
        !dataLastWeek
        .Any(d2 =>
            d2.ArchiveTime.Date == d.ArchiveTime.Date &&
            d2.ArchiveTime.Hour == d.ArchiveTime.Hour &&
            d2.ArchiveTime < d.ArchiveTime));

Now for processing the entries and displaying them on a chart, I only need one or two properties of the model class Data. The use case is something like this:

List<Data> actualData = onePerHour.ToList();

var tempCTupels = new List<TimeTupel<float>>();
tempCTupels.AddRange(actualData.Select(d => new TimeTupel<float>(d.ArchiveTime, d.TempC)));

var co2Tupels = new List<TimeTupel<float>>();
tempCTupels.AddRange(actualData.Select(d => new TimeTupel<float>(d.ArchiveTime, d.CO2Percent)));

TimeTupel is very simple and defined like this:

public class TimeTupel<TData>
{
    public TimeTupel(DateTime time, TData yValue)
    {
        Time = time;
        YValue = yValue;
    }

    public DateTime Time { get; set; }
    public TData YValue { get; set; }
}

Question

Currently actualdata is a List<Data> which means it's fully loaded in memory.
Since I only use two properties I wouldn't need to retrieve the whole object to create the TimeTupels.

Now my question is how would I achieve a performance increase? Is it the correct approach to remove the ToList?

Things I've tried

  • Just using the IQueryable<Data> to create the TimeTupel:
    IQueryable<Data> actualData = onePerHour; yields a runtime error ("System.InvalidOperationException: 'Null TypeMapping in Sql Tree'")

  • Using AsEnumerable:
    IEnumerable<Data> actualData = onePerHour.AsEnumerable(); is slow, takes around 22 seconds for 10 days worth of data

  • Using ToList as seen in the above code (ToArray is almost equal):
    List<Data> actualData = onePerHour.ToList(); is faster, takes around 5 seconds for the same amount of data

1
0
7/15/2019 1:38:13 PM

Accepted Answer

You can use an anonymous type in a Select statement to retrieve only the needed columns of data into memory, and then convert that in-memory data into the TimeTupel<> class from there. It would look like this:

var actualData = dataLastWeek
    .Where(d => 
        !dataLastWeek
        .Any(d2 =>
            d2.ArchiveTime.Date == d.ArchiveTime.Date &&
            d2.ArchiveTime.Hour == d.ArchiveTime.Hour &&
            d2.ArchiveTime < d.ArchiveTime))
    .Select(d => new { d.ArchiveTime, d.TempC, d.CO2Percent})
    .ToList();

var tempCTupels = actualData.Select(d => new TimeTupel<float>(d.ArchiveTime, d.TempC)).ToList();

var co2Tupels = actualData.Select(d => new TimeTupel<float>(d.ArchiveTime, d.CO2Percent)).ToList();
2
7/15/2019 1:52:51 PM

Popular Answer

you can select only the properties you need BEFORE you're actually loading the objects from your IQueryable. use Select after your Where statement to only load what you need.

An example:

assume you have a class which looks like this:

public class Person {
        public string Name { get; set; }
        public int Age { get; set; }
    }

I can initialize a list of items for testing:

var people = new List<Person> { new Person { Name = "John", Age = 10 }, new Person { Name = "Archie", Age = 40 } };

then we apply a filter:

var filterred = people.Where(p => p.Age > 15).Select(p => p.Name).ToList();

If I want to create a new object with the selection, to select more then just one property, I could do something like this:

var objFilterred = people.Where(p => p.Age > 15).Select(p => new { FullName = p.Name  }).ToList();

You don't have to use an anonymous object, you can also create a new class holding only the properties you require and simply populate that.

You can't "remove" ToList, as this is what actually executes your query. IQueryable is not data, it's a query that hasn't run yet and you can chain as many things as you want to it. The final step is to execute it, running something like ToList, to actually load the objects. As long as you build your IQueryable and when you're done you execute it, then you should see an improvement in the execution speed



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow