IEnumerable from DBContext vs IEnumerable from code Memory usage

.net .net-core c# entity-framework entity-framework-core

Question

I'm getting confused with IEnumerable memory usage problem, especially compare IEnumerable data source from DB and IEnumerable data source from code yield return const values.

I have a Memory function for checking the memory usage.

        static string Memory()
        {
            return (Process.GetCurrentProcess().WorkingSet64 / (1024 * 
                    1024)).ToString();
        }
  1. So here I'm using EF CORE 3.0, accessing a total of 150000 records table
            using DataContext context = new DataContext();

            Console.WriteLine(Memory()); //21

            IEnumerable<User> users = context.Users;
            foreach (var i in users) {}

            Console.WriteLine(Memory());//101
            Console.WriteLine(GC.GetTotalMemory(true));//46620032

for some reason I cannot upload pics, so I need to type the results sorry about that.(results are in the code as comments).

  1. And the next example is using yield return to generate the IEnumerable data.
        static IEnumerable<User> Generator(int max)
        {
            for (int i = 0; i < max; i++)
            {
                yield return new User { Id = 1, Name = "test" };
            }
        }

here is the result

            Console.WriteLine(Memory());// 21

            IEnumerable<User> users = Generator(150000);
            foreach (var i in users){}

            Console.WriteLine(Memory());// 24
            Console.WriteLine(GC.GetTotalMemory(true)); // 658040

Now, I'm very confused by example 1 and 2. My understanding is that for IEnumerable data source, it's going to read one at the time, rather than the whole collection, so it can reduce the memory usage just like the example 2. However, when it comes to using EF CORE(I know this not specific to EF CORE, but I need a concrete example for that.), I think it's still pulling one by one, but my question is why it uses so much more memory than the second example. So is it pulling each record one by one? And at the end, I have all the records from DB in memory is it correct? But why the second use so less memory? I'm yielding the same records. If some could explain this is much appreciated. Thanks !!!

1
1
10/17/2019 12:13:43 PM

Accepted Answer

It's indeed EF (Core) specific behavior called (change) tracking, explained in Tracking vs. No-Tracking Queries. Note that tracking is the default behavior if you don't change it explicitly

context.ChangeTracker.QueryTrackingBehavior = QueryTrackingBehavior.NoTracking;

or use AsNoTracking() on the query source.

The essential is that even though the query result is evaluated one by one, the DbContext instance adds each created entity instance plus some additional info like state and snapshot of the original values into some internal list. So even without key, status and original values snapshot, the equivalent code for the generator would be something like this:

IEnumerable<User> users = Generator(150000);
var trackedUsers = new List<User>();
foreach (var i in users)
{
    trackedUsers.Add(i);
}

So at the end of the loop you would have all created instances during iteration stored in memory.

That's why you might consider using AsNoTracking option in case all you need it to execute an entity query and iterate it once. Note that non entity (projection) queries and keyless entities do not track their results, so this is really entity query specific behavior.

2
10/17/2019 1:46:06 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow