Entity Framework Include directive not getting all expected related rows

.net c# entity-framework entity-framework-6

Question

Entity framework was loading a lot of data through lazy loading, which I noticed when troubleshooting some performance problems (900 additional query calls ain't quick!). yet I was certain that I had the right include. The real use case is more complicated, so I don't have much room to rework the signature of what I'm doing, but I've managed to condense this down to a fairly small test case to illustrate the confusion I'm seeing. Hopefully, this is a clear illustration of the problem I'm experiencing.

Many MetaInfo rows relating to documents. I want all MetaInfo rows to be included so I won't have to send out a separate request for all the Documents MetaInfo, but I also want to have all documents categorized by MetaInfo rows with a specified value.

I have the following question, then.

ctx.Configuration.LazyLoadingEnabled = false;

var DocsByCreator = ctx.Documents
    .Include(d => d.MetaInfo) // Load all the metaInfo for each object
    .SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
        .Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
    .ToList(); // Actualize the collection

I anticipated that this would have all of the Document / Author pairings and be fully populated with Document MetatInfo properties.

However, the Documents MetaInfo field ONLY contains MetaInfo objects with Name == "Author," so that's not what occurs. I still receive the Document objects and the Authors perfectly fine.

The same thing happens if I move the where clause out of the select many, until I move it to after the actualization. While this may not seem like a significant concern, it is in the real application since it implies we are receiving much more data than we wish to handle.

After experimenting with several methods, I believe the problem seems to be when you do a select(...new...) along with the where and the include. After actualization, performing the select or where clause causes the data to show as I had anticipated.

I modified it as follows to test the idea since I thought there could be a problem with the MetaInfo property of Document being filtered. I was startled to see that this also produces the same (and, in my opinion, incorrect) outcome.

ctx.Configuration.LazyLoadingEnabled = false;

var DocsByCreator = ctx.Meta
    .Where(m => m.Name == "Author")
    .Include(m => m.Document.MetaInfo) // Load all the metaInfo for Document
    .Select(m => new { Doc = m.Document, Creator = m })
    .ToList(); // Actualize the collection

I anticipated that because we're not putting the location on the Document.MetaInfo property, this would get over the issue, but oddly enough, the papers still only seem to contain a "Author" MetaInfo object.

As far as I can tell, all of the test cases in the straightforward test project I developed and posted to github passed; only the ones with premature actualisation did.

https://github.com/Robert-Laverick/EFIncludeIssue

Anyone have a theory? Is there any way I'm misusing EF or SQL? Is there anything I can do to obtain the same outcomes organization? Is this an EF problem that has just gone unnoticed since LazyLoad is enabled by default and this is a bit of an unusual group type operation?

1
2
10/10/2018 5:59:33 PM

Accepted Answer

A drawback of EF is that includes won't be taken into account if the scope of the entities returned changes from where they were introduced.

For EF6, I couldn't locate a reference to this, although EF Core has documentation on it. (https://docs.microsoft.com/en-us/ef/core/querying/related-data) (see "ignore includes") (see "ignore includes") It may be a restriction put in place to prevent EF's SQL creation from going entirely MIA in some circumstances.

the whilevar docs = context.Documents.Include(d => d.Metas) would provide the eagerly loaded metas for the document; the moment you.SelectMany() The Include statement is disregarded since you are altering the data that EF is meant to return.

If you would want to return all of the documents together with a property identifying their author:

var DocsByCreator = ctx.Documents
    .Include(d => d.MetaInfo)
    .ToList() // Materialize the documents and their Metas.
    .SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
        .Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
    .ToList(); // grab your collection of Doc and Author.

If you're just interested in documents with authors:

var DocsByCreator = ctx.Documents
    .Include(d => d.MetaInfo)
    .Where(d => d.MetaInfo.Any(m => m.Name == "Author")
    .ToList() // Materialize the documents and their Metas.
    .SelectMany(d => d.MetaInfo.Where(m => m.Name == "Author") // For each Author
        .Select(m => new { Doc = d, Creator = m })) // Create an object with the Author and the Document they authored.
    .ToList(); // grab your collection of Doc and Author.

As a result, you must make sure that all of your filtering logic is applied above that.'ToList() call. As an alternative, you may think about resolving the Author meta after the query, say, when view models are filled, or by using an unmapped "Author" attribute on Document. Despite the fact that if their usage sneaks into an EF query, you receive a nasty runtime error, I normally steer clear of unmapped properties.

Edit: Considering the need to skip and take, I advise using view models to deliver data rather than entities. By using a view model, you can tell EF to just deliver the raw data you want. View models may be composed using either straightforward filler code or Automapper, which works well with both EF and IQueryable and can handle the majority of delayed scenarios like this.

For instance:

public class DocumentViewModel
{
    public int DocumentId { get; set; }
    public string Name { get; set; }
    public ICollection<MetaViewModel> Metas { get; set; } = new List<MetaViewModel>();
    [NotMapped]
    public string Author // This could be update to be a Meta, or specialized view model.
    {
        get { return Metas.SingleOrDefault(x => x.Name == "Author")?.Value; }
    }
}

public class MetaViewModel
{
    public int MetaId { get; set; }
    public string Name { get; set; }
    public string Value { get; set; }
}

Next, the issue:

var viewModels = context.Documents
    .Select(x => new DocumentViewModel
    {
        DocumentId = x.DocumentId,
        Name = x.Name,
        Metas = x.Metas.Select(m => new MetaViewModel
        {
            MetaId = m.MetaId,
            Name = m.Name,
            Value = m.Value
         }).ToList()
    }).Skip(pageNumber*pageSize)
    .Take(PageSize)
    .ToList();

At the data level, the connection between a "author" and a document is assumed rather than enforced. With this approach, the entity models remain "pure" to the data representation, and the code is given control over changing the inferred connection into the author of a document.

The .Select() Automapper can manage population using.ProjectTo<TViewModel>() .

You can prevent problems like these by returning view models rather than entities. Include() operations are invalidated, problems resulting from the temptation to detach and reattach entities between contexts are avoided, performance and resource usage are improved by only selecting and sending the necessary data, lazy load serialization problems are avoided if you forget to disable lazy-load, and unexpected #null data problems are avoided with it.

2
10/11/2018 9:39:08 PM


Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow