Let's take a look at simple class examples:
public class Book
{
[Key]
public string BookId { get; set; }
public List<BookPage> Pages { get; set; }
public string Text { get; set; }
}
public class BookPage
{
[Key]
public string BookPageId { get; set; }
public PageTitle PageTitle { get; set; }
public int Number { get; set; }
}
public class PageTitle
{
[Key]
public string PageTitleId { get; set; }
public string Title { get; set; }
}
So, if I want to get all PageTitiles, if I knew only the BookId, I need to write a few includes, like this:
using (var dbContext = new BookContext())
{
var bookPages = dbContext
.Book
.Include(x => x.Pages)
.ThenInclude(x => x.PageTitle)//.ThenInclude(x => x.Select(y => y.PageTitle)) Shouldn't use in EF Core
.SingleOrDefault(x => x.BookId == "some example id")
.Pages
.Select(x => x.PageTitle);
}
And if i want to get PageTitles connected with other book, I need to rewrite this method again, and nothing changed except the BookId! This is very inefficient way to work with database, in this example I have 3 classes, but if I had hundreds of classes, nested to the very deep level, it would be very slow and uncomfortable to work.
How exactly should I organize working with my database, to avoid many Includes, and redundant queries?
Problem 1: I have to add a bunch of Includes
each time.
Well, there's not a way around that as you have to explicitly included related data in EF, but you can easily create an extension method to make it cleaner:
public static IQueryable<Book> GetBooksAndPages(this BookContext db)
{
return db.Book.Include(x => x.Pages);
}
public static IQueryable<Book> GetBooksAndPagesAndTitles(this BookContext db)
{
return GetBooksAndPages(db).ThenInclude(p => p.PageTitle)
}
Then you can just do:
var bookPages = dbContext
.GetBooksAndPagesAndTitles()
.SingleOrDefault(x => x.BookId == "some example id")
.Pages
.Select(x => x.PageTitle);
Problem 2: I have to write this query multiple times for different IDs.
Why not just refactor that into a method with a bookId
parameter?
public IEnumerable<PageTitle> GetPageTitlesForBook(BookContext dbContext, int bookId)
{
return dbContext
.GetBooksAndPagesAndTitles()
.SingleOrDefault(x => x.BookId == bookId)
.Pages
.Select(x => x.PageTitle);
}
Bottom line - if you find yourself writing the same thing many times, that's a perfect opportunity to refactor your code into smaller methods that can be re-used.
None of the examples given require any Include statements at all. If you are using a select at the end of your query and you are still operating on an IQueryable such as a DbSet, Entity Framework will perform what is known as a 'projection' and will run the query including all of the required fields for you automatically.
For example, your original code:
using (var dbContext = new BookContext())
{
var bookPages = dbContext
.Book
.Include(x => x.Pages)
.ThenInclude(x => x.PageTitle)//.ThenInclude(x => x.Select(y => y.PageTitle)) Shouldn't use in EF Core
.SingleOrDefault(x => x.BookId == "some example id")
.Pages
.Select(x => x.PageTitle);
}
You can rewrite this like so:
using (var dbContext = new BookContext())
{
var bookPages = dbContext
.Book
.Where(x => x.BookId == "some example id")
.SelectMany(x => x.Pages.Select(y => y.PageTitle))
.ToList();
}
Here's what Entity Framework will do to resolve this:
The last step is the crucial one if you want to understand how Entity Framework does what it does. In your example when you call SingleOrDefault
you are instructing Entity Framework to execute the query, which is why you need the includes. In your example you haven't actually told Entity Framework that you need the pages when you run the query, so you have to manually request them using Include
.
In the example I've posted, you can see that by the time you run the query (ToList
is what triggers the query execution) Entity Framework knows from your Select expression that it is going to need the pages, and their titles. Even better - this means Entity Framework will not even include unused columns in the SELECT
statement that it generates.
I highly recommend investigating projections, they're probably the best way I know of to remove the requirement to continuously manually include stuff.