Need to speed up writing large object graph to database with Entity Framework

asp.net-core-webapi c# ef-core-2.1 entity-framework-core

Question

I'm using EntityFramework Core on my current project. In this project, I have an API endpoint that accepts a large (4,000K) text file. The endpoint reads and parses the file and converting the data into an object graph.

I then need to write the entire graph to a SQL database. After parsing the text file, I wind up with about 20,000 objects in this object graph.

The graph typically has one Transaction. The Transaction has about 5000 Subscribers and each Subscriber has an average of 4 Benefits. Each Dates collection will have 1 or 2 DateRanges. Rejections are usually empty.

My object graph looks basically like this:

public class Transaction {
   public int Id {get; set;}
   ...  // Other properties
   public ICollection<Subscriber> Subscribers {get; private set;}
   public ICollection<TranRejection> Rejections {get; private set;}
}

public class Subscriber {
   public int Id {get; set;}
   public int TransactionId {get; set;}  //Foreign Key
   ... // Other properties
   public ICollection<Benefit> Benefits {get; private set;}
   public ICollection<SubscriberRejection> Rejections {get; private set;}
   public ICollection<SubscriberDateRange> Dates {get; private set;}
}

public class Benefit {
   public int Id {get; set;}
   public int SubscriberId {get; set;}  //Foreign Key
   ... // Other properties
   public ICollection<BenefitRejection> Rejections {get; private set;}
   public ICollection<BenefitDateRange> Dates {get; private set;}
}

//This abstract class w/ empty subclasses is done to take advantage of TPH
//so that all dates get stored in a single table
public abstract class DateRange {
   public int Id {get; set;}
   public int ParentId {get; set;}
   public string DateCode {get; set;}
   public DateTime BeginRange {get; set;}
   public DateTime? EndRange {get; set;}
}

public class BenefitDateRange : DateRange {}
public class SubscriberDateRange : DateRange {}

//Rejection class is handled very similar to DateRange

My EF Mappings look something like this. (Only including important bits to help see relationships).

builder.Entity<DateRange>().ToTable("dateranges")
  .HasDiscriminator<string>("rangetype")
  .HasValue<BenefitDateRange>("benefit")
  .HasValue<SubscriberDateRange>("subscriber");
builder.Entity<DateRange>().HasKey(r => r.Id);

builder.Entity<Transaction>().HasMany(t => t.Subscribers).WithOne()
   .HasForeignKey(s => s.TransactionId);

builder.Entity<Subscriber>().HasMany(s => s.Benefits).WithOne()
   .HasForeignKey(b => b.SubscriberId);

builder.Entity<Subscriber>().HasMany(s => s.Dates).WithOne()
   .HasForeignKey(d => d.ParentId);

//Similar mappings for Benefit.Dates
//Rejections are using TPH just like DateRanges

I tried saving to the database by saving pieces individually -- i.e saving transaction with no subscribers and then saving each subscriber, etc. That takes at least 30 minutes.

I then switched to saving the entire graph at once like this:

_dbContext.AddRange(transactions);
_dbContext.SaveChanges();

That takes about 5 minutes. However, this is part of an API call and I'd like to speed this up. Is there a faster way to save this whole graph to the database? Should I not be using EF for this?

1
3
12/20/2018 3:09:48 PM

Accepted Answer

Using the demo version of Entity Framework Extensions, I was able to get the 5 minute Insert time down to approx. 30 seconds! Worked great -- of course, there is a $$ cost to going with that solution. I literally added a using clause and one line of code and voila, it worked.

_context.AddRange(history);
//_context.SaveChanges(); <-- Previous Code
_context.BulkSavechanges();  //New Entity Framework Extensions Code

I tried EFCore.BulkExtensions. I was unable to get this to work. It didn't seem to like the Conversion mapping that I had created in my Fluent API Entity Map.

builder.Entity<Transaction>()
  .Property(t => t.Receiver)
  .HasColumnName("receiverdata")
  .HasConversion(v => JsonConvert.SerializeObject(v), v => JsonConvert.DeserializeObject<ReceiverEntity>(v));

EFCore.BulkExtensions states that they support Conversion so I'm not sure what the issue is here. I posted the issue on GitHub so we'll see if there is a way to get this to work or not.

2
12/20/2018 8:28:28 PM

Popular Answer

We had a similar problem, but with one less level. The solution that worked best for us was to use BulkExtensions and wrap each level in a try-catch block and roll back all changes if there was an error saving.

https://github.com/borisdj/EFCore.BulkExtensions

The native option without external libraries was to turn off AutoDetectChangesEnabled and ValidateOnSaveEnabled on the DBContext. But it was still a bit slower than using BuilExtensions.

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

Our use case was to always insert new rows and not to update existing. So, I can't speak for the performance of InsertOrUpdate method of BulkExtensions. But, worth giving a try.



Related Questions





Related

Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow
Licensed under: CC-BY-SA with attribution
Not affiliated with Stack Overflow