I'm using EntityFramework Core on my current project. In this project, I have an API endpoint that accepts a large (4,000K) text file. The endpoint reads and parses the file and converting the data into an object graph.
I then need to write the entire graph to a SQL database. After parsing the text file, I wind up with about 20,000 objects in this object graph.
The graph typically has one Transaction. The Transaction has about 5000 Subscribers and each Subscriber has an average of 4 Benefits. Each Dates collection will have 1 or 2 DateRanges. Rejections are usually empty.
My object graph looks basically like this:
public class Transaction {
public int Id {get; set;}
... // Other properties
public ICollection<Subscriber> Subscribers {get; private set;}
public ICollection<TranRejection> Rejections {get; private set;}
}
public class Subscriber {
public int Id {get; set;}
public int TransactionId {get; set;} //Foreign Key
... // Other properties
public ICollection<Benefit> Benefits {get; private set;}
public ICollection<SubscriberRejection> Rejections {get; private set;}
public ICollection<SubscriberDateRange> Dates {get; private set;}
}
public class Benefit {
public int Id {get; set;}
public int SubscriberId {get; set;} //Foreign Key
... // Other properties
public ICollection<BenefitRejection> Rejections {get; private set;}
public ICollection<BenefitDateRange> Dates {get; private set;}
}
//This abstract class w/ empty subclasses is done to take advantage of TPH
//so that all dates get stored in a single table
public abstract class DateRange {
public int Id {get; set;}
public int ParentId {get; set;}
public string DateCode {get; set;}
public DateTime BeginRange {get; set;}
public DateTime? EndRange {get; set;}
}
public class BenefitDateRange : DateRange {}
public class SubscriberDateRange : DateRange {}
//Rejection class is handled very similar to DateRange
My EF Mappings look something like this. (Only including important bits to help see relationships).
builder.Entity<DateRange>().ToTable("dateranges")
.HasDiscriminator<string>("rangetype")
.HasValue<BenefitDateRange>("benefit")
.HasValue<SubscriberDateRange>("subscriber");
builder.Entity<DateRange>().HasKey(r => r.Id);
builder.Entity<Transaction>().HasMany(t => t.Subscribers).WithOne()
.HasForeignKey(s => s.TransactionId);
builder.Entity<Subscriber>().HasMany(s => s.Benefits).WithOne()
.HasForeignKey(b => b.SubscriberId);
builder.Entity<Subscriber>().HasMany(s => s.Dates).WithOne()
.HasForeignKey(d => d.ParentId);
//Similar mappings for Benefit.Dates
//Rejections are using TPH just like DateRanges
I tried saving to the database by saving pieces individually -- i.e saving transaction with no subscribers and then saving each subscriber, etc. That takes at least 30 minutes.
I then switched to saving the entire graph at once like this:
_dbContext.AddRange(transactions);
_dbContext.SaveChanges();
That takes about 5 minutes. However, this is part of an API call and I'd like to speed this up. Is there a faster way to save this whole graph to the database? Should I not be using EF for this?
Using the demo version of Entity Framework Extensions, I was able to get the 5 minute Insert time down to approx. 30 seconds! Worked great -- of course, there is a $$ cost to going with that solution. I literally added a using clause and one line of code and voila, it worked.
_context.AddRange(history);
//_context.SaveChanges(); <-- Previous Code
_context.BulkSavechanges(); //New Entity Framework Extensions Code
I tried EFCore.BulkExtensions. I was unable to get this to work. It didn't seem to like the Conversion mapping that I had created in my Fluent API Entity Map.
builder.Entity<Transaction>()
.Property(t => t.Receiver)
.HasColumnName("receiverdata")
.HasConversion(v => JsonConvert.SerializeObject(v), v => JsonConvert.DeserializeObject<ReceiverEntity>(v));
EFCore.BulkExtensions states that they support Conversion so I'm not sure what the issue is here. I posted the issue on GitHub so we'll see if there is a way to get this to work or not.
We had a similar problem, but with one less level. The solution that worked best for us was to use BulkExtensions and wrap each level in a try-catch block and roll back all changes if there was an error saving.
https://github.com/borisdj/EFCore.BulkExtensions
The native option without external libraries was to turn off AutoDetectChangesEnabled and ValidateOnSaveEnabled on the DBContext. But it was still a bit slower than using BuilExtensions.
context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;
Our use case was to always insert new rows and not to update existing. So, I can't speak for the performance of InsertOrUpdate
method of BulkExtensions. But, worth giving a try.