I have a batch job which is parsing a CSV file and creating and processing records. At each row, I have to perform commits as I need to create entities and then use the results of the created entities.
As there is thousands of records, the performance is slow and I am trying to improve my performance.
I have code which looks something like this:
var data = ParseExcel(filePath);
Setup();
foreach (var batch in data.Split(20))
{
foreach (var row in batch)
{
try
{
ParseRow(row);
}
catch (Exception e)
{
JobLogger.Error(e, "Failed to parse row. Exception: " + e.Message);
throw;
}
}
_unitOfWork.Commit();
_unitOfWork.Dispose();
_unitOfWork = LifetimeScope.Resolve<Owned<IUnitOfWork>>().Value;
ClientRepository = LifetimeScope.Resolve<Owned<IEntityBaseRepository<Client>>>().Value;
My Dispose method looks like this:
public void Dispose()
{
_dbContext.Dispose();
_dbContext = null;
_dbFactory.Dispose();
_dbFactory = null;
GC.SuppressFinalize(this);
}
The intention here is that after each batch of records is processed, I want to refresh the unit of work by disposing of it and requesting that Autofac generates a new instance of it.
However, at the moment when I'm adding an item to my ClientRepository, it falls over with an error:
The operation cannot be completed because the DbContext has been disposed.
My ClientRepository is using a generic repository class which looks like this:
public class EntityBaseRepository<T> : IEntityBaseRepository<T> where T : class, IEntityBase, new()
{
private DataContext _dataContext;
#region Properties
protected IDbFactory DbFactory
{
get;
}
protected DataContext DbContext => _dataContext ?? (_dataContext = DbFactory.Initialise());
public EntityBaseRepository(IDbFactory dbFactory)
{
DbFactory = dbFactory;
}
#endregion
Here is part of my UnitOfWork:
public class UnitOfWork : IUnitOfWork, IDisposable
{
private IDbFactory _dbFactory;
private DataContext _dbContext;
public UnitOfWork(IDbFactory dbFactory)
{
_dbFactory = dbFactory;
}
public DataContext DbContext => _dbContext ?? (_dbContext = _dbFactory.Initialise());
public void Commit()
{
DbContext.Commit();
}
Any thoughts on why I am still getting this error?
To ensure I get a fresh instance of my UnitOfWork after each batch of operations, I retrieve a reference to my current Autofac lifetime scope and ask Autofac to give me a new lifetime scope inside a using statement, then I use Autofac to register and resolve those dependencies. Some of my services also depend on the UnitOfWork so it was important that I get fresh instances of those dependencies too.
Here's a cut-down snippet:
foreach (var batch in data.Split(10))
{
using (var scope = LifetimeScope.BeginLifetimeScope("UnitOfWork", b =>
{
b.RegisterType<UnitOfWork>().AsImplementedInterfaces().InstancePerLifetimeScope();
b.RegisterType<MyService>().AsImplementedInterfaces().PropertiesAutowired().InstancePerLifetimeScope();
b.RegisterGeneric(typeof(EntityBaseRepository<>)).As(typeof(IEntityBaseRepository<>)).InstancePerLifetimeScope();
}))
{
UnitOfWork = scope.Resolve<IUnitOfWork>();
MyService = scope.Resolve<IMyService>();
foreach (var row in batch)
{
try
{
ParseRow(row);
}
catch (Exception e)
{
JobLogger.Error(e, "Failed to parse row. Exception: " + e.Message);
throw;
}
}
}
}
In the above code, I gave the name 'UnitOfWork' to the nested lifetime scope.
With this code in place, the performance of the job dramatically improved as this time I was not reusing the same UnitOfWork instance which was tracking tens of thousands of changes as it processed the file.
Additionally, I stopped splitting the data into batches of 10 - I decided to retrieve a new UnitOfWork after processing each row instead as each row already involved inserting data into at least 10 different tables.