I'm running a .Net Core container on Open Shift Enterprise V3 pointing to a SQL Server database.
I have a .Net Core REST API with a put method which adds or updates a record in the database.
The table I am adding/updating had 3000 records and has indexes.
This works fine locally using the same database and with the container. However when I start to put load through the the container (approximately 50 concurrent http connections with JMeter) I get random timeouts with the error message below. I don't get the problem running locally.
The local machine is a lot more powerful than the container and I have increased the CPU power on the container but this doesn't appear to have made any difference.
Any suggestions on things to try would be appreciated.
[10:45:36 ERR] Failed executing DbCommand (35,001ms) [Parameters=[@__get_Item_0='?' (Size = 255) (DbType = AnsiString)], CommandType='Text', CommandTimeout='30'] SELECT [e].[host_name], [e].[data_centre], [e].[Environment], [e].[is_physical_machine], [e].[mac_address], [e].[number_of_cores], [e].[number_of_sockets], [e].[number_of_v_cores], [e].[number_ofcpus], [e].[operating_system], [e].[operating_system_version], [e].[processor], [e].[uuid] FROM [EntitlementServer].[host] AS [e] WHERE [e].[host_name] = @__get_Item_0 System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (258): Unknown error 258 at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction) at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose) at System.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error) at System.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() at System.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket() at System.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer() at System.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value) at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady) at System.Data.SqlClient.SqlDataReader.TryConsumeMetaData() at System.Data.SqlClient.SqlDataReader.get_MetaData() at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString) at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds) at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite, String method) at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior) at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior) at System.Data.Common.DbCommand.ExecuteReader() at Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommand.Execute(IRelationalConnection connection, DbCommandMethod executeMethod, IReadOnlyDictionary`2 parameterValues) ClientConnectionId:6ca037fc-9671-4d43-bebe-35879203c682 Error Number:-2,State:0,Class:11
Whilst I found that scaling up the CPU and memory on the container did not help with performance. When I increased the number of running containers, that did increase throughput and the timeout issue went away. I am not sure why this would be the case.
The problem actually turned out to be a threading issue. The db timeout was a symptom. This is the reason why increasing the number of containers stopped the problem because the number of http threads also increased.
By making code asynchronous so that threads are released I have been able to fix the problem without having to increase the number of containers.