Question on how to scale SQL Server from Monolithic to high availability for a high demand/ multi/micro service system


#1

Hi all,

I’m not 100% sure where is the correct place to ask this question, so please forgive/enlighten me if you has seen this question repeated some where.

Let me first start this, by give a clear context on where our system currently is and what are the issue:

  1. The system purpose: We developing a platform for multi-tenant user to use our platform to manage their business processing. (Mostly doing ERP tasks…)

  2. The system design: We have a monolithic design web site/app with:

  • One SQL Server database (to store all system wide data)
  • One web server (to provide web APIs)
  • One website (light front-end)
  • Several workers/services implemented on C# .NET
  1. The implementation:
  • All back-end APIs and worker are access database via Entity Framework (6.0) - Database first.
  • Each worker service run in cycle 1 minutes, and once it run, it can process a long-period transaction with requires saving multiple times on multiple entities.
  1. The issues:
  • Poor query performance (nearly 10 sec for a query listing on a table with join 2 tables, each table has about several thousands records). I know Entity Framework is slow, but is it really this bad? :frowning:
  • We constantly run in to time out issue.
  • We got concurrency exception (Although, logically we can assure that there’s no change was made while a worker service is processing an entity object)

Giving the issue, I would like to ask that:

  1. Giving our system design, what is the main cause of the poor performance: SQL Server or Entity Framework?
  2. I don’t have experience in DevOps and SQL Scaling. Given our situation, what is the “TODAY” strategy to scale SQL Server (horizontal) to handle much more query demand, remove the outage and get highest availability for our system?
  3. In the mean time, we planned to improve the performance first by indexing, review and remove logic which use long-holding transaction. But giving the dynamic of Entity framework, we not sure we should take this part, or move completely to other ORM (like Dapper)?
  4. I also researching and plan to redesign our system to use micro service architecture, but one thing that I still not completely understand is that: Although, we break the database into multiple domain-driven instance. But then, it will can also be run into the same problem again if the domain is big enough. I know some system use No-SQL, in-memory db like Redis, and other db like MongoDb when I come to high scalability system/ micro service. This make me wonder: Does SQL Server Database really suitable for micro services? Will I run into the same issue again when the domain for each service is big enough? Or are there any solution/strategy that modern system apply to make SQL Server really suitable for micro service architecture.

TL;DR: We have a monolithic system which recently run into poor performance issue. We tried to fix it with indexing, logic etc…and one thing we know for sure that we have to apply micro-service architecture. But one question still remain: The bottle neck still remain at our database server (SQL Server), how could we actually scale SQL Server so our system really becomes high available and suitable for high demand, high availability system?


#2

Hi,

As always, the answer is: it depends.

A couple of things:

  • I currently work in a micro services environment using dotnet and MS SQL Server, so it can work - but it depends on what sort of throughput you want and how much money you’re willing to throw at SQL Server licenses and the hardware to run it. The client I work for at the moment is stingy, so all databases sit on a single production server. Read that line again. We’ve advised them multiple times that this is a very bad idea.

  • To my knowledge, horizontally scaling Microsoft SQL Server such that each instances accepts both reads and writes is not possible.

  • When you go the multi-database route on MS SQL Server using MS Windows Server, beware of MS DTC! (Microsoft Distributed Transaction Coordinator). Your transactions (wrapped by EF) will automatically be promoted to a distributed transaction when your query spans multiple databases. To alleviate this problem, you can implement something known as a process manager, but it can get quite complex. Also you will need a process manager if you’re trying to communicate between different persistence technologies MS SQL Server and PostgreSQL for instance.

  • Entity Framework has immense overhead when it comes to writes. To put it into perspective, EF was taking 5 seconds to insert 100 rows into one of my applications. We then switched over to SqlBulkCopy and ADO.NET, and the performance went through the roof. We were able to process 2500 - 7000 records per second without much effort.

  • It also doesn’t help that for reads, entity framework has entity tracking on by default. You can disable this.

  • Because you have multiple services accessing a single database you’re going to run into issues with the default transaction level that entity framework uses. By default entity framework uses the most restrictive/protective transaction level that SQL Server offers, known as SERIALIZABLE. Have a look at the link below on the different transaction levels MS SQL Server offers. Make sure you understand the drawbacks of each. If you scroll down to SERIALIZABLE you’ll note this:

  • Statements cannot read data that has been modified but not yet committed by other transactions.
  • No other transactions can modify data that has been read by the current transaction until the current transaction completes.
  • Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.

https://docs.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017

  • You mentioned indexing, but have you looked at your query plans? Just because Index A solves the problem now, does not mean SQL will not choose a different query plan later on. Also are you following the index pattern of : filter columns first (when applicable), join columns second, and ordering columns last, with included columns if necessary?

  • Speaking of which, what database maintenance jobs are you running? (Index rebuilds/reorgs, etc)

  • What does your EF query code look like? Are you materializing all the results in memory and then filtering, or are you filtering and then materializing? Are you using contains (the in operator in SQL is an expensive operation). If you’re trying to do bulk data manipulation, you’re probably going to have to use the raw SQL classes.

Hope this helps.


#3

Apologies for the late reply, but before we dig into the complex topics, you need to ask yourself a question whether it really makes sense to split your monolithic app into separate microservices, knowing that the current performance issues occur mostly due to the database?

At first, I’d try to tune the EF e.g. by disabling the object graph and a few other settings that have the rather significant influence on the performance. I’d also run a profile to check the SQL queries and look for the ones that are really complex and take a lot of time to be processed. Eventually, you could try to use Dapper or simply execute the plain SQL queries using EF within these parts of your application that seem to be a bottleneck (e.g. complex transactions or querying a lot of data with multiple joins etc.).


#4

Hi dmh,
Thank a lot for your thoughtful comment! Those knowledge from experienced person like you help me help me a lot. I’m really appreciate this.
Those are definitely things that I certainly look forward to apply on our system.

What do you think about using caching/replicate SQL on your system to boots read performance?


#5

Yes we definitely need to migrate our system into micro services. Because currently, our system already consist of many back-end worker services, and they use simple queue (in database) to assign job/processing jobs which are long processing tasks or back end processing.

So, transition into micro services, at least to help services communicate via a service bus and event mechanism is a definitely direction.

So, performance here is certainly an issue, but it not the main reason we move to micro services architecture.

Given that, since I mostly see implementation for micro service using MongoDB, NoSQL, in-memory database like Redis, so I wonder if SQL Server is really suitable for Micro Service (in the context on .NET). Of course I know that we must sharding/break database into separate Domain, but still, I’m really not sure so far that SQL Server is good database for this purpose, or what is the main reason other use another database technology? Like dmh said, cost is definitely an issue, but how about performance by SQL Itself?


#6

What do you think about using caching/replicate SQL on your system to boots read performance?

As soon as you use SQL Server replication, you open up an enormous can of worms.

  1. You need more SQL and OS licenses.
  2. You possibly need more hardware.
  3. You introduce eventual consistency into your architecture.
  4. Sometimes the delay in replication can get quite severe. Again, my current client uses SQL replication and the worst I’ve seen was a delay of 30 minutes.

You’ll need to elaborate a bit more, with regards to what you mean by “caching” - because the caching I’m thinking of is one that comes with the challenge of cache (in)validation.

Yes we definitely need to migrate our system into micro services. Because currently, our system already consist of many back-end worker services, and they use simple queue (in database) to assign job/processing jobs which are long processing tasks or back end processing.

Without knowing more about the complexities of your environment I would hesitate to say that going the micro-services route is the panacea to your problems. Micro-services bring their own set of challenges.

Given that, since I mostly see implementation for micro service using MongoDB, NoSQL, in-memory database like Redis, so I wonder if SQL Server is really suitable for Micro Service (in the context on .NET). Of course I know that we must sharding/break database into separate Domain, but still, I’m really not sure so far that SQL Server is good database for this purpose, or what is the main reason other use another database technology? Like dmh said, cost is definitely an issue, but how about performance by SQL Itself?

SQL Server is fine for micro-services. If you’re worried about cost, you can always transition over to PostgreSQL. There are entity framework libraries out there for PostgreSQL that make the change almost seamless. SQL server has things such as In-Memory tables, it has things such as XML fields, Server 2017 finally has JSON methods.

My advice to you is: profile your app, look at the entity framework code you are writing, monitor SQL - look at what the query plans are doing, how the indexes are becoming fragmented, etc.

As for entity framework code, let’s assume you have some simple models like this:

public class Employee {
	public int Id { get; set;}
	// stuff
	public virtual Manager Manager { get; set; }
}

public class Manager {
	// stuff
}

Are you writing code that looks like this:

public Employee GetEmployee(int employeeId) {
	return dbContext.Employee
		.Include("Manager")
		.ToList()
		.FirstOrDefault(employee => employee.Id == employeeId);
}

or like this:

public Employee GetEmployee(int employeeId) {
	return dbContext.Employee
		.Include("Manager")
		.FirstOrDefault(employee => employee.Id == employeeId);
}

Also remember this, NoSQL has its own sets of challenges. For the most part you have to throw relational algebra out the window, you will need to embrace eventual consistency, you have to make do with the fact that your NoSQL persistence might not be able to do aggregations (sum, count, etc), you’re going to have to put data governance strategies into place to deal with the inevitable data bloat and data inconsistencies [1].

[1] For example, suppose you chose to use a document database (MongoDB, Amazon DynamoDB, Google Cloud Datastore, etc) to retain order information. Let’s suppose you have a document that looks like this:

[{
	"order_id": 12345,
	"order_date": "2019-01-04T13:31:00Z",
	"order_items": [{
		"stock_control_unit": "ABC123",
		"item_name": "Some passing fad (blue)",
		"item_quantity": 3
	}]
}]

Now, let’s suppose someone decides to change the value of item_name in to No longer a passing fad (blue), another sale comes through and you then have the following in your document database:

[{
	"order_id": 12345,
	"order_date": "2019-01-04T13:31:00Z",
	"order_items": [{
		"stock_control_unit": "ABC123",
		"name": "Some passing fad (blue)",
		"item_quantity": 3
	}]
},
{
	"order_id": 98765,
	"order_date": "2019-01-04T16:22:10Z",
	"order_items": [{
		"stock_control_unit": "ABC123",
		"name": "No longer a passing fad (blue)",
		"item_quantity": 21
	}]
}]

You are going to make the lives of your data people (data engineers, data stewards, data analysts, business intelligence developers, etc) hell.


#7

Thank you!
I will definitely take those into consideration for every steps we move.