95% of DB Performance Issues are Design-Based.

By Charles

June 15, 2017

application design, Database Services, sql server


Performance issues are like a plague.

Everyone seems determined to treat the symptom when they should be focused on the cause.

There is a sense that any performance issue can be resolved by increasing the hardware spec it runs on.

I believe this is a fallacy and an expensive one at that. By scaling up the hardware, you are most likely scaling up and obfuscating the problem.

Prevention is better than cure

With a bit of planning and by following application design best practices, you can save excessive time troubleshooting performance issues – issues that would not exist in a well-designed application.

There are two types of application performance issues that you will find.

One is an issue in a third-party piece of software which you have no control over. It should, however, be logged with that third party, and if you are lucky, they will rewrite part of the app, which will then be pushed out in the next set of patches.

The second type is an internally written app you have complete control over. You can redesign and rewrite the app to remove the problem.

This will no doubt require a lot of testing and a rollout to whoever uses the app. This makes it sound like it’s more straightforward to add more resources.

What is most likely to happen is you get past one hump and land in the next.

Weeks of design can save you months of troubleshooting

At some point, someone is going to blame the database. In some cases, I agree.

The DB will only ever do what you have designed it to do. It just might not be doing it as efficiently as you thought.

By telling the DB to do something inefficient, it will do it badly and make it look like the DB is the problem.

You control what you tell it to do, not how it will do it.

So tell it to do something more efficiently, and you will find the problem evaporates.

Other considerations are as follows:

Data Types

Make sure you choose suitable data types.

This will save disk storage space and use less physical memory, which can cause performance issues.

Do NOT use the default datatype that may be selected for you.

Millions of tables with NVARCHAR(255) datatypes store single-character or digit data.

The N allows you to store non-unicode data by adding an extra BYTE to every single row of data.

You add that up for millions of rows and across thousands of tables, and that’s a lot of wasted space.

Think about what type of data you are storing and what would best suit that data.

Normalisation

Thousands of wide tables (500 columns plus) should be multiple thinner tables.

Is normalisation a complicated process? Not really, as it’s a logical process to help you make an efficient design to store your data in the least amount of space whilst removing duplicates.

It will also allow you to maintain accurate data by not storing different values for the same customer or order in multiple locations. ,

Most applications will not even need more than the 3rd standard form.

The problem does come about when applications get added over time.

An extra column here and there starts to add up until someone thinks maybe the table is a bit large and splits it into two.

This is not how to do normalisation.

You need to look at the relationship between the data you are modelling.

You need to think of situations that you might end up in. For example, should you store the customer’s address on the invoice or keep it as part of the customer table?

You will be frustrated if you do not store it on the order form and want to return it to find out what address was used.

As the address used would be the customer’s address. If that address has been updated after a company move, the order will appear with the new address.

To maintain the order address history, you will need to store the address sent to the order, or you will need an address history table with from and to dates so you can re-generate the order with the correct address for the order date.

These things need to be thought out (designed) before you write code.

Primary and Foreign keys

Some frameworks like to keep control of referential integrity.

This may be a bit old school, but I prefer SQL Server to handle it.

If it is handled at the SQL Server end, there is no getting around it.

If an abstraction layer handles it, people may ignore it and bypass it by hitting the tables directly.

It might just be the DBA in me, but I have access to data locked down and the data protected.

Data Size

The more data you have, the longer it will take to sort, search and return the data you seek.

We can speed data access up using indexes, but there are still limits like memory size, network speed etc.

Sooner or later, if you have a massive DB as your OLTP system, you will need to look at archiving data or obliterating it into another reporting-focused system.

The best thing about having a reporting system is that you can change the index plan.

Optimising indexes on a reporting DB, data retrieval is prioritised over inserting and updating your data.

Whilst your OLTP system needs to be able to update, insert and perhaps delete data.

Your reporting system is 100% focused on retrieving data for your reports.

Both systems require different index plans, and by separating these two functions into two separate DBs, you can have the best of both worlds.

Suppose you need to be able to offer reporting and OLTP use in the front-end app. You have two choices.

You can use two different connection strings, one for the reports to be shown and the other for the live system.

The second option is to split it into two databases.

Then all you need to worry about is your reports’ accuracy. You will need a separate process to sync the data in the OLTP and reporting systems, but this can be built in or a caveat out.

Performance issues

Performance issues are a pain, but they happen in all systems eventually.

Minimising the issue is to have a decent application design from the start.

The second is to choose the correct datatypes to minimise memory and storage requirements.

Finally, the size of the system can be a problem, and in most cases splitting out OLTP and reporting workloads can reduce the strain on the live system.

One person rarely solves performance issues. Teamwork is required to track down the root cause, which can be design, storage tech, network issues, application servers etc.

Starting with a well-designed app will help reduce the number of places to look to resolve it.

Charles

About the author

Microsoft Certified SQL Server DBA with over a decades experience including work for large FTSE 250 companies amongst others. The SQL Server stack has been the focus of almost all of my career in IT. I have experience designing, supporting and troubleshooting large Data Platform deployments.

You might also like

{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}
>