06 Dec

|IBM Gold Partner

The best way to skin the SQL: The difference between a good developer and an awesome developer

Performance tuning is essentially using various techniques that make SQL code run in the quickest possible time and is the difference between a good developer and an awesome developer. After the SQL is written, it gets tested for syntax or logical errors and the query output is then scrutinised for correctness.

Job done?

Not quite!

Even if the code compiles and the query results are 100% accurate, it doesn’t mean the task is completed. The final jigsaw piece is making the SQL code performant by processing the data swiftly. There are several ways to skin a cat and numerous ways to skin the SQL.

Query structure tips and tricks

A) Keep it simple. If there are several CTEs, table joins and nested subqueries in your SQL then that is a sign the code is too complicated which means the SQL Optimiser won’t generate the most efficient execution plan.

B) In the SELECT clause, only use the columns that you need, don’t use SELECT *.

C) Remove unnecessary sorting operations like ORDER BY, UNION or DISTINCT.

D) Be careful when nesting queries, too many nested levels cause performance issues.

E) Watch out for cartesian or cross joins, they could potentially return a multitude of superfluous and erroneous rows.

F) Use the TOP command to restrict the number of rows returned when testing.

G) Use SQL Join, Table or Query hints as a last resort as you are effectively forcing the SQL Server Optimiser on what execution path take; the Optimiser knows better than a developer!

H) Use Set-Based queries whenever possible to process the data in, ‘chunks’ as opposed to row by row operations like cursors or while loops.

I) Row by Agonising Rows (rbar) operations like Cursors are only acceptable if the data set to process is very small and takes a limited time to process.

Indexing in eight easy steps:

J) Indexes can be applied to temp tables, so experiment with replacing CTEs with indexed temp tables.

K) Tables should have primary keys to create a clustered index, thus improving query performance automatically.

L) Non clustered indexes should be on foreign keys in joins.

M) Watch out for fragmentation in indexes, rebuild if necessary.

N) Use Explain Plans on each section of your SQL code to see if the optimiser recommends any additional indexes, experiment with the index to see if it runs faster.

O) Don’t have too many indexes! This could confuse the optimiser and not generate an efficient query plan as well as increasing disk space usage.

P) Remove unnecessary non-clustered indexes if there are performance issues with UPDATES and INSERTS (the more indexes, the more writes to data pages).

Q) If you are working with a well-defined subset of data, consider applying a Filtered-Index on the table to access that subset of data quickly.

Advanced Techniques:

R) Experiment with memory optimised tables for smaller dimension tables, they are read from memory as opposed to disk so can return faster query times.

S) For faster table read operations in large fact tables, experiment with Columnstore indexes, they also have higher compression rates so should save on storage.

T) The SQL Server database engine usually outperforms a client-side ETL transformation (i.e. within SSIS). Consider replacing SSIS ETL transformations with SQL stored procedures as the code is already parsed and benefits from using indexes and histograms on the database.

U) When looking at Explain Plans, if Full-Table or Index Scans are returned it means there are opportunities to tune the code by introducing an index to get an Index Seek instead (an Index Seek returns a specific number or records whereas a Full Table or Index Scan looks at the whole table.

V) If the database uses table partitioning, tailor your SQL to utilise the partitions you need and eliminate the partitions you don’t!

W) Examine the statistics generated from a SQL Server Trace or Extended Events session to diagnose if a query is suffering from waits, deadlocks, cpu, memory or disk i/o issues.

Ask yourself the following questions when existing SQL suddenly slows down

X) Has anyone made a change to the code? Check source control.

Y) Has the source data volume significantly increased that the SQL code processes?

Z) Is there any index fragmentation on the tables that the SQL accesses? Rebuild indexes if necessary.

AA) Have Database Statistics become stale due to data changes?

AB) Have any database schema changes been applied (i.e. new constraints, new or dropped index)?

AC) Is there contention or locking on the database when the SQL code runs? Has a new batch job been introduced that clashes with your SQL code or an increase in concurrent users on the system?

AD) Has there been a recent architectural change? (i.e. windows update, server upgrade, database migration, Azure Tier change and so on).

AE) Are any DML triggers firing under the covers in the database that are causing contention?

Discover how Simpson Associates can help you to achieve success

If you are looking to outsource the delivery of your data analytics platform or are simply looking for maintenance and support of your IT services, we can help. Simpson Associates is a Data Analytics company with over 25 years’ experience in the field. We can help your organisation to achieve success with our managed IT support services, freeing up time for you to manage and grow your business.

Discover how we can help your business achieve success.

Blog Author

David Mills, Senior Consultant, Simpson Associates