I’ve been banging my head against this problem for months now, and maybe I just don’t know the magic words to search for, but Google hasn’t been useful. I’ve found all sorts of ways to optimize queries and find and reduce resource locks so that a locked table or row doesn’t hold up transactions waiting on that resource, but I haven’t found anything discussing the problem I am having. So, if I am simply looking in the wrong place, help in refining my search would also be a suitable answer here (such as, as I mostly expect, this is a duplicate question of another I couldn’t find due to using the wrong search terms).
Anyway, my problem is this: if I have a long running transaction running in one database (completely contained within that database), the whole server freezes up. Queries in other databases are blocked until the first query finishes – which could be minutes. Even the simplest queries, like the TOP X of a table, or right clicking a table/database in SSMS and waiting for a context menu. I can’t even get the Activity Monitor to work while it is frozen.
And there is nothing particularly special about the query that is blocking everything. It isn’t a specific query, but pretty much any long running query, usually SELECT queries (as this is supposed to be a data warehouse). One or two joins, 3 or 4 WHERE clauses, an ORDER BY – it is usually only long running because it is iterating over a lot of rows. Not using transactions, no triggers defined, etc. The query is often, but not always, a view, if that makes a difference.
I can’t rule out that the SELECT query might be consuming resources in one or more system databases (tempdb) and that this shared resource is responsible for the system wide blocking, but the system is largely set up using defaults, so it seems like poor design if a relatively simple SELECT query could monopolize system databases by default, and I have trouble visualizing how an SSMS context menu TOP 10 query against a single table could depend on tempdb. This question seems partly relevant, but we have plenty of RAM and the resources used by the queries are completely contained in their respective databases (the query might join across databases, but the two databases used by the query are unrelated to the third database used by the blocked query), so only option 3 seems likely, but I don’t know either how to check that or how to correct it if it turns out to be the case. The databases are stored on a SAN over a fiber channel connection, with partitions for each of the following: system DBs, user DBs, logs, and the OS on its own local disk. Tempdb is broken across 8 files (on the same partition) to provide some parallelism. Most other settings are stock.
Load on the server is also rather light – most processing is done during off hours and so when these queries are run they are often the only ones being run, except of course when they aren’t and we have this problem, but even in this case there are usually only two or three queries running simultaneously.
My question is: considering the above, what can I do to allow SQL Server to perform some sort of time slicing so that other queries can run simultaneously with the currently blocking query? I understand that they might suffer some impact as they will be sharing compute cycles, but a half second query to get the TOP 10 rows in a table shouldn’t have to wait 10 minutes for a longer running query to complete.
Let me know if there is some specific information that you may need to provide an answer (assuming, again, that this isn’t already answered under a different set of query terms). I am not looking for query or index optimization advice as again, I have considered such advice already, and many times the queries are ad hoc or generated by a query tool (Ppwer BI) and so I have little or no control over their optimization. I suspect that the solution lies in server configuration, possibly settings or even arrangement of the file structure.