I am looking for some advise here on one of our SQL database server with below behavior-
Queries running for this databases are generally considered good with avg run time of 20 ms.
Suddenly on some weird days it will go from 20 ms to 80 ms and its very hard for our monitoring process to capture which time exactly it shifted its run time and why-
Our current monitoring method include below 2:-
DMV’s from cached TOP SQL queries which does not help much as metrics are cumulative so its hard to find the point in time where issue happened and plan or something changed.
Extended events for RPC COMPLETED, SP statement completed, SQL batch completed and SQL statement completed, but only for queries running over 5 secs.
We have not looked into QUER STORE yet due to some issues i am reading online because this DB is SQL2017 with AG setup in 2 DC configuration. Though there are trace flags for some but our Eng team is hesitant still to use QS on this server which is high OLTP with batch req/sec on avg 30-40K/ secs/
Please suggest or advise if i am missing anything except third party monitoring tool?