This is a repost of my response to a question over on Stack Overflow. The user had posed a question about how to best convert and store large dynamic queries for Microsoft SQL Server which were being generated as inline code in a C# project.
If you’re not using Entity Framework and LINQ for your queries, then I recommend that you put the queries into SQL Stored Procedures in your database, and use ADODB.Command or SqlClient.SqlCommand objects to run them.
Depending on the complexity of how you are building the query, it might make sense to create multiple queries to replace your inline calls. In particular, if you have a block of code that filters by product OR by store OR by chain (for example), and those are all handled by one block of dynamic C# code to build the command, it would be better to have 3 separate stored procedures in SQL to handle each case than to try to duplicate that dynamic behavior in one procedure.
An added benefit to this approach is that you will probably find better overall performance and index tuning opportunities with the stored procedures due to the way SQL builds query plans.
Some general comments about how SQL manages query plans
If you build your stored procedure in your database, SQL will generate a query plan the very first time each procedure is run, using whatever parameters and flow that call requires to optimize the query. If you load your queries dynamically – whether with generated dynamic SQL or by loading SQL scripts saved as files – Then SQL runs this analysis every time the call is run.
Generating the query plan every run is a performance hit. Depending on your database and your query, this hit can be very minor – a few milliseconds on a query that runs once a day – or very major – a second or two for a query that runs thousands or millions of times a day.
Splitting your calls in to 3 separate procedures is a good idea because SQL Server builds the plan on the first run example. If you have a procedure that takes an optional ID value, and returns one row if you pass the value, or all rows if you don’t… then depending on which one gets called first, SQL is going to try to either do an index lookup or table scan every time you call it, neither of which is optimal for the other operation. Splitting this in to two separate calls allows SQL to generate an optimal query plan for each operation.
Another aspect of this is more for logging and analysis. Many SQL performance tools, including those built in to SQL, are able to look at many calls of the same stored procedure an aggregate the results to identify long term performance trends. Some tools even do a really good job of pinpointing the exact parts of the procedure which run poorly. If you’re using dynamically generated SQL, however, those calls all become a sea of separate events. So your 3 second call made millions of times a day would get lost if you have a long running stored procedure that bubbles up once or twice a day, but if the 3 second call is a stored procedure then you can see that it collectively becomes 90% of your server workload, and a good candidate for refactoring and query tuning.
So, while it feels a little bit like you’re violating the DRY principal to generate multiple similar queries as separate stored procedures, you want to adjust your mindset when working with SQL to get the best performance.