a companion discussion area for blog.codinghorror.com

Who Needs Stored Procedures, Anyways?


#121

Paul:

“The issue is not how many pieces of code will be affected, the issue is how do I find all of the affected pieces of code?”

Try Edit-Find And Replace-Find In Files

Jeff:

This has been a good read for me. I enjoy hearing both sides of the argument and really haven’t had a strong opinion either way. Until now.

I believe the security and abstraction that comes with using stored procedures make them well worth implementing. I know the maintenance aspects have really saved my team on a number of occasions.

Security wise, preventing direct access to the tables make sense. It keeps them from being abused by the general public (or casual developer). Stored procedures create a standard interface that can be enforced, keeping an application developer from unknowingly putting your data into an invalid state.

Although it’s bad practice, stored procedures can be modified to add functionality to a deployed application without re-release. This makes my boss happy because he doesn’t like the time or the process involved in certifying and releasing builds. A stored procedure change can often take less than a few minutes and get an application back up and running. It can also add new functionality to a well designed application.

As far as being able to swap out databases, I think you can still do this with a small amount of effort. If your database supports stored procedures, you’ll have to re-code them in the new database. One plus to note here is that your code should require NO changes to support this. I should say “in theory” but this has been the case for me on almost every occasion so far. If the new database doesn’t support stored procedures (mySql) then you can override your standard datalayer (that interfaces to the stored procedures) and put the logic in there. A word of experience here: it takes a lot more effort to implement logic from a stored procedure in your data layer. One plus of database script languages is that they’re designed specifically to manipulate data. They’re very compact and focus primarily on that function.

Think about accessing the database through stored procedures (only) as a means to using your database like a service. It provides whatever (and only) the functionality implemented by your stored procedures, regardless of what application is accessing it. It you have one application and 4 import (applications) all using the database, stored procedures will help ensure that they all do it in a consistent way. New applications will already have a good baseline of database logic to choose from. You database becomes more like an entity, or single object that has specific functionality available.

If you’ve never tried stored procedures, you owe it to yourself to do so and formulate an opinion based on your experience. Do some more reading on the subject too. Quite often, you’ll find that a development method/practice doesn’t make sense because you haven’t used it in the same way that other people have successfully used it.


#122

You guys are clueless if you think that stored procedures have no real world benefit. When you have worked on a database that has over 2 TB of data, and literally more than 3k tables/views you will understand not only the power, but NECESSITY of them. You use stored procedures to present an API for a schema, and get the benefit of faster execution, security, etc.

Dynamic SQL queries are an absolute killer in terms of database performance. They do not scale, at all. When your users are screaming because your query is taking 20+ seconds to generate output, and the plan analyzer shows that it’s taking 19+ seconds trying to figure out what to do with it, you know it’s time for an SP. Actually, up-front would have been the time. They are also a killer when it comes to maintaining consistency for client applcations, which may not always be your simple compiled application. When you have 70+ projects sitting on one giant shared database + schema you simply cannot have developers mucking about directly with tables or views unless ABSOLUTELY necessary. Otherwise when you change the schema under them, they are screwed. And so are all your users using that dirty code.

It’s nice to bitch about how much of a pain in the ass they are (and I agree - they are), but if you’re that vehement about how useless they are it would seem you don’t really know pain yet.


#123

I have to ask…why is everyone saying that the alternative to stored procedures is SQL? Has anyone heard of an ORM? I mean, seriously, this is 2006.

For thos .NET people, check out nHibernate, or DataSets directly from MS. For all you open source people (re: java), I am sure you have heard of Hibernate and are probably using it because you don’t have you head up your butt!

It seems that this stored procedure debate only happens with .NET people. Why is that? This issue has been settled for years on every other platform. it would be nice if .NET people would join the rest of hte world.


#124

Sorry , i respect your right to your point of view but…

your mad.

I can not think of one VALID reason for using messy inline code, Not one, using SQL in the db gives you Scalability for one, Security and a lot easier to use.


#125

If your supporting more than 1 database environment, don’t go the stored procedure route because you will need 2 sets of procedures, one for database A and one for database B. This will be hard to maintain and a pain in general. Plus right now just two, what happens when another database is added, now you need 3 sets.

Create a DAL (Interface) that each database layer will implement

How you implement the DAL is up to you.


#126

I have a stored procedure that validates user logins and was wondering whether this was the best approach. The sproc is called from my middle tier and performs some of the following functions depending on whether the user is valid or invalid:

  1. checks to see if the user exists in the user table
  2. logs the login attempt to a log table
  3. logs the user IP to a log table
  4. checks to see if the user IP or username is banned
  5. checks to see if the user status is valid i.e active user vs not verified etc…
  6. updates the user record setting last login date to current datetime
  7. returns invalid user or a valid user record including role, status etc…

The sproc performs multiple functions and I only need to make one call from the middle tier into the database. The application is subject to heavy load, 1000’s of concurrent users and is transaction intensive.

How could this best be achieved using non sproc methods without incurring multiple calls into the database from the application layer ?

Thanks in advance

f00


#127

Anyone who has worked on enterprise level software would understand the benefits of sps over ad-hoc queries.

  1. Using f00’s user validation example, would you rather call
    "uspValidateUser ‘userId’" or would you rather send the whole query each time you need to validate a user? It could a difference of couple of bytes sent to the server or a couple of hundred Kilobytes PER validation.

  2. Do you want to expose the query and its inner workings over the network or would you rather send “uspValidateUser ‘userId’”.

  3. You can deny all permissions to a login and just allow the logins to execute certain sps. This simplifies security immensely. If you allow ad-hoc queries in the validation example, you must give that user access to the underlying tables. There may be columns that you do not want the user/developers to see. How would you prevent this? Column level permissions? Give me a break. Using views? This requires additional performance or space penalties depending on whether it’s an indexed view or not.

  4. If you were to send ad-hoc queries for multi-step transaction processes, such as f00’s validation sp, you’d need to use a transaction for the duration of the connection as you process each step with ad-hoc queries. If it is a 5 step process, you are making at least 5 trips. With the sp, you only need a transaction within the sp, which is for the during of the session. This is orders of magnitude better in terms of blocking, performance, security and simplicity.

  5. How can people say that managing schema changes are easier with ad-hoc queries than with sps. Let’s say there is a change to a table that is accessed by uspValidateUser. If there are many groups( web group, java group, .Net group ) that use uspValidateUser, just changing uspValidateUser would suffice. If all these groups issued ad-hoc queries, each would have to search through their code and update/test.
    Also, many schema changes, such as table or column name changes would not affect the developer if sps are used because the changes can be managed at the sp level. If you use ad-hoc queries, it would involve having to update the ad-hoc queries and compiling the app if it is an app.

  6. As for migrating to different servers, it’s infinitely simpler having the DBAs update the sps and developers update the DAL. This way the developers only have to worry about accessing the new servers because the sps should be the same on the new servers. You don’t have to worry about how the underlying objects( tables/columns/functions/views/etc) are implemented.

Having been through stints as a C++ developer of enterprise software and a DBA of enterprise level rdbms, there is no doubt in my mind that having sps as an interface for raw data access provides more flexibility, performance and security than using ad-hoc queries. Also, sps fit nicely into the N-tier development model. Don’t say that SQL is “archaic”, it makes you sound ignorant. Is assembly language archaic, what about interpretive languages? Is Scheme and other procedural languages archaic? Are markup-language archaic? SQL is a specialized language for a specialized task. The only reason we are having this discussion is solely due to incompetance and ignorance. If you are unable or unwilling to spend the time to learn SQL properly, you shouldn’t be a developer ( especially of data-driven software ). If you are like the 75% of “developers” out there who finished an online PHP tutorial and fancy himself an HTML “hacker”, then go ahead and keep using those ad-hoc queries to your hearts content.


#128

It seems to me that instead of fighting over one or the other we should be able to come to a common agreement: there are pros and cons to either side and there isn’t a simple answer. Each one of us has different goals, viewpoints, experiences, skill sets, likes/dislikes, deadlines, ROI, etc. which can lean it one way or the other. If all the pros were on one side, we wouldn’t be having this discussion. Having said that, I think these discussions are good so that we can make informed decisions.

I think the closest we can ever come to a common conclusion is a decision tree, but even then there will be differences of opinion. Here are some ideas to prove my point and get you started. I flattened the tree for this post. I am by no means suggesting I know it all, so bear with me.

  1. Are you creating simple throw-away software? Are you sure? Use whatever is fastest to develop (RAD, DataSet, no n-tier/n-layer whatever).
    1a. Does the throw-away software run for a long time (batch) and needs to perform better? Consider optimizing your code, using faster classes (no DataSet but rather DataReader, custom class…), less recursive functions, etc.
    1b. Will you be working with lots of data where memory could be a problem? Process one piece at a time.

  2. Are you creating a LOB app for a small amount of users? Are you sure it won’t need to scale soon? Focus more on coding/maintenance and not so much on performance. An ORM, dynamic SQL, would probably work fine.

  3. Are you creating a website for a large amount of users? Consider using stored procedures to reduce roundtrips and the amount of data that goes back and forth.

  4. Are you creating a website for a VERY large amount of users?
    4a. Is it mostly used as read-only (e.g. news site)? Consider using application caching to reduce database load. Consider having one write-only database and replicate to read-only databases.
    4b. Is it mostly used for writing? It’s easier to scale a web farm than a database, so consider not using stored procedures, triggers, etc.

  5. Is there a real chance you will want to switch databases?
    5a. Do you need it to perform well? Consider using stored procedures to tune the queries for each database type. This will require more work.
    5b. Otherwise, keep everything in a DAL/BLL to reduce dependency on database type.

  6. Will you be accessing the business logic from different languages (C#, Java, etc.)?
    6a. Do you want to go through the trouble of creating a web service with the pros/cons it has? Would all systems go through the web service? Keep your business logic outside of the DB and in your code.
    6b. Do you want to flatten some layers and keep it simple? Put your business logic in the DB.
    6c. Will it need to scale a lot (#4)? Reconsider the web service. You may need a web service farm as well.

  7. Will the business logic always be accessed from the same language, but several applications, websites, etc.? Keep it in the BLL.

  8. Do you have expert DBAs, able/willing to make changes as needed when needed?
    8a. Are the DBAs not that expensive and/or you need performance and you’re not scaling to extremes (#4)? Consider using the database more.

  9. Do you only have programmers who don’t understand SQL? Consider getting DBAs or keep more out of the database, use an ORM, etc.

  10. Did you decide to not use stored procedures? Do you have a process that requires a lot of back/forth to the DB that could really be optimized with a stored procedure? Consider using a stored procedure anyway in this case.

Note that some questions can go back/override some of the previous “conclusions”. I know it’s not accurate or complete (could it ever be?) and I’m sure many of you could add and improve on this, but that’s not the point. The point is there are a zillion variables and there is no “silver bullet” or “one size fits all”. Yes, sometimes one way of doing things is usually better, but there is generally at least one extreme/edge circumstance that overrides “usually better”. Other things are closer to middle ground and generally start “religious” wars such as this.

Choose the right tool for the task and good luck with deciding what the right tool is!


#129

Just to clarify, it’s a very complex decision and I completely skipped over some parts such as security considerations. :slight_smile:


#130

Anti-SPROC developers must be writing the equivalent of “See spot run” programs. It’s a rare developer indeed that knows what they are talking about when it comes to databases. 100 million record+ systems that have dozens of applications written against them better wall off “programmers” from the database with SPROCS or face a world of woe. Every idiot thinks he can write SQL or worse yet relies on ORM to do the idiot work for them. Do your work and let the people who know do theirs.


#131

This is a great article (at least if you read through all the comments), and as mentioned before, it all depends on the requirements. All in favor of SP’s seem to only be on 1 DBMS, and in that case I would propably agree, but what about a retail package which must support more DBMSses because the client already has Oracle or MS SQL Server and doesn’t want another DBMS on their network, or isn’t willing to pay for a license and wants to resort to MySQL? What if you don’t have the ‘direct’ control of the database and everything has to go through an updateapplication.
I’ve been reading this article because we are starting up a new projec which must at least be able to support MS-SQL Server, Oracle and MySQL, as it all depends on the client. After reading this whole article I get the impression that SP’s aren’t the way to go for this kind of project, unless we are willing to invest a lot of time on having to spend updating/maintaining all those SP’s.

But as said before, it all depends on what your needs are, and budget/time available…


#132

Hate to bump this topic, however this was 4 years ago and I want to see what people think about stored procedures in Microsoft Sql Server with the advent of SSDT and Visual Studio SP debugging support?

With SSDT you can manage your entire MSSQL database in a Visual Studio project with publishing support (create profiles for deploying to different servers and it handles the deploy/update for you). Which means you can check the database into source control.

VS2013+ also supports stepping through MSSQL Stored Procedures with the debugger in Visual Studio.

And now MSSQL 2016 supports JSON so we can query json objects, which means creating stored procedures with fewer parameters and using the new JSON support to process them as JSON Strings.

I haven’t done anything with JSON support yet, but a recent DB I am working on is 100% SSDT and stored procedures for most insert/update operations, I even validate credentials against Active Directory and do AD queries in stored procedures.

I’ve found that it’s actually faster for me to build procs in SSDT then write logic in my Service layer in c#. It keeps my Service Layer pretty clean, Instead of:

var employee = new EmployeeDto() { id = "1", name = "Bob", etc};
if (uow.Db.Exists<EmployeeDto>(employee))
    uow.Db.Update(employee);
else
    uow.Insert(employee);

I have something like
> uow.Db.ExecuteScalar(“EXEC [xyz].[addUpdateEmployee] @@id = @id, @@name = @name”, employee);

Should note that all my SP’s have a parameter called ignoreNulls so that if I set it to 1, null parameters are ignored and don’t overwrite existing values.

I am using PetaPoco and it’s T4 Text Templates to generate poco’s from the database I built with SSDT.


#133

One thing I did not see you mention as that stored procedures can cause duplication of implementation.

In a reporting site we were making we needed to get many sums and counts grouped by either month, area, country and/or some other values. When joining the team there were about 150 stored procedures all with (for example):

JOIN [Calendar] on Entity.Year = Calendar.Year and Entity.Month = Calendar.Month

And something like:

SELECT ..., [Calendar].[Month] ...[the JOIN] ... GROUP BY [Calendar].[Month]

Or when filtering (filtering does not need to select the group):

SELECT ..., [the JOIN] ... WHERE [Calendar].[Month] IN (....)

When adding the option to filter and group by quarterly value another 150 stored procedures needed to be created or 150 stored procedures needed to be changed since the join and group were copy pasted in all the 150 procedures.

I added a SQL builder class and DataLayerHelper types to build a SQL statement that inherits from BaseDataLayerHelper so adding grouping by quaterly value would require me to only change one function; the BaseDataLayerHelper.prepareGroup (pseudo code):

prepareGroup(Query q, dataType, groupType){
    .....existing code
    if(groupType == BaseDataLayerHelper.GROUP_BY_QUARTER){
        IField dateField = this.joinCalendar(q, fromTable, groupType);
        ((Field)dateField).group = true;
        ((Field)dateField).sort = "ASC";
        q.addField(dateField);
    }
}

This would never be possible using stored procedures. Any new grouping or filtering feature that needed to be applied to all or many entities can now be implemented in the BaseDataLayerHelper.

Some people would still argue stored procedures is better because of performance (even though performance is not a problem) and because things get done quicker when people can use stored procedures.

Maybe sprocs for new things (new entities) get done quicker because some people in the team do not know how to use the helper and SQL builder but implemening application wide new filtering and grouping options would definetely take a lot longer with stored procedures.

It is funny though that even though in this particular case; using strored procedures was such an obvious Maintainability and extensibility problem, some people would still stubbornly vote to use them.


#134

Such a non-sense you claim. I had to login just to comment on this article.

First, you say “We should consider SP as DB Assembly”, then you close “We should use it for situations where the performance is critical”.

Every single system on earth would be greatly benefited from ANY performance improvement that can be done, no matter the size of the project.

You have a quite poor philosophy by believing it’s OK to ignore “critical” performance to your subjective understanding of the problem by the time the system is developed.


#135

After so many years have passed, you should do a follow up entry on the same subject. In all honesty, I disagree with just about everything you said, but it would be good to hear your thoughts after all this time.
How’s that for a challenge :wink: ?


#136

I would also be curious to see if the original author has changed his position on this topic. In case anyone cares (you probably don’t) here is my spin on this.

Stored procedures insulate applications from having to know or care about the specifics of where in the schema a particular data item resides. Application software in most instances should not need to concern itself with the pedantic details of the data storage model. If a particular application data object contains properties that were gathered from several different tables or even different schemas why should the application programmer need to care about this? To us application devs the business objects served up by the data access layer are widgets and I could not care less about the details of which cubbyhole the database keeps them in. Since us devs generally have to write the data access layers as well as the actual software my preference every time is to call a stored procedure to store or retrieve data rather than write some OE Linq gobeldygook that has virtually no protection from schema design changes in the database. As soon as you step on that OE train then a side-effect that you cannot get around is that it introduces a requirement of schema awareness into your application which in my opinion is highly undesirable. Is the SP model 100% decoupled from the schema? No it is not because as others have pointed out the parameters are definitely part of the schema too but they are relatively easy to maintain compared to table relationships and data storage locations.

Does it cost more to implement a SP interface up front? Yes of course it does. Does it save money in the long run? Like buying insurance, that entirely depends on what happens in the long run. If you are the type of developer who parachutes into a project, does your thing, and then leaves never to be seen again you probably do not care about this very much. If you care about your clients (or your employer, whatever the case may be) then perhaps your thinking will be more long-term.

As many other posters have already said, there are cases when one approach to data access is clearly superior to another so that one should be used, however in most cases I will take the schema independence advantages over the expediency of an OE type of design.

Given the age of the original post I am going to ignore the author’s obvious lack of understanding of what web services are and reserve comment on that aspect of the post.


#137

Link for the “Daily WTF entry on Stored Procedures” is broken, new url:
http://thedailywtf.com/articles/Who_Needs_Stored_Procedures,_anyways_0x3f_


#138

I have a React front-end a Golang server cluster behind an NGINX proxy with MySQL as the DB. I use Stored Procedures, but I don’t do any business logic in them. I like to keep them simple and strictly related to fetching/storing data and if there is some dynamism ensuring that it always returns the same columns. But I generally do most of the rest from Golang. On the front end I actually have a datatypes.js file that defines the shape of all my data types with the default values that works with a library I wrote to do various manipulation based on it. The fields in these data types are actually the same as the names of the database fields so the Golang JSON field annotations can be the same as the ones in the data type JS declarations.