Posted by Josh | Posted in SQL Server | Posted on 12-05-2009
Tags: performance, SQL Server
From Linchi Shea’s blog:
In SQL Server, it is rather handy to retrieve data from a different SQL Server instance and use the result locally in another SQL statement for further processing. In theory and in the set purists’ fantasy land, it shouldn’t matter where you get your data or even how you get the data as long as you can use them to further compose a set-based solution.
That is all fine. After all, in a distributed environment the data you want may not be on the same instance where the processing takes place. Well, that is fine until it comes to performance and troubleshooting. If you use linked servers indiscriminately, sooner or later you’ll run into these issues, and they are not pretty.
We have used linked server functionality at various points at work. As Linchi points out, it can be quite useful when you have data distributed across multiple systems. But I have seen cases where it’s caused issues, such as excess network traffic, authentication problems (anyone actually been able to get Kerberos auth to work for a two-hop Windows auth’d linked server?), and general performance malaise.
So what are some alternatives?
- Replication – true, it may not solve the network issue, and depending upon the type used, it might cause a delay in data transfer. But it would allow for queries to execute against a single server.
- Static data transfer using SSIS – I would say this would be fine for offline use of data such as reporting. There have been very few cases in my experience where true real-time data has been a hard business requirement, though at times the point has been, shall we say, vigorously argued.