Three hours of this process was taken up by transferring data across the network between two servers. There were only 8 million rows of simple text and numbers and the time it was taking was clearly excessive. As the number of rows was increasing daily, this was something that needed attention as the running time was only going to increase.
The ProblemIt was simple, the data was stored in the source and destination tables as nvarchar(max). Each column, no matter whether it held an 8 digit date or a 3 character code, was being stored in nvarchar(max).
Some might say what difference does that make? This data type will hold string data of any length, it’s a variable length data type, SQL will only use as much storage as it needs. True, but SQL also compresses data where it can and unfortunately nvarchar(max) cannot be compressed. And even if space is not an issue for you, once you start to transfer the data you will definitely feel the effects.
The SolutionAs the source table was not under my control I did not want to affect any other possible processes that wrote to it, so I created a new table in the source database using the most appropriate data types: int, varchar(20), char(3), etc. As part of the daily process I copied the data into my new table and then kicked off the transfer to the destination table which also now had the correct structure.
The Result55 seconds. I did not believe it at first and had to run the process a number of times to satisfy myself that it was doing everything it needed to. It was true. The data transfer time was reduced from three hours to 55 seconds!
Whether this was an oversight by a developer, lack of time to go back and review, or simply due to inexperience, it was costly. The network was being tied up by this data transfer on a daily basis. In fact on fixing the problem the whole process needed to be rescheduled as it was then finishing too quickly and other parts weren’t ready for the next steps to start running.
Sometimes going back to the very beginning can be extremely beneficial in terms of productivity and efficiency.
The fact that the same unchanged data was being transferred every day is not being addressed here. That deserves its own blog entirely.