Skip to content

SQL Server PDW Table Storage Basics

2010 July 1
by Brian Mitchell

Within a SQL Server 2008 R2 Parallel Data Warehouse appliance, Compute Node storage is designed to boost IO during data reads.  Disk IO is notorious for being a the place where you find server bottlenecks.  SQL PDW is designed to alleviate those problems by greatly increasing the number of spindles answer any particular query.  Each compute node has its own storage array that includes ten disks that are set up in RAID 10 pairs.  Each RAID 10 pair uses multipath so that the compute node can read from both drives in the RAID pair for improved performance.  Four of those RAID groups are assigned for user data storage and the fifth RAID group is set aside for transaction logs.  During setup of the appliance, nine LUNS are set up for each compute node.  Additionally, there is a hot spare in each array in order to quickly replace any disk that goes down during usage.

A single compute node storage array:


Replicated Tables

Replicated tables  have a file group that includes eight files, spread across the first four RAID Groups.    The replicated files all are created the same size and have the same growth rate.  This allows us to take advantage of the proportional fill algorithm employed by SQL Server.  What this means is that the data in the replicated tables will be spread evenly across all disks in a compute node.  By spreading the tables across all disks, we can have all eight hard drive head’s access data in parallel at the same time to speed up operations.

Distributed Tables

Distributed tables have their rows dispersed across several storage locations.  Each storage location is called a distribution.  Each compute node has eight distributions which are physically implemented within SQL Server as separate file groups.  If an appliance has eight compute nodes, that appliance will have 64 distributions.  Each file group is assigned a different LUN within the SAN, effectively distributing the data on that compute node.  Upon creation of a distributed table, a column is chosen that a deterministic function is applied against.  This function will determine one and only one storage location within the appliance for that data. 

Digging Deeper

Looking at my virtual appliance in a Hyper-V environment, we can see that the modified for PDW AdventureWorks database has 17 files.  Also, we see that there are 8 file groups for distributed tables.  Each File group has its own file.  There is only one file group for Replicated tables, but this file group has 8 files in it spread across all the different LUNS.  If you are looking closely you may protest that all files are on drive G!  That was my first impression also, but a closer look shows that the appliance uses mount points.  Thus the folders data_01, data_02, etc all point to different LUNS as described above.




You may notice that TempDB doesn’t show up in this discussion.   That’s because TempDB is located on the local storage to the compute node.   I’ll discuss this design in another post.  The thing that I’m still wrapping my mind around is that this describes just one compute node.  When we distribute a table, it will have a minimum of eight of these babies and thus every distributed table will be broken up into at least 64 physical pieces spread across 64 LUNS.  This is minimum.  At some point someone will have a 40 node appliance, thus a distributed table will have 320 distributions across the appliance…now that’s spreading out some disk IO.

No comments yet

Leave a Reply

Note: You can use basic XHTML in your comments. Your email address will never be published.

Subscribe to this comment feed via RSS