1SmartSolution Blog: Data Access Optimization In SQL Server: Partitioning

Data Access Optimization In SQL Server: Partitioning

Posted At : Jan 12, 2010 21:15 PM | Posted By : Ed Tabara
Related Categories: SQL

Apply partitioning on big and fat tables
Table partitioning means splitting a large table into multiple smaller tables so that queries has to scan less amount data while retrieving. This is very useful when you have very large (possibly having millions of records) table in your database and when you see that after all optimization steps, querying on this table is still executing slowly. Following two following options are available to partition a table.
- Horizontal partitioning
  Suppose we have a table containing 10 millions of rows. We can divide the table's data into 10 separate partitioning tables where each partition will contain 1 million records and the partition will be based for example upon the value of the ID field. So, first partition will contain the rows in the range 1-1000000, second partition will contain the rows which have a primary key value in the range 1000001-2000000, etc.
  This way we are partitioning the table by grouping the rows based upon a criteria (ID range). It's like we have a stack of books in a box from where we are horizontally splitting the stack by taking a group of books from the top and putting in smaller boxes.
- Vertical partitioning
  Suppose, we have a table having many columns and also millions of records. Some of the columns in the table are very frequently accessed in some queries and most of the columns in the table are less frequently accessed in some other queries. As the table size is huge in terms of number of columns and rows, any data retrieval query from the table performs slowly. So, based on the frequency of access of the columns we can split the table into two or more tables where each table would contain a few columns from the original tables. In our case, a partition of this table should contain the columns that are frequently accessed by queries and another partition of this table should contain the columns that are less frequently accessed by other queries. Splitting the columns vertically and putting in different thinner partitions is called vertical partition.
  Another good criteria for applying vertical partitioning could be to partition the Indexed columns and non-indexed columns into separate tables. Also, vertical partitioning could be done by splitting the LOB or VARCHARMAX columns into separate tables.
  This type of partitioning has to be done very carefully because if there is any query that involves columns from both partitions, then the query processing engine would require joining two partitions of the tables to retrieve data, which in turn would degrade performance.
- Partitioning best practice
  1. Consider partitioning big fat tables into different file groups where each file inside the file group is spread into separate physical disks. This would enable database engine to read/write data operations faster.
  2. For history data, consider partitioning based on "Age". For example, suppose a table has order data. To partition this table, use the Order date column to split the table so that, a partition is created to contain each year's sales data.
- How to partition?
  Suppose, we have an Order table in our database that contains Order data for 4 years (1999, 2000, 2001 and 2002) and this table contains millions of rows. We would like to apply partitioning on this table. To do that, following tasks are to be performed:
  1. Add user defined file groups to the database
    To create a file group do the following
    ALTER DATABASE OrderDB ADD FILEGROUP [1999]
```
ALTER DATABASE OrderDB ADD FILE (NAME = N'1999', FILENAME

= N'C:\OrderDB\1999.ndf', SIZE = 5MB, MAXSIZE = 100MB, FILEGROWTH = 5MB) TO

FILEGROUP [1999]
```
    We are adding a file group "1999" and adding a secondary data file "C:\OrderDB\1999.ndf" to this file group. We did this because we would like to put our table partitions into separate files in separate file groups. Using same SQL command you may create another 3 file groups "2000", "2001" and "2002". Each of these file group would store a year's order data inside their corresponding data files.
  2. Create a partition function
    A partition function is an object that defines the boundary points for partitioning data.
```
CREATE PARTITION FUNCTION FNOrderDateRange (DateTime) AS

RANGE LEFT FOR VALUES ('19991231', '20001231', '20011231')
```
    The above partition function specifies that, the Order date column having value between
    
    DateTime <= 1999/12/31 would fall into 1st partition.
    DateTime > 1999/12/31 and <= 2000/12/31 would fall info 2nd partition.
    DateTime > 2000/12/31 and <= 2001/12/31 would fall info 3rd partition.
    DateTime > 2001/12/31 would fall info 4th partition.
    
    The RANGE LEFT is used to specify that the boundary value should fall into left partition. For example, here the boundary value 1999/12/31 is falling into the 1st partition (with all other dates less than this value) and the next value is falling into the next partition. If we specify RANGE RIGHT, then the boundary value would fall into the right partition. So, in this example the boundary value 2000/12/31 would fall into the 2nd partition and any date less than this value would fall into the 1st partition.
  3. Create a partition schema
    The partition scheme maps the partitions of a partitioned table/index to the file groups that will be used to store the partitions. To create a partition schema, use the following command:
```
CREATE PARTITION SCHEME OrderDatePScheme AS PARTITION FNOrderDateRange

TO ([1999], [2000], [2001], [2002])
```
    Here, we are specifying that
    
    The 1st partition should go into the "1999" file group
    The 2nd partition should go into the "2000" file group
    The 3rd partition should go into the "2001" file group
    The 4th partition should go into the "2002" file group
  4. Apply partitioning on the table
    Till now we have defined the necessary partitioning criteria. So all we need to do now is to partition the table.
    In order to do this follow the steps:
    Drop the existing clustered index from the table that is most likely created due to the primary key creation on the table. The clustered index can be dropped by using the DROP INDEX statement. Assuming that the PK_Orders is the primary key of the table, use the following command to drop the primary key which will eventually drop the clustered index from the table.
    ALTER TABLE Orders DROP CONSTRAINT PK_Orders
    Recreate the clustered index on the partition scheme: The index can be created on a partitioned scheme as follows:
```
CREATE UNIQUE CLUSTERED INDEX PK_Orders ON Orders(OrderDate) ON

OrderDatePScheme (OrderDate)
```
    Assuming that the OrderDate column values are unique in the table, the table will be partitioned based on the partition scheme specified (OrderDatePScheme) which internally uses the partition function to partition the table into 4 smaller parts in 4 different file groups.

For more on partioning check Partitioned Tables and Indexes in SQL Server 2005

| 9817 Views | 11% / 0% Popularity

Comments

This looks intense; I had no idea that this kind of functionality was even available in databases. I really need to increase my SQL knowledge; I have a good amount of home-grown / Googled info, but I need to get professional with it.

# Posted By Ben Nadel | Jan 15, 2010 3:11 AM

Search

Calendar

Sun	Mon	Tue	Wed	Thu	Fri	Sat
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Tags Cloud

1ssblog about access adobe ajax amazon caching cfhssf cfsqlmaster cfwatcher coldfusion data deals fun jquery moldova mssql my projects optimization other performance profit ria server sites sql this transerfing using your

ColdFusion	166	[RSS]
Other	101	[RSS]
My Projects	61	[RSS]
Fun	54	[RSS]
SQL	52	[RSS]
Deals	19	[RSS]
RIA	18	[RSS]
1ssBlog	16	[RSS]
cfSQLMaster	12	[RSS]
Profit	9	[RSS]
Caching	8	[RSS]
Transerfing	6	[RSS]
AJAX	5	[RSS]
cfHSSF	5	[RSS]
cfWatcher	5	[RSS]
JavaScript	5	[RSS]
Amazon	4	[RSS]
cfFirewall	4	[RSS]
Security	4	[RSS]
SEO	4	[RSS]
jQuery	3	[RSS]
1ssChat	2	[RSS]
Adobe Air	1	[RSS]
jQuery Mobile	1	[RSS]
MMA	1	[RSS]

Categories

Recent Entries

Recent Comments

Search

Calendar

Tags Cloud