Ms Sql Server Technology Discussions

February 5, 2013

[quote name='Kaarthikeya' timestamp='1360030825' post='1303226520']
MTKR mama nuvvu malli ade query raastunnav...nuvvu raase query motham table lo unna max date ni pull chestundi kadaa? sCo_^Y

each record ki max date diff ga untundi....nenu ee type query raasanu kaani naaku work out avvale...may be nenu inko Table kuda join chestunna apart from the above two tables..anduke diff vastundemo...I could solve the issue today...But its taking nearly 2 mins to run for the View....daanni ela tune cheyyalo ardham kaavatledu :3D_Smiles_38:

anyways thankyou for actively giving me reply
[/quote]

malli malli ade rasthunnav ante ade ans kabatti ade rasthunnaaa....

previous posts lo 10 records unna table ki chupincha... andulo max dt entire tbl ki pick chesukovadam ledhu only set of grp ids lo teesukuntuntundhiii....

motham query ni okesaari execute cheyyaku... step by step chusukuntu velluu appudu ne results ekkada miss avuthunnayo telustadiiii....

ny way all the best....

February 5, 2013

[quote name='ghazni' timestamp='1360031972' post='1303226571']
max enddate peti group by only id petu.......
[/quote]
[quote name='loveindia' timestamp='1360038025' post='1303227084']
orey ayya... max(date) group by id ani raasadu kada.. adi raastey every id ki max date edi undo danni pick chestadi.. nuvvu query lo adi miss avutu undi untaav.. baaga chudu..
[/quote]
[quote name='mtkr' timestamp='1360068940' post='1303228334']

malli malli ade rasthunnav ante ade ans kabatti ade rasthunnaaa....

previous posts lo 10 records unna table ki chupincha... andulo max dt entire tbl ki pick chesukovadam ledhu only set of grp ids lo teesukuntuntundhiii....

motham query ni okesaari execute cheyyaku... step by step chusukuntu velluu appudu ne results ekkada miss avuthunnayo telustadiiii....

ny way all the best....
[/quote]

ala raasanu work out avvaledu ane ikkada adigaanu mama :3D_Smiles_38: anyways I could figure out a query for this yesterday

Thanks all for ur help

February 5, 2013

[b] Fix Error: 8111 – Cannot define PRIMARY KEY constraint on nullable column in table – Error: 1750 – Could not create constraint. See previous errors[/b]

A very common error new developers receive when they begin with SQL Server and start playing with the keys. Let us first run following code which will generate an error 8111.

[CODE]
-- Create Table
CREATE TABLE test (ID INT, Col1 INT,
Col2 VARCHAR(100))
GO
-- Now create PK on ID Col
ALTER TABLE test
ADD CONSTRAINT [PK_test]
PRIMARY KEY CLUSTERED
([ID] ASC)
GO
[/CODE]

When you run above code it will give following error:
[color=#ff0000]Msg 8111, Level 16, State 1, Line 2[/color]
[color=#ff0000]Cannot define PRIMARY KEY constraint on nullable column in table ‘test’.[/color]
[color=#ff0000]Msg 1750, Level 16, State 0, Line 2[/color]
[color=#ff0000]Could not create constraint. See previous errors.[/color]

Now let us see how we can fix the error.

Fix / Workaround / Solution:

The error message is very clear and explains the root cause of the error. When a user creates a primary key on any column, the column should not be nullable.

There are two ways to fix this error.
[b] Method 1: Alter Column as NOT NULL (preferred)[/b]

This is the method is very popular as most of the time design of the database is already done so altering the column to not null is an easy thing to do.

[CODE]
-- Alter Column to not null
ALTER TABLE test
ALTER COLUMN ID INT NOT NULL
GO
[/CODE]
[b] Method 2: Create a table with the Column as NOT NULL[/b]

This method can be only useful, if you are aware that Primary Key needs to be NOT NULL. When you design your database or do database modeling at that time, you can keep this in mind and create a table with the column as not null. Additionally, you can also create Primary Key constraint when you create original table.

[CODE]
-- Create Table
CREATE TABLE test (ID INT NOT NULL,
Col1 INT, Col2 VARCHAR(100)
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED)
GO
[/CODE]

February 5, 2013

[b] Fix: Error: 1505 The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name and the index name[/b]

Here is another example where the error messages are very clear but often developers get confused with the message. I think the reason for the confusion is the word [i][b]“Key”[/b][/i] used in the error message. After I explained this to a developer who sent me the error he realize that it is about how we all interpret a same statement.

Following code will generate the error 1505.

[CODE]
-- Create Table
CREATE TABLE test (ID INT NOT NULL,
Col1 INT, Col2 VARCHAR(100))
GO
-- Populate Table
INSERT INTO test (ID, Col1, Col2)
SELECT 1, 1, 'First'
UNION ALL
SELECT 1, 2, 'Second' -- Duplicate ID col
UNION ALL
SELECT 3, 3, 'Third'
UNION ALL
SELECT 4, 4, 'Fourth'
GO
[/CODE]

After creating a table, I am inserting same a key in the first column multiple times.

[CODE]
-- Now create PK on ID Col
ALTER TABLE test
ADD CONSTRAINT [PK_test]
PRIMARY KEY CLUSTERED
([ID] ASC)
GO
[/CODE]

Now when I attempt to create a Primary Key on the column it gives us following error.

[color=#ff0000]Msg 1505, Level 16, State 1, Line 2[/color]
[color=#ff0000]The CREATE UNIQUE INDEX statement terminated because a duplicate key was found for the object name ‘dbo.test’ and the index name ‘PK_test’. The duplicate key value is (1).[/color]
[color=#ff0000]Msg 1750, Level 16, State 0, Line 2[/color]
[color=#ff0000]Could not create constraint. See previous errors.[/color]

The statement has been terminated.
[b] Fix / Workaround / Solution:[/b]

In SQL Server Primary Key can’t have duplicate records as well can’t have a column which is nullable. In our case, you can notice that we are creating a primary key on the column ID which contains duplicate values of 1. The only way to create primary key on this column is to delete the duplicate row which exists. If your business logic does not allow to delete the duplicate row, this means that your column is not a good candidate for the Primary Key and you will have to either select another column or use a composite Primary Key (where you use multiple columns).

February 5, 2013

Nenu raasina oka View more than 2 minutes run aithandi...Is it good or not?

PS: 7 million records ni okate saari teesukostondi by GROUPing aa View

February 5, 2013

[quote name='Kaarthikeya' timestamp='1360075635' post='1303228601']
Nenu raasina oka View more than 2 minutes run aithandi...Is it good or not?

PS: 7 million records ni okate saari teesukostondi by GROUPing aa View
[/quote]

7 million records, 2 mins antey quite acceptable.. anni records display avvanikey anta time padtadi..

February 5, 2013

evaranaa veetiki answers chepara ****Triggers can be used to ensure referential integrity, enforce business rules, create audit trails, replicate tables, but cannot call other triggers yes/no?***
***Joining tables or using a subquery may produce the same result yes/no nenu yes anukuntunna, am i right?****

February 5, 2013

[quote name='SonyVaio' timestamp='1360097061' post='1303231298']
evaranaa veetiki answers chepara ****Triggers can be used to ensure referential integrity, enforce business rules, create audit trails, replicate tables, but cannot call other triggers yes/no?***
***Joining tables or using a subquery may produce the same result yes/no nenu yes anukuntunna, am i right?****
[/quote]

1) YES
2) YES BUT join is preferred because of performance.

February 5, 2013

[quote name='loveindia' timestamp='1360094203' post='1303230983']

7 million records, 2 mins antey quite acceptable.. anni records display avvanikey anta time padtadi..
[/quote]
ante 7 million records ni nenu group chestaanu...so rough ga 200K records vasthayi anthe s%H#

February 5, 2013

[b] Capturing Baselines on SQL Server: Wait Statistics[/b]

In order to work through this article, you'll need SQL Server 2005 or higher and a BaselineData database in which to store baseline information (see the Part 2 link above, where you'll find the scripts to create this database).

[b] Clearing Wait Stats[/b]

By default, SQL Server clears the cumulative wait statistics for an instance from the sys.dm_os_wait_stats DMV upon instance restart. In addition, a DBA can clear the statistics manually using DBCC SQLPERF ([url="http://msdn.microsoft.com/en-us/library/ms189768.aspx"]http://msdn.microsof...y/ms189768.aspx[/url]).
It is not required to clear out wait statistics on a regular basis. However, in order to analyze this data in a meaningful way, both to understand what "normal" behavior is and also quickly spot abnormalities when comparing data from a previous period to current data, it is important that DBAs adopt a clear policy on the timing and frequency of clearing wait statistics.

If one DBA is clearing wait statistics at 6 AM daily, and another DBA is capturing the information at 7 AM, the data only represents the waits accumulated during an hour's workload, which may not represent the normal workload.

Ideally, we need to collect wait statistics for a period that represents normal activity, without clearing them. At the same time, however, the DBAs will want to understand how significant changes, such as adding a new index, or altering a configuration setting, affect the pattern of waits for the instance. Clearing the wait statistics immediately after making the change can help us understand the impact of the modification.
There are other reasons that may compel a DBA to clear waits statistics. For example, some companies use a single third-party utility for all backups, regardless of the application, resulting in an extended time to complete SQL Server database backups. While there are alternate options, which can perform very fast SQL Server backups (e.g. native SQL backups, dedicated third-party SQL backup applications), the DBA is unable to use them. In such cases, the DBA knows backup performance to be poor, but is unable to make improvements and instead may opt to clear out wait statistics after each backup job completes, to prevent any waits from the actual backup from influencing the interpretation of the wait statistics as a whole. Alternatively, the DBA can filter BACKUP* waits from the output.

The queries in Listing 1 will reveal when wait statistics were last cleared by an instance restart, as well as if, and when, someone last cleared them manually. Simply compare the two values to see if wait statistics have been manually cleared since the last restart.

[CODE]
SELECT [wait_type] ,
[wait_time_ms] ,
DATEADD(SS, -[wait_time_ms] / 1000, GETDATE()) AS "Date/TimeCleared" ,
CASE WHEN [wait_time_ms] < 1000
THEN CAST([wait_time_ms] AS VARCHAR(15)) + ' ms'
WHEN [wait_time_ms] BETWEEN 1000 AND 60000
THEN CAST(( [wait_time_ms] / 1000 ) AS VARCHAR(15)) + ' seconds'
WHEN [wait_time_ms] BETWEEN 60001 AND 3600000
THEN CAST(( [wait_time_ms] / 60000 ) AS VARCHAR(15)) + ' minutes'
WHEN [wait_time_ms] BETWEEN 3600001 AND 86400000
THEN CAST(( [wait_time_ms] / 3600000 ) AS VARCHAR(15)) + ' hours'
WHEN [wait_time_ms] > 86400000
THEN CAST(( [wait_time_ms] / 86400000 ) AS VARCHAR(15)) + ' days'
END AS "TimeSinceCleared"
FROM [sys].[dm_os_wait_stats]
WHERE [wait_type] = 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP';
/*
check SQL Server start time - 2008 and higher
*/
SELECT [sqlserver_start_time]
FROM [sys].[dm_os_sys_info];

/*
check SQL Server start time - 2005 and higher
*/
SELECT [create_date]
FROM [sys].[databases]
WHERE [database_id] = 2
[/CODE]
[b]Listing 1: When were waits stats last cleared, either manually or by a restart?[/b]

Ultimately, it is at the discretion of the DBAs to decide when to clear out wait statistics, if at all. If in doubt, collect your wait statistics (see the next section) for a period that will capture a representative workload (for example one month). The next month, collect the stats on am more regular schedule, such as every Sunday, after scheduled code changes, and then immediately clear out the sys.dm_os_wait_stats DMV. Compare each of the four 1-week data sets to the 1-month set: do different wait patterns exist (for example, perhaps the last week of the month, when various business reports run, has different waits), or are they consistent across all five sets? If you see differences then you may want to consider clearing out the stats on a regular (e.g. weekly) basis.

[b] Collecting Wait Statistics for Analysis[/b]

In order to collect wait statistics, on a regular schedule, the first step is to create a table to hold the information, as shown in Listing 2 (as described previously, this script assumes the BaselineData database exists).

[CODE]
USE [BaselineData];
GO
IF NOT EXISTS ( SELECT *
FROM [sys].[tables]
WHERE [name] = N'WaitStats'
AND [type] = N'U' )
CREATE TABLE [dbo].[WaitStats]
(
[RowNum] [BIGINT] IDENTITY(1, 1) ,
[CaptureDate] [DATETIME] ,
[WaitType] [NVARCHAR](120) ,
[Wait_S] [DECIMAL](14, 2) ,
[Resource_S] [DECIMAL](14, 2) ,
[Signal_S] [DECIMAL](14, 2) ,
[WaitCount] [BIGINT] ,
[Percentage] [DECIMAL](4, 2) ,
[AvgWait_S] [DECIMAL](14, 2) ,
[AvgRes_S] [DECIMAL](14, 2) ,
[AvgSig_S] [DECIMAL](14, 2)
);
GO
CREATE CLUSTERED INDEX CI_WaitStats ON [dbo].[WaitStats] ([RowNum], [CaptureDate]);
[/CODE]
[b]Listing 2: Creating the dbo.WaitStats table[/b]

The second step is simply to schedule a query to run on a regular basis, which captures the wait information to this table. Listing 3 derives from a query presented in Paul Randal's [url="http://www.sqlskills.com/blogs/paul/post/Wait-statistics-or-please-tell-me-where-it-hurts.aspx"]wait statistics post[/url]. This query uses a CTE to capture the raw wait statistics data, and then manipulates the output to include averages, for example, average wait (AvgWait_S), and average signal wait (AvgSig_S). I included an additional, optional INSERT as it helps to separate each set of data collected when reviewing the output.

[CODE]
USE [BaselineData];
GO
INSERT INTO dbo.WaitStats
( [WaitType]
)
VALUES ( 'Wait Statistics for ' + CAST(GETDATE() AS NVARCHAR(19))
);
INSERT INTO dbo.WaitStats
( [CaptureDate] ,
[WaitType] ,
[Wait_S] ,
[Resource_S] ,
[Signal_S] ,
[WaitCount] ,
[Percentage] ,
[AvgWait_S] ,
[AvgRes_S] ,
[AvgSig_S]
)
EXEC
( '
WITH [Waits] AS
(SELECT
[wait_type],
[wait_time_ms] / 1000.0 AS [WaitS],
([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
[signal_wait_time_ms] / 1000.0 AS [SignalS],
[waiting_tasks_count] AS [WaitCount],
100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
FROM sys.dm_os_wait_stats
WHERE [wait_type] NOT IN (
N''CLR_SEMAPHORE'', N''LAZYWRITER_SLEEP'',
N''RESOURCE_QUEUE'', N''SQLTRACE_BUFFER_FLUSH'',
N''SLEEP_TASK'', N''SLEEP_SYSTEMTASK'',
N''WAITFOR'', N''HADR_FILESTREAM_IOMGR_IOCOMPLETION'',
N''CHECKPOINT_QUEUE'', N''REQUEST_FOR_DEADLOCK_SEARCH'',
N''XE_TIMER_EVENT'', N''XE_DISPATCHER_JOIN'',
N''LOGMGR_QUEUE'', N''FT_IFTS_SCHEDULER_IDLE_WAIT'',
N''BROKER_TASK_STOP'', N''CLR_MANUAL_EVENT'',
N''CLR_AUTO_EVENT'', N''DISPATCHER_QUEUE_SEMAPHORE'',
N''TRACEWRITE'', N''XE_DISPATCHER_WAIT'',
N''BROKER_TO_FLUSH'', N''BROKER_EVENTHANDLER'',
N''FT_IFTSHC_MUTEX'', N''SQLTRACE_INCREMENTAL_FLUSH_SLEEP'',
N''DIRTY_PAGE_POLL'')
)
SELECT
GETDATE(),
[W1].[wait_type] AS [WaitType],
CAST ([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S],
CAST ([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S],
CAST ([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S],
[W1].[WaitCount] AS [WaitCount],
CAST ([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage],
CAST (([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgWait_S],
CAST (([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgRes_S],
CAST (([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL (14, 4)) AS [AvgSig_S]
FROM [Waits] AS [W1]
INNER JOIN [Waits] AS [W2]
ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY [W1].[RowNum], [W1].[wait_type], [W1].[WaitS],
[W1].[ResourceS], [W1].[SignalS], [W1].[WaitCount], [W1].[Percentage]
HAVING SUM ([W2].[Percentage]) - [W1].[Percentage] < 95;'
);
GO
[/CODE]
[b]Listing 3: Capturing wait stats data for analysis[/b]

We should capture wait statistics regularly, at least once a week or once a month. We could do so more frequently, perhaps daily, but remember that unless we are clearing the data regularly, they represent an aggregation of waits since the last restart. The longer the waits have been accumulating, the harder it may be to spot smaller changes in wait percentages. For example, let's say the system suffers a short period (one hour) of poor performance, during which the number of, and duration of, a certain wait type increases significantly. This 'spike' may be hard to spot if you're analyzing waits accumulated over a long period (e.g. a month), since the spike might not affect significantly the overall wait percentage for that period.
[b] Reviewing Wait Statistics Data[/b]

You'll want to review regularly the waits for each SQL Server instance. If you capture them once a week, then check on the trending once a week. The simple SELECT in Listing 4 retrieves from the dbo.WaitStats table all the data captured for the last 30 days.

[CODE]
SELECT *
FROM [dbo].[WaitStats]
WHERE [CaptureDate] > GETDATE() - 30
ORDER BY [RowNum];
[/CODE]
[b]Listing 4: Reviewing the last 30 days data[/b]

If you need to view older data, adjust the number of days as necessary, or remove the predicate entirely. In some cases, it might be ideal to look at only the top wait for each set of data collected, as shown in Listing 5 (again, alter the number of days as needed).

[CODE]
SELECT [w].[CaptureDate] ,
[w].[WaitType] ,
[w].[Percentage] ,
[w].[Wait_S] ,
[w].[WaitCount] ,
[w].[AvgWait_S]
FROM [dbo].[WaitStats] w
JOIN ( SELECT MIN([RowNum]) AS [RowNumber] ,
[CaptureDate]
FROM [dbo].[WaitStats]
WHERE [CaptureDate] IS NOT NULL
AND [CaptureDate] > GETDATE() - 30
GROUP BY [CaptureDate]
) m ON [w].[RowNum] = [m].[RowNumber]
ORDER BY [w].[CaptureDate];
[/CODE]
[b]Listing 5: Reviewing the top wait for each collected data set[/b]

There are many ways in which to examine this data, but your focus, initially, should be to understand the top waits in your system and ensure they're consistent over time. Expect to tweak your capture and monitoring process in the first few weeks of implementation.
As discussed earlier, it's up to the DBAs to decide on a clear, consistent policy on how often to collect and analyze this data, and when to clear out the sys.dm_os_wait_stats DMV. Here, by way of a starting point, I offer three possible options for clearing, capturing and reviewing this data:

[b]Option 1[/b][b]:[/b][list]
[*]Never clear wait statistics
[*]Capture weekly (at the end of any business day)
[*]Review weekly
[/list]
[b]Option 2[/b][b]:[/b][list]
[*]Clear wait statistics on Sunday nights (or after a full weekly backup)
[*]Capture daily at the end of the business day
[*]Review daily, checking to see if the percentages for wait types vary throughout the week
[/list]
[b]Option 3[/b][b]:[/b][list]
[*]Clear wait statistics nightly (after full or differential backups complete)
[*]Capture daily, at the end of the business day (optional: capture after any evening or overnight processing)
[*]Review daily, checking to see how the waits and their percentages vary throughout the week (and throughout the day if capturing more than once a day)
[/list]
The critical point to remember is that you are capturing this data to achieve a baseline, and to understand "normal" wait patterns on your systems. However, it's common, as you're reviewing this information, to identify existing or potential bottlenecks. This is a good thing. Having this information allows you to investigate an unexpected or high wait type, and determine the possible source of the bottleneck that caused the wait, before it becomes a production problem.
[b] Managing Historical Data[/b]

As with all baseline data, it will cease to be relevant, after a certain point, and you can remove it from the BaselineData database. The query in Listing 6 removes data older than 90 days, but you can adjust this value as appropriate for your environment. The overall size of the dbo.WaitStats table will depend on the choices you make on how often to capture the data and how long to retain it.
[CODE]DELETE FROM [dbo].[WaitStats]
WHERE [CaptureDate] < GETDATE() – 90;[/CODE]
Listing 6: Purging data over 90 days old
Alternatively, if you've been following the series you may have already implemented the dbo.usp_PurgeOldData stored procedure (see [url="http://www.sqlservercentral.com/articles/baselines/94656/"]Part 2[/url]), in which case you can modify it to incorporate this DELETE statement, as shown in Listing 7.

[CODE]
IF OBJECTPROPERTY(OBJECT_ID(N'usp_PurgeOldData'), 'IsProcedure') = 1
DROP PROCEDURE usp_PurgeOldData;
GO
CREATE PROCEDURE dbo.usp_PurgeOldData
(
@PurgeConfig SMALLINT ,
@PurgeCounters SMALLINT ,
@PurgeWaits SMALLINT

)
AS
BEGIN;
IF @PurgeConfig IS NULL
OR @PurgeCounters IS NULL
BEGIN;
RAISERROR(N'Input parameters cannot be NULL', 16, 1);
RETURN;
END;
DELETE FROM [dbo].[ConfigData]
WHERE [CaptureDate] < GETDATE() - @PurgeConfig;
DELETE FROM [dbo].[ServerConfig]
WHERE [CaptureDate] < GETDATE() - @PurgeConfig;
DELETE FROM [dbo].[PerfMonData]
WHERE [CaptureDate] < GETDATE() - @PurgeCounters;
DELETE FROM [dbo].[WaitStats]
WHERE [CaptureDate] < GETDATE() - @PurgeWaits;
END;
[/CODE]
[b]Listing 7: The dbo.usp_PurgeOldData stored procedure[/b]

To delete configuration data older than 30 days, performance counter data older than 7 days, and wait statistics older than 90 days, respectively, execute the following statement:

[CODE]EXEC dbo.usp_PurgeOldData 30, 7, 90[/CODE]

[b] Final Notes[/b]

Wait statistics are one of the best places for a Database Administrator to start when tuning a SQL Server environment or troubleshooting a performance issue. While wait statistics alone will not solve a problem, they are a key piece of information that will point you in the right direction, particularly when you have baseline values that you can reference.

The queries provided in this article should serve as a good starting point for any DBA to capture and review the wait statistics. As always, remember to add the dbo.WaitStats table to your optimization tasks, and add indexes as needed to support new queries and reports.

February 5, 2013

[b] Using the Data Profiling SQL Server Integration Services SSIS task[/b]

[b] Problem[/b]

Have you ever had to write a bunch of TSQL to do some data analysis on the table data in your database. If you have you'll know that this can become a fairly time consuming and tedious task? SQL Server 2012 Integration Services has a feature called the Data Profiling task that can help you perform this analysis much easier and faster (this feature is also available in SQL Server 2008). This task can help you find potential issues with your existing data as well as help you become more familiar with the data in a database that you have just started managing.
[b] Solution[/b]

To use the Data Profiling SSIS task in SQL Server 2012 we first need to create a new Integration Services project in [url="http://www.mssqltips.com/sqlservertip/2804/introduction-to-sql-server-data-tools/"]SQL Server Data Tools[/url] (formerly [url="http://www.mssqltips.com/sqlservertutorial/204/business-intelligence-development-studio-bids/"]Business Intelligence Development Studio[/url]). Once you've created the project and opened it, your screen should look as follows with an empty package design tab.

[center][img]http://www.mssqltips.com/tipimages2/2854_DataToolsInitialScreen.gif[/img][/center]

Now that we have a package open we need to add a connection manager object for the database we want to run the profile against. To do this we right click "Connection Managers" in the Solution Explorer and select "New Connection Manager". Then select "ADO.NET" and click "Add...", then click "New..." and then fill in dialog box as show below. Once everything is filled in I like to click "Test Connection" just to make sure it's configured correctly. After this is successful you can click "OK" to complete the setup.

[center][img]http://www.mssqltips.com/tipimages2/2854_DBConnectionManager.gif[/img][/center]

Next we can add the Data Profiling task to the package by clicking and dragging it from the SSIS Toolbox onto the Control Flow panel.

[center][img]http://www.mssqltips.com/tipimages2/2854_PackageWithDataProfilingTask.gif[/img][/center]

We can start to configure the task by double clicking on the task in the Control Flow panel. A dialog box opens to the General Tab where we can setup our Profiler Output File connection. Click on the Destination property and select "<New File connection...>" and complete the dialog box as shown below. Once complete we can click "OK".

[center][img]http://www.mssqltips.com/tipimages2/2854_FileConnectionManager.gif[/img][/center]

The complete General tab should now look like this.

[center][img]http://www.mssqltips.com/tipimages2/2854_GeneralTab.gif[/img][/center]

From this panel we have two different methods we can use to configure what we are going to profile. First we can click on "Quick Profile" and we are presented with a form we can use to configure one or more tables to be profiled. For this example I am just going to select a single table and check all the profiles to be computed. A description of each profile can be found [url="http://technet.microsoft.com/en-us/library/bb895263.aspx"]here[/url]. Once completed the form should look like the one below.

[center][img]http://www.mssqltips.com/tipimages2/2854_QuickProfile.gif[/img][/center]

After clicking "OK" on the form above we are taken to the Profile Requests panel which is the other method of configuring this task and could have been selected when the initial form was opened. Looking at the screenshot below we can see all of the Profile Types we selected in the last form are configured here. Looking under the Data subsection of the Request Properties section we can see that we also have the ability to profile one or more columns within the table. As an example, if we were looking for a candidate key we may exclude some columns from the profile that we know for sure would not be part of the key.

[center][img]http://www.mssqltips.com/tipimages2/2854_ProfileRequestsPanelFromQuickProfile.gif[/img][/center]

Let's add one more table/column by clicking on the next empty space in the "Profile Type" column. After selecting the profile type we can fill in the "Request Properties" section as shown below by select a connection manager, table/view and columns. In this case we are just going to select a single column. After clicking "OK" to complete the configuration we are taken back to the Control Flow tab of our package.

[center][img]http://www.mssqltips.com/tipimages2/2854_ProfileRequestsPanel.gif[/img][/center]

The configuration of our data profiling task is now complete and we can run it by selecting "Start Debugging" from the Debug menu or by using the F5 shortcut. Once it has completed successfully we have to select "Stop Debugging" from the Debug menu or use the Shift+F5 shortcut to be able to view the output.

The output file is just a regular xml file, [url="http://www.mssqltips.com/tipimages2/2854_AdventureWorksDataProfiler.xml"]here is a sample[/url], and Microsoft provides two options for launching the [url="http://technet.microsoft.com/en-us/library/bb895313.aspx"]Data Profile Viewer[/url] which displays this xml file in a nice formatted output. The first way is to double click on the Data Profiling task on the Control Flow tab and it will open up the General tab as it did when you were configuring the task. Once here we can click "Open Profile Viewer". Alternatively we can start the [url="http://technet.microsoft.com/en-us/library/bb895313.aspx"]Data Profile Viewer[/url] as a stand-alone application. It can be found under the Start Menu -> Microsoft SQL Server 2012 -> Integration Services -> Data Profile Viewer. Once there simply open the output file that was created by the Data Profiling task. Regardless of the method you choose you will get the following window to display the contents of the xml file.

[center][img]http://www.mssqltips.com/tipimages2/2854_DataProfileViewer.gif[/img][/center]

February 5, 2013

[b] T-SQL insert using SSIS Data Pump[/b]

This article implements an automatization of SSIS Data Pump method. The method SSIS Data Pump represents the most effective way to load large amount of data into a SQL Server database because the SSIS Data Pump makes optimal usage of the transaction logs, parallelism, locking, and pipelining. A more detailed explanation is in the referenced article.
Imagine the following scenario: I have a large filegroup containing 20 tables , and the filegroup has a lot of unused space. If I want to recover the unused space the best way is to create a new filegroup, rename the old tables , create the new tables on the new filegroup and then copy the rows from old tables into new tables . To copy 20 tables, I could manually create 20 SSIS packages and run them, but it would be nice to have a T-SQL stored procedure that, given a T-SQL select statement, executes it and loads the resulting rows into a given table using the SSIS Data Pump.
This article presents an implemented and tested solution that loads data using a dynamically generated SSIS package. The solution is made up of 2 components :[list=1]
[*]An SSIS package named BULK_TRANSFER_DATA.dtsx
[*]the stored procedure Direct_Path_Insert
[/list]
Here is a snapshot of how BULK_TRANSFER_DATA looks like

[img]http://www.sqlservercentral.com/Images/17245.jpg[/img]

The core of the solution is the VB script Create_DTSX, which creates an SSIS package using values contained into package variables that work as input parameters. This code is similar to that shown these two links:[list]
[*][url="http://msdn.microsoft.com/en-us/library/ms345167.aspx"]http://msdn.microsoft.com/en-us/library/ms345167.aspx[/url]
[*][url="http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/3f513155-6b09-45bc-b913-a24906d1b580"]http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/thread/3f513155-6b09-45bc-b913-a24906d1b580[/url]
[/list]
The code is shown below:

[CODE]
Public Sub Main()
Dim packageName As String = Dts.Variables("DEST_TABLE_NAME").Value.ToString

Dim threads As Integer = 1
orderNumber = 1
'--------------------------------------------------------------------------------------------
app = New Application()
Dts.Events.FireInformation(0, "Bulk Transfer Package", packageName, "", 0, True)
Dim srcDbConnectionString As String = Dts.Variables("SOURCE_CONNECT").Value.ToString
Dim destDbConnectionString As String = Dts.Variables("DEST_CONNECT").Value.ToString
'--------------------------------------------------------------------------------------------
'Create package
dataTransferPackage = New Package()
dataTransferPackage.LocaleID = 1029
'Create SourceConection
Dim srcConn As Microsoft.SqlServer.Dts.Runtime.ConnectionManager = dataTransferPackage.Connections.Add("OLEDB")
srcConn.Name = "OLEDB Source"
srcConn.ConnectionString = srcDbConnectionString
'Create DestConection
Dim destConn As Microsoft.SqlServer.Dts.Runtime.ConnectionManager = dataTransferPackage.Connections.Add("OLEDB")
destConn.Name = "OLEDB Destination"
destConn.ConnectionString = destDbConnectionString

Dim SQLCommand As String = Dts.Variables("SOURCE_SELECT").Value.ToString
Dim schemaTableName As String = Dts.Variables("DEST_TABLE_NAME").Value.ToString
'------------------------------------------------------
'Create DataFlow task
Dim ePipeline As Executable = dataTransferPackage.Executables.Add("STOCK:PipelineTask")
Dim thMainPipe As Microsoft.SqlServer.Dts.Runtime.TaskHost = CType(ePipeline, Microsoft.SqlServer.Dts.Runtime.TaskHost)
thMainPipe.Name = Format(orderNumber, "000000") + "_copy_Table_Task_" + schemaTableName
orderNumber += 1
Dim dataFlowTask As MainPipe = CType(thMainPipe.InnerObject, MainPipe)
dataFlowTask.DefaultBufferSize = 50 * 1024 * 1024
' FastLoadMaxInsertCommitSize will have the same value of DefaultBufferMaxRows
' It represents the size in rows of a commited batch
' If I want to have all in a single transaction, must choose 0
dataFlowTask.DefaultBufferMaxRows = 20000

' dataFlowTask.BLOBTempStoragePath = dataTransferCache
' dataFlowTask.BufferTempStoragePath = dataTransferCache
Dts.Events.FireInformation(0, "DataFlow task created for table", schemaTableName, "", 0, True)
'--------------------------------------------------------------------------------------------
'Create OLEDB Source
Dts.Events.FireInformation(0, "Creating OLEDB source - for table", schemaTableName, "", 0, True)
Dim compSrc As IDTSComponentMetaData100 = dataFlowTask.ComponentMetaDataCollection.New()
compSrc.ComponentClassID = "DTSAdapter.OleDbSource.2"
compSrc.Name = "OLEDBSource_" + schemaTableName

' Initialize the component.
Dim instSrc As CManagedComponentWrapper = compSrc.Instantiate()
instSrc.ProvideComponentProperties()
'Dts.Events.FireInformation(0, "Set Connection OLEDB source - for table", "[" + schemaName + "].[" + tableName + "]", "", 0, True)
compSrc.RuntimeConnectionCollection(0).ConnectionManagerID = dataTransferPackage.Connections("OLEDB Source").ID
compSrc.RuntimeConnectionCollection(0).ConnectionManager = DtsConvert.GetExtendedInterface(dataTransferPackage.Connections("OLEDB Source"))
compSrc.CustomPropertyCollection.Item("DefaultCodePage").Value = 1250
Dts.Events.FireInformation(0, "OLEDB source query", SQLCommand, "", 0, True)
instSrc.SetComponentProperty("AccessMode", 2) ' Table or view ... 0, SQL Command ... 2, Table or view - fast 3
instSrc.SetComponentProperty("SqlCommand", SQLCommand)
' Reinitialize the metadata.
instSrc.AcquireConnections(vbNull)
instSrc.ReinitializeMetaData()
instSrc.ReleaseConnections()
instSrc = Nothing
compSrc.Name = "OLEDBSource_" + schemaTableName
Dim srcOutput As IDTSOutput100 = compSrc.OutputCollection(0)
Dts.Events.FireInformation(0, "SSIS Tasks created for table", schemaTableName, "", 0, True)
'--------------------------------------------------------------------------------------------
'Create OLEDB Destination
Dts.Events.FireInformation(0, "Creating OLEDB dest - for table", schemaTableName, "", 0, True)
Dim compDest As IDTSComponentMetaData100 = dataFlowTask.ComponentMetaDataCollection.New()
compDest.ComponentClassID = "DTSAdapter.OleDbDestination.2"
compDest.Name = "OLEDBDest_" + schemaTableName
' Initialize the component.
Dim instDest As CManagedComponentWrapper = compDest.Instantiate()
instDest.ProvideComponentProperties()
compDest.RuntimeConnectionCollection(0).ConnectionManagerID = dataTransferPackage.Connections("OLEDB Destination").ID
compDest.RuntimeConnectionCollection(0).ConnectionManager = DtsConvert.GetExtendedInterface(dataTransferPackage.Connections("OLEDB Destination"))
compDest.CustomPropertyCollection.Item("DefaultCodePage").Value = 1250
' Set the custom properties
instDest.SetComponentProperty("AccessMode", 3)
instDest.SetComponentProperty("OpenRowset", "DBO." + schemaTableName)
Dts.Events.FireInformation(0, "instDest", schemaTableName, "", 0, True)
instDest.SetComponentProperty("FastLoadKeepIdentity", True)
instDest.SetComponentProperty("FastLoadKeepNulls", True)
instDest.SetComponentProperty("FastLoadMaxInsertCommitSize", dataFlowTask.DefaultBufferMaxRows)
instDest.SetComponentProperty("FastLoadOptions", "CHECK_CONSTRAINTS ")
',ROWS_PER_BATCH = " + dataFlowTask.DefaultBufferMaxRows.ToString())
Dim destInput As IDTSInput100 = compDest.InputCollection(0)
Dim path As IDTSPath100 = dataFlowTask.PathCollection.New()
path.AttachPathAndPropagateNotifications(srcOutput, destInput)
' Reinitialize the metadata.
instDest.AcquireConnections(vbNull)
instDest.ReinitializeMetaData()
compDest.Name = "OLEDBDest_" + schemaTableName
Dts.Events.FireInformation(0, "instDest_ioriuno", schemaTableName, "", 0, True)
Dim vdestInput As IDTSVirtualInput100 = destInput.GetVirtualInput()
For Each vColumn As IDTSVirtualInputColumn100 In vdestInput.VirtualInputColumnCollection
Dim vCol As IDTSInputColumn100 = instDest.SetUsageType(destInput.ID, vdestInput, vColumn.LineageID, DTSUsageType.UT_READWRITE)
instDest.MapInputColumn(destInput.ID, vCol.ID, destInput.ExternalMetadataColumnCollection(vColumn.Name).ID)
Next
instDest.ReleaseConnections()
instDest = Nothing
'--------------------------------------------------------------------------------------------
dataTransferPackage.MaxConcurrentExecutables = threads
dataTransferPackage.MaximumErrorCount = 100
app.SaveToXml(packageName + ".dtsx", dataTransferPackage, Nothing)
Dts.TaskResult = ScriptResults.Success
End Sub
[/CODE]

The BULK_TRANSFER_DATA first creates an SSIS package named the same as the destination table. It then executes the package using a built-in SSIS ExecutePackage component before and deleting the package. The dynamically created package contains a DataFlow that contains a simple OleDb Source - OleDb Destination mapping.
The OleDb Source is created from the value of the package variable SOURCE_SELECT, which is the concatenation of the variables SELECT_CLAUSE + WHERE_CONDITION. The OleDB Destination inserts data into a table named the same as the value of the variable DEST_TABLE_NAME.
Let's now look at the stored procedure. The procedure first creates a Windows batch file named the same as the destination_table and then launches it. The batch file sets the working directory as the value of @work_dir, and then launches BULK_TRANSFER_DATA.
The prerequistes for Direct_Path_Insert to run are :[list=1]
[*]The destination table exists
[*]The columns generated by the source select must have the identical name (case sensitive), datatype, and length as the columns of the destination table.
[/list]
The procedure code is shown here:

[CODE]
/*
Created by F Iori 20110427
Perform Bulk Insert, generating dynamically a dtsx , running it , deleting it
Setup : copy into @work_dir on the server BULK_TRANSFER_DATA.dtsx and thats all
@dest_tab must be without schema name , is dbo by default
@select_clause column names are case sensitive , they should be the same name, type and number as dest_tab
elseway the proc will fail
*/
/*
declare
@dest_tab varchar(200) = 'drop_GSM_CDR_PTC_TRAFFIC_ACTUAL' -- Schema is always dbo
, @src_db varchar(200) = 'DWHSK_TARGET_PROD'
, @dest_db varchar(200) = 'DWHSK_STAGE_PROD'
, @instance_name varchar(50) = 'NTSQLDWHSKT01\I03'
-- select clause column names are case sensitive
,@select_clause varchar(2000) =
'select top 350 CDR_ID, File_ID, Exchange_ID , Charging_Start_Time , Create_Date from GSM_CDR_PTC_TRAFFIC_ACTUAL where 1=1 '
, @wherecond varchar(2000) = ' and Charging_Start_Time between ''2011-03-01'' and ''2011-03-02'' ' -- or empty string
exec dwhsk_warehouse.dbo.Direct_Path_Insert @dest_tab , @src_db , @dest_db ,
@instance_name , @select_clause , @wherecond
*/
CREATE proc [dbo].[Direct_Path_Insert] (
@dest_tab varchar(200) , @src_db varchar(50), @dest_db varchar(50),
@instance_name varchar(50), @select_clause varchar(2000) ,
@wherecond varchar(2000) ) as
declare
@ret int , @cmd varchar(8000) = '' , @cmds varchar(8000) = ''
, @work_dir varchar(100)= 'N:\Data1\SS_Direct_Path_Insert\'
begin
set @cmd =
' DTexec.exe /File "BULK_TRANSFER_DATA.dtsx" '
+ ' /SET "\package.Variables[User::DEST_TABLE_NAME].Value";"'+@dest_tab+'" '
+ ' /SET "\package.Variables[User::SOURCE_DB].Value";"'+@src_db+'" '
+ ' /SET "\package.Variables[User::DEST_DB].Value";"'+@dest_db+'" '
+ ' /SET "\package.Variables[User::INSTANCE_NAME].Value";"'+@instance_name+'" '
+ ' /SET "\package.Variables[User::SELECT_CLAUSE].Value";"'+@select_clause+'" '
+ ' /SET "\package.Variables[User::WHERE_CONDITION].Value";"'+@wherecond+'" '

print @cmd
set @cmds= ' del '+@work_dir+@dest_tab+'.* '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
set @cmds= 'echo '+LEFT(@work_dir,1)+': > '+@work_dir+@dest_tab+'.bat '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
set @cmds= 'echo cd '+@work_dir+' >> '+@work_dir+@dest_tab+'.bat '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
set @cmds=' echo '+@cmd+' >> '+@work_dir+@dest_tab+'.bat '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
set @cmds = ' '+@work_dir+@dest_tab+'.bat '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
if @ret<>0
BEGIN
declare @errmsg varchar(300) = @dest_tab+'.bat failed '
raiserror (@errmsg, 12, 1)
RETURN
END
set @cmds= ' del '+@work_dir+@dest_tab+'.bat '
exec @ret= master.sys.xp_cmdshell @cmds , no_output
end
[/CODE]

Installation of this system is very easy :[list=1]
[*]On the database server create a folder named the same as the variable @work_dir defined in the stored procedure Direct_Path_Insert.
[*]Copy into the folder @work_dir, the SSIS package BULK_TRANSFER_DATA.dtsx
[*]Create the stored procedure Direct_Path_Insert on any db of the instance.
[/list]
Now the stored procedure can be called from a T-SQL script as :

[CODE]
exec Direct_Path_Insert <destination_table> , <source_db_name>, <destination_db_name>, @@servername, <source_select >, <where_condition for source select>
[/CODE]

Example :

[CODE]
use DWHSK_STAGE_PROD
go
-- Create destination table
select top 250 [CDR_ID]
,[File_ID]
,[Exchange_ID]
, Charging_Start_Time
, Create_Date
into dwhsk_stage_prod.dbo.drop_GSM_CDR_PTC_TRAFFIC_ACTUAL
from dwhsk_target_prod..GSM_CDR_PTC_TRAFFIC_ACTUAL where 1=2
-- Loads some rows into destination table
declare
@dest_tab varchar(200) = 'drop_GSM_CDR_PTC_TRAFFIC_ACTUAL' -- Schema is always dbo
, @src_db varchar(200) = 'DWHSK_TARGET_PROD'
, @dest_db varchar(200) = 'DWHSK_STAGE_PROD'
, @instance_name varchar(50) = 'NTSQLDWHSKT01\I03'
,@select_clause varchar(2000) =
' select CDR_ID, File_ID, Exchange_ID , Charging_Start_Time , Create_Date from GSM_CDR_PTC_TRAFFIC_ACTUAL where 1=1 '
, @wherecond varchar(2000) = ' and Charging_Start_Time between ''2011-03-01'' and ''2011-03-02'' ' -- or empty string
exec dwhsk_warehouse.dbo.Direct_Path_Insert @dest_tab , @src_db , @dest_db ,
@instance_name , @select_clause , @wherecond
select * from dwhsk_stage_prod.dbo.drop_GSM_CDR_PTC_TRAFFIC_ACTUAL
drop table dwhsk_stage_prod.dbo.drop_GSM_CDR_PTC_TRAFFIC_ACTUAL
[/CODE]

At the end, the dynamically generated SSIS packages and Windows batch files are deleted from the working directory on the database server.
Please note that the insert operation performed by Direct_Path_Insert is not atomic, but commits inserted rows in batchs of 20000 rows. That is the value of dataFlowTask.DefaultBufferMaxRows in the VB script. If an error occurs, only the last batch is rolled back.
To have atomic behaviour, just set FastLoadMaxInsertCommitSize to 0 and all the inserted rows will be committed or rolled back in a single batch.

The name of the stored procedure Direct_Path_Insert has been chosen because offers similar functionality as Oracle Direct Path Insert ( insert with hint APPEND).

[b] Resources:[/b]

[url="http://www.sqlservercentral.com/Files/Direct_Path_Insert.sql/9388.sql"]Direct_Path_Insert.sql[/url] | [url="http://www.sqlservercentral.com/Files/BULK_TRANSFER_DATA.dtsx/9389.dtsx"]BULK_TRANSFER_DATA.dtsx[/url]

February 5, 2013

[quote name='loveindia' timestamp='1360099155' post='1303231498']

1) YES
2) YES BUT join is preferred because of performance.
[/quote]

Thanks

February 5, 2013

You run the following code in an SSMS query window in SQL 2008, 2008 R2, or 2012. The SQL server and SSMS are default installations (language English(UNITED STATES), collation Latin1_General_CI_AS) with default configuration. The database PLAYPEN exists and its default collation is the server default.
[CODE]declare @k int;
select
case when @k=0 OR NOT (@k = 0)
then cast(0 as datetime)
else cast('2012-04-01T10:00:00' as date)
end as [when] ;[/CODE]
What value does the select statement deliver for column [when]?

February 5, 2013

ee ques ki answers telistay evaraina chepara [b]True/False[/b] *****1.XML uses bcode, long commands embedded in quotes, to characterize data*******
****2.Storing XML data is becoming a big question as XML data becomes more prevalent****
****3.Application Programming Interfaces are more efficient than CGI scripts and are implemented as shared code or dynamic link libraries****
****4.A client/server configuration that has three layers (one client and two servers) has a three-tier architecture****

Sign In

Ms Sql Server Technology Discussions

Recommended Posts

mtkr

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

loveindia

loveindia

deals2buy

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

loveindia

Link to comment

Share on other sites

SonyVaio

Link to comment

Share on other sites

loveindia

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

SonyVaio

Link to comment

Share on other sites

deals2buy

Link to comment

Share on other sites

SonyVaio

Link to comment

Share on other sites

Tell a friend

Most viewed in last 30 days

Activity