FAQ

SQLDBA
This is a generic question often asked by many interviewers. Explain what are the different SQL Server Versions you have worked on, what kind of administration of those instances has been done by you. Your role and responsibilities carried out in your earlier projects that would be of significance to the potential employer. This is the answer that lets the interviewer know how suitable are you for the position to which you are being interviewed.

2. What are the different SQL Server Versions you have worked on?

The answer would be depending on the versions you have worked on, I would say I have experience working in SQL Server 7, SQL Server 2000, 2005 and 2008. If you have worked only the some version be honest in saying that, remember, no one would be working on all versions, it varies from individual to individual.

3. What are the different types of Indexes available in SQL Server?

The simplest answer to this is “Clustered and Non-Clustered Indexes”. There are other types of Indexes what can be mentioned such as Unique, XML, Spatial and Filtered Indexes. More on these Indexes later.

4. What is the difference between Clustered and Non-Clustered Index?

=> In a clustered index, the leaf level pages are the actual data pages of the table. When a clustered index is created on a table, the data pages are arranged accordingly based on the clustered index key. There can only be one Clustered index on a table.
=> In a Non-Clustered index, the leaf level pages does not contain data pages instread it contains pointers to the data pages. There can multiple non-clustered indexes on a single table.

There are quite a lot of changes and enhancements in SQL Server 2005. Few of them are listed here :
=> Database Partitioning
=> Dynamic Management Views
=> System Catalog Views
=> Resource Database
=> Database Snapshots
=> SQL Server Integration Services
=> Support for Analysis Services on a  Fail over Cluster.
=> Profiler being able to trace the MDX queries of the Analysis Server.
=> Peer-toPeer Replication
=> Database Mirroring

 Failover Clustering, Database Mirroring, Log Shipping and Replication are the High-Availability features available in SQL Server. I would recommend reading this blog of mine which explains the differences between these 4 features. Comparing the High Availability Features in SQL Server 2005
Inside SSMS, in Object explorer under SQL Server Agent look for Job Activity Monitor. The job activity monitor displays the current status of all the jobs on the instance. Choose the particular job which GRANT, DENY and REVOKE.displayed and you may choose the execution time (if the job failed multiple times during the same day).
There would information such as the time it took to execute that Job and details about the error occurred.
1433
A Database can contain a maximum of 32,767 files.
· There are Primarily 2 types of data files Primary data file and Secondary data file(s).
There can be only one Primary data file and multiple secondary data files.
DCL stands for Data Control Language.
GRANT, DENY and REVOKE.
Fill Factor is a setting that is applicable to Indexes in SQL Server. The fill factor value determines how much data is written to an index page when it is created /rebuilt.
By default the fill factor value is set to 0.
Check out the article Q100139 from Microsoft knowledge base and of course, there's much more
information available in the net. It'll be a good idea to get a hold of any RDBMS fundamentals text
book, especially the one by C. J. Date. Most of the times, it will be okay if you can explain till third
normal form.
As the name indicates, denationalization is the reverse process of normalization. It's the controlled introduction of redundancy in to the database design. It helps improve the query performance as the number of joins could be reduced.

=> One-to-One relationship can be implemented as a single table and rarely as two tables with primary and foreign key relationships.
=> One-to-Many relationships are implemented by splitting the data into two tables with primary key and foreign key relationships.
=> Many-to-Many relationships are implemented using a junction table with the keys from both the tables forming the composite primary key of the junction table.
It will be a good idea to read up a database designing fundamentals text book.
Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a non clustered index by default. Another major difference is that, primary key doesn't allow NULLs, but unique key allows one NULL only.
User defined datatypes let you extend the base SQL Server datatypes by providing a descriptive name, and format to the database. Take for example, in your database, there is a column called Flight_Num which appears in many tables. In all these tables it should be varchar(8). In this case you could create a user defined datatype called Flight_num_type of varchar(8) and use it across all your tables.
See sp_addtype, sp_droptype in books online.
Bit datatype is used to store boolean information like 1 or 0 (true or false). Untill SQL Server 6.5 bit datatype could hold either a 1 or 0 and there was no support for NULL. But from SQL Server 7.0 onwards, bit datatype can represent a third state, which is NULL.
A default is a value that will be used by a column, if no value is supplied to that column while inserting data. IDENTITY columns and timestamp columns can't have defaults bound to them. See CREATE DEFUALT in books online.
SQL Server architecture (top)
21.
What is a transaction and what are ACID properties?
A transaction is a logical unit of work in which, all the steps must be performed or none. ACID stands for Atomicity, Consistency, Isolation, Durability. These are the properties of a transaction.
22.
What type of Index will get created after executing the above statement?
Non-clustered index. Important thing to note: By default a clustered index gets created on the primary key, unless specified otherwise.
23.
What's the maximum size of a row?
8060 bytes. Don't be surprised with questions like 'what is the maximum number of columns per table'. Check out SQL Server books online for the page titled: "Maximum Capacity Specifications".
24.
What are constraints? Explain different types of constraints.
Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you to create triggers, rule or defaults.
Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY
For an explanation of these constraints see books online for the pages titled: "Constraints" and "CREATE TABLE", "ALTER TABLE"
25.
What is RAID and what are different types of RAID configurations?
RAID stands for Redundant Array of Inexpensive Disks, used to provide fault tolerance to database servers. There are six RAID levels 0 through 5 offering different levels of performance, fault tolerance. MSDN has some information about RAID levels and for detailed information, check out the RAID advisory board's homepage
26.
What are the steps you will take to improve performance of a poor performing query?
This is a very open ended question and there could be a lot of reasons behind the poor performance of a query. But some general issues that you could talk about would be: No indexes, table scans, missing or out of date statistics, blocking, excess recompilations of stored procedures, procedures and triggers without SET NOCOUNT ON, poorly written query with unnecessarily complicated joins, too much normalization, excess usage of cursors and temporary tables.
Some of the tools/ways that help you troubleshooting performance problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance monitor, Graphical execution plan in Query Analyzer.
Again this is another open ended question. Here are some things you could talk about: Preferring NT authentication, using server, databse and application roles to control access to the data, securing the physical database files using NTFS permissions, using an unguessable SA password, restricting physical access to the SQL Server, renaming the Administrator account on the SQL Server computer, disabling the Guest account, enabling auditing, using multiprotocol encryption, setting up SSL, setting up firewalls, isolating SQL Server from the web server etc.
Read the white paper on SQL Server security from Microsoft website. Also check out My SQL Server security best practices
28.
What is a deadlock and what is a live lock? How will you go about resolving deadlocks?
Deadlock is a situation when two processes, each having a lock on one piece of data, attempt to acquire a lock on the other's piece. Each process would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user's process.
A live lock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A livelock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely.
29.
What is blocking and how would you troubleshoot it?
Blocking happens when one connection from an application holds a lock and a second connection requires a conflicting lock type. This forces the second connection to wait, blocked on the first.
30.
How to restart SQL Server in single user mode? How to start SQL Server in minimal configuration mode?
SQL Server can be started from command line, using the SQLSERVR.EXE. This EXE has some very important parameters with which a DBA should be familiar with. -m is used for starting SQL Server in single user mode and -f is used to start the SQL Server in minimal confuguration mode.
31.
What are statistics, under what circumstances they go out of date, how do you update them?
Statistics determine the selectivity of the indexes. If an indexed column has unique values then the selectivity of that index is more, as opposed to an index with non-unique values. Query optimizer uses these indexes in determining whether to choose an index or not while executing a query.
Some situations under which you should update statistics:
1) If there is significant change in the key values in the index
2) If a large amount of data in an indexed column has been added, changed, or removed (that is, if the distribution of key values has changed), or the table has been truncated using the TRUNCATE TABLE statement and then repopulated
3) Database is upgraded from a previous version.
There are lots of options available, you have to choose your option depending upon your requirements. Some of the options you have are: BACKUP/RESTORE, dettaching and attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT, SELECT...INTO, creating INSERT scripts to generate data.
33.
Explain different types of BACKUPs avaialabe in SQL Server? Given a particular scenario, how would you go about choosing a backup plan?
Types of backups you can create in SQL Sever 7.0+ are Full database backup, differential database backup, transaction log backup, filegroup backup. Check out the BACKUP and RESTORE commands in SQL Server books online. Be prepared to write the commands in your interview. Books online also has information on detailed backup/restore architecture and when one should go for a particular kind of backup.
34.
How to determine the service pack currently installed on SQL Server?
The global variable @@Version stores the build number of the sqlservr.exe, which is used to determine the service pack installed. To know more about this process visit SQL Server service packs and versions.
Database programming (top)
35.
What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?
Cursors allow row-by-row prcessing of the resultsets.
Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.
Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one rowundtrip, however large the result set is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors.
Most of the times, set based operations can be used instead of cursors.
Here is an example:
If you have to give a flat hike to your employees using the following criteria:
Salary between 30000 and 40000 -- 5000 hike
Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike
In this situation many developers tend to use a cursor, determine each employee's salary and update his salary according to the above formula. But the same can be achieved by multiple update statements or can be combined in a single UPDATE statement as shown below:
UPDATE tbl_emp SET salary = CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000 WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000 WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END
Another situation in which developers tend to use cursors: You need to call a stored procedure when a column in a particular row meets certain condition. You don't have to use cursors for this.
This can be achieved using WHILE loop, as long as there is a unique key to identify each row. For examples of using WHILE loop for row by row processing, check out the 'My code library' section of my site or search for WHILE.
36.
Write down the general syntax for a SELECT statements covering all the options.
Here's the basic syntax: 
SELECT select_list
[INTO new_table_]
FROM table_source
[WHERE search_condition]
[GROUP BY group_by_expression]
[HAVING search_condition]
[ORDER BY order_expression [ASC | DESC] ]
Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.
Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.
38.
What is an extended stored procedure? Can you instantiate a COM object by using T-SQL?
An extended stored procedure is a function within a DLL (written in a programming language like C, C++ using Open Data Services (ODS) API) that can be called from T-SQL, just the way we call normal stored procedures using the EXEC statement. See books online to learn how to create extended stored procedures and how to add them to SQL Server.
Yes, you can instantiate a COM (written in languages like VB, VC++) object from T-SQL by using sp_OACreate stored procedure. Also see books online for sp_OAMethod, sp_OAGetProperty,
sp_OASetProperty, sp_OADestroy.
39.
What is the system function to get the current user's user id?
USER_ID(). Also check out other system functions like USER_NAME(), SYSTEM_USER, SESSION_USER, CURRENT_USER, USER, SUSER_SID(), HOST_NAME().
40.
What are triggers? How many triggers you can have on a table? How to invoke a trigger on demand?
Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation takes place on a table.
In SQL Server 6.5 you could define only 3 triggers per table, one for INSERT, one for UPDATE and one for DELETE. From SQL Server 7.0 onwards, this restriction is gone, and you could create multiple triggers per each action. But in 7.0 there's no way to control the order in which the triggers fire. In SQL Server 2000 you could specify which trigger fires first or fires last using sp_settriggerorder
Triggers can't be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on the table on which they are defined.
Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster.
Till SQL Server 7.0, triggers fire only after the data modification operation happens. So in a way, they are called post triggers. But in SQL Server 2000 you could create pre triggers also.

Instantiating COM objects is a time consuming process and since you are doing it from within a trigger, it slows down the data insertion process. Same is the case with sending emails from triggers. This scenario can be better implemented by logging all the necessary data into a separate table, and have a job which periodically checks this table and does the needful.
Self join is just like any other join, except that two instances of the same table will be joined in the query.
Here is an example:
Employees table which contains rows for normal employees as well as managers. So, to find out the managers of all the employees, you need a self join.
CREATE TABLE emp
(
empid int,
mgrid int,
empname char(10)
)
INSERT emp SELECT 1,2,'Vyas'
INSERT emp SELECT 2,3,'Mohan'
INSERT emp SELECT 3,NULL,'Shobha'
INSERT emp SELECT 4,2,'Shridhar'
INSERT emp SELECT 5,2,'Sourabh'
SELECT t1.empname [Employee], t2.empname [Manager]
FROM emp t1, emp t2
WHERE t1.mgrid = t2.empid
Here's an advanced query using a LEFT OUTER JOIN that even returns the employees without managers (super bosses)
SELECT t1.empname [Employee], COALESCE(t2.empname, 'No manager') [Manager]
FROM emp t1
LEFT OUTER JOIN
emp t2
ON
t1.mgrid = t2.empid
43.
What are the steps to improve the performance of a query?
=> Number of joins and use of complex views/cursors have to be reduced.
=> The use of the stored procedures and indexes have to be maximized.
=> The optimized use of the complex conditional checks and computer columns have to be in place.
=> Tracking of performance analysis for the query helps us in identifying the right aspects to optimize.
44.
How would you use the SP_ functions to identify the blocking problems?
Blocking is the deadlock situation when two SQL connections race to obtain the control over the same set of rows in conflicting terms. This can be tracked by the status of WAIT present in the SP_LOCK procedure’s output. All the active LOCKS and the different rows that are being involved are shown in this output. The identification of the connections involved in the specific row contention lock can be identified with sp_who and sp_who2 procedures. This way the causal agents of the blocking is identifies. KILL command issued against the specific SQL connection causing the BLOCK can resolve the issue. But the permanent solution lies in the proper design of the application code to execute in concurrence across different connections.
The SQL server offers 4 types of backups to suit the need of the administrator.
? Complete backup- The complete back up is just zipping the content of the entire database in terms of the different tables and procedures etc. This back up can server as an independent entity that can be restored in different systems with just the base SQL server installed.
? Transaction log backup: This is the mechanism of backing up the transaction logs that have been maintained in the server. This way the details of the database getting updated is obtained. This cannot be a stand-alone back up mechanism. But can save a lot of time if we already have the file system related to the DB backed up on the new deployment server.
? Differential backup: This is a subset of the complete backup, where only the modified datasets are backed up. This can save the time when we are just trying to maintain a backup server to main server.
? File backup: This is the quickest way to take the backup of entire database. Instead of taking in the data actually stored in DB, the files are backed up and the file system thus obtained when combined with the transaction logs of the original system will render the database that we are trying to back up.
46.
What are the different levels of isolation?
The isolation represents the way of separating the database from the effects of network accesses, thereby maintaining the consistency.
The different levels of isolation are:
? read committed: This level of isolation uses the shared locks and the reads to the database give the constant and consistent values.
? read uncommitted: No locks implemented. This is the least effective isolation level.
? repeatable read: There are lock over the rows and values but the updates are maintained as a separate phantom row which is the next set of values for the specific record. Values can change within a specific transaction of a SQL function.
? SERIALIZABLE reads: This is the implementation of pure lock mechanism where one specific transaction is not allowed access to specific record before another one completes.
The SQL Server.exe is the executable which can be called in the command prompt with the parameters -m and -f. These are the options that will start the SQL server in the user mode and minimal configuration mode respectively.
Statistics represent the uniqueness for the indexes that are being used for selecting the records. This can make the query execution pretty efficient. The tables that we are dealing with if truncated and repopulated, there is a good chance that the indexes and statistics are out of sync and this is when we have to update the statistics. There are also other situations like when the table has been modified and lot of rows have been added recently or like when a server has been updated with different version of software. These also give us the reason to use the UPDATE_STATISTICS, DBCC SHOW_STATISTICS etc to update it accordingly.
Replication refers to the moving or copying of the database elements from one system to another. This can be done in the SQL Server in one of the following methods:
? Transactional.
? Snapshop.
? Merge replication.
50.
Can we initiate a external COM object from within SQL?
Yes we can use the stored procedure sp_OACreate to initiate the external COM object from the T-SQL
51.
What is a schema? How is it useful in SQL Serers?
The Schema refers to the overall structure of the database with all related information like users, access privileges, interaction information between the different tables etc. The CREATE SCHEMA is the command that can be used to create the schema in the SQL Server. This when done can be used to re deploy the same database in another system for demonstrative or test purposes. This holds intact the underlying framework over which the database has been built.
52.
What is a write-ahead log?
The write-ahead log is the logging system that just updates the buffer cache of the database for the transactions and updates the logs and only then the actual changes are incorporated in the actual database. This is the reason why it is called “write ahead”. This helps in maintaining the consistency in the database. This can also be useful in getting the actual database values even in case of failures
53.
What is the use of check points in the transaction logs?
The check points are restoration points that indicate the specific state of the database. When there is some failure int he database that is occurring before the next check point, the database can be reverted back to the previous check point and thus the database would still be consistent
54.
What is a column with identity?
The column with a defined identity in turn means that there is an unique value that the system assigns to the specific column. This is similar to the AUTO Number property of the Access backend.
? The Logical component consists of the overall database structure definitions, users, groups, access permissions, collations etc.
? The Physical component is the one that contains the physically represented form of the database i.e the actual files.
There are three types of files:
1. Primary data files: These contain the links to the other files. This file is the representation of the higher level of the database. (.mdf)
2. Secondary data files:
These contain the files that contain the values and actual data as in database.
3. Log files: These represent the different types of the logging output of the SQL Server, including the transaction logs.
56.
How does one create a new database?
One can create and modify Oracle databases using the Oracle "dbca" (Database Configuration Assistant) utility. The dbca utility is located in the $ORACLE_HOME/bin directory. The Oracle Universal Installer (oui) normally starts it after installing the database server software.
One can also create databases manually using scripts. This option, however, is falling out of fashion, as it is quite involved and error prone. Look at this example for creating and Oracle 9i database:
CONNECT SYS AS SYSDBA
ALTER SYSTEM SET DB_CREATE_FILE_DEST='/u01/oradata/';
ALTER SYSTEM SET DB_CREATE_ONLINE_LOG_DEST_1='/u02/oradata/';
ALTER SYSTEM SET DB_CREATE_ONLINE_LOG_DEST_2='/u03/oradata/';
CREATE DATABASE;
57.
What database block size should I use?
Oracle recommends that your database block size match, or be multiples of your operating system block size. One can use smaller block sizes, but the performance cost is significant. Your choice should depend on the type of application you are running. If you have many small transactions as with OLTP, use a smaller block size. With fewer but larger transactions, as with a DSS application, use a larger block size. If you are using a volume manager, consider your "operating system block size" to be 8K. This is because volume manager products use 8K blocks (and this is not configurable).
58.
Where can one find the high water mark for a table?
There is no single system table, which contains the high water mark (HWM) for a table. A table's HWM can be calculated using the results from the following SQL statements:
SELECT BLOCKS
FROM DBA_SEGMENTS
WHERE OWNER=UPPER(owner) AND SEGMENT_NAME = UPPER(table);
ANALYZE TABLE owner.table ESTIMATE STATISTICS;
SELECT EMPTY_BLOCKS
FROM DBA_TABLES
WHERE OWNER=UPPER(owner) AND SEGMENT_NAME = UPPER(table);
Thus, the tables' HWM = (query result 1) - (query result 2) - 1
NOTE: You can also use the DBMS_SPACE package and calculate the HWM = TOTAL_BLOCKS - UNUSED_BLOCKS - 1.
59. 
Where are my TEMPFILES, I don't see them in V$DATAFILE or DBA_DATA_FILE?
Tempfiles, unlike normal datafiles, are not listed in v$datafile or dba_data_files. Instead query
v$tempfile or dba_temp_files:
SELECT * FROM v$tempfile;
SELECT * FROM dba_temp_files;
Unlike normal tablespaces, true temporary tablespace information is not listed in DBA_FREE_SPACE. Instead use the V$TEMP_SPACE_HEADER view:
SELECT tablespace_name, SUM (bytes used), SUM (bytes free)
FROM V$temp_space_header
GROUP BY tablespace_name;
61.
What are the different windows events activated at runtimes?
When_window_activated
When_window_closed
When_window_deactivated
When_window_resized
Within this triggers, you can examine the built in system variable system. event_window to determine the name of the window for which the trigger fired.
One can backup archived log files using RMAN or any operating system backup utility.
Remember to delete files after backing them up to prevent the archive log directory from filling up.
If the archive log directory becomes full, your database will hang! Look at this simple RMAN
backup script:
RMAN> run {
2> allocate channel dev1 type disk;
3> backup
4> format '/app/oracle/arch_backup/log_t%t_s%s_p%p'
5> (archivelog all delete input);
6> release channel dev1;
7> }
63.
How does one connect to an administrative user?
If an administrative user belongs to the "dba" group on Unix, or the "ORA_DBA" (ORA_sid_DBA)
group on NT, he/she can connect like this:
connect / as sysdba
No password is required. This is equivalent to the desupported "connect internal" method.
A password is required for "non-secure" administrative access. These passwords are stored in password files. Remote connections via Net8 are classified as non-secure.
Look at this example:
connect sys/password as sysdb
64.
How do you reference a Parameter?
In Pl/Sql, You can reference and set the values of form parameters using bind variables syntax.
Ex. PARAMETER name = '' or :block.item = PARAMETER Parameter name
65.
What are the built-ins used for processing rows?
Get_group_row_count(function)
Get_group_selection_count(function)
Get_group_selection(function)
Reset_group_selection(procedure)
Set_group_selection(procedure)
Unset_group_selection(procedure)
66.
What are the built-in used for getting cell values?
Get_group_char_cell(function)
Get_groupcell(function)
Get_group_number_cell(function)
67.
How will you monitor rollback segment status ?
Querying the DBA_ROLLBACK_SEGS view
IN USE - Rollback Segment is on-line.
AVAILABLE - Rollback Segment available but not on-line.
OFF-LINE - Rollback Segment off-line
INVALID - Rollback Segment Dropped.
NEEDS RECOVERY - Contains data but need recovery or corupted.
PARTLY AVAILABLE - Contains data from an unresolved transaction involving a distributed database.

PRE-QUERY fires once.
POST-QUERY fires 10 times.