Monday, March 28, 2011

Best Practice for PK in SQL Server

I have been wondering what the best practices or ramifications are in setting up the PK in a M2M table in SQL Server. For instance:

I have 2 tables

  • Users
  • Roles

I am making a new table

  • UserRole

Which has 2 fields RoleId & UserID

now should I

  1. create a UserRoleID as the PK and make UserID and RoleID the FKs
  2. make the PK UserID AND RoleID and set them as FKs
  3. something else

I would like to know the performance issues with each of the options and what the recommended best practices are.

From stackoverflow
  • It depends whether you expect to hang any other meaning to the fact that a specific user has a specific role. If not then just create a clustered PK to span across the two fields.

    Add FKs for both and add an index to the second field. Give some consideration to which order the fields should appear in. Are you more likely to be retrieving a set of roles a user belongs to or the set of users in a specific role?

  • Standard procedure for these cases is to have two indexes. The unique PK is the two fields composite, with the field with the greater cardinality first, i.e. UserID; and the second index with just the secondary field (i.e. RoleID).

    Then cluster on whichever is likely to be involved in more multirecord result sets (i.e. if querying for multiple roles per user, or multiple users per role).

  • It depends on how you are using them. Most of the time I make the primary key as a UserId and a RoleId to make sure they are unique. Meaning the same user cannot have the same role.

    Now this is where the "depends" comes into play. If you are going to link the UserRole table to another table, that is where I create the UserRoleId primary key. And make the UserId and RoleId into a unique constraint.

    The reason for this is not to have on the table that references the UserRole have both a UserId and a RoleId when it is not needed because you are linking to the UserRoleId and not the User table and the role table respectively.

    Tom H. : So you would rather join across two tables every time you want to get at any useful information rather than add an extra 4 bytes to another table row?
  • Declare the PK as (UserID, RoleID). (Note: the order is important)

    Declare UserID as an FK with the reference to the Users table. Declare RoleID as an FK with the reference to the Roles table.

    With any luck, your DBMS will give you a composite index on (UserID, RoleID) in that order.

    With any luck this will speed up joins between users and roles. A good DBMS will give you a merge join for a join with no restrictions other than the join condition. A three way join should run pretty fast as well, assuming the number of roles is small.

    When you join UserRoles and Roles, without joining in Users, you may find it's disappointingly slow. How often do you do that, and how important is speed in this case? If it is important, you can create an index on just RoleID.

  • Avoid the composite PK and put a unique index on the two FKs (Seems appropriate in this instance.). Not an issue in this case, but be consistent. Having to remember to address the multiple fields to join on when writing queries is a pain. If your composite key has to be made up of datetime, char or other types of fields, performance takes a hit.

0 comments:

Post a Comment