I have a web application that syncs Outlook contacts to a database (and back) via CDO. The DB contains every contact only once (at least theoretically, of course doublets happen), providing a single point of change for a contact, regardless of how many users have that particular contact in Outlook (like Interaction or similar products).
The sync process is not automatic, but user-initialized. An arbitrary timespan can pass before users decide to sync their contacts. A subset of these contacts may have been updated by other users in the meantime.
Generally, this runs fine, but I have never been able to solve this fundamental problem:
How do I doubtlessly identify a contact object in a mailbox?
- I can't rely on
PR_ENTRYID
, this property changes on contact move or mailbox move. - I can't rely on my own IDs (e.g. DB table ID), because these get copied with the contact.
- I absolutely can't rely on fields like name or e-mail address, they are subject to changes and updates.
Currently I use a combination of 1 (preferred) and 2 (fall-back). But inevitably, sometimes users run into the problem of synching to the wrong contact because there is none with a given PR_ENTRYID
, but two with the same DB ID, of which the wrong one is chosen.
There are a bunch of Outlook-synching products out there, so I guess the problem must be solvable.
-
I had a similar problem to overcome with an internal outlook plugin that does contact syncing. I ended up sticking a database id in the Outlook object and referring to that when doing syncs.
The difference here is that our system has a bunch of duplicates that get resolved later by the users. When they get merged I'll remove the old records and update outlook with all of the new information along with a new id.
You could do fuzzy matching to identify duplicates, but duplicate resolution is a funny problem that's mostly trial and error. We've been successful at implementing "fuzzy" matching logic using the levenshtein distance algorithm for names and addresses cleaned down to a hash code.
Good luck, my syncing experiences have been somewhat painful.
Tomalak : Oh man, the age of the question alone tells me that this is not a trivial problem. I stick the Database ID into the Contact as well, but that does not help when the contact is reused (same position, different guy), or when it gets copied to make a template for a new contact in the same firm.Tomalak : Up voted anyway. At least I have your sympathy. :-D
0 comments:
Post a Comment