Ismail Mayat, The Cogworks
"CMSImport, every Umbraco content migrators best friend"
A requirement that we get quite often these days and isn't supported out of the box by CMSImport is if CMSImport can update existing content that is not previously imported by CMSImport. The short answer is no, CMSImport can only update content that is imported by cmsimport. This is because CMSImport can use any primary key and primary key value and during import it will create a relation between the imported Id and the imported Document Id. Then the next time the imported id is found it knows the Umbraco document Id and it can update the document.
When you take a look at the CMSImportRelation table you can see how CMSImport stores the relation. The UmbracoID column holds the nodeid in Umbraco and the DataSourceKey holds the Related id from the datasource
The DataSourceKey is based on three items to ensure it's unique:
It is of course possible to create relations manually in the database but when an editor creates a new document that relation can't be found and it will still be imported as a new document when importing data. Lucky for us we can hook into events in Umbraco so after a publish of a document we can check if a relation is found and if not create one. Below an example of an Excel we are going to Import. The UmbracoID column holds Node Id in Umbraco. The first two exists in Umbraco so the titles must be updated and the other record is new.
If you take a look at the code below that doesn't do much you see that it takes the current Umbraco Id and combines that with the configured CMSImport.UmbracoRelationPrefix value from web.config. In my case that is:
<add key="CMSImport.UmbracoRelationPrefix" value="Excel fileUmbracoID" />
During import the first record will be Excel fileUmbracoID1408 in this case. And that's the same as CMSImport would use during import. The 1408 will also be stored as UmbracoId and a relation is made.
So now whenever you publish a document a relation is stored in the cmsimportrelationtable in a way CMSImport would normally do and when you run the import again only the non existing document get inserted and the other two documents will update an existing node.
Please not this is a workarround and will be addressed in CMSImport V4.