In episode 23 of Build and Analyze, a 5by5 podcast with Dan Benjamin and Marco Arment. Marco answers a question about syncing and hosting for a mobile app with a web/hosted back-end.

Around minute 23 Marco reads a question from Ryan and starts talking about syncing issues.

“The reason why most apps that could use sync don’t have it, isn’t because hosting is expensive, it’s because syncing is hard. Syncing is really hard, to do syncing properly…”

Marco knows what he’s talking about, his experience creating Instapaper adds great insight.

Syncing is the feature I struggled with building Jottpad. Jottpad is a simple iPhone app, only 5 screens - Login/Register, Lists, Items, Settings, and Sharing. Without syncing this app would have taken around a month, with syncing, about 6 months. I was building the API on the web side & consuming the API for the iPhone app. This is also my first foray into iPhone development & the first time I’ve built this level of syncing logic.

Now would be a good time to point out that while composing this post I discovered writing about why Syncing is Hard… is hard, so grab a beer & keep reading.

Why is Syncing so Hard?

  1. 3 Step process
  2. Smallest amount of data possible
  3. Multiple devices
  4. Delete is not as simple as you think
  5. Conflicts

3 Step Process
Any sync action requires at least 3 steps. I’ll illustrate with a simple Add List action from the iPhone to Jottpad.com

  1. Add List “payload” sent from iPhone to Jottpad.com. Jottpad.com creates List record to match what was added on iPhone. Also creates a DeviceSync record to store the current status of the List for all Device/User combinations.
  2. Response from Jottpad.com to iPhone including additional data from Jottpad.com such as Id of List record created on Jottpad.com. iPhone updates with ListId from Jottpad.com & marks a SyncDateTime timestamp on the List.
  3. Acknowledge the response (send a SyncAck) to Jottpad.com. This SyncAck closes the loop and assures Jottpad.com is in sync with the iPhone. Jottpad.com marks a SyncDateTime timestamp on the DeviceSync record.

The SyncAck step might seem like overkill but I found it necessary to ensure knowing Jottpad.com & the iPhone were in sync. This step helps facilitate the next step by keeping a snapshot of the current Sync state on the iPhone & Jottpad.com.

Smallest Amount of Data
Speed is a feature. Sync the smallest amount of data & only when needed. This is common sense, but in practice requires a lot more work than syncing everything, every time. Only sending the smallest subset of data on each sync requires tracking the status of all data elements that could potentially need synced. In the case of Jottpad this means each List & Item have individual Sync status per device/user combination the DeviceSync record.

My wife walks into the grocery store and opens Jottpad on her iPhone. This initial syncing when you launch Jottpad I’ll call “Munging” - basically figuring out everything that has changed since the last time you launched Jottpad on the iPhone. The idea here is to only send data from the iPhone to Jottpad.com which has not been synced & only send data from Jottpad.com to the iPhone which has not been synced. This initial Munge, each time you launch Jottpad, is by far the most logic (lines of code) & data intensive (amount of data elements to compare) operation so it needs to be fast, reliable, & most importantly just work.

  • In this case everything on the iPhone has been synced to Jottpad.com so the initial Munge is asking for anything that might have changed since the last sync on Jottpad.com
  • Jottpad.com gets the Munge request.

    • Check the payload from the iPhone for any changes - in this case nothing.
    • Determine if anything has changed on Jottpad.com that needs synced to this iPhone. Yes - we have a new List - this needs sent back down.
  • iPhone gets response from Jottpad.com with info about new List. Do the add and send the SyncAck back to Jottpad.com.

  • Jottpad.com receives SyncAck & marks the new List as synced.

This is a simple Munging scenario but the underlying logic is the same regardless - only send the data elements which have been modified, added, or deleted since the last sync.

Multiple Devices
Syncing with one device is easy - you can hack in a couple extra columns on the List & Item records for LastSyncDate and call it a day. When you start syncing with multiple devices the complexity grows quickly. Along with that complexity comes a bunch of potential bugs & edge cases. Again, this seems like common sense, but it bit me in the ass when my first design for syncing fell apart.

Delete
Delete, it seems so easy. Swipe right on the iPhone, tap Delete. I found Delete Sync logic to be the most convoluted of the three Sync actions - Add, Modify, Delete.

Syncing a Delete action is backwards. The problem being once you Delete an Item, it’s gone. How can you sync something that is gone? What about all the other devices that have already synced the Item you just deleted?

Let’s look at another example to try and clarify…

I share a List with my wife - Things We Need. I go to the store, check the list & see we need a new coffee grinder. I buy the coffee grinder and instead of just checking it off the list, I delete. Why? Well I’m pretty sure we won’t be buying another coffee grinder anytime soon so there is no reason to keep this on the list. My iPhone Syncs with Jottpad.com, the coffee grinder is deleted, gone on my iPhone and gone on Jottpad.com.

Meanwhile my wife is also on her way home from work and stops at the store. She opens up Jottpad, sees we need a coffee grinder, buys the coffee grinder and also deletes it. Uh-oh, she gets an error trying to Sync - something something “object reference not set to an instance of an object”. She makes a mental note to tell me about Jottpad having issues, gets in the car and goes home.

Not only do we now have 2 coffee grinders but Jottpad has a serious issue - deleting from my iPhone is not getting synced to her iPhone - what happened & how do we fix this?

Well I started with not actually deleting the Item that was deleted - you put it into a MarkedForDelete status. Think of MarkedForDelete as putting something into the Trash on your computer. It’s still there but not really. Jottpad.com needs to be the master record - I can’t delete from Jottpad.com until all devices that have synced the Item coffee grinder have been synced. At this point the MarkedForDelete Item can be safely deleted.

Conflicts
Conflicts happen when two people Modify the same item before the other person has synced the changes. In other words, syncing a modified item from the iPhone with a modified item on Jottpad.com.

The simple fix is create a ModifiedDateTime timestamp on each record, do a compare on ModifiedDateTime whenever you munge and the newest, most recently modified item wins. Essentially this works, although it can be a little weird if you modify an item, sync, and the item comes back as a different value then what you just modified and synced. This is definitely an edge case and in an ideal world I would have built some sort of merge dialog which would ask the user which value they want to use - but I copped out for Version 1 cause at some point you have to ship the product.

Wrap it up
Syncing is hard, writing about it is hard, and if you’re still reading this you are either:

a. Writing syncing logic and looking for somebody else to share in your pain
b. My wife graciously proof reading this Opus

Posted
AuthorRichard Hochstetler
CategoriesUncategorized