Share This Article:Share on FacebookTweet about this on TwitterShare on LinkedIn

At Egnyte, one of our primary objectives is to provide a unified user experience in the file synchronization space. That, by definition, includes supporting Egnyte Storage Sync and other Egnyte clients on a wide variety of OS and file systems, ranging from Windows and Mac to VMs and NAS devices running different flavors of Unix.

Providing a seamless synchronization experience requires supporting the following types of file systems:

  1. Case Sensitive: Examples include file systems used by Linux. In this type of file system, two files referenced by the same name but have different cases are different and are stored differently
  2. Case Insensitive, Case Preserving: Examples include Windows. Files can be referenced by any case and case is preserved, but two files differing only by case can not co-exist

In addition to that, Egnyte also supports different unicode normalization forms for file paths.

Synchronizing such diverse file systems involves reconciling a user’s view of the data set with that stored on the Egnyte Cloud File Server (CFS), which itself is a case insensitive and case preserving system.

To achieve that objective, we had to first come up with a definition for canonical equivalence that would be adopted by Egnyte CFS and clients running in different environments. We decided to use a lowercase, NFC form of a path as the basis for canonical equivalence. NFC is the preferred version because of its compatibility with strings with legacy encodings. It is also slightly more compact, is generally fast to compute and has an additional advantage that most of the content online is already in this form. Using this canonical form, different Egnyte subsystems decide if two path entries have the same canonical representation and then determine how they should synchronize objects with these paths with each other.

Lets take one of these subsystems, Egnyte CFS, to see how it brings synchronization on the cloud side to a steady state:

1. It stores two versions of the file path for every file system entry created by clients:

    • Actual path supplied by client used to create the file system entry
    • Canonical path computed and stored for each file system object for fast lookup

cloud, canonical, case sensitive
2. For future requests coming from any client, CFS prepares the canonical path based on request. It then runs a search on the file system to find a canonically equivalent entry stored and then stores the incoming object against the real path observed during the first request.

different case path, engineering, canonical path

Egnyte CFS uses a service-oriented architecture within its infrastructure to power up functionalities, such as tagging, searching, sharing, permission management, etc. These components communicate with each other using the real path supplied by the client but internally uses the same canonical equivalence rule. Egnyte’s clients running over different file systems use the same approach to synchronize between them and the Egnyte Cloud File Server to reach a steady state.