Detangling the verbiage of eDiscovery, hold, retention, and archiving

4 / 5 ( 1 vote )

Not a week goes by where I don’t get asked a question about eDiscovery or retention of data in the Office 365 (or Microsoft 365 platform).  We’ll mainly focus on mailbox questions here, though a lot of the concepts apply to SharePoint content as well.

The most common questions, comments, and misconceptions I encounter, in no particular order:

Let’s dig into these fun topics!

eDiscovery only works for mailboxes that have Office 365 E3 and E5 subscriptions

No, no, no.  Chances are, if someone has told or is asking you this, they have conflated the concepts of discovery (finding stuff) and retention (keeping and preserving stuff).  Office 365 (or Microsoft 365) eDiscovery works against virtually all content stored in the platform.

I think these two items frequently get confused because most people associate the legal discovery process with a custodian being required preserve their data during litigation.  The vernacular is tricky, so it’s critical to understand what it is you (or your customer) are trying to achieve.  The differences and nuances can be significant, since inadvertently preserving or deleting data, as well as failing to protect it for a certain period of time, can have grave legal ramifications for your organization.

Zooming back out, the E5 and E3 (and corresponding G5 and G3 for government or A-series for Education) subscriptions have additional features surrounding retention and content preservation. However, everything in the platform is discoverable and searchable, regardless of the license applied (F1, E1, E3, E5, etc). If you discover content, you may not be able to put it on hold or protect it from being deleted (depending on your license), but you’ll certainly be able to export it at the point in time of discovery and keep it offline.

Just to reiterate:

Discovery finds whatever is currently in the mailbox, including content located in Deleted Items and Recoverable Items.  The type of Exchange Online mailbox license has no impact on the finding and exporting of data in a mailbox.  eDiscovery is not the same as retention.

We good?  Cool, cool.

Litigation hold is (better than | worse than | the same as) Security & Compliance Retention Policies

First, let’s talk about the core ways that data can be preserved in a mailbox.

Litigation Hold (since Exchange 2010)

Litigation Hold is a Boolean value applied to the mailbox that preserves all content in the mailbox.  This per-mailbox attribute must be applied to every mailbox individually (either through clicking the Litigation Hold option in the Exchange Admin Center or via PowerShell). Litigation Hold preserves user-deleted content in the Recoverable Items hidden folder.  This feature is still alive and kicking.

In-Place Hold

This has also been around a long time (since the 2010 days).  It was also the primary hold feature available when Exchange Online became generally available (for your veterans, this was the Business Productivity Online Service or BPOS offering).  We made moves to deprecate this option, as noted here: In-Place Hold and Litigation Hold | Microsoft Docs.  As such, you can no longer create in-place holds in the Exchange Online admin center.

I put together a script to help migrate legacy In-Place holds to the new Security & Compliance Center platform.

Pertinent data:

Yes, like all good engineers, I put the instructions after the tool.  When all else fails, read the documentation.

Security & Compliance Retention Policy

This is the modern way of doing things.  Retention policies are part of your organization’s overall information governance strategy (the process of creating, classifying, storing, preserving, and destroying content).  Retention policies can be applied at an organizational level and can specify certain data locations to include or exclude.  Modern retention policies can cover more than just mailboxes–including Office 365 (modern) groups, Teams conversations, Teams data, as well as content stored in SharePoint Online and OneDrive for Business.

These modern retention policies are generally designed to supersede features like litigation hold and in-place hold.

Fun fact: Mailbox-level retention policies do not include Teams chat conversations.  While Teams 1:1 chat is stored in the participating users’ mailboxes, if you want to preserve all email and all IMs, you’ll need to create two policies (one for email, one for Teams chats).  They can’t be included in the same policy.

Security & Compliance Label with Retention

While the end result functions similarly to a modern retention policy (managing an item’s preservation through the mailbox’s Recoverable Items folder), labels with retention are applied at an item level, as opposed to a retention policy being applied at a container level.

Labels with retention can be applied manually (as part of the E3/G5 license capability) or automatically (as part of the E5/G5 license capability).  Since labels are applied in the user’s context, if they’re not marked as records, they can be removed by a user as well.  Modern retention policies happen at the mailbox level, outside of the control of the user.

Bonus: I put together a script to provide an example of how to programmatically work with Security & Compliance labels in Outlook.

eDiscovery Case Hold

eDiscovery case holds, like other content hold mechanisms in the Microsoft 365 platform, focus on preserving content in the user’s mailbox directly.  An eDiscovery case is a logical boundary that is used to group content searches related to a particular need (such as a FOIA request or legal proceeding).  eDiscovery cases can be used to just search, or they can be used to preserve content as well.

When content is placed on hold through an eDiscovery case, it’s essentially an open-ended preservation that supersedes any information governance policies for the included custodians.

Need an example?  I thought you might.

Let’s say you have configured an organization-wide retention policy that preserves mailbox content for 3 years and then purges it.  At some point, your organization is party to a legal proceeding involving the mailbox Finance1.  You create an eDiscovery case, add the Finance1 mailbox, and then place it on hold. The the covered content in Finance1’s mailbox will not be able to purged until the both the retention period expires AND the eDiscovery case hold is lifted.

bUt WhAt AbOuT mRm PoLiCiEs?

I’m about to blow your mind.

Messaging Records Management (MRM) policies don’t actually protect data.  They only have two real options:

  • Move to archive
  • Delete

So, if you have a need to protect or preserve mailbox data, MRM ain’t gonna do it.  I’ll say it again for people still reeling from shock:

Despite MRM using terms like “Retention Policy Tags,” MRM policies DO NOT RETAIN DATA.

Back to the original question–which is better?  I’m gonna pull a Socrates* and answer your question with a question–which solutions meets my business requirement?

Are Security & Compliance Retention Policies the same as Exchange Online Retention Policy Tags?

Using our (potentially newfound) knowledge above, it’s safe to say “no, they are not the same.”

Modern Security & Compliance retention policies manage the lifecycle of data Recoverable Items hidden folder in a user’s mailbox.  MRM policies just move and delete data–there’s nothing about an MRM policy that prevents data from being deleted prior to its expiration date.

Why should I use SCC retention policies instead of MRM?

Sometimes this is phrased as “I don’t need SCC retention policies because I have MRM policies already.”  It’s really the same as the last two points I made, just phrased differently.  If your business rules require you to preserve content against accidental or intentional destruction, MRM does not meet that business requirement.  If you’re only concerned with offloading data from a primary mailbox, then talk to your doctor to see if MRM is right for you.

I don’t need retention because I have archiving.  Aren’t they the same?

Archiving has different meanings, depending on your context. We need to share a common vocabulary and understanding to help determine the similarities and differences between “archiving” solutions.

Old-fashioned archives

In the offline world, archiving generally means some sort of long-term, immutable storage (like a banker’s box of printed data stored in vault or a set of backup tapes stored in a data maintenance facility).  Organizations may have a records manager or records management framework that dictates how people save, classify, store, and dispose of paper or other media.

Exchange archives

Archiving in the Exchange sense is a totally different creature.  Archiving is really “just another place to put data for content organization purposes.”  Exchange Personal Archives were introduced in Exchange 2010 primarily as a way to manage costly storage–the idea being that you could have fast (read: expensive) primary storage for the most recently created mailbox data while older data could be offloaded to slower, less costly storage.  This feels like it was, in technology terms, several lifetimes ago–15k RPM fibre-channel and SAS disks were thousands of dollars each and SATA disks were a dime a dozen.

From the Exchange architecture perspective, this was implemented as a secondary mailbox.  Archiving, in Exchange parlance, is a secondary mailbox that’s connected or related to the user’s primary mailbox.  MRM policies (mentioned above) can be used to move content from the user’s primary mailbox to their archive mailbox.  The archive mailbox has the same folder structure as the primary mailbox, with the intent being to help users locate data based on where it was last stored.  Because it’s essentially a regular mailbox, a user has the same rights and permissions over their archive mailbox (the special account SELF is granted Full Control) as they do their primary mailbox.

Exchange Personal Archives or Online Archives are not protected storage.

From a capability perspective, archive mailboxes enabled organizations to ingest legacy PST data or mail data from foreign systems and store it on the slower, less costly media.  One of the advantages that archive mailboxes had over PSTs was the centralized management of corporate data, reducing the risks of losing data if a personal PC failed and a user’s PST file was unrecoverable.  Another advantage was that because archive mailboxes were stored in the mail system, they were also available to Outlook Web Access users directly (as opposed to PSTs, which were not available when users connected to OWA).

Just a reminder: Exchange-based archives are not a retention mechanism.  They’re a storage mechanism.

Third-party archives

Other solutions use the term archiving as well.  Platforms that come to mind include Symantec Enterprise Vault, Archive 360, and Barracuda.

For example, one of the capabilities of Symantec Enterprise Vault is removing data from a user’s primary mailbox and storing the content in an immutable form in an external database, and then replacing the content in the user’s mailbox with a shortcut to the Enterprise Vault-stored item.  This replicates the Exchange personal archive function of moving data from costly to less expensive storage, but also introduces the concept of immutability–content in the archive typically can’t be deleted until its retention period expires.

I don’t need retention because I do Journaling.

Journaling is yet another way to preserve a copy of data.  When a system (either Exchange On-premises, Exchange Online, or any other mail system) is configured for journaling, the mail system is responsible for sending a copy (think of it like a BCC) of every message to a special location called a Journal mailbox.  From there, an application imports that data and stores and immutable copy in its database for the period of time determined by the system administrator and business rules.

It doesn’t matter what the user does in their own mailbox, as the mail transport system already delivered a copy of any messages to the journal mailbox for processing.

Both Exchange on-premises and Exchange Online support journaling (although, it is strictly against the Exchange Online Terms of Service to use it as a journal destination).  If you want to continue to journal, you’ll need to use a solution such as Enterprise Vault, EV Cloud, or Smarsh as your journal destination.

Most customers, I recommend they use the native protection capability of their Exchange Online P2 mailbox (which allows for unlimited archiving and retention).  This enables them to provision mailboxes, retain, and search for data all in a single platform.  It’s both a cost reduction or avoidance strategy (no longer paying a third-party to do something that’s included in a single product) as well as a simplicity strategy–allowing you to create, manage, discover, and preserve all your stuff in one place.

Summary

Hopefully, this content will allow you to have more specific and nuanced discussions about retention, compliance, and discovery capabilities of the Microsoft 365 platform.

If you’re interested in diving deeper, check out my series on using Content Search and eDiscovery in the Security & Compliance Center.

Footnotes

Yes, I know that technically, the Socratic method is question-based discourse to uncover deeper truth.  The actual act of using questioning to cause logical reasoning employed by Socrates is maieutics.  But, you knew what I meant.

Published by Aaron Guilmette

Helping companies conquer inferior technology since 1997. I spend my time developing and implementing technology solutions so people can spend less time with technology. Specialties: Active Directory and Exchange consulting and deployment, Virtualization, Disaster Recovery, Office 365, datacenter migration/consolidation, cheese. View all posts by Aaron Guilmette

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Exit mobile version