Part of my role over the last few years has been educating customers about how Office 365 eDiscovery works. While we have lots of documentation on how things work, we don’t have a whole lot actually showing what it looks like in action (to know if you’re doing it right).
So, to help you with that last part, I’m going to run through a bunch of things I help customers with frequently to improve the quality of their searches while reducing the quantity of results they get.
Part of any discovery process is finding documents that are responsive. Responsive (if you didn’t click the link and want the TL;DR version) is legalese for “does it match my search?” If you’re going to YouTube to find a quick how-to video on changing out an HMI radio module for a Chevrolet Tahoe, you probably don’t care about videos for a Ford Explorer’s radio or Lake Tahoe. And if you’re searching Know Your Meme for 2019’s Baby Yoda, you likely don’t want to see original Yoda or 2020’s Baby Nut.
When searching the internet or other types of search, you’ve probably become accustomed to using quotes (” “) to denote phrases, + to require terms to be included, and – to exclude terms. Office 365 eDiscovery and content search also have qualifiers and operators, and in order to get the best possible (most responsive) results, you need to know how to use them.
- The Security & Compliance Center
- Content Search vs eDiscovery
- Order of Operations
- Best Practices and Recommendations
Which brings us to this blog post series: Office 365 Security & Compliance Center Content Search and eDiscovery. Over the posts in this series, I’m going to go over the following concepts:
- Condition Cards: Sender, Recipients, & Participants and Content Types
- Phrases and Grouping AND OR’ing (Oh, my!)
- Learning NEAR and ONEAR
- Discovering Microsoft Teams Content
So, buckle up and prepare to hopefully get slightly better with the Security & Compliance Center.
The Security & Compliance Center
If you’ve been using Office 365 for a while, you’ll likely have seen the transformation from the SharePoint eDiscovery Search Center and the Exchange Online In-Place eDiscovery. Over the last few years, we’ve transformed the search aspect and brought together the capabilities for searching Exchange mailboxes, Office 365 Groups, Teams, Skype for Business conversations, SharePoint sites, and OneDrive for Business sites.
You can see the experience by logging into https://protection.office.com.
One of the things you’ll notice in this current iteration of the portal (current as of February 2020) is that we have two places on the navigation bar to search for content: the Search menu (which exposes the Content Search option) and the eDiscovery menu (which gives you access to eDiscovery and Advanced eDiscovery menu options). The things we’re going to go through in this blog series will cover both Content Search and core eDiscovery, as the search functionality is identity.
But, there are some differences. What are they? Glad you asked.
Content Search vs eDiscovery
While the search capability of Content Search and eDiscovery are identical, there are a couple of key differences:
- Searches created in Content Search are visible to anyone who has access to Content Search. Each eDiscovery case, by contrast, is its own security boundary, meaning that unless a user is granted access to a case, they’re not able to able to see anything listed. The notable exception to this is the eDiscovery Administrator role, as they can see every eDiscovery case.
- eDiscovery allows you to put content on hold, as well as scope searches to content on hold for the case.
In either scenario, you have the ability to search for content across a wide variety of sources:
The areas you can search include:
|Exchange mailboxes||Office 365 group messages||Skype for Business messages|
|Microsoft Forms||Yammer conversations||SharePoint sites|
|OneDrive for Business sites||Office 365 Groups sites||Teams sites|
|Yammer networks||Exchange public folders|
Yes. Exchange public folders AND Yammer. You right that read.
Order of Operations
Conceptually, each search goes through a lifecycle not unlike the following:
- Figure out what you’re going to search for. This answers the question what content am I looking for?
- Figure out where you’re going to search. You saw the potential search targets above above. With any search (content or eDiscovery), you need to have, at a minimum, a content source (Exchange, Teams, SharePoint, etc). It can even be everything. You just have to select something.
- Once you’ve identified the what and where for your search, you can begin. Enter any qualifying parameters or conditions (locations, sender, keywords, etc). The goal of any search is to be as complete as possible while being as precise as possible. If you’re looking for all the instances of penguins falling down or goats screaming like humans, be specific. There’s no sense returning every document and message in an Office 365 tenant and having to click each one individually to see if it’s responsive to your query if you know going in that you just needed Bob’s messages to the candy cane vendor between November 30 and December 24.
- Preview your search results to make sure you’re getting the data you expect based on your search terms, conditions, and content sources.
- Export the content. In terms of the Security & Compliance Center, export means “get it ready for download.”
- Download the export.
That’s the nitty gritty. But, I’ll show you a few options just to drive it home.
Here, we’re going to go some examples:
- Basic Content Search
- Guided Content Search
- Working with eDiscovery Cases
Here we go!
Basic Content Search
First up is the basic content search. We’re going to look at the search interface and just doing the most basic of searches–one for a few keywords. There are a few different search paths that we can follow:
- General search (using condition cards)
- Guided search (a wizard)
- ID search (a more advanced search that we’ll get to later in the series)
We’ll examine the first two searches in this post. First, the General search.
- Log into the Security & Compliance Center (https://protection.office.com).
- Select Search | Content Search.
- Select the New search button.
- The search pane is automatically populated with the Keywords condition card. Enter the keywords relating to the content you wish to find. In this case, I’m going to query for penguins.
- By default, the Specific locations radio button is selected, but no locations have been identified. You can select the All locations radio button to search the entirety of the Office 365 tenant, or you can scope it to particular data sets. In this case, I’m going to select three individual places, so you can see how it works. Click the Modify… link next to the Specific locations radio button.
- In the Modify locations fly-out, select Choose users, groups, or teams to identify the messaging targets you wish to search. In my case, I’m going to search the admin mailbox (since that’s where I have my test content).
- Click the Choose users, groups or teams button.
- From this window, you’ll be able to add mailboxes. This UI wording is a little bit quirky. You have to perform three steps: enter a value to search for, select the checkbox for each matching item you wish to include, and then select the Choose button.
- You’ll be taken back to the Edit locations panel. Click Done when you’re finished.
- You’ll see that under the “messaging” section, I now have 1 user, group, or team listed under the Selected locations column. I want to also specify the the user’s OneDrive for Business site and the main SharePoint site. Click the Choose sites link.
- On the SharePoint sites fly-out panel, click Choose sites.
- You’ll notice in this box that you have to manually enter a site or site collection. There is no site browser. For each SharePoint site, site collection, or OneDrive site, you have to enter the URLs individually and click the + button.
- Then, you have to select the checkbox next to the URL. The UI is a little quirky here, in my opinion: after you click the checkbox next to the URL, you can then enter a new URL in the URL text box and click +. It will show up under the sites list where you can click the checkbox next to it. No wildcards are allowed here. Each site you add will show up under the Added section. When you are done, click Choose.
- Review and click Save.
- Click Save & run.
- On the Save search fly-out panel, enter a name for the search and click Save. The name for the search must be unique in your tenant. After you click Save, the search will begin.
- If the search matched any content, it will be displayed in the preview window (if you have the correct permissions). In this case, we’ve returned a bunch of stuff: items from a SharePoint page, a document stored in OneDrive, and some test emails. You can select each item in the results column and see a preview of it in the preview window. You can also download individual items here in their native format (as opposed to going through with an export). The Preview is limited to 1,000 items.
We’re going to pause here for a moment and take a look at accomplishing the same thing with a Guided Search. If you can’t wait to get your stuff downloaded, skip to the Export section.
Guided Content Search
We’re going to look for the same content in this example, but we’ll take a different route to get there. This time, instead of clicking on the + New Search button, we’re going to select the + Guided Search button to see the wizard interface.
- Log into the Security & Compliance Center (https://protection.office.com).
- Select Search | Content Search.
- Select the + Guided search button.
- What’s interesting here is you can see the basic search in the background–the wizard (the guiding part of the guided search) is going to make sure we populate the required fields. The first required field is Name, so fill that out and click Next.
- The next page of the wizard is Choose locations. Just like the basic or general search, you need to indicate the search targets. The UI is exactly the same, so I won’t bore you with repetitive screenshots at this point. I’m going go through the exact same steps as before to select a target mailbox, SharePoint, and OneDrive sites. Click Next when finished configuring items on this page.
- The Create query or Condition card page has a Keywords condition card by default. I’m going to enter penguins like I did before. Click Finish when complete. This will save and run the search.
Now, you can just sit back and wait for the search to complete.
Now we’re at the exact same point from both of our searches (the general or basic search and the guided search). Once you’re set on the content, the next step is to Export it. As I mentioned previously, Export isn’t what most of us traditionally think of as exporting (where the result is a file). In this case, Export is an intermediate step that prepares content to be downloaded.
There are actually two exports that can be created–one to prepare files and data to download (also known as Export results) and one to prepare a report (brilliantly named Export report). We’ll examine what each of these options does.
Located under the More menu, Export report generates a list of items that matched (or were responsive, in legalese) your search conditions.
The report contains some of metadata about the responsive items, such as what type of content it was and where it was found. This can be important in helping you understand what has been found in your search–especially if you come back with a very large data set.
Once you’ve selected Export Report, you’ll have the option of scoping the report to one of three data sets under Output options:
- All items, excluding ones that have unrecognized format, are encrypted, or weren’t indexed for other reasons
- All items, including ones that have unrecognized format, are encrypted, or weren’t indexed for other reasons
- Only items that have an unrecognized format, are encrypted, or weren’t indexed for other reasons
The number of type of results your query generates will inform how you make this decision. If all the items returned were indexed and searchable, it means they were responsive to (match) your query, so either of the first two options are fine. If you have unindexed or unsearchable results, you may want to produce two exports: one with only the indexed items and one with only the unindexed items–that way, you can separate what is “definitely responsive” from what is “unknown.” This really depends on a question of volume–if it’s only a few unsearchable items, it may not be worth your time to generate two exports. However, if there are hundreds or thousands to go through, you may want to split it up.
Clicking the Generate report button will … generate the report. The report data will be exported to a storage blob in Azure. Once it’s ready, select the Exports tab of the search, and then select the the appropriate report. If the report generation has completed, you’ll have a button that says Download report at the top of the screen, and a link that says Copy to clipboard somewhere in the middle of the page, under the heading Export key:
You’ll need that export key to be able to connect to the Azure storage blob where your report data is stored (you’ll notice it looks a lot like a SAS key).
Click the Download report button to launch the downloader. Note: You’ll need to be running Microsoft Edge or Internet Explorer to launch the downloader application.
One you acknowledge the app launch, you’ll have a box in which to drop that handy Export key. Paste it in, and select a path for the data, and click Start. Off you go!
After the download finishes, you can open the target folder. The two files with the most importance are the “Export Summary” and “Results” CSVs:
The summary is just that–raw stats about the number of items anticipated/processed, the export options selected, and the number of errors and warnings generated. The Results.csv file, however, contains an inventory of every item that matched your output criteria:
Alrighty, on to Export Results!
The Export Results option runs you through the same basic steps as the report, only this time, you get all the files, messages, and other content that was discovered in the process of the search. Instead of selecting Export Report, you’ll select Export Results.
For your efforts, you’ll be presented with a screen not unlike the following:
The default option is to select one PST for each mailbox, thereby allowing you to easily distinguish a particular custodian (more legalese: owner) of the data. You do, however, have additional options, such as a single PST with everything in it (organized in their source folders), a single PST with all of the messages in a single folder (seems like an OCD nightmare), and Individual Messages. When exporting individual messages, they are exported as .eml files, so if someone needs to actually read them, they will either need Outlook or a converter.
After you’ve selected your options, click the Export button. In this case, since I don’t have any unindexed files, I’m just going to select “all items” and a single PST.
You also have options available for Exchange de-dupe (which I generally don’t recommend, based on our list of exceptions and known issues), as well as SharePoint versions (if you have versioning turned on in your library).
The steps from here on out are the same as exporting the report (clicking the download button, copying the SAS Export key, etc.), so I needn’t (feeling all Victorian with that word) replicate those steps here.
Working with eDiscovery Cases
I wanted to touch on this topic briefly: eDiscovery capabilities are a superset of Content Search capabilities that add two additional features:
- An eDiscovery case is a security boundary. Only the eDiscovery manager who created the case (or individuals that have been specifically delegated to) can see the case. It’s security-trimmed to prevent awkward situations like someone discovering they’re under investigation or preserving the privacy of sensitive information.
- eDiscovery cases also include the concept of a hold. It’s constructed identically to a search, but instead of just locating the data, you have an additional capability to preserve the data.
Let’s look at those differences here.
eDiscovery as a Security Boundary
We’ll take a quick look at how an eDiscovery container looks. eDiscovery cases are located in the Security & Compliance Center under the menu node eDiscovery (if you’re looking at a commercial tenant) or under Search & Investigation if you’re exploring this in a GCC tenant:
The feature set for what we’re going to look at here is exactly the same between the two clouds, however. The UI for GCC has not yet been updated (as of this writing).
Once you launch eDiscovery, you’ll see a list of cases that you currently have access to (either cases you’ve created or cases to which you’ve been delegated access–unless of course you are the eDiscovery administrator, as you’ll have access to all the cases across the organization.
Click the + Create a case button to create a new eDiscovery case, which will act as the container for all of the searches, holds, and exports you perform related to a particular case.
The case container creation doesn’t have a whole lot of options initially–really, just a name and a description.
Once you’ve created, the case, though, you can click on it and then define additional properties:
- Members – These are the individuals that have access to the case. By default, only the creator of the case has access to it.
- Role Groups – In addition to assigning individual members, you can assign everyone who has a particular role group assignment. The role groups can be found under the Permissions node of the navigation menu.
- Status – Under this section, you can update the name of a case, its description, and whether it is Active (open) or Closed.
After you’ve made any updates, click Save.
eDiscovery Content Holds
The other core difference between the feature set of Content Search and eDiscovery is the ability to place content on hold. Usually, when conducting a case that it intended to go to litigation, content is put on hold and preserved for the duration of the case.
Depending on how your organization handles retention, this may or may not be necessary. Generally, when working with my customers, I’ve found that when it comes to document and email retention and management, lawyers fall into two buckets: more data protects you more, and less data protects you more. Your organization may have indefinite retention policies, which means that you likely don’t need to place separate holds on content to preserve it. However, if your organization has short (or no) retention policies, you may want to preserve content–especially if there is a risk that litigation or proceedings might last longer than your retention policy.
For everyone reading this:
Ok, with that part out of the way, let’s talk about how to actually create a hold. It’s actually very simple. It uses the same process as a content search, only at the end of it, the button says Create this hold. Watch.
- Open the eDiscovery case by clicking the word Open on the case.
- Select the Hold tab and then click + Create.
- As promised, the UI is nearly identical to the guided search wizard. Fill out the details, including the name, description, location, and query (just like the guided search wizard). Click Next at the end of each step.
- At the end, click Create this hold to preserve the data.
A few notes about holds:
- Your tenant can have 10,000 eDiscovery case holds active at any time. This value includes the holds of all cases in your tenant.
- eDiscovery case holds are only active as long as the hold is active (on the holds tab) or while the case is open. You can release an individual hold in a case by deleting it, or you can release all holds by closing a case.
- You can see more information about case holds here: https://docs.microsoft.com/en-us/microsoft-365/compliance/ediscovery-cases?view=o365-worldwide#step-4-place-content-locations-on-hold.
Best Practices and Recommendations
- Use eDiscovery cases for things that require privacy. Remember, an eDiscovery case is a security boundary, so unless someone is added to a case, they can’t see the searches, holds, or results.
- When exporting (either results or reports), you’ll notice that there are three options (all items excluding unindexed ones, all items including unindexed ones, all items). If you have more than a few dozen unindexed results, I typically recommend doing two exports: one that contains only the indexed items (that is, the ones that are responsive to your search) and one that contains only unindexed items, as those are results that you’re not sure of. That way, you can focus on evaluating whether or not the unindexed results are actually responsive as opposed to sifting out the results.
That’s it for the basic tour of the Security & Compliance Content Search and eDiscovery Overview!