This is the fourth in a series of posts focusing on helping you get the most out of Office 365 Content Search and eDiscovery.
Over the posts in this series, I’m going to go over the following concepts:
- Condition Cards: Sender, Recipients, & Participants and Content Types
- Phrases and Grouping AND OR’ing (Oh, my!)
- Learning NEAR and ONEAR
- Discovering Microsoft Teams Content
In this edition of “finally catching up on all those eDiscovery cases that are piling up,” we’re going to look at unleashing the refining power of the NEAR and ONEAR operators.
What are NEAR and ONEAR?
Put simply, NEAR and ONEAR are search operators that help you locate results where one word is within a certain proximity or distance of another word. Quite plainly, when something is near something else.
There are a lot of times when this could be helpful. For example, maybe you’re looking for documents related to “Sky Blue Construction” or “Construction Project”. So, you plug those words into a query, but as you pore through your results, you realize that those are really common words. If you had just searched for keywords, you could get any number of unrelated results.
Enter the NEAR and ONEAR operators. They can help you winnow down the results to things that are more likely to be responsive based on how close words are to each other–especially if the common words you’re looking for aren’t necessarily an exact phrase.
This operator helps you locate words that are within a certain distance of each other. In this case, distance is used to mean “word distance.” Words are tokens separated by spaces. For example, in Project Sky Construction, project and construction have a distance of one (since one word separates them).
The NEAR operator is invoked using the
keyword1 NEAR(n=#) keyword2 syntax.
A search like
project NEAR(n=1) construction will locate phrases like project blue construction and project construction, as well as construction project. NEAR looks for the supporting element keyword in either direction of the anchor element keyword.
If you were comparing NEAR to a regular expression look around, you could express it like this:
If no distance is specified (
n=#), the default distance of 8 is used.
Note: Our documentation uses
NEAR(#) interchangeably in parts, but my results have been mixed. The KQL documentation says that the (n=#) parameter can be shortened to just (#), but I have not experienced a 100% success rate with that–sometimes it works, sometimes it doesn’t). During the times that it doesn’t, NEAR (and ONEAR, for that matter) simply appear to function as OR.
ONEAR is ordered near. This operator helps you locate words that are within a certain distance of each other AND also appear in a certain order. This can be important if one word always precedes another in your search, or if you need to ignore words unless they appear in a certain order. If you’re familiar with regular expressions, you can think of this working similar to look-around behavior.
ONEAR definitely has some fantastic power. Unfortunately, ONEAR’s key capability of finding words within a certain distance in a certain order only works in SharePoint locations. If the ONEAR operator is used in Exchange locations, it functions like the NEAR operator. This affects all content types in the Exchange locations, including Yammer Groups.
The ONEAR operator is invoked using the syntax
keyword1 ONEAR(N=#) keyword2 syntax. While the above
project NEAR(n=1) construction would return both project construction and construction project,
project ONEAR(n=1) construction will only return instances like Project Sky Construction and Project Construction. If you were comparing ONEAR to a regular expression look ahead, you could use something like this:
If no distance is specified (
n=#), the default distance of 8 is used.
Now that you’re up to speed with the syntax of how these operators work, let’s see some examples!
In this set of examples, we’re going to show how NEAR can be used to locate content in Exchange and SharePoint locations (which will include mailboxes, conversations, and documents).
First up, using NEAR in a mailbox. The query I’m executing is
cat NEAR(n=2) duck. As you can see, the
n=2 distance returns two single messages that match. The highlighted text shows two words separating the search terms.
If I change the distance to
n=1 or less, my search results change. Search results that previously had two words distance or greater are no longer displayed:
Next, we’ll look at NEAR searches in a SharePoint location.
NEAR searches in SharePoint work exactly the same way–you just have to select a SharePoint location as opposed to an Exchange location. SharePoint searches will site pages and assets as well as posted or uploaded files. In this example, we’re going to be using the search query
penguins NEAR(n=3) "ice sheets" to display a document.
When I download and open the document, you can see that the search terms that caused this document to be responsive (I highlighted them so you could see them easily).
The keyword phrase ice sheets is no more than 3 words away from the first keyword penguins.
If I adjust the distance to
n=2, this document will no longer be responsive.
If NEAR is cool, ONEAR is its much cooler cousin (although, a bit of a one-trick pony, as we’ll see).
As I previously stated, ONEAR wont’ work in Exchange locations, but we can go ahead and try it anyway. We’ll go back to a search that I used earlier (
cat NEAR(n=2) duck) and switch it around:
duck ONEAR(n=2) cat
As you can see, the result ignores the ordering part of ONEAR and just returns
keyword1 NEAR keyword2. The distance parameter is honored, however.
And, as I previously mentioned, this “limited” ONEAR behavior also affects Yammer group messages, as you can see:
So, just be aware of how ONEAR functions when using it in conjunction with Exchange-based search locations.
ONEAR works as expected in SharePoint locations. We’re going to demonstrate this by re-using our above search query and modifying it for ONEAR:
penguins ONEAR(n=3) "ice sheets"
As expected, the same document that was responsive earlier showed up in these results. Now, if I switch the keyword order, this document should no longer be responsive:
"ice sheets" ONEAR(n=3) "penguins"
That’s all I’ve got for NEAR and ONEAR. It’s strange to count success as not finding anything, but this is the world we live in today.
One that I noticed is that our documentation has two different syntaxes listed for how NEAR and ONEAR work.
NEAR/ONEAR(n) is listed as the syntax on https://docs.microsoft.com/en-us/microsoft-365/compliance/keyword-queries-and-search-conditions?view=o365-worldwide. This does not work reliably with Content Search or eDiscovery search, in my experience. My experience has been that it sometimes works, but also sometimes appears to function as an OR operator (returning items matching ANY of the keywords or phrases–so, literally the opposite of what you’re hoping for). It’s listed as a shorthand for
n=# in our documentation, but I’ve not had consistent success with it. I would recommend not using it.
NEAR/ONEAR(n=#) is listed as the syntax on https://docs.microsoft.com/en-us/sharepoint/dev/general-development/keyword-query-language-kql-syntax-reference. This is the way that works.