Office 365 Security & Compliance Center eDiscovery – Part 4: Learning NEAR and ONEAR

Office 365 Security & Compliance Center eDiscovery – Part 4: Learning NEAR and ONEAR

  •  
  •  
  •  
  •  
  •  
  •  
This entry is part 4 of 4 in the series Getting the most out of Office 365 Search

This is the fourth in a series of posts focusing on helping you get the most out of Office 365 Content Search and eDiscovery.

Intro

Over the posts in this series, I’m going to go over the following concepts:

In this edition of “finally catching up on all those eDiscovery cases that are piling up,” we’re going to look at unleashing the refining power of the NEAR and ONEAR operators.

What are NEAR and ONEAR?

Put simply, NEAR and ONEAR are search operators that help you locate results where one word is within a certain proximity or distance of another word.  Quite plainly, when something is near something else.

There are a lot of times when this could be helpful.  For example, maybe you’re looking for documents related to “Sky Blue Construction” or “Construction Project”.  So, you plug those words into a query, but as you pore through your results, you realize that those are really common words.  If you had just searched for keywords, you could get any number of unrelated results.

Enter the NEAR and ONEAR operators.  They can help you winnow down the results to things that are more likely to be responsive based on how close words are to each other–especially if the common words you’re looking for aren’t necessarily an exact phrase.

NEAR

This operator helps you locate words that are within a certain distance of each other.  In this case, distance is used to mean “word distance.”  Words are tokens separated by spaces.  For example, in Project Sky Construction project and construction have a distance of one (since one word separates them).

The NEAR operator is invoked using the keyword1 NEAR(n=#keyword2 syntax.

A search like project NEAR(n=1) construction will locate phrases like project blue construction and project construction, as well as construction project.  NEAR looks for the supporting element keyword in either direction of the anchor element keyword.

If you were comparing NEAR to a regular expression look around, you could express it like this:

\b(?:project\W+(?:\w+\W+){0,1}?construction|construction\W+(?:\w+\W+){0,1}?project)\b

If no distance is specified (n=#), the default distance of 8 is used.

Note: Our documentation uses NEAR(n=#) and NEAR(#) interchangeably in parts, but my results have been mixed.  The KQL documentation says that the (n=#) parameter can be shortened to just (#), but I have not experienced a 100% success rate with that–sometimes it works, sometimes it doesn’t).  During the times that it doesn’t, NEAR (and ONEAR, for that matter) simply appear to function as OR

ONEAR

ONEAR is ordered near.  This operator helps you locate words that are within a certain distance of each other AND also appear in a certain order.  This can be important if one word always precedes another in your search, or if you need to ignore words unless they appear in a certain order.  If you’re familiar with regular expressions, you can think of this working similar to look-around behavior.

ONEAR definitely has some fantastic power.  Unfortunately, ONEAR’s key capability of finding words within a certain distance in a certain order only works in SharePoint locations.  If the ONEAR operator is used in Exchange locations, it functions like the NEAR operator.  This affects all content types in the Exchange locations, including Yammer Groups.

The ONEAR operator is invoked using the syntax keyword1 ONEAR(N=#) keyword2 syntax. While the above project NEAR(n=1) construction would return both project construction and construction projectproject ONEAR(n=1) construction will only return instances like Project Sky Construction and Project Construction.  If you were comparing ONEAR to a regular expression look ahead, you could use something like this:

\bproject\W+(?:\w+\W+){0,1}?construction\b

If no distance is specified (n=#), the default distance of 8 is used.

Examples

Now that you’re up to speed with the syntax of how these operators work, let’s see some examples!

NEAR

In this set of examples, we’re going to show how NEAR can be used to locate content in Exchange and SharePoint locations (which will include mailboxes, conversations, and documents).

Exchange Locations

First up, using NEAR in a mailbox.  The query I’m executing is cat NEAR(n=2) duck.  As you can see, the n=2 distance returns two single messages that match.  The highlighted text shows two words separating the search terms.

If I change the distance to n=1 or less, my search results change.  Search results that previously had two words distance or greater are no longer displayed:

Next, we’ll look at NEAR searches in a SharePoint location.

SharePoint Locations

NEAR searches in SharePoint work exactly the same way–you just have to select a SharePoint location as opposed to an Exchange location.  SharePoint searches will site pages and assets as well as posted or uploaded files.  In this example, we’re going to be using the search query penguins NEAR(n=3) "ice sheets" to display a document.

When I download and open the document, you can see that the search terms that caused this document to be responsive (I highlighted them so you could see them easily).

The keyword phrase ice sheets is no more than 3 words away from the first keyword penguins.

If I adjust the distance to n=2, this document will no longer be responsive.

ONEAR

If NEAR is cool, ONEAR is its much cooler cousin (although, a bit of a one-trick pony, as we’ll see).

Exchange Locations

As I previously stated, ONEAR wont’ work in Exchange locations, but we can go ahead and try it anyway.  We’ll go back to a search that I used earlier (cat NEAR(n=2) duck) and switch it around:

duck ONEAR(n=2) cat

As you can see, the result ignores the ordering part of ONEAR and just returns keyword1 NEAR keyword2.  The distance parameter is honored, however.

And, as I previously mentioned, this “limited” ONEAR behavior also affects Yammer group messages, as you can see:

So, just be aware of how ONEAR functions when using it in conjunction with Exchange-based search locations.

SharePoint Locations

ONEAR works as expected in SharePoint locations. We’re going to demonstrate this by re-using our above search query and modifying it for ONEAR: penguins ONEAR(n=3) "ice sheets"

As expected, the same document that was responsive earlier showed up in these results.  Now, if I switch the keyword order, this document should no longer be responsive:

"ice sheets" ONEAR(n=3) "penguins"

That’s all I’ve got for NEAR and ONEAR.  It’s strange to count success as not finding anything, but this is the world we live in today.

Syntax Gotchas

One that I noticed is that our documentation has two different syntaxes listed for how NEAR and ONEAR work.

NEAR/ONEAR(n) is listed as the syntax on https://docs.microsoft.com/en-us/microsoft-365/compliance/keyword-queries-and-search-conditions?view=o365-worldwide.  This does not work reliably with Content Search or eDiscovery search, in my experience.  My experience has been that it sometimes works, but also sometimes appears to function as an OR operator (returning items matching ANY of the keywords or phrases–so, literally the opposite of what you’re hoping for).  It’s listed as a shorthand for n=# in our documentation, but I’ve not had consistent success with it.  I would recommend not using it.

NEAR/ONEAR(n=#) is listed as the syntax on https://docs.microsoft.com/en-us/sharepoint/dev/general-development/keyword-query-language-kql-syntax-reference.  This is the way that works.

Series Navigation<< Office 365 Security & Compliance Center eDiscovery – Part 3: Phrases and Grouping AND OR’ing (Oh, my!)

Published by Aaron Guilmette

Helping companies conquer inferior technology since 1997. I spend my time developing and implementing technology solutions so people can spend less time with technology. Specialties: Active Directory and Exchange consulting and deployment, Virtualization, Disaster Recovery, Office 365, datacenter migration/consolidation, cheese.