Why Enterprise Search Got Stuck and How It’s Moving Forward
There’s no shortage of productivity apps in the workplace. In fact, the average knowledge worker now uses over two dozen apps every day. A presentation starts on Google Slides, gets exported as a PDF, which is then Slacked to coworkers, and finally lands in someone’s Dropbox.
Ironically, all of these apps designed to make you more efficient are just creating more places for you to search when you need a document or quick answer. How do you find the thing you need when things are literally everywhere?
Many companies have tried to create an enterprise search solution with little success. As the latest entrant to tackle this problem, we at Searchable want to share why it can be challenging to solve a problem that’s existed since the invention of portable computing—and how we’re overcoming those obstacles.
Parsing is the process of analyzing various forms of data to assess and take stock of what exists. Once information is parsed, you can query for what you need through a search engine, which will then retrieve it from the parsed data (typically stored in an index).
Any web-based search engine (like Google) parses through website information that’s written in a single framework: HTML. HTML makes it straightforward for search engines to identify all the various components of a webpage. HTML delineates text from images, calls out structural elements of a page (e.g., columns), and can even associate captions with corresponding multimedia.
Parsing HTML is simple. Sorting through enterprise data across a variety of file formats is complex. Word docs, spreadsheets, presentations, videos, audio files—these information sources require more advanced parsing technology to be distinguished as different formats whose data must be interpreted correctly into an index.
A simple example: you have a PDF document with three different layouts and formats: 1) single-column text document 2) double-column text document and 3) scanned version of the second document. The parser must discern the different layouts and file types to decipher the content accordingly.
Consider another common workplace document: a presentation with images, captions, and text. The parser needs to recognize each distinct block of information in order to interpret how the various components of a page correspond to each other to tell a story.
By using machine learning models and AI, Searchable trains our parser to interpret a wide variety of document formats so that our search engine returns the most accurate and relevant document or information.
User expectations on search returns
Many don’t realize that their search expectations change in a workplace context. Enterprise users expect more precise results than that from a web-based search engine like Google.
Say you’re googling “Best office chairs of 2021.” You’ll likely find the first page of results helpful, as long as the results are relevant to what you’re looking for. You’re just doing research and don’t have an exact idea of what you want to find.
On the other hand, when you’re searching for a specific document (e.g., a presentation shared by a colleague), you have an exact document in mind. That’s why enterprise search engines must retrieve precise file matches to be effective. If, after a few tries, the search engine doesn’t bring back the document you want, you’ll naturally revert to tab-surfing and app-switching.
Limitations of keyword search
For most people, “search” means a web-based search engine, like Google, that uses semantic search to find information related to a question or keyword. Unlike traditional keyword search, semantic search considers "searcher intent, query context, and the relationship between words” when delivering search page results.
For example, a keyword search of “blue marble” will only return results related to blue-colored marbles. A semantic search will also return results related to planet Earth because the term “blue marble” is a colloquial phrase for it. A modern search engine considers both semantic meanings and dictionary definitions.
Enterprise search still functions primarily based on keyword retrieval. As a result, it often misses the expectations of users who have been primed by web search engines to anticipate more meaningful and connotative results.
Here’s another example: Across industries, we all use shorthand or acronyms to communicate ideas quickly. A designer might use “illos” to mean “illustrations.” A keyword search for “illo” may not return any results, but a capable semantic search engine would retrieve files associated with “illustrations.”
Because words take on different meanings across different domains and functions within an organization, an effective search engine must incorporate knowledge that’s personalized to individual users and their roles.
We’re training our search engines on semantic search to be able to retrieve the right documents—and even answers or insights within documents—based on keywords and natural language queries.
Sensitivity around enterprise users’ data
Web-based search engines accumulate data from billions of people searching every day. This massive dataset helps train search engines, enabling companies like Google and others to improve their results over time.
Meanwhile, enterprise search engines can only analyze data from individual users. Because of the sensitivity around corporate information, your search data can’t be used to inform other users’ search results across different organizations. The limited use of your search data makes it difficult for enterprise search engines to optimize results.
Our machine learning models analyze how each individual searches, and based on repeat usage, we’ll be able to customize your experience and personalize your results. For example, you might often search for files you created recently, while another user might mostly dig around for documents that have been archived.
The future of search
Our team pushes on to find a solution that individualizes the search for critical information. People are already using more than a handful of productivity apps for both work and personal projects. If we want our information to be readily accessible—without tab-surfing, clicking around, or app-switching—we’ll need a system that can properly analyze and interpret this data for fast retrieval. That’s what we’re working toward at Searchable.ai.
Sign up for our beta here.