Intelligent Search for Life Sciences


Intelligent Language Translation



Audio and Video Search

A story in motion

Audio and Video Search

At the heart of Docxonomy is a powerful search of virtually any file format and that includes audio and video.


Two things comprise a video file: 1. Image frames and 2. An audio track.  When you allow Docxonomy to index one of your video files, we use this fact to drive our analysis.  Docxonomy visually compares the image frames in a video, choosing all those that are visually unique from one another and we then apply our image analysis capabilities to these unique frames.  The output of that analysis is then added to your search index, enabling you to search for videos based on what Docxonomy has visually identified in its frames. Looking for the beach volleyball segments in your promotional travel videos? – just search for “beach volleyball”.  It’s that simple.


So what about the audio?  During our video (and audio) analysis process, we take the audio track and, using an artificial intelligence algorithm, we transcribe what is being verbally said, into text.  And guess what? Docxonomy is great at analyzing text! So again, the output of our text analysis is added to the search index and associated with the video file.


All of this leads to incredible video and audio search capability!  Suppose your organization has many, many videos promoting certain products or services and you’ve been tasked with tracking down all of the videos that mention product Y.  Or let’s say you are a media organization and are looking for all interviews with John Smith where a particular topic was discussed.


Under ordinary circumstances, any of the above situations would mean watching and listening to hundreds of hours of audio and/or video.  But with Docxonomy, you have the power of search at your side – enabling you to search by either imagery in a video or words spoken in a video or audio file.  That is powerful search.


On top of that, once you’ve found your video, Docxonomy opens to a special interface, displaying just the relevant segments of the video that met your search criteria.  For example, if you were searching for that John Smith interview where he mentioned the word “teeth”, Docxonomy will show that segment of the video and highlight the transcribed word for you (even if what he actually said was “tooth”!).  From this interface, a user can select specific segments and jump throughout the video without having to comb through the whole thing.


Contact Us

July 22, 2018


Artificial Intelligence, Audio, Image Analysis, Machine Learning, Natural Language Processing, Speech Analysis, Text Analysis, Video