An application that uses IBM Watson AI services and Cloud Functions to analyze videos, extracting visual and audio insights for search and categorization.
Dark Vision is an application that processes videos to discover their content using IBM Watson AI services and IBM Cloud Functions. It extracts frames and audio from videos, analyzes them for visual elements and spoken concepts, and builds a searchable summary to help users categorize and find specific content within large video libraries.
Developers and organizations in media & entertainment or any sector with large video archives who want to implement AI-powered video search and content discovery without building the entire pipeline from scratch.
It provides a ready-to-deploy reference architecture that integrates multiple Watson services (Visual Recognition, Speech to Text, NLU) with a serverless backend, demonstrating best practices for building scalable video analysis applications on IBM Cloud.
Discover dark data in videos with IBM Watson and IBM Cloud Functions
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Orchestrates multiple Watson services—Visual Recognition, Speech to Text, and NLU—into a single workflow for comprehensive video and audio analysis, as detailed in the architecture diagrams.
Built on IBM Cloud Functions with event-driven triggers, enabling automatic, scalable processing of video uploads without managing servers.
Includes a web app for uploads and browsing, plus an optional iOS client, offering flexibility for different user access points.
Explicitly labeled as a technology demonstration in the README, with no official support and a redirect to IBM's commercial product for production use.
Deeply tied to IBM Cloud services (US South region only), requiring setup of multiple Watson instances, Cloudant, and Docker, which limits portability and adds deployment complexity.
Relies on Watson's async API to bypass Cloud Functions' 5-minute timeout, but the README notes frequent audio processing failures in troubleshooting, making it unreliable for long videos.