Databases

Lightricks boosts search retrieval rates by 40% with vector support in Cloud SQL

June 12, 2024

David Michael Gang

Tech Lead, Lightricks

Try Gemini 1.5 models

Google's most advanced multimodal models in Vertex AI

Editor’s note: Lightricks builds innovative photo and video creation apps based on the latest computer vision and AI technology, enabling content creators and brands to produce engaging, top-performing, and scalable content. When its powerful video editor, Videoleap, needed enhanced search functionality, Lightricks leveraged the pgvector extension in Cloud SQL for PostgreSQL, which improved its search capabilities and boosted the number of retrieval rates by 40%.

At Lightricks, our mission is to bridge the gap between imagination and creation. Our video editing app, Videoleap, lets pros and beginners alike have fun cutting and combining clips with ease wherever they are.

We want to make video editing accessible to all through an intuitive editor, AI tools, and templates created from user-generated content (UGC). In particular, Videoleap’s template search function is crucial for enabling our users to efficiently explore this vast and diverse collection of video templates. To enhance search, we needed a solution that would help us transition to a more dynamic search model. When Cloud SQL for PostgreSQL announced vector search support, we knew it was the right choice for Videoleap.

Searching for a solution a cut above the rest

Prior to our exploration of vector database options, we were already leveraging Cloud SQL for PostgreSQL as our managed relational database. This allowed us to dedicate less time to database administration and focus on enhancing our applications. However, we needed additional support for Videoleap’s search functions to keep up with the growing trends in video editing. We wanted to give users greater control over their browsing experience and the ability to quickly find templates that aligned with their creative vision.

Initially, our platform’s search functionality relied on exact keyword matching based on predefined annotations, which meant users had to use specific keywords to bring up relevant results, leaving little room for error. This approach often failed to capture the broad expanse of user queries, which could contain complex phrases or terms not directly present in our annotations. Addressing these discrepancies individually would have been time-consuming, so we decided to explore the potential of vector search, which relies on vector embeddings to retrieve relevant information from databases. We quickly realized that implementing this method in Videoleap would yield more context-aware results and enhance the speed and quality of search.

We evaluated a few solutions to solve our search functionality needs. First, we tried Pinecone and Vespa, but these options fell short due to slower integration and added complexity. Introducing a separate vector database means making changes to local environments, continuous integration (CI) pipelines, and deployment processes, which significantly increases development overhead. Developers also face a steeper learning curve compared to pgvector, the popular PostgreSQL extension for vector search, which mainly involves understanding a new data type. Maintaining data consistency between PostgreSQL and an external vector database can also be complex and error-prone, requiring careful management of ongoing data synchronization. Additionally, leveraging PostgreSQL's transactional capabilities for atomic updates across both vector and relational data can become challenging. The separation of systems might also hinder efficient joins between vector data and other tables within PostgreSQL.

We also considered Chroma DB, but it lacked reliable hosting options and deployment availability. So when Cloud SQL for PostgreSQL rolled out vector support via pgvector, we knew it was the right choice. Not only did it perfectly align with our needs, but it also integrated with the PostgreSQL infrastructure that we already had in place. Its streamlined approach reduces development overhead and minimizes the risk of data inconsistencies, making it a more efficient and reliable solution for many use cases.

Leveraging Cloud SQL with pgvector for enhanced functionality

Using pgvector with Cloud SQL allows us to easily join data, handle transactions, and enable semantic search. We can control and fine-tune it for our needs using different indexing strategies for factors such as speed and accuracy.

The application we created for Videoleap’s search uses a microservices architecture. To enable search while also ensuring scalability and high availability, it stores UGC template metadata and various embeddings of the templates on Cloud SQL. This approach adheres to microservices principles and enhances system modularity, flexibility, and scalability.

http://storage.googleapis.com/gweb-cloudblog-publish/images/image2_2XveMFD.max-1700x1700.png

Fig 1. Lightricks architecture diagram using Cloud SQL for PostgreSQL with pgvector support

Visualizing speedier results with dynamic search capabilities

Transitioning to a semantic search model using vector embeddings revolutionized our ability to deliver relevant results for a broad spectrum of queries. This change was particularly beneficial for capturing the semantic intent of search queries, rather than relying solely on exact keyword matching. This adaptability allows us to retrieve relevant results even when the query contains variations, such as misspellings, synonyms, or related concepts. For example, a search for "labrador plays with frisbee" may now return a video of a golden retriever playing with a ball, understanding the underlying intent rather than being limited by the specific words used.

We saw a significant increase in our export rates with the new system, indicating that the change in our search features clearly provided more value for creators. With pgvector, both the number of retrievals and the template usage from retrieved results increased by 40%. The pgvector extension adding support for the Hierarchical Navigable Small Worlds (HNSW) algorithm also enabled us to query millions of embeddings with high accuracy. The response time of 90% of our requests (P90), which previously took anywhere from one to four seconds, plummeted to under 100 milliseconds. This unlocks the magic of creativity, empowering users to effortlessly discover relevant results so the editing process is as fulfilling as the initial creation.

http://storage.googleapis.com/gweb-cloudblog-publish/images/image1_7AxEfYT.max-1300x1300.png

Fig 2. Hierarchical Navigable Small Worlds (HNSW) support enables high-accuracy querying of millions of embeddings. Response times (p90) plummeted from 1-4 seconds to under 100 milliseconds.

Bringing search options into sharper focus with AI

A cornerstone of our enhanced search capabilities is the integration of a visual content-based search feature. Using Cloud SQL with pgvector enables us to leverage neural networks to create vector embeddings. This has revolutionized our ability to understand and match visual content with user intent, ensuring more accurate and relevant search results.

Being on the cutting edge of video editing tools means recognizing the growing trend of AI-assisted editing, where creations are often based on textual prompts. That’s why we introduced a Videoleap search feature that leverages AI prompts, enabling creators to find content that not only matches visually but also aligns with more nuanced or specific themes. An example would be differentiating between content related to the Barbie movie and generic Barbie doll model imagery.

Looking forward, our vision is to build heavily on AI-powered search — and Google Cloud enables us to do this with minimal overhead in operations. We’re really excited to see what comes next.