Built with BigQuery: How to supercharge your product data with Google Cloud and Harmonya
Co-Founder & CTO at Harmonya
Dr. Ali Arsanjani
Director, AI/ML Partner Engineering, Head of AI Center of Excellence, Google Cloud
“CPG manufacturers and retailers are dependent on product data to understand their markets, inspire innovation, and serve customers, but this is a challenge with the common data sources across the industry,” says Cem Kent, CEO of Harmonya. “Data sets are siloed, products are categorized differently across sources, and the descriptive attributes and characteristics about products are not evolving to reflect industry or consumer perspectives. That’s where Harmonya comes in.”
Harmonya is an all-in-one, AI-powered, product data enrichment, categorization, and insights platform. The company enriches its customers’ product data with deeper attributes and characteristics to power more impactful analytics and decision-making. Harmonya is committed to empowering its customers with greater control over their product analysis and categorization while maintaining a fresh, consistent view of the categories in which they operate. With Harmonya, customers can unlock a wide range of use cases, including:
e-commerce content, search and recommendation applications
Harmonya’s proprietary technology enriches product data by ingesting information from millions of online product listings and tags products with unique concepts informed by titles, descriptions, structured attributes, consumer reviews, and more. This harmonized data asset empowers brand and retail teams that use product data to unlock new opportunities for their business through a better understanding of what matters most to consumers. We’ll discuss several use cases of this enrichment in detail later in this article.
On top of this enrichment, Harmonya builds robust analytical tools to help uncover insights about the consumer and marketing drivers of in-market performance, improve assortment and merchandising, guide product innovation, engage target audiences more effectively, and categorize products. Fortune 500s and other industry leading CPG manufacturers and retailers rely on Harmonya to enrich their product data and help them compete in a fast-changing marketplace.
Harmonya builds and maintains data pipelines that process massive amounts of data, training and serving machine learning models on top of the BigQuery data warehouse used throughout the organization. BigQuery’s integration with other Google Cloud components and pay-per-use model enables near-limitless scalability for data processing, providing significant value that allows Harmonya to focus on bringing value to its customers. Below is an example illustrating the data access model and deployment model between Harmonya’s internal environment and the customer-facing multi-tenant environment on the right side of the diagram.
The above diagram shows that Harmonya’s stack is split into two separate environments. The first is an internal environment (left side, yellow background) independent of Harmonya’s customers and their data. There, the Harmonya Product Language is created, starting (from left to right) with scheduling data acquisition tasks, querying the current state of the normalized product data vs. the scrape-state DB and deciding which new scrape tasks should be performed.
Then, Cloud Functions are triggered to gather the relevant data from the web and store the raw results in Cloud Storage. From there, the process of the Harmonya Graph creation takes place, where products are clustered into a consistent view, and relations between products are discovered. Following that process, a set of NLP models are used to extract any meaningful concepts related to the products forming a detailed taxonomy.
The second environment (right side, red background) is a multi-tenant environment where each customer has their own complete separation of resources, ensuring nothing is being shared between any two customers of Harmonya.
The processing starts with a customer sharing raw point-of-sale data point with Harmonya. This data is processed using BigQuery in a streamlined and scalable way and merged with a snapshot of the Harmonya Language, relying on BigQuery’s capability to join data between separate projects. The merged dataset is then processed in Harmonya’s data pipelines, running ML processing to generate customer-specific insights, stored in Cloud SQL for real-time serving in Harmonya’s SaaS based application, running on Node.js and accessed by customers online at http://app.harmonya.com.
BigQuery is an essential tool for Harmonya when working with product data for several reasons:
Scalability: BigQuery is a cloud-based data warehouse that can scale automatically to handle large and complex data sets. This makes it an ideal solution for Harmonya, which needs to manage growing amounts of data without the need for expensive infrastructure investments.
Cost-effective: BigQuery operates on a pay-as-you-go model, which means Harmonya only pays for the resources we use. This makes it a cost-effective solution for startups with limited budgets.
Speed: BigQuery’s high-speed processing of large data sets enables Harmonya to analyze data and make decisions in real-time. This provides a competitive advantage to customers that need to react quickly to market changes.
Accessibility: BigQuery is accessible through a web-based interface, as well as through a range of programming languages, including SQL and Python. This means that Harmonya’s team, with different levels of technical expertise, can use the tool to analyze and visualize their data integration: BigQuery can integrate with a range of other tools, including data visualization and business intelligence tools, as well as with other Google services. This makes it a versatile tool for Harmonya, which needs to work with data from multiple sources.
Simple data ingestion: BigQuery can ingest data from a variety of sources, including Cloud Storage, Cloud Pub/Sub, Cloud SQL, and more. Harmonya uses these integrations to seamlessly move data from their existing data sources into BigQuery.
On top of that, BigQuery’s flexible scheme allows it to store various data types and query them in a dynamic fashion. Harmonya stores a mixture of structured and semi-structured json files within the same tables in BigQuery, simplifying data ingestion and allowing for a wide variety of use-cases with less data duplication.
Creating meaningful selling stories and trends
Enriching product data unlocks a wide variety of commercial and operational use cases on the brand and retail sides of the commerce chain. A popular use for Harmonya’s enrichment is in creating more impactful and dynamic selling stories.
Manufacturers rely on retailers to sell their products, so it's crucial for manufacturers to create unique selling stories that resonate with retailers to stand out in the highly competitive marketplace. Enriching product data with unique attributes and characteristics with Harmonya can help manufacturers tell better selling stories to retailers in several ways:
Deeper understanding of performance drivers: When product data is enriched with unique attributes and characteristics, brands and retailers have a differentiated understanding of in-market dynamics. This helps them make better decisions, identify the true drivers of brand and category performance, and develop more successful strategies to drive growth.
Improved product descriptions: Manufacturers can provide more detailed and accurate product descriptions to retailers when they have a more holistic understanding of how owned and competitive portfolios resonate with consumers. This helps brands and retailers create more compelling product descriptions and marketing materials that drive sales.
Better targeting: Enriched product data can help manufacturers target specific customer segments more effectively based on the combination of first party data and enriched transactional data. By understanding the unique attributes and characteristics of a product and the demographics and behaviors of purchasers, manufacturers and retailers can tailor their outreach and marketing messages to specific customer needs and preferences with unprecedented precision.
Differentiation: Retailers carry many products from various manufacturers, and it's important for manufacturers to create a unique selling story that sets their product apart from the competition. A unique selling story can make the difference between a single or multiple facings and preferential shelf placement, especially when both the brand and the retailer understand the unique attributes that set those products apart.
Harmonya's collaborative approach to data enrichment is brought to life via its suite of applications that allow customers to explore and analyze their enhanced datasets.
Detecting trends in product data is challenging for brands and retailers because of the vast amount of information generated by multiple sources, such as sales data, customer feedback, social media, and industry reports. Extracting insights from this data requires powerful analytics tools, expertise in data analysis, and a deep understanding of the market.
This is where Harmonya comes in. Their proprietary algorithms can analyze sales data at the attribute and characteristic level of products, providing granular insights into consumer preferences and trends. Harmonya's technology can also identify emerging trends and changes in consumer behavior, allowing brands and retailers to adapt their product strategies in real-time.
By leveraging Harmonya's technology, brands and retailers can gain a competitive edge by staying ahead of the curve in product innovation and positioning. They can also optimize their product portfolios and pricing strategies, improve customer engagement, and ultimately drive revenue growth.
In addition to analyzing sales data at the attribute and characteristic level of products, Harmonya also provides an intuitive, user-friendly interface that allows brands and retailers to visualize their data and trends in a way they haven't been able to before. Their platform displays data in interactive dashboards and charts, making it simple for users to identify patterns and correlations that may be difficult to spot with traditional analysis methods.
Furthermore, Harmonya's app simplifies the process of detecting trends by automating the analysis and reporting process, eliminating the need for manual data processing and freeing up valuable time for teams to focus on other strategic initiatives. By leveraging machine learning algorithms, Harmonya's platform can quickly identify and report on trends, providing brands and retailers with timely insights that enable them to make informed decisions about product development and marketing campaigns.
Overall, Harmonya's technology and app enable brands and retailers to gain a deeper understanding of their customers' preferences and behaviors, leading to better product development, pricing strategies, and customer engagement. By providing powerful insights in an easy-to-use interface, Harmonya is helping companies stay ahead of the curve in a constantly evolving market.
According to a Fortune 50 multi-category manufacturer, “Harmonya achieved 98% accuracy of UPC coding and classification during their engagement. This has enabled us to enrich and automate core data processes around how we manage our product catalog and harmonize external data structures. We are really impressed with the accuracy and quality of their outputs, and we are accelerating the expansion of our partnership to take full advantage of Harmonya's strategic capabilities more broadly.”
Google’s data cloud provides a complete platform for building data-driven applications from simplified data ingestion, processing, and storage to powerful analytics, AI, ML, and data sharing capabilities — all integrated with the open, secure, and sustainable Google Cloud platform. With a diverse partner ecosystem, open-source tools, and APIs, Google Cloud can provide technology companies the portability and differentiators they need to serve the next generation of customers.
We thank the Google Cloud team members who co-authored the blog: Banruo Yu, Technical Account Manager, Google Cloud, and Christian Williams, Principal Architect, Google Cloud