• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • About Us
  • What is AI?
  • AI Education
  • AI Jobs
  • Contact Page

010101

Artificial Intelligence Resources

  • AI Writing
  • AI Books
  • AI Movies
  • AI Tools
  • AI in the Media
  • AI Bill of Rights

Latest Artificial Intelligence (AI) Research From Czech Republic Proposes 'GLAMI-1M,' A Multilingual Image-Text Classification Dataset And Benchmark – MarkTechPost

December 6, 2022 by AVA Leave a Comment

Public datasets are one of machine learning research’s most important building blocks. Because of these datasets, anyone can train and evaluate their models on personal devices or cloud services. These public benchmarks allow testing and evaluating different methods because they have pre-defined training and test data splits.
Image classification is one of the most well-known problems in computer vision. However, image classification models were already pretty good. When an ALIGN model predecessor was trained on a proprietary WebImageText for classification, it achieved state-of-the-art performance on the Fashion-Gen dataset. These observations reveal that image classification can be further improved using image-text models.
However, public large-scale image-text classification datasets have limited size and language diversity (see Table 1). So in this paper, the authors introduced GLAMI-1M. A public multilingual image-text classification benchmark of fashion products. Let’s briefly describe the dataset; the dataset contains 1.1M images of fashion products and their descriptions in one of the 13 languages. The descriptions of products are taken from e-commerce websites. The images are categorized into 191 classes (see Figure 2) with high-quality labels. Complete test set and 75% of the 1M training set images are human-labeled.
As the data is collected from an e-commerce website, it poses various challenges, like dealing with imbalanced long-tailed class distributions, noisy labels, multimodal inputs, multilingual texts, and many more. 
There are some fashion-gen datasets (see Tables 2 and 3), but only one bilingual image-text dataset, Fashion-MMT. However, it is ten times smaller in size than GLAMI-1M. 
Now coming to the question, How is the data collected and cleaned?
The fashion items that are present in the dataset are selected from the GLAMI catalog in two phases: 
In addition, there is no overlap between training and test set images and texts, as checked via MD5 hashes and cosine similarity.
Table 4 gives some more information about the dataset.
The researchers also produced a baseline for Multimodal classification and Text-conditional image generation on GLAMI-1M. 
Let’s talk about classification first-
In multimodal classification, the inputs come from different modalities, here; textual (title + description), visual (image), and categorical (label-source). For the baseline, they have used EmbraceNet because it can take encoded inputs from any modality and combine them to form a single modality. 
Now, talking about Text-conditional Image Generation,
They trained a small version of the Imagen-like model on some subset of the dataset. 
Results from both the baseline can be seen in Table 6 and Figure 5,6,7.
In conclusion, GLAMI-1M is the largest publicly available multilingual image-text classification dataset. It has the potential to help accelerate research in text-conditional image generation, image-text classification, and multilingual machine translation. Moreover, it can also be helpful in the detailed listing of fashion products on e-commerce websites.
Check out the Paper and Github link. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.
Vineet Kumar is a consulting intern at MarktechPost. He is currently pursuing his BS from the Indian Institute of Technology(IIT), Kanpur. He is a Machine Learning enthusiast. He is passionate about research and the latest advancements in Deep Learning, Computer Vision, and related fields.

Free-2 Min AI NewsletterJoin Our AI Community
Enter your email address

Marktechpost is a California based AI News Platform providing easy-to-consume, byte size updates in machine learning, deep learning, and data science research
© 2021 Marktechpost LLC. All Rights Reserved. Made with ❤️ in California
Win All-Access Pass to AI Summit in NY worth $2,500 and Hang out with Q2 and Protopia AI Leadership
Thank you for submitting the form

source

Filed Under: Uncategorized

Reader Interactions

Leave a Reply

You must be logged in to post a comment.

Primary Sidebar

Recent Posts

🌱 ChatGPT Artificial Intelligence App + Child And Family Well-Being – Patch

Hello everyone! I'm back with your fresh copy of the San Diego Patch newsletter. … [Read More...] about 🌱 ChatGPT Artificial Intelligence App + Child And Family Well-Being – Patch

  • How ChatGPT Is Fast Becoming The Teacher's Pet – Forbes
  • Evidence of a cognitive bias in the quantification of COVID-19 with CT: an artificial intelligence randomised clinical trial … – Nature.com
  • How AI will change the way we work – Yahoo Finance

Follow Us Online

  • Facebook
  • LinkedIn

Ads, Of Course

Footer

Main Nav

  • Home
  • About Us
  • What is AI?
  • AI Education
  • AI Jobs
  • Contact Page

Secondary Nav

  • AI Writing
  • AI Books
  • AI Movies
  • AI Tools
  • AI in the Media
  • AI Bill of Rights

Copyright © 2023 · 010101.ai · Website by Amador Marketing