imSanko/Image_Caption_Generator_With_Transformers

This repository contains code for generating captions for images using a Transformer-based model. The model used is the `VisionEncoderDecoderModel` from the Hugging Face Transformers library, specifically the `nlpconnect/vit-gpt2-image-captioning` model.

Jupyter NotebookStars 12Forks 1Watchers 12Open issues 2License MIT License

Details

仓库信息

OwnerimSanko

Homepagehttps://imsanko.github.io/Image_Caption_Generator_With_Transformers/

GitHubhttps://github.com/imSanko/Image_Caption_Generator_With_Transformers

Last pushed2024-09-02

Last updated2025-12-14

Issues fetched at—

imSanko/Image_Caption_Generator_With_Transformers

Community at a glance