Skip to content

SamirYMeshram/ocr-with-their-applications

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Here’s a fully detailed README.md for your project:


Ultimate Feature Launcher

Welcome to the Ultimate Feature Launcher, a Python-based graphical user interface (GUI) application designed for launching powerful OCR and PDF-related tools with ease. Built using tkinter, this application offers a clean, user-friendly interface to help you efficiently execute tasks like advanced OCR, simple OCR, PDF-to-Audiobook conversion, and PDF translation.


🚀 Features

  1. Enhanced OCR (Gemini AI):

    • Leverage a robust OCR powered by Gemini AI for advanced and accurate text extraction.
  2. Simple OCR (EasyOCR):

    • A lightweight and efficient OCR utility for quick and straightforward text recognition.
  3. PDF to Audiobook:

    • Converts PDF documents into audiobooks, making content more accessible.
  4. PDF Translation:

    • Translate PDF content into different languages for multilingual support.
  5. Dynamic UI:

    • Hover effects and responsive UI to enhance the user experience.
  6. Fullscreen Mode:

    • The app launches in fullscreen for better focus, but can easily exit to a windowed mode using the Esc key.

🛠️ Prerequisites

Ensure you have the following installed on your system:

  1. Python 3.7 or later
  2. Required Python libraries:
    • tkinter (for GUI creation)
    • subprocess (for running external scripts)
    • tkinter.messagebox (for popup notifications)

To check if these are installed, run:

python --version

If Python is installed, you can also install required libraries using pip:

pip install tk

📂 Folder Structure

Here’s the structure of the project files:

Ultimate-Feature-Launcher/
├── interface.py               # Main application script
├── Gemini_OCR_SDK.py     # Script for Enhanced OCR (ensure this file exists)
├── simpleocr.py          # Script for Simple OCR
├── de.py                 # Script for PDF to Audiobook
├── pdftrans.py           # Script for PDF Translation
├── README.md             # Project documentation

Place the required scripts (Gemini_OCR_SDK.py, simpleocr.py, de.py, pdftrans.py) in the same directory as main.py for seamless integration.


💻 Getting Started

Follow these steps to set up and run the application:

  1. Clone the Repository: Clone this repository to your local machine:

    git clone https://github.com/SamirYMeshram/ocr-with-their-applications.git
    cd ocr-with-their-applications
  2. Add the Required Scripts: Ensure the following files are in the same directory:

    • Gemini_OCR_SDK.py
    • simpleocr.py
    • de.py
    • pdftrans.py
  3. Run the Application: Launch the GUI application by running the following command:

    python main.py

🎮 How to Use

  1. Launch the application (main.py).
  2. Upon startup, the app will open in fullscreen mode.
  3. Choose one of the available features by clicking the corresponding button:
    • Enhanced OCR (Gemini AI): Launches the Gemini_OCR_SDK.py script.
    • Simple OCR (EasyOCR): Launches the simpleocr.py script.
    • PDF to Audiobook: Starts the de.py script for audiobook conversion.
    • PDF Translation: Executes the pdftrans.py script for translation.
  4. A popup will confirm the feature has started.
  5. Press the Esc key to exit fullscreen mode.

📸 Screenshots

Screenshot 2024-12-10 131842


📝 Customization and Extensions

You can extend or customize the application by:

  • Adding more buttons for additional features.
  • Modifying the color scheme and fonts in the Font and configure sections.
  • Implementing error handling for missing scripts or invalid input.

🤝 Contributing

We welcome contributions! To contribute:

  1. Fork the repository.
  2. Create a new branch:
    git checkout -b feature-name
  3. Commit your changes:
    git commit -m "Add feature"
  4. Push your changes:
    git push origin feature-name
  5. Open a Pull Request.

🐛 Known Issues and Troubleshooting

Common Issues:

  1. Scripts Not Found:

    • Ensure the external scripts (Gemini_OCR_SDK.py, simpleocr.py, etc.) exist in the same directory as main.py.
  2. Dependencies Missing:

    • Install missing dependencies using pip install.
  3. Fullscreen Not Exiting:

    • Press Esc to toggle out of fullscreen mode.

📜 License

This project is licensed under the MIT License. See the LICENSE file for details.


🧑‍💻 Author

Created by [Samir meshram]. For support, suggestions, or bug reports, feel free to reach out or open an issue.

Happy coding! 🎉

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors